Data science is one of the fastest-growing fields in existence. Scientists, businesses, government agencies and various organizations routinely gather huge amounts of data from a variety of sources. Data scientists help transform this information into insights that shape the world, asking and answering questions that influence decisions about healthcare, sustainability, business, security, equity – the list goes on.
Willamette’s data science program helps students gain contemporary computer programming and data analysis skills, either as a major course of study or a minor complementing any undergraduate major. The program also addresses issues such as the ethics of working with data while teaching students how to formulate good questions, design a process for answering them and effectively communicate their findings to a variety of stakeholders.
Students learn two core computer programming languages (R and Python). The R course focuses on introductory statistics, and the Python course focuses on introduction to computer programming. Students also complete electives that advance their knowledge of statistical, mathematical, analytical and machine learning techniques. Both majors and minors apply their skills in the Problem-Solving with Data Analytics class, while majors complete their bachelor’s degree with a capstone internship or research project.
Requirements for the Data Science Major (Bachelor of Science)(40 semester hours)
Required Core (28 semester hours)
- CS 151 Introduction to Programming with Python (4)
- DATA 151 Introduction to Data Science (4)
- DATA 152 Statistics for Data Science (4)
- DATA 252 Models and Machine Learning (4)
- DATA 351 Data Management with SQL (4)
- DATA 352W Ethics, Teamwork, and Communication
- MATH 280 Math for Data Scientists (4)
Electives (12 semester hours)
Twelve hours of electives at any level chosen from classes with CS or DATA prefixes or from the following approved list:
- BIOL 342 Biostatistics (4)
- BIOL 347 Bioinformatics (4)
- ECON 350 Introduction to Econometrics and Forecasting (4)
- ENVS 250 Geographic Information Systems (4)
- ENVS 381 Research in Spatial Science (4)
- MATH 253 Linear Algebra (4)
- MATH 266 Probability and Statistics (4)
- MATH 376 Topics in Mathematics: Probability Theory (topic dependent) (4)
- PHYS 340 Advanced Data Analysis and Simulation (ADAS) (4)
Requirements for the Data Science Minor (20 semester hours)
- CS 151 Introduction to Programming with Python (4)
- DATA 151 Introduction to Data Science (4)
- DATA 152 Statistics for Data Science (4)
- Eight hours of electives at any level, chosen from courses with a DATA prefix or from the list of approved electives above
Requirements for the accelerated 3+1 BS/MS in Data Science (56 semester hours)
Required undergraduate Core (16 semester hours):
- CS 151 Introduction to Programming with Python (4)
- DATA 151 Introduction to Data Science (4)
- DATA 152 Statistics for Data Science (4)
- MATH 280 Math for Data Scientists (4)
Undergraduate Electives (8 semester hours):
- Eight hours of electives at any level chosen from classes with CS or DATA prefixes, or from the list of approved electives above
Required graduate Core (20 semester hours):
- DATA 502 Data Visualization and Presentation (4)
- DATA 503 Fundamentals of Data Engineering (4)
- DATA 504 Data Ethics, Policy, and Human Beings (4)
- DATA 505 Applied Machine Learning (4)
- DATA 510 Graduate Capstone (4)
Graduate Electives (12 semester hours):
- Twelve hours of graduate electives with DATA prefix
Course Listings
DATA 151 Introduction to Data Science with R
This course focuses on developing the foundational skills of a modern data scientist including data cleaning, wrangling, visualization, and communication. Students will actively engage with R and RStudio, the most popular programming language and software environment for statistical computing. The course covers basic descriptive statistics (mean, standard deviation, quantiles, correlation) and introduces students to the tools they need to work with large, real-world data sets. Students will also develop the critical thinking skills needed to use data ethically. The course is the first of two in the introductory Data Science sequence, but will also be of interest to any student who wants to better understand the data they meet in everyday life and in the world around them. The course does not assume any previous background in statistics or programming.
- General Education Requirement Fulfillment: Mathematical Science
- Offering: Fall
- Professor: Staff
DATA 152: Inferential Statistics with R
This course gives students a solid grounding in the theory and practice of basic inferential statistics: confidence intervals, hypothesis testing (including chi-squared tests and ANOVA), and linear regression. Students will implement these techniques using R, a statistical programming language. The course also introduces the topics from probability theory needed to understand these methods (Law of Large Numbers and the Central Limit Theorem), and introduces students to the computational techniques needed to carry out these tests, including randomization and resampling. Students will develop the skills to write well-defined research questions, test hypotheses, and communicate results in a manner that facilitates action by decision makers.
- General Education Requirement Fulfillment: Mathematical Science
- Offering: Fall
- Professor: Staff
DATA 199 Topics in Data Science (1-4)
A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments.
- General Education Requirement Fulfillment: Topic dependent
- Prerequisite: Topic dependent
- Offering: Occasionally
- Professor: Staff
DATA 252 Models and Machine Learning (4)
This project based course provides an overview of modern approaches to analyzing large and complex real world data sets from diverse applications. Students will learn techniques in modeling and predictive methods from selected topics in supervised learning and unsupervised learning. Building off a strong foundation from the generalized linear model framework, students will learn to assess model assumptions and motivate machine learning methods; which may include classification (logistic regression, linear discriminant analysis, naive Bayes, k-means, etc), non-linear and non-parametric methods, support vector machines, decision trees (classification and regression trees, random forests), boosting, neural networks, and additional topics, if time allows. Students will become proficient in implementing these methods using R packages.
- Prerequisite: MATH 280
- Offering: Annually
- Professor: Staff
DATA 299 Topics in Data Science (1-4)
A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments.
- General Education Requirement Fulfillment: Topic dependent
- Prerequisite: Topic dependent
DATA 351 Data Management with SQL (4)
Data management is core to both applied computer science and data science. This includes storing, managing, and processing datasets of varying sizes and types. This course introduces students to the various ways in which data is stored and processed including relational databases, file-based databases, cloud-based storage and data streaming. Students will also learn how to access data using Structured Query Language (SQL).
DATA 352W Ethics, Teamwork, and Communication (4)
- General Education Requirement Fulfillment: Writing-Centered
- Prerequisite: CS 151
- Offering: Annually
- Professor: Staff
DATA 399 Topics in Data Science (1-4)
A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments..
- General Education Requirement Fulfillment: Topic dependent
- Prerequisite: Topic dependent
- Offering: Occasionally
- Professor: Staff
DATA 429 Topics in Data Science (1-4)
A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments.
- General Education Requirement Fulfillment: Topic dependent
- Prerequisite: Topic dependent
- Offering: Occasionally
- Professor: Staff