Skip to main content

Course Descriptions

DATA 151 Introduction to Data Science with R

This course focuses on developing the foundational skills of a modern data scientist including data cleaning, wrangling, visualization, and communication. Students will actively engage with R and RStudio, the most popular programming language and software environment for statistical computing. The course covers basic descriptive statistics (mean, standard deviation, quantiles, correlation) and introduces students to the tools they need to work with large, real-world data sets. Students will also develop the critical thinking skills needed to use data ethically. The course is the first of two in the introductory Data Science sequence, but will also be of interest to any student who wants to better understand the data they meet in everyday life and in the world around them. The course does not assume any previous background in statistics or programming.


DATA 152 Inferential Statistics With R

This course gives students a solid grounding in the theory and practice of basic inferential statistics: confidence intervals, hypothesis testing (including chi-squared tests and ANOVA), and linear regression. Students will implement these techniques using R, a statistical programming language. The course also introduces the topics from probability theory needed to understand these methods (Law of Large Numbers and the Central Limit Theorem), and introduces students to the computational techniques needed to carry out these tests, including randomization and resampling. Students will develop the skills to write well-defined research questions, test hypotheses, and communicate results in a manner that facilitates action by decision makers.


DATA 199 Topics in Data Science

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar's webpage for descriptions and applicability to graduation requirements.


DATA 252 Models and Machine Learning

Selected topics in supervised learning, unsupervised learning, and reinforcement learning: perceptron, logistic regression, linear discriminant analysis, decision trees, neural networks, naive Bayes, support vector machines, k-nearest neighbors algorithm, hidden Markov Models, expectation-maximization algorithm, K-means, Gaussian mixture model, bias-variance tradeoff, ensemble methods, feature extraction and dimentionality reduction methods, principal component analysis, Markov decision processes, passive and active learning.


DATA 275 Data in the Cosmos

In the coming years, scientific telescopes will be collecting vast amounts of data on the observable sky and our place in the cosmos. As a result, astronomy is intersecting with the field of data science like never before. This course will provide students with an opportunity to explore the techniques and applications of data science in modern astronomy. You will work with large datasets to study the evolution of stars and the age of the universe, use signal processing techniques to identify planets orbiting other stars, and employ basic machine learning techniques to categorize galaxy types. Collaborating with peers from various disciplines, you will also learn how to communicate scientific findings effectively through written and oral presentations. While no prior science background is required, proficiency in programming languages like Python or R is a prerequisite, and a solid grasp of algebra and geometry is highly recommended.


DATA 299 Topics in Data Science

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar's webpage for descriptions and applicability to graduation requirements.


DATA 351 Data Management With SQL

Data management is core to both applied computer science and data science. This includes storing, managing, and processing datasets of varying sizes and types. This course introduces students to the various ways in which data is stored and processed including relational databases, file-based databases, cloud-based storage and data streaming. Students will also learn how to access data using Structured Query Language (SQL).


DATA 352W Ethics, Teamwork, & Communication

Scientists with backgrounds in data and computing face both novel challenges in ethics, teamwork, and communication and existing challenges in novel contexts. Human-centered scientists must be able to analyze, act upon, and argue in support of ethical use of technologies. Topics will include labor policies including hiring practices and workplace non-discrimination, tech monopolies and their global impact, open source projects and datasets (namely Github), socially responsible research, and accessibility of technologies. To develop a vocabulary to advocate and collaborate, students will collaboratively prepare technical reports and presentations and build technical blog posts with embedded data/computing resources and visualizations.


DATA 375 Problem-Solving With Data Analytics

This course serves as both the culmination experience in the Data Science minor and a required course in the Data Science major. Students will work in teams to apply data analytics tools and skills toward the resolution of a question or problem. Depending on the instructor, this course might organize all projects around a common question, problem, or challenge, or it might allow teams to identify a theme of their own. Teams will work with their instructor to develop a problem-solving strategy that utilizes the data-analytics skills and methods acquired in prerequisite courses, to develop a project plan, and to carry out the project. In most instances, projects will yield a summary essay, research paper, or white paper.


DATA 391 Independent Study

This course is intended for the qualified advanced student who wishes to do an intensive independent study in an area not covered by an existing course in the department. Arrangements for this course must be made with a faculty member before registration.


DATA 395 Quad Internship

Internship in the QUAD center, required of DATA majors.


DATA 399 Topics in Data Science

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar's webpage for descriptions and applicability to graduation requirements.


DATA 410 Data Science Capstone

Over the course of the semester, students will propose, plan and execute an actual data science project. Run as an independent study during the student's last year, the project provides an opportunity to integrate all of the core skills learned throughout the program, and to develop a portfolio piece that can help with students' career aspirations. Some class time may be required for collaboration, discussion and critiques of classmates' work.


DATA 429 Topics in Data Science

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar's webpage for descriptions and applicability to graduation requirements.


DATA 497 Research in Data Science

Individualized program of investigative research, in which a student works directly with a Data Science faculty member on their area of research expertise. Nature of participation varies from collaborative research to the design and execution of an independent project. The course provides hands-on experience, which may include literature review, data collection, data management, data analysis, and the synthesis of results in a formal paper and/or oral presentation. May be repeated for credit until a maximum of 8 total credits.


Willamette University

Data Science