Skip to main content

The Willamette MSDS program was designed to develop skills that are relevant to the modern tech landscape. AI is transforming the world and impacting organizations at all levels, and ethics, machine learning and data engineering are key components to successful AI implementation and deployment.  Curriculum is reviewed every program year, in order to adapt to the changing environment. Our courses are taught by faculty members in the School of Computing and Information Sciences, and may include contributions from well-credentialed adjunct faculty with connections to local industry.

The MSDS degree requires the completion of nine classes (36 credit hours). Classes are sequenced in order to maximize student learning and allow for a deeper understanding of critical concepts.  

Courses, Academic Year 2025-2026


Fall semester: Foundations

CS 501: Python for Data Science (4)

Data Science is the study of knowledge extraction from massive amounts of data. It requires an integrated skill set including aspects of mathematics, statistics, and computer science, as well as effective problem solving techniques and domain knowledge. This course will introduce students to this rapidly growing field, with an emphasis on basic programming techniques in Python. Students will learn the fundamentals of data structures and complexity theory, data wrangling, exploratory data analysis, data visualization, and basic machine learning algorithms such as regression, classification, and clustering. Additionally, students will practice effective communication by framing their analyses within appropriate context and ethical considerations.

DATA 501 Foundations of Data Science with R (4)

This foundational course offers a full-spectrum introduction to data science and data science workflows, emphasizing data as a source of value creation in the enterprise. The R programming environment serves as the implementation vehicle in support of essential data science activities – data exploration and visualization, data wrangling, predictive modeling, model deployment, and communication. The R programming environment, along with Python, is among the most important tools in the data scientist’s toolbox and this course will feature tools and a style of programming inspired by the popular tidyverse ecosystem – ggplot2 for data visualization, dplyr and tidyr for data wrangling. Students will master elements of the data science workflow through a series of short R programming exercises reinforced by a full-spectrum, integrative final project. Presentation skills are an ever-present theme as students are challenged, through every stage of analysis, to communicate managerial relevance and value to the enterprise.

DATA 504W Data Ethics, Policy and Human Beings (4)

This course explores the legal, policy, and ethical implications of data. These types of issues arise at each stage of the data science workflow including data collection, storage, processing, analysis and use. Armed with legal and ethical guidelines, students are then confronted with topics including privacy, surveillance, security, classification, discrimination, decisional-autonomy, and duties to warn or act. Using case studies and a lecture-discussion format, the course will address real-world problems in areas like criminal justice, national security, health, marketing and politics.


Spring semester: Application

DATA 502 Data Visualization and Presentation (4)

It’s one thing to conduct an analysis, it’s another to convince someone to change their behavior based on this analysis. In this course, students will study theories of visualization, communication and presentation with the purpose of translating technical results into actionable insight. Using a mix of case studies and code, the course begins with an examination of how to ask good research questions. It then covers the psychology of communication and the construction of compelling visualizations. Finally, students are tasked with writing and presenting their work in a manner suited to a non-technical audience.

DATA 503 Fundamentals of Data Engineering (4)

Data management is core to both applied computer science and data science. This includes storing, managing, and processing datasets of varying sizes and types. This course introduces students to some of the various ways in which data is stored and processed, with particular focus on relational databases and SQL.  Over the course of the semester, students will also learn the fundamentals of command-line interfaces, remote connections, and containerization in order to construct a data pipeline to ingest raw data, transform and organize that data into something useful, and then make that polished data available to downstream consumers. 

DATA 505 Applied Machine Learning (4)

Machine learning is becoming a core component of many modern organizational processes. It is a growing field at the intersection of computer science and statistics focused on finding patterns in data. Prominent applications include personalized recommendations, image processing and speech recognition. This course will focus on the application of existing machine learning libraries to practical problems faced by organizations. Through lectures, cases and programming projects, students will learn how to use machine learning to solve real world problems, run evaluations and interpret their results.

  • Prerequisite: CS 501/DATA 522

Summer semester: Integration

DATA 513  Advanced Data Engineering (4)

Data engineers design, implement, and maintain the data pipelines that power modern technology and society. This course builds on previous coursework, focusing on ingesting data from diverse sources, including logs, data streams, and various database types. Students will explore automated orchestration tools for scheduling and managing data pipelines, along with key concepts and technology in data warehousing and lakehouses. Emphasis will be placed on documentation, pipeline maintenance, and development through collaborative, semester-long projects.

  • Prerequisite: DATA 503

DATA 515 Advanced Machine Learning (4)

This advanced machine learning course is designed to provide students an in-depth exploration of advanced topics, techniques, algorithms, and applications in the field of machine learning. Through a combination of lectures, hands-on exercises and project-based learning, students will gain a comprehensive understanding of machine learning techniques and their applications across domains. Topics may include: training and fine-tuning neural networks, generative networks, natural language processing, latent space representations, and evaluating ethical considerations such as dataset quality and bias. Particular emphasis will be given to current events and recent advances in the field.

  • Prerequisite: DATA 505

DATA 510 Capstone Project (4)

Over the course of the semester, students will propose, plan and execute a large data science project. Run as a small-group collaboration during the student’s last term, the project requires a  synthesis of all core skills learned throughout the program,  simultaneously developing a portfolio piece that markets a student’s skills to future employers. Such marketing also requires that project data be publically available or publishable. Checkpoints are provided throughout the summer to ensure students are making meaningful progress, with robust opportunities for peer and faculty feedback provided. Projects culminate with a presentation to faculty, peers, and alumni, as well as a comprehensive written work that is published to the web. 

  • Prerequisites: CS 501, DATA 501, 502, 503, 504W, 505

Degree Requirements, Fall 2025- Summer 2026

Courses required for the Data Science Master of Science degree (9 classes, 36 semester hours) can be completed in as few as 12 months (3 classes per semester, Fall start) or over 24 months (1 or 2 classes per semester, with courses taken in sequence where applicable).  Contact us with any questions, or to inquire about part-time format options. 

General Requirements

  • DATA 504W: Data Ethics, Policy and Human Beings (4)
  • DATA 510:  Graduate Capstone (4)

Analytics Requirements

  • DATA 501: Foundations of Data Science with R (4)
  • DATA 502: Data Visualization and Presentation (4)

Machine Learning Sequence

  • CS 501: Python for Data Science (4)
  • DATA 505: Applied Machine Learning (4)
  • DATA 515: Advanced Machine Learning (4)

Data Engineering Sequence

  • DATA 503: Fundamentals of Data Engineering (4)
  • DATA 515:  Advanced Data Engineering (4)

Willamette University

School of Computing and Information Sciences