Skip to main content

Course Descriptions

Core Courses

DATA 501 Foundations of Data Science with R (4)

This foundational course offers a full-spectrum introduction to data science and data science workflows, emphasizing data as a source of value creation in the enterprise. The R programming environment serves as the implementation vehicle in support of essential data science activities – data exploration and visualization, data wrangling, predictive modeling, model deployment, and communication. The R programming environment, along with Python, is among the most important tools in the data scientist’s toolbox and this course will feature tools and a style of programming inspired by the popular tidyverse ecosystem – ggplot2 for data visualization, dplyr and tidyr for data wrangling. Students will master elements of the data science workflow through a series of short R programming exercises reinforced by a full-spectrum, integrative final project. Presentation skills are an ever-present theme as students are challenged, through every stage of analysis, to communicate managerial relevance and value to the enterprise.

  • Prerequisite: None
  • Offering: Fall
  • Professor: Staff

DATA 502 Data Visualization and Presentation (4)

It’s one thing to conduct an analysis, it’s another to convince someone to change their behavior based on this analysis. In this course, students will study theories of visualization, communication and presentation with the purpose of translating technical results into actionable insight. Using a mix of case studies and code, the course begins with an examination of how to ask good research questions. It then covers the psychology of communication and the construction of compelling visualizations. Finally, students are tasked with writing and presenting their work in a manner suited to a non-technical audience.

  • Prerequisite: None
  • Offering: Fall
  • Professor: Staff

DATA 503 Fundamentals of Data Engineering (4)

Data management is core to both applied computer science and data science. This includes storing, managing, and processing datasets of varying sizes and types. This course introduces students to the various ways in which data is stored and processed including relational databases, file-based databases, cloud-based storage and data streaming. A key component of the course is learning which architectures fit which types of data science problem (and the strengths and weaknesses of each). Students will learn to work with data that is both clean and structured, and dirty and unstructured.

  • Prerequisite: None
  • Offering: Spring
  • Professor: Staff

DATA 504 Data Ethics, Policy and Human Beings (4)

This course explores the legal, policy, and ethical implications of data. These types of issue arise at each stage of the data science workflow including data collection, storage, processing, analysis and use. Armed with legal and ethical guidelines, students are then confronted with topics including privacy, surveillance, security, classification, discrimination, decisional-autonomy, and duties to warn or act. Using case studies and a lecture-discussion format, the course will address real-world problems in areas like criminal justice, national security, health, marketing and politics.

  • Prerequisite: None
  • Offering: Spring Semester
  • Professor: Staff

DATA 505 Applied Machine Learning (4)

Machine learning is becoming a core component of many modern organizational processes. It is a growing field at the intersection of computer science and statistics focused on finding patterns in data. Prominent applications include personalized recommendations, image processing and speech recognition. This course will focus on the application of existing machine learning libraries to practical problems faced by organizations. Through lectures, cases and programming projects, students will learn how to use machine learning to solve real world problems, run evaluations and interpret their results.

  • Prerequisite: None
  • Offering: Spring
  • Professor: Staff

DATA 510 Capstone Project (4)

Over the course of the semester, students will propose, plan and execute an actual data science project. Run as an independent study during the student’s last term, the project provides an opportunity to integrate all of the core skills learned throughout the program, and to develop a portfolio piece that can help with students’ career aspirations. Projects must be consequential in nature—i.e., have a real (or potential) impact on some organization, or the world. Grades will be based on assessments by both the faculty advisor and those (potentially) impacted by the project’s results. Data sets must be selected by the student either from a public repository or from the company for which they work and approved by the course instructor within the first two weeks of the term.

  • Prerequisite: None
  • Offering: Summer
  • Professor: Staff

Electives

DATA 520 Marketing Analytics (4)

  • Offering: TBD
  • Professor: Staff

DATA 521 Time Series Modeling (4)

  • Offering: TBD
  • Professor: Staff

DATA 522 Python Programming (4)

Offering: We will begin from the basics of the Python programming language and learn the key tools and libraries within Python needed to solve industry problems in data science. The course will include data manipulation and preprocessing using Pandas and numpy, cover machine learning models in scikit-learn, and conclude with building full preprocessing-and-forecasting pipelines. At the completion of this course, students will be able to write basic code and functions as well as visualize data, implement data science workflows, and fit machine learning models using Python. We will also explore differences between Python and R as tools for data science to help students select the correct tool for a given industry problem.

  • Professor: Staff

DATA 599 Topics: Cybersecurity (4)

Offering: Cybersecurity can be understood as a mindset or approach rather than a subfield of computer science, such as secure mobile computing, network and operating system security, secure data bases, and secure cryptography algorithms. This course prepares a general audience to incorporate security concepts and ethics into their daily lives and offers some basic familiarity with writing security oriented code.

  • Professor: Staff

DATA 599 Topics: Data Science in the Natural Sciences (4)

Offering: As “big data” and the role of data scientist enters into mainstream life, the way science is done is also evolving. Natural scientists are frequently turning to trained data scientists to collaborate and assist with data reduction, data storage, and machine-learning. This course is intended to help bridge the gap across disciplines: to instruct data scientists in the analyses, formats, requirements, and approaches commonly used in the natural sciences. Particular emphasis will be placed on understanding what sorts of information scientists are interested in, and in facilitating communication between data scientists and natural scientists. Additionally, as science can frequently use a slightly different technology stack than commercial or industrial groups, some time will be devoted to learning how to navigate a Linux environment and work with a variety of open-source frameworks.

  • Professor: Staff

DATA 599 Topics: Survival Analysis (4)

Offering: Survival analysis methods consist of statistical modeling techniques that predict the time to an event. The event of interest often depends on the application area. Survival analysis techniques have been used in a wide variety of areas—medical professionals working to predict the onset of a disease; HR representatives studying trends in workforce attrition; mechanical engineers investigating reliability of post-released products; and many other applications. In this course, students will be exposed to practice survival analysis techniques through exploration of real data sets. Students can expect to develop a deeper understanding of concepts including: observation censoring, accelerated life testing and experimental design, recurrence analysis, and implementation of survival analysis techniques in statistical software as well as communicating statistical results to different audiences.

  • Prerequisite: Integral calculus
  • Not explicitly required, but helpful: Knowledge of multiple linear regression
  • Professor: Gore

Willamette University

Masters in Data Science

Address
200 Market Street
Portland Oregon 97201 U.S.A.
Phone
971-717-7271