Skip to main content

Course Descriptions

DATA 151 Introduction to Data Science with R (4)

This course focuses on developing the foundational skills of a modern data scientist including data cleaning, wrangling, visualization, and communication. Students will actively engage with R and RStudio, the most popular programming language and software environment for statistical computing. The course covers basic descriptive statistics (mean, standard deviation, quantiles, correlation) and introduces students to the tools they need to work with large, real-world data sets. Students will also develop the critical thinking skills needed to use data ethically. The course is the first of two in the introductory Data Science sequence, but will also be of interest to any student who wants to better understand the data they meet in everyday life and in the world around them. The course does not assume any previous background in statistics or programming.

  • General Education Requirement Fulfillment: Mathematical Science
  • Offering: Fall
  • Professor: Staff

DATA 152: Inferential Statistics with R (4)

This course gives students a solid grounding in the theory and practice of basic inferential statistics: confidence intervals, hypothesis testing (including chi-squared tests and ANOVA), and linear regression. Students will implement these techniques using R, a statistical programming language. The course also introduces the topics from probability theory needed to understand these methods (Law of Large Numbers and the Central Limit Theorem), and introduces students to the computational techniques needed to carry out these tests, including randomization and resampling. Students will develop the skills to write well-defined research questions, test hypotheses, and communicate results in a manner that facilitates action by decision makers.

  • General Education Requirement Fulfillment: Mathematical Science
  • Prerequisite: DATA 151
  • Offering: Fall
  • Professor: Staff

DATA 199 Topics in Data Science (1-4)

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments.

  • General Education Requirement Fulfillment: Topic dependent
  • Prerequisite: Topic dependent
  • Offering: Occasionally
  • Professor: Staff

DATA 252 Models and Machine Learning (4)

This project based course provides an overview of modern approaches to analyzing large and complex real world data sets from diverse applications. Students will learn techniques in modeling and predictive methods from selected topics in supervised learning and unsupervised learning. Building off a strong foundation from the generalized linear model framework, students will learn to assess model assumptions and motivate machine learning methods; which may include classification (logistic regression, linear discriminant analysis, naive Bayes, k-means, etc), non-linear and non-parametric methods, support vector machines, decision trees (classification and regression trees, random forests), boosting, neural networks, and additional topics, if time allows. Students will become proficient in implementing these methods using R packages.


DATA 275 Data in the Cosmos (4)

In the coming years, scientific telescopes will be collecting vast amounts of data on the observable sky and our place in the cosmos. As a result, astronomy is intersecting with the field of data science like never before. This course will provide students with an opportunity to explore the techniques and applications of data science in modern astronomy. You will work with large datasets to study the evolution of stars and the age of the universe, use signal processing techniques to identify planets orbiting other stars, and employ basic machine learning techniques to categorize galaxy types. Collaborating with peers from various disciplines, you will also learn how to communicate scientific findings effectively through written and oral presentations. While no prior science background is required, proficiency in programming languages like Python or R is a prerequisite, and a solid grasp of algebra and geometry is highly recommended.

  • General Education Requirement Fulfillment: Mathematical Sciences; Natural Sciences
  • Prerequisite: CS 151 or DATA 151
  • Offering: Spring
  • Professor: Staff

DATA 299 Topics in Data Science (1-4)

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments.

  • General Education Requirement Fulfillment: Topic dependent
  • Prerequisite: Topic dependent

DATA 351 Data Management with SQL (4)

Data management is core to both applied computer science and data science. This includes storing, managing, and processing datasets of varying sizes and types. This course introduces students to the various ways in which data is stored and processed including relational databases, file-based databases, cloud-based storage and data streaming. Students will also learn how to access data using Structured Query Language (SQL).

  • Prerequisite: CS 151 and DATA 151
  • Offering: Annually
  • Professor: Staff

DATA 352W Ethics, Teamwork, and Communication (4)



  • General Education Requirement Fulfillment: Writing-Centered
  • Prerequisite: CS 151
  • Offering: Annually
  • Professor: Staff

DATA 391 Independent Study (2 or 4)

This course is intended for the qualified advanced student who wishes to do an intensive independent study in an area not covered by an existing course in the department. Arrangements for this course must be made with a faculty member before registration

  • General Education Requirement Fulfillment: Mathematical Sciences
  • Prerequisite: Consent of instructor 
  • Offering: On demand
  • Professor: Staff

DATA 399 Topics in Data Science (1-4)

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments..

  • General Education Requirement Fulfillment: Topic dependent
  • Prerequisite: Topic dependent
  • Offering: Occasionally
  • Professor: Staff

DATA 429 Topics in Data Science (1-4)

A semester-long study of topics in Data Science. Topics and emphases will vary according to the instructor. This course may be repeated for credit with different topics. See the New and Topics Courses page on the Registrar’s webpage for descriptions and applicability to majors/minors in other departments.

  • General Education Requirement Fulfillment: Topic dependent
  • Prerequisite: Topic dependent
  • Offering: Occasionally
  • Professor: Staff

DATA 497 Research in Data Science (2 or 4)

Individualized program of investigative research, in which a student works directly with a Data Science faculty member on their area of research expertise. Nature of participation varies from collaborative research to the design and execution of an independent project. The course provides hands-on experience, which may include literature review, data collection, data management, data analysis, and the synthesis of results in a formal paper and/or oral presentation. May be repeated for credit until a maximum of 8 total credits.

  • General Education Requirement Fulfillment: Mathematical Sciences
  • Prerequisite: Consent of instructor
  • Offering: On demand
  • Professor: Data Science Staff

DATA 501 Foundations of Data Science with R (4)

This foundational course offers a full-spectrum introduction to data science and data science workflows, emphasizing data as a source of value creation in the enterprise. The R programming environment serves as the implementation vehicle in support of essential data science activities – data exploration and visualization, data wrangling, predictive modeling, model deployment, and communication. The R programming environment, along with Python, is among the most important tools in the data scientist’s toolbox and this course will feature tools and a style of programming inspired by the popular tidyverse ecosystem – ggplot2 for data visualization, dplyr and tidyr for data wrangling. Students will master elements of the data science workflow through a series of short R programming exercises reinforced by a full-spectrum, integrative final project. Presentation skills are an ever-present theme as students are challenged, through every stage of analysis, to communicate managerial relevance and value to the enterprise.

  • Prerequisite: None
  • Offering: Fall
  • Professor: Staff

DATA 502 Data Visualization and Presentation (4)

It’s one thing to conduct an analysis, it’s another to convince someone to change their behavior based on this analysis. In this course, students will study theories of visualization, communication and presentation with the purpose of translating technical results into actionable insight. Using a mix of case studies and code, the course begins with an examination of how to ask good research questions. It then covers the psychology of communication and the construction of compelling visualizations. Finally, students are tasked with writing and presenting their work in a manner suited to a non-technical audience.

  • Prerequisite: None
  • Offering: Fall
  • Professor: Staff

DATA 503 Fundamentals of Data Engineering (4)

Data management is core to both applied computer science and data science. This includes storing, managing, and processing datasets of varying sizes and types. This course introduces students to the various ways in which data is stored and processed including relational databases, file-based databases, cloud-based storage and data streaming. A key component of the course is learning which architectures fit which types of data science problem (and the strengths and weaknesses of each). Students will learn to work with data that is both clean and structured, and dirty and unstructured.

  • Prerequisite: None
  • Offering: Spring
  • Professor: Staff

DATA 504 Data Ethics, Policy and Human Beings (4)

This course explores the legal, policy, and ethical implications of data. These types of issue arise at each stage of the data science workflow including data collection, storage, processing, analysis and use. Armed with legal and ethical guidelines, students are then confronted with topics including privacy, surveillance, security, classification, discrimination, decisional-autonomy, and duties to warn or act. Using case studies and a lecture-discussion format, the course will address real-world problems in areas like criminal justice, national security, health, marketing and politics.

  • Prerequisite: None
  • Offering: Spring Semester
  • Professor: Staff

DATA 505 Applied Machine Learning (4)

Machine learning is becoming a core component of many modern organizational processes. It is a growing field at the intersection of computer science and statistics focused on finding patterns in data. Prominent applications include personalized recommendations, image processing and speech recognition. This course will focus on the application of existing machine learning libraries to practical problems faced by organizations. Through lectures, cases and programming projects, students will learn how to use machine learning to solve real world problems, run evaluations and interpret their results.

  • Prerequisite: None
  • Offering: Spring
  • Professor: Staff

DATA 510 Capstone Project (4)

Over the course of the semester, students will propose, plan and execute an actual data science project. Run as an independent study during the student’s last term, the project provides an opportunity to integrate all of the core skills learned throughout the program, and to develop a portfolio piece that can help with students’ career aspirations. Projects must be consequential in nature—i.e., have a real (or potential) impact on some organization, or the world. Grades will be based on assessments by both the faculty advisor and those (potentially) impacted by the project’s results. Data sets must be selected by the student either from a public repository or from the company for which they work and approved by the course instructor within the first two weeks of the term.

  • Prerequisite: None
  • Offering: Summer
  • Professor: Staff

DATA 520 Marketing Analytics (4)

  • Offering: TBD
  • Professor: Staff

DATA 521 Time Series Modeling (4)

  • Offering: TBD
  • Professor: Staff

DATA 522 Python Programming (4)

Offering: We will begin from the basics of the Python programming language and learn the key tools and libraries within Python needed to solve industry problems in data science. The course will include data manipulation and preprocessing using Pandas and numpy, cover machine learning models in scikit-learn, and conclude with building full preprocessing-and-forecasting pipelines. At the completion of this course, students will be able to write basic code and functions as well as visualize data, implement data science workflows, and fit machine learning models using Python. We will also explore differences between Python and R as tools for data science to help students select the correct tool for a given industry problem.

  • Professor: Staff

Willamette University

3+1 BS/MS in Data Science

Address
900 State Street
Salem Oregon 97301 U.S.A.
Phone
971-717-7271