CS 109: Data Science - (New Course)

CS 109

Computer Science 109. Data Science - (New Course)
Catalog Number: 70866
Hanspeter Pfister (Computer Science), Joseph K. Blitzstein (Statistics), and Verena Kaynig-Fittkau (Computer Science) 
Half course (fall term). Tu., Th., 2:30–4, and a weekly section.
Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries. Built around three modules: prediction and elections, recommendation and business analytics, and sampling and social network analysis.
Note: Only one of CS 109, AC 209, or Stat 121 can be taken for credit. Only admitted graduate students can take AC 209, in which case we expect significant differences in readings, assignments, and projects.
Prerequisite: Programming knowledge at the level of CS 50 or above, and statistics knowledge at the level of Stat 100 or above (Stat 110 recommended).