Graduate
Data Science 200. Introduction to Data Science Programming (3 units)
more information
This fast-paced course gives students fundamental Python knowledge necessary for advanced work in data science. Students gain frequent practice writing code, building to advanced skills focused on data science applications. We introduce a range of Python objects and control structures, then build on these with classes on object-oriented programming. A major programming project reinforces these concepts, giving students insight into how a large piece of software is built and experience managing a full-cycle development project. The last section covers two popular Python packages for data analysis, NumPy and pandas, and includes an exploratory data analysis.
Section 1 Mo 4:00 pm - 5:30 pm Instructor(s): Uthra Ramanujam Section 2 Tu 2:00 pm - 3:30 pm Instructor(s): Gerald Benoît Section 3 Tu 4:00 pm - 5:30 pm Instructor(s): Gerald Benoît Section 4 Tu 6:30 pm - 8:00 pm Instructor(s): Sridevi Pudipeddi Section 5 Th 6:30 pm - 8:00 pm Instructor(s): Ysis Wilson-Tarter Section 6 We 4:00 pm - 5:30 pm Instructor(s): Sridevi Pudipeddi Section 7 We 6:30 pm - 8:00 pm Instructor(s): Ysis Wilson-Tarter Section 8 Th 4:00 pm - 5:30 pm Instructor(s): Mumin Khan
Data Science 201. Research Design and Applications for Data and Analysis (3 units)
more information
Introduces the data sciences landscape, with a particular focus on learning data science techniques to uncover and answer the questions students will encounter in industry. Lectures, readings, discussions, and assignments will teach how to apply disciplined, creative methods to ask better questions, gather data, interpret results, and convey findings to various audiences. The emphasis throughout is on making practical contributions to real decisions that organizations will and should make.
Section 1 Mo 4:00 pm - 5:30 pm Instructor(s): Elena Petrov Section 2 Mo 6:30 pm - 8:00 pm Instructor(s): Carlos Rivera Section 3 Tu 2:00 pm - 3:30 pm Instructor(s): JP Dolphin Section 4 Tu 4:00 pm - 5:30 pm Instructor(s): Brooks Ambrose Section 5 We 4:00 pm - 5:30 pm Instructor(s): Brooks Ambrose Section 6 We 4:00 pm - 5:30 pm Instructor(s): Napoleon Paxton Section 7 We 6:30 pm - 8:00 pm Instructor(s): Carlos Rivera Section 8 We 6:30 pm - 8:00 pm Instructor(s): Donna Dueker Section 9 Th 4:00 pm - 5:30 pm Instructor(s): Sahab Aslam Section 10 Th 6:30 pm - 8:00 pm Instructor(s): Conor Healy
Data Science 203. Statistics for Data Science (3 units)
more information
An introduction to many different types of quantitative research methods and statistical techniques for analyzing data. We begin with a focus on measurement, inferential statistics and causal inference using the open-source statistics language, R. Topics in quantitative techniques include: descriptive and inferential statistics, sampling, experimental design, tests of difference, ordinary least squares regression, general linear models.
Section 1 Tu 2:00 pm - 3:30 pm Instructor(s): Paul Laskowski Section 2 Tu 4:00 pm - 5:30 pm Instructor(s): Paul Laskowski Section 3 Tu 4:00 pm - 5:30 pm Instructor(s): Tanya Roosta Section 4 Tu 6:30 pm - 8:00 pm Instructor(s): D. Alex Hughes, Bill Chung Section 5 We 4:00 pm - 5:30 pm Instructor(s): Mark Labovitz Section 6 We 6:30 pm - 8:00 pm Instructor(s): Mark Labovitz Section 7 We 6:30 pm - 8:00 pm Instructor(s): Gunnar Kleemann Section 8 Th 2:00 pm - 3:30 pm Instructor(s): Paul Laskowski Section 9 Th 4:00 pm - 5:30 pm Instructor(s): Gunnar Kleemann Section 10 Th 6:30 pm - 8:00 pm Instructor(s): Gunnar Kleemann
Data Science 205. Fundamentals of Data Engineering (3 units)
more information
Storing, managing, and processing datasets are foundational processes in data science. This course introduces the fundamental knowledge and skills of data engineering that are required to be effective as a data scientist. This course focuses on the basics of data pipelines, data pipeline flows and associated business use cases, and how organizations derive value from data and data engineering. As these fundamentals of data engineering are introduced, learners will interact with data and data processes at various stages in the pipeline, understand key data engineering tools and platforms, and use and connect critical technologies through which one can construct storage and processing architectures that underpin data science applications.
Section 1 Mo 4:00 pm - 5:30 pm Instructor(s): Korin Reid Section 2 Tu 4:00 pm - 5:30 pm Instructor(s): Doris Schioberg Section 3 Tu 4:00 pm - 5:30 pm Instructor(s): Kevin Crook Section 4 Tu 6:30 pm - 8:00 pm Instructor(s): Kevin Crook Section 5 We 2:00 pm - 3:30 pm Instructor(s): Doris Schioberg Section 6 We 4:00 pm - 5:30 pm Instructor(s): Doris Schioberg Section 7 We 6:30 pm - 8:00 pm Instructor(s): Doris Schioberg Section 8 We 6:30 pm - 8:00 pm Instructor(s): Shiraz Chakraverty Section 9 Th 6:30 pm - 8:00 pm Instructor(s): Kevin Crook Section 99 Tu 2:00 pm - 3:30 pm Instructor(s): Doris Schioberg
Data Science 207. Applied Machine Learning (3 units)
more information
Machine learning is a rapidly growing field at the intersection of computer science and statistics concerned with finding patterns in data. It is responsible for tremendous advances in technology, from personalized product recommendations to speech recognition in cell phones. This course provides a broad introduction to the key ideas in machine learning. The emphasis will be on intuition and practical examples rather than theoretical results, though some experience with probability, statistics, and linear algebra will be important.
Section 1 Tu 2:00 pm - 3:30 pm Instructor(s): Amit Bhattacharyya Section 2 Tu 4:00 pm - 5:30 pm Instructor(s): Nedelina Teneva Section 3 Tu 4:00 pm - 5:30 pm Instructor(s): John Santerre Section 4 Tu 6:30 pm - 8:00 pm Instructor(s): John Santerre Section 5 Tu 6:30 pm - 8:00 pm Instructor(s): Nedelina Teneva Section 6 We 6:30 pm - 8:00 pm Instructor(s): Nedelina Teneva Section 7 Th 4:00 pm - 5:30 pm Instructor(s): Ishaani Priyadarshini Section 8 Sa 10:00 am - 11:30 am Instructor(s): Uri Schonfeld Section 9 Instructor(s): Alberto Todeschini Section 98 Mo 2:00 pm - 3:30 pm Instructor(s): Ishaani Priyadarshini
Data Science 209. Data Visualization (3 units)
more information
Visualization enhances exploratory analysis as well as efficient communication of data results. This course focuses on the design of visual representations of data in order to discover patterns, answer questions, convey findings, drive decisions, and provide persuasive evidence. The goal is to give you the practical knowledge you need to create effective tools for both exploring and explaining your data. Exercises throughout the course provide a hands-on experience using relevant programming libraries and software tools to apply research and design concepts learned.
Section 1 Mo 4:00 pm - 5:30 pm Instructor(s): Clinton Brownley Section 2 Mo 6:30 pm - 8:00 pm Instructor(s): Mak Ahmad Section 3 Tu 6:30 pm - 8:00 pm Instructor(s): Mak Ahmad Section 4 We 4:00 pm - 5:30 pm Instructor(s): Clinton Brownley Section 5 We 6:30 pm - 8:00 pm Instructor(s): Fereshteh Amini Section 6 Th 4:00 pm - 5:30 pm Instructor(s): Bum Chul Kwon
Data Science 210. Capstone (3 units)
more information
The capstone course will cement skills learned throughout the MIDS program— both core data science skills and “soft skills” like problem-solving, communication, influencing, and management — preparing students for success in the field. The centerpiece is a semester-long group project in which teams of students propose and select project ideas, conduct and communicate their work, receive and provide feedback (in informal group discussions and formal class presentations), and deliver compelling presentations along with a web-based final deliverable. Includes relevant readings, case discussions, and real-world examples and perspectives from panel discussions with leading data science experts and industry practitioners.
Section 1 Mo 4:00 pm - 5:30 pm Instructor(s): Todd Holloway, Joyce Shen Section 2 Mo 6:30 pm - 8:00 pm Instructor(s): Joyce Shen, Korin Reid Section 3 Tu 2:00 pm - 3:30 pm Instructor(s): Puya H. Vahabi, Zona Kostic Section 4 Tu 4:00 pm - 5:30 pm Instructor(s): Fred Nugen, Korin Reid Section 5 Tu 4:00 pm - 5:30 pm Instructor(s): Puya H. Vahabi, Kira Wetzel Section 6 Tu 6:30 pm - 8:00 pm Instructor(s): Fred Nugen, Korin Reid Section 7 Tu 6:30 pm - 8:00 pm Instructor(s): Puya H. Vahabi, Danielle Cummings Section 8 We 2:00 pm - 3:30 pm Instructor(s): Uri Schonfeld, Zona Kostic Section 9 We 4:00 pm - 5:30 pm Instructor(s): Joyce Shen, Kira Wetzel Section 10 We 6:30 pm - 8:00 pm Instructor(s): Joyce Shen, Kevin Hartman
Data Science 221. Modern Data Applications (3 units)
more information
This is a multidisciplinary graduate course that synthesizes data management, data economy, and machine learning & AI strategy and research, product innovation, business and enterprise technology strategy, industry analysis, organizational decision-making and data-driven leadership into one course offering. The course provides strategic thinking tools, analytical frameworks, and real-world case examples to help students explore and investigate modern data applications and opportunities in multiple domains and industries. Students are required to participate in weekly sessions and write response pieces as well as a final paper and presentation evaluating one defining application or emerging technology in machine learning/AI end-to-end.
Section 1 TuTh 6:30 pm - 8:00 pm Instructor(s): Joyce Shen
Data Science 231. Behind the Data: Humans and Values (3 units)
more information
Intro to the legal, policy, and ethical implications of data, including privacy, surveillance, security, classification, discrimination, decisional-autonomy, and duties to warn or act. Examines legal, policy, and ethical issues throughout the full data-science life cycle collection, storage, processing, analysis, and use with case studies from criminal justice, national security, health, marketing, politics, education, employment, athletics, and development. Includes legal and policy constraints and considerations for specific domains and data-types, collection methods, and institutions; technical, legal, and market approaches to mitigating and managing concerns; and the strengths and benefits of competing and complementary approaches.
Section 1 Instructor(s): Morgan Ames, Jared Maslin, Deb Donig Section 2 Tu 6:30 pm - 8:00 pm Instructor(s): Morgan Ames, Jared Maslin, Deb Donig Section 3 We 4:00 pm - 5:30 pm Instructor(s): Morgan Ames, Jared Maslin, Deb Donig
Data Science 233. Privacy Engineering (3 units)
more information
This course surveys privacy mechanisms applicable to systems engineering, with a particular focus on the inference threat arising due to advancements in artificial intelligence and machine learning. We will briefly discuss the history of privacy and compare two major examples of general legal frameworks for privacy from the United States and the European Union. We then survey three design frameworks of privacy that may be used to guide the design of privacy-aware information systems. Finally, we survey threat-specific technical privacy frameworks and discuss their applicability in different settings, including statistical privacy with randomized responses, anonymization techniques, semantic privacy models, and technical privacy mechanisms.
Section 1 Tu 4:00 pm - 5:30 pm Instructor(s): Daniel Aranki Section 2 Th 4:00 pm - 5:30 pm Instructor(s): Daniel Aranki
Data Science 241. Experiments and Causal Inference (3 units)
more information
This course introduces students to experimentation in the social sciences. This topic has increased considerably in importance since 1995, as researchers have learned to think creatively about how to generate data in more scientific ways, and developments in information technology have facilitated the development of better data gathering. Key to this area of inquiry is the insight that correlation does not necessarily imply causality. In this course, we learn how to use experiments to establish causal effects and how to be appropriately skeptical of findings from observational data.
Section 1 Mo 4:00 pm - 5:30 pm Instructor(s): David Reiley Section 2 Mo 6:30 pm - 8:00 pm Instructor(s): David Reiley Section 3 We 4:00 pm - 5:30 pm Instructor(s): Scott Guenther Section 4 We 6:30 pm - 8:00 pm Instructor(s): Scott Guenther
Data Science 255. Machine Learning Systems Engineering (3 units)
more information
This course provides learners hands-on data management and systems engineering experience using containers, cloud, and Kubernetes ecosystems based on current industry practice. The course will be project-based with an emphasis on how production systems are used at leading technology-focused companies and organizations. During the course, learners will build a body of knowledge around data management, architectural design, developing batch and streaming data pipelines, scheduling, and security around data including access management and auditability. We’ll also cover how these tools are changing the technology landscape.
Section 1 Tu 4:00 pm - 5:30 pm Instructor(s): Stephen Muchovej Section 2 Tu 6:30 pm - 8:00 pm Instructor(s): James York-Winegar Section 3 We 4:00 pm - 5:30 pm Instructor(s): Luis Villarreal Section 4 We 6:30 pm - 8:00 pm Instructor(s): Luis Villarreal Section 5 Sa 8:00 am - 9:30 am Instructor(s): James York-Winegar Section 6 We 6:30 pm - 8:00 pm Instructor(s): James York-Winegar, Amanda Ford
Data Science 261. Machine Learning at Scale (3 units)
more information
This course teaches the underlying principles required to develop scalable machine learning pipelines for structured and unstructured data at the petabyte scale. Students will gain hands-on experience in Apache Hadoop and Apache Spark.
Section 1 Mo 6:30 pm - 8:00 pm Instructor(s): Siinn Che Section 2 Tu 6:30 pm - 8:00 pm Instructor(s): Ramakrishna Gummadi Section 3 We 4:00 pm - 5:30 pm Instructor(s): Vinicio De Sola Section 4 We 6:30 pm - 8:00 pm Instructor(s): Vinicio De Sola Section 5 Th 6:30 pm - 8:00 pm Instructor(s): Ramakrishna Gummadi Section 6 Fr 2:00 pm - 3:30 pm Instructor(s): Vinicio De Sola
Data Science 266. Natural Language Processing with Deep Learning (3 units)
more information
Understanding language is fundamental to human interaction. Our brains have evolved language-specific circuitry that helps us learn it very quickly; however, this also means that we have great difficulty explaining how exactly meaning arises from sounds and symbols. This course is a broad introduction to linguistic phenomena and our attempts to analyze them with machine learning. We will cover a wide range of concepts with a focus on practical applications such as information extraction, machine translation, sentiment analysis, and summarization.
Section 1 Tu 2:00 pm - 3:30 pm Instructor(s): Peter Grabowski Section 2 Tu 4:00 pm - 5:30 pm Instructor(s): Jennifer Zhu Section 3 Tu 6:30 pm - 8:00 pm Instructor(s): Jennifer Zhu Section 4 We 2:00 pm - 3:30 pm Instructor(s): Amit Bhattacharyya Section 5 We 4:00 pm - 5:30 pm Instructor(s): Natalie Ahn Section 6 We 6:30 pm - 8:00 pm Instructor(s): Mike Tamir, Paul Spiegelhalter Section 7 Th 6:30 pm - 8:00 pm Instructor(s): Mark Butler
Data Science 271. Statistical Methods for Discrete Response, Time Series, and Panel Data (3 units)
more information
A continuation of Data Science 203 (Statistics for Data Science), this course trains data science students to apply more advanced methods from regression analysis and time series models. Central topics include linear regression, causal inference, identification strategies, and a wide-range of time series models that are frequently used by industry professionals. Throughout the course, we emphasize choosing, applying, and implementing statistical techniques to capture key patterns and generate insight from data. Students who successfully complete this course will be able to distinguish between appropriate and inappropriate techniques given the problem under consideration, the data available, and the given timeframe.
Section 1 Instructor(s): Majid Maki Section 2 Tu 4:00 pm - 5:30 pm Instructor(s): Vinod Bakthavachalam Section 3 Th 4:00 pm - 5:30 pm Instructor(s): Mark Labovitz
Data Science 281. Computer Vision (3 units)
more information
This course introduces the theoretical and practical aspects of computer vision, covering both classical and state of the art deep-learning based approaches. This course covers everything from the basics of the image formation process in digital cameras and biological systems, through a mathematical and practical treatment of basic image processing, space/frequency representations, classical computer vision techniques for making 3-D measurements from images, and modern deep-learning based techniques for image classification and recognition.
Section 1 Mo 4:00 pm - 5:30 pm Instructor(s): Rachel Brown Section 2 We 4:00 pm - 5:30 pm Instructor(s): Alberto Todeschini, Rachel Brown, Senthil Periaswamy Section 3 Mo 6:30 pm - 8:00 pm Instructor(s): Rachel Brown
Data Science 290. Generative AI: Foundations, Techniques, Challenges, and Opportunities (3 units)
more information
Recent developments in neural network architectures, algorithms, and computing hardware have led to a revolutionary development usually referred to as generative AI nowadays. Large language models (LLMs) are now able to generate seemingly human-like text in response to tasks like summarization, question answering, etc. Leveraging similar strategies, comparable advances have been made with images as well as audio. With today’s (and anticipated future) capabilities, Generative AI is poised to be a tool used comprehensively in a wide variety of ways, and therefore to have a profound set of effects on our lives and society as a whole.
This course is a broad introduction to these new technologies. It is split conceptually into three parts. In the introduction section we will cover the historical aspects, key ideas and learnings all the way to Transformer architectures and training aspects. In the practical aspects and techniques section, we will learn how to deploy, use, and train LLMs. We will discuss core concepts like prompt tuning, quantization, and parameter efficient fine-tuning, and we will also explore use case patterns. Finally, we will discuss challenges & opportunities offered by Generative AI, where we will highlight critical issues like bias and inclusivity, fake information, and safety, as well as some IP issues.
Our focus will be on practical aspects of LLMs to enable students to be both effective and responsible users of generative AI technologies.
Section 1 Tu 4:00 pm - 5:30 pm, Fr 4:00 pm - 5:30 pm Instructor(s): Mark Butler MIDS only. Prerequisites: DATASCI 207 MIDS only. Prerequisites: DATASCI 207 Section 2 Tu 6:30 pm - 8:00 pm, Fr 4:00 pm - 5:30 pm Instructor(s): Joachim Rahmfeld, Mark Butler MIDS only. Prerequisites: DATASCI 207 MIDS only. Prerequisites: DATASCI 207 Section 3 We 6:30 pm - 8:00 pm, Fr 4:00 pm - 5:30 pm Instructor(s): Mark Butler MIDS only. Prerequisites: DATASCI 207 MIDS only. Prerequisites: DATASCI 207