Skip to Main Content

DATA145

Download as PDF

DATA 145 - Evidence and Uncertainty

Data Science Undergraduate Studies Undergraduate CDSS - Clg of Comp Data Sci & Society

Subject

DATA

Course Number

145

Course Level

Undergraduate

Course Title

Evidence and Uncertainty

Course Description

When we learn about the world from data, how much can we rely on the conclusions we draw? How do we know if we could do better? This course will cover the statistical theory required to measure and control our uncertainty when we analyze large and complex modern data sets. We will use mathematical and computational lenses to examine optimality properties and error bounds. Topics include the Bayesian and frequentist paradigms, asymptotic and finite-sample methods, parametric and nonparametric techniques, causality, and multiple testing.

Minimum

4

Maximum

4

Grading Basis

Default Letter Grade; P/NP Option

Method of Assessment

Written Exam

Instructors

Adhikari, Fithian

Prerequisites

Math 53, Data C100, and either Data C140 or EECS 126, with a C- or better or Pass.

Repeat Rules

Course is not repeatable for credit.

Credit Restriction Courses. Students will receive no credit for this course if following the course(s) have already been completed.

-

Credit Restrictions.

Students will receive no credit for DATA 145 after completing STAT 210A.

Credit Replacement Courses. Upon passing, students can use the following course(s) to replace a deficient grade for this course.

-

Course Objectives

The course is primarily intended for students interested in machine learning and artificial intelligence, whether in industry or academia. It will also be helpful preparation for students who want to study statistics at the graduate level. It will examine approaches to defining and modeling uncertainty, and will identify connections and differences between the frequentist and Bayesian paradigms. The emphasis will be on situations where classical statistical methods do not apply and only minimal distributional assumptions can be made. In such settings, computational solutions might be feasible if the mathematics becomes intractable.

Student Learning Outcomes

Students will understand the need for statistical inference in data science and why the Bayesian viewpoint is so pervasive in modern data analysis. They will recognize the power and limitations of classical methods and newer computationally intensive approaches. They will appreciate the optimality or near-optimality properties of some asymptotic methods, and learn how to work in finite-sample settings where asymptotic methods do not apply. Throughout, they will use mathematics and computation as needed for problem solving. Upon leaving the course, students should be able to follow upcoming developments in the field without extensive further education in statistical inference.

Formats

Lecture, Discussion

Term

Fall and Spring

Duration (in weeks)

15

Minimum Hours

3

Maximum Hours

3

Lecture Mode of Instruction

In Person

Minimum Hours

1

Maximum Hours

1

Discussion Mode of Instruction

In Person

Minimum Hours

8

Maximum Hours

8