B9145: Topics in Trustworthy AI

Hongseok Namkoong, Columbia University, Spring 2025

Description

Pre-trained AI systems have achieved remarkable capabilities in understanding videos, text, and code, demonstrating reasoning abilities that match or surpass human experts. While these omnipresent systems present unprecedented societal opportunities, significant challenges remain before they can meaningfully transform real-world decision-making problems.

A fundamental challenge is that AI systems inevitably encounter inputs unseen during training, as they must operate continuously while processing diverse real-world data including customer feedback and user interactions. Although scaling datasets has improved capabilities, it has not solved this core challenge. Modern AI systems, despite training on datasets orders of magnitude larger than human experience, still struggle with tail inputs – they hallucinate, cannot quantify uncertainty, and perform poorly on underrepresented groups.

The ability to handle tail inputs is a longstanding open problem in AI, with limited fundamental progress over past decades. As we exhaust easily available data sources, it is becoming clear that we must rethink the standard machine learning paradigm. Our lack of understanding of failure modes highlights the need for both more reliable models and rigorous safety evaluation methods.

This course surveys emerging topics in trustworthy machine learning, spanning data collection, pre-training, finetuning, and inference-time methods. Most topics discussed are active research areas, with reading materials drawn from recent literature (to be posted on the website). The goal is to foster discussion on new research questions, encompassing theoretical and methodological developments, modeling considerations, novel applications, and practical challenges.

Course outline

The course will comprise of pedagogical lectures and seminar-style guided discussions. We will begin by overviewing recent advances in AI

Architectures, optimization algorithms, and datasets
Pre-training on web-scale data
Finetuning on downstream tasks, including supervised and RL-based methods
Inference-time search methods

Then, we will cover the recent set of works on improving reliability in machine learning. Since trustworthiness is a loosely defined term with many connotations, we will explore various aspects of this concept, alongside a discussion of future directions. The following is a selection of topics that will be covered in the course.

Data-centric view of AI systems
Distribution shift
Uncertainty quantification
Adaptive data collection (active exploration)
Adversarial attacks
Fairness, equity, and data provenance
Causal learning

Lectures

Thursdays, 9am–12:15pm, Kravis 430

Course staff

Hongseok Namkoong (Instructor)

Email: namkoong@gsb.columbia.edu
Office hours by appointment

Daksh Mittal (TA)

Email: DMittal27@gsb.columbia.edu

Prerequisites

There are no formal prerequisites, but the class will be fast-paced and will assume a strong background in machine learning, statistics, and optimization. This is a class intended for PhD students conducting research in related fields. Although some materials are of applied interest, this course has significant theoretical content that require mathematical maturity. The ability to read, write, and think rigorously is essential to understanding the material.

Grading

Final project (70%), class presentation (30%)

Previous course offering

This is a research topics class that gets significantly updated with new materials every time it is offered. The course was last offered in Fall, 2020 and Spring, 2023.