B9145: Topics in Trustworthy AIHongseok Namkoong, Columbia University, Spring 2025
DescriptionPre-trained AI systems have achieved remarkable capabilities in understanding videos, text, and code, demonstrating reasoning abilities that match or surpass human experts. While these omnipresent systems present unprecedented societal opportunities, significant challenges remain before they can meaningfully transform real-world decision-making problems. A fundamental challenge is that AI systems inevitably encounter inputs unseen during training, as they must operate continuously while processing diverse real-world data including customer feedback and user interactions. Although scaling datasets has improved capabilities, it has not solved this core challenge. Modern AI systems, despite training on datasets orders of magnitude larger than human experience, still struggle with tail inputs – they hallucinate, cannot quantify uncertainty, and perform poorly on underrepresented groups. The ability to handle tail inputs is a longstanding open problem in AI, with limited fundamental progress over past decades. As we exhaust easily available data sources, it is becoming clear that we must rethink the standard machine learning paradigm. Our lack of understanding of failure modes highlights the need for both more reliable models and rigorous safety evaluation methods. This course surveys emerging topics in trustworthy machine learning, spanning data collection, pre-training, finetuning, and inference-time methods. Most topics discussed are active research areas, with reading materials drawn from recent literature (to be posted on the website). The goal is to foster discussion on new research questions, encompassing theoretical and methodological developments, modeling considerations, novel applications, and practical challenges. Course outlineThe course will comprise of pedagogical lectures and seminar-style guided discussions. We will begin by overviewing recent advances in AI
Then, we will cover the recent set of works on improving reliability in machine learning. Since trustworthiness is a loosely defined term with many connotations, we will explore various aspects of this concept, alongside a discussion of future directions. The following is a selection of topics that will be covered in the course.
LecturesThursdays, 9am–12:15pm, Kravis 430 Course staffHongseok Namkoong (Instructor)
Daksh Mittal (TA)
PrerequisitesThere are no formal prerequisites, but the class will be fast-paced and will assume a strong background in machine learning, statistics, and optimization. This is a class intended for PhD students conducting research in related fields. Although some materials are of applied interest, this course has significant theoretical content that require mathematical maturity. The ability to read, write, and think rigorously is essential to understanding the material. GradingFinal project (70%), class presentation (30%) Previous course offeringThis is a research topics class that gets significantly updated with new materials every time it is offered. The course was last offered in Fall, 2020 and Spring, 2023. |