Skip to main content

The lecture schedule will be updated as the term progresses.

Date Topic Readings
Tue, Jan 7, 2025 Intro [slides]
Thu, Jan 9, 2025 What are Agents? [slides] Stuart Russell and Peter Norvig, AIMA Chapter 1
Tue, Jan 14, 2025 Simulated Environments and Reality [slides] Peter A. Jansen, A Systematic Survey of Text Worlds as Embodied Natural Language Environments
Jared Sorensen, Action Castle
Thu, Jan 16, 2025 How to Make a Simulation? [slides] Fares Alaboud, Intro to PDDL
Graham Nelson, Intro to Inform7
Andrew Plotkin, The Visible Zorker
Optional: Jason Scott, GET LAMP: The Text Adventure Documentary
Tue, Jan 21, 2025 HW0 Due
Tue, Jan 21, 2025 Search for Planning in Simulations [slides] Kory Becker, Intro to STRIPS
Stuart Russell and Peter Norvig, AIMA Chapter 11
Rich Sutton and Andrew Barto, RL Book Chapter 8.1, 8.9-8.11
Wed, Jan 22, 2025 Finalize Project Groups
Thu, Jan 23, 2025 Classical Control, Pre-Deep Learning [slides] Rich Sutton and Andrew Barto, RL Book Chapter 4, 5
Tue, Jan 28, 2025 Deep Reinforcement Learning, Pre-LLMs [slides] Rich Sutton and Andrew Barto, RL Book Chapter 6.1-6.5
Minh et al. 2013, Playing Atari with Deep Reinforcement Learning
Optional: Minh et al. 2015, (Nature version) Human-level control through deep reinforcement learning
Thu, Jan 30, 2025 Project Workshopping
Tue, Feb 4, 2025 Project Pitches
Thu, Feb 6, 2025 HW1 Due
Fri, Feb 7, 2025 Project Pitch Slide Decks Due
Thu, Feb 6, 2025 Reinforcement Learning and Search Combined [slides] Silver et al. 2017, (Alpha Zero) Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Ammanabrolu et al. 2019, How to Avoid Being Eaten by a Grue Structured Exploration Strategies for Textual Worlds
Optional: Schrittwieser et al. 2019, (Mu Zero) Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Tue, Feb 11, 2025 Attention and Language Modeling [slides] Vaswani et al., Attention is All You Need
Jay Alammar, The Illustrated Transformer
Brown et al., Language Models are Few Shot Learners
Thu, Feb 13, 2025 RL for Language Agents Pt 1 (Online RL for NLP, RLHF) [slides] Rich Sutton and Andrew Barto, RL Book Chapter 13
Ramamurthy*, Ammanabrolu* et al., Is Reinforcement Learning (Not) for Natural Language Processing? Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Ouyang et al., Training Language Models to Follow Instructions with Human Feedback
Tue, Feb 18, 2025 RL for Language Agents Pt 2 (Rewards and Closed Form Methods) [slides] Lightman et al., Let's Verify Step by Step
Wu et al., Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Rafailov et al., Direct Preference Optimization Your Language Model is Secretly a Reward Model
Thu, Feb 20, 2025 HW2 Due
Thu, Feb 20, 2025 Prompt Optimization [slides] Yao et al., ReAct Synergizing Reasoning and Acting in Language Models
Lewis et al., (RAG) Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Anthropic, Building Effective Agents
Optional: Khattab et al., DsPy Compiling Declarative Language Model Calls into Self-Improving Pipelines
Tue, Feb 25, 2025 Neurosymbolic Tool Use Methods and Agent Reasoning [slides] Wang et al., Behavior Cloned Transformers are Neurosymbolic Reasoners
Liu et al., LLM+P Empowering Large Language Models with Optimal Planning Proficiency
Patil et al., Gorilla Large Language Model Connected with Massive APIs
Valmeekam et al., PlanBench An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change
Thu, Feb 27, 2025 Agent Reasoning and Inference Time Methods [slides] Zelikman et al., Quiet-STaR Language Models Can Teach Themselves to Think Before Speaking
Snell et al., Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Gandhi et al., Stream of Search (SoS) Learning to Search in Language
Tue, Mar 4, 2025 Multi-Agent Systems and Human-AI Collaboration [slides] Urbanek et al., LIGHT - A large-scale crowdsourced fantasy text game
Park et al., Generative Agents Interactive Simulacra of Human Behavior
Vats et al., A Survey on Human-AI Teaming with Large Pre-Trained Models
Thu, Mar 6, 2025 HW3 Due
Thu, Mar 6, 2025 Societal Impacts and Agent Safety [slides] Tamkin et al., Evaluating and Mitigating Discrimination in Language Model Decisions
Labunets et al., Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API
Bowman et al., Measuring Progress on Scalable Oversight for Large Language Models
Optional: Chakrabarty et al., Creativity Support in the Age of Large Language Models An Empirical Study Involving Emerging Writers
Optional: Ammanabrolu et al., Aligning to Social Norms and Values in Interactive Narratives
Tue, Mar 11, 2025 Final Presentations Pt 1
Thu, Mar 13, 2025 Final Presentations Pt 2
Mon, Mar 17, 2025 Final Project Writeups Due