Skip to main content

The lecture schedule will be updated as the term progresses.

Date Topic Readings
Tue, Jan 6, 2026 Intro [slides] Optional: Jason Scott, GET LAMP: The Text Adventure Documentary
Thu, Jan 8, 2026 What are Agents? [slides] Stuart Russell and Peter Norvig, AIMA Chapter 1
Tue, Jan 13, 2026 Simulated Environments and Reality [slides] Peter A. Jansen, A Systematic Survey of Text Worlds as Embodied Natural Language Environments
Jared Sorensen, Action Castle
Thu, Jan 15, 2026 How to Make a Simulation? [slides] Fares Alaboud, Intro to PDDL
Graham Nelson, Intro to Inform7
Andrew Plotkin, The Visible Zorker
Optional: Jason Scott, GET LAMP: The Text Adventure Documentary
Tue, Jan 20, 2026 HW0 Due
Tue, Jan 20, 2026 Search for Planning in Simulations [slides] Kory Becker, Intro to STRIPS
Stuart Russell and Peter Norvig, AIMA Chapter 11
Rich Sutton and Andrew Barto, RL Book Chapter 8.1, 8.9-8.11
Thu, Jan 22, 2026 Finalize Project Groups
Thu, Jan 22, 2026 Classical Control, Pre-Deep Learning [slides] Rich Sutton and Andrew Barto, RL Book Chapter 4, 5
Tue, Jan 27, 2026 Deep Reinforcement Learning, Pre-LLMs [slides] Rich Sutton and Andrew Barto, RL Book Chapter 6.1-6.5
Minh et al. 2013, Playing Atari with Deep Reinforcement Learning
Optional: Minh et al. 2015, (Nature version) Human-level control through deep reinforcement learning
Thu, Jan 29, 2026 Reinforcement Learning and Search Combined [slides] Silver et al. 2017, (Alpha Zero) Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
Ammanabrolu et al. 2019, How to Avoid Being Eaten by a Grue Structured Exploration Strategies for Textual Worlds
Optional: Schrittwieser et al. 2019, (Mu Zero) Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Tue, Feb 3, 2026 Project Pitches 1
Thu, Feb 5, 2026 Project Pitches 2
Tue, Feb 3, 2026 HW1 Due
Mon, Feb 9, 2026 Project Pitch Slide Decks Due
Tue, Feb 10, 2026 Attention and Language Modeling [slides] Vaswani et al., Attention is All You Need
Jay Alammar, The Illustrated Transformer
Brown et al., Language Models are Few Shot Learners
Thu, Feb 12, 2026 RL for Language Agents Pt 1 (Online RL for NLP, RLHF) [slides] Rich Sutton and Andrew Barto, RL Book Chapter 13
Ramamurthy*, Ammanabrolu* et al., Is Reinforcement Learning (Not) for Natural Language Processing? Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Ouyang et al., Training Language Models to Follow Instructions with Human Feedback
Tue, Feb 17, 2026 HW2 Due
Tue, Feb 17, 2026 RL for Language Agents Pt 2 (Rewards and Closed Form Methods) [slides] Lightman et al., Let's Verify Step by Step
Wu et al., Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Rafailov et al., Direct Preference Optimization Your Language Model is Secretly a Reward Model
Thu, Feb 19, 2026 Prompt Optimization [slides] Yao et al., ReAct Synergizing Reasoning and Acting in Language Models
Lewis et al., (RAG) Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Anthropic, Building Effective Agents
Optional: Khattab et al., DsPy Compiling Declarative Language Model Calls into Self-Improving Pipelines
Tue, Feb 24, 2026 Neurosymbolic Tool Use Methods and Agent Reasoning [slides] Wang et al., Behavior Cloned Transformers are Neurosymbolic Reasoners
Liu et al., LLM+P Empowering Large Language Models with Optimal Planning Proficiency
Patil et al., Gorilla Large Language Model Connected with Massive APIs
Valmeekam et al., PlanBench An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change
Thu, Feb 26, 2026 Agent Reasoning and Inference Time Methods [slides] Lilian Weng, Why we think
Zelikman et al., Quiet-STaR Language Models Can Teach Themselves to Think Before Speaking
Snell et al., Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Gandhi et al., Stream of Search (SoS) Learning to Search in Language
Tue, Mar 3, 2026 HW3 Due
Tue, Mar 3, 2026 Multi-Agent Systems and Human-AI Collaboration [slides] Urbanek et al., LIGHT - A large-scale crowdsourced fantasy text game
Park et al., Generative Agents Interactive Simulacra of Human Behavior
Vats et al., A Survey on Human-AI Teaming with Large Pre-Trained Models
Thu, Mar 5, 2026 Agent Safety, Security, and Legality [slides] Tamkin et al., Evaluating and Mitigating Discrimination in Language Model Decisions
Desai and Riedl, Responsible AI Agents
Labunets et al., Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API
Bowman et al., Measuring Progress on Scalable Oversight for Large Language Models
Optional: Chakrabarty et al., Creativity Support in the Age of Large Language Models An Empirical Study Involving Emerging Writers
Optional: Ammanabrolu et al., Aligning to Social Norms and Values in Interactive Narratives
Tue, Mar 10, 2026 Final Project Presentations 1
Thu, Mar 12, 2026 Final Project Presentations 2