Lectures

The lecture schedule will be updated as the term progresses.

Date	Topic	Readings
Tue, Jan 6, 2026	Intro [slides]	Optional: Jason Scott, GET LAMP: The Text Adventure Documentary
Thu, Jan 8, 2026	What are Agents? [slides]	Stuart Russell and Peter Norvig, AIMA Chapter 1
Tue, Jan 13, 2026	Simulated Environments and Reality [slides]	Peter A. Jansen, A Systematic Survey of Text Worlds as Embodied Natural Language Environments Jared Sorensen, Action Castle
Thu, Jan 15, 2026	How to Make a Simulation? [slides]	Fares Alaboud, Intro to PDDL Graham Nelson, Intro to Inform7 Andrew Plotkin, The Visible Zorker Optional: Jason Scott, GET LAMP: The Text Adventure Documentary
Tue, Jan 20, 2026	HW0 Due
Tue, Jan 20, 2026	Search for Planning in Simulations [slides]	Kory Becker, Intro to STRIPS Stuart Russell and Peter Norvig, AIMA Chapter 11 Rich Sutton and Andrew Barto, RL Book Chapter 8.1, 8.9-8.11
Thu, Jan 22, 2026	Finalize Project Groups
Thu, Jan 22, 2026	Classical Control, Pre-Deep Learning [slides]	Rich Sutton and Andrew Barto, RL Book Chapter 4, 5
Tue, Jan 27, 2026	Deep Reinforcement Learning, Pre-LLMs [slides]	Rich Sutton and Andrew Barto, RL Book Chapter 6.1-6.5 Minh et al. 2013, Playing Atari with Deep Reinforcement Learning Optional: Minh et al. 2015, (Nature version) Human-level control through deep reinforcement learning
Thu, Jan 29, 2026	Reinforcement Learning and Search Combined [slides]	Silver et al. 2017, (Alpha Zero) Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm Ammanabrolu et al. 2019, How to Avoid Being Eaten by a Grue Structured Exploration Strategies for Textual Worlds Optional: Schrittwieser et al. 2019, (Mu Zero) Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Tue, Feb 3, 2026	Project Pitches 1
Thu, Feb 5, 2026	Project Pitches 2
Tue, Feb 3, 2026	HW1 Due
Mon, Feb 9, 2026	Project Pitch Slide Decks Due
Tue, Feb 10, 2026	Attention and Language Modeling [slides]	Vaswani et al., Attention is All You Need Jay Alammar, The Illustrated Transformer Brown et al., Language Models are Few Shot Learners
Thu, Feb 12, 2026	RL for Language Agents Pt 1 (Online RL for NLP, RLHF) [slides]	Rich Sutton and Andrew Barto, RL Book Chapter 13 Ramamurthy, Ammanabrolu et al., Is Reinforcement Learning (Not) for Natural Language Processing? Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization Ouyang et al., Training Language Models to Follow Instructions with Human Feedback
Tue, Feb 17, 2026	HW2 Due
Tue, Feb 17, 2026	RL for Language Agents Pt 2 (Rewards and Closed Form Methods) [slides]	Lightman et al., Let's Verify Step by Step Wu et al., Fine-Grained Human Feedback Gives Better Rewards for Language Model Training Rafailov et al., Direct Preference Optimization Your Language Model is Secretly a Reward Model
Thu, Feb 19, 2026	Prompt Optimization [slides]	Yao et al., ReAct Synergizing Reasoning and Acting in Language Models Lewis et al., (RAG) Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Anthropic, Building Effective Agents Optional: Khattab et al., DsPy Compiling Declarative Language Model Calls into Self-Improving Pipelines
Tue, Feb 24, 2026	Neurosymbolic Tool Use Methods and Agent Reasoning [slides]	Wang et al., Behavior Cloned Transformers are Neurosymbolic Reasoners Liu et al., LLM+P Empowering Large Language Models with Optimal Planning Proficiency Patil et al., Gorilla Large Language Model Connected with Massive APIs Valmeekam et al., PlanBench An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change
Thu, Feb 26, 2026	Agent Reasoning and Inference Time Methods [slides]	Lilian Weng, Why we think Zelikman et al., Quiet-STaR Language Models Can Teach Themselves to Think Before Speaking Snell et al., Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Gandhi et al., Stream of Search (SoS) Learning to Search in Language
Tue, Mar 3, 2026	HW3 Due
Tue, Mar 3, 2026	Multi-Agent Systems and Human-AI Collaboration [slides]	Urbanek et al., LIGHT - A large-scale crowdsourced fantasy text game Park et al., Generative Agents Interactive Simulacra of Human Behavior Vats et al., A Survey on Human-AI Teaming with Large Pre-Trained Models
Thu, Mar 5, 2026	Agent Safety, Security, and Legality [slides]	Tamkin et al., Evaluating and Mitigating Discrimination in Language Model Decisions Desai and Riedl, Responsible AI Agents Labunets et al., Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API Bowman et al., Measuring Progress on Scalable Oversight for Large Language Models Optional: Chakrabarty et al., Creativity Support in the Age of Large Language Models An Empirical Study Involving Emerging Writers Optional: Ammanabrolu et al., Aligning to Social Norms and Values in Interactive Narratives
Tue, Mar 10, 2026	Final Project Presentations 1
Thu, Mar 12, 2026	Final Project Presentations 2