Lectures

The lecture schedule will be updated as the term progresses.

Date	Topic	Readings
Tue, Jan 7, 2025	Intro [slides]
Thu, Jan 9, 2025	What are Agents? [slides]	Stuart Russell and Peter Norvig, AIMA Chapter 1
Tue, Jan 14, 2025	Simulated Environments and Reality [slides]	Peter A. Jansen, A Systematic Survey of Text Worlds as Embodied Natural Language Environments Jared Sorensen, Action Castle
Thu, Jan 16, 2025	How to Make a Simulation? [slides]	Fares Alaboud, Intro to PDDL Graham Nelson, Intro to Inform7 Andrew Plotkin, The Visible Zorker Optional: Jason Scott, GET LAMP: The Text Adventure Documentary
Tue, Jan 21, 2025	HW0 Due
Tue, Jan 21, 2025	Search for Planning in Simulations [slides]	Kory Becker, Intro to STRIPS Stuart Russell and Peter Norvig, AIMA Chapter 11 Rich Sutton and Andrew Barto, RL Book Chapter 8.1, 8.9-8.11
Wed, Jan 22, 2025	Finalize Project Groups
Thu, Jan 23, 2025	Classical Control, Pre-Deep Learning [slides]	Rich Sutton and Andrew Barto, RL Book Chapter 4, 5
Tue, Jan 28, 2025	Deep Reinforcement Learning, Pre-LLMs [slides]	Rich Sutton and Andrew Barto, RL Book Chapter 6.1-6.5 Minh et al. 2013, Playing Atari with Deep Reinforcement Learning Optional: Minh et al. 2015, (Nature version) Human-level control through deep reinforcement learning
Thu, Jan 30, 2025	Project Workshopping
Tue, Feb 4, 2025	Project Pitches
Thu, Feb 6, 2025	HW1 Due
Fri, Feb 7, 2025	Project Pitch Slide Decks Due
Thu, Feb 6, 2025	Reinforcement Learning and Search Combined [slides]	Silver et al. 2017, (Alpha Zero) Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm Ammanabrolu et al. 2019, How to Avoid Being Eaten by a Grue Structured Exploration Strategies for Textual Worlds Optional: Schrittwieser et al. 2019, (Mu Zero) Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Tue, Feb 11, 2025	Attention and Language Modeling [slides]	Vaswani et al., Attention is All You Need Jay Alammar, The Illustrated Transformer Brown et al., Language Models are Few Shot Learners
Thu, Feb 13, 2025	RL for Language Agents Pt 1 (Online RL for NLP, RLHF) [slides]	Rich Sutton and Andrew Barto, RL Book Chapter 13 Ramamurthy, Ammanabrolu et al., Is Reinforcement Learning (Not) for Natural Language Processing? Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization Ouyang et al., Training Language Models to Follow Instructions with Human Feedback
Tue, Feb 18, 2025	RL for Language Agents Pt 2 (Rewards and Closed Form Methods) [slides]	Lightman et al., Let's Verify Step by Step Wu et al., Fine-Grained Human Feedback Gives Better Rewards for Language Model Training Rafailov et al., Direct Preference Optimization Your Language Model is Secretly a Reward Model
Thu, Feb 20, 2025	HW2 Due
Thu, Feb 20, 2025	Prompt Optimization [slides]	Yao et al., ReAct Synergizing Reasoning and Acting in Language Models Lewis et al., (RAG) Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Anthropic, Building Effective Agents Optional: Khattab et al., DsPy Compiling Declarative Language Model Calls into Self-Improving Pipelines
Tue, Feb 25, 2025	Neurosymbolic Tool Use Methods and Agent Reasoning [slides]	Wang et al., Behavior Cloned Transformers are Neurosymbolic Reasoners Liu et al., LLM+P Empowering Large Language Models with Optimal Planning Proficiency Patil et al., Gorilla Large Language Model Connected with Massive APIs Valmeekam et al., PlanBench An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change
Thu, Feb 27, 2025	Agent Reasoning and Inference Time Methods [slides]	Zelikman et al., Quiet-STaR Language Models Can Teach Themselves to Think Before Speaking Snell et al., Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters Gandhi et al., Stream of Search (SoS) Learning to Search in Language
Tue, Mar 4, 2025	Multi-Agent Systems and Human-AI Collaboration [slides]	Urbanek et al., LIGHT - A large-scale crowdsourced fantasy text game Park et al., Generative Agents Interactive Simulacra of Human Behavior Vats et al., A Survey on Human-AI Teaming with Large Pre-Trained Models
Thu, Mar 6, 2025	HW3 Due
Thu, Mar 6, 2025	Societal Impacts and Agent Safety [slides]	Tamkin et al., Evaluating and Mitigating Discrimination in Language Model Decisions Labunets et al., Computing Optimization-Based Prompt Injections Against Closed-Weights Models By Misusing a Fine-Tuning API Bowman et al., Measuring Progress on Scalable Oversight for Large Language Models Optional: Chakrabarty et al., Creativity Support in the Age of Large Language Models An Empirical Study Involving Emerging Writers Optional: Ammanabrolu et al., Aligning to Social Norms and Values in Interactive Narratives
Tue, Mar 11, 2025	Final Presentations Pt 1
Thu, Mar 13, 2025	Final Presentations Pt 2
Mon, Mar 17, 2025	Final Project Writeups Due