CS294 - LLM Agents - Notes

LLM Agents: a free UC berkeley MOOC course — CS 294/194–196

Notes From Berkeley CS294: LLM Agents

Lecture 1

  • Intermediate reasoning steps improve LLM performance. (LLM Does not know reasoning as such?)

  • Chain-of-thought prompting is very effective. An example problem involves teaching:

    Bill Gates --> ls; Elon Musk --> nk ; Then, Barrack Obama --> ?? (Answer is ka)

    Since these exact words are not in training data, but how does it work? Does LLM have basic reasoning related to natural language and we leverage that ?

  • The reflection questions are effective only when the answer is wrong. If the answer is right, the reflection forces to give some other wrong answer.

  • Should we generate multiple LLM answers and then the LLM compare the answers and select the best one ? Probably not, because every answer generation itself is based on "Highest quality next token". Not clear if The whole answer evaluation is possible or not ??? Multiple LLMs can generate answers and we can use another LLM to compare the answers and select one from that, not sure if such an approach makes sense. May be an agent can do this.

  • LLMs can be easily distracted by irrelevant info, but when we remind explicitly to ignore irrelevant information, it performs better.

  • LLMs are good in picking up training using few shot examples. Basically prompting technique, where we explicitly explain the task to LLM, it does it. We can probably call it as 0 shot, OR 1 shot OR few shots learning.

  • There is Training, Fine Tuning, Reinforcement Learning, RAG, etc.

Lecture 2 ( LLM agents: brief history and overview )

  • ReAct framework is currently the basis of all agents.
  • Iterations over reasoning and acting get agents closer to achieving their goals and objectives.

Lecture 3 ( Agentic AI Frameworks & AutoGen )

  • Autogen is the most prevalent framework today to build Agentic AI systems.
  • It is currently used by universities as well as leading AI enterprises like Google, META and Microsoft.
  • Platform choice is crucial since building agents will involve a lot of plug an play between LLM’s and various tools

Lecture 5 ( DSPy Framework )

  • Optimisation frameworks when stringing together multiple inference engines in an agent.
  • OPRO, MIPRO are a couple.

Lecture 6 ( Agents for Software Development )

  • Copilots and code agents already impacting productivity.
  • Challenges posed by codebases already in the training data.

Lecture 7 ( Enterprise Workflows )

  • ServiceNow Tapeagents Framework for building and deploying agents in the enterprise.
  • Web agents are more challenging than API agents

Lecture 8 ( Unified framework of Neural and Symbolic Decision Making )

  • This META lecture needed a lot of Math knowledge. This and Dspy lectures were the most technical of the 12

Lecture 9 ( nVidia-Generalist Robotics)

  • nVIDIA’s project GR00T — AI Brain for Humanoid Robots.
  • Agents simulating data for Robots to train since human generated data is too small for training a Robot.

Lecture 10 ( Open-Source and Science in the Era of Foundation Models )

  • This lecture from a Stanford Professor made the case for why Open source is important for making better models while scaling.

Lecture 11 ( Measuring Agent capabilities and Anthropic’s RSP )

  • Great lecture from Anthropic on metrics to determine safety of the AI systems.
  • AI Safety levels 1–4 and the current systems are on the verge of breaking safety level 2.

Lecture 12 ( Towards Building Safe & Trustworthy AI Agents )

  • History of safety and security and some of the groundbreaking papers contributing to today’s standards for safety.