CS294 - LLM Agents - Notes

LLM Agents: a free UC berkeley MOOC course — CS 294/194–196

Notes From Berkeley CS294: LLM Agents

Intermediate reasoning steps improve LLM performance. (LLM Does not know reasoning as such?)
Chain-of-thought prompting is very effective. An example problem involves teaching:

Bill Gates --> ls; Elon Musk --> nk ; Then, Barrack Obama --> ?? (Answer is ka)

Since these exact words are not in training data, but how does it work? Does LLM have basic reasoning related to natural language and we leverage that ?
The reflection questions are effective only when the answer is wrong. If the answer is right, the reflection forces to give some other wrong answer.
Should we generate multiple LLM answers and then the LLM compare the answers and select the best one ? Probably not, because every answer generation itself is based on "Highest quality next token". Not clear if The whole answer evaluation is possible or not ??? Multiple LLMs can generate answers and we can use another LLM to compare the answers and select one from that, not sure if such an approach makes sense. May be an agent can do this.
LLMs can be easily distracted by irrelevant info, but when we remind explicitly to ignore irrelevant information, it performs better.
LLMs are good in picking up training using few shot examples. Basically prompting technique, where we explicitly explain the task to LLM, it does it. We can probably call it as 0 shot, OR 1 shot OR few shots learning.
There is Training, Fine Tuning, Reinforcement Learning, RAG, etc.

ReAct framework is currently the basis of all agents.
Iterations over reasoning and acting get agents closer to achieving their goals and objectives.

Autogen is the most prevalent framework today to build Agentic AI systems.
It is currently used by universities as well as leading AI enterprises like Google, META and Microsoft.

Platform choice is crucial since building agents will involve a lot of plug an play between LLM’s and various tools

Optimisation frameworks when stringing together multiple inference engines in an agent.
OPRO, MIPRO are a couple.

ServiceNow Tapeagents Framework for building and deploying agents in the enterprise.
Web agents are more challenging than API agents

This META lecture needed a lot of Math knowledge. This and Dspy lectures were the most technical of the 12

nVIDIA’s project GR00T — AI Brain for Humanoid Robots.
Agents simulating data for Robots to train since human generated data is too small for training a Robot.

This lecture from a Stanford Professor made the case for why Open source is important for making better models while scaling.

Great lecture from Anthropic on metrics to determine safety of the AI systems.
AI Safety levels 1–4 and the current systems are on the verge of breaking safety level 2.

History of safety and security and some of the groundbreaking papers contributing to today’s standards for safety.