LLM Agent

Source

LLM -> prompt engineering -> Agent (very hard!) to interact with environment using LLM

Question? What’s the relationship between LLM agent vs. RL agent?

Use CPU analogy to LLM

Memory: because LLM is “stateless”, need memory; different from RL
- short-term working memory
- long-term memory: Episodic (experience); semantic (knowledge); procedural (LLM, code)
Action (space): RL is defined by action space.

＊External actions interact with external environments (grounding) - same as RL agent

＊Internal actions interact with internal memories - different from RL agent
- Reasoning: read & write working memory (via LLMs)
- Retrieval: read long-term memory
- Learning: write long-term memory (or weights)
Decision making
- A language agent chooses actions via decision procedures
  - Split taken actions into decision cycles

Interact with physical world / humans: hard to scale
Interact with digital simulation (Atari, Mujuco): hard to be practical
Interact with digital applications: ?
1. Interact with web (smartphone)
2. Interact with code (PC, ..) productivity

LLM - auto-regressive way

Human code - interactive way

Lang-Chain is one example?