Skip to content
NR
All writing
LLM SystemsNovember 20, 2025·8 min read

Multi-Agent Prompt Optimization with LangGraph

Orchestrating specialized agents to iteratively refine, evaluate, and test LLM prompts.


title: "Multi-Agent Prompt Optimization with LangGraph" date: "2025-11-20T00:00:00Z" excerpt: "Orchestrating specialized agents to iteratively refine, evaluate, and test LLM prompts." tag: "LLM Systems" readTime: "8 min" accentColor: "blue"

Beyond Single-Pass Prompting

Writing effective prompts for LLMs requires iterative experimentation, but most developers lack systematic frameworks for optimization. A single-pass optimizer can't account for edge cases or competing quality dimensions.

To address this, I built Context Refinery, a desktop application using LangGraph that orchestrates a multi-agent pipeline for prompt optimization.

The Agent Pipeline

The system is designed as a LangGraph state machine with four specialized agents:

  1. Analyzer: Breaks down the user's initial prompt and intent.
  2. Rewriter: Applies established prompt engineering techniques to draft improvements.
  3. Evaluator: Scores outputs against user-defined criteria.
  4. Refiner: Triggers conditional loops when thresholds aren't met.

Why LangGraph?

State machine architecture enables conditional branching and loops — essential for a refinement pipeline where re-runs are data-driven, not hardcoded. Unlike sequential chains, LangGraph allows the Evaluator to route back to the Rewriter based on quality scores.

# Conceptual LangGraph routing
def evaluate_prompt(state: GraphState):
    score = evaluator_agent.run(state.prompt)
    if score < 0.8:
        return "rewrite"
    return "finalize"

Desktop-First Privacy

Context Refinery is built with Electron and React. Local execution means no API keys leave the machine and the app works offline once models are configured — a strong privacy argument for enterprise prompt optimization workflows.

Takeaways

LangGraph's explicit state model makes debugging multi-agent pipelines dramatically easier. Furthermore, agent role specialization beats generalism for complex multi-step workflows. Specialized system prompts per agent produced significantly better role adherence than a generalist multi-task agent.