lesson-22

20 MIN READ | UPDATED: 2026-05-07

🎯 Learning Objectives for This Session

Hey there, future AI architects! Welcome to Part 22 of the LangGraph Multi-Agent Masterclass. By now, your "Universal AI Content Agency" is likely taking shape. Your Agents are performing their respective duties, running smoothly under LangGraph's orchestration. But I bet you've run into this dilemma:

Suddenly, an Agent starts hallucinating, the workflow gets stuck at a specific node, or the final output is completely off the mark. Yet, you have absolutely no idea where the pipeline broke down or which LLM slacked off under which Edge. Feels like defusing a bomb blindfolded, right? Don't panic—this session is here to hand you a pair of "X-ray glasses"!

In this session, we will dive deep into the art of "observability" and "debugging" for complex LangGraph workflows, focusing specifically on the crown jewel of the LangChain ecosystem: LangSmith.

By the end of this session, you will:

  1. Gain a thorough understanding: Why observability and debugging are the lifelines of development efficiency in multi-agent and LLM-driven complex systems.
  2. Master the core: The fundamental concepts and working principles of LangSmith, and its seamless integration mechanism with LangGraph.
  3. Hands-on practice: Mount LangSmith probes onto our "Universal AI Content Agency" project to track every decision and action of the Planner, Researcher, Writer, and Editor Agents in real-time.
  4. Troubleshoot efficiently: Learn to use LangSmith's visual Trace feature to pinpoint agent collaboration issues, LLM prompt flaws, or tool invocation errors in seconds, bidding farewell to the era of "black-box debugging."

📖 Principle Analysis

Still Debugging with print() in the LLM Era? Wake Up!

In traditional software development, we rely on various IDEs, debuggers, and logging systems to understand code execution flows. However, in the LLM era—especially when building multi-agent systems—traditional debugging methods suddenly fall flat. Why?

  1. The Black Box Effect: The internal reasoning process of an LLM is highly opaque. You feed it a prompt, and it spits out a response. It's incredibly difficult to peek directly into what happened in between.
  2. Non-determinism: LLM outputs are rarely 100% deterministic. Even with the exact same prompt, slight variations can occur at different times or under different temperature settings. This makes reproducing issues a nightmare.
  3. Complex Pipelines: Multi-agent systems orchestrated by LangGraph often involve multiple LLM calls, various tool invocations, and state transfers/decisions across several Agents. A tiny error or misunderstanding can be amplified down the pipeline, ultimately leading to system crashes or skewed outputs.
  4. State Transitions: The core of LangGraph is a state machine, where state flows and mutates between nodes. If you can't clearly see what state each node received and what it outputted, debugging becomes an absolute nightmare.

Imagine this: Your Planner Agent comes up with a nonsensical title, causing the Researcher to fail at finding relevant materials. The Writer is forced to hallucinate, and the Editor revises based on that hallucinated content. Ultimately, you get an absurd article. Without a good observability tool, you wouldn't know the issue originated at the Planner stage. You might spend hours tweaking the Writer's prompt—treating the symptom instead of the disease.

LangSmith: The "X-Ray Machine" for Your Multi-Agent System

LangSmith was born to solve these exact pain points. It is an official developer platform launched by LangChain, designed to help you debug, test, evaluate, and monitor LLM-based applications. For complex multi-agent orchestrations like LangGraph, LangSmith is a match made in heaven.

Its core philosophy is simple: Capture and visualize every "Run" of your LLM application. Whether it's a single LLM call, a Chain, a Tool, or our complex LangGraph workflow, LangSmith breaks it down into a series of traceable events and displays them clearly in a tree structure (known as a Trace).

How does LangSmith collaborate with LangChain/LangGraph?

The LangChain/LangGraph libraries have built-in Tracing mechanisms. Once you set the appropriate environment variables, any LLM call, Tool invocation, or Chain execution initiated via the LangChain library is automatically "intercepted" and sent to the LangSmith backend service. This means you gain powerful observability with virtually zero changes to your business logic code!

What can it show you?

  • Complete Call Traces: From user input to final output, every LLM call, Agent decision, and tool usage is recorded.
  • Inputs/Outputs: The detailed input prompt and LLM output for every single step.
  • Intermediate Steps: The Agent's thought process, tool invocation parameters, and results.
  • Latency and Cost: The execution time of each step, along with estimated token usage and costs.
  • Error Information: If a step fails, LangSmith highlights it and provides the error stack trace.

It's like installing countless micro-cameras and microphones throughout your "Universal AI Content Agency." Every "thought" and "action" of every Agent is in plain sight.

Mermaid Diagram: How LangSmith "Monitors" Your Agency

Let's use a Mermaid diagram to visually understand how LangSmith penetrates your multi-agent workflow:

graph TD
    subgraph Universal AI Content Agency Workflow
        start[User Request] --> Planner(Planner Agent);
        Planner -- Plan Output --> Research(Researcher Agent);
        Research -- Research Report --> Write(Writer Agent);
        Write -- First Draft --> Edit(Editor Agent);
        Edit -- Final Draft --> end[Final Content Output];
    end

    subgraph LangSmith Observability System
        direction LR
        A[LangSmith UI] --> B{Real-time Observation & Analysis}
        B --> C(Performance Bottleneck Identification)
        B --> D(Prompt Optimization Suggestions)
        B --> E(Agent Behavior Understanding)
        B --> F(Cost & Latency Tracking)
        B --> G(Rapid Error Troubleshooting)
    end

    style A fill:#f9f,stroke:#333,stroke-width:2px,color:#333
    style B fill:#e0e0e0,stroke:#333,stroke-width:1px,color:#333
    style C fill:#e0e0e0,stroke:#333,stroke-width:1px,color:#333
    style D fill:#e0e0e0,stroke:#333,stroke-width:1px,color:#333
    style E fill:#e0e0e0,stroke:#333,stroke-width:1px,color:#333
    style F fill:#e0e0e0,stroke:#333,stroke-width:1px,color:#333
    style G fill:#e0e0e0,stroke:#333,stroke-width:1px,color:#333

    start -- Trigger --> LS_Tracer_Start(LangSmith Tracer);
    Planner -- Call/Execute --> LS_Tracer_Planner(LangSmith Tracer);
    Research -- Call/Execute --> LS_Tracer_Research(LangSmith Tracer);
    Write -- Call/Execute --> LS_Tracer_Write(LangSmith Tracer);
    Edit -- Call/Execute --> LS_Tracer_Edit(LangSmith Tracer);
    end -- End --> LS_Tracer_End(LangSmith Tracer);

    LS_Tracer_Start -- Record Trace --> LangSmith_Backend(LangSmith Backend Service);
    LS_Tracer_Planner -- Record Trace --> LangSmith_Backend;
    LS_Tracer_Research -- Record Trace --> LangSmith_Backend;
    LS_Tracer_Write -- Record Trace --> LangSmith_Backend;
    LS_Tracer_Edit -- Record Trace --> LangSmith_Backend;
    LS_Tracer_End -- Record Trace --> LangSmith_Backend;

    LangSmith_Backend -- Data Display --> A;

As you can see from the diagram, every Agent execution and every LangChain component invocation (including the entire LangGraph execution) is captured by the LangSmith Tracer. These events are then sent to the LangSmith backend service. Ultimately, we can view the detailed execution trajectory of the entire workflow on the LangSmith Web UI through an intuitive graphical interface. This is the transparent transformation of your "Universal AI Content Agency"!

💻 Hands-On Code Practice

Now, let's actually integrate LangSmith into our "Universal AI Content Agency" project. For the sake of simplicity in this demonstration, we will build a streamlined LangGraph workflow: the user provides a topic, the Planner Agent outlines the content, and the Writer Agent drafts the article based on that outline.

1. LangSmith Account and API Key Configuration

First, you need to head over to the LangSmith website and register for an account. After registering, you can find your API Key on your personal settings page.

To let LangChain know where to send the Traces, you need to set a few environment variables. The easiest way is to set them in your .env file (if you are using python-dotenv) or directly in the environment where you run your script:

# Example .env file content
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY="sk-YOUR_LANGSMITH_API_KEY"
LANGCHAIN_PROJECT="AI Content Agency - Dev" # Optional, used to organize your projects in the LangSmith UI

Important Note: LANGCHAIN_TRACING_V2=true is the key to enabling the LangChain V2 Tracing mechanism. LANGCHAIN_API_KEY is your LangSmith API key. LANGCHAIN_PROJECT is an optional name used to group all your runs under this project in the LangSmith UI for easier management. It is highly recommended to set this; otherwise, your LangSmith dashboard will become a chaotic mess.

2. Install Necessary Libraries

Ensure your environment has LangChain, LangGraph, OpenAI, and langsmith itself installed:

pip install -qU langchain langgraph langchain_openai langsmith python-dotenv

3. Build the Simplified Agency Graph

We will simulate a simple content creation pipeline: Planner -> Writer.

import os
from dotenv import load_dotenv
from typing import TypedDict, Annotated, List, Union
import operator

from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI

from langgraph.graph import StateGraph, END

# Load environment variables
load_dotenv()

# ====================================================================================
# 1. Define the state of our AI Content Agency (AgencyState)
# ====================================================================================
class AgencyState(TypedDict):
    """
    Structure representing the current state of the AI Content Agency.
    It tracks all key information from the user request to the final content output.
    """
    topic: str  # The original topic requested by the user
    plan: str   # The content outline/plan generated by the Planner Agent
    draft: str  # The first draft written by the Writer Agent
    messages: Annotated[List[BaseMessage], operator.add] # Message history for inter-agent communication

# ====================================================================================
# 2. Initialize the Large Language Model (LLM)
# ====================================================================================
# Recommended to use GPT-4o or GPT-4-turbo, as they perform better at understanding complex instructions
# If you don't have access to these models, you can try gpt-3.5-turbo, but results may vary
llm = ChatOpenAI(model="gpt-4o", temperature=0.7)

# ====================================================================================
# 3. Define Agent Node Functions
#    Each function receives the AgencyState and returns the modified state or messages.
# ====================================================================================

def planner_agent(state: AgencyState) -> AgencyState:
    """
    Planner Agent: Generates a detailed content outline based on the user-provided topic.
    """
    print("---Planner Agent: Planning content outline---")
    topic = state["topic"]
    messages = state["messages"]

    # Planner's Prompt
    planner_prompt = PromptTemplate.from_template(
        """
        你是一位经验丰富的内容规划师。你的任务是根据给定的主题,
        为一篇长文章生成一个详细且逻辑清晰的大纲。

        主题: {topic}

        请确保大纲包含以下要素:
        1. 吸引人的标题
        2. 简介 (Introduction)
        3. 至少 3-5 个主要章节标题
        4. 每个主要章节下包含 2-3 个子章节标题
        5. 结论 (Conclusion)
        6. 行动号召 (Call to Action, 可选)

        请直接输出大纲内容,不需要任何额外说明。
        """
    )

    # Build the planner chain
    planner_chain = planner_prompt | llm | StrOutputParser()

    # Invoke the planner chain to generate the outline
    plan_output = planner_chain.invoke({"topic": topic})

    # Update state: save the plan and add the planner's output to message history
    return {
        "plan": plan_output,
        "messages": [AIMessage(content=f"Planner has completed the outline:\n{plan_output}")]
    }

def writer_agent(state: AgencyState) -> AgencyState:
    """
    Writer Agent: Writes the first draft of the article based on the outline provided by the Planner.
    """
    print("---Writer Agent: Writing article first draft---")
    topic = state["topic"]
    plan = state["plan"]
    messages = state["messages"]

    # Writer's Prompt
    writer_prompt = PromptTemplate.from_template(
        """
        你是一位专业的撰稿人。你的任务是根据以下主题和详细大纲,
        撰写一篇高质量、内容丰富且引人入胜的文章初稿。

        主题: {topic}

        大纲:
        {plan}

        请遵循大纲结构,扩充每个章节和子章节,使文章内容连贯、富有洞察力。
        字数要求:至少 800 字。
        请直接输出文章初稿,不需要任何额外说明。
        """
    )

    # Build the writer chain
    writer_chain = writer_prompt | llm | StrOutputParser()

    # Invoke the writer chain to generate the first draft
    draft_output = writer_chain.invoke({"topic": topic, "plan": plan})

    # Update state: save the draft and add the writer's output to message history
    return {
        "draft": draft_output,
        "messages": [AIMessage(content=f"Writer has completed the first draft:\n{draft_output}")]
    }

# ====================================================================================
# 4. Build the LangGraph Workflow
# ====================================================================================

def create_agency_workflow():
    """
    Creates and compiles the LangGraph workflow for the AI Content Agency.
    """
    workflow = StateGraph(AgencyState)

    # Add nodes
    workflow.add_node("planner", planner_agent)
    workflow.add_node("writer", writer_agent)

    # Set entry point
    workflow.set_entry_point("planner")

    # Define edges: Planner hands over to Writer upon completion
    workflow.add_edge("planner", "writer")

    # End workflow after Writer completes
    workflow.add_edge("writer", END)

    # Compile the graph
    app = workflow.compile()
    return app

# ====================================================================================
# 5. Run the Workflow and Observe in LangSmith
# ====================================================================================

if __name__ == "__main__":
    print("---AI Content Agency Started, LangSmith Observability Enabled---")

    # Create workflow
    app = create_agency_workflow()

    # Define initial state: User requests a topic
    initial_state = {
        "topic": "人工智能在医疗健康领域的应用与挑战",
        "plan": "",
        "draft": "",
        "messages": [HumanMessage(content="请帮我写一篇关于人工智能在医疗健康领域应用的文章。")]
    }

    # Run the workflow
    # LangGraph automatically captures and sends the Trace to LangSmith
    final_state = app.invoke(initial_state)

    print("\n---Workflow Execution Completed---")
    print("\nFinal Article Draft:")
    print(final_state["draft"])

    print("\nPlease visit the LangSmith UI (https://www.langsmith.com/app/) to view the detailed Trace of this run.")
    print(f"You can find it under 'Projects' in the project named '{os.getenv('LANGCHAIN_PROJECT', 'default')}'.")

4. Run the Code and Observe the LangSmith UI

  1. Ensure your .env file is configured correctly and matches your LangSmith API Key.
  2. Run the Python script above.
    python your_script_name.py
    
  3. The script will start executing, and you will see outputs like ---Planner Agent: Planning content outline--- and ---Writer Agent: Writing article first draft--- in your console.
  4. Open your browser and visit the LangSmith UI: https://www.langsmith.com/app/.
  5. In the left navigation bar, click "Projects". You should see the LANGCHAIN_PROJECT name you configured (e.g., AI Content Agency - Dev). Click to enter that project.
  6. You will see a list containing the Trace you just ran. Each Trace represents a single app.invoke() call.

What will you see in the LangSmith UI?

Click on the latest Trace to enter a detailed view:

  • Timeline View: A chronological view clearly displaying the execution order and duration of each component (including the entire LangGraph run, the planner node, the writer node, and the ChatOpenAI LLM calls inside each node).
  • Graph View: This view is particularly cool for LangGraph. It visually displays your Agent pipeline as a graph and highlights the path taken by the current Trace.
  • Detailed Steps:
    • LangGraph Run: The top-level Run shows the inputs (topic, messages) and final outputs (draft, messages) of the entire Graph.
    • Node Runs: Clicking on the planner or writer node reveals the inputs (the state dictionary) and outputs (the state dictionary) of that node when called as a function.
    • LLM Calls: Inside each node, you will see the ChatOpenAI invocations. Click on one to see the exact Prompt sent to the LLM (the messages array) and the raw Response returned by the LLM (content). You can also view token usage and estimated costs.

Where does the value of LangSmith lie?

Suppose the article produced by your Writer Agent isn't good enough. In LangSmith, you can:

  • Inspect the Planner Agent's output: Is the outline provided by the Planner flawed? Did it misunderstand the topic?
  • Inspect the Writer Agent's input Prompt: Did the Writer's Prompt fail to properly utilize the Planner's outline? Are the Prompt instructions unclear?
  • Inspect the actual input/output of the Writer Agent's LLM call: Is the LLM simply underperforming given this specific Prompt?

Through this visual, drill-down inspection, you can quickly pinpoint the issue—whether the Planner's Prompt needs tweaking, the Writer's logic is flawed, or it's an inherent issue with the LLM itself. Say goodbye to blind guessing and embrace scientific debugging!

📝 Pitfalls and How to Avoid Them

As a senior mentor, I've seen too many students stumble in the "treasure trove" that is LangSmith. Let me point out a few "bright paths" to save you from unnecessary detours:

  1. "Dead Environment, Dead End": Environment variables are your lifeline!

    • Pitfall: The most common issue is failing to correctly set LANGCHAIN_TRACING_V2=true and LANGCHAIN_API_KEY. Many people set the API Key but forget to enable V2 Tracing, resulting in a blank LangSmith dashboard.
    • Solution: Always verify that both environment variables are correctly set and active in your script's runtime environment. Using python-dotenv is a good habit, but ensure the .env file is actually loaded. Adding a quick print(os.getenv("LANGCHAIN_TRACING_V2")) at the beginning of your script is a great way to double-check.
  2. "Scattered Projects, Messy Dashboard": Make good use of LANGCHAIN_PROJECT.

    • Pitfall: Not setting LANGCHAIN_PROJECT, or using a different project name every time you run. The LangSmith UI will default to putting all Traces under the default project, or it will create a bunch of fragmented projects. Once you have multiple projects, finding a specific Trace becomes like looking for a needle in a haystack.
    • Solution: Set a fixed and meaningful LANGCHAIN_PROJECT name for each independent project (like our "Universal AI Content Agency"). When developing different feature branches, consider using something like LANGCHAIN_PROJECT="AI Content Agency - FeatureX" to keep things organized.
  3. "Data Security is Paramount": Never upload sensitive information.

    • Pitfall: During development, accidentally sending Prompts or LLM outputs containing sensitive information—like user privacy data or company secrets—to LangSmith.
    • Solution: LangSmith is a powerful debugging tool, but it is a cloud service. Exercise extreme caution in production environments or when handling sensitive data. For highly sensitive information, consider data masking/anonymization before sending it to LangSmith. LangSmith does offer a self-hosted option (LangSmith On-premise), though the cloud service is more convenient for most developers.
  4. "Endless Traces, Dazzled Eyes": Learn to filter and search.

    • Pitfall: When your workflow is highly complex or runs frequently, a single Trace might contain hundreds of steps, making it painful to review in the LangSmith UI.
    • Solution: The LangSmith UI offers powerful filtering, searching, and grouping features. You can filter by Agent name, LLM type, status, time range, etc. Learning to use these features will drastically improve your debugging efficiency. Additionally, you can add tags to your Runs to make them easier to find later.
  5. "Custom Components, Where to Go?": Manually integrating the Tracer.

    • Pitfall: If you use entirely custom Python functions inside your LangGraph nodes instead of LangChain's Runnable components, the sub-steps within that custom logic (e.g., a custom external API call) might not be automatically captured by LangSmith.
    • Solution: For fully custom components that don't inherit from Runnable, if you want their internal operations tracked, you need to manually introduce the LangChainTracer. For example, you can use with get_tracer().start_as_current_span("my_custom_tool_call") as span: to wrap your custom logic, and add inputs and outputs to the span. This requires a deeper understanding of LangChain's CallbackManager mechanism.
  6. "Performance Impact, Not to be Ignored": Tracing has overhead.

    • Pitfall: Although the overhead of LangSmith Tracing is usually minimal, in extremely high-concurrency scenarios or applications with strict latency requirements, network transmission and data logging might introduce slight additional latency.
    • Solution: When deploying to production, evaluate the performance impact of LangSmith Tracing. Generally, keep it fully enabled during development and testing. In production, consider sampling only a percentage of requests for tracing, or enabling it dynamically only when issues arise.
  7. "Cost Calculations are for Reference Only": Discrepancies with actual bills.

    • Pitfall: The costs displayed in LangSmith are typically estimates based on token counts. They might differ slightly from your actual LLM provider's bill (due to factors like API call charges, concurrency discounts, or precise billing methods for different models).
    • Solution: Use LangSmith's cost data as a tool for quick reference and trend analysis, but rely on your LLM provider's official dashboard for final financial accounting.