Programming Notes
nacho4d avatar

Programming Notes

@nacho4d

LangChain basics — tool calling from scratch

A practical summary of how LangChain works under the hood, built up step by step.

1. Initialising the model

from langchain.chat_models import init_chat_model

llm = init_chat_model("gpt-4o-mini", model_provider="openai")
init_chat_model wraps the underlying provider SDK. No network call happens here — it just configures a client object.

2. Defining tools with @tool

from langchain_core.tools import tool

tools = []

@tool
def extract_video_id(url: str) -> str:
    """Extracts the 11-character YouTube video ID from a URL."""
    ...

tools.append(extract_video_id)
@tool is a LangChain-specific decorator — not a Python built-in. The moment Python reads the @tool line, it:
  1. Reads __name__ → becomes the tool's name
  2. Reads __doc__ → becomes the description sent to the LLM
  3. Reads type annotations → builds a Pydantic input schema
  4. Wraps everything into a StructuredTool object
The original function is preserved inside .func and is never lost. After decoration, your variable (extract_video_id) immediately becomes a StructuredTool — it never exists as a plain function in your namespace. Calling a tool directly (outside of LLM flow):
extract_video_id.invoke({"url": "https://youtube.com/watch?v=abc"})
extract_video_id.func("https://youtube.com/watch?v=abc")  # raw function

3. Binding tools to the model

llm_with_tools = llm.bind_tools(tools)

This is pure preparation — no network call. It creates a new runnable that stores the tool definitions to include in every subsequent API call. When you later call .invoke(), LangChain serialises the tool definitions into the OpenAI JSON format:

{
  "model": "gpt-4o-mini",
  "messages": [...],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "extract_video_id",
        "description": "...",
        "parameters": { "type": "object", "properties": { "url": { "type": "string" } } }
      }
    }
  ]
}

4. Message types

LangChain uses typed message classes. Each serialises differently into the JSON sent to the API. The direction is always consistent:

ClassDirectionWho creates it
SystemMessageclient → LLMdeveloper (instructions, persona)
HumanMessageclient → LLMend user
AIMessageLLM → clientthe model
ToolMessageclient → LLMyour tool execution code

Below are samples on how to create each message:

from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, ToolMessage
  • SystemMessage sits permanently at index 0 — never recreated between turns.
    messages = [
        SystemMessage(content="You are a helpful assistant that does this and that...")
    ]
  • HumanMessage is appended fresh every user turn.
    user_input = input("You: ")
    messages.append(HumanMessage(content=user_input))
  • AIMessage is the return value of invoke() — append it as-is, never construct one by hand.
    response = llm_with_tools.invoke(messages)
    messages.append(response)          # response is already an AIMessage
  • ToolMessage requires a tool_call_id matching the id the LLM assigned to its request. That id lives in tool_call["id"], which is why execute_tool reads it from the tool call dict rather than generating one itself.
    for tool_call in response.tool_calls:
        messages.append(execute_tool(tool_call))   # execute_tool returns a ToolMessage

After a full exchange with one tool call, messages[] looks like this:

[
    SystemMessage(content="You are a helpful assistant ..."),   # permanent
    HumanMessage(content="Summarise this video: youtube.com/..."),
    AIMessage(tool_calls=[{"name": "extract_video_id", "args": {...}, "id": "tu_abc"}]),
    ToolMessage(content="T-D1OfcDW1M", tool_call_id="tu_abc"),
    AIMessage(content="Here is a summary ..."),
]

This list is the LLM's entire working memory — sent in full on every invoke() call.

5. The tool call loop

When you call llm_with_tools.invoke(messages), the LLM does not execute your tools. It returns an AIMessage that says "I want to call this tool with these arguments". Your code must execute the tool and send the result back.

sequenceDiagram
    participant P as Your program
    participant L as LLM API

    P->>L: `① POST /v1/messages — messages:[{role:"user",content:"summarize this video"}], tools:[extract_video_id,fetch_transcript]`

    L-->>P: `② 200 stop_reason:"tool_use" — content:[{type:"tool_use",id:"tu_abc",name:"extract_video_id",input:{url:"youtube.com/..."}}]`

    Note over P: `AIMessage(tool_calls) → invoke extract_video_id → returns "T-D1OfcDW1M"`

    P->>L: `③ POST /v1/messages — role:"assistant":{type:"tool_use",id:"tu_abc"}, role:"user":{type:"tool_result",tool_use_id:"tu_abc",content:"T-D1OfcDW1M"}`

    L-->>P: `④ 200 stop_reason:"tool_use" — content:[{type:"tool_use",id:"tu_xyz",name:"fetch_transcript",input:{video_id:"T-D1OfcDW1M"}}]`

    Note over P: `AIMessage(tool_calls) → invoke fetch_transcript → returns "[full transcript]"`

    P->>L: `⑤ POST /v1/messages — role:"user":{type:"tool_result",tool_use_id:"tu_xyz",content:"[full transcript]"}`

    L-->>P: `⑥ 200 stop_reason:"end_turn" — content:[{type:"text",text:"Here is a summary..."}]`

Key rules:

  • Always append the AIMessage to messages[] before appending the ToolMessage. The API requires them to appear in order.
  • When response.tool_calls is non-empty, the loop continues. When it is empty, the LLM is done. Note that the raw field name varies by provider — OpenAI calls it finish_reason, Anthropic calls it stop_reason. LangChain abstracts this behind response.tool_calls.
  • AIMessage.content is typically empty ('') while the LLM is in tool-calling mode — in most providers it has nothing to say to the user until all tools have been executed.

6. The tool mapping

tool_mapping = {t.name: t for t in tools}

Built from the tools list automatically — no need to maintain it by hand. Used to look up the callable from the string name the LLM returns:

tool_fn = tool_mapping[tool_call["name"]]
result  = tool_fn.invoke(tool_call["args"])

7. A reusable execute_tool helper

def execute_tool(tool_call):
    try:
        result = tool_mapping[tool_call["name"]].invoke(tool_call["args"])
        return ToolMessage(content=str(result), tool_call_id=tool_call["id"])
    except Exception as e:
        return ToolMessage(content=f"Error: {str(e)}", tool_call_id=tool_call["id"])

The try/except matters: if the tool throws, you still need to return a ToolMessage with the matching tool_call_id, otherwise messages[] becomes malformed and the next API call fails. Returning an error message lets the LLM see what went wrong and potentially recover.

8. A minimal agent loop

Putting it all together:

def run_agent(messages):  # note: mutates messages in place
    while True:
        response = llm_with_tools.invoke(messages)
        messages.append(response)

        if not response.tool_calls:   # LLM is done
            return response

        for tool_call in response.tool_calls:
            messages.append(execute_tool(tool_call))

The LLM can request multiple tools in a single response — hence the for loop over response.tool_calls. All results are appended before the next invoke().

This is essentially what LangGraph's ToolNode does internally — the tutorial is showing you the manual version so you understand what the framework automates.

9. Chaining

LangChain's chaining feature allows composing multiple steps into a single pipeline using the | operator — the same concept as Unix pipes:

my_chain = p1 | p2 | p3

Each step must return a Runnable, which allows calling my_chain.invoke({...}) on the whole pipeline. Data enters via invoke() and flows through each step in order — the chain itself is just a description of transformations, no data flows through it at definition time.

Two building blocks make this work:

  • RunnablePassthrough.assign(key=fn) — copies the current dict and adds one new key, where the value is the result of calling fn with the current dict. All previous keys are preserved.
  • RunnableLambda(fn) — wraps a plain function into a Runnable so it can participate in the | pipeline. Unlike assign, it replaces the entire value rather than adding a key — used as the final step to extract a single return value.

Here is a hardcoded chain implementation for our agent. Each comment shows the state of the dict x after that step:

summarization_chain = (

    # x = {"query": "...url..."}
    RunnablePassthrough.assign(
        messages=lambda x: [HumanMessage(content=x["query"])]
    )
    # x = {"query": "...", "messages": [HumanMessage]}

    | RunnablePassthrough.assign(
        ai_response=lambda x: llm_with_tools.invoke(x["messages"])
    )
    # x = {..., "ai_response": AIMessage(tool_calls=[extract_video_id])}

    | RunnablePassthrough.assign(
        tool_messages=lambda x: [
            execute_tool(tc) for tc in x["ai_response"].tool_calls
        ]
    )
    # x = {..., "tool_messages": [ToolMessage(content="T-D1OfcDW1M")]}

    | RunnablePassthrough.assign(
        messages=lambda x: x["messages"] + [x["ai_response"]] + x["tool_messages"]
    )
    # x = {..., "messages": [HumanMessage, AIMessage, ToolMessage]}
    # messages is now a proper conversation history for the next LLM call

    | RunnablePassthrough.assign(
        ai_response2=lambda x: llm_with_tools.invoke(x["messages"])
    )
    # x = {..., "ai_response2": AIMessage(tool_calls=[fetch_transcript])}

    | RunnablePassthrough.assign(
        tool_messages2=lambda x: [
            execute_tool(tc) for tc in x["ai_response2"].tool_calls
        ]
    )
    # x = {..., "tool_messages2": [ToolMessage(content="[full transcript]")]}

    | RunnablePassthrough.assign(
        messages=lambda x: x["messages"] + [x["ai_response2"]] + x["tool_messages2"]
    )
    # x = {..., "messages": [HumanMessage, AIMessage, ToolMessage, AIMessage, ToolMessage]}

    | RunnablePassthrough.assign(
        summary=lambda x: llm_with_tools.invoke(x["messages"]).content
    )
    # x = {..., "summary": "The video discusses LangChain..."}

    | RunnableLambda(lambda x: x["summary"])
    # returns just the string — chain is done
)

Note that this chain is hardcoded for exactly two tool calls in a fixed order. It is useful as a learning device to make each round-trip explicit, but it is not suitable for production. The plain function equivalent is:

def summarize_video(query: str) -> str:
    # Build initial messages
    messages = [HumanMessage(content=query)]

    # First LLM call → wants to call extract_video_id
    ai_response = llm_with_tools.invoke(messages)
    messages.append(ai_response)
    for tc in ai_response.tool_calls:
        messages.append(execute_tool(tc))

    # Second LLM call → wants to call fetch_transcript
    ai_response2 = llm_with_tools.invoke(messages)
    messages.append(ai_response2)
    for tc in ai_response2.tool_calls:
        messages.append(execute_tool(tc))

    # Final LLM call → generates summary
    return llm_with_tools.invoke(messages).content

Both are functionally identical. The chain style becomes compelling in production for three reasons:

  1. Observability — the pipe style hooks automatically into tracing and monitoring tools such as LangSmith, without any manual instrumentation.
  2. Composability — chains can be passed around, combined, and reused as building blocks across different parts of an application.
  3. Parallelism — LangChain can run independent steps concurrently within a chain, without manual thread management.

10. Quick reference

ConceptWhat it does
@tool decoratorconverts a function into a StructuredTool at decoration time
bind_tools(tools)stores tool definitions locally, no API call
invoke(messages)sends full message history + tool schemas to the API
AIMessage.tool_callslist of tool calls the LLM wants executed; empty when the LLM is done
ToolMessagecarries a tool result back to the LLM; must include matching tool_call_id
messages[]the LLM's entire working memory, sent in full on every invoke() call
RunnablePassthrough.assigncopies the current dict and adds one new key; used to build pipelines step by step
RunnableLambdawraps a plain function into a Runnable so it can join a | pipeline

0 comments:

This work is licensed under BSD Zero Clause License | nacho4d ®