1. Initialising the model
from langchain.chat_models import init_chat_model
llm = init_chat_model("gpt-4o-mini", model_provider="openai")
init_chat_model wraps the underlying provider SDK. No network call happens here — it just configures a client object.
2. Defining tools with @tool
from langchain_core.tools import tool
tools = []
@tool
def extract_video_id(url: str) -> str:
"""Extracts the 11-character YouTube video ID from a URL."""
...
tools.append(extract_video_id)
@tool is a LangChain-specific decorator — not a Python built-in. The moment Python reads the @tool line, it:
- Reads
__name__→ becomes the tool's name - Reads
__doc__→ becomes the description sent to the LLM - Reads type annotations → builds a Pydantic input schema
- Wraps everything into a
StructuredToolobject
.func and is never lost. After decoration, your variable (extract_video_id) immediately becomes a StructuredTool — it never exists as a plain function in your namespace.
Calling a tool directly (outside of LLM flow):
extract_video_id.invoke({"url": "https://youtube.com/watch?v=abc"})
extract_video_id.func("https://youtube.com/watch?v=abc") # raw function
3. Binding tools to the model
llm_with_tools = llm.bind_tools(tools)
This is pure preparation — no network call. It creates a new runnable that stores the tool definitions to include in every subsequent API call. When you later call .invoke(), LangChain serialises the tool definitions into the OpenAI JSON format:
{
"model": "gpt-4o-mini",
"messages": [...],
"tools": [
{
"type": "function",
"function": {
"name": "extract_video_id",
"description": "...",
"parameters": { "type": "object", "properties": { "url": { "type": "string" } } }
}
}
]
}
4. Message types
LangChain uses typed message classes. Each serialises differently into the JSON sent to the API. The direction is always consistent:
| Class | Direction | Who creates it |
|---|---|---|
SystemMessage | client → LLM | developer (instructions, persona) |
HumanMessage | client → LLM | end user |
AIMessage | LLM → client | the model |
ToolMessage | client → LLM | your tool execution code |
Below are samples on how to create each message:
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage, ToolMessage
SystemMessagesits permanently at index 0 — never recreated between turns.messages = [ SystemMessage(content="You are a helpful assistant that does this and that...") ]HumanMessageis appended fresh every user turn.user_input = input("You: ") messages.append(HumanMessage(content=user_input))AIMessageis the return value ofinvoke()— append it as-is, never construct one by hand.response = llm_with_tools.invoke(messages) messages.append(response) # response is already an AIMessageToolMessagerequires atool_call_idmatching the id the LLM assigned to its request. That id lives intool_call["id"], which is whyexecute_toolreads it from the tool call dict rather than generating one itself.for tool_call in response.tool_calls: messages.append(execute_tool(tool_call)) # execute_tool returns a ToolMessage
After a full exchange with one tool call, messages[] looks like this:
[
SystemMessage(content="You are a helpful assistant ..."), # permanent
HumanMessage(content="Summarise this video: youtube.com/..."),
AIMessage(tool_calls=[{"name": "extract_video_id", "args": {...}, "id": "tu_abc"}]),
ToolMessage(content="T-D1OfcDW1M", tool_call_id="tu_abc"),
AIMessage(content="Here is a summary ..."),
]
This list is the LLM's entire working memory — sent in full on every invoke() call.
5. The tool call loop
When you call llm_with_tools.invoke(messages), the LLM does not execute your tools. It returns an AIMessage that says "I want to call this tool with these arguments". Your code must execute the tool and send the result back.
sequenceDiagram
participant P as Your program
participant L as LLM API
P->>L: `① POST /v1/messages — messages:[{role:"user",content:"summarize this video"}], tools:[extract_video_id,fetch_transcript]`
L-->>P: `② 200 stop_reason:"tool_use" — content:[{type:"tool_use",id:"tu_abc",name:"extract_video_id",input:{url:"youtube.com/..."}}]`
Note over P: `AIMessage(tool_calls) → invoke extract_video_id → returns "T-D1OfcDW1M"`
P->>L: `③ POST /v1/messages — role:"assistant":{type:"tool_use",id:"tu_abc"}, role:"user":{type:"tool_result",tool_use_id:"tu_abc",content:"T-D1OfcDW1M"}`
L-->>P: `④ 200 stop_reason:"tool_use" — content:[{type:"tool_use",id:"tu_xyz",name:"fetch_transcript",input:{video_id:"T-D1OfcDW1M"}}]`
Note over P: `AIMessage(tool_calls) → invoke fetch_transcript → returns "[full transcript]"`
P->>L: `⑤ POST /v1/messages — role:"user":{type:"tool_result",tool_use_id:"tu_xyz",content:"[full transcript]"}`
L-->>P: `⑥ 200 stop_reason:"end_turn" — content:[{type:"text",text:"Here is a summary..."}]`
Key rules:
- Always append the
AIMessagetomessages[]before appending theToolMessage. The API requires them to appear in order. - When
response.tool_callsis non-empty, the loop continues. When it is empty, the LLM is done. Note that the raw field name varies by provider — OpenAI calls itfinish_reason, Anthropic calls itstop_reason. LangChain abstracts this behindresponse.tool_calls. AIMessage.contentis typically empty ('') while the LLM is in tool-calling mode — in most providers it has nothing to say to the user until all tools have been executed.
6. The tool mapping
tool_mapping = {t.name: t for t in tools}
Built from the tools list automatically — no need to maintain it by hand. Used to look up the callable from the string name the LLM returns:
tool_fn = tool_mapping[tool_call["name"]]
result = tool_fn.invoke(tool_call["args"])
7. A reusable execute_tool helper
def execute_tool(tool_call):
try:
result = tool_mapping[tool_call["name"]].invoke(tool_call["args"])
return ToolMessage(content=str(result), tool_call_id=tool_call["id"])
except Exception as e:
return ToolMessage(content=f"Error: {str(e)}", tool_call_id=tool_call["id"])
The try/except matters: if the tool throws, you still need to return a ToolMessage with the matching tool_call_id, otherwise messages[] becomes malformed and the next API call fails. Returning an error message lets the LLM see what went wrong and potentially recover.
8. A minimal agent loop
Putting it all together:
def run_agent(messages): # note: mutates messages in place
while True:
response = llm_with_tools.invoke(messages)
messages.append(response)
if not response.tool_calls: # LLM is done
return response
for tool_call in response.tool_calls:
messages.append(execute_tool(tool_call))
The LLM can request multiple tools in a single response — hence the for loop over response.tool_calls. All results are appended before the next invoke().
This is essentially what LangGraph's ToolNode does internally — the tutorial is showing you the manual version so you understand what the framework automates.
9. Chaining
LangChain's chaining feature allows composing multiple steps into a single pipeline using the | operator — the same concept as Unix pipes:
my_chain = p1 | p2 | p3
Each step must return a Runnable, which allows calling my_chain.invoke({...}) on the whole pipeline. Data enters via invoke() and flows through each step in order — the chain itself is just a description of transformations, no data flows through it at definition time.
Two building blocks make this work:
RunnablePassthrough.assign(key=fn)— copies the current dict and adds one new key, where the value is the result of callingfnwith the current dict. All previous keys are preserved.RunnableLambda(fn)— wraps a plain function into aRunnableso it can participate in the|pipeline. Unlikeassign, it replaces the entire value rather than adding a key — used as the final step to extract a single return value.
Here is a hardcoded chain implementation for our agent. Each comment shows the state of the dict x after that step:
summarization_chain = (
# x = {"query": "...url..."}
RunnablePassthrough.assign(
messages=lambda x: [HumanMessage(content=x["query"])]
)
# x = {"query": "...", "messages": [HumanMessage]}
| RunnablePassthrough.assign(
ai_response=lambda x: llm_with_tools.invoke(x["messages"])
)
# x = {..., "ai_response": AIMessage(tool_calls=[extract_video_id])}
| RunnablePassthrough.assign(
tool_messages=lambda x: [
execute_tool(tc) for tc in x["ai_response"].tool_calls
]
)
# x = {..., "tool_messages": [ToolMessage(content="T-D1OfcDW1M")]}
| RunnablePassthrough.assign(
messages=lambda x: x["messages"] + [x["ai_response"]] + x["tool_messages"]
)
# x = {..., "messages": [HumanMessage, AIMessage, ToolMessage]}
# messages is now a proper conversation history for the next LLM call
| RunnablePassthrough.assign(
ai_response2=lambda x: llm_with_tools.invoke(x["messages"])
)
# x = {..., "ai_response2": AIMessage(tool_calls=[fetch_transcript])}
| RunnablePassthrough.assign(
tool_messages2=lambda x: [
execute_tool(tc) for tc in x["ai_response2"].tool_calls
]
)
# x = {..., "tool_messages2": [ToolMessage(content="[full transcript]")]}
| RunnablePassthrough.assign(
messages=lambda x: x["messages"] + [x["ai_response2"]] + x["tool_messages2"]
)
# x = {..., "messages": [HumanMessage, AIMessage, ToolMessage, AIMessage, ToolMessage]}
| RunnablePassthrough.assign(
summary=lambda x: llm_with_tools.invoke(x["messages"]).content
)
# x = {..., "summary": "The video discusses LangChain..."}
| RunnableLambda(lambda x: x["summary"])
# returns just the string — chain is done
)
Note that this chain is hardcoded for exactly two tool calls in a fixed order. It is useful as a learning device to make each round-trip explicit, but it is not suitable for production. The plain function equivalent is:
def summarize_video(query: str) -> str:
# Build initial messages
messages = [HumanMessage(content=query)]
# First LLM call → wants to call extract_video_id
ai_response = llm_with_tools.invoke(messages)
messages.append(ai_response)
for tc in ai_response.tool_calls:
messages.append(execute_tool(tc))
# Second LLM call → wants to call fetch_transcript
ai_response2 = llm_with_tools.invoke(messages)
messages.append(ai_response2)
for tc in ai_response2.tool_calls:
messages.append(execute_tool(tc))
# Final LLM call → generates summary
return llm_with_tools.invoke(messages).content
Both are functionally identical. The chain style becomes compelling in production for three reasons:
- Observability — the pipe style hooks automatically into tracing and monitoring tools such as LangSmith, without any manual instrumentation.
- Composability — chains can be passed around, combined, and reused as building blocks across different parts of an application.
- Parallelism — LangChain can run independent steps concurrently within a chain, without manual thread management.
10. Quick reference
| Concept | What it does |
|---|---|
@tool decorator | converts a function into a StructuredTool at decoration time |
bind_tools(tools) | stores tool definitions locally, no API call |
invoke(messages) | sends full message history + tool schemas to the API |
AIMessage.tool_calls | list of tool calls the LLM wants executed; empty when the LLM is done |
ToolMessage | carries a tool result back to the LLM; must include matching tool_call_id |
messages[] | the LLM's entire working memory, sent in full on every invoke() call |
RunnablePassthrough.assign | copies the current dict and adds one new key; used to build pipelines step by step |
RunnableLambda | wraps a plain function into a Runnable so it can join a | pipeline |