Business Apps

Build a Customer Support Chatbot with LangChain

This guide shows you how to build a working customer support chatbot in Python using LangChain, with a model, a persona prompt, conversation memory, and a lookup tool, in about 30 minutes. LangChain is an open-source library that connects an AI model to prompts, memory, and your own functions so you write less plumbing and more logic.

By the end you will have a chatbot that remembers what the customer said earlier, stays in a support tone, and can look up a real order status instead of inventing one.

Prerequisites

You only need a few things beyond a working Python setup. If Python is not installed yet, follow How to Install Python for AI on Windows or How to Install Python for AI Projects on Mac first.

Install the libraries:

pip install langchain langchain-openai langchain-community python-dotenv

Store your key in a file named .env in the project folder so it never lands in your code:

OPENAI_API_KEY=sk-your-real-key-here

Add .env to your .gitignore immediately so you never commit your key to a repository:

echo ".env" >> .gitignore

Confirm the key loads before you go further:

python -c "import os; from dotenv import load_dotenv; load_dotenv(); print('Key loaded:', bool(os.getenv('OPENAI_API_KEY')))"

If that prints Key loaded: True, you are ready.

Step 1: Connect a model and a prompt

The two building blocks are the model (the AI that writes replies) and the prompt (the instructions that shape its tone). In LangChain you create a model object once and reuse it.

The prompt below uses a ChatPromptTemplate, which is a reusable message template. The MessagesPlaceholder is a slot where past conversation turns will be inserted in the next step, and {input} is where the customer's new message goes.

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

SYSTEM_PROMPT = """You are a Tier-1 customer support agent for an online store.
- Keep a professional, empathetic, solution-oriented tone.
- If you are missing information, ask one clarifying question instead of guessing.
- Never invent a policy, refund amount, or order status. Use the tools provided.
- Keep replies under three sentences unless steps are required."""

prompt = ChatPromptTemplate.from_messages([
    ("system", SYSTEM_PROMPT),
    MessagesPlaceholder(variable_name="history"),
    ("human", "{input}"),
])

# temperature 0.2 keeps answers steady and predictable for support
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2)

# StrOutputParser turns the model's reply object into plain text
chain = prompt | llm | StrOutputParser()

The | pipe joins the pieces into a single sequence: the prompt fills in, the model answers, and the parser hands you a clean string. You can already test it with chain.invoke({"input": "Hi", "history": []}).

Step 2: Add memory so the bot remembers

A support bot that forgets the previous message is frustrating. LangChain solves this with message history: it saves every turn under a session_id and replays it into the history placeholder automatically.

RunnableWithMessageHistory wraps your chain and does this for you. You supply a function that returns a storage object for a given session id. Below it is in-memory, which is fine for development.

from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory

store: dict[str, ChatMessageHistory] = {}

def get_session_history(session_id: str) -> ChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]

chatbot = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="history",
)

Now each call needs a session id so the bot knows which conversation it is in:

config = {"configurable": {"session_id": "customer-42"}}
print(chatbot.invoke({"input": "My order is late."}, config=config))
print(chatbot.invoke({"input": "What did I just ask about?"}, config=config))

The second reply will reference the late order, because the first turn is now in memory. For a deeper look at persistence options, see Add Memory to a Python Chatbot.

Step 3: Give the bot a tool to look up real answers

Left alone, a model will happily guess an order status. A tool is a plain Python function the model is allowed to call when it needs a fact. LangChain reads the function's name, type hints, and docstring to decide when and how to call it, so write a clear docstring.

Here we add a fake order-lookup function. In a real system this would query your database or order API.

from langchain_core.tools import tool

# Pretend this is your real order database
ORDERS = {
    "1001": {"status": "shipped", "eta": "June 20"},
    "1002": {"status": "processing", "eta": "June 24"},
}

@tool
def lookup_order(order_id: str) -> str:
    """Look up the status and delivery estimate for an order by its ID."""
    order = ORDERS.get(order_id)
    if not order:
        return f"No order found with ID {order_id}."
    return f"Order {order_id}: {order['status']}, estimated delivery {order['eta']}."

# Bind the tool so the model knows it can call it
llm_with_tools = llm.bind_tools([lookup_order])

To let the model actually run the tool and then answer using the result, route any tool calls back to your function. The small helper below handles both cases: a direct text reply, or a request to call the tool.

from langchain_core.messages import HumanMessage, ToolMessage

def answer_with_tools(user_input: str) -> str:
    messages = [("system", SYSTEM_PROMPT), HumanMessage(user_input)]
    ai_msg = llm_with_tools.invoke(messages)

    # If the model asked for no tool, return its text directly
    if not ai_msg.tool_calls:
        return ai_msg.content

    # Otherwise run each requested tool and feed the result back
    messages.append(ai_msg)
    for call in ai_msg.tool_calls:
        result = lookup_order.invoke(call["args"])
        messages.append(ToolMessage(result, tool_call_id=call["id"]))
    return llm_with_tools.invoke(messages).content

print(answer_with_tools("Where is order 1001?"))

Now the bot returns the real shipped status for order 1001 instead of a made-up date. If you would rather have the bot answer from a knowledge base of help articles, Connect a Chatbot to Your Docs with RAG shows the retrieval approach.

Step 4: Run an interactive chat loop

Tie it together with a loop that reads customer messages and prints replies. This version uses the memory-aware chatbot from Step 2 so the conversation stays in context. A try/except keeps a single bad request from crashing the session.

from openai import OpenAIError, RateLimitError

def reply(user_input: str, session_id: str) -> str:
    try:
        return chatbot.invoke(
            {"input": user_input},
            config={"configurable": {"session_id": session_id}},
        )
    except RateLimitError:
        return "We are busy right now. Please try again in 30 seconds."
    except OpenAIError as e:
        return f"Sorry, something went wrong: {e}"

if __name__ == "__main__":
    session_id = "support-session-01"
    print("Support chatbot ready. Type 'quit' to exit.")
    while True:
        message = input("\nCustomer: ").strip()
        if message.lower() in {"quit", "exit"}:
            print("Session closed.")
            break
        if not message:
            continue
        print(f"Agent: {reply(message, session_id)}")

Save the file as support_bot.py and run it with python support_bot.py. Ask a question, then ask a follow-up, and watch it keep the thread.

Key parameters quick reference

ParameterWhereDefaultWhat it does
modelChatOpenAI(...)requiredWhich model answers. gpt-4o-mini is cheap and fast for support.
temperatureChatOpenAI(...)0.7How varied replies are. Use 0.00.3 so support answers stay consistent.
session_idconfig={"configurable": ...}requiredPicks which conversation's memory to load. One id per customer thread.
history_messages_keyRunnableWithMessageHistory(...)noneMust match the MessagesPlaceholder name so past turns get injected.

Troubleshooting

  1. ValidationError: OPENAI_API_KEY ... field required — The key did not load. Check that .env is in the folder you run from and that you call load_dotenv() before creating the model. See Fix the 401 Unauthorized Error in OpenAI Python.
  2. The bot forgets earlier messages — You either changed the session_id between calls or your history_messages_key does not match the MessagesPlaceholder name. Keep both consistent.
  3. RateLimitError: 429 — You sent requests faster than your plan allows. Wait, then add backoff. Fix the 429 Rate-Limit Error in Python has a retry pattern.
  4. The model ignores your tool — Make sure you called llm.bind_tools([...]) and that the tool's docstring clearly describes when to use it. A vague docstring makes the model skip it.

When to use this vs. alternatives

LangChain saves time, but it is not always the right pick. Here is when each approach fits.

  • Use LangChain when you want memory, tools, and retrieval to plug together with little glue code, or when you may swap models later. The wrappers shown here are the payoff.
  • Use the raw openai SDK when your bot is a single prompt with no memory or tools. The extra LangChain layer adds dependencies and abstraction you will not use, so a direct API call is simpler to read and debug.
  • Use a hosted platform (a no-code support tool) when you need a polished widget, analytics, and human handoff out of the box, and you are willing to trade custom logic for speed.

A good rule: reach for LangChain once you need at least two of memory, tools, and document retrieval. Below that, stay with the plain SDK.

To take this further, add real-time replies with Stream Chatbot Responses with Python, or wire it to your sales data through CRM Data Integration with AI.

Back to Custom AI Chatbot Development.