This guide shows you how to build a working customer support chatbot in Python using LangChain, with a model, a persona prompt, conversation memory, and a lookup tool, in about 30 minutes. LangChain is an open-source library that connects an AI model to prompts, memory, and your own functions so you write less plumbing and more logic.
By the end you will have a chatbot that remembers what the customer said earlier, stays in a support tone, and can look up a real order status instead of inventing one.
Prerequisites
You only need a few things beyond a working Python setup. If Python is not installed yet, follow How to Install Python for AI on Windows or How to Install Python for AI Projects on Mac first.
- Python 3.10 or newer. Check with
python --version. - A virtual environment. See Create a Python Virtual Environment for AI if this is new.
- An OpenAI API key. If you are choosing a provider, OpenAI vs Anthropic API for Beginners compares them.
Install the libraries:
pip install langchain langchain-openai langchain-community python-dotenv
Store your key in a file named .env in the project folder so it never lands in your code:
OPENAI_API_KEY=sk-your-real-key-here
Add .env to your .gitignore immediately so you never commit your key to a repository:
echo ".env" >> .gitignore
Confirm the key loads before you go further:
python -c "import os; from dotenv import load_dotenv; load_dotenv(); print('Key loaded:', bool(os.getenv('OPENAI_API_KEY')))"
If that prints Key loaded: True, you are ready.
Step 1: Connect a model and a prompt
The two building blocks are the model (the AI that writes replies) and the prompt (the instructions that shape its tone). In LangChain you create a model object once and reuse it.
The prompt below uses a ChatPromptTemplate, which is a reusable message template. The MessagesPlaceholder is a slot where past conversation turns will be inserted in the next step, and {input} is where the customer's new message goes.
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
load_dotenv()
SYSTEM_PROMPT = """You are a Tier-1 customer support agent for an online store.
- Keep a professional, empathetic, solution-oriented tone.
- If you are missing information, ask one clarifying question instead of guessing.
- Never invent a policy, refund amount, or order status. Use the tools provided.
- Keep replies under three sentences unless steps are required."""
prompt = ChatPromptTemplate.from_messages([
("system", SYSTEM_PROMPT),
MessagesPlaceholder(variable_name="history"),
("human", "{input}"),
])
# temperature 0.2 keeps answers steady and predictable for support
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
# StrOutputParser turns the model's reply object into plain text
chain = prompt | llm | StrOutputParser()
The | pipe joins the pieces into a single sequence: the prompt fills in, the model answers, and the parser hands you a clean string. You can already test it with chain.invoke({"input": "Hi", "history": []}).
Step 2: Add memory so the bot remembers
A support bot that forgets the previous message is frustrating. LangChain solves this with message history: it saves every turn under a session_id and replays it into the history placeholder automatically.
RunnableWithMessageHistory wraps your chain and does this for you. You supply a function that returns a storage object for a given session id. Below it is in-memory, which is fine for development.
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_community.chat_message_histories import ChatMessageHistory
store: dict[str, ChatMessageHistory] = {}
def get_session_history(session_id: str) -> ChatMessageHistory:
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
chatbot = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="input",
history_messages_key="history",
)
Now each call needs a session id so the bot knows which conversation it is in:
config = {"configurable": {"session_id": "customer-42"}}
print(chatbot.invoke({"input": "My order is late."}, config=config))
print(chatbot.invoke({"input": "What did I just ask about?"}, config=config))
The second reply will reference the late order, because the first turn is now in memory. For a deeper look at persistence options, see Add Memory to a Python Chatbot.
Step 3: Give the bot a tool to look up real answers
Left alone, a model will happily guess an order status. A tool is a plain Python function the model is allowed to call when it needs a fact. LangChain reads the function's name, type hints, and docstring to decide when and how to call it, so write a clear docstring.
Here we add a fake order-lookup function. In a real system this would query your database or order API.
from langchain_core.tools import tool
# Pretend this is your real order database
ORDERS = {
"1001": {"status": "shipped", "eta": "June 20"},
"1002": {"status": "processing", "eta": "June 24"},
}
@tool
def lookup_order(order_id: str) -> str:
"""Look up the status and delivery estimate for an order by its ID."""
order = ORDERS.get(order_id)
if not order:
return f"No order found with ID {order_id}."
return f"Order {order_id}: {order['status']}, estimated delivery {order['eta']}."
# Bind the tool so the model knows it can call it
llm_with_tools = llm.bind_tools([lookup_order])
To let the model actually run the tool and then answer using the result, route any tool calls back to your function. The small helper below handles both cases: a direct text reply, or a request to call the tool.
from langchain_core.messages import HumanMessage, ToolMessage
def answer_with_tools(user_input: str) -> str:
messages = [("system", SYSTEM_PROMPT), HumanMessage(user_input)]
ai_msg = llm_with_tools.invoke(messages)
# If the model asked for no tool, return its text directly
if not ai_msg.tool_calls:
return ai_msg.content
# Otherwise run each requested tool and feed the result back
messages.append(ai_msg)
for call in ai_msg.tool_calls:
result = lookup_order.invoke(call["args"])
messages.append(ToolMessage(result, tool_call_id=call["id"]))
return llm_with_tools.invoke(messages).content
print(answer_with_tools("Where is order 1001?"))
Now the bot returns the real shipped status for order 1001 instead of a made-up date. If you would rather have the bot answer from a knowledge base of help articles, Connect a Chatbot to Your Docs with RAG shows the retrieval approach.
Step 4: Run an interactive chat loop
Tie it together with a loop that reads customer messages and prints replies. This version uses the memory-aware chatbot from Step 2 so the conversation stays in context. A try/except keeps a single bad request from crashing the session.
from openai import OpenAIError, RateLimitError
def reply(user_input: str, session_id: str) -> str:
try:
return chatbot.invoke(
{"input": user_input},
config={"configurable": {"session_id": session_id}},
)
except RateLimitError:
return "We are busy right now. Please try again in 30 seconds."
except OpenAIError as e:
return f"Sorry, something went wrong: {e}"
if __name__ == "__main__":
session_id = "support-session-01"
print("Support chatbot ready. Type 'quit' to exit.")
while True:
message = input("\nCustomer: ").strip()
if message.lower() in {"quit", "exit"}:
print("Session closed.")
break
if not message:
continue
print(f"Agent: {reply(message, session_id)}")
Save the file as support_bot.py and run it with python support_bot.py. Ask a question, then ask a follow-up, and watch it keep the thread.
Key parameters quick reference
| Parameter | Where | Default | What it does |
|---|---|---|---|
model | ChatOpenAI(...) | required | Which model answers. gpt-4o-mini is cheap and fast for support. |
temperature | ChatOpenAI(...) | 0.7 | How varied replies are. Use 0.0–0.3 so support answers stay consistent. |
session_id | config={"configurable": ...} | required | Picks which conversation's memory to load. One id per customer thread. |
history_messages_key | RunnableWithMessageHistory(...) | none | Must match the MessagesPlaceholder name so past turns get injected. |
Troubleshooting
ValidationError: OPENAI_API_KEY ... field required— The key did not load. Check that.envis in the folder you run from and that you callload_dotenv()before creating the model. See Fix the 401 Unauthorized Error in OpenAI Python.- The bot forgets earlier messages — You either changed the
session_idbetween calls or yourhistory_messages_keydoes not match theMessagesPlaceholdername. Keep both consistent. RateLimitError: 429— You sent requests faster than your plan allows. Wait, then add backoff. Fix the 429 Rate-Limit Error in Python has a retry pattern.- The model ignores your tool — Make sure you called
llm.bind_tools([...])and that the tool's docstring clearly describes when to use it. A vague docstring makes the model skip it.
When to use this vs. alternatives
LangChain saves time, but it is not always the right pick. Here is when each approach fits.
- Use LangChain when you want memory, tools, and retrieval to plug together with little glue code, or when you may swap models later. The wrappers shown here are the payoff.
- Use the raw
openaiSDK when your bot is a single prompt with no memory or tools. The extra LangChain layer adds dependencies and abstraction you will not use, so a direct API call is simpler to read and debug. - Use a hosted platform (a no-code support tool) when you need a polished widget, analytics, and human handoff out of the box, and you are willing to trade custom logic for speed.
A good rule: reach for LangChain once you need at least two of memory, tools, and document retrieval. Below that, stay with the plain SDK.
To take this further, add real-time replies with Stream Chatbot Responses with Python, or wire it to your sales data through CRM Data Integration with AI.
Back to Custom AI Chatbot Development.
Related guides
- Custom AI Chatbot Development — the main guide for this track.
- Add Memory to a Python Chatbot — persist conversations beyond a single run.
- Stream Chatbot Responses with Python — show replies word by word.
- Connect a Chatbot to Your Docs with RAG — answer from your help center.