A production-pattern AI chatbot built with LangGraph — featuring stateful multi-turn conversations, persistent memory using both in-memory and SQLite checkpointers, real-time token streaming, and a Streamlit UI with threading support.
Most chatbot tutorials use simple prompt → response loops with no memory.
This project implements the exact architecture used in production agentic systems:
- ✅ StateGraph — conversation logic modeled as a graph, not a chain
- ✅ Checkpointing — memory persists across sessions using SQLite (not just in-memory)
- ✅ Streaming — tokens stream in real-time, not returned all at once
- ✅ Thread isolation — each conversation has its own
thread_id, just like production chatbots - ✅ Two backends — swap between in-memory and database persistence without changing the graph
User Input (Streamlit UI)
│
▼
LangGraph StateGraph
│
┌────▼────┐
│chat_node│ ◄── ChatOpenAI (GPT)
└────┬────┘
│
Checkpointer
┌────┴────────────────┐
│ │
InMemorySaver SqliteSaver
(session only) (chatbot.db — persists across restarts)
Conversation Flow:
START → chat_node → END
↑
state["messages"] — full history passed every turn via add_messages reducer
| Layer | Technology |
|---|---|
| LLM | OpenAI GPT (via langchain_openai) |
| Orchestration | LangGraph StateGraph |
| State Management | TypedDict + Annotated message reducers |
| Memory (session) | InMemorySaver |
| Memory (persistent) | SqliteSaver → chatbot.db |
| Streaming | LangGraph streaming mode |
| Frontend | Streamlit with threading |
| Config | python-dotenv |
chatbot_langgraph/
├── langgraph_backend.py # Core graph — InMemorySaver checkpointer
├── langgraph_backend_streamming.py # Same graph with real-time token streaming
├── database_backend.py # SQLite-persisted graph — survives restarts
├── langgraph_frontend.py # CLI frontend — run conversations in terminal
├── database_frontend.py # CLI frontend — loads past threads from DB
├── streamlit_frontend_threading.py # Full Streamlit UI with thread-safe streaming
├── .gitignore
└── README.md
class ChatState(TypedDict):
messages: Annotated[list[BaseMessage], add_messages]The add_messages reducer appends new messages to history automatically — no manual list management needed.
checkpointer = InMemorySaver()
chatbot = graph.compile(checkpointer=checkpointer)
# Each thread_id is a separate conversation
config = {"configurable": {"thread_id": "user_123"}}
response = chatbot.invoke({"messages": [HumanMessage("Hello")]}, config)conn = sqlite3.connect("chatbot.db", check_same_thread=False)
checkpointer = SqliteSaver(conn=conn)
chatbot = graph.compile(checkpointer=checkpointer)
# Load all past conversation threads
def get_all_threads():
return list({cp.config['configurable']['thread_id']
for cp in checkpointer.list(None)})for chunk in chatbot.stream(
{"messages": [HumanMessage(content=user_input)]},
config=config,
stream_mode="messages"
):
print(chunk[0].content, end="", flush=True)git clone https://github.com/Rushi6767/chatbot_langgraph.git
cd chatbot_langgraph
pip install langgraph langchain-openai langchain-core streamlit python-dotenv# Create .env file
echo "OPENAI_API_KEY=your_key_here" > .envTerminal (in-memory):
python langgraph_frontend.pyTerminal (with SQLite persistence):
python database_frontend.pyStreamlit UI (with streaming):
streamlit run streamlit_frontend_threading.py| Concept | What it means in practice |
|---|---|
StateGraph |
Model conversation logic as nodes + edges, not sequential chains |
add_messages reducer |
Automatic message history management — append-only, no bugs |
InMemorySaver |
Fast dev/testing — memory resets on restart |
SqliteSaver |
Production-grade persistence — reload past conversations |
thread_id |
Isolate multiple users or sessions in one graph |
| Streaming | Real-time UX — tokens appear as generated, not after full response |
| Streamlit threading | Non-blocking UI while LLM streams tokens in background |
Rushi Sathavara — Python & AI Engineer
MS Computer Science, Harrisburg University (4.0 GPA)