Digital You - RAG Evolution

Earlier we utilized "Context Stuffing" to create the Digital You. This project evolves that concept into a Retrieval Augmented Generation (RAG) system for better scalability and efficiency.

Architectural Evolution

Ingestion: Reads documents from /data.
Chunking: Splits text into manageable segments.
Embedding: Converts text into numerical vectors.
Retrieval: Fetches only relevant segments to answer user queries.

Setup

Sync Environment: uv sync in the root directory.
Data: Place .txt, .md, or .pdf files in the /data directory.
Tasks: Complete the TODOs in app.py to build the RAG pipeline.

Your Mission

Data Infrastructure: Build a vector store in chroma_db/ using your bio data.
Contextualization: Teach the AI how to re-write questions based on history so it never "forgets" the subject of conversation.
Persona Engineering: You must write your own qa_prompt. This is where you define your "Digital Twin" identity and force the AI to only use the retrieved context.

Enhancements

Use SemanticChunker for smarter context splitting.
Customize the Gradio UI using gr.Blocks.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
.gitignore		.gitignore
README.md		README.md
app.py		app.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Digital You - RAG Evolution

Architectural Evolution

Setup

Your Mission

Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Digital You - RAG Evolution

Architectural Evolution

Setup

Your Mission

Enhancements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages