Skip to content

mitjasai/local-csc-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local RAG with CSC documentation

This repository contains code for doing retrieval-augmented generation (RAG) with CSC user documentation using models run locally on workstation CPUs.

Setting up

Make sure you have a working installation of Python and Docker Engine.

Install Python dependencies

python3 -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt

Download models

The models are run using the llama.cpp framework, which requires models to be converted to the GGUF format. Fortunately, there are many pre-converted models available on Hugging Face. The hf CLI tool used here is installed by the huggingface-hub Python package, which is included in the dependencies.

hf download --local-dir ./data/llama.cpp unsloth/embeddinggemma-300m-GGUF embeddinggemma-300m-Q4_0.gguf
hf download --local-dir ./data/llama.cpp unsloth/gemma-3-4b-it-qat-GGUF gemma-3-4b-it-qat-Q4_K_M.gguf

Usage

Start the llama.cpp and Qdrant servers

Start the llama.cpp inference server and Qdrant vector database containers using Docker Compose. The container images are pulled automatically.

docker compose up

The containers can be stopped using a similar command.

docker compose down

If you encounter any issues, it can be helpful to remove any stopped containers before running Compose.

docker container prune

Build Qdrant vector index

Build the vector database using the provided script. This takes around 25 minutes on my workstation.

python3 build_index.py

Open Chainlit chat interface

Open the Chainlit chat interface. The chainlit CLI tool is installed by a Python package of the same name, which is included in the dependencies. The server's startup script takes a few seconds to run, so if you see an error message when trying to access the web UI, try refreshing the page.

chainlit run app.py

RAG is disabled by default, but can be enabled from the settings menu, which is accessed by clicking on the gear icon located on the left side of the input field. The retrieved documents can be viewed by expanding the "Used retrieve" step in the chat and clicking on one of the source icons at the end of the step.

Example

example

About

Local RAG with CSC documentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages