Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
7b5eddd
chore(repo): gitignore runtime state, untrack vector_store + sqlite dbs
danrixd Apr 14, 2026
1b0e600
feat(backend): real LLM SDKs, multi-model per tenant, settings store,…
danrixd Apr 14, 2026
6a5db9e
feat(demo): six seeded knowledge vaults (personal / company / organiz…
danrixd Apr 14, 2026
5a8a702
feat(frontend): RAG visualizer, Vault editor, Settings page, live API…
danrixd Apr 14, 2026
8007c34
docs+test: recruiter README, CHANGELOG, architecture doc, smoke scrip…
danrixd Apr 14, 2026
83835a1
feat(demo): load_financebench.py — SEC 10-Ks + S&P 500 bars/profiles …
danrixd Apr 14, 2026
fd04398
fix(loader): resilient SP500 fetch + bounded chunker + CPU ingest fal…
danrixd Apr 14, 2026
b0730c1
fix(loader): standalone ingest script to sidestep native-state segfault
danrixd Apr 14, 2026
c026ab0
fix(files): recursive vault listing + nested-path routes + financeben…
danrixd Apr 14, 2026
b7f04bc
feat(ui): interactive file tree with search + collapsible folders for…
danrixd Apr 14, 2026
94738da
feat: persist RAG traces so they can be replayed without spending tokens
danrixd Apr 14, 2026
d5c312d
fix(privacy): don't leak absolute filesystem paths into vault content
danrixd Apr 14, 2026
e91e3c7
feat: T2/T3/T4 sweep — 16 of the 17 outstanding gap-analysis items
danrixd Apr 14, 2026
911f140
fix(eval): ASCII-only progress output to avoid Windows cp1252 crash
danrixd Apr 15, 2026
bb317ab
docs: stage CI workflow as template + pin PR body
danrixd Apr 15, 2026
2bf5122
docs: demo walkthrough script for the GIF recording
danrixd Apr 15, 2026
8f8b979
docs: FinanceBench eval results — 45% auto-score with Claude Opus 4.6
danrixd Apr 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Example environment variables for SmartBaseAI
# Copy this file to `.env` and adjust the values as needed
# Copy this file to `.env` and adjust values.
# Anything set here can be overridden at runtime via the admin Settings page
# (values saved there win over env vars).

# Secret key used for signing JWT tokens
# Required: JWT signing key
SECRET_KEY=change-this-secret

# LLM provider credentials (any or all)
# ANTHROPIC_API_KEY=sk-ant-...
# OPENAI_API_KEY=sk-...
# OLLAMA_BASE_URL=http://localhost:11434
19 changes: 19 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,13 +1,32 @@
# Python cache files
__pycache__/
*.pyc
.pytest_cache/

# Environment files
.env
frontend/.env.local

# Node dependencies
node_modules/

# Build artifacts
dist/
build/
frontend/dist/

# Runtime state — never commit
vector_store/
data/*.db
data/uploads/
data/financebench/
data/_downloads/
embeddings.json

# Virtual envs
venv/
.venv/

# IDE
.idea/
.vscode/
63 changes: 63 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Changelog

All notable changes to SmartBaseAI are documented here. Dates are approximate and derived from git history.

## [Unreleased] — upgrade branch

### Fixed
- **`SECRET_KEY` now read from the environment.** Both `api/routes_auth.py` and `api/auth_middleware.py` previously hardcoded `"super_secret"`, silently ignoring the `.env` value. A new `api/config.py` centralizes the setting and loads `.env` via `python-dotenv` when available.
- **File uploads are now ingested into the tenant vector store.** `POST /files/upload` previously only wrote the file to disk. Text formats (`.txt`, `.md`, `.csv`, `.log`) are now pushed through `TenantVectorStore.add_document` so uploaded content is immediately searchable by chat. Unsupported binary formats still save cleanly and log a skip.
- **Tenants created via the admin API are now immediately visible to the chat route.** `TenantManager` previously cached `tenants.json` in memory at construction time, so the `tenant_manager` instance owned by `routes_chat.py` never saw tenants created via the `manager` instance owned by `routes_admin.py` until the process restarted. Now `TenantManager` re-reads storage on every call.
- **Duplicate tenant / user creation returns `409 Conflict`** instead of surfacing a `ValueError` as a generic `500`.

### Added
- `scripts/smoke_test.py` — end-to-end test that exercises auth, admin, tenant CRUD, user CRUD, file upload with ingestion, chat RAG path, and chat DB-lookup path against a live server.
- `docs/architecture.md` with a Mermaid diagram of the orchestration flow.
- Recruiter-facing `README.md` rewrite — hero, elevator pitch, quickstart, tech stack, contact.
- `TODO.md` for out-of-scope ideas surfaced during polish.

### Changed
- `python-dotenv` added to `requirements.txt`.

---

## Historical milestones (from git log)

### Auth and multi-tenancy
- JWT auth with role-based access (`super_admin` / `admin` / `user`)
- Multi-tenant user model scoped by `tenant_id`
- Legacy plaintext password migration with bcrypt rehash on verify
- Schema migrations for legacy `users` tables (rename `password` → `hashed_password`, backfill timestamps)

### Retrieval + orchestration
- `ResponseGenerator` introduced as the three-source orchestrator (history + DB + RAG)
- Fusion layer — DB-preferred with RAG as supplemental context
- Persistent per-tenant Chroma vector store via `TenantVectorStore`
- Hybrid search (keyword ∪ semantic) returned in Chroma-compatible format
- `exact_lookup` fast path for date-keyed structured queries (`data/<tenant>.db`)
- CUDA auto-detected for `sentence-transformers` with CPU fallback

### Model backends
- Ollama integration with real HTTP calls and streaming-JSON handling
- OpenAI and Anthropic model wrappers under `ai/models/`
- Local Llama support

### Data ingestion
- `ETLManager` with pluggable connectors (Postgres / MySQL / MongoDB / HTTP API)
- Cleaners + metadata generator
- Script to load CSV data into a tenant's structured sample DB

### API
- FastAPI app with routers for auth, chat, admin, and files
- CORS middleware
- Conversation + audit log repositories (SQLite)
- Tenant listing and per-tenant chat session management

### Frontend
- React 19 + Vite + Tailwind frontend under `frontend/`
- Login, chat, files, and admin pages with role-aware views
- Axios client with configurable `VITE_API_BASE_URL`
- Tenant selection for super-admins, persistent chat sessions

### Testing
- `pytest` suites for AI pipeline, API routes, ingestion, hybrid search, and user migration
206 changes: 109 additions & 97 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,153 +1,165 @@
![SmartBaseAI Logo](logo.png)
<p align="center">
<img src="logo.png" alt="SmartBaseAI" width="240"/>
</p>

SmartBaseAI is an open-source starter kit for building multi-tenant chat
applications. It includes a FastAPI backend, simple Next.js frontends and a
collection of scripts for managing tenants and embeddings.
<h1 align="center">SmartBaseAI</h1>

## Environment setup
<p align="center">
<em>A multi-tenant LLM platform that grounds answers in <b>both</b> your structured databases <b>and</b> your unstructured documents.</em>
</p>

Before running the API create a `.env` file in the project root. You can start by copying the provided example:
<p align="center">
<a href="#quickstart">Quickstart</a> ·
<a href="#architecture">Architecture</a> ·
<a href="#demo">Demo</a> ·
<a href="#tech-stack">Tech stack</a> ·
<a href="#contact">Contact</a>
</p>

```bash
cp .env.example .env
```

Edit the file and set `SECRET_KEY` to a secure value that will be used for signing JWT tokens. Any variables defined in `.env` are read by the backend on startup.
---

## Backend setup
## What is it

Install Python dependencies and run the API server:
SmartBaseAI is an open-source backend + web UI for building chat experiences over private knowledge bases. Unlike a vanilla RAG starter, every chat turn runs through an **orchestrator** that merges three sources before calling the LLM:

```bash
pip install -r requirements.txt
python scripts/run_server.py --reload
```
1. **Conversation history** — maintained per session.
2. **Exact structured lookups** — the orchestrator detects entities in the user's message (e.g. an ISO date) and pulls the matching row from the tenant's SQL / Mongo / API source.
3. **Hybrid RAG** — per-tenant Chroma store with keyword ∪ semantic retrieval over uploaded documents.

The server listens on port `8000` by default.
Each tenant is isolated: its own persistent vector store, its own DB connector, its own model backend. LLMs are pluggable — **OpenAI, Anthropic, Ollama, or a local Llama** — so a tenant can run fully on-prem or fully hosted without code changes.

## Frontend setup
Built for environments where "hallucinate a close price" is not acceptable: the exact DB row is preferred, RAG is treated as supplemental context, and the fallback is an explicit `"No information"` rather than a fabricated answer.

Two Next.js applications are provided. Each must be started separately.
## Demo

### Chat interface
<!-- TODO: replace with actual recording -->
<p align="center"><i>Demo GIF coming soon — see <a href="docs/screenshots/">docs/screenshots/</a>.</i></p>

```bash
cd ui/web
npm install
npm run dev
```
## Quickstart

### Admin panel
**Requirements:** Python 3.10+, Node 18+, optionally a running Ollama instance (`ollama pull llama3`) or an OpenAI / Anthropic API key.

```bash
cd ui/admin_panel
npm install
npm run dev
```

## Run the web app

Start the API server and a frontend to use SmartBaseAI in the browser.
# 1. Clone
git clone https://github.com/danrixd/smartbaseai.git
cd smartbaseai

```bash
# terminal 1: backend
python scripts/run_server.py --reload

# terminal 2: chat interface
cd ui/web
npm install
npm run dev
```
# 2. Configure
cp .env.example .env
# edit .env: set SECRET_KEY and (optionally) OPENAI_API_KEY / ANTHROPIC_API_KEY

The chat interface is available at <http://localhost:3000>. To manage tenants, run the admin panel in another terminal:
# 3. Backend
pip install -r requirements.txt
python scripts/run_server.py --reload # http://localhost:8000

```bash
cd ui/admin_panel
# 4. Frontend (in a second terminal)
cd frontend
npm install
npm run dev -- -p 3001
npm run dev # http://localhost:5173
```

Visit <http://localhost:3001> for administrative tasks.
Open <http://localhost:5173>, log in, pick a tenant, and start chatting. The default admin credentials live in `.env.example` — change them before exposing the service.

## Utility scripts

### Create a tenant
### Creating a tenant

```bash
python scripts/setup_tenant.py tenant1 --name "Tenant 1" --db-type postgres \
--db-config '{"host": "localhost", "user": "app"}'
python scripts/setup_tenant.py tenant1 \
--name "Tenant 1" \
--db-type postgres \
--db-config '{"host": "localhost", "user": "app"}' \
--model-type ollama --model-name llama3
```

### Build embeddings
### Ingesting documents

```bash
python scripts/build_embeddings.py --source docs/ --output embeddings.json \
python scripts/build_embeddings.py \
--source docs/ \
--output embeddings.json \
--embedder local
```

## Example API usage

### Authenticate
### Running the tests

```bash
curl -X POST -H "Content-Type: application/json" \
-d '{"username": "admin", "password": "ChangeThis123!"}' \
http://localhost:8000/auth/login
pip install -r requirements.txt
pytest -q
```

Use the `access_token` returned above when calling other endpoints.
## Architecture

### Send a chat message
```mermaid
flowchart LR
U[User] --> FE[React + Vite UI]
FE -->|JWT| API[FastAPI<br/>auth · chat · admin · files]

```bash
curl -H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-X POST http://localhost:8000/chat/message \
-d '{"session_id": "s1", "tenant_id": "t1", "message": "hello"}'
```
API --> ORCH[ResponseGenerator<br/>orchestrator]

### Tenant operations
ORCH --> HIST[(Conversation<br/>history)]
ORCH --> DB[(Tenant DB<br/>Postgres · MySQL · Mongo · API)]
ORCH --> RAG[Hybrid retrieval<br/>keyword ∪ semantic]

```bash
# list tenants
curl -H "Authorization: Bearer <token>" http://localhost:8000/admin/tenants
RAG --> VS[(Per-tenant Chroma<br/>MiniLM-L6-v2)]

ORCH --> LLM{LLM backend}
LLM --> OAI[OpenAI]
LLM --> ANT[Anthropic]
LLM --> OLL[Ollama]
LLM --> LLA[Local Llama]

# get configuration for a tenant
curl -H "Authorization: Bearer <token>" http://localhost:8000/admin/tenants/t1
ING[ETL Manager] -->|clean + metadata| VS
SRC[(Source DBs / APIs)] --> ING
```

## Testing
Full write-up: **[docs/architecture.md](docs/architecture.md)**.

Install the Python dependencies before running the test suite:
Key files if you want to read the code:

```bash
pip install -r requirements.txt
```
- `chatbot/response_generator.py` — the three-source orchestrator.
- `ai/rag_pipeline.py` — tenant-aware RAG with FAISS fallback.
- `ai/vector_stores/chroma_store.py` — persistent per-tenant store with hybrid query.
- `ingestion/etl_manager.py` — pluggable DB connectors → clean → metadata → store.
- `api/app.py` — FastAPI entrypoint; routers under `api/routes_*.py`.

## Tech stack

Run all unit tests with:
| Layer | Choice |
|--------------|------------------------------------------------------------------|
| Backend | Python 3.10+, FastAPI, Pydantic, JWT auth, pytest |
| Retrieval | ChromaDB (persistent), sentence-transformers `all-MiniLM-L6-v2`, FAISS fallback, hybrid keyword + semantic |
| LLMs | OpenAI, Anthropic, Ollama, local Llama — pluggable via `ai/models/` |
| Data sources | Postgres, MySQL, MongoDB, generic HTTP APIs |
| Frontend | React 19, Vite, Tailwind, React Router, Axios |
| Infra | GPU auto-detected for embeddings (CUDA → CPU fallback) |

## Example API usage

```bash
pytest -q
# 1. Authenticate
curl -X POST -H "Content-Type: application/json" \
-d '{"username": "admin", "password": "ChangeThis123!"}' \
http://localhost:8000/auth/login

# 2. Chat
curl -H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-X POST http://localhost:8000/chat/message \
-d '{"session_id": "s1", "tenant_id": "tenant1", "message": "What was the close on 2024-03-15?"}'
```

## Extending the response generator
## Project status

The `ResponseGenerator` combines conversation history, structured data from a
tenant database and unstructured context from the vector store. Additional data
sources can be integrated by implementing new helper methods that fetch and
format their results before calling the language model. Each source should be
encapsulated in its own method and added to the prompt in
`chatbot/response_generator.py`.
Active personal project. See [CHANGELOG.md](CHANGELOG.md) for milestones and [TODO.md](TODO.md) for the backlog.

## Contributing
## Contact

Contributions are welcome! Please fork the repository and open a pull request
for any enhancements or bug fixes. For large changes, open an issue first to
discuss the proposed work. Make sure to run the test suite with `pytest -q`
before submitting a PR.
Built by **Dan Ringart** — algorithm developer, B.Sc. Physics (Tel Aviv University). Background in simulations, quant trading systems, and LLM-orchestrated platforms.

## License
- Website: [danringart.com](https://danringart.com)
- GitHub: [@danrixd](https://github.com/danrixd)

Feedback, issues, and PRs welcome.

This project is licensed under the [MIT License](LICENSE).
## License

MIT — see [LICENSE](LICENSE).
Loading