Skip to content

gowtham0992/picochat

Repository files navigation

Picochat

Train a small language model on your domain — without hiding the evidence.

Picochat is a local-first factory for building small, specialized language models. Bring your own text, train (from scratch or by fine-tuning an existing model), evaluate honestly, chat with it, and serve it to your team — end to end, from one dashboard or the CLI.

Product Page · Pipeline Guide · Honesty Checks · Release Gates · Deploy · 100M Runbook · 1B Runbook

GitHub stars License: MIT Python 3.10+ Local + cloud H100/H200 ready

Get started

What is Picochat?

Most domain-model projects stall in the same places: there's no clean path from "I have some text" to "my team can use a model," and the evaluation quietly leaks, memorizes, or overstates. Picochat fixes both. It is an end-to-end small-language-model factory with a dashboard as the control plane and an honest evaluation/release gate at its core.

You can drive the whole lifecycle — no terminal required:

bring data → build/refine training data → train → evaluate → compare → chat → serve → export

…and two ways to start a model:

  • Train from scratchpicochat run tiny (or any scale) builds a Picochat-native model: tokenizer → base pretraining → chat SFT → optional DPO → eval → release gate.
  • Fine-tune an existing modelpicochat train hf-sft starts from a Hugging Face causal LM (SmolLM, Qwen, …) and fine-tunes it on your chat data, with optional LoRA.

Picochat builds small, specialized models — fast and cheap to run, honest about what they do and don't know. Best when the domain is narrow and the data is yours. It is not a general chatbot, not RAG, and not a frontier-model claim.

Do it all from the dashboard

picochat web --runs-dir runs --port 8765   # then open http://127.0.0.1:8765
Bring & refine data — import a Hugging Face dataset, point at a local folder of docs, generate starter chat/eval, then edit the JSONL in-browser with live validation. Train — a guided wizard (data → check training data → train), from-scratch or fine-tune-existing, with an Advanced panel for architecture, optimizer (Muon), precision, LoRA, and DPO.
Evaluate honestly — pass/fail with refusal and prompt-echo signals, re-run eval on demand, and view the honesty / contamination report that checks for leakage between SFT, eval, and corpus. Compare & leaderboard — rank every run by visible eval, or pick runs for a side-by-side metric matrix.
Chat — talk to your model (native or fine-tuned HF) in the Playground. Serve to your team — one click starts an OpenAI-compatible /v1 endpoint with a copy-paste snippet.
Export — convert a run to a Transformers model + model card to use anywhere. Cloud — launch training on Modal (and recipes for Colab / Lambda) from here, then pull the finished run back to local.

Dashboard overview

Compare runs and rank them on a leaderboard

Leaderboard and compare

Chat with your model, then serve it as a team API

Playground and serve

Train on remote GPUs, with the dashboard as the control plane

Cloud training

The honest part

Picochat treats evaluation integrity as a product feature, not an afterthought. Every run can be inspected, compared, and blocked — a finished run is not a release.

  • Separate practice from scoring. SFT rows are practice; eval rows are the scoreboard. Picochat checks they don't overlap.
  • Honesty / contamination report. Detects exact and near leakage between chat SFT, eval prompts, and the base corpus, plus memorization risk.
  • Release gate. Blocks release when SFT fit, held-out fit, visible eval, prompt echo, refusal behavior, external benchmarks, or honesty checks fail — surfaced in the dashboard with the underlying markdown reports.
  • Preflight + GPU-spend guards. Long/paid runs require sanity, preflight, and a short DDP dry run, with explicit paid-launch confirmation.

Quick start

git clone https://github.com/gowtham0992/picochat.git
cd picochat
python3 -m venv .venv
source .venv/bin/activate
python -m pip install -e ".[dev,hf]"

The installed command is picochat (a shorter pico alias is also provided).

picochat demo                              # tiny end-to-end demo
picochat web --runs-dir runs --port 8765   # the dashboard
docker compose up --build picochat-web     # …or via Docker

…or from the CLI

The dashboard is a control plane over the CLI; everything is scriptable.

# Build a dataset pack from a Hugging Face dataset, then train from scratch
picochat data hf-import --dataset <hf/dataset> --pack-out my_pack --max-rows 5000
picochat run tiny --dataset-pack my_pack/dataset_pack.json

# Fine-tune an existing Hugging Face model on your chat data
picochat train hf-sft --model HuggingFaceTB/SmolLM2-135M-Instruct \
  --input my_pack/chat.jsonl --out-dir runs/my-domain-ft --peft lora

# Optional preference alignment after SFT
picochat data preference-starter --input my_pack/chat.jsonl --out data/preferences.jsonl
picochat run tiny --dataset-pack my_pack/dataset_pack.json \
  --dpo-input data/preferences.jsonl --dpo-steps 200

# Rank completed runs and export a model
picochat leaderboard --runs-dir runs --out reports/leaderboard.md
picochat export hf --checkpoint runs/my-run/sft/checkpoint \
  --tokenizer runs/my-run/tokenizer.json --out-dir exports/my-run

Serve your model

One click in the Playground, or:

picochat serve \
  --checkpoint runs/my-run/sft/checkpoint \
  --tokenizer runs/my-run/tokenizer.json \
  --host 127.0.0.1 --port 8000

curl http://127.0.0.1:8000/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{"model":"my-run","messages":[{"role":"user","content":"What is Picochat?"}],"max_tokens":80}'

pico serve is OpenAI-compatible (/v1/models, /v1/completions, /v1/chat/completions, plus stream=true SSE) and can serve either a native Picochat checkpoint or a fine-tuned Hugging Face model (--hf-model). Binding a non-loopback host automatically requires a bearer key. For high-throughput production serving, export to HF and run vLLM / TGI / llama.cpp.

Scale up

Picochat runs from a laptop CPU smoke test to multi-GPU H100/H200. Larger runs are intentionally gated:

setup → sanity → import → release-skills pack → preflight → DDP dry run → run → SFT/eval → release gate

Documentation

Picochat is inspired by Andrej Karpathy's nanochat, with a different goal: make the whole small-model factory inspectable and usable, not claim frontier behavior from a tiny run.

Development

pytest -q                 # Python tests
npm ci && npm run frontend:check && npm run frontend:build   # the dashboard
ruff check src tests      # lint

See CONTRIBUTING.md for PR standards and the release-evidence expectations.

License

MIT. See LICENSE.

About

Production-honest small language model training factory: data import, pretraining, SFT, eval gates, contamination checks, and GPU runbooks.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors