λ Kalle — KRR Chat

A language model with no neural network.

Eigenvalues, kernel ridge regression, and honest corpus engineering.

The same task as GPT-4 — predict the next word — solved with eigenvalues instead of backpropagation.

→ Launch Kalle in your browser

What is this?

Kalle is a bilingual chatbot (German + English) that runs entirely in your browser using Kernel Ridge Regression with Random Fourier Features. No neural networks, no backpropagation, no gradient descent, no server — just matrix multiplication, a Gaussian kernel, and eigenvalues. TensorFlow.js provides WebGL GPU acceleration; the model weights are embedded directly in the HTML.

Kalle chats about food, hobbies, music, feelings, weather, and simple math — with multi-turn context awareness, honest scope boundaries, and RAG (Retrieval-Augmented Generation) over the Eigenvalues & AI blog post. Ask "What are eigenvalues?" or "How does PageRank work?" and Kalle retrieves the relevant blog section to answer.

How the build works

Everything is bundled into a single self-contained HTML file (index.html, ~33 MB). No server, no API, no external dependencies at runtime.

OFFLINE BUILD (Python, Float64, ~3 min on CPU)
┌─────────────────────────────────────────────────────────┐
│                                                         │
│  data/corpus.md (2113 dialog pairs)                     │
│       │                                                 │
│       ├──→ Word2Vec (gensim, 32-dim) → embeddings       │
│       ├──→ Token sequence (5× repeat) → RFF features    │
│       │         └──→ W = (Z^TZ + λI)⁻¹ Z^TY   [KRR]   │
│       ├──→ IDF weights + BoW pair embeddings             │
│       └──→ data/chunk_index.json → RAG chunk embeddings  │
│                                                         │
│  All tensors → Float16 + gzip + base64                  │
│       └──→ Injected into data/template.html             │
│            └──→ index.html (self-contained, ~33 MB)     │
└─────────────────────────────────────────────────────────┘

ONLINE (Browser, WebGL GPU)
┌─────────────────────────────────────────────────────────┐
│  1. Decompress base64 → Float16 → GPU tensors           │
│  2. User query → chunk retrieval (RAG) → pair matching   │
│  3. Word-by-word rendering with prediction comparison    │
│  4. TensorFlow.js WebGL for matrix ops (<1ms/word)      │
└─────────────────────────────────────────────────────────┘

The template (data/template.html) contains ~80 lines of vanilla JavaScript for the matching and rendering logic. The build script (src/build.py) trains the model offline in Float64 for numerical precision, then quantizes to Float16 and packs everything into the template.

The math (three equations)

# 1. Random Fourier Features (Rahimi & Recht, 2007)
z(x) = sqrt(2/D) * cos(x @ omega + bias)       # ω ~ N(0, 1/σ²)

# 2. Kernel Ridge Regression (closed-form, no gradient descent)
W = solve(Z.T @ Z + lambda * I, Z.T @ Y)        # one matrix solve

# 3. Prediction (single matrix-vector multiply)
next_word = argmax(z(context) @ W)               # <1ms on WebGL GPU

No epochs. No learning rate. No convergence monitoring. W is the only learned parameter (6144 × 2952 ≈ 18.1M values).

Key numbers

Parameter	Value
Corpus	2174 curated dialog pairs (DE + EN)
Vocabulary	2952 words (Word2Vec, 32-dim)
Context window	24 words
RFF dimension (D)	6144
Kernel bandwidth (σ)	1.5
Regularization (λ)	10⁻⁶
RAG chunks	29 (from Eigenvalues & AI blog)
File size	~56 MB (self-contained HTML)
Runtime	WebGL GPU via TensorFlow.js

Architectural properties

Deterministic consequences of the BoW+IDF design — not programmed, but also not magical:

Math validation: Kalle asks "what is 3+5?", user says "8" → "correct!" — pure pattern matching on plus 3 5 8, no code checks the math
Insult immunity: Profanity is out-of-vocabulary → invisible to the model → no filter needed
Typo robustness: OOV words from typos are silently ignored, remaining words still match
Bilingual routing: English/German words have different IDF weights → queries route to language-appropriate pairs without language detection code
Honest limits: Low confidence → "I can only talk about food, hobbies, music, feelings, weather and math"

The journey: less is more

Iteration	Pairs	Encoding	Top-1	Lesson
V1 (original)	57	Hash (128 buckets)	99.8%	Perfect but narrow
MEGA (mass)	4301	Word2Vec + hacks	34.9%	More data = worse!
FINAL (curated)	2174	Word2Vec (32-dim)	65.4%	Curated > generated

This mirrors what OpenAI, Anthropic and Google learned: data quality beats data quantity.

Engineering

Build from source

pip install numpy gensim

# Generate corpus (optional — data/corpus.md already included)
python3 src/gen_corpus.py

# v1 build: direct Gaussian elimination (fastest on CPU at current scale)
python3 src/build.py
# → produces index.html (~56 MB, all weights embedded, runs in any browser)

# v2 build: pluggable solver (v1-compatible + new iterative paths)
python3 src/build_v2.py --solver=direct                           # v1-equivalent
python3 src/build_v2.py --solver=cg                               # Block-PCG (default, GPU-friendly)
python3 src/build_v2.py --solver=cg --cg-tol=1e-8 --cg-maxiter=2000  # tighter tolerance
# → produces kalle-chat-v2.html

The v2 solver implements the absorber-stochasticization described in the v2 paper — same result as direct solve, but using only matrix-vector products (GPU-ideal, scales to D ≫ 10,000). See benchmarks/README.md for synthetic comparisons and benchmarks/real_kalle_results.md for the full-scale benchmark on actual Kalle training matrices (NumPy vs. PyTorch CPU vs. Apple MPS GPU — with a surprising finding: PyTorch's BLAS back-end matters more than the GPU at this scale).

Run tests

pip install playwright && playwright install chromium

# Full regression suite (34 scenarios)
python3 tests/test_regression.py index.html

# Filter by category
python3 tests/test_regression.py index.html --filter math
python3 tests/test_regression.py index.html --filter emotion

Run locally

git clone https://github.com/pmmathias/krr-chat.git
cd krr-chat
open index.html   # opens in default browser, GPU accelerated

No build step needed to run — the HTML file is self-contained with embedded model weights.

Project structure

krr-chat/
├── index.html              # The chatbot (self-contained, ~56 MB)
├── README.md
├── ARCHITECTURE.md         # Full technical deep-dive
├── src/
│   ├── build.py            # Build pipeline: corpus → Word2Vec → KRR → HTML
│   ├── gen_corpus.py       # Curated corpus generator (~2100 pairs)
│   └── gen_rag_qa.py       # RAG Q&A pair generator from blog chunks
├── tests/
│   └── test_regression.py  # Playwright regression suite (34 scenarios)
└── data/
    ├── corpus.md           # 2174 curated dialog pairs (training data)
    ├── chunk_index.json    # 29 blog chunks for RAG retrieval
    └── template.html       # HTML/JS template (matching + rendering logic)

See ARCHITECTURE.md for the full technical architecture, design decisions, and mathematical foundations.

License

MIT

Author

Mathias Leonhardt — CTO at pmagentur.com, writing about the mathematical foundations of AI at ki-mathias.de

Built in collaboration with Claude Code (Anthropic)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

λ Kalle — KRR Chat

A language model with no neural network.

Eigenvalues, kernel ridge regression, and honest corpus engineering.

→ Launch Kalle in your browser

What is this?

How the build works

The math (three equations)

Key numbers

Architectural properties

The journey: less is more

Engineering

Build from source

Run tests

Run locally

Project structure

Further reading

License

Author

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
benchmarks		benchmarks
data		data
paper		paper
projektmanagement		projektmanagement
src		src
tests		tests
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
README.md		README.md
VARIANTS.md		VARIANTS.md
index.html		index.html
kalle-chat-v2.html		kalle-chat-v2.html

Folders and files

Latest commit

History

Repository files navigation

λ Kalle — KRR Chat

A language model with no neural network.

Eigenvalues, kernel ridge regression, and honest corpus engineering.

→ Launch Kalle in your browser

What is this?

How the build works

The math (three equations)

Key numbers

Architectural properties

The journey: less is more

Engineering

Build from source

Run tests

Run locally

Project structure

Further reading

License

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages