Skip to content
View Zandereins's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Dresden

Highlights

  • Pro

Block or report Zandereins

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Zandereins/README.md
Franz Paul — developer mascots

Franz Paul

Production AI engineer · Microsoft 365 / Cloud consulting background · Dresden, Germany

I build deterministic tooling that makes AI agents measurable — quality you can defend with numbers, not vibes. Available for freelance.

Total stars schliff score schliff on PyPI Hydra fpaul.dev LinkedIn

Featured · Open source · Work with me


Most AI tooling calls the agent and hopes. I build the layer that makes the agent's output measurable — one coherent stack with three jobs: you score the instructions before you trust the agent, you review its output with more than one mind, and you give it the context to be right in the first place. The throughline is measure first, then fix.

Layer Tool Its job
1 · Score schliff Measure the instructions before you run the agent.
2 · Review hydra Review the diff with more than one model's read.
3 · Context vault-sync (private) Keep the agent working from ground truth.

Featured

Two shipped, public tools do the heavy lifting; the rest are private systems that show how I work end to end.

Repo What it does Signal
schliff Deterministic, stdlib-only quality scorer for AI agent instruction files — SKILL.md, CLAUDE.md, .cursorrules, AGENTS.md, system prompts 8 scorers (7 in the composite) · anti-gaming detection · deterministic patches · 1,288 tests · MIT · on PyPI
hydra Multi-perspective review council for Claude Code: advisors analyze, reviewers cross-examine, a chairman synthesizes 4 advisors (6 in deep mode) · 3 cross-examining reviewers · up to 10 agents · Claude Opus + OpenAI Codex · MIT
vault-sync (private) Syncs GitHub repos + PyPI metadata into an Obsidian vault as a Context Mirror CLI + macOS menubar widget (Molty mascot) + Claude Code MCP plugin (4 read-only MCP tools, 4 skills, 2 hooks) · Python ≥3.10 · MIT
project-beat (private) FastAPI + Next.js 16 freelance-job radar across German boards Scrapes 4 active boards five times daily · 6-component hybrid matching · Supabase dashboard
mission-control (private) Next.js 16 command center for an OpenClaw VPS 23 server-side API endpoints · Kanban board · JSON persistence · Tailscale-only access
fpaul.dev Personal developer site — Next.js 16, MDX, Writing section on AI security and agent tooling Live on Vercel

What I build

Deterministic, well-tested tooling for the agentic-coding ecosystem: things that score and review AI agents instead of just calling them. The throughline is measure first, then fix — anti-gaming detection so a score can't be juiced, deterministic patches that apply ~32% of schliff's fixes mechanically, and spec-first discipline where every claim is checked against the real artifact. Stdlib-first Python, TypeScript where the runtime demands it.

Open source

A clean merged PR is the receipt I trust most — third-party-validated proof a maintainer accepted the work.

Project Contribution Status
modelcontextprotocol/servers Added a root CLAUDE.md covering the full reference-servers monorepo — 7 servers (4 TypeScript, 3 Python) PR #3733, merged by a maintainer, April 2026

Same thesis, applied upstream: better context, fewer guesses.

Dev environment & stack
  • Languages: Python (stdlib-first, ≥3.10), TypeScript, SQL
  • AI / Agents: Claude Code, OpenAI Codex, MCP servers, agent instruction-file quality scoring, multi-agent review councils
  • Web: Next.js 16, React 19, Tailwind CSS, MDX
  • Backend / Data: FastAPI, Supabase / Postgres, Playwright, multilingual embeddings
  • Infra: Docker, Tailscale zero-trust networking, Vercel, Hetzner VPS
  • Tooling discipline: deterministic scorers, anti-gaming detection, heavy test coverage, single-sourced versioning, spec-driven workflows
  • Knowledge base: Obsidian (PARA), synced to repos via vault-sync
More private systems
  • OpenClaw / Vega stack — self-hosted OpenClaw Gateway on a Hetzner VPS: Docker Compose, a security-hardening overlay, and access locked behind a Tailscale zero-trust network, driving an always-on OpenClaw agent workspace.
  • Mission Control — private Next.js command center for the OpenClaw VPS: 23 server-side API endpoints, a Kanban board (Open / In Progress / Review / Done), JSON-file persistence (no database), reached only over Tailscale.
  • project-beat — private Python / FastAPI + Next.js system that scrapes 4 active German freelance job boards (freelance.de, GULP, Freelancermap, Hays — 13 sources configured) five times daily and ranks postings against profiles via a 6-component hybrid matching pipeline on a Supabase dashboard.

Why I build this way

Production AI engineer with a consulting background in Microsoft 365 and Microsoft Cloud, based in Dresden, Germany. The enterprise work taught me the thing the AI-hype market keeps forgetting: tooling that can't be measured can't be trusted. So I work spec-first — a spec is the single source of truth, the code follows, and claims get verified against the real artifact — and I build to the same standard I'd ship to a client. That's the whole reason the stack starts with score.

Work with me

Available for freelance engagements — AI tooling, agent quality / eval systems, Microsoft 365 / Microsoft Cloud, and full-stack web.

fpaul.dev · LinkedIn

Popular repositories Loading

  1. schliff schliff Public

    Deterministic quality scorer for AI agent instruction files — 8-dimension scoring with security, multi-format (SKILL.md, CLAUDE.md, .cursorrules, AGENTS.md), anti-gaming detection, zero dependencies

    Python 3

  2. hydra hydra Public

    Multi-perspective code review council for Claude Code. 3 advisors by default, 10 agents in deep mode (Opus + Codex). Evidence chains, adversarial self-test, dual-path verdict. Based on Karpathy's L…

    Python 3

  3. openclaw openclaw Public

    Forked from openclaw/openclaw

    Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

    TypeScript

  4. n8n n8n Public

    Forked from n8n-io/n8n

    Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.

    TypeScript

  5. Zandereins Zandereins Public

    GitHub Profile README

  6. awesome-agent-skills awesome-agent-skills Public

    Forked from VoltAgent/awesome-agent-skills

    Claude Code Skills and 700+ agent skills from official dev teams and the community, compatible with Codex, Antigravity, Gemini CLI, Cursor and others.