Releases · zenprocess/servingcard

servingcard v0.1.0

Hardware-specific LLM serving configurations. Model cards for serving.

What's included

Spec: ServingCard YAML v1.0 with JSON Schema
CLI: servingcard benchmark, servingcard apply, servingcard validate, servingcard info
Registry: 3 seed configs (Qwen3-coder on NVIDIA GB10)
- FP8 + Eagle3 speculative: 69 tok/s
- FP8 baseline: 42 tok/s
- NVFP4: 42 tok/s, 262K context
PawBench integration: benchmark harness for producing serving cards
57 tests, CI on Python 3.10/3.11/3.12

Install

git clone https://github.com/zenprocess/servingcard
cd servingcard/packages/python
pip install -e .

Quick start

# Apply a community config
servingcard apply qwen3-coder/gb10-fp8-eagle3-spec3

# Benchmark your setup
servingcard benchmark --model qwen3-coder --hardware nvidia-gb10 --endpoint http://localhost:8000

PyPI package coming in v0.2.0.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

servingcard v0.1.0

What's included

Install

Quick start

Uh oh!

Releases: zenprocess/servingcard

v0.1.0 — First release

servingcard v0.1.0

What's included

Install

Quick start

Uh oh!