Add PegaFlow external KV cache blog post by Alex-yang00 · Pull Request #211 · vllm-project/vllm-project.github.io

Alex-yang00 · 2026-05-19T02:27:04Z

Summary

This PR adds a new blog post for the vLLM x Novita AI collaboration on PegaFlow, an external KV cache service for production LLM serving.

The post covers:

Why moving KV cache lifetime out of the inference process improves restart behavior and failure isolation
How PegaFlow pools KV cache across local instances, TP ranks, and remote nodes
Benchmark results for startup time, local cache sharing, MLA KV deduplication, and RDMA remote reads
The three-level cache hierarchy with pinned DRAM, remote RDMA-accessible DRAM, and SSD
vLLM integration through the external KV connector interface
Quick-start commands and a public reference benchmark from the PegaFlow repository

Assets

Adds figures for:

PegaFlow architecture
Startup time comparison
Rust/Python tail-latency comparison
Local sharing result summary
Cross-node RDMA throughput
Cache-policy comparison

Review notes

The public PegaFlow install commands, connector configuration, P2P flags, and reference benchmark were checked against the novitalabs/pegaflow README and docs. The internal production benchmark numbers should still be confirmed by the Novita AI team before marking this PR ready for merge.

Signed-off-by: Alex-wuhu <yanglongwei06@gmail.com>

esmeetu · 2026-05-19T12:49:54Z

+  --metaserver-addr http://metaserver-host:50056
+```
+
+Connect vLLM without modifying vLLM source code:


We can specify the vLLM version used.

Thanks! Added a note in the quick-start section that the examples in this post use vllm>=0.20.0.

esmeetu · 2026-05-19T12:51:21Z

LGTM! There's small suggestion, please take a look.

Signed-off-by: Alex-wuhu <yanglongwei06@gmail.com>

vercel Bot deployed to Preview May 19, 2026 02:27 View deployment

vercel Bot deployed to Preview May 19, 2026 02:30 View deployment

vercel Bot deployed to Preview May 19, 2026 02:36 View deployment

Alex-yang00 marked this pull request as ready for review May 19, 2026 02:38

Add PegaFlow blog post

6aecf39

Signed-off-by: Alex-wuhu <yanglongwei06@gmail.com>

Alex-yang00 force-pushed the codex/pegaflow-blog branch from c6ec6c1 to 6aecf39 Compare May 19, 2026 02:39

vercel Bot deployed to Preview May 19, 2026 02:40 View deployment

esmeetu reviewed May 19, 2026

View reviewed changes

esmeetu approved these changes May 19, 2026

View reviewed changes

Specify vLLM version for PegaFlow examples

62afa48

Signed-off-by: Alex-wuhu <yanglongwei06@gmail.com>

vercel Bot deployed to Preview May 19, 2026 13:14 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PegaFlow external KV cache blog post#211

Add PegaFlow external KV cache blog post#211
Alex-yang00 wants to merge 2 commits into
vllm-project:mainfrom
Alex-yang00:codex/pegaflow-blog

Alex-yang00 commented May 19, 2026 •

edited

Loading

Uh oh!

esmeetu May 19, 2026 •

edited

Loading

Uh oh!

Alex-yang00 May 19, 2026

Uh oh!

esmeetu commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Alex-yang00 commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Assets

Review notes

Uh oh!

esmeetu May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Alex-yang00 May 19, 2026

Choose a reason for hiding this comment

Uh oh!

esmeetu commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Alex-yang00 commented May 19, 2026 •

edited

Loading

esmeetu May 19, 2026 •

edited

Loading