From 4d23adce3d46836567b96a0ed88349731fb16d9c Mon Sep 17 00:00:00 2001 From: dacharyc Date: Fri, 8 May 2026 22:24:59 -0400 Subject: [PATCH 1/2] Hugo fixups + cross-linking to new Platforms page --- README.md | 4 ++-- SPEC.md | 21 ++++++++++++++--- site/content/_index.md | 6 ++++- .../{platforms.md => platforms/_index.md} | 23 +++++++++++++------ site/layouts/_default/list.html | 6 ----- site/static/llms.txt | 1 + 6 files changed, 42 insertions(+), 19 deletions(-) rename site/content/{platforms.md => platforms/_index.md} (88%) diff --git a/README.md b/README.md index a3254c5..37105f2 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@ docs agent-friendly. The spec focuses on meeting the technical constraints of agent platforms (truncation limits, content negotiation, discovery); it does not consider qualitative evaluation of content. -**Status**: Draft (v0.5.0) +**Status**: Draft (v0.5.1) **Full spec**: [SPEC.md](SPEC.md) | **Website**: [agentdocsspec.com](https://agentdocsspec.com) @@ -98,7 +98,7 @@ This spec is open for community review. We welcome: - **Proposed changes**: Submit a pull request (open an issue first for significant changes) - **Platform data**: If you know a platform's truncation limits, contribute to - [the Platforms tables](./site/content/platforms.md) + [the Platforms tables](https://agentdocsspec.com/platforms/) - **Real-world results**: If you've evaluated your docs against this spec, we'd love to hear what you found diff --git a/SPEC.md b/SPEC.md index 3c01912..611af1a 100644 --- a/SPEC.md +++ b/SPEC.md @@ -3,8 +3,8 @@ | | | |--------------|--------------------------------------------------------------| | **Status** | Draft | -| **Version** | 0.5.0 | -| **Date** | 2026-04-25 | +| **Version** | 0.5.1 | +| **Date** | 2026-05-08 | | **Author** | Dachary Carey + community contributors | | **URL** | https://agentdocsspec.com | | **Repository** | https://github.com/agent-ecosystem/agent-docs-spec | @@ -653,6 +653,10 @@ Because different agents hit different paths, this spec defines size checks for **both** the markdown response (if available) and the HTML response. A site that's only optimized for the markdown path is leaving most agents behind. +For empirical observations of how specific platforms (Claude, Cursor, Copilot, +Gemini, Windsurf Cascade, and others) handle retrieval, truncation, and +summarization in practice, see [Agent platform comparisons](https://agentdocsspec.com/platforms/). + ### `rendering-strategy` - **What it checks**: Whether the HTTP response contains the page's actual @@ -1503,7 +1507,7 @@ becomes available. ### Known Platform Limits -Compare platform architecture and truncation limits in [Platforms](./site/content/platforms.md). +Compare platform architecture and truncation limits in [Platforms](https://agentdocsspec.com/platforms/). ### What This Means for Threshold Selection @@ -1598,6 +1602,17 @@ welcome. ## Changelog +### v0.5.1 (2026-05-08) + +- Moved per-platform truncation data out of Appendix A into a new + [Platforms](https://agentdocsspec.com/platforms/) comparison page on the + site. Appendix A retains the spec's threshold rationale and points readers + to the platforms page for current per-platform observations. Category 3 + (Page Size and Truncation Risk) now references the platforms page so + readers can connect threshold choices to empirical pipeline behavior. No + threshold or check definitions changed. Platforms page authored by + Rhyannon Rodriguez. + ### v0.5.0 (2026-04-25) - Split `llms-txt-directive` into two independent checks: diff --git a/site/content/_index.md b/site/content/_index.md index 9ffaaa9..171825d 100644 --- a/site/content/_index.md +++ b/site/content/_index.md @@ -1,5 +1,5 @@ --- -title: "Agent-Friendly Documentation Spec" +title: "Can agents read your documentation?" description: "A proposed specification for making documentation sites work well for coding agents." --- @@ -30,6 +30,10 @@ severity. **[Read the Full Spec](/spec/)** +For empirical observations on how specific agent platforms (Claude, Cursor, +Copilot, Gemini, and others) handle retrieval, truncation, and summarization, +see **[Agent platform comparisons](/platforms/)**. + ## Quick Start for Documentarians If you can only do a few things, these have the highest impact: diff --git a/site/content/platforms.md b/site/content/platforms/_index.md similarity index 88% rename from site/content/platforms.md rename to site/content/platforms/_index.md index 51c27fc..d978c06 100644 --- a/site/content/platforms.md +++ b/site/content/platforms/_index.md @@ -1,13 +1,22 @@ --- -title: "Platforms" +title: "Agent platform comparisons" description: "Agent platform comparisons for retrieval, truncation, and summarization layers." +showTableOfContents: true --- -| **Section** | **Description** | -| ----------- | ------------------ | -| [Retrieval](#retrieval) | How and when an agent fetches content | -| [Truncation](#truncation) | What gets lost and whether agents report it | -| [Summarization](#summarization) | What happens to content between retrieval and generation | +| | | +|------------------|----------------------------------------------------------------------------------| +| **Author** | Rhyannon Rodriguez | +| **Last updated** | 2026-05-08 | +| **Methodology** | [Agent Ecosystem Testing](https://rhyannonjoy.github.io/agent-ecosystem-testing) | + +This page provides an overview of observed agent web fetch retrieval behavior across agent platforms. To clarify observed behavior, we break down the retrieval process into three key components that form most agent web fetch pipelines: + +- [Retrieval](#retrieval): How and when an agent fetches content +- [Truncation](#truncation): What gets lost and whether agents report it +- [Summarization](#summarization): What happens to content between retrieval and generation + +These observations inform the size thresholds and pipeline assumptions in the [Agent-Friendly Documentation Spec](/spec/), particularly [Category 3: Page Size and Truncation Risk](/spec/#category-3-page-size-and-truncation-risk). ## Retrieval @@ -59,5 +68,5 @@ Observable outputs from default settings primarily inform the conclusions below. | [Gemini API URL context](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/google-gemini-url-context-tool/methodology) | _API layer pipeline, undocumented_ | Pre-generation injection suggests processing occurs before LLM invocation. No transformation layer between retrieval and generation; LLM receives content directly and any summarization occurs as part of generation, not as an intermediate pipeline stage. | | [GitHub Copilot](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/microsoft-github-copilot/methodology) | _Inferred via relevance-ranking, undocumented for web fetch_ | Reassembled excerpts, outputs that don't note discarded content, browser masquerading, and tool substitution patterns suggests an orchestrator-subagent relationship and not a linear, passive pipeline. Agent loop descriptions vary by implementation. [VS Code-Copilot docs](https://code.visualstudio.com/docs/copilot/agents/subagents) describe subagent delegation as _main agent-initiated_ for complex tasks with further config available, but [Copilot SDK docs](https://docs.github.com/en/copilot/how-tos/copilot-sdk/use-copilot-sdk/custom-agents) only describe subagents as configurable, and not default architecture. | | MCP Fetch (reference server) | _None_ hard truncation at `max_length` | Passive, linear pipeline without a processing layer. | -| [OpenAI web search](./open-ai-web-search-tool/methodology.md) | _Differs by API surface, undocumented_ | Chat Completions autonomously retrieves, but Responses' LLM actively manages search in the chain of thought with `open_page` and `find_in_page`, suggesting a processing layer, but not explicitly documented or named in either API responses. | +| [OpenAI web search](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/open-ai-web-search-tool/methodology) | _Differs by API surface, undocumented_ | Chat Completions autonomously retrieves, but Responses' LLM actively manages search in the chain of thought with `open_page` and `find_in_page`, suggesting a processing layer, but not explicitly documented or named in either API responses. | | [Windsurf Cascade](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/cognition-windsurf-cascade/methodology) | _Inferred via chunking, undocumented for web and docs search_ | Codebase research triggers [built-in subagent Fast Context](https://docs.windsurf.com/context-awareness/fast-context). Test prompts likely invoked Fast Context alongside web search. Chunk analysis, tool substitution, terminal execution, and workspace referencing suggest an extensive processing layer, not a passive, linear pipeline. | diff --git a/site/layouts/_default/list.html b/site/layouts/_default/list.html index cc58147..64c7650 100644 --- a/site/layouts/_default/list.html +++ b/site/layouts/_default/list.html @@ -44,11 +44,5 @@

-

- {{ i18n "list.no_articles" | emojify }} -

- {{ end }} {{ end }} diff --git a/site/static/llms.txt b/site/static/llms.txt index 04afe91..c40d3e3 100644 --- a/site/static/llms.txt +++ b/site/static/llms.txt @@ -5,6 +5,7 @@ ## Docs - [Full Spec](https://agentdocsspec.com/spec/index.md): The complete specification with all checks, thresholds, and implementation guidance. +- [Platforms](https://agentdocsspec.com/platforms/index.md): Comparison of agent platforms across retrieval, truncation, and summarization behavior. - [Home](https://agentdocsspec.com/index.md): Overview, quick start guide, and background links. ## Tools From 747ef57f95bcfafa1f6d4f58ebd23d36f4a3128c Mon Sep 17 00:00:00 2001 From: dacharyc Date: Fri, 8 May 2026 22:32:31 -0400 Subject: [PATCH 2/2] Apply upstream formatting fix to relocated platforms page --- site/content/platforms/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/site/content/platforms/_index.md b/site/content/platforms/_index.md index d978c06..68e4474 100644 --- a/site/content/platforms/_index.md +++ b/site/content/platforms/_index.md @@ -47,7 +47,7 @@ testing analysis and/or tool documentation. | ---------- | ----------------- | ------- | | [Claude API web fetch](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/anthropic-claude-api-web-fetch-tool/claude-interpreted-vs-raw) | ~20,700 chars and/or ~100 KB of rendered content _default unset_ | `max_content_tokens` approximate, setting 5,000 returned 17,186 chars, truncation occurs mid-token. Default limit identified in raw track, self-report attributed missing content to JavaScript rendering, masking character limit. | | [Claude Code](https://giuseppegurgone.com/claude-webfetch) | ~100,000 chars | Trusted sites serving `text/markdown` under 100K chars bypass summarization, while content over 100K chars are passed to a summarization LLM. | -| [Cursor](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/anysphere-cursor/cursor-interpreted-vs-raw)** | 28 KB–240 KB+ _method-dependent_, _nondeterministic filtering_ | `WebFetch MCP` ~28 KB, `urllib` ~72 KB, unknown path 245 KB+, `curl` no ceiling detected; appears to apply structure-aware content filtering, navigation and CSS stripped, but content selection heuristic presents as complete, so agents don't report truncation. | +| [Cursor](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/anysphere-cursor/cursor-interpreted-vs-raw) | 28 KB–240 KB+ _method-dependent_, _nondeterministic filtering_ | `WebFetch MCP` ~28 KB, `urllib` ~72 KB, unknown path 245 KB+, `curl` no ceiling detected; appears to apply structure-aware content filtering, navigation and CSS stripped, but content selection heuristic presents as complete, so agents don't report truncation. | | [Gemini API URL context](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/google-gemini-url-context-tool/gemini-interpreted-vs-raw) | _No fixed ceiling or silent dropping detected_, 20 URLs hard limit per request | API-layer rejection returns `400` and doesn't consume tokens; retrieval-layer failure completes the request, but records `URL_RETRIEVAL_STATUS_ERROR`. Format support inconsistent with documentation: PDF fails, YouTube succeeds, JSON nondeterministic; Google Docs fail consistently. | | [GitHub Copilot](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/microsoft-github-copilot/copilot-interpreted-vs-raw) | _No fixed ceiling detected, _nondeterministic excerpting_, tested 6.68M tokens | Pipeline with `fetch_webpage` discards whole sections or more granularly before generation, `curl` delivers all raw bytes but unreadable, chat rendering cutoff visible in output, not persisted as requested, but agents don't reliably report these results as truncation. | | [MCP Fetch (reference server)](https://pypi.org/project/mcp-server-fetch/) | Default 5,000 chars | Default `max_length` is 5,000 chars, but configurable up to 1,000,000; uniquely user-controlled truncation. |