Skip to content

benpeter/web-resource-ledger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

113 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Web Resource Ledger (WRL)

CI License: Apache 2.0 despicable 99% Vibe Coded

Cryptographic evidence of web content -- capture what a page looked like, when, with proof anyone can verify.

Submit a URL, get back a screenshot, rendered HTML, HTTP headers, and an Ed25519-signed WACZ bundle. The verification URL works for anyone -- no account needed. Deploy on your own infrastructure; your captures, your keys, your evidence.

Status: Early development, single-operator deployment. The API is functional and deployed but pre-1.0. See the roadmap for what's coming.

What you get

A single API call produces:

  • Dual screenshots (PNG) -- before and after cookie consent dismissal, so both the banner presence and the underlying page content are preserved
  • Rendered HTML -- the DOM after JavaScript execution
  • HTTP response headers -- the server's response at capture time
  • Signed WACZ bundle -- all artifacts packaged, hashed, and signed with Ed25519
  • Verification URL -- a shareable link anyone can use to confirm authenticity

Usage

Requires a running WRL instance. See Setup below.

export WRL_API_KEY=your_tenant_api_key

Tenant keys are created via the admin API (see step 8a). For deployments using the legacy static key, WRL_API_KEY is your CAPTURE_API_KEY value.

Replace wrl.example.com with your deployment URL, or localhost:8787 for local dev.

Step 1: Submit a capture

curl -X POST https://wrl.example.com/v1/captures \
  -H "Authorization: Bearer $WRL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'
{
  "id": "cap_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
  "statusUrl": "https://wrl.example.com/v1/captures/cap_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6/status",
  "note": "Use GET /v1/captures to list and search your captures."
}

Your captures are always accessible. Use GET /v1/captures to list them, or save the capture ID for direct access.

Step 2: Poll for completion

curl https://wrl.example.com/v1/captures/cap_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6/status

No auth required -- the capture ID acts as the access secret. Poll until status is complete or failed. The response includes a captureUrl when complete.

Step 3: Retrieve artifacts

curl https://wrl.example.com/v1/captures/cap_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6

Returns metadata and signed artifact URLs (screenshot, html, headers, wacz) plus a verifyUrl.

Step 4: Verify the bundle

curl https://wrl.example.com/v1/verify/cap_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6

Returns a JSON verification result with up to four checks: artifactHashes, bundleHash, signature, and (for new captures) timestamp. The timestamp check verifies an RFC 3161 independent timestamp obtained at capture time. Legacy captures return three checks. The verifyUrl from step 3 also renders as a human-readable page in browsers.

The verifyUrl is safe to share publicly. The capture ID grants full access to all artifacts without authentication -- treat it as a secret. Anyone with the ID can view the capture.

Offline verification

For independent, offline verification -- including full CMS/PKCS#7 certificate chain validation -- use the CLI tool:

npx @w-r-l/verify capture.wacz --origin https://wrl.example.com

See packages/verify/ for details.

Finding and sharing captures

Finding captures: GET /v1/captures lists your captures (requires your API key). Use it to browse and recover capture IDs. Sharing captures: The capture ID in any URL works without authentication. Share verification URLs freely.

curl https://wrl.example.com/v1/captures \
  -H "Authorization: Bearer $WRL_API_KEY"
{
  "data": [
    {
      "id": "cap_a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6",
      "status": "complete",
      "url": "https://example.com",
      "createdAt": "2024-01-15T10:30:00.000Z",
      "completedAt": "2024-01-15T10:30:45.123Z"
    }
  ],
  "pagination": {
    "cursor": null,
    "hasMore": false,
    "limit": 20
  }
}

Optional query parameters: limit (1-100, default 20), cursor (for paging), status (pending, complete, or failed).

Captures are rate-limited to 10 per minute per IP. Verification is limited to 60 per minute per IP. Error responses use RFC 9457 application/problem+json format. For full details, see openapi.yaml.

Setup

Prerequisites

  • Wrangler CLI installed and authenticated
  • Node.js 22+ (see .nvmrc)
  • Cloudflare account with R2 and Browser Rendering enabled

1. Install dependencies

npm install

2. Create KV namespace

wrangler kv namespace create wrl-kv

Update wrangler.toml with the returned id and preview_id. If you forked this repo, replace the existing id and preview_id values in wrangler.toml with the IDs returned by these commands.

3. Create R2 bucket

wrangler r2 bucket create wrl-captures
wrangler r2 bucket create wrl-captures-preview

4. Configure capture API key (legacy fallback)

CAPTURE_API_KEY is a static bearer token that acts as a fallback when no KV-based tenant key is found. For new deployments, consider setting up the admin API (step 8a) and creating tenant keys instead. For existing deployments, this key continues to work during migration.

In the usage examples above, this is $WRL_API_KEY.

Generate a key:

openssl rand -hex 32

Set the production secret:

wrangler secret put CAPTURE_API_KEY

For local dev, add to .dev.vars:

CAPTURE_API_KEY=<hex string from the command above>

Security: Never commit this value to version control. .dev.vars is already in .gitignore.

5. Configure signing key

WRL signs WACZ bundles with Ed25519. The signing key is optional -- if SIGNING_KEY is not set, captures complete successfully but without WACZ bundles (screenshot, HTML, and headers are still stored).

Generate a key pair:

node scripts/generate-signing-key.js

The script prints a private key (PKCS8 DER, base64) and the corresponding public key (raw, base64). The public key is embedded automatically in every signed bundle for verification -- no separate distribution needed.

Set the production secret:

wrangler secret put SIGNING_KEY
# Paste the private key (PKCS8 DER, base64) when prompted

For local dev, add to .dev.vars:

SIGNING_KEY=<base64 string from the script>

Security: Never commit the private key to version control. .dev.vars is already in .gitignore.

6. Configure IP hash seed (recommended)

IP_HASH_SEED is an HMAC seed used to hash IP addresses before they appear in logs. Without it, log entries have no IP correlation for abuse analysis.

Generate a seed:

openssl rand -hex 32

Set the production secret:

wrangler secret put IP_HASH_SEED

For local dev, add to .dev.vars:

IP_HASH_SEED=<hex string from the command above>

7. Configure Coralogix log ingestion (required for production observability)

CORALOGIX_SEND_KEY is the API key for structured log ingestion to Coralogix. Without it, the Worker runs normally but logs go to console only -- no structured log ingestion occurs. For fork developers who do not use Coralogix, this key is effectively optional.

Find your send key in the Coralogix dashboard under Settings > Send Your Data > API Keys.

Set the production secret:

wrangler secret put CORALOGIX_SEND_KEY

For local dev, structured logs are emitted to the console. No key is needed for wrangler dev.

8. Configure CORS origins (optional)

CORS_ORIGINS is a comma-separated list of allowed origins for CORS preflight responses. Only needed if browser-based clients will call the API directly.

Set it as an environment variable in wrangler.toml (not a secret):

[vars]
CORS_ORIGINS = "https://app.example.com,https://www.example.com"

If omitted, cross-origin requests from browsers are blocked. Server-to-server requests are unaffected.

8a. Configure admin key (required for per-tenant key management)

ADMIN_KEY is the bearer token for the admin API (/v1/admin/keys). It grants the ability to create, list, and revoke per-tenant API keys. It does not grant capture or read access.

Generate a key:

openssl rand -hex 32

Set the production secret:

wrangler secret put ADMIN_KEY

For local dev, add to .dev.vars:

ADMIN_KEY=<hex string from the command above>

Once deployed, create your first tenant key:

curl -X POST <YOUR_PRODUCTION_URL>/v1/admin/keys \
  -H "Authorization: Bearer $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"tenantId": "default", "scopes": ["capture"], "name": "default-key"}' | jq .

Use the returned key value as $WRL_API_KEY going forward. See OPERATIONS.md for the full migration runbook.

Security: Never commit this value to version control. .dev.vars is already in .gitignore.

9. Deploy

wrangler deploy

Steps 1-9 are one-time setup. After initial deployment, the CD pipeline handles staging and production deploys automatically on every push to main. For the full deploy flow, environment configuration, rollback procedures, and how secrets map across Worker runtime, GitHub CI, and local development, see OPERATIONS.md.

Development

See CONTRIBUTING.md for local dev setup, test conventions, and contribution guidelines.

See OPERATIONS.md for deployment, rollback, and environment setup.

Staging

wrangler.toml includes an [env.staging] configuration with its own R2 bucket and KV namespace. Before deploying, you must create those resources in your own Cloudflare account (same as steps 2-3 above, but scoped to staging).

Create the staging KV namespace:

wrangler kv namespace create KV --env staging

Update the id field under [env.staging.kv_namespaces] in wrangler.toml with the returned ID.

Create the staging R2 bucket:

wrangler r2 bucket create wrl-captures-staging

Then deploy to staging:

wrangler deploy --env staging

Staging auto-deploys on merge to main via deploy-staging.yml. Secrets must be set separately for the staging environment:

wrangler secret put CAPTURE_API_KEY --env staging
wrangler secret put SIGNING_KEY --env staging
# repeat for any other secrets

Smoke tests run against a live deployment. Set SMOKE_URL and SMOKE_API_KEY, then:

npm run smoke

Roadmap

WRL follows a three-act development plan:

  1. Solid Foundation (complete) -- List endpoint, key versioning, CORS, security hardening. Closes the trust gaps for single-operator use.
  2. Evidence-Grade (in progress) -- RFC 3161 timestamps, per-tenant keys (complete), audit logging. Makes "evidence" independently verifiable.
  3. Infrastructure -- MCP server, web UI, batch capture. Expands WRL into a platform other tools build on.

See docs/backlog.md for the full roadmap and GitHub issues for detailed tracking.

Built with despicable-agents

WRL was built using despicable-agents, a multi-agent orchestration framework. Every phase of development is documented in docs/evolution/ -- the prompts, decisions, and outcomes are all there.

Reference

Key Rotation

Key rotation is safe -- old captures continue to verify after rotation. Every time a capture is signed, the signing key is archived automatically. Each key is identified by a keyId: the first 8 hex characters of the SHA-256 of the raw 32-byte public key. The keyId is stored in the WACZ bundle's signedData.signatures array (v0.2.0) or signedData directly (v0.1.0 legacy) and in the KV capture record. During verification, the system looks up the correct historical key by keyId rather than assuming the current key.

Rotation procedure:

  1. Generate a new key pair: node scripts/generate-signing-key.js
  2. Update the production secret: wrangler secret put SIGNING_KEY
  3. Update local dev secret in .dev.vars (if applicable)

New captures are signed with the new key. Existing captures are verified against the archived key that signed them. The /.well-known/signing-keys endpoint lists the full key archive for third-party verifiers.

Public Key Endpoint

GET /.well-known/signing-key returns the current Ed25519 public key. Third-party verifiers can fetch the key without trusting the publicKey embedded in individual WACZ bundles. Responses are cached for 1 hour at the edge.

{
  "algorithm": "Ed25519",
  "publicKey": "<base64-encoded raw 32-byte key>",
  "keyId": "<8-char hex fingerprint>"
}

keyId is the first 8 hex characters of the SHA-256 of the raw public key bytes. Use it to match against the key archive when verifying historical captures.

Key Archive Endpoint

GET /.well-known/signing-keys lists all historical signing keys. Use this endpoint to verify captures signed with any key, not just the current one.

{
  "keys": [
    {
      "keyId": "<8-char hex fingerprint>",
      "algorithm": "Ed25519",
      "publicKey": "<base64-encoded raw 32-byte key>",
      "archivedAt": "<ISO 8601 timestamp>"
    }
  ]
}

Third-party verifiers: match the keyId from a WACZ bundle's signedData (v0.1.0) or signedData.signatures array (v0.2.0) against this list to retrieve the correct public key for signature verification. Rate-limited at the same limit as the singular endpoint.

Health Endpoint

GET /health returns the current service status and legal document URLs.

{ "status": "ok", "legal": { "terms": "<url>", "policy": "<url>" } }

Useful for uptime monitoring and smoke tests.

Response Headers

All responses include:

  • Link -- points to the terms-of-service URL with rel="terms-of-service". Present on every response.
  • Strict-Transport-Security -- includes preload and includeSubDomains. Present on every response.
  • X-RateLimit-Limit -- the rate limit ceiling for the endpoint. Present on responses from rate-limited endpoints (captures, verification, signing key endpoints).

Legal

By using this API, you agree to the Terms of Service.

License

Apache 2.0

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages