Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 37 additions & 38 deletions .github/workflows/checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ on:
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
group: ${{ github.workflow }}-${{ github.head_ref || github.ref }}
cancel-in-progress: true

env:
Expand Down Expand Up @@ -80,45 +80,44 @@ jobs:
run: nix develop --command deno bench --allow-all benchmarks/

- name: s3-tests
if: false
run: |
# Run MinIO tests in background
nix develop --command deno run --allow-all x/s3-tests.ts --backend minio &
MINIO_PID=$!

# Run Swift tests in background against SAIO
nix develop --command deno run --allow-all x/s3-tests.ts --backend swift &
SWIFT_PID=$!

# Wait for both and capture exit codes
MINIO_EXIT=0
if ! wait $MINIO_PID; then
MINIO_EXIT=$?
fi

SWIFT_EXIT=0
if [ -n "$SWIFT_PID" ]; then
if ! wait $SWIFT_PID; then
SWIFT_EXIT=$?
fi
fi

# Exit with error if either failed
if [ $MINIO_EXIT -ne 0 ] || [ $SWIFT_EXIT -ne 0 ]; then
echo "One or more compatibility tests failed (MinIO: $MINIO_EXIT, Swift: $SWIFT_EXIT)"
exit 1
set +e

run_minio() {
echo "=== Running s3-tests (MinIO) ==="
nix develop --command deno run --allow-all x/s3-tests.ts --backend minio --no-abort
echo "--- s3-tests/s3-tests.log (MinIO) ---"
cat s3-tests/s3-tests.log || true
echo "--- s3-tests/herald-proxy.log (MinIO) ---"
cat s3-tests/herald-proxy.log || true
}

run_swift() {
echo "=== Running s3-tests (Swift) ==="
nix develop --command deno run --allow-all x/s3-tests.ts --backend swift --no-abort
echo "--- s3-tests/s3-tests-swift.log (Swift) ---"
cat s3-tests/s3-tests-swift.log || true
echo "--- s3-tests/herald-proxy-swift.log (Swift) ---"
cat s3-tests/herald-proxy-swift.log || true
}

run_minio &
pid_minio=$!

run_swift &
pid_swift=$!

wait $pid_minio
status_minio=$?

wait $pid_swift
status_swift=$?

# Fail the step if either failed
if [ $status_minio -ne 0 ] || [ $status_swift -ne 0 ]; then
# exit 1
Comment on lines +85 to +119
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ensure s3-tests failures fail the workflow.

The final guard comments out exit 1, so the step succeeds even when MinIO/Swift tests fail.

🛠️ Reinstate failure propagation
-          if [ $status_minio -ne 0 ] || [ $status_swift -ne 0 ]; then
-            # exit 1
-          fi
+          if [ $status_minio -ne 0 ] || [ $status_swift -ne 0 ]; then
+            exit 1
+          fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
set +e
run_minio() {
echo "=== Running s3-tests (MinIO) ==="
nix develop --command deno run --allow-all x/s3-tests.ts --backend minio --no-abort
echo "--- s3-tests/s3-tests.log (MinIO) ---"
cat s3-tests/s3-tests.log || true
echo "--- s3-tests/herald-proxy.log (MinIO) ---"
cat s3-tests/herald-proxy.log || true
}
run_swift() {
echo "=== Running s3-tests (Swift) ==="
nix develop --command deno run --allow-all x/s3-tests.ts --backend swift --no-abort
echo "--- s3-tests/s3-tests-swift.log (Swift) ---"
cat s3-tests/s3-tests-swift.log || true
echo "--- s3-tests/herald-proxy-swift.log (Swift) ---"
cat s3-tests/herald-proxy-swift.log || true
}
run_minio &
pid_minio=$!
run_swift &
pid_swift=$!
wait $pid_minio
status_minio=$?
wait $pid_swift
status_swift=$?
# Fail the step if either failed
if [ $status_minio -ne 0 ] || [ $status_swift -ne 0 ]; then
# exit 1
set +e
run_minio() {
echo "=== Running s3-tests (MinIO) ==="
nix develop --command deno run --allow-all x/s3-tests.ts --backend minio --no-abort
echo "--- s3-tests/s3-tests.log (MinIO) ---"
cat s3-tests/s3-tests.log || true
echo "--- s3-tests/herald-proxy.log (MinIO) ---"
cat s3-tests/herald-proxy.log || true
}
run_swift() {
echo "=== Running s3-tests (Swift) ==="
nix develop --command deno run --allow-all x/s3-tests.ts --backend swift --no-abort
echo "--- s3-tests/s3-tests-swift.log (Swift) ---"
cat s3-tests/s3-tests-swift.log || true
echo "--- s3-tests/herald-proxy-swift.log (Swift) ---"
cat s3-tests/herald-proxy-swift.log || true
}
run_minio &
pid_minio=$!
run_swift &
pid_swift=$!
wait $pid_minio
status_minio=$?
wait $pid_swift
status_swift=$?
# Fail the step if either failed
if [ $status_minio -ne 0 ] || [ $status_swift -ne 0 ]; then
exit 1
fi
🤖 Prompt for AI Agents
In @.github/workflows/checks.yml around lines 84 - 118, The workflow currently
swallows s3-tests failures because the final conditional that should abort the
step is commented out; restore failure propagation by re-enabling the exit in
the final check (the block that examines status_minio and status_swift).
Specifically, in the script that defines run_minio and run_swift and captures
pid_minio/pid_swift and status_minio/status_swift, remove the comment before the
exit so the step exits non-zero (e.g., exit 1 or exit
$status_minio/$status_swift) when either status_minio or status_swift is
non-zero.

fi

- name: prune uv cache
run: nix develop --command uv cache prune --ci

- name: failure logs
if: failure()
run: |
echo "--- s3-tests/s3-tests.log (MinIO) ---"
cat s3-tests/s3-tests.log || true
echo "--- s3-tests/s3-tests-swift.log (Swift) ---"
cat s3-tests/s3-tests-swift.log || true
echo "--- s3-tests/herald-proxy.log ---"
cat s3-tests/herald-proxy.log || true
echo "--- s3-tests/herald-proxy-swift.log ---"
cat s3-tests/herald-proxy-swift.log || true
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -88,4 +88,3 @@ token
*.db-shm
*.db-wal
.vscode
symlinks
2 changes: 2 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,5 @@

- Always fix deno lint and deno check issues before running tests, the type
system is there to help.
- Never use `--no-check`. Treat the codebase like a Rust codebase. Live and die
by the type system.
158 changes: 157 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,37 @@ Herald is an S3 proxy that supports:
- Backend routing based on bucket names.
- Flexible bucket mapping with glob support.

## Quick start

Run Herald in Docker with env-only config (no YAML). Point it at an
S3-compatible backend (e.g. [MinIO](https://min.io)) and use any S3 client
against Herald.

```bash
# Start Herald (default backend: S3 at host's MinIO). Port 3000.
docker run -p 3000:3000 \
-e HERALD_DEFAULT_PROTOCOL=s3 \
-e HERALD_DEFAULT_ENDPOINT=http://host.docker.internal:9000 \
-e HERALD_DEFAULT_REGION=us-east-1 \
-e HERALD_DEFAULT_ACCESS_KEY_ID=minioadmin \
-e HERALD_DEFAULT_SECRET_ACCESS_KEY=minioadmin \
ghcr.io/expnt/herald:latest
```

Use the AWS CLI (or any S3 client) with Herald as the endpoint. The S3 API is
mounted at `/s3`; use path-style so bucket and key are in the path.

```bash
# List buckets via Herald
aws s3 ls --endpoint-url http://localhost:3000/s3

# List objects in a bucket
aws s3 ls --endpoint-url http://localhost:3000/s3 s3://my-bucket/
```

**Images:** [ghcr.io/expnt/herald](https://ghcr.io/expnt/herald) **Helm chart:**
[chart/](chart/) for Kubernetes (chart may be outdated; update planned).

## Config

Herald is configured via a YAML file (typically `herald.yaml`). The
Expand Down Expand Up @@ -52,12 +83,18 @@ backends:
# 1. "*" to match all buckets not claimed by other backends
# 2. A glob pattern like "logs-*"
# 3. A map of bucket definitions for granular control
# Optional: auth for this backend (bucket > backend > global)
auth:
accessKeysRefs: [admin]

buckets:
# Simple bucket mapping (inherits backend settings)
my-bucket: {}

# Mapping with overrides
# Mapping with overrides; bucket-level auth overrides backend/global
external-data:
auth:
accessKeysRefs: [readonly]
# Map proxy bucket "external-data" to backend bucket "data-v1"
bucket_name: data-v1
# Override endpoint for this specific bucket
Expand Down Expand Up @@ -85,6 +122,10 @@ backends:
# Route all archive buckets to Swift
buckets: "archive-*"

# Optional: require S3 SigV4 auth for incoming requests (see Auth section)
auth:
accessKeysRefs: [admin, readonly]

cors:
# Global CORS defaults
allowedOrigins: ["*"]
Expand All @@ -95,6 +136,44 @@ cors:
credentials: false
```

### Auth (incoming request verification)

Herald can verify incoming S3 requests using AWS Signature Version 4 (SigV4).
When auth is configured, only requests signed with one of the configured access
keys are accepted. Credentials are never stored in the config file; you
reference them by name (_refs_) and supply the actual keys via environment
variables.

#### Precedence

Auth is resolved at three levels with the same precedence as CORS: **Bucket >
Backend > Global**. The most specific definition wins (e.g. a bucket’s `auth`
overrides its backend’s `auth`).

#### Config shape

At each level you set `auth.accessKeysRefs`: a list of ref names (strings). Each
ref maps to a pair of env vars:

- `HERALD_AUTH_<REF>_ACCESS_KEY_ID` — access key id
- `HERALD_AUTH_<REF>_SECRET_KEY` — secret key

`<REF>` is the ref name in UPPERCASE (e.g. ref `admin` →
`HERALD_AUTH_ADMIN_ACCESS_KEY_ID`). Only refs that have both env vars set are
used; missing refs are skipped.

Example: global `auth.accessKeysRefs: [admin, readonly]` with
`HERALD_AUTH_ADMIN_ACCESS_KEY_ID`, `HERALD_AUTH_ADMIN_SECRET_KEY` and
`HERALD_AUTH_READONLY_ACCESS_KEY_ID`, `HERALD_AUTH_READONLY_SECRET_KEY` set in
the environment allows requests signed with either key. You can override at
backend or bucket level (e.g. a backend that only accepts `admin`, or a bucket
that only accepts `readonly`).

#### When auth is not configured

If no `auth` is defined at any level for a request, Herald does not perform
SigV4 verification and the request is not gated by these credentials.

### CORS Configuration

Herald supports fine-grained CORS control at three levels with the following
Expand Down Expand Up @@ -152,3 +231,80 @@ resolves the backend using the following priority:
backends' `buckets` maps.
3. **Glob match (string)**: If a backend has `buckets: "string-*"`, it checks if
the bucket name matches that pattern.

When several backends could match (e.g. two globs), the **first backend in
config order** wins.

### Environment variable configuration

Configuration can be supplied or overridden via environment variables; env is
merged with YAML at load time (env wins for the same path). All config-related
vars use the `HERALD_` prefix. Naming: `HERALD_<KEY>` applies to the `default`
backend or global (for top-level keys like auth/CORS); `HERALD_<BACKEND>_<KEY>`
applies to that backend. Keys are normalised (e.g. `AUTH_URL` → `auth_url`;
credential keys go under `credentials`).

| Var | Purpose | Default |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- | ------------- |
| `HERALD_CONFIG_PATH` | Path to YAML config file | `herald.yaml` |
| `HERALD_LOG_LEVEL` | Log level (e.g. `DEBUG`, `INFO`) | (none; INFO) |
| `PORT` | HTTP server port | `3000` |
| `HERALD_AUTH_ACCESS_KEYS_REFS` | Global auth: comma-separated ref names | — |
| `HERALD_<BACKEND>_AUTH_ACCESS_KEYS_REFS` | Backend auth: comma-separated ref names | — |
| `HERALD_AUTH_<REF>_ACCESS_KEY_ID` | Access key for auth ref (SigV4) | — |
| `HERALD_AUTH_<REF>_SECRET_KEY` | Secret key for auth ref (SigV4) | — |
| `HERALD_PROTOCOL`, `HERALD_ENDPOINT`, `HERALD_REGION`, `HERALD_BUCKETS` | Default backend (S3) | — |
| `HERALD_<BACKEND>_PROTOCOL`, `HERALD_<BACKEND>_ENDPOINT`, `HERALD_<BACKEND>_REGION`, `HERALD_<BACKEND>_BUCKETS` | Backend (S3) | — |
| `HERALD_<BACKEND>_ACCESS_KEY_ID`, `HERALD_<BACKEND>_SECRET_ACCESS_KEY` | Backend S3 credentials | — |
| `HERALD_<BACKEND>_AUTH_URL`, `HERALD_<BACKEND>_CONTAINER`, `HERALD_<BACKEND>_USERNAME`, `HERALD_<BACKEND>_PASSWORD`, `HERALD_<BACKEND>_PROJECT_NAME`, `HERALD_<BACKEND>_USER_DOMAIN_NAME`, `HERALD_<BACKEND>_PROJECT_DOMAIN_NAME` | Backend (Swift) | — |
| `HERALD_CORS_ALLOWED_ORIGINS`, `HERALD_CORS_ALLOWED_METHODS`, `HERALD_CORS_ALLOWED_HEADERS`, `HERALD_CORS_EXPOSED_HEADERS`, `HERALD_CORS_MAX_AGE`, `HERALD_CORS_CREDENTIALS` | Global CORS (lists comma-separated) | — |
| `HERALD_<BACKEND>_CORS_<KEY>` | Backend CORS (same keys as above) | — |

### Health and observability

- **Health:** `GET /health` returns `{ "status": "ok" }`. Use it for
liveness/readiness.
- **Logging:** Set `HERALD_LOG_LEVEL` (e.g. `DEBUG`, `INFO`) to control log
verbosity.
- **Tracing:** Optional OpenTelemetry: set `OTEL_EXPORTER_OTLP_ENDPOINT` (and
`OTEL_SERVICE_NAME`, default `herald`) to export traces to an OTLP collector.

## Deployment

- **Docker:** Images are published at
[ghcr.io/expnt/herald](https://ghcr.io/expnt/herald). Use env vars (see table
above) or mount a `herald.yaml` and set `HERALD_CONFIG_PATH`.
- **Kubernetes:** A Helm chart is in [chart/](chart/). It may be outdated;
updates are planned.

## Limitations

Herald is an S3 proxy focused on routing, protocol translation, and core object
operations. The following are **not** currently supported (or are partial):

- **Bucket subresources:** Bucket policies (`?policy`), lifecycle
(`?lifecycle`), versioning config (`?versioning`), tagging (`?tagging`), ACLs
(`?acl`), website (`?website`), public access block (`?publicAccessBlock`),
replication, logging, inventory, metrics, ownership controls.
- **Object subresources:** Object ACLs, tagging, legal hold, retention (Object
Lock), S3 Select. Copy Object (`x-amz-copy-source`) and Multi-Object Delete
(`POST ?delete`) are not implemented.
- **Object operations:** GetObjectAttributes (`?attributes`) is not implemented.
Checksum headers (`x-amz-checksum-*`) and conditional requests (`If-Match`,
etc.) are not fully supported.
Comment thread
Yohe-Am marked this conversation as resolved.
- **List enhancements:** `encoding-type=url`, special delimiter handling,
ListObjectsV2 `FetchOwner`, unordered listing behavior may not match S3.
- **Auth & IAM:** No IAM policy evaluation, STS, or web identity federation.
Anonymous access for public buckets/objects is not implemented. Invalid or
missing SigV4 auth may not return 403/400 as expected.
- **Validation & protocol:** Bucket naming rules (length, format) are not
strictly enforced. HTTP 100 Continue (`Expect: 100-continue`) is not
supported. Some error codes and response fields may differ from S3.

For the full list of missing functionality and focus tests (from the s3-tests
suite), see [TODO.md](TODO.md).

## Prior art

- https://github.com/gaul/s3proxy
- https://github.com/ceph/s3-tests
Loading
Loading