feat: compress termination messages to fit more results in 4KB limit by waveywaves · Pull Request #9682 · tektoncd/pipeline

waveywaves · 2026-03-28T12:15:17Z

Changes

Add optional flate compression for termination messages, gated behind the
enable-termination-message-compression alpha feature flag. When enabled, results
are compressed with Go stdlib compress/flate and base64-encoded before writing
to the termination message path, effectively increasing the usable capacity from
~33 to ~187 results (5.7x) for typical image digest payloads.

The problem

Kubernetes limits termination messages to 4KB per container, and the total across
all containers in a pod is capped at 12KB (divided by container count). With multiple
steps, each container may get as little as 1KB for results (#4808). The sidecar logs
method (TEP-0127) bypasses this but adds a sidecar container to every TaskRun,
which is wasteful when most tasks don't need large results (#8448).

The solution

Compress the JSON payload before writing to the termination message path. The parser
auto-detects compressed messages via a tknz: prefix, maintaining full backward
compatibility with uncompressed messages.

Step writes result → entrypoint reads → JSON → flate compress → base64 → termination message
                                                                              ↓
Reconciler reads ← JSON ← flate decompress ← base64 decode ← termination message

E2E test results (kind cluster, k8s v1.29.2)

Results	Without Compression	With Compression
30	3776 bytes ✅	—
35	>4096 bytes ❌	899 bytes ✅
100	❌	2241 bytes ✅
150	❌	3307 bytes ✅
180	❌	3952 bytes ✅
185	❌	4061 bytes ✅
200	❌	>4096 bytes ❌

Max capacity: ~33 → ~187 results (5.7x improvement)

Implementation

Feature flag: enable-termination-message-compression (alpha, default: false)
- Uses PerFeatureFlag with AlphaAPIFields stability
- Documented in config/config-feature-flags.yaml
- Only meaningful when results-from: termination-message (default)
Write path: pkg/termination/write.go compresses with flate.BestCompression + base64.RawStdEncoding, prefixed with tknz:
- Smart fallback: if compressed output is larger than plain JSON, falls back automatically
- Fixed pre-existing O_TRUNC bug in file rewrite
Parse path: pkg/termination/parse.go auto-detects tknz: prefix, decompresses transparently
Decompression safety: io.LimitReader with 1MB cap + explicit size check
Entrypoint wiring: ConfigMap → pkg/pod/pod.go → cmd/entrypoint/main.go → Entrypointer
Zero new dependencies — Go stdlib only

Test coverage

18 unit tests: round-trip, size comparison, append, auto-detect, corrupt data, capacity benchmark
Feature flag tests (default, all-flags-set, invalid value)
Example TaskRun and Pipeline in examples/v1/*/alpha/

Related Issues

Relates to Enable larger results without a sidecar on every TaskRun #8448 (enable larger results without sidecar)
Relates to Results, TerminationMessage and Containers #4808 (results, termination message and containers)
Relates to Changing the way Result Parameters are stored. #4012 (changing the way result parameters are stored)
Complements TEP-0127 (larger results via sidecar logs)
Short-term measure until TEP-0164 (external result storage via OCI/S3/GCS) addresses the underlying 4KB limitation

/kind feature

Submitter Checklist

Has Docs if any changes are user facing
Has Tests included if any functionality added or changed
pre-commit Passed
Follows the commit message standard
Meets the Tekton contributor standards
Has a kind label
Release notes block below has been updated

Release Notes

Added optional termination message compression (alpha feature flag
enable-termination-message-compression) that uses flate compression to fit
approximately 5.7x more results in the 4KB Kubernetes termination message limit.
The parser auto-detects compressed messages for full backward compatibility.
Zero new dependencies — uses Go stdlib only.

waveywaves · 2026-03-28T13:52:46Z

E2E Test Results — Kind Cluster (k8s v1.29.2)

Deployed Tekton from this branch via ko apply with KO_DOCKER_REPO=kind.local.

Without compression (baseline)

Results	Termination Message	Status
30	3776 bytes	PASS
35	>4096 bytes	FAIL — `Termination message is above max allowed size 4096, caused by large task result.`

Max capacity without compression: ~33 results

With compression (`enable-termination-message-compression: "true"`)

Results	Termination Message	Status
35	899 bytes	PASS
100	2241 bytes	PASS
150	3307 bytes	PASS
175	3849 bytes	PASS
180	3952 bytes	PASS
185	4061 bytes	PASS
200	>4096 bytes	FAIL

Max capacity with compression: ~187 results

Summary

Result type: ~100 byte image digest (gcr.io/project/image-NNN@sha256:64-hex-chars)
Without compression: ~33 results
With compression:    ~187 results
Improvement:         5.7x capacity increase

All results verified intact after round-trip (first/last result values checked, result count matched). The tknz: prefix was present in the termination message, confirming compression was active.

vdemeester

A few things to address before this can move forward.

Should fix

No e2e test with the flag enabled. The unit tests in pkg/termination/ are thorough, but there's no integration test proving the full loop on a real cluster: feature flag → pod builder → entrypoint → compressed write → reconciler parse → results in .status.taskResults. At minimum, add one test to the alpha e2e suite that creates a TaskRun with enough results to exceed 4KB uncompressed and asserts they all appear in status.
No tests in pkg/pod/pod_test.go or pkg/entrypoint/entrypointer_test.go for the new flag plumbing. The termination package is well-covered but the integration points aren't.
OpenAPI spec not regenerated. EnableTerminationMessageCompression was added to the FeatureFlags struct but openapi_generated.go wasn't updated. Run ./hack/update-openapigen.sh.
CRD schema entries missing description. The enableTerminationMessageCompression field is added in 8 places across the CRD YAMLs but without a description, unlike all other feature flags there.

Strategic context

I want to be transparent about the bigger picture here. There is a TEP in progress (TEP-0164) that addresses the root cause — storing results in external backends (OCI/S3/GCS) so the 4KB limit doesn't apply at all. That TEP also unifies results with TEP-0147 artifacts and converges with TEP-0139 trusted artifacts.

That said, TEP-0164 is still in proposal stage and won't land for a while. Users are hitting the 4KB wall today (as the issues you reference show), and this is a reasonable opt-in stop-gap: zero new deps, backward compatible, meaningful capacity improvement.

I'd suggest two things to make the relationship clear:

Add a note to the feature flag description (in config-feature-flags.yaml and docs) that this is a short-term measure, and that external result storage (TEP-0164) will address the underlying limitation.
Reference TEP-0164 in the PR description alongside the other related issues, so the community can see the full picture.

With the technical items above addressed, I'm supportive of this going in as an alpha feature.

vdemeester · 2026-03-30T14:30:03Z

 	}

+	if featureFlags.EnableTerminationMessageCompression {
+		commonExtraEntrypointArgs = append(commonExtraEntrypointArgs, "-compress_termination_message=true")


Must fix: This is not gated on the result extraction method. When results-from: sidecar-logs is active, the sidecar reads results from disk, not the termination message — compression is meaningless and adds confusion.

Suggested change

commonExtraEntrypointArgs = append(commonExtraEntrypointArgs, "-compress_termination_message=true")

if featureFlags.EnableTerminationMessageCompression && !sidecarLogsResultsEnabled {

commonExtraEntrypointArgs = append(commonExtraEntrypointArgs, "-compress_termination_message=true")

}

Done. Added && !sidecarLogsResultsEnabled guard. Also added TestPodBuild_CompressTerminationMessage with 3 subtests covering: enabled+termination-message (arg present), enabled+sidecar-logs (arg skipped), disabled (arg absent).

Thanks @waveywaves - please make sure that this is also documented in the feature flag docs.

Might be worth logging a warning for the operator to catch as this would be an inconsistent configuration that needs to be detected

vdemeester · 2026-03-30T14:30:04Z

+		var wErr error
+		if e.CompressTerminationMessage {
+			wErr = termination.WriteCompressedMessage(e.TerminationPath, output)
+		} else {
+			wErr = termination.WriteMessage(e.TerminationPath, output)
+		}
+		if wErr != nil {


Must fix: This exact if compress { WriteCompressed } else { Write } block is duplicated at L478-485 in readResultsFromDisk. Extract a helper to avoid divergence:

func (e Entrypointer) writeTerminationMessage(path string, results []result.RunResult) error { if e.CompressTerminationMessage { return termination.WriteCompressedMessage(path, results) } return termination.WriteMessage(path, results) }

Done. Extracted writeTerminationMessage helper method. Both call sites now use it. Added TestWriteTerminationMessage_Compression with 2 subtests (plain vs compressed output).

vdemeester · 2026-03-30T14:30:04Z

 	github.com/prometheus/client_golang v1.23.2 // indirect
 	github.com/prometheus/client_model v0.6.2 // indirect
-	github.com/prometheus/common v0.67.5 // indirect
+	github.com/prometheus/common v0.67.5


Nit: Unrelated change — github.com/prometheus/common went from // indirect to direct but nothing in the codebase imports it. Looks like an accidental go mod tidy artifact, please revert.

vdemeester · 2026-03-30T14:30:04Z

-	if err := json.Unmarshal([]byte(msg), &r); err != nil {
-		return nil, fmt.Errorf("parsing message json: %w, msg: %s", err, msg)
+	if err := json.Unmarshal([]byte(jsonMsg), &r); err != nil {
+		return nil, fmt.Errorf("parsing message json: %w, msg: %s", err, jsonMsg)


Nit: On JSON parse failure after decompression, this includes the full decompressed payload in the error message. For large payloads that could produce enormous log entries. Consider truncating to first N bytes.

Done. Truncated to 256 bytes with ...(truncated) suffix.

vdemeester · 2026-03-30T14:30:04Z

+	EnableWaitExponentialBackoff             bool `json:"enableWaitExponentialBackoff,omitempty"`
+	EnableTerminationMessageCompression     bool `json:"enableTerminationMessageCompression,omitempty"`


Nit: Inconsistent alignment — these two fields have extra whitespace padding that doesn't match the rest of the struct.

khrm · 2026-04-07T13:07:11Z

+
+// maxDecompressedSize is the upper bound for decompressed termination messages.
+// The pre-compression JSON was at most 4KB, but with compression we allow
+// larger payloads. 1MB is a safe limit that prevents decompression bombs


Isn't 1MB too generous? How much max payload are we expecting? 128 KB? Or is 1 MB really possible except in case of attack?

Good question. The pre-compression input is capped at 4KB by Kubernetes (termination message limit). With flate, the theoretical max decompressed output from 4KB compressed is about 64KB for typical JSON.

1MB is a safety net against decompression bombs. 128KB would work for legitimate use — happy to lower it. The key constraint: must be bounded (not unlimited) and larger than any legitimate 4KB-compressed payload could produce.

Let's put it to 128 KB

Done — changed to 128KB.

khrm · 2026-04-07T13:30:21Z

+		existing, err := parseExisting(fileContents)
+		if err == nil {
+			pro = append(existing, pro...)
 		}


Why weren't we surfacing error earlier? Is it still valid now to not surface it if error vectors have increased leading to loss of results?

The pre-existing code silently swallows parse errors from the existing file (line 57-59) — this was intentional so a corrupt termination message doesn't prevent a step from completing and writing its own results.

With compression, the error vectors haven't increased: compressed messages auto-detect via the tknz: prefix, and if decompression fails, the same fallback applies (treat as corrupt, overwrite with new results).

If you think we should log a warning when dropping corrupt existing results, I can add that — though it would be a behavior change for the non-compressed path too.

Updated — now logs a warning when existing termination message can't be parsed:

slog.Warn("Failed to parse existing termination message, previous results will be lost", "error", parseErr, "path", path)

This applies to both compressed and non-compressed paths. Results are still written (step doesn't fail), but the data loss is no longer silent.

waveywaves · 2026-04-14T06:50:58Z

/retest

Adds optional zlib compression (gated by enable-termination-message-compression feature flag) to the entrypoint binary's termination message writer, with decompression on the controller side. Falls back to uncompressed format when compression is disabled or the controller encounters uncompressed data. This increases effective result capacity from ~33 to ~187 results in the 4KB termination message limit (5.7x improvement). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

waveywaves · 2026-04-16T14:05:58Z

/retest

afrittoli

Thanks @waveywaves - it looks good, just a few minor comments.

/approve

afrittoli · 2026-04-20T09:49:59Z

 	}

+	if featureFlags.EnableTerminationMessageCompression {
+		commonExtraEntrypointArgs = append(commonExtraEntrypointArgs, "-compress_termination_message=true")


Thanks @waveywaves - please make sure that this is also documented in the feature flag docs.

afrittoli · 2026-04-20T10:23:05Z

+# The task generates 20 image digest results. Without compression, this
+# produces ~5KB of JSON, exceeding the 4KB Kubernetes termination message
+# limit. With compression enabled (~40-50% reduction), it fits comfortably.


NIT: wondering why is this in the no-ci folder? Couldn't we setup the test cluster with the flag enabled?

afrittoli · 2026-04-20T10:26:30Z

 	}

+	if featureFlags.EnableTerminationMessageCompression {
+		commonExtraEntrypointArgs = append(commonExtraEntrypointArgs, "-compress_termination_message=true")


Might be worth logging a warning for the operator to catch as this would be an inconsistent configuration that needs to be detected

tekton-robot · 2026-04-20T10:27:30Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: afrittoli

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [afrittoli]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tekton-robot requested review from pritidesai and twoGiants March 28, 2026 12:15

tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 28, 2026

waveywaves requested review from aThorp96 and vdemeester and removed request for pritidesai March 28, 2026 16:39

vdemeester reviewed Mar 30, 2026

View reviewed changes

waveywaves force-pushed the feat/compress-termination-results branch from e4cec69 to bf1e5f0 Compare March 31, 2026 09:02

waveywaves marked this pull request as ready for review April 2, 2026 14:00

tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 2, 2026

tekton-robot requested review from abayer and pritidesai April 2, 2026 14:00

waveywaves force-pushed the feat/compress-termination-results branch from bf1e5f0 to b57cd0e Compare April 2, 2026 23:08

vdemeester self-assigned this Apr 7, 2026

khrm reviewed Apr 7, 2026

View reviewed changes

waveywaves force-pushed the feat/compress-termination-results branch from b57cd0e to 23c80a4 Compare April 9, 2026 09:47

waveywaves force-pushed the feat/compress-termination-results branch from 23c80a4 to bf8451c Compare April 14, 2026 09:33

tekton-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Apr 14, 2026

waveywaves force-pushed the feat/compress-termination-results branch from bf8451c to 9a62b83 Compare April 16, 2026 11:12

tekton-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 16, 2026

afrittoli reviewed Apr 20, 2026

View reviewed changes

tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 20, 2026

		EnableWaitExponentialBackoff bool `json:"enableWaitExponentialBackoff,omitempty"`
		EnableTerminationMessageCompression bool `json:"enableTerminationMessageCompression,omitempty"`

Conversation

waveywaves commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

The problem

The solution

E2E test results (kind cluster, k8s v1.29.2)

Implementation

Test coverage

Related Issues

Submitter Checklist

Release Notes

Uh oh!

waveywaves commented Mar 28, 2026

E2E Test Results — Kind Cluster (k8s v1.29.2)

Without compression (baseline)

With compression (enable-termination-message-compression: "true")

Summary

Uh oh!

vdemeester left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Should fix

Strategic context

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

khrm Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

waveywaves commented Apr 14, 2026

Uh oh!

waveywaves commented Apr 16, 2026

Uh oh!

afrittoli left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tekton-robot commented Apr 20, 2026

Uh oh!

waveywaves commented Mar 28, 2026 •

edited

Loading

With compression (`enable-termination-message-compression: "true"`)

vdemeester left a comment •

edited

Loading

khrm Apr 7, 2026 •

edited

Loading