Add zstd compression for JSONL transcript storage by evisdren · Pull Request #514 · entireio/cli

evisdren · 2026-02-26T03:37:42Z

Summary

Add zstd compression to JSONL transcripts before storing as git blobs, reducing object sizes 10-15x and dramatically improving git push times for the entire/checkpoints/v1 branch
New compression package with Compress/Decompress helpers using klauspost/compress/zstd
All write paths (committed + shadow branch) now compress transcripts; all read paths try compressed first with uncompressed fallback for backward compatibility
New entire optimize [--apply] migration command to rewrite existing uncompressed data on the metadata branch using ApplyTreeChanges tree surgery
Compression and storage benchmarks for throughput, ratio, and simulated push payload

Changes

New files:

compression/zstd.go — Compress/Decompress/IsCompressedName/CompressedName
compression/zstd_test.go — round-trip, empty, large, concurrent, invalid data tests
compression/zstd_bench_test.go — benchmarks at 1KB–25MB
checkpoint/compression_bench_test.go — end-to-end write, read, push payload, migration benchmarks
optimize.go — entire optimize [--apply] command (dry-run by default)

Modified files:

paths/paths.go — TranscriptCompressedFileName, CompressedSuffix constants
agent/chunking.go — ChunkCompressed/ReassembleCompressed for raw byte splitting
checkpoint/committed.go — compressed writes/reads for transcripts + subagents
checkpoint/temporary.go — compressed shadow branch writes + fallback reads
strategy/common.go, manual_commit_condensation.go, manual_commit_hooks.go — compressed-first reads
root.go — register optimize command
go.mod — promote klauspost/compress from indirect to direct dependency
Test files updated to handle .zst format

Test plan

mise run fmt — clean
mise run lint — 0 issues
mise run test:ci — all unit + integration tests pass
Run compression benchmarks: go test -bench=BenchmarkCompress -benchmem ./cmd/entire/cli/compression/...
Run storage benchmarks: go test -bench=BenchmarkSimulatedPushPayload -benchmem ./cmd/entire/cli/checkpoint/...
Manual: checkpoint + commit → verify full.jsonl.zst on metadata branch tree
Manual: read back old uncompressed checkpoints → verify backward-compat

🤖 Generated with Claude Code

Compress JSONL transcripts with zstd before storing as git blobs, reducing object sizes 10-15x. This dramatically improves git push times for the entire/checkpoints/v1 branch. - Add compression package (zstd.go) with Compress/Decompress helpers - Add ChunkCompressed/ReassembleCompressed for compressed byte splitting - Compress transcripts in all write paths (committed + shadow branch) - Add compressed-first reads with uncompressed fallback (backward compat) - Add `entire optimize` command to migrate existing uncompressed data - Add compression and storage benchmarks - Promote klauspost/compress from indirect to direct dependency Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Entire-Checkpoint: e65d9bec561a

cursor · 2026-02-26T03:37:48Z

PR Summary

Medium Risk
Changes core checkpoint persistence/read logic and on-disk formats (new .zst paths and chunking semantics); backward-compat fallbacks help, but any mismatch could make transcripts unreadable or break migration.

Overview
Transcripts and subagent JSONL artifacts are now stored zstd-compressed on entire/checkpoints/v1 (and shadow/temporary checkpoints), using full.jsonl.zst / .zst-suffixed agent-*.jsonl and chunking the compressed bytes when blobs exceed MaxChunkSize.

Read paths were updated to prefer compressed content with transparent decompression, while retaining fallbacks for legacy uncompressed/chunked formats; metadata directory ingestion also compresses .jsonl files and adjusts tree paths accordingly. A new entire optimize command performs a dry-run (default) or --apply migration that rewrites existing uncompressed transcript blobs to .zst, and tests/benchmarks were added/updated to validate and measure compression behavior.

^{Written by Cursor Bugbot for commit ae08fa2. Configure here.}

Copilot

Pull request overview

This PR adds zstd compression to JSONL transcript storage to reduce git object sizes 10-15x and improve git push performance for the entire/checkpoints/v1 metadata branch. It introduces a new compression package, updates all write paths to compress transcripts, adds backward-compatible reads with compressed-first fallback, and provides an entire optimize migration command to compress existing uncompressed data.

Changes:

New compression package with zstd Compress/Decompress functions and helper utilities
Updated checkpoint storage (committed.go, temporary.go) to compress all JSONL transcripts and subagent transcripts before writing
New agent chunking functions (ChunkCompressed/ReassembleCompressed) for splitting compressed binary data at byte boundaries
Updated all read paths in strategy package to try compressed format first with uncompressed fallback
New entire optimize command for migrating existing uncompressed transcripts using tree surgery
Updated integration tests to handle both compressed and uncompressed transcript formats

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
compression/zstd.go	Core compression/decompression using klauspost/compress/zstd
compression/zstd_test.go	Unit tests for compression round-trips and edge cases
compression/zstd_bench_test.go	Benchmarks for compression throughput and ratios
checkpoint/compression_bench_test.go	End-to-end benchmarks for write/read/migration scenarios
checkpoint/committed.go	Compressed writes for transcripts with chunking support; compressed-first reads
checkpoint/temporary.go	Compressed writes for shadow branch; fallback reads with decompression
checkpoint/checkpoint_test.go	Updated tests to decompress subagent transcripts
agent/chunking.go	New ChunkCompressed/ReassembleCompressed for binary data splitting
strategy/common.go	Compressed-first reads with fallback for getTaskTranscriptFromTree
strategy/manual_commit_hooks.go	Compressed-first reads in sessionHasNewContent
strategy/manual_commit_condensation.go	Compressed-first reads in extractSessionData
optimize.go	New migration command with dry-run default and tree surgery implementation
root.go	Command registration for optimize
paths/paths.go	Added TranscriptCompressedFileName and CompressedSuffix constants
integration_test/testenv.go	Helper to read compressed or uncompressed transcripts
integration_test/*.go	Updated tests to check for both compressed and uncompressed formats
go.mod	Promoted klauspost/compress from indirect to direct dependency

Copilot · 2026-02-26T03:45:33Z

cmd/entire/cli/agent/chunking.go

+// ChunkCompressed splits compressed (binary) data into chunks at raw byte boundaries.
+// Unlike ChunkJSONL which respects line boundaries, this splits at arbitrary byte offsets
+// since the consumer will reassemble the raw bytes before decompression.
+func ChunkCompressed(data []byte, maxSize int) [][]byte {
+	if len(data) <= maxSize {
+		return [][]byte{data}
+	}
+
+	var chunks [][]byte
+	for len(data) > 0 {
+		end := maxSize
+		if end > len(data) {
+			end = len(data)
+		}
+		chunk := make([]byte, end)
+		copy(chunk, data[:end])
+		chunks = append(chunks, chunk)
+		data = data[end:]
+	}
+	return chunks
+}
+
+// ReassembleCompressed concatenates raw byte chunks back into a single buffer.
+func ReassembleCompressed(chunks [][]byte) []byte {
+	if len(chunks) == 0 {
+		return nil
+	}
+	if len(chunks) == 1 {
+		return chunks[0]
+	}
+	total := 0
+	for _, c := range chunks {
+		total += len(c)
+	}
+	result := make([]byte, 0, total)
+	for _, c := range chunks {
+		result = append(result, c...)
+	}
+	return result
+}


The new functions ChunkCompressed and ReassembleCompressed in this file lack test coverage. These functions handle critical binary data chunking for compressed transcripts, and should have tests to verify:

Correct chunking at byte boundaries

Round-trip preservation (chunk then reassemble yields original data)

Edge cases (empty data, single chunk, data size exactly at maxSize, data size just over maxSize)

Proper handling when maxSize is very small

Since other chunking functions in this file (ChunkJSONL, ReassembleJSONL) have comprehensive test coverage in chunking_test.go, these new functions should follow the same pattern.

Copilot · 2026-02-26T03:45:33Z

cmd/entire/cli/optimize.go

+	for entryPath, entry := range entries {
+		if !isUncompressedTranscript(entryPath) {
+			continue
+		}
+
+		// Read the blob content
+		blob, err := repo.BlobObject(entry.Hash)
+		if err != nil {
+			continue
+		}
+
+		reader, err := blob.Reader()
+		if err != nil {
+			continue
+		}
+
+		content := make([]byte, blob.Size)
+		n, readErr := io.ReadFull(reader, content)
+		_ = reader.Close()
+		if readErr != nil && n == 0 {
+			continue
+		}
+		content = content[:n]
+
+		originalSize := int64(len(content))
+
+		// Compress the content
+		compressed, err := compression.Compress(content)
+		if err != nil {
+			continue
+		}
+
+		compressedSize := int64(len(compressed))
+		totalOriginalSize += originalSize
+		totalCompressedSize += compressedSize
+		filesCompressed++
+
+		if apply {
+			// Create compressed blob
+			blobHash, err := checkpoint.CreateBlobFromContent(repo, compressed)
+			if err != nil {
+				continue
+			}
+
+			// Delete old uncompressed entry
+			changes = append(changes, checkpoint.TreeChange{
+				Path:  entryPath,
+				Entry: nil, // delete
+			})
+
+			// Add new compressed entry
+			compressedPath := entryPath + paths.CompressedSuffix
+			changes = append(changes, checkpoint.TreeChange{
+				Path:  compressedPath,
+				Entry: &object.TreeEntry{Mode: filemode.Regular, Hash: blobHash},
+			})
+		}


The error handling in the blob reading loop silently continues on any error, which means if blob reading fails or compression fails for legitimate reasons (e.g., I/O errors, corrupted data), the command will skip those files without informing the user. This could mask real problems.

At minimum, failed compressions should be logged (they're operational issues, not user content). Consider:

Logging warnings for failed blob reads (may indicate repo corruption)

Logging warnings for compression failures (unexpected but should be visible)

Possibly collecting and reporting error counts at the end of the operation

This is particularly important because the dry-run output shows savings estimates but doesn't mention any files that were skipped due to errors, which could give an inaccurate picture.

Copilot · 2026-02-26T03:45:34Z

cmd/entire/cli/strategy/common.go

+	// Try compressed format first
+	compressedPath := sessionDir + "/" + paths.TranscriptCompressedFileName
+	if file, fileErr := tree.File(compressedPath); fileErr == nil {
+		content, contentErr := file.Contents()
+		if contentErr == nil {
+			decompressed, decompressErr := compression.Decompress([]byte(content))
+			if decompressErr == nil {
+				return decompressed, nil
+			}
+		}


This code only tries to read the base compressed file (full.jsonl.zst) but doesn't handle chunked compressed files (e.g., full.jsonl.zst, full.jsonl.zst.001, full.jsonl.zst.002).

When a compressed transcript exceeds MaxChunkSize (50MB), it's split into chunks. However, this read path will fail to reassemble those chunks - it will only read the first chunk file.

The code should use the same chunked reading logic that exists in readTranscriptFromTree in committed.go (specifically the readCompressedTranscript helper function), which properly handles both single and chunked compressed files.

Suggested change

// Try compressed format first

compressedPath := sessionDir + "/" + paths.TranscriptCompressedFileName

if file, fileErr := tree.File(compressedPath); fileErr == nil {

content, contentErr := file.Contents()

if contentErr == nil {

decompressed, decompressErr := compression.Decompress([]byte(content))

if decompressErr == nil {

return decompressed, nil

}

}

// Try compressed format first (supports single-file and chunked transcripts)

baseCompressedPath := sessionDir + "/" + paths.TranscriptCompressedFileName

var compressedPaths []string

filesIter := tree.Files()

err = filesIter.ForEach(func(f *object.File) error {

if strings.HasPrefix(f.Name, baseCompressedPath) {

compressedPaths = append(compressedPaths, f.Name)

}

return nil

})

if err != nil {

return nil, fmt.Errorf("failed to iterate tree files: %w", err)

}

if len(compressedPaths) > 0 {

// Ensure chunks are read in order: base file, then .001, .002, ...

sort.Strings(compressedPaths)

var combined []byte

for _, path := range compressedPaths {

file, fileErr := tree.File(path)

if fileErr != nil {

return nil, fmt.Errorf("failed to read compressed transcript chunk %s: %w", path, fileErr)

}

content, contentErr := file.Contents()

if contentErr != nil {

return nil, fmt.Errorf("failed to read contents of compressed transcript chunk %s: %w", path, contentErr)

}

combined = append(combined, []byte(content)...)

}

decompressed, decompressErr := compression.Decompress(combined)

if decompressErr == nil {

return decompressed, nil

}

// If decompression fails, fall back to trying uncompressed formats below.

Copilot · 2026-02-26T03:45:34Z

cmd/entire/cli/strategy/manual_commit_hooks.go

+	// Try compressed format first, then uncompressed, then legacy
+	if file, fileErr := tree.File(metadataDir + "/" + paths.TranscriptCompressedFileName); fileErr == nil {
+		hasTranscriptFile = true
+		if content, contentErr := file.Contents(); contentErr == nil {
+			if decompressed, decompressErr := compression.Decompress([]byte(content)); decompressErr == nil {
+				transcriptLines = countTranscriptItems(state.AgentType, string(decompressed))
+			}
+		}
+	} else if file, fileErr := tree.File(metadataDir + "/" + paths.TranscriptFileName); fileErr == nil {


This code only tries to read the base compressed file but doesn't handle chunked compressed files. When a compressed transcript exceeds MaxChunkSize (50MB), it's stored as multiple chunk files (full.jsonl.zst, full.jsonl.zst.001, etc.), but this code will only read the first chunk.

The same issue exists in strategy/common.go:1095-1105. The proper solution is to iterate through tree entries to find all chunks with the compressed transcript base name, sort them, read them all, reassemble the compressed data, then decompress.

Copilot · 2026-02-26T03:45:34Z

cmd/entire/cli/checkpoint/temporary.go

 	// Fall back to direct file access (for backwards compatibility)
+	// Try compressed first
+	compressedPath := metadataDir + "/" + paths.TranscriptCompressedFileName
+	if file, fileErr := tree.File(compressedPath); fileErr == nil {
+		content, contentErr := file.Contents()
+		if contentErr == nil {
+			decompressed, decompressErr := compression.Decompress([]byte(content))
+			if decompressErr == nil {
+				return decompressed, nil
+			}
+		}
+	}


The fallback path (lines 600-610) only reads the base compressed file and doesn't handle chunks. If readTranscriptFromTree fails but a chunked compressed transcript exists, this fallback will only read the first chunk.

However, since line 593 uses readTranscriptFromTree which properly handles chunked compressed files, this fallback should only be reached in edge cases. Still, for consistency and correctness, the fallback should either:

Also handle chunks (iterate through possible chunk files)

Be documented as intentionally handling only non-chunked files

Log a warning if it's used (indicating an unexpected code path)

Copilot · 2026-02-26T03:45:35Z

cmd/entire/cli/optimize.go

+	now := plumbing.NewHashReference(refName, plumbing.ZeroHash) // placeholder
+	_ = now
+
+	commitObj := &object.Commit{
+		Author:    object.Signature{Name: authorName, Email: authorEmail},
+		Committer: object.Signature{Name: authorName, Email: authorEmail},
+		Message:   fmt.Sprintf("Optimize: compress %d transcript files\n", filesCompressed),
+		TreeHash:  newTreeHash,
+		ParentHashes: []plumbing.Hash{
+			ref.Hash(),
+		},
+	}


The code creates a placeholder reference that is never used. Lines 174-175 create a now variable with plumbing.NewHashReference and immediately discard it with _ = now. This serves no purpose.

The commit signature should use time.Now() for the timestamp. Looking at other commit creation code in the codebase (e.g., temporary.go:780-786), the pattern is to create a time.Time value and use it in the object.Signature.When field. However, this code doesn't set the When field at all, which means the commit will have a zero timestamp.

Add proper timestamp handling:

now := time.Now() commitObj := &object.Commit{ Author: object.Signature{Name: authorName, Email: authorEmail, When: now}, Committer: object.Signature{Name: authorName, Email: authorEmail, When: now}, // ... }

Copilot · 2026-02-26T03:45:35Z

cmd/entire/cli/checkpoint/temporary.go

+				chunks := agent.ChunkCompressed(compressed, agent.MaxChunkSize)
+				baseName := paths.TranscriptCompressedFileName
+				if compressErr != nil {
+					baseName = paths.TranscriptFileName
+				}


When compression fails and the code falls back to storing uncompressed data, it still uses agent.ChunkCompressed which is designed for binary data and splits at arbitrary byte boundaries. This is incorrect for uncompressed JSONL data.

For uncompressed JSONL, the code should use agent.ChunkTranscript (as was done before this change) which respects line boundaries and maintains JSONL validity. Using ChunkCompressed on uncompressed JSONL will split JSON objects mid-line, making the chunks invalid JSONL.

The fix should:

Use agent.ChunkTranscript when compressErr != nil

Use agent.ChunkCompressed only when compression succeeded

Detect agent type from the transcript content when using ChunkTranscript

Copilot · 2026-02-26T03:45:35Z

cmd/entire/cli/integration_test/testenv.go

+// readTranscriptContent reads the transcript content for a checkpoint, trying compressed first.
+func (env *TestEnv) readTranscriptContent(checkpointID string) string {
+	env.T.Helper()
+
+	// Try compressed format first
+	compressedPath := SessionFilePath(checkpointID, paths.TranscriptCompressedFileName)
+	if compressedContent, found := env.ReadFileFromBranch(paths.MetadataBranchName, compressedPath); found {
+		decompressed, err := compression.Decompress([]byte(compressedContent))
+		if err == nil {
+			return string(decompressed)
+		}
+	}


This helper function only reads the base compressed file and doesn't handle chunked compressed transcripts. If a test creates a transcript larger than MaxChunkSize (50MB), it will be stored as multiple chunks but this helper will only read the first chunk.

While current tests likely don't create transcripts large enough to trigger chunking, this could cause subtle test failures in the future if someone adds a test with large transcripts. Consider either:

Adding chunk-handling logic (iterate through numbered suffixes)

Documenting that this helper doesn't support chunked transcripts

Making the helper use the same readTranscriptFromTree logic that the production code uses

Copilot · 2026-02-26T03:45:35Z

cmd/entire/cli/strategy/manual_commit_condensation.go

+		// Fall back to shadow branch copy — try compressed first, then uncompressed
+		if file, fileErr := tree.File(metadataDir + "/" + paths.TranscriptCompressedFileName); fileErr == nil {
 			if content, contentErr := file.Contents(); contentErr == nil {
-				fullTranscript = content
+				if decompressed, decompressErr := compression.Decompress([]byte(content)); decompressErr == nil {
+					fullTranscript = string(decompressed)
+				}
 			}
-		} else if file, fileErr := tree.File(metadataDir + "/" + paths.TranscriptFileNameLegacy); fileErr == nil {
-			if content, contentErr := file.Contents(); contentErr == nil {
-				fullTranscript = content
+		}


This code only tries to read the base compressed file but doesn't handle chunked compressed files. When a compressed transcript exceeds MaxChunkSize (50MB), it's stored as multiple chunk files, but this code will only read the first chunk.

The same issue exists in strategy/common.go:1095-1105 and strategy/manual_commit_hooks.go:1068-1076. A helper function should be created to properly read compressed transcripts (with chunk handling) that can be shared across all these call sites.

Copilot · 2026-02-26T03:45:35Z

cmd/entire/cli/optimize.go

+	var dryRunFlag bool
+
+	cmd := &cobra.Command{
+		Use:   "optimize",
+		Short: "Optimize stored checkpoint data",
+		Long: `Compress existing uncompressed transcript data on the entire/checkpoints/v1 branch.
+
+New checkpoints are automatically compressed with zstd. This command migrates
+older uncompressed data to the compressed format, reducing storage size and
+improving push/pull performance.
+
+Default: dry run that shows what would be compressed and estimated savings.
+With --apply, actually compresses the data.`,
+		RunE: func(cmd *cobra.Command, _ []string) error {
+			return runOptimize(cmd.OutOrStdout(), !dryRunFlag)
+		},
+	}
+
+	cmd.Flags().BoolVar(&dryRunFlag, "apply", false, "Actually compress data (default: dry run)")


The PR description says the command syntax is entire optimize [--apply] but line 39 shows the flag help text says "Actually compress data (default: dry run)" with the flag name as --apply. This is confusing because:

The boolean variable is named dryRunFlag but the flag is named apply

Most CLI tools use --dry-run as a flag to prevent changes, not --apply to enable them

The default behavior (dry run) is non-standard - usually the command does the action by default and --dry-run prevents it

While the implementation is functionally correct (the negation on line 35 handles the inversion), this design is counterintuitive for users familiar with standard CLI patterns. Consider changing to --dry-run flag with default false, or keeping --apply but renaming the variable to applyFlag for clarity.

cursor

Cursor Bugbot has reviewed your changes and found 6 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

cursor · 2026-02-26T03:49:55Z

cmd/entire/cli/optimize.go

+				Path:  compressedPath,
+				Entry: &object.TreeEntry{Mode: filemode.Regular, Hash: blobHash},
+			})
+		}


Optimize command corrupts chunked transcripts with wrong filenames

High Severity

The optimize command compresses each chunk file individually, appending .zst to the original filename (e.g., full.jsonl.001 → full.jsonl.001.zst). However, the read path expects compressed chunk names in the format full.jsonl.zst.001. The misnamed files are orphaned from the compressed reader and worse, ParseChunkIndex("full.jsonl.001.zst", "full.jsonl") successfully parses index 1, so the uncompressed fallback reader picks them up and returns compressed binary as if it were JSONL text. For chunked transcripts, this causes data corruption on read after optimization.

Additional Locations (1)

cmd/entire/cli/optimize.go#L208-L227

cursor · 2026-02-26T03:49:55Z

cmd/entire/cli/checkpoint/committed.go

+			agentBlobHash, agentBlobErr := CreateBlobFromContent(s.repo, compressed)
 			if agentBlobErr == nil {
-				agentPath := taskPath + "agent-" + opts.AgentID + ".jsonl"
+				agentPath := taskPath + "agent-" + opts.AgentID + ".jsonl" + paths.CompressedSuffix


SessionFilePaths.Transcript references non-existent uncompressed file path

Medium Severity

filePaths.Transcript is set to paths.TranscriptFileName (full.jsonl), but writeTranscript now stores the file as paths.TranscriptCompressedFileName (full.jsonl.zst). The CheckpointSummary metadata JSON will contain a path that doesn't correspond to any actual file in the git tree, breaking any consumer that follows this path to locate the transcript.

cursor · 2026-02-26T03:49:55Z

cmd/entire/cli/optimize.go

+		ParentHashes: []plumbing.Hash{
+			ref.Hash(),
+		},
+	}


Optimize commit created with zero-value timestamp

Medium Severity

The object.Signature for the optimize commit omits the When field, resulting in a git commit with a year-0001 timestamp. The existing createCommit method in the codebase properly sets When: time.Now(). The nearby dead code now := plumbing.NewHashReference(...) / _ = now at lines 174–175 suggests timestamp handling was intended but not completed.

cursor · 2026-02-26T03:49:55Z

cmd/entire/cli/checkpoint/committed.go

+// is updated via the compressedTreePath output parameter (if non-nil).
 func createRedactedBlobFromFile(repo *git.Repository, filePath, treePath string) (plumbing.Hash, filemode.FileMode, error) {
+	hash, mode, _, err := createRedactedBlobFromFileWithCompression(repo, filePath, treePath)
+	return hash, mode, err


Wrapper compresses JSONL but discards effective path

Medium Severity

createRedactedBlobFromFile wraps createRedactedBlobFromFileWithCompression but discards the effective tree path (the third return value). This means .jsonl files get their content compressed, but the caller (copyMetadataDir) stores the compressed blob under the original .jsonl path without the .zst suffix. The parallel functions addDirectoryToEntriesWithAbsPath and addDirectoryToChanges were correctly updated to use the WithCompression variant, but copyMetadataDir was not.

Additional Locations (1)

cmd/entire/cli/checkpoint/committed.go#L1443-L1456

cursor · 2026-02-26T03:49:55Z

cmd/entire/cli/optimize.go

+	// Create new commit
+	authorName, authorEmail := checkpoint.GetGitAuthorFromRepo(repo)
+	now := plumbing.NewHashReference(refName, plumbing.ZeroHash) // placeholder
+	_ = now


Dead placeholder code left in optimize command

Low Severity

now := plumbing.NewHashReference(refName, plumbing.ZeroHash) creates a reference object that is immediately discarded with _ = now. This appears to be leftover development scaffolding — the variable name now and the // placeholder comment suggest it was meant to hold a timestamp but was never completed.

cursor · 2026-02-26T03:49:55Z

cmd/entire/cli/compression/zstd.go

+// CompressedName returns the filename with a .zst suffix appended.
+func CompressedName(name string) string {
+	return name + ".zst"
+}


Unused exported functions in compression package

Low Severity

IsCompressedName and CompressedName are exported functions that are never called anywhere in the codebase outside their own test file. The codebase uses paths.CompressedSuffix and strings.HasSuffix directly instead of these helpers.

Push perf test (build tag: pushperf) measures real GitHub push times for entire/checkpoints/v1 using real transcript data from the source repo, with GIT_TRACE2_EVENT profiling to break down each phase (ref negotiation, pack+send, remote processing). Growth model test (build tag: growthmodel) projects data volumes, push times, and GitHub size limit timelines across team sizes (10/50/250/1000 devs) and time horizons (1-12 months). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Summarizes growth model projections, push profiling results, GitHub size limit timelines, platform-level storage estimates, and per-developer unit economics. Includes potential mitigations for data scaling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings February 26, 2026 03:37

evisdren requested a review from a team as a code owner February 26, 2026 03:37

Copilot started reviewing on behalf of evisdren February 26, 2026 03:38 View session

Copilot AI reviewed Feb 26, 2026

View reviewed changes

cursor bot reviewed Feb 26, 2026

View reviewed changes

evisdren and others added 2 commits February 27, 2026 12:09

-	// Try compressed format first
-	compressedPath := sessionDir + "/" + paths.TranscriptCompressedFileName
-	if file, fileErr := tree.File(compressedPath); fileErr == nil {
-		content, contentErr := file.Contents()
-		if contentErr == nil {
-			decompressed, decompressErr := compression.Decompress([]byte(content))
-			if decompressErr == nil {
-				return decompressed, nil
-			}
-		}
+	// Try compressed format first (supports single-file and chunked transcripts)
+	baseCompressedPath := sessionDir + "/" + paths.TranscriptCompressedFileName
+	var compressedPaths []string
+	filesIter := tree.Files()
+	err = filesIter.ForEach(func(f *object.File) error {
+		if strings.HasPrefix(f.Name, baseCompressedPath) {
+			compressedPaths = append(compressedPaths, f.Name)
+		}
+		return nil
+	})
+	if err != nil {
+		return nil, fmt.Errorf("failed to iterate tree files: %w", err)
+	}
+	if len(compressedPaths) > 0 {
+		// Ensure chunks are read in order: base file, then .001, .002, ...
+		sort.Strings(compressedPaths)
+		var combined []byte
+		for _, path := range compressedPaths {
+			file, fileErr := tree.File(path)
+			if fileErr != nil {
+				return nil, fmt.Errorf("failed to read compressed transcript chunk %s: %w", path, fileErr)
+			}
+			content, contentErr := file.Contents()
+			if contentErr != nil {
+				return nil, fmt.Errorf("failed to read contents of compressed transcript chunk %s: %w", path, contentErr)
+			}
+			combined = append(combined, []byte(content)...)
+		}
+		decompressed, decompressErr := compression.Decompress(combined)
+		if decompressErr == nil {
+			return decompressed, nil
+		}
+		// If decompression fails, fall back to trying uncompressed formats below.

Conversation

evisdren commented Feb 26, 2026

Summary

Changes

Test plan

Uh oh!

cursor bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 26, 2026

Choose a reason for hiding this comment

Optimize command corrupts chunked transcripts with wrong filenames

Uh oh!

cursor bot Feb 26, 2026

Choose a reason for hiding this comment

SessionFilePaths.Transcript references non-existent uncompressed file path

Uh oh!

cursor bot Feb 26, 2026

Choose a reason for hiding this comment

Optimize commit created with zero-value timestamp

Uh oh!

cursor bot Feb 26, 2026

Choose a reason for hiding this comment

Wrapper compresses JSONL but discards effective path

Uh oh!

cursor bot Feb 26, 2026

Choose a reason for hiding this comment

Dead placeholder code left in optimize command

Uh oh!

cursor bot Feb 26, 2026

Choose a reason for hiding this comment

Unused exported functions in compression package

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

cursor bot commented Feb 26, 2026 •

edited

Loading