Skip to content

fix: hide swift multipart upload parts#89

Open
tewodrosdy wants to merge 1 commit into
mainfrom
fix/swift-multipart-parts-hidden
Open

fix: hide swift multipart upload parts#89
tewodrosdy wants to merge 1 commit into
mainfrom
fix/swift-multipart-parts-hidden

Conversation

@tewodrosdy
Copy link
Copy Markdown

@tewodrosdy tewodrosdy commented May 21, 2026

  • Fixes Swift multipart uploads so uploaded parts are stored under Herald internal state instead of the user object prefix. Completed multipart uploads now expose only the final object in S3 listings, not raw part objects such as <object>/1.
  • Swift SLO segment objects must remain available after CompleteMultipartUpload, so deleting parts immediately is not safe. The safer approach is to store segments in Herald’s internal namespace, build the SLO manifest from those internal paths, hide internal state from ListObjectsV2, and clean segments through abort/delete paths.

Migration notes

No user-facing migration is required. Existing multipart upload API usage stays the same.


  • The change comes with new or modified tests
  • Hard-to-understand functions have explanatory comments
  • End-user documentation is updated to reflect the change

Summary by CodeRabbit

  • New Features

    • Added support for legacy OpenStack Swift v1 authentication with automatic fallback
  • Improvements

    • More robust multipart upload handling and session tracking; completed uploads no longer expose raw part objects
    • Improved handling for large-object deletions and bulk-delete operations
    • Listings now filter internal metadata paths
  • Bug Fixes

    • Deleting a missing object returns a successful (204) response
  • Tests

    • Added/updated tests to verify multipart completion and delete behavior

Review Change Stack

@tewodrosdy tewodrosdy requested a review from Sabian-A May 21, 2026 12:47
@tewodrosdy tewodrosdy self-assigned this May 21, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

📝 Walkthrough

Walkthrough

Adds legacy Swift v1 auth fallback; refactors manual bulk-delete to derive connection details from swiftUrl; detects and correctly deletes static large objects; tracks multipart part storage paths for manifest writing and targeted cleanup; filters internal state paths from listings; and adds tests verifying multipart completion and delete semantics.

Changes

Swift Backend Multipart and SLO Improvements

Layer / File(s) Summary
Legacy Swift v1 Auth Fallback
src/backends/swift/auth.ts
Added normalizeLegacySwiftStorageUrl and getLegacySwiftV1AuthMeta helpers that extract token and storage URL from response headers. Main auth flow falls back to legacy v1 when primary request returns 404 or 405.
Bulk Delete Refactor & Connection Handling
src/backends/swift/objects.ts
sendManualBulkDeleteRequest now accepts swiftUrl, builds a raw ?bulk-delete request from URL pathname/search, ensures trailing newline and Content-Length, and selects Deno.connectTls vs Deno.connect based on protocol; call sites updated to pass swiftUrl.
Static Large Object Deletion Handling
src/backends/swift/objects.ts, src/backends/swift/mod.ts
deleteObject issues a pre-HEAD to detect x-static-large-object and appends ?multipart-manifest=delete to DELETE for SLO cleanup. convertSwiftDeleteObjectToS3Response treats Swift 404 as S3 204 success.
Multipart Part Storage Model & Constants
src/backends/swift/objects.ts, src/constants/s3.ts
Introduced MPUPart metadata with lastModified and path, helpers to compute per-part storage and manifest paths, and exported MULTIPART_UPLOAD_PARTS_PATH.
Upload Part with Path Tracking
src/backends/swift/objects.ts
uploadPart computes a multipart-specific object path for each part, uploads the part to that path, and records the computed path in the multipart session JSON.
Multipart Completion, Listing, and Abort Cleanup
src/backends/swift/objects.ts
completeMultipartUpload writes SLO manifest entries using computed manifest paths; listParts GETs and validates the multipart session then generates S3 ListParts XML; abortMultipartUpload reads session parts to bulk-delete exact part objects and falls back to prefix listing only when session paths are absent, then deletes the multipart session.
State Path Filtering in Listings
src/backends/swift/utils/mod.ts, src/constants/s3.ts
Import HERALD_STATE_PATH and skip list entries whose name is missing or matches/is under the herald state path prefix to hide internal state objects from S3 listings.
Multipart & Delete Test Additions
tests/swift/basic/multipartupload_test.ts, tests/swift/basic/object_test.ts
Added ListObjectsV2 checks after multipart completion to assert raw part objects under the part-prefix are not exposed; added a test asserting deleting a missing object returns 204.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • expnt/herald#17: Related multipart upload implementation and session/part handling changes.
  • expnt/herald#54: Overlapping Swift multipart refactors affecting listParts/abortMultipartUpload.
  • expnt/herald#20: Related changes to multipart listing and abort cleanup in the Swift backend.

Suggested reviewers

  • Sabian-A
  • hailatGH
  • zifeo
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: hide swift multipart upload parts' clearly and concisely summarizes the main change: hiding multipart upload parts in Swift backend.
Description check ✅ Passed The description covers all three required sections (WHAT, WHY, HOW), includes migration notes, and marks test/comment checkboxes completed; documentation checkbox is appropriately unchecked.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/swift-multipart-parts-hidden

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the bug fix label May 21, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/backends/swift/objects.ts (1)

1500-1629: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reorder abort cleanup: delete the session file only after the parts bulk-delete succeeds.

The current sequence is: read session (Lines 1500-1518) → DELETE session file (Lines 1520-1552) → build objectsToDelete → bulk-delete parts (Lines 1598-1629). If the bulk-delete step fails, the function returns the error to the caller (Line 1623), but the session JSON has already been removed. A retried abort can no longer recover the part paths from the session, and the backward-compat prefix listing under ${object}/ will return nothing for new-style uploads (parts live under MULTIPART_UPLOAD_PARTS_PATH), so the segments stay in the container as orphans.

This is a reliability regression introduced by making the session the source of truth for part locations. Delete the session file after the bulk-delete has succeeded so the next retry can still locate the parts.

🛡️ Suggested reordering

Move the session-DELETE block (Lines 1520-1552) to run after a successful bulk-delete, e.g.:

-  try {
-    const deleteSessionFunc = async () => { ... };
-    const deleteResponse = await retryWithExponentialBackoff(deleteSessionFunc);
-    ...
-  } catch (error) { ... }
-
   const objectsToDelete = new Set(...);
   if (objectsToDelete.size === 0) { ... }
 
   if (objectsToDelete.size === 0) {
     logger.info(`No parts found for multipart upload ${uploadId}`);
   } else {
     // Bulk delete all parts
     ...
     if (!isOk(bulkDeleteResponseResult)) {
       logger.error(...);
       return bulkDeleteResponseResult;
     }
     ...
   }
+
+  // Only remove the session file once we have successfully cleaned up the parts,
+  // so a retried abort can still recover the part paths if bulk-delete fails.
+  try {
+    const deleteSessionFunc = async () => { ... };
+    const deleteResponse = await retryWithExponentialBackoff(deleteSessionFunc);
+    ...
+  } catch (error) { ... }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backends/swift/objects.ts` around lines 1500 - 1629, The abort cleanup
currently deletes the multipart session file (multipartSessionUrl via
deleteSessionFunc) before attempting the bulk delete of parts (built from
sessionParts and sent with sendManualBulkDeleteRequest), which can lose part
paths on failure; change the flow so you read sessionParts first, build
objectsToDelete, perform the bulk delete (retryWithExponentialBackoff around
sendManualBulkDeleteRequest), and only upon successful bulk-delete remove the
session file (invoke the DELETE logic for multipartSessionUrl and its logging) —
keep the same error handling and logging but move the deleteSessionFunc / DELETE
retry block to after the bulk-delete success branch so retries can still find
parts if the bulk delete fails.
🧹 Nitpick comments (3)
tests/swift/basic/multipartupload_test.ts (1)

131-142: ⚡ Quick win

Tighten the raw-parts assertion to match the large-file test.

The negative assertion only checks for ${objectKey}/1. If part numbering ever changes or additional parts leak, this would silently pass. The large-file variant at lines 371-382 uses startsWith(${largeObjectKey}/) which catches any leaked part — apply the same here for consistency and stronger coverage.

♻️ Proposed change
-    const keys = result.Contents?.map((object) => object.Key) ?? [];
-    assert(keys.includes(objectKey));
-    assert(!keys.includes(`${objectKey}/1`));
+    const keys = result.Contents?.map((object) => object.Key) ?? [];
+    assert(keys.includes(objectKey));
+    assert(!keys.some((key) => key?.startsWith(`${objectKey}/`)));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/swift/basic/multipartupload_test.ts` around lines 131 - 142, In the
"Completed Upload Does Not Expose Raw Parts" test update the negative assertion
to detect any leaked part keys (not just `${objectKey}/1`): after computing keys
from result.Contents (using the existing objectKey variable), assert that no key
in keys startsWith(`${objectKey}/`) — e.g., replace the current
assert(!keys.includes(`${objectKey}/1`)) with an assertion that keys.every(k =>
!k?.startsWith(`${objectKey}/`)) or equivalent so any raw part prefixed by
objectKey is caught.
src/backends/swift/auth.ts (1)

42-81: 💤 Low value

Consider mirroring observability of the v3 path.

getLegacySwiftV1AuthMeta is silent on success and on entry, whereas the v3 path logs "Fetching Authorization Token…" and a success line (lines 99-100, 150, 186). Adding equivalent info logs here would make it possible to tell from logs whether v1 was actually exercised and whether it succeeded, which matters since this branch is only hit on auth-endpoint failure.

📋 Optional logging additions
 async function getLegacySwiftV1AuthMeta(config: SwiftConfig): Promise<{
   storageUrl: string;
   token: string;
 }> {
   const { auth_url, credentials } = config;
+  logger.info("Fetching Authorization Token From Swift Server (legacy v1)");
   const response = await fetch(auth_url, {
     ...
   });
   ...
+  logger.info("Authorization Token and Storage URL retrieved Successfully (legacy v1)");
   return {
     token,
     storageUrl: normalizeLegacySwiftStorageUrl(auth_url, storageUrl),
   };
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backends/swift/auth.ts` around lines 42 - 81, Add info-level logging at
the start and on successful completion of getLegacySwiftV1AuthMeta so the v1
flow is observable like the v3 path; log an entry message (e.g., "Fetching
Authorization Token (Swift v1) from {auth_url}") when the function begins and a
success message including the resolved storageUrl and masked token (or token
presence) right before returning the normalized storage URL, using the existing
getLegacySwiftV1AuthMeta function and normalizeLegacySwiftStorageUrl/HeraldError
symbols to locate insertion points; keep error throw behavior unchanged but
ensure the success log occurs only after token and storageUrl checks pass.
src/backends/swift/objects.ts (1)

354-369: 💤 Low value

Minor: cache unwrapOk(headResponse) and consider logging when HEAD fails on a possible SLO.

unwrapOk(headResponse) is invoked three times in the SLO-detection block. Caching it would also let you log a warning when the HEAD probe doesn't return OK, since a persistent failure here means an SLO will be DELETEd without ?multipart-manifest=delete, leaking segments. Retries via retryWithExponentialBackoff already cover transient blips, so a single warning at this point is a useful signal.

♻️ Proposed refactor
-  const headResponse = await retryWithExponentialBackoff(async () => {
-    return await fetch(reqUrl, {
-      method: "HEAD",
-      headers: headers,
-    });
-  });
-  if (isOk(headResponse) && unwrapOk(headResponse).ok) {
-    const isStaticLargeObject =
-      unwrapOk(headResponse).headers.get("x-static-large-object")
-        ?.toLowerCase() === "true";
-    if (isStaticLargeObject) {
-      reqUrl = `${swiftUrl}/${bucket}/${object}?multipart-manifest=delete`;
-    }
-  }
+  const headResponse = await retryWithExponentialBackoff(async () => {
+    return await fetch(reqUrl, {
+      method: "HEAD",
+      headers: headers,
+    });
+  });
+  if (isOk(headResponse)) {
+    const headOk = unwrapOk(headResponse);
+    if (headOk.ok) {
+      const isStaticLargeObject =
+        headOk.headers.get("x-static-large-object")?.toLowerCase() === "true";
+      if (isStaticLargeObject) {
+        reqUrl = `${swiftUrl}/${bucket}/${object}?multipart-manifest=delete`;
+      }
+    } else if (headOk.status !== 404) {
+      logger.warn(
+        `SLO detection HEAD failed for ${bucket}/${object}: ${headOk.statusText}; proceeding with plain DELETE`,
+      );
+    }
+  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backends/swift/objects.ts` around lines 354 - 369, Cache the result of
unwrapOk(headResponse) instead of calling it multiple times: after calling
retryWithExponentialBackoff for the HEAD probe, assign const headOk =
isOk(headResponse) ? unwrapOk(headResponse) : null, then use headOk to compute
isStaticLargeObject and adjust reqUrl; additionally, when headOk is null or
headOk.ok is false, emit a warning (e.g., logger.warn or console.warn) noting
the HEAD probe failed for the object so that potential SLO deletions without
?multipart-manifest=delete are visible; update references in the SLO-detection
block (isStaticLargeObject, reqUrl) to use the cached headOk.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/backends/swift/objects.ts`:
- Around line 1500-1629: The abort cleanup currently deletes the multipart
session file (multipartSessionUrl via deleteSessionFunc) before attempting the
bulk delete of parts (built from sessionParts and sent with
sendManualBulkDeleteRequest), which can lose part paths on failure; change the
flow so you read sessionParts first, build objectsToDelete, perform the bulk
delete (retryWithExponentialBackoff around sendManualBulkDeleteRequest), and
only upon successful bulk-delete remove the session file (invoke the DELETE
logic for multipartSessionUrl and its logging) — keep the same error handling
and logging but move the deleteSessionFunc / DELETE retry block to after the
bulk-delete success branch so retries can still find parts if the bulk delete
fails.

---

Nitpick comments:
In `@src/backends/swift/auth.ts`:
- Around line 42-81: Add info-level logging at the start and on successful
completion of getLegacySwiftV1AuthMeta so the v1 flow is observable like the v3
path; log an entry message (e.g., "Fetching Authorization Token (Swift v1) from
{auth_url}") when the function begins and a success message including the
resolved storageUrl and masked token (or token presence) right before returning
the normalized storage URL, using the existing getLegacySwiftV1AuthMeta function
and normalizeLegacySwiftStorageUrl/HeraldError symbols to locate insertion
points; keep error throw behavior unchanged but ensure the success log occurs
only after token and storageUrl checks pass.

In `@src/backends/swift/objects.ts`:
- Around line 354-369: Cache the result of unwrapOk(headResponse) instead of
calling it multiple times: after calling retryWithExponentialBackoff for the
HEAD probe, assign const headOk = isOk(headResponse) ? unwrapOk(headResponse) :
null, then use headOk to compute isStaticLargeObject and adjust reqUrl;
additionally, when headOk is null or headOk.ok is false, emit a warning (e.g.,
logger.warn or console.warn) noting the HEAD probe failed for the object so that
potential SLO deletions without ?multipart-manifest=delete are visible; update
references in the SLO-detection block (isStaticLargeObject, reqUrl) to use the
cached headOk.

In `@tests/swift/basic/multipartupload_test.ts`:
- Around line 131-142: In the "Completed Upload Does Not Expose Raw Parts" test
update the negative assertion to detect any leaked part keys (not just
`${objectKey}/1`): after computing keys from result.Contents (using the existing
objectKey variable), assert that no key in keys startsWith(`${objectKey}/`) —
e.g., replace the current assert(!keys.includes(`${objectKey}/1`)) with an
assertion that keys.every(k => !k?.startsWith(`${objectKey}/`)) or equivalent so
any raw part prefixed by objectKey is caught.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2d3868bb-c9e8-40fa-89fe-93ed31446e7a

📥 Commits

Reviewing files that changed from the base of the PR and between 40b6ce1 and 4170f79.

📒 Files selected for processing (5)
  • src/backends/swift/auth.ts
  • src/backends/swift/objects.ts
  • src/backends/swift/utils/mod.ts
  • src/constants/s3.ts
  • tests/swift/basic/multipartupload_test.ts

@tewodrosdy tewodrosdy force-pushed the fix/swift-multipart-parts-hidden branch from 4170f79 to ec43986 Compare May 21, 2026 14:22
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/backends/swift/objects.ts (1)

1626-1644: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep the upload session until bulk delete is fully confirmed.

Here we only log the parsed bulk-delete body and then delete the session unconditionally. Non-2xx manual responses and per-object Errors are ignored, so a partial failure leaves hidden parts orphaned and removes the only authoritative path list for retries.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backends/swift/objects.ts` around lines 1626 - 1644, The code currently
calls deleteMultipartSession() immediately after logging the parsed bulk-delete
body, which can orphan parts if the bulk delete returned non-2xx statuses or
per-object Errors; modify the flow around
bulkDeleteResponseResult/retryWithExponentialBackoff and getBulkDeleteJsonBody
so you parse the JSON (using unwrapOk/getBulkDeleteJsonBody), inspect for any
non-2xx entries or any Errors array items, and only call
deleteMultipartSession() when the parsed response confirms all objects
succeeded; if any failures are detected, log detailed error(s) (use
unwrapErr/unwrapOk and logger.error) and return a failure Result instead of
deleting the session so the authoritative parts list remains for retries.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/backends/swift/auth.ts`:
- Around line 62-82: The code currently accepts empty or malformed Swift auth
headers; update the auth parsing in the token/storageUrl block to reject blank
values and surface structured errors: trim and validate the extracted token and
storageUrl (treat "" or whitespace-only as missing) and throw HeraldError with
appropriate status/messages instead of proceeding; wrap the call to
normalizeLegacySwiftStorageUrl(auth_url, storageUrl) in a try/catch and convert
any parsing/TypeError into a HeraldError (include context like "Invalid
X-Storage-Url" and the original error message) so
normalizeLegacySwiftStorageUrl, token, storageUrl and HeraldError are referenced
for locating the changes.

In `@src/backends/swift/objects.ts`:
- Around line 356-377: The HEAD response handling must not let code fall through
to a plain DELETE when the HEAD failed or returned a transient non-2xx; change
the logic around headResponse/headOk/headErr so that when isOk(headResponse) is
false (headOk == null) you throw or return an error (so
retryWithExponentialBackoff or the caller will retry/fail) instead of logging
and continuing; only proceed to potentially rewrite reqUrl for SLO deletion when
headOk is present and indicates x-static-large-object; reference symbols to
update: retryWithExponentialBackoff, headResponse, isOk, unwrapOk, headOk,
headErr, reqUrl, swiftUrl, bucket, object, and logger.warn. Ensure transient
HEAD failures cause a retry/failure rather than falling back to a plain DELETE.
- Around line 1566-1571: When aborting, the current objectsToDelete set only
includes internal-state parts with part.path, skipping legacy stored parts and
leaving `${object}/<partNumber>` segments behind for mixed sessions. Update the
logic that builds objectsToDelete (used in the abort flow) to iterate
sessionParts and for each part add either `${bucket}/${path}` when part.path
exists or the legacy key `${bucket}/${object}/${partNumber}` (or whatever legacy
key format uses the part number) when path is missing; reference sessionParts,
objectsToDelete, bucket, object and the part's partNumber property to ensure
both internal-state and legacy parts are included in the deletion list.

---

Outside diff comments:
In `@src/backends/swift/objects.ts`:
- Around line 1626-1644: The code currently calls deleteMultipartSession()
immediately after logging the parsed bulk-delete body, which can orphan parts if
the bulk delete returned non-2xx statuses or per-object Errors; modify the flow
around bulkDeleteResponseResult/retryWithExponentialBackoff and
getBulkDeleteJsonBody so you parse the JSON (using
unwrapOk/getBulkDeleteJsonBody), inspect for any non-2xx entries or any Errors
array items, and only call deleteMultipartSession() when the parsed response
confirms all objects succeeded; if any failures are detected, log detailed
error(s) (use unwrapErr/unwrapOk and logger.error) and return a failure Result
instead of deleting the session so the authoritative parts list remains for
retries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: b00aae7d-2a7c-41dd-babf-a263f38e427d

📥 Commits

Reviewing files that changed from the base of the PR and between 4170f79 and ec43986.

📒 Files selected for processing (7)
  • src/backends/swift/auth.ts
  • src/backends/swift/mod.ts
  • src/backends/swift/objects.ts
  • src/backends/swift/utils/mod.ts
  • src/constants/s3.ts
  • tests/swift/basic/multipartupload_test.ts
  • tests/swift/basic/object_test.ts

Comment on lines +62 to +82
const token = response.headers.get("x-auth-token") ??
response.headers.get("x-storage-token");
const storageUrl = response.headers.get("x-storage-url");

if (token == null) {
throw new HeraldError(400, {
message:
"Error Authenticating to Swift Server: auth token header is null",
});
}

if (storageUrl == null) {
throw new HeraldError(404, {
message: "Storage URL not found in Swift auth response",
});
}

const normalizedStorageUrl = normalizeLegacySwiftStorageUrl(
auth_url,
storageUrl,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject blank or malformed legacy auth headers before normalizing them.

Line 62 only rejects null, so an empty X-Auth-Token is returned as a valid credential, and Line 79 can throw a raw TypeError if X-Storage-Url is present but blank or malformed. This is an upstream-response parsing path, so it should fail with a structured HeraldError instead of returning unusable auth state or leaking an uncaught URL parse error.

Proposed fix
-  const token = response.headers.get("x-auth-token") ??
-    response.headers.get("x-storage-token");
-  const storageUrl = response.headers.get("x-storage-url");
+  const token = (response.headers.get("x-auth-token") ??
+    response.headers.get("x-storage-token"))?.trim();
+  const storageUrl = response.headers.get("x-storage-url")?.trim();

-  if (token == null) {
+  if (!token) {
     throw new HeraldError(400, {
       message:
         "Error Authenticating to Swift Server: auth token header is null",
     });
   }

-  if (storageUrl == null) {
+  if (!storageUrl) {
     throw new HeraldError(404, {
       message: "Storage URL not found in Swift auth response",
     });
   }

-  const normalizedStorageUrl = normalizeLegacySwiftStorageUrl(
-    auth_url,
-    storageUrl,
-  );
+  let normalizedStorageUrl: string;
+  try {
+    normalizedStorageUrl = normalizeLegacySwiftStorageUrl(
+      auth_url,
+      storageUrl,
+    );
+  } catch {
+    throw new HeraldError(502, {
+      message: "Invalid storage URL returned by Swift auth response",
+    });
+  }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backends/swift/auth.ts` around lines 62 - 82, The code currently accepts
empty or malformed Swift auth headers; update the auth parsing in the
token/storageUrl block to reject blank values and surface structured errors:
trim and validate the extracted token and storageUrl (treat "" or
whitespace-only as missing) and throw HeraldError with appropriate
status/messages instead of proceeding; wrap the call to
normalizeLegacySwiftStorageUrl(auth_url, storageUrl) in a try/catch and convert
any parsing/TypeError into a HeraldError (include context like "Invalid
X-Storage-Url" and the original error message) so
normalizeLegacySwiftStorageUrl, token, storageUrl and HeraldError are referenced
for locating the changes.

Comment on lines +356 to +377
const headResponse = await retryWithExponentialBackoff(async () => {
return await fetch(reqUrl, {
method: "HEAD",
headers: headers,
});
});
const headOk = isOk(headResponse) ? unwrapOk(headResponse) : null;
const headErr = isOk(headResponse) ? null : unwrapErr(headResponse);
if (headOk?.ok) {
const isStaticLargeObject = headOk.headers.get("x-static-large-object")
?.toLowerCase() === "true";
if (isStaticLargeObject) {
reqUrl = `${swiftUrl}/${bucket}/${object}?multipart-manifest=delete`;
}
} else {
const reason = headOk == null
? headErr?.message
: `${headOk.status} ${headOk.statusText}`;
logger.warn(
`Could not confirm object metadata before delete for ${bucket}/${object}: ${reason}`,
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't fall back to a plain DELETE when the SLO check fails.

If the HEAD request errors or returns a transient non-2xx, this still issues a normal DELETE. For static large objects that removes only the manifest, leaving the hidden segments behind while the caller still sees a successful delete. Please fail or retry until the SLO/non-SLO decision is known.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backends/swift/objects.ts` around lines 356 - 377, The HEAD response
handling must not let code fall through to a plain DELETE when the HEAD failed
or returned a transient non-2xx; change the logic around
headResponse/headOk/headErr so that when isOk(headResponse) is false (headOk ==
null) you throw or return an error (so retryWithExponentialBackoff or the caller
will retry/fail) instead of logging and continuing; only proceed to potentially
rewrite reqUrl for SLO deletion when headOk is present and indicates
x-static-large-object; reference symbols to update: retryWithExponentialBackoff,
headResponse, isOk, unwrapOk, headOk, headErr, reqUrl, swiftUrl, bucket, object,
and logger.warn. Ensure transient HEAD failures cause a retry/failure rather
than falling back to a plain DELETE.

Comment on lines +1566 to +1571
const objectsToDelete = new Set(
sessionParts
.map((part) => part.path)
.filter((path): path is string => Boolean(path))
.map((path) => `${bucket}/${path}`),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle mixed legacy and internal-state parts during abort.

Once any stored part has a path, the legacy ${object}/ fallback is skipped entirely. An upload started before this change and resumed after it will have a mixed session, so abort deletes only the new internal-state parts and leaves the older ${object}/<partNumber> segments behind.

Also applies to: 1573-1608

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/backends/swift/objects.ts` around lines 1566 - 1571, When aborting, the
current objectsToDelete set only includes internal-state parts with part.path,
skipping legacy stored parts and leaving `${object}/<partNumber>` segments
behind for mixed sessions. Update the logic that builds objectsToDelete (used in
the abort flow) to iterate sessionParts and for each part add either
`${bucket}/${path}` when part.path exists or the legacy key
`${bucket}/${object}/${partNumber}` (or whatever legacy key format uses the part
number) when path is missing; reference sessionParts, objectsToDelete, bucket,
object and the part's partNumber property to ensure both internal-state and
legacy parts are included in the deletion list.

@Yohe-Am
Copy link
Copy Markdown
Collaborator

Yohe-Am commented May 24, 2026

Oh damn, didn't know someone else was hacking on this repo. I done merged a full rewerite that has swift multipart uploads working. It has a lot of things missing still but I'm currently using that version in prod for CNPG, Velero backups and so on. Try it out.

@teddy-23
Copy link
Copy Markdown

Thanks @Yohe-Am, I’ll test latest main then. Just to confirm: does the rewrite already hide Swift multipart part objects from normal S3 ListObjects/ListObjectsV2 responses after CompleteMultipartUpload?

@Yohe-Am
Copy link
Copy Markdown
Collaborator

Yohe-Am commented May 25, 2026

Yep @teddy-23, you can see that in the usages of the filterXxx functions on locations like https://github.com/expnt/herald/blob/main/src/Frontend/Objects/List.ts#L43 .

@Yohe-Am
Copy link
Copy Markdown
Collaborator

Yohe-Am commented May 25, 2026

One thing you'll have to watch out for, this impl uses the Swift bucket itself as a KV store for metadata https://github.com/expnt/herald/blob/main/src/Services/BackendKeyValueStore.ts . Not suitable for for heavy usage since one will be paying per GET/PUT request. Add a kv impl backed by something else if you have such a usecase. It's just a matter of implementing the KeyValue interface from @effect/platform.

If hacking on the rewrite, check out the integration test suite in Typescript but also, tht python submodule that uses an upstram extensive test suite. It's only partially passing on the upstream test suite but I don't have it blocking the CI. Make sure that changes don't reduce the pass rate on that (I probably should wire CI to detect that). Look at the latest CI runs for the upstream test suite to understand what's supported and missing today.

Edit: looks like I'd disabled upstream suite runs in CI to make the jobs complete faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants