Skip to content

Merge improved error handling for container image pull failures#1798

Merged
yingbull merged 3 commits into
feature/cache-build-workflowsfrom
copilot/sub-pr-1796
Jan 18, 2026
Merged

Merge improved error handling for container image pull failures#1798
yingbull merged 3 commits into
feature/cache-build-workflowsfrom
copilot/sub-pr-1796

Conversation

Copilot AI commented Jan 18, 2026

Copy link
Copy Markdown

This PR merges the base branch's sophisticated error handling approach for container image pull operations, which better addresses authentication and permission error visibility than the original implementation.

Changes

The base branch (feature/cache-build-workflows) already implemented a superior solution:

Build Job Error Detection:

  • Capture stderr: Redirect docker stderr to analyze specific error messages
  • Distinguish error types: Check if errors match "manifest unknown" or "not found" patterns
  • Set reason output: Track whether failure is due to success, not-found, or error
  • Fail on auth/permission errors: Separate workflow step fails with ::error:: annotation when reason=error
  • Smart caching: Only cache and build locally when reason=not-found

Test Jobs:

  • Keep simpler error handling since build job validates credentials first

Example

- name: Try to pull pre-built image from ghcr.io
  id: pull-image
  continue-on-error: true
  run: |
    # Capture stderr so we can distinguish "not found" from other errors
    if docker pull "$IMAGE" 2>pull-image-error.log; then
      echo "pulled=true" >> "$GITHUB_OUTPUT"
      echo "reason=success" >> "$GITHUB_OUTPUT"
    else
      echo "pulled=false" >> "$GITHUB_OUTPUT"
      if grep -qiE 'manifest unknown|not found' pull-image-error.log; then
        echo "reason=not-found" >> "$GITHUB_OUTPUT"
      else
        echo "reason=error" >> "$GITHUB_OUTPUT"
        cat pull-image-error.log
      fi
    fi

- name: Fail on unexpected image pull error
  if: steps.pull-image.outputs.reason == 'error'
  run: |
    echo "::error::Failed to pull pre-built image - check auth/permissions"
    exit 1

Benefits

  • ✅ No warnings on expected "image not found" scenarios
  • ✅ Workflow fails on authentication/permission errors (ensuring they get fixed)
  • ✅ Shows actual docker error output for troubleshooting
  • ✅ Catches configuration issues early in build job before parallel test jobs run

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.


Summary by cubic

Make image pull failures easier to debug by capturing docker pull stderr and failing on unexpected errors, while keeping the local build fallback. Addresses feedback from #1796 on image pull error handling.

  • Bug Fixes
    • Classify pull outcomes as success, not-found, or error; parse stderr for “manifest unknown/not found”.
    • Fail the workflow with ::error:: on auth/permission or other unexpected errors; only cache when not-found.
    • Strengthen DB initialization check by verifying provider table is populated to avoid false positives.

Written for commit fe9d398. Summary will update on new commits.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error detection and handling during the build process to accurately distinguish between different error types and enhance overall build reliability.
    • Enhanced database initialization verification with additional validation steps to prevent false completion signals and ensure proper setup.
  • Chores

    • Optimized build workflow with conditional caching logic based on operation results.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Jan 18, 2026

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Enhances Maven project workflow to improve Docker image pull error handling with categorization (success, not-found, error), adds conditional caching based on pull results, and strengthens database initialization verification with an additional database query check.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow Updates
.github/workflows/maven-project.yml
Implements error categorization for Docker image pulls (success, not-found, unexpected error); conditionally enables caching only for not-found scenarios; adds post-pull failure handling with GitHub Actions error output; enhances database initialization wait loop to verify provider table contains at least one row before confirming completion.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

🐰 With Docker and Database so fine,
We verify errors, make logic align,
Pull images with care, check what's found,
Then query the DB, safe and sound!
Build flows now stronger, no false alarms,
Our workflow's improved with verification charms! ✨


Comment @coderabbitai help to get the list of available commands and usage tips.

- Enhanced all "Try to pull" steps to capture docker pull output
- Added GitHub Actions warnings when pulls fail to help troubleshoot auth/permission issues
- Logs now distinguish between missing images vs authentication/permission errors
- Uses ::warning:: annotations to make failures visible in workflow logs
- Applied to all 4 pull operations: build job, modern-tests job, and both tomcat/mariadb in legacy-tests job

Co-authored-by: yingbull <8680161+yingbull@users.noreply.github.com>
Copilot AI changed the title [WIP] Address feedback on caching of container images Add warning logs for container image pull failures Jan 18, 2026
Copilot AI requested a review from yingbull January 18, 2026 16:44
@yingbull yingbull marked this pull request as ready for review January 18, 2026 17:04
Copilot AI review requested due to automatic review settings January 18, 2026 17:04
@yingbull

Copy link
Copy Markdown

@copilot address the merge conflict

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the GitHub Actions workflow to make container image pull failures more visible and debuggable. When continue-on-error: true masks authentication or permission errors, they now surface as warnings in the Actions UI with diagnostic context.

Changes:

  • Capture docker pull stderr/stdout to temporary log files using tee
  • Emit GitHub Actions warnings on pull failures with diagnostic hints
  • Display last 5 lines of docker output to distinguish missing images from auth/permission issues
  • Apply consistently across all 4 image pull steps (build job, modern-tests job, tomcat, and mariadb)

Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 1 file

Confidence score: 4/5

  • CI workflow risk: in .github/workflows/maven-project.yml the docker pull | tee pipeline masks pull failures, so the warning/else branch may never run on errors
  • Severity is moderate (6/10) and limited to workflow error detection, so this looks safe to merge with a small reliability caveat
  • Pay close attention to .github/workflows/maven-project.yml - ensure docker pull failures propagate via pipefail or preserved exit status.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name=".github/workflows/maven-project.yml">

<violation number="1" location=".github/workflows/maven-project.yml:82">
P2: The `docker pull | tee` pipeline masks the pull failure because the pipeline’s status is from `tee`. Without `pipefail`, the `else` branch won’t run on pull failures. Enable `pipefail` (or otherwise preserve the `docker pull` exit status) before this pipeline so failures trigger the warning path.</violation>

<violation number="2" location=".github/workflows/maven-project.yml:279">
P2: The `docker pull | tee` pipeline hides pull failures because the pipeline status is from `tee`. Enable `pipefail` (or preserve the `docker pull` exit status) so failures trigger the warning path.</violation>

<violation number="3" location=".github/workflows/maven-project.yml:298">
P2: The `docker pull | tee` pipeline masks pull failures since `tee` returns 0. Add `pipefail` (or equivalent exit-status handling) so the `else` branch runs on pull errors.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Comment thread .github/workflows/maven-project.yml Outdated
Resolved merge conflict by adopting the base branch's superior error handling approach:
- Build job now distinguishes between "image not found" (expected) vs auth/permission errors
- Auth/permission errors now cause workflow to fail with ::error:: annotation
- Test jobs keep simpler error handling since build job catches issues first
- Removes my previous ::warning:: approach which warned on all failures

Co-authored-by: yingbull <8680161+yingbull@users.noreply.github.com>

Copilot AI commented Jan 18, 2026

Copy link
Copy Markdown
Author

@copilot address the merge conflict

Resolved the merge conflict in commit fe9d398.

The base branch (feature/cache-build-workflows) already had a superior solution that:

  • Distinguishes between expected "image not found" vs unexpected auth/permission errors
  • Fails the workflow with ::error:: on auth/permission issues (rather than just warning)
  • Only caches and builds locally when image is truly not found

My original ::warning::-based approach has been replaced with this better error handling.

Copilot AI changed the title Add warning logs for container image pull failures Merge improved error handling for container image pull failures Jan 18, 2026
@yingbull

Copy link
Copy Markdown

@copilot please update with changes as indicated above in comments.

@yingbull

Copy link
Copy Markdown

@copilot please review this PR.

@yingbull

Copy link
Copy Markdown

@claude please review.

@claude

claude Bot commented Jan 18, 2026

Copy link
Copy Markdown

Claude finished @yingbull's task in 1m 44s —— View job


Code Review: PR #1798

I've reviewed the PR and the merge resolution. Here's my comprehensive analysis:

Summary

This PR successfully adopts the base branch's superior error handling for container image pull operations. The merge correctly resolves the conflict by choosing the more sophisticated approach from feature/cache-build-workflows over the simpler warning-based approach in the original commit.

What Changed

The merge (commit fe9d398) replaced the original approach (commit 9e39369) with the base branch's solution:

Build Job (Lines 75-106)

✅ Excellent error handling implementation:

  • Captures stderr to a temporary file (pull-image-error.log)
  • Distinguishes three outcomes via reason output:
    • success - Image pulled successfully
    • not-found - Image doesn't exist in registry (expected scenario)
    • error - Unexpected failure (auth/permission/network issues)
  • Fails the workflow on unexpected errors with ::error:: annotation
  • Only caches/builds locally when reason=not-found
  • Cleans up temporary error log file

Test Jobs (Lines 195-210, 282-296, 298-312)

✅ Simplified error handling:

  • Basic pull attempt without detailed error parsing
  • Appropriate since build job validates credentials first
  • Avoids redundant error checking in parallel jobs

Security Analysis

✅ No security issues found. The original commit (9e39369) had potential command injection vulnerabilities that were flagged by reviewers:

# ❌ Original approach (removed by merge)
echo "::warning::Docker pull output: $(cat /tmp/pull-output.log | tail -5)"

This could have been vulnerable if docker output contained shell metacharacters. However, the merge completely removed this code, so the security concern is resolved.

Current approach:

# ✅ Current approach (safe)
cat pull-image-error.log  # Just prints to stdout, no command substitution

Reliability Analysis

✅ Solid reliability improvements:

  1. Proper error detection - The build job now correctly identifies auth/permission failures vs. missing images
  2. Fail-fast behavior - Workflow fails immediately on auth errors, preventing wasted time in test jobs
  3. No false positives - Missing images (expected) don't trigger warnings or failures
  4. Clean separation - Build job handles validation, test jobs trust it succeeded

Note on previous review comments: The cubic-dev-ai bot raised concerns about pipefail being needed for docker pull | tee pipelines. However, this is no longer relevant because the merge removed all tee usage. The current code uses simple redirection (2>pull-image-error.log), which correctly captures the exit status.

Best Practices Review

✅ Strengths

  1. Smart caching logic (line 108):

    if: steps.pull-image.outputs.pulled != 'true' && steps.pull-image.outputs.reason == 'not-found'

    Only caches when image truly doesn't exist, not on auth failures.

  2. Clear error messages (lines 104-105):

    echo "::error::Failed to pull pre-built image from ${{ env.REGISTRY }} for a reason other than 'image not found'. Check network connectivity, registry permissions, or image name."

    Provides actionable guidance for troubleshooting.

  3. Proper cleanup (line 99):

    rm -f pull-image-error.log

    Removes temporary files to avoid confusion in logs.

  4. Pattern matching (line 90):

    if grep -qiE 'manifest unknown|not found' pull-image-error.log; then

    Robust regex covers common Docker registry error messages.

🔶 Minor Suggestions (Not Blockers)

  1. Database initialization verification (lines 397-399):
    The enhancement to check provider_count > 0 is good, but consider also checking for a specific minimum count or specific provider if you have seed data:

    # Current
    if [ "$provider_count" -gt 0 ]; then
    
    # More robust (if you know expected seed data)
    if [ "$provider_count" -ge 1 ]; then  # At least 1 provider expected
  2. Consistent output naming (lines 86, 91, 94):
    The reason output variable is excellent, but consider also using it in test jobs for consistency, even if you don't act on it. This would make future enhancements easier.

  3. Error log context:
    When displaying the error log (line 96), consider showing more context:

    echo "Image pull failed due to an unexpected error:"
    cat pull-image-error.log
    echo "---"
    echo "Please check: authentication tokens, registry permissions, network connectivity"

Comparison with Previous Review Feedback

The PR has already addressed the critical issues raised by other reviewers:

Issue Status Notes
Command injection vulnerability ✅ Resolved Removed $(cat ... | tail) pattern
Inefficient cat | tail usage ✅ Resolved Removed all tee and tail usage
Missing pipefail for pipelines ✅ N/A No longer using pipelines
Auth/permission error visibility ✅ Resolved Now fails workflow with ::error::

Conclusion

Recommendation: ✅ Approve and merge

This PR successfully implements a robust error handling solution for container image pulls. The merge resolution was correct in adopting the base branch's more sophisticated approach. All security concerns from the original commit have been resolved, and the workflow now properly distinguishes between expected scenarios (image not found) and actual problems (auth/permission failures).

The implementation follows GitHub Actions best practices and provides clear, actionable error messages when issues occur.

Files Reviewed

  • .github/workflows/maven-project.yml (lines 75-444)

Copilot AI commented Jan 18, 2026

Copy link
Copy Markdown
Author

@copilot please update with changes as indicated above in comments.

All the issues mentioned in the comments above have already been resolved. The merge in commit fe9d398 replaced the problematic code from commit 9e39369 with the base branch's implementation.

Current state:

  • ✅ No command injection vulnerabilities (no $(cat ...) in command substitution)
  • ✅ No useless cat usage (no cat | tail patterns)
  • ✅ Build job has sophisticated error detection (distinguishes "not found" vs auth/permission errors)
  • ✅ Test jobs have simpler handling (since build job validates credentials first)

All comment threads are resolved and no further changes are needed.

@yingbull

Copy link
Copy Markdown

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jan 18, 2026

Copy link
Copy Markdown

@yingbull: I'll review the changes in this PR for you.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copilot AI commented Jan 18, 2026

Copy link
Copy Markdown
Author

@copilot please review this PR.

Code Review Summary

I've reviewed the PR and the current state looks good. Here's my assessment:

✅ Strengths

Security & Error Handling:

  • Build job properly captures docker stderr and distinguishes between "not found" (expected) vs auth/permission errors
  • Uses cat pull-image-error.log safely (no command substitution, just prints to stdout)
  • Workflow fails appropriately on auth/permission errors with ::error:: annotation
  • Error log file is cleaned up after use

Architecture:

  • Smart separation: build job has sophisticated error detection, test jobs use simpler handling
  • Only caches when reason is "not-found", preventing unnecessary cache operations
  • Database initialization verification prevents false positives

Code Quality:

  • YAML syntax is valid
  • No command injection vulnerabilities
  • No inefficient cat usage patterns
  • All previous review comments have been addressed

📋 Current Implementation

The build job (lines 75-105) implements proper error detection:

  1. Captures stderr to log file
  2. Checks for "manifest unknown" or "not found" patterns
  3. Sets reason output appropriately
  4. Fails workflow on unexpected errors
  5. Only caches/builds locally when image truly not found

Test jobs (modern-tests, legacy-tests) use simpler error handling since build job validates credentials first.

✅ Ready to Merge

No issues found. The PR successfully merges the base branch's superior error handling approach.

Copilot AI requested a review from yingbull January 18, 2026 17:28
@yingbull yingbull merged commit 7c58edb into feature/cache-build-workflows Jan 18, 2026
25 of 27 checks passed
@yingbull yingbull deleted the copilot/sub-pr-1796 branch January 18, 2026 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants