Skip to content

Avoid infinite log resending if exceptions happen during the log-sending process#359

Open
amidofu wants to merge 15 commits intoaws:devfrom
boostdraft:user/amidofu/log_24hr_issue_aws
Open

Avoid infinite log resending if exceptions happen during the log-sending process#359
amidofu wants to merge 15 commits intoaws:devfrom
boostdraft:user/amidofu/log_24hr_issue_aws

Conversation

@amidofu
Copy link
Copy Markdown

@amidofu amidofu commented May 8, 2026

Issue #, if available:
This is a PR which replaces #219 with the latest dev branch

Description of changes:
We drop the logs in the log batch if the current log batch causes exceptions in the log-sending process
We did this because we observed that our log batch causes the over 24 hours spanning problem
and since the monitor is not stopped, it just resends the log batch forever.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@amidofu amidofu requested review from a team as code owners May 8, 2026 02:57
@amidofu amidofu requested review from GarrettBeatty and philasmar May 8, 2026 02:57
@amidofu amidofu changed the title User/amidofu/log 24hr issue aws Avoid infinite log resending if exceptions happen during the log-sending process May 8, 2026
@GarrettBeatty GarrettBeatty requested a review from Copilot May 8, 2026 16:09
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates AWSLoggerCore’s batching/sending behavior to prevent a log batch that triggers send-time exceptions (notably CloudWatch Logs’ 24-hour timestamp-range constraint) from being resent indefinitely by the monitor loop.

Changes:

  • Resets (drops) the current in-flight batch when an exception occurs in the monitor loop, to avoid infinite resend.
  • Adds PrepareLogEventBatchForSending() to sort events and trim older events until the batch fits within a 24-hour timestamp range.
  • Adds RemoveMessageAt(int index) to LogEventBatch to support removing individual events while keeping message-size accounting consistent.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

src/AWS.Logger.Core/Core/AWSLoggerCore.cs:489

  • SendMessages still only catches ResourceNotFoundException. Any other PutLogEventsAsync failure will leave _repo intact, and the monitor loop will keep retrying the same failing batch indefinitely. To match the PR intent, catch/log broader exceptions (e.g., AmazonCloudWatchLogsException / AmazonServiceException) and drop/reset the current batch (optionally with backoff) to prevent infinite resend loops.
            catch (ResourceNotFoundException ex)
            {
                // The specified log stream does not exist. Refresh or create new stream.
                LogLibraryServiceError(ex);

Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs Outdated
amidofu and others added 4 commits May 9, 2026 15:00
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@amidofu amidofu requested a review from Copilot May 9, 2026 06:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs
Comment thread src/AWS.Logger.Core/Core/AWSLoggerCore.cs
Comment on lines +440 to +467
private static readonly TimeSpan MaxLogEventBatchAllowedTimeRange = TimeSpan.FromHours(24);
private void PrepareLogEventBatchForSending()
{
//Make sure the log events are in order from the oldest to the newest.
_repo._request.LogEvents.Sort((ev1, ev2) =>
ev1.Timestamp.GetValueOrDefault().CompareTo(ev2.Timestamp.GetValueOrDefault()));
if (_repo._request.LogEvents.Count > 0)
{
DateTime latestLogDateTime = _repo._request.LogEvents.Last().Timestamp ?? DateTime.UtcNow;
//Avoid the error that log events must be within a 24-hour window.
//https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_PutLogEvents.html
int lastInvalidEventIndexToRemove = -1;
for (int i = 0; i < _repo._request.LogEvents.Count; i++)
{
var logEvent = _repo._request.LogEvents[i];
if (!logEvent.Timestamp.HasValue || (latestLogDateTime - logEvent.Timestamp.Value) > MaxLogEventBatchAllowedTimeRange)
{
lastInvalidEventIndexToRemove = i;
}
else
{
break; // Events are in order, so we can stop checking once we find a valid event
}
}
if (lastInvalidEventIndexToRemove >= 0)
{
_repo.RemoveMessages(0, lastInvalidEventIndexToRemove + 1);
}
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

@amidofu
Copy link
Copy Markdown
Author

amidofu commented May 9, 2026

@GarrettBeatty
sorry, for copilot's comments below, I'm not sure how to do them

  1. Avoid infinite log resending if exceptions happen during the log-sending process #359 (comment): how to narrow down the exceptions to the bad log event cases so that we can only drop those bad log event to prevent it sending the bad log event over and over again?

  2. Avoid infinite log resending if exceptions happen during the log-sending process #359 (comment): how can I add unit tests to test the log event pruning can work in the real log sending environment?

thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants