Skip to content

fix: Simplify refresh logic on a failed fetch#16

Merged
caitlynstocker merged 5 commits into
mainfrom
cat/devex-84/simplify-refresh-retry
May 29, 2026
Merged

fix: Simplify refresh logic on a failed fetch#16
caitlynstocker merged 5 commits into
mainfrom
cat/devex-84/simplify-refresh-retry

Conversation

@caitlynstocker
Copy link
Copy Markdown
Contributor

@caitlynstocker caitlynstocker commented May 28, 2026

Background 🌇

The feature toggle evaluation manifest is fetched when a provider is initialized and then refreshed regularly according to a cache interval. If one of these fetches fails, the refresh interval drops from the cache duration (default one min) down to five seconds. This could significantly increase load on OctoToggle in the event of an outage and does not seem worth the cost considering that we already have regular refreshes at the cache duration interval.

In DEVEX-41, we added logic to fall back the most recent set of evaluations in the case where a refresh fails, rather than falling back to default toggle states. This made waiting the cache duration for a refresh (rather than frantically retrying) a more acceptable option.

What's this? 🐦

The PR removes the fast-refresh logic. Now instead of dropping down to a five second refresh interval on failure, we just carry on refreshing at the caching interval.

Testing 🧪

✅ I've given these changes a manual test with both the Functions and ASP.NET OctoToggle setup
✅ The updated refresh logic has has been tested using the tests added in this PR.

How to review? 🔍

💬 Check out the comments on the code
☑️ Code quality

Completes DEVEX-84

@caitlynstocker caitlynstocker requested a review from a team as a code owner May 28, 2026 02:45

// Stop warnings about refresh failures from reaching the test output
julLogger.setUseParentHandlers(false);

Copy link
Copy Markdown
Contributor Author

@caitlynstocker caitlynstocker May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUESTION: The tests involving refresh failures were outputting a lot of scary looking logs - which is expected as we're throwing an error every time a refresh fails. I've added these lines to keep the test output clean, however this could potentially hide other more useful logs. What is the better option here between hiding logs and showing too many?

Here is the combined test output before adding these limitations to the test loggers.

SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
... this x about 10

java.lang.RuntimeException: Oops! Simulated refresh error
	at com.octopus.openfeature.provider.OctopusContextProviderTests$ThrowsOnRefreshClient.haveFeatureTogglesChanged(OctopusContextProviderTests.java:225)

	at com.octopus.openfeature.provider.OctopusContextProvider.refresh(OctopusContextProvider.java:48)
	at java.base/java.lang.Thread.run(Thread.java:1474)

May 28, 2026 2:22:08 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
java.lang.RuntimeException: Oops! Simulated refresh error
	at com.octopus.openfeature.provider.OctopusContextProviderTests$ThrowsOnRefreshClient.haveFeatureTogglesChanged(OctopusContextProviderTests.java:225)

	at com.octopus.openfeature.provider.OctopusContextProvider.refresh(OctopusContextProvider.java:48)
	at java.base/java.lang.Thread.run(Thread.java:1474)
... this x 3

May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.66 s -- in com.octopus.openfeature.provider.OctopusContextProviderTests

Results:

Tests run: 5, Failures: 0, Errors: 0, Skipped: 0

------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time:  20.460 s
Finished at: 2026-05-28T14:22:14+10:00
------------------------------------------------------------------------

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly either way, but given we are adding the logger to assert against, this test output is a side effect and it is reasonable to suppress it.

provider.shutdown();
}
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests below have been translated straight across from my .NET PR for the same ticket.

@caitlynstocker caitlynstocker changed the title Cat/devex 84/simplify refresh retry fix: Simplify refresh logic May 28, 2026
@caitlynstocker caitlynstocker changed the title fix: Simplify refresh logic fix: Simplify refresh logic on a failed fetch May 28, 2026
@caitlynstocker caitlynstocker requested a review from Copilot May 29, 2026 03:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies OctopusContextProvider refresh behavior so failed refreshes no longer switch to a five-second retry loop and instead continue using the configured cache duration.

Changes:

  • Removed the refresh retry delay and retry-attempt tracking.
  • Standardized refresh failure logging while retaining the existing context.
  • Added tests covering recovery after failed initial/refresh fetches and logging for refresh exceptions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/main/java/com/octopus/openfeature/provider/OctopusContextProvider.java Removes fast retry behavior and keeps refresh cadence tied to cache duration.
src/test/java/com/octopus/openfeature/provider/OctopusContextProviderTests.java Adds coverage for failed refresh recovery and exception logging behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

byte[] initialHash = {0x01, 0x02, 0x03, 0x04};
byte[] updatedHash = {0x01, 0x02, 0x03, 0x05};

var logMessages = new ArrayList<String>();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a test thread and refresh thread both writing to the logger, so this seems like a good call. Have added it to the existing refresh test above as well 👍

Comment thread src/test/java/com/octopus/openfeature/provider/OctopusContextProviderTests.java Outdated
Copy link
Copy Markdown
Contributor

@liamhughes liamhughes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done. Wondering if we wanted to align a bit with the .NET version while we're here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed the poor grammar here. Would you mind? 😄

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could even totally wild and log the same message we log in the .NET provider! 🫣

}

try {
var toggles = client.getFeatureToggleEvaluationManifest();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about aligning the error handling of OctopusClient methods to the .NET version? i.e. remove the try/catch blocks and letting them bubble up?

Copy link
Copy Markdown
Contributor Author

@caitlynstocker caitlynstocker May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I think this commit makes them equivalent to the .NET. Would you mind giving it an extra check for me?


// Stop warnings about refresh failures from reaching the test output
julLogger.setUseParentHandlers(false);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly either way, but given we are adding the logger to assert against, this test output is a side effect and it is reasonable to suppress it.

@caitlynstocker caitlynstocker force-pushed the cat/devex-84/simplify-refresh-retry branch from 1cd8b92 to 111a4be Compare May 29, 2026 03:42
return null;
}
var evaluations = OctopusObjectMapper.INSTANCE.readValue(httpResponse.body(), new TypeReference<List<FeatureToggleEvaluation>>() {});
return new FeatureToggles(evaluations, Base64.getDecoder().decode(contentHashHeader.get()));
Copy link
Copy Markdown
Contributor Author

@caitlynstocker caitlynstocker May 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes error handling for consistency with the GetFeatureToggleEvaluationManifest function in the .NET provider. Errors here will now bubble up to be logged in the catch blocks of the OctopusContextProvider's initialize() and refresh() functions.

HttpResponse<String> httpResponse = client.send(request, HttpResponse.BodyHandlers.ofString());
FeatureToggleCheckResponse checkResponse = OctopusObjectMapper.INSTANCE.readValue(httpResponse.body(), FeatureToggleCheckResponse.class);
return !Arrays.equals(checkResponse.contentHash, contentHash);
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes error handling for consistency with the HaveFeaturesChanged function in the .NET provider. Errors here will now bubble up to be logged in the catch blocks of the OctopusContextProvider's refresh() function.

@caitlynstocker caitlynstocker requested a review from liamhughes May 29, 2026 04:42
Copy link
Copy Markdown
Contributor

@liamhughes liamhughes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet. 👍

Forgot to say on the .NET one, but I have done some additional testing with a polling client testing red --> green --> red. Seems to be rock solid.

@caitlynstocker caitlynstocker merged commit e4c0e78 into main May 29, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants