fix: Simplify refresh logic on a failed fetch#16
Conversation
|
|
||
| // Stop warnings about refresh failures from reaching the test output | ||
| julLogger.setUseParentHandlers(false); | ||
|
|
There was a problem hiding this comment.
QUESTION: The tests involving refresh failures were outputting a lot of scary looking logs - which is expected as we're throwing an error every time a refresh fails. I've added these lines to keep the test output clean, however this could potentially hide other more useful logs. What is the better option here between hiding logs and showing too many?
Here is the combined test output before adding these limitations to the test loggers.
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:02 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
... this x about 10
java.lang.RuntimeException: Oops! Simulated refresh error
at com.octopus.openfeature.provider.OctopusContextProviderTests$ThrowsOnRefreshClient.haveFeatureTogglesChanged(OctopusContextProviderTests.java:225)
at com.octopus.openfeature.provider.OctopusContextProvider.refresh(OctopusContextProvider.java:48)
at java.base/java.lang.Thread.run(Thread.java:1474)
May 28, 2026 2:22:08 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
java.lang.RuntimeException: Oops! Simulated refresh error
at com.octopus.openfeature.provider.OctopusContextProviderTests$ThrowsOnRefreshClient.haveFeatureTogglesChanged(OctopusContextProviderTests.java:225)
at com.octopus.openfeature.provider.OctopusContextProvider.refresh(OctopusContextProvider.java:48)
at java.base/java.lang.Thread.run(Thread.java:1474)
... this x 3
May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
May 28, 2026 2:22:09 PM com.octopus.openfeature.provider.OctopusContextProvider refresh
SEVERE: Failed to retrieve updated feature manifest. Retaining existing context which may be stale.
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.66 s -- in com.octopus.openfeature.provider.OctopusContextProviderTests
Results:
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0
------------------------------------------------------------------------
BUILD SUCCESS
------------------------------------------------------------------------
Total time: 20.460 s
Finished at: 2026-05-28T14:22:14+10:00
------------------------------------------------------------------------
There was a problem hiding this comment.
I don't feel strongly either way, but given we are adding the logger to assert against, this test output is a side effect and it is reasonable to suppress it.
| provider.shutdown(); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
The tests below have been translated straight across from my .NET PR for the same ticket.
There was a problem hiding this comment.
Pull request overview
This PR simplifies OctopusContextProvider refresh behavior so failed refreshes no longer switch to a five-second retry loop and instead continue using the configured cache duration.
Changes:
- Removed the refresh retry delay and retry-attempt tracking.
- Standardized refresh failure logging while retaining the existing context.
- Added tests covering recovery after failed initial/refresh fetches and logging for refresh exceptions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/main/java/com/octopus/openfeature/provider/OctopusContextProvider.java |
Removes fast retry behavior and keeps refresh cadence tied to cache duration. |
src/test/java/com/octopus/openfeature/provider/OctopusContextProviderTests.java |
Adds coverage for failed refresh recovery and exception logging behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| byte[] initialHash = {0x01, 0x02, 0x03, 0x04}; | ||
| byte[] updatedHash = {0x01, 0x02, 0x03, 0x05}; | ||
|
|
||
| var logMessages = new ArrayList<String>(); |
There was a problem hiding this comment.
We have a test thread and refresh thread both writing to the logger, so this seems like a good call. Have added it to the existing refresh test above as well 👍
liamhughes
left a comment
There was a problem hiding this comment.
Nicely done. Wondering if we wanted to align a bit with the .NET version while we're here.
There was a problem hiding this comment.
Just noticed the poor grammar here. Would you mind? 😄
There was a problem hiding this comment.
We could even totally wild and log the same message we log in the .NET provider! 🫣
| } | ||
|
|
||
| try { | ||
| var toggles = client.getFeatureToggleEvaluationManifest(); |
There was a problem hiding this comment.
What do you think about aligning the error handling of OctopusClient methods to the .NET version? i.e. remove the try/catch blocks and letting them bubble up?
There was a problem hiding this comment.
Sounds good. I think this commit makes them equivalent to the .NET. Would you mind giving it an extra check for me?
|
|
||
| // Stop warnings about refresh failures from reaching the test output | ||
| julLogger.setUseParentHandlers(false); | ||
|
|
There was a problem hiding this comment.
I don't feel strongly either way, but given we are adding the logger to assert against, this test output is a side effect and it is reasonable to suppress it.
1cd8b92 to
111a4be
Compare
| return null; | ||
| } | ||
| var evaluations = OctopusObjectMapper.INSTANCE.readValue(httpResponse.body(), new TypeReference<List<FeatureToggleEvaluation>>() {}); | ||
| return new FeatureToggles(evaluations, Base64.getDecoder().decode(contentHashHeader.get())); |
There was a problem hiding this comment.
This change removes error handling for consistency with the GetFeatureToggleEvaluationManifest function in the .NET provider. Errors here will now bubble up to be logged in the catch blocks of the OctopusContextProvider's initialize() and refresh() functions.
| HttpResponse<String> httpResponse = client.send(request, HttpResponse.BodyHandlers.ofString()); | ||
| FeatureToggleCheckResponse checkResponse = OctopusObjectMapper.INSTANCE.readValue(httpResponse.body(), FeatureToggleCheckResponse.class); | ||
| return !Arrays.equals(checkResponse.contentHash, contentHash); | ||
| } |
There was a problem hiding this comment.
This change removes error handling for consistency with the HaveFeaturesChanged function in the .NET provider. Errors here will now bubble up to be logged in the catch blocks of the OctopusContextProvider's refresh() function.
liamhughes
left a comment
There was a problem hiding this comment.
Sweet. 👍
Forgot to say on the .NET one, but I have done some additional testing with a polling client testing red --> green --> red. Seems to be rock solid.
Background 🌇
The feature toggle evaluation manifest is fetched when a provider is initialized and then refreshed regularly according to a cache interval. If one of these fetches fails, the refresh interval drops from the cache duration (default one min) down to five seconds. This could significantly increase load on OctoToggle in the event of an outage and does not seem worth the cost considering that we already have regular refreshes at the cache duration interval.
In DEVEX-41, we added logic to fall back the most recent set of evaluations in the case where a refresh fails, rather than falling back to default toggle states. This made waiting the cache duration for a refresh (rather than frantically retrying) a more acceptable option.
What's this? 🐦
The PR removes the fast-refresh logic. Now instead of dropping down to a five second refresh interval on failure, we just carry on refreshing at the caching interval.
Testing 🧪
✅ I've given these changes a manual test with both the Functions and ASP.NET OctoToggle setup
✅ The updated refresh logic has has been tested using the tests added in this PR.
How to review? 🔍
💬 Check out the comments on the code
☑️ Code quality
Completes DEVEX-84