Skip to content

Fix test flakiness#7400

Open
Evangelink wants to merge 1 commit intomainfrom
dev/amauryleve/json-rcp-flaky
Open

Fix test flakiness#7400
Evangelink wants to merge 1 commit intomainfrom
dev/amauryleve/json-rcp-flaky

Conversation

@Evangelink
Copy link
Member

Fix #7398

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Addresses flakiness in the server-mode TCP startup/abort path (Issue #7398) by tightening cancellation handling around TCP connection establishment and by broadening the set of cancellation-adjacent exceptions treated as expected during shutdown/abort.

Changes:

  • Adds a post-ConnectAsync cancellation check before calling TcpClient.GetStream() to avoid a race where the socket is closed by cancellation.
  • Consolidates multiple cancellation-related catch blocks in ServerTestHost into a single filtered catch that also covers InvalidOperationException from GetStream().

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/Platform/Microsoft.Testing.Platform/ServerMode/JsonRpc/MessageHandlerFactory.cs Adds a cancellation guard between successful connect and GetStream() to prevent “non-connected sockets” failures.
src/Platform/Microsoft.Testing.Platform/Hosts/ServerTestHost.cs Expands/centralizes cancellation exception handling to treat additional timing-dependent socket exceptions as expected during abort.

Comment on lines 144 to +150
(_messageHandler as IDisposable)?.Dispose();
}
catch (OperationCanceledException ex) when (ex.CancellationToken == cancellationToken)
{
}
catch (SocketException ex) when (ex.SocketErrorCode == SocketError.OperationAborted && cancellationToken.IsCancellationRequested)
{
}
catch (ObjectDisposedException) when (cancellationToken.IsCancellationRequested)
catch (Exception ex) when
// When the cancellation token fires during TCP connect or message handling, several
// exception types can surface depending on the exact timing:
(cancellationToken.IsCancellationRequested
// the standard cancellation path.
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When an exception is swallowed by this cancellation-specific catch filter, _messageHandler may have been successfully created but will not be disposed (the dispose call is only on the success path). Consider moving (_messageHandler as IDisposable)?.Dispose() into finally (or duplicating it in the catch) to ensure TCP resources are always cleaned up.

Copilot uses AI. Check for mistakes.
Comment on lines +58 to 60
cancellationToken.ThrowIfCancellationRequested();
NetworkStream stream = client.GetStream();
return new TcpMessageHandler(client, clientToServerStream: stream, serverToClientStream: stream, FormatterUtilities.CreateFormatter());
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cancellationToken.ThrowIfCancellationRequested() throws here, the TcpClient instance is never disposed. That can leave underlying resources open (even if the socket was closed by cancellation). Consider disposing client before rethrowing/throwing on cancellation, and also ensure client is disposed if GetStream() ends up throwing in this path.

See below for a potential fix:

            bool disposeClient = true;

            try
            {
#if NETCOREAPP
                await client.ConnectAsync(host: _host, port: _port, cancellationToken).ConfigureAwait(false);
#else
                await client.ConnectAsync(host: _host, port: _port).WithCancellationAsync(cancellationToken, observeException: true).ConfigureAwait(false);
#endif
                // ConnectAsync with a CancellationToken registers a callback that closes the socket.
                // If the connect completes at the OS level at the same instant the token fires,
                // ConnectAsync can return successfully while the socket is already closed,
                // causing GetStream() to throw InvalidOperationException ("non-connected sockets").
                cancellationToken.ThrowIfCancellationRequested();
                NetworkStream stream = client.GetStream();
                TcpMessageHandler handler = new(client, clientToServerStream: stream, serverToClientStream: stream, FormatterUtilities.CreateFormatter());
                disposeClient = false;
                return handler;
            }
            finally
            {
                if (disposeClient)
                {
                    client.Dispose();
                }
            }

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +55
// ConnectAsync with a CancellationToken registers a callback that closes the socket.
// If the connect completes at the OS level at the same instant the token fires,
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about ConnectAsync registering a cancellation callback that closes the socket is accurate for the NETCOREAPP overload taking a CancellationToken, but not for the WithCancellationAsync path (which doesn't close the socket). Please adjust the comment to reflect the conditional behavior so it stays correct across TFMs.

Suggested change
// ConnectAsync with a CancellationToken registers a callback that closes the socket.
// If the connect completes at the OS level at the same instant the token fires,
// On NETCOREAPP, the ConnectAsync overload that takes a CancellationToken registers a callback that closes the socket.
// In that case, if the connect completes at the OS level at the same instant the token fires,

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate ServerCanBeStartedAndAborted_TcpIp test flakiness

1 participant