fix(mcp): handle SSL connection drop during pre-call session teardown#39917
fix(mcp): handle SSL connection drop during pre-call session teardown#39917aminghadersohi wants to merge 3 commits intoapache:masterfrom
Conversation
When RDS drops an SSL connection due to idle timeout or max-connection-age, `db.session.remove()` in `sync_wrapper` raises `OperationalError` because the implicit rollback inside `session.close()` fails on the dead DBAPI connection. This caused the MCP tool call to fail even when the operation itself completed successfully, and left a dead connection in the pool. Introduce `_remove_session_safe()` which: - Catches `OperationalError` from `session.remove()` (SSL/network errors) - Calls `session.invalidate()` to mark the dead connection for pool discard - Retries `session.remove()` so the scoped registry is clean before the tool runs Replace the bare `db.session.remove()` in `sync_wrapper` with `_remove_session_safe()`. Add a unit test verifying `invalidate()` is called and remove is retried on SSL error.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #39917 +/- ##
==========================================
- Coverage 63.88% 63.87% -0.02%
==========================================
Files 2583 2583
Lines 136604 136636 +32
Branches 31502 31504 +2
==========================================
+ Hits 87276 87277 +1
- Misses 47812 47843 +31
Partials 1516 1516
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR hardens MCP sync tool execution against stale/closed DBAPI connections by making the pre-invocation db.session.remove() step tolerant to OperationalError (e.g., SSL idle-timeout connection drops), so successful tool calls aren’t failed by teardown cleanup.
Changes:
- Added
_remove_session_safe()to catchOperationalErrorfromdb.session.remove(), invalidate the session’s connection, and retry removal. - Updated
sync_wrapperinmcp_auth_hookto use_remove_session_safe()instead of callingdb.session.remove()directly. - Added a unit test covering the SSL/connection-drop failure mode and ensuring the tool call still succeeds.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
superset/mcp_service/auth.py |
Introduces safe scoped-session removal logic and wires it into sync MCP tool execution. |
tests/unit_tests/mcp_service/test_auth_user_resolution.py |
Adds regression coverage to ensure sync_wrapper tolerates OperationalError during pre-call session cleanup. |
Code Review Agent Run #c82e1fActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
- Replace nonlocal counter in SSL error test with MagicMock side_effect list - Add inline comment on retry db.session.remove() to clarify intent
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
…st OperationalError Some database drivers (e.g. MySQL, SQLite) surface dropped connections as InterfaceError rather than OperationalError. Both are DBAPIError subclasses. Widen the catch in _remove_session_safe() so all DBAPI-level disconnect errors are handled consistently regardless of driver.
|
Addressed catch: widened from to so drivers that surface disconnect as are also covered. — agor claude on Amin's behalf |
Code Review Agent Run #a5a6a8Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
SUMMARY
MCP tools running in thread-pool workers (sync tools) call
db.session.remove()before each invocation to clear any stale thread-local session. If the underlying DBAPI connection died between requests — e.g. RDS dropped it due to SSL idle-timeout or max-connection-age — the implicitrollback()insidesession.close()raisesOperationalError. This caused the tool call to fail with a stack trace, even when the operation itself had completed successfully.Root cause:
sync_wrapperinauth.pycalleddb.session.remove()bare, with no handling for connection-level errors.Fix: Extract
_remove_session_safe()which:OperationalErrorfromsession.remove()session.invalidate()to mark the dead connection for pool discard (prevents it being checked out again by another thread)session.remove()to deregister the scoped sessionThe tool call proceeds normally; a fresh connection is obtained on the next DB access.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — backend-only change
TESTING INSTRUCTIONS
pytest tests/unit_tests/mcp_service/test_auth_user_resolution.py::test_sync_wrapper_handles_ssl_error_on_pre_call_removeADDITIONAL INFORMATION