-
Notifications
You must be signed in to change notification settings - Fork 140
DRAFT: Upgrade to Python 3.13 with libtmux race condition fix #1978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Update target-version and pythonVersion to 3.13 in root pyproject.toml - Update Python version in server.yml build matrix to 3.13 - Update Python version in pypi-release.yml to 3.13 - Update Python version in pr-review action to 3.13 - Pin libtmux to neubig/libtmux#fix/new-session-race-condition branch which fixes the race condition in new_session() that causes TmuxObjectDoesNotExist errors in Python 3.13 environments The libtmux fix avoids the race condition by eliminating the separate list-sessions query after session creation, instead parsing the session data directly from the -P output of new-session. Fixes the Python 3.13 + PyInstaller + Docker compatibility issue reported in libtmux#624. Co-authored-by: openhands <openhands@all-hands.dev>
|
Hi! I started running the integration tests on your PR. You will receive a comment with the results shortly. |
🧪 Integration Tests ResultsOverall Success Rate: 100.0% 📁 Detailed Logs & ArtifactsClick the links below to access detailed agent/LLM logs showing the complete reasoning process for each model. On the GitHub Actions page, scroll down to the 'Artifacts' section to download the logs.
📊 Summary
📋 Detailed Resultslitellm_proxy_deepseek_deepseek_reasoner
Skipped Tests:
litellm_proxy_moonshot_kimi_k2_thinking
Skipped Tests:
litellm_proxy_gemini_3_pro_preview
litellm_proxy_claude_sonnet_4_5_20250929
|
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 28.7s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 21.6s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 11.9s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 48.3s | $0.04 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 19.9s | $0.02 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 27.5s | $0.02 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 34.3s | $0.03 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 16.5s | $0.02 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 21.5s | $0.01 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 4m 25s | $0.53 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 18.7s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 27.9s | $0.02 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 15.8s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 22.3s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 9.7s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 22.2s | $0.02 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 1m 27s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 5m 5s | $0.41 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 2m 16s | $0.18 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 20.6s | $0.02 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 38.4s | $0.03 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 36.6s | $0.03 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 11.2s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 3m 7s | $0.21 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 19.7s | $0.02 |
| 01_standalone_sdk/34_critic_example.py | ❌ FAIL Exit code 1 |
3.8s | -- |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 9.8s | $0.00 |
| 01_standalone_sdk/37_llm_profile_store.py | ✅ PASS | 4.0s | $0.00 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 58.3s | $0.04 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ❌ FAIL Exit code 1 |
1m 4s | -- |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ❌ FAIL Exit code 1 |
16.8s | -- |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ❌ FAIL Exit code 1 |
1m 1s | -- |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 31.8s | $0.02 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ❌ FAIL Exit code 1 |
4.1s | -- |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 18.7s | $0.02 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 44.4s | $0.08 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 12.5s | $0.01 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 7.8s | $0.01 |
❌ Some tests failed
Total: 38 | Passed: 33 | Failed: 5 | Total Cost: $1.94
Failed examples:
- examples/01_standalone_sdk/34_critic_example.py: Exit code 1
- examples/02_remote_agent_server/02_convo_with_docker_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py: Exit code 1
…HUB_SHA GitHub Actions sets GITHUB_SHA to the merge commit by default, which differs from the PR head commit. Use a custom variable AGENT_SERVER_SHA to explicitly pass the PR head SHA to example scripts for Docker image selection. Co-authored-by: openhands <openhands@all-hands.dev>
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 25.3s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 17.9s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 11.1s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 41.2s | $0.04 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 13.5s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 27.9s | $0.02 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 28.7s | $0.03 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 11.2s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 20.8s | $0.01 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 4m 22s | $0.54 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 17.2s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 25.1s | $0.01 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 13.2s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 19.5s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 14.8s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 18.1s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 1m 10s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 2m 54s | $0.22 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 2m 4s | $0.17 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 18.6s | $0.02 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 28.3s | $0.02 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 38.6s | $0.03 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 12.2s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ❌ FAIL Timed out after 600 seconds |
10m 0s | -- |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 23.7s | $0.02 |
| 01_standalone_sdk/34_critic_example.py | ❌ FAIL Exit code 1 |
3.9s | -- |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 21.3s | $0.01 |
| 01_standalone_sdk/37_llm_profile_store.py | ✅ PASS | 4.1s | $0.00 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 1m 5s | $0.04 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ❌ FAIL Exit code 1 |
4.7s | -- |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ❌ FAIL Exit code 1 |
5.7s | -- |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ❌ FAIL Exit code 1 |
5m 11s | -- |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 28.3s | $0.03 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ❌ FAIL Exit code 1 |
4.7s | -- |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 31.8s | $0.04 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 1m 10s | $0.05 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 10.5s | $0.01 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 7.5s | $0.01 |
❌ Some tests failed
Total: 38 | Passed: 32 | Failed: 6 | Total Cost: $1.52
Failed examples:
- examples/01_standalone_sdk/31_iterative_refinement.py: Timed out after 600 seconds
- examples/01_standalone_sdk/34_critic_example.py: Exit code 1
- examples/02_remote_agent_server/02_convo_with_docker_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py: Exit code 1
- Regenerate uv.lock with pinned libtmux git dependency - Simplify Generator[T, None, None] to Generator[T] in test files Co-authored-by: openhands <openhands@all-hands.dev>
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 25.7s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 20.0s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 14.0s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 30.2s | $0.02 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 18.5s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 35.4s | $0.03 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 31.4s | $0.03 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 20.0s | $0.02 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 21.1s | $0.02 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 6m 29s | $0.84 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 16.4s | $0.02 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 23.3s | $0.01 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 15.6s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 15.8s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 11.4s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 16.7s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 57.8s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 51.3s | $0.05 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 1m 44s | $0.19 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 22.6s | $0.03 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 35.2s | $0.03 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 44.6s | $0.04 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 10.6s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 3m 7s | $0.22 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 16.6s | $0.02 |
| 01_standalone_sdk/34_critic_example.py | ❌ FAIL Exit code 1 |
3.8s | -- |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 13.1s | $0.00 |
| 01_standalone_sdk/37_llm_profile_store.py | ✅ PASS | 4.1s | $0.00 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 59.4s | $0.04 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ❌ FAIL Exit code 1 |
4.8s | -- |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ❌ FAIL Exit code 1 |
4.9s | -- |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ❌ FAIL Exit code 1 |
5m 11s | -- |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 28.6s | $0.02 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ❌ FAIL Exit code 1 |
5.6s | -- |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 20.8s | $0.03 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 57.7s | $0.07 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 14.0s | $0.02 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 7.6s | $0.01 |
❌ Some tests failed
Total: 38 | Passed: 33 | Failed: 5 | Total Cost: $1.90
Failed examples:
- examples/01_standalone_sdk/34_critic_example.py: Exit code 1
- examples/02_remote_agent_server/02_convo_with_docker_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1
- examples/02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py: Exit code 1
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
Summary
This PR upgrades the project to Python 3.13 and pins libtmux to neubig/libtmux@fix/new-session-race-condition which contains the fix for the race condition reported in libtmux#624.
Context
See upstream PR: tmux-python/libtmux#625
The issue was that
new_session()in libtmux would:tmux new-session -P -F#{session_id}to create sessiontmux list-sessionsto fetch full session dataThis created a race condition in Python 3.13 environments (especially with PyInstaller + Docker) where
list-sessionsmight not see the newly created session yet, causingTmuxObjectDoesNotExisterrors.The fix expands the
-Fformat string to include all Session fields and parses the output directly, eliminating the separate list-sessions query entirely.Changes
target-versionfrompy312topy313in root pyproject.toml (ruff)pythonVersionfrom3.12to3.13in root pyproject.toml (pyright)server.ymlbuild matrix from 3.12 to 3.13pypi-release.ymlfrom 3.12 to 3.13pr-reviewaction from 3.12 to 3.13libtmux @ git+https://github.com/neubig/libtmux.git@fix/new-session-race-conditionTesting
This PR needs integration tests to verify the libtmux fix works correctly in our CI environment. The
integration-testlabel should trigger those tests.Note
This is a draft PR to test the libtmux fix. Once the upstream PR is merged and released to PyPI, we should update the dependency to the released version.
Related issues:
@neubig can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:bcad0e2-pythonRun
All tags pushed for this build
About Multi-Architecture Support
bcad0e2-python) is a multi-arch manifest supporting both amd64 and arm64bcad0e2-python-amd64) are also available if needed