fix: make TDgpt Docker model startup and downloads robust#35353
fix: make TDgpt Docker model startup and downloads robust#35353tomchon wants to merge 5 commits into
Conversation
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a centralized Hugging Face download utility (hf_download.py) that implements fallback logic and environment variable configuration, replacing direct snapshot_download calls across multiple model server scripts. Additionally, it refines argument handling in the Docker entrypoint and the model downloader tool, while introducing new unit tests. Review feedback recommends adopting argparse for more robust command-line argument parsing and utilizing os.path.join for cross-platform path construction in test scripts.
| if len(sys.argv) < 4: | ||
| print("invalid parameters, e.g.,:\n python model_downloader.py model_directory model_name ep_enable") | ||
| exit(-1) | ||
| sys.exit(1) | ||
|
|
||
| path = sys.argv[1].strip('\'"') | ||
| model_name = sys.argv[2].strip('\'"') | ||
| ep_enable = bool(sys.argv[3]) | ||
| ep_raw = sys.argv[3] | ||
| try: | ||
| ep_enable = parse_bool(ep_raw) | ||
| except ValueError: | ||
| print(f"invalid ep_enable parameter: {ep_raw}") | ||
| sys.exit(1) |
There was a problem hiding this comment.
The argument parsing here is done manually using sys.argv. While functional, this is less robust and user-friendly than using the argparse module. The other server scripts in this PR have been updated to use argparse. For consistency and better maintainability (e.g., automatic help messages, better error handling), consider refactoring this part to use argparse as well.
| import pytest | ||
|
|
||
|
|
||
| sys.path.append(os.path.dirname(os.path.abspath(__file__)) + "/..") |
There was a problem hiding this comment.
For better portability and to adhere to best practices, it's recommended to use os.path.join for constructing paths instead of string concatenation. This will ensure the code works correctly on different operating systems.
| sys.path.append(os.path.dirname(os.path.abspath(__file__)) + "/..") | |
| sys.path.append(os.path.join(os.path.dirname(os.path.abspath(__file__)), "..")) |
| import pytest | ||
|
|
||
|
|
||
| tdgpt_root = os.path.dirname(os.path.abspath(__file__)) + "/.." |
There was a problem hiding this comment.
For better portability and to adhere to best practices, it's recommended to use os.path.join for constructing paths instead of string concatenation. This will ensure the code works correctly on different operating systems.
| tdgpt_root = os.path.dirname(os.path.abspath(__file__)) + "/.." | |
| tdgpt_root = os.path.join(os.path.dirname(os.path.abspath(__file__)), "..") |
There was a problem hiding this comment.
Pull request overview
This PR improves TDgpt Docker startup and Hugging Face model download robustness by centralizing endpoint/boolean handling and updating model servers + downloader to use the shared helper.
Changes:
- Add
hf_download.pyhelper with robust boolean parsing, endpoint resolution, and fallback from custom/mirror endpoints to official Hugging Face. - Update TDgpt model downloader and non-tdtsfm model servers to use the shared download helper.
- Add focused unit tests for boolean parsing, endpoint precedence, and fallback behavior.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/tdgpt/taosanalytics/misc/hf_download.py | New shared helper for endpoint resolution and snapshot download fallback. |
| tools/tdgpt/taosanalytics/misc/model_downloader.py | Uses helper; fixes CLI boolean parsing and exits with consistent error code. |
| tools/tdgpt/taosanalytics/tsfmservice/chronos-server.py | Switches to helper for model downloads; adds misc path for import. |
| tools/tdgpt/taosanalytics/tsfmservice/moirai-server.py | Switches to helper for model downloads; adds misc path for import. |
| tools/tdgpt/taosanalytics/tsfmservice/moment-server.py | Switches to helper for model downloads; adds misc path for import. |
| tools/tdgpt/taosanalytics/tsfmservice/timemoe-server.py | Switches to helper for model downloads; adds misc path for import. |
| tools/tdgpt/taosanalytics/tsfmservice/timesfm-server.py | Switches to helper for model downloads; adds misc path for import. |
| tools/tdgpt/tests/hf_download_test.py | New tests covering parse/endpoint precedence/fallback behaviors. |
| tools/tdgpt/tests/model_downloader_test.py | New test verifying invalid boolean input exits with code 1. |
| packaging/docker/entrypoint.sh | Passes argparse flags to non-tdtsfm servers instead of positional args. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| model_args=(--model-folder "${subdir}" --model-name "${model_name}") | ||
| if [ "${ENDPOINT_ENABLE}" == "True" ]; then | ||
| model_args+=(--enable-ep) | ||
| fi |
| **kwargs: Any, | ||
| ) -> str: | ||
| endpoint = resolve_endpoint(enable_ep) | ||
| print(f"set the download ep:{endpoint}") |
|
Superseded by the clean download-only PR: #35355 |
Summary
Test Plan