Skip to content

Add retry logic to download_and_extract_meshes#120

Open
gbocchio-bdai wants to merge 2 commits into
rai-opensource:mainfrom
gbocchio-bdai:gbocchio/add-retries-to-meshes-download
Open

Add retry logic to download_and_extract_meshes#120
gbocchio-bdai wants to merge 2 commits into
rai-opensource:mainfrom
gbocchio-bdai:gbocchio/add-retries-to-meshes-download

Conversation

@gbocchio-bdai
Copy link
Copy Markdown

Description

This PR adds a _request_with_retry function to judo/utils/assets.py and uses it to download the meshes; essentially making download_and_extract_meshes more resilient in CI.

It's not infrequent for one given HTTP request to fail in CI due to network instability. Adding a few retries solves most transient failures.

Comment thread judo/utils/assets.py Outdated
asset_name: str = "meshes.zip",
tag: str | None = None,
retries: int = 3,
timeout: float = 30.0,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it in seconds or minutes? Consider specify the unit in the variable name?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment thread judo/utils/assets.py Outdated
Comment thread judo/utils/assets.py Outdated
return response
except (requests.ConnectionError, requests.Timeout, requests.HTTPError) as exc:
if attempt >= retries - 1:
raise
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider removing this if.. raise: Other raise statements below won't be reached if we raise here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the if ... raise. I added a similar if to prevent sleeping on the last retry if it fails.

Comment thread judo/utils/assets.py Outdated
@gbocchio-bdai gbocchio-bdai requested a review from dta-bdai May 11, 2026 13:21
Copy link
Copy Markdown
Collaborator

@dta-bdai dta-bdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! One more request change to simplify the code. Can you please take a look?

Comment thread judo/utils/assets.py
time.sleep(backoff_time_s)
if last_exception is not None:
raise last_exception
raise RuntimeError("Unexpected retry loop termination in _request_with_retry")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line won't never been reached.

Comment thread judo/utils/assets.py
Comment on lines 46 to 58
last_exception: Exception | None = None
for attempt in range(retries):
for attempt in range(retries + 1):
try:
response = requests.request(method, url, timeout=timeout, **request_kwargs)
response = requests.request(method, url, timeout=timeout_s, **request_kwargs)
response.raise_for_status()
return response
except (requests.ConnectionError, requests.Timeout, requests.HTTPError) as exc:
if attempt >= retries - 1:
raise
last_exception = exc
time.sleep(backoff_time)
except (requests.ConnectionError, requests.Timeout, requests.HTTPError) as request_exception:
last_exception = request_exception
if attempt < retries:
time.sleep(backoff_time_s)
if last_exception is not None:
raise last_exception
raise RuntimeError("Unexpected retry loop termination in _request_with_retry")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
last_exception: Exception | None = None
for attempt in range(retries):
for attempt in range(retries + 1):
try:
response = requests.request(method, url, timeout=timeout, **request_kwargs)
response = requests.request(method, url, timeout=timeout_s, **request_kwargs)
response.raise_for_status()
return response
except (requests.ConnectionError, requests.Timeout, requests.HTTPError) as exc:
if attempt >= retries - 1:
raise
last_exception = exc
time.sleep(backoff_time)
except (requests.ConnectionError, requests.Timeout, requests.HTTPError) as request_exception:
last_exception = request_exception
if attempt < retries:
time.sleep(backoff_time_s)
if last_exception is not None:
raise last_exception
raise RuntimeError("Unexpected retry loop termination in _request_with_retry")
for attempt in range(retries):
try:
response = requests.request(method, url, timeout=timeout, **request_kwargs)
response.raise_for_status()
return response
except (requests.ConnectionError, requests.Timeout, requests.HTTPError) as request_exception:
if attempt < retries:
time.sleep(backoff_time_s)
continue
raise request_exception

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you try the suggestion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants