Add qa_count_clear_total to the outputs#161
Conversation
Add qa_total_count_clear to the outputs
Add some GDAL parameters to get around processes giving up to quickly when a scene is in cold storage.
os.environ["GDAL_HTTP_TIMEOUT"] = "300" # default is 30s
os.environ["GDAL_HTTP_MAX_RETRY"] = "10" # default is 0
os.environ["GDAL_HTTP_RETRY_DELAY"] = "5" # seconds between retries
|
For full integration test results, refer to the Tests directory README. |
robbibt
left a comment
There was a problem hiding this comment.
Hey @vnewey, have made some suggestions below. I think you can accept the suggestions directly through Github if they look OK.
The GDAL change is the biggest unknown here I think - I was wondering if it might be safer to adopt the existing datacube GDAL cloud defaults instead. But I guess we can test it and see how it goes.
| # Record params in logs | ||
| log.info(f"{run_id}: Using parameters {input_params}") | ||
|
|
||
| # This is to help when scenes need to be moved from s3 cold storage |
There was a problem hiding this comment.
I don't know enough about these specific options to know if they're likely to have any negative downstream impacts. I do wonder though whether we should just adopt the GDAL_CLOUD_DEFAULTS that are used by datacube when you run configure_s3_access:
https://github.com/opendatacube/odc-loader/blob/main/src/odc/loader/_rio.py#L76-L80
GDAL_CLOUD_DEFAULTS = {
"GDAL_DISABLE_READDIR_ON_OPEN": "EMPTY_DIR",
"GDAL_HTTP_MAX_RETRY": "10",
"GDAL_HTTP_RETRY_DELAY": "0.5",
}
Updated variable name for clarity and adjusted comments for S3 configuration.

Add qa_count_clear_total to the outputs
Add some GDAL parameters to get around processes giving up to quickly when a scene is in cold storage.
os.environ["GDAL_HTTP_TIMEOUT"] = "300" # default is 30s
os.environ["GDAL_HTTP_RETRY_DELAY"] = "5" # seconds between retries