Feat/add bulk presigned url#1344
Conversation
| for access_id, object_ids in access_to_object_ids.items(): | ||
| try: | ||
| result = bulk_get_signed_url_for_file( | ||
| file_ids=object_ids, requested_protocol=access_id |
There was a problem hiding this comment.
we need to pass passports in here and make sure they are supported.
ga4gh_passports=bulk_request.passports
And we should reject if a request tries to give both a bearer token and a passport. See how the single-object drs endpoint works as reference e.g.
if request.headers.get("Authorization") and bulk_request.passports: raise Forbidden("Sending both a bearer token and passports is not allowed")
There was a problem hiding this comment.
Also unmock bulk_signed_url_for_file in tests and test both token and passports
|
|
||
|
|
||
| class BulkObjectAccessRequest(BaseModel): | ||
| passports: Optional[list[str]] |
There was a problem hiding this comment.
this has to be passports: Optional[list[str]] = None or else it will still require the field I think. Write a test to that doesn't pass a passports key just to make sure.
| indexd_server = config.get("INDEXD") or config["BASE_URL"] + "/index" | ||
| url = indexd_server + "/index/" | ||
| try: | ||
| res = requests.get(url + file_id) |
There was a problem hiding this comment.
I think the main idea was to use a single POST to indexd's /bulk/documents endpoint here
| unresolved_by_code[error_code].extend(object_ids) | ||
| except NotFound as e: | ||
| logger.debug(f"NotFound fetching signed URLs for {access_id}: {e}") | ||
| unresolved_by_code[404].extend(object_ids) |
There was a problem hiding this comment.
A request for [valid_guid, missing_guid] under the same access ID returns both as unresolved even though one could have been resolved.
use indexd's bulk documents endpoint and split returned/missing IDs into resolved and unresolved lists
| return {"url": signed_url} | ||
|
|
||
|
|
||
| @enable_audit_logging |
There was a problem hiding this comment.
the function that runs within this decorator ( /fence/resources/audit/utils.py -> create_audit_log_for_request()) only will send logs for certain requests. We need to add support for the new endpoints to actually sned the logs to audit service
There was a problem hiding this comment.
Still looking into this and would like a bit of clarification on what it usually looks like in our codebase. Should have pushed changes for the rest of the comments
There was a problem hiding this comment.
https://github.com/uc-cdis/fence/blob/master/fence/resources/audit/utils.py#L66-L95 This segment of code checks the path and makes decisions on auditing based on it, so you'll need to update the handler here to account for it.
Integration TestsTest summary after running integration tests
Test summary after rerunning failed integration tests
Please find the detailed integration test report here Please find the detailed integration test report after rerunning failed tests here Please find the Github Action logs here |
|
|
||
|
|
||
| @blueprint.route("/download/bulk", methods=["POST"]) | ||
| def download_bulk_files(): |
There was a problem hiding this comment.
Update the swagger / openapi yamls to include this endpoint.
| return {"url": signed_url} | ||
|
|
||
|
|
||
| @enable_audit_logging |
There was a problem hiding this comment.
https://github.com/uc-cdis/fence/blob/master/fence/resources/audit/utils.py#L66-L95 This segment of code checks the path and makes decisions on auditing based on it, so you'll need to update the handler here to account for it.
Integration TestsFailed to Prepare CI environment. The following pods are not in healthy state: Please find the Github Action logs here |
Integration Tests
Please find the detailed integration test report here Please find the Github Action logs here |
Integration TestsTest summary after running integration tests
Test summary after rerunning failed integration tests
Please find the detailed integration test report here Please find the detailed integration test report after rerunning failed tests here Please find the Github Action logs here |
Integration TestsTest summary after running integration tests
Test summary after rerunning failed integration tests
Please find the detailed integration test report here Please find the detailed integration test report after rerunning failed tests here Please find the Github Action logs here |
Integration TestsTest summary after running integration tests
Test summary after rerunning failed integration tests
Please find the detailed integration test report here Please find the detailed integration test report after rerunning failed tests here Please find the Github Action logs here |
…e into feat/add-bulk-presigned-url
Integration TestsTest summary after running integration tests
Test summary after rerunning failed integration tests
Please find the detailed integration test report here Please find the detailed integration test report after rerunning failed tests here Please find the Github Action logs here |
|
|
||
| if file_authz: | ||
| authz_key = ( | ||
| tuple(file_authz) if isinstance(file_authz, list) else file_authz |
There was a problem hiding this comment.
I think we need tuple(sorted(file_authz)) so it's the same as the line 644
| default=default_expires_in, | ||
| ) | ||
|
|
||
| if ga4gh_passports and not config["GA4GH_PASSPORTS_TO_DRS_ENABLED"]: |
There was a problem hiding this comment.
we should move this check to the blueprint level before calling bulk_get_signed_url_for_file, ill put example there
| "You cannot supply both GA4GH passports and a token " | ||
| "in the Authorization header of a request." | ||
| ) | ||
|
|
There was a problem hiding this comment.
put the passports enabled check here:
if bulk_request.passports and not config["GA4GH_PASSPORTS_TO_DRS_ENABLED"]:
raise NotSupported(
"Using GA4GH Passports as a means of authentication and authorization "
"is not supported by this instance of Gen3."
)| auth_info = _get_auth_info_for_id_or_from_request( | ||
| sub_type=int, db_session=db_session | ||
| ) | ||
| flask.g.audit_data = { |
There was a problem hiding this comment.
ok so I think there is a specific case here where if they authenticate with a Passport but no bearer token, _get_auth_info_for_id_or_from_request will set "username": auth_info["username"] for the audit log to be anonymous. So we should check for passports before here, and if they are using passports set the flask.g.audit_data username using users_from_passports.items()
Link to JIRA ticket if there is one:
New Features
Breaking Changes
Bug Fixes
Improvements
Dependency updates
Deployment changes