Support file level ingestion and file listing search#1339
Open
ricofurtado wants to merge 5 commits intomainfrom
Open
Support file level ingestion and file listing search#1339ricofurtado wants to merge 5 commits intomainfrom
ricofurtado wants to merge 5 commits intomainfrom
Conversation
| owner=user.user_id, | ||
| added_users=body.user_ids, | ||
| ) | ||
| return JSONResponse({"success": True, "allowed_users": merged, "acl_result": str(result)}) |
| owner=user.user_id, | ||
| removed_users=body.user_ids, | ||
| ) | ||
| return JSONResponse({"success": True, "allowed_users": remaining, "acl_result": str(result)}) |
| error=str(e), | ||
| ) | ||
| return JSONResponse( | ||
| {"error": f"Failed to browse files: {str(e)}"}, |
| except Exception as e: | ||
| logger.error("Failed to list files", error=str(e)) | ||
| return JSONResponse( | ||
| {"error": "Failed to list files", "detail": str(e)}, |
| except Exception as e: | ||
| logger.error("Failed to search files", error=str(e)) | ||
| return JSONResponse( | ||
| {"error": "Failed to search files", "detail": str(e)}, |
| logger.exception("Failed to connect to S3 during credential test.") | ||
| return JSONResponse( | ||
| {"error": "Could not connect to S3 with the provided configuration."}, | ||
| {"error": describe_s3_error(exc, conn_config.get("bucket_names"))}, |
| logger.exception("Failed to list S3 buckets for connection %s", connection_id) | ||
| return JSONResponse({"error": "Failed to list buckets"}, status_code=500) | ||
| return JSONResponse( | ||
| {"error": describe_s3_error(exc, connection.config.get("bucket_names"))}, |
| logger.exception("Failed to list buckets from S3 for connection %s", connection_id) | ||
| return JSONResponse({"error": "Failed to list buckets"}, status_code=500) | ||
| return JSONResponse( | ||
| {"error": describe_s3_error(exc, connection.config.get("bucket_names"))}, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The current ingestion workflow behaves like a bulk or folder-scoped operation, which limits user control and visibility when a bucket contains many files. Users cannot select and ingest a single file from within a bucket in the same straightforward way they can in a Google Drive–style experience. As a result, ingestion becomes an all-or-nothing operation, and it’s difficult to understand what content is present, what has already been ingested, and what ingestion state each file is in.
A second gap is discoverability: there is no first-class API surface for listing files in a bucket or searching for them by name/metadata. This reduces usability (especially for large buckets), makes troubleshooting ingestion harder, and constrains the ability to build UI flows that show a clear “documents in this bucket” view.
At a minimum, the platform needs to treat each file as an individually addressable unit with its own lifecycle state and metadata, while still supporting multiple independent ingestions within the same bucket.