Skip to content

Support file level ingestion and file listing search#1339

Open
ricofurtado wants to merge 5 commits intomainfrom
support-file-level-ingestion-and-file-listing-search
Open

Support file level ingestion and file listing search#1339
ricofurtado wants to merge 5 commits intomainfrom
support-file-level-ingestion-and-file-listing-search

Conversation

@ricofurtado
Copy link
Copy Markdown
Collaborator

The current ingestion workflow behaves like a bulk or folder-scoped operation, which limits user control and visibility when a bucket contains many files. Users cannot select and ingest a single file from within a bucket in the same straightforward way they can in a Google Drive–style experience. As a result, ingestion becomes an all-or-nothing operation, and it’s difficult to understand what content is present, what has already been ingested, and what ingestion state each file is in.

A second gap is discoverability: there is no first-class API surface for listing files in a bucket or searching for them by name/metadata. This reduces usability (especially for large buckets), makes troubleshooting ingestion harder, and constrains the ability to build UI flows that show a clear “documents in this bucket” view.

At a minimum, the platform needs to treat each file as an individually addressable unit with its own lifecycle state and metadata, while still supporting multiple independent ingestions within the same bucket.

@github-actions github-actions bot added frontend 🟨 Issues related to the UI/UX backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) tests labels Apr 6, 2026
Comment thread src/api/acl.py
owner=user.user_id,
added_users=body.user_ids,
)
return JSONResponse({"success": True, "allowed_users": merged, "acl_result": str(result)})
Comment thread src/api/acl.py
owner=user.user_id,
removed_users=body.user_ids,
)
return JSONResponse({"success": True, "allowed_users": remaining, "acl_result": str(result)})
Comment thread src/api/connectors.py
error=str(e),
)
return JSONResponse(
{"error": f"Failed to browse files: {str(e)}"},
Comment thread src/api/files.py
except Exception as e:
logger.error("Failed to list files", error=str(e))
return JSONResponse(
{"error": "Failed to list files", "detail": str(e)},
Comment thread src/api/files.py
except Exception as e:
logger.error("Failed to search files", error=str(e))
return JSONResponse(
{"error": "Failed to search files", "detail": str(e)},
@ricofurtado ricofurtado requested a review from mfortman11 April 14, 2026 18:32
logger.exception("Failed to connect to S3 during credential test.")
return JSONResponse(
{"error": "Could not connect to S3 with the provided configuration."},
{"error": describe_s3_error(exc, conn_config.get("bucket_names"))},
logger.exception("Failed to list S3 buckets for connection %s", connection_id)
return JSONResponse({"error": "Failed to list buckets"}, status_code=500)
return JSONResponse(
{"error": describe_s3_error(exc, connection.config.get("bucket_names"))},
logger.exception("Failed to list buckets from S3 for connection %s", connection_id)
return JSONResponse({"error": "Failed to list buckets"}, status_code=500)
return JSONResponse(
{"error": describe_s3_error(exc, connection.config.get("bucket_names"))},
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend 🔷 Issues related to backend services (OpenSearch, Langflow, APIs) frontend 🟨 Issues related to the UI/UX tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants