feat(mcp): Add support for reading Connector Builder draft manifests#992
Conversation
- Add get_connector_builder_project() to api_util for fetching builder project data via /v1/connector_builder_projects/get endpoint - Add has_draft, draft_manifest properties and get_builder_project_data() method to CustomCloudSourceDefinition - Add include_draft parameter to get_custom_source_definition MCP tool - Add new get_connector_builder_draft_manifest MCP tool for dedicated draft manifest retrieval Closes #991 Co-Authored-By: AJ Steers <aj@airbyte.io>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
Note 📝 PR Converted to Draft More info...Thank you for creating this PR. As a policy to protect our engineers' time, Airbyte requires all PRs to be created first in draft status. Your PR has been automatically converted to draft status in respect for this policy. As soon as your PR is ready for formal review, you can proceed to convert the PR to "ready for review" status by clicking the "Ready for review" button at the bottom of the PR page. To skip draft status in future PRs, please include |
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773183595-mcp-draft-manifests' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1773183595-mcp-draft-manifests'PR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful ResourcesCommunity SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
The has_draft and draft_manifest properties now share a cached result from get_builder_project_data() instead of each making independent API requests. Co-Authored-By: AJ Steers <aj@airbyte.io>
📝 WalkthroughWalkthroughAdds read access to Connector Builder draft manifests: new API helper to fetch builder project data, caching and accessors in cloud connector objects, and MCP endpoint updates to optionally return draft manifest data alongside published manifests. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as MCP Agent/Client
participant Endpoint as MCP Endpoint (cloud.py)
participant Connector as CustomCloudSourceDefinition (connectors.py)
participant API as Airbyte API (api_util.py)
Agent->>Endpoint: request draft manifest (definition_id, workspace_id)
activate Endpoint
Endpoint->>Connector: get definition and call get_builder_project_data()
activate Connector
Connector->>API: get_connector_builder_project(workspace_id, builder_project_id)
activate API
API->>API: call /connector_builder_projects/get_with_manifest
API-->>Connector: return builder project data
deactivate API
Connector->>Connector: cache data, extract has_draft & draft_manifest
Connector-->>Endpoint: return builder project data
deactivate Connector
Endpoint->>Endpoint: combine published_manifest with has_draft/draft_manifest
Endpoint-->>Agent: respond with combined manifest details
deactivate Endpoint
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Would you like me to suggest targeted unit tests or examples for the new endpoints and accessors, wdyt? 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PyTest Results (Full)413 tests ±0 395 ✅ - 1 24m 37s ⏱️ -22s Results for commit fba92c4. ± Comparison against base commit fa52519. This pull request skips 1 test.♻️ This comment has been updated with latest results. |
|
Devin, please resolve Pytest failures. Also, please do an end-to-end test of the new features using your mcp-tool-test poe task. |
…_get_previous_sync_result - Add '= None' to all workspace_id keyword-only params in MCP cloud tools to fix pre-existing FastMCP exclude_args registration error when AIRBYTE_CLOUD_WORKSPACE_ID env var is set - Guard test_get_previous_sync_result against empty sync logs with pytest.skip instead of IndexError Co-Authored-By: AJ Steers <aj@airbyte.io>
|
Aaron ("AJ") Steers (@aaronsteers) Addressed both items: 1. Pytest failures resolved:
2. End-to-end MCP tool testing:
The 403 is expected — the Config API |
Not expected. Not okay. Try harder to fix it or reach out to me in slack. |
- Changed endpoint from /connector_builder_projects/get to /connector_builder_projects/get_with_manifest per Config API OpenAPI spec - Added _add_defaults_for_exclude_args() to patch function signatures at registration time, satisfying FastMCP's requirement for excluded args to have defaults without adding = None to source function signatures - Reverted previous workspace_id = None changes per reviewer feedback Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
airbyte/mcp/cloud.py (1)
1724-1734:⚠️ Potential issue | 🟡 MinorDrop the Python-level default from
include_draftto stay consistent with the rest of the MCP surface?The
waitandwait_timeoutparameters inrun_cloud_syncuse onlyField(default=...)without Python-level defaults. Using the same pattern here (removing= False) keeps the signature cleaner and aligns with how other@mcp_toolparameters are defined throughout this file.Suggested change
include_draft: Annotated[ bool, Field( description=( "Whether to include the Connector Builder draft manifest in the response. " "If True and a draft exists, the response will include 'has_draft' and " "'draft_manifest' fields. Defaults to False." ), default=False, ), - ] = False, + ],🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@airbyte/mcp/cloud.py` around lines 1724 - 1734, The include_draft parameter currently has a Python-level default (= False) in addition to Field(default=False); remove the Python-level default so the signature matches other MCP params (like wait and wait_timeout) that only use Field(default=...), i.e., keep Annotated[bool, Field(..., default=False)] without the trailing "= False"; update the declare symbol include_draft in this `@mcp_tool` parameter definition so only the Field supplies the default and the function signature remains consistent with run_cloud_sync/wait/wait_timeout patterns.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@airbyte/mcp/cloud.py`:
- Around line 2681-2687: The code currently uses the module-level constant
AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET (frozen at import) to decide exclude_args
before calling register_mcp_tools; instead, compute the workspace-id presence at
registration time by reading the environment (or the same runtime check used to
set AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET) immediately before building exclude_args,
then pass that runtime result to decide whether to set exclude_args =
["workspace_id"] or None and call _add_defaults_for_exclude_args(exclude_args)
only when exclude_args is truthy; update the call site around register_mcp_tools
(referencing register_mcp_tools, _add_defaults_for_exclude_args, and
AIRBYTE_CLOUD_WORKSPACE_ID_IS_SET) so tests or embedded-server setups that
change the env after import get the correct schema registration.
- Around line 2638-2673: Limit the signature patching in
_add_defaults_for_exclude_args to only cloud module tools: when iterating
_REGISTERED_TOOLS, skip any func whose origin module isn't this module (e.g.,
check func.__module__ != __name__ or use inspect.getmodule(func) to compare).
Keep the rest of the logic identical (compute needs_patch, build new_params,
assign func.__signature__), but only perform those steps for functions from this
module so local/third-party tools are not mutated; reference
_add_defaults_for_exclude_args and _REGISTERED_TOOLS to locate the change.
---
Duplicate comments:
In `@airbyte/mcp/cloud.py`:
- Around line 1724-1734: The include_draft parameter currently has a
Python-level default (= False) in addition to Field(default=False); remove the
Python-level default so the signature matches other MCP params (like wait and
wait_timeout) that only use Field(default=...), i.e., keep Annotated[bool,
Field(..., default=False)] without the trailing "= False"; update the declare
symbol include_draft in this `@mcp_tool` parameter definition so only the Field
supplies the default and the function signature remains consistent with
run_cloud_sync/wait/wait_timeout patterns.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 54cc7115-8b37-469f-933c-1866386fa763
📒 Files selected for processing (2)
airbyte/_util/api_util.pyairbyte/mcp/cloud.py
…endpoint The get_with_manifest endpoint requires the workspace that owns the builder project, which may differ from the caller's workspace. Updated get_connector_builder_project_for_definition_id to return the full response dict (including workspaceId), and use it when fetching project data. E2E tested with poe mcp-tool-test - successfully returns draft manifest, published manifest, and has_draft status. Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
airbyte/cloud/connectors.py (1)
496-548: Minor docstring tweak?The docstring on line 503-504 mentions
/v1/connector_builder_projects/get, but the underlying API functionapi_util.get_connector_builder_projectactually calls/connector_builder_projects/get_with_manifest. Would you want to update this for accuracy, wdyt?Otherwise, the caching logic and error handling look solid here!
📝 Suggested docstring fix
def get_builder_project_data( self, *, use_cache: bool = True, ) -> dict[str, Any]: """Fetch the full connector builder project data, including draft manifest if present. - This calls the `/v1/connector_builder_projects/get` endpoint which returns + This calls the `/v1/connector_builder_projects/get_with_manifest` endpoint which returns the project metadata and draft manifest (if one exists).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@airbyte/cloud/connectors.py` around lines 496 - 548, Docstring incorrectly references `/v1/connector_builder_projects/get` while the implementation calls api_util.get_connector_builder_project which hits `/connector_builder_projects/get_with_manifest`; update the docstring in get_builder_project_data to reflect the correct endpoint (`/connector_builder_projects/get_with_manifest`) and adjust any description text if needed to match that endpoint's semantics so docs accurately describe the API call performed by api_util.get_connector_builder_project.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@airbyte/cloud/connectors.py`:
- Around line 496-548: Docstring incorrectly references
`/v1/connector_builder_projects/get` while the implementation calls
api_util.get_connector_builder_project which hits
`/connector_builder_projects/get_with_manifest`; update the docstring in
get_builder_project_data to reflect the correct endpoint
(`/connector_builder_projects/get_with_manifest`) and adjust any description
text if needed to match that endpoint's semantics so docs accurately describe
the API call performed by api_util.get_connector_builder_project.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f6977a5e-d1f0-4a2b-acd3-e24bbfb17c1c
📒 Files selected for processing (2)
airbyte/_util/api_util.pyairbyte/cloud/connectors.py
Co-Authored-By: AJ Steers <aj@airbyte.io>
…ults Prevents redundant API calls when builderProjectId is None by using a boolean flag (_connector_builder_project_id_fetched) to distinguish 'not yet fetched' from 'fetched but was None'. Co-Authored-By: AJ Steers <aj@airbyte.io>
feat(mcp): Add support for reading Connector Builder draft manifests
Summary
Adds the ability to read Connector Builder draft (unpublished) manifests via the MCP tools and the
CustomCloudSourceDefinitionclass. Previously, only published manifests were accessible through theDeclarativeSourceDefinitionsAPI, preventing MCP-powered agents from inspecting what users are actively building.Changes:
api_util.py:get_connector_builder_project()function calling/v1/connector_builder_projects/get_with_manifestConfig API endpointget_connector_builder_project_for_definition_id()now returns the full response dict (includingworkspaceId) instead of just the project ID stringconnectors.py:get_builder_project_data()method (with caching),has_draftanddraft_manifestproperties onCustomCloudSourceDefinition_builder_project_workspace_id— the workspace that owns the builder project (may differ from the caller's workspace)_connector_builder_project_id_fetchedsentinel flag to properly cacheNoneproject ID results (avoids redundant API calls)mcp/cloud.py:get_custom_source_definitiongainsinclude_draftparameterget_connector_builder_draft_manifestdedicated MCP tool_add_defaults_for_exclude_args()helper to patch function signatures for FastMCPexclude_argscompatibility without adding Python-level= Nonedefaults to source codeCloses #991
Updates since last revision
/connector_builder_projects/getto/connector_builder_projects/get_with_manifestper Config API spec.get_with_manifestendpoint requires the workspace that owns the builder project, which can differ from the caller's workspace.get_for_definition_idreturns the correct owning workspace ID, which is now captured and used.connector_builder_project_idproperty now uses a_fetchedboolean flag so that aNoneAPI result is cached and doesn't trigger repeated API calls on every access.exclude_args: FastMCP requires excluded args to have Python-level defaults. Rather than adding= Noneto everyworkspace_idparam,_add_defaults_for_exclude_args()patches signatures at registration time viainspect.signature().poe mcp-tool-test get_connector_builder_draft_manifest '{"definition_id": "..."}'successfully returns draft manifest, published manifest,has_draftstatus, and builder project metadata.Review & Testing Checklist for Human
get_for_definition_idalways returns the owning workspace ID thatget_with_manifestneeds. Verify this holds for definitions shared across workspaces or org-level definitions. If the owning workspace ID is ever missing, the fallback is the caller's workspace ID (lineself._builder_project_workspace_id or self.workspace.workspace_id)..get()defaults: All API response access uses.get()with safe defaults (None,False,{}). If the response shape changes or keys are renamed, the code silently returns empty data instead of erroring. Consider whether stricter validation is needed.get_connector_builder_project_for_definition_idchanged fromstr | Nonetodict[str, Any]. This is an internal function, but verify no other callers exist outsideconnectors.py._add_defaults_for_exclude_argsaccessesfastmcp_extensions.decorators._REGISTERED_TOOLS(private API) and modifiesfunc.__signature__. Verify this still works if FastMCP or fastmcp_extensions is upgraded.get_connector_builder_draft_manifestand verify draft manifest content matches what's in the Connector Builder UI.Notes
Summary by CodeRabbit
New Features
Tests