feat(connectors): add smoke test source for destination regression testing#982
Conversation
…sting Co-Authored-By: AJ Steers <aj@airbyte.io>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksTesting This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1770933386-smoke-test-source' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1770933386-smoke-test-source'PR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful ResourcesCommunity SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
Co-Authored-By: AJ Steers <aj@airbyte.io>
Aaron ("AJ") Steers (aaronsteers)
left a comment
There was a problem hiding this comment.
Looks good! Test and debug by running locally until everything is fully tested
|
Will do! Running comprehensive local testing now across all 15 streams, custom scenario injection, filtering, and large batch generation. |
…s module Co-Authored-By: AJ Steers <aj@airbyte.io>
… config options By default, high-volume scenarios (large_batch_stream) are excluded. The scenario_filter list is unioned with the boolean-driven sets, with overlaps deduplicated. Co-Authored-By: AJ Steers <aj@airbyte.io>
…ide_table schema nullability Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
…devin/1770933386-smoke-test-source
|
Devin, log a new issue regarding STATE here in the pyairbyte repo, then replace the TK-TODO comment with a regular TODO comment and link to the newly logged issue. |
…ling Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
60c502b
into
devin/1769653021-destination-pyairbyte-universal
This PR targets the following PR:
Summary
Adds a Smoke Test Source connector (
source-smoke-test) for destination regression testing, as described in #981. This source generates synthetic data across 15 predefined scenarios designed to exercise common destination failure patterns:The source also supports dynamic scenario injection via a
custom_scenariosconfig field, enabling new failure conditions (e.g., anonymized from production errors) to be added at runtime without code changes.New files:
airbyte/cli/universal_connector/smoke_test_source.py—SourceSmokeTestclass +PREDEFINED_SCENARIOSdataairbyte/cli/universal_connector/run_smoke_test.py— CLI entry point__init__.pyexports andpyproject.tomlentry point registrationLocal Testing Results
All CLI commands verified locally:
spec— outputs valid JSON schemacheck --config '{}'— passes with 15 scenariosdiscover --config '{}'— returns all 15 streamsreadwith full catalog — emits 1028 records across 14 streams (empty_stream correctly emits 0)scenario_filter— correctly filters discover and read to only named streamslarge_batch_record_count=5— correctly limits large_batch_stream to 5 recordslarge_batch_record_count=0— correctly emits 0 records (stream still appears in catalog)custom_scenarios— successfully injects and reads custom scenario dataReview & Testing Checklist for Human
wide_table_50_columnsschema mismatch: The second record sets all 49 columns toNone, but the schema declares them as"type": "string"(not nullable). This may cause issues with strict schema validation in some destinations.large_batch_record_count=0behavior: The stream still appears indiscover()but emits zero records. Verify whether this is the desired behavior or if the stream should be excluded from the catalog entirely.destination-pyairbyte(from feat(connectors): add universal source and destination using PyAirbyte #969) to verify compatibility.Recommended test plan:
source-smoke-test specand verify the output JSON schemasource-smoke-test check --config '{}'and confirm successsource-smoke-test discover --config '{}'and verify all 15 streams appearsource-smoke-test read --config '{}' --catalog <catalog.json>with a catalog selecting a few streams, and verify records are emitted correctlydestination-pyairbyte(from feat(connectors): add universal source and destination using PyAirbyte #969) to verify end-to-end compatibilityNotes
Requested by: Aaron ("AJ") Steers (@aaronsteers)
Devin session: https://app.devin.ai/sessions/02ea317be9054bcba5e48a6ed1622620