Skip to content

feat(connectors): add smoke test source for destination regression testing#982

Merged
Aaron ("AJ") Steers (aaronsteers) merged 11 commits intodevin/1769653021-destination-pyairbyte-universalfrom
devin/1770933386-smoke-test-source
Feb 26, 2026
Merged

feat(connectors): add smoke test source for destination regression testing#982
Aaron ("AJ") Steers (aaronsteers) merged 11 commits intodevin/1769653021-destination-pyairbyte-universalfrom
devin/1770933386-smoke-test-source

Conversation

@aaronsteers
Copy link
Contributor

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Feb 12, 2026

This PR targets the following PR:


Summary

Adds a Smoke Test Source connector (source-smoke-test) for destination regression testing, as described in #981. This source generates synthetic data across 15 predefined scenarios designed to exercise common destination failure patterns:

  • Type coverage: basic types, timestamps, large decimals, nested JSON/arrays
  • Null handling: nullable columns across all types, always-null columns, null-vs-empty-vs-zero
  • Naming edge cases: reserved SQL words, CamelCase, dots/dashes/spaces in column names, long column names, mixed-case stream names
  • Schema variations: wide table (50 columns), no-primary-key stream, empty stream, single-record stream
  • Batch sizes: configurable large batch (default 1000 records)
  • Unicode/special strings: international characters, escape sequences

The source also supports dynamic scenario injection via a custom_scenarios config field, enabling new failure conditions (e.g., anonymized from production errors) to be added at runtime without code changes.

New files:

  • airbyte/cli/universal_connector/smoke_test_source.pySourceSmokeTest class + PREDEFINED_SCENARIOS data
  • airbyte/cli/universal_connector/run_smoke_test.py — CLI entry point
  • Updated __init__.py exports and pyproject.toml entry point registration

Local Testing Results

All CLI commands verified locally:

  • spec — outputs valid JSON schema
  • check --config '{}' — passes with 15 scenarios
  • discover --config '{}' — returns all 15 streams
  • read with full catalog — emits 1028 records across 14 streams (empty_stream correctly emits 0)
  • scenario_filter — correctly filters discover and read to only named streams
  • large_batch_record_count=5 — correctly limits large_batch_stream to 5 records
  • large_batch_record_count=0 — correctly emits 0 records (stream still appears in catalog)
  • custom_scenarios — successfully injects and reads custom scenario data

Review & Testing Checklist for Human

  • No automated tests are included. The source has not been verified with any unit or integration tests. Consider whether tests should be required before merge or tracked as follow-up.
  • wide_table_50_columns schema mismatch: The second record sets all 49 columns to None, but the schema declares them as "type": "string" (not nullable). This may cause issues with strict schema validation in some destinations.
  • large_batch_record_count=0 behavior: The stream still appears in discover() but emits zero records. Verify whether this is the desired behavior or if the stream should be excluded from the catalog entirely.
  • End-to-end test: Pipe output into destination-pyairbyte (from feat(connectors): add universal source and destination using PyAirbyte #969) to verify compatibility.

Recommended test plan:

  1. Run source-smoke-test spec and verify the output JSON schema
  2. Run source-smoke-test check --config '{}' and confirm success
  3. Run source-smoke-test discover --config '{}' and verify all 15 streams appear
  4. Run source-smoke-test read --config '{}' --catalog <catalog.json> with a catalog selecting a few streams, and verify records are emitted correctly
  5. Pipe output into destination-pyairbyte (from feat(connectors): add universal source and destination using PyAirbyte #969) to verify end-to-end compatibility

Notes

Requested by: Aaron ("AJ") Steers (@aaronsteers)
Devin session: https://app.devin.ai/sessions/02ea317be9054bcba5e48a6ed1622620


Open with Devin

…sting

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1770933386-smoke-test-source' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1770933386-smoke-test-source'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Test and debug by running locally until everything is fully tested

@devin-ai-integration
Copy link
Contributor

Will do! Running comprehensive local testing now across all 15 streams, custom scenario injection, filtering, and large batch generation.

@github-actions
Copy link

github-actions bot commented Feb 12, 2026

PyTest Results (Fast Tests Only, No Creds)

343 tests  +17   343 ✅ +17   5m 43s ⏱️ +4s
  1 suites ± 0     0 💤 ± 0 
  1 files   ± 0     0 ❌ ± 0 

Results for commit 59b39e8. ± Comparison against base commit 4e326cd.

♻️ This comment has been updated with latest results.

…s module

Co-Authored-By: AJ Steers <aj@airbyte.io>
@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review February 12, 2026 22:19
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 2 commits February 12, 2026 22:22
… config options

By default, high-volume scenarios (large_batch_stream) are excluded.
The scenario_filter list is unioned with the boolean-driven sets,
with overlaps deduplicated.

Co-Authored-By: AJ Steers <aj@airbyte.io>
…ide_table schema nullability

Co-Authored-By: AJ Steers <aj@airbyte.io>
@github-actions
Copy link

github-actions bot commented Feb 12, 2026

PyTest Results (Full)

413 tests  +17   396 ✅ +17   24m 41s ⏱️ +5s
  1 suites ± 0    17 💤 ± 0 
  1 files   ± 0     0 ❌ ± 0 

Results for commit 59b39e8. ± Comparison against base commit 4e326cd.

♻️ This comment has been updated with latest results.

@aaronsteers
Copy link
Contributor Author

Devin, log a new issue regarding STATE here in the pyairbyte repo, then replace the TK-TODO comment with a regular TODO comment and link to the newly logged issue.

…ling

Co-Authored-By: AJ Steers <aj@airbyte.io>
@devin-ai-integration
Copy link
Contributor

Done. Created #987 for STATE message handling, and replaced the TK-TODO with a regular TODO linking to the issue (eb27252).


Devin session

devin-ai-integration bot and others added 2 commits February 26, 2026 16:59
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit 60c502b into devin/1769653021-destination-pyairbyte-universal Feb 26, 2026
14 of 15 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1770933386-smoke-test-source branch February 26, 2026 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant