Summary
Improve ExternalTaskSensor diagnostics to make cross-DAG dependency failures easier to debug, particularly when execution_delta is misconfigured or when the expected upstream DAG run does not exist.
Currently, when an ExternalTaskSensor waits indefinitely (or until timeout), operators often need to manually inspect multiple DAGs and execution dates to determine the root cause. This can make troubleshooting slow and error-prone, especially during backfills or schedule changes.
Problem
When an ExternalTaskSensor is configured with an incorrect execution_delta, schedule mismatch, or invalid upstream dependency, the sensor may remain in a waiting state without providing enough context to quickly identify the issue.
Common failure scenarios include:
- Upstream DAG run does not exist for the computed execution date.
- Upstream task failed or was skipped.
execution_delta is incorrectly configured.
- Schedule changes create execution-date mismatches.
- Backfills introduce unexpected dependency mappings.
In these cases, users often need to manually:
- Calculate the expected execution date.
- Locate the upstream DAG run.
- Verify task status.
- Determine whether the dependency configuration is correct.
Proposed Improvement
Enhance sensor diagnostics and logging to provide additional context when waiting or timing out.
Potential improvements:
- Log the computed upstream execution date.
- Include upstream DAG ID and task ID in diagnostic messages.
- Report whether a matching upstream DAG run was found.
- Differentiate between:
- Missing DAG run
- Missing task instance
- Failed task
- Waiting task
- Include dependency configuration details (
execution_delta, execution date mapping, etc.).
- Provide actionable hints in timeout/error messages.
Example
Current behavior:
Proposed behavior:
Waiting for DAG 'upstream_dag'
Task: 'publish_data'
Expected execution date: 2026-06-03T00:00:00Z
execution_delta: 1 day
No matching upstream DAG run found for the expected execution date.
Possible causes:
- Incorrect execution_delta
- Upstream DAG did not run
- Schedule mismatch between DAGs
Benefits
-
Faster root-cause analysis for dependency issues.
-
Reduced operational overhead.
-
Easier debugging during backfills.
-
Improved observability of cross-DAG dependencies.
-
Better onboarding experience for new contributors.
Additional Context
Telemetry-Airflow relies heavily on cross-DAG dependencies, making sensor observability particularly important. More descriptive diagnostics would help users identify configuration problems without manually inspecting multiple DAG runs and task instances.
Summary
Improve
ExternalTaskSensordiagnostics to make cross-DAG dependency failures easier to debug, particularly whenexecution_deltais misconfigured or when the expected upstream DAG run does not exist.Currently, when an
ExternalTaskSensorwaits indefinitely (or until timeout), operators often need to manually inspect multiple DAGs and execution dates to determine the root cause. This can make troubleshooting slow and error-prone, especially during backfills or schedule changes.Problem
When an
ExternalTaskSensoris configured with an incorrectexecution_delta, schedule mismatch, or invalid upstream dependency, the sensor may remain in a waiting state without providing enough context to quickly identify the issue.Common failure scenarios include:
execution_deltais incorrectly configured.In these cases, users often need to manually:
Proposed Improvement
Enhance sensor diagnostics and logging to provide additional context when waiting or timing out.
Potential improvements:
execution_delta, execution date mapping, etc.).Example
Current behavior:
Proposed behavior:
Benefits
Faster root-cause analysis for dependency issues.
Reduced operational overhead.
Easier debugging during backfills.
Improved observability of cross-DAG dependencies.
Better onboarding experience for new contributors.
Additional Context
Telemetry-Airflow relies heavily on cross-DAG dependencies, making sensor observability particularly important. More descriptive diagnostics would help users identify configuration problems without manually inspecting multiple DAG runs and task instances.