Skip to content

🚨 CRITICAL FIX: Race condition in shared resource manager#16

Draft
fosk06 wants to merge 1 commit intomainfrom
feat/fix-shared-resource-manager-bug
Draft

🚨 CRITICAL FIX: Race condition in shared resource manager#16
fosk06 wants to merge 1 commit intomainfrom
feat/fix-shared-resource-manager-bug

Conversation

@fosk06
Copy link
Copy Markdown
Owner

@fosk06 fosk06 commented Sep 16, 2025

🚨 CRITICAL BUG FIX: Race condition in SQLMeshResultsResource

Problem

The shared resource manager had a critical race condition where multiple assets could simultaneously:

  1. Check has_results(run_id)False
  2. Decide to execute SQLMesh
  3. Execute SQLMesh in parallel
  4. Result: Multiple SQLMesh executions for the same run!

Solution

Implemented a robust solution using Dagster's native dependency system:

✅ New Architecture

  • Precompute Asset: sqlmesh_execution_results executes SQLMesh once per run
  • Individual Assets: Depend directly on the precompute asset via deps=[sqlmesh_execution_results]
  • Native Dagster Flow: No more race conditions, Dagster manages execution order
  • Reused Existing Logic: All console/notifier tracking preserved

🔧 Key Changes

  • Created sqlmesh_execution_results asset that executes SQLMesh once
  • Modified individual assets to depend on precompute results
  • Removed SQLMeshResultsResource (no longer needed)
  • Reused existing utilities: execute_sqlmesh_materialization, handle_successful_execution
  • Preserved all console custom and notifier logic for audit capture

🎯 Benefits

  • Guaranteed Single Execution: One SQLMesh run per Dagster run
  • No Race Conditions: Dagster manages dependencies natively
  • Preserved Functionality: All audit capture and model tracking works
  • Better Debugging: Clear execution flow in Dagster UI
  • Simplified Architecture: No complex shared state management

🧪 Testing

  • All 173 tests pass
  • Integration tests updated for new architecture
  • Backward compatibility maintained

This fixes the fundamental architectural flaw that was causing multiple SQLMesh executions.

Files Changed

  • src/dg_sqlmesh/factory.py - New precompute asset architecture
  • tests/integration/test_asset_execution.py - Updated tests for new architecture

Breaking Changes

  • SQLMeshResultsResource is no longer used (removed from resources)
  • Individual assets now depend on sqlmesh_execution_results asset

Migration

No migration needed - the change is backward compatible at the API level.

🚨 CRITICAL BUG FIX: Race condition in SQLMeshResultsResource

## Problem
The shared resource manager had a critical race condition where multiple assets could simultaneously:
1. Check has_results(run_id) → False
2. Decide to execute SQLMesh
3. Execute SQLMesh in parallel
4. Result: Multiple SQLMesh executions for the same run!

## Solution
Implemented a robust solution using Dagster's native dependency system:

### ✅ New Architecture
- **Precompute Asset**: sqlmesh_execution_results executes SQLMesh once per run
- **Individual Assets**: Depend directly on the precompute asset via deps=[sqlmesh_execution_results]
- **Native Dagster Flow**: No more race conditions, Dagster manages execution order
- **Reused Existing Logic**: All console/notifier tracking preserved

### 🔧 Key Changes
- Created sqlmesh_execution_results asset that executes SQLMesh once
- Modified individual assets to depend on precompute results
- Removed SQLMeshResultsResource (no longer needed)
- Reused existing utilities: execute_sqlmesh_materialization, handle_successful_execution
- Preserved all console custom and notifier logic for audit capture

### 🎯 Benefits
- ✅ **Guaranteed Single Execution**: One SQLMesh run per Dagster run
- ✅ **No Race Conditions**: Dagster manages dependencies natively
- ✅ **Preserved Functionality**: All audit capture and model tracking works
- ✅ **Better Debugging**: Clear execution flow in Dagster UI
- ✅ **Simplified Architecture**: No complex shared state management

### 🧪 Testing
- All 173 tests pass
- Integration tests updated for new architecture
- Backward compatibility maintained

This fixes the fundamental architectural flaw that was causing multiple SQLMesh executions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant