Skip to content
Discussion options

You must be logged in to vote

Great project! Here are some simplified insights for your architecture:
Data Quality: Definitely move to a Dead Letter Queue (DLQ). Dropping rows makes you lose visibility into why data is failing. Storing them separately allows for later auditing and reprocessing.
Orchestration: For <10k rows, Airflow is likely overkill. Stick to your script or use Prefect/Dagster if you need better UI and retry logic without the heavy infrastructure of Airflow.
Docker Networking: Ensure your DB container is not exposing ports to the public internet (remove the ports mapping in docker-compose if only the app needs access). Use a private Docker network and environment variables (secret files) for credenti…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by Thiago-code-lab
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants