📊 Add income distribution S3 export for bespoke viz#6128
Conversation
|
Quick links (staging server):
Login: chart-diff: ✅No charts for review.data-diff: ✅ No differences foundAutomatically updated datasets matching excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included Edited: 2026-05-15 12:27:40 UTC |
paarriagadap
left a comment
There was a problem hiding this comment.
Looks good!
My only comment is about hardcoding the year, which is not so future-proof. I would prefer to assign the latest year instead.
It's true that the previous step looks redundant now, but I am keeping it for now, because I might use it for static charts.
| # S3 bucket name and folder where dataset files will be stored. | ||
| S3_BUCKET_NAME = "owid-public" | ||
| S3_DATA_DIR = Path("data/poverty-inequality") | ||
| EXPORT_YEAR = 2026 |
There was a problem hiding this comment.
I would prefer this to be the latest year and not hardcode to 2026
| S3_BUCKET_NAME = "owid-public" | ||
| S3_DATA_DIR = Path("data/poverty-inequality") | ||
| EXPORT_YEAR = 2026 | ||
| OUTPUT_FILE = f"income-distribution.{EXPORT_YEAR}.json" |
There was a problem hiding this comment.
Same here, hardcoding the year can be a problem for future versions of the data
|
I've now changed it so that it always creates a data file for each year. Then we don't need to hardcode anything here, and can just flexibly use whichever data file we need. |
Our other bespoke viz also upload data to the owid-public S3 bucket (see e.g. #6067 and https://github.com/owid/etl/blob/6e3a93b6d1afcbc99771e00556162131eb3f9c59/etl/steps/export/s3/ihme_gbd/latest/gbd_treemap_json.py), so let's do that for the income distribution plot, too. We also transform the data into a JSON format that's easily consumable in JS.
@paarriagadap If you want, you could now get rid of the
data://external/poverty_inequality/latest/thousand_bins_distributionstep. I believe that specifying the latest version in this step's DAG should be enough.Summary
export://s3/poverty_inequality/latest/income-distributionin the poverty and inequality DAG.data/poverty-inequality/income-distribution.2026.json.Validation
DRY_RUN=1 .venv/bin/etlr export://s3/poverty_inequality/latest/income-distribution --export --private