This document describes the MySQL schema used by the Climate Analog Database & API.
It covers the purpose of each table group, the data flow (ingest → temp → final → distances), and how to recreate the schema locally.
- High-Level Architecture
- Data Flow (Ingest → Norms → Distances → Queries)
- Table Groups
- Stored Procedures
- Conventions
- Recreating Schema Locally
- Operational Notes
- See Also
erDiagram
STATES {
string StateCode PK
string StateAbbr
string StateName
}
COUNTIES {
int CountyID PK
string CountyCode
string CountyName
string StateCode FK
float Latitude
float Longitude
}
PHYSICALDISTANCES {
int TargetCountyId PK
int AnalogCountyId PK
float PhysicalDistance
}
%% ------------------ FINAL FACT TABLES ------------------
MONTHLY_PRECIP_TargetState {
int CountyID PK
int Year PK
string Month PK
float Precipitation
}
MONTHLY_TEMP_TargetState {
int CountyID PK
int Year PK
string Month PK
float Temperature
}
SEASONAL_PRECIP_TargetState {
int CountyID PK
int Year PK
string Season PK
float Precipitation
}
SEASONAL_TEMP_TargetState {
int CountyID PK
int Year PK
string Season PK
float Temperature
}
YEARLY_PRECIP_TargetState {
int CountyID PK
int Year PK
float Precipitation
}
YEARLY_TEMP_TargetState {
int CountyID PK
int Year PK
float Temperature
}
%% ------------------ NORMS TABLES ------------------
MONTHLY_PRECIP_NORMS {
int CountyID PK
string Month PK
float NormPrecipitation
float StdDevPrecipitation
}
MONTHLY_TEMP_NORMS {
int CountyID PK
string Month PK
float NormTemperature
float StdDevTemperature
}
SEASONAL_PRECIP_NORMS {
int CountyID PK
string Season PK
float NormPrecipitation
float StdDevPrecipitation
}
SEASONAL_TEMP_NORMS {
int CountyID PK
string Season PK
float NormTemperature
float StdDevTemperature
}
YEARLY_PRECIP_NORMS {
int CountyID PK
float NormPrecipitation
float StdDevPrecipitation
}
YEARLY_TEMP_NORMS {
int CountyID PK
float NormTemperature
float StdDevTemperature
}
%% ------------------ DISTANCES (FINAL) ------------------
MONTHLY_PRECIP_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
string Month PK
float Distance
int AnalogRank
}
MONTHLY_TEMP_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
string Month PK
float Distance
int AnalogRank
}
MONTHLY_COMBINED_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
string Month PK
float Distance
int AnalogRank
}
SEASONAL_PRECIP_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
string Season PK
float Distance
int AnalogRank
}
SEASONAL_TEMP_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
string Season PK
float Distance
int AnalogRank
}
SEASONAL_COMBINED_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
string Season PK
float Distance
int AnalogRank
}
YEARLY_PRECIP_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
float Distance
int AnalogRank
}
YEARLY_TEMP_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
float Distance
int AnalogRank
}
YEARLY_COMBINED_DIST {
int TargetCountyID PK
int AnalogCountyID PK
int Year PK
float Distance
int AnalogRank
}
%% ------------------ TEMP STAGING (no FKs by design) ------------------
TargetState_MNTH_PRECIP_TEMP {
int CountyID
int Year
string Month
float Precipitation
}
TargetState_MNTH_TEMP_TEMP {
int CountyID
int Year
string Month
float Temperature
}
TargetState_SEAS_PRECIP_TEMP {
int CountyID
int Year
string Season
float Precipitation
}
TargetState_SEAS_TEMP_TEMP {
int CountyID
int Year
string Season
float Temperature
}
TargetState_YR_PRECIP_TEMP {
int CountyID
int Year
float Precipitation
}
TargetState_YR_TEMP_TEMP {
int CountyID
int Year
float Temperature
}
MNTH_PRECIP_DIST_TEMP {
int TargetCountyID
int AnalogCountyID
int Year
string Month
float Distance
}
MNTH_TEMP_DIST_TEMP {
int TargetCountyID
int AnalogCountyID
int Year
string Month
float Distance
}
SEAS_PRECIP_DIST_TEMP {
int TargetCountyID
int AnalogCountyID
int Year
string Season
float Distance
}
SEAS_TEMP_DIST_TEMP {
int TargetCountyID
int AnalogCountyID
int Year
string Season
float Distance
}
YR_PRECIP_DIST_TEMP {
int TargetCountyID
int AnalogCountyID
int Year
float Distance
}
YR_TEMP_DIST_TEMP {
int TargetCountyID
int AnalogCountyID
int Year
float Distance
}
%% ------------------ RELATIONSHIPS ------------------
STATES ||--o{ COUNTIES : has
COUNTIES ||--o{ MONTHLY_PRECIP_TargetState : has
COUNTIES ||--o{ MONTHLY_TEMP_TargetState : has
COUNTIES ||--o{ SEASONAL_PRECIP_TargetState: has
COUNTIES ||--o{ SEASONAL_TEMP_TargetState : has
COUNTIES ||--o{ YEARLY_PRECIP_TargetState : has
COUNTIES ||--o{ YEARLY_TEMP_TargetState : has
COUNTIES ||--o{ MONTHLY_PRECIP_NORMS : norms_for
COUNTIES ||--o{ MONTHLY_TEMP_NORMS : norms_for
COUNTIES ||--o{ SEASONAL_PRECIP_NORMS: norms_for
COUNTIES ||--o{ SEASONAL_TEMP_NORMS : norms_for
COUNTIES ||--o{ YEARLY_PRECIP_NORMS : norms_for
COUNTIES ||--o{ YEARLY_TEMP_NORMS : norms_for
COUNTIES ||--o{ PHYSICALDISTANCES : as_Target
COUNTIES ||--o{ PHYSICALDISTANCES : as_Analog
COUNTIES ||--o{ MONTHLY_PRECIP_DIST : as_Target
COUNTIES ||--o{ MONTHLY_TEMP_DIST : as_Target
COUNTIES ||--o{ MONTHLY_COMBINED_DIST : as_Target
COUNTIES ||--o{ SEASONAL_PRECIP_DIST : as_Target
COUNTIES ||--o{ SEASONAL_TEMP_DIST : as_Target
COUNTIES ||--o{ SEASONAL_COMBINED_DIST: as_Target
COUNTIES ||--o{ YEARLY_PRECIP_DIST : as_Target
COUNTIES ||--o{ YEARLY_TEMP_DIST : as_Target
COUNTIES ||--o{ YEARLY_COMBINED_DIST : as_Target
COUNTIES ||--o{ MONTHLY_PRECIP_DIST : as_Analog
COUNTIES ||--o{ MONTHLY_TEMP_DIST : as_Analog
COUNTIES ||--o{ MONTHLY_COMBINED_DIST : as_Analog
COUNTIES ||--o{ SEASONAL_PRECIP_DIST : as_Analog
COUNTIES ||--o{ SEASONAL_TEMP_DIST : as_Analog
COUNTIES ||--o{ SEASONAL_COMBINED_DIST: as_Analog
COUNTIES ||--o{ YEARLY_PRECIP_DIST : as_Analog
COUNTIES ||--o{ YEARLY_TEMP_DIST : as_Analog
COUNTIES ||--o{ YEARLY_COMBINED_DIST : as_Analog
- DB Engine: MySQL
- Primary Entities: Counties (with lat/long), States
- Facts: Precipitation & Temperature for Monthly, Seasonal, and Yearly time scales (TargetState-focused final tables).
- Norms: Means & std dev computed over 1991–2020 for monthly/seasonal/yearly, per county.
- Distances: Euclidean distances computed for precip, temp, and combined across monthly/seasonal/yearly.
- TEMP Tables: Used to ingest and transform before copying into final fact tables (keeps ingestion idempotent).
-
Ingest NOAA files via
/addallcountydata:- Parse fixed-width lines.
- Look up
CountyIDfromStates/Counties. - Write to
TargetStateCounty*_*_TEMPtables (monthly/seasonal/yearly + precip/temp).
-
Compute Norms (1991–2020):
- Accumulate totals & std dev for monthly, seasonal, yearly windows.
- Insert into
*_normstables.
-
Copy TEMP → Final:
TargetStateCountyMonthlyPrecip_TEMP→monthly_precipitation_data_TargetStateTargetStateCountyMonthlyTemp_TEMP→monthly_temperature_data_TargetState- (same for seasonal/yearly)
-
Compute Distances:
- Fill
*_distances_TEMPthen write/merge into*_distances(and*_combined_distances).
- Fill
-
Query Layer:
- Stored procedures fetch top analogs by time-scale and year/season/month.
-
States
Canonical list of states.
Used to resolve state codes during ingest. -
Counties
Canonical list of counties (CountyID,CountyName,StateCode,Latitude,Longitude).
Used to resolveCountyIDand provide spatial context. -
PhysicalDistances(optional / auxiliary)
If present, stores geodesic/physical distances between counties.
Used during ingestion to stage values that are eventually copied into final fact tables. These are dropped & recreated each run.
TargetStateCountyMonthlyPrecip_TEMPTargetStateCountyMonthlyTemp_TEMPTargetStateCountySeasonalPrecip_TEMPTargetStateCountySeasonalTemp_TEMPTargetStateCountyYearlyPrecip_TEMPTargetStateCountyYearlyTemp_TEMP
Why TEMP?
- Keeps partial runs isolated
- Allows “insert ignore / replace” semantics safely
- Enables efficient bulk copy into final fact tables
Hold the canonical time-series after TEMP copy:
-
Monthly
monthly_precipitation_data_TargetState(CountyID, Year, Month, Precipitation)
monthly_temperature_data_TargetState(CountyID, Year, Month, Temperature)
-
Seasonal (meteorological: DJF=winter, MAM, JJA, SON)
seasonal_precipitation_data_TargetState(CountyID, Year, Season, Precipitation)
seasonal_temperature_data_TargetState(CountyID, Year, Season, Temperature)
-
Yearly
yearly_precipitation_data_TargetState(CountyID, Year, Precipitation)
yearly_temperature_data_TargetState(CountyID, Year, Temperature)
Season boundaries (meteorological):
- winter: Dec (prev yr) + Jan + Feb
- spring: Mar + Apr + May
- summer: Jun + Jul + Aug
- fall: Sep + Oct + Nov
Per-county mean/stddev for each time-scale (used to contextualize anomalies):
monthly_precipitation_normsmonthly_temperature_normsseasonal_precipitation_normsseasonal_temperature_normsyearly_precipitation_normsyearly_temperature_norms
Window: 1991–2020 inclusive, matching NOAA climate normals in code.
Values: mean and standard deviation by month/season/year.
Distances computed between a target county and analog counties for each time-scale.
-
Monthly
monthly_precipitation_distancesmonthly_temperature_distancesmonthly_combined_distances- TEMP variants during compute:
monthly_precipitation_distances_TEMPmonthly_temperature_distances_TEMP
-
Seasonal
seasonal_precipitation_distancesseasonal_temperature_distancesseasonal_combined_distances- TEMP variants:
seasonal_precipitation_distances_TEMPseasonal_temperature_distances_TEMP
-
Yearly
yearly_precipitation_distancesyearly_temperature_distancesyearly_combined_distances- TEMP variants:
yearly_precipitation_distances_TEMPyearly_temperature_distances_TEMP
Combined distances use both precip and temp (the exact formula is encapsulated in stored procedures used by the backend).
See full signatures and bodies in
docs/PROCEDURES.md.
Below is a functional grouping that matches the code:
-
Inserts / Norms
InsertMonthlyPrecipitationTargetState,InsertMonthlyTemperatureTargetStateInsertSeasonalPrecipitationTargetState,InsertSeasonalTemperatureTargetStateInsertYearlyPrecipitationTargetState,InsertYearlyTemperatureTargetStateInsertMonthlyPrecipitationNorms,InsertMonthlyTemperatureNormsInsertSeasonalPrecipitationNorms,InsertSeasonalTemperatureNormsInsertYearlyPrecipitationNorms,InsertYearlyTemperatureNorms
-
Distance Calculations
CalculateMonthlyPrecipitationDistancesCalculateMonthlyTemperatureDistancesCalculateAllMonthlyCombinedDistancesCalculateSeasonalPrecipitationDistancesCalculateSeasonalTemperatureDistancesCalculateAllSeasonalCombinedDistancesCalculateYearlyPrecipitationDistancesCalculateYearlyTemperatureDistancesCalculateYearlyCombinedDistances
-
Lookup / Helpers
GetCountyIDByCodeAndStateInsertCounty,InsertState
-
Top Analogs (Query API)
- By Year (all years / specific year)
GetAllTopPrecipAnalogsForCountyByYearGetAllTopTempAnalogsForCountyByYearGetAllTopCombinedAnalogsForCountyByYearGetTopPrecipAnalogsForCountyByYearGetTopTempAnalogsForCountyByYearGetTopCombinedAnalogsForCountyByYear
- By Season
GetAllTopPrecipAnalogsForCountyBySeasonGetAllTopTempAnalogsForCountyBySeasonGetAllTopCombinedAnalogsForCountyBySeasonGetPrecipAnalogsForCountyByYearAndSeasonGetTempAnalogsForCountyByYearAndSeasonGetCombinedAnalogsForCountyByYearAndSeason
- By Month
GetAllTopPrecipAnalogsForCountyByMonthGetAllTopTempAnalogsForCountyByMonthGetAllTopCombinedAnalogsForCountyByMonthGetPrecipAnalogsForCountyByYearAndMonthGetTempAnalogsForCountyByYearAndMonthGetCombinedAnalogsForCountyByYearAndMonth
- By Year (all years / specific year)
- Months stored as zero-padded strings:
'01'–'12'. - Seasons stored as lowercase strings:
'winter' | 'spring' | 'summer' | 'fall'. - Invalid NOAA values:
-9.99/-99.90are ignored in aggregates. - TEMP tables are dropped and recreated during each ingest run.
- Target State focus: Final fact tables are suffixed with
_TargetStateand store TargetState county time series.
Use the dump (schema-only + routines) and restore:
Create dump (from remote):
mysqldump -u <user> -h <host> -P <port> -p \
--no-data --routines --no-tablespaces \
climate-change-app > docs/db-dump.sqlRestore into local DB (example):
# Create (if needed)
mysql -u root -p -e "CREATE DATABASE climate_change_app_local;"
# Import
mysql -u root -p climate_change_app_local < docs/db-dump.sqlIf you share this repo, include docs/db-dump.sql so others can bootstrap easily.
Idempotent Ingest: Ingestion uses TEMP tables + INSERT IGNORE/REPLACE to avoid duplicate final writes. Latest Month Guard: Code checks the latest inserted (Year, Month) to skip old data. Cron Updates: src/cron/syncData.js checks NOAA for newer files and hits /addallcountydata if needed. Backups: Regularly export schema (and optionally data) for disaster recovery. Performance: Distance computations can be heavy — they are batched by time-scale and use TEMP tables internally.
TABLES.md — detailed table columns with types & keys
PROCEDURES.md — stored procedure signatures & bodies