Skip to content

GeographicDataService/MSOA_IZ_Area_Classification

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aggregate ONS Output Area Classification (OAC) for GB MSOA/IZ

This repository provides code to aggregate the 2021/22 Output Area Classification (OAC) from the lowest geographical level (Output Areas) to:

  • 2021 Middle Layer Super Output Areas (MSOA) in England & Wales
  • 2022 Intermediate Zones (IZ) in Scotland

Overview

This code aggregates OAC classifications from UK Census Output Areas up to their respective mid-level geographies (MSOA/IZ). It then selects the dominant OAC Subgroup within each mid-level geography based on the largest total population contribution, effectively classifying each MSOA or IZ by the most populous Subgroup within it.

The workflow is as follows:

  1. Merge lookup tables of Output Areas to MSOA/IZ.
  2. Import OAC classifications for all Output Areas.
  3. Join total population counts.
  4. Aggregate OAC Subgroups to the mid-level geography.
  5. Generate a final aggregated classification.
  6. Compare original OA-level classification with aggregated classification using an alluvial plot.
  7. Export final outputs as CSV, Parquet, and GeoPackage files.

Requirements

The script uses the following R packages:

  • tidyverse
  • arrow
  • magrittr
  • sf
  • ggalluvial

Make sure these packages are installed before running the script:

install.packages(c("tidyverse", "magrittr", "sf", "ggalluvial"))
# For 'arrow', install from CRAN or the appropriate binary source
install.packages("arrow")

Data Sources

  1. Lookup: OA to MSOA (England & Wales)

  2. Lookup: OA to IZ (Scotland)

  3. OAC Input

    • Parquet file: ./data/UK_OAC_Final.parquet
  4. Total Population Data

  5. Geographical Boundaries

    • MSOA/IZ boundaries in GeoPackage format: ./data/MSOA_IZ.gpkg

Usage

  1. Clone or download this repository.
  2. Place the required data files in the correct directories, as indicated in the script (e.g., ./data/UK_OAC_Final.parquet, ./data/MSOA_IZ.gpkg).
  3. Install the required R packages (see Requirements).
  4. Open the R script (or copy-paste it into an R environment).
  5. Run the script from start to finish.

The script will read data from the specified sources, perform the aggregation, and produce outputs including CSV, Parquet, GeoPackages, and a comparison plot.


Outputs

The script generates several key outputs:

  • ./data/GB_OA_Lookup.parquet: Combined lookup table for Great Britain Output Areas to MSOA/IZ
  • ./data/MSOA_IZ_Lookup.csv and ./data/MSOA_IZ_Lookup.parquet: Final aggregated OAC classifications for MSOA/IZ
  • ./data/MSOA_IZ_SF_Counts.gpkg: Spatial data with OAC classifications and diversity counts
  • ./plot/Comparison.png: Alluvial plot showing classification flows

Comparison

The script produces an alluvial plot (Comparison.png) showing how Supergroups at the Output Area level flow into the aggregated Supergroups at the MSOA/IZ level.

  • Left Axis: Original OA-level Supergroups
  • Right Axis: Aggregated MSOA/IZ Supergroups

This helps visualize the degree of alignment or shifts in classification that occur during aggregation.

Statistical Summaries

Several tables and data frames show how many distinct Subgroups, Groups, and Supergroups are contained within each MSOA or IZ. This reveals how homogeneous or diverse each mid-level geography is in terms of OAC classes.

  • Compare_Subgroup, Compare_Group, and Compare_Supergroup
    • Show distributions of how many different classes exist per MSOA/IZ.
  • n_all
    • Merges the counts of distinct Supergroups, Groups, and Subgroups.

These outputs are joined to the spatial data frame and written to MSOA_IZ_SF_Counts.gpkg.


Supergroup Classifications

The analysis uses 8 OAC Supergroups with descriptive labels:

  1. Retired Professionals
  2. Suburbanites & Peri-Urbanites
  3. Multicultural & Educated Urbanites
  4. Low-Skilled Migrant & Student Communities
  5. Ethnically Diverse Suburban Professionals
  6. Baseline UK
  7. Semi & Un-Skilled Workforce
  8. Legacy Communities

Acknowledgments

  • ONS Output Area Classification: Data provided under the Open Government Licence.
  • Geography Boundaries: Sourced from ONS and NRS Scotland.
  • Code Contributors: @alexsingleton, Geographic Data Service.

For any questions or issues, please open a GitHub issue or reach out to the authors.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • R 100.0%