Revising small area statistics geographies – data zones and intermediate zones: consultation

Data Zones and Intermediate Zones are small area geographies used in the production of official statistics in Scotland. They were first introduced in 2004 and revised in 2014. The purpose of this consultation is to seek feedback from users on proposals to update these geographies.


Why we need to revise Data Zones and Intermediate Zones

Data Zones and Intermediate Zones are designed to be stable geographies. There is a need, however, to revise them periodically to ensure they remain fit for purpose.

There are a number of reasons why Data Zones and Intermediate Zones are periodically updated. These reasons are discussed in this section.

Population change over time

The main reason why we need to revise Data Zones and Intermediate Zones is to account for population change. These geographies are designed to have roughly similar populations. The current versions used data from the 2011 census to ensure this requirement is satisfied.

Over time, however, changes in the population can cause some Data Zones to fall below or exceed the population thresholds. This can be due to new houses being built, old houses being demolished, or other changes in population distribution across Scotland. To account for these population changes, Data Zones and Intermediate Zones must periodically be reviewed using more current data.

In particular, revisions are timed to coincide with the Scottish Census. Data Zones and Intermediate Zones were first produced in 2004 based on 2001 Census results. They were then revised in 2014 so that they were based on 2011 Census data. The purpose of the proposals outlined in this consultation is to update Data Zones and Intermediate Zones so that they are based on recently published Census 2022 data.

Changes to higher level boundaries

Data Zones and Intermediate Zones are also revised to take into account changes to other geographic boundaries, most notably local authority boundaries. This is because Data Zones are designed so that aggregations of Data Zones produce exact matches of local authorities.

Local authority boundaries rarely change, however, when they do it means that aggrations of Data Zones no longer match Lboundaries exactly, so statistics produced from Data Zones become approximations based on best-fit aggregations.

This can also introduce disclosure risk. This is because it may be possible to infer information about the very small gaps that exist between aggregations of Data Zone statistics and statistics produced using the exact boundaries of higher level geographies.

There have been two boundary changes to local authorities since 2011. A small change, which did not affect any population, was made between Fife and Perth and Kinross at Keltybridge and Fife Environmental Energy Park, Westfield in February 2018. A slightly larger change, which did affect some population, occurred between Glasgow City and North Lanarkshire at Cardowan by Stepps in April 2019.

Changes to lower-level boundaries

Data Zones are produced by aggregating Census Output Areas. Output Areas in turn are produced by aggregating ‘frozen’ Census postcodes. Changes in postcode boundaries over time can erode this link between postcodes and Data Zones, which means postcodes will be aggregated up to Data Zone level on a best-fit basis.

Postcodes are owned and maintained by the Royal Mail purely for the purposes of delivering mail. Analysts use postcodes as the building blocks of almost all higher level geographies including Data Zones, as it is the most common spatial referencing point available.

The Royal Mail do not define postcode boundaries, however National Records of Scotland (NRS) produce postcode boundaries based on the location of address points with the same postcode. A ‘frozen’ set of Census postcode boundaries is used to create Data Zones. NRS publish updated postcode boundaries every six months, but the Data Zone boundaries remain fixed.

There are two issues associated with postcode drift that affect Data Zones. Firstly, as mentioned above, postcode boundaries will gradually change as address points are added or removed. Since analysts will usually use postcodes to allocate data to higher level geographies, this means that the notional area that the statistics relate to will gradually change over time and will be slightly different to the exact Data Zone boundaries, which remain fixed.

The second issue associated with postcode drift is related to using the postcode centroid to allocate postcodes to Data Zones. The postcode centroid is located at the address point nearest to the average easting and northing of all address points within the postcode. This, combined with the odd shape and the small size of the postcode geography, can occasionally result in postcodes switching Data Zones if there is a very small change to the address points in a postcode.

The main implication of this is that postcodes are assigned to Data Zones on a best-fit basis and the area that the statistics relate to will be slightly different to the area within the Data Zone boundary. This can introduce inaccuracy into statistics based on postcodes. Further information on the NRS postcode products is available from the NRS website.

Contact

Email: statistics.enquiries@gov.scot

Back to top