Equality Data Improvement Programme (EDIP) project board - equality data audit and overview of data on race: December 2021
Supporting paper from the meeting of the group on 15 December 2021.
Summary of equality data audit findings - race data
This section of the paper summarises the RAG ratings provided by analysts in each analytical area in relation to the collection and publication of data on race (analysts were provided with the following definition: “Race (or ethnicity) - The Equality Act 2010 defines race as a group of people defined by their colour, nationality (including citizenship) or ethnic or national origins. Many data collections ask about ethnic group rather than race, using questions such as: "What is your ethnic group?". Some datasets may, however, record inequality arising from race, such as racial hate crimes.” The term ‘race’ is used with reference to this definition in this paper).
Please note that some datasets included in the audit returns do not include information about individuals or households, but about businesses, sectors, local authorities or funding allocations. These datasets are included in the figures presented below as, whilst often more challenging, equality breakdowns may still be possible (e.g. breakdowns on the proportion of employees by race). Nevertheless, we highlight that, even in an ideal world, it would unlikely be possible to collect and publish data on race from all datasets marked as ‘red’.
All datasets
A total of 199 datasets were included in returns across the 10 analytical areas within the Scottish Government (National Records of Scotland (NRS) is included). As shown in Table 1, of these:
- just over half (53%) do not collect data on race and almost 6 in 10 (58%) are not used to publish data breakdowns by race
- almost 3 in 10 (29%) collect robust data about race that is reliable enough to produce robust statistics and just over 2 in 10 (22%) are used to proactively publish data breakdowns by race
There are a total of 38 datasets that are currently collecting data on race that is robust enough to produce reliable statistics and are used to proactively publish data breakdowns by race (i.e. ‘green’ for both collected and published). Annex A provides an overview of these datasets by analytical area.
Table 1: all datasets by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
53% (105) |
58% (116) |
A (amber) |
10% (19) |
9% (17) |
G (green) |
29% (57) |
22% (43) |
No rating provided |
9% (18) |
22% (23) |
TOTAL |
100% (199) |
100% (199) |
Across the analytical areas, the most commonly reported barriers to the collection or publication of protected characteristic data include:
- the responsibility or ownership of datasets sits with organisations outwith the Scottish Government
- low sample size makes producing robust estimates for smaller groups or those with intersecting protected characteristics challenging
- data is supplied to the Scottish Government at an aggregate level so information about individuals is unknown
- poor completion rates for some protected characteristics, including race
- sensitivities around the collection of data on protected characteristics due to features of the data collection context or respondents
- outdated IT systems for recording data
Health and social care analysis (HCSA)
There were 64 datasets included in the HSCA return. As shown in Table 2, of these:
- almost 7 in 10 (66%) do not collect data on race and just over 7 in 10 (72%) are not used to publish data breakdowns by race
- just over 2 in 10 (22%) collect robust data on race and just over 1 in 10 are used to proactively publish data breakdowns by race
There were 7 datasets used by HSCA where robust data on race is both collected and proactively published (Annex A).
Table 2: datasets used by HSCA by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
66% (43) |
72% (47) |
A (amber) |
9% (6) |
11% (7) |
G (green) |
22% (14) |
12% (8) |
No rating provided |
3% (2) |
5% (3) |
TOTAL |
100% (65) |
100% (65) |
Communities analysis division (CAD)
There were 27 datasets included in the CAD return. As shown in Table 3, of these:
- just over 4 in 10 (41%) do not collect data on race and almost 5 in 10 (48%) are not used to publish data breakdowns by race
- almost 4 in 10 (37%) collect robust data on race and 3 in 10 (30%) are used to proactively publish data breakdowns by race
There were 7 datasets used by CAD where robust data on race is both collected and proactively published (Annex A).
Table 3: datasets used by CAD by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
41% (11) |
48% (13) |
A (amber) |
7% (2) |
4% (1) |
G (green) |
37% (10) |
30% (8) |
No rating provided |
15% (4) |
19% (5) |
TOTAL |
100% (27) |
100% (27) |
Note that there are 3 datasets in the CAD return where analysts have reported that there is no scope to collect equality data as the datasets contain data about properties not individuals or households.
Constitution and external affairs analysis (CEAA)
There were 18 datasets included in the CEAA return. As shown in Table 4, of these:
- just over 4 in 10 (44%) do not collect data on race and just over 6 in 10 (61%) are not used to publish data breakdowns by race
- almost 3 in 10 (28%) collect robust data on race and almost 2 in 10 (22%) are used to proactively publish data breakdowns by race
There were 3 datasets used by CEAA where robust data on race is both collected and proactively published (Annex A).
Table 4: datasets used by CEAA by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
44% (8) |
61% (11) |
A (amber) |
6% (1) |
0% (0) |
G (green) |
28% (5) |
17% (3) |
No rating provided |
22% (4) |
22% (4) |
TOTAL |
100% (18) |
100% (18) |
Please note that many datasets reported in the CEAA return do not include person or household level data so the scope to collect equality data may be limited. For example, ‘red’ datasets included in the return include export statistics, airport data, and datasets on International Development Fund and Foreign Direct Investment expenditure.
Education analytical services (EAS)
There were 23 datasets included in the EAS return. As shown in Table 5, of these:
- 3 in 10 (30%) do not collect race data and just under 4 in 10 (39%) are not used to publish data breakdowns by race
- over 6 in 10 (61%) collect data on race and over 5 in 10 (52%) are used to proactively publish data breakdowns by race
There were 12 datasets used by EAS where robust data on race is both collected and proactively published (Annex A).
Table 5: datasets used by EAS by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
30% (7) |
39% (9) |
A (amber) |
9% (2) |
9% (2) |
G (green) |
61% (14) |
52% (12) |
No rating provided |
0% (0) |
0% (0) |
TOTAL |
100% (23) |
100% (23) |
Justice analytical services (JAS)
There were 14 datasets included in the JAS return. As shown in Table 6, of these:
- 5 in 10 (50%) do not collect data on race and just over 4 in 10 (43%) are not used to publish data breakdowns by race
- almost 4 in 10 (36%) collect data on race and just over 2 in 10 (21%) are used to proactively publish data breakdowns by race
There were 3 datasets used by JAS where robust data on race is both collected and proactively published (Annex A).
Table 6: datasets used by JAS by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
50% (7) |
43% (6) |
A (amber) |
14% (2) |
14% (2) |
G (green) |
36% (5) |
21% (3) |
No rating provided |
0% (0) |
21% (3) |
TOTAL |
100% (14) |
100% (14) |
Local government analytical services (LG)
There are 11 datasets in the LG return. As shown in Table 7, of these:
- over 9 in 10 (91%) do not collect data on race and all datasets (100%) are not used to publish data breakdowns by race
- there are no datasets collecting robust race data or proactively publishing data breakdowns by race
Table 8: datasets used by LG by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
91% (10) |
100% (11) |
A (amber) |
9% (1) |
0% (0) |
G (green) |
0% (0) |
0% (0) |
No rating provided |
0% (0) |
0% (0) |
TOTAL |
100% (11) |
100% (11) |
Please note that the majority of datasets reported in the LG return do not include person or household level data so the scope to collect equality data may be limited.
National Records of Scotland (NRS)
There were 15 datasets in the NRS return. As shown in Table 8, of these:
- just over 7 in 10 (73%) do not collect race data and just over 7 in 10 (73%) are not used to publish data breakdowns by race
- 2 in 10 (20%) collect robust data on race and 2 in 10 (20%) are used to proactively publish data breakdowns by race
There were 2 datasets used by NRS where robust data on race is collected and proactively published (Annex A).
Table 8: datasets used by NRS by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
73% (11) |
73% (11) |
A (amber) |
7% (1) |
7% (1) |
G (green) |
20% (3) |
20% (3) |
No rating provided |
0 (0%) |
0% (0) |
TOTAL |
100% (14) |
100% (14) |
Office of the Chief Economic Adviser (OCEA)
There were 19 datasets in the OCEA return. As shown Table 9, of these:
- almost 6 in 10 (58%) do not collect race data and almost 6 in 10 (58%) are not used to publish data breakdowns by race
- almost 3 in 10 (26%) collect robust data on race and almost 3 in 10 (26%) are used to proactively publish data breakdowns by race
There were 3 datasets used by OCEA where robust data on race is both collected and proactively published (Annex A).
Table 9: datasets used by OCEA by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
58% (11) |
58% (11) |
A (amber) |
16% (3) |
16% (3) |
G (green) |
26% (5) |
26% (5) |
No rating provided |
0% (0) |
0% (0) |
TOTAL |
100% (19) |
100% (19) |
Rural and environmental science analytical services (RESAS)
There were 5 datasets in the RESAS return. As shown in Table 10, of these:
- 8 in 10 (80%) do not collect race data and 8 in 10 (80%) are not used to publish data breakdowns by race
- no datasets collect robust data on race and no datasets are used to proactively publish race data
Table 10: RESAS datasets by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
80% (4) |
80% (4) |
A (amber) |
20% (1) |
20% (1) |
G (green) |
0% (0) |
0% (0) |
No rating provided |
0% (0) |
0% (0) |
TOTAL |
100% (5) |
100% (5) |
Transport Scotland strategy and analysis
There were 2 datasets in the Transport Scotland strategy and analysis return. As shown in Table 11, one of these datasets is not used to collect or publish race data, the other is used to collect robust data on race which is proactively published.
Table 11: Transport datasets by RAG rating for the collection and publication of race data
Note that due to rounding percentages may not sum to 100.
Rating |
Collected |
Published |
R (red) |
50% (1) |
50% (1) |
A (amber) |
0% (0) |
0% (0) |
G (green) |
50% (1) |
50% (1) |
No rating provided |
0% (0) |
0% (0) |
TOTAL |
100% (2) |
100% (2) |
Contact
There is a problem
Thanks for your feedback