Guide to basic quality assurance in statistics

Guidance for those producing official statistics to ensure that quality is monitored and assured.


Basic checks

hese are some basic ways of quality assuring data prior to analysis which can help to identify any problems in the data. These are some of the steps that can be used to check the data.

It is important to understand what data kind of data is expected. Is it raw unit level data, aggregated data, processed data etc. – this will aid understanding of the kinds of quality assurance required.

Check that the dataset is complete. Are there any missing values? Check against expected number of data entries where possible – e.g. total number of schools, Scottish population, survey respondents, etc.

For survey data – check number of responses is correct based on routing (e.g. follow up questions asked only of those who answered yes to a previous question). It is also worth checking all variables are present.

Also check the dataset for possible duplicate entries, which may be the result of a mistake in the data entry.

Data checks

Scan through the dataset and consider if the values in the dataset look sensible (e.g. if the Census reported the population of Scotland to be 10 million, it would be apparent that the value is not sensible).

Plotting the data on a chart is a helpful way to identify outliers in the data and helps to identify the trend of the data. Compare the new data with previous years, has there been any larges changes between years. If so, do you know a reason why? For example, a change in the methodology or change in policy intervention. It is important to ensure possible reasons are clearly evidenced and not just assumptions.

Check that the values in columns and rows add up to the totals. Cumulative totals in columns and rows are very useful to check the data is accurate and does not contain mistakes (e.g. the area of derelict land in each local authority figures, when added together, should equal the area of derelict land in Scotland figure).

If possible, compare data with previously published data to check they correspond as some data sources may have already published similar data. Consider similar publications from the past but also other products (compendium publications, PQs, similar analyses produced by other organisation etc) so you can be confident that your data are sensible. This is particularly important when preparing compendium publications. As with all quality assurance if the data do not correspond, consider why not and then take appropriate action.

Ensure that all figures reported throughout a publication are consistent (i.e. if reporting a value in Table 2 of a publication then again in Table 4, make sure these numbers correspond). Also ensure that numerators and denominators are correctly applied throughout and between tables. Consider the conclusions drawn from the basic data and those from more processed data such as proportions, summary statistics and charts – are they the same, if not, why not?

Consider the ‘common sense’ of the final figures. Perhaps consider links to other data, for example expenditure data. This might be particularly relevant when considering changes or differences between groups (sectors in the economy, types of farming, local authority data, gender analysis etc). 

 

Back to top