Public dialogue on the use of data by the public sector in Scotland

This report presents the findings from a public dialogue on the use of data in Scotland commissioned by the Scottish Government to explore the ethics of data-led projects. The purpose of the panel was to inform approaches to data use by the Scottish Government and public sector agencies in Scotland.


Looking back: past projects that the Data and Intelligence Network supported

This chapter summarises reflections on four past projects that were presented to the panel. By exploring reactions to and perceptions of specific data projects, this chapter highlights the ethical considerations around data use that were important to participants, which later fed into the guidelines they developed.

These sessions represented a distinct stage in the learning process. By hearing from representatives of real-life data projects, participants had the chance to place some of the concepts outlined in session one (data sharing, data protection, data ethics) into a practical context. It also gave participants the chance to scrutinise projects and ask questions directly to the specialists in plenary Q&A sessions.

The past data projects reviewed in sessions two and three were:

  • Shielding list (session two) – Medical records were used by NHS Scotland during the pandemic to identify those more likely to be clinically at risk from COVID-19.
  • CURL (session two) – Health data was linked with residential addresses to improve understanding of health risks in different situations.
  • Equalities (session three) – Information from medical records, education records and census data was used to develop as complete a picture as possible of the protected characteristics across Scotland.
  • Ukrainian Displaced People (session three) – Data was processed and shared during the Ukraine Crisis so that Ukrainians could be safely housed across Scotland.

As well as representatives from each project sharing their reflections, in session three an academic from Tilburg University, Dr Anuj Puri, joined to offer his reflections on the projects from an alternative perspective (having not been involved in them himself). This perspective provided an opportunity for participants to consider other points of view on the ethical issues around the use of data by the public sector.

In this chapter, each project is presented separately, summarising the key ethical considerations raised by the panel as part of their assessments. These reflections formed the basis of the ethical guidelines that the panel developed in later sessions.

Key findings

  • On the shielding list project, the panel felt the benefits of saving lives during the pandemic offset the risks and challenges (recognising the potential harms around asking people to shield and adding them to the shielding list without their consent).
  • On the CURL project, there were more mixed views on the benefits of the project – some felt the linked data could be useful in future while others highlighted a lack of transparency around these possible uses.
  • On the equalities project, the benefits of having this data available for future use were broadly recognised. The challenges associated with it largely hinged on data quality concerns and the risk of this leading to poor policy decisions based on skewed data.
  • On the Ukrainian Displaced People project, the humanitarian aspect was broadly applauded. However, there were also concerns raised about data sharing between countries in the context of war.

The key ethical considerations raised in relation to these past projects included:

  • Ensuring accurate and up to date data.
  • Proportionate use and not going beyond the original scope.
  • Weighing up the relative benefits and harms to society.
  • Ensuring transparency and accountability in decisions about what data is used, by whom, and for how long it is held.
  • Ensuring data is held securely.
  • Ensuring the principles of consent are adhered to.

Past project one: Shielding list

During the COVID-19 pandemic, medical records were used by NHS Scotland to identify citizens who were more likely to be clinically at risk from COVID-19. This data were then used to contact individuals to request that they stay in their homes and take extra precautions to minimise their risk of contracting Covid. The data were shared with local authorities so that they could provide additional support to any individuals who were shielding. A summary of this project was presented in plenary by the DIN team, followed by smaller breakout discussion, and Q&A with the DIN team in plenary.

Strong positive impact of the shielding programme

The shielding programme was seen as having positive impacts by helping keep people safe during the pandemic. Participants reflected on how the shielding list had impacted on their own lives or those they cared about and largely felt the risks associated with sharing identifiable health data were outweighed by the benefits of protecting vulnerable groups.

"I have four friends, all of whom were shielding. I think the government got it spot on, and quickly." (Session two)

Potential risk of harm

The panel were mindful of the potential harms associated with receiving a letter and being advised to stay at home (such as negative impacts on peoples’ mental health and wellbeing). They also highlighted the potential risk to individuals’ privacy. For example, one participant expressed discomfort about being on the shielding list and having such information about them shared. This concern related to the risk, identified earlier by the panel, that data could be used for purposes other than shielding.

The project also sparked some discussion about the issue of consent. One view was that information on peoples’ health conditions should not be shared beyond the NHS – such as with charities – without their permission. Another view was that it was acceptable to share this information with organisations who could provide support to those that were shielding, as the data may not need to include detailed information (i.e. their name and address but not their health condition). A more exceptional view was that individuals should have been consulted about being included on the shielding list in the first place.

“[A] negative would be maybe the possibility of intrusion, if that is the right word. You don’t want someone to know something and you get a letter discussing that. It could be an issue for someone they have to personally deal with.” (Session two)

Importance of data being used for a specific purpose only

Questions were raised about how long the shielding list data would be held for. In discussions around this, participants highlighted the importance of the data not being kept longer than was needed and only being used for the specific purpose of shielding. On balance, the panel was reassured that data had been used proportionately and appropriately, and that there was a clear justification for its use in this specific case.

"I'd be very nervous about day-to-day sharing of data unless it was for a really important purpose, like shielding.” (Session two)

Risk of gaps in the data

Concerns were also raised about data quality, including how accurate and up to date the data were. They noted that gaps in the data may have led to people being missed from the shielding list.

“That's the issue about the data being in the right place and up to date. It's fine if you are keeping your data up to date in the right place, but how do you know if you have missed someone out?” (Session two)

Necessity of clear roles and responsibilities when multiple organisations are sharing data

Participants raised questions about the number of organisations involved in the shielding list project and were unclear about who was accountable:

Questions raised in relation to the shielding list project:

  • “Who is making the judgment calls about who data is transferred to?”
  • “Why are the third party organisations being involved in data sharing?

Given the range of public sector organisations involved in the shielding list data project (such as health boards, GPs, universities and local authorities) and the range of data sources (such as GP, local authority and academic datasets), the panel felt that clarity over roles and responsibilities was important.

“Most of all, I think that it should be very clear what is being taken and who is getting this data, who it's being shared with.” (Session two)

Participants also felt that the public benefits of projects like the shielding list should be clearly defined and communicated by the organisations involved (not through “long T&Cs”) along with assurances that data were being used responsibly. Having such transparency was linked to building trust in public sector use of data about citizens.

"There's no feedback or follow-up on how it's been used and its impact. If data was used for the good of society, and we knew that, we might trust the organisations more with the data." (Session two)

In concluding discussions on the shielding list project, the panel noted down their key ethical considerations on a digital whiteboard using post-it notes. .

Past project two: CURL

Public Health Scotland and academic researchers from the Scottish Centre for Administrative Data Research (SCADR) undertook work to link health data (using Community Health Index – or CHI – numbers) and residential addresses (using Unique Property Reference Numbers or UPRNs). The project was called CHI/UPRN Residential Linkage (CURL). During the pandemic, this project helped the Scottish Government understand the impact of hospital discharges to care homes in terms of COVID-19 outbreaks and improve testing in care homes. A wider aim was to combine this linked dataset with other data for future uses, for example combining it with geography or area-based datasets to understand the impact of flooding on peoples’ health. A summary of this project was presented to the panel in plenary by a DIN team member, and was followed by smaller breakout discussion and Q&A with the DIN team member in plenary.

Concerns about widening the scope of the project in future

The primary purpose of this data-led project, to minimise the spread of COVID-19 in care homes, was recognised as a positive one. Participants felt that the “tidying up” of data for future use beyond the pandemic would also be beneficial, for example by helping to understand public health needs at a local level. However, some participants were not clear on the possible benefits of linking such data and what difference it could make in the future.

“The care home scenario was a great use of it, but he was talking about bringing this forward into the future… I don't know how things like insulating the roof, like he said, can have an impact on your overall health.” (Session two)

The scope of the project was therefore viewed as a challenge, given the range of possible future uses that were outlined in the presentation. These possible uses – such as for understanding the impact of flooding on peoples’ health – were not widely recognised as being relevant to people’s health data and were described as potentially “intrusive”. Although it was deemed appropriate to link this data to protect people in care homes (recognising that this was an emergency situation), the panel considered the lack of transparency around these wider uses to be a risk and questioned the linked data being used more widely without consent. It was suggested that people should be given the opportunity to provide consent for uses of the data that go beyond the original scope (in this case, helping understand the risk of and minimise the spread of COVID-19 in care homes).

“It could be used for good things in the future, but I don't think it's great you can take that system that exists for an emergency and then adapt it for future projects. If there was consent for the people in that household, there may be better awareness. But otherwise it feels quite intrusive.” (Session two)

The panel highlighted the importance of weighing up the benefits and harms that this use of data may have on individuals and society. While they could see the benefits of such data projects during the pandemic, there was also a sense of powerlessness in terms of how data about citizens were used. A clearly outlined public benefit for any future use of the CURL data was therefore deemed to be important.

“It's got to be the impact of the people and the communities. Why are they getting that data, and what would be the impact on the community? It's about having a clear purpose.” (Session two)

Concerns about data security

Other concerns raised about this project were the amount of data being analysed, the extent to which personal data could be accessed via the CHI identification numbers, and the risk of data breaches occurring.

Given the possible future uses of CURL data, participants felt it was important to ensure adequate security was in place to prevent data leakages or misuse. They also felt that the amount of data being collected should be limited to minimise the impact on individuals if such incidents were to occur.

“The more organisations it's shared amongst the more susceptible it is to falling into the wrong hands. They've already mentioned they work with companies, so they know your age, your details. Am I going to be sold insurance products? Do they need all the data that's passed over to them? Are there safeguards in place for that, as well?” (Session two)

Importance of data completeness

Participants also supported the idea of reviewing the data for any gaps that would risk individuals being excluded or not benefitting from initiatives if their CHI number was not known. This reflected a broader need for reassurance that data were being used ethically, robustly, and for the benefit of society. While it was agreed that the data should be as complete as possible, it was also felt that only the minimum amount of data required to fulfil the project objectives should be used.

“In theory, the more data there is, the more potential for misuse, or even use that wasn't its original intent”. (Session two)

Wariness of private sector use of health data

Participants were wary of commercial interests in the CURL project. While reassured by the additional information provided by the specialist on the role of ethical committees in academia and in public bodies to control access to health data, participants felt it was important to consider the risk of misuse by private sector organisations, such as insurance companies. It was deemed appropriate that decisions about the use of health data should be made by the NHS.

In concluding discussions on the CURL project, the panel noted down their key ethical considerations on a digital whiteboard using post-it notes.

Past project three: Equalities and protected characteristics

The equalities and protected characteristics project aimed to develop as complete a picture as possible of the protected characteristics across the Scottish population using information from medical records, education records and census data. The purpose of linking this data together was to enable public bodies and academic organisations to better consider equality issues when planning and delivering services. A summary of this project was presented to the panel in plenary by Duncan Buchanan (Research Data Scotland) and was followed by smaller breakout discussion and Q&A with Duncan in plenary.

Benefits and risks of linked data for future use

The ability to quickly access this linked data in future was considered a benefit of the project. Reflecting on the pandemic, when data needed to be compiled or linked quickly, it was felt that having such data already available would ensure speed and quality if it was ever required urgently. A more exceptional view, however, was that this might result in having data “for the sake of it” and that a clear purpose was lacking.

“It's a good thing they've got access to data, especially following the pandemic so you can roll out help and things like that in a timely fashion and bring these bodies together.” (Session three)

Reflecting on the presentation, the panel felt assured that the organisations involved had been aware of the challenges associated with this type of data linkage and taken steps to address them. For instance, the panel were reassured about the existence of safe havens (secure environments where data is held and can only be accessed by approved researchers). The panel also pointed to the use of various data sources to ensure the information being used was more accurate than if relying on only one source (like the census).

Risk of incomplete data impacting decisions

The risks associated with this project hinged mainly on the issue of data quality. As had been pointed out in the presentation, one of the challenges with this project was accounting for the different ways in which characteristics were recorded and individuals’ changing circumstances. For instance, there was no data available on gender reassignment or sexual orientation, and there were some characteristics (such as religion or disability) that were only recorded every ten years. There was some concern among participants that incomplete data could skew the results, providing inaccurate information and leading to “bad” policy decisions.

“They talked about gender reassignment surgeries not being tracked, but it made me think about broader data that might not be gathered. What you exclude can be very telling. If you don't take some data, or if people refuse to give it, then it still might skew results, and over time, that gets worse and worse”. (Session three)

Other risks and challenges associated with this project included the possibility of identifying an individual due to the amount of information being collected across multiple sources; data being open to abuse if information was not stored securely or if passed onto third parties; and the lack of clear research objectives leading to data being used for purposes not in the public interest or that exacerbate inequality or discrimination.

Importance of having a clear justification and set parameters for using the data

Given the possible future uses of this linked data, concerns were raised over data being passed to third parties and so it was felt that any organisation wishing to make use of this data would need a clear justification. There was a view that those seeking to use the data should demonstrate how this would benefit communities and be in the public good. Considering the reflections offered by the academic Anuj Puri on the projects presented in session three, the panel also highlighted the importance of staying within the original scope of a project, especially where the principles of consent apply.

“They're using people's information for a separate project where they haven't asked the individuals. It's been used for other things. I think people should be given the opportunity to say, 'We're going to give your data to a 3rd party,' and say yes or no. At the end of the day, it's for a completely different project.” (Session three)

The challenge of reconciling different ethical issues

These discussions highlighted a broader challenge for some participants in reconciling issues around identifiability, data quality, and consent. Participants still had questions about these aspects and how they related to each other:

Some questions raised by the panel in relation to the equalities project:

  • “To what extent does anonymity, or pseudonymisation, of data compromise data quality?”
  • How do you minimise the risk to data accuracy while removing identifying information?”
  • How do you get informed consent if the data is being anonymised?”
  • “Can the ID number somehow be traced back to the individual/their information?”

Questions over the identifiability of data highlighted an ongoing lack of clarity around what impact pseudonymising or de-identifying data would have on other aspects like data quality and individual privacy. There was some reassurance in knowing that measures were in place to help remove some identifying information to protect individuals’ privacy. However, concern remained about the potential for data to become identifiable when linked in ways such as in the equalities project and participants suggested that the public may not be aware of this.

In concluding discussions on the equalities and protected characteristics project, the panel noted down their key ethical considerations on a digital whiteboard using post-it notes.

Past project four: Ukrainian Displaced People

The UK and Scottish Governments processed and shared data during the Ukraine Crisis so that Ukrainians could be safely housed across Scotland. Immigration, safeguarding and housing data were shared between relevant agencies and organisations to ensure displaced peoples could be safely looked after. A summary of this project was presented to the panel in plenary by a Scottish Government representative and was followed by smaller breakout discussion and Q&A with the presenter in plenary.

Benefit of defining data use principles

The aim of the project, to support Ukrainian people coming to Scotland, was generally seen as a positive one. Based on the information given in the presentation, participants considered the principles established by those involved in the programme (such as data minimisation, necessity, proportionality, and humanitarianism) to be appropriate and felt that the use of data had been restricted in line with those principles.

“That's how things should be in general. Things should be defined at the very beginning, instead of collecting as much data as possible. Define the principles and then collect what you need." (Session three)

Concerns about holding and updating data indefinitely

While recognising the humanitarian good of the Ukrainian Displaced People project, a number of risks and challenges were identified, such as holding sensitive information about Ukrainian people and their hosts for an indefinite period of time (given the uncertainty around Russia’s invasion of Ukraine) and keeping the data up to date. It was also suggested that the process could be insensitive to Ukrainian refugees who had been through a traumatic experience.

"It seems like reducing people to numbers. Every family's journey has now become a case note." (Session three)

Discussions about this project reflected a broad range of views on wider debates around immigration and refugee policies. In weighing up the relative benefits and risks, participants had different groups in the forefront of their minds; some were thinking about the safety and wellbeing of refugees, and others that of the hosts.

Concerns over data security in an international emergency

This project raised questions about data security, given the international context and the sharing of data between organisations during a period of conflict. The panel recognised the complexities and challenges around this but wanted to know more about how data were kept secure to protect the refugees and hosts, especially when such sensitive information (e.g. criminal records) was shared between different countries and agencies:

Questions raised by the panel in relation to the Ukrainian Displaced People project:

  • “What differences are there between Scotland and other UK nations regarding the approach to data collection on Ukrainian refugees and hosts?”
  • “How is data security managed on the Ukraine project?”
  • “Who has access to data on hosts/refugees? Which delivery agencies/parts of the council?”
  • “Is there any risk to ‘group privacy’ / a risk to the Ukrainian community from how data could be used?”

It was recognised that ensuring data is of good quality takes time, but in an urgent or emergency situation there was a risk that this could be overlooked. Having accurate information was considered important for avoiding any exploitation of individuals – both refugees and hosts – involved in the programme.

The panel raised further considerations in relation to how different countries might approach data sharing and retention, and how any potential differences are accounted for.

“The fact that you're looking at foreign nationals coming into the country. You're holding details about people from another country which needs to be held with sensitivity. You would presumably want to give that information back at some point and probably wouldn't want to hold onto it going forwards. One of the big concerns is, what do you do with the information going forward.” (Session three)

In concluding discussions on the Ukrainian Displaced People project, the panel noted down their key ethical considerations on a digital whiteboard using post-it notes.

Contact

Email: michaela.omelkova@gov.scot

Back to top