Statistics.gov.scot improvement project: discovery user research report
The research aimed to understand the current user needs and expectations of the Scottish Government’s site for open access to Scotland’s official statistics: statistics.gov.scot. This programme of user research is one workstream of the discovery project to improve statistics.gov.scot.
Interviews
11 semi-structured interviews were held from 15/07/24 to 14/08/24 with subject experts and stakeholders. Interviews ranged from 30 mins to 1 hour. 6 participants were internal to SG, and 5 were external. Interviews were held either in-person at Victoria Quay, or online using Microsoft Teams or Google Meet. The interviews were led by one member of the SG Open Data Team, with Tom Farrington from Storm ID helping to facilitate and take notes.
The overall aim of these interviews was to gain insights into specific data management, technology, and communications issues potentially facing both the internal project team and the broader open data community. Interviewees also reflected on the project’s findings up to this point, with the multiple perspectives on these affording the project team something of a sense-check.
Interview schedules included some questions tailored towards the specific interviewee. These were prepared collaboratively by the project team. A summary of interim themes were shared with participants in advance of the sessions for background. This summary and a sample interview schedule are included in Appendix C.
Anonymous qualitative data was generated in the form of notes typed by the interviewers into a secure Confluence Whiteboard. This data was subject to a basic qualitative content analysis by the user researcher, being a version of template analysis as described in the Methodology section.
Interview analysis
Given the more targeted aims of the interviews, and the distinct nature of interview data compared with workshop and focus group data, this analysis does not intentionally build upon the previous analyses (i.e. does not specifically bring the existing template to bear on this data), although the themes below are clearly complementary. Each theme is accompanied by a set of sub-themes, each with illustrative quotations. These findings were presented to the wider Open Data Team for discussion on 11/07/24.
Interview theme 1: Complexity and technical barriers
Interviewees highlighted the difficulties associated with using SPARQL, with a preference expressed for simpler and more user-friendly APIs. There's a call for a reliable, well-maintained and well-documented API (CKAN is cited here), which is seen as essential for accessibility and usability by more experienced users. Again, on statistics.gov.scot, the complexity of features like ‘data cart’ and 'dimension locking' also posed barriers to even experienced users, although the intent behind both of these (to partition data prior to download) was seen as admirable.
The terminology used on the site is again seen as overly complex and not intuitive, even for users who are highly technical. There’s a need for the site to cater to users with varying levels of expertise by simplifying language and processes. The name of the site is also seen as a barrier, being difficult to say and spell, and a misnomer to some, which raises the potential need for rebranding to improve user engagement.
Sub-themes and illustrative quotations:
Technical challenges with APIs and tools:
- “I’d rather just have a REST API chucked in front of me - CKAN is what good looks like.”
- “I’d rather use anything [other] than SPARQL”
- “The ‘on-ramp’ is too steep”
User interface and jargon:
- “I’m data literate and [don't understand] the dimension locking, datacube, SPARQL…even if you are an advanced user it’s quite difficult to use”
- “It’s too tangled up in academic terms… built by brainy people for brainy people”
- “The title of the site is confusing and potentially just wrong - you take data and produce statistics. I’d expect to find statistics on the site rather than data.”
Interview theme 2: Challenges in data publication and maintaining standards
Interviewees who had published data on statistics.gov.scot noted that the process is cumbersome and counterintuitive. Again, even experienced users felt inadequate or frustrated when interacting with statistics.gov.scot. Those who don’t publish data still wanted the process to be simple. This is compounded by the lack of integration between different platforms, which forces statisticians to reprocess data multiple times. This is frustrating and demotivating.
The strict requirements for adherence to data standards and the correct use of specific geography codes are seen as challenging but vital. Some interviewees suggested that responsibility for meeting data standards must lie with those providing data.
There is a recurring theme around the need for better data governance, with calls for clearer standards and more consistent practices across different datasets. Interviewees pointed out that inconsistent metadata and descriptions often undermine the usability of the data, and there's a strong push towards centralised and standardised data management practices. Interviewees also pointed out the potential utility of moving towards cloud-based solutions. There is some overlap here with theme 4.
Sub-themes and illustrative quotations:
Difficulty in data publishing:
- “I used to feel like an idiot when trying to upload data in my previous role.”
- “Some teams are spending 3 months making publishable tables on gov.scot - they then have to reprocess it to publish [elsewhere] - how do we get all these platforms to talk to each other to avoid this?”
- “I would rather people publish the data badly that not publish at all!”
Need for improved data governance:
- “We need to have data governance across all organisations - it’s not that complicated but it’s not well communicated.”
Interview theme 3: Trust and reliability concerns
Some interviewees agreed with concerns that the site’s appearance, poor performance, and the inconsistent quality of data undermine trust in the platform. There is a perception that the site is unreliable, which affects user confidence. Out-of-date content and a lack of clarity about dataset licensing were mentioned as issues that can undermine trust and usability. Interviewees also noted that many potential users (e.g. academics and other researchers) may be unaware of the site, due to a lack of recent communications and publicity.
In the context of similar platforms, interviewees noted that a lack of clarity around data provenance and the inconsistent application of metadata and licensing information could lead to misuse or misinterpretation of data, particularly in commercial contexts.
Sub-themes and illustrative quotations:
Perceived reliability and reputation:
- “[statistics.gov.scot] is quite out of date - could it be that the gov only publishes data that isn’t published elsewhere?”
- “There was a backlash about the clunkiness of the platform last time it was publicised”
- “a lot of academics don’t use it or aren’t aware of it - some third-sector people don’t know it”
- “What about a complete re-brand? Brand loyalty feels like the opposite situation.”
Impact on data trust:
- “[statistics.gov.scot] needs a reputation as being official and meeting quality standards”
- “[It’s] not always clear what the licence is, for specific datasets.”
Interview theme 4: Strategic and structural recommendations
Several interviewees advocated for a more centralised approach to data governance and management, which would help streamline the processes of data publication and access. The DAMA wheel was cited as illustrating the foundations of good data governance. Again, the move towards the cloud was cited as an opportunity to pursue this approach.
Interviewees mentioned the need for better communication between data publishers and users, including mechanisms for user feedback. Interviewees highlighted the utility of closing the feedback loop, which would not only improve data quality but also help publishers understand how their data is being used and valued.
Sub-themes and illustrative quotations:
Centralisation and standardisation
- ”there’s a need for a top down directive to sort out data and management of data”
- “[statistics.gov.scot] sits a way off on its own and doesn’t connect to anything”
- “Some people can’t put their data in, not because they don’t want to, but because the platform won’t allow it.”
Feedback mechanisms and communication
- “Closing the feedback loop between publishers and users is a priority.”
Interview theme 5: Purpose and design
There was general agreement that the current site does not clearly communicate what it is for, and who it is for (e.g. general users, data analysts, both). This makes it difficult for users to know what to expect and whether or not the site will be useful to them.
Interviewees noted the importance of designing the site with the user in mind; a clear understanding of user needs should drive the platform's development. This includes simplifying user journeys and ensuring that the site’s content and tools are accessible to an appropriately broad audience. The site must offer structured and standardised guidance to both data publishers and users.
Sub-themes and illustrative quotations:
Confusion around the site’s purpose and intended audience
- “Users don’t care about the technical aspects… they just want answers to simple questions.”
- “Build it for what you need just now, in a way that you can create future iterations.”
Need for user-centred design
- “statistics.gov.scot looks like someone just got carried away!”
- “we could probably reasonably conclude from this Discovery that we don’t have something that is meeting user needs and so we need a redesign from the ground up.”
- “A full-scale system wide rethink and redesign of how we disseminate data seems sensible.”
- “whittle it down to the things that absolutely need to be there before you add other things on.”
Contact
Email: auren.clarke@gov.scot
There is a problem
Thanks for your feedback