Small Business Bonus Scheme: evaluation

This report presents the results of an evaluation of the Small Business Bonus Scheme (SBBS), and provides recommendations in relation to the SBBS and non-domestic rates relief more broadly.


Appendix F: NDR – IDBR data linkage exercise (provided by the Scottish Government)

Executive summary

  • A dataset of 130,861 non-domestic properties derived from the 2018 Non-Domestic Rates (NDR) Valuation Roll and Billing System Snapshot were matched to the Inter-Departmental Business Register (IDBR).
  • In total, 47% of all properties were matched successfully, using a variety of name and address information.
  • The dataset included both SBBS recipients and non-SBBS recipients. The match rate for SBBS recipients was identical to the overall match rate of 47%.
  • Properties with a higher rateable value were more likely to match to the IDBR than those with a lower rateable value.

Background

This report describes the process and results of linking data on Small Business Bonus Scheme (SBBS) relief recipients to the IDBR. The data-linkage exercise was carried out to gain a better economic understanding of the relief recipients as part of the SBBS Review. The linked data will be used by the contractor, the Fraser of Allander Institute, to describe the types of businesses in receipt of SBBS rates relief, and allows (through the IDBR reference number) linkage to other economic datasets.

To gauge the feasibility of the data-linkage exercise, we first worked with statisticians in the Office of the Chief Economic Advisor (OCEA) to link a sample of 1,000 SBBS recipients to the IDBR. The initial match rate of the sample was 42%. After further review of the data we deleted a number of incorrect matches. Most of these resulted from matching to the wrong name or address information in the sample, for example where a match was based on proprietor name and address even though the ratepayer was the tenant. The final match rate of the matched sample was 34%, which was considered sufficient to proceed with the full data-linkage exercise which was carried out by the ONS. We used the results of the matched sample to inform the specification of the full data-linkage exercise, which led to additional name and address information being included. As shown below, this contributed to a higher match rate in the full data-linkage exercise.[51]

Non-domestic rates data

We selected all SBBS relief recipients from the most popular property core descriptions to avoid including property types that we knew were unlikely to be on the IDBR, e.g. public toilets and sewage tanks. We also limited our selection to those core descriptions that had at least 100 relief recipients. A full list of all core descriptions included in the dataset is provided in Section F.1. In total, 86% of all SBBS relief recipients were included in the dataset for matching. We also included a number of properties with the same core descriptions and with a rateable value (RV) below £20,000 which were not in receipt of any kind of NDR relief, to allow the contractor to compare SBBS relief recipients to non-recipients.

Our NDR data provides a number of different name and address variables. To allow all of these to be used in the data-linkage exercise, we split the data into four datasets that used different combinations of the name and address variables. This resulted in the following datasets:

1. Business name & property address (130,861 observations)

2. Ratepayer name & ratepayer address (97,362 observations)

3. Business name and proprietor, tenant or occupier address (97,648 observations)

4. Ratepayer name & property address (97,398 observations) [52]

Dataset 1 is larger than the others because it includes non-SBBS relief recipients.

Analysis

ONS matched all four datasets separately to the IDBR. Matches were searched for on any record on the IDBR (VAT, PAYE, local unit, reporting unit or enterprise). For each dataset, we received three files with results: matches, non-matches and multiple matches. We merged these with the original rates datasets. For each match, ONS provided an enterprise reference number as well as all Scottish local unit numbers associated with that enterprise number. Due to the research question at hand, we only focused our analysis on the enterprise-level results, although the final dataset contains a count variable that indicates the number of Scottish local units associated with each enterprise reference number.

ONS also included matches to historical IDBR records. On advice from colleagues in OCEA, we decided to exclude any matches to firms that had ceased trading prior to 1 January 2017.

For each matched property we received a match score. ONS only included those matches with a match score of 84% or higher.

As it was possible for properties to match to different enterprises across the four datasets, we used the following selection criteria to decide which matches to keep:

1. If only one match across all four datasets, use that reference number.

2. If matched to the same reference number in all datasets where there is a match, use that reference number.

3. If matched to different reference numbers across the datasets, use the reference number that occurs most frequently.

4. If matched to different reference numbers across the datasets and no reference number occurs more frequently than another, use the reference number with the highest match score.

There were no cases in which several reference numbers had the same frequency, and the same match scores, so these four criteria were considered sufficient to select an enterprise number to retain for all matches.

Additional checks

To assess the quality of the match we carried out some further checks. We first inspected a number of matches manually to see if they were as expected. This did not reveal any unexpected matches.

We also reviewed where multiple properties matched to the same enterprise. Across the matched dataset, in the vast majority of cases less than ten properties matched to a single enterprise (99.4%). Most enterprises (82.4%) matched to a single property.

We checked cases where more than 50 properties had matched to the same enterprise. Examples of a large number of properties matching to the same enterprise and properties still receiving SBBS relief include local authorities, time-share companies and a large business operating as a franchise.

We also matched the results back to the pilot to see if results were similar. Of those cases matched in the pilot, around 60% had the same enterprise reference number in the full-data match. Around 40% of the properties in the pilot yielded a different result in the full-data match. Three quarters of these had a match in the pilot but not in the full data linkage; one quarter was matched to a different reference number in the pilot than in the full data linkage exercise, with about three-quarters not matching to the IDBR.

Results

Through linking the matched datasets back to the original sample we are able to calculate the match rate of the data-linkage exercise. Table F.1 presents the match rate across the four datasets as well as the overall match rate for the whole sample, which is 47%.

Table F.1: Match rates across different datasets
Dataset name Matching variables Size Matches Match rate Matched to dead record Non-matches Multi-ples
Dataset 1 Business name and property address 130,861 36,625 28% 5,478 86,917 2,021
Dataset 2 Ratepayer name and ratepayer address 97,362 24,472 25% 3,682 67,434 1,881
Dataset 3 Business name and PTO Address 97,648 35,327 36% 5,558 51,441 5,530
Dataset 4 Ratepayer name and property address 97,398 31,793 33% 4,690 59,342 1,681
All   130,861 61,902 47%      

Note: Business Name and Proprietor, Tenant, and Occupier (PTO) Address (PTOA) are derived from the PTOA dataset, Property Address comes from the Valuation Roll. Ratepayer Name and Ratepayer Address come from the Billing System Snapshot. See Annex B for more detail.

Dead records are IDBR records with a death date before the 1st of January 2017 (i.e. those that had ceased trading prior to this date).

The IDBR excludes sole traders or partnerships with no employees and an annual turnover of less than the VAT threshold. This would suggest that smaller businesses (which are less likely to be on the IDBR) are more likely to be excluded from the matched sample. Comparing the mean RV of matched and non-matched cases did in fact show that the mean RV for properties that matched to the IDBR is around £2,000 higher than the RV of properties that did not match, a statistically significant difference. This indicates that properties with a higher RV are more likely to match to the IDBR. This is in line with the findings of the smaller sample matching exercise.

Table F.2 shows the match rate by RV quintile. This shows that as RV increases, properties are more likely to match to the IDBR. Table F.3 shows that properties that were part of a chain were more likely to match to the IDBR than those which occupied a single site. Once again this illustrates that larger businesses were more likely to be matched than smaller ones.

The match rate for SBBS and non-SBBS relief recipients was identical, at 47%. In total, 46,040 SBBS properties were matched, 47% of all SBBS properties (35% of all properties included in the sample).

Table F.2: Match rate by RV quintile
RV Quintiles Matched properties All properties Match rate Matched SBBS properties All SBBS properties SBBS match rate
Q1 (lowest) 8,334 26,530 31% 6,096 18,350 33%
Q2 10,138 25,419 40% 7,993 20,621 39%
Q3 12,238 26,135 47% 9,907 21,314 46%
Q4 14,427 26,438 55% 11,545 21,024 55%
Q5 (highest) 16,765 26,339 64% 10,499 16,091 65%
All 61,902 130,861 47% 46,040 97,400 47%
Table F.3: Match rate by site count
Site count Matched properties All properties Match rate Matched SBBS properties All SBBS properties SBBS match rate
Single 35,597 82,664 43% 31,690 71,190 45%
Multiple 26,305 48,197 55% 14,350 26,210 55%
All 61,902 130,861 47% 46,040 97,400 47%

The tables below compare the matched and non-matched properties by local authority, property core description and property class.

Property class and core descriptions are classifications used by the Scottish Assessors to describe the type of a property. Properties are classified into one of 20 classes, which describe the general type such as 'shop' or hotel', and into one of 202 core descriptions, such as 'retail shop' or 'hostel'.

Table F.4: Match rate by local authority
Local Authority Matched properties All properties Match rate Matched SBBS properties All SBBS properties SBBS match rate
Aberdeen City 1,991 3,738 53% 1,015 1,789 57%
Aberdeenshire 3,264 6,122 53% 2,643 5,143 51%
Angus 1,335 2,663 50% 1,068 2,251 47%
Argyll & Bute 1,919 5,166 37% 1,367 3,735 37%
City of Edinburgh 5,851 11,376 51% 3,970 7,746 51%
Clackmannanshire 436 903 48% 314 592 53%
Dumfries & Galloway 2,539 5,412 47% 2,085 4,697 44%
Dundee City 1,571 3,041 52% 1,193 2,401 50%
East Ayrshire 993 2,357 42% 765 1,843 42%
East Dunbartonshire 771 1,500 51% 633 1,196 53%
East Lothian 914 1,921 48% 754 1,666 45%
East Renfrewshire 546 988 55% 440 769 57%
Falkirk 1,477 2,882 51% 984 2,065 48%
Fife 3,707 7,793 48% 2,808 6,380 44%
Glasgow City 6,523 13,482 48% 4,859 10,337 47%
Highland 4,609 10,701 43% 3,462 8,419 41%
Inverclyde 685 1,413 48% 495 1,059 47%
Midlothian 939 1,738 54% 640 1,182 54%
Moray 1,188 2,555 46% 879 2,054 43%
Na h-Eileanan Siar 404 1,376 29% 308 1,179 26%
North Ayrshire 1,215 3,164 38% 976 2,583 38%
North Lanarkshire 2,817 5,622 50% 2,159 4,188 52%
Orkney Islands 546 1,217 45% 432 1,058 41%
Perth & Kinross 2,793 5,096 55% 2,197 4,165 53%
Renfrewshire 1,918 6,795 28% 1,384 2,593 53%
Scottish Borders 1,992 4,343 46% 1,653 3,718 44%
Shetland Islands 460 943 49% 338 727 46%
South Ayrshire 1,350 2,783 49% 1,010 2,190 46%
South Lanarkshire 3,326 5,895 56% 2,383 3,878 61%
Stirling 1,452 2,819 52% 1,033 2,091 49%
West Dunbartonshire 766 1,880 41% 594 1,371 43%
West Lothian 1,605 3,177 51% 1,199 2,335 51%
All 61,902 130,861 47% 46,040 97,400 47%
Table F.5: Match rate by core description
Core description Matched properties All properties Match rate Matched SBBS properties All SBBS properties SBBS match rate
Boathouse 72 219 33% 58 179 32%
Bowling Club 177 398 44% 170 384 44%
Café 460 831 55% 395 678 58%
Car Wash 67 208 32% 45 142 32%
Caravan Site 179 477 38% 135 392 34%
Cattery 29 128 23% 27 118 23%
Clinic 208 371 56% 98 217 45%
Club 306 811 38% 245 692 35%
Community Centre 88 196 45% 27 109 25%
Day Nursery 255 379 67% 171 250 68%
Electricity 226 331 68% 129 179 72%
Factory 473 806 59% 367 596 62%
Garage 1,543 2,858 54% 1,286 2,340 55%
Garden Centre 55 113 49% 51 101 50%
Guest House 349 1,359 26% 308 1,222 25%
Hall 359 1,735 21% 244 1,493 16%
Hostel 132 270 49% 91 188 48%
Hotel 407 811 50% 275 564 49%
Kennels 119 383 31% 105 340 31%
Kiosk 79 223 35% 50 161 31%
Museum 60 153 39% 39 121 32%
Office 15,133 26,367 57% 9,730 16,506 59%
Public House 883 1,554 57% 697 1,140 61%
Restaurant 692 1,329 52% 563 1,019 55%
Riding School 65 182 36% 59 166 36%
Self-Catering 3,469 14,682 24% 2,809 12,874 22%
Shop 17,092 34,403 50% 13,654 27,870 49%
Showroom 314 513 61% 242 398 61%
Store 5,836 15,173 38% 4,046 8,680 47%
Studio 241 955 25% 202 832 24%
Surgery 990 1,592 62% 736 1,146 64%
Time Share Units 877 1,113 79% 715 944 76%
Warehouse 1,017 1,653 62% 644 1,016 63%
Workshop 8,101 15,412 53% 6,637 12,484 53%
Yard 1,549 2,873 54% 990 1,859 53%
All 61,902 130,861 47% 46,040 97,400 47%
Table F.6: Match rate by property class
Class Matched properties All properties Match rate Matched SBBS properties All SBBS properties SBBS match rate
Care Facilities 183 333 55% 128 235 54%
Cultural 210 383 55% 129 263 49%
Education and Training 36 65 55% 30 53 57%
Garages and Petrol Stations 1,534 2,839 54% 1,285 2,335 55%
Health and Medical 1,138 1,896 60% 790 1,318 60%
Hotels 1,106 3,854 29% 775 3,137 25%
Industrial Subjects 17,278 37,042 47% 12,937 25,584 51%
Leisure, Entertainment, Caravans etc. 4,920 16,652 30% 4,097 14,656 28%
Offices 15,150 26,346 58% 9,763 16,522 59%
Other 198 584 34% 163 496 33%
Petrochemical 1 1 100% - -  
Public Houses 873 1,542 57% 690 1,133 61%
Public Service Subjects 387 1,582 24% 201 1,267 16%
Religious 4 16 25% 3 11 27%
Shops 18,698 37,408 50% 14,957 30,216 50%
Sporting Subjects 23 75 31% 22 73 30%
Statutory Undertaking 163 243 67% 70 101 69%
All 61,902 130,861 47% 46,040 97,400 47%
F.1 Property core descriptions included in data matching
Core description Number of SBBS recipients
Shop 27,871
Office 16,509
Self-catering 12,874
Workshop 12,485
Store 8,681
Garage 2,340
Yard 1,859
Hall 1,493
Guest House 1,222
Surgery 1,147
Public House 1,140
Restaurant 1,020
Warehouse 1,018
Time Share Units 944
Studio 832
Club 692
Café 678
Factory 596
Hotel 566
Showroom 398
Caravan Site 392
Bowling Club 384
Kennels 340
Day Nursery 250
Clinic 217
Hostel 188
Boathouse 179
Electricity 179
Riding School 166
Kiosk 161
Car Wash 142
Museum 121
Cattery 118
Community Centre 110
Garden Centre 101
All 97,413

F.2 Variables included in data matching

The variables included in the data matching were derived from three different datasets: the Billing System Snapshot, Valuation Roll and PTOA dataset.

Billing System Snapshot

The Billing System Snapshot is collated annually and holds information used by local authorities to keep track of payments and reliefs. The Ratepayer name and address are derived from this dataset.

Valuation Roll

The Valuation Roll is maintained by Scottish Assessors and hold information on all non-domestic properties. The Property address variables come from this dataset.

PTOA dataset

Finally, the PTOA dataset holds at least one Proprietor, Tenant, or Occupier record for each Valuation Roll entry. There can be a tenant or occupier in addition to a proprietor, so this dataset may hold multiple name and address variables for each property. The occupier status variable identifies whether the property is in occupation by the Proprietor, Tenant, or Occupier, or if the property is vacant. This variable is also used to derive the Business Name variable, which uses the Proprietor name in case the property is vacant or occupied by a proprietor, the Tenant name in case the property is occupied by a tenant and the Occupier name where there is an occupier.

Variable: Business name - Source: PTOA dataset

Variable: Ratepayer name - Source: Billing System Snapshot

Variable: Property address - Source: Valuation Roll

Variable: Ratepayer address - Source: Billing System Snapshot

Variable: PTOA address - Source: PTOA dataset

Contact

Email: ndr@gov.scot

Back to top