Growing up in Scotland: children's social, emotional and behavioural characteristics at entry to primary school
This report investigates the extent and nature of social, emotional and behavioural difficulties among Scottish school children around the age they enter primary one, and shows which children are most likely to have these difficulties.
APPENDIX 3 - DESCRIPTION OF CLUSTER ANALYSIS
The children were grouped into clusters using k-means 9 clustering in SPSS v15. The desired number of clusters is specified beforehand. Random starting points are used as cluster centres and the data are grouped around these points. An iterative procedure is then used to re-calculate the cluster centres and re-group the data until the variance between the cluster members is as low as possible. A number of different groupings were run, from 4 to 7 clusters. A five cluster solution was eventually chosen as best representing the shared patterns of behaviour and best differentiating between different types of children.
Table 2.2 in section 2.3.1 shows mean individual scores for each cluster on each of the SDQ scales, plus scores for the the overall sample. Table A3.1 shows the median and range of scores for each cluster. The tables demonstrate the differences between the profiles of different clusters in detail. A summary of the behavioural characteristics of the children in each cluster is also provided in section 2.3.1
Table A3.1 Median scores (and range) on SDQ scales by cluster
SDQ scale |
Cluster |
||||
---|---|---|---|---|---|
1 |
2 |
3 |
4 |
5 |
|
Conduct problems |
2 (0 - 6) |
1 (0 - 4) |
3 (0 - 9) |
3 (0 - 8) |
2 (0 - 6) |
Emotional symptoms |
1 (0 - 4) |
0 (0 - 5) |
1 (0 - 7) |
4 (2 - 10) |
1 (0 - 4) |
Hyper-activity |
3 (0 - 6) |
1 (0 - 3) |
7 (4 - 10) |
4 (0 - 10) |
4 (3 - 8) |
Peer problems |
1 (0 - 6) |
0 (0 - 6) |
2 (0 - 10) |
3 (0 - 8) |
0 (0 - 5) |
Pro-social |
6 (0 - 9) |
10 (6 -10) |
7 (0 - 10) |
9 (4 - 10) |
9 (7 - 10) |
Base (unweighted) |
377 |
820 |
251 |
220 |
565 |
To validate the groupings defined by the cluster analysis a series of checks were run on the data. Table A3.2 shows the mean, minimum and maximum distances of individual cases from the cluster centres. The distance measures are based on the SDQ scores that were used to cluster the children. These data can be used to examine how similar the children are in each cluster as ideally the individuals within each cluster should be alike. The more similar are children in each cluster the lower the distances between them.
Table A3.2 Distance of cases from the cluster centre
Distance of case from its classification cluster centre |
||||
---|---|---|---|---|
Mean |
Minimum |
Maximum |
||
Cluster number |
1 |
2.53 |
0.41 |
6.99 |
2 |
1.96 |
0.72 |
5.89 |
|
3 |
3.39 |
1.24 |
10.14 |
|
4 |
3.45 |
0.91 |
7.25 |
|
5 |
2.17 |
0.63 |
4.85 |
The table suggests that individuals within Cluster 3 are the most dispersed, whereas Cluster 2 has the lowest mean, suggesting the children within this cluster are the most alike. Overall, the difference between clusters is not huge, as such there is no overwhelming evidence to suggest that any single cluster should be further split. As a final check the children on the cluster peripheries were excluded and mean SDQ scores for each cluster were re-calculated. Individuals on the periphery are more likely to be outliers with scores at the extreme ends of the scale ranges, this could affect the profile of the cluster. The analysis showed that the means were unchanged, hence the outliers were not impacting on the overall profile of the cluster, indicating the clusters are robust.
There is a problem
Thanks for your feedback