Scottish Study of Early Learning and Childcare: Three-year-olds (Phase 3) Report - Updated 2021
Findings from the third phase of the Scottish Study of Early Learning and Childcare (SSELC), a research project established to evaluate the expansion of early learning and childcare in Scotland.
Appendix D: Regression analysis
Tables D1 to D6 show the results of logistic regression analysis of whether a child has delayed development on at least two domains of the Ages and Stages Questionnaire and of raised / high score on the Strengths and Difficulties Questionnaire total difficulties scale at Phase 3.
Logistic regression analysis is a method of summarising the relationship between a binary 'outcome' variable and one or more 'predictor' variables. It allows us to estimate the odds of a child having a score of '1' on the outcome variable (as opposed to '0') from knowledge of their scores on the predictor variables. In the model shown in Table D1 the score of '1' on the dependent variable refers to exhibiting delayed development on two or more of the ASQ domains, while a '0' refers to exhibiting no delayed development, or delayed development on just one of the domains.
Logistic regression allows us to consider multiple relationships at the same time and to identify those relationships between a predictor variable and the outcome variable which remain statistically significant even when we take into account other predictor variables. For those variables that do remain significant we can say that they show an independent association with the outcome variable while controlling all other factors in the model.
The first two regression models are longitudinal models. All of the predictor variables included are from Phase 1, while the outcome variables are from Phase 3. This introduces a time element allowing us to say for certain that all of the predictor variables predate the outcome – although this does not imply causality. The other four models are cross-sectional, taking all the data from Phase 3.
Tables D1 to D6 show how the odds for each category of each predictor variable compare with the odds for the reference category. An odds ratio of greater than 1 indicates that, holding all other factors constant, there is an increased likelihood of a child in that category being in the category '1' for the outcome variable compared with a child in the base category. For example, in Table D1, the odds ratio of 1.7 for the category 'Female' means that girls are more likely than boys (the base category) to exhibit development that is on schedule for at least four of the five ASQ domains (and the odds of a girl exhibiting such development are 1.7 times those for a boy, holding all other factors constant). Conversely, an odds ratio of below 1 means they have lower odds of exhibiting delayed development than respondents in the reference category.
Because data are taken from a sample, we recognise that the odds ratios are only estimates, so we also include confidence intervals around each estimate. If the survey were to be repeated, we would expect the true value to fall within these odds ratios 95 times out of 100.
Two measures of statistical significance are provided. The first is for the comparison between a particular category and the base category, while the second is for the variable as a whole. Where the independent variable has just two categories, these are the same. A significance level of 0.05 or less indicates that there is less than a 5% chance we would have found these differences between the categories just by chance if in fact no such difference exists, hence we can say that we are 95% sure there is a relationship between the predictor and outcome variables. A level of <0.001 indicates that there is a less than 0.1% chance, so we can say that we are 99.9% sure that the relationship exists. For the purposes of Tables 4 and 5, we described a level of significance of less than 0.01 as "highly significant", of between 0.01 and 0.05 as "moderately significant, and of between 0.05 and 0.10 as "marginally significant".
The Nagelkerke R-square value provided at the bottom of each model is a rough indication of the proportion of variation in the outcome variable explained by the predictor variables in the model. In the first two models this is between 0.2 and 0.25, which is fairly typical for this type of analysis, while in the subsequent models this is lower still. This means that there is a lot of variation in the data which is not explained by the variables (and nor would we expect it to be).
All models have been tested for stability through the systematic removal of variables to check for changes in odds ratios and significance of other variables, and checks on the covariation of independent variables, and all were found to be stable. Because of the small sample size for all the models, but particularly the longitudinal ones, it was not possible to include a large number of predictor variables. Instead a number of key variables were forced into each of the models. Other variables were then systematically tested to check for a significant association with the outcome variable when controlling other factors. Only tested variables that were significant at the 10% (0.10) level were included in the final models as presented in Tables C1 to C6. The variables which were forced into the models are included in the Tables irrespective of their level of significance. Forced variables for the longitudinal models were the Phase 1 score (ASQ 4+ domains on schedule or SDQ total difficulties on schedule, to match the outcome), the ITERS rating from the setting observation data, sex of the child and area deprivation of the home address. Forced variables for the cross-sectional models were sex of the child, area deprivation of the home address and whether the child has a long-term condition which may affect their development, as reported by either the keyworker or the parent. A list of all the variables considered for inclusion in the regression models is given below.
Longitudinal models (All variables from Phase 1) | Cross-sectional models (All variables from Phase 3) |
---|---|
SDQ total difficulties score (SDQ model only) (banded) | |
ASQ 4+ domains on schedule (ASQ model only) (banded) | |
ITERS total score (from setting observations) | |
Sex | Sex |
Area deprivation (SIMD) | Area deprivation (SIMD) |
In employment | In employment |
Equivalised household income (banded) | Equivalised household income (banded) |
Number of parents in household | Number of parents in household |
Number of siblings | Number of siblings |
Ethnic group | Ethnic group |
Language spoken at home | Language spoken at home |
Highest qualification of respondent | Highest qualification of respondent |
Longstanding illness | Longstanding illness |
Ever breastfed | Ever breastfed |
Whether sleeps through night | Whether sleeps through night |
Hours sleep per 24 (banded) | Hours sleep per 24 (banded) |
Home learning environment (banded) | Home learning environment (banded) |
Parental warmth scale (banded) | Parental warmth scale (banded) |
Parental longstanding illness | Parental longstanding illness |
Parent age (banded) | Parent age (banded) |
Short WEMWBS (banded) | Short WEMWBS (banded) |
Parental self-efficacy | Parental self-efficacy |
Any formal childcare (other than nursery) | Any formal childcare (other than nursery) |
Any informal childcare | Any informal childcare |
Feelings about amount of support | Feelings about amount of support |
Total hours of childcare (banded) | |
Confusion, hubbub and order scale (banded) |
Odds ratio | Confidence interval | Sig. (compared with base) | Sig. (overall) | ||
---|---|---|---|---|---|
Phase 1 - ASQ summary measure | <0.001 | ||||
On schedule for at least four domains | 5.4 | (2.6 - 11.2) | <0.001 | ||
Not on schedule for at least two domains | |||||
Sex of child | 0.076 | ||||
Female | 1.7 | (0.9 - 3.0) | 0.076 | ||
Male (+ missing) | |||||
Phase 1 - Area deprivation (SIMD) of home address | 0.755 | ||||
less deprived (+ missing) | 0.9 | (0.5 - 1.6) | 0.755 | ||
20% most deprived | |||||
Phase 1 - ITERS total score | 0.112 | ||||
Not observed | 0.2 | (0.1 - 0.9) | 0.037 | ||
6+ | 0.4 | (0.1 - 1.0) | 0.056 | ||
5 - <6 | 0.9 | (0.4 - 2.0) | 0.708 | ||
4 - <5 | 0.7 | (0.3 - 1.7) | 0.431 | ||
<4 | |||||
Phase 1 - Highest qualification of respondent | 0.023 | ||||
Degree / HE | 0.9 | (0.4 - 2.0) | 0.710 | ||
Upper school / post-school/pre-HE (Highers, HNC, etc.) (+ missing) | 2.3 | (1.2 - 4.4) | 0.015 | ||
None / lower school (Standard Grade, etc.) | |||||
Phase 1 - Home learning environment scale | 0.021 | ||||
Highest quartile (most frequent activities) | 2.3 | (1.1 - 4.8) | 0.021 | ||
Other (+ missing) | |||||
n = 243 | Naglekerke R-square = 0.24 |
Odds ratio | Confidence interval | Sig. (compared with base) | Sig. (overall) | ||
---|---|---|---|---|---|
Phase 1 - SDQ total difficulties score | <0.001 | ||||
Close to average | 2.9 | (1.7 - 4.9) | <0.001 | ||
Raised/High/Very high | |||||
Sex of child | 0.028 | ||||
Female | 1.8 | (1.1 - 3.2) | 0.028 | ||
Male (+ missing) | |||||
Phase 1 - Area deprivation (SIMD) of home address | 0.185 | ||||
less deprived (+ missing) | 1.5 | (0.8 - 2.6) | 0.185 | ||
20% most deprived | |||||
Phase 1 - ITERS total score | 0.455 | ||||
Not observed | 0.6 | (0.2 - 2.3) | 0.460 | ||
6+ | 0.4 | (0.1 - 1.3) | 0.118 | ||
5 - <6 | 0.5 | (0.2 - 1.3) | 0.144 | ||
4 - <5 | 0.7 | (0.3 - 1.9) | 0.518 | ||
<4 | |||||
Phase 1 - Home learning environment scale | 0.007 | ||||
Highest quartile (most frequent activities) | 2.7 | (1.3 - 5.4) | 0.007 | ||
Other (+ missing) | |||||
n = 267 | Naglekerke R-square = 0.19 |
Odds ratio | Confidence interval | Sig. (compared with base) | Sig. (overall) | ||
---|---|---|---|---|---|
Sex of child | 0.350 | ||||
Female | 1.3 | (0.7 - 2.3) | 0.350 | ||
Male (+ missing) | |||||
Area deprivation (SIMD) of home address | 0.509 | ||||
3/4/5 - less deprived | 0.8 | (0.5 - 1.5) | 0.509 | ||
1/2 - 40% most deprived (+ missing) | |||||
Long term health condition | 0.003 | ||||
No (+ missing) | 3.1 | (1.5 - 6.6) | 0.003 | ||
Yes | |||||
Home learning environment scale | 0.019 | ||||
Highest quartile (most frequent activities) | 3.3 | (1.5 - 7.1) | 0.003 | ||
3rd (+ missing) | 1.4 | (0.7 - 2.7) | 0.347 | ||
2nd | 1.2 | (0.6 - 2.6) | 0.577 | ||
Lowest quartile (least frequent activities) | |||||
n = 243 | Naglekerke R-square = 0.11 |
Odds ratio | Confidence interval | Sig. (compared with base) | Sig. (overall) | ||
---|---|---|---|---|---|
Sex of child | 0.078 | ||||
Female | 1.5 | (1.0 - 2.3) | 0.078 | ||
Male (+ missing) | |||||
Area deprivation (SIMD) of home address | 0.135 | ||||
3/4/5 - less deprived (+ missing | 1.4 | (0.9 - 2.1) | 0.135 | ||
1/2 - 40% most deprived) | |||||
Language usually spoken at home | 0.072 | ||||
English only | 1.7 | (1.0 - 3.2) | 0.072 | ||
Other languages (including English and other languages) | |||||
Highest qualification of respondent | 0.002 | ||||
Degree / HE | 2.9 | (1.6 - 5.1) | <0.001 | ||
Upper school / post-school/pre-HE (Highers, HNC, etc.) (+ missing) | 2.0 | (1.2 - 3.5) | 0.008 | ||
None / lower school (Standard Grade, etc.) | |||||
Child long-term health condition | 0.223 | ||||
No (+missing) | 1.6 | (0.8 - 3.3) | 0.223 | ||
Yes | |||||
Parental long-term condition | 0.054 | ||||
No (+ missing) | 1.7 | (1.0 - 3.0) | 0.054 | ||
Yes | |||||
n = 515 | Naglekerke R-square = 0.10 |
Odds ratio | Confidence interval | Sig. (compared with base) | Sig. (overall) | ||
---|---|---|---|---|---|
Sex of child | 0.470 | ||||
Female | 1.2 | (0.7 - 2.0) | 0.470 | ||
Male (+ missing) | |||||
Area deprivation (SIMD) of home address | 0.348 | ||||
3/4/5 - less deprived | 1.3 | (0.7 - 2.4) | 0.348 | ||
1/2 - 40% most deprived (+ missing) | |||||
Long-term health condition | 0.098 | ||||
No (+missing) | 1.7 | (0.9 - 3.1) | 0.098 | ||
Yes | |||||
n = 253 | Naglekerke R-square = 0.02 |
Odds ratio | Confidence interval | Sig. (compared with base) | Sig. (overall) | ||
---|---|---|---|---|---|
Sex of child | 0.020 | ||||
Female | 1.6 | (1.1 - 2.5) | 0.020 | ||
Male (+ missing) | |||||
Area deprivation (SIMD) of home address | 0.002 | ||||
3/4/5 - less deprived (+ missing | 2.1 | (1.3 - 3.3) | 0.002 | ||
1/2 - 40% most deprived) | |||||
Ethnic group | 0.026 | ||||
Non-white | 0.4 | (0.2 - 0.9) | 0.026 | ||
White (+ missing) | |||||
Long-term health condition | 0.148 | ||||
No (+ missing) | 1.8 | (0.8 - 4.1) | 0.148 | ||
Yes | |||||
Confusion, hubbub and order scale (CHAOS) | 0.029 | ||||
Lowest/middle tertile (least chaotic) (+ missing) | 1.6 | (1.1 - 2.5) | 0.029 | ||
Highest tertile (most chaotic) | |||||
Total hours of childcare (formal and informal) | 0.029 | ||||
> 18 | 1.6 | (1.1 - 2.6) | 0.029 | ||
Up to 18 | |||||
n = 518 | Naglekerke R-square = 0.11 |
Contact
Email: socialresearch@gov.scot
There is a problem
Thanks for your feedback