This paper illustrated that data pooling generally led to better or more reliable estimates of the variables of interest. Doubling the sample led to moderate reductions in RSEs for the selected COAG performance measures while quadrupling it reduced RSEs by about half.
Data pooling on this scale, however, may not enable detection of very small changes in slow-moving indicators for which very large increases in sample size (particularly at jurisdiction or SEIFA-quintile level) are required. The data pooling options considered in this paper did not enable detection of the small changes in the selected education attainment/participation measures and much larger increases in sample size would be required to measure such small changes, if they exist.
It should be noted that, there is no reason why data pooling should always lead to detection of small changes, since in some cases no real underlying change exist between two estimates. It is possible that pooling may produce better and more reliable estimates that make the estimates for two periods move closer together, compared with single-year estimates. As such, no statistically significant difference between the two estimates would be detected, which would be the correct conclusion to draw in this case.
Data pooling resulted in no appreciable gain in the detection of change over a two-year period in the attainment indicator NEA 7 - the proportion of the 20-24 year old population having attained at least a year 12 or equivalent or AQF Certificate Level II or above. This indicator is very slow moving, reflecting the relatively long period already over which Australian youth have had good access to and high levels of participation in education and training, both at school and in tertiary settings. Furthermore, this indicator is based on a small population which, even when doubled, does not provide a sufficient base from which to detect change in the survey estimates over a two-year period.
The assessment of the potential benefit of data pooling for NEA 9 – the proportion of young people aged 15-19 years participating in post school education or training and NEA 10 – the proportion of 18 to 24 year olds engaged in full time employment, education or training at or above Certificate III was affected by the occurrence of the Global Financial Crisis in 2009. While data pooling detected change over the two year period 2008 to 2010, so did single-year data over the critical year 2008 to 2009.
Compared with single-year data over a one-year period there were some gains at the state/territory level in the measurement of change using data pooled over two years for NASWD 2 – the proportion of 20-64 year olds who do not have qualifications at or above a Certificate III level. This indicator is steadily declining and has a large population base. The continuing decline in this at risk indicator reflects generational change, whereby older age cohorts with, on average, relatively low qualification levels are being replaced by younger ones with higher qualifications. It may also reflect, in part, access across all ages to continuing education and training opportunities. While NASWD 2 has a sufficiently large sample and rate of change for data pooling to lead to improvements in accuracy at the state/territory level, these gains are offset to the extent that single-year data were also sufficient for measuring the change that occurred over a two-year period.
Nevertheless, the data pooling methodology investigated here may have application in other situations, for instance where the annual rate of change of a variable is such that it is just below the threshold of statistical significance when based on comparison of single-year estimates. Data pooling allows for the sample from one survey to be boosted with data from other surveys without the additional cost of collection and with no further provider load on the community. The methodology adopted here, based on re-weighting the combined sample to the latest survey reference point, resulted in estimates that were broadly consistent between single-year and pooled data. Although the pooled data were lagging due to changes in population structure and labour force participation over the period in which pooling took place, estimates for narrow age ranges of about 5 to 10 years were remarkably similar.
Some of the limitations of this study may be overcome if it were possible to pool data over a single calendar year. This would further minimise the lag effect of pooling and allow some short term trends, for example in the engagement of youth in work or study to be better detected.
This page last updated 14 September 2011