Combining data from multiple surveys- the SEW Data Pooling Project
Users of official statistics are becoming more sophisticated, requiring estimates at more disaggregated or smaller subpopulation levels. However, most often individual surveys are not large or extensive enough to provide estimates at the desired level, and budgetary pressures preclude carrying out separate, smaller surveys for each analytical problem. There is potentially great value, therefore, in pooling or combining data from existing sources to construct new estimates at the required levels. In addition to the work described in the previous article, the Analytical Services Branch (ASB), in conjunction with the National Centre for Education and Training, has begun an investigation into the feasibility of pooling or combining data from different education surveys to derive better estimates of selected key educational participation and attainment measures.
The main survey of interest is the ABS Survey of Education and Work (SEW), which is being used by policy departments to produce key performance measures of Australian youth participation and attainment in education and training. There is strong interest in the disaggregation of these measures by state/territory and in movements over time (year to year). As a supplementary survey to the Labour Force Survey, SEW delivers accurate point-in-time estimates at the national level and reasonably accurate estimates for almost all states/territories, with the exception of Northern Territory. In addition to providing more accurate estimates at state/territory level and possibly at other relevant subpopulation grouping levels (e.g., sex, age, area of socioeconomic disadvantage), stakeholders are also interested in the ability of the data to detect relatively small movements in the key performance measures from year to year. The relatively small sample sizes for the required variables from the current SEW surveys do not allow detection of small year-to-year movements, particularly for smaller jurisdictions and subpopulations.
The aim of the SEW Data Pooling project is therefore to assess the feasibility and benefits of pooling or combining SEW data with historical SEW surveys and/or other surveys, in order to improve the accuracy of the key performance measures of participation and attainment. Broadly, this project will investigate the improvements in accuracy (i.e., SEs/RSEs) of both single year and movement estimates of the key performance measures, under different options, relative to the accuracy achieved from estimates based only on SEW. That is, improvements will be recognised in comparison to accuracy achieved for estimates based only on SEW.
Initially, the following four key COAG educational measures will be the focus of analytical comparison: the proportion of 18-24 year olds engaged in [full time] employment, education or training at or above Certificate III level; the proportion of 19 year olds who have completed Year 12 or equivalent or Certificate II or above; the proportion of 20 to 24 year olds who have completed Year 12 or equivalent or Certificate II or above; and the proportion of 25 to 29 year olds who have completed Certificate III or above. These measures have been chosen due to their prominence in reporting against the COAG National Education Agreement, or the MCEETYA annual National Report on Schooling.
Possible different options for data pooling could include: combining current SEW with one or more previous SEWs; combining SEW with another monthly Labour Force supplementary survey (e.g., Labour Mobility Survey, Job Experience Survey, Underemployed Workers Survey, Childhood Education and Care Survey); combining SEW with another Special Social Survey (e.g., Adult Literacy and Life Skills, Survey of Education and Training, Survey of Disability, Ageing and Carers); and combining SEW with any combinations of the above three options.
If the initial phase proves the feasibility of the approach, then the second phase project will proceed with undertaking the actual pooling/combining of data to identify where improvements are possible, and make recommendations regarding the most suitable data pooling/combining process and option which will deliver the greatest accuracy improvements in the identified key educational performance measures.
For more information, please contact Anil Kumar on (02) 6252 5344 or firstname.lastname@example.org