Combining data from multiple surveys- LFS and NATSIHS
The ABS has been investigating the feasibility of combining data from multiple surveys to improve estimates of population totals. The key advantage in combining data is increased sample size, which reduces sampling error. However, inconsistencies between surveys, in areas such as scope, sample design and questions, may increase non-sampling error if they are not accounted for when combining data.
As reported in Methodological News two quarters ago, a research project in the Analytical Services Branch (ASB) is using the Labour Force Survey (LFS) and the National Aboriginal and Torres Strait Islander Health Survey (NATSIHS) to evaluate the benefits and issues involved in combining data from two surveys. The surveys were combined to produce labour force estimates for the Indigenous population in Australia. Both surveys collect labour force information but use a different set of questions to ascertain a respondent's labour force status. The LFS questions are more detailed.
Three approaches to combining the data were considered. In Approach 1, the LFS and NATSIHS labour force variables were assumed to be consistent and combined to produce a labour force estimate. In Approach 2 and Approach 3, the LFS labour force variable was taken as the 'gold-standard' and the NATSIHS labour force variable was assumed to contain some measurement error. In Approach 2, a 'NATSIHS' labour force variable was imputed for each LFS respondent, which allowed the use of a two phase estimator. In Approach 3, a 'LFS' labour force variable was imputed for each respondent to the NATSIHS, which was then combined with the LFS to produce a labour force estimate. In some cases the imputation was stochastic and respondents were assigned a probability of being employed, unemployed or not in the labour force.
All three approaches produced Australia-level labour force estimates with lower Relative Standard Errors (RSEs) than the corresponding LFS and NATSIHS estimates. Each approach also produced some substantial RSE gains for lower-level estimates, such as at the state by area (major city, regional or remote) level.
A set of diagnostics was developed to assess the quality of the estimates produced by combining the surveys and these were applied to Approach 1. One diagnostic showed that the employment and unemployment estimates at the state by area level in Approach 1 are more efficient than the corresponding LFS estimates, as long as the true difference between the LFS and NATSIHS estimates are less than 15% and 30% respectively. The conclusion of the work is that it is worthwhile to combine the LFS and NATSIHS. The diagnostics will soon be applied to Approach 2 and 3 to determine if they are more robust than the first.
For more information, please contact James Chipperfield on (02) 6252 7301 or firstname.lastname@example.org, or Julia Chessman on (02) 6252 5098 or email@example.com