Sean Buttsworth, Rachel Barker, and Mark Zhang, Methodology Division, Australian Bureau of Statistics
Methodology Advisory Committee
The ABS Methodology Advisory Committee (MAC) is an expert advisory group of statisticians and data scientists drawn mainly from, but not restricted to, universities across Australia and New Zealand.
The function of the MAC is to provide expert advice to the Chief Methodologist on selected methodological issues relevant to the production of official statistics, including questionnaire development, survey design, data collection, data linkage, estimation, time series analysis, confidentialisation and the use of new and emerging data sources. The MAC is a key mechanism used by the ABS to ensure outputs are based on sound and objective statistical principles.
The Committee generally meets two or three times each year to discuss papers on specific issues. The papers can relate to work at different stages of the methodology solution cycle:
- during initial thinking around a statistical and/or data science method to meet new or emerging needs;
- as investigations proceed or develop the more complex aspects of a proposed method;
- as methods are finalised and circulated for public comment; or
- as part of ongoing reviews of methods already implemented in ABS work.
In providing advice the Committee considers aspects such as:
- the statistical validity of the approach proposed;
- similar statistical or data science problems, arising in other fields, where work done may provide the basis for improved solutions to the given problem; and
- the implications of the statistical or data science methods proposed for the valid use of the outputs, and in particular the inferences that might be drawn from the resulting data.
Purpose of this paper
We would like to seek input from MAC on the suitability of the proposed solution for estimating labour force statistics at a small area level and quantifying their uncertainty, as well as any suggestions for improving the quality of these estimates.
ABS context
- In the 2022-23 federal budget the ABS was given funding for enhancing regional labour market statistics to provide a better understanding of regional labour market developments.
- The ABS has undertaken to provide improved SA4-level monthly estimates of key labour force statistics in 23/24. We aim to have a productionised system in place by February 2024 to enable monthly SA4 modelled estimates to be released in conjunction with the monthly direct estimates at national and state level.
- By the end of June 2024 we are aiming to provide even finer-level (e.g. SA2, SA3, LGA) modelled estimates at a frequency that yields sufficiently reliable estimates. It is also a goal to provide SA4 modelled estimates disaggregated into age by gender demographic groups.
Problem specification
- The Labour Force Survey is designed primarily to provide national estimates, with the secondary design objective of producing state and territory estimates. Direct estimates at Statistical Area Level 4 (SA4) are also provided, notwithstanding that many of these SA4 estimates have high levels of sampling error due to small sample sizes.
- Small area modelling techniques may be employed to produce SA4-level estimates with greater stability. However, application of the cross-sectional cell-level mixed models typically employed by the ABS is unlikely to be either appropriate or optimal given the rotating panel sample design of the Labour Force Survey.
- Hence time series small area models (Rao-Yu models in particular) have been used to properly account for, and take advantage of, the temporal correlations in the labour force data. However, such models are substantially more complicated to apply than simple cross-sectional models.
- This paper describes our application of Rao-Yu models to estimate SA4-level monthly estimates of labour force statistics.
- Mean squared error estimators are readily available for the modelled small area count estimates of levels. However, movement estimates and estimates of unemployment rates are also desired, and mean squared errors of these are not straightforward since they depend on the correlations between modelled estimates. It is expected the use of a parametric bootstrap may be necessary.
- While it is not the focus of this paper, we also face the challenge of modelling at finer levels of geography and demographics.
Key findings, issues, and challenges
- The Rao-Yu model evaluated seems, in most cases, conceptually appropriate, and empirically produces plausible estimates with greater accuracy than direct estimation methods.
- Dealing with COVID and other real-world changes presented difficulties and the model will need to be managed in the case of future shocks.
- The estimation of AR1 coefficients for NSW and VIC employment models was stopped at upper bounds of 0.98, indicating non-stationarity problems. The dynamic model of Fay and Diallo (2012) may be a better alternative in these cases.
- The random area effect variance estimates and random area by time effect variance estimates were small but significant for unemployment. For employment both random effect variances were small and statistically non-significant.
- Uncertainty estimation for movements and rates will be required.
- For a large number of time points and areas, fitting the Rao-Yu model, as implemented in the SAE2 R package, is computationally demanding. Hence we have not explored Australia-level models. The computational demands will also need to be considered when monthly production is implemented in 2024, when we look at bootstrapping for uncertainty estimation, and when we explore the feasibility of modelled estimates for finer-level geographies.
Questions for Methodology Advisory Committee
- Does MAC have any feedback on the proposed methodology for modelling, small area estimation and uncertainty estimation?
- Are there better alternatives to dealing with the time-varying regression coefficient due to COVID impacts?
- Do you have any alternative modelling approaches in general?
- Are there improvements or further work that should be pursued?
- Any thoughts on the value of a multivariate model and on obtaining finer-level estimates would be appreciated.