TECHNICAL NOTE 5 STATISTICAL IMPACT OF ADL
1 The introduction of Automated Data Linking (ADL) was arguably the most significant change introduced in the 2011 PES, a key innovation that improved the PES linking and matching methodology which is so fundamental to obtaining a net undercount measure. For further information on ADL see Linking and Matching.
2 Based on the outcome of a feasibility study undertaken after the 2006 PES, the ABS expected the new methodology to deliver a significantly better linking and matching outcome than the methodology employed in previous years, and consequently a better, and significantly lower, estimate of net undercount. The main implication of a significant improvement in the estimate of net undercount was that there would be an increase in the discrepancy between 2006 and 2011 based population estimates, directly related to this change.
3 Therefore, the decision to conduct a Statistical Impact Study arose primarily out of the need to explain the extent to which intercensal discrepancy (i.e. the change between 2006 and 2011 based population estimates) could be attributed to a change made in the PES methodology. That is, the study aimed to assess the extent to which the discrepancy was explained by a change in how Census coverage was being measured as opposed to Census coverage itself.
4 Since the introduction of ADL represented a change to the processing of statistical inputs, rather than the inputs themselves, it was possible to estimate the impact of the new methodology through processing a sample of 2011 records using a close approximation of the methodology that was used in the 2006 PES. The matching outcomes of this process could then be compared with the outcomes achieved through ADL.
5 A random sample of 2,158 dwellings was selected, stratified by state, containing around 5,700 persons or approximately 6% of the total 2011 responding PES sample, for the study.
6 The Statistical Impact Study records were separately processed using a methodology that was a close approximation of that used in the 2006 PES. The matching methodology used in 2006 is outlined in Appendix 2 of the 2006 PES publication: Census of Population and Housing - Undercount, 2006 (cat. no. 2940.0).
ESTIMATING THE IMPACT OF ADL
7 Matching outcomes for the Statistical Impact Study sample were compared with the outcomes from the ADL-enabled process, to determine where the two methodologies differed. In order to effectively estimate the impact, some imputation was required. ADL-enabled matches that were not made using the 2006 processing were classified into six imputation groups, based on:
- whether the match was at the place of enumeration, a respondent-provided search address or another address; and
- the quality of the link.
The Statistical Impact Study sample provided the average number of matches for each ADL-enabled match within these imputation groups, as well as an average number of Statistical Impact Study matches for the small number of persons unmatched in ADL-enabled processing. These averages were then applied to the rest of the PES sample, together with the outcomes of some vague address modelling (which was a feature of 2006 processing made redundant by ADL).
Once the imputation work was complete, the file was then run through the same person weighting and estimation processes used to produce the published 2011 PES estimates. The format of the estimates output was therefore directly comparable to the main estimates produced via the ADL-enabled processing method.
The ADL Statistical Impact Study estimated that the use of ADL in 2011 PES linking and matching resulted in a net undercount that was 246,985 persons less than the 2006 PES matching methodology would have delivered.
The Statistical Impact Study result has a standard error of 43,000. A common approach to assessing the variability inherent in estimates is to examine the 95% confidence interval (which is two standard errors either side of the estimate). Using this approach, there is a 95% chance that the true value of the statistical impact of ADL on net undercount in 2011 is between 160,985 and 332,985 persons.
It is important to remember that the Statistical Impact Study estimate was not designed to provide an alternative measure of net undercount for 2011, in 2006 terms, but only to identify the impact of the ADL methodology. There are a range of PES and Census changes that are not related to ADL that will affect comparability between 2006 and 2011. For instance, the reduced level of Census imputation in 2011 will directly affect the comparability of net undercount measures with their 2006 equivalents.
The ADL Statistical Impact Study reinforces the value in the ABS continuing to make innovative changes to the PES, especially in developing linking and matched methods that provide a more accurate estimate of the completeness of Census coverage.