CENSUS IMPUTATION FOR NON-RESPONDING DWELLINGS
Some non-responding dwellings supply their own count of number of males and females. This is acquired by asking a neighbour or by some other method, and is referred to as 'credible source data'. Analysis of credible source data from 2001 showed that these records differed from the general responding Census population. In particular, the average number of persons per household about which credible source data was available was 1.8. Given that there are roughly 140 000 system created dwellings this difference (1.8 versus 2.4) produces the estimated overcount.
The outcome of the review was that a new imputation methodology would be devised for non-responding dwellings that incorporated the credible source data as the best representation of non-respondents available. Following on from this analysis a variety of imputation methods were assessed, including:
A review of the 2001 Census imputation methodology was conducted. In particular the review focussed on removing the processing induced overcount of roughly 83 000 persons.
When a Census collector believes a dwelling to be occupied, but (for a variety of reasons) no form is obtained for it then the dwelling is considered to be non-responding. Census records are created for these dwellings and for the persons believed to be living in them. These are called 'system created records' (previously they were known as 'dummy' records').
For most non-responding dwellings the number of males and females resident on Census night is imputed. In 2001 (and previously) this was done using the average number of males and females per household for the CD in which the dwelling was located. The average at the Australia level was 2.4 persons (both male and female).
The assessment involved reimputing 2001 non-respondents. The results were compared with the actual imputed values produced during 2001 Census processing.
All of the methods reduced the overcount by a desirable amount, and had similar levels of accuracy. The hotdecking method was ultimately selected because it imputes non-respondents with the same distribution of dwelling size as the credible source data. This is desirable because credible source records have a far higher proportion of one person households than the general population. Using hotdecking we impute 65 000 more one person households than were actually imputed for the 2001 Census. This has flow-on effects when we go on to impute other characteristics for system created records, such as age and marital status (ie small households are more likely to contain persons aged 20-35 and 55+ and less likely to contain persons aged 0-15).
Currently Population Census are working on a system to implement this method in readiness for testing in the 2005 Census Dress Rehearsal.
For further information, please contact, Claire Clarke on (02) 6252 5556
- using credible source data to create an adjustment factor for the CD level mean number of persons per responding household,
- using a model incorporating dwelling and CD characteristics to predict the number of persons per household,
- using the mean number of persons per household from credible source data within broad imputation classes,
- using a hotdecking methodology to select a donor with credible source data that had the same dwelling and geographic characteristics as a non-respondent.