2062.0 - Census Data Enhancement Project: An Update, Oct 2010
Previous ISSUE Released at 11:30 AM (CANBERRA TIME) 15/10/2010
|Page tools: Print Page Print All RSS Search this Product|
CREATION OF A STATISTICAL LONGITUDINAL CENSUS DATASET (SLCD)
A major component of the CDE project is the creation of a Statistical Longitudinal Census Dataset, or SLCD. The first wave of the SLCD was formed with a 5% random sample of the population from the 2006 Census. The second wave of the SLCD will be created by combining the wave one data in the SLCD 2006 with data from the 2011 Census of Population and Housing.
The 5% SLCD is formed using statistical data linking techniques (see Glossary) rather than matching based on name and address. A paper outlining methods for creating the 5% SLCD, results from similar statistical linking projects, and preliminary results of linking can be found in ABS Research Paper: Exploring Methods for Creating a Longitudinal Census Dataset (ABS cat. no. 1352.0.55.076).
A quality study was undertaken as part of the 2006 CDE project which assessed the likely quality of the 5% SLCD. The results of that study found that the 5% SLCD would produce similar or better quality results to a panel survey. For further information, see Assessing the Likely Quality of the Statistical Longitudinal Census Dataset (ABS cat. no. 1351.0.55.026).
Further information about the second wave of the 5% SLCD and plans for the third wave are outlined in Section 3 of this paper.
BRINGING TOGETHER 2006 CENSUS DATA WITH OTHER DATASETS USING NAME AND ADDRESS DURING CENSUS PROCESSING TO UNDERTAKE QUALITY STUDIES
The second component of the 2006 CDE project was the undertaking of several quality studies which involved bringing together data from the 2006 Census with other ABS and non-ABS datasets. The agreement of the custodians of the non-ABS datasets was required for projects using non-ABS datasets. The aim of these studies was to understand and evaluate the quality of ABS statistical operations and outputs, to better inform ABS on the most suitable statistical techniques for bringing together the 5% SLCD and other datasets, and to assess the quality of datasets created using these techniques.
The quality studies were undertaken during the 2006 Census processing period using names and addresses to undertake the linkage, after which all Census forms and names and addresses held by the ABS were destroyed. The linked datasets created for the quality studies did not contain name and address information and were destroyed at the completion of the studies (see previous comments regarding security, confidentiality and accessibility).
A number of quality studies were proposed for Census 2006 as outlined in the information paper Census Data Enhancement Project: An Update (ABS cat. no. 2062.0). The outcomes of studies that did proceed are outlined below.
Feasibility of combining the 5% SLCD with data from future Censuses
This study aimed to test the feasibility of bringing together a 5% sample of one Census with subsequent Censuses using statistical techniques. This was simulated by linking the 2005 Census Dress Rehearsal dataset to the 2006 Census data both with and without names and addresses as matching variables. The linking using name and address acted as a benchmark for assessing the quality of the linking without using name and address. Details about the linking methodologies used, the application of these methodologies in the quality study using the 2005 Census Dress Rehearsal dataset and the outcomes of the study were released in three research papers:
The ABS will be undertaking a series of quality studies using the 2010 Census Dress Rehearsal dataset . For further details, see '1 Bringing together 2011 census data with other datasets during census processing using name and address, for quality studies'.
This study brought together data from the 2006 Census with data from death registrations for the period August 2006 to 30 June 2007, during the Census processing period using name and address. The study assessed the undercoverage of Indigenous deaths in death registration records; identified factors that may be contributing to undercoverage in Indigenous deaths in death registrations; and assessed the feasibility of calculating and applying adjustment factors to improve estimates of Indigenous mortality.
Findings from the quality study using death registration data can be found in the information paper, Census Data Enhancement - Indigenous Mortality Quality Study, 2006-07 (ABS cat. no. 4723.0).
Adjustment factors obtained in the quality study were subsequently used in the development and introduction of a new method to derive adjusted Indigenous deaths used to produce life tables and life expectancy estimates for Aboriginal and Torres Strait Islander Australians. The availability of information from the quality study considerably improved the quality and robustness of the estimates of Indigenous life expectancy. These estimates had previously relied on a range of assumptions about the level of under identification of Indigenous deaths in each jurisdiction. The linked Census and death registration data pointed to significant deficiencies in these assumptions, resulting in significant underestimation of Indigenous life expectancy in some states/territories. For further information, see Discussion Paper: Assessment of Methods for Developing Life Tables for Aboriginal and Torres Strait Islander Australians, 2006 (ABS cat. no. 3302.0.55.002).
The ABS will be undertaking an Indigenous Mortality project as part of the 2011 CDE project. For further details, see '2.1 Indigenous Mortality Project'.
Assessing the feasibility of bringing together data from the Department of Immigration and Citizenship's Settlement Database and the 2006 Census
The Migrants Quality Study was conducted to assess the feasibility of linking the Department of Immigration and Citizenship's Settlement Database (SDB) to the 5% Statistical Longitudinal Census Dataset (SLCD) without the use of name and address as linking variables. Findings of the quality study Assessing the Quality of Linking Migrant Settlement Records to Census Data (ABS cat. no. 1351.0.55.027) were released in August 2009. The results from the quality study indicated that linking the SDB to the 5% SLCD is feasible and can produce useful information that no other data source currently provides. However, some quality issues were identified and further work was proposed to ensure that the linked data are correctly interpreted and appropriately used. For further details, see '1 Bringing together 2011 census data with other datasets during census processing using name and address, for quality studies'.
Assessing Automatic Data Linking for the Census Post Enumeration Survey
The Census Post Enumeration Survey (PES) is conducted a few weeks after the Census to estimate the number of people who were missed in the Census or who were counted more than once. This is done by matching Census and PES responses. A quality study undertaken after the 2006 PES assessed the feasibility of introducing automated linking processes to improve the efficiency and effectiveness of the PES, in line with previous changes to the survey, such as the introduction of Computer Assisted Interviewing in 2006. The study found that automated data linking can provide important quality and efficiency gains to replace in part, though not entirely, the clerical matching process. A key advantage of automated data linking is that it provides a greater capability for locating respondents at undisclosed or poorly reported Census night addresses.
The third component of the 2006 CDE project was the bringing together of the 5% SLCD with specified non-ABS datasets using statistical techniques. The agreement of the custodians of the non-ABS datasets was required for these projects, and projects could only be for statistical and research purposes.
Three non-ABS datasets were identified for this component of the CDE Project. These were birth and death register data, including cause of death data; migrants data from the Department of Immigration and Citizenship's Settlement Database (SDB); and national disease registers.
A statistical study based on bringing together data from the Department of Immigration and Citizenship's Settlement Database with data from the 2006 Census was undertaken. The bringing together of migrant information with Census information had the potential to provide insights into patterns of settlement of different groups of migrants, including family formation, housing, labour force characteristics, changing occupations, educational pathways and region of settlement. The study started after the completion of the quality study (discussed above) and results have been released Perspectives on Migrants, June 2010 (ABS cat. no. 3416.0).
PRIVACY AND CONFIDENTIALITY
A fundamental aspect of the CDE project is the management of privacy and confidentiality. The ABS applied strict protocols to the 2006 CDE project to ensure that the privacy of individuals and the confidentiality of their data was protected throughout the 2006 CDE project. These protocols included:
Audits undertaken by Oakton in 2008 and 2010 found that: