Australian Bureau of Statistics
2062.0 - Census Data Enhancement Project: An Update, Oct 2010
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 15/10/2010
|Page tools: Print Page Print All RSS Search this Product|
ABS Australian Bureau of Statistics
ABSDL Australian Bureau of Statistics Data Laboratory
ALLD Australian Longitudinal Learning Database
CDE Census Data Enhancement
COAG Council of Australian Governments
CURF Confidentialised Unit Record Files
DIAC Department of Immigration and Citizenship
NAA National Archives of Australia
PES (Census) Post Enumeration Survey
SLCD Statistical Longitudinal Census Dataset
WA Western Australia
ABS Data laboratory
The ABS Data Laboratory (ABSDL) is the data analysis solution for high-end data users who want to extract full value from ABS microdata. The ABSDL provides an interactive environment, enabling the analysis of Basic, Expanded or Specialist (customised) Confidentialised Unit Record Files (CURFs).
Australian Longitudinal Learning Database
The Australian Longitudinal Learning Database (ALLD) has been proposed as a national longitudinal statistical database on the education pathways and outcomes of Australian students from early childhood education through to the end of their schooling or education. The ALLD would be constructed from administrative records, and, with community support, would include data drawn from Census and survey records.
Birth register data
The responsibility for registration of births in Australia lies with the individual State and Territory Registrars of Births, Deaths and Marriages. A Birth Registration Statement is completed by at least one of the parents of a baby. This information is the basis of the data provided to the ABS for processing and production of birth statistics.
The period of time immediately after the conduct of the Census of Population and Housing during which the Census forms are processed to produce statistical outputs.
Census Time Capsule
In Australian Censuses prior to 2001, forms and other name-identified records have been destroyed once the statistical data required for the purposes of the Census have been extracted.
Following recommendations from the House of Representatives Standing Committee, the Government decided that for the 2001 Census all people would be given the option of having their name-identified responses retained for 99 years (Census Time Capsule). After 99 years, the name-identified data will be made public for future generations. This option was again included in the 2006 Census and will be a permanent feature of future Censuses.
Some 53% of the population chose to have their individual responses from the 2001 Census retained, and 56% from the 2006 Census. These are now with the National Archives of Australia. In order to ensure that the current high levels of public confidence and cooperation in the Census are maintained, and to respect the wishes of those who do not want their information retained for future release, information will only be kept for those persons who explicitly give their consent. For privacy reasons the name-identified information will not be available for any purpose, including by a court or tribunal, within a 99 year closed access period.
After this information has been transferred to the National Archives of Australia and statistical processing is completed, the ABS will destroy all paper and eCensus forms including the computer images of those forms. As in the past, the paper forms will be pulped for recycling.
Confidentialised Unit Record File (CURF)
A CURF is a file of responses to an ABS statistical collection that has had specific identifying information about a person or organisation confidentialised.
The most basic of the techniques employed by the ABS involves ensuring all identifying information, such as names and addresses are not on the files.
Additionally, the data items that are most likely to enable identification of unit records are only released in broad categories. For example, while survey questionnaires may capture your home or business address, microdata may only be released at the State or Territory level.
More advanced confidentialisation occurs through checking the CURFs for records with uncommon combinations of responses. These records may be altered slightly to ensure individual respondents cannot be identified.
A file containing the individual responses from a statistical collection, administrative records or register of information (for example disease register). Datasets are used to generate statistical output.
Death register data
Registration of deaths is the responsibility of the individual State and Territory Registrars of Births, Deaths and Marriages and is based on the data provided on an information form. This information form is the basis of the data provided to the ABS for processing and production of death statistics.
In this publication unit record data is considered identifiable if the data available in the record identifies the specific individual to whom it refers.
A dataset which contains information for the same unit over a number of different points in time.
Long-term migration data
Statistical data held by the Department of Immigration and Citizenship (DIAC) from the administration of immigration programs. This includes overseas arrivals and departures data, where the period of duration is over 12 months, and visa grant data, including type of visa.
Non-identifying Grouped numeric code
From 2011 a non-identifying grouped numeric code will be included on the 5% SLCD records to improve the accuracy of the linked dataset and the efficiency of the linking process. The code will be based on name and created using a secure one-way process. Each group code will represent about 2000 people.
Statistical Integration Projects
The bringing together of unit record data from different administrative and/or survey sources to provide new datasets for statistical and research purposes. These new datasets address significant research questions, produce new statistical outputs or enable understanding and evaluation of the quality of statistical operations, techniques and/or outputs.
Functions related to the compilation, analysis and dissemination of statistics. Statistical purposes precludes use of a dataset for administrative or client management purposes, where there is an impact on specified individuals.
In this publication, statistical techniques refer to the method that would be used to bring together different administrative and/or survey sources. The proposed method is often referred to as probabilistic record linkage, which involves bringing together data from two different datasets using a number of characteristics such as name, address, age/date of birth, sex, geographic region, and country of birth. All possible linkages based on these data items, or a subset of them, are evaluated. The records for which the linkage is most likely to be correct are brought together.
WA Enhanced Mortality dataset
The WA Enhanced Mortality dataset involves linking WA death registrations data with a range of health datasets available in the WA Data Linkage System. Indigenous status will be derived for this linked data set using a number of possible business rules.
These documents will be presented in a new window.
This page last updated 15 October 2010