2080.5 - Information Paper: Australian Census Longitudinal Dataset, Methodology and Quality Assessment, 2006-2011  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 18/12/2013   
   Page tools: Print Print Page Print all pages in this productPrint All  
Contents >> 3. Linkage results >> 3.1 Linkage accuracy >> 3.1.2 Consistency of common information on record pairs

3.1.2 CONSISTENCY OF COMMON INFORMATION ON RECORD PAIRS

In data linkage projects, geographic boundaries function as blocking variables that restrict the search for record pairs. They are also used as linking variables, and when combined with other linking fields such as age, sex and date of birth, provide a high level of uniqueness, and reduce the likelihood of linking to an incorrect record.

Table 4 displays the number of records that had consistent information and is grouped by the consistency of the record pairs across varying levels of geography.

TABLE 4 - CONSISTENCY OF LINKED RECORDS, By geography and selected linking fields

Consistency of key linkage fields(a)(b)
(no.)
(%)

Mesh Block combined with
Age exact, Sex, DOB Day and Month agree
552 714
69.0
Age exact, Sex agree
41 135
5.1
Age +/- 2 years, Sex agree
77 98
1.0
SA2 combined with
Age +/- 2 years, Sex , DOB Day and Month agree
84 265
10.5
Age +/- 2 years, Sex agree
26 739
3.3
SA4 combined with
Age +/- 2 years, Sex , DOB Day and Month agree
66 623
8.3
Total records included
779 274
97.3
Total records linked
800 759
100

(a) Only includes records that agree on all key linking fields.
(b) Categories are mutually exclusive. Records that agree in each category are excluded from subsequent categories.




Just over 97% of all records that were matched in the ACLD linkage process agreed on small to medium levels of geographic area combined with other key linking fields, such as age, sex and date of birth. While the number of consistent fields can give a strong indication of likely linkage quality, other factors should be taken into account, for example, the expected number of people in a geographic area that are likely to share a characteristic by chance. A tolerance of plus or minus two years was used at certain parts of the linkage process to cater for persons who may have understated their age in 2006 and overstated it in 2011 or vice versa.

By contrast, record pairs may have inconsistent information and yet be a true link. Inconsistent information may be recorded for the same person in different Censuses due to a range of factors, including:
  • Transcription errors in the Census, where the wrong category is selected or the information is transposed, such as the day the person was born being reported in the month instead of as the day field.
  • Data capture errors, where the Census form is scanned using Optical Character Recognition software and certain characters may be mis-classified, such as a 1 captured as a 7 or a 3 as an 8.
  • Reporting errors, where information is given for the wrong member of the household (e.g. person 1's information is reported for person 3) or where the person completing the Census estimates information that they do not know (e.g. about a fellow group household member).
  • Information that was not stated by the respondent and has been imputed as part of Census processing (such as age or sex).
  • A different person fills out the Census form at the different time points and interprets the questions differently.




This section contains the following subsection :
          3.1.2.1 Consistent reporting of Indigenous status

Previous PageNext Page