Rate the ABS website
2901.0 - Census Dictionary, 2006 (Reissue)  
Previous ISSUE Released at 11:30 AM (CANBERRA TIME) 28/06/2007  Reissue
Contents >> Short Definitions and Classifications - 2006 >> Age (AGEP) - Characteristics 2006


Age has been collected in all Australian Censuses. Age data, combined with sex data, are essential for the production of accurate population estimates based on the Census count.

The 2006 Census form gives respondents the option of writing in their age and/or their date of birth. During processing age is calculated from date of birth where provided, else stated age is used. Only age in years data are output. If neither age nor date of birth is provided, age is imputed using other information on the form and using an age distribution of the population. The variable Imputation Flag for Age (IFAGEP) is used to indicate if a person's age has been imputed for the Census. More Detailed Description

Image of Question

2006 Household Form - Question 4


For the 2001 Census age was available for 0 to 99 years singly and then 100 years and over. For 2006 age is available for 0 to 115 years singly.

Applicable to: All persons

000 - 115 ( 0 to 115 years of age singly)
Or data may be output by age group
For example by 5 year age groups:
0-4 years
5-9 years
and so on.

Total number of categories:
by single year 116
by 5 year age group 18

More Detailed Description
Quality Statement - Age (AGEP)

In 2006, the question on Age included the option to report either Date of Birth (DOB) or Age last birthday. The check box for selecting '100 years or more' that appeared in 2001 was removed, allowing people to record actual ages in this age range.

The majority of respondents provided DOB information only (52.9%), while 36.6% reported both DOB and Age last birthday and 5.7% reported Age last birthday only. The remainder (4.8%) did not state either. Where both sets of information were provided, DOB information was used to derive an age in years (AGEP).

Where age could not be derived or was not stated (or set to not stated during processing as discussed below) then it was imputed, using other information on the form, and using an age distribution of the population. The imputation rate in 2006 for Age (AGEP) was 5.0% compared with 3.6% for 2001. Nearly all of this imputation is attributable to the 4.2% of persons in dwellings which were occupied on Census Night but did not return a completed form. Persons are imputed into these dwellings together with some demographic characteristics including AGEP. In 2001, 2.2% of persons were imputed into dwellings for which no form was received.

There were a small number of cases where age was set to 'not stated' because of inconsistencies between age and relationship data. This occurred most often because the Census concept of a parent and child relationship requires a 15 year age gap where such a relationship exists (and a 30 year age gap where a grandparent/grandchild relationship exists).Where this condition is not met, the age of the parent or grandparent is set to not stated and then imputed. These types of adjustments occurred for 0.2% of all persons.

There are two main sources for error in age data: respondent error, and processing error.

Respondent error

Users of the data need to bear in mind that almost all census data are as originally reported by the respondents. Respondents occasionally provided the date that they filled out the form, or the date of their last birthday, as their date of birth. Such records that could be positively identified, using other information on the form, had their ages set to not stated and then imputed. Other respondent actions, such as crossing out of incorrect digits, transposing numbers (particularly by eCensus users), and 'sticky key' repetition errors (for eCensus users), are more difficult to determine, and such errors are likely to remain in final output.

Processing error

Age data was mostly captured from hand written numeric responses: therefore there is some risk of character recognition error. During processing, the vast majority of individual characters handwritten on paper forms met preset recognition confidence levels and were accepted without further examination. However, there are low-level patterns of regular numeric substitution in the final data (for example between 8 and 0; 1 and 7; 4 and 9) that suggest that the automated preset recognition confidence tests may not have been sufficiently rigorous for some poor handwriting, affecting a small proportion of AGEP data.

Characters that failed recognition confidence levels, were sent to a team of coders for further determination. Coders selected the most likely digit the respondent was trying to convey, based on visual inspection of an image of the response. If there was no way that a determination could be made regarding individual digits within Age last birthday, then the entire content of the field was deleted, so that misleading information was not passed on to later systems. For DOB, where the Year of Birth was unrecognisable and could not be ascertained from an associated Age last birthday response, that field was deleted. Age for these records was imputed at a later stage of processing.

Sample checks were made throughout the data capture processing schedule, to ensure an acceptable level of processing quality was maintained.

Data confrontation

One way of measuring the accuracy of age data is to compare reported age, and derived age (calculated from DOB data) for the 36.6% of respondents who supplied both sets of information. Where both DOB and Age last birthday were provided, the two values for age were consistent in 91.7% of cases, giving high confidence that the age (AGEP) for these records were correct. For 6.2% of persons there was only one year difference between the data items. For the remaining 2.1%, however, where the difference was two or more years, respondent error (for either variable, or both), or character recognition problems during processing were the most likely causes. In all cases, the assumption was made that DOB was correct. It is equally probable (but unverifiable) that a similar degree of error exists in AGEP for those records where just Date of Birth, or just Age last birthday, were supplied by the respondent.

Caution in uses of data

Census data can be used for the analysis of population characteristics at finer geographic levels and for smaller sub-groups than would be reliably available from household surveys. However, at very fine data levels, and as other data items are incorporated, outliers (unusual results) may become more apparent. Users are therefore cautioned to keep in mind the age data quality issues which have been outlined above, when looking at small population groups. Post-censal analysis of data for the population aged 100 and over indicates that users should be wary of cross-classifying age data for this group with other population characteristics. In most cases it is advised that users collapse the single age categories for those aged 100 and over into a single output category. For the official ABS estimate of demographic data on the Australian population, users should use data on Estimated Resident Population ( 3201.0)

The ABS aims to produce high quality data from the Census. To achieve this, extensive effort is put into Census form design, collection procedures, and processing procedures. More details regarding these efforts can be found in:
All are available from the ABS Website.

06/06/2008 Note: The Quality Statement - Age (AGEP) was amended.

