This article was published in the July 2008 issue of Australian Labour Market Statistics (cat. no. 6105.0).
UNDERENUMERATION IN THE LABOUR FORCE SURVEY
FINDINGS OF A CENSUS DATA ENHANCEMENT QUALITY STUDY
The Census of Population and Housing is the largest statistical collection undertaken by the ABS. It aims to collect information about every person in Australia on Census Night.
During the period of Census processing, the ABS uses name and address information collected in the Census to assist in processing the data including the coding of family structure and checking undercount. After Census processing is complete, all names and addresses held by the ABS are destroyed.
For the 2006 Census, the ABS is undertaking a Census Data Enhancement project (CDE). As part of this project, the ABS is conducting a number of quality studies which bring together 2006 Census data and other specified datasets. The results of these studies will indicate the usefulness and validity of such linkages and will allow the ABS to make improvements to its collections.
An investigation into underenumeration in the Labour Force Survey (LFS) is a component of the LFS quality assurance program. The 2006 Census provides an opportunity to bring together information related to dwellings in both the LFS and the 2006 Census. For the study, a linked dataset was created using names, addresses and other demographic variables from the 2006 Census and the August 2006 LFS. All names and addresses used to create this dataset, using clerical methods, were subsequently removed. The resulting dataset was only available to ABS officers and was destroyed on 30 June 2008.
This paper reports on the findings of the study. The three main purposes of this study were:
- to produce an estimate of underenumeration in the LFS, relative to the Census;
- to determine and compare the types of people, dwellings and geographies associated with high underenumeration; and
- to develop recommendations to reduce underenumeration in the LFS.
All of the CDE studies have strict data security procedures in place to ensure confidentiality. These procedures were followed in this quality study. For further information on the CDE project see the Statistician's CDE Project Statement of Intention, which is available on the ABS website, and the following ABS papers:
- Methodology of Evaluating the Quality of Probabilistic Linking (cat. no. 1351.0.55.018);
- Enhancing the Population Census: Developing a Longitudinal View 2006 (cat. no. 2060.0); and
- Census Data Enhancement Project: An Update (cat. no. 2062.0).
The LFS is a monthly household survey which collects information about the labour force status and other characteristics of the usually resident Australian civilian population aged 15 years and over. Estimates of employment, unemployment, and labour force participation published from the survey each month are used to inform key social and economic policies.
The LFS has an extensive quality assurance program which aims to ensure LFS estimates are of the highest standard. Despite these efforts, estimates from the LFS, and all surveys, are affected by various forms of error. One form of error is underenumeration.
Underenumeration arises when part of the target population is missed. This can occur, for example, when dwellings are missed in constructing the sampling frame. It can also occur when information is not obtained for all persons selected in the survey. Some groups within the population, for example young males, can be more difficult to contact than others. If this occurs, the group being underenumerated is not fully represented in the sample.
Benchmarking the estimates to population estimates by age and sex compensates for underenumeration of certain groups. However, to the extent that, within benchmark categories, the characteristics of enumerated persons are different to the characteristics of the people missed, a bias remains.
The limitations of the current study, discussed in the next section, prevent it being used to estimate the bias from underenumeration in the LFS. Rather, its results provide an indication of the characteristics and extent of LFS underenumeration.
This study measured underenumeration in the LFS through a comparison with the Census. Whilst the Census aims to collect information about everyone in Australia on Census night, it too is affected by underenumeration. The extent of underenumeration in the Census is estimated by the Census Post Enumeration Survey (PES), conducted three weeks after Census night (for details of Census undercount, see Census of Population and Housing - Details of Undercount, cat. no. 2940.0). As the CDE Quality Study of Labour Force Underenumeration did not include persons who were missed by the Census, it does not provide a complete measure of underenumeration in the LFS.
An additional limitation of the current study is the lower than anticipated linkage rate between LFS and Census records. Linkage rates in non-private dwellings (for example, hotels, caravan parks and Indigenous community dwellings) were extremely low with only 0.4% of addresses linked. Linkage results in LFS private dwellings were considerably higher with 71.3% of addresses linked. Consequently, the findings presented in this paper are for linked addresses for private dwellings only. The main hindrances to linking were the lack of detail in some LFS and Census address information and differences in address descriptions between the two collections.
The limitations of this study become evident when comparisons are made between the study's results and an indicative measure of underenumeration derived from the 'apparent enumeration rate'. This rate is calculated using the following formula:
Actual LFS sample / Expected LFS sample x 100
where 'Actual LFS sample' is the number of people who fully responded to the LFS, and 'Expected LFS sample' is the civilian population aged 15 years and over multiplied by the sampling fraction for a given State or Territory. The sampling fraction is the proportion of the civilian population aged 15 years and over intended to be selected in the survey, and is calculated as part of the sample design process. For example, the sampling fraction for New South Wales in August 2006 was 1/321. This means that one in every 321 people in New South Wales should be selected in the LFS sample.
The 'apparent enumeration rate' for the August 2006 LFS is 86.9%, which corresponds to an underenumeration rate of 13.1%. This is considerably higher than the underenumeration rate, relative to the Census, calculated in the present study. This discrepancy is not unexpected, as the underenumeration rate calculated in the present study does not account for:
- people who did not return a Census form;
- people in non-private dwellings; or
- people in private dwellings whose address could not be linked.
Another source of discrepancy is the treatment of people who are temporarily overseas. In the LFS, selection rules are applied at each dwelling in sample, with the aim of associating each person in the target population with one and only one dwelling. People who usually live in a private dwelling, but who are away for six weeks or more, are excluded from the LFS at their usual residence, and have their chance of selection in the survey at the place at which they are staying. If they are staying overseas, they have no chance of selection in the survey. Therefore, part of the apparent underenumeration rate is explained by the inclusion of people in the estimate of the civilian population aged 15 years and over, who in practice have no chance of selection in the LFS.
A further issue to be kept in mind when interpreting the results of the current study is the potential time gap between Census and LFS enumeration of each dwelling. The majority of LFS enumeration occurred in the two week period 7-20 August, with a small proportion of dwellings enumerated in the follow-up period 21-31 August. Census enumeration occurred over a longer period. Census Night was Tuesday 8 August, and the majority of Census enumeration occurred before the end of August. However, a small proportion of Census forms were completed before Census Night, and some were completed in September. If there was a time gap between collections for a particular dwelling, it could be as little as one day, but could be over 30 days. As a result, the mobility of people, and entire households, could have had an effect on the accuracy of the current analysis as there was no reliable way of differentiating between actual underenumeration and cases where occupants of a dwelling changed between the time of Census and LFS enumeration.
THE POPULATION FOR ANALYSIS
For the purposes of this study, the population of interest is persons over 15 years of age who are usual residents in a private dwelling and who were enumerated at their usual residence on Census Night. These people were considered to be most likely to have been usual residents at the time of LFS enumeration, and therefore should have been enumerated in the LFS. Visitors to dwellings on Census Night were excluded from the analysis because their mobility would potentially distort the underenumeration analysis if they were not matched between the LFS and the Census. Those aged under 15 years were also excluded as they are out of the scope of the LFS. It should be noted that the population for the present study does not mirror the defined LFS population precisely, as it is likely to include a small number of persons who are normally out of the scope of the LFS, such as permanent members of the Australian defence forces.
The results of this study are based on unweighted counts. In other words, the data have not been inflated to represent the Australian population as a whole. Of greater interest for the purposes of this study are the raw counts and proportion of people missed in LFS, relative to those who were enumerated in the LFS sample.
For analysis, the population of interest was divided into two distinct groups: those missed in the LFS (but enumerated in the Census) and those enumerated in both the LFS and Census.
The LFS underenumeration rate was calculated using the following formula:
Those missed in the LFS / (Those missed in the LFS + Those enumerated in both the LFS and Census) x 100
LFS Underenumeration rate relative to Census
The underenumeration rate calculated in this study is the proportion of people missed in the LFS, relative to the Census. Based on the results of the current study, the national underenumeration rate in the LFS, relative to the Census, is 5.0%.
The following paragraphs provide an indication of the characteristics of LFS underenumeration.
1. Persons enumerated, LFS and Census
|Enumerated in both LFS and Census |
|Missed in LFS |
|Total persons |
LFS Underenumeration by state/territory
The underenumeration rates, relative to the Census, varied considerably between the states and territories. The highest underenumeration rates were observed in Queensland (5.6%) and New South Wales (5.5%). The lowest underenumeration rates were observed in the ACT (2.4%) and Tasmania (2.9%).
2. LFS Underenumeration, States and territories
Missed in LFS
Enumerated in both LFS and Census
|New South Wales |
|South Australia |
|Western Australia |
|Northern Territory |
|Australian Capital Territory |
LFS Underenumeration by sex and age groups
The underenumeration rate, relative to the Census, was higher for males (5.4%) than females (4.5%). For both the male and female populations, the underenumeration rates were highest in the 15-34 year old age group, with the highest underenumeration rate observed for males aged 25-34 (8.1%).
Substantially lower underenumeration rates were observed for persons 35 years and over, with underenumeration rates ranging from 3.5% for persons 75 years and over to 4.6% for persons 35-44 years.
3. LFS Underenumeration Rate, by sex and age groups
|Age group (years) |
|75 and over |
LFS Underenumeration by sex and registered marital status
The highest underenumeration rates, relative to the Census, were observed in the never married (7.3%) and separated (8.1%) populations. The lowest underenumeration rate was observed in the married population (3.2%). The highest underenumeration rate observed was for separated males (10.9%).
4. LFS Underenumeration Rate, by sex and registered marital status
|Marital status |
|Never married |
LFS Underenumeration by sex and labour force status
High underenumeration rates, relative to the Census, were observed amongst those whose labour force status in the Census was unstated (10.9%), in the unemployed looking for full-time work population (10.1%) and the employer population (7.4%). Males whose labour force status was not stated (13.3%) and males looking for full-time work (11.1%) had the highest underenumeration rates observed in this study.
5. LFS Underenumeration Rate, by sex and labour force status(a)
|Labour force status |
|Own account worker |
|Contributing family worker |
|Unemployed looking for full-time work |
|Unemployed looking for part-time work |
|Not in the labour force |
|Not stated |
|(a) This Labour Force Status classification (LFSP) was used in the 2001 Census. Following the release of Census Dictionary, 2006 in May 2006, this Classification was changed to LFS06P (see Census Dictionary: Corrigendum (cat. no. 2901.0)). As the CDE Project data was linked and analysed before the changed classification was available, the LFSP classification is used here. For definitions of each category, see the Glossary. |
Other characteristics of persons missed in LFS
Compared to the overall underenumeration rate of 5.0%, relative to Census, there were several other characteristics (that is, types of people, dwellings or geographies) identified as being associated with high underenumeration. They are presented in the following table.
6. LFS underenumeration rate, Other characteristics
Missed in LFS
Enumerated in both LFS and Census
|Living elsewhere in Australia in 2005(a) |
|Living overseas in 2005 |
|Living overseas in 2001 |
|Living in a single person household |
|Living in a household of 6 or more persons |
|Aboriginal or Torres Strait Islander |
|Living in remote/very remote areas |
|Speak a language other than English at home |
|(a) Living somewhere other than person's usual address in the 2006 Census. |
The results of the CDE Quality Study of LFS Underenumeration provided a measure of LFS underenumeration when compared with the Census (5.0%) and identified some characteristics of those who were enumerated in the Census but missed in the LFS. As a consequence of this study, the ABS will review the field operations which apply to the groups with high underenumeration rates relative to the Census. Due to the limitations of the study, including the linkage rate between Census and LFS addresses, the findings will not be used to minimise the impact of undercoverage bias on LFS estimates through weighting adjustments. However, a beneficial outcome of this CDE quality study is that the ABS has identified the potential to improve the linkage rate in future studies through the way that address details are recorded in the LFS and Census.