How administrative data improved the quality of the 2021 Census

How we used administrative data to improve the quality of the information we collected

Released

20/01/2023

Background

Prior to the 2021 Census, we shared our plans to use administrative data to support the Census. Following the Census, the Statistical Independent Assurance Panel, in its report on the quality of 2021 Census data, was “pleased to note data quality improvements resulting from enhancements to occupancy determination and imputation for non-responding households, most notably as a result of the use of administrative data.”

This article explains what we did and shows how administrative data improved the quality of the 2021 Census.

What is administrative data?

Methods to improve the quality of Census counts

To make sure that we count all Australians in the Census, we need to:

decide whether a house was occupied on Census night, when we didn’t receive a Census form
adjust the count for people who were missed.

Results from the 2016 Post Enumeration Survey showed us that we don’t always get these decisions right, and we might think a house was occupied on Census night when it was actually empty, or we might adjust the count by adding people of the wrong age. For 2021 Census, we used information from administrative data to improve our methods.

Assessing which houses were empty (occupancy determination)

For houses where we didn’t receive a Census form and our field staff couldn’t work out if the house was occupied, we used administrative data to help us decide whether the house was empty or occupied on Census night. We explained the way we did this in a previous article, Using administrative data to improve the Census count.

We made one improvement to the approach described above, by adding publicly available information about rental vacancies. If a house was listed for rent around Census time, then we were more likely to decide it was empty.

For most houses, we had good information about whether they were occupied or vacant, either from a Census form or from our field staff. We only used our statistical model to help decide occupancy for 2% (about 218,000) of all houses. The model predicted that almost three-quarters of these (72% or about 156,000) were empty.

In the states that were impacted by COVID-19 lockdowns around Census time, we used our model to set occupancy for more houses (2.5% for New South Wales and Victoria, and 2.2% for ACT) than in states that were not in lockdown (from 0.9% in Tasmania to 1.9% in Northern Territory). This is likely due to field staff finding it more difficult to work out whether a house was occupied when they couldn’t knock on any doors (Census field work was contactless when COVID lockdowns were in place).

Adjusting the Census count (imputation)

For houses where we didn’t receive a Census form and the house was occupied (based on field information or our model), we needed to adjust our count for the people we missed. Like previous Censuses, we did this using a process called imputation. Imputation is where we copy basic Census information (number of people with their age and sex) from another similar household to represent the missed people.

For the 2021 Census, we used administrative data to help us choose a representative household (known as a ‘donor’) where the people are more similar in age to those who were missed. We described our method in this article Using administrative data to improve the Census count.

For the 2021 Census, there were about 379,000 (3.5%) dwellings where we needed to make up for missed people. This was down from around 430,000 (4.4%) dwellings in the 2016 Census.

Measuring improvements in the quality of Census counts

Improvements in how we adjusted for people we missed (occupancy determination and imputation)

Results from the 2021 Post Enumeration Survey showed that, by using administrative data, we did a better job of deciding whether non-responding houses were occupied and adjusting the count for the people we missed. Together, these improvements delivered Census counts that were more accurate. This was especially true for counts in inner city areas and for older Australians.

First, the 2016 Post Enumeration Survey showed that we set too many non-responding houses to occupied and, as a result, we added about 320,000 too many people when we adjusted the count (about 1.4% of the population). In the 2021 Post Enumeration Survey this reduced to about 230,000 too many people (about 0.9% of the population).

While this is only a small change at a national level, improvements were much more significant for inner city high-rise areas where it’s hard to tell if people are home. For example, Figure 1 below shows the proportion of non-responding houses that were set to occupied for the whole of Sydney, the Inner-Sydney SA4 and the Potts Point SA2. The greatest difference was for the Potts Point SA2 in central Sydney where high-rise apartments are very dense and where we probably set too many non-responding houses to occupied in 2016.

Figure 1: Proportion of non-responding houses in the Sydney area that were set to occupied in 2016 and 2021
	2016	2021	difference
Greater Sydney	4.3	3.4	-1.0
Sydney - City and Inner South	7.8	4.1	-3.7
Potts Point - Woolloomooloo	9.8	4.3	-5.5

Figure 1: Proportion of non-responding houses in the Sydney area that were set to occupied in 2016 and 2021

["","2016","2021","difference"]

[["Greater Sydney","Sydney - City and Inner South","Potts Point - Woolloomooloo"],[[4.3320802389999997],[7.8352118649999998],[9.7754092119999996]],[[3.368953018],[4.1357364150000002],[4.3020323390000001]],[[-0.96312722100000003],[-3.69947545],[-5.4733768730000003]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Percentage","axis_units":"","tooltip_units":"(%)","table_units":"","axis_min":"-10","axis_max":"10","tick_interval":"2","precision":"1","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

A second major improvement to the Census counts was in the age profile, where administrative data helped us choose donor houses with people who were more similar in age to the people who were missed. You can see in Figure 2 that the 2016 Post Enumeration Survey shows we added too many older people in our adjustment for the 2016 Census. In contrast, the 2021 Post Enumeration Survey shows we reduced the net overcount in these older age groups, making it more similar across all adult age groups.

Figure 2: Net overcount rate for persons imputed by age, Post Enumeration Survey
Age group (years)	2016	2021
0–4	0.3	0.9
5–9	0.3	0.9
10–14	1.1	0.5
15–19	1.1	1.2
20–24	2.7	2.4
25–29	2.7	3.0
30–34	2.8	2.2
35–39	2.7	1.9
40–44	3.0	2.5
45–49	3.1	2.3
50–54	3.3	2.4
55–59	4.1	2.2
60–64	5.0	2.1
65–69	4.7	2.8
70–74	4.5	2.4
75–79	4.4	2.7
80–84	4.7	2.9
85 and over	4.1	4.1

Figure 2: Net overcount rate for persons imputed by age, Post Enumeration Survey

["Age group (years)","2016","2021"]

[["0\u20134","5\u20139","10\u201314","15\u201319","20\u201324","25\u201329","30\u201334","35\u201339","40\u201344","45\u201349","50\u201354","55\u201359","60\u201364","65\u201369","70\u201374","75\u201379","80\u201384","85 and over"],[[0.27000000000000002],[0.28999999999999998],[1.0900000000000001],[1.05],[2.71],[2.6899999999999999],[2.7799999999999998],[2.6499999999999999],[3.04],[3.0800000000000001],[3.3199999999999998],[4.0899999999999999],[4.96],[4.6699999999999999],[4.4699999999999998],[4.3899999999999997],[4.71],[4.0700000000000003]],[[0.85999999999999999],[0.92000000000000004],[0.45000000000000001],[1.1799999999999999],[2.3799999999999999],[3],[2.2200000000000002],[1.8899999999999999],[2.54],[2.2599999999999998],[2.4300000000000002],[2.21],[2.0800000000000001],[2.8300000000000001],[2.3900000000000001],[2.7000000000000002],[2.8700000000000001],[4.0499999999999998]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Net overcount (%)","axis_units":"","tooltip_units":"(%)","table_units":"","axis_min":"0","axis_max":"6","tick_interval":"1","precision":"1","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Another way we can see the improvements in 2021 Census data is when we compare the Census population with the Census night population estimate from the Post Enumeration Survey. In 2016, the count of Census people under 35 was noticeably lower than the estimate from the Post Enumeration Survey (gap between shaded region and line in Figure 3a) and the count of Census people aged 55-74 was higher than the estimate from the Post Enumeration Survey (shaded region above the line in Figure 3a).

Figure 3a: Census night population of usual residents compared with Post Enumeration Survey estimates, by age group, 2016
Age group (years)	PES population estimate	Census count (including imputation)
0–4	1,543,456	1,464,513
5–9	1,554,446	1,502,360
10–14	1,424,145	1,396,924
15–19	1,459,520	1,421,380
20–24	1,648,495	1,566,586
25–29	1,750,638	1,664,345
30–34	1,739,760	1,703,532
35–39	1,568,691	1,561,362
40–44	1,587,004	1,582,931
45–49	1,573,756	1,581,123
50–54	1,511,360	1,523,173
55–59	1,429,515	1,453,953
60–64	1,256,777	1,299,068
65–69	1,156,060	1,188,699
70–74	854,169	887,533
75–79	634,123	652,541
80–84	446,807	460,484
85 and over	484,981	486,789

Figure 3a: Census night population of usual residents compared with Post Enumeration Survey estimates, by age group, 2016

["Age group (years)","PES population estimate","Census count (including imputation)"]

[["0\u20134","5\u20139","10\u201314","15\u201319","20\u201324","25\u201329","30\u201334","35\u201339","40\u201344","45\u201349","50\u201354","55\u201359","60\u201364","65\u201369","70\u201374","75\u201379","80\u201384","85 and over"],[[1543456],[1554446],[1424145],[1459520],[1648495],[1750638],[1739760],[1568691],[1587004],[1573756],[1511360],[1429515],[1256777],[1156060],[854169],[634123],[446807],[484981]],[[1464513],[1502360],[1396924],[1421380],[1566586],[1664345],[1703532],[1561362],[1582931],[1581123],[1523173],[1453953],[1299068],[1188699],[887533],[652541],[460484],[486789]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Persons","axis_units":"","tooltip_units":"","table_units":"","axis_min":"0","axis_max":"2000000","tick_interval":"200000","precision":"0","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Note: Excludes Other Territories and overseas visitors.

In 2021, the count of Census people under 35 was closer to the estimate from the Post Enumeration Survey (smaller gap between the shaded region and line in Figure 3b) and the count of Census people aged 55-74 was closer to the estimate from the Post Enumeration Survey (very little shading above the line in Figure 3b).

Figure 3b: Census night population of usual residents compared with Post Enumeration Survey estimates, by age group, 2021
Age group (years)	PES population estimate	Census count (including imputation)
0–4	1,488,725	1,463,572
5–9	1,607,028	1,585,847
10–14	1,624,572	1,587,758
15–19	1,478,176	1,457,616
20–24	1,612,929	1,579,324
25–29	1,802,371	1,771,396
30–34	1,884,757	1,852,783
35–39	1,865,170	1,838,473
40–44	1,654,973	1,648,555
45–49	1,642,755	1,635,653
50–54	1,613,761	1,610,590
55–59	1,545,435	1,541,524
60–64	1,465,804	1,467,714
65–69	1,280,216	1,298,138
70–74	1,149,047	1,160,503
75–79	809,887	821,765
80–84	550,961	554,496
85 and over	531,454	542,271

Figure 3b: Census night population of usual residents compared with Post Enumeration Survey estimates, by age group, 2021

["Age group (years)","PES population estimate","Census count (including imputation)"]

[["0\u20134","5\u20139","10\u201314","15\u201319","20\u201324","25\u201329","30\u201334","35\u201339","40\u201344","45\u201349","50\u201354","55\u201359","60\u201364","65\u201369","70\u201374","75\u201379","80\u201384","85 and over"],[[1488725],[1607028],[1624572],[1478176],[1612929],[1802371],[1884757],[1865170],[1654973],[1642755],[1613761],[1545435],[1465804],[1280216],[1149047],[809887],[550961],[531454]],[[1463572],[1585847],[1587758],[1457616],[1579324],[1771396],[1852783],[1838473],[1648555],[1635653],[1610590],[1541524],[1467714],[1298138],[1160503],[821765],[554496],[542271]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Persons","axis_units":"","tooltip_units":"","table_units":"","axis_min":"0","axis_max":"2000000","tick_interval":"200000","precision":"0","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Note: Excludes Other Territories and overseas visitors.

In summary, using administrative data helped to improve the 2021 Census count by:

reducing the overadjustment for missing persons, particularly for inner city high-rise areas
giving a more balanced adjustment for missing people across age groups, reducing the overadjustment for older people.

Checking Census data quality (quality assurance)

We used administrative data as an independent check for Census counts and occupancy rates (the proportion of houses that are occupied). This helped give us confidence that the Census data was accurate, particularly when it wasn’t what we expected.

In some areas, the occupancy rate for the 2021 Census was quite different to the occupancy rate in the 2016 Census. We predicted occupancy rates for these areas using administrative data and these rates generally matched the Census data, giving us confidence that the data was accurate.

We also used administrative data to check Census counts when they didn’t match what we expected from our official measure of Australia’s population at that time (the Estimated Resident Population, or ERP). The Census counts and ERP matched well for all of Australia, but we saw some differences in specific areas. For example, the Census count for people aged 25-39 in Tasmania was higher than we expected from ERP (difference between the dark blue and orange lines in Figure 4). When we looked at the administrative data (light blue line), the count of people aged 25-39 in Tasmania was higher than ERP as well. This gave us confidence that the Census data was accurate.

Figure 4: Population counts by age group for Tasmania comparing Census, administrative data and Estimated Resident Population(a), 2021
Age group (years)	Administrative data	Census	ERP
0-4	27,080	28,277	28,574
5-9	30,000	31,150	31,487
10-14	32,090	33,212	33,328
15-19	30,760	30,065	30,988
20-24	34,250	31,024	30,931
25-29	41,630	38,756	33,447
30-34	39,750	37,908	33,287
35-39	35,730	34,708	32,345
40-44	32,170	31,246	30,393
45-49	33,930	33,228	32,932
50-54	36,070	35,976	35,263
55-59	37,180	36,993	36,445
60-64	37,920	38,395	37,858
65-69	34,760	35,137	34,140
70-74	31,590	31,856	31,027
75-79	21,670	22,029	21,653
80-84	14,580	14,617	14,317
85 and over	12,910	12,995	13,064

Figure 4: Population counts by age group for Tasmania comparing Census, administrative data and Estimated Resident Population(a), 2021

["Age group (years)","Administrative data","Census","ERP"]

[["0-4","5-9","10-14","15-19","20-24","25-29","30-34","35-39","40-44","45-49","50-54","55-59","60-64","65-69","70-74","75-79","80-84","85 and over"],[[27080],[30000],[32090],[30760],[34250],[41630],[39750],[35730],[32170],[33930],[36070],[37180],[37920],[34760],[31590],[21670],[14580],[12910]],[[28277],[31150],[33212],[30065],[31024],[38756],[37908],[34708],[31246],[33228],[35976],[36993],[38395],[35137],[31856],[22029],[14617],[12995]],[[28574],[31487],[33328],[30988],[30931],[33447],[33287],[32345],[30393],[32932],[35263],[36445],[37858],[34140],[31027],[21653],[14317],[13064]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Persons","axis_units":"","tooltip_units":"","table_units":"","axis_min":"0","axis_max":"50000","tick_interval":"10000","precision":"0","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Estimated Resident Population (ERP) is unrebased

Enhancing Census with administrative data

Preparing for unexpected events

When we shared our plans to use administrative data to support the Census, we included information on preparing for unexpected events. We developed and tested a method for using administrative data to fill gaps in Census data. Even though the COVID-19 pandemic was an unexpected event, it didn’t cause any significant gaps in Census data, and we didn’t need to use this method. We are well prepared if we need this in the future.

Enhancing Census income data

We have added to the income data available in the Census using linked administrative data. We used income data from the Australian Taxation Office and the Department of Social Services to provide extra information. This includes:

Weekly income earned in the 2020/2021 financial year (in $10 per week categories)
Main source of income
Main type of government benefit payment.

Like other data we collect, administrative data is collected under Census and statistics laws (the Census and Statistics Act 1905). This means any comparison between income reported in Census and administrative data can only be for statistical reasons, not for compliance.

When it's available, we will add a link to the analysis here.

Future plans

Given the improvements that integrated administrative data made to the 2021 Census, we will be exploring new ways to use it to support 2026 Census. This includes:

reducing visits to houses that we predict to be vacant around Census time
filling more gaps in Census data, particularly for areas where we know it is harder to reach everyone
adding to the information on the Census, for example, adding information about people's main source of income or more information on where people lived since the last Census
potentially replacing some information currently collected on the Census, for example, removing the income range tick box question and replacing it with more detailed income data sourced from administrative records.

As part of our work to support the 2021 Census, we created two administrative datasets: one about people and one about houses. Like the Census, these are a snapshot of Australian people and houses as of August 2021. We think these datasets could be a useful addition alongside the Census snapshot in helping to understand and improve the lives of Australians. We plan to make these administrative datasets available to researchers in 2023, with release of data cubes in June and possible further releases in the second half of the year.

Appendix - Measuring the quality of our administrative data

The administrative data we used to support 2021 Census comprised mostly government data, along with rental vacancies data and electricity use information. The government data we used was from the Multi-Agency Data Integration Project (MADIP), which combines data on healthcare, government payments and personal income tax with population demographics.

In the lead up to the 2021 Census, we released an article to demonstrate the quality of this data by showing how well it would have captured the 2016 population, Assessing administrative data quality to enhance the 2021 Census. We did this by comparing the data to 2016 population estimates.

In this section we indicate the quality of the data we used to support the 2021 Census by doing a similar comparison with rebased 2021 population estimates. Note this data has had many small improvements throughout the Census dissemination period; here we show a more recent version from late 2022.

Working out which people to include

Administrative records in MADIP cover people who were resident in Australia from January 2006 to June 2021. For this project, we were interested in the population around Census time (we used 30 June 2021, just six weeks prior to Census night).

To identify the people who were alive and living in Australia at Census time, we applied rules (see Table 1) to remove people who had either left the country or had died prior to 30 June 2021.

Table 1: Rules used to scope the administrative data to persons living in Australia at 30 June 2021
Scoping rule	Measured by	Datasets used
Remove people recorded as deceased prior to 30 June 2021	Date of death	Medicare Social Security Death Registrations
Remove people recorded as having left the country prior to 30 June 2021	Date of departure Date of arrival	Travellers
Remove people who have not recently used a government service	Use of a government service in the last 1-5 years(a)	Medicare and Pharmaceutical Benefits Social Security Immunisation Register Single Touch Payroll

Exact duration depends on a person's age

Working out where to place people

As well as working out which people to include, we applied rules to put people in the most appropriate geographic location. We used location information from a range of administrative data sources including Medicare, Social Security, Tax and Immunisation Register location information. To pick the best location for our time point of interest (30 June 2021), we used information on how recently it was updated, its precision and whether it was likely to be a residential location.

How well does administrative data match the population?

To understand how well our administrative data represents the population, we compare it to the official ABS population count, the Estimated Resident Population (ERP).

National results

The national count of people in our administrative data (25,625,000) closely matches the ERP count (25,688,000), with a difference of around 63,000 people or 0.2% of the population. When we compare the age profiles, administrative data is very close to ERP (Figure 5).

Figure 5: National age profile of administrative data counts compared with Estimated Resident Population(a), June 2021
Age (years)	Administrative data	ERP
0	261,610	298,859
1	287,630	295,164
2	295,720	301,212
3	300,000	303,761
4	309,900	310,963
5	317,920	321,002
6	317,130	319,292
7	321,310	323,324
8	326,620	326,716
9	321,380	326,440
10	318,800	324,690
11	321,990	328,199
12	321,830	325,051
13	320,600	324,817
14	319,050	321,135
15	307,390	308,384
16	298,510	297,458
17	292,590	291,889
18	291,470	289,076
19	294,240	292,825
20	307,580	308,979
21	316,400	320,236
22	320,330	323,086
23	326,900	329,696
24	339,230	341,387
25	345,260	351,659
26	359,400	362,946
27	362,570	365,373
28	367,940	368,991
29	368,380	373,062
30	376,580	382,023
31	382,180	383,954
32	378,070	379,491
33	375,640	377,229
34	376,000	376,923
35	376,860	379,496
36	378,900	377,889
37	374,670	374,408
38	374,950	373,811
39	363,760	361,783
40	353,350	352,505
41	339,970	335,464
42	332,690	328,468
43	325,800	319,975
44	322,550	318,088
45	319,910	318,774
46	326,020	321,997
47	330,660	327,385
48	339,740	334,938
49	349,760	346,941
50	351,410	350,055
51	330,070	327,039
52	326,810	322,416
53	310,790	309,303
54	304,940	302,741
55	302,130	301,697
56	305,570	305,165
57	312,800	313,332
58	315,550	316,619
59	312,080	313,694
60	307,260	310,463
61	295,600	298,961
62	291,700	292,802
63	283,570	285,291
64	276,300	277,508
65	268,430	270,957
66	261,860	262,192
67	254,320	254,044
68	252,380	250,497
69	242,500	242,453
70	237,640	238,670
71	234,590	233,182
72	225,790	224,712
73	222,900	223,531
74	232,060	226,678
75	185,360	186,488
76	177,900	175,179
77	165,790	162,858
78	145,060	144,014
79	140,620	138,656
80	128,140	125,717
81	119,840	118,356
82	111,780	109,659
83	102,020	100,124
84	93,830	91,552
85	82,870	81,633
86	73,750	71,884
87	64,680	63,300
88	57,760	56,623
89	50,260	49,864
90	45,720	45,014
91	39,110	38,799
92	31,890	31,622
93	25,780	25,652
94	19,830	19,894
95	15,150	15,828
96	11,330	11,586
97	8,100	8,083
98	5,490	5,495
99	3,820	3,683
100+	2,370	5,300

Figure 5: National age profile of administrative data counts compared with Estimated Resident Population(a), June 2021

["Age (years)","Administrative data","ERP"]

[["0","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49","50","51","52","53","54","55","56","57","58","59","60","61","62","63","64","65","66","67","68","69","70","71","72","73","74","75","76","77","78","79","80","81","82","83","84","85","86","87","88","89","90","91","92","93","94","95","96","97","98","99","100+"],[[261610],[287630],[295720],[300000],[309900],[317920],[317130],[321310],[326620],[321380],[318800],[321990],[321830],[320600],[319050],[307390],[298510],[292590],[291470],[294240],[307580],[316400],[320330],[326900],[339230],[345260],[359400],[362570],[367940],[368380],[376580],[382180],[378070],[375640],[376000],[376860],[378900],[374670],[374950],[363760],[353350],[339970],[332690],[325800],[322550],[319910],[326020],[330660],[339740],[349760],[351410],[330070],[326810],[310790],[304940],[302130],[305570],[312800],[315550],[312080],[307260],[295600],[291700],[283570],[276300],[268430],[261860],[254320],[252380],[242500],[237640],[234590],[225790],[222900],[232060],[185360],[177900],[165790],[145060],[140620],[128140],[119840],[111780],[102020],[93830],[82870],[73750],[64680],[57760],[50260],[45720],[39110],[31890],[25780],[19830],[15150],[11330],[8100],[5490],[3820],[2370]],[[298859],[295164],[301212],[303761],[310963],[321002],[319292],[323324],[326716],[326440],[324690],[328199],[325051],[324817],[321135],[308384],[297458],[291889],[289076],[292825],[308979],[320236],[323086],[329696],[341387],[351659],[362946],[365373],[368991],[373062],[382023],[383954],[379491],[377229],[376923],[379496],[377889],[374408],[373811],[361783],[352505],[335464],[328468],[319975],[318088],[318774],[321997],[327385],[334938],[346941],[350055],[327039],[322416],[309303],[302741],[301697],[305165],[313332],[316619],[313694],[310463],[298961],[292802],[285291],[277508],[270957],[262192],[254044],[250497],[242453],[238670],[233182],[224712],[223531],[226678],[186488],[175179],[162858],[144014],[138656],[125717],[118356],[109659],[100124],[91552],[81633],[71884],[63300],[56623],[49864],[45014],[38799],[31622],[25652],[19894],[15828],[11586],[8083],[5495],[3683],[5300]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Persons","axis_units":"","tooltip_units":"","table_units":"","axis_min":"0","axis_max":"500000","tick_interval":"100000","precision":"0","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Estimate Resident Population (ERP) is rebased

If we look at the difference between administrative data and ERP more closely, we see that, for most ages, the difference (shown by the light blue line in Figure 6) falls within the range of possible error that is present in ERP (shown by the shaded band). This uncertainty in ERP is introduced when we apply the sample-based adjustment factor for undercounting in the Census, as determined by the Post Enumeration Survey.

Where the percentage difference stays within the band, it is not clear whether the administrative count or ERP is closer to the true population. One clear difference is that there are fewer babies in the administrative data than in ERP. This is due to a delay in babies being registered in the administrative data.

Figure 6: Differences in national administrative counts and Estimated Resident Population(a), by age, June 2021
Age group (years)	Percentage difference	Confidence interval (low)	Confidence interval (high)
0-4	-3.65	-1.11	1.11
5-9	-0.77	-1.22	1.22
10-14	-1.33	-1.17	1.17
15-19	0.31	-1.05	1.05
20-24	-0.80	-1.22	1.22
25-29	-1.01	-1.15	1.15
30-34	-0.59	-1.09	1.09
35-39	0.09	-0.99	0.99
40-44	1.20	-0.96	0.96
45-49	0.97	-0.92	0.92
50-54	0.77	-0.95	0.95
55-59	-0.15	-0.97	0.97
60-64	-0.72	-0.93	0.93
65-69	-0.05	-0.94	0.94
70-74	0.54	-0.91	0.91
75-79	0.93	-1.11	1.11
80-84	1.87	-1.51	1.51
85 and over	1.24	-1.77	1.77

Figure 6: Differences in national administrative counts and Estimated Resident Population(a), by age, June 2021

["Age group (years)","Percentage difference","Confidence interval"]

[["0-4","5-9","10-14","15-19","20-24","25-29","30-34","35-39","40-44","45-49","50-54","55-59","60-64","65-69","70-74","75-79","80-84","85 and over"],[[-3.6490394770000001],[-0.76782531099999995],[-1.3314924880000001],[0.308725413],[-0.79734677700000001],[-1.0143076600000001],[-0.58695949700000005],[0.093874489000000005],[1.2003626469999999],[0.97300966300000002],[0.77353908100000002],[-0.15330469299999999],[-0.72319584999999997],[-0.051009925999999997],[0.54125794699999996],[0.93347951900000004],[1.870526285],[1.243950393]],[[-1.1115429699999999,1.1115429699999999],[-1.22226353,1.22226353],[-1.165271186,1.165271186],[-1.0545759320000001,1.0545759320000001],[-1.2248983550000001,1.2248983550000001],[-1.1516793249999999,1.1516793249999999],[-1.0916742129999999,1.0916742129999999],[-0.99088872400000005,0.99088872400000005],[-0.96134900400000001,0.96134900400000001],[-0.91990821499999997,0.91990821499999997],[-0.95062222399999996,0.95062222399999996],[-0.96706965199999995,0.96706965199999995],[-0.93200945999999996,0.93200945999999996],[-0.94157554799999998,0.94157554799999998],[-0.91005934899999996,0.91005934899999996],[-1.1106579519999999,1.1106579519999999],[-1.514960861,1.514960861],[-1.7710276899999999,1.7710276899999999]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Percentage difference","axis_units":"","tooltip_units":"(%)","table_units":"","axis_min":"-4","axis_max":"4","tick_interval":"1","precision":"2","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Estimated Resident Population (ERP) is rebased

State and territory results

When we compare the administrative data with ERP at the state and territory level, we see that the count of people in administrative data is close to ERP, being within the margin of error for most states (NSW, Vic, Qld, SA and Tas). The count of people in administrative data for WA, NT and ACT is less than ERP (Figure 7).

Figure 7: Percentage difference between administrative counts and Estimated Resident Population(a)(b) for States and Territories, June 2021
State or Territory	Percentage difference	Confidence interval (low)	Confidence interval (high)
Australia	-0.21	-0.36	0.36
NSW	0.25	-0.70	0.70
Vic	-0.69	-0.68	0.68
Qld	0.23	-0.82	0.82
SA	-0.79	-0.98	0.98
WA	-2.12	-1.24	1.24
Tas	-0.79	-0.96	0.96
NT	-3.68	-2.34	2.34
ACT	-2.67	-1.34	1.34

Figure 7: Percentage difference between administrative counts and Estimated Resident Population(a)(b) for States and Territories, June 2021

["State or Territory","Percentage difference","Confidence interval"]

[["Australia","NSW","Vic","Qld","SA","WA","Tas","NT","ACT"],[[-0.21189312199999999],[0.25099412300000001],[-0.69379539499999998],[0.22820605399999999],[-0.78871246100000003],[-2.116250113],[-0.79220438500000001],[-3.6757624400000002],[-2.6673545609999998]],[[-0.35999999999999999,0.35999999999999999],[-0.69999999999999996,0.69999999999999996],[-0.68000000000000005,0.68000000000000005],[-0.81999999999999995,0.81999999999999995],[-0.97999999999999998,0.97999999999999998],[-1.24,1.24],[-0.95999999999999996,0.95999999999999996],[-2.3399999999999999,2.3399999999999999],[-1.3400000000000001,1.3400000000000001]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Percentage difference","axis_units":"","tooltip_units":"(%)","table_units":"","axis_min":"-4","axis_max":"4","tick_interval":"1","precision":"2","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Estimated Resident Population (ERP) is rebased
Error bars represent the uncertainty in ERP introduced by sampling error in the Post Enumeration Survey

Comparing in more detail, we see that the administrative data count is close to ERP in all capital cities except Melbourne, Perth and Canberra where it is lower to a significant degree (outside the margin of error). Outside of the capital cities the administrative data count is higher than ERP to a significant degree in regional NSW and Vic and is substantially lower than ERP in regional NT (about 17% lower). The large difference in regional NT is mostly due to an absence of reliable location information for people in the remote Northern Territory in administrative data (Figure 8).

Figure 8: Percentage difference between administrative data counts and Estimated Resident Population(a)(b) for Greater Capital Cities and Remainder of State/Territory, June 2021
Region	Percentage difference	Confidence interval (low)	Confidence interval (high)
Greater Sydney	-0.61	-0.78	0.78
Rest of NSW	1.73	-1.44	1.44
Greater Melbourne	-1.49	-0.82	0.82
Rest of Vic	1.78	-1.68	1.68
Greater Brisbane	-0.82	-1.16	1.16
Rest of Qld	1.18	-1.26	1.26
Greater Adelaide	-0.45	-0.96	0.96
Rest of SA	-2.05	-3.04	3.04
Greater Perth	-1.72	-1.24	1.24
Rest of WA	-3.90	-4.10	4.10
Greter Hobart	-0.91	-1.36	1.36
Rest of Tas	-0.69	-1.54	1.54
Greater Darwin	0.18	-2.58	2.58
Rest of NT	-17.16	-4.34	4.34
ACT	-2.78	-1.34	1.34

Figure 8: Percentage difference between administrative data counts and Estimated Resident Population(a)(b) for Greater Capital Cities and Remainder of State/Territory, June 2021

["Region","Percentage difference","Confidence interval"]

[["Greater Sydney","Rest of NSW","Greater Melbourne","Rest of Vic","Greater Brisbane","Rest of Qld","Greater Adelaide","Rest of SA","Greater Perth","Rest of WA","Greter Hobart","Rest of Tas","Greater Darwin","Rest of NT","ACT"],[[-0.609038733],[1.732114207],[-1.4926578880000001],[1.784929285],[-0.81812367600000002],[1.1829838189999999],[-0.447306853],[-2.0456637870000001],[-1.720121392],[-3.8995041559999999],[-0.914968114],[-0.69178380500000003],[0.180778355],[-17.160529489999998],[-2.7820036250000002]],[[-0.78000000000000003,0.78000000000000003],[-1.4399999999999999,1.4399999999999999],[-0.81999999999999995,0.81999999999999995],[-1.6799999999999999,1.6799999999999999],[-1.1599999999999999,1.1599999999999999],[-1.26,1.26],[-0.95999999999999996,0.95999999999999996],[-3.04,3.04],[-1.24,1.24],[-4.0999999999999996,4.0999999999999996],[-1.3600000000000001,1.3600000000000001],[-1.54,1.54],[-2.5800000000000001,2.5800000000000001],[-4.3399999999999999,4.3399999999999999],[-1.3400000000000001,1.3400000000000001]]]

[]

[{"axis_id":"0","tick_interval":"","axis_min":"","axis_max":"","axis_title":"","precision":-1,"axis_units":"","tooltip_units":"","table_units":"","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

[{"value":"0","axis_id":"0","axis_title":"Percentage difference","axis_units":"","tooltip_units":"(%)","table_units":"","axis_min":"-10","axis_max":"10","tick_interval":"2","precision":"2","data_unit_prefix":"","data_unit_suffix":"","reverse_axis":false}]

Estimated Resident Population (ERP) is rebased
Error bars represent the uncertainty in ERP introduced by sampling error in the Post Enumeration Survey

Note: Y axis is truncated at -10

APA

Citation