Using administrative data to fill possible data gaps in the Census

Preparing the 2021 Census for unexpected events like bushfires and floods

Released
12/03/2021

Being prepared for unexpected events

The Census is one of Australia’s largest peacetime operations. The COVID-19 pandemic and 2020 bushfires are reminders that the Census needs to be ready to change when unexpected events happen.

To make sure we can still deliver the highest quality Census, we are getting ready to use administrative data to fill any significant gaps that unexpected events might cause.

A bushfire, for example, might make it hard for people in an affected town or area to complete the Census. A national emergency, such as the outbreak of a pandemic, could affect how the whole country responds.

Administrative data can help improve the population counts and fill in the gaps in some of the other information collected on Census forms.

Other countries have successfully used administrative data to fill in gaps in their censuses.  For example:

  1. The US Census Bureau used administrative data to achieve a high response in their 2020 Census.  Early results show that it helped count people in about 6% of houses. It needed to do this even after it increased the time for collecting Census forms from three to six months because of COVID-19.
  2. Stats NZ used administrative data to fill gaps in its 2018 Census after it got a lower response than expected. Data for 89% of people in its Census came from the 2018 Census form and 11% from administrative data. This meant Stats NZ got higher quality counts compared to previous censuses, but it couldn't fill all the gaps in Census information.
  3. In Canada, a large bushfire at Fort McMurray interrupted the nation's 2016 Census, with about 100,000 people needing to be evacuated from the area. If Statistics Canada didn't receive a response from a home in the evacuated area, it used administrative data to fill in basic information.

When we'd use administrative data for the Census

We think it’s important to be prepared and will only use administrative data in this way if we really need to.

It’s very important to us to keep the public’s trust. In 2020, we did a privacy impact assessment on using administrative data.  The assessment found that people’s privacy is well protected. It recommended that we publish information on how we will decide whether to use administrative data or not.

To answer this recommendation, we developed the principles in the table below for when we would use administrative data to fill gaps in the Census.  There is also more detail in Figure 1 on how we will use these principles to guide our decisions.

  1. Census data quality is significantly affected

There must be a large enough impact to the quality of Census data for us to use administrative data.  For example:

  • we can't get data for a town or local region
  • the gaps for a particular population are large enough that the Census can’t give accurate information for planning and policy decisions.
  1. Administrative data is of high enough quality
Administrative data must be of high enough quality to fix the impact on Census data quality.
  1. Delays to Census results are acceptable
The benefits to data quality must clearly outweigh the costs, particularly any delays to Census results.
  1. Owners of the administrative data are supportive
Before using administrative data to fill gaps in the Census, the owners of the administrative data (like the Australian Taxation Office) must agree to us using their data for this reason.
  1. Transparency and keeping the public's trust

We are transparent about how we might use administrative data in the Census and the benefits.

We must assess and limit any impacts on privacy before awe use administrative data.

 

How we could use administrative data

If unexpected events affect the Census, we could fill gaps using administrative data from the Australian Taxation Office, Medicare and Centrelink.

There are two steps we could use to fill any gaps in the Census information:

  1. Find any records from the Australian Taxation Office, Medicare and Centrelink systems that seem to be missing from the Census and are in the area or group of people the unexpected event has affected. The administrative data would give us at least the age, sex and area the person lives, which are key pieces of information from the Census.
  2. Fill in other Census information for these records using administrative data and 2016 Census data. For example, we may be able to find family relationships from Centrelink and Medicare records or fill in information from the 2016 Census that wouldn’t have changed over time, such as a person's country of birth.

We're also developing a possible way to fill in Indigenous status, if counts of Aboriginal and Torres Strait Islander peoples are significantly affected. We would only use administrative data for this after talking with Aboriginal and Torres Strait Islander groups.

Figure 1: Decision process guiding when to use administrative data to fill gaps in the Census

Flow diagram which steps through the process for deciding whether to continue normal output processes for the Census, or to use administrative data to fill gaps in the Census. Each step in the process relates to the principles outlined in the table.
If Census response is lower than expected, the first step is to do targeted communication and additional field work. If the Census response is still too low, then the decision process begins. Step 1 - Has Census data quality been significantly affected? Is a whole area missing, or are there big gaps for important populations? Are important planning and policy decisions going to be affected? If no, continue normal output processes for the Census. If yes, continue to Step 2. Step 2 - Is there admin data of high enough quality to fix the impacts? Does administrative data cover the gaps for the affected areas or populations? Does it have the right information? If no, continue normal output processes for the Census. If yes, continue to Step 3. Step 3 - Do benefits outweigh the costs, particularly delays to critical results? What is the delay to critical results like population counts? Are the additional costs to data processing feasible? If no, continue normal output processes for the Census. If yes, continue to Step 4. Step 4 - Do the data owners agree to us using their administrative data for this reason? Are signed agreements in place? Are data owners being kept informed of plans? If no, continue normal output processes for the Census. If yes, continue to Step 5. Step 5 - Have we been transparent and are we protecting privacy? Has there been a privacy impact assessment of this approach? Has the approach been made public? If no, continue normal output processes for the Census. If yes, use administrative data to fill gaps in the Census.
Back to top of the page