Using administrative data to improve the Census count

Improving how we do things so we get a high quality 2021 Census count

Released
11/06/2021

We are using administrative data to help assess which houses were empty on Census night and to help improve the Census count.

We work hard to collect data from every house that was occupied on Census night. We still need to adjust the count for a small number of houses where someone was home, but we haven't received a Census form.

We do this using a process called imputation. Imputation is where we copy basic Census information (number of people with their age and sex) from another similar household to represent the missed people.

We know from looking at results from the 2016 Post Census Review that, despite our best efforts, we sometimes impute data for a house that was actually empty on Census night.  If this happens a lot in an area, it can make the counts for that area larger than they should be.

Our research also shows that when we adjust the Census count using imputation, we tend to count more older Australians than we should. You can read more about this on our page Can administrative data help to improve the Census count?

Improvements using administrative data

Using administrative data will improve our Census count in two ways: 

  1. It helps us to better assess whether a house was empty on Census night.  This helps to make sure that we don’t over-adjust Census counts. This is especially important for areas where it is harder to assess what is occupied and what isn't, such as inner-city areas with lots of apartment buildings.
  2. Administrative data helps us to choose a donor or representative household where the people are more similar in age to those who were missed.  This makes sure that the adjustments we make to the count don't include too many older Australians.

Assessing which houses were empty

After we have collected Census data , we look at the small number of houses where we haven't received a Census form and where Census field staff aren't able to work out if the house was occupied. (In the 2016 Census, this was about 3% of all houses.)

We need to decide if these houses were empty on Census night, or if we should adjust the Census count to cover people who were missed.  In the 2021 Census, we will use administrative data to help with this decision.

First, we create indicators of occupancy from administrative data for the houses where we need to make a decision.  Some of these are specific to a house, while some cover a group of houses across a wider area.  We store all administrative data safely and securely and we always keep personal data confidential by storing it without address information attached.

The administrative data we use to create indicators includes:

  • data from government services like Medicare, Centrelink and the Australian Taxation Office
  • information on electricity use from energy distributors
  • information from our Address Register
  • data from the last Census.

Table 1 below shows the type of indicators we use to assess whether houses were empty or occupied. Indicators we create from government data and our Address Register are for individual houses. Indicators we create from electricity use and the last Census are for a group of houses.

We don't use electricity data for individual houses. The 2021 Census Administrative Data Privacy Impact Assessment recommended using electricity data cautiously. So we will only use this information at an area level for the 2021 Census.

Although this information is less exact when we use it for an area, it still gives us an important sign of whether homes are occupied or not because it's more up to date than government data.  Government data is updated more slowly and is a better sign of whether a house is empty or occupied over the long-term.

Download
Table 1: The types of indicators we use to assess if a house was empty or occupied
Individual or group of housesIndicators we useData source
Individual house
  • The possible number, sex, and age group of people living in the house.
  • Whether the people who probably live in the house recently used a government service.
  • The type of house, such as a stand-alone house or an apartment.
  • Medicare
  • Centrelink
  • Australian Taxation Office
  • Our Address Register
A group of houses across a wide area
  • An estimated occupancy rate at Census time for houses in the area where we don't know if they were occupied or not – using electricity data.
  • The estimated occupancy rate across all houses in the area from the 2016 Census.
  • Electricity use
  • 2016 Census

 

We use these indicators to assess whether houses that didn’t send us a form were empty or occupied. For example, if Centrelink data shows that someone in the house recently received a government payment, we will probably decide it's occupied.  On the other hand, if the house is in an area where electricity use shows that not many houses are occupied, we will probably decide it's empty.

We have developed a set of rules (a statistical model) based on the indicators in Table 1, using 2016 data. These rules decide whether a house was occupied, without our staff looking at the indicators.

When we use the new rules on 2016 Census data, it shows that we would have decided that another 1.7% of houses were empty. This would have reduced the Census count of people by about 2%. (Our official population estimates wouldn't have changed because we create those using a separate process.) These new numbers match closely with estimates from the 2016 Post Census Review. This makes us confident that we're improving the Census counts.

We expect to see bigger decreases in Census counts in inner-city areas. There are lots of high-rise apartments in these areas and it's harder to tell if the apartments are empty.  If we had used this new method in 2016, counts for some of these areas would have been 5-10% lower.

Adjusting the Census count

For the 2021 Census, we will also use administrative data to help choose donor houses with people who are similar in age to the people who were missed.

For houses we decide are occupied on Census night, we first work out the number and ages of people at the house from government data (see Table 1 above).  We then use this information to help choose a donor house from the Census that has people of the right age.

For example, administrative data may show that two people aged 30–34 live in a house that we decided was occupied.  We then choose a donor house from the Census where administrative data also shows that there were two people aged 30–34.  The Census counts from this donor house are copied across for the missed people.  This should give us counts of people in the right age group rather than just choosing a house in the same area at random.

We don’t use the administrative data counts as direct replacements for people missed in the house.  This is something we may think about in the future, but we need to do further research to understand if this would be appropriate.