# Microdata and TableBuilder: Census of Population and Housing

Designed for complex data queries such as detailed analysis and modelling on appropriately confidentialised unit record data

## Introduction

This publication provides information about the microdata files released from the 2016 Census of Population and Housing. Included are details about the methodology, how to use the files, and the conditions of use. Data item lists and information on the quality of the microdata are also provided.

Microdata files contain the most detailed information available from a Census and comprise the confidentialised responses to individual questions on the Census Form or data derived from two or more questions. This level of detail is released with the approval of the Australian Statistician.

This publication was previously called Microdata: Census of Population and Housing, Census Sample File, 2011.

The 2016 microdata files comprise TableBuilder datasets, a Basic Confidentialised Unit Record File (CURF), and a Detailed Microdata file. Subject to limitations in the data classifications used, these files enable users to tabulate, manipulate and analyse data to their own specifications.

In particular, Detailed Microdata files contain small systematic samples of confidentialised occupied private dwellings and non-private dwellings, with their associated family and person records. The Basic CURF contains a small systematic sample of confidentialised occupied private dwellings with their associated family and person records, and a random sample of persons from all non-private dwellings together with a record for the associated non-private dwelling.

The set of data were collected on Census Night, 9 August 2016.

### Available products

The following microdata products are available from the 2016 Census:

• TableBuilder datasets, available via the TableBuilder portal on the ABS website.
• Detailed Microdata files, available through the ABS DataLab environment to approved users.
• 1% sample Basic CURF, available though the Microdata Download to approved users.

The Detailed Microdata files contain the full range of classifications that are available for selected person, family and household data items. The Basic CURF contains similar information, though some items are excluded or shown in less detail. Both the Basic CURF and the Detailed Microdata are available in SAS, SPSS and STATA formats.

Further information about these services, and other information to assist users in understanding and accessing ABS microdata in general, is available from the Microdata Entry Page on the ABS website.

### Apply for access

Before applying for access, users should read and familiarise themselves with the information contained in this publication and the Responsible Use of ABS Microdata, User Guide (cat. no. 1406.0.55.003).

To apply for access to TableBuilder, Detailed Microdata and/or Basic CURF files, please see How to Apply for Microdata on the ABS website.

### Further information

Further information about the microdata files can be found in this publication:

• Detailed lists of data items for TableBuilder, 5% Detailed Microdata and Basic CURF files are available under the Data downloads section.
• Information on data quality and definitions can be found under the Quality declaration section.

### Data available on request

Data obtained in the Census but not contained on TableBuilder, Detailed Microdata or Basic CURF files may be available from the ABS, on request, as statistics in tabulated form.

Subject to confidentiality and sampling variability constraints, special tabulations can be produced incorporating data items, populations and geographic areas selected to meet individual requirements. These are available on request, on a fee for service basis. Contact the National Information and Referral Service (NIRS) on 1300 135 070 or client.services@abs.gov.au for further information.

## Methodology

### Selection of sample

Data in the Census microdata files represent samples of dwelling, family and person records from the 2016 Census of Population and Housing. Systematic sampling techniques were utilised to ensure a representative sample across states and territories in each microdata file.

### Detailed Microdata and Basic CURF files

The Detailed Microdata file contains a 5% sample of dwelling records, taken from occupied private dwellings and non-private dwellings, and their associated family and person records. That is, the Detailed Microdata file provides a sample of five occupied private and non-private dwelling records in every hundred from the Census, and their associated family and person records.

The 1% Basic CURF provides a sample of one private dwelling record in every hundred from the Census, and the associated family and person records. Dwellings with more than six usual residents were removed from the sample to ensure confidentiality of large dwellings. For non-private dwellings the sampling is applied to persons present, where one person in every hundred is selected and the associated dwelling records included on the file.

The data are released under the Census and Statistics Act 1905, which has provision for the release of individual level records, i.e. unit records, where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses on the microdata files, and other steps, including the following list of actions, are taken to maintain respondent confidentiality.

In both the Detailed Microdata and the Basic CURF:

• Records from the Other Territories, comprising Jervis Bay, Cocos (Keeling) and Christmas Islands, have been excluded from sampling, as have migratory, shipping and off-shore statistical areas; and
• Some data items that were collected in the Census have been excluded from the files.

In the Basic CURF, additional confidentiality measures were undertaken:

• Large households, i.e., with seven or more usual residents, have been replaced in the sample to ensure confidentiality of large households. A dwelling from a similar geographic region with similar size (up to six residents) was chosen via random sampling as a replacement for each large household;
• The level of detail of certain data items has been reduced by grouping, ranging or top coding values; and
• Where necessary, minor edits were made to individual records.

The nature of the changes made, and the relatively small number of records involved, ensure that the effect on data for analysis purposes is considered negligible. These changes also mean that estimates produced from the microdata files may differ from those published in Census products (QuickstatsDataPacksCommunity Profiles and TableBuilder) or DataLab output.

Data included on the microdata files comprise the key output items for the 2016 Census, including person demographics, labour force, education, family and dwelling characteristics. For a full list of available data items in TableBuilder, Detailed Microdata and Basic CURF files, please see the Data Item Lists in the Data downloads section.

### Changes from previous Census Microdata files

There have been 5 new data items included on the 2016 Detailed Microdata file and 4 new data items on the Basic CURF. These are:

• Indigenous Status (INGP) on the persons level
• Indigenous Household Indicator (INGDWTD) on the dwelling level
• Form type (FTPP) on the persons level
• Status in Employment (SIEMP), which is a new item for the 2016 Census and replaces Employment Type (EMTP), which was used in 2011 Census output.
• Type of Non-Private Dwelling (NPDD) on the dwelling level (available on the Detailed Microdata file only).

The following data items underwent changes to their classifications in the 2016 Census:

• Ancestry (ANC1P, ANC2P)
• Birthplace of Mother (BPFP)
• Birthplace of Father (BFMP)
• Income classifications for persons (INCP), family (FINF, FINASF, FIDF) and household (HIND, HINASD, HIDD, HIED)
• Religious Affiliation (RELP)
• Year of Arrival in Australia (YARP), to accommodate the years between the 2011 and 2016 Censuses.

### Estimation procedure

An estimate of the total for an item can be obtained by totalling the item for the relevant Census microdata file and then multiplying the result by 20 for the Detailed Microdata file, or by 100 for the Basic CURF. Note that this estimate of total will not correspond exactly to the total that would be obtained from the full Census, firstly because of the sampling error arising due to the microdata files containing only a sample of Census records, and secondly, in the Basic CURF, because of the exclusion of large households.

Averages from the microdata files, such as the proportion of persons falling into a particular category, can be used as an estimate of the corresponding average in the Census. For example, the proportion of Australian born persons who are students is estimated by the proportion of students observed among Australian born persons on the microdata files. Note that if the denominator of such a proportion is known from the full Census then it can be multiplied by the estimated proportion to give an estimate of the numerator. For example, the total number of Australian born students could be estimated by multiplying the above proportion by the Australian born population. This gives an alternative estimate from using one of the microdata files (rather than counting the Australian born students on the Detailed Microdata file and multiplying by 20) that may be preferred in some circumstances, since it is more compatible with the known full-Census count.

Household, family and person estimates are available for private dwellings in both the Census Microdata files. For the detailed microdata file, person and household estimates are available for non-private dwellings, however for the basic CURF, only person estimates are available, due to the differing sampling methodologies. Family records are not applicable for non-private dwellings in both files.

### Reliability of estimates

The sampling error should be taken into account when interpreting estimates from the Census microdata files. A measure of the likely difference between an estimate from the Census microdata files and the corresponding full Census value is given by the standard error (SE) of the estimate. The SE indicates the extent to which an estimate might have varied by chance because only a sample of persons was included. There are about two chances in three that a sample estimate will differ by less than one SE from the full Census value, and about 19 chances in 20 that the difference will be less than two SEs. Another measure of sampling variability is the relative standard error (RSE), which is obtained by expressing the SE as a percentage of the estimate to which it refers.

Non-sampling errors may occur in any statistical collection - a full count or a sample - and should not be confused with imprecision due to sampling error, which is measured by the SE. Non-sampling errors in both Census microdata files are differences due to the exclusion of large dwellings, while in the Census as a whole there may be inaccuracies that occur because of imperfections in reporting by respondents, errors made in collection (such as when recording responses) and errors made in processing the Census data. It is not possible to quantify non-sampling error, but every effort is made to reduce it to a minimum. For the following examples, non-sampling error is assumed to be zero. In practice, the potential for non-sampling error adds to the uncertainty in the estimates that is caused by sampling variability.

#### Standard error calculation

Both Census microdata files can be treated, for the purposes of standard error calculations, as a simple random sample of dwellings from the private dwelling population. For some analytic purposes, the non-private dwelling population has only a minor influence on results, and it is sufficient to include each person counted in a non-private dwelling as a separate 'dwelling' when calculating standard errors.

##### Dwelling level estimates

Estimates of the SE of averages for dwelling-level items can be obtained using standard formulae for a simple random sample. These standard error formulae require computing the average value of an item of interest per dwelling on the Census microdata file. The formula for $$y_{A V}$$, the estimated average of an item that takes value $$y_d$$ for dwelling $$d$$ out of $$n$$ sampled dwellings in a geographic area, is:

$$y_{A V}=\frac{1}{n} \sum_ \limits {d} y_{d}$$

where $$\sum_ \limits {d}$$ represents summing over the $$n$$ dwellings.

The standard error estimate $$S E\left(y_{A V}\right)$$ is given by the following formula:

$$S E\left(y_{A V}\right)=\sqrt{\frac{1}{n} \frac{1}{n-1} \sum_ \limits {d}\left(y_{d}-y_{A V}\right)^{2}}$$

The estimate $$y_{T O T}$$ of the total count for this item, and its corresponding SE estimate $$S E\left(y_{T o T}\right)$$, are obtained by multiplying the average per dwelling by the number of dwellings in the geographic area. The number of dwellings is approximated with minimal error by:

$$w \times n$$

where w is the weight (20 on the Detailed Microdata file and 100 on the Basic CURF) since the construction of the Census microdata file ensures proportional representation of geographic areas.

The formulae are as follows:

$$y_{T O T}=w \times n \times y_{A V}$$

$$S E\left(y_{T O T}\right)=w \times n \times S E\left(y_{A V}\right)$$

Note that the geographic area to be used in these calculations should be the smallest geographic area containing the dwellings in question. For example, estimates for a single state should use state as the geographic area.

##### Person level estimates

The above formulae can be applied to totals of persons by treating the $$y_{d}$$ as person counts within the dwelling i.e. $$y_{d}$$ is the number of persons from dwelling d with the characteristic of interest. This makes $$y_{A V}$$ the average number of persons per dwelling having this characteristic, and $$y_{T O T}$$ the total number of persons in the geographic area with this characteristic.

##### Family level estimates

Similarly, estimates for family-level items can be obtained by treating the $$y_{d}$$ as family counts within the dwelling i.e. $$y_{d}$$ is the number of families from dwelling d with the characteristic of interest, $$y_{A V}$$ is the average number of families per dwelling having the characteristic, and $$y_{T O T}$$ is the total number of families in the geographic area with the characteristic.

##### Clustering of the person sample

For some person level variables, it may be a reasonable approximation to treat the Census microdata files as a simple random sample of persons, even though it is in fact a sample of dwellings. This would involve letting d in the above formulae indicate persons rather than dwellings, and replacing n by the number of persons in the geographic area of interest. Person level means and associated standard errors could then be obtained by a standard tabulation package applied to the person level data.

Unfortunately, doing this will typically give an underestimate of the actual SE. The extent of this underestimation depends on how clustered the variable of interest is within dwellings - that is, on how often similar values of the variable tend to occur together in the same dwelling. The understatement of standard error will be greatest for variables that are highly clustered within dwellings, such as birthplace.

For this reason, it would be appropriate when treating the Census microdata files as a sample of persons to obtain a measure of the effect of clustering for the variables being investigated. A suitable measure is the design factor (DEFT), given by the ratio of the SE calculated correctly (with dwellings as units) to the SE calculated treating persons as units. Standard errors from the person level analysis can then be adjusted by this factor.

The SE ignoring clustering will be denoted by $$S E_{p}\left(y_{T o T}\right)$$ , with the subscript p indicating that it is calculated at the person level. This can be obtained by taking the person level Census microdata file and creating a variable taking the value 1 for Australian born persons and 0 otherwise. This is then used to estimate the total and its SE.

An example using the 2011 Census microdata files showed that the standard error produced ignoring clustering underestimates the actual standard error by a factor of 2. Users could expect that other totals (eg. for geographic regions) for the variable 'Australian-born' would have a similar design factor.

##### Proportions

Simple approximations can be used to estimate the standard error for a ratio of counts. If $$y_{T O T_{1}}$$ and $$y_{T O T_{2}}$$ are estimated totals for two nested categories (i.e. category 2 is a subset of category 1) then writing

$$R S E\left(y_{T O T}\right)=\frac{S E\left(y_{T O T}\right)}{y_{T O T}}$$

for the relative standard error gives the following approximation:

$$R S E\left(\frac{y_{T O T_{2}}}{y_{T O T_{i}}}\right)=\sqrt{R S E\left(y_{T O T_{2}}\right)^{2}-R S E\left(y_{T O T_{i}}\right)^{2}}$$

This formula depends on the two categories being nested, and should not be used for distinct categories.

##### Differences

If two totals are for distinct categories (e.g. in comparing estimates across states), then the difference between two totals has the following SE approximation:

$$S E\left(y_{T O T_{2}}-y_{T O T_{i}}\right)=\sqrt{S E\left(y_{T O T_{2}}\right)^{2}+S E\left(y_{T O T_{i}}\right)^{2}}$$

While this formula will only be exact for differences between separate and uncorrelated (unrelated) characteristics or sub-populations, it is expected to provide a good approximation for most differences likely to be of interest.

##### Regression estimates

One use of the sample file will be to examine relationships between variables using regression methods. By treating the dwelling as the sample unit, standard regression packages can be used unweighted and the resulting standard errors and test statistics will be good estimates. For example, a regression model could be derived for $$y_{i}$$, the number of persons in the dwelling needing assistance with core activities, against various characteristics $$x_{1 i}, x_{2 i}, \ldots, x_{k i}$$ such as $$x_{1 i}$$, the number of persons in the dwelling aged over 65 years, to fit the linear regression model:

$$y_{i}=a+b_{1} x_{1 i}+\ldots+b_{k} x_{k i}$$

Measures of model fit and of significance of the parameters $$a, b_{1}, \ldots, b_{k}$$ from the standard package will then be appropriate. Unfortunately, such a linear model may not adequately describe the relationships between variables at a dwelling level.

If a similar regression is performed treating person as the sample unit, the resulting standard errors and measures of significance could be inaccurate or misleading. This arises because the persons in the sample are clustered within dwellings, and so their responses may be "correlated" or affected by similar influences such as characteristics of the dwelling. The extent to which the measures of significance are affected will depend on how clustered the variable $$y_{i}$$ is likely to be within dwellings.

If a person level analysis is performed, such as a 'logistic analysis' of the probability of a person having a given characteristic, then the effect of clustering should be taken into account when interpreting the outcomes. In particular, SEs are likely to be understated, as discussed in the section Clustering of the person sample, and this will tend to increase the apparent significance of modelled effects.

Techniques are available to perform valid analyses at the person level for a sample that is clustered within dwellings, treating persons as being subject to both person and dwelling effects. These techniques include 'multi-level', 'random effect' and 'mixed' modelling. (Footnote ¹ and ²)

By using these techniques, models can be used that do a better job of describing the actual relationships between variables at both person and dwelling level. Statistical packages are widely available to validly perform such analyses.

### Footnotes

1. Footnote 1 Goldstein, H. and Arnold, E, 1995, 'Multilevel Statistical Models', 2nd ed.Halsted Press, New York.
2. Snijders Tom A. B. and Bosker Roel J, 1999, 'Multilevel analysis : an introduction to basic and advanced multilevel modelling, SAGE, London.

## Using the Basic CURF

The full classification structures for all Basic CURF data items can be found in the Census Dictionary, 2016 (cat. no. 2901.0).

Many of the classifications in the Basic CURF have been collapsed and the full listings of the Basic CURF classifications are detailed in the Data items lists in the Data downloads section.

### Identifiers

#### Dwelling, Family and Person IDs

Each record level are given an identifier:

• Dwelling (Household) - ABSHID
• Family - ABSFID
• Person - ABSPID

To enable users to link records, the following Identifiers are available across levels:

• ABSFID and the related ABSHID on each Family record
• ABSPID and the related ABSFID and ABSHID on each Person record.

#### Dwelling Indicator for Persons

The DWIP (Dwelling Indicator for Persons) variable was introduced in 2006 as a way of enabling users of the microdata files to more easily distinguish between those people enumerated in private dwellings and those enumerated in non-private dwellings (without the need to link to the household file). This variable was applied in 2011 and is included in 2016 as well.

The DWIP variable applies to all persons enumerated in an occupied private dwelling or non-private dwelling. Categories are:

1. Enumerated in an occupied private dwelling
2. Enumerated in a non-private dwelling.

As migratory, off-shore and shipping areas were not included in the sample, there is no Not applicable' category for this variable.

#### Geography

The Basic CURF contains information on the geographic area of selected dwellings. For 2016, geographic areas in the Basic CURF are based on the Australian Statistical Geography Standard (ASGS).

To ensure that the information on the file is not likely to enable identification of a person or household, all areas will be defined using a minimum population size of 250,000 persons (except for the Northern Territory which has a total population of 228,833 persons) from the full Census. Records will be randomly ordered within a region to further reduce the likelihood of individual identification.

All regions can be aggregated to the state level as in the 2011 files.

Geographic regions will be formed from Statistical Area Level 4 and form the basis of the following data items:

• AREAENUM (Area of enumeration);
• REGUCP (Region of usual residence on Census night);
• REGU1P (Region of usual residence 1 year ago); and
• REGU5P (Region of usual residence 5 years ago) data items.

A full list of regions is included in the Data Item List.

#### Files and file structures

Dwelling, Family and Person Level files are available in the following formats:

• CSV in a comma delimited ASCII text format;
• SAS for Windows;
• SPSS for Windows; and
• STATA.

## Using the Detailed Microdata in the DataLab

Detailed microdata files are the ABS's most detailed unit record data and have been designed specifically for use within the DataLab environment. A 5% sample of person, family and household unit record data from the 2016 Census of Population and Housing has been released as Detailed Microdata files into the ABS' DataLab environment.

The full listing of the Detailed Microdata classifications and the corresponding Census classifications are detailed in the Data Item Lists in the Data downloads section. In some cases these will differ marginally.

Further information about Census data items can be found in the Census Dictionary, 2016 (cat. no. 2901.0). For information about response rates and Census data quality, please visit the Understanding the Census and Census Data (cat.no 2900.0) publication.

### Identifiers

#### Dwelling, Family and Person IDs

Each record level has been given an identifier:

• Dwelling (Household) - ABSHID
• Family - ABSFID
• Person - ABSPID.

To enable users to link records, the following Identifiers are available across levels:

• ABSFID and the related ABSHID on each Family record
• ABSPID and the related ABSFID and ABSHID on each Person record.

#### Dwelling Indicator for Persons

The DWIP (Dwelling Indicator for Persons) variable was introduced in 2006 as a way of enabling users of the Census microdata file to more easily distinguish between those people enumerated in private dwellings and those enumerated in non-private dwellings (without the need to link to the household file). This variable was applied in 2011 and in 2016 as well.

The DWIP variable applies to all persons enumerated in an occupied private dwelling or non-private dwelling. Categories are:

1. Enumerated in an occupied private dwelling
2. Enumerated in a non-private dwelling.

As migratory, off-shore and shipping areas were not included in the sample, there is no Not applicable' category for this variable.

#### Geography

The Detailed Microdata file contains information on the geographic area of selected dwellings and each person's usual residence geographies. For 2016, geographic areas in the file have been based on the Australian Statistical Geography Standard (ASGS).

A list of the geographic data items available in the Detailed Microdata file is available in the Data Item List in the Data downloads section.

#### Files and file structures

##### CSV

These files contain the data in a comma delimited ASCII text format.

• CDM16_dwelling.csv contains the Dwelling level data
• CDM16_family.csv contains the Family level data
• CDM16_person.csv contains the Person level data
##### SAS

These files contain the data in SAS for Windows format:

• CDM16_dwelling.sas7bdat contains the Dwelling level data
• CDM16_family.sas7bdat contains the Family level data
• CDM16_person.sas7bdat contains the Person level data
##### SPSS

These files contain the data in SPSS for Windows format:

• CDM16_dwelling.sav contains the Dwelling level data
• CDM16_family.sav contains the Family level data
• CDM16_person.sav contains the Person level data
##### STATA

These files contain the data in STATA format:

• CDM16_dwelling.dta contains the Dwelling level data
• CDM16_family.dta contains the Family level data
• CDM16_person.dta contains the Person level data
##### Information files

This file is a SAS library containing formats.
FORMATS.sas7bcat

## Using TableBuilder for Census data

TableBuilder is an online data tool in which you can create tables, graphs and maps of ABS microdata. It is designed to help you produce data specific to your needs through a flexible online user interface. There are two TableBuilder systems: Census TableBuilder and TableBuilder for all other datasets.

Within Census TableBuilder, you can:

• construct tables of Census data for a range of geographic areas, including small area geographies like Postcodes or SA2s
• display data by counts or percentages
• view and export data as graphs and thematic maps in a variety of formats, including PDF and KMZ files
• create and save customised geographic areas and data items (recodes).

A list of the data items available for Census TableBuilder can be found in the Data downloads section.

System restrictions have been implemented which prevent the cross-tabulation of certain data items within the following 2016 Census Pro datasets:

• 2016 Census - Counting Persons, Place of Enumeration
• 2016 Census - Counting Families, Place of Enumeration
• 2016 Census - Counting Persons, Estimating Homelessness
• 2016 Experimental Index of Household Advantage and Disadvantage - Counting Persons, Place of Enumeration
• 2016 Experimental Index of Household Advantage and Disadvantage - Counting Families, Place of Enumeration

These restrictions have been applied to:

• maintain the confidentiality of respondents
• ensure the output of quality data
• assist users by not allowing combinations of data items that statistically should not be combined.

When the restriction is triggered the following error message will be displayed: "These variables cannot be used together". Other similar data items may be available. For example, if you are using Geographical Areas from Mesh Block (MBs), you may be able to use another Geographical Area data item instead, such as Main Statistical Area Structure (Main ASGS).

Apply for access to Basic or Pro Census TableBuilder.

To access the free and charged versions of Census TableBuilder, visit the TableBuilder page on the Census website. This page also contains short video tutorials and links to the TableBuilder User Guide and the Census Dictionary, 2016 (cat. no. 2901.0) to help users make the most of this tool.

Data files

## Previous releases

Census of Population and Housing, 2011TableBuilderBasic microdataDetailed microdata
Census of Population and Housing, 2006TableBuilderBasic microdataDetailed microdata
Census of Population and Housing, 2001 Basic microdataDetailed microdata
Census of Population and Housing, 1996 Basic microdata
Census of Population and Housing, 1991 Basic microdata
Census of Population and Housing, 1986 Basic microdata
Census of Population and Housing, 1981 Basic microdata

## History of changes

### Show all

##### 29/10/2019

2016 Experimental Index of Household Advantage and Disadvantage (IHAD) datasets made available via Census TableBuilder Pro. Release includes supportive changes to 'Introduction' and 'Using TableBuilder for Census Data' chapters, as well as the 'TableBuilder Guest, Basic and Pro Data Items List' in the Data downloads section.

##### 23/08/2019

Additional content: Census TableBuilder Pro system restrictions now included in the 'Using TableBuilder for Census Data' chapter. Changes also made to the 'TableBuilder Guest, Basic and Pro Data Items List' in the Data downloads section.

##### 11/04/2019

Basic CURF made available via Microdata Downloads. Release includes textual changes relating to sampling methodology and availability of Microdata products.

##### 10/01/2018

Updates to expected Basic CURF release date and minor corrections to Detailed Microdata data item list.

## Quality declaration

### Institutional environment

The microdata products addressed in this publication are released in accordance with the conditions specified in the Statistics Determination section of the Census and Statistics Act 1905 (CSA), noting that the Census and Statistics (Information Release and Access) Determination 2018 came into effect on 15 August 2018 and has replaced the Statistics Determination 1983. This ensures that confidentiality is maintained whilst enabling unit record level data to be released. More information on the confidentiality practices associated with microdata can be found on the About CURF Microdata or Detailed Microdata page.

For information on the institutional environment of the Australian Bureau of Statistics (ABS), including the legislative obligations of the ABS, please see ABS Legislative Framework.

### Relevance

Microdata from the 2016 Census of Population and Housing are available as TableBuilder datasets, 5% Detailed Microdata files and a 1% Basic CURF. These microdata files are the most detailed information available about key characteristics of people in Australia on Census night, and are released to support advanced data analysis. These characteristics are generally responses to individual questions on the Census form or data derived from two or more questions.

### Timeliness

The Census and Statistics Act 1905 requires the Australian Statistician to conduct a Census on a regular basis. Since 1961, a Census has been required every five years. The 2016 Census was the 17th national Census, and was held on 9 August 2016. Microdata products in recent times are usually released within three years of the collection of Census data.

### Accuracy

The microdata files generally contain finer levels of detail of data items than what is otherwise published in other formats, for example in 2016 Community Profiles. For more information on the level of detail provided, see the associated data item listings for individual microdata products.

Steps to confidentialise the data made available on the microdata files are taken in such a way as to maximise the usefulness of the data while maintaining the confidentiality of respondents. As a result, it may not be possible to exactly reconcile all the statistics produced from the microdata with other published statistics.

### Coherence

It is important for Census microdata to be comparable and compatible with previous Censuses and related survey or administrative data sources. However:

• There are differences regarding how the sample has been created in relation to larger households in different Census years.
• The product types have changed in 2016 in response to the evolving institutional environment, where a Detailed Microdata file instead of an Expanded CURF was released. This enables more detailed information to be provided for Census data items compared to previous Census years.
• The classifications used for Census data topics change over time.
• Geographic areas on the 2016 Census microdata files are based on the Australian Statistical Geography Standard (ASGS), which replaced the Australian Statistical Geography Classification (ASGC) used in previous microdata files.

### Interpretability

The information within this publication should be referred to when using the microdata products. It explains the sample methodology, use of the microdata files, file structure, the data item lists, and changes over time.

The Census Dictionary, 2016 (cat. no. 2901.0) and Understanding the Census and Census Data (cat no. 2900.0) include information on the Census objectives, methods and design, content, data quality and interpretation, output data items, information about the availability of results and comparability with previous surveys.

### Accessing the data

Microdata files are available to approved users. Users wishing to access the microdata files should read the How to Apply for Microdata web page before applying for access. Users should also familiarise themselves with information available via the Microdata Entry Page.

A full list of available microdata can be viewed via the Available Microdata page. More detail regarding types and modes of access to microdata can be found on the Compare access options page.