2901.0 - Census of Population and Housing: Census Dictionary, 2016

ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 23/08/2016

Summary
Downloads
Explanatory Notes
Related Information
Past Releases

Page tools: Print

2016 Census Dictionary >> Glossary >> Introduced random error

Introduced random error

Under the Census and Statistics Act,1905 it is an offence to release any information collected under the Act that is likely to enable identification of any particular individual or organisation. Introduced random error is used to ensure that no data are released which could risk the identification of individuals in the statistics.

Many classifications used in ABS statistics have an uneven distribution of data throughout their categories. For example, the number of people who are Anglican or born in Italy is quite large (3,679,907 and 185,403 respectively in 2011), while the number of people who are Buddhist or born in Chile (528,981 and 24,937 respectively in 2011), is relatively small. When religion is cross-classified with country of birth, the number in the table cell who are Anglican and who were born in Italy could be small, and the number of Buddhists born in Chile even smaller. These small numbers increase the risk of identifying individuals in the statistics.

Even when variables are more evenly distributed in the classifications, the problem still occurs. The more detailed the classifications, and the more of them that are applied in constructing a table, the greater the incidence of very small cells.

Care is taken in the specification of tables to minimise the risk of identifying individuals. In addition, a technique has been developed to randomly adjust cell values. Random adjustment of the data is considered to be the most satisfactory technique for avoiding the release of identifiable Census data. When the technique is applied, all cells are slightly adjusted to prevent any identifiable data being exposed. These adjustments result in small introduced random errors. However, the information value of the table as a whole is not impaired. The technique allows very large tables, for which there is a strong client demand, to be produced even though they contain numbers of very small cells.

The counts and totals in summary tables are subjected to small adjustments. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals. The counts are adjusted independently in a controlled manner, so the same information is adjusted by the same amount. However, tables at higher geographic levels may not be equal to the sum of the tables for the component geographic units.

It is not possible to determine which individual figures have been affected by random error adjustments, but the small variance which may be associated with derived totals can, for the most part, be ignored.

No reliance should be placed on small cells as they are impacted by random adjustment, respondent and processing errors.

Many different classifications are used in Census tables and the tables are produced for a variety of geographical areas. The effect of the introduced random error is minimised if the statistic required is found direct from a tabulation rather than from aggregating more finely classified data. Similarly, rather than aggregating data from small areas to obtain statistics about a larger standard geographic area, published data for the larger area should be used wherever possible.

When calculating proportions, percentages or ratios from cross-classified or small area tables, the random error introduced can be ignored except when very small cells are involved, in which case the impact on percentages and ratios can be significant.

See also Confidentiality.