|

Comparison Pitfalls
On this page:
> Different sources
> Different definitions
> Changes to the data set
> Correlating information
> Results lack variation
Be wary when making comparisons. Comparisons cannot be made between 'apples and oranges', only between 'oranges and oranges'.
Different sources
Be wary of comparing data from different sources. Consider if the data sets are in fact comparable?
 | |
 | Example 1
Results from the 2006 Census regarding unpaid child care cannot be directly compared with the results of the ABS Child Care Survey because the age of the children who were reported on is different. The Census question referred to care provided for children aged less than 15 years of age in the two weeks prior to the Census, while the Child Care Survey only included children aged less than 13 years during a single reference week. |
 | |
 | |
 | Example 2
ABS and Centrelink both collect information about unemployed persons, but the data sets are not comparable. ABS unemployed are defined by activity. That is, they are people who are without work, but have been actively seeking work in the past four weeks, and were available to start work last week. Centrelink unemployed are defined by their eligibility to receive unemployment benefits. |
 | |
Different definitions
Definitions may differ depending on the context or the survey. Always check that you have the correct definition and are clear about what you are describing.
 | |
 | Example
A 'child' may be defined in some instances as a person aged under 18 years, a person aged under 15 years, or a person aged under 13 years, depending on the circumstances. |
 | |
Changes to the data set
Changes can occur to a data set over time, such as changes in classification, geography, sample size, methodology, etc. This may result in a break in a time series.
 | |
 | Example
New industry classification codes, known as Australian and New Zealand Standard Industrial Classification (ANZSIC), were developed in 2006, replacing the 1993 edition, which was the first version produced. ANZSIC 2006 codes reflect the changes that have occurred in the structure and composition of industry since the previous edition, and enhance international comparability. However, direct comparisons with ANZSIC 1993 cannot be made. |
 | |
Correlating information
Correlation does not mean causation. The relationship between data and an event may be purely coincidental, or there may be multiple reasons behind an event taking place, with the data only reflecting one aspect of the relationship.
 | |
 | Example
An increase in the number of shark attacks along the eastern seaboard of Australia in January 2009 may have corresponded with booming retail sales of a new sunscreen product. This retail boom just happened to coincide with the peak shark attack period, but it is not necessarily related. |
 | |
Results lack variation
Variation to data is important and almost impossible to remove. Therefore, lack of variation in results over time should be cause for suspicion.
 | |
 | Example
If the unemployment rate remained unchanged over many months, it would be worth further investigation as to why this was the case. |
 | |
|
More Fact Sheets |
This page last updated 1 October 2009 |