Australian Bureau of Statistics

Rate the ABS website
ABS Home
ABS @ Facebook ABS @ Twitter ABS RSS ABS Email notification service
Comparison pitfalls
 



Comparison Pitfalls

On this page:
> Different sources
> Different definitions
> Changes to the data set
> Correlating information
> Results lack variation


Be wary when making comparisons. Comparisons cannot be made between 'apples and oranges', only between 'oranges and oranges'.

Different sources

Be wary of comparing data from different sources. Consider if the data sets are in fact comparable?

Example 1

Results from the 2006 Census regarding unpaid child care cannot be directly compared with the results of the ABS Child Care Survey because the age of the children who were reported on is different. The Census question referred to care provided for children aged less than 15 years of age in the two weeks prior to the Census, while the Child Care Survey only included children aged less than 13 years during a single reference week.

Example 2

ABS and Centrelink both collect information about unemployed persons, but the data sets are not comparable. ABS unemployed are defined by activity. That is, they are people who are without work, but have been actively seeking work in the past four weeks, and were available to start work last week. Centrelink unemployed are defined by their eligibility to receive unemployment benefits.



Different definitions

Definitions may differ depending on the context or the survey. Always check that you have the correct definition and are clear about what you are describing.

Example

A 'child' may be defined in some instances as a person aged under 18 years, a person aged under 15 years, or a person aged under 13 years, depending on the circumstances.



Changes to the data set

Changes can occur to a data set over time, such as changes in classification, geography, sample size, methodology, etc. This may result in a break in a time series.

Example

New industry classification codes, known as Australian and New Zealand Standard Industrial Classification (ANZSIC), were developed in 2006, replacing the 1993 edition, which was the first version produced. ANZSIC 2006 codes reflect the changes that have occurred in the structure and composition of industry since the previous edition, and enhance international comparability. However, direct comparisons with ANZSIC 1993 cannot be made.



Correlating information

Correlation does not mean causation. The relationship between data and an event may be purely coincidental, or there may be multiple reasons behind an event taking place, with the data only reflecting one aspect of the relationship.


Example

An increase in the number of shark attacks along the eastern seaboard of Australia in January 2009 may have corresponded with booming retail sales of a new sunscreen product. This retail boom just happened to coincide with the peak shark attack period, but it is not necessarily related.



Results lack variation

Variation to data is important and almost impossible to remove. Therefore, lack of variation in results over time should be cause for suspicion.

Example

If the unemployment rate remained unchanged over many months, it would be worth further investigation as to why this was the case.

More Fact Sheets


© Commonwealth of Australia 2008

Unless otherwise noted, content on this website is licensed under a Creative Commons Attribution 2.5 Australia Licence together with any terms, conditions and exclusions as set out in the website Copyright notice. For permission to do anything beyond the scope of this licence and copyright terms contact us.