1504.0 - Methodological News, Mar 2021  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 29/03/2021   
   Page tools: Print Print Page Print all pages in this productPrint All

Exploring the use of machine learning for anomaly detection and editing

The ABS is assessing the feasibility of using machine learning to identify and correct anomalous data. This work aims to understand how machine learning can help to prioritise and reduce manual effort, identify incorrect data, find new edit rules, propose edit value(s) with a measure of confidence, and address issues that traditional approaches struggle with.

We are currently focussed on business administrative data, particularly big data (in terms of number of records and data items) with a time-series aspect.

As part of this work we are addressing key aspects such as understanding the relative advantages and limitations of different methods, including traditional approaches, and the use of several techniques in combination. We will also explore:

    • Methods to adequately demonstrate and explain how a machine learning algorithm came to a solution.
    • How to develop edit rules for new or evolving datasets.
    • How to update models and algorithms over time and in response to real-world changes.

We are assessing a selection of machine learning and traditional approaches, ranging from simple techniques to more-complex methods suited to the time-series nature of some datasets. The aim is to develop a toolbox of techniques suited to datasets with different characteristics.

Initial work has focussed on data where there is limited understanding of the behaviour patterns of correct vs wrong data, where a key challenge is to develop a labelled dataset that the methods can be assessed against. Unsupervised approaches such as Local Outlier Factor can assist to identify anomalies, however human expertise is needed to help differentiate wrong data from unusual data.

The ABS is engaging with other National Statistical Organisations who are undertaking similar work in order to share best practice.

For further information, please contact Jenny Pocknee at methodology@abs.gov.au.

The ABS Privacy Policy outlines how the ABS will handle any personal information that you provide to us.