1351.0.55.163 - Research Paper: Synthetic Microdata - A Possible Dissemination Tool, Oct 2018  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 15/10/2018  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All


Enhancing microdata access is one of the strategic priorities for the Australian Bureau of Statistics (ABS) in its transformation program. However, balancing the trade-off between enhancing data access and protecting confidentiality is a delicate act. The ABS could use synthetic data to make its business microdata more accessible for researchers to inform decision making while maintaining confidentiality. Australian businesses in some industries are characterised by oligopoly or duopoly. This means the existing microdata protection techniques such as information reduction or perturbation may not be as effective as for household microdata.

The research focuses on addressing the following questions: Can a synthetic data approach enhance microdata access for the longitudinal business data? What is the utility and protection trade-off using the synthetic data approach? The study compares confidentialised input and output approaches for protecting confidentiality and analysing Australian microdata from business survey or administrative data sources.

The preliminary results show that synthetic data can be a possible dissemination tool to make more business microdata accessible while ensuring confidentiality. The analysis shows that the confidentialised input approach provides more protection than the confidentialised output approach in this particular setting - one percent sample file of business microdata. This is partly because the researchers have access to the microdata so there is a stronger need to add more noise for protection. The amounts of utility loss from synthetic data and perturbation approaches are comparable because the estimated coefficients are similar. Synthetic data could be a possible approach for the ABS to consider to enhance access to business microdata.