1504.0 - Methodological News, Jan 2019

ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 23/01/2019

Page tools: Print Page Print All
Summary Protecting the Confidentiality of Providers Measuring Statistical Impacts with State Space Models Big Data and Data Integration How to Contact Us and Email Subscriber List About this Release	PROTECTING THE CONFIDENTIALITY OF PROVIDERS The release of aggregated statistics provides useful information for researchers, policy decision-makers and the community, however, by definition comes with a risk of privacy loss - known as the risk-utility trade-off. The ABS publishes data in a way that maximises the utility of statistical information while upholding our obligations to protect the confidentiality of our providers. The ABS applies a suite of rigorous confidentiality methods and protections in order to be able to release useful information while making it not likely to enable identification of individuals. The ABS undertakes a range of investigations to stay at the leading edge of methodological practice in confidentiality - particularly in light of increasingly sophisticated computing technologies and algorithms available to attackers, and the emerging types of data and information solutions being expected by analysts and the community. Differential Privacy is a framework for protecting privacy by adding random noise to released data. This framework specifically focuses on providing 'plausible deniability' to a respondent around whether they can be identified as contributing information to that dataset. This is expressed through a measurable limit to the resulting loss of privacy. Differential Privacy promises a number of theoretical benefits, such as measuring the privacy loss associated with using a particular perturbation distribution and also calculating the accumulated privacy loss across multiple queries. However, it must also be considered in terms of underlying assumptions, practical implementation, social licence and stakeholder acceptance, and the broader confidentiality protections / legislative requirements. In particular: 1. The privacy loss measure relates only to the perturbation component, so will underestimate the protection provided by the overall package of confidentiality applied to the data under the ‘Five Safes’ framework; 2. There are limitations when extending beyond person-level counting type queries; 3. Access to correlated auxiliary data weakens the protection provided by (any) perturbation mechanism; and 4. Implementation requirements may not perturb data beyond levels acceptable to analysts or be practically infeasible. It is important to note that the theory is still evolving, however, there may be aspects that are potentially useful to augment current approaches. The ABS is assessing the potential benefits of differential privacy and identifying unresolved issues and implications, including implications for practical implementation. The ABS is also engaging with experts and other National Statistical Organisations who face similar challenges to better understand these implications. The US Census Bureau, for example, are incorporating perturbation as part of their confidentialisation method for the 2020 US Census, including an adapted form of differential privacy; and are working to identify and resolve outstanding issues and progress the theoretical foundations. It is important to remember that the differential privacy framework does not guarantee privacy protection, rather it allows a bound to be placed on the maximum privacy loss for aggregated tables – thereby providing a data-providing agency with a mechanism to control risk. Differential privacy also does not in itself inform the analyst of the utility-loss relevant to their particular analysis - some literature argued that under Differential Privacy, the usefulness of microdata files may be severely damaged. While there is a growing literature investigating potential relaxations, adaptions and alternatives that attempt to address the issues, the theory is still maturing. Some of the issues include aggregated units (eg, families), descriptive statistics, weighted data, magnitude data, correlated data and longitudinal data. In particular, there is currently no widely accepted way to calculate the accumulated privacy loss over multiple queries (current composition strategies are based on a ‘worst case’ scenario). The ABS is undertaking work to assess the differential privacy measure relevant to the perturbation mechanisms commonly applied. For more information, please contact Daniel Elazar Methodology@abs.gov.au The ABS Privacy Policy outlines how the ABS will handle any personal information that you provide to us. Document Selection These documents will be presented in a new window.