This issue contains three articles:
- The connection between the ABS perturbation methodology and differential privacy
- Adjusting to the times: the use of video interviews for data collection
Safe use of personal data for sample design and estimation
Features important work and developments in ABS methodologies
This issue contains three articles:
Safe use of personal data for sample design and estimation
The ABS is committed to improving access, while ensuring privacy and confidentiality is maintained. The emergence of differential privacy (DP) methods has created opportunities to better quantify the trade-off between statistical utility and confidentiality protection in our statistical outputs.
The ABS is continuing to explore the opportunities offered by DP. This research builds on Leaver and Marley (2011) and Bailie and Chien (2019) to improve the perturbation methodology we use in TableBuilder. This methodology has two elements – an entropy maximisation method for generating the perturbation table and a cell key method to ensure consistent protections for statistical outputs. Recent work has focussed on the first element and has considered an analytical entropy maximisation approach to incorporate (ε,δ)-DP parameters in the design of the perturbation table (transition matrix).
Collaborating with Professor Parastoo Sadeghi, the ABS has considered a single static counting query function and explored the analytical form of the symmetric perturbation distribution – a special case of current TableBuilder parameters which include asymmetric perturbation distributions. This collaboration has established:
This research has shown it is possible to incorporate DP parameters in the design of the perturbation table. There are several areas for future research including:
For more information, please contact Professor Parastoo Sadeghi or Dr. Joseph Chien.
The need to mitigate challenges associated with in-person interviewing during the COVID-19 pandemic, together with an increase in the use of video calls across society, contributed to the decision to explore the collection of official statistics using remote video interviews.
Video-Assisted Live Interviewing (VALI) is data collection conducted online using a video conferencing platform. While VALI was initially considered for pandemic related reasons, it also provides the potential to improve data collection efficiency, reduce costs, enhance interviewer safety, and may also improve response rates and reduce provider burden.
A comprehensive range of VALI related research has been undertaken to develop the video interviewing process. This research includes conducting an online panel study of mode preferences, field test observation, and multiple usability testing rounds with ABS field interviewers and various respondent cohorts.
Findings from testing included that respondents:
Further evaluation activities, including a pilot study, will be undertaken prior to decisions being made about the future of VALI within the ABS.
Plans are also underway to conduct a modal experiment to enable data quality comparisons between VALI, online and telephone collection. The results of this experiment are due in early 2023.
For more information, please contact Kirsten Gerlach.
The ABS is the custodian for MADIP – the Multi-Agency Data Integration Project. The ABS collects and links administrative data from a number of Australian government agencies to create a secure data asset combining information on health, education, government payments, income and taxation, employment, and population demographics. Access to these data is provided to authorised researchers in a way that protects the personal privacy of the information.
This rich source of data could also be used to improve the efficiency of ABS household surveys, by identifying subpopulations of interest and over-sampling these subpopulations, or by using administrative data to improve the efficiency of estimation. But this needs to be done in a safe way that protects the confidentiality of the information and is seen as an appropriate use of the information by the agencies that supply the data and by the Australian community.
One method of using the data that is generally considered safe is to use area-level summaries. So for example, if a survey would like to over-sample recent migrants, the proportion of recent migrants in the population can be calculated at area level, and areas with high proportions can be over-sampled. Typically an SA1 area is used, with an average population of approximately 400 people.
While area-level summaries work well in many situations, further efficiencies can be gained by using information at an address-level, particularly if the interest is in relatively rare subpopulations. The gain needs to be balanced against the risk to privacy that would occur if information from MADIP were simply linked to addresses on the sampling frame. One technique developed by the ABS is to fit a predictive model, that returns a propensity, or a likelihood that an address contains the subpopulation of interest. These models can take into account the chance that MADIP information is out of date, and for example that people may have moved address. While the use of predictive models is, in general, not as effective as directly linking personal information to the sampling frame, it can be a significant improvement on using area-level summaries and can be an appropriate balance between improving efficiency and protecting the privacy of personal information.
For more information, please contact Bruce Fraser.
Please email methodology@abs.gov.au to:
Alternatively, you can post to:
Methodological News Editor
Methodology Division
Australian Bureau of Statistics
Locked Bag No. 10
Belconnen ACT 2617
The ABS Privacy Policy outlines how the ABS will handle any personal information that you provide to us.
Releases from June 2021 onwards can be accessed under research.
Releases up to March 2021 can be accessed under past releases.