|Page tools: Print Page Print All|
An Alternative Approach to Measure Record Level Disclosure Risk in Micro-data
The ABS is required by legislation to ensure that no statistical outputs are released in a manner that is likely to enable the identification of a particular person or organisation. A key component of the ABS dissemination strategy is the release of micro-data files in the form of licensed Confidentialised Unit Record Files (CURFs), therefore each micro-data file produced must be assessed to ensure that the likelihood of identification is minimised.
The current assessment method of risk of identification in the ABS is through a number of manual tabulation procedures as well as the use of the Special Uniques Detection Algorithm (SUDA) program. This can sometimes be manually intensive.
The ABS is looking for a way to improve the assessment method by identifying a set of unit record risk measures that are:
· statistically valid and objective so that risk can be measured reliably
· consistent across CURFs so that we can properly assess relative risks
· in accordance with practical experience
· fast to calculate
· easy to interpret and apply.
One approach suggested by Elamir and Skinner (2006) is to use a log-linear model to estimate the probability that a unique record in the CURF is a match to a person known in the population. They assume that for each combination of key characteristics, the number of people in the population that have that combination can be modelled by a Poisson distribution. Using this assumption and log-linear modelling they provide an estimate for the probability described. A drawback of this method is that it can lead to biased estimates due to the presence of many zero counts. To overcome the supposed shortfalls of the log-linear model, Manrique-Vallier and Reiter (2012) suggest using a grade of membership model to calculate the same probability.
Data Access and Confidentiality Methodology Unit (DACMU) will be looking into the use of the log-linear modelling approach, grade of membership modelling approach and other relevant approaches (including SUDA output), analysing the accuracy and precision of these approaches for future CURF assessments. The results of current CURF assessment procedures will be used to validate the risk measures obtained from these new methods.
These documents will be presented in a new window.