|Page tools: Print Page Print All RSS Search this Product|
MANAGING THE RISK OF DISCLOSURE: THE FIVE SAFES FRAMEWORK
Is the researcher appropriately authorised to access and use the data?
By placing controls on the way data are accessed, the data custodian invests some responsibility in the researcher for preventing re-identification. The general rule is this: as the detail in the data increases, so should the level of user authorisation required.
Prerequisites for user authorisation usually include the following:
By definition, a Safe People assessment would not be required for open data (i.e. data that are released into the public domain with no restriction on use).
Is the data to be used for an appropriate purpose?
Users wanting to access detailed microdata should be expected to explain the purpose of their project. For example, in order to access detailed microdata in the ABS DataLab, users must demonstrate to the ABS that their project has a statistical purpose and show it has:
As with Safe People, the need for a Safe Project assessment will depend on the context in which the data are accessed. It would not be required for open data.
Does the access environment prevent unauthorised use?
The environment here can be considered in terms of both the IT and the physical environment. In data access contexts such as open data, Safe Settings are not required. At the other end of the spectrum however, sensitive data should only be accessed via secure research centres.
Secure research centres may have features such as:
Safe Settings ensure that data access and use is occurring in a transparent way.
Has appropriate and sufficient protection been applied to the data?
At a minimum the removal of direct identifiers (such as name and address) must be applied to data before it is released. Further statistical disclosure controls should also be applied, depending on how the data will be released. Table 1 shows some of the statistical factors that should be considered when assessing disclosure risk.
Are the statistical results non-disclosive?
This is the final check on the information before it is made public, which aims to reduce the risk of disclosure to a minimum. All data made available outside the data custodian’s IT environment must be checked for disclosure. For example in the ABS’ DataLab, statistical experts check all outputs for inadvertent disclosure before the data leaves the DataLab environment.
In practice, the Safe Data part of the Five Safes should be addressed after the other four are considered. This is because the degree of data treatment required will become evident once it is clear who will be able to access the data, under what conditions, in what circumstances and how the resulting data will be protected in order to be made public. The process is likely to be iterative, as data treatment with a view to maintaining utility may necessitate reassessing one or more of the other four safes.
THE FIVE SAFES IN PRACTICE: EXAMPLES FROM THE ABS
The Five Safes Framework provides a mechanism for data custodians to take necessary and reasonable steps to manage disclosure risk in their data releases. It broadens the approach to data confidentiality by considering not just the treatment of data, but also the manner and context in which data are released.
The safes are assessed independently, but also considered as a whole. They can be thought of as a series of adjustable levers or controls to effectively manage risk and maximise the usefulness of a data release. The degree to which each safe is controlled is critical to assessing the disclosure risk. Tightly controlling all five will be counterproductive because the restrictions applied will not produce a corresponding benefit (i.e. useful data).
The table below describes how the ABS applies the Five Safes Framework to three different data access channels – open data, Confidentialised Unit Record Files (CURF) and detailed microdata files.
In all three cases, applying any one Safe in isolation is unlikely to provide an effective confidentiality solution. However, when all five are considered in combination, the overall disclosure risk becomes very low.
The treatment of the microdata files in the ABS DataLab exemplifies the framework’s holistic nature: Safe People, Safe Projects, Safe Settings, Safe Data and Safe Outputs are all controlled to mitigate the risk of disclosure, allowing appropriately authorised researchers to work securely with highly detailed microdata.
When data are loaded into the user’s own environment, the data custodian has no way to monitor how the data are used. In this case, the data custodian mitigates disclosure risk by directly protecting the data. The downside of this approach is that the data can lose some of its utility. Examples of these types of datasets include the following:
Techniques to treat microdata to mitigate disclosure risks are outlined in Part 5: Managing the risk of disclosure: treating microdata.
As Table 2 shows, tabular data are most effectively protected through Safe Data and Safe Outputs. Techniques for protecting tabular data are presented in Part 4: Managing the risk of disclosure: treating aggregate data.
These documents will be presented in a new window.