Data confidentiality guide
Learn about the Five Safes framework, confidentiality techniques and confidentialising your own data
Safely releasing valuable data
Data is a strategic and valuable resource "for growing the economy, improving service delivery and transforming policy outcomes for the Nation" (Australian Government Public Data Policy Statement).
It is important that organisations that collect data, including the Australian Bureau of Statistics (ABS), make their data holdings widely available. This includes releasing aggregate and unit record datasets (microdata) in ways that optimise their usefulness while still protecting the secrecy and privacy of those who have provided the information as required by Australian legislation. By law, the ABS must disseminate official statistics while making sure that information is not released in a way that is likely to enable individuals or organisations to be identified. The Australian Bureau of Statistics has safely and effectively released made data holdings available for over 110 years.
This guide focuses on methods and management techniques to securely release data while maintaining the confidentiality of individuals or organisations about which the information relates. If you have questions or feedback, email email@example.com.
What is confidentiality
Confidentiality is protecting the secrecy and privacy of information collected from individuals and organisations.
When information is made available to researchers, it needs to be done in a way that is unlikely to allow individuals or organisations to be identified. Maintaining confidentiality is both a legal and ethical obligation. Failure to maintain confidentiality is called a confidentiality breach, or disclosure.
We focus on managing the risks of two main types of breach:
- re-identification: where the identity of a person or organisation is determined using other public or privately held information about them
- attribute disclosure: where a characteristic of an individual or organisation is determined without formally re-identifying them
Maintaining confidentiality (protecting secrecy, privacy and identity) is essential to preserving public trust in data custodians (agencies that collect, manage and release data). All data custodians must carefully consider confidentiality requirements (secrecy, privacy and identity) before the release of any data, whether aggregate or microdata.
Obligation to maintain confidentiality
Australian Government agencies collect data from individuals and organisations as a standard part of their activities. There is a legal and ethical responsibility for agencies to respect and maintain the secrecy, privacy and identity of those providing the information.
In practice, this means implementing policies and procedures that address all aspects of data protection. Agencies should ensure that identifiable information:
- is not released publicly (except where allowed by legislation)
- is maintained and accessed securely
- is available only to approved people and on a need-to-know basis
There is a public expectation that agencies treat information about individuals and organisations with respect and manage it appropriately.
The obligation to keep confidential the identities and characteristics of people and organisations is primarily reflected in laws governing the collection, use and dissemination of information. These laws include, for example:
- Privacy Act 1988
- Social Security (Administration) Act 1999
- Taxation Administration Act 1953
- Census and Statistics Act 1905
These and other pieces of legislation have different terminology for the process of making data available in a safe manner. However, they all require reasonable steps to be taken to limit the likelihood of an individual person or organisation being re-identified in any data that is released. Penalties apply if the secrecy provisions set out in these Acts are breached. For example, the Census and Statistics Act stipulates criminal penalties for enabling the likely identification of an individual or organisation.
Organisations may also have policies and principles that outline additional non-legislative requirements for maintaining confidentiality. In the government sector, these documents set standards for employee behaviour and provide advice on the protocols and procedures for managing information safely. For example, the APS Values and Code of Conduct explain the high levels of ethical behaviour required of Commonwealth Government employees. Agencies planning to integrate datasets can find principle based obligations in the High Level Principles for Data Integration Involving Commonwealth Data for Statistical and Research Purposes.
In Australia, data protections are recognised in the Privacy Act 1988.
The Act sets out people's rights in relation to the collection, use, sharing and retention of information they provide to the Commonwealth. The Privacy Act also establishes the Australian Privacy Principles (APPs) which outline how most Australian Government agencies, all private sector and not-for-profit organisations with an annual turnover of more than $3 million, all private health service providers and some small businesses must treat personal information. Importantly APP6 limits the disclosure of personal information. Personal information is defined in s6(1) of the Privacy Act as:
"information or an opinion about an identified individual, or an individual who is reasonably identifiable:
- whether the information or opinion is true or not; and
- whether the information or opinion is recorded in a material form or not."
Government agencies in the Northern Territory, ACT and most Australian states are bound by privacy legislation specific to their state or territory. Agencies in Western Australia are bound by the confidentiality provisions and privacy principles in the Freedom of Information Act 1992 (WA), while South Australia has an Information Privacy Principles Instruction administered by the Privacy Committee of South Australia.
Contextual approach to confidentiality
Legislation enables data to be released as long as reasonable steps are taken to prevent re-identification. Using a contextual approach to confidentiality. This means that as long as the practical result (of processes applied) is that the confidentiality of individuals or organisations is not breached, then the legal and ethical requirements are satisfied. Processes used to achieve this are heavily dependent on the surrounding context (or manner) in which the data is released. In order to maintain confidentiality you must consider:
- the environment into which the data will be released (such as a public website, a secure data laboratory)
- the method and the degree of data treatment to be applied to prevent re-identification in that environment
- the balance between adequately treating the data and ensuring its usefulness
The Privacy Act supports this contextual approach to maintain confidentiality of the data it protects (ie personal information). The idea notion of 'identifiability' is central to the operation of the Privacy Act, although there is no formal definition of when an individual is 'identifiable' or 'reasonably identifiable' in a dataset. The Office of the Australian Information Commissioner sets out a number of factors that organisations should consider when determining the identifiability of data they hold (De-identification and the Privacy Act) as well as providing guidance on 'what is personal information'. These resources show that determining whether any data subjects are 'reasonably' identifiable' in a dataset requires a contextual consideration of the particular circumstances of the case, including:
- the nature and amount of information
- who will hold and have access to the information
- the other information that is available to researchers (privately held or publicly available)
- the practicality of using that information to identify an individual
In some cases, this contextual approach may mean that a focus on treating the data will be the only practical option (such as when data is made publicly available on a website). In other cases, controls on the environment in which data is to be accessed, used or released may play a larger role. Understanding this context informs decisions about the what level of treatment required for a data release.
Other context controls could include:
- establishing processes to approve researchers before being granted access
- ensuring the purpose for which data is used is appropriate/legal/ethical
- providing a secure access environment
- checking outputs to prevent disclosure in publicly released information
For example, the ABS applies this contextual approach to confidentiality using the Five Safes Framework in order to provide researchers with secure access to detailed microdata within the ABS DataLab. A similar approach is taken by the Sax Institute in their Secure Unified Research Environment (SURE).