What is Protari
Protari is software that enables fast safe data access and innovation by users. Protari has been developed collaboratively by Data61, incorporating ABS methods for on-the-fly confidentiality.
The ABS is conducting a trial of Protari to explore how harnessing API technology can lead to better outcomes for users. Currently Protari is being used to improve researcher access to insights from the Multi Agency Data Integration Project (MADIP).
How can I use Protari
There are several ways that you can interact with Protari, depending on what you are hoping to achieve.
The ABS uses the 'Five Safes Framework' for protecting the privacy of individuals when releasing data. For more information see Managing the Risk of Disclosure: The Five Safes Framework, in the ABS Confidentiality Series.
Perturbation is one element of privacy protection within this broader framework. To minimise the risk of identifying individuals in aggregate statistics, outputs based on real data derived through Protari are perturbed. That is, small random perturbations (or changes) are applied to individual cells while the information value of the table as a whole is retained. The ABS considers perturbation to be the most satisfactory technique for avoiding the release of identifiable data while maximising the range of information that can be released. Perturbation is considered necessary due to the flexible nature of possible queries, the amount of detail in the underlying dataset, and the potential for results from multiple queries to be compared. When interpreting results from Protari, consider that:
Protari may not be suitable for all types of research. It is the responsibility of each researcher to assess the fitness of data for their intended purpose.
For more information about perturbation, see:
In addition to perturbation, other protections are also applied to real data when accessed via Protari. For example, any underlying data is de-identified, users are limited in the number of queries they are able to run, rules restrict certain items being run together, and activity is logged and monitored.
There are two types of data available via Protari:
Real data - controlled access
Access to real data is limited to analysts associated with Australian Commonwealth or State Government agencies for statistical and research purposes.
Individuals can apply to use Protari to analyse the following Multi-Agency Data Integration Project (MADIP) products:
For information about using MADIP data products see Using Protari section in Microdata: Multi-Agency Data Integration Project, Australia.
Synthetic data - open access
To foster innovation and support development of custom interfaces there are no restrictions on who can query synthetic (test) data via the Protari API. User authentication is not required.
The available synthetic data does not contain real data but it is generally consistent with the structure of the related real data product. Use of the synthetic data allows a researcher or developer to become familiar with Protari's functionality and the structure of the real data, so you can prepare code/programs without having to apply for access to, or use limited queries against, the real data. Additional data protections, that are not required with synthetic data, are applied to real data access, such as perturbation of results, rules restricting certain items being run together, query limits and active usage monitoring.
Synthetic data is available via the Protari API for the following MADIP products:
For instructions on how to access the synthetic data see Using the API.
Using the table interface
If you are approved to participate in the trial of Protari, you can access the ABS Protari Table Interface at https://protari.abs.gov.au/
When accessing the Table Interface, you are re-directed to login to the authentication service Okta. Further information on how to use the Table Interface can be found via Data61's Protari documentation Using the Table Interface. This includes selecting your dataset, building your query, generating results, selecting output formats, and other features.
Using the API
The easiest way to get started and explore the capabilities of the Protari API is through the Swagger user interface at https://protari.abs.gov.au/api/v1/ui/. Swagger provides a user interface for exploring and testing the various endpoints the API offers.
If you are an experienced API user, you can query the ABS Protari API using your own software, using code in R or Python, for example. Start your queries using the base URL `https://protari.abs.gov.au/api/v1`, and add your own endpoints (examples below).
Whether using Swagger or your own software, querying real datasets requires user authentication. This is not required for synthetic (test) datasets. Currently, if you are already an approved user for real data access, you can acquire an authentication token via your browser's developer tools. Contact email@example.com for more details. The ABS is working on an improved authentication flow.
Main API endpoints
For more information see Data61's Protari Documentation. ABS is currently using Protari v1.2.0.
Find out more and apply for access
Try out the synthetic data API in Protari. To apply for access to real data and find out more about Protari, contact firstname.lastname@example.org.