Protari
 


At the ABS we embrace solutions that improve the way Australians get access to the data they need and protect privacy and data security. Protari combines the ABS' world-leading confidentiality methods with API technology to streamline delivery of our data to your desktop and drive user centred innovation.
.



What is Protari

Protari is software that enables fast safe data access and innovation by users. Protari has been developed collaboratively by Data61, incorporating ABS methods for on-the-fly confidentiality.
  • Safe data access - Users can create tables from underlying unit-record data, similar to TableBuilder.
  • User innovation - Users can build applications or custom interfaces using Protari's functionality as an Application Programming Interface (API).

The ABS is conducting a trial of Protari to explore how harnessing API technology can lead to better outcomes for users. Currently Protari is being used to improve researcher access to insights from the Multi Agency Data Integration Project (MADIP).


How can I use Protari

There are several ways that you can interact with Protari, depending on what you are hoping to achieve.
If this is you......you can:
Policy researcher more comfortable with a user interface than code Use the default Protari Table Interface
Policy researcher comfortable using codeQuery the Protari API
Organisation with a policy research role interested in offering staff streamlined data accessCreate or commission a custom interface that draws on data dynamically via the API:
  • Discuss your project feasibility with the ABS at protari@abs.gov.au
  • Talk to your own IT area about the potential to set up a customised interface

Open synthetic data is available to promote innovation and support development of custom interfaces.

Protecting privacy

The ABS uses the 'Five Safes Framework' for protecting the privacy of individuals when releasing data. For more information see Managing the Risk of Disclosure: The Five Safes Framework, in the ABS Confidentiality Series.

Perturbation is one element of privacy protection within this broader framework. To minimise the risk of identifying individuals in aggregate statistics, outputs based on real data derived through Protari are perturbed. That is, small random perturbations (or changes) are applied to individual cells while the information value of the table as a whole is retained. The ABS considers perturbation to be the most satisfactory technique for avoiding the release of identifiable data while maximising the range of information that can be released. Perturbation is considered necessary due to the flexible nature of possible queries, the amount of detail in the underlying dataset, and the potential for results from multiple queries to be compared. When interpreting results from Protari, consider that:
  • while perturbation introduces random errors, it does so with almost no bias
  • running the same query multiple times results in the same perturbation and therefore the same output
  • some relationships between different estimates (e.g. adding to 100%) may not be preserved exactly

Protari may not be suitable for all types of research. It is the responsibility of each researcher to assess the fitness of data for their intended purpose.
  • It is expected that for most queries perturbation will have a smaller impact on data accuracy than will other forms of error, such as coverage error, response error and linkage error.
  • Although cells may appear to contain none or all of the relevant population, this is not necessarily a reflection of the true value of the cell.
  • No reliance should be placed on data cells with small values, particularly where the total table population is also small.

For more information about perturbation, see:
  • Confidentiality in the TableBuilder User Guide. The methods used to calculate the perturbations and other confidentiality protections in Protari are very similar to those used in TableBuilder.
  • Managing the Risk of Disclosure: Treating Aggregate Data section of ABS Confidentiality Series.

In addition to perturbation, other protections are also applied to real data when accessed via Protari. For example, any underlying data is de-identified, users are limited in the number of queries they are able to run, rules restrict certain items being run together, and activity is logged and monitored.
Available data

There are two types of data available via Protari:
  • real data
  • synthetic (or test) data

Real data - controlled access

Access to real data is limited to analysts associated with Australian Commonwealth or State Government agencies for statistical and research purposes.

Individuals can apply to use Protari to analyse the following Multi-Agency Data Integration Project (MADIP) products:
For information about using MADIP data products see Using Protari section in Microdata: Multi-Agency Data Integration Project, Australia.

Synthetic data - open access

To foster innovation and support development of custom interfaces there are no restrictions on who can query synthetic (test) data via the Protari API. User authentication is not required.

The available synthetic data does not contain real data but it is generally consistent with the structure of the related real data product. Use of the synthetic data allows a researcher or developer to become familiar with Protari's functionality and the structure of the real data, so you can prepare code/programs without having to apply for access to, or use limited queries against, the real data. Additional data protections, that are not required with synthetic data, are applied to real data access, such as perturbation of results, rules restricting certain items being run together, query limits and active usage monitoring.

Synthetic data is available via the Protari API for the following MADIP products:
For instructions on how to access the synthetic data see Using the API.


Using the table interface

If you are approved to participate in the trial of Protari, you can access the ABS Protari Table Interface at https://protari.abs.gov.au/

When accessing the Table Interface, you are re-directed to login to the authentication service Okta. Further information on how to use the Table Interface can be found via Data61's Protari documentation Using the Table Interface. This includes selecting your dataset, building your query, generating results, selecting output formats, and other features.
Using the API

The easiest way to get started and explore the capabilities of the Protari API is through the Swagger user interface at https://protari.abs.gov.au/api/v1/ui/. Swagger provides a user interface for exploring and testing the various endpoints the API offers.

If you are an experienced API user, you can query the ABS Protari API using your own software, using code in R or Python, for example. Start your queries using the base URL `https://protari.abs.gov.au/api/v1`, and add your own endpoints (examples below).

Authentication

Whether using Swagger or your own software, querying real datasets requires user authentication. This is not required for synthetic (test) datasets. Currently, if you are already an approved user for real data access, you can acquire an authentication token via your browser's developer tools. Contact protari@abs.gov.au for more details. The ABS is working on an improved authentication flow.

Main API endpoints

EndpointExpected response
/aboutReturns general information about the API such as the version of Protari in use, and any terms of access (JSON format only)
/datasetsReturns a list of available datasets (does not display datasets a user does not have approval for) (JSON format only)
/datasets/{dataset_name}[?values={true|false}]Returns the metadata for a dataset (JSON format only)
/datasets/{dataset_name}/fields/{field_name}Returns the metadata for a single field (aka variable, data item) (JSON format only)
/datasets/{dataset_name}/aggregation[/{csv|sdmx-json}][?{query_string}]Returns aggregate results as JSON (default), CSV or SDMX-JSON

Example queries

TypeExampleExpected response
datasetshttps://protari.abs.gov.au/api/v1/datasets/A list of available datasets (JSON format only) (restricted datasets will only be shown if user is authenticated)
datasethttps://protari.abs.gov.au/api/v1/datasets/MADIP_2011_SYNTHThe metadata for a dataset (JSON format only)
group_byhttps://protari.abs.gov.au/api/v1/datasets/MADIP_2011_SYNTH/aggregation?group_by=madip_sexResults (unit counts by default) broken down according to the value in a particular field (or fields) (default format JSON)
csv group_byhttps://protari.abs.gov.au/api/v1/datasets/MADIP_2011_SYNTH/aggregation/csv?group_by=madip_sex,madip_stateAs above in CSV format
wherehttps://protari.abs.gov.au/api/v1/datasets/MADIP_2011_SYNTH/aggregation?where=madip_state=7Returns only results for units meeting a condition (or conditions)

For more information see Data61's Protari Documentation. ABS is currently using Protari v1.2.0.
Find out more and apply for access

Try out the synthetic data API in Protari. To apply for access to real data and find out more about Protari, contact protari@abs.gov.au.

Back to top of the page