ABS TRIAL OF PROTARI
The ABS is conducting a user trial of new software called Protari. Protari aims to enable relevant, and customisable, analytical interfaces for users while maintaining privacy. The ABS is inviting participation from analysts associated with Australian Commonwealth or State Government agencies.
- is an Application Programming Interface (API) that enables an authorised user to generate aggregate tables that are confidentialised on-the-fly from underlying unit-record data
- allows you to dynamically integrate your analysis into broader analysis at your own desktop or into data visualisation tools
- is being developed collaboratively with Data61 under the National Innovation and Science Agenda's Platforms for Open Data Program
There are two ways to access Protari:
- query the Application Programming Interface (API) directly using your own code/software, or
- use the Protari Table Interface.
PARTICIPATION IN THE PROTARI TRIAL
If you are an analyst associated with a Commonwealth or State Government agency, you may apply to participate in the Protari trial. The ABS will assess your application and if approved, invite you to take part. For further information contact email@example.com.
Consider applying to participate in the Protari trial if:
- you are capable of querying the API directly to make use of its benefits of programmability/machine readability, or
- you want to use the standard Protari Table Interface to run tabular analysis.
Also only consider applying to participate if:
- your analysis is for statistical or research purposes
- your needs can be met through tabular analysis, and
- you accept that perturbed output is fit for your purpose (see the 'Perturbation to protect confidentiality' section below).
To be invited to participate in the Protari trials, an applicant must:
- be registered, and joined to their organisation, in the ABS Registration Centre, agreeing to comply with the Registration Centre's Conditions of use
- submit an 'Application to Participate in the Protari Trials' to firstname.lastname@example.org. This application will include agreeing to the 'Terms and conditions to access Protari as a participant in the ABS Trial', and
- be willing to provide feedback to inform future development,
QUERYING THE PROTARI API DIRECTLY
Querying the API directly to produce tabular output is suitable if you already use compatible software (such as R or Python) in your own environment, and:
- you are interested in programming queries into your own analysis/code, or
- you are interested in embedding dynamic queries into a customised user interface (e.g. data visualisation tool) for other authorised users.
For more information on how to use the API directly, see Data61's Protari documentation.
USING THE PROTARI TABLE INTERFACE
The Protari Table Interface is suitable for users who have a research purpose to analyse data within Protari and are more comfortable with a user interface than writing code. For more information on how to use the Table Interface, see Using the ABS Protari Table Interface.
PERTURBATION TO PROTECT CONFIDENTIALITY
To minimise the risk of identifying individuals in aggregate statistics, outputs derived through Protari are perturbed. That is, small random perturbations (or changes) are applied to individual cells within results while the information value of the table as a whole is retained. The ABS considers perturbation to be the most satisfactory technique for avoiding the release of identifiable data while maximising the range of information that can be released. Perturbation is considered necessary due to the flexible nature of the possible queries, the amount of detail in the underlying dataset, and the potential for results from multiple queries to be compared. When interpreting results from Protari, consider that:
- while perturbation results in introduced random errors, it does so with almost no bias
- running the same query multiple times will result in the same perturbation and therefore the same results
- some relationships between different estimates (e.g. adding to 100%) may not be preserved exactly
Protari may not be suitable for all types of research. It is the responsibility of each researcher to assess the fitness of data for their intended purposes.
- It is expected that for most queries perturbations will have a smaller impact on data accuracy than from other forms of error (e.g. coverage error, response error and linkage error).
- Although cells may appear to contain none, or all, of the relevant population, this is not necessarily a reflection of the true value of the cell.
- No reliance should be placed on data cells with small values, particularly where the total table population is also small.
The methods used to calculate the perturbations and other confidentiality protections are very similar to those used in ABS TableBuilder (see the Confidentiality page of the TableBuilder User Guide). The ABS uses the 'Five Safes Framework' for protecting the privacy of individuals when releasing data. Perturbation is only one element of privacy protection within this broader framework. Further information on perturbation can be found in the 'Managing the Risk of Disclosure: Treating Aggregate Data' section of ABS Confidentiality Series while more information about the Five Safes Framework is in the 'Managing the Risk of Disclosure: The Five Safes Framework' section of the same publication.
AVAILABLE DATA IN PROTARI
Users in the trial can currently apply to use Protari to analyse data from the following Multi-Agency Data Integration Project (MADIP) products:
- MADIP Basic Longitudinal Extract, 2011-2016 (2011 Cohort)
- MADIP Basic Longitudinal Extract, 2011-2016 (2011-2016 Cohorts)
For information about using these data products via Protari see Using Protari section in Microdata: Multi-Agency Data Integration Project, Australia (cat. no. 1700.0).