|Page tools: Print Page Print All RSS Search this Product|
This document was added 22/11/2018.
For the full list of data items for the above data products, see the Data Item lists from the Downloads tab within this publication. Note that for the BLE (2011-2016 Cohorts) Protari provides subsets of the data (rather than the full file). See more information below.
For information on scope and coverage, defining your population of interest, the linking methodology and weighting - see the Methodology (2011 Cohort) and Methodology (2011-2016 Cohorts) sections within this publication.
ASSESSING THE FITNESS FOR PURPOSE OF DATA
It is the responsibility of each researcher to assess the fitness of data for intended purposes. Protari perturbs (slightly adjusts) data outputs to protect privacy while retaining the information value of the table as a whole (more information below). Data accuracy may also be impacted by statistical or other error, including but not limited to coverage error, response error and linkage error (see 'Methodology' chapters linked above). No reliance should be placed on data cells with small values, particularly where the total table population is also small.
PERTURBATION TO PROTECT CONFIDENTIALITY
To minimise the risk of identifying individuals in aggregate statistics, outputs derived through Protari are perturbed. That is, small random perturbations (or changes) are applied to individual cells within results while the information value of the table as a whole is retained. The ABS considers perturbation to be the most satisfactory technique for avoiding the release of identifiable data while maximising the range of information that can be released. Perturbation is considered necessary due to the flexible nature of the possible queries, the amount of detail in the underlying dataset, and the potential for results from multiple queries to be compared. When interpreting results from Protari, consider that:
Protari may not be suitable for all types of research. It is the responsibility of each researcher to assess the fitness of data for their intended purposes.
The methods used to calculate the perturbations and other confidentiality protections are very similar to those used in ABS TableBuilder (see the Confidentiality page of the TableBuilder User Guide). The ABS uses the 'Five Safes Framework' for protecting the privacy of individuals when releasing data. Perturbation is only one element of privacy protection within this broader framework. Further information on perturbation can be found in the 'Managing the Risk of Disclosure: Treating Aggregate Data' section of ABS Confidentiality Series, Aug 2017 (cat. no. 1160.0) while more information about the Five Safes Framework is in the 'Managing the Risk of Disclosure: The Five Safes Framework' section of the same publication.
SUBSETS AND SAMPLE FILES FOR BLE (2011-2016 COHORTS)
The BLE (2011-2016 Cohorts) is available as subsets and samples via Protari, rather than in its entirety. Each subset and sample file includes data from the in-scope population, across all 6 years (2011-2016) and for all the General, Scoping, Derived, Census, PIT, SSRI, and Apprentice and Trainee data items, but differ depending on MBS/PBS data item inclusions and geographic availability. For the full list of data items, see the Data Item List from the Downloads tab within this publication. The following subsets and samples are currently available for use by approved users via Protari.
TABLE 1: COMPARING SUBSETS AND SAMPLE FILES FOR BLE (2011-2016 COHORTS)
TABLE 2: SUMMARY MBS AND PBS DATA ITEMS INCLUDED IN THE HEALTH SUMMARY SUBSET
Both the BLE (2011 Cohort) and the BLE (2011-2016 Cohorts) contain a single weight field to correct for incompleteness of the linkage of Census (2011 Census or 2016 Census respectively) to MEDB/MADIP (for further information see the Methodology (2011 Cohort) or Methodology (2011-2016 Cohorts) sections within this publication). Typically queries involving Census fields should use these weights so that estimates for the complete Census population are obtained. For queries that do not include fields from Census, more accurate results will usually be obtained by not applying the weights, so that records that are not linked to Census can also be used in the estimation.
In Protari the ability for the user to apply or not apply these weights is achieved by the selection of different datasets. If the '<dataset name> Weighted' dataset is selected then the weights will be used in calculating the results. If the '<dataset name> Unweighted' dataset is selected then the weights will not be used. These datasets are otherwise the same.
Within the BLE all source datasets except Personal Income Tax (PIT) have years ending 31 December. For PIT data, a year refers to the financial year ending on 30 June of that calendar year (e.g. '2015' refers to the financial year 2014-15).
Although the BLE contains six years of data (2011-2016) for most source datasets, Census and Derived data items relate to a single year only (i.e. 2011 for the BLE (2011 Cohort) and 2016 for the BLE (2011-2016 Cohorts). In Protari, results relating to data items prefixed 'Census 2011', 'Derived 2011', 'Census 2016' or 'Derived 2016' will always refer to that single year, even if the query is run for a different year.
These documents will be presented in a new window.