1406.0.55.007 - DataLab, User Guide  
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 25/03/2021  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

DataLab input and output clearance

Load file into DataLab
Request file load
DataLab output clearance
Request output

Request input clearance
Preparing your output for clearance
Rules of thumb
Request output clearance


Request input clearance

If you have your own data, code or files that you would like to use in DataLab, they need to be approved before they can be loaded. This is known as input clearance. Examples of inputs include:
  • data - aggregated data, tables, microdata and classifications
  • code - user written code and packages
  • other files - Word documents and PDFs

To request input clearance use the Request file load button at the top of this page. We aim to respond to your request within two business days. Larger more complex loads are likely to take longer.

If the Request file load button does not generate an email, use the template below to submit your request.

To: microdata.access@abs.gov.au

Subject: Request DataLab input file load

Dear DataLab team

I would like to load the attached file(s) to my DataLab project.

Project name:
File type (e.g. code or data):
Description of each file:

Additional information required for each data file:
  • organisation/individual owner of the data:
  • source of the data (include website link if applicable):
  • any terms of use or licensing that applies to the data that may restrict its use in the ABS DataLab and require additional permissions or conditions:


Preparing your output for clearance

Outputs from DataLab must be approved by ABS before they can be released. You must not remove anything (data, code, notes, etc.) from the DataLab yourself.

Before you ask for output clearance, apply the appropriate DataLab output rules to each statistic.

Descriptive statistics

Frequency tables
  • Rule of 10
  • Group disclosure rule
  • Consequential suppression

Magnitude tables, means, totals, indices, indicators, proportions, measures of concentration
  • Rule of 10
  • Dominance rules
  • Group disclosure rule
  • Consequential suppression

Ratios
  • Rule of 10
  • Dominance rules
  • Group disclosure rule
  • Consequential suppression
  • If the ratio is calculated at the business or individual level, the ratio is treated as another variable on the dataset and the (1,50) and (2,67) dominance rules applies as usual
  • If the ratio is in the form of aggregate/aggregate, the (1,50) and (2,67) dominance rules applies to the numerator and denominator separately. If either the numerator or denominator fail, the ratio is suppressed

Maximums, minimums
Subject to minimum contributors for percentiles, use:
  • 99th and 1st percentiles
  • 95th and 5th percentiles
  • 90th and 10th percentiles

Quantiles (including median, quartiles, quintiles, deciles, percentiles)
  • Minimum contributors for percentiles

Box plot
  • Same rules apply as per quartiles, maximums and minimums
  • Minimum contributors for percentiles

Mode
  • Rule of 10

Higher moments of distributions/measures of spread (including variance, covariance, kurtosis, skewness)
  • Rule of 10

Graphs, pictorial representations of actual data
  • Not normally released if showing individual observations


Correlation and regression analysis

Regression coefficients, and summary and test statistics
  • Minimum 10 degrees of freedom
  • R-squared ≤ 0.8

For regressions that have a continuous dependent variable and only categorical independent variables, the regression will return the average of each category. In this case:
  • Rule of 10
  • Dominance rules
  • Provide a cross-tab of the independent variables. Each cell must have at least 10 observations.
  • Each cell in the cross-tab needs to be tested for the (1,50) and (2,67) dominance rules for the dependent variable.

Hazard models
  • Rule of 10
  • There must be at least 10 'failures'

Estimation residuals
  • Not normally released
  • Provide justification

Correlation coefficients
  • Rule of 10


Unit records

Print, list or other commands that produce unit record level data
  • Prohibited


Rules of thumb

Rule of 10
  • Each cell/statistic should have at least 10 (unweighted) contributors
  • Provide unweighted counts

Dominance rules
  • (1,50) rule: the largest contributor of a cell/statistic should not exceed 50% of the total for that cell/statistic
  • (2,67) rule: the two largest contributors of a cell/statistic should not exceed 67% of the total for that cell/statistic
  • Replace negative values with absolute values, take the largest one (two) absolute value(s) and calculate the (1,50) and (2,67) statistics for the contribution to the total of absolute values
  • Provide evidence

Group disclosure rule
  • In all tabular and similar outputs, no cell should contain 90% or more of the column or row total
  • Provide evidence

Minimum contributors for percentiles

PercentileMinimum contributors
0.01500
0.05100
0.1050
0.2520
0.5010
0.7520
0.9050
0.95100
0.99500

Consequential suppression

If one or more of the rules of thumb fail and suppression is applied, one or more additional cells should be suppressed to protect the value of the primary suppressed cell from being worked out.

In the case of the rule of 10 failing, if someone has access to multiple tables regarding the same sample, they cannot use these multiple tables to deduce values of cells with less than 10 observations.

In the case of the dominance rules failing, if area11 + area12 + area13 = area1, and a cell in area11 is suppressed, then the same cell in area12 and/or area13 also needs to be suppressed such that both dominance rules pass for the combined suppressed cells.
Likewise, for any other relationships. Examples include:
  • Industry11 + Industry12 + Industry13 = Industry1
  • variable1 + variable2 + variable3 = variable4
  • (variable1 - variable2) / variable1 = variable3
  • variable1 / variable2 = variable3

Minimum 10 degrees of freedom
  • All modelled output should have at least 10 degrees of freedom
  • Degrees of freedom = number of observations - number of parameters - other restrictions of the model


Request output clearance

To request output clearance:
  1. Make sure you have applied the output clearance rules of thumb.
  2. Move your output to the Output drive (in new DataLab) or the Clearance folder on H drive (in old DataLab).
  3. Use the Request output button at the top of this page. If the Request output button does not generate an email, use the template below to submit your request.

We aim to respond to your output clearance request within two business days. Larger more complex outputs are likely to take longer.

To: microdata.access@abs.gov.au

Subject: Request DataLab output clearance

Dear DataLab team

I have saved my output to the Output drive (new DataLab)/Clearance folder (old DataLab) for ABS review.

Project name:
Output file name(s):
Data file(s) used (e.g. BLADE1617_CORE):
Description of the original and self-constructed variables:
Description of the analysis:

Additional requirements are listed below:
  • Weighted outputs: I have included the unweighted frequencies in my output.
  • Graphs/charts: I have included the underlying numbers used to produce the graphs/charts.
  • I have included any relevant code and log files.




Back to top of the page