DataLab
Description

Analyse the most detailed microdata in the secure DataLab for your statistical research or modelling, find out about costs and how to access

Released
4/11/2021
Content
\(\Large ⚿\) Log into DataLab  
What is DataLab
DataLab is the analysis solution for high-end users who want to undertake real time complex analysis of detailed microdata. Compare data services
Features, view and analyse unit record information recent versions of analytical software, including R, SAS, Stata and Python virtual access to files that r
Who can access DataLab, Researchers who meet ABS safe people criteria, including: ability to use at least one of the statistical analytical languages available in the D
Detailed microdata in the DataLab, designed specifically for use within the DataLab environment direct identifiers (such as names and addresses) removed further appropriate confiden
Limited release detailed microdata in the DataLab, Some datasets have been released on a limited basis. This includes BLADE and custom MADIP detailed microdata, which are available
Cost
Approved users can access standard detailed microdata in DataLab for approved projects. This includes: ABS survey and census coll
DataLab charges - 1 July 2022 to 30 June 2023, Standard DataLab access incurs an annual fee. This fee is based on the number of users with virtual machine access in a project. The fee covers a
Applying for and using DataLab
Step 1. Register and activate your account, Register and agree to conditions of use Use your organisation email address when registering to join your organisation automatically
Step 2. Submit project proposal, Project proposals Projects must be for statistical and/or research purposes and provide public benefit Projects must not be for comp
Step 3. Seek approval, User approval You must have statistical experience or be referred by an experienced researcher on your project team You must be able to use th
Step 4. Complete safe researcher training, See DataLab safe researcher training to enrol You can register to complete your training while we are considering your project proposal&nb
Step 5. Submit documentation, There are legal documents you must sign before accessing microdata in the DataLab Your organisation's CEO (or equivalent) signs the Respons

Topics

List of detailed microdata files available in DataLab, links to publications and data item lists

Released
8/11/2021

Detailed microdata files and reference periods in DataLab are listed below. For datasets in other systems see MicrodataDownload and TableBuilder, or all topics in Available microdata and TableBuilder.

You need to apply for access by submitting a DataLab project proposal before you can access these files.

Use Ctrl+F (Windows) or Command+F (Mac) to search this list.
Economy
Business Longitudinal Analysis Data Environment (BLADE)
Business Characteristics, 2004-05 to 2009-10
Business Characteristics, 2006-07 to 2010-11
Business Characteristics, 2008-09 to 2012-13
Business Characteristics, 2009-10 to 2013-14
Business Characteristics, 2010-11 to 2014-15
Business Characteristics, 2011-12 to 2015-16
Innovation in Australian Business, 2003
Management and Organisational Capabilities of Australian Business, 2015-16
 
Labour
Australians' Employment and Unemployment Patterns, 1994-97
Employee Earnings and Hours, 2006, 2010, 2012, 2014, 2016, and 2018
Employee Earnings and Jobs, 2011-12
Employment Arrangements and Superannuation, 2000
Employment Arrangements, Retirement and Superannuation, April to July 2007
Labour Force Survey and Employee Earnings Benefits and Trade Union Membership, 2006
Labour Force Survey and Employee Earnings Benefits and Trade Union Membership, 2008
Labour Force Survey and Employee Earnings Benefits and Trade Union Membership, 2010
Labour Force Survey and Forms of Employment Survey, 2008
Labour Force Survey and Labour Mobility, 2008
Labour Force Survey and Labour Mobility, 2010
Labour Force Survey and Labour Mobility, 2012
Longitudinal Labour Force, monthly data from 1982
Participation, Job search and Mobility, 2015-2022
Pregnancy and Employment Transitions, 2005
Pregnancy and Employment Transitions, 2011
Qualifications and Work, 2018-19
Work-Related Injuries, 2009-10
 
People
Adult Literacy and Life Skills, 2006
Australian Census and Migrants, 2011
Australian Census and Migrants, 2016
Australian Census and Temporary Entrants, 2016
Australian Census Longitudinal Dataset, 2006-2011
Australian Census Longitudinal Dataset, 2006-2016
Australian Census Longitudinal Dataset, 2011-2016
Census of Population and Housing, 2001
Census of Population and Housing, 2006
Census of Population and Housing, 2011
Census of Population and Housing, 2016
Characteristics of Recent Migrants, 2007 and 2010
Child Care, 1999
Child Care, 2002
Child Care, 2005
Childhood Education and Care, 2008
Childhood Education and Care, 2011
Crime and Safety, 2002
Crime and Safety, 2005
Crime Victimisation, 2009-10
Education and Training, 2005
Education and Training, 2009
Education and Work, 2016, 2017
Family Characteristics and Transitions, 2006-07
Family Characteristics, 2003
Family Characteristics, 2009-10
General Social Survey, 2002
General Social Survey, 2006
General Social Survey, 2010
General Social Survey, 2014
Household Expenditure, Income and Housing, 2003-04 including Fiscal Incidence Study
Household Expenditure, Income and Housing, 2009-10 including Fiscal Incidence Study
Household Expenditure, Income and Housing, 2015-16 including Fiscal Incidence Study
Income and Housing, 2000-01
Income and Housing, 2002-03
Income and Housing, 2005-06
Income and Housing, 2007-08
Income and Housing, 2011-12
Income and Housing, 2013-14
Income and Housing, 2017-18
Income and Housing, 2019-20
Multi-Agency Data Integration Project (MADIP), 2011-2016
Multipurpose Household Survey, 2004-05
- Household Use of Information Technology
- Barriers and Incentives to Labour Force Participation
- Retirement and Retirement Intentions
Multipurpose Household Survey, 2005-06
- Household Use of Information Technology
- Participation in Sports and Physical Recreation
- Attendance at Selected Cultural and Leisure Venues and Events
- Sports Attendance
- Work-Related Injuries
Multipurpose Household Survey, 2006-07
- Adult Learning
- Barriers and Incentives to Labour Force Participation
- Retirement and Retirement Intentions
- Household Use of Information Technology
- Family Characteristics and Transitions
Multipurpose Household Survey, 2007-08
- Environmental Views and Behaviour
- Household Use of Information Technology
- Personal Fraud
Multipurpose Household Survey, 2008-09
- Crime Victimisation
- Barriers and Incentives to Labour Force Participation
- Retirement and Retirement Intentions
- Household Use of Information Technology
Outcomes from Vocational Education and Training in Schools, 2006-2011
Participation in Sport and Physical Recreation, 2009-10
Participation in Sport and Physical Recreation, 2011-12
Personal Fraud, 2007-08
Personal Income of Migrants, annually from 2009-10 to 2016-17
Personal Safety Survey, 2005
Personal Safety, 2012
Personal Safety, 2016
Preschool Education, annually from 2013 to 2021
Programme for the International Assessment of Adult Competencies, 2011-12
Time Use, 1997
Time Use, 2006
 
Health
Australian Health Survey, Core Content - Risk Factors and Selected Health Conditions, 2011-12
Disability, Ageing and Carers, 2015
Disability, Ageing and Carers, 2018
Mental Health and Wellbeing, 2007
Mental Health and Wellbeing, 2020-21
Mortality, Enhanced Characteristics, 2011-12
National Aboriginal and Torres Strait Islander Health Survey, 2004-05
National Aboriginal and Torres Strait Islander Health Survey, Core Content - Risk Factors and Selected Health Conditions, 2012-13
National Aboriginal and Torres Strait Islander Health Survey, Detailed Conditions and Other Health Data, 2012-13
National Aboriginal and Torres Strait Islander Health Survey, Nutrition and Physical Activity, 2012-13
National Aboriginal and Torres Strait Islander Health, 2018-19
National Aboriginal and Torres Strait Islander Social Survey, 2002
National Aboriginal and Torres Strait Islander Social Survey, 2008
National Aboriginal and Torres Strait Islander Social Survey, 2014-15
National Aboriginal and Torres Strait Islander Survey, 1994
National Health Indigenous, 2001
National Health Survey, 2001
National Health Survey, 2004-05
National Health Survey, 2007-08
National Health Survey, 2011-12
National Health Survey, 2014-15
National Health Survey, 2017-18, 2020-21
Nutrition and Physical Activity, 2011-12
Patient Experiences, 2018-19, 2019-20, 2020-21
Smoker Status, 2017-18, 2020-21
 
Environment
Household Energy Consumption, 2012

Safe researcher training

Registering your interest to attend DataLab training, and training resources

Released
19/11/2021

What is safe researcher training

DataLab safe researcher training must be undertaken before you can use the DataLab or be approved on a project:

  • training enables new users to become approved DataLab researchers
  • available as face-to-face training via ABS offices, in most capital cities
  • also available as virtual training

Training covers:

  • your shared responsibilities as a DataLab user
  • meeting your legislative requirements
  • appropriate output for ABS clearance and data release

The training does not include:

  • using the system
  • statistical capability training
  • code or analytical language training

Before enrolling in training, check who can access the DataLab to make sure:

  • you understand the pre-requisites for accessing DataLab
  • you, your organisation and your project are eligible for DataLab access

Safe researcher training and DataLab access are only available to researchers located in Australia in Australian organisations. International researchers and organisations will be considered on a case by case basis.

Refresher training

Researchers need to undertake refresher training because:

  • key operations, such as output checking procedures and rules, change over time
  • people can become complacent about complying with appropriate behaviours in the DataLab
  • it reinforces the need to refresh your skills and knowledge about implementing safe researcher practices
  • it ensures you remain aware of your responsibilities and obligations when using the DataLab

The refresher training policy requires:

  • all active users and discussants to complete the online Safe Researcher DataLab video modules every two years, or sooner if instructed to by the ABS
  • if you are already working on active projects, you will be registered for online training but any new access applications can still be processed before you complete the refresher training
  • if you are not already working on active projects or have moved organisations since previously completing training, you will be registered for online training and new accesses will not be approved until you have completed and passed the online training

How to register your interest

Training should be timely - only enrol when you have submitted your project proposal or have requested to join a project. 

To register your interest, click on the email link at the top of this page.

If the button does not generate an email, use the template below to submit your request. We will acknowledge your email within 2-3 working days.

To: microdata.access@abs.gov.au

Subject: Register interest in DataLab safe researcher training

Dear DataLab Client Support team

I would like to enrol in an upcoming DataLab safe researcher training session.

Full name: 
Organisation: 
Preference for face-to-face or virtual training: 
Base location: 
Contact phone number/s (mobile): 


DataLab project name and number: 
Have you been added to the project proposal by the project lead? 
Has the project lead submitted the updated project proposal to the ABS?
If the project has not commenced yet, what is the indicative start date: 

Safe researcher training resources

The attached slides are presented during the DataLab Safe Researcher Training. We will also email the slides to you with other materials after you have completed training.

Part 1 - Working together to enable microdata access

Part 2 - Maintaining data confidentiality

Part 3 - Statistical disclosure control

\(\Huge 🗎\) DataLab safe researcher training Parts 1 and 2.docx

\(\Huge 🗎\) DataLab safe researcher training Part 3.docx

You should also read Responsible use of ABS microdata user guide to understand your responsibilities as a safe researcher.

Using DataLab responsibly

Roles and expected behaviours for being a safe researcher in the ABS DataLab

Released
4/11/2021

Roles and expected behaviours

ABS

  • encourages, promotes and supports the use of data for research and/or statistical purposes
  • provides training on guidelines and compliance requirements for safe researchers and safe use of data
  • provides a secure environment for flexible and wide-ranging microdata access to meet researchers' needs
  • provides a range of statistical packages and updates
  • provides adequate metadata
  • manages the authorisation, provision and removal of access to microdata
  • provides researchers with the principles and rules for safe outputs
  • checks outputs and provides advice on how to make outputs non-disclosive
  • responds to questions relating to the data, processes and systems, in a timely manner
  • respects researchers' academic independence
  • monitors and audits DataLab use to ensure compliance with procedures and legislative requirements

Lead researchers

  • submit your research proposal to the ABS by following the steps in the DataLab User Guide
  • provide an updated project proposal to reflect changes to the team, scope, project time frames and/or data requirements - use the 'Document history' section on page 2 of the project proposal to advise us of any changes
  • support your research team to adhere to DataLab safe researcher practices and behaviours and building a culture of best practice within your team
  • provide feedback on the outcomes of your project and experience with the ABS microdata and DataLab upon project closure
  • provide ABS with two weeks notice before release and then provide a link to any published research stemming from the project's findings
  • advise the DataLab team immediately of any suspected incidents in the DataLab, including both data security or procedural failures
  • support the ABS in communicating key messages with your research team
  • adhere to any relevant requirements of analysts

Approved project team researchers/analysts

As an approved project team analyst, you have access to the DataLab and may discuss uncleared data with other approved analysts or discussants on your project team. You must:

  • meet all on-boarding requirements, including:
    • completing the safe researcher DataLab training
    • confirming you are willing to have your name, organisation, microdata you have access to, projects, and links to resultant papers published from the research on a register on the ABS website (unless otherwise agreed in advance)
    • confirming you belong to an organisation that has a Responsible Officer undertaking in place with the ABS
    • signing and agreeing to the conditions in the individual undertaking and other associated paperwork (including the Declaration of Compliance)
    • confirming in writing that you are not currently restricted from accessing government data, or any other data due to misuse of data or a breach of data policy/procedures
    • declaring you have at least three years’ quantitative research experience or university study with a significant component working with quantitative data, or if this is not possible comply with the pre-requisite skills and/or experience expected of approved researchers
    • having experience with at least one of the statistical analytical languages available in the DataLab
  • comply with ABS protocols and instructions for access and use of microdata in the DataLab
  • access only the microdata you have been approved to access - if you can access data that you believe you should not be able to, contact the DataLab team immediately
  • inform the ABS if you leave the project team or if you leave your organisation
  • only access the DataLab from a private location with a secure internet connection, not from public networks or spaces
    • protect your work area and screen from oversight by others, including unauthorised colleagues, family, children and pet cams
    • not screen share your DataLab session or content, even when meeting only with approved researchers on your project team
    • keep passwords for the DataLab secure
    • not share DataLab log in credentials
    • not attempt to identify individuals or organisations within data held in the DataLab
    • not attempt to match DataLab unit record data with any other list, database or repository of persons or organisations
    • not attempt to avoid, override or otherwise circumvent the system or procedures
    • not transcribe or copy anything from the DataLab prior to output clearance (this includes screen sharing or any written or photographic form)
    • not transcribe or copy anything from the DataLab prior to output clearance to share it with any other researcher (approved or not) or with ABS personnel or the DataLab Client Support Team
    • only use the shared project space within the DataLab to share uncleared work with approved project researchers or to communicate with ABS DataLab support areas about uncleared data
    • do not send outputs to be cleared to the DataLab team - you must request output clearance
    • do not attempt to link two microdata files within the DataLab at the unit record level based on matching characteristics, except where linking keys have been provided and the files are designed to be linked
    • not deliberately attempt to identify individual or organisational respondents or mishandle a spontaneous recognition event
    • be aware that data confidentiality is your responsibility when submitting outputs for review - see Confidentiality in ABS microdata and output guidelines for more information
    • report any security incidents or procedural failures to DataLab team immediately via microdata.access@abs.gov.au and cc the lead researcher

    Approved project team researchers/discussants

    As an approved project team discussant, you do not have access to the DataLab but may discuss uncleared data with other approved analysts or discussants on your project team. You must:

    • meet all on-boarding requirements, including completion of the safe researcher DataLab training and signing of all relevant undertakings
    • seek approval from the ABS, via the project lead, if you wish to become a DataLab analyst
    • adhere to any relevant requirements of project team researchers/analysts

    Guiding principles

    External communication

    We encourage you to communicate as much as possible within the DataLab environment.

    If you need to communicate via other means, consider what is to be communicated and how the communication will take place to ensure that you do not inadvertently remove uncleared data from the DataLab.

    Managing communication

    • You can use notes within the DataLab to leave messages for other approved project team members. Let them know that you have left a note and would like them to view it.
    • You can talk to your supervisor/s about the data, if they are approved researchers or discussants on the project, but consider the environment and who is around.
    • Phone calls and video conferencing may be used for discussions but never share your screen.
    • Do not transmit any uncleared DataLab output in an email, including with ABS personnel or the DataLab Client Support Team. Instead, let your approved project team colleague or the ABS know and ask them to view the issue within the DataLab. Similarly your approved colleagues or the ABS need to leave their responding information within the DataLab and let you know that there is information for you within the project in the DataLab.

    Remote access

    ABS trusts and supports approved researchers who remotely access the DataLab.

    Remote access is permitted under the following conditions:

    • It must be used in a work or private location.
    • The screen must be protected from oversight by any other person. This includes password-protecting you screen, should you move away from you computer.
    • A secure internet connection must be used:
      • A secure internet connection means any Wi-Fi that is password protected (e.g. work, home, your hotel room, hotspotting from your phone)
      • A non-secure internet connection means an open or public connection like a restaurant/cafe, airport, public transport, hotel lobby or shopping mall
    • Overseas access to DataLab is not allowed under any circumstances.
    • Working in the DataLab from home is supported by the ABS but you are responsible for checking and complying with your organisation's requirements for working from home.
    • Do not use any type of internal messaging system which may have external server connections.
    • The DataLab screens are to be kept secure at all times whether you are working within your organisation or from home.

    Further information is available in the Responsible use of ABS microdata user guide.

    Requirements to become an approved researcher

    Pre-requisite skills and/or research experience required of approved researchers

    To be an approved DataLab researcher, you must have the analytical research experience to be able to carry out quantitative data research or analysis in the DataLab This includes the ability to use at least one of the statistical analytical languages supported in the DataLab. This may have been acquired through working on research, analytical or statistical projects. For example, a person who was employed for three years in a relevant field, such as a university researcher, research assistant or a government or non-government employee working in research or statistics. If they had worked for around half of their time on quantitative research projects, then they would have spent a significant component of their time working with quantitative data.

    You may also have qualifications (either an undergraduate or higher degree) with a significant proportion of mathematics or statistics. A significant proportion of the degree should cover research method components and analytical fields, including:

    • qualitative data collection and research design, interviewing skills, conducting focus groups and ethnographic methods
    • quantitative data collection and research design, questionnaire design, sampling and weighting
    • hypothesis testing and evaluation
    • undertaking systematic reviews
    • data analysis, including data linkage, imputation and presentation of results
    • application of ethics to research

    Other relevant undergraduate degrees may include psychology, demography, social policy, sociology, political science, geography, economics, and social statistics. If you have postgraduate qualifications, you may combine multiple degrees to ensure you meet this requirement. This is a cumulative requirement.

    If you do not meet the above criteria but still want to access the DataLab, you may request a referral by an authorised researcher who is on the same research team as you. The referring researcher must meet all of the following requirements:

    • have at least three years of either quantitative research or analysis experience or university study with a significant component working with quantitative data
    • be working on the same project within the DataLab as the less experienced researcher
    • agree to directly supervise and take responsibility for the work of the less experienced researcher
    • have the agreement of a Senior Executive from the less experienced researcher’s organisation for this referral

    Download the undertaking, declaration and referral forms.

    The ABS does not provide support to researchers relating to statistical analytical languages or coding issues.

    Failing to comply with DataLab conditions of use

    As an approved researcher, you have signed appropriate documentation agreeing to comply with data access provisions under relevant legislation, whenever you access detailed microdata in the DataLab.

    If you suspect that you or others in your team may have failed to comply with a microdata undertaking, immediately cease the behaviour, notify the lead researcher and email microdata.access@abs.gov.au as soon as possible.

    For further information, see Consequences of failing to comply with a microdata undertaking in the Responsible use of ABS microdata user guide.

    Citing the ABS

    Information and research using ABS data must be acknowledged.

    See How to cite ABS sources to correctly reference ABS material.

    Input and output clearance

    Requesting input and output clearance, output rules

    Released
    19/11/2021

    Outputs from DataLab must be approved by ABS before they can be released. You must not remove anything (data, code, notes, etc.) from the DataLab yourself.

    Before you ask for output clearance, apply the appropriate DataLab output rules to each statistic.

     \(\Large ✉\) Request output clearance

     \(\Large ✉\) Request input clearance 
     

    Output rules

    Rule of 10

    • Each cell/statistic should have at least 10 (unweighted) contributors
    • Provide unweighted counts

    Dominance rules

    • (1,50) rule: the largest contributor of a cell/statistic should not exceed 50% of the total for that cell/statistic
    • (2,67) rule: the two largest contributors of a cell/statistic should not exceed 67% of the total for that cell/statistic
    • Replace negative values with absolute values, take the largest one (two) absolute value(s) and calculate the (1,50) and (2,67) statistics for the contribution to the total of absolute values
    • Provide evidence

     

    Applying dominance rules

    The dominance rule applies to tables that present magnitude or continuous variables such as income or turnover. This does not apply to categorical variables or counts. The rule is designed to prevent the re-identification of units that contribute a large percentage of a cell's total value, which could in turn reveal information about individuals, households or businesses. The cell dominance rule defines the number of units that are allowed to contribute a defined percentage of the total. 

    DataLab has a (1,50) and (2,67) rule. This means that the top contributor cannot contribute more than 50% of the total value to a cell and the top 2 contributors cannot contribute more than 67% of the total value to a cell.

    Dominance is required if any mean, total, ratio, proportion or measure of concentration statistic can be calculated for continuous or magnitude variables.

    While ratios/proportions can be continuous, if the numerator and denominator of the ratios/proportions are counts, we do not need dominance statistics.

    It is also required when there is a regression with a continuous dependent variable and categorical independent variables. In this case, every combination of categorical variables (crosstab) will need to be tested for dominance against the dependent variable.

    The below table shows an example of the additional information that analysts need to provide for output clearance when requesting a mean, total, ratio, proportion or measure of concentration

    There are multiple instances where the (1,50) (2,67) rule is violated.

    The top contributor in LGA 3 contributes 2.51/3.22 = 78% of the total.

    This violates the (1,50) rule.

    The top 2 contributors in LGA 3 contributes 3.03/3.22 = 94% of the total.

    This violates the (2,67) rule.

    You may also need to apply consequential suppression to your table so suppressed values cannot be derived.

    LGATotal Profit ($M)Top 1 Contributor ($M)Top 2 Contributors ($M)Top 1 Contribution to Total Profit (%)Top 2 Contribution to Total Profit (%)
    11.650.510.823150
    20.940.110.151216
    33.222.513.037894
    42.11.521.837287
    52.050.50.82439


     

    Group disclosure rule

    • In all tabular and similar outputs, no cell should contain 90% or more of the column or row total
    • Provide evidence

    Minimum contributors for percentiles

    PercentileMinimum contributors
    0.01500
    0.05100
    0.1050
    0.2520
    0.5010
    0.7520
    0.9050
    0.95100
    0.99500

    Minimum 10 degrees of freedom

    • All modelled output should have at least 10 degrees of freedom
    • Degrees of freedom = number of observations - number of parameters - other restrictions of the model

    Consequential suppression

    If one or more of the rules fail and suppression is applied, one or more additional cells should be suppressed to protect the value of the primary suppressed cell from being worked out.

    In the case of the rule of 10 failing, if someone has access to multiple tables regarding the same sample, they cannot use these multiple tables to deduce values of cells with less than 10 observations.

    In the case of the dominance rules failing, if area11 + area12 + area13 = area1, and a cell in area11 is suppressed, then the same cell in area12 and/or area13 also needs to be suppressed such that both dominance rules pass for the combined suppressed cells.

    Likewise, for any other relationships. Examples include:

    • Industry11 + Industry12 + Industry13 = Industry1
    • variable1 + variable2 + variable3 = variable4
    • (variable1 - variable2) / variable1 = variable3
    • variable1 / variable2 = variable3

    Preparing your output for clearance

    Descriptive statistics

    Frequency tables
    • Rule of 10
    • Group disclosure rule
    • Consequential suppression
    Magnitude tables, means, totals, indices, indicators, proportions, measures of concentration
    • Rule of 10
    • Dominance rules
    • Group disclosure rule
    • Consequential suppression
    Ratios
    • Rule of 10
    • Dominance rules
    • Group disclosure rule
    • Consequential suppression
    • If the ratio is calculated at the business or individual level, the ratio is treated as another variable on the dataset and the (1,50) and (2,67) dominance rules applies as usual
    • If the ratio is in the form of aggregate/aggregate, the (1,50) and (2,67) dominance rules applies to the numerator and denominator separately. If either the numerator or denominator fail, the ratio is suppressed
    Maximums, minimums

    Subject to minimum contributors for percentiles, use:

    • 99th and 1st percentiles
    • 95th and 5th percentiles
    • 90th and 10th percentiles
    Quantiles (including median, quartiles, quintiles, deciles, percentiles)
    • Minimum contributors for percentiles
    Box plot
    • Same rules apply as per quartiles, maximums and minimums
    • Minimum contributors for percentiles
    Mode
    • Rule of 10
    Higher moments of distributions/measures of spread (including variance, covariance, kurtosis, skewness)
    • Rule of 10
    Graphs, pictorial representations of actual data
    • Not normally released if showing individual observations

    Correlation and regression analysis

    Regression coefficients, and summary and test statistics
    • Minimum 10 degrees of freedom
    • R-squared ≤ 0.8

    For regressions that have a continuous dependent variable and only categorical independent variables, the regression will return the average of each category. In this case:

    • Rule of 10
    • Dominance rules
    • Provide a cross-tab of the independent variables. Each cell must have at least 10 observations.
    • Each cell in the cross-tab needs to be tested for the (1,50) and (2,67) dominance rules for the dependent variable.
    Hazard models
    • Rule of 10
    • There must be at least 10 'failures'
    Estimation residuals
    • Not normally released
    • Provide justification
    Correlation coefficients
    • Rule of 10
       

    How to apply dominance rule and rule of 10 for regression

    Example 1: Linear Regression

    A linear regression was run to predict income by age and health status:

    Age was binned into three categories: <18 years, >18 and < 30 and > 30 years, where <18 was the reference category.

    Health status was categorised according to healthy or unhealthy, where unhealthy was the reference category.

    Suppose the desired output was a regression summary below:

     Beta CoefficientP Value
    Constant1.50.001
    Age >18 and <3020.004
    Age >3030.002

    N=1000, R-Squared=0.67

    We should provide a crosstabulation of counts and a dominance table for the output clearance team.

    Crosstabulation of Counts

     UnhealthyHealthy
    Age < 181530
    Age >18 and <304070
    Age >306089

    Counts for each combination of variables are greater than 10. The rule of 10 is satisfied

    Dominance Table

    We should provide a dominance table for the output clearance team like below:

    Please note: Only the columns Top 1 and Top 2 Contribution to Total Income are required. The other columns are presented to illustrate the calculation. This table is also usually presented in one long spreadsheet.

     Total IncomeTop 1 IncomeTop2 IncomeTop 1 Contribution to Total IncomeTop 2 Contribution to Total Income
     UnhealthyUnhealthyUnhealthyUnhealthyUnhealthy
    Age < 18$1,500$500$900500/1,500 =33%900/1,500 = 60%
    Age >18 and <30$130,000$55,000$85,00055,000/130,000 = 42%85,000/130,000 = 65%
    Age >30$1,000,000$520,000$600,000520,000/1,000,000 = 52%600,000/1,000,000=60%
     
     Total IncomeTop 1 IncomeTop2 IncomeTop 1 Contribution to Total IncomeTop 2 Contribution to Total Income
     HealthyHealthyHealthyHealthyHealthy
    Age < 18$2,500$1,000$1,3001,000/2,500 = 40%1,300/2,500 = 52%
    Age >18 and <30$230,000$155,000$200,000155,000/230,000 = 67%200,000/230,000 = 87%
    Age >30$2,000,000$600,000$900,000600,000/2,000,000=30%900,000/1,000,000=90%

    There are multiple instances where the (1,50) and (2,67) rules are violated. Adjustments to the regression output will need to be applied before it can be cleared. The most common suggestion is to suppress the constant/intercept.


     

    Unit records

    Print, list or other commands that produce unit record level data

    • Prohibited

    Request output clearance

    To request output clearance:

    1. Make sure you have applied the output clearance rules.
    2. Move your output to the Output drive.
    3. Use the 'Request output clearance' link at the top of this page. If the Request output button does not generate an email, use the template below to submit your request.

    Outputs generally take 2-3 business days to be cleared if all the rules have been followed. Outputs where the rules have been improperly applied will take longer. Large outputs will also take longer. To minimise clearance time, ensure that requests contain only necessary outputs and the rules have been correctly applied.

    To: microdata.access@abs.gov.au

    Subject: Request DataLab output clearance

    Dear DataLab team

    I have saved my output to the Output drive for ABS review.

    Project name:
    Output file name(s):
    Data file(s) used (e.g. BLADE1617_CORE):
    Description of the original and self-constructed variables:
    Description of the analysis:

    Additional requirements are listed below:

    • Weighted outputs: I have included the unweighted frequencies in my output.
    • Graphs/charts: I have included the underlying numbers used to produce the graphs/charts.
    • I have included any relevant code and log files.

    Request input clearance

    If you have your own data, code or files that you would like to use in DataLab, they need to be approved before they can be loaded. This is known as input clearance. Examples of inputs include:

    • data - aggregated data, tables, microdata and classifications
    • code - user written code and packages
    • other files - Word documents and PDFs

    To request input clearance use the 'Request input clearance' link at the top of this page. If the Request input clearance link does not generate an email, use the template below to submit your request.

    We aim to respond to your input clearance request within two to three business days. It is likely to take longer if your request is large, complex or needs clarification.

    To: microdata.access@abs.gov.au

    Subject: Request DataLab input file load

    Dear DataLab team

    I would like to load the attached file(s) to my DataLab project.

    Project name:
    File type (e.g. code or data):
    Description of each file:

    Additional information required for each data file:

    • organisation/individual owner of the data:
    • source of the data (include website link if applicable):
    • any terms of use or licensing that applies to the data that may restrict its use in the ABS DataLab and require additional permissions or conditions:

    Logging into the portal and workspace

    Logging in, activating and starting your VM, first time use/new phone steps, resetting your password

    Released
    19/11/2021

    There are three steps when logging into new DataLab: 

    1. Log into the DataLab portal where you can access information and settings related to your profile, project and start your virtual machine (VM)
    2. Activate, then start the VM for your project
    3. Log into the Citrix DataLab workspace workspace where you and your project teams run analyses and produce output for clearance

    Log into the DataLab portal

    Click Sign in on the DataLab landing page.

    DataLab landing page

    If you are logging in for the first time, you will also need to use the First time use/new phone steps.

    For returning users, click on your account (firstname.lastname@mydata.abs.gov.au) or Use another account and enter your account name. All DataLab accounts use the @mydata.abs.gov.au domain format. Enter your password and Sign in.

    Returning users can click on their account followed by a password to log in

    By logging in you agree to these conditions:

    Important Notice

    If you are not authorised to access this system, exit immediately. Unauthorised users may be subject to criminal and civil penalties.

    This is an Australian Government computer system. Part 10.7 of the Criminal Code Act 1995 outlines the penalties that may apply for unlawful use of Government systems including unauthorised access, modification or impairment of computer systems, data or electronic communications. The Act provides penalties of up to 10 years imprisonment for such offences. By proceeding, you are representing yourself as an authorised user and acknowledge you have read and agree to comply with the Responsible Use of ABS Microdata User Guide. Your activity will logged, monitored and investigated should any misuse be suspected.

    Sanctions ranging from a reprimand to revocation of access or termination of employment may be imposed if misuse is determined.

    For system security, you need to authenticate your log in using the Microdata Authenticator app on your mobile phone. To set this up, see First time use/new phone steps.

    A notification from the Microsoft Authenticator app is sent to your phone for you to approve.

    2 factor authentication message

    If you don't approve within the time limit, click Send another request to my Microsoft Authenticator app. If the request expires, re-enter your account and password in the DataLab log in screen.

    2 factor authentication expired message, with send another request option

    You can also change the way you approve the sign in request by selecting Sign in another way.

    Link to alternate sign in method

    You then have two options:

    1. approve a request on your phone app
    2. use a verification code from your phone app
    Alternate sign in method options

    After you approve in Microsoft Authenticator, you are logged into the DataLab portal.

    DataLab portal

    Activate, then start the VM

    To enter your DataLab project workspace you need to:

    1. activate your Virtual Machine
    2. start the VM
    3. launch your desktop

    Step 1 Activate Virtual Machine

    Select the Search virtual machines from the My virtual machines tile, or select the laptop icon from the left navigator. For information about other functions you can do within this tile, see Functions in My Virtual Machines.

    Search virtual machines from the My virtual machines tile

    Each project has a separate virtual machine. The virtual machine name is your project number followed by random letters, which provide a unique name assigned to your profile.

    • If your machine already shows a green tick in the Active column, skip to step 2 Start the VM.
    • If not, select the Change active VM button. Don't activate a virtual machine when its status is Building. Wait until the status has changed to Ready.
    • If your machine shows the status of Dormant you must rebuild the VM first. See VM management options for more information.
    My Virtual Machines view

    Select the machine you want to activate from the drop down list

    Change My Active VM view, select a VM to activate

    The Activate VM button becomes active. Select Activate VM.

    Activate VM button

    Confirm your action by selecting Yes. If you have VM for another project that is currently active, this logs you out of your other session. If you have a program running in your Workspace using another VM, this will stop the program. You can only run multiple VMs if you have requested and are using offline local disk space.

    Activate this VM warning about logging out of existing sessions

    You can track the VM activation progress by either selecting Track from the pop up notification or the i icon on the left navigator.

    Notification: track activate VM progress

    When the VM activation completes, a message confirms that the action was successful. If the action fails, repeat the above steps to activate.

    Notification: Activate VM succeeded

    If you navigated to the Action Log page, select the laptop icon in the left navigator to return to the My Virtual Machines page.

    Step 2 Start the VM

    From the My Virtual Machines page, click on the VM name that is connected to the project you want to work on to view specific settings for this virtual machine. For information about other functions you can do within this tile, see Functions in My Virtual Machines.

    My virtual machines view showing VM name

    The workflow diagram at the top of the page shows you have activated the virtual machine (step 1). You now need to start this machine (step 2).

    Steps to launch this VM view

    In the Power State box, select Start VM, and confirm.

    VM Power State box 'Start VM' button and confirmation

    You can track the Start VM progress in the Action Log by either selecting track, from the pop up notification or the i icon in the left navigator.

    Notification: Start VM in progress

    Step 3 Launch the desktop

    Starting the VM takes several minutes. If you try to access your workspace too early you will not see your virtual machine in the DataLab Workspace. When it has completed, the status changes from In progress to Succeeded. If the action fails, repeat the steps to Start the VM.

    Action Log view

    When the status has changed to Succeeded, navigate to the specific VM page described in Start the VM.

    You are now at step 3 of the VM workflow diagram. Launch the desktop using the link Go to the DataLab Workspace. This opens the Citrix Workspace application.

    Steps to launch this VM workflow diagram. Step 3 launch desktop button

    Log into Citrix DataLab Workspace

    When you click on Go to the DataLab Workspace, the application Citrix Workspace opens. The first time Citrix opens, click Detect Workspace in the pop up message.

    Welcome to Citrix Workspace message

    If you receive an error instead of launching the Citrix Workspace application, you need to download the Citrix receiver application to your device. The latest version of Citrix Workspace is available at https://www.citrix.com/en-au/downloads/workspace-app/windows/.

    Warning dialog, Windows can't open this file

    Select Open Citrix Workspace Launcher.

    Browser alert to open Citrix Workspace Launcher

    For first time users, Citrix provides a tour of the system and its features. If you do not want to take the tour, select maybe later.

    Citrix Workspace virtual tour welcome screen

    If you want to take the tour later, you can access this feature by clicking on the arrow button at the bottom right corner of the screen.

    Citrix Workspace take the tour at any time button

    The Citrix dashboard shows your applications and your recently used desktops. If you do not see any desktops under Recent select the Desktops option in the left navigator. Desktop is the Citrix term for virtual machine.

    Citrix Workspace dashboard

    From the All Desktops view, click the virtual machine/Desktop (this is your project number) to open your project in the DataLab environment.

    Citrix Workspace All Desktops view

    This may prompt a script asking if you want to Open or Save the application. Choose Open.

    Open or save popup dialogue box

    It may take a few minutes for Citrix to connect to your DataLab desktop. When completed, your DataLab workspace launches.

    DataLab desktop connecting screen
    Welcome to DataLab screen

    Log in using the same credentials you used to log into the DataLab Portal. Once logged in, the system may take a few minutes to load as it prepares Windows.

    DataLab VM Windows log in screen

    As part of the conditions of use, all activity within the workspace is recorded for auditing and reporting purposes.

    Notification activity in DataLab VM is recorded

    Your DataLab workspace looks like this. See Using your workspace.

    DataLab workspace desktop

    First time use/new phone steps

    New DataLab uses two factor authentication to provide a secure log in environment. You need to download the Microsoft Authenticator application to your smart phone to use the DataLab.

    Open https://new.datalab.abs.gov.au (the temporary url while projects are migrating) and click Sign in.

    The first time you log in, enter the user name and password provided to you by us. If you are using a new phone, continue to use your existing credentials. All DataLab accounts use the @mydata.abs.gov.au domain format.

    DataLab sign in username
    DataLab sign in password

    In Additional security verification, choose Mobile app from the drop-down menu, and Receive notifications for verification. Click the Set up button.

    DataLab additional security verification

    A pop-up appears with 3 steps on how to link your new DataLab account to the Microsoft Authenticator app.

    DataLab configure mobile app 2 factor authentication QR code

    Step 1. Download the Microsoft Authenticator application to your smart phone from the App Store (for iOS) or Google Play (for Android).

    Microsoft Authenticator app on both Apple App and Google Play Stores for mobile devices

    Step 2. Open the app and create a Work or school account.

    Microsoft Authenticator Add account screen

    Step 3. Select Scan the QR code.

    Microsoft Authenticator app scan QR code option

    The QR code appears on your screen, scan this with your phone.

    Microsoft Authenticator app scan QR code on phone screen

    Click Next in your browser, which returns you to the Additional security verification screen. Select Next again.

    Configure mobile app, Next
    Additional security verification notifications options

    The Additional Security Verification sends a notification to your phone.

    Additional security verification step 2 screen

    From your phone, review the request, Approve sign-in? Choose Approve.

    Approve sign-in screen with Approve or Deny

    The web page is refreshed. Click Done, then Next to change your password.

    Additional security verification communicating with mobile app device

    Set up a new password for your account. Your password cannot contain your user ID. It must be a minimum of 8 characters and contain at least three of the following:

    • upper-case letters A – Z
    • lower-case letters a - z
    • numbers 
    • special characters @ # $ % ^ & * - _ ! + = [ ] { } | \ : ' , . ? / ` ~ " ( ) ;
    Update password screen

    After changing your password, you will be prompted to provide more information. Click "Next".

    More information required, Next screen

    To make sure you can reset your password in the future, you also need to set up an authentication phone and/or email.

    • You need to set up at least one in order to proceed.
    • Click Set it up now beside your chosen option.
    Don't lose access to your account screen

    Authentication by phone

    The system verifies your phone by the option you select.

    Select country and phone number

    Select the country or region you are registering in from the drop down menu, then enter your mobile number.

    Enter your authentication phone number

    Once your phone number is entered, choose between receiving a text or a call to activate the verify button. The text option sends a verification code to your phone. Enter the code then select verify. The call option sends you an automated phone call that will ask you to press the # key to verify.

    Verify your authentication phone number

    Authentication by email

    Enter an email address to activate the email me button.

    Enter your authentication email address

    A verification code is emailed to you. Enter the verification code and select verify. Once you have set up your details, select Finish to return to DataLab sign in.

    Verify your authentication email address
    Authentication phone number and email address confirmation screen

    Reset your password

    If you forget your password select the Forgot my password link.

    Enter password screen showing forgot my password link

    Your user ID is populated for you. Enter the characters in the picture, or words in the audio. Click Next.

    Enter the characters shown or click the audio link

    The next screen takes you to Step One of verifying your account. Choose from the options in the left-hand column (options available depend on what you chose during your original account set up process).

    Enter your authentication phone number to get back into your account

    Verify your account information via email, text, or call (whichever you chose), and follow the prompts to reset your password. The screen below displays when your password has been reset. Click the link to sign in with your new password.

    Your password has been reset confirmation screen

    Using your workspace

    Getting started, accessing your data files, available software, locking your workspace and signing out

    Released
    4/11/2021

    Getting started in the DataLab workspace

    When you have successfully logged into the Citrix Workspace, your DataLab workspace looks like this.

    DataLab workspace

    You can use DataLab in a similar way to using other secure networked systems, where you can securely see, use and share data files, analysis and output with the other members of your project team.

    Open File Explorer and click on This PC to see the network drives you have access to:

    • Library: All researchers can see all files in the Library drive. This is where we upload support information, such as Statistical language documentation, ANZSIC classification, this DataLab user guide. Files cannot be saved to this drive.
    • Output: Any output you want the ABS to clear should be saved to this drive. Only members of your team can see this drive. See also Request output clearance. Information is backed up nightly and retained for 14 days.
    • Project: A shared space for your team to work in and store all your project files. Only members of your team can see this drive. Information is backed up nightly and retained for 14 days. The default storage is 1TB. You will need to review and delete unnecessary files as your project files grow over time. If necessary, an increase to this storage can be requested via microdata.access@abs.gov.au. There may be a cost for additional storage.
    • Products: Access data files that have been approved for your project. However, it is best to use the My data products shortcut on your desktop as this shows you only the datasets you have been approved to access, rather than all dataset short names. Files cannot be saved to this drive.
    • LocalDisk: If you have been granted local disk space, this can be used to run jobs on offline virtual machines (desktops). You may want to request this option if you have multiple projects that you are actively involved in. There may be a cost associated with attaching local disk space to your VM. The local disk will only be present if it has been allocated to your VM.
    • Drives A, C and D are not to be used. Information saved here is destroyed with each nightly shutdown and 30 day rebuild.
    Network drives you have access to

    Do not store files in any other folders. Other members of your project cannot see files if you store them in other drives. Files stored outside of the Project and Output drives are destroyed every 30 days as part of DataLab security protocols.

    Refreshing your network drives. If your network drives do not appear in File Explorer, you can click the Refresh Network shortcut on the desktop. A confirmation message appears when this has been successfully refreshed.

    Refreshing your network drives

    Accessing your data files

    To access the data files for your project, use the My Data Products shortcut on your desktop.

    My Data Products shortcut

    The My Data Products folder displays only the products approved for your project.

    My Data Products folder

    Selecting the Products drive shows you the short name of all data loaded to the DataLab. However, if you try to open a file that is not approved for your project you are denied access and receive an error.

    Products drive
    Error message when accessing a file that is not approved for your project

    Available software

    Software can be opened using the shortcuts on your desktop or by using search on the Taskbar.

    All researchers have access to these applications in the DataLab:

    • LibreOffice
    • Acrobat Reader
    • Notepad ++
    • QGIS
    • Git (available locally for projects to version their code)
    • R 4.1.1, including:
      • RStudio 1.4.1717
      • RTools 40
    • Python 3.7 (Anaconda3 distribution) including:
      • Juypter Notebook
      • Spyder

    If required, you can also request:

    • SAS 9.4
    • Stata MP 16

    Microsoft Word and Excel are not currently available, as these applications require a internet connection, which is not supported in a secure system like DataLab. The versions which worked in old DataLab are no longer available. We are reviewing options on how to support this software within our secure system.

    Firefox and Edge are available to support access for Databricks (which is under development) and for Jupyter notebooks to use Python/R. These browsers cannot be used to browse the internet.

    If a package in your statistical software choice is not available, send an email to microdata.access@abs.gov.au describing the package required. Managing your R packages explains how you can manage R packages using the RStudio package manager shortcut on your desktop.

    To open a PDF, right click and choose open with Adobe Acrobat Reader. Due to a default setting in Microsoft, if you double click, the system automatically uses Microsoft Edge to open any PDF file.

    How to open a PDF file

    Managing your R packages

    If you are working with a specific set of R packages, you can manage these using the RStudio package manager shortcut on your desktop.

    RStudio package manager shortcut

    In the R Package Manager, click Get Started to take you to the available packages. You can use this tool to search for packages (in the left column) and install the R packages you want to use for your project. If the packages you need are not listed, email your request to microdata.access@abs.gov.au.

    R Package Manager page where you can check your available packages by clicking Get Started
    List of the available packages in your RStudio package manager

    Sign out or Lock your DataLab session

    When you walk away from your computer or are finished with your DataLab session, you must either lock your work station or sign out of your account to ensure nobody else accesses your DataLab account. Click the Windows menu in the bottom left corner and select the person icon to select Lock or Sign out.

    Signing out or locking your DataLab session

    Lock if you need to leave your computer for a short length of time. To log back in use the dot menu at the top of your virtual session.

    Use the dot menu located at the top of your virtual session to log back in to your locked session

    When expanded, select Ctrl+Alt+Del, and re-enter your credentials.

    Select the Ctrl+Alt+Del icon and re-enter your credentials

    Sign out to leave your workspace session. This closes your session but does not end any programs you have running. Your programs will continue to run until 10pm that night, or longer if you have selected the Bypass option in the portal.

    Signing out returns you to the Citrix workspace portal, where you can either close the browser window or Log Out of the portal using the icon in the top right corner (with your initial). To log back in, see Logging into the portal and workspace.

    Citrix Workspace portal screen where you can close the browser window or Log Out of the portal

    Portal features

    My Virtual Machines, My Accounts and My Projects

    Released
    4/11/2021

    The DataLab portal is where you find information about your DataLab account, projects, and virtual machines.

    DataLab portal

    The DataLab portal displays information in three tiles: 

    • My Virtual Machines to activate, start and launch the VM associated with your project
    • My Account to view your personal contact information and the virtual machines you have access to
    • My Projects to view basic attributes of the project as well as users in each project

    Left navigator menu
    The left navigator menu is open by default and has shortcuts that can be used to navigate between pages when you are not on the home page. Click the arrow to collapse or expand the navigator menu.

    Left navigator menu
    Home

    Returns you to the DataLab portal home page where the three tiles are displayed.

    My Projects, My Virtual Machines and My Account 

    These are shortcuts to the tiles on the home page.

    Action Log 

    Keeps a record of your portal actions. This can help you manage your sessions and provides useful information if you encounter problems with the system. It includes:

    • starting VM
    • stopping VM
    • changing your Active VM
    • restarting VM
    • rebuilding VM
    Global links

    The links at the top right are available from all pages of the portal:

    Your obligations and management responsibilities

    The Responsible Use of ABS Microdata obligations guide helps you understand your obligations and management responsibilities to handle microdata safely. Read the guide and contact microdata.access@abs.gov.au if you would like any help understanding your responsibilities.

    Important messages banner

    This banner appears at the top of your DataLab portal window when we have an important message for your consideration or action.

    Functions in My Virtual Machines

    By selecting either the My Virtual Machines tile or laptop icon you can view your virtual machines. You have one virtual machine for each project you are approved for.

    Accessing your virtual machine using the my virtual machine tile or the laptop icon

    All your virtual machines are listed in a table with details about each machine. You can export this table as a spreadsheet or filter the display by using the quick search feature.

    List of your virtual machines
    • Name is your project number followed by random letters, providing a unique name that is assigned to your profile. This is also a link to the management options for that VM.
    • Build date is the date we assigned you to that project.
    • Type describes the size of machine you will be using. The size allocated to you depends on the amount of data you have access to.
    • Status: don't activate a virtual machine when its status is Building or Dormant, wait until the status has changed to Ready
    • Power State shows if the machine has been started.
    • Active indicates which machine is currently active. A VM must be active before you can start it.
    • Local Disk (GB) shows if you have been allocated extra space for running jobs on offline VMs.


    Use the Change active VM button to swap between projects. The virtual machine must show a green tick in the Active column before you start the VM. VMs that are Dormant must be rebuilt before they can be made Active.

    VM management options

    When you have selected the VM associated with your project, click the Name link to see the management options.

    If your VM is Dormant, no management options are available.

    VM management options when VM is dormant

    Select Rebuild now which prompts a confirmation message, selecting yes to proceed. Rebuilding takes 45 minutes to complete and can be monitored for success from the Action Log.

    When your VM is active, the management options are able to be selected again.

    VM management options when VM is active

    Steps to launch this Virtual Machine workflow process flow illustrates where you are up to when you launch your workspace.

    Basic Attributes

    • Name is your project number followed by random letters, providing a unique name that is assigned to your profile.
    • Project profile is a link to the project details page (also accessible from My Project tile)
    • Status of the project (Building, Ready or Destroyed)
    • Date registered is the date we created your project profile
    • Date destroyed is the date the VM was removed from the system
    • VM option type is the size of machine available to you (small, medium, large or x-large). The size assigned depends on the amount of data attached to your project
    • Activated indicates if the VM is ready to be started
    • Local Disk (GB) shows if you have been allocated extra space for running jobs on offline VMs

    Power State: You can start, stop, or restart the VM. This can be helpful if swapping between virtual machines or having difficulties seeing your machine in the Citrix workspace.

    For non-standard machines, you will continue to incur system costs until you click Stop VM (or until the next scheduled shutdown).

    Power State for the VM

    Scheduled Shutdown: Virtual Machines are automatically shutdown every night at 10pm AEST. If you have a program running that you expect to run past 10pm, you can choose to extend your session for up to 3 days by selecting Bypass shutdown.

    Bypass shutdown
    Options for extending your VM session
    Confirming the extension of your VM session

    Scheduled Rebuild: VMs are automatically destroyed and rebuilt every 30 days for security and maintenance purposes:

    • you cannot extend this time
    • it displays a date and numbered count down on a coloured bar, adjusted the time in your local area
    • the coloured bar changes colour, starting with green, moving to orange, and finally red as you get closer to the rebuild date
    • you can choose to rebuild before the schedule time by selecting Rebuild now
    • after rebuilding the count down resets to 30 days and allows you to bypass the nightly shutdown
    • if you try to bypass a shutdown when your machine is scheduled for a rebuild, the system will deny the action, but offer to Rebuild now
    Scheduled Rebuild

    Run jobs on offline VMs

    If you are an analyst who works across multiple projects, you can request local disk space. This will enable your VM to run jobs offline.

    Datasets are stored on a remote file share. Only the active machine has network access to this location. Your inactive virtual machines do not. To run offline jobs, you need to request local disk space to be attached to your machine. There may be a cost associated with this.

    When running jobs offline, the inactive machine can continue to run your program as it still has access to the data since it is no longer using the remote file share. However, working like this does not allow your project team to see your analysis or output. You should always move your output back to your Project or Output drives where your project team can access and review the output. See Using your workspace for more on the available drives in DataLab.

    To use local disk space:

    1. Request access to a local disk for your project by emailing microdata.access@abs.gov.au.
    2. Copy the data products you need to the local disk.
    3. In your program, point to your source/input data on the local disk and start the job.
    4. In your program, save your output to the local disk.
    5. Exit Citrix and return to the DataLab portal to activate another machine.
    6. After you have finished running your analysis offline (local disk) move your analysis and output back to your Project drive.
    Local disk space

    Functions in My Account

    Select either the My Account tile or person icon to see details about your account.

    Accessing your account details
    Details about your account

    Basic attributes displays your name, email, phone etc. If your personal details are incorrect, email us at microdata.access@abs.gov.au with the correct information.

    Account settings allows you to opt in or out of receiving email reminders. These reminders let you know when your virtual machine will shut down. Notifications are sent at 5pm AEST/AEDT, prior to the 10pm scheduled shut down if you have started your VM that day. It will also remind you before your 30 day VM rebuild. You can change this option at any time.

    Email reminder in Account settings
    Confirming settings update
    Confirmation for settings update

    Functions in My Projects

    Selecting either the My Projects tile or briefcase icon, takes you to details about your projects:

    • ID (active link) is your project number
    • Project name
    • Organisation shows the lead organisation for the project
    • Status indicates if the project is Building, Open or Destroyed
    • Start date is the date the project is created by us
    • End date is the indicative closure date for the project
    • Closed date will show when the project is actually closed
    • Description is a summary taken from your project proposal
    • Users are the analysts who are approved to access the data
    Accessing details about your projects
    Project details

    Selecting the project lD link displays information about each project:

    • Basic attributes of the project. End date is the indicative project closure date. The default VM type of small is allocated to all analysts when added to a project. Default VMs can be changed after they have been allocated.
    • Project lead contact information.
    • Users lists the DataLab user IDs of all approved researchers in the project. This does not include approved discussants, as they do not have access in DataLab.
    Information about each project

    Recommended browsers

    DataLab is presented in a web browser. It is recommended to use the latest versions of:

    • Chrome
    • Firefox
    • Safari

    Internet Explorer is not recommended.

    Troubleshooting

    Help with logging in, virtual machines, errors and running out of space, code and software

    Released
    4/11/2021

    Logging in

    I can't log in

    • If you have entered your user name or password using copy and paste, you may have accidentally included hidden characters or a space.
    • Your organisation firewall may be blocking access. Try accessing DataLab while disconnected from your organisation's network.
    • The ABS DataLab only supports use of the Microsoft Authenticator app.
    • If you have changed your mobile phone we need to reset your Microsoft Multi Factor Authentication. Email microdata.access@abs.gov.au.
    • If you need to reset your password this must be done via the Forgot my password link in the initial DataLab sign in screen.
    • Clear your browser cache.
    • Try a different browser. See Recommended browsers.

    Has my organisation authenticated my access to the DataLab

    DataLab is enabled by cloud infrastructure, which may be blocked by some organisations’ firewall settings.

    ABS cannot make changes to external organisations' infrastructure. Project Leads need to supply the information below to each organisation participating on this project.

    Network/IT Security sections in each organisation need to review and make changes to authenticate access.

    There are 4 steps which need to be applied to each organisation’s security settings before the project start date to enable access to DataLab.

    1. Enable authentication to the tenant

    Users need to authenticate to one of ABS Azure Active tenants, which may be strictly controlled by government agencies and academic workplaces. Authentication must be enabled to the tenants:

    • mydata.abs.gov.au
    • absmydata.onmicrosoft.com

    2. Allow user access to URLs

    Users will need to access the following URLs:

    • DataLab production portal: datalab.abs.gov.au and gw.datalab.abs.gov.au
    • Citrix portal: absdatalab.cloud.com

    3. 2020 version of Citrix Workspace client installed

    The originating client machine must have a recent version of the Citrix Workspace client installed. Here is a link to the Citrix Workspace download page

    4. Enable HTTPS connections

    All Remote Desktop client connections to ABS DataLab go via Citrix Cloud service. You will need to enable HTTPS connections to both:

    • *.citrix.com
    • *.cloud.com
    • *.nssvc.net

    Why do I have to log in twice during the access process

    The DataLab has more functionality and features available to you, so you can set options as well as undertake your research.

    • First log-in is to the DataLab portal, where you can view and set options for your DataLab account information and virtual machines. Read more in DataLab portal features.
    • Second log-in is to the DataLab workspace where you undertake your analysis.

    How long does my temporary password/password last

    • The temporary password issued to you by the ABS lasts for 90 days. After you have completed the set up steps you must reset your password.
    • If you have forgotten your temporary password, email microdata.access@abs.gov.au for a reset.

    I forgot my password to get into the DataLab portal

    Your log in credentials for the DataLab portal are the same as for the DataLab workspace. You can reset your password by clicking on the Forgot my password link.

    My password expired while my citrix workspace is running

    Your session will continue on until a shutdown is required (either nightly shutdown or 30 day rebuild). However, you can still reset your password while your session is running.

    Virtual machines

    My virtual machine is not launching

    1. You must Activate, then start the VM. Follow the process and wait for each step to complete before progressing.
    2. Check your internet connection. If you have a weak or intermittent connection, this can affect launching your virtual machine.
    3. Try launching the virtual machine outside of your organisation's online environment. Some institutions’ or Government departments’ firewall or other security settings may be preventing access to DataLab portal and/or launching of the virtual machine. Attempting to connect outside your agency’s online environment may assist in forming the VM connection.
    4. VM not launching can be caused by a Citrix issue. Try again after installing the latest version of Citrix workspace.
    5. Restart your virtual machine. As with restarting a computer, restarting your virtual machine can sometimes resolve problems with launching your machine successfully. From the virtual machine page click the Restart VM button and wait 10 minutes to ensure the reboot of the machine is complete before attempting to launch again.
    6. If you are still having trouble, email microdata.access@abs.gov.au.
    Restart VM button

    What are the virtual machines/desktops

    • Virtual machines, or VMs, are the virtual workspaces you can use to undertake your analysis in the DataLab.
    • Virtual machines are called Desktops in the Citrix Portal.
    • You have one machine for each project. This is a security measure to prevent data from one project being accessed by another project that the same researcher has access to.
    • VMs are created by us as part of the project application process, described in About DataLab.
    • You can run analysis on multiple virtual machines at the same time, but only if you have been granted local disk space. See Run jobs on offline VMs (desktops). You may want to request this option if you have multiple projects that you are actively involved in.

    How large are the different sizes of virtual machines in the DataLab

    • Small (2 core CPU, 8GB memory), intended for supervisors or users who are reviewing code rather than doing their own analysis
    • Medium (2 core CPU, 16GB memory)
    • Large (2 ‘fast’ core CPU, 64GB memory)

    We assign what is appropriate for use, mainly driven by the size of the data approved to your project. If you are noticing poor performance, you can email microdata.access@abs.gov.au. Larger machines incur higher running costs. With user charging, you may need to consult with your organisation to confirm incurring additional expenses for your project before applying for a larger machine.

    What does it mean for a virtual machine to be Active and why does this matter

    If you are a member of multiple projects in the DataLab, you will have more than one virtual machine. Your Active machine is the one that is connected to the remote file share, where the data files are stored. For security purposes, only one of your sessions can connect to the remote file share at a time (this is where data files are stored). You can activate your virtual machine by using the Change Active VM button.

    Why are virtual machines destroyed every 30 days

    Virtual machines are destroyed approximately every 30 days for security purposes. If the 30 day timing will interfere with the timing of your project, you can choose to destroy and rebuild earlier than 30 days at a time that suits you.

    Is my virtual machine backed up

    Virtual machine project and output drives are backed up every night and kept for 14 days. Files outside of these drives are not recoverable.

    Where do I save the work I have done on a virtual machine that is scheduled to be destroyed

    Save your work to your Project or Output drives to ensure that your analysis is not lost. Information saved outside of these drives is destroyed when your machine is rebuilt every 30 days.

    Can I have multiple virtual machines running code at the same time

    Only if you have requested local disk space to be allocated to a machine. This allows you to run jobs on offline VMs.

    I can't see my project's products

    Try logging out of Citrix, stopping your VM and then begin the Start VM process again. If that does not work, try the rebuild now from your VM management options.

    Errors and running out of space

    One of my network drives in the analysis environment is missing

    If you cannot see the Library, Project, and Output network drives in File Explorer, go to the desktop and double-click the Refresh Network Drives icon.

    Refresh Network Drives icon

    I got an error while working with data in SAS/Stata/R/Python

    Stata error example

    Stata error example

    This means you have exceeded the memory for your virtual machine.

    1. Use an alternative method/program to manipulate or process the dataset. Some processes/programs/methods for working with large datasets are more memory-intensive than others. Try some alternative method to see if it is less system intensive.

    • Most statistical software tools are able to filter data as it is imported. If your analysis only needs variables a, b and c from a dataset containing 30 variables, then selecting, filtering or importing only these variables uses less memory.
    • If you cannot do this in your software, consider creating a subsetted data file using another tool, such as Python, as the first step of preparing your data for analysis.
    • If you are unsure of alternative methods, we recommend discussing with other researchers in your project team who are more familiar with your chosen statistical software. The ABS does not provide advice or training on using the analytical tools provided to you in the DataLab.

    2. Email microdata.access@abs.gov.au to request a larger machine. Larger machines incur higher running costs. With user charging, you may need to consult with your organisation to confirm incurring additional expenses for your project before applying for a larger machine.

    I am running out of space in my Project drive

    Clean up the drive contents, review and delete redundant files to free up space.

    Email microdata.access@abs.gov.au to request a storage increase. There may be a cost associated with this.

    Code and software

    I have some code for one project that I want to use in another project - how do I arrange this

    You can request input clearance for data, code or files to be loaded to your project, from either another project, or other sources that you hold.

    Can I use a mix of SAS, STATA, R and Python for different people in my project team

    Yes, each virtual machine has R and Python as default software. SAS and Stata are not automatically provided on all machines but can be requested as they require a licence to be assigned to your virtual machine. Email microdata.access@abs.gov.au with your request.

    Is cluster processing possible in the DataLab

    Cluster processing is not currently available. We are developing a Databricks service to provide scalable clustered analytics environment for users.

    Is there a delay between assigning data to a project and users seeing it

    Yes, it takes about 5 minutes to process the connection. You also need to log out of your virtual machine to allow the system to refresh your session with the new data.

    What can I do if my code will run longer than 10pm tonight

    You can extend your session to bypass the nightly shutdown, by one, two or three nights.
     

    How do I see what R packages I have available and how do I manage these

    Use the R Studio Package Manager shortcut on the DataLab virtual machine desktop to check the range of R packages available to you. See Managing your R packages.

    SAS warning messages

    If the project you opened was saved with SAS Datalab – [machine name] you are connecting to the local SAS server without a profile. When you try to run the project without selecting a profile the system may present an error message saying "The server "SASMain" is not defined in the current repository". Click though the messages and continue.

    I can’t find the R packages I need in the analysis environment

    1. See Managing your R packages to use the RStudio Package Manager on the desktop.
    2. If the packages you need is not listed, email your request to microdata.access@abs.gov.au

    Double clicking to open a PDF is not working

    Due to a default setting in Microsoft, the system automatically uses Microsoft Edge to open any PDF file. You can open the PDF file by right-clicking on the file, selecting Open with > Adobe Reader. This launches the file using Adobe Acrobat Reader.

    Launching a PDF file using Adobe Acrobat Reader