DataLab

Analyse the most detailed microdata in the secure DataLab for your statistical research or modelling, find out about charges and how to access

Released
4/11/2021

\(\Large ⚿\) Log into DataLab 

What is DataLab

DataLab is the analysis solution for high-end users who want to undertake real time complex analysis of detailed microdata. Compare data services to see if detailed microdata in the DataLab is the right service for you.

Note: Government organisations seeking to leverage the DataLab cloud infrastructure to host their own data and manage their own end-users should consider the Secure Environment for Analysing Data (SEAD) service. 

Features

  • View and analyse unit record information
  • Recent versions of analytical software, including R, SAS, Stata and Python
  • Virtual access to files that remain in the secure ABS environment
  • All analytical output that you want to use outside DataLab are checked by the ABS before release

Who can access the DataLab

Detailed survey and integrated microdata are available for approved projects, organisational users must be:

  • government employees
  • government contractors and individuals sponsored by government
  • academics
  • researchers from public policy research institutes
  • sponsored by government

All users need to also meet ABS safe people criteria, including researchers who:

  • belong to an Australian organisation (international researchers and organisations will be considered on a case by case basis)
  • belong to an organisation with a Responsible Officer Undertaking (ROU) in place with the ABS
  • are located in Australia when accessing the microdata
  • have completed all relevant undertakings and declarations
  • have the ability to use at least one of the statistical analytical languages available in the DataLab
  • have at least three years of either quantitative research experience or university study with a significant component working with quantitative data, or have a referral from an experienced researcher working on the same project
  • have an approved safe project that is for statistical and/or research purposes and demonstrates public value
  • have completed ABS safe researcher training and refresher training as per ABS refresher policy
  • meet additional criteria that apply to specific microdata.

DataLab system security

The ABS is committed to keeping the ABS DataLab safe and secure. We have a strong data protection culture and extensive experience in keeping data secure as Australia’s national statistical organisation and as an Accredited Data Service Provider. The ABS DataLab is hosted in Microsoft Azure and meets PROTECTED level security standards as prescribed in the Australian Government Information Security Manual (ISM). It is subject to Independent Security Registered Assessors Program (IRAP) certification, ongoing security audits and robust IT security testing and patching delivering the Safe Settings aspect of the Five Safes Framework.

The technology underpinning the ABS DataLab includes:

  • data encryption at rest to mitigate against unauthorised access to microdata
  • Azure Storage Accounts to securely hold individual research products and allow querying from authorised users
  • cloud servers (including backup servers) hosted exclusively onshore, with access only authorised for use in Australia unless approved by the ABS
  • closed network virtual machines to provide secure, isolated research spaces for the analysis of microdata
  • guarded access through multi-factor authentication and workspace segmentation inhibiting data sharing between projects
  • a DataLab Product Storage Account protected with Microsoft Defender providing threat detection against malicious/unusual behaviour.

The ABS employs the above with a focus on industry standard security posture management to provide a safe and secure platform for policy and program delivery work.

Detailed microdata in the DataLab

  • Designed specifically for use within the DataLab environment
  • Direct identifiers (such as names and addresses) removed
  • Further appropriate confidentiality applied within the context of the other security features of the DataLab
  • Topics include Census, health, education, labour force, Aboriginal and Torres Strait Islander peoples, migrants, crime, business, disabilities, ageing and carers
  • Datasets include ABS survey results, administrative data collected by other organisations and integrated datasets
  • Data item lists are linked in detailed microdata topics in the DataLab

Charges

Costs for 2023-24 are now available below. If you have any questions, please contact data.services@abs.gov.au.

Approved users can access standard detailed microdata in DataLab for approved projects. This includes:

  • ABS survey and census collections
  • data ABS has collected from other organisations (with custodian approval)
  • integrated microdata such as:
    • Person Level Integrated Data Asset (PLIDA)
    • Business Longitudinal Analysis Data Environment (BLADE) Core plus BLADE standard module (various ABS surveys), Intellectual Property Longitudinal Research Data (IPLORD) and Merchandise Imports and Exports
    • PLIDA/BLADE linked data

Additional charges apply for customised data integration services.  

DataLab charges

DataLab access incurs an annual charge. This charge is based on the number of analysts with virtual machine access in a project. The charge covers the annual costs of:  

  • project establishment and ongoing administration and support 
  • researcher onboarding (including training and refresher training) 
  • changes to analysts and discussants within a project 
  • standard virtual machine access for analysts 
  • standard software access for analysts (e.g., R, Python, Stata) 
  • project storage up to one terabyte 
  • standard output and input clearance 

The ABS is committed to supporting the DataLab service and subsidising DataLab users. In 2023-24, increasing costs and budget constraints require the ABS to move towards a more sustainable partial cost recovery arrangement with all our clients. The below updated charges will allow the ABS to maintain its service levels as well as deliver critical system and infrastructure enhancements.   

The ABS is adopting a staggered approach to price increases to minimise the impacts on existing projects. Existing projects are projects established before 1 July 2023. Existing and new projects will be subject to different pricing models for 2023-24 and 2024-25. Please see the below for more details.  

Please note: 

  • Individual quotes will be prepared for projects with over 25 analysts. 
  • An increase in the number of analysts to the next tier will incur an additional charge equivalent to the next tier.
  • Significant changes in project scope may result in the establishment of a new project.
  • Project extensions after 1 July 2023 will be charged at the new project rates.
  • Access to non-standard services are subject to additional charges, please see ‘Additional non-standard access and services charges’ table below.
  • All charges are calculated quarterly, based on the month of request. For example, a project commencing in October will be charged the annual fee and any non-standard DataLab access based on three quarters of the financial year.

Annual charges for existing projects

Existing projects are projects established before 1 July 2023.  

Tier 1 and 2 projects (projects with under 10 analysts) 
  • Tier 1 and 2 projects will continue to be charged the same annual rates for financial years 2023-24 and 2024-25.
  • Prices will increase from 2025-26 for all Tier 1 and 2 projects.
Tier 3 projects (projects with 11 to 25 analysts) 
  • Tier 3 projects will be charged half the 2022-23 annual charge of $10,000 ($5,000 excluding GST) for 1 July 2023 to 31 December 2023, with prices increasing from 1 January 2024.
  • From 1 January 2024, Tier 3 projects will be subject to a new charge rate. 
  • Tier 3 projects will be charged half the new 2023-24 annual charge of $30,000 ($15,000 excluding GST) for 1 January 2024 to 30 June 2024. 
  • The delayed roll out of the new charging model provides project leads the opportunity to decrease usage or close their project before prices increase. 
Tier 4 projects (projects with over 25 analysts) 
  • Tier 4 projects will be charged custom pricing from 1 July 2023 under the new pricing model.
2023-24 annual charges for existing projects
Annual charge (per project)Excluding GSTIncluding GST
Tier 1 - 1 to 5 analysts$2,000$2,200
Tier 2 - 6 to 10 analysts$4,000$4,400
Tier 3 - 11 to 25 analysts

$5,000 - 1 Jul 2023 to 31 Dec 2023

$15,000 - 1 Jan 2024 to 30 Jun 2024

$5,500 - 1 Jul 2023 to 31 Dec 2023

$16,500 - 1 Jan 2024 to 30 Jun 2024

Tier 4 - Over 25 analystsCustomCustom

 

Annual charges for new projects

New projects are projects established after 1 July 2023.  

2023-24 annual charges for new projects
Annual charge (per project)Excluding GSTIncluding GST
Tier 1 - 1 to 5 analysts$5,000$5,500
Tier 2 - 6 to 10 analysts$12,000$13,200
Tier 3 - 11 to 25 analysts$30,000$33,000
Tier 4 - Over 25 analystsCustomCustom

 

University projects

The annual charge for projects with up to 10 DataLab analysts is covered under the ABS/Universities Australia agreement. These projects will not be subject to an annual charge for the duration of the current agreement, in place until 31 December 2023.

Additional non-standard access and service charges

Non-standard charges apply to all projects for services that are not within the scope of the annual charge. Non-standard charges include access to SAS, Databricks, non-standard virtual machines and storage above one terabyte. Non-standard charges also apply to increased service levels, such as priority clearance and high service level. More information on non-standard access and services are in the tables below. 

Non-standard DataLab access charges
Annual charges per person per projectExcluding GSTIncluding GST 
Use of SAS$500$550

Non-standard virtual machines

Standard virtual machines are included in the annual charge and comprise machines up to and including the large size. Please refer to virtual machines for further information on size.  

The price of this access is a minimum charge. Should analysts exceed their usage in dollar terms for their non-standard virtual machine within the financial year, access can continue subject to additional charges being applied. 

If shorter term usage of a non-standard virtual machine is required, please contact data.services@abs.gov.au to discuss options.  

$1,700 (minimum)$1,870 (minimum)
 
Annual charges per projectExcluding GSTIncluding GST

Databricks - low usage  

Databricks - high usage 

Please refer to Databricks for more information on this service.

The price of this access is a minimum charge. Should analysts exceed their usage in dollar terms for their access to Databricks within the financial year, access can continue subject to additional charges being applied. 

$3,500 (low usage minimum) 

$6,500 (high usage minimum) 

$3,850 (low usage minimum) 

$7,150 (high usage minimum) 

Each additional terabyte of storage  

One terabyte of storage is included in the annual charge. 

$850$935
Non-standard DataLab service charges
Annual charges per projectExcluding GSTIncluding GST

High output demand - Tier 1  

High output demand - Tier 2 

A high output demand charge applies to projects that require a higher level of service, including high volumes, faster turnaround or the application of special rules. This charge will be applied when the project team requests this higher level of service, or when the ABS determines that a project is requiring resources exceeding cost recovery of the annual charge. High output demand charges are structured into two separate tiers.  

Tier 1 applies to projects with output requests that: 

  • regularly exceed 2 per month 
  • regularly require turnaround of less than 48 hours, or 
  • require informal ABS methodological or policy advice to facilitate output.

Tier 2 applies to projects with output requests that: 

  • regularly exceed weekly occurrences 
  • regularly require same day turn around, or 
  • require formal ABS methodological or legislative advice to facilitate output.

$19,000 (Tier 1)

$41,000 (Tier 2)

$20,900 (Tier 1)

$45,100 (Tier 2)

High service level 

A high service level charge applies to projects that require resources exceeding the cost recovery of the annual fee. This charge will be applied when the project team requests this higher level of service, or when the ABS determines that a project is requiring resources exceeding cost recovery of the annual charge.   

The following will be a factor in considering the application of the high service level charge: 

  • frequent/regular meetings 
  • frequent/complex queries 
  • frequent/complex project changes 
  • high volume of publications requiring review and custodian notifications 
  • projects with a large range of research topics requiring access to a high volume of datasets and data integration work  
  • projects with custom requirements 
  • projects with compressed timelines and critical milestones which prompt out-of-session arrangements and prioritisation within the ABS work program  
  • projects with multiple phases with varied needs requiring staged custodian approvals, additional approvals such as non-DataLab approvals, and increased effort to monitor project status and progress 
  • projects with non-s15 access, for example, s14 or s16, requiring additional arrangements 
  • projects merging or splitting, thus requiring re-approvals and rearrangements 
$20,000$22,000

 

Applying for DataLab access

Step 1. Ensure you meet requirements

For criteria, refer to Who can access the DataLab

Organisation approval

  • Your organisation must have a verified Responsible Officer Undertaking (ROU) in place with the ABS. If one does not exist, your organisation will not be available for selection in the myDATA online project proposal.
  • To check if there is an active ROU for your organisation, go to the myDATA user portal homepage and select Dashboard/Organisation.

Researcher approval

  • You must have a commitment to protect the confidentiality of data.
  • Every member of your project team who will see or discuss uncleared outputs (whether or not they will be using DataLab) needs to be approved.

Project approval

  • Projects must be for statistical and/or research purposes and provide public benefit.
  • Projects must not be for compliance or regulatory purposes.
  • Every project needs to be approved by the ABS.
  • Projects for or about Aboriginal and/or Torres Strait Islander peoples may be subject to a Cultural Review by the Centre of Aboriginal and Torres Strait Islander Statistics at the ABS.
  • Some projects also require consideration and approval by data custodians.

Refer to What is DataLab and Using DataLab responsibly for more information.

Step 2. Register and activate your account

Registration of an account will allow you to:

  • create a project proposal within the myDATA user portal
  • enrol in DataLab safe researcher training
  • collect forms for your onboarding process after training has been completed
  • draft and review projects you are participating in.

Register in the myDATA Portal and agree to the Conditions of use

  • Use your organisation email address. If you are a user in more than one organisation, you will need to register separately using the email address for each organisation.
  • Authenticate your account - myDATA will automatically email your registered account with steps to authenticate.
  • If you encounter errors in myDATA, please submit a System support query.

You can complete your training while your project proposal is underway.

Access will not be granted until DataLab safe onboarding is completed. For the onboarding process, the following documents are required to be submitted:

For further information, please refer to the myDATA user guides section below.

Step 3. Submit project proposal

Submit project proposal
Create and complete a new project proposal in the myDATA user portalUpdates to an existing project proposal (word document)

The creator of a new project will be automatically assigned as the Project Editor, they are the only person who can edit and submit the project proposal. Project Leads can be identified separately when adding people to the project.

Step through the online form and complete all required fields.

Submit your completed project proposal to the ABS for review, who will respond with feedback if edits are required.

Note: researchers must complete registration and activation steps successfully before they can be added to the project. Training does not have to be completed to be added to a project – access will not be granted until after approval is given.  

For further information, please refer to the myDATA user guides section below.

For existing projects, the project proposal must be updated with any changes (e.g., changes to researchers, organisations, data or scope) and submitted to data.services@abs.gov.au

Changes to project proposals must be made in red, with tracked changes on, and supplied on the newest version of the template below.

Project proposal template and data request form

🗎 ABS DataLab Data Request Form.xlsx

🗎 ABS DataLab Project Proposal.docx

Enabling access to DataLab

DataLab is enabled by cloud infrastructure, which may be blocked by some organisations’ firewall settings. 

ABS cannot make changes to external organisations' infrastructure. Project Leads need to supply the information below to each organisation participating on this project. 

Network/IT Security sections in each organisation need to review and make changes to authenticate access. This only needs to be done once per organisation.

Citrix access configuration

There are four steps which need to be applied to each organisation’s security settings before the project start date to enable access to DataLab.

1. Enable authentication to the tenant

Users need to authenticate to one of ABS Azure Active tenants, which may be strictly controlled by government agencies and academic workplaces. Authentication must be enabled to the tenants:

  • mydata.abs.gov.au
  • absmydata.onmicrosoft.com

2. Allow user access to URLs

Users will need to access the following URLs:

  • DataLab production portal: datalab.abs.gov.au and gw.datalab.abs.gov.au
  • Citrix portal: absdatalab.cloud.com

3. 2020 version of Citrix Workspace client installed

The originating client machine must have a recent version of the Citrix Workspace client installed. Here is a link to the Citrix Workspace download page.

4. Enable HTTPS connections

All Remote Desktop client connections to ABS DataLab go via Citrix Cloud service. Your organisations Network/Security area will need to enable HTTPS connections to the following:

  • *.citrix.com
  • *.cloud.com
  • *.*.nssvc.net

Organisations that can't enable all subdomains can whitelist using wildcards to prevent future connectivity issues to Citrix. For more information refer to Citrix Product Documentation.

Customers who can’t enable all subdomains can use the following addresses instead:

  • *.g.nssvc.net
  • *.c.nssvc.net

SSL/TLS inspection must be bypassed for *.nssvc.net as it can break connections to Citrix Gateway Service.

Azure Virtual Desktop configuration

1. Enable authentication to the tenant

Users need to authenticate to one of ABS Azure Active tenants, which may be strictly controlled by government agencies and academic workplaces. Authentication must be enabled to the tenant:

  • absmydata.onmicrosoft.com

This tenant is located in Azure Australia East and Azure Australia Central regions. 

2. Allow user access to URLs

Users will need to access the following URLs:

  • DataLab production portal: datalab.abs.gov.au and sead.abs.gov.au

3. Configure your organisation's network to allow outbound connections to the following addresses required for Azure Virtual Desktop (AVD):

  • login.microsoftonline.com 
  • *.wvd.microsoft.com 
  • *.servicebus.windows.net 
  • go.microsoft.com 
  • aka.ms 
  • learn.microsoft.com 
  • privacy.microsoft.com 
  • query.prod.cms.rt.microsoft.com 

These addresses all utilise the TCP protocol and outbound port 443 for communication. 

Contact data.services@abs.gov.au for further assistance.

myDATA user portal

The myDATA portal is your one-stop shop for all aspects of DataLab training, onboarding and project management. 

Below are the myDATA user guides, which will assist you navigating through the various aspects associated with training and project management.

  1. Register and activate user account (download pdf)
  2. User portal dashboard navigation (download pdf)
  3. Self-training enrolment (download pdf)
  4. Download and return forms (download pdf)
  5. Create and submit a project proposal (download pdf)

Privacy policy

The ABS privacy policy and DataLab privacy notice outline how the ABS handles any personal information that you provide to us.

Products

List of detailed microdata files available in DataLab, links to publications and data item lists

Released
8/11/2021

Detailed microdata files and reference periods in DataLab are listed below. For datasets in other systems see MicrodataDownload and TableBuilder, or all topics in Available microdata and TableBuilder.

You need to apply for access by submitting a DataLab project proposal before you can access these files.

Use Ctrl+F (Windows) or Command+F (Mac) to search this list.

Economy

Environment

Health

Labour

People

Adult Literacy and Life Skills, 2006

Australian Census and Migrants, 2011, 2016, 2021

Australian Census and Temporary Entrants, 2016, 2021

Australian Census Longitudinal Dataset, 2006-2011, 2006-2021, 2011-2021, 2016-2021

Census of Population and Housing, 2001

Census of Population and Housing, 2006

Census of Population and Housing, 2011

Census of Population and Housing, 2016, 2021

Characteristics of Recent Migrants, 2007 and 2010

Child Care, 1999

Child Care, 2002

Child Care, 2005

Childhood Education and Care, 2008

Childhood Education and Care, 2011

Crime and Safety, 2002

Crime and Safety, 2005

Crime Victimisation, 2009-10

Education and Training, 2005

Education and Training, 2009

Education and Work, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2022, 2023

Family Characteristics and Transitions, 2006-07

Family Characteristics, 2003

Family Characteristics, 2009-10

General Social Survey, 2002

General Social Survey, 2006

General Social Survey, 2010

General Social Survey, 2014

Household Expenditure, Income and Housing, 2003-04 including Fiscal Incidence Study

Household Expenditure, Income and Housing, 2009-10 including Fiscal Incidence Study

Household Expenditure, Income and Housing, 2015-16 including Fiscal Incidence Study

Income and Housing, 2000-01, 2002-03, 2003-04, 2005-06, 2007-08, 2009-10, 2011-12, 2013-14, 2015-16, 2017-18, 2019-20

Person Level Integrated Data Asset previously known as Multi-Agency Data Integration Project (MADIP), 2011-2016

Multipurpose Household Survey, 2004-05, includes the following:
- Household Use of Information Technology
- Barriers and Incentives to Labour Force Participation
- Retirement and Retirement Intentions

Multipurpose Household Survey, 2005-06, includes the following:
- Household Use of Information Technology
- Participation in Sports and Physical Recreation
- Attendance at Selected Cultural and Leisure Venues and Events
- Sports Attendance
- Work-Related Injuries

Multipurpose Household Survey, 2006-07, includes the following:
- Adult Learning
- Barriers and Incentives to Labour Force Participation
- Retirement and Retirement Intentions
- Household Use of Information Technology
- Family Characteristics and Transitions

Multipurpose Household Survey, 2007-08, includes the following:
- Environmental Views and Behaviour
- Household Use of Information Technology
- Personal Fraud

Multipurpose Household Survey, 2008-09, includes the following:
- Crime Victimisation
- Barriers and Incentives to Labour Force Participation
- Retirement and Retirement Intentions
- Household Use of Information Technology

Outcomes from Vocational Education and Training in Schools, 2006-2011

Participation in Sport and Physical Recreation, 2009-10

Participation in Sport and Physical Recreation, 2011-12

Personal Fraud, 2007-08

Personal Income of Migrants, annually from 2009-10, 2010-11, 2011-12, 2012-13, 2014-15, 2015-16, 2016-17

Personal Safety Survey, 2005

Personal Safety, 2012, 2016

Preschool Education, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022

Programme for the International Assessment of Adult Competencies, 2011-12

Time Use, 1997

Time Use, 2006

Safe researcher training

Registering your interest to attend DataLab training, and training resources

Released
19/11/2021

What is safe researcher training

DataLab safe researcher training must be undertaken before you can use the DataLab or be approved on a project:

  • training enables new users to become approved DataLab researchers or discussants
  • available as face-to-face training via ABS offices, in most capital cities
  • also available as virtual training

Training covers:

  • your shared responsibilities as a DataLab user
  • meeting your legislative requirements
  • appropriate output for ABS clearance and data release

The training does not include:

  • using the system
  • using the data
  • statistical capability training
  • code or analytical language training

Safe researcher training and DataLab access are only available to researchers located in Australia in Australian organisations. International researchers and organisations will be considered on a case by case basis. 

DataLab training current wait time is approximately 4 weeks

How to register and enrol in a session

Register for DataLab safe researcher training via the myDATA portal. The myDATA portal will be your one-stop shop for all aspects of DataLab training, onboarding and project management.

Once you have created your myDATA portal user account, click on the ‘My onboarding’ tile. You will find an ‘Enrol in training’ button. Click on the magnifying glass to select the training session that best suits you and click ‘Select’. Should there not be a suitable session, click on the 'No session suitable' and we will contact you directly.

Detailed registration steps are listed in the User Guide here.

Email data.services@abs.gov.au if you have issues with registering in the portal.

Refresher training

The successful and safe operation of the ABS DataLab relies upon researchers understanding their responsibilities and obligations when accessing the ABS DataLab.

All researchers seeking access to the ABS DataLab, including discussants, must complete the DataLab Safe Researcher training and satisfactorily complete a quiz before they will be granted access to the DataLab.

To retain access to the DataLab researchers must:

  • undertake refresher training every two years or as directed by the ABS Researcher Onboarding and System Support
  • resubmit relevant Declarations and Undertaking covering your responsibilities and obligations every two years or as required and as directed by the ABS Researcher Onboarding and System Support

If you think you might be due for refresher training please email data.services@abs.gov.au

Changing organisations does not invalidate a researcher's training status and the usual refresher training requirements apply.

Refresher training policy:
Researchers need to undertake refresher training because:

  • key operations, such as output checking procedures and rules, will change over time
  • reinforcing key elements on a regular basis reduces the likelihood that researchers forget them
  • people can become complacent about complying with appropriate behaviours in the DataLab
  • training reinforces the need to constantly refresh ones skills and knowledge about implementing safe researcher practices
  • this ensures researchers remain aware of their responsibilities and obligations when using the DataLab

The refresher training policy requires: 

  • all active users and discussants to complete the Safe Researcher DataLab refresher training every two years, or sooner if instructed to by the ABS
  • resubmit relevant Declarations and Undertaking covering on responsibilities and obligations every two years, or sooner if instructed to by the ABS
  • complete the training in the time-frame specified by the ABS to ensure access is not suspended.  

Safe researcher training resources

The slides linked below are presented during the DataLab Safe Researcher Training. 

Part 1 - Working together to enable microdata access

Part 2 - Maintaining data confidentiality

Part 3 - Statistical disclosure control

\(\Huge 🗎\) DataLab safe researcher training Parts 1 and 2 (PDF)

\(\Huge 🗎\) DataLab safe researcher training Part 3 (PDF)

You should also read Responsible use of ABS microdata user guide and Using DataLab Responsibly to understand your responsibilities as a safe researcher.

Conditions of use

DataLab and myDATA conditions of use

Released
20/07/2022

DataLab conditions of use

By accessing and using DataLab, you agree to abide by the administering organisation's relevant requirements and obligations, including the conditions outlined below. If you cannot abide by these conditions, your use of the system is to cease immediately.

By using DataLab, I agree:

  1. To adhere to all access, usage, security and other procedural guidelines within DataLab as directed by the ABS. Including those provided in the Undertaking by the Responsible Officer of an Organisation, Undertaking by an individual, Declaration of Compliance, and other directions provided to me. 
  2. To take all necessary measures to protect the security of DataLab and the data held within, by safeguarding my access credentials and promptly notifying the ABS of any security incidents or procedural failures.      
  3. To adhere, where applicable, to ABS system constraints.     
  4. To the possibility that the ABS may discuss my registration and access with the administering organisation, who have the authority to remove my access.    
  5. To comply with the ABS conditions of sale
  6. To use and operate DataLab in compliance with the relevant operating manuals and documentation.    
  7. To not remove or attempt to remove content from DataLab by any means, including extracting or copying material by screen capture, handwritten notes, or transcription, without obtaining written approval from the ABS.    
  8. To uphold the integrity of ABS intellectual property by not removing, obscuring, or altering any ABS attributions, including logos, legal notices, or other labels visible in DataLab.    
  9. To not attempt to load code, software or applications without seeking the appropriate authorisation from the ABS.     
  10. To cooperate with any audit or investigation initiated by the ABS or administering organisation that pertains to any matter concerning the DataLab. 
  11. Delete or destroy data when requested to do so by the ABS.

I acknowledge that:   

  1. My use of DataLab may be audited by the ABS.    
  2. DataLab is authorised for use only within Australia unless prior written authorisation has been provided by the ABS.     
  3. A breach of these conditions may result in sanctions which may include, but are not limited to, the ABS revoking my access to DataLab permanently or for a set period.    
  4. The ABS will not provide guidance on how to conduct data analysis, modelling or how to utilise the statistical tools available.     
  5. Features and functionality of DataLab may undergo necessary changes or upgrades without user consultation.    
  6. The ABS does not guarantee, or accept any legal liability arising from, or connected to the use of material connected within, or derived from DataLab. 

Expected behaviours: 

  1. Comply with the protocols and instructions of the ABS.     
  2. Access only the data I have been approved to access. and notify the ABS if you think you have access to data you shouldn't
  3. Not attempt to avoid, override, or bypass the system or procedures.    
  4. Maintain data confidentiality when submitting outputs for review.    
  5. Request output clearance through the ABS DataLab Clearance procedure in all instances.    
  6. Notify the ABS of any suspected activities that may impact the security of DataLab.   

Remote access:

Remote access within Australia is permitted under the following conditions:       

  1. DataLab must only be accessed from a work or private location.     
  2. A secure internet connection must be used.  
  3. Overseas access to DataLab is not permitted unless approved by the ABS.     
  4. Do not use any type of internal messaging system, do not screen sharing and do not transcribe any data from the DataLab prior to output clearance

myDATA conditions of use

The following conditions are applicable to all users registered in My Data Approvals to Access (myDATA) for access to DataLab projects.

By registering in myDATA, I agree: 

  1. To provide true and correct information about my identity and contact details when registering, and if these details change, to provide the updated details to the ABS. I understand if I leave my organisation new approvals need to be sought for continued access to data or systems.
  2. The ABS may discuss my registration and access with my organisation's Contact Officer and Project lead, who have the authority to remove my access.
  3. To keep my log in credentials for ABS systems secure and not share my log in credentials with any other person or organisation.
  4. ABS can share my project information, such as contact details, project name, project lead and organisation and dataset access with other organisations and users. This information will only be shared at the discretion of the ABS for the purposes of administering access to data, such as seeking approvals on your behalf with the data custodians responsible for the data, system logging and disclosing user details to other users for the purpose of enabling collaboration.
  5. ABS can share with other organisations confidentialised information about my organisation's system and data usage, for the purposes of usage reporting, auditing, feedback or performance. Any personal information collected will be held while there is a business need and kept in accordance with the Privacy Act 1988. See also, ABS privacy policy and DataLab privacy notice.
  6. ABS may contact me to provide me with information about data, systems or products, such as upcoming releases, system changes or to seek my feedback.
  7. To comply with the ABS conditions of sale.

I acknowledge that:

  1. My use of myDATA and ABS systems may be audited.
  2. The ABS may partially or fully remove, suspend or deactivate my access to ABS systems, data, files or tables. This may be for ABS operational reasons, such as implementing changes to data or systems, or for a breach of ABS directions and system constraints.
  3. A breach of these conditions of use may result in sanctions which may include, but are not limited to, ABS removing access to these systems for me and/or my organisation permanently or for a set period of time.
  4. Removal of access does not entitle a user or organisation to a refund of any subscription charges and the ABS is not liable for any damages this removal of access may have for a user or organisation.
  5. While the ABS has taken great care to ensure that the information provided within ABS systems is as correct and accurate as possible, the ABS does not guarantee, or accept any legal liability arising from, or connected to, the use of any material contained within, or derived from these systems. See ABS disclaimer for further information.

Privacy

The ABS privacy policy and DataLab user privacy notice  outline how the ABS handles any personal information that you provide to us.   

The ABS Privacy Policy for Managing and Operating Our Business outlines how we handle personal information that is collected for managing and operating within the ABS.  

Using DataLab responsibly

Roles and expected behaviours for being a safe researcher in the ABS DataLab

Released
4/11/2021

Roles and expected behaviours

ABS

  • encourages, promotes and supports the use of data for research and/or statistical purposes
  • provides training on guidelines and compliance requirements for safe researchers and safe use of data
  • provides a secure environment for flexible and wide-ranging microdata access to meet researchers' needs
  • provides a range of statistical packages and updates
  • provides adequate metadata
  • manages the authorisation, provision and removal of access to microdata
  • provides researchers with the principles and rules for safe outputs
  • provides a cultural review of Aboriginal and Torres Strait Islander data projects through its Centre of Aboriginal and Torres Strait Islander Statistics
  • checks outputs and provides advice on how to make outputs non-disclosive
  • responds to questions relating to the data, processes and systems, in a timely manner
  • respects researchers' academic independence
  • monitors and audits DataLab use to ensure compliance with procedures and legislative requirements

Lead researchers

  • submit your research proposal to the ABS by following the steps in the DataLab User Guide
  • provide an updated project proposal to reflect changes to the team, scope, project time frames and/or data requirements - use the 'Document history' section on page 2 of the project proposal to advise us of any changes
  • support your research team to adhere to DataLab safe researcher practices and behaviours and building a culture of best practice within your team
  • provide feedback on the outcomes of your project and experience with the ABS microdata and DataLab upon project closure
  • provide ABS with two weeks notice before release and then provide a link to any published research stemming from the project's findings
  • advise the DataLab team immediately of any suspected incidents in the DataLab, including both data security or procedural failures
  • support the ABS in communicating key messages with your research team
  • adhere to any relevant requirements of analysts

Approved project team researchers/analysts

As an approved project team analyst, you have access to the DataLab and may discuss uncleared data with other approved analysts or discussants on your project team. You must:

  • meet all on-boarding requirements, including:
    • completing the safe researcher DataLab training
    • confirming you are willing to have your name, organisation, microdata you have access to, projects, and links to resultant papers published from the research on a register on the ABS website (unless otherwise agreed in advance)
    • confirming you belong to an organisation that has a Responsible Officer undertaking in place with the ABS
    • signing and agreeing to the conditions in the individual undertaking and other associated paperwork (including the Declaration of Compliance)
    • confirming in writing that you are not currently restricted from accessing government data, or any other data due to misuse of data or a breach of data policy/procedures
    • declaring you have at least three years’ quantitative research experience or university study with a significant component working with quantitative data, or if this is not possible comply with the pre-requisite skills and/or experience expected of approved researchers
    • having experience with at least one of the statistical analytical languages available in the DataLab
  • comply with ABS protocols and instructions for access and use of microdata in the DataLab
  • access only the microdata you have been approved to access - if you can access data that you believe you should not be able to, contact the DataLab team immediately
  • inform the ABS if you leave the project team or if you leave your organisation
  • only access the DataLab from a private location with a secure internet connection, not from public networks or spaces
  • protect your work area and screen from oversight by others, including unauthorised colleagues, family, children and pet cams
  • not screen share your DataLab session or content, even when meeting only with approved researchers on your project team
  • keep passwords for the DataLab secure
  • not share DataLab log in credentials
  • not attempt to identify individuals or organisations within data held in the DataLab
  • not attempt to match DataLab unit record data with any other list, database or repository of persons or organisations
  • not attempt to avoid, override or otherwise circumvent the system or procedures
  • not transcribe or copy anything from the DataLab prior to output clearance (this includes screen sharing or any written or photographic form)
  • not transcribe or copy anything from the DataLab prior to output clearance to share it with any other researcher (approved or not) or with ABS personnel or the Researcher Onboarding and System Support team
  • only use the shared project space within the DataLab to share uncleared work with approved project researchers or to communicate with ABS DataLab support areas about uncleared data
  • do not send outputs to be cleared to the DataLab team - you must request output clearance
  • do not attempt to link two microdata files within the DataLab at the unit record level based on matching characteristics, except where linking keys have been provided and the files are designed to be linked
  • not deliberately attempt to identify individual or organisational respondents or mishandle a spontaneous recognition event
  • be aware that data confidentiality is your responsibility when submitting outputs for review - see Confidentiality in ABS microdata and output guidelines for more information
  • report any security incidents or procedural failures to DataLab team immediately via data.services@abs.gov.au and cc the lead researcher

Approved project team researchers/discussants

As an approved project team discussant, you do not have access to the DataLab but may discuss uncleared data with other approved analysts or discussants on your project team. You must:

  • meet all on-boarding requirements, including completion of the safe researcher DataLab training and signing of all relevant undertakings
  • seek approval from the ABS, via the project lead, if you wish to become a DataLab analyst
  • adhere to any relevant requirements of project team researchers/analysts

Guiding principles

External communication

We encourage you to communicate as much as possible within the DataLab environment.

If you need to communicate via other means, consider what is to be communicated and how the communication will take place to ensure that you do not inadvertently remove uncleared data from the DataLab.

Managing communication

  • You can use notes within the DataLab to leave messages for other approved project team members. Let them know that you have left a note and would like them to view it.
  • You can talk to your supervisor/s about the data, if they are approved researchers or discussants on the project, but consider the environment and who is around.
  • Phone calls and video conferencing may be used for discussions but never share your screen.
  • Do not transmit any uncleared DataLab output in an email, including with ABS personnel or the Researcher Onboarding and System Support team. Instead, let your approved project team colleague or the ABS know and ask them to view the issue within the DataLab. Similarly your approved colleagues or the ABS need to leave their responding information within the DataLab and let you know that there is information for you within the project in the DataLab.

Remote access

ABS trusts and supports approved researchers who remotely access the DataLab.

Remote access is permitted under the following conditions:

  • It must be used in a work or private location.
  • The screen must be protected from oversight by any other person. This includes password-protecting you screen, should you move away from you computer.
  • A secure internet connection must be used:
    • A secure internet connection means any Wi-Fi that is password protected (e.g. work, home, your hotel room, hotspotting from your phone)
    • A non-secure internet connection means an open or public connection like a restaurant/cafe, airport, public transport, hotel lobby or shopping mall
  • Overseas access to DataLab is not allowed under any circumstances.
  • Working in the DataLab from home is supported by the ABS but you are responsible for checking and complying with your organisation's requirements for working from home.
  • Do not use any type of internal messaging system which may have external server connections.
  • The DataLab screens are to be kept secure at all times whether you are working within your organisation or from home.

Further information is available in the Responsible use of ABS microdata user guide.

Requirements to become an approved researcher

Pre-requisite skills and/or research experience required of approved researchers

To be an approved DataLab researcher, you must have the analytical research experience to be able to carry out quantitative data research or analysis in the DataLab This includes the ability to use at least one of the statistical analytical languages supported in the DataLab. This may have been acquired through working on research, analytical or statistical projects. For example, a person who was employed for three years in a relevant field, such as a university researcher, research assistant or a government or non-government employee working in research or statistics. If they had worked for around half of their time on quantitative research projects, then they would have spent a significant component of their time working with quantitative data.

You may also have qualifications (either an undergraduate or higher degree) with a significant proportion of mathematics or statistics. A significant proportion of the degree should cover research method components and analytical fields, including:

  • qualitative data collection and research design, interviewing skills, conducting focus groups and ethnographic methods
  • quantitative data collection and research design, questionnaire design, sampling and weighting
  • hypothesis testing and evaluation
  • undertaking systematic reviews
  • data analysis, including data linkage, imputation and presentation of results
  • application of ethics to research

Other relevant undergraduate degrees may include psychology, demography, social policy, sociology, political science, geography, economics, and social statistics. If you have postgraduate qualifications, you may combine multiple degrees to ensure you meet this requirement. This is a cumulative requirement.

If you do not meet the above criteria but still want to access the DataLab, you may request a referral by an authorised researcher who is on the same research team as you. The referring researcher must meet all of the following requirements:

  • have at least three years of either quantitative research or analysis experience or university study with a significant component working with quantitative data
  • be working on the same project within the DataLab as the less experienced researcher
  • agree to directly supervise and take responsibility for the work of the less experienced researcher
  • have the agreement of a Senior Executive from the less experienced researcher’s organisation for this referral

Download the undertaking, declaration and referral forms.

The ABS does not provide support to researchers relating to statistical analytical languages or coding issues.

Publishing and citing data

Referencing DataLab data in publications

Preparing outputs to be published

  • Publishing refers to making information available to the public by any means.
  • The ABS encourages researchers to share their research findings (which have been cleared for confidentiality by the ABS) and make the results publicly available.
  • If the ABS have cleared and sent you your outputs from the DataLab it means they have passed statistical disclosure checks and are cleared to be released. 

When do I need to inform the ABS I am publishing? 

  • Any publication, report and presentation that references BLADE or PLIDA data needs to be provided to the ABS a minimum of 2 weeks prior to wider release. This is a requirement of our Data Custodians. This process does not seek approval from custodians, but rather is in place to give custodians visibility of project outputs, provide comments and brief Ministers, as required. Analysts are sent any comments or feedback provided by custodians on their publications.
  • Project teams can use outputs referencing BLADE or PLIDA data as part of preliminary analysis and to collaborate with others. This does not require the 2 weeks notice for the ABS. At this stage your outputs are considered as draft analysis and are not yet published and not for further circulation. When you are circulating this draft analysis, it should be clearly stated the analysis is draft and not for further distribution.
  • When your analysis is ready to publish that's when the 2 weeks clearance applies.

How do I cite my work?

Information and research using ABS data must be acknowledged.

When citing the ABS DataLab the preferred citation structure is as follows:

  1. Source of Data: e.g. Person-Level Integrated Data Asset (PLIDA).
  2. Date/reference period of data used
  3. PLIDA product: e.g. PLIDA Modular Product
  4. Pathway of access: e.g. ABS DataLab
  5. Statement: “Findings based on use of PLIDA data.
  6. Include an explanation of any processes or transformations which have been applied to that data
  7. If you are using data from ATO, DSS or Home Affairs you must also include the following disclaimer:

    “The results of these studies are based, in part, on data supplied to the ABS under the Taxation Administration Act 1953, A New Tax System (Australian Business Number) Act 1999, Australian Border Force Act 2015, Social Security (Administration) Act 1999, A New Tax System (Family Assistance) (Administration) Act 1999, Paid Parental Leave Act 2010 and/or the Student Assistance Act 1973. Such data may only used for the purpose of administering the Census and Statistics Act 1905 or performance of functions of the ABS as set out in section 6 of the Australian Bureau of Statistics Act 1975. No individual information collected under the Census and Statistics Act 1905 is provided back to custodians for administrative or regulatory purposes. Any discussion of data limitations or weaknesses is in the context of using the data for statistical purposes and is not related to the ability of the data to support the Australian Taxation Office, Australian Business Register, Department of Social Services and/or Department of Home Affairs’ core operational requirements.

    Legislative requirements to ensure privacy and secrecy of these data have been followed. For access to PLIDA and/or BLADE data under Section 16A of the ABS Act 1975 or enabled by section 15 of the Census and Statistics (Information Release and Access) Determination 2018, source data are de-identified and so data about specific individuals has not been viewed in conducting this analysis. In accordance with the Census and Statistics Act 1905, results have been treated where necessary to ensure that they are not likely to enable identification of a particular person or organisation.”

Please refer to these examples:

Example 1

Person Level Integrated Data Asset (PLIDA), 2021, Census of Population and Housing, ABS DataLab. Findings based on use of PLIDA data.

Example 2

Australian Bureau of Statistics (2020) Microdata: Personal Income of Migrants, Australia, accessed 15 December 2020

A citation of your work will be added to the short online description of your project once your work is published, See also How to cite ABS sources

Failing to comply with DataLab conditions of use

As an approved researcher, you have signed appropriate documentation agreeing to comply with data access provisions under relevant legislation, whenever you access detailed microdata in the DataLab.

If you suspect that you or others in your team may have failed to comply with a microdata undertaking, immediately cease the behaviour, notify the lead researcher and email data.services@abs.gov.au as soon as possible.

For further information, see Consequences of failing to comply with a microdata undertaking in the Responsible use of ABS microdata user guide.

Input and output clearance

Requesting input, output and transfer clearance. Applying the output rules to your analysis. 

Released
19/11/2021

DataLab clearance instructions and templates

Request output clearance

DataLab outputs must be approved and cleared by the ABS before being shared. Output requests generally take 1–2 weeks to be completed. Large, complex, or insufficiently described files will take longer to review. Please apply data minimisation principles and only request what you need. You must not copy or remove anything (for example data, code, notes) out of the DataLab yourself. Please do not include any counts or data from DataLab in your emails with the ABS.

To request output clearance:

  1. Format the data with clear headings and labels (ODS or ODT format is preferred).
  2. Apply the appropriate output rules and prepare evidence in a separate file.
  3. Move the relevant files, including supporting evidence, to a new sub-folder in the output (O:) drive.
  4. Click the button below to generate an email and then complete the details and send.

✉ Request output clearance

If the link does not generate an email, please use the following template in a new email. Do not reply to, forward or copy an existing email chain for a new request as this will not be received.

To: datalab.clearance@abs.gov.au 
Subject: Request DataLab output clearance 

  1. Virtual machine (VM): 
  2. Project name: 
  3. (Optional) If urgent, date required and justification:
  4. Output drive sub-folder path: O:/
  5. Names of files requiring clearance: 
  6. (If relevant) Names of files with supporting evidence: 
  7. Data products used to produce output (e.g. blade1617_core): 
  1. Relationship of output to project objectives: 
  2. (If relevant) Relationship to previous requests: 
  1. For each table / model / graph:
  • Description of output (e.g. weighted income by age, logistic regression model to predict health service usage)
  • People / businesses in scope including reference period (e.g. mining businesses operating in 2019)
  • Definitions of each original and self-constructed variable in output (e.g. count: unweighted count of people, empstat: employment status)
  • Output rules applied (e.g. Rule of 10 on unweighted counts, dominance, degrees of freedom)

   ** Reminder: Do not include counts or data in emails ** 

Request input clearance

This request is only for loading aggregate data, concordances, supporting material or statistical code to DataLab projects. To add microdata to your project, please submit an existing DataLab project query. To request new software or a software package, please use this template.

The following will not be loaded to the DataLab.

  • Names of people or businesses
  • Addresses or longitudes and latitudes for specific locations
  • Free text fields

To request input clearance:

  1. Click the button below to generate an email and complete the details.
  2. Attach any files for input and send.

✉ Request input clearance

If the link does not generate an email, please use the following template in a new email. Do not reply to, forward or copy an existing email chain for a new request as this will not be received. 

To: datalab.clearance@abs.gov.au
Subject: Request DataLab input file load

  1. Virtual machine (VM): 
  2. Project name:  
  3. (Optional) If urgent, date required and justification: 
  4. File type (e.g. code, aggregate data, correspondence file): 
  1. If publicly available:
    Source URL: 
    Terms of use (e.g. Creative Commons Attribution 4.0): 
  1. If not publicly available:
    Name of owner/author/custodian:  
    Attach consent to use file in DataLab. 
  1. Description of each file:
  2. How the file will be used: 
  3. For data tables - description of each variable:

Request transfer between projects

This request is to move code and other files that do not contain data between DataLab projects. Please ensure there are no counts or IDs anywhere, including in logs or comments. If you wish to move files containing data, please submit an output request ensuring all output rules are met and note that you want it transferred to another project.

To request transfer clearance:

  1. Check files do not contain any data.
  2. Move the relevant files to a new sub-folder in the output (O:) drive. 
  3. Click the button below to generate an email and then complete the details and send. 

✉ Request transfer between projects

If the link does not generate an email, please use the following template in a new email. Do not reply to, forward or copy an existing email chain for a new request as this will not be received.

To: datalab.clearance@abs.gov.au
Subject: Request DataLab file transfer

  1. Transfer from virtual machine (VM):
  2. Transfer to VM: 
  3. (Optional) If urgent, date required and justification: 
  4. Output drive sub-folder path: O:/
  5. Names of files requiring transfer: 
  6. Reason for moving files: 

** Reminder: Do not include counts or data in emails **  

Output rules quick reference table

The most common types of analysis are listed below along with the applicable rules for output. Other output types will be assessed based on similar principles. 

 
Output typeApplicable rules
Frequency tables (counts, percentages)Rule of 10
Group disclosure
Magnitude statistics (means, sums, ratios)Rule of 10
Group disclosure
Dominance
Quantiles (percentiles, medians)Minimum contributors for quantiles                                                         
Minimums, maximums, rangesMinimum contributors for quantiles
Models including regressionsDegrees of freedom
Model-specific rules
Charts (graphs, plots and histograms)Chart clearance
MicrodataNot appropriate for output
Synthetic microdataNot appropriate for output

Rule of 10 

The rule of 10 refers to the minimum number of contributors required for each cell or statistic. The underlying (unweighted) count of observations must meet this threshold, and evidence must be provided. 

If multiple tables are produced, differences of less than ten should not be able to be calculated through combining the tables. 

The rule of 10 applies to most outputs including counts, percentages (both numerator and denominator), means, sums, ratios, and other statistics. 

Options for making output safe include suppression of small counts, aggregation of categories or perturbation. If a cell is suppressed but it can be derived or estimated from other outputs, one or more additional values should be suppressed to protect the values of the the primary suppressed cell from being worked out.

See Data downloads for examples and options for treatment. 

Dominance 

The dominance rule is designed to prevent the re-identification of units that contribute a large percentage of a cell's total value, which could in turn reveal information about individuals, households or businesses. 

DataLab has a (1,50) and a (2,67) rule. This means that for any cell, the largest contributor cannot account for more than 50% of the total value and the largest two contributors cannot account for more than 67% of the total value. 

Where a variable can take both positive and negative values, the negative values should be replaced with absolute values before determining the largest contributors and the total. The largest absolute value is then divided by the sum of absolute values to determine if the (1,50) rule is met, and the sum of the two largest absolute values are divided by the sum of absolute values to check the (2,67) rule.  

Similar to the rule of 10, in the case of the dominance rule failing and if a cell is suppressed but it can be derived or estimated from other outputs, one or more additional values should be suppressed to protect the values of the primary suppressed cell from being worked out.

Dominance must be checked if any mean, total or similar statistic is calculated for continuous or magnitude variables. It does not apply to counts.

See Data downloads for examples and options for treatment. 

Group disclosure

Group (or attribute) disclosure occurs when all or nearly all units that have one feature also have some other feature. This means that even when the individual units may appear protected based on other rules, a previously unknown attribute of a unit may be disclosed based on the attributes of the group. Group disclosure risk should be assessed when any cell contains more than 90% of total number of units in the row or column.   

This rule applies to frequency tables. Whether group disclosure requires treatment depends on the sensitivity and nature of the output. 

See Data downloads for examples and options for treatment. 

Minimum contributors for quantiles

Quantiles and other relative ranks must be based on a minimum number of contributors depending on the precision. Underlying unweighted counts should be provided when reporting quantiles in the outputs. For information on required contributors for quantiles, see the table below: 

 
Quantile Minimum contributors 

Medians ( 0.50 )

10 

Quartiles ( 0.25, 0.5, 0.75 )

20 

Quintiles ( 0.2, 0.4, 0.6, 0.8 )

25 

Deciles ( 0.1, 0.2, 0.3 ... 0.9 )

50 

Vigintiles ( 0.05, 0.1, 0.15 ... 0.95 )

100 

Percentiles ( 0.01, 0.02 ... 0.99 )

500 

Minimums and maximums are generally unsafe to output. The following percentiles are safe options if the minimum contributors rule is satisfied: 

  • 1st and 99th percentiles 
  • 5th and 95th percentiles 
  • 10th and 90th percentiles 

See Data downloads for examples and options for treatment. 

Degrees of freedom

Models and regressions are generally safe to output. However, overfitted models can pose a disclosure risk. All models and regressions must have a minimum of 10 degrees of freedom and evidence that this has been met should be provided.

The degrees of freedom are calculated by subtracting the number of parameters and other model restrictions from the total number of observations that contribute to the model.

See Data downloads for examples and options for treatment.

Model-specific rules

There are additional rules for specific model types. 

For ordinary least squares regressions, the R-squared should be lower than 0.9. If the R-squared is higher than this, the constant may need to be suppressed to prevent predictions. This requirement does not apply to other models such as fixed effects or two-stage regressions. 

Additionally, for ordinary least squares regressions with a continuous dependent variable and only categorical independent variables, the regression will approximate the tabular means. The addition of a continuous independent variable, or suppression of the intercept reduces the disclosure risk. Otherwise, apply the rule of 10 and dominance rules.

For survival curves, each step change in the survival curve should represent at least 10 data subjects. 

Correlation coefficients should be calculated based on a minimum of 10 contributors.  

Gini coefficients are usually safe to output, and must be based on a minimum of 10 contributors. 

For classification and regression trees, any underlying unweighted counts must meet the rule of 10

For other models, please provide evidence that no estimates or parameters are derived from fewer than 10 underlying contributors and explain why the output is non-disclosive.  

See Data downloads for examples and options for treatment. 

Chart clearance

All graphs, plots and other charts are subject to the output rules that apply to the underlying output type. The data used in the chart must be provided, accompanied by any relevant supporting evidence that it meets output rules. 

Charts that plot characteristics of individual units or groups of fewer than 10 units will not be cleared. 

See Data downloads for examples and options for treatment. 

Data downloads

DataLab output clearance examples (not real data)

Logging into the portal and workspace

Logging in, launching your VM, first time use/new phone steps, resetting your password

Released
19/11/2021

There are two steps when logging into DataLab, which have been streamlined as of August 2023:

  1. Log into the DataLab portal where you can access information and settings related to your profile, project and start your virtual machine (VM)
  2. Launch the VM for your project where you and your project team members view data, run analysis and prepare reports or data outputs for clearance

Log into the DataLab portal

Enter your account details into the DataLab log in page.

If you are logging in for the first time, for system security you will need to authenticate your log in using the Microsoft Authenticator app on your mobile phone. To set this up, see First time use/new phone steps.

For returning users, click on your account (firstname.lastname@mydata.abs.gov.au) or use another account and enter your account name. All DataLab accounts use the @mydata.abs.gov.au domain format. Enter your password and Sign in.

Choose an account

By logging in you agree to these conditions:

Important Notice

If you are not authorised to access this system, exit immediately. Unauthorised users may be subject to criminal and civil penalties.

This is an Australian Government computer system. Part 10.7 of the Criminal Code Act 1995 outlines the penalties that may apply for unlawful use of Government systems including unauthorised access, modification or impairment of computer systems, data or electronic communications. The Act provides penalties of up to 10 years imprisonment for such offences. By proceeding, you are representing yourself as an authorised user and acknowledge you have read and agree to comply with the Responsible Use of ABS Microdata User Guide. Your activity will be logged, monitored and investigated should any misuse be suspected.

Sanctions ranging from a reprimand to revocation of access or termination of employment may be imposed if misuse is determined.

Once you have entered your credentials and hit ‘Sign in’, a notification from the Microsoft Authenticator app is sent to your phone and asks you to perform a “number match”. Enter the numbers shown on your browser screen into the authenticator app on your phone to proceed.

Are you trying to sign in - authenticator

If you don't approve within the time limit, click ‘Send another request to my Microsoft Authenticator app’. If the request expires, re-enter your account and password in the DataLab log in screen.

We didn't hear from you

You can also change the way you approve the sign in request by selecting "I can’t use my Microsoft Authenticator app right now".

I can't use my authenticator right now

You then have two options:

  1. approve a request on your phone app
  2. use a verification code from your phone app
Verify your identity

After you approve in Microsoft Authenticator, you are logged into the DataLab portal.

DataLab user interface

Launch the VM

To enter your DataLab project workspace you need to:

  1. Activate your VM
  2. Launch your desktop

Step 1 Activate your VM

Each project VM is displayed on individual tiles, with your “active” VM appearing at the top above those that are “locked”. For more information about your VMs, see Functions in My Projects.

My projects page

Each project has a separate VM and you can only access one project VM at a time. If your machine is already available to launch, skip to Launch your desktop.

  • If not, click the ‘Activate' button as shown below and wait until the ‘Launch’ button appears.
  • If your machine shows a status other than ‘Launch’ or ‘Activate’ you must rebuild the VM first. See VM management options for more information.

Click on the ‘Activate’ button of the VM you want to launch.

Virtual machine activate button

If you have a VM for another project that is currently active, this logs you out of your other session. If you have a program running in your Workspace using another VM, this will stop the program. You can only run multiple VMs if you have requested and are using offline local disk space.

Virtual machine activating

As seen below, you can track the VM activation progress by either selecting 'Track' from the pop up notification or from the action log icon on the left navigator.

Action log tracking

When the VM activation completes, an additional pop up notification confirms that the action was successful. If the action fails, repeat the above steps to activate.

Action log succeeded

If you navigated to the Action Log page, select the laptop icon named 'My Projects' in the left navigator to return to the 'My Projects' page.

Step 2 Launch your desktop

Azure Virtual Desktop (AVD)

If you have changed your VM version to AVD (refer to VM management options on how to do this), launch the VM by clicking the ‘Connect’ button on the active VM tile.

Prior to launch, you will be asked to choose the type of connection you wish to utilise from the drop-down menu, as shown in the image below. AVD provides the option to select between a web browser user experience OR downloaded application if required (via the 'Remote Desktop Client').

Note: If you want to use multi-screen functionality, we advise you to connect via the 'Remote Desktop Client' for Windows version. This version may also improve connection stability and VM screen resolution.
 

VM tile with three launch options shown in a drop-down menu

This image shows three options from the drop-down menu when selecting ‘connect’ to launch your AVD VM. The first option is “Connect (web client)”, the second is “Connect (Remote Desktop Client for Window)”, and the third option is for “Other ways to connect”, which will bring up another menu to show other launch options. 

 

Connect (web client) option

If you are connecting to the 'web client' version of AVD, a new browser page will be launched, and you will be met with the below settings screen. This window may have slightly different options for each device. There is no need to change any of these settings. Press ‘Connect’ to continue. 

Web Client settings page

You will then be asked to log into your VM using your DataLab credentials. Once you have entered your credentials, click 'Sign In'. 

 

Web Client credentials page

After a successful login, you will see the below loading screen. 

 

AVD web client loading screen

Your VM will then prepare Windows for you. 

 

Web Client preparing windows screen

You should now have access to your VM within a browser window. For more information on using the workspace see Using your workspace. 

Note: You may see an upload button in the toolbar which, intentionally, does not work. You may encounter a success message after uploading but you will not be able to retrieve any files that you have uploaded. If you wish to upload data or packages to the DataLab, please Contact us

 

DataLab VM

Connect 'Remote Desktop Client for Windows' option

If you are connecting to the 'Remote Desktop Client' version of AVD, ensure that your IT department has enabled the correct networking addresses. See Enabling access to the Datalab under 'Azure Virtual Desktop configuration' for more information. 

Note: If you intend to use the 'Remote Desktop Client' on your organisation's workspace, you will need to contact your internal IT department to make it available to you. 

The latest version of 'Remote Desktop Client' for Windows is available here

The latest version of 'Remote Desktop Client' for Macintosh is available here

 

Once you have access and started the application, you can first click ‘Subscribe’ as shown below. 

 

Remote Desktop subscription page

You will then be shown a new window that allows you to log into your DataLab account using your DataLab credentials. 

 

Remote Desktop Client login page

Upon a successful login, you will be shown all your available VMs. If you can’t see the VM that you want, it may be dormant and requires a rebuild. 

If you have access to multiple projects you may be shown multiple tabs, one for each project as shown in the image below. 

Note: If you wish to use multi-display, right click on your VM icon, click ‘Settings’, and turn off ‘Use default settings’. You should now see options for how you would like to display you VM on your computer. 

 

Remote Desktop Client VM options

Click into one of the computer icons and you will be taken to a login screen to reconfirm your identity, before access is granted to enter your VM. This may not appear if you have recently logged in, through a similar window. 

 

AVD identity confirmation

The below loading bar should briefly appear. 

 

Remote Desktop Client loading bar

If, instead of the second login prompt, you receive a message as shown in the below image, then either your machine is not the active machine or it is not using the AVD version ‘2024avd’. If your desktop session does not start, repeat  Step 1 Activate your VM and check that the version at the bottom of your VM says ‘2024avd’. 

 

AVD remote desktop error message

You will then be presented with another login screen for the VM itself. 

 

Windows security login credentials

You should then have a new window open on your computer with your VM desktop open. 

 

DataLab virtual machine launched

For more information on using the workspace see Using your workspace.

 

Citrix

Launch the VM using the ‘Launch’ button from the active VM tile. If you are using Citrix, this action will open the Citrix Workspace application after a few moments.

 

Virtual machine Launch button shown from within the DataLab user interface

The Citrix dashboard shows all your recently used desktops. Desktop is the Citrix term for VM.

View of your available desktop sessions from within the Citrix interface

From the All Desktops view, click the VM/Desktop (this is your project number) to open your project in the DataLab environment.

Launching the desktop from within the Citrix workspace

Citrix will then begin connecting to your DataLab desktop, this may take a few minutes. A popup in the bottom right hand corner of your screen will show you the connection progress. If your desktop session does not start, repeat Step 1 Activate your VM.

Workspace powering on

You will then be asked to open the Citrix Workspace Launcher. If you do not have the Citrix receiver application, you will need to download it to your device. The latest version of Citrix Workspace is available at https://www.citrix.com/en-au/downloads/workspace-app/windows/.

At this point you can select the ‘Always allow’ checkbox, if you do not wish to receive this alert each time Citrix attempts to broker your connection.

Open Citrix Workspace Launcher notification

Log in using the same credentials you used to log into the DataLab Portal. Once logged in, the system may take a few minutes to load as it prepares Windows.

DataLab VM Windows log in screen

As part of the conditions of use, all activity within the workspace is recorded for auditing and reporting purposes.

Notification activity in DataLab VM is recorded

Your DataLab workspace looks like this. For more information on using the workspace see Using your workspace.

DataLab workspace

First time use/new phone steps

The DataLab uses two factor authentication to provide a secure log in environment. You need to download the Microsoft Authenticator app to your smart phone to use the DataLab.
Open https://datalab.abs.gov.au and enter your credentials.

The first time you log in, enter the username and password provided to you by the ABS. 

If you are using a new phone, refer to the Contact us page for system support. An ABS system administrator will need to reset your Authenticator. 

Note: all DataLab accounts use the @mydata.abs.gov.au domain format.

DataLab sign in username

If you are switching to a new phone (not new account) you will be given the screen ‘More information required’, click ‘Next’.

More information required

This will direct you to Step 1 of setting up your Microsoft Authenticator application. Download the Microsoft Authenticator app to your smart phone from the App Store (for iOS) or Google Play (for Android). Make sure that the authenticator is published by Microsoft, as the ABS DataLab only supports Microsoft Authenticator.

Microsoft Authenticator screen shot

Once you have fulfilled the initial requirements noted for Step 1, proceed by clicking ‘Next’.

Method 1 of 2 Microsoft Authenticator

You will then be guided to open your downloaded Microsoft Authenticator app and add a ‘Work or school account’.

Add a work or school account in the app

Once complete, click ‘Next’.

Method 1 of 2 continued

The following screen presents a QR code to scan using your Microsoft Authenticator app. 

Scan the QR code on the screen

Initiate the scanning function on your phone, then hover your phone over the QR code shown on your browser screen.

Initiating scan
Scanning in progress

Once scanning is complete, click ‘Next’ on your browser screen. You will be asked to enter the security number shown on your browser screen into your Microsoft Authenticator app, once complete, click ‘Next’.

Additional security verification communicating with mobile app device

After entering the security number and receiving approval from your Microsoft Authenticator application, you will see a ‘Notification Approved’ confirmation on your browser. Click ‘Next’ to proceed to Method 2 of setting up your multi-factor authentication.

Notification approved

To make sure you can reset your password in the future, you also need to set up an authentication phone and/or email. The system will verify you by the option you select.

Authentication by phone

The system verifies your phone by the option you select.

Select your region from the drop-down menu, enter your mobile number and choose between receiving a text or a call to activate the verify button. Enter the code then select ‘Next’ to verify. The call option sends you an automated phone call that will ask you to press the # key to verify.

Enter your authentication phone number
Enter security code

Once verified, select ‘Next’ and then ‘Done’ to return to the sign in page. 

SMS verified
SMS authenticated successfully

Authentication by email

Click the ‘I want to set up a different method’ and select ‘Email’ from the drop down menu.

Choose a different authentication method

Enter an email address then click ‘Next’.

Enter your email address

A verification code is emailed to you. Enter the verification code and select ‘Next’. Once you have set up your details, select ‘Done’ to return to the sign in page. 

Email verified

Set up a new password for your account. Your password cannot contain your user ID. It must be a minimum of 8 characters and contain at least three of the following:

  • upper-case letters A – Z
  • lower-case letters a - z
  • numbers 
  • special characters @ # $ % ^ & * - _ ! + = [ ] { } | \ : ' , . ? / ` ~ " ( ) ;
Update your password

You can then log in to your account using your new password. 

Reset your password

If you forget your password click on the ‘Forgotten my password' link. If you have received a notification to reset your password while in your VM, log out and then click on the ‘Forgotten my password' link. 

Enter password screen showing forgot my password link

Your user ID is populated for you. Enter the characters in the picture, or words in the audio, then click 'Next'.

Enter the characters shown or click the audio link

The next screen takes you to Step One of verifying your account. Choose from the options in the left-hand column (options available depend on what you chose during your original account set up process).

Enter your authentication method to get back into your account

Verify your account information via email, text, or call (whichever you chose), and follow the prompts to reset your password. The screen below displays when your password has been reset. Click the link to sign in with your new password.

Your password has been reset confirmation screen

Using your workspace

Getting started, accessing your data files, available software, locking your workspace and signing out

Released
4/11/2021

Getting started in the DataLab workspace

When you have successfully logged into your Virtual Machine (VM) instance, your DataLab workspace looks like this:

DataLab workspace

You can use DataLab in a similar way to using other secure networked systems, where you can securely see, use and share data files, analysis and output with the other members of your project team.

Open File Explorer and click on This PC to see the network drives you have access to:

  • Library (L drive): All researchers can see all files in the Library drive. This is where we upload support information, such as statistical language documentation, ANZSIC classification and general access guides for non-standard products. Files cannot be saved to this drive.
  • Output (O drive): Any output you want the ABS to clear should be saved to this drive. Only members of your team can see this drive. See also Request output clearance. Information is backed up nightly and retained for 14 days. Information in this folder remains unaffected by a rebuild.
  • Project (P drive): A shared space for your team to work in and store all your project files, as well as set up and run Python and R scripts. Only members of your team can see this drive. Information is backed up nightly and retained for 14 days. Information in this folder remains unaffected by a rebuild. The default storage is 1TB. You will need to review and delete unnecessary files as your project files grow over time. If necessary, an increase to this storage can be requested via the Contact us page. There may be a cost for additional storage.
  • Products (R drive): Access data files that have been approved for your project. However, it is best to use the My data products shortcut on your desktop as this shows you only the datasets you have been approved to access, rather than all dataset short names. Files cannot be saved to this drive.
  • Local Disk (X drive): If you have been granted local disk space, this can be used to run jobs on offline virtual machines (desktops). You may want to request this option if you have multiple projects that you are actively involved in. There may be a cost associated with attaching local disk space to your VM. The local disk will only be present if it has been allocated to your VM. To request local disk space contact the ABS via the Contact us page
  • Drive C can be utilised to run scripts and create new Python package virtual environments, not facilitated through either Jupyter Notebook, JupyterLab or Spyder. Noting the C drive is also destroyed with each 30 day rebuild. Note: Avoid using this drive for saving files - there is limited space and no ability to increase the storage capacity. If more storage is required a local disk can be requested for your VM. 
  • ​​​​​​Drives A and D are not to be used. Information saved here is either destroyed with each nightly shutdown and 30 day rebuild, or has restricted access. Attempting to read or write from Drives A or D will invoke a group policy error due to access controls. In this case please use the C drive or consult your project lead to request local disk space.
Network drives you have access to

Do not store files in any other folders. Other members of your project cannot see files if you store them in other drives. Files stored outside of the Project and Output drives are destroyed every 30 days as part of DataLab security protocols.

Refreshing your network drives. If your network drives do not appear in File Explorer, you can click the 'Refresh Network' shortcut on the desktop. A confirmation message appears when this has been successfully refreshed.

Refreshing your network drives

Accessing your data files

To access the data files for your project, use the 'My Data Products' shortcut on your desktop.

My Data Products shortcut

The My Data Products folder displays only the products approved for your project.

My Data Products folder

Selecting the 'Products' drive shows you the short name of all data loaded to the DataLab. However, if you try to open a file that is not approved for your project you are denied access and receive an error.

Products drive
Error message when accessing a file that is not approved for your project

Available software

Software can be opened using the shortcuts on your desktop or by using search on the Taskbar.

All researchers have access to these applications in the DataLab:

  • LibreOffice 7.5
  • Acrobat Reader
  • Notepad ++ 8.5.2
  • QGIS 3.30
  • WinMerge 2.16
  • Git 2.40
  • Stata MP18
  • CUDA 12.1.1
  • R 4.1.3. including:
    • RStudio 2023.03
    • RTools 42
  • Python 3.9 (Anaconda3 distribution) including:
    • Jupyter Notebook & JupyterLab
    • Spyder
  • PostgreSQL 15
  • 7Zip

If required, you can also request:

  • SAS 9.4
  • Azure Databricks

Microsoft Word and Excel are not currently available, as these applications require an internet connection, which is not supported in a secure system like DataLab. Libreoffice is the alternative offered in the system, with similar capabilities to Microsoft Office.

Firefox and Edge are available to support access for Databricks (which is under development) and for Jupyter notebooks to use Python/R. These browsers cannot be used to browse the internet.

If a package in your statistical software choice is not available, you can request it using the Contact us page. 

Managing your R & Python packages explains how you can manage R and Python packages using the Posit Package Manager shortcut on your desktop.

Databricks

Databricks is available to projects within the DataLab as a non-standard product. 

What is Databricks? 

Databricks is a cloud-based Big Data processing platform which provides users with an integrated environment to collaborate on projects and offers a range of tools for data exploration, visualisation and analysis. Within the Databricks environment, users can:

  • Build pipelines for streaming data processing.
  • Build and run machine learning tools.
  • Create interactive dashboards.
  • Take advantage of scalable distributed computing capability.

Project analysts will also have access to the Databricks Academy training subscription (an online library of Databricks training guides), in addition to instruction materials on how to setup the Databricks workspace provided in the ABS shared library. 

How do I allocate a Databricks workspace to my project? 

To allocate a Databricks workspace to your project, you will need to submit a request to data.services@abs.gov.au. Once your project is allocated a Databricks workspace, it can be accessed from within your VM using the installed Edge or Firefox browsers.

How will costing work? 

Access to Databricks will be charged per project on a financial year basis. Projects will have the flexibility to select between a low or high usage profile, each with its own charging profile. Selecting the appropriate usage profile is determined by how much compute resources project analysts are estimated to consume. The same level of service is applicable across both profiles. 

Usage will be monitored by the ABS and project leads will be advised if their usage is projected to exceed the charges paid. Should projects exceed the usage of their profile within the financial year, access to this service can be ceased or continue subject to additional charges. Please see our DataLab charges for more information. 

As Databricks uses separate compute power, projects requesting access to Databricks should consider if they need to continue to maintain their existing VM sizes. The option of scaling down the size of existing VMs provides users the opportunity to save on project costs. 

What are the cluster policy arrangements? 

User analysts can be provisioned with the following cluster policy options: 

Instance: DS3 v2

  • Server purpose: General purpose
  • Max autoscale workers: 5
  • CPU: 4
  • RAM/Databricks Units: 14GB/0.75

Instance: D13 v2

  • Server purpose: Memory optimized 
  • Max autoscale workers: 4
  • CPU: 8
  • RAM/Databricks Units: 56GB/2

Instance: DS3 v2

  • Server purpose: Compute optimized 
  • Max autoscale workers: 4
  • CPU: 16
  • RAM/Databricks Units: 32GB/3

Databricks cluster policies will restrict the type and number of workers you can provision for a cluster. If an existing policy does not fit your requirements, you can request a new policy via the ABS. All information regarding this can be found in the ABS shared library.

To ensure the security and integrity of the DataLab, clients will not have administrative access to the Databricks workspace and some usage restrictions may apply. Administration will be exclusively managed by the ABS, aligning with the specified usage restrictions of the DataLab. 

Please contact data.services@abs.gov.au with any questions.

Managing your R & Python packages

If you are working with a specific set of R and/or Python packages, you can now manage these using the Package Manager shortcut on your desktop.

Posit package manager shortcut

In the Package Manager, click 'Get Started' to navigate to the available packages. You can use this tool to search for packages (in the left column) and install the packages you want to use for your project. If the packages you need are not listed, you can request them using the Contact us page.

Posit Package Manager page where you can check your available packages by clicking Get Started
List of the available packages in your Posit package manager

Virtual machines

What are virtual machines?

Virtual machines, or VMs, are the virtual workspaces you use to undertake your analysis in the DataLab. VMs are created by the ABS as part of the project establishment process, described in About DataLab.

You have one VM for each project. This is a design feature to prevent data from one project being accessed by another project. You can run analysis on multiple virtual machines at the same time, but only if you have been granted local disk space. See Run jobs on offline VMs (desktops). You may want to request this option if you have multiple projects that you are actively involved in.

Virtual machine sizes

The ABS offers standard and non-standard VM sizes. Standard VMs are included in the DataLab annual fee, whereas non-standard VMs are subject to additional charges as they are more expensive to run. For more information on charges, see DataLab charges.

Researchers may request access to a non-standard machine for performance or productivity purposes. If you require a non-standard machine, you will need to consult your project lead and your project lead will need to send the ABS an updated project proposal.

Currently offered VMs and approximate running costs are listed in the tables below.

 
Standard virtual machines

Large VMs are provided as the default and most projects operate efficiently with this size.

If you have a small or medium machine, it can be upgraded to a large at no additional charge. Please contact data.services@abs.gov.au for further assistance.

Standard Virtual Machines
NameCPU CoresRAMApprox cost per hour ($AUD)
Small28GBNot applicable - these virtual machines are included in the DataLab annual fee.
Medium216GB
Large864GB

 

Non-standard virtual machines

Non-standard machines are available on request. Additional charges will apply, please refer to DataLab charges for more information.

Non-standard Virtual Machines
NameCPU CoresRAMApprox cost per hour ($AUD)
X-Large16128GB$1.80
XX-Large32256GB$3.80
XXX-Large64504GB$6.40

 

Specialised and custom non-standard virtual machines

The following specialised VMs (also non-standard) capable of supporting machine learning and high-performance computing can also be requested, however these are assessed on a case by case basis with the appropriate justification, and are subject to quote. If the required VM is not listed, the ABS may be able to provide a customised option at an additional charge, please be sure to describe why the available machines do not meet your needs in any justification provided. A list of virtual machines by region can be viewed via the Azure website.

Assigned names of VMs are unrelated to Azure naming conventions. ABS review our provided VM options periodically, please revisit this page for any updates.

Specialised and Custom Non-standard Virtual Machines
NameCPU CoresRAMGPUApprox per hour cost ($AUD)
Large GPU856GBTesla T4 16GB $1.50
X-Large GPU16110GBTesla T4 16GB$2.40
M-series1282000GBNot applicable$28.80

 

Sign out or Lock your DataLab session

When you walk away from your computer or are finished with your DataLab session, you must either lock your workstation or sign out of your account to ensure nobody else accesses your DataLab account.

To lock or sign out of your workspace, click the menu button at the top of your window to expand the toolbar, then select 'Ctrl+Alt+Del' to be presented with the options to lock or sign out of your workspace. 

Workspace toolbar expander
Ctrl, alt, delete menu option
Lock menu option

If you need to leave your computer for a short length of time or you have analysis running, lock your DataLab screen. You can then close your VM window by using the X in the top right- hand corner. This closes your session but does not end any programs you have running.  Your programs will continue to run until 10pm that night, or longer if you have selected the Bypass option in the portal.

'X' button to exit workspace
Desktop viewer alert received when exiting workspace

Sign out to leave your workspace session. This closes your session and will end any programs you have running. 

If you are using Citrix workspace portal you may still be logged into the browser. You can either close this window or Log Out of the portal using the icon in the top right corner (with your initial). To log back in, see Logging into the portal and workspace.

Citrix Workspace portal screen where you can close the browser window or Log Out of the portal

Portal features

My Virtual Machines, My Accounts and My Projects

Released
4/11/2021

The DataLab portal is where you will find information about your DataLab account, projects and virtual machines (VMs).

DataLab portal

The DataLab portal displays information across three tabs: 

My Projects

From this tab you can activate, start and launch the VM associated with your project. 

My Account

Use this tab to view your personal contact information and basic account settings. 

Action Log 

Keeps a record of your portal actions. This can help you manage your sessions and provides useful information if you encounter problems with the system. It includes:

  • starting VM
  • stopping VM
  • changing your Active VM
  • restarting VM
  • rebuilding VM
Left navigator menu

The left navigator menu contains shortcuts that can be used to navigate between pages. Click the arrow to collapse or expand the navigator menu.

Left navigator menu

Global links

The links at the top right are available from all pages of the portal:

  • What’s New acts as a global information centre for the DataLab, showcasing information about new features and updates
  • DataLab Workspace to access your VM. You must activate your VM before you can use this shortcut
  • About the DataLab to access this user guide
  • DataLab Privacy Notice
  • User Icon displays your details, including name, user name and user role. This is also where you log out of the DataLab
Your obligations and management responsibilities

The Responsible Use of ABS Microdata obligations guide and Using DataLab responsibly pages will help you understand your obligations and management responsibilities to handle microdata safely. Read through these pages or Contact Us if you would like any help understanding your responsibilities.

Important messages banner

This banner appears at the top of your DataLab portal window when we have an important message for your consideration or action.

Functions in My Projects

We've simplified the user interface to make it easier to launch your project VMs. In the updated portal My Projects page, you will see your currently "active" machine, and other "locked" machines below. You can only access one project at a time. Please read the below information to familiarise yourself with the new interface.

My Projects page functionality

VM management options

Click the ‘Manage’ button on the upper corner of your VM tile to see a detailed management view.

VM management options button
VM management options

If your VM is dormant, no management options are available.

dormant vm showing no management options

Note as of March 2024, users will be able to access their virtual machine via the new Azure Virtual Desktop (AVD) experience by selecting 'Change VM Version'. Then choose the '2024avd' version from the drop-down menu.

The contents of each pane on this screen are outlined as follows:

Power State: You can start, stop, or restart the VM. This can be helpful if swapping between VMs or having difficulties seeing your machine in the Citrix workspace.

Power State for the VM

Scheduled Shutdown: VMs are automatically shutdown every night at 10pm AEST. If you have a program running that you expect to run past 10pm, you can choose to extend your session for up to 3 days by selecting 'Bypass shutdown'.

Bypass shutdown
Bypass shutdown duration

Scheduled Rebuild: VMs are automatically destroyed and rebuilt every 30 days for security and maintenance purposes:

  • you cannot extend this time, however you can choose to rebuild before the schedule time by selecting ‘Rebuild Now’
  • it displays a date and numbered count down on the coloured bar, with time adjusted to your local area
  • the coloured bar changes, starting with green, moving to orange, and finally red as you get closer to the rebuild date
  • after rebuilding the countdown resets to 30 days and allows you to bypass the nightly shutdown
  • if you try to bypass a shutdown when your machine is scheduled for a rebuild, the system will deny the action, but offer to ‘Rebuild Now’

Scheduled Rebuild

Run jobs on offline VMs

If you are an analyst who works across multiple projects, you can request local disk space. This will enable your VM to run jobs offline, noting the 30 day rebuild still applies.

Datasets are stored on a remote file share. Only the active machine has network access to this location. Your locked virtual machines do not. To run offline jobs, you need to request local disk space to be attached to your machine. There may be a cost associated with this.

When running jobs offline, the inactive machine can continue to run your program as it still has access to the data since it is no longer using the remote file share. However, working like this does not allow your project team to see your analysis or output. You should always move your output back to your Project or Output drives where your project team can access and review the output. See Using your workspace for more on the available drives in DataLab.

To use local disk space:

  1. Request access to a local disk for your project through the Contact Us page
  2. Copy the data products you need to the local disk.
  3. In your program, point to your source/input data on the local disk and start the job.
  4. In your program, save your output to the local disk.
  5. Exit your VM and return to the DataLab portal to activate another machine.
  6. After you have finished running your analysis offline (local disk) move your analysis and output back to your Project drive.
Local disk space

Functions in My Account

Select the ‘My Account’ tab from the navigation panel to see details about your account.

Accessing your account details
Details about your account

Basic attributes displays your name, email, phone etc. If your personal details are incorrect, please Contact Us with the correct information.

Account settings allows you to opt in or out of receiving email reminders. These reminders let you know when your virtual machine will shut down. Notifications are sent at 5pm AEST/AEDT, prior to the 10pm scheduled shut down if you have started your VM that day. It will also remind you before your 30 day VM rebuild. You can change this option at any time by clicking ‘Edit’.

Email reminder in Account settings

Recommended browsers

DataLab is presented in a web browser. It is recommended the latest version of either of the following are used:

  • Chrome
  • Microsoft Edge
  • Firefox
  • Safari

Internet Explorer is not recommended.

Note: Mobile devices are not supported/enabled for the DataLab. 

Troubleshooting

Help with logging in, virtual machines, errors and running out of space, code and software

Released
4/11/2021

Authentication

I’m having trouble with my Multi-Factor Authentication

If your DataLab account name is not recognised by your authenticator application, it may be because you have downloaded an authenticator not published by Microsoft. The ABS DataLab only supports Microsoft Authenticator. 

You will need to download the Microsoft Authenticator application to your smart phone from the App Store (for iOS) or Google Play (for Android) to fulfil the login sequence.

Microsoft Authenticator App

The following authenticator applications (similar in appearance to the Microsoft Authenticator) will not function with the ABS DataLab:

Example authentication apps that are not supported by the ABS DataLab

If you have verified you are utilising the correct application, but are not receiving prompts to authenticate on your mobile device, check your battery optimisation is turned OFF by following these steps. 

  1. Open Microsoft Authenticator, and refer to the top right-hand corner, if there is a red dot above the three white dots, press this and then select 'Allow' to turn the battery optimisation off. 
  2. Then try logging in again on the DataLab landing page. You can also try to switch from WiFi to mobile data through your phone, to ensure a good connection to your PC/laptop.

If you receive the ‘Error communicating with server’ message on your mobile device, and your device is connected to the home WiFi, then please turn your phone over to mobile data and try logging in again.

If you are switching to a new phone/tablet device, the operating system of some mobile devices may not interact as expected with Microsoft Authenticator (namely the iOS 16, and some older versions of Android), in this case, try setup another phone or tablet device. In the case that you still have your old phone it may help if the old account is removed from the Microsoft Authenticator application on the previous device.

Logging in

I can't log in

  • If you have entered your user name or password using copy and paste, you may have accidentally included hidden characters or a space.
  • Your organisation firewall may be blocking access. Try accessing DataLab while disconnected from your organisation's network.
  • The ABS DataLab only supports use of the Microsoft Authenticator app.
  • If you have changed your mobile phone we need to reset your Microsoft Multi Factor Authentication, email data.services@abs.gov.au.
  • If you need to reset your password this must be done via the Forgot my password link in the initial DataLab sign in screen.
  • Clear your browser cache.
  • Try a different browser. See Recommended browsers.

Has my organisation authenticated my access to the DataLab

DataLab is enabled by cloud infrastructure, which may be blocked by some organisations’ firewall settings.

ABS cannot make changes to external organisations' infrastructure. Project Leads need to supply the information below to each organisation participating on this project.

Network/IT Security sections in each organisation need to review and make changes to authenticate access.

There are 4 steps which need to be applied to each organisation’s security settings before the project start date to enable access to DataLab.

1. Enable authentication to the tenant

Users need to authenticate to one of ABS Azure Active tenants, which may be strictly controlled by government agencies and academic workplaces. Authentication must be enabled to the tenants:

  • mydata.abs.gov.au
  • absmydata.onmicrosoft.com

2. Allow user access to URLs

Users will need to access the following URLs:

  • DataLab production portal: datalab.abs.gov.au and gw.datalab.abs.gov.au
  • Citrix portal: absdatalab.cloud.com

3. 2020 version of Citrix Workspace client installed (if you are using the Citrix Desktop). 

The originating client machine must have a recent version of the Citrix Workspace client installed. Here is a link to the Citrix Workspace download page

If you are using Azure Virtual Desktop (AVD), ensure your organisation’s network allows outbound connections to the specified addresses:

  • login.microsoftonline.com  
  • *.wvd.microsoft.com  
  • *.servicebus.windows.net  
  • go.microsoft.com  
  • aka.ms  
  • learn.microsoft.com  
  • privacy.microsoft.com  
  • query.prod.cms.rt.microsoft.com  

These addresses all utilise the TCP protocol and outbound port 443 for communication. Contact data.services@abs.gov.au for further assistance. 

Why do I have to log in twice during the access process

The DataLab has more functionality and features available to you, so you can set options as well as undertake your research.

  • First log-in is to the DataLab portal, where you can view and set options for your DataLab account information and virtual machines. Read more in DataLab portal features.
  • Second log-in is to the DataLab workspace where you undertake your analysis.

How long does my temporary password/password last

  • The temporary password issued to you by the ABS lasts for 90 days. After you have completed the set up steps you must reset your password.
  • If you have forgotten your temporary password, email data.services@abs.gov.au for a reset.

I forgot my password to get into the DataLab portal

Your log in credentials for the DataLab portal are the same as for the DataLab workspace. You can reset your password by clicking on the Forgot my password link.

My password expired while my virtual machine is running

Your session will continue on until a shutdown is required (either nightly shutdown or 30 day rebuild). However, you can still reset your password while your session is running.

Virtual machines

My virtual machine is not launching

  1. You must Activate, then start the VM. Follow the process and wait for each step to complete before progressing.
  2. Check your internet connection. If you have a weak or intermittent connection, this can affect launching your virtual machine.
  3. Try launching the virtual machine outside of your organisation's online environment. Some institutions’ or Government departments’ firewall or other security settings may be preventing access to DataLab portal and/or launching of the VM. Attempting to connect outside your agency’s online environment may assist in forming the VM connection.
  4. VM not launching can be caused by a Citrix issue. Try again after installing the latest version of Citrix workspace. Alternatively, you can switch your VM version to the newly released AVD, by selecting 'Manage' on the virtual machine you are attempting to connect to. 
  5. Restart your virtual machine. As with restarting a computer, restarting your virtual machine can sometimes resolve problems with launching your machine successfully. From the virtual machine page click the Restart VM button and wait 10 minutes to ensure the reboot of the machine is complete before attempting to launch again.
  6. If you are still having trouble, email data.services@abs.gov.au.
Restart VM button

What does it mean for a virtual machine to be Active and why does this matter

If you are a member of multiple projects in the DataLab, you will have more than one virtual machine. Your Active machine is the one that is connected to the remote file share, where the data files are stored. For security purposes, only one of your sessions can connect to the remote file share at a time (this is where data files are stored). You can activate your virtual machine by using the Change Active VM button.

My Virtual Machine is not launching from the Remote Desktop client

VM not launching from Remote Desktop client can be caused by a few issues. Check that your VM is using the ‘2024avd’ version, that the machine you are trying to launch is ‘activated’, and that your machine is not currently being rebuilt.

Remote Desktop client

Why are virtual machines destroyed every 30 days

Virtual machines are destroyed approximately every 30 days for security purposes. If the 30 day timing will interfere with the timing of your project, you can choose to destroy and rebuild earlier than 30 days at a time that suits you.

Is my virtual machine backed up

Virtual machine project and output drives are backed up every night and kept for 14 days. Files outside of these drives are not recoverable.

Where do I save the work I have done on a virtual machine that is scheduled to be destroyed

Save your work to your Project or Output drives to ensure that your analysis is not lost. Information saved outside of these drives is destroyed when your machine is rebuilt every 30 days.

Can I have multiple virtual machines running code at the same time

Only if you have requested local disk space to be allocated to a machine. This allows you to run jobs on offline VMs.

I can't see my project's products

Try logging out of and stopping your VM, then begin the Start VM process again. If that does not work, try the rebuild now from your VM management options.

There is an upload button on my Azure Virtual Desktop machine toolbar

The upload button that is present in your toolbar is intentionally non-functional. You may encounter a success message after uploading but you will not be able retrieve any files that you have uploaded. If you wish to upload data or packages to the DataLab, please contact the Data Services team or Input Clearance team.

I'm experiencing performance issues within my DataLab workspace

System performance issues can occur for many different reasons, and every issue requires a unique approach to troubleshoot. Many issues can be resolved by attempting one of the following:

  • Ensure you have a good internet connection 
  • If you are using Citrix, ensure your Citrix Workspace Application is up-to-date (see here)
  • Close and reopen the program you are using, and close any other programs or processes that may be consuming system memory
  • Confirm your project drive has available space, if not attempt to free up space
  • Refresh your network drives using the icon on the workspace desktop
  • Attempt to shutdown or rebuild your virtual machine

If issues persist, email data.services@abs.gov.au for further assistance.

Errors and running out of space

One of my network drives in the analysis environment is missing

If you cannot see the Library, Project, and Output network drives in File Explorer, go to the desktop and double-click the Refresh Network Drives icon.

Refresh Network Drives icon

I got an error while working with data in SAS/Stata/R/Python

Stata error example

Stata error example

This means you have exceeded the memory for your virtual machine.

1. Use an alternative method/program to manipulate or process the dataset. Some processes/programs/methods for working with large datasets are more memory-intensive than others. Try some alternative method to see if it is less system intensive.

  • Most statistical software tools are able to filter data as it is imported. If your analysis only needs variables a, b and c from a dataset containing 30 variables, then selecting, filtering or importing only these variables uses less memory.
  • If you cannot do this in your software, consider creating a subsetted data file using another tool, such as Python, as the first step of preparing your data for analysis.
  • If you are unsure of alternative methods, we recommend discussing with other researchers in your project team who are more familiar with your chosen statistical software. The ABS does not provide advice or training on using the analytical tools provided to you in the DataLab.

2. Email data.services@abs.gov.au to request a larger machine. Larger machines incur higher running costs. With user charging, you may need to consult with your organisation to confirm incurring additional expenses for your project before applying for a larger machine.

SAS configuration file can now be edited by users with SAS installed

Users with SAS installed now have the ability to edit the SAS configuration file located at “C:\Program Files\SASHome\SASFoundation\9.4\nls\en\sasv9.cfg”. This file governs various software settings and parameters, enhancing customisation options. Noting any changes will need to be repeated following machine rebuilds, as the C drive is destroyed on rebuild. To modify the SAS config file, follow these steps: 
1.    Access the file at the provided directory, using SAS or a text editor. 
2.    Proceed to make the necessary changes, in order to tailor the configuration.
3.    Save the file (and a backup version on your P drive, to reinstate following machine rebuilds), then test the software for desired outcomes. Exercise caution, as improper edits may lead to unexpected behaviour. 

I am running out of space in my Project drive

Clean up the drive contents, review and delete redundant files to free up space.

Email data.services@abs.gov.au to request a storage increase. There may be a cost associated with this.

Code and software

I have some code for one project that I want to use in another project - how do I arrange this

You can request input clearance for data, code or files to be loaded to your project, from either another project, or other sources that you hold.

Can I use a mix of SAS, STATA, R and Python for different people in my project team

Yes, each virtual machine has R, STATA and Python as default software. SAS  is not automatically provided on all machines but can be requested as it requires a licence to be assigned to your virtual machine. Email data.services@abs.gov.au with your request.

Is cluster processing possible in the DataLab

Cluster processing is not currently available. We are developing a Databricks service to provide scalable clustered analytics environment for users.

Is there a delay between assigning data to a project and users seeing it

Yes, it takes about 5 minutes to process the connection. You also need to log out of your virtual machine to allow the system to refresh your session with the new data.

What can I do if my code will run longer than 10pm tonight

You can extend your session to bypass the nightly shutdown, by one, two or three nights.

How do I see what R packages I have available and how do I manage these

Use the R Studio Package Manager shortcut on the DataLab virtual machine desktop to check the range of R packages available to you. See Managing your R packages.

SAS warning messages

If the project you opened was saved with SAS Datalab – [machine name] you are connecting to the local SAS server without a profile. When you try to run the project without selecting a profile the system may present an error message saying "The server "SASMain" is not defined in the current repository". Click though the messages and continue.

I can’t find the R packages I need in the analysis environment

  1. See Managing your R packages to use the RStudio Package Manager on the desktop.
  2. If the packages you need is not listed, email your request to data.services@abs.gov.au

Double clicking to open a PDF is not working

Due to a default setting in Microsoft, the system automatically uses Microsoft Edge to open any PDF file. You can open the PDF file by right-clicking on the file, selecting Open with > Adobe Reader. This launches the file using Adobe Acrobat Reader.

Launching a PDF file using Adobe Acrobat Reader

How can I enable larger data storage for Postgresql data directory?

Following recent enhancements, PostgreSQL's data directory relocation is now automated to X:\psql\data when a local disk is attached, facilitating expanded storage capacity for data. This feature streamlines the process of accommodating larger datasets. To enable local disc please email data.services@abs.gov.au

Contact us

Key contacts and template emails for ABS DataLab

Released
20/07/2022

Enrol in DataLab safe researcher training

To register interest in DataLab safe researcher training, please use the link on the Safe researcher training page.

New DataLab project query

 \(\Large ✉\) New DataLab project query.

Template email to submit a query about a new DataLab project

To: data.services@abs.gov.au
Subject: New DataLab project query 

Dear DataLab team 

I would like to enquire about setting up a new DataLab project. 

Project Organisation/s:
Project Title:
Draft Project Proposal (if completed): 

Existing DataLab project query

Template email for submitting a query about an existing project

To: data.services@abs.gov.au
Subject: Existing DataLab project query 

Dear DataLab team 

I have a query regarding an active DataLab project.

Project Name and Number:
Project Organisation/s:
Project query: 

System Support

 \(\Large ✉\) System support query (this includes requesting an MFA or password reset).

Template email to request system support

To: data.services@abs.gov.au
Subject: Request System Support 

  • I would like to request a MFA reset, if yes please indicate which system DataLab/myDATA?
  • I would like a password reset, if yes please indicate which system DataLab/myDATA?
  • I would like to request assistance with DataLab/myDATA system issue.

Note: Do not take any screen shots or copies of the data in the DataLab.

On which system are you encountering issues DataLab/myDATA?

  • Project name and number (if applicable):
  • Virtual machine (if applicable):
  • Exact time/date when the issue was experienced (and your time zone):
  • Describe the system issue in as much detail as possible:
  • Describe the steps taken to reproduce your issue:
  • Have you tried troubleshooting using the website user guide, if applicable? What troubleshooting steps have you taken?
  • Are you connected to a network (wifi/wired/VPN) at work or home?
  • Are you using Azure Virtual Desktop (AVD) or Citrix to access your DataLab virtual machine, if applicable?
  • What browser and operating system are you using and have all outstanding updates been installed?
  • Are there any error messages or logs associated with your issue? If so, please copy the text or any images you have of the error or logs into your output drive, specifying the location.

Request Input or Output Clearance

To request input or output clearance, please use the relevant links on the Input and output clearance page.

Request new software or software packages

 \(\Large ✉\) Request new software or software packages (this includes Python, R and stata packages or other software not currently supported in the DataLab).

Template email to request new software or software packages

To: data.services@abs.gov.au
Subject: Request for packages in DataLab

Dear DataLab team 

I would like to request the following software or software packages be added to the DataLab. 

List software package name and include the source link:

Note: We only accept software packages from the following sources.

Python

  • https://anaconda.org/ (preferred option) 
  • https://pypi.org/  

Packages must be compatible with the current version within DataLab (Python 3.9.7) 
Anaconda.org packages must be owned by Anaconda 
PyPi.org packages must be wheel format (.whl) 

Stata 

  • https://ideas.repec.org/ 

Stata packages must include the .toc file 

R packages 

  •  https://cran.r-project.org

For DataBricks or other software not currently available in the DataLab or software packages not available from anaconda, pypi, IDEAS or CRAN, these can be listed below with a business case and ABS will review these.

List of other software or software packages (include the source link):

Business case
Does the file contain data or executables:
Brief business justification including benefit to broader DataLab user group:
Who is the owner of the data/code and are they a recognised and trusted source:
Any terms of use or licensing that apply to the data/code that may restrict its use in the ABS DataLab and require additional permissions or conditions:

Notification of upcoming publication

Template email to notify ABS of an upcoming publication

Template email to notify ABS of an upcoming publication

To: data.services@abs.gov.au
Subject: Notification of upcoming publication

Dear DataLab team 

In accordance with the Conditions of Use I hereby notify the ABS of the intention to publish the following (attached) publication.

This process is not seeking approval from the custodians and is simply in place to give the custodians an opportunity to get across project outputs, provide comments and brief ministers, as required.

I can confirm my published results have been appropriately referenced to the data source/s, and an explanation of any processes or transformations which have been applied to that data, and any relevant data disclaimers have been included.

All other queries

For all other queries, please contact us via email at data.services@abs.gov.au. This email account is monitored and we will respond to your query during standard business hours, Monday to Friday.