Troubleshooting

DataLab

Help with logging in, virtual machines, errors and running out of space, code and software

Released
4/11/2021

Logging in

I can't log in

  • If you have entered your user name or password using copy and paste, you may have accidentally included hidden characters or a space.
  • Your organisation firewall may be blocking access. Try accessing DataLab while disconnected from your organisation's network.
  • The ABS DataLab only supports use of the Microsoft Authenticator app.
  • If you have changed your mobile phone we need to reset your Microsoft Multi Factor Authentication. Email microdata.access@abs.gov.au.
  • If you need to reset your password this must be done via the Forgot my password link in the initial DataLab sign in screen.
  • Clear your browser cache.
  • Try a different browser. See Recommended browsers.

Has my organisation authenticated my access to the DataLab

DataLab is enabled by cloud infrastructure, which may be blocked by some organisations’ firewall settings.

ABS cannot make changes to external organisations' infrastructure. Project Leads need to supply the information below to each organisation participating on this project.

Network/IT Security sections in each organisation need to review and make changes to authenticate access.

There are 4 steps which need to be applied to each organisation’s security settings before the project start date to enable access to DataLab.

1. Enable authentication to the tenant

Users need to authenticate to one of ABS Azure Active tenants, which may be strictly controlled by government agencies and academic workplaces. Authentication must be enabled to the tenants:

  • mydata.abs.gov.au
  • absmydata.onmicrosoft.com

2. Allow user access to URLs

Users will need to access the following URLs:

  • DataLab production portal: datalab.abs.gov.au and gw.datalab.abs.gov.au
  • Citrix portal: absdatalab.cloud.com

3. 2020 version of Citrix Workspace client installed

The originating client machine must have a recent version of the Citrix Workspace client installed. Here is a link to the Citrix Workspace download page

4. Enable HTTPS connections

All Remote Desktop client connections to ABS DataLab go via Citrix Cloud service. You will need to enable HTTPS connections to both:

  • *.citrix.com
  • *.cloud.com
  • *.nssvc.net

Why do I have to log in twice during the access process

The DataLab has more functionality and features available to you, so you can set options as well as undertake your research.

  • First log-in is to the DataLab portal, where you can view and set options for your DataLab account information and virtual machines. Read more in DataLab portal features.
  • Second log-in is to the DataLab workspace where you undertake your analysis.

How long does my temporary password/password last

  • The temporary password issued to you by the ABS lasts for 90 days. After you have completed the set up steps you must reset your password.
  • If you have forgotten your temporary password, email microdata.access@abs.gov.au for a reset.

I forgot my password to get into the DataLab portal

Your log in credentials for the DataLab portal are the same as for the DataLab workspace. You can reset your password by clicking on the Forgot my password link.

My password expired while my citrix workspace is running

Your session will continue on until a shutdown is required (either nightly shutdown or 30 day rebuild). However, you can still reset your password while your session is running.

Virtual machines

My virtual machine is not launching

  1. You must Activate, then start the VM. Follow the process and wait for each step to complete before progressing.
  2. Check your internet connection. If you have a weak or intermittent connection, this can affect launching your virtual machine.
  3. Try launching the virtual machine outside of your organisation's online environment. Some institutions’ or Government departments’ firewall or other security settings may be preventing access to DataLab portal and/or launching of the virtual machine. Attempting to connect outside your agency’s online environment may assist in forming the VM connection.
  4. VM not launching can be caused by a Citrix issue. Try again after installing the latest version of Citrix workspace.
  5. Restart your virtual machine. As with restarting a computer, restarting your virtual machine can sometimes resolve problems with launching your machine successfully. From the virtual machine page click the Restart VM button and wait 10 minutes to ensure the reboot of the machine is complete before attempting to launch again.
  6. If you are still having trouble, email microdata.access@abs.gov.au.
Restart VM button

What are the virtual machines/desktops

  • Virtual machines, or VMs, are the virtual workspaces you can use to undertake your analysis in the DataLab.
  • Virtual machines are called Desktops in the Citrix Portal.
  • You have one machine for each project. This is a security measure to prevent data from one project being accessed by another project that the same researcher has access to.
  • VMs are created by us as part of the project application process, described in About DataLab.
  • You can run analysis on multiple virtual machines at the same time, but only if you have been granted local disk space. See Run jobs on offline VMs (desktops). You may want to request this option if you have multiple projects that you are actively involved in.

How large are the different sizes of virtual machines in the DataLab

  • Small (2 core CPU, 8GB memory), intended for supervisors or users who are reviewing code rather than doing their own analysis
  • Medium (2 core CPU, 16GB memory)
  • Large (2 ‘fast’ core CPU, 64GB memory)

We assign what is appropriate for use, mainly driven by the size of the data approved to your project. If you are noticing poor performance, you can email microdata.access@abs.gov.au. Larger machines incur higher running costs. With user charging, you may need to consult with your organisation to confirm incurring additional expenses for your project before applying for a larger machine.

What does it mean for a virtual machine to be Active and why does this matter

If you are a member of multiple projects in the DataLab, you will have more than one virtual machine. Your Active machine is the one that is connected to the remote file share, where the data files are stored. For security purposes, only one of your sessions can connect to the remote file share at a time (this is where data files are stored). You can activate your virtual machine by using the Change Active VM button.

Why are virtual machines destroyed every 30 days

Virtual machines are destroyed approximately every 30 days for security purposes. If the 30 day timing will interfere with the timing of your project, you can choose to destroy and rebuild earlier than 30 days at a time that suits you.

Is my virtual machine backed up

Virtual machine project and output drives are backed up every night and kept for 14 days. Files outside of these drives are not recoverable.

Where do I save the work I have done on a virtual machine that is scheduled to be destroyed

Save your work to your Project or Output drives to ensure that your analysis is not lost. Information saved outside of these drives is destroyed when your machine is rebuilt every 30 days.

Can I have multiple virtual machines running code at the same time

Only if you have requested local disk space to be allocated to a machine. This allows you to run jobs on offline VMs.

I can't see my project's products

Try logging out of Citrix, stopping your VM and then begin the Start VM process again. If that does not work, try the rebuild now from your VM management options.

Errors and running out of space

One of my network drives in the analysis environment is missing

If you cannot see the Library, Project, and Output network drives in File Explorer, go to the desktop and double-click the Refresh Network Drives icon.

Refresh Network Drives icon

I got an error while working with data in SAS/Stata/R/Python

Stata error example

Stata error example

This means you have exceeded the memory for your virtual machine.

1. Use an alternative method/program to manipulate or process the dataset. Some processes/programs/methods for working with large datasets are more memory-intensive than others. Try some alternative method to see if it is less system intensive.

  • Most statistical software tools are able to filter data as it is imported. If your analysis only needs variables a, b and c from a dataset containing 30 variables, then selecting, filtering or importing only these variables uses less memory.
  • If you cannot do this in your software, consider creating a subsetted data file using another tool, such as Python, as the first step of preparing your data for analysis.
  • If you are unsure of alternative methods, we recommend discussing with other researchers in your project team who are more familiar with your chosen statistical software. The ABS does not provide advice or training on using the analytical tools provided to you in the DataLab.

2. Email microdata.access@abs.gov.au to request a larger machine. Larger machines incur higher running costs. With user charging, you may need to consult with your organisation to confirm incurring additional expenses for your project before applying for a larger machine.

I am running out of space in my Project drive

Clean up the drive contents, review and delete redundant files to free up space.

Email microdata.access@abs.gov.au to request a storage increase. There may be a cost associated with this.

Code and software

I have some code for one project that I want to use in another project - how do I arrange this

You can request input clearance for data, code or files to be loaded to your project, from either another project, or other sources that you hold.

Can I use a mix of SAS, STATA, R and Python for different people in my project team

Yes, each virtual machine has R and Python as default software. SAS and Stata are not automatically provided on all machines but can be requested as they require a licence to be assigned to your virtual machine. Email microdata.access@abs.gov.au with your request.

Is cluster processing possible in the DataLab

Cluster processing is not currently available. We are developing a Databricks service to provide scalable clustered analytics environment for users.

Is there a delay between assigning data to a project and users seeing it

Yes, it takes about 5 minutes to process the connection. You also need to log out of your virtual machine to allow the system to refresh your session with the new data.

What can I do if my code will run longer than 10pm tonight

You can extend your session to bypass the nightly shutdown, by one, two or three nights.
 

How do I see what R packages I have available and how do I manage these

Use the R Studio Package Manager shortcut on the DataLab virtual machine desktop to check the range of R packages available to you. See Managing your R packages.

SAS warning messages

If the project you opened was saved with SAS Datalab – [machine name] you are connecting to the local SAS server without a profile. When you try to run the project without selecting a profile the system may present an error message saying "The server "SASMain" is not defined in the current repository". Click though the messages and continue.

I can’t find the R packages I need in the analysis environment

  1. See Managing your R packages to use the RStudio Package Manager on the desktop.
  2. If the packages you need is not listed, email your request to microdata.access@abs.gov.au

Double clicking to open a PDF is not working

Due to a default setting in Microsoft, the system automatically uses Microsoft Edge to open any PDF file. You can open the PDF file by right-clicking on the file, selecting Open with > Adobe Reader. This launches the file using Adobe Acrobat Reader.

Launching a PDF file using Adobe Acrobat Reader