1406.0.55.007 - DataLab, User Guide  
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 25/03/2021  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

New DataLab: troubleshooting

Logging in
Virtual machines
Errors and running out of space
Code and software

Logging in

Why do I have to log in twice during the access process

The new DataLab has more functionality and features available to you, so you can set options as well as undertake your research.

  • First log-in is to the DataLab portal, where you can view and set options for your DataLab account information and virtual machines. Read more in New DataLab portal features.
  • Second log-in is to the DataLab workspace where you undertake your analysis.

I can't log in
How long does my temporary password/password last
  • The temporary password issued to you by the ABS lasts for 90 days. After you have completed the set up steps you must reset your password.
  • If you have forgotten your temporary password, email microdata.access@abs.gov.au for a reset.

I forgot my password to get into the new DataLab portal

Your log in credentials for the new DataLab portal are the same as for the new DataLab workspace. You can Reset your password by clicking on the Forgot my password link.

Virtual machines

What are the virtual machines/desktops
  • Virtual machines, or VMs, are the virtual workspaces you can use to undertake your analysis in the DataLab.
  • Virtual machines are called Desktops in the Citrix Portal.
  • You have one machine for each project. This is a security measure to prevent data from one project being accessed by another project that the same researcher has access to.
  • VMs are created by us as part of the project application process, described in About DataLab.
  • You can run analysis on multiple virtual machines at the same time, but only if you have been granted local disk space. See Run jobs on offline VMs (desktops). You may want to request this option if you have multiple projects that you are actively involved in.

How large are the different sizes of virtual machines in the DataLab
  • Small (2 core CPU, 8GB memory), intended for supervisors or users who are reviewing code rather than doing their own analysis
  • Medium (2 core CPU, 16GB memory)
  • Large (2 ‘fast’ core CPU, 64GB memory)

We will assign what is appropriate for use, mainly driven by the size of the data approved to your project. If you are noticing poor performance, you can email microdata.access@abs.gov.au. Larger machines incur higher running costs. With user charging, you may need to consult with your organisation to confirm incurring additional expenses for your project before applying for a larger machine.

What does it mean for a virtual machine to be Active and why does this matter

If you are a member of multiple projects in the DataLab, you will have more than one virtual machine. Your Active machine is the one that is connected to the remote file share, where the data files are stored. For security purposes, only one of your sessions can connect to the remote file share at a time (this is where data files are stored). You can activate your virtual machine by using the Change Active VM button.

Why are virtual machines destroyed every 30 days

Virtual machines are destroyed approximately every 30 days for security purposes. If the 30 day timing will interfere with the timing of your project, you can choose to destroy and rebuild earlier than 30 days at a time that suits you.

Where do I save the work I have done on a virtual machine that is scheduled to be destroyed

Save your work to your Project or Output drives to ensure that your analysis is not lost. Information saved outside of these drives is destroyed when your machine is rebuilt every 30 days.

Can I have multiple virtual machines running code at the same time

Only if you have requested local disk space to be allocated to a machine. This allows you to run jobs on offline VMs.

My virtual machine is not launching
  1. Check to make sure your virtual machine is powered on. Virtual machines must be active and powered on before they can be launched.
  2. Restart your virtual machine. As with restarting a computer, restarting your virtual machine can sometimes resolve problems with launching your machine successfully. From the virtual machine page click the Restart VM button and wait 10 minutes to ensure the reboot of the machine is complete before attempting to launch again.
    Image: Power state with stop VM, start VM and restart VM buttons
  3. Check your internet connection. If you have a weak or intermittent connection, this can affect launching your virtual machine.
  4. Try launching the virtual machine outside of your organisation's online environment. Some institutions’ or Government departments’ firewall or other security settings may be preventing access to new DataLab portal and/or launching of the virtual machine. Attempting to connect outside your agency’s online environment may assist in forming the VM connection.
  5. If you are still having trouble, email microdata.access@abs.gov.au.

Errors and running out of space

I am getting a Citrix error

Check you have the latest version of Citrix Workspace application from https://www.citrix.com/en-au/downloads/workspace-app/windows/ or request it from your IT department. Example of a Citrix error below.
Image: windows error can't open this file

One of my network drives in the analysis environment is missing

If you cannot see the Library, Project, and Output network drives in File Explorer, go to the desktop and double-click the Refresh Network Drives icon.
Image: Refresh network drives desktop shortcut

I got an error while working with data in SAS/Stata/R/Python

Stata error example
Image: error op sys refuses to provide memory

This means you have exceeded the memory for your virtual machine.
  1. Use an alternative method/program to manipulate or process the dataset. Some processes/programs/methods for working with large datasets are more memory-intensive than others. Try some alternative method to see if it is less system intensive.
    • Most statistical software tools are able to filter data as it is imported. If your analysis only needs variables a, b and c from a dataset containing 30 variables, then selecting, filtering or importing only these variables uses less memory.
    • If you cannot do this in your software, consider creating a subsetted data file using another tool, such as Python, as the first step of preparing your data for analysis.
    • If you are unsure of alternative methods, we recommend discussing with other researchers in your project team who are more familiar with your chosen statistical software. The ABS does not provide advice or training on using the analytical tools provided to you in the DataLab.
  2. Email microdata.access@abs.gov.au to request a larger machine. Larger machines incur higher running costs. With user charging, you may need to consult with your organisation to confirm incurring additional expenses for your project before applying for a larger machine.
I am running out of space in my Project drive
  1. Clean up the drive contents, review and delete redundant files to free up space.
  2. Email microdata.access@abs.gov.au to request a storage increase. There may be a cost associated with this.

Code and software

I have some code for one project that I want to use in another project - how do I arrange this

You can request input clearance for data, code or files to be loaded to your project, from either another project, or other sources that you hold.

Can I use a mix of SAS, STATA, R and Python for different people in my project team

Yes, each virtual machine has R and Python as default software. SAS and Stata are not automatically provided on all machines but can be requested as they require a licence to be assigned to your virtual machine. Email microdata.access@abs.gov.au with your request.

Is cluster processing possible in the DataLab

Cluster processing is not currently available. We are developing a Databricks service to provide scalable clustered analytics environment for users.

Is there a delay between assigning data to a project and users seeing it

Yes, it takes about 5 minutes to process the connection. You also need to log out of your virtual machine to allow the system to refresh your session with the new data.

What can I do if my code will run longer than 10pm tonight

You can extend your session to bypass the nightly shutdown, by one, two or three nights.

How do I see what R packages I have available and how do I manage these

Use the R Studio Package Manager shortcut on the DataLab virtual machine desktop to check the range of R packages available to you. See Managing your R packages.

I can’t find the R packages I need in the analysis environment
  1. See Managing your R packages to use the RStudio Package Manager on the desktop.
  2. If the packages you need is not listed, email your request to microdata.access@abs.gov.au.
Double clicking to open a PDF is not working

Due to a default setting in Microsoft, the system automatically uses Microsoft Edge to open any PDF file. You can open the PDF file by right-clicking on the file, selecting Open with > Adobe Reader. This launches the file using Adobe Acrobat Reader.
Image: File Explorer showing open with menu selecting Adobe Acrobat Reader DC

Back to top of the page