Azure Storage Explorer
How to navigate and use Azure Storage Explorer for data ingress/egress
Setting up Azure Storage Explorer
The default method of loading data to SEAD is to use Microsoft Azure Storage Explorer, which will need to be setup and managed by your organisations ICT department. To open Azure Storage Explorer, search for the application in the start menu of your virtual machine, then click open.
NOTE: If users/analysts are using Azure Data Lake containers with Databricks, Azure Storage Explorer is available inside user's VM's as an alternative to AzCopy to manage and transfer their files between your file share drives (output, project, etc) and blob storage. Users can refer to the 'Azure Storage Explorer User Guide' in the shared library drive for more information. Users/analysts do not have access to upload or download data via Azure Storage Explorer outside of their virtual machine. Data ingress and egress to SEAD is managed by data administrators only.
NOTE: Some organisations may require users to configure a proxy for web connections. You can configure proxy settings by selecting Edit > Configure Proxy. If unsure, contact your relevant organisation ICT support to assist.
Logging into Azure Storage Explorer
Within the application, click on the person icon on the top left of your side menu bar and select ‘Add an account’.
On the following ‘Select Resource’ window, select ‘Subscription’.
On the following ‘Select Azure Environment’ window, select ‘Azure’ then press ‘Next’.
The system will redirect to a browser window.
Enter your mydata.abs.gov.au username/or select your account.
Log in, then return to Azure Storage Explorer. You will now have your account added and you can access the SEAD products and project files.
Authentication
The below message appears when you have been timed out of the system:
Follow this path also if you can’t see any of the files/file containers after logging in, your session may have timed out and you need to re-authenticate.
To manage this, select ‘Manage Accounts’ from the reauthentication notification at the top of your screen.
Then select the profile (person icon) from the menu bar on the left and select 'Reauthenticate now' beneath the account you are reauthenticating.
Follow the prompts for username and password.
Once authenticated, the browser will display the following message:
NOTE: In some cases, you may receive a notification stating that the page cannot be reached. If this occurs, close the window and return to Azure Storage Explorer where your account will be active. If it is not active, refresh your view.
Uploading data
NOTE: It is the responsibility of the SEADpod Project Owner to ensure compliance with the Project Owner’s legal requirements, rules and obligations pertaining to data input and outputs.
Firstly, you must create a product shell from the SEAD administrator interface (refer to Creating Products).
NOTE: You must create the product shell from the SEAD web application as this will enable an Access Control List (ACL). Product shells made directly in Azure Storage Explorer are not configured with this protection. Users will have to log out and back in again for new linked products and subsequent files to become available and access control lists to refresh.
Return to Azure Storage Explorer after creating your product shell, ensuring your Azure account is activated (refer to Setting up Azure Storage Explorer). In the left-hand side bar there should be a key icon listing Production ‘Prod’. Listed under ‘Storage Accounts’ are the Projects by number, library drive and product list.
NOTE: There may be multiple DataLab subscriptions visible to SEAD Administrators Azure Storage Explorer, each containing different products. Products being spread across multiple subscriptions is a result of load-balancing mechanisms in place to manage Azure limitations. SEAD administrators do not have permission to move projects to specific subscriptions.
Tip: Right-click on the product folder and select 'Pin to Quick Access' to allow you to quickly locate the product folder instead of searching through the list of folders.
Under the project, library or product drive, open ‘File Shares’ and select the required folder associated to the project in question.
Locate the product shell you created in SEAD previously by inputting the short name in the top right corner search box or scrolling to it.
Click on the folder. This opens the folder you created in SEAD.
NOTE: Depending on what is being loaded, you may need to open additional folders. Review the data load request form for further information on what files are to be loaded.
Select the relevant Upload button ensuring it is corresponding with the correct product. You will then see the options to Upload Folder or Upload Files.
On the following window, Select the three dots to locate your folders/files.
Select the product folder or file name and press ‘Select Folder’.
On the following window, check the upload paths are correct and at the appropriate file/location level. If correct, select Upload.
NOTE: Only provisioned data administrators within your SEADpod can view objects in Azure Storage Explorer. Azure denies authentication to administrators who do not have the data administrator role within the SEADpod.
At the bottom of the screen from your Azure Storage Explorer app, the Activities box will indicate if the upload has been successful. If the upload has failed, retry the above steps.
Next, quality check the data upload by ensuring the number of files in the upload mirrors the source location. For this example, the number of cache items at the bottom left-hand corner should equal the number of sub-folders in the directory.
In order to download data/egress it from SEAD, use the download function (beside the upload button) in Azure Storage Explorer to download data from azure storage explorer to your local organisations data store.
NOTES:
- Reminder, SEAD users/researchers are unable to load data or egress it from SEAD, as they do not have access to Azure Storage Explorer. Users do have read and write access to the files made available to them in the 'My Products' folder, and Project/Output folders that are linked to their project from their workspace, so they can delete files in these folders.
- If the folder in the Azure Storage Explorer directory is empty, it has not been loaded. Other forms of quality checking the data upload include making sure files have actual content (size should not be 0 KB). Also be aware that the size of files uploaded to SEAD are smaller, due to compression. For some files, the CONTENT-MD5 may be blank. This is because CONTENT-MD5 is specifically related to Azure Blob Storage and since the storage in SEAD is Azure Files/file share based, it is unused/blank.
Restore files from backup
Administrators can restore files that have been backed up within the last 14 days. To do this, navigate to the project share for the relevant project and click on the drop down ‘Current’ or ‘View Share Snapshots’ buttons.
Here you can view backup snapshots for the last 14 days, the timestamps are in UTC time zone which equates to 5AM the following day AEST. E.g. 2023-05-01T18:47:05.0000000Z equates to 2023-05-02 4:47 AM AEST.
If the file you are restoring still exists in the Current snapshot and you don’t want to overwrite it from the restore point, create a backup folder and copy the file into it.
Select the time of the backup to restore, right click the file and select Restore Snapshot.
Confirm the restore.
NOTE: If you haven’t renamed or moved the Current version of the file into a folder it will be overwritten.
Verify the restore completed successfully in the activity log.