2026 Universities Australia Solutions Summit: Keynote address

Harnessing Data for National Decision-Making

Release date and time
25/02/2026 4:00pm AEDT

Dr David Gruen AO
Australian Statistician
Wednesday 25 February 2026

Summary

The talk will discuss the development of Australian integrated data assets, which are now some of the most extensive in the world. This is enabling increasingly sophisticated analysis of a wide range of Australian public policy problems and evaluation of social policies by researchers in universities and the public sector. It is also attracting researchers from some of the best universities in the world who are accessing these data to work on Australian policy problems.

Introduction

Thank you for the opportunity to speak to you. I want to spend my time today talking about a recent development that is having a powerful effect on our capacity to do high quality analysis and research on public policy problems. That development is the rise of integrated data assets in Australia. ‘Integrated data asset’ may not be a term you have heard before. So before delving into the details, let me explain the key ideas.

The building blocks for an integrated data asset are datasets that include detailed information on individuals (or individual businesses). To give an example, the Australian Immunisation Register records, for all Australians, when they were vaccinated against COVID-19, both original vaccinations and boosters. (Australians who were not vaccinated do not appear in that dataset.) To give a different example, the Higher Education dataset records information about enrolments, completions, courses, and loan amounts for students studying at Australian higher education institutions. 

There are many datasets like these that record information about individuals across domains including earned income, tax paid, details about education, income support payments (like unemployment benefits or single parent payments), employment outcomes, disability, and migrant information.

Datasets including this wide array of different types of information are then ‘integrated’ by linking them together in such a way that the individual records for the same person in the different datasets are identified as such. Given the breadth of the available information, analysts can then use these integrated data assets to examine many aspects of people’s lives and behaviour in ways that can identify correlations – but can also be used to draw causal links.

All the individual records in the datasets are de-identified so that privacy is preserved, and the identity of the individuals in the datasets remains unknown. All researchers seeking access to these integrated data assets must sign an undertaking that they will not try to re-identify anyone in the datasets. [1]

Australia has come a long way in integrating datasets and a great deal of research is being conducted using these integrated data assets – much of it by researchers in universities represented by Universities Australia. There are several Australian integrated data assets, but the two I will talk about are the largest and most extensive integrated data assets in Australia. They are hosted by the Australian Bureau of Statistics. [2] These integrated data assets are called the Person Level Integrated Data Asset (PLIDA) and the Business Longitudinal Analysis Data Environment (BLADE). As the names suggest, PLIDA links together different aspects of people’s lives and behaviours, while BLADE links together different aspects of businesses’ operations.

Universities Australia has an agreement with the ABS which provides funding for Australian university researchers to access PLIDA and BLADE (and other microdata assets hosted by the ABS). I take this opportunity to thank Luke Sheehy and Mandev Kumar from UA for their help in negotiating this agreement.

PLIDA and BLADE

The first versions of what would become PLIDA and BLADE were launched in 2015 – so they are just over 10 years old. They started small, with just a few datasets each, and have grown enormously in the subsequent decade. [3]

Both assets are longitudinal: the core datasets in PLIDA span the years 2006 to 2025, while the core datasets in BLADE span the years 2001 to 2025. 

They are large. PLIDA contains records for about 39.9 million people – almost everyone who has been in Australia at any time since 2006. BLADE contains records for about 12.5 million active businesses over 2001-2025.

Figures 1 and 2 below show the datasets currently included in PLIDA and BLADE. PLIDA currently integrates 37 datasets including the Census, tax return data, data on social security recipients, migrants, and on health, education, and disability. BLADE currently integrates 41 datasets including surveys on a wide range of business characteristics, data on business income and tax, on exports and imports, insolvency, and employment conditions.

These integrated data assets therefore provide analysts with powerful tools to shed light on public policy problems across multiple dimensions.

Figure 1: Person Level Integrated Data Asset (PLIDA) datasets

This figure outlines the all the datasets included in the Person-Level Integrated Data Asset (PLIDA). 

This figure outlines the all the datasets included in the Person-Level Integrated Data Asset (PLIDA). PLIDA is a secure data asset combining information on health, education, government payments, income and taxation, employment, and population demographics (including the Census) over time. It provides whole-of-life insights about various population groups in Australia, such as the interactions between their characteristics, use of services like healthcare and education, and outcomes like improved health and employment.

Figure 2: Business Longitudinal Analysis Data Environment (BLADE) Datasets

This figure outlines all the datasets included in the Business Longitudinal Analysis Data Environment (BLADE)

This figure outlines all the datasets included in the Business Longitudinal Analysis Data Environment (BLADE). BLADE is an economic data tool combining tax, trade and intellectual property data with information from ABS surveys to provide a better understanding of the Australian economy and businesses performance over time.

Improving the Evidence Base for Public Policy

As of January 2026, there were 354 active research projects accessing ABS-hosted integrated data assets. Of these, about a half use PLIDA, about a third use a combination of both assets and the remainder use BLADE on its own. About half are university projects, one third are government projects, Federal and State, with the remainder mostly internal ABS projects and projects undertaken by Australian think tanks like the Grattan Institute and e61.

The nearly 2,100 analysts working on these projects do so after receiving training on how to access the data assets and on the importance of maintaining the privacy of the underlying data. [4

PLIDA and BLADE are unusual in their quality and breadth. They put Australia in the vanguard of data integration efforts around the world. Given the quality of these Australian assets, we have been keen to enable access to PLIDA and BLADE to researchers from outside Australia. [5]

This is a recent initiative, and we are seeing significant interest. Thus far, projects using PLIDA and/or BLADE are being undertaken by researchers at the University of Chicago, Harvard, MIT, University of Illinois, Urbana Champaign, University of California, Berkeley, London School of Economics and the OECD.

In the academic discipline I know best – economics – it is hard for academics to get research on Australian economic issues published in the top international journals. Making Australian integrated microdata available to international researchers is generating wider international interest in Australian policy issues that can be tackled using these data. In turn, this is making a modest contribution to improving international recognition of academic work conducted using Australian data.

Let me now describe some of the public policy issues being tackled using Australian integrated data assets. 

During the COVID-19 pandemic, elements of both PLIDA and BLADE were used by the Federal Treasury to link employees to their employers and to track flows between employment and the range of support payments put in place to soften the economic impact of the pandemic. This enabled Treasury to have a detailed understanding of labour market outcomes when it provided advice on the appropriate time to wind down JobKeeper (the main support payment for laid-off workers). [6]

To give a quite different example, researchers at Dilin Duwa, associated with the University of Melbourne, are using PLIDA and BLADE to track the contribution over time of Aboriginal and Torres Strait Islander businesses to employment and the wider Australian economy. The longitudinal nature of the data enables researchers to explore growth in the economic contribution of indigenous businesses and examine ways in which they differ from non-indigenous businesses. [7]

In a third example, in 2022 the Department of Health used the link in PLIDA between the Census and the Australian Immunisation Register to identify groups with low-vaccine uptake who spoke languages other than English. Table 1 shows some of the results. The level of detail shown in the table enabled communication campaigns, digital translations, and community outreach activities to be developed rapidly to lift vaccine rates for those groups that had been identified as having low uptake.

Table 1: COVID-19 vaccination uptake by language group and country of birth, as at 17 July, 2022

This table outline COVID-19 vaccination uptake by language group and country of birth, as at 17 July, 2022

This table outline COVID-19 vaccination uptake by language group and country of birth, as at 17 July, 2022. 

The link between these data and Single Touch Payroll data (which links workers with their employers) enabled the Department of Health to also examine vaccination rates for people such as aged care workers who were working with vulnerable people.

The link in PLIDA with the Australian Immunisation Register was also used by an academic study which followed 3.8 million Australians aged 65 and over in 2022 to examine the relationship between vaccination status and mortality for this older age group.

The study demonstrated that a 65+ year old person having had three COVID-19 vaccinations – with the third dose administered within the previous three months – had a COVID-19 mortality that was reduced by 93 per cent relative to a comparable unvaccinated person. 

The study also demonstrated how vaccine effectiveness wanes over time. It showed that people who received their most recent booster within the previous three months had a much larger reduction in mortality (by around 20 percentage points) than people whose latest booster had been more than six months ago. Being vaccinated reduced mortality significantly relative to the unvaccinated but the level of protection was noticeably higher for those with a recent booster.

In another example, researchers at the ANU using PLIDA found strong evidence that the introduction of the National Disability Insurance Scheme led to a nearly one-third increase in reported autism prevalence and accounted for nearly half of new autism diagnoses since the introduction of the scheme. Thus, financial incentives have led to an increase in the proportion of people diagnosed with autism, who are therefore eligible for financial support from the NDIS. [8]

As a final example, researchers at e61 linked de-identified individual tax records with university enrolments data available in PLIDA to study the relationship between individuals’ Australian Tertiary Admission Rank (ATAR) and their subsequent earnings. What did they find? 

Beyond the age of 25, average earnings are higher for individuals with higher ATARs – and the higher the ATAR, the higher the average earnings. Further, the average earnings gap rises as people age. Just to be clear: these results are averages and plenty of people with lower ATARs earn more than others with higher ATARs! [9]

In a new development, PLIDA is being used to conduct evaluations of social programs. Let me give two examples.

The ACT Justice and Community Safety Directorate is evaluating the policy commitment to reduce recidivism in the ACT by 25 percent by 2025. ANU researchers will use linked administrative data from ACT Police, Criminal Courts and Corrective Services to access whether this policy commitment has been achieved.   

As a second example, Mission Australia, in collaboration with research partners from the University of Sydney and RMIT, is linking Mission Australia data with income support and labour market data in PLIDA to evaluate the extent to which its homelessness support programs and services have improved outcomes for recipients of that support.

Conclusion

Australia’s recent experience building powerful integrated data assets has been an overwhelmingly positive one. Important public policy questions and careful evaluations of social policy are being tackled by researchers in universities and the public service accessing PLIDA and BLADE. 

These two integrated data assets are unusual in their quality and breadth. They have attracted interest from researchers at some of the best universities in the world who are using them to work on Australian policy problems.

Thank you.

Footnotes

[1] The linking of datasets is done via a spine that is common across the linked datasets (see https://www.abs.gov.au/about/data-services/data-integration/person-linkage-spine for further information). It is incumbent on the hosts of these data assets, in this case the ABS, to ensure they are secure, with well-developed protocols to ensure the private information of individuals (and businesses) is protected and is not compromised. For more on the ABS approach to data confidentiality, see https://www.abs.gov.au/about/data-services/data-confidentiality-guide/five-safes-framework.

[2] Other integrated data assets in Federal public service agencies are the National Disability Data Asset, owned by the Department of Health, Disability and Ageing; Alife, hosted by the Australian Taxation Office; the National Health Data Hub, hosted by the AIHW, and other integrated data assets hosted by the ABS including the Australian Census Longitudinal Dataset (see https://www.abs.gov.au/about/data-services/data-integration). Other integrated data assets are hosted by State governments and universities.

[3] For the history of PLIDA and BLADE, see https://www.abs.gov.au/about/our-organisation/australian-statistician/speeches/data-linkage-and-integration-improve-evidence-base-public-policy-lessons-australia.

[4] As previously mentioned, everyone accessing the integrated data assets first signs an undertaking that they will not attempt to re-identify any individual or business in the unit records they access.

[5] https://www.abs.gov.au/statistics/microdata-tablebuilder/datalab/access-outside-australia

[6] See https://treasury.gov.au/publication/p2021-211978 and https://www.abs.gov.au/system/files/documents/dd267c4bbee2318ccdaa8a6cd5e54974/Cully%2C%20Whalan%20-%20Looking%20under%20the%20lamppost%2C%20new%20data%20for%20unseen%20challenges.pdf

[7] See https://dilinduwa.com.au/snapshot-studies

[8] https://crawford.anu.edu.au/sites/default/files/2025-04/Complete%20WP%20Ranjan%20Breunig%20Apr%202025.pdf

[9] https://e61.in/whats-in-an-atar-how-university-admission-scores-predict-future-incomes/

Back to top of the page