Dr David Gruen*

Thursday, 13 February 2020

Understanding what’s happening and why – the role for microdata


Hello and welcome to the ABS/UNSW BLADE Expo. It’s a pleasure to be with you all today to open this event. This is my first public speaking appearance as Australian Statistician, and I’m very pleased to be discussing a range of topics that are important to me, the ABS and many of you. I expect this will be the first of many opportunities for data users, stakeholders, researchers, policy makers, academics and the ABS to broaden and deepen our collaboration. Enhancing ABS collaboration, particularly with academics and other experts, is a topic close to my heart, and one I’ll come back to a little later.

Today I would like to talk to you a little about me, particularly my statistics and data journey; reflect on the ABS data journey over recent years; and then look forward to high-light some of the opportunities I see for data to provide insights into some of Australia’s policy challenges.

I’ll also say a few words about BLADE, the Business Longitudinal Analytical Data Environment, which is one of the ABS’, and Australia’s, key integrated data assets. In the next presentation, Alan Herning from the ABS will provide you with a much more detailed description of BLADE, the data and the opportunities it presents to you – policy makers and researchers.

I’ll then sum up and ensure there’s time for questions.

A few things about me

I have always had a passion for data. I’m inquisitive and have a strong desire to understand not only what’s happening, but also why. This way of thinking has led me, time and time again, over many years of study and my career, to seek out data and statistics.

My first exposure to statistics was in my undergraduate degree. I remember learning that if you want to derive an unbiased estimate of the population variance from a sample variance then you need to divide by ‘n-1’ rather than ‘n’ in circumstances when you also need to use the sample mean to estimate the population mean. I remember being impressed by how clever that is, and concluding: these statisticians know a thing or two!

My career took a short detour into biophysics, a discipline that confirmed for me the importance of data.

A few years after I became a biophysicist I was feeling restless, so I began reading Macroeconomics by Dornbusch and Fischer. Since then I haven’t really looked back, apart from the occasional google scholar search to see how often my biophysics work has been cited.

My subsequent economics study and PhD reinforced for me the power of data and statistics. As part of my PhD, I conducted a survey to examine the relevance of the Ricardian equivalence hypothesis, though I’m not sure my sample size of 632 economic students would be viewed favourably by the methodologists at the ABS. Nevertheless, the data and results enabled me to challenge assumptions, present new ideas and shed light on why individuals behaved in the way they did.

I was a heavy user of ABS statistics during my time as Head of Research at the Reserve Bank of Australia (RBA). This was the late 1990s and early 2000s, a period following significant micro and macroeconomic reform in Australia, as well as the introduction of the Goods and Services Tax. While at the RBA I often wanted to get my hands on the underlying data from which the statistics had been compiled. I knew the underlying data would provide more insights into why individuals and businesses were behaving in particular ways. I couldn’t get hold of the underlying data back then, but now I can! I’m happy to hear that over recent years RBA staff have undertaken studies utilising ABS microdata from the Consumer Price Index 1 ; Wage Price Index 2 ‘; and the New Capital Expenditure Survey 3, ‘. I’m keen for this type of work to continue, and expand the opportunities to undertake this type of research to academics and other experts.

The Global Financial Crisis hit during my time at the Treasury. I’ve reflected on the crisis previously4, discussing the main causes. Related to this was the intense focus the Treasury and others paid to the statistics as they were released, both domestically and internationally. Once again, I was reminded of the critical need for high quality and timely data and statistics.

When I moved to the Department of Prime Minister and Cabinet (PM&C) I gained responsibility for data policy in Australia. In recent years I have led the Data Integration Partnership for Australia (DIPA). DIPA started in July 2017 and is a three-year $130.8 million investment to maximise the use and value of the Government’s data assets. DIPA aims to create new insights into important policy questions through data integration and analysis.

DIPA is a whole-of-government collaboration of over 20 Commonwealth agencies, and is improving technical data infrastructure and data integration capabilities across the Australian Public Service. Important data assets such as BLADE have been further developed under DIPA, allowing policy makers to gain insights that were not possible before. At PM&C I was also responsible for establishing the Office of the National Data Commissioner. I recruited Deb Anton, the Interim Commissioner and have supported the development of an environment, and frameworks, to foster greater sharing of government data across the Commonwealth for the benefit of Australians.

I mention all this background to emphasise that I come to the role of Australian Statistician with a knowledge of, and affection for, statistics and data. I know the importance of producing high quality statistics and, as a long term user, I’m attuned to the need for access to the underlying data. I suspect, perhaps I already know, that these topics are also important to many of you.

Now that you have some insight into my journey, I would now like to turn to the journey the ABS has been on over recent years.

The ABS journey

The ABS has a long history of collecting data, producing statistics and keeping data about people and businesses safe.

Important ABS statistics that the community relies upon have been around for decades. The ABS CPI dates back to 1948; and the quarterly Australian National Accounts series begin in 1958.

These important statistics, and many others like them, provide critically important information to policy makers about what’s happening in Australia’s economy, society and environment. And it’s the underlying data behind these statistics, a great asset, that can help all of us understand the ‘why’ of what’s happening.

The ABS has, over the past few years, improved access to ABS microdata – particularly integrated microdata.

In the past access to ABS microdata was available only to government employees who were seconded to the ABS and could only be accessed on-site in ABS offices. This approach limited the ability of the ABS to grant access to skilled researchers who could use the data to assist government in evaluating and formulating policy.

However significant change occurred at the ABS in late 2018 when the Census and Statistics (information Release and Access) Determination 2018 was enacted by the Federal parliament.

The release of ABS data, including microdata, is authorised by the Census and Statistics (Information Release and Access) Determination 2018. This updated Determination enabled greater access to ABS microdata and, in particular, to broaden the for which information can be used to include ‘research’ purposes in addition to the original ‘statistical’ purposes.

It was my predecessor, David Kalisch, who utilised the Determination to provide microdata access, including to BLADE and MADIP – the person integrated asset, to a wider range of trusted researchers and analysts via the ABS Virtual DataLab. This wider range of trusted researchers now includes government employees, government contractors, State and Territory government employees, individuals sponsored by government departments, academics and researchers from public policy institutes.

This access is supported by the ABS Five Safes where projects are assessed on five dimensions. These are, safe people, safe projects, safe data, safe setting and safe outputs. I won’t go into each of this here as Alan will be covering them a bit later.

By enabling timely, safe and effective access to data, these changes opened the door to a significant amount of new research that will strengthen the development of evidence-based policy, programs and services, which ultimately will provide a significant benefit to the Australian community.

And these new access arrangements have resulted in more researchers utilising BLADE. In 2017 there were 26 researchers accessing the BLADE data asset from across various government departments. Following the application of the Statistics Determination 2018, as at Dec 2019, this has increased to 148 researchers, including 17 academics, across 61 projects.

Enabling academics, and others, to access ABS microdata is one element of my desire to expand ABS engagement with academics, public policy institutes, policy makers at Federal and State and Territory government level, and other experts. While I’m still learning the ropes as the new Australian Statistician, my sense is that a significant amount of progress has already been made – and there is more to be done by the ABS to enable broad access to ABS data in a safe way.

I’d like to make a few comments about BLADE.

BLADE’s history is short, but it has evolved considerably since its creation in 2014. BLADE is best described as a series of integrated, linked longitudinal datasets.

BLADE was originally established as the Expanded Analytical Business Longitudinal Database (EABLD) in 2014 as a joint project between the ABS and the Department of Industry, Innovation and Science, to enable Australia to participate in the OECD’s project on employment dynamics. The first BLADE asset initially integrated select ABS business surveys with data from the Australian Taxation Office.

As I mentioned earlier, in July 2017 the Commonwealth Government established the Data Integration Partnership for Australia to raise the use and value of the Government’s data assets. DIPA funded the ABS to expand and improve BLADE to enable policy makers to gain insights into the economy and business performance and dynamics that were not possible before.

Under DIPA, the ABS has expanded BLADE to include more data from the ATO, more ABS Survey Data and more administrative data from a number of government departments such as Merchandise Imports and Exports data from the Department of Home Affairs and government program data from the Department of Industry, Innovation and Science.

In late 2019, the ABS also integrated into BLADE 12 years of Agriculture census and survey data for the first time.

The first DIPA project linking employer and employee information from BLADE and MADIP is currently underway with the ABS goal to develop an integrated data asset that enables research across economic, social, geospatial and environmental areas.

As I mentioned earlier we are facing a range of social, economic and environmental challenges – with many of these challenges complex and interrelated. To be able to respond to these challenges, data need to be available that reflects these multiple dimensions.

Now I would like to focus on the future.

Looking to the future

Utilising data well enables us all to gain insights into what’s happening now in our communities, economy and the environment; why it’s happening; and what might happen next. And it’s more important than ever to gain these insights in a range of important social and economic policy areas – like housing, energy, tax and inequality.

Most, if not all, countries around the world are facing a range of challenges – with many of these challenges complex and interrelated. Expert multidisciplinary analysis and research is needed, especially using integrated datasets like BLADE.

And analysis on integrated data has already yielded positive results. A recent study analysed Pharmaceutical Benefits Scheme (PBS) data to identify adverse events associated with medicines. The study identified 5 medications potentially associated with heart failure5 . This study is a compelling case for using data well. There are many more examples where combined administrative data are being used to identify and understand the characteristics of Australia’s businesses – you’ll see some examples of these studies after morning tea; to improve the allocation of school funding; and in partnership with Geoscience Australia, the ABS is exploring the use of satellite images about land cover, water availability and use to better understand the impact of bushfires and floods.

Expertise from academics and others is needed to enable the ABS to continue to utilise and interpret the ever-increasing volume of available data; provide insights from microdata; and ensure statistical outputs are well communicated, timely, accurate, and reflect the dynamic nature of Australian society, economy and environment.

I’ll be focused on ensuring experts are invited into discussions relating to these topics at the ABS.

There are some challenging areas that are front of mind that would benefit from early expert engagement.

Producing high quality statistics requires accurate data sources, sound methods and an ability to interpret the results produced. Especially when the data produce surprising results.

The productivity slowdown across the developed world in recent decades, including in Australia, continues to be a puzzle. Economists continue to question why seemingly rapid technology diffusion is not driving higher rates of productivity growth at the economy wide level.

Australia is predominantly a services based economy. Sixty eight percent (68%) of GDP in Australia is derived from services activities. Services and non-market activities present special measurement challenges for National Statistics Offices. Continuing to measure and understand the nature of these activities will be important to a comprehensive understanding of Australia’s economic performance in the years to come.

In previous roles I have spoken publicly about: the future of work and the impact of technology6 ‘; inequality7, ‘; as well as technology and international trade8, . These are topics that will continue to be of interest to me and, hopefully, to many of you.

Utilising experts benefits the ABS, our staff and the statistics produced. I’m aware of some examples from the recent past where this has been borne out.

Some of you here today are members of the ABS Methodology Advisory Committee; have mentored or are supervising ABS staff studying for a PhD; and contributed to enhancements to important statistical outputs. The Consumer Price Index (CPI) is a great example. Academic work by Professor Kevin Fox and support from Dutch Professor Jan de Haan enabled the ABS to take advantage of new data sources – scanner data and web scraping – and implement new index methods which means the ABS now produces one of the world’s best measures of household inflation. Though, no doubt some of you would prefer the CPI to be compiled monthly!

Advancing knowledge in the topics I’ve mentioned will bring great benefit to policy makers; and will perhaps shed more light on Australia’s productivity performance and BLADE is a key data asset that can help.

Microdata presents great opportunities for the ABS and researchers. And these opportunities also come with enormous responsibility. A responsibility to care for these data in a way that respects the people and businesses who provided them, to keep the data safe while also carefully using the data to improve the lives of Australians.

To continue to harness the opportunities of data we need to be focused on the community’s trust of what is planned for data use and how that use is carried out. In this context, the ABS has an important role.

Let me now sum up so I have time to answer some of your questions.

The ABS is much more than Australia’s national statistics agency that publishes high quality statistics almost every day. The ABS is an independent institution with a history of almost 115 years focused on safely capturing, storing and releasing data. It’s in the DNA of the ABS to carefully consider the safety of data in all aspects of our business. At a time when the power of data is being increasingly recognised, the ABS is playing an expanded role driving improved data capability across the Australian Public Service and beyond. The ABS will continue to do much more than produce a core set of economic and demographic official statistics.

Building community trust in the use of data by government is a long term pursuit because years of good practice can be quickly undone. The ABS will continue to champion the safe use of data into the future; and provide the foundation on which others can rely to demonstrate the highest professional data standards. As has been widely recognised, the ABS is one of Australia’s core public policy institutions, one that provides an invaluable service to all Australians.

Thank you for your time. I’m looking forward to your questions.

