In 2014, the Australian Bureau of Statistics (ABS) National Migrants Statistics Unit (NMSU) conducted a study to examine the quality of the Department of Immigration and Border Protection (DIBP) Temporary Visa Holders (TVH) administrative data in terms of its suitability for data integration. For the purposes of this study, the administrative data on Temporary migrants is composed of data on International Students and Temporary work (skilled) (subclass 457) visa holders as at 31 July 2011.
The study investigates the quality of the administrative data of migrants on student and Temporary work (skilled) visas as well as assessing the suitability of the data for linking with the (ABS) 2011 Australian Census of Population and Housing and the Australian Taxation Office (ATO) Personal Income Tax (PIT) data. If this data were linked to the Census data file, a linked Australian Census and Temporary Migrants Integrated Dataset (ACTMID) could provide enhanced information on temporary migrant outcomes. If an integrated dataset could be produced of sufficient quality, then the dataset would provide new information on the characteristics of temporary migrants such as usual residence currently and one year ago, labour force status, educational qualifications obtained, income and housing. This data would be very useful for policy and planning as well as providing a rich source of data on this growing population group for academics and researchers to inform their studies.
The study also explores methodologies to improve future data integration studies and compares the findings with earlier quality studies that have assessed the quality of the integrated dataset created by linking the 2011 Census to the Department of Social Services (DSS) and Department of Immigration and Border Protection (DIBP) Settlement Database (SDB).
Relevant legislation and guidelines, including the Privacy Act 1988 and the High Level Principles for Data Integration Involving Commonwealth Data for Statistical and Research Purposes, were adhered to, protecting the privacy of individuals.
The characteristics of the data items provided for the study were assessed for completeness and any anomalies. The feasibility of linking with the 2011 Census was assessed by an experimental linkage simulation tool. This tool simulates the probabilistic linking process and provides a diagnostic report enabling the feasibility of linking two datasets to be assessed prior to linkage.
This paper provides background to the Temporary Migrant Feasibility Study, a discussion of the quality of the TVH administrative data items, and a brief description of the simulated linking strategy, with detailed results included in the Appendix.
The results of the quality study and the linkage simulation show that linking the TVH data to the 2011 Census is likely to result in a link rate of about 70% but that there are a number of issues with the data. The most prevalent issue with the data is records missing residential address information and therefore a Meshblock was unable to be assigned during geocoding. Extensive analysis following any linkage would be required to ascertain the characteristics of temporary migrants in both the linked and unlinked records in order to effectively account for any systematic bias in the linked data.
The assessment of the feasibility of undertaking future linking of the TVH records to the 2011 Census concluded that, with simulated linkage results of a 70% link rate and a 98% precision rate, all the required elements are in place to produce a useful dataset for analysis. This is particularly true for the largest groups of temporary migrants (International Students and Temporary work (skilled) subclass 457). However, in order to improve the link rate, and provide more reliable and more detailed estimates, more work needs to be done to assess and improve data quality and address any issues that arise from the under-representation of certain subgroups in the linked dataset.
The study also found that data linkage of the TVH records to the ATO PIT data is not feasible, primarily due to the quality of the address data. There are very few variables common to both datasets, so successful linkage to PIT data requires high quality name and address data and the Temporary migrants dataset was not of sufficient quality due to the lack of high quality address information for all records.