Microdata and TableBuilder: Characteristics of Employment, Australia

Enables detailed analysis of employment characteristics

COE microdata in DataLab

Microdata from the annual Characteristics of Employment (COE) survey is now available in ABS DataLab as a supplementary file for the Longitudinal Labour Force (LLFS) microdata. All existing users of the LLFS microdata will automatically get access to the COE file (use of the file may require an updated project proposal) and new users can apply for access to both files. 

This release of COE microdata features data collected annually for the months August 2014 to August 2021 and enables detailed analysis of employee earnings, casual workers, independent contractors, trade union membership, labour hire, job flexibility and job security. A future update to include the results from the August 2022 COE survey is scheduled for release later in the year on 14 December 2022. 

A detailed data item list for the COE microdata is available in Data downloads.

Introduction

This product provides a range of information about the release of microdata relating to employment characteristics.

Microdata are the most detailed information available from a survey and are generally the responses to individual questions on the questionnaire or data derived from two or more questions.

Characteristics of Employment, 2014 to 2021

The Characteristics of Employment survey (COE) is conducted in August throughout Australia and is designed to provide statistics on employment across the following 18 concepts:

  • Away from work
  • Casual work and Job security
  • Characteristics of employment (all jobs)
  • Characteristics of main job
  • Characteristics of second job
  • Demography
  • Earnings in main job (median, mean and distribution of weekly and hourly earnings)
  • Education and Qualifications
  • Families and children
  • Fixed-term contracts
  • Independent contractors
  • Job flexibility and Working from home
  • Labour hire
  • Leave entitlements
  • Overemployment and Overtime
  • Trade union membership
  • Underemployment
  • Working arrangements and Working patterns

Microdata from the COE survey is released in both TableBuilder and DataLab.

TableBuilder is an online tool for creating tables and graphs from underlying microdata. Refer to TableBuilder for more information.

DataLab is the analysis solution for high-end users who want to undertake real time complex analysis of detailed microdata in a secure environment. Refer to DataLab for more information.

Historical microdata, 1998 to 2010

Prior to 2014, microdata relating to employment characteristics was released in a number of Confidentialised Unit Record Files (CURFs). These files are available in DataLab

Accessing the data

You can use this data in:

  • TableBuilder - online tool for creating tables and graphs.
  • DataLab - analyse detailed microdata

Compare data services to see what's right for you. Information on how to apply for access can be found in TableBuilder and DataLab.

Further information about these products, and other information to assist users in understanding and accessing microdata in general, is available from the Microdata and TableBuilder Entry Page.

Further information

Further information about the survey and the microdata can be found in the various pages associated with this product, including:

Support

For further support in the use of this product, please contact Microdata Access Strategies via microdata.access@abs.gov.au.

Data available on request

Data collected in the survey but not included in TableBuilder or DataLab may be available from the ABS, on request, as statistics in tabulated form.

Subject to confidentiality and sampling variability constraints, special tabulations can be produced incorporating data items, populations and geographic areas selected to meet individual requirements. These are available, on request, on a fee for service basis. For more information, contact the ABS by visiting www.abs.gov.au/about/contact-us or email the Labour Statistics Branch at labour.statistics@abs.gov.au.

Privacy

The ABS Privacy Policy outlines how the ABS handles any personal information that you provide to us.

Changes to TableBuilder

Many improvements have been made to the 2021 COE TableBuilder release to simplify the way data items are presented, increase the range of data items available and generally enhance the usability of the product.

Concept groups

The way that data items are arranged in TableBuilder has been changed to better reflect different conceptual groupings (rather than populations and data items). These concepts groups are to assist with understanding the data and using the dataset. Data items are now listed under one of the following conceptual groups. Some of the concepts are only collected every two years, on an alternating basis.

All years

  • Away from work
  • Characteristics of employment (all jobs)
  • Characteristics of main job
  • Characteristics of second job
  • Demography
  • Earnings in main job (median, mean and distribution of weekly and hourly earnings)
  • Education and Qualifications
  • Families and children
  • Fixed-term contracts
  • Independent contractors
  • Leave entitlements
  • Underemployment

Even years only

  • Casual work and Job security
  • Characteristics of independent contractors
  • Labour hire
  • Trade union membership

Odd years only

  • Job flexibility and Working from home
  • Overemployment and Overtime
  • Working arrangements and Working patterns

For more details, refer to the Data item list available in the Data downloads section

Occupation data

Occupation data is now provided to the unit group level (4 digit). TableBuilder may suppress this level if a requested table is too finely detailed. Occupation data can be aggregated to minor, sub-major and major group levels (3-, 2- and one-digit levels) to avoid table suppression.

Hourly earnings in main job

The parametric item "Hourly earnings in main job," which is available under Summation options in TableBuilder, has been changed from dollars to cents. Median and mean hours are more accurately calculated using cents as the software performs better with integers. Custom ranges can be specified in one-dollar increments (100 cents) between $5 per hour and $385 per hour (500 and 38500 cents). 

Not applicable categories

The "Not applicable" categories in each data item now have descriptive labels to describe which populations are not included for the data item. For example, in the data item “Time remaining on fixed-term contract in main job,” the “Not applicable” category has been labelled as “Not on a fixed-term contract (ongoing)” to indicate that only people on a fixed-term contract were asked about the time remaining on their contract.

Other changes

Some data item labels have been revised or shortened to improve interpretability (note the concepts remain the same).

Hours worked, hours preferred and duration data items have been aligned to consistent higher level groupings. 

For more details, including notes on how each data item differs from the 2020 COE TableBuilder release, refer to the Data Item List in the Data downloads section.

Rebenchmarking and seasonal factor adjustments

Since 2017, the COE data have been rebenchmarked every year to reflect the most recently available release of Estimated Resident Population (ERP) data and Labour Force Survey population benchmarks. The data for 2014 to 2021 have been revised to incorporate the population benchmarks that were used to produce estimates published in the August 2021 Labour Force, Australia.

To reduce the impact of seasonal effects on employment, the benchmarks have been adjusted by factors based on seasonally adjusted Labour Force Survey estimates (as published in August 2021). For example, August estimates have a typical seasonal pattern of lower employment. The factors applied increase the number of employed, to align with seasonally adjusted LFS estimates.

Trend series factors would usually be used in COE benchmarks but are currently not available during the COVID period. Seasonally adjusted factors will be used until trend series are reinstated in Labour Force statistics.

Topic-based publications

Since 2020, statistics from the Characteristics of Employment survey are published across three topic-based releases:

Data and file structure

Survey methodology

General information about the Characteristics of Employment (COE) survey, including summary results, are available in the following publications 

Detailed information about the survey including scope and coverage, survey design, data collection methodology, weighting, estimation and benchmarking, estimate reliability and a glossary can be accessed from the Methodology page of the publication.

Data items

The data items included in the COE TableBuilder are grouped under broad headings and subheadings as shown in the image below. A complete data items list can be found in Data downloads.

Headings and subheadings

File structure

The underlying format of the COE TableBuilder file is structured at a single person level. This person level contains general demographic information such as age, sex and country of birth, as well as details about status of employment, weekly earnings, working arrangements, trade union membership and educational qualifications.

When tabulating data from TableBuilder, person weights are automatically applied to the underlying sample counts to provide the survey's population estimates.

Reference year

The COE TableBuilder contains a mandatory field called Reference year to allow for historical analysis. By default, this field will be present in any new table as per the image below:

Reference year will be present in any new tables

Individual years can be removed from the table using the data item panel by selecting the required year and removing it from the table as per the image below:

Select individual years in the field of Reference year

However, at least one category (reference year) of the mandatory field must be present in a table for TableBuilder to retrieve data.

Biennial content

The COE TableBuilder contains biennial content. Data items are labelled as available for either "All years," "Even years only" or "Odd years only" in the Data items list.

When a data item is placed in a table for a particular reference year where the data was not collected, TableBuilder will return estimates in a category that contains the label "Not collected (even years only)" or "Not collected (odd years only)." When data for a biennial item is requested across multiple years, TableBuilder will retrieve data for the applicable reference years and return "Not collected" for the years where the data was not collected.

Not applicable categories

Most data items included in the TableBuilder file include a 'Not applicable' category. This category generally represents the number of people who were not asked a particular question or the number of people excluded from the population for a data item when that data were derived (e.g. Status of employment in second job is not applicable for people without a second job).

Since 2021, The "Not applicable" categories in each item now have a descriptive label to describe which populations are not included for the data item. For example, in the data item “Time remaining on fixed-term contract in main job,” the “Not applicable” category has been labelled as “Not on a fixed-term contract (ongoing)” to indicate that only people on a fixed-term contract were asked about the time remaining on their contract.

Table populations

The population relevant to each data item should be kept in mind when extracting and analysing data. The actual population count for each data item is equal to the total cumulative frequency minus the 'Not applicable' category.

Generally, some populations can be 'filtered' using other relevant data items. For example, if the population of interest is 'Employees', any data item with that population (excluding the 'Not applicable' category) could be used.

Zero value cells

Tables generated from sample surveys will sometimes contain cells with zero values because no respondents that satisfied the parameters of a particular cell in a table were in the survey. This is despite there being people in the general population with those characteristics. This is an example of sampling variability which occurs with all sample surveys. Relative standard Errors cannot be generated for zero cells.

Availability of median earnings data in TableBuilder

For the Characteristics of Employment survey, median weekly earnings are a more robust measure of the centre for earnings data and have been given more prominence since 2017.

To minimise the risk of identifying individuals in aggregate statistics, a technique is used to randomly adjust cell values. This technique is called perturbation. Perturbation involves small random adjustments of the statistics and is considered the most satisfactory technique for avoiding the release of identifiable statistics while maximising the range of information that can be released.

The ABS has tested and implemented a perturbation process in respect of median earnings data to ensure that both the confidentiality of individuals is maintained, and the integrity of medians is preserved.

Using TableBuilder

For general information relating to the TableBuilder or instructions on how to use features of the TableBuilder product, please refer to TableBuilder and the TableBuilder, User Guide.

More specific information applicable to the Characteristics of Employment (COE) survey TableBuilder, which should enable users to understand, interpret and tabulate the data, is outlined below.

Confidentiality features in TableBuilder

In accordance with the Census and Statistics Act 1905, all the data in TableBuilder are subjected to a confidentiality process before release. This confidentiality process is undertaken to avoid releasing information that may allow the identification of individuals, families, households, dwellings or businesses.

Processes used in TableBuilder to confidentialise records include the following:

  • perturbation of data; and
  • table suppression

Perturbation effects

To minimise the risk of identifying individuals in aggregate statistics, a technique is used to randomly adjust cell values. This technique is called perturbation. Perturbation involves small random adjustments of the statistics and is considered the most satisfactory technique for avoiding the release of identifiable statistics while maximising the range of information that can be released. These adjustments have a negligible impact on the underlying pattern of the statistics.

The introduction of these random adjustments result in tables not adding up. As a result, randomly adjusted individual cells will be consistent across tables, but the totals in any table will not be the sum of the individual cell values. The size of the difference between summed cells and the relevant total will generally be very small.

Please be aware that the effects of perturbing the data may result in components being larger than their totals. This includes determining proportions.

Table suppression

Some tables generated within TableBuilder may contain a substantial proportion of very low counts within cells (excluding cells that have counts of zero). When this occurs, all values within the table are suppressed to preserve confidentiality. The following error message below is displayed (in red) at the bottom of the table when table suppression has occurred.

ERROR: The table has been suppressed as it is too sparse
ERROR: table cell values have been suppressed

Counting units and weights

Weighting is the process of adjusting results from a sample survey to infer results for the total population. To do this, a 'weight' is allocated to each record. The weight is the value that indicates how many population units are represented by each sample unit.

To produce estimates for the in-scope population you must use a weight field in your tables. In TableBuilder they can be found under the Summation Options category in the left-hand pane under the applicable level. If you do not select a weight field, TableBuilder will apply 'Person weight' by default. This will give you estimates of the number of persons.

If you are estimating the number of persons with certain characteristics (e.g. 'Number of jobs held last week') the weight listed under the category heading 'Person level weighting' must be used.

When creating a table, a default Summation Item will need to be the Reference year as this item will provide data for the relevant year. This item will then be used for time-series purposes as future data becomes available.

Selecting data items for cross-tabulation

The Person level contains a range of data items detailing the characteristics of respondents including demographic, education, labour force, earnings, working arrangements, trade union membership and population variables.

Populations and data items

When adding a data item to a table, it should be noted that not all respondents to the survey may be associated with that data item. For example, the data item “Duration of current trade union membership” is only applicable to "Trade union members." When using this item in a table, it would be appropriate to also use the population "P12 - Trade union members," to restrict the output of this table to this population only.

Similarly, if multiple data items are included in a table, they should all apply to the same population group.

For more information about data items, refer to the COE TableBuilder Data Items List available from Data downloads.

Cross-tabulating data items on the same level

Cross-tabulating data from the Person Level with other data items from the same level will produce data about people. For example, cross-tabulating the geographic variable 'State or territory of usual residence' by the 'Hours usually worked in main job' produces a table showing the number of people in each region by the hours that they usually work each week in their main job.

Multi-response data items

A number of the survey's data items allow respondents to report more than one response. These are referred to as 'multi–response data items'. An example of such a data item is pictured below. For this data item, respondents can report all the days of the week they usually work.

Multiple-response data item

When a multi–response data item is tabulated, a person is counted against each response they have provided (e.g. a person who responds 'Monday' and 'Thursday' and 'Saturday' will be counted once in each of these three categories).

As a result, each person in the appropriate population is counted at least once, and some people are counted multiple times. Therefore, the total for a multi–response data item will be less than or equal to the sum of its components.

For more information on definitions and concepts that apply to the data items in this file, please refer to Characteristics of Employment and Labour Force.

Using DataLab

DataLab allows real time access to detailed microdata files through a portal to a secure ABS environment. Using detailed microdata in DataLab allows users to run advanced statistical analyses using recent analytical software.

For information about the data items available on the detailed microdata files, see the Data Item Lists in Data downloads.

About DataLab

Detailed microdata files in DataLab can be accessed on-site at ABS offices or in a secure virtual environment from your own computer. All unit record data remains in DataLab, and any analysis results or tables are checked by the ABS before being provided to users.

Refer to DataLab for more information, including prerequisites for DataLab access.

COE microdata in DataLab, 2014-2021

Characteristics of Employment (COE) microdata is now available in ABS DataLab as a supplementary file for the Longitudinal Labour Force (LLFS) microdata. All existing users of the LLFS microdata will automatically get access to the COE file (use of the file may require an updated project proposal) and new users can apply for access to both files. 

This release of COE microdata features data collected annually for the months August 2014 to August 2021. A future update to include the results from the August 2022 COE survey is scheduled for release later in the year on 14 December 2022. 

Record identifiers

The record identifiers used in the COE and LLFS microdata are consistent across both files. This is to facilitate data linkage between the two files and enable further analysis. The COE survey is collected from private dwellings in 7/8th of the Labour Force Survey (LFS) sample, so not all records in August on the LLFS will have a corresponding COE record. 

More details on these records and the formatting of record identifiers can be found in the Data Item List in Data downloads.

Weights

Person level weights (and replicate weights for calculating standard errors) are provided on the COE file. These differ from the weights provided on the LLFS file, as the weights are recalibrated for COE due to the reduced sample size compared to the LFS. Aggregate estimates from both sets of weights will align closely, as the COE survey data is benchmarked to match seasonally adjusted estimates from the LFS, but care should be taken when performing micro analysis.

COE weights are recommended for cross-sectional analysis of COE data items, but when linking COE and LLFS data for longitudinal analysis, new weights should be calculated based on the population benchmarks provided on the LLFS file. Care should be taken to account for attrition bias by adjusting the weights appropriately (increasing the weights for those more likely to leave the LFS). More information on using benchmarks and weights for longitudinal analysis is provided in Longitudinal Labour Force.

Earnings and other parameters

Data related to earnings and other parameters (including duration, hours, ages, etc.) are presented using two data items:

  • flag item - indicates which records have earnings or other parametric data and which records do not. A flag of '1' indicates that the record has parametric data, and flags of '0' or negative values indicate that the record does not have parametric data and are categorised by the reason why they were excluded. Flag data items use identifiers that end in 'A'. 
  • values item - provides the value for earnings or other parametric information. Values data items have identifiers that end in 'B'.

Records that have a values data item equal to '0' have an ambiguous meaning when used on their own - it could indicate a parametric value of zero, or it could also indicate that there is no parametric data. Zero values should be interpreted in conjunction with the flag data item to determine the meaning and whether they should be included in analysis or not.

When analysing parametric items, such as calculating median or average weekly earnings, the data should first by filtered for records that have a flag value of '1' before performing the analysis.

For more information on earnings and other parametric items, refer to the Data Item List in Data downloads

Historical microdata in DataLab

Prior to 2014, microdata relating to employment characteristics was released in a number of Confidentialised Unit Record Files (CURFs). These files are available in ABS DataLab

For more information about these microdata releases, refer to the following archived publications:

Data downloads

Data files

Previous releases

 TableBuilder data seriesMicrodataDownloadDataLab
Labour Force Survey and Forms of Employment Survey, 2008 Basic microdata 
Labour Force Survey and Employee Earnings Benefits and Trade Union Membership, 2010 Basic microdataDetailed microdata
Labour Force Survey and Employee Earnings Benefits and Trade Union Membership, 2008 Basic microdataDetailed microdata
Labour Force Survey and Employee Earnings Benefits and Trade Union Membership, 2006 Basic microdataDetailed microdata
Labour Force Survey and Employee Earnings Benefits and Trade Union Membership, 2004 Basic microdataDetailed microdata
Forms of Employment, 1998 Basic microdata 

History of changes

04/08/2022

This issue provides details on the first release of COE microdata in DataLab as a supplementary file for the Longitudinal Labour Force (LLFS) microdata. This file features data collected annually for the months August 2014 to August 2021. A future update to include the results from the August 2022 COE survey is scheduled for release later in the year on 14 December 2022.

20/04/2022

Minor updates to labels related to fixed-term contracts and leave entitlements. More details are provided in the updated Data Item List available in Data Downloads

14/12/2021

This update coincides with the release of data from the 2021 Characteristics of Employment Survey into TableBuilder. 

Many improvements have been made to the 2021 COE TableBuilder release to simplify the way data items are presented, increase the range of data items available and generally enhance the usability of the product. See "Changes this release" for more information.

Previous catalogue number

This release previously used catalogue number 6333.0.00.001.