4234.0.30.001 - Microdata: Work Related Training and Adult Learning, April 2013 Quality Declaration 
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 28/03/2014  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

FILE STRUCTURE AND CONTENT


FILE STRUCTURE

The underlying format of the 2013 Work Related Training and Adult Learning (WRTAL) TableBuilder file is structured as a single person level. This person level contains general demographic information about each survey respondent such as their age, sex, country of birth and labour force status as well details about recent education and training activities.

When tabulating data, person weights are automatically applied to the underlying sample counts to provide the survey's population estimates.

The data items included in the 2013 WRTAL TableBuilder are grouped under the following broad headings. A complete data item list can be accessed from the Downloads page.

File structure for WRTAL TableBuilder

FILE CONTENT

Multi-Response Data Items


A number of questions included in the survey allowed respondents to provide more than one response. The data items resulting from these questions are referred to as 'multi-response data items'. For example, a person can report several reasons why they undertook their most recent work-related training course - as shown below.

Example of multiple response item
When a multi-response data item is tabulated, the same record (or person in this case) is counted against each response they have provided. As a result each person in the applicable population is counted at least once and some persons are counted multiple times. Therefore, the sum of the individual multi-response categories will be greater than the population or number of people applicable to the particular data item (as respondents are able to select more than one response). However, the total will still remain as the total number of individuals estimated for that particular population of interest. For our example, the sum of the components in the table below is 5,366,500 whereas the total applicable population is 4,612,500 persons
Table displaying multiple response item
All multi-response data items can be identified in the data item list by the following label <Multiple response data item>.

Special Codes

For some data items, certain classification values have been reserved as special codes which should not be added as if they were quantitative values. In particular, the value for these codes should be excluded when calculating means, medians and modes. These special codes generally relate to 'invalid' responses such 'Not stated' (e.g. code 99998) and 'Not applicable' (e.g. code 99999).

The data item list in the Downloads tab provides all the categories, including special codes, that are applicable to each data item.

Continuous Data Items

For the WRTAL survey, there are a number of continuous data items that are available for selection from Summation Options in the Customise Table pane. Continuous data items are generally those data items that can be measured, written as a value in a specified unit and can be placed in ascending or descending order. For this survey, some examples of continuous data include age (in single years), cost (single dollars) and time (single hours). These continuous items can be used to create sums, medians, means and customised ranges.

Most continuous data items have special codes for particular responses (See Special Codes section above). However, these special codes are EXCLUDED from each of the applicable continuous data items available from Summation Options, so that the medians and means, etc. are only applied to 'valid responses'. Instead, the data relating to any special code categories are available from a corresponding data item that can be selected from the list of categorical data items under the relevant grouping.

For example, as shown below, the responses for the data item 'Personal cost for most recent work-related training' may be a dollar value in the range $0 – $99,990, 'Not stated', or 'Not applicable' for those persons who did not undertake any work-related training.
Example from data item list of continuous item
The 'valid responses' used for calculating means, medians, etc. are the single dollar values only while the special codes within the categorical data item relate to the 'Not stated' and 'Not applicable' categories and are an estimate of the number of people in these populations. The corresponding 'Personal cost for most recent work-related training' data item, included in the categorical data item list, would be as follows:
Example of categorical data item
It is highly recommended that when interpreting a table of sums, medians or means of a continuous data item, the corresponding categorical data item is also used in a separate tabulation for a complete understanding of all 'valid' and 'invalid' responses.

The following two tables display the types of results obtained when using a continuous data item from Summation Options and the corresponding data item from the categorical level. In this example, the mean or average Personal cost for the most recent work-related training course by Sex has been tabulated from the continuous data item in Summation Options. This shows that the average cost was estimated at around $185 for men and $151 for women.

Table showing continuous item
When the corresponding data item from the categorical data item list is cross-tabulated by sex, the results show the estimates of the total population relevant to each category. For example, the table below shows that for the survey population in scope, there was an estimated 8.5 million males aged 15–74 years in Australia. Of these, 2.3 million undertook work-related training and reported their personal cost for their most recent course (i.e. a 'valid response'). A further 15,000 males undertook work-related training but did not state the amount of the cost they incurred, while 6.2 million males did not undertake any work-related training (i.e. Not applicable).

Table showing categorical item
Consequently, when tying these tables together, it shows the size of the population to which the above estimate of the average Personal cost for the most recent work-related training course applies – i.e. An estimated 2.3 million men spent, on average, $185 on their most recent work-related training course in the last 12 months.

In summary, it is important to use the data item list and be aware of the special codes that may be applicable when interpreting a sum, median or mean of a continuous variable.

All continuous data items can be identified in the data item list by the following label <Continuous data item>. The corresponding categorical data item is denoted with <Categorical data item>.

Not Applicable Categories

Most data items included in the TableBuilder file include a 'Not applicable' category. The classification values of these 'Not applicable' categories, where relevant, are shown in the data item list in the Downloads tab. The 'Not applicable' category generally represents the number of people who were not asked a particular question or the number of people excluded from the population for a data item when that data was derived (e.g. Year of Arrival in Australia is not applicable for people born in Australia).

Table Populations

The population relevant to each data item is identified in the data item list and should be kept in mind when extracting and analysing data. The actual population count for each data item is equal to the total cumulative frequency minus the 'Not applicable' category.

Generally, all populations, including very specific populations, can be 'filtered' using other relevant data items. For example, if the population of interest is 'Employed persons', any data item with that population (excluding the 'Not applicable' category) could be used. While any applicable data item can be used for this filtering process, the WRTAL TableBuilder file also includes some data items that have been specifically derived for this purpose. For example, the population data item 'Persons aged 15–24 years' can be used to filter this population rather than the actual age group data item. The specifically derived population data items are listed in the data item list in the 'Population data items' worksheet.

Zero Value Cells

Tables generated from sample surveys will sometimes contain cells with zero values because no respondents that satisfied the parameters of a particular cell in a table were in the survey. This is despite there being people in the general population with those characteristics. That is, the cell may have had a value above zero if all persons in scope of the survey had been enumerated. This is an example of sampling variability which occurs with all sample surveys. Relative Standard Errors cannot be generated for zero cells.