|Page tools: Print Page Print All|
FILE STRUCTURE AND CONTENT
For some data items, certain classification values have been reserved as special codes which should not be added as if they were quantitative values. In particular, the value for these codes should be excluded when calculating means, medians and modes. These special codes generally relate to 'invalid' responses such 'Not stated' (e.g. code 99998) and 'Not applicable' (e.g. code 99999).
The data item list in the Downloads tab provides all the categories, including special codes, that are applicable to each data item.
Continuous Data Items
For the WRTAL survey, there are a number of continuous data items that are available for selection from Summation Options in the Customise Table pane. Continuous data items are generally those data items that can be measured, written as a value in a specified unit and can be placed in ascending or descending order. For this survey, some examples of continuous data include age (in single years), cost (single dollars) and time (single hours). These continuous items can be used to create sums, medians, means and customised ranges.
Most continuous data items have special codes for particular responses (See Special Codes section above). However, these special codes are EXCLUDED from each of the applicable continuous data items available from Summation Options, so that the medians and means, etc. are only applied to 'valid responses'. Instead, the data relating to any special code categories are available from a corresponding data item that can be selected from the list of categorical data items under the relevant grouping.
For example, as shown below, the responses for the data item 'Personal cost for most recent work-related training' may be a dollar value in the range $0 – $99,990, 'Not stated', or 'Not applicable' for those persons who did not undertake any work-related training.
The 'valid responses' used for calculating means, medians, etc. are the single dollar values only while the special codes within the categorical data item relate to the 'Not stated' and 'Not applicable' categories and are an estimate of the number of people in these populations. The corresponding 'Personal cost for most recent work-related training' data item, included in the categorical data item list, would be as follows:
It is highly recommended that when interpreting a table of sums, medians or means of a continuous data item, the corresponding categorical data item is also used in a separate tabulation for a complete understanding of all 'valid' and 'invalid' responses.
The following two tables display the types of results obtained when using a continuous data item from Summation Options and the corresponding data item from the categorical level. In this example, the mean or average Personal cost for the most recent work-related training course by Sex has been tabulated from the continuous data item in Summation Options. This shows that the average cost was estimated at around $185 for men and $151 for women.
When the corresponding data item from the categorical data item list is cross-tabulated by sex, the results show the estimates of the total population relevant to each category. For example, the table below shows that for the survey population in scope, there was an estimated 8.5 million males aged 15–74 years in Australia. Of these, 2.3 million undertook work-related training and reported their personal cost for their most recent course (i.e. a 'valid response'). A further 15,000 males undertook work-related training but did not state the amount of the cost they incurred, while 6.2 million males did not undertake any work-related training (i.e. Not applicable).
Consequently, when tying these tables together, it shows the size of the population to which the above estimate of the average Personal cost for the most recent work-related training course applies – i.e. An estimated 2.3 million men spent, on average, $185 on their most recent work-related training course in the last 12 months.
In summary, it is important to use the data item list and be aware of the special codes that may be applicable when interpreting a sum, median or mean of a continuous variable.
All continuous data items can be identified in the data item list by the following label <Continuous data item>. The corresponding categorical data item is denoted with <Categorical data item>.
Not Applicable Categories
Most data items included in the TableBuilder file include a 'Not applicable' category. The classification values of these 'Not applicable' categories, where relevant, are shown in the data item list in the Downloads tab. The 'Not applicable' category generally represents the number of people who were not asked a particular question or the number of people excluded from the population for a data item when that data was derived (e.g. Year of Arrival in Australia is not applicable for people born in Australia).
The population relevant to each data item is identified in the data item list and should be kept in mind when extracting and analysing data. The actual population count for each data item is equal to the total cumulative frequency minus the 'Not applicable' category.
Generally, all populations, including very specific populations, can be 'filtered' using other relevant data items. For example, if the population of interest is 'Employed persons', any data item with that population (excluding the 'Not applicable' category) could be used. While any applicable data item can be used for this filtering process, the WRTAL TableBuilder file also includes some data items that have been specifically derived for this purpose. For example, the population data item 'Persons aged 15–24 years' can be used to filter this population rather than the actual age group data item. The specifically derived population data items are listed in the data item list in the 'Population data items' worksheet.
Zero Value Cells
Tables generated from sample surveys will sometimes contain cells with zero values because no respondents that satisfied the parameters of a particular cell in a table were in the survey. This is despite there being people in the general population with those characteristics. That is, the cell may have had a value above zero if all persons in scope of the survey had been enumerated. This is an example of sampling variability which occurs with all sample surveys. Relative Standard Errors cannot be generated for zero cells.
These documents will be presented in a new window.