|Page tools: Print Page Print All RSS Search this Product|
USING THE EXPANDED CURF
About the CURF
The 2014–15 National Aboriginal and Torres Strait Islander Social Survey (NATSISS) Expanded CURF contains three separate record level files which are described on the File Structure page on the Summary tab. Subject to the limitation of sample size, the data classifications used and the conditions of use, it is possible to interrogate the data, produce tabulations and undertake statistical analyses to individual specifications.
The data included in the CURF are released under the provisions of the Census and Statistics Act 1905. This legislation allows the Australian Statistician to release unit record data, or microdata, provided this is done "in a manner that is not likely to enable the identification of a particular person or organisation to which it relates." Accordingly, there are no names or addresses of survey respondents on the CURF and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:
The nature of the changes made ensure that the effects on data for analysis purposes is considered negligible. These changes, and the fact that estimates previously published in National Aboriginal and Torres Strait Islander Social Survey, Australia, 2014–15 (cat. no. 4174.0) have perturbation applied to all data items, mean that estimates produced from the CURF may differ from those published in the summary publication or produced in TableBuilder.
Steps to confidentialise the datasets made available on the CURF are undertaken in such a way as to ensure the integrity of the datasets and optimise the content, while maintaining the confidentiality of respondents. Intending purchasers should ensure that the data they require at the level of detail they require are available on the CURF. Data obtained in the survey, but not contained on the CURF may be available in TableBuilder or in tabulated form on request. The Data Item List contains information about the data items, which is available as an Excel spreadsheet on the Downloads tab.
Table 1 shows the number of records on each level for the CURF dataset.
Table 1: Counting units and number of records, by level
Weights and Estimation
Information regarding Weights and Estimation is available on the File Structure page.
There are a series of identifiers that can be used on records at each level of the file.
File level identifiers
The identifiers ABSHID, ABSPID, ABSBID appear on all levels of the file (as they are needed to create a hierarchical CSV file). Where the information for the identifier is not relevant for a level, it has a value of 0.
Each household has a unique thirteen digit random identifier, ABSHID. This identifier appears on the Household level and is repeated on every other level. The Barriers to services episode level is a child of the Person level, and therefore the unique identifier is comprised of the Household, Person and episode level. The composition of identifiers for each level is outlined below:
1. Household = ABSHID
3. Person = ABSHID, ABSPID
4. Barriers to services = ABSHID, ABSPID, ABSBID
Copying information across levels
Identifiers can be used to copy information from one level of the file to another. The following SAS code (or equivalent) can be used to copy information from one level to another:
PROC SORT DATA=ISS14EP; *Person level file;
PROC SORT DATA=ISS14EH; *Household level file;
MERGE ISS14EP (IN=A) ISS14EH (IN=B);
IF A AND B THEN OUTPUT;
The following SAS code (or equivalent) can be used to copy information from a higher level to a level below:
PROC SORT DATA=ISS14EP; *Person level file;
PROC SORT DATA=ISS14EB; *Barriers to services level file;
MERGE ISS14EB (IN=A) ISS14EP (IN=B)
IF A AND B THEN OUTPUT; *Only keeps records which are present on both files;
This merge will match one ISS14EP record to many ISS14EB records. The statement 'If A and B then OUTPUT;' ensures that only records present on both files are kept. If this statement was not used then ISS14EP records without a corresponding ISS14EB record would appear with a missing value for all ISS14EB data items. Note that the data items copied from the ISS14EP level will now have the counting unit for the level they have been added to, being instances of Barriers to services in this case.
Combining data from different levels can sometimes be confusing, both in selecting an appropriate item and in understanding the counting unit. For example, if you are interested in Barriers to services, and you want to analyse this by characteristics such as sex or age, then you might cross-tabulate SEX by COUNTBAR (Types of selected services has most problems accessing). This would yield results indicating the estimate (or sample count) of instances of barriers experienced in each category, split by sex, rather than the estimate (or sample count) of males or females. When looking at the Barriers to services level, the counting unit is instances of barriers experienced, rather than persons.
Example STATA code
The following STATA code will display values for the household level data items STATE and SF2SA1DN:
table STATE, c( freq ) f(%11.0f) stubwidth(30)
table SF2SA1DN, c( freq ) f(%11.0f) stubwidth(30)
The following STATA code will display values for the person level data items LTCQ01 and NETINHOM:
table LTCQ01, c( freq ) f(%11.0f) stubwidth(30)
table NETINHOM, c( freq ) f(%11.0f) stubwidth(30)
Example SPSS code
The following SPSS code will display values for the household level data items STATE and SF2SA1DN:
The following SPSS code will display values for the person level data items LTCQ01 and NETINHOM:
Remote/Non-remote only data items
Some survey questions were only asked of people/households in either remote or non-remote areas. Data items based on these questions are therefore only applicable to their relevant geographies. These items are identified by their population description on the Data Item List.
Indigenous status for Queensland
The 2014–15 NATSISS sample for Queensland was designed to allow for the release of data on the Torres Strait Islander population in that state. When using the Indigenous status item for Queensland on the CURF (QLDINDS), it should be noted that the Torres Strait Islander category comprises persons who:
Multi-response items on the CURF
There are a number of data items on the CURF that contain multiple responses. This means that the person being interviewed was able to select one or more response categories for these items. Multiple response items are indicated on the Data Item List.
On the CURF, each response category for the multiple response questions is treated as a separate data item. Each data item therefore has a response of either:
A 'Not applicable' response has a code of '0' indicating that the response category does not apply for the respondent. A 'Yes' response has a code greater than '0' indicating a positive response for that category.
An example of a multiple response item is the question on the 'Types of selected stressors experienced by self, family or friends in last 12 months' (TSTRAL), which has 26 response categories. From these categories, 26 separate data items have been produced - TSTRALLA, TSTRALB, TSTRALC...TSTRALZ.
In most cases, multiple response items will have a number of categories falling into the first SAS category. This is denoted by an 'A' at the end of the fixed SAS name, eg TSTRALA. This category will contain the first multiple response category, as well as any special codes for the item. Using the example of TSTRALA, these special codes are 99 'Not applicable' and 98 'Refusal'. When using data from these multiple response items, the placement of these special codes should be confirmed by referring to the Data Item List.
Continuous data items on the CURF
When analysing continuous items at the person and household levels, it is necessary to exclude the special codes. The special codes are used for responses that do not represent the data being collected (eg 'Don't know'). The codes vary, but will generally be 0, 96, 97, 98, 99 or variations of these. For example, the 'Weekly rent' data item has reserved values of:
The Data Item List provides the special codes for continuous items. Care should be taken to exclude these codes when categorising higher values for ranges, and when calculating means, medians and other summary statistics.
Accessing the Expanded CURF
The 2014–15 Expanded CURF can be accessed via the ABS Data Laboratory and the Remote Access Data Laboratory, and is available in SAS, SPSS and STATA formats.
Expanded CURF data files
These files contain the data for the CURF in SAS format.
ISS14EH.SAS7BDAT contains the Household level data
ISS14EP.SAS7BDAT contains the Person level data
ISS14EB.SAS7BDAT contains the Barriers to services data
These files contain the data for the CURF in SPSS format.
ISS14EH.SAV contains the Household level data
ISS14EP.SAV contains the Person level data
ISS14EB.SAV contains the Barriers to services data
These files contain the data for the CURF in STATA format.
ISS14EH.DTA contains the Household level data
ISS14EP.DTA contains the Person level data
ISS14EB.DTA contains the Barriers to services data
Data Item List
The Data Item List contains all the data items, including details of categories and code values that are available on the CURF, and is available on the Downloads tab.
FORMATS.sas7bcat is a SAS library containing formats
The following plain text format files contain data item code values and category labels at each level, for both unweighted and weighted data.
FREQUENCIES_ISS14EHU.txt contains Household level unweighted data
FREQUENCIES_ISS14EHW.txt contains Household level weighted data
FREQUENCIES_ISS14EPU.txt contains Person level unweighted data
FREQUENCIES_ISS14EPW.txt contains Person level weighted data
FREQUENCIES_ISS14EBU.txt contains Barriers to services level unweighted data
FREQUENCIES_ISS14EBW.txt contains Barriers to services level weighted data
These documents will be presented in a new window.