4720.0.55.002 - Microdata: National Aboriginal and Torres Strait Islander Social Survey, 2014-15 Quality Declaration 
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 29/07/2016  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All RSS Feed RSS Bookmark and Share Search this Product

USING THE EXPANDED CURF

About the CURF

The 2014–15 National Aboriginal and Torres Strait Islander Social Survey (NATSISS) Expanded CURF contains three separate record level files which are described on the File Structure page on the Summary tab. Subject to the limitation of sample size, the data classifications used and the conditions of use, it is possible to interrogate the data, produce tabulations and undertake statistical analyses to individual specifications.

The data included in the CURF are released under the provisions of the Census and Statistics Act 1905. This legislation allows the Australian Statistician to release unit record data, or microdata, provided this is done "in a manner that is not likely to enable the identification of a particular person or organisation to which it relates." Accordingly, there are no names or addresses of survey respondents on the CURF and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:

  • Excluding some data items that were collected
  • Applying value ranges, collapses or top-coding to some variables
  • Perturbation of dollar values
  • Changing some demographic characteristics on unusual records to protect against identification

The nature of the changes made ensure that the effects on data for analysis purposes is considered negligible. These changes, and the fact that estimates previously published in National Aboriginal and Torres Strait Islander Social Survey, Australia, 2014–15 (cat. no. 4174.0) have perturbation applied to all data items, mean that estimates produced from the CURF may differ from those published in the summary publication or produced in TableBuilder.

Steps to confidentialise the datasets made available on the CURF are undertaken in such a way as to ensure the integrity of the datasets and optimise the content, while maintaining the confidentiality of respondents. Intending purchasers should ensure that the data they require at the level of detail they require are available on the CURF. Data obtained in the survey, but not contained on the CURF may be available in TableBuilder or in tabulated form on request. The Data Item List contains information about the data items, which is available as an Excel spreadsheet on the Downloads tab.


Record counts

Table 1 shows the number of records on each level for the CURF dataset.

Table 1: Counting units and number of records, by level

LevelCounting unitNumber of records

Household levelHouseholds6,611
Person levelSelected persons11,178
Barriers to services levelBarriers to services8,717



Weights and Estimation

Information regarding Weights and Estimation is available on the File Structure page.

Identifiers

There are a series of identifiers that can be used on records at each level of the file.

File level identifiers

The identifiers ABSHID, ABSPID, ABSBID appear on all levels of the file (as they are needed to create a hierarchical CSV file). Where the information for the identifier is not relevant for a level, it has a value of 0.

Each household has a unique thirteen digit random identifier, ABSHID. This identifier appears on the Household level and is repeated on every other level. The Barriers to services episode level is a child of the Person level, and therefore the unique identifier is comprised of the Household, Person and episode level. The composition of identifiers for each level is outlined below:

1. Household = ABSHID
3. Person = ABSHID, ABSPID
4. Barriers to services = ABSHID, ABSPID, ABSBID

Copying information across levels

Identifiers can be used to copy information from one level of the file to another. The following SAS code (or equivalent) can be used to copy information from one level to another:

PROC SORT DATA=ISS14EP; *Person level file;
BY ABSHID;
RUN;

PROC SORT DATA=ISS14EH; *Household level file;
BY ABSHID;
RUN;

DATA MERGE_FILE;
MERGE ISS14EP (IN=A) ISS14EH (IN=B);
BY ABSHID;
IF A AND B THEN OUTPUT;
RUN;

The following SAS code (or equivalent) can be used to copy information from a higher level to a level below:

PROC SORT DATA=ISS14EP; *Person level file;
BY ABSHID;
RUN;

PROC SORT DATA=ISS14EB; *Barriers to services level file;
BY ABSHID;
RUN;

DATA MERGE_FILE;
MERGE ISS14EB (IN=A) ISS14EP (IN=B)
BY ABSHID;
IF A AND B THEN OUTPUT; *Only keeps records which are present on both files;
RUN;

This merge will match one ISS14EP record to many ISS14EB records. The statement 'If A and B then OUTPUT;' ensures that only records present on both files are kept. If this statement was not used then ISS14EP records without a corresponding ISS14EB record would appear with a missing value for all ISS14EB data items. Note that the data items copied from the ISS14EP level will now have the counting unit for the level they have been added to, being instances of Barriers to services in this case.

Combining data from different levels can sometimes be confusing, both in selecting an appropriate item and in understanding the counting unit. For example, if you are interested in Barriers to services, and you want to analyse this by characteristics such as sex or age, then you might cross-tabulate SEX by COUNTBAR (Types of selected services has most problems accessing). This would yield results indicating the estimate (or sample count) of instances of barriers experienced in each category, split by sex, rather than the estimate (or sample count) of males or females. When looking at the Barriers to services level, the counting unit is instances of barriers experienced, rather than persons.

Example STATA code

The following STATA code will display values for the household level data items STATE and SF2SA1DN:

use "`ISS14EH'"
table STATE, c( freq ) f(%11.0f) stubwidth(30)
table SF2SA1DN, c( freq ) f(%11.0f) stubwidth(30)

The following STATA code will display values for the person level data items LTCQ01 and NETINHOM:

use "`ISS14EP'"
table LTCQ01, c( freq ) f(%11.0f) stubwidth(30)
table NETINHOM, c( freq ) f(%11.0f) stubwidth(30)

Example SPSS code

The following SPSS code will display values for the household level data items STATE and SF2SA1DN:

GET
FILE=ISS14EH.
EXECUTE.
FREQUENCIES
VARIABLES=STATE SF2SA1DN/ORDER=ANALYSIS.

The following SPSS code will display values for the person level data items LTCQ01 and NETINHOM:

GET
FILE=ISS14EP.
EXECUTE.
FREQUENCIES
VARIABLES=LTCQ01 NETINHOM/ORDER=ANALYSIS.


Geography

Remote/Non-remote only data items

Some survey questions were only asked of people/households in either remote or non-remote areas. Data items based on these questions are therefore only applicable to their relevant geographies. These items are identified by their population description on the Data Item List.

Indigenous status for Queensland

The 2014–15 NATSISS sample for Queensland was designed to allow for the release of data on the Torres Strait Islander population in that state. When using the Indigenous status item for Queensland on the CURF (QLDINDS), it should be noted that the Torres Strait Islander category comprises persons who:

  • Identified as being of Torres Strait Islander origin only; and
  • Identified as being of both Aboriginal and Torres Strait Islander origin.


Multi-response items on the CURF

There are a number of data items on the CURF that contain multiple responses. This means that the person being interviewed was able to select one or more response categories for these items. Multiple response items are indicated on the Data Item List.

On the CURF, each response category for the multiple response questions is treated as a separate data item. Each data item therefore has a response of either:
  • Not applicable; or
  • Yes.

A 'Not applicable' response has a code of '0' indicating that the response category does not apply for the respondent. A 'Yes' response has a code greater than '0' indicating a positive response for that category.

An example of a multiple response item is the question on the 'Types of selected stressors experienced by self, family or friends in last 12 months' (TSTRAL), which has 26 response categories. From these categories, 26 separate data items have been produced - TSTRALLA, TSTRALB, TSTRALC...TSTRALZ.

In most cases, multiple response items will have a number of categories falling into the first SAS category. This is denoted by an 'A' at the end of the fixed SAS name, eg TSTRALA. This category will contain the first multiple response category, as well as any special codes for the item. Using the example of TSTRALA, these special codes are 99 'Not applicable' and 98 'Refusal'. When using data from these multiple response items, the placement of these special codes should be confirmed by referring to the Data Item List.


Continuous data items on the CURF

When analysing continuous items at the person and household levels, it is necessary to exclude the special codes. The special codes are used for responses that do not represent the data being collected (eg 'Don't know'). The codes vary, but will generally be 0, 96, 97, 98, 99 or variations of these. For example, the 'Weekly rent' data item has reserved values of:

  • 9997 for Not applicable; and
  • 9998 for Not known.

The Data Item List provides the special codes for continuous items. Care should be taken to exclude these codes when categorising higher values for ranges, and when calculating means, medians and other summary statistics.


Accessing the Expanded CURF

The 2014–15 Expanded CURF can be accessed via the ABS Data Laboratory and the Remote Access Data Laboratory, and is available in SAS, SPSS and STATA formats.


Expanded CURF data files

SAS files

These files contain the data for the CURF in SAS format.

ISS14EH.SAS7BDAT contains the Household level data
ISS14EP.SAS7BDAT contains the Person level data
ISS14EB.SAS7BDAT contains the Barriers to services data

SPSS files

These files contain the data for the CURF in SPSS format.

ISS14EH.SAV contains the Household level data
ISS14EP.SAV contains the Person level data
ISS14EB.SAV contains the Barriers to services data

STATA files

These files contain the data for the CURF in STATA format.

ISS14EH.DTA contains the Household level data
ISS14EP.DTA contains the Person level data
ISS14EB.DTA contains the Barriers to services data


Information files

Data Item List

The Data Item List contains all the data items, including details of categories and code values that are available on the CURF, and is available on the Downloads tab.

Formats file

FORMATS.sas7bcat is a SAS library containing formats

Frequency files

The following plain text format files contain data item code values and category labels at each level, for both unweighted and weighted data.

FREQUENCIES_ISS14EHU.txt contains Household level unweighted data
FREQUENCIES_ISS14EHW.txt contains Household level weighted data

FREQUENCIES_ISS14EPU.txt contains Person level unweighted data
FREQUENCIES_ISS14EPW.txt contains Person level weighted data

FREQUENCIES_ISS14EBU.txt contains Barriers to services level unweighted data
FREQUENCIES_ISS14EBW.txt contains Barriers to services level weighted data