4715.0.30.003 - Microdata: Australian Aboriginal and Torres Strait Islander Health Survey, Core Content - Risk Factors and Selected Health Conditions, 2012-13 Quality Declaration 
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 28/07/2015  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

USING THE EXPANDED CURF

ABOUT THE EXPANDED CURFs

The Aboriginal and Torres Strait Islander Health Survey, Core Content - Risk Factors and Selected Health Conditions 2012–13 Expanded CURFs contain unit records relating to all of the survey respondents aged 2 years and over who participated in the National Aboriginal and Torres Strait Islander Health Survey (NATSIHS) and National Aboriginal and Torres Strait Islander Nutrition and Physical Activity Survey (NATSINPAS). The data are released under the Census and Statistics Act 1905, which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses of survey respondents on the CURFs and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:

  • the level of detail of many data items has been reduced by grouping, ranging or top coding values
  • some unusual records have been changed to protect against identification
  • excluding some data items that were collected
  • household income data has been perturbed. In addition, the household income deciles have been applied using the cut-offs produced by data from the Australian Health Survey (AHS) Expanded CURF files, with an additional CPI adjustment made.

The nature of the changes made, and the relatively small number of records involved ensure that the effects on data for analysis purposes is considered negligible.

The changes mean that estimates produced from the CURFs may differ from those published in Australian Aboriginal and Torres Strait Islander Health Survey: Updated Results, Australia - 2012-13 (cat. no. 4727.0.55.006), Australian Aboriginal and Torres Strait Islander Health Survey: Biomedical Results (cat. no. 4727.0.55.003) or subsequent AATSIHS Core-related publications.

Detailed information about the data collected, comments regarding data quality and other points to assist in using and interpreting the data are contained in Australian Aboriginal and Torres Strait Islander Health Survey: Users’ Guide, 2012-13 (cat. no. 4727.0.55.002).

The 2012-13 Aboriginal and Torres Strait Islander Health Survey, Core Content is available via two Expanded Confidentialised Unit Record Files (CURFs):
  • a State/Territory Expanded CURF
  • a State/Territory by ASGS Remoteness Expanded CURF

The primary differences between these CURFs are the:
  • type of geography available, and
  • availability of non-remote and remote restricted data items.

State/Territory Expanded CURF

This CURF contains a data item (STATEE) which identified each state and territory separately, except Tasmania and the ACT. Due to confidentiality considerations, the samples from Tasmania and the ACT have been combined into a single category of Tas/ACT. This data item is located on the household level file. In addition, this CURF contains the Socio-economic Index of Relative Disadvantage (SEIFA - deciles) variable.

This CURF contains items that were collected in both non-remote and remote areas.

The structure of this CURF is:

Image: State/Territory CURF File Structure

State/Territory by ASGS Remoteness Expanded CURF

This CURF contains a broad National Remoteness data item (RAECURF) and a special data item (STATREM) consisting of 16 output categories which comprise selected cross-classifications of state/territory by remoteness, where sample and population estimate sizes permit. Output categories can be found in the Expanded CURF data item list located in the Downloads tab of this product. These two data items are available on the household level file. In addition, this CURF contains the Socio-economic Index of Relative Disadvantage (SEIFA - deciles) variable.

This CURF contains items that were collected in both non-remote and remote areas as well as items collected in non-remote areas only or remote areas only. Data items are identified in the Data item list for which remoteness area they were collected in and users should ensure they reference this to ensure the population they are representing is correct.

The structure of this CURF is:

Image: State by ASGS Remoteness File Structure


ACCESSING EXPANDED CURFS

Expanded CURFs can only be accessed via the Remote Access Data Laboratory (RADL). Users must have applied for use of the RADL prior to using the Expanded CURF microdata. Details on the RADL can be found here - Remote Access Data Laboratory.


COUNTS AND WEIGHTS
NUMBER OF RECORDS BY LEVEL, AATSIHS- CORE 2012-13 EXPANDED CURF

LEVEL RECORD COUNTS (unweighted)
APPLICABLE POPULATION
RECORD COUNTS (unweighted)
WEIGHTED COUNTS

1. Household level
8 237
8 237
N/A
2. Persons in household level
28 785
28 785
N/A
3. Person level
12 947
12 947
606 915
4. Condition level
17 224
17 224(a)
N/A
5. Biomedical level
12 947
3 293(b)
365 868
6. Child 5-17 years physical activity day level (NR Only)(c)
17 361
6 621(d)
N/A
7. Child 5-17 years physical activity detailed level (NR Only)(c)
21 607
10 867(e)
N/A

(a) Comprising 12,947 people, including people who have no condition
(b) Comprising biomedical participants 18 years and over
(c) Applicable to State/Territory by ASGS Remoteness CURF only.
(d) Comprising 3 days of data for 2,207 children aged 5-17 years living in non-remote areas
(e) Comprising 3 days of activity data for 2,207 children aged 5-17 years living in non-remote areas, including those who did no physical activity


Weights and Hierarchical Files

Weight Variables

There are two weight variables on the CURF file:

Person Weight PAA (IAHSPERW) - Person level - Benchmarked to the total population. This weight is located on the Person level.
Biomedical Weight PAA (IHMSPERW) - Biomedical level - Benchmarked to the total population aged 18 years and over. This weight is located on the Biomedical level. Note that this level also contains non-biomedical participant records however their biomedical weight is set to 0 so they won't contribute to estimates. When using biomedical variables in conjunction with other variables on the biomedical level or with variables from other levels, the biomedical weight should be used.

There are no weights associated with the other levels. This is because the records are repeated for each person. If, for example, IAHSPERW is merged onto the Conditions level, it will be attached to each condition record and therefore be repeated for each person where they have more than one condition. This should be considered when producing tables. See Copying information across levels below for more information.

For more information about weights see Reliability of Estimates below.

Using Weights

The AATSIHS is a sample survey. To produce estimates for the in-scope population you must use weight fields in your calculations. The 'Biomedical Weight PAA' must be used for all tables where a biomedical level data item is being used. This includes where biomedical items are being used with items from other levels. Which weight, if any, is used on data at non-benchmarked levels will affect the result as shown in the examples below:

Explanation of Estimates if use Person Weight
Weighted estimate of level if apply Person weight

1. Household levelPersons in households with the specified characteristics.
N/A(a)
2. Persons in household levelPersons in households containing one or more persons with the specified characteristics.
N/A(a)
3. Person levelPersons with the specified characteristics.
606 915
4. Condition levelPersons with one or more conditions with the specified characteristics.
771 011(b)
5. Biomedical levelUse Biomedical weight to calculate persons with specified characteristics.
365 868(c)
6. Child 5-17 years physical activity day level (NR Only)(d)Persons with one or more physical activity days with the specified characteristics.
918 823(b)
7. Child 5-17 years physical activity detailed level (NR Only)(d)Persons with one or more physical activity types with the specified characteristics.
1 218 818(b)

(a) Data must be flattened to be a household characteristic (if using the Persons in household level) and/or copied to the Person level. Due to there potentially being more than one selected respondent in a household with a weight, the person weight should not be copied to either the Household or Persons in household levels.
(b) Each person selected in the survey has at least one record per level below the Person level. Weights produced for these levels, without any filtering to restrict to the applicable population, therefore includes the weights of persons who are not applicable to the level or characteristic.
(c) Weighted estimate when using biomedical weight.
(d) State/Territory by ASGS Remoteness CURF only.


IDENTIFIERS

Every record on each level of the file is uniquely identified.

The identifiers ABSHHRID, ABSAPRID, ABSSPRID, ABSBIRID, ABSCNRID appear on all levels of the files. In addition, the State/Territory by ASGS Remoteness Expanded CURF also has ABSCPAID and ABSCPDID located on all levels of its files. Where the information for the identifier is not relevant for a level, it has a value of 0. See below for details on which IDs are relevant for which levels.

Each household has a unique thirteen digit random identifier, ABSHHRID. This identifier appears on the household level and is repeated on each level on each record pertaining to that household. The combination of identifiers uniquely identifies a record at a particular level as shown below.

STATE/TERRITORY EXPANDED CURF

1. Household = ABSHHRID
2. Persons in Household = ABSHHRID, ABSAPRID
3. Person = ABSHHRID, ABSAPRID, ABSSPRID
4. Conditions = ABSHHRID, ABSAPRID, ABSSPRID, ABSCNRID
5. Biomedical = ABSHHRID, ABSAPRID, ABSSPRID, ABSBIRID

STATE BY ASGS REMOTENESS EXPANDED CURF

1. Household = ABSHHRID
2. Persons in Household = ABSHHRID, ABSAPRID
3. Person = ABSHHRID, ABSAPRID, ABSSPRID
4. Conditions = ABSHHRID, ABSAPRID, ABSSPRID, ABSCNRID
5. Biomedical = ABSHHRID, ABSAPRID, ABSSPRID, ABSBIRID
6. Child 5-17 years Physical Activity = ABSHHRID, ABSAPRID, ABSSPRID, ABSCPAID
7. Child 5-17 years Physical Activity detailed = ABSHHRID, ABSAPRID, ABSSPRID, ABSCPAID, ABSCPDID

ABSHHRID assists with linking together people of the same household and also with household characteristics such as geography (located on the household level). The combination of ABSHHRID, ABSAPRID, ABSSPRID and ABSCNRID identifies each individual condition record a selected person has. When merging data with a level above, only those identifiers relevant to the level above are required. However, when merging, for example, the conditions level with the person level, the data on the person level will duplicate for each condition. See Copying information across levels below for more information.

COPYING INFORMATION ACROSS LEVELS

For information regarding whether a level is higher or lower than another, refer to the structure picture located in the About the Expanded CURF section located above.

Lower level to a higher level

The following SAS code is an example of copying information from a lower level to a level above:

PROC SORT DATA=AIH13SCO; /* Condition level */
BY ABSHHRID ABSAPRID ABSSPRID;

DATA TTLLT (KEEP= ABSHHRID ABSAPRID ABSSPRID LONGTERM NOTCURR);
SET AIH13SCO;
BY ABSHHRID ABSAPRID ABSSPRID; /* This step will go through each Condition record within each unique combination of ABSHHRID, ABSAPRID and ABSSPRID*/

RETAIN LONGTERM NOTCURR;
IF FIRST.ABSSPRID THEN DO; LONGTERM=0; NOTCURR=0; END; /* Note as the file is sorted by the three IDs, reference to first is only needed for the last part of the ID */
IF AHSSTAT=1 THEN LONGTERM=LONGTERM+1; /*starts a count of the number of diagnosed long term conditions*/
IF AHSSTAT=3 THEN NOTCURR=NOTCURR+1; /*starts a count of the number of diagnosed conditions that are not current*/

IF LAST.ABSSPRID THEN OUTPUT; /* This outputs the totals found within each unique combination of ABSHHRID, ABSAPRID and ABSCNRID*/

PROC SORT DATA=AIH13SSP; /* PERSON level - the level above Condition */
BY ABSHHRID ABSAPRID ABSSPRID;

DATA MRGFILES;
MERGE TTLLT AIH13SSP;
BY ABSHHRID ABSAPRID ABSSPRID;

PROC FREQ DATA=MRGFILES; /*This procedure gives a weighted count of the data copied up from the Condition level to the Actions level */
TABLES LONGTERM NOTCURR*SEX; /* LONGTERM will be a weighted frequency table. NOTCURR will be in a weighted frequency table cross-tabbed by Sex */
WEIGHT IAHSPERW;

RUN;

The new variables LONGTERM and NOTCURR produce the number of collected conditions a person has that are either diagnosed/longterm or diagnosed/not current. So they are meaningful on the Person level, where only one value per Person is produced for each variable. Merging these new items onto the Person level now allows them to be analysed with any other items on the person level and for weighted estimates to be correctly produced.

Higher level to a lower level

The following SAS code is an example of copying information from a higher level to a level below:

DATA PERSON (KEEP=ABSHHRID ABSAPRID ABSSPRID AGEEC SEX IAHSPERW);
SET AIH13SSP;

PROC SORT DATA=PERSON;
BY ABSHHRID ABSAPRID ABSSPRID;

PROC SORT DATA=AIH13SCO;
BY ABSHHRID ABSAPRID ABSSPRID;

DATA MRGFILES2;
MERGE AIH13SCO PERSON;
BY ABSHHRID ABSAPRID ABSSPRID;

PROC FREQ DATA=MRGFILES2; /*This procedure gives a weighted count of the AHSSTAT items located on the Condition level by the SEX variable and weight brought down from the Persons level. */
TABLES SEX*AHSSTAT;
WEIGHT IAHSPERW;

RUN;

This merge matches one Person record to many Conditions records. So, the data items copied from the person level ('AGEEC' and 'SEX' and 'IAHSPERW' in the example) will be repeated for the counting unit of the level they have been added to, Conditions in this case. Each Conditions record will therefore receive the Age and Sex and Person Weight of the Person they belong to. Weighted estimates will now be influenced by people who have more than one condition as their weight will be applied to multiple conditions.


RELIABILITY OF ESTIMATES

As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the CURF. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for, by use of appropriate weights, the results will be biased. For details on the weighting process see Weighting, Benchmarks and Estimation procedures in Australian Aboriginal and Torres Strait Islander Health Survey: Users' Guide, 2012-13 (cat. no. 4727.0.55.002).

Each person record has a main weight (IAHSPERW). This weight indicates how many population units are represented by the sample units. When producing estimates of sub-populations from the CURF, it is essential that they are calculated by adding the weights of persons in each category and not just by counting the sample number in each category. If each person's weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's chance of selection or of different response rates across population groups, with the result that the estimates produced could be biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.

Each person record on the CURF contains 250 replicate weights in addition to the main weight. Replicate weights can be used to calculate measures of sampling error. For details on sampling error calculations and replicate weights see the Technical Note in the Australian Aboriginal and Torres Strait Islander Health Survey: Users' Guide, 2012-13 (cat. no. 4727.0.55.002).

EXPANDED CURF FILES

SAS files

These files contain the data for the CURFs in SAS format.

STATE/TERRITORY

AIH13SHH.sas7bdat contains the Household level data
AIH13SPH.sas7bdat contains the Persons in Household level data (All Persons)
AIH13SSP.sas7bdat contains the Person level data (Selected Person)
AIH13SBI.sas7bdat contains the Biomedical level data
AIH13SCO.sas7bdat contains the Condition level data

STATE/TERRITORY BY ASGS REMOTENESS

AIH13RHH.sas7bdat contains the Household level data
AIH13RPH.sas7bdat contains the Persons in Household level data (All Persons)
AIH13RSP.sas7bdat contains the Person level data (Selected Person)
AIH13RBI.sas7bdat contains the Biomedical level data
AIH13RCO.sas7bdat contains the Condition level data
AIH13RCP.sas7bdat contains the Child 5-17 years Physical Activity level (NR only) data
AIH13RCD.sas7bdat contains the Child 5-17 years Physical Activity detailed level (NR only) data


SPSS files

These files contain the data for the CURFs in SPSS format.

STATE/TERRITORY

AIH13SHH.sav contains the Household level data
AIH13SPH.sav contains the Persons in Household level data (All Persons)
AIH13SSP.sav contains the Person level data (Selected Person)
AIH13SCO.sav contains the Condition level data
AIH13SBI.sav contains the Biomedical level data

STATE/TERRITORY BY ASGS REMOTENESS

AIH13RHH.sav contains the Household level data
AIH13RPH.sav contains the Persons in Household level data (All Persons)
AIH13RSP.sav contains the Person level data (Selected Person)
AIH13RCO.sav contains the Condition level data
AIH13RBI.sav contains the Biomedical level data
AIH13RCP.sav contains the Child 5-17 years Physical Activity level (NR only) data
AIH13RCD.sav contains the Child 5-17 years Physical Activity detailed level (NR only) data


STATA files

These files contain the data for the CURFs in STATA format.

STATE/TERRITORY

AIH13SHH.dta contains the Household level data
AIH13SPH.dta contains the Persons in Household level data (All Persons)
AIH13SSP.dta contains the Person level data (Selected Person)
AIH13SCO.dta contains the Condition level data
AIH13SBI.dta contains the Biomedical level data

STATE/TERRITORY BY ASGS REMOTENESS

AIH13RHH.dta contains the Household level data
AIH13RPH.dta contains the Persons in Household level data (All Persons)
AIH13RSP.dta contains the Person level data (Selected Person)
AIH13RCO.dta contains the Condition level data
AIH13RBI.dta contains the Biomedical level data
AIH13RCP.dta contains the Child 5-17 years Physical Activity level (NR only) data
AIH13RCD.dta contains the Child 5-17 years Physical Activity detailed level (NR only) data


Information files

FORMATS.sas7bcat is a SAS library containing formats. There is one produced for the State/Territory CURF and one for the State/Territory by ASGS Remoteness CURF.

Frequency files

STATE/TERRITORY

ECURF AIH13S Household Freq.txt contains frequencies for Household level data
ECURF AIH13S Persons in Household Freq.txt contains frequencies for Persons in Household level data (All Persons)
ECURF AIH13S Person Freq.txt contains frequencies for Person level data (Selected Person)
ECURF AIH13S Condition Freq.txt contains frequencies for Condition level data
ECURF AIH13S Biomedical Unweighted Freq.txt contains unweighted frequencies for Biomedical level data
ECURF AIH13S Biomedical Weighted Freq.txt contains weighted frequencies for Biomedical level data

STATE/TERRITORY BY ASGS REMOTENESS

ECURF AIH13R Household Freq.txt contains frequencies for Household level data
ECURF AIH13R Persons in Household Freq.txt contains frequencies for Persons in Household level data (All Persons)
ECURF AIH13R Person Freq.txt contains frequencies for Person level data (Selected Person)
ECURF AIH13R Condition Freq.txt contains frequencies for Condition level data
ECURF AIH13R Biomedical Unweighted Freq.txt contains unweighted frequencies for Biomedical level data
ECURF AIH13R Biomedical Weighted Freq.txt contains weighted frequencies for Biomedical level data
ECURF AIH13R Child Physical Activity Freq.txt contains frequencies for Child 5-17 years Physical Activity level (NR only) data
ECURF AIH13R Child Physical Activity Detailed Freq.txt contains frequencies for Child 5-17 years Physical Activity detailed level (NR only) data