4324.0.55.002 - Microdata: Australian Health Survey, Nutrition and Physical Activity, 2011-12 Quality Declaration 
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 13/11/2014  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

USING THE EXPANDED CURF


ABOUT THE EXPANDED CURF


The NNPAS 2011–12 Expanded Confidentialised Unit Record File (CURF) contains unit records relating to all of the survey respondents. The data are released under the Census and Statistics Act 1905, which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses of survey respondents on the CURF and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:
        • the level of detail of many data items has been reduced by grouping, ranging or top coding values
        • some unusual records have been changed to protect against identification
        • excluding some data items that were collected
        • income data has been perturbed.

The nature of the changes made, and the relatively small number of records involved ensure that the effect on data for analysis purposes is considered negligible.

The changes mean that estimates produced from the CURF may differ from those published in Australian Health Survey: Physical Activity - 2011-12 (cat. no. 4364.0.55.004) or subsequent publications.

Detailed information about the data collected, comments regarding data quality and other points to assist in using and interpreting the data are contained in Australian Health Survey: Users' Guide, 2011-13 (cat. no. 4363.0.55.001).


ACCESSING EXPANDED CURFS


Expanded CURFs can be accessed via the Remote Access Data Laboratory (RADL) and/or the DataLab. Users must have applied for use of the RADL and/or DataLab prior to using the Expanded CURF microdata.


COUNTS AND WEIGHTS


NUMBER OF RECORDS BY LEVEL, NNPAS 2011-12 EXPANDED CURF

LEVELS
RECORD COUNTS (UNWEIGHTED)
WEIGHTED COUNTS (if applicable)
Household level
9 519
8 581 354
Persons in Household level (All persons)
23 464
N/A
Person level (Selected persons)
12 153
21 526 456
Conditions level
15 897
N/A
Child 2-4 Physical Activity Day level
16 203
N/A
Child 5-17 Physical Activity Day level
24 411
N/A
Child 5-17 Physical Activity Detailed level
31 789
N/A
Adult Physical Activity level
13 474
N/A
Pedometer level
51 341
N/A
Biomedical level (Persons 5+)
12 153
20 649 321
Food level
341 897
N/A
Supplement level
25 141
N/A
Australian Dietary Guidelines level
3 102 528
N/A


Weights and Hierarchical Files


Weight Variables

There are three weight variables on the file:

Household Weight (NPAHHWT) -
Household level - Benchmarked
Person Weight (NPAFINWT)
- Selected Person level - Benchmarked to the total population aged 2 years and over
Biomedical Weight (NHMSPERW)
- Biomedical level - Benchmarked to the total population aged 5 years and over. Note that this level also contains non-biomedical participant records, however, their biomedical weight is set to 0 so they will not contribute to estimates. When using biomedical variables in conjunction with other variables on the biomedical level or with variables from other levels, the biomedical weight should be used.

There is no weight associated with the Persons in household level. This level is available in order to produce compositional information about the household (e.g. Number of persons in household aged 4-14 years) which can then either be used with the household weight to represent for example, the number of households with at least two persons aged 4-14 years, or with the person weight to represent the number of people living in households that contain at least two persons aged 4-14 years.

There is also no weight associated with the other levels. This is because the records are repeated for each person. If, for example, NPAFINWT is merged onto the Conditions level, it will be attached to each condition record and therefore be repeated for each person where they have more than one condition. This should be considered when producing tables. See' Copying information across levels' below for more information.

For more information about weights see 'Reliability of Estimates' below.

Using Weights

The NNPAS is a sample survey. To produce estimates for the in-scope population you must use weight fields in your calculations. The 'Biomedical Weight (Benchmarked weight)' must be used for all tables where a biomedical level data item is being used. This includes where biomedical items are being used with items from other levels. Which weight, if any, is used on data at non-benchmarked levels will affect the result as shown in the examples below:



Level of Data ItemEstimates if use Household WeightEstimates if use Person Weight
Household levelHouseholds with the specified characteristics.Persons in households with the specified characteristics.
aaaaaaaaa
Persons in Household level (All persons)Households containing one or more persons with the specified characteristics.Persons in households containing one or more persons with the specified characteristics
aaaaaaaaa
Person level (Selected persons)Households containing one or more selected persons with the specified characteristics.Persons with the specified characteristics.
aaaaaaaaa
Conditions levelHouseholds containing one or more selected persons with one or more conditions with the specified characteristics.Persons with one or more conditions with the specified characteristics.
aaaaaaaaa
Child 2-4 Physical Activity Day levelHouseholds containing one or more selected persons with one or more physical activity days with the specified characteristics.Persons with one or more physical activity days with the specified characteristics.
aaaaaaaaa
Child 5-17 Physical Activity Day levelHouseholds containing one or more selected persons with one or more physical activity days with the specified characteristics.Persons with one or more physical activity days with the specified characteristics.
aaaaaaaaa
Child 5-17 Physical Activity Detailed levelHouseholds containing one or more selected persons with one or more physical activity types with the specified characteristics.Persons with one or more physical activity types with the specified characteristics.
aaaaaaaaa
Adult Physical Activity levelHouseholds containing one or more selected persons with one or more physical activity types with the specified characteristics.Persons with one or more physical activity types with the specified characteristics.
aaaaaaaaa
Pedometer levelHouseholds containing one or more selected persons with one or more pedometer days with the specified characteristics.Persons with one or more pedometer days with the specified characteristics.
aaaaaaaaa
Biomedical level (Persons 5+)Households containing one or more selected persons with one or more specified biomedical characteristics.Persons with the specified biomedical characteristics.
aaaaaaaaa
Food levelHouseholds containing one or more persons with one or more food days with the specified characteristics.Persons with one or more food days with the specified characteristics.
aaaaaaaaa
Supplement levelHouseholds containing one or more persons with one or more supplement days with the specified characteristics.Persons with one or more supplement days with the specified characteristics.
aaaaaaaaa
Australian Dietary Guidelines (ADG) levelHouseholds containing one or more persons with one or more food days with the specified summary characteristics.Persons with one or more food days with the specified summary characteristics.




IDENTIFIERS

Every record on each level of the file is uniquely identified.

The identifiers ABSHID, ABSAID, ABSGID, ABSCID, ABSTID, ABSKID, ABSDID, ABSUID, ABSEID, ABSBID, ABSFID, ABSSID and ABSLFID appear on all levels of the file. Where the information for the identifier is not relevant for a level, it has a value of 0. See the Data Item List for details on which IDs equate to which levels.

Each household has a unique thirteen digit random identifier, ABSHID. This identifier appears on the household level and is repeated on each level on each record pertaining to that household. The combination of identifiers uniquely identifies a record at a particular level as shown below.

1. Household = ABSHID
2, All Persons in Household = ABSHID, ABSAID
3. Selected Person = ABSHID, ABSAID, ABSGID
4. Conditions = ABSHID, ABSAID, ABSGID, ABSCID
5. Child 2-4 Physical Activity Day = ABSHID, ABSAID, ABSGID, ABSTID
6. Child 5-17 Physical Activity Day = ABSHID, ABSAID, ABSGID, ABSKID
7. Child 5-17 Physical Activity Detailed = ABSHID, ABSAID, ABSGID, ABSKID, ABSDID
8. Adult Physical Activity = ABSHID, ABSAID, ABSGID, ABSUID
9. Pedometer = ABSHID, ABSAID, ABSGID, ABSEID
10. Biomedical = ABSHID, ABSAID, ABSGID, ABSBID
11. Food = ABSHID, ABSAID, ABSGID, ABSFID
12. Supplement = ABSHID, ABSAID, ABSGID, ABSSID
13. ADG = ABSHID, ABSAID, ABSGID, ABSLFID

ABSHID assists with linking together people of the same household and also with household characteristics such as geography (located on the household level). The combination of ABSHID ABSAID ABSGID and ABSCID identifies each individual condition record a person has. When merging data with a level above, only those identifiers relevant to the level above are required. However, when merging, for example, the conditions level with the person level, the data on the person level will duplicate for each condition. See 'Copying information across levels' below for more information.

Copying information across levels


The following SAS code is an example of copying information from a lower level to a level above:

PROC SORT DATA=NPA11ECN; /* Condition level */
BY ABSHID ABSAID ABSGID;

DATA TTLLT (KEEP=ABSHID ABSGID LONGTERM NOTCURR);
SET NPA11ECN;
BY ABSHID ABSGID;

RETAIN LONGTERM NOTCURR; /* This step will go through each Condition record within each unique combination of ABSHID and ABSGID */

IF FIRST.ABSGID THEN DO;
LONGTERM=0;
NOTCURR=0;
END; /* Note as the file is sorted by these IDs, reference to the first is only needed for the last part of the ID */

IF AHSSTAT=1 THEN LONGTERM=LONGTERM+1; /*starts a count of the number of diagnosed long term conditions*/
IF AHSSTAT=3 THEN NOTCURR=NOTCURR+1; /*starts a count of the number of diagnosed conditions that are not current*/

IF LAST.ABSGID THEN OUTPUT; /* This outputs the totals found within each unique combination of ABSHID, ABSAID and ABSGID */

PROC SORT DATA=NPA11ESP; /* Selected Person- the level above Condition */
BY ABSHID ABSGID;

DATA MRGFILES;
MERGE TTLLT NPA11ESP;
BY ABSHID ABSGID;

PROC FREQ DATA=MRGFILES; /*This procedure gives a weighted count of the data copied up from the Condition level to the Selected Person level */
TABLES LONGTERM NOTCURR;
WEIGHT NPAFINWT;

RUN;

The new variables LONGTERM and NOTCURR produce the number of collected conditions a person has that are either diagnosed/longterm or diagnosed/not current. So they are meaningful on the Persons level, where only one value per person is produced.

This process allows them to be analysed with any other items on the person level and for weighted estimates to be correctly produced.

The following SAS code is an example of copying information from a higher level to a level below:

DATA PERSON (KEEP=ABSHID ABSGID AGEC SEX);
SET NPA11ESP;

PROC SORT DATA=PERSON;
BY ABSHID ABSGID;

PROC SORT DATA=NPA11ECN;
BY ABSHID ABSGID;

DATA MRGFILES;
MERGE NPA11ECN PERSON;
BY ABSHID ABSGID;

RUN;

This merge matches one Person record to many Conditions records. So, the data items copied from the person level ('AGEC' and 'SEX' in the example) will be repeated for the counting unit of the level they have been added to, Conditions in this case. Each Conditions record will therefore receive the AGEC and SEX of the Person they belong to.


MULTI-RESPONSE ITEMS


A number of questions in the survey allowed respondents to provide one or more responses. Each response category for these multi-response data items is treated as a separate data item. On the CURF, these data items share the same identifier (SAS name) prefix but are each separately suffixed with a letter - A for the first response, B for the second response, C for the third response and so on.

For example, the multi-response data item 'Type of diet currently on' has thirteen response categories (excluding not applicable). There are thirteen data items named TYPDIETA, TYPDIETB, TYPDIETC...TYPDIETM. Each data item in the series has a 'Yes' response code and a 'Null' response code indicating that the response was not relevant for the respondent. The example TYPDIET (A--M) places the not applicable response (code 97, where the question was not asked of the respondent) in the first item TYPDIETA. So TYPDIETA has three response codes; the 'Yes' response code of 10 (Weight loss or low calorie diet), the 'Null' response code of 0 and the not applicable code of 97. The remaining items TYPDIETB--M have just the two response codes each. The data item list identifies all multi-response items and lists the corresponding codes with the corresponding response categories.

Note that the sum of individual multi-response categories will be greater than the population applicable to the particular data item as respondents are able to select more than one response.


RELIABILITY OF ESTIMATES


As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the CURF. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for by use of appropriate weights, the results will be biased. For details on the NNPAS weighting process, see Weighting, Benchmarks and Estimation procedures in Australian Health Survey: Users' Guide, 2011-13 (cat. no. 4363.0.55.001).

Each person record has a main weight (NPAFINWT). This weight indicates how many population units are represented by the sample units. When producing estimates of sub-populations from the CURF, it is essential that they are calculated by adding the weights of persons in each category and not just by counting the sample number in each category. If each person's weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's chance of selection or of different response rates across population groups, with the result that the estimates produced could be biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.

Each person record on the CURF contains 60 replicate weights in addition to the main weight. Replicate weights can be used to calculate measures of sampling error. For details on sampling error calculations and replicate weights, see Technical Note.


EXPANDED CURF FILES

SAS files

These files contain the data for the CURF in SAS format.

NPA11EHH.sas7bdat contains the Household level data
NPA11EAP.sas7bdat contains the Persons in Household level data (All Persons)
NPA11ESP.sas7bdat contains the Person level data (Selected Person)
NPA11ECN.sas7bdat contains the Conditions level data
NPA11ETP.sas7bdat contains the Child 2-4 Physical Activity Day level data
NPA11ECP.sas7bdat contains the Child 5-17 Physical Activity Day level data
NPA11ECD.sas7bdat contains the Child 5-17 Physical Activity Detailed level data
NPA11EAD.sas7bdat contains the Adult Physical Activity level data
NPA11EPD.sas7bdat contains the Pedometer level data
NPA11EBI.sas7bdat contains the Biomedical level data
NPA11EFD.sas7bdat contains the Food level data
NPA11ESU.sas7bdat contains the Supplement level data
NPA11EAG.sas7bdat contains the ADG level data

SPSS files

These files contain the data for the CURF in SPSS format.

NPA11EHH.sav contains the Household level data
NPA11EAP.sav contains the Persons in Household level data (All Persons)
NPA11ESP.sav contains the Person level data (Selected Person)
NPA11ECN.sav contains the Conditions level data
NPA11ETP.sav contains the Child 2-4 Physical Activity Day level data
NPA11ECP.sav contains the Child 5-17 Physical Activity Day level data
NPA11ECD.sav contains the Child 5-17 Physical Activity Detailed level data
NPA11EAD.sav contains the Adult Physical Activity level data
NPA11EPD.sav contains the Pedometer level data
NPA11EBI.sav contains the Biomedical level data
NPA11EFD.sav contains the Food level data
NPA11ESU.sav contains the Supplement level data
NPA11EAG.sav contains the ADG level data

STATA files

These files contain the data for the CURF in STATA format.

NPA11EHH.dta contains the Household level data
NPA11EAP.dta contains the Persons in Household level data (All Persons)
NPA11ESP.dta contains the Person level data (Selected Person)
NPA11ECN.dta contains the Conditions level data
NPA11ETP.dta contains the Child 2-4 Physical Activity Day level data
NPA11ECP.dta contains the Child 5-17 Physical Activity Day level data
NPA11ECD.dta contains the Child 5-17 Physical Activity Detailed level data
NPA11EAD.dta contains the Adult Physical Activity level data
NPA11EPD.dta contains the Pedometer level data
NPA11EBI.dta contains the Biomedical level data
NPA11EFD.dta contains the Food level data
NPA11ESU.dta contains the Supplement level data
NPA11EAG.dta contains the ADG level data

Information files

FORMATS.sas7bcat is a SAS library containing formats
Back to top of the page