ABOUT THE BASIC CURF
|The NNPAS 2011–12 Basic CURF contains unit records relating to all of the survey respondents. The data are released under the Census and Statistics Act 1905, which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses of survey respondents on the CURF and other steps, including the following list of actions, have been taken to protect the confidentiality of respondents:
- the level of detail of many data items has been reduced by grouping, ranging or top coding values
- some unusual records have been changed to protect against identification
- excluding some data items that were collected
- income data has been perturbed.
The nature of the changes made, and the relatively small number of records involved ensure that the effects on data for analysis purposes is considered negligible.
The changes mean that estimates produced from the CURF may differ from those published in Australian Health Survey: Nutrition First Results - Foods and Nutrients, 2011-12 (cat. no. 4364.0.55.007) or subsequent publications.
Detailed information about the data collected, comments regarding data quality and other points to assist in using and interpreting the data are contained in Australian Health Survey: Users' Guide, 2011-13 (cat. no. 4363.0.55.001). It is recommended that relevant parts of the guide be read in conjunction with the use of the NNPAS 2011-12 Basic CURF.
COUNTING UNITS AND WEIGHTS
NUMBER OF RECORDS BY LEVEL, NNPAS 2011-12 BASIC CURF
RECORD COUNTS (unweighted)
WEIGHTED COUNTS (if applicable)
|Person level (Selected persons)|
21 526 456
|Biomedical level (Persons 5+)|
20 649 321
|Australian Dietary Guidelines level|
3 102 528
The counting unit for the person level is the (selected) person/s, for the food level it is foods, for the supplement level it is supplements and for the ADG level it is food summaries. There is a weight attached to the person level in order to estimate the total population of the relevant counting unit, in this case persons. The person weight is called NPAFINWT.
Note that only weighted counts on the person level will produce an estimate of the total number of persons with the specified characteristics. This is because the food, supplement and ADG records are repeated for each person. If, for example, NPAFINWT is merged onto the food level, it will be attached to each food record and therefore be repeated for each person. Information should be copied to the person level in order to create weighted estimates. See 'Copying information across levels' below for an example. For more information about weights, see 'Reliability of Estimates' below.
There is also a biomedical weight for the Biomedical level which is called NHMSPERW. Records on this level are benchmarked to the total population aged 5 years and over. Note that this level also contains non-biomedical participant records, however, their biomedical weight is set to 0 so they will not contribute to estimates. When using biomedical data items in conjunction with other items on the biomedical level or with items from other levels, the biomedical weight should be used.
Every record on each level of the file is uniquely identified.
The identifiers ABSPID, ABSFID, ABSSID, ABSBID, ABSLFID and ABSHID appear on all levels of the file. Where the information for the identifier is not relevant for a level, it has a value of 0. See the Data Item List for details on which IDs equate to which levels.
Each person has a unique fourteen digit random identifier, ABSPID. This identifier appears on the person level and is repeated on each level on each record pertaining to that person. On the food level, the item ABSFID sequentially numbers each food record within each person record. The combination of ABSPID and ABSFID uniquely identifies each food record. On the supplement level, the item ABSSID sequentially numbers each supplement record within each person record. The combination of ABSPID and ABSSID uniquely identifies each supplement record. On the Australian Dietary Guidelines level, ABSLFID sequentially numbers each ADG summary record within each person record. The combination of ABSPID and ABSLFID uniquely identifies each ADG summary record.
ABSHID uniquely identifies each household, but note that due to the absence of a household level on the Basic CURF, ABSHID is not needed for sorting, merging and/or copying information between the five levels. ABSHID aids in analysis of household characteristics by relating members of the same household.
Copying information across levels
Much of the important data from the food and supplement level has already been copied to the person level. The person level file contains data items summing a person's nutrient intake for day one total, day one food only, day one supplements only, day two total, day two food only and day two supplements only. These are provided for each of the 44 nutrients (however, there are no supplement totals for nutrients not measured on the supplement level).
The following SAS code is an example of copying information from a lower level to a level above:
PROC SORT DATA=NPA11BF;
DATA TTLBREAD (KEEP=ABSPID BRDT1 BRDT2);
RETAIN BRDT1 BRDT2;
IF FIRST.ABSPID THEN DO; BRDT1=0; BRDT2=0; END;
IF THRDIGC=122 AND ENERGYWF>0 AND DAYNUM=1 THEN BRDT1=SUM(BRDT1,ENERGYWF); /*sums the energy with dietary fibre intake for each record in the food group 'regular breads, and bread rolls (plain/unfilled/untopped varieties)' for day 1*/
IF THRDIGC=122 AND ENERGYWF>0 AND DAYNUM=2 THEN BRDT2=SUM(BRDT2,ENERGYWF); /*sums the energy with dietary fibre intake for each record in the food group 'regular breads, and bread rolls (plain/unfilled/untopped varieties)' for day 2*/
IF LAST.ABSPID THEN OUTPUT;
PROC SORT DATA=NPA11BP;
MERGE TTLBREAD NPA11BP;
PROC FREQ DATA=MRGFILES; /*This procedure gives a weighted count of the data copied up from the food level*/
TABLES BRDT1 BRDT2;
The new data items BRDT1 and BRDT2 are a sum of the energy with dietary fibre of regular breads, and bread rolls (plain/unfilled/untopped varieties) for each person per day of intake. So they are meaningful on the person level, where only one value per record is produced for each variable. If a person has no day two intake then BRDT2=0. Merging the new data items onto the person level allows them to be analysed with any other items on the person level and for weighted estimates to be correctly produced.
The following SAS code is an example of copying information from a higher level to a level below:
PROC SORT DATA=NPA11BS;
PROC SORT DATA=NPA11BP;
MERGE NPA11BS NPA11BP (KEEP=ABSPID ABSHID AGEC SEX);
This merge matches one person record to many supplement records. So, the data items copied from the person level ('AGEC' and 'SEX' in the example) will be repeated for the counting unit of the level they have been added to, supplements in this case.
A number of questions in the survey allowed respondents to provide one or more responses. Each response category for these multi-response data items is treated as a separate data item. On the CURF, these data items share the same identifier (SAS name) prefix but are each separately suffixed with a letter - A for the first response, B for the second response, C for the third response and so on.
For example, the multi-response data item 'Type of diet currently on' has thirteen response categories (excluding not applicable). There are thirteen data items named TYPDIETA, TYPDIETB, TYPDIETC...TYPDIETM. Each data item in the series has a 'Yes' response code and a 'Null' response code indicating that the response was not relevant for the respondent. The example TYPDIET (A--M) places the not applicable response (code 97, where the question was not asked of the respondent) in the first item TYPDIETA. So TYPDIETA has three response codes; the 'Yes' response code of 10 (Weight loss or low calorie diet), the 'Null' response code of 0 and the not applicable code of 97. The remaining items TYPDIETB--M have just the two response codes each. The data item list identifies all multi-response items and lists the corresponding codes with the corresponding response categories.
Note that the sum of individual multi-response categories will be greater than the population applicable to the particular data item as respondents are able to select more than one response.
RELIABILITY OF ESTIMATES
As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the CURF. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for by use of appropriate weights, the results will be biased. For details on the NNPAS weighting process, see Weighting, Benchmarks and Estimation procedures in Australian Health Survey: Users' Guide, 2011-13 (cat. no. 4363.0.55.001).
Each person record has a main weight (NPAFINWT). This weight indicates how many population units are represented by the sample units. When producing estimates of sub-populations from the CURF, it is essential that they are calculated by adding the weights of persons in each category and not just by counting the sample number in each category. If each person's weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's chance of selection or of different response rates across population groups, with the result that the estimates produced could be biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.
Each person record on the CURF contains 60 replicate weights in addition to the main weight. Replicate weights can be used to calculate measures of sampling error. For details on sampling error calculations and replicate weights, see Technical Note.
BASIC CURF FILES
ASCII text format files
These files contain the raw confidentialised survey data in hierarchical comma delimited ASCII text format.
NNPAS11B.csv contains all levels
NPA11BP.csv contains Person level data
NPA11BF.csv contains Food level data
NPA11BS.csv contains Supplement level data
NPA11BB.csv contains Biomedical level data
NPA11BA.csv contains ADG level data
These files contain the data for the CURF in SAS format.
NPA11BP.sas7bdat contains the Person level data
NPA11BF.sas7bdat contains the Food level data
NPA11BS.sas7bdat contains the Supplement level data
NPA11BB.sas7bdat contains the Biomedical data
NPA11BA.sas7bdat contains the ADG level data
These files contain the data for the CURF in SPSS format.
NPA11BP.sav contains the Person level data
NPA11BF.sav contains the Food level data
NPA11BS.sav contains the Supplement level data
NPA11BB.sav contains the Biomedical data
NPA11BA.sav contains the ADG level data
These files contain the data for the CURF in STATA format.
NPA11BP.dta contains the Person level data
NPA11BF.dta contains the Food level data
NPA11BS.dta contains the Supplement level data
NPA11BB.dta contains the Biomedical data
NPA11BA.dta contains the ADG level data
FORMATS.sas7bcat is a SAS library containing formats
NNPAS11B.sas contains a SAS program to load NNPAS11B.csv and the SAS formats into SAS for Windows
IMPORTANT INFORMATION.pdf describes the file contents of the CURF and information on using the CURF
COPYRITE1.bat describes Copyright obligations for CURF users
The following plain text format files contain data item code values and category labels at each level, with weighted and unweighted frequencies for each value.
FREQUENCIES_NPA11BP.txt contains frequencies for Person level items
FREQUENCIES_NPA11BF.txt contains frequencies for Food level items
FREQUENCIES_NPA11BS.txt contains frequencies for Supplement level items
FREQUENCIES_NPA11BB.txt contains frequencies for Biomedical level items
FREQUENCIES_NPA11BA.txt contains frequencies for ADG level items