Microdata: National Health Measures Survey

Provides data from the National Health Measures Survey for biomedical tests to assess chronic conditions and nutrition status

Introduction

The National Health Measures Survey (NHMS) is collected on an irregular basis and is designed to provide a range of information about the health of Australians. It provides information on chronic disease and nutrient biomarkers, health risk factors and objective cases of disease.

Biomedical tests include: 

  • chronic disease biomarkers, including tests for diabetes, lipid levels, kidney disease and liver function
  • nutrient biomarkers, including tests for iron, folate, iodine, vitamin B12 and vitamin D levels.

Additional key topics include:

  • education attainment and attendance
  • labour force participation
  • selected self-reported health conditions
  • physical measures (weight, height, waist circumference and blood pressure).

This information can be cross classified by selected demographic and socioeconomic characteristics. 

Microdata from this survey was previously released as the Australian Health Survey (AHS) core content. 

This product provides information about the microdata release from the 2022-24 NHMS and 2011-12 AHS core content including details about the data files and how to use the microdata products. 

The 2022-24 NHMS is broadly comparable to the 2011-12 AHS core content. However, comparisons should be made with caution.

For more details about the 2022-24 NHMS, including making comparisons with previous surveys, see National Health Measures Survey methodology, 2022-24. 

Available products

The following microdata products are available from this survey:

  • Detailed microdata - approved users can access DataLab for in-depth and interactive data analysis using a range of statistical software packages. This product is available for the 2022-24 NHMS and the 2011-12 AHS core content.
  • TableBuilder - an online tool for creating tables and graphs. This product is available for the 2011-12 AHS core content only. For more information, see the TableBuilder page. 

To apply for access, see Microdata Entry Page.

Before you apply for access, read Responsible Use of ABS Microdata, User Guide.

File structure

Datasets from the NHMS are hierarchical in nature. A hierarchical data file is an efficient means of storing and retrieving information which describes one to many, or many to many, relationships. For example, a person may report multiple conditions.

2022-24 NHMS file structure

2011-12 AHS core content file structure

Counts and weights

Number of records by level, 2022-24 NHMS microdata
LevelRecord counts (unweighted)Weighted counts (if applicable)
Household21,91510,219,039
Selected person27,78625,053,995
Conditions40,209n/a
Biomedical27,78624,155,696
Number of records by level, 2011-12 AHS core content microdata
LevelRecord counts (unweighted)Weighted counts (if applicable)
Household25,0808,581,354
Persons in household61,657n/a
Person31,83721,526,456
Conditions42,037n/a
Biomedical31,83720,649,321

Weight variables

For the 2022-24 NHMS, there are three weight variables on the file:

  • Household Weight (FINHHWT) - Household level – Benchmarked
  • Person Weight (FINPERWT) - Selected Person level - Benchmarked to the population aged 2 years and over
  • Biomedical Weight (FINBIOWT) – Biomedical level – Benchmarked to the population aged 5 years and over.

For the 2011-2012 AHS core content, there were three weight variables on the file:

  • Household Weight (AHSHHWT) - Household level - Benchmarked
  • Person Weight (AHSPERWT) - Person level - Benchmarked to the population aged 2 years and over
  • Biomedical Weight (NHMSPERWT) – Biomedical level – Benchmarked to the population aged 5 years and over.

There is no weight associated with the other levels. This is because the records are repeated for each person. If, for example, FINPERWT is merged onto the Conditions level, it will be attached to each condition record and therefore be repeated for each person where they have more than one condition. This should be considered when producing tables.

Using weights

The NHMS is a sample survey, so to produce estimates for the in-scope population you must use weight fields in your calculations. When analysing a Household level item at the household level, you will need to use the household weight. For example, if you wanted to know the number of households in a state, rather than the number of persons living in that state.

Caution should be used when applying the ‘Household’ weight to items from other levels. For example, if the household weight is applied to a selected person level demographic item, such as ‘Sex’, your table will show the number of households with one or more selected persons of that sex. Since up to two people can be selected in the NHMS, this will result in some households being counted twice, once for the selected adult and once for the selected child, if they are both the same sex.

When analysing the results of a biomedical test, you will need to use the biomedical weight. This includes when performing analysis on the biomedical participants using items from the selected person level. Selected persons who did not participate in the biomedical component have a biomedical weight of 0. 

File content

Available data items

Data items for the 2022-24 NHMS include:

  • demographics - age, sex at birth, country of Birth, main language spoken, marital status, visa status
  • household details - household type, size, composition, SEIFA, geography, income
  • educational attainment/participation
  • labour force status
  • selected health risk factors
  • physical measurements
  • selected self-reported conditions
  • results from biomedical tests.

The Data Item Lists section is the definitive source of available data items and categories.

Identifiers

Every record on each level of the file is uniquely identified. See Data Item Lists for details on which ID equates to which level.

Each household has a unique random identifier, ABSHIDD. This identifier appears on the household level and is repeated on each level on each record pertaining to that household. Each person within a household will also have a unique person ID, ABSPID. A combination of identifiers for a particular level and all levels above in the hierarchical structure uniquely identifies a record at a particular level. For example, each record on the conditions level is uniquely identified by a combination of the Household, Person and Conditions level identifiers.

The Household record identifier, ABSHIDD, assists with linking people from the same household, and with household characteristics such as geography (located on the household level) to the Person records. When merging data with a level above, only those identifiers relevant to the level above are required.

Multi-response items

Several questions in the survey allowed respondents to provide one or more responses. Each response category for these multi-response data items is treated as a separate data item. In the detailed microdata, these data items share the same identifier (SAS name) prefix but are each separately suffixed with a letter - A for the first response, B for the second response, C for the third response and so on.

For example, the multi-response data item 'All types of physical activity undertaken last week' has seven response categories. There are seven data items named PATYPEWA, PATYPEWB, PATYPEWC....PATYPEWG. Each data item in the series will have either a positive response code, with the exception of the first item in the series, PATYPEWA. PATYPEWA has four potential response codes: the positive response code 1 - 'Walking for exercise', the code 0 - null response, as well as the additional response codes, code 8 - 'No physical activity in last week' and code 9 - 'Not applicable'. The remaining items PATYPEWB--G have just two response codes each. The data item list identifies all multi-response items and lists the corresponding codes with the corresponding response categories.

Note that the sum of individual multi-response categories will be greater than the population applicable to a particular data item as respondents can select more than one response.

Non-Indigenous flag

The purpose of the Non-Indigenous flag (NONINDST) is to assist users in producing non-Indigenous data only. It should not be used to estimate Aboriginal and Torres Strait Islander populations through differencing, as the scope of the National Health Measures Survey excludes Very Remote areas of Australia and discrete Aboriginal and Torres Strait Islander communities.

Continuous items

Some continuous data items are allocated special codes for certain responses (e.g. 9999 = 'Not applicable'). Any special codes for continuous (summation) data items are listed in the Data Item List (DIL) and will be found in the categorical version of the continuous item. However, note that labelling of '0's in the DIL does not necessarily mean they are excluded from the ranges (for example - identifying 0 as 'Did not visit' or 'Did not do') as they may still be important in some calculations. Reference should be made to the categorical version of the item to identify which codes are specifically excluded. Therefore, the total shown only represents 'valid responses' of that continuous data item rather than all responses (including special codes).

Reliability of estimates

As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the detailed microdata. This is important because a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for by use of appropriate weights, the results could be biased.

Each person or household record has a main weight (e.g., FINPERWT). This weight indicates how many population units are represented by the sample unit. When producing estimates of sub-populations from the detailed microdata, it is essential that they are calculated by adding the weights of persons or households in each category and not just by counting the sample number in each category. If each person or household’s weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person or household's chance of selection or of different response rates across population groups, with the result that the estimates produced could be biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.

It is also important to calculate a measure of sampling error for each estimate.  Sampling error occurs because only part of the population is surveyed to represent the whole population.  Sampling error should be considered when interpreting estimates as this gives an indication of accuracy and reflects the importance that can be placed on interpretations using the estimate. Measures of sampling error include standard error (SE), relative standard error (RSE) and margin of errors (MoE).  These measures of sampling error can be estimated using the replicate weights. The replicate weight variables provided on the microdata are labelled WHM01XX (household), WPM01XX (person) and WPB01XX (biomedical), where XX represents the number of the given replicate group. For example, NHMS produces survey microdata with 60 replicate groups, which means there are 60 person replicate weight variables labelled WPM0101 to WPM0160.

Using replicate weights for estimating sampling error

Overview of replication methods

ABS household surveys employ complex sample designs and weighting which require special methods for estimating the variance of survey statistics.  Variance estimators for a simple random sample are not appropriate for this survey microdata.

A class of techniques called 'replication methods' provide a general process for estimating variance for the types of complex sample designs and weighting procedures employed in ABS household surveys. The ABS uses a method called the Group Jackknife Replication Method. 

A basic idea behind the replication approach is to split the sample into G replicate groups. One replicate group is then dropped from the file and a new set of weights is produced for the remaining sample. This is repeated for all G replicate groups to provide G sets of replicate weights. For each set of replicate weights, the statistic of interest is recalculated and the variance of the full sample statistic is estimated using the variability among the replicate statistics.

The statistics calculated from these replicates are called replicate estimates. Replicate weights provided on the microdata file enable variance of survey statistics, such as means and medians, to be calculated relatively simply (Further technical explanation can be found in Section 4 of Research Paper: Weighting and Standard Error Estimation for ABS Household Surveys (Methodology Advisory Committee).

How to use replicate weights

To calculate the standard error of any statistic derived from the survey data, the method is as follows:

  1. Calculate the estimate of the statistic of interest using the main weight
  2. Repeat the calculation above for each replicate weight, substituting the replicate weight for the main weight and creating G replicate estimates.  In the example where there are 60 replicate weights, you will have 60 replicate estimates.
  3. Use the outputs from step 1 and 2 as inputs to the formula below to calculate the estimate of the Standard Error (SE) for the statistic of interest.
\[SE(y) = \sqrt {\frac{{G - 1}}{G}\sum\nolimits_{g = 1}^G {{{\left( {{y_{(g)}} - y} \right)}^2}} }\]

[Equation 1]

  • G = Number of replicate groups
  • g = the replicate group number
  • y(g) = Replicate estimate for group g, i.e. the estimate of y calculated using the replicate weight for g
  • y = the weighted estimate of y from the sample

From the replicate variance you can then derive the following measures of sampling error: relative standard error (RSE), or margin of error (MOE) of the estimate.

\[\text{Relative Standard Error (RSE)} = \frac{\text{SE}}{\text{Estimate}}\]

[Equation 2]

\[\text{Margin of Error (MoE)} = 1.96 \times \text{SE}\]

[Equation 3]

An example in calculating the SE for an estimate of the mean

Suppose you are calculating the mean value of earnings, y, in a sample.  Using the main weight produces an estimate of $500.

You have 5 sets of Group Jackknife replicate weights and using these weights (instead of the main weight) you calculate 5 replicate estimates of $510, $490, $505, $503, $498 respectively. 

To calculate the standard error of the estimate you will substitute the following inputs to equation [1]

  • G = 5
  • y = 500
  • g = 1, y(g) = 510
  • g = 2, y(g) = 490

\(SE(y) = \sqrt{\frac{5-1}{5} \sum_{g=1}^5 (y_{(g)}- 500)^2}\)

\(SE(y) = \sqrt{\frac{4}{5}((510-500)^2+(490-500)^2+(505-500)^2+(503-500)^2+(498-500)^2)}\)

\(SE(y)= \sqrt{\frac{4}{5} \times 238}\)

\(SE(y)= 13.8\)

To calculate the RSE you divide the SE by the estimate of y ($500) and multiply by 100 to get a %

\(RSE(y)=\frac{13.8}{500} \times 100\)

\(RSE(y)=2.8\%\)

To calculate the margin of error you multiply the SE by 1.96

\(\text {Margin of Error} (y)=13.8 \times 1.96\)

\(\text {Margin of Error} (y)=27.05\)

Data Item Lists

Data Item Lists

Data files

Previous releases

 TableBuilder data seriesMicrodata DownloadDataLab
Australian Health Survey, core content, 2011-12TableBuildern/aDetailed microdata

Further information

See National Health Measures Survey methodology, 2022-24 for further information about the 2022-24 NHMS cycle.

See Australian Health Survey: Biomedical Results for Chronic Diseases methodology and Australian Health Survey: Biomedical Results for Nutrients methodology for further information about the 2011-12 AHS cycle.

Previous catalogue number

This release previously used catalogue number 4324.0.55.003.

Back to top of the page