Microdata: National Aboriginal and Torres Strait Islander Nutrition and Physical Activity Survey

Provides data from the NATSINPAS for key health statistics including nutrition, physical activity and sleep.

Release date and time

01/05/2026 11:30am AEST

Introduction

The National Aboriginal and Torres Strait Islander Nutrition and Physical Activity Survey (NATSINPAS) is designed to provide a range of information about nutrition, physical activity, sedentary behaviour and sleep. This information can be cross classified by selected demographic and socioeconomic characteristics.

This publication provides information about the microdata releases from the two NATSINPAS surveys, 2023 and 2012–13. It includes details about the data files and how to use the different microdata products. Data Item Lists, information about the survey methodology, and a link to microdata for the previous NATSINPAS release (2012–13) is also provided.

The 2023 NATSINPAS is generally comparable to the 2012–13 NATSINPAS. However, due to the time between the NATSINPAS surveys, there have been a number of changes to the content. These changes are mainly due to updates to relevant nutrition and physical activity guidelines, updates to demographic standards (for example, occupation, industry) and the addition of content based on user needs. Major changes that occurred include changes to the dietary recall tool and the use of an accelerometer instead of a pedometer to collect directly measured data on physical activity, inactivity and sleep.

A summary of the main content changes applied in the 2023 NATSINPAS compared with the 2012–13 survey can be found in the NATSINPAS Methodology and Data Item List in the Data downloads section. Additionally, a comparison of food and nutrient collections over time can be found in the Intergenerational Health and Mental Health Study: Concepts, Sources and Methods.

Available products

Basic microdata – approved users can download and analyse unit record data in their own environment. This product is available for both the 2023 and 2012–13 NATSINPAS surveys. For more information, see the MicrodataDownload page.
Detailed microdata - approved users can access DataLab for in-depth and interactive data analysis using a range of statistical software packages. This product is also available for both the 2023 and 2012–13 NATSINPAS surveys. For more information, including prerequisites for DataLab access, see the DataLab page.
TableBuilder is an online tool for creating tables and graphs, and can be accessed via the ABS website. Using TableBuilder, users can only access the 2012–13 NATSINPAS data.

File structure

Datasets from the NATSINPAS are hierarchical in nature. A hierarchical data file is an efficient means of storing and retrieving information which describes one to many, or many to many, relationships. For example, a child aged 5-17 may report multiple days on which physical activity was undertaken, and different types of physical activity on each of these days.

Data about households and families are contained as individual characteristics on person records. While estimates are also available at the household level, estimates at the family level are not available from this survey. The data items and related output categories are described in the Data Item Lists in the Data downloads section.

2023 NATSINPAS file structure

The following table shows the levels available in the microdata products and the information contained on those levels:

Level name	Information contained on level	Basic Microdata	Detailed Microdata
1. Household	Geographic classifications, household size and structure, dwelling characteristics and household income details	✓	✓
2. Person	Demographic and socioeconomic characteristics of survey respondents, as well as health, health risks and related information provided by respondents	✓	✓
3. Conditions	Health conditions and status		✓
4. Pre-School Aged Child (2–5) Physical Activity Day	Physical activity and sedentary behaviour across an individual day, for 2–4 year olds, and 5 year olds not attending primary school living in non-remote areas		✓
5. School-Aged Child (5–17) Physical Activity Day	Physical activity (including active transport and muscle and bone strengthening activities) and sedentary behaviour across an individual day, for 5 year olds attending primary school and 6–17 year olds living in non-remote areas		✓
6. School-Aged Child (5–17) Physical Activity Detail	Additional detail about physical activity at the individual activity level, 5 year olds attending primary school and 6–17 year olds living in non-remote areas		✓
7. Adult Physical Activity Day	Physical activity (including active transport and muscle and bone strengthening activities), for adults living in non-remote areas		✓
8. Adult Physical Activity Detail	Additional detail about physical activity at the individual activity level, for adults living in non-remote areas		✓
9. Dietary Recall	Food intake details on the day prior to the interview	✓	✓
10. Supplements	Dietary supplement intake details on the day prior to the interview	✓	✓
11. Accelerometer - Sleep	Accelerometer measured average sleep duration wake times and sleep onset times for each main sleep period		✓
12. Accelerometer - Day	Accelerometer measured physical activity, bouts, total sleep and steps for midnight-to-midnight day		✓
13. Accelerometer - Hour	Accelerometer measured physical activity, bouts, total sleep and steps per hour of midnight-to-midnight day		✓
14. Accelerometer - Quarter Hour	Accelerometer measured average luminosity, temperature, acceleration magnitude calculated for each quarter hour interval of wear time		✓
15. Accelerometer - 5-Second	Acceleration metrics, averaged per 5-second interval of wear time		✓
16. Accelerometer - Input 100Hz	Raw accelerometer data. Unprocessed device measured acceleration for X, Y and Z device axis		✓

An example file (DataLab test file) for the Accelerometer 5-second, Quarter hour and Input 100Hz level is available in the Data downloads section

The following table shows the hierarchical file structure and the relationship between each level:

Level 1	Level 2	Level 3	Level 4	Relationship type
Household				One record per in scope household
	Person			Up to two selected person records per household (1 adult and 1 child)
		Conditions		One condition record for each reported condition for each selected person record
		Pre-School Aged Child (2–5) Physical Activity Day		Seven physical activity day records per selected person aged 2–5 years (5 year olds not attending primary school)
		School-Aged Child (5–17) Physical Activity Day		Seven physical activity day records per selected person aged 5–17 years (5 year olds attending primary school)
			School-Aged Child (5–17) Physical Activity Detail	One physical activity detail record for each reported activity for each selected person aged 5–17 years on each day of the week (5 year olds attending primary school)
		Adult Physical Activity Day		Seven physical activity day records per selected adult
			Adult Physical Activity Detail	One physical activity detail record for each reported activity for each selected adult
		Dietary Recall		One food/drink record for each reported food/drink for each selected person record
		Supplements		Dietary supplement intake details on the day prior to the interview. One Supplement record for each reported supplement for each selected person record.
		Accelerometer - Sleep		One record per main sleep period for persons participating in the accelerometer study
		Accelerometer - Day		Seven records per person opting into accelerometer study corresponding to each day during the wear period
			Accelerometer - Hour	168 records per person participating in accelerometer study corresponding to each hour in the wear period (7 days x 24 hours)
			Accelerometer - Quarter Hour	Up to 672 records per person participating in the accelerometer study corresponding to each quarter hour in the wear period (7 days x 24 hours x 4 quarter hours)
			Accelerometer - 5-Second	Up to 120,960 records per person participating in the accelerometer study corresponding to each 5-second period of the wear time (7 days x 24 hours x 60 minutes x 60/5 seconds)
			Accelerometer - Input 100Hz	Up to 60,480,000 records per person participating in the accelerometer study (500 x higher resolution than the 5-second level)

2012–13 NATSINPAS file structure

The following table shows the levels available in each NATSINPAS microdata product and examples of the information that are contained on those levels:

Level	TableBuilder	Basic CURF	Expanded CURF	Information contained on level
1. Household	✓	✓	✓	Geographic classifications, household size and structure
2. Persons in household	✓	-	✓	Basic demographic and relationship details of all members of households
3. Person (selected persons)	✓(a)	✓	✓	This is the main level, containing demographic and socioeconomic characteristics of survey respondents, and most of the physical activity, nutrition and health information
4. Condition	✓	✓	✓	Selected health conditions reported by respondents
5. Child 2-4 years physical activity day (NR Only)	✓	✓	✓	Physical and sedentary activities undertaken on the three days prior to interview for children aged 2–4 years living in non-remote areas, including specific information on time spent on indoors/outdoors/screen-based activities per day
6. Child 5-17 years physical activity day (NR Only)	✓	✓	✓	Physical and sedentary activities undertaken on the three days prior to interview for children aged 5–17 years living in non-remote areas, including specific information on time spent on active transport/moderate-vigorous/screen-based activities per day
7. Child 5-17 years physical activity detailed (NR Only)	✓	✓	✓	Detailed information about the physical activities undertaken each day on the three days prior to interview for children aged 5–17 years living in non-remote areas, including time spent on active transport/moderate-vigorous by detailed type, whether the moderate-vigorous activity was organised, and time spent on the organised component
8. Adult physical activity (NR Only)	✓	✓	✓	Detailed information about the physical activities undertaken, and identification for each activity on whether it was organised, for persons aged 18 years and over living in non-remote areas
9. Pedometer (NR Only)	✓	✓	✓	Number of steps and time wore the pedometer for up to eight days reported by the respondents aged 5 years and over living in non-remote areas who participated in this component
10. Biomedical	-	✓	✓	Pathology test information for markers of chronic disease such as blood sugar levels, cholesterol and kidney function, markers of nutritional status, as well as markers of exposure to chemicals such as nicotine for respondents aged 18 years and over who participated in this component
11. Food	✓(a)	✓	✓	Food intake details on the day prior to the interview and on a second day for respondents that completed the follow-up interview (CATI), including nutrient information on each of the food items consumed as well as the time and eating occasion of consumption
12. Supplement	✓(a)	✓	✓	Dietary supplement intake details on the day prior to the interview and on a second day for respondents that completed the follow-up interview (CATI), including nutrient information on each of the dietary supplement items consumed
13. Australian Dietary Guidelines	-	✓	✓	Australian Dietary Guidelines (ADG) items including: day number for intake, food group, ADG food source inclusions and serve amount

a. The TableBuilder product does not contain Day 2 (CATI) nutrition data on the Person/Food/Supplement levels or nutrient information on the Food or Supplement levels. See the TableBuilder Data item list located in the Data downloads section for the nutrition content available in this product.

The following table shows the hierarchical file structure and the relationship between each level:

Level 1	Level 2	Level 3	Level 4	Relationship type
Household				One record per in scope household
	Persons in household			One record per usual resident within household
	Selected person			Up to two selected person records per household (1 adult and 1 child)
		Conditions		One condition record for each reported condition for each selected person record
		Child 2–4 years physical activity day		Seven physical activity day records per selected person aged 2–4 years
		Child 5–17 years physical activity day		Seven physical activity day records per selected person aged 5–17 years
			Child 5–17 years physical activity detailed	One physical activity detail record for each reported activity for each selected person aged 5–17 years on each day of the week
		Adult physical activity		One physical activity detail record for each reported activity for each selected person aged 18 years and over
		Pedometer		One pedometer record for each day a pedometer was worn by selected persons aged 5 years and over
		Biomedical		One biomedical record for each selected person record
		Food		One food/drink record for each reported food/drink for each food day for each selected person record
		Supplement		One supplement record for each reported supplement for each food day for each selected person record
		Australian Dietary Guidelines		One food summary record for each food summary characteristic for each food day for each selected person record

Counts and weights

2023 NATSINPAS

Number of records by level, NATSINPAS 2023 microdata:

Level	Record counts (unweighted)	Weighted counts (if applicable)
Household	2,097	462,188
Person (Selected persons)	2,879	946,233
Conditions	5,348	N/A
Pre-School Aged Child (2–5) Physical Activity Day	3,027	N/A
School-Aged Child (5–17) Physical Activity Day	3,493	N/A
School-Aged Child (5–17) Physical Activity Detail	4,098	N/A
Adult Physical Activity Day	8,345	N/A
Adult Physical Activity Detail	3,006	N/A
Dietary Recall (known as ‘Food level’ in 2012–13 NATSINPAS)	39,829	N/A
Supplements	3,175	N/A
Accelerometer - Sleep	7,140	N/A
Accelerometer - Day	7,231	N/A
Accelerometer - Hour	173,544	N/A
Accelerometer - Quarter Hour	524,536	N/A
Accelerometer - 5-Second	Up to 120,960 rows per file (1,033 files)	N/A
Accelerometer - Input 100Hz	Up to 60,480,000 rows per file (1,033 files)	N/A

2012–13 NATSINPAS

Number of records by level, NATSINPAS 2012–13:

Levels	Record counts (unweighted)	Weighted counts (if applicable)
Household level	2,900	N/A
Persons in Household level (All persons)	10,275	N/A
Person level (Selected persons)	4,109	609,915
Conditions level	5,414	N/A
Child 2–4 Physical Activity Day level (NR only)	4,399	N/A
Child 5–17 Physical Activity Day level (NR only)	5,063	N/A
Child 5–17 Physical Activity Detailed level (NR only)	5,982	N/A
Adult Physical Activity level (NR only)	4,202	N/A
Pedometer level (NR only)	7,753	N/A
Biomedical level (Persons 5+)	4,109	365,868
Food level	72,376	N/A
Supplement level	8,538	N/A
Australian Dietary Guidelines level	761,280	N/A

Weight variables

For the 2023 NATSINPAS, there are two weight variables on the file:

Household Weight (FINHHWT) - Household level - Benchmarked
Person Weight (FINPERWT) - Selected Person level - Benchmarked to the total population aged 2 years and over.

There is no weight associated with the other levels. This is because the records are repeated for each selected person. If, for example, FINPERWT is merged onto the Conditions level, it will be attached to each condition record and therefore be repeated for each selected person where they have more than one condition. This should be considered when producing tables and analysing microdata.

For the 2012–13 NATSINPAS, there are two weight variables on the file:

Person Weight (IPAFINWT) - Selected Person level - Benchmarked to the total population aged 2 years and over.
Biomedical persons (IHMSPERW) - located on the Biomedical level. This weight has been benchmarked to produce Australian population estimates based on Biomedical participants aged 5 years and over.

There are no weights associated with the Household level. Household variables can be used in conjunction with the Person or Biomedical weights to provide, for example, geographic or household compositional information for selected persons. There are also no weights associated with the other levels. This is because the records are repeated for each person who was selected in the survey. If, for example, IPAFINWT is merged onto the Conditions level, it will be attached to each condition record and therefore be repeated for each person where they have more than one condition. This should be considered when producing tables and analysing microdata.

Using weights in the 2023 NATSINPAS

The NATSINPAS is a sample survey, so to produce estimates for the in-scope population, you must use weight fields in your calculations. When analysing a Household level item at the household level, you will need to use the household weight. For example, if you wanted to know the number of households in a state, rather than the number of persons living in that state, you need to use the household weight, not the person weight.

Caution should be used when applying the ‘Household’ weight to items from other levels. For example, if the household weight is applied to a selected person level demographic item, such as ‘Sex’, your table will show the number of households with one or more selected persons of that sex. Since up to two people can be selected in the NATSINPAS, this will result in some households being counted twice, once for the selected adult and once for the selected child, if they are both the same sex.

File content

Available data items

Data items for the 2023 NATSINPAS include:

Demographics – age, sex, Indigenous status, main language spoken at home, marital status
Household details – size, household composition, Socio-Economic Indexes for Areas (SEIFA), geography
Employment – Labour force status, hours usually worked
Education – current study status, attainment
Household Income
Long-term health status relating to diabetes, kidney disease, mental health conditions
Risk factors such as tobacco smoking and physical activity
24-hour dietary recall
Specific dietary information such as consumption of fruits, vegetables, oils, fats, salt, tap water and dietary supplements
Influences on dietary choices
Physical and sedentary activity
Sleep behaviours
Self-reported height and weight
Physical Measures – blood pressure, height, weight and waist.

The Data Item List in the Data downloads section is the definitive source of available data items and categories.

Identifiers

Every record on each level of the file is uniquely identified. See Data Item List in the Data downloads section for details on which ID equates to which level.

Each household has a unique random identifier, ABSHIDD. This identifier appears on the household level and is repeated on each level on each record pertaining to that household. A combination of identifiers for a particular level and all levels above in the hierarchical structure uniquely identifies a record at a particular level. For example, each record on the conditions level is uniquely identified by a combination of the Household, Person and Conditions level identifiers.

The Household record identifier, ABSHIDD, assists with linking people from the same household, and with household characteristics such as geography (located on the household level) to the Person records. When merging data with a level above, only those identifiers relevant to the level above are required.

Multi-response items

Several questions in the survey allowed respondents to provide one or more responses. Each response category for these multi-response data items is treated as a separate data item. In the microdata, these data items share the same identifier (SAS name) prefix but are each separately suffixed with a letter - A for the first response, B for the second response, C for the third response and so on.

For example, the multi-response data item 'All types of physical activity undertaken in last week' (PATYPEW) has seven response categories. There are seven data items named PATYPEWA, PATYPEWB, PATYPEWC....PATYPEWG. Each data item in the series will have either a positive response code or a null response code, with the exception of the first item in the series, PATYPEWA.

PATYPEWA has four potential response codes:

code 0 – null response
code 1 – 'Walking for exercise, recreation or sport' – positive response
code 8 – 'No physical activity in last week'
code 9 – 'Not applicable'.

The remaining items PATYPEWB, PATYPEWC....PATYPEWG have just two response codes each. The Data Item List identifies all multi-response items and lists the corresponding codes with the corresponding response categories. See the Data Item List in the Data downloads section.

Note that the sum of individual multi-response categories will be greater than the population applicable to a particular data item as respondents can select more than one response.

Continuous items

Some continuous data items are allocated special codes for certain responses (e.g. 9999 = 'Not applicable'). Any special codes for continuous (summation) data items are listed in the Data Item List and will be found in the categorical version of the continuous item. However, note that labelling of '0's in the Data Item List does not necessarily mean they are excluded from the ranges (for example - identifying 0 as 'Did not do') as they may still be important in some calculations. Reference should be made to the categorical version of the item to identify which codes are specifically excluded. Therefore, the total shown only represents 'valid responses' of that continuous data item rather than all responses (including special codes). See the Data Item List in the Data downloads section.

Using 2023 NATSINPAS accelerometer microdata files

Accelerometers are a common type of sensor used to study human movement. They are wearable devices that measure linear acceleration – the change in a person’s speed (velocity) per unit time. Acceleration was measured 100 times per second (100Hz) for up to one week resulting in large files (up to 60,480,000 rows per file).

Accelerometer data is output at different levels of detail. These include the most detailed input 100Hz files, as well as summary files with data per:

5-seconds
quarter-hour (15-minutes)
hour
day
person (weekly, weekday and weekend).

Due to the amount of data on the Input 100Hz and 5-second levels, these data are given as one file per participating person. Users may choose small samples of the data for the populations they are interested in. An example file (DataLab Test File) for the Accelerometer 5-second, Quarter hour and Input 100Hz levels is available in the Data downloads section. More detailed data allow users to perform their own analysis of accelerometer data for specific research needs.

For all sub-person levels, data is provided for each person who participated in the accelerometer study regardless of whether they met the minimum wear time requirements for inclusion in the publication estimates. Wear time and imputation flags have been provided to enable users to restrict the data.

As discussed in the NATSINPAS methodology, for people living in remote areas, publication estimates were produced using the first four days of data; that is, the first three full 24 hour periods of wear, and the first and last partial 24 hour periods summed. Categories (1,2,3,7) from the data items “Midnight-midnight periods, first-last day summed” on the Accelerometer hour level and day level (variable names; MMORDDAY, MMORDER) can be used to identify these days (see table below). To meet the minimum wear criteria for inclusion in the publication estimates, there must have been at least 48 hours of wear time in this period.

'Midnight-midnight periods, first-last day summed' categories

Category number	Category description
1	First full 24 hour period
2	Second full 24 hour period
3	Third full 24 hour period
4	Fourth full 24 hour period
5	Fifth full 24 hour period
6	Sixth full 24 hour period
7	First and last partial 24 hour periods summed

For people living in non-remote areas, publication estimates were produced for all persons with at least 48 hours of wear time.

For sleep estimates, this population is as stated above with either:

The first four nights of data for persons living in remote areas having at least two nights of sleep data with >80% wear time, or
all nights of data for persons in non-remote with >80% wear time.

NOTE: Example R and SAS code to handle the different wear times in remote and non-remote areas is available in the DataLab.

All datasets, except the 5-second and Input 100Hz levels, include columns with record identifiers called ABSHIDD (household ID) and ABSPIDD (person ID), which allow merging. For the 5-second and Input 100Hz levels, the file names contain these identifiers. For more information, see the 5-second and Input 100Hz level sections below.

The following table shows some example use cases for each of the accelerometer levels in the DataLab:

Level	Example uses
Person level	For analysis of physical activity and sleep per week and split by weekday and weekend days. Also includes flags for wear time thresholds, imputation rates and time zone which can be transferred to other levels.
Day level	For analysis of physical activity and sleep by day of the week. This level includes a flag for wear day order (i.e. second 24 hour period, third 24 hour period).
Sleep level	For analysis of main sleep periods, physical activity during sleep periods, and sleep analysis by day of the week.
Hour level	For analysis of physical activity and sleep by ‘time-of-day’ and per day.
Quarter hour level	For analysis of acceleration, luminosity and temperature by time of day in 15-minute increments. Users may apply their own processing methodologies to this input data level, as it has minimal data processing applied. This level is the “long-epoch” dataset created by GGIR in Part 1. Use of this level will result in faster processing times compared to the 5-second and 100Hz levels.
5-second level	For analysis of acceleration in the 5-second (short epoch) time increments. For users who wish to analyse in GGIR with different acceleration thresholds or settings. This level can be used to run GGIR Parts 2-6 with the 5-second epoch set. Use of this semi-processed level will result in faster processing times compared to the 100Hz level.
Input 100Hz level	For detailed analysis of the x, y and z axis. For users who wish to use an analysis program other than GGIR, set their own epoch length or develop modelling algorithms related to accelerometry research.

Day and Sleep levels

The ‘Accelerometer – Day level’ dataset has seven records for each person who participated in the accelerometer study. Each record represents a full 24 hour period from midnight‑to‑midnight during the week the device was worn. Because respondents start and stop wearing the device at different times, the first and last partial days are combined into one complete 24 hour record.

Some sleep data also appears on the ‘Accelerometer – Day level’. However, this does not represent one sleep period because it adds up all the sleep that happened between midnight and midnight. For people who go to sleep before midnight and wake up the next morning, the ‘Day level’ dataset will only include the sleep before midnight (plus any sleep after midnight from the night before).

The ‘Accelerometer – Sleep level’ dataset contains one record for each main sleep period. A main sleep period is defined as the longest period of sustained inactivity (or lack of movement) between midday and midday. Most respondents have 6-7 sleep-level records. See ‘Measured physical activity and sleep (accelerometer)’ in the methodology.

Quarter hour level

The quarter-hour level dataset has up to 168 hours of data for each respondent who participated in the accelerometer study, broken into 15-minute blocks. It is provided as one SAS or CSV file, and all times are shown in the respondent’s local time.

An imputation flag is included to show where imputation was done for the higher-level datasets, but the quarter-hour data itself is not imputed. As data on this level do not have imputation, data cleaning or wear thresholds already applied, the outputs may not exactly match the NATSINPAS publication if different methods are used. Users will need to consider data processing methods when using these data.

5-second level

The 5-second level dataset includes calculated acceleration measures and timestamps (in the respondent’s local time) for every 5-seconds. Only respondents who participated in the accelerometer study are included. There is no imputation on this level, and it does not include any information about wear-time rules. As data on this level do not have these already applied, the outputs may not exactly match the NATSINPAS publication if different methods are used. Users will need to consider data processing methods when using these data.

Each respondent has their own CSV file, named using the format:

“<ABSHIDD>-<ABSPIDD>.csv”

The file name can be used to merge on relevant demography data from higher level datasets.

Each CSV can have up to 120,960 rows of data (which is seven days of 5-second periods) but may have fewer rows if the device was not worn the whole time.

100Hz level

The 100Hz level is the highest detail data available from the accelerometer study. It matches the devices measurement rate of 100 readings per second.

Each participating respondent has one compressed gzip (.gz) file. Inside this file is a CSV containing all their data. This gzip file works like a ZIP file but usually gives better compression. These CSV files can be opened in common software such as:

R (using gzfile())
Python (using the ‘gzip’ module)
De-compression tools like 7-Zip.

Each file includes:

acceleration in three axes (x, y and z)
time in milliseconds as an integer since the start of the UNIX epoch.

Similar to the 5-second level, the file name uses ABSHIDD (household ID) and ABSPIDD (person ID) so the data can be matched to other datasets. As data on this level do not have imputation, data cleaning or wear thresholds already applied, the outputs may not exactly match the NATSINPAS publication if different methods are used. Users will need to consider data processing methods when using these data.

Important note: Reading these files uses a lot of computing power. Users should choose a virtual machine that suits their analysis needs. See information about the virtual machine options at Using your workspace. It is strongly recommended not to decompress all files on DataLab unless necessary, because they take a long time to process and use a lot of storage. For example, a typical 7-day file contains up to 60,480,000 rows and is about 242MB when compressed or 2.07GB when uncompressed.

Using accelerometer timestamps

The timestamps in the 100Hz level are a 9-digit number. This represents the milliseconds since 1 January 1970 at 00:00:00 UTC.

These timestamps are designed so that when converted to the respondent’s local time, the day of the week and the clock time (for example, “Thursday” or “9:19am”) match the survey data. The timestamps do not include the real calendar date of when the data was collected. Month and season of interview, as well as each participant’s local time zone is available on the ‘Person level’.

Example formulae for converting time stamps to AEST (UTC +10:00):

Microsoft Excel:

=A1 / 1000 / 86400 + DATE(1970,1,1) + TIME(10,0,0)

R (general):

as.POSIXct(timestamp / 1000, origin = "1970-01-01", tz = "Australia/Canberra")

R (GGIR):

Parameters: configtz = “UTC” and desiredtz = “Australia/Canberra”

Python (using Pandas):

datetime.fromtimestamp(timestamp / 1000, tz = ZoneInfo("Australia/Canberra"))

SAS:

timestamp / 1000 + HMS(10,00,0) + MDY(1,1,1970) * 86400;

Reliability of estimates

As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the microdata. This is important because a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for by use of appropriate weights, the results could be biased.

Each household or person record has a main weight (FINHHWT or FINPERWT). This weight indicates how many population units are represented by the sample unit. When producing estimates of sub-populations from the microdata, it is essential that they are calculated by adding the weights of households or persons in each category and not just by counting the sample number in each category. If each household’s or person’s weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a household's or person’s chance of selection or of different response rates across population groups. This could result in estimates produced being biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc., rather than to the distributions within the sample itself.

It is also important to calculate a measure of sampling error for each estimate. Sampling error occurs because only part of the population is surveyed to represent the whole population. Sampling error should be considered when interpreting estimates as this gives an indication of accuracy. It reflects the importance that can be placed on interpretations using the estimate. Measures of sampling error include standard error (SE), relative standard error (RSE) and margin of errors (MoE). These measures of sampling error can be estimated using the replicate weights. The replicate weight variables provided on the microdata are labelled WHM1XXX (household) and WPM1XXX (person), where XXX represents the number of the given replicate group. The exact number of replicates will vary depending on the survey. The NATSINPAS uses 250 replicate groups for both household and person weights labelled WHM1001 to WHM1250 (household) and WPM1001 to WPM1250 (person).

Using replicate weights for estimating sampling error

Overview of replication methods

ABS household surveys employ complex sample designs and weighting which require special methods for estimating the variance of survey statistics. Variance estimators for a simple random sample are not appropriate for this survey microdata.

A class of techniques called 'replication methods' provide a general process for estimating variance for the types of complex sample designs and weighting procedures employed in ABS household surveys. The ABS uses a method called the Group Jackknife Replication Method.

A basic idea behind the replication approach is to split the sample into G replicate groups. One replicate group is then dropped from the file and a new set of weights is produced for the remaining sample. This is repeated for all G replicate groups to provide G sets of replicate weights. For each set of replicate weights, the statistic of interest is recalculated and the variance of the full sample statistic is estimated using the variability among the replicate statistics.

The statistics calculated from these replicates are called replicate estimates. Replicate weights provided on the microdata file enable variance of survey statistics, such as means and medians, to be calculated relatively simply (Further technical explanation can be found in Section 4 of Research Paper: Weighting and Standard Error Estimation for ABS Household Surveys (Methodology Advisory Committee).

How to use replicate weights

To calculate the standard error of any statistic derived from the survey data, the method is as follows:

Calculate the estimate of the statistic of interest using the main weight.
Repeat the calculation above for each replicate weight, substituting the replicate weight for the main weight and creating G replicate estimates. In the example where there are 250 replicate weights, you will have 250 replicate estimates.
Use the outputs from steps 1 and 2 as inputs to the formula below to calculate the estimate of the Standard Error (SE) for the statistic of interest.

\[SE\left( y \right) = \sqrt {\;\left( {\frac{{G - 1}}{G}\;} \right)\;\;\;\sum\limits_{g = 1}^G {{{\left( {{y_{(g)}} - y} \right)}^2}} }\]

[equation 1]

$G$ = Number of replicate groups
$g$ = the replicate group number
$y_{\left(g\right)}$ = Replicate estimate for group g, i.e. the estimate of y calculated using the replicate weight for g
$y$ = the weighted estimate of y from the sample.

From the replicate variance you can then derive the following measures of sampling error: relative standard error (RSE), or margin of error (MOE) of the estimate.

\[Relative\ Standard\ Error \left(RSE\right)=\frac{SE}{Estimate}\]

[equation 2]

\[Margin\ of\ Error\left(MoE\right)=1.96 \times SE\]

[equation 3]

An example in calculating the SE for an estimate of the mean

Suppose you are calculating the mean value of earnings, y, in a sample. Using the main weight produces an estimate of $500.

You have 5 sets of Group Jackknife replicate weights and using these weights (instead of the main weight) you calculate 5 replicate estimates of $510, $490, $505, $503, $498 respectively.

To calculate the standard error of the estimate you will substitute the following inputs to equation [1]:

$G$ = 5
$y$ = 500
$g$ = 1, $y_{\left(g\right)}$ = 510
$g$ = 2, $y_{\left(g\right)}$ = 490
…

\[\begin{align} SE(y) &= \sqrt {\frac{{5 - 1}}{5}\sum\limits_{g = 1}^5 {{{({y_{(g)}} - 500)}^2}} } \\ SE(y) &= \sqrt {\frac{4}{5}({{(510 - 500)}^2} + {{(490 - 500)}^2} + {{(505 - 500)}^2} + {{(503 - 500)}^2} + {{(498 - 500)}^2})} \\ SE(y) &= \sqrt {\frac{4}{5} \times 238} \\ SE\left( y \right) &= 13.8 \end{align}\]

To calculate the RSE you divide the SE by the estimate of and multiply by 100 to get a %:

\[\begin{equation} \begin{split} RSE\left(y\right) &= \frac{13.8}{500} \times 100 \\ RSE\left(y\right) &= 2.8\% \end{split} \end{equation}\]

To calculate the margin of error you multiply the SE by 1.96:

\[\begin{equation} \begin{split} Margin\ of\ Error\left(y\right) &= 13.8\times1.96 \\ Margin\ of\ Error\left(y\right) &= 27.05 \end{split} \end{equation}\]

Confidentiality steps taken for the microdata

The microdata contains unit records relating to all of the survey respondents. It is released under the Census and Statistics Act 1905, which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation.

The NATSINPAS 2023 Detailed Microdata is only accessible from within the secure ABS environment for approved data users via Datalab.

The NATSINPAS 2023 Basic Microdata is available to approved data users to download and analyse in their own environment. Steps including the following list of actions, have been taken to protect the confidentiality of respondents:

the level of detail of many data items has been reduced by grouping, ranging or top coding values
some unusual records have been changed to protect against identification
excluding some data items that were collected.

The nature of the changes made, and the relatively small number of records involved ensure that the effects on data for analysis purposes is considered negligible.

The changes mean that estimates produced from the basic microdata may differ from those published in National Aboriginal and Torres Strait Islander Nutrition and Physical Activity Survey, 2023 or subsequent publications.

Data downloads

Data item lists

Download all (2.39 MB)

Data files

NATSINPAS 2023 Detailed Microdata Data Item List.xlsx
Download xlsx [1.32 MB]
NATSINPAS 2023 Basic Microdata Data Item List.xlsx
Download xlsx [796.32 KB]
NATSINPAS 2012-13 Detailed Microdata Data Item List.xls
Download xls [1.79 MB]
NATSINPAS 2012-13 Basic Microdata Data Item List.xls
Download xls [1.62 MB]
NATSINPAS 2012-13 TableBuilder Data Item List.xls
Download xls [2.05 MB]

DataLab Test file

Test file for accelerometer - contains one zip file with an example version of the Quarter hour, 5-second and 100Hz levels (250MB).