Microdata: Smoker Status

Presents pooled data about smoking from multiple household surveys including the National Health Survey

Release date and time

05/12/2022 11:30am AEDT

Introduction

In 2021–22, the National Health Survey (NHS), National Study of Mental Health and Wellbeing (NSMHW), Survey of Income and Housing (SIH), and Survey of Disability and Carers (SDAC) collected a standard set of information that was pooled to produce the Smoker Status dataset.

The Smoker Status dataset provides data on the prevalence of smoking in Australia and can be cross classified by selected demographic and socio-economic characteristics.

Smoking data has previously been pooled for the 2020-21 and 2017-18 financial years. This data can also be accessed using DataLab however due to changes in data sources and collection methodologies, comparisons over time should be made with caution.

Using DataLab

The DataLab environment allows real time access to detailed microdata from the Smoker Status pooled dataset.

DataLab is an interactive data analysis solution available for users to run advanced statistical analyses, for example, multiple regressions and structural equation modelling. The DataLab environment contains up-to-date versions of SPSS, Stata, SAS and R analytical languages. Controls in DataLab have been put in place to protect the identification of individuals and organisations. All output from DataLab sessions is cleared by an ABS officer before it is released.

For more information, including prerequisites for DataLab access, please see the About DataLab page.

File Structure

The following table shows the levels available on the DataLab product and the information contained on those levels:

Level name	Information contained on level
1. Household	Geographic classifications, household size and structure, and dwelling characteristics.
2. Selected person	Demographic and socio-economic characteristics of survey respondents, as well as information about smoking.

The following table shows the hierarchical file structure and the relationship between each level:

Level 1	Level 2	Relationship type
Household		One record per in scope household.
	Selected Person	Selected persons in household aged 15 years and over.

Counts and weights

The following table shows the number of records on each level and the weighted counts for the 2017-18, 2020-21, and 2021-22 Smoker Status datasets. This data includes persons aged 15 years and over.

Survey period	Level	Record counts (unweighted)	Weighted counts
2021-22	Household	18,723	9,846,992
	Selected person	26,156	20,398,949
2020-21	Household	30,564	9,782,954
	Selected person	42,117	20,285,817
2017-18	Household	30,787	9,268,534
	Selected person	44,901	19,502,432

There are two weight variables on each Smoker Status datasets. For the 2020-21 and 2021-22 datasets, these are:

Household weight (HSPDHHWT) - benchmarked to produce household estimates.
Person weight (HSPDSPWT) - benchmarked to the total population aged 15 years and over.

For the 2017-18 dataset, the weight variables are:

Household weight (NHIFHHWT) - benchmarked to produce household estimates.
Person weight (NHIFINWT) - benchmarked to the total population aged 15 years and over.

File content

Available data items

Data items include:

Demographics, such as Age, Sex, Country of Birth, Main language spoken, Marital status
Household details, such as Type, Size, Household composition, Tenure, SEIFA, Geography
Labour force status
Educational attainment
Self-assessed health status
Visa status
Current smoker status.

Each survey reference period has a data item list available in the Data downloads section. Each list provides information on all available data items and categories for that reference period. There may be different data items available for each period. Please refer to each list to confirm it meets your requirements before purchasing the DataLab product.

Identifiers

Every record on each level of the file has a unique identifier. These identifiers, ABSHIDD and ABSPID, appear on both levels of the file.

Each household has a unique fifteen-digit random identifier, ABSHIDD. This identifier appears on the household level and is repeated on each level on each record pertaining to that household. The combination of identifiers uniquely identifies a record at a particular level as shown below:

Household = ABSHIDD
Selected person = ABSHIDD + ABSPID

The household record identifier, ABSHIDD, assists with linking people from the same household, and also with household characteristics such as geography (located on the household level) to the person records.

Historical comparability

Pooled smoking data was previously released for the 2020-21 and 2017-18 financial years. In 2020-21, the pooled dataset was created using sample from the NHS, General Social Survey (GSS), SIH, Time Use Survey (TUS) and the NSMHW. In 2017-18, the pooled dataset was created using sample from the NHS and SIH only.

While similar in content, each pooled dataset has different data sources and collection methodologies for the financial year and comparisons over time should be made with caution. In particular, the 2020-21 surveys were conducted during the peak of the COVID-19 pandemic with the majority of interviews collected via online, self-complete forms (64%). There were significant impacts on response rates and sample representativeness because Interviewer follow-up of non-responding households was not possible. The 2020-21 pooled smoking data is considered a break in series, and reflects the specific time point only.

For more information, see methodology for each dataset. Links to these datasets can be found in Further information.

Data downloads

Data item list

Download all (276.99 KB)

Data files

Smoker Status, Australia, 2021-22 Data Items.xlsx
Download xlsx [115.82 KB]
Smoker Status, Australia, 2020-21 Data Items.xlsx
Download xlsx [127.14 KB]
Smoker Status, Australia, 2017-18 Data Items.xlsx
Download xlsx [106.71 KB]

Further Information

See Insights into Australian smokers, 2021-22 for summary results and methodology information.

See Pandemic insights into current Australian smokers, 2020-21 and Smoking, 2017-18 for summary results and methodology information for previous releases.

Previous catalogue number

This release previously used catalogue number 4324.0.55.004.

APA

Citation