Microdata: National Study of Mental Health and Wellbeing

Presents microdata from the National Study of Mental Health and Wellbeing for key mental health statistics including prevalence of mental disorders.


The National Study of Mental Health and Wellbeing (NSMHW) is comprised of a survey that is collected on an irregular basis and is designed to provide a range of information about the mental health of Australians. It provides information on the prevalence of selected lifetime and 12-month mental disorders, by the major disorder groups:

  • Anxiety disorders (e.g., Social Phobia)
  • Affective disorders (e.g., Depression)
  • Substance Use disorders (e.g., Alcohol Harmful Use).

It also provides information on the level of impairment, and health services used for mental health problems.

This information can be cross classified by selected demographic and socioeconomic characteristics.

This product provides information about the 2020-21 and 2007 NSMHW cycles. It includes details about data files, Data Item Lists, and information about the survey methodology. A link to microdata for the 1997 release is also provided.

Due to changes in survey content and the application of diagnostic criteria for mental health disorders, some data are not comparable between collections. For more information, see Comparison between 2020-21, 2007 and 1997 below and 2020-21 Methodology information.

Available products

The following microdata products are available from this survey:

  • Basic microdata – approved users can download and analyse unit record data in their own environment. This product is available for the 1997 and 2007 NSMHW. It is not available for the 2020-21 NSMHW.
  • Detailed microdata - approved users can access a remote desktop environment in DataLab for in-depth and interactive data analysis using a range of statistical software packages. This product is available for NSMHW reference periods: 2007 and 2021.

To apply for access, see Microdata Entry Page.

Before you apply for access, read Responsible Use of ABS Microdata, User Guide.

File structure

Estimates from the 2020-21 NSMHW are available at two levels contained in separate data files: Household and Selected Person.

A complete list of data items can be accessed from the Data Item List in the Data downloads section. This contains details for each data item including the output categories and any special codes used.

2020-21 and 2007 NSMHW file structure

The following table shows the levels available in the microdata product and the information contained on those levels:

Level name

Information contained on level

  1. Household

Geographic classifications, household size and structure, dwelling characteristics and household income details.

  1. Selected Person

Demographic and socioeconomic characteristics of survey respondents, as well as health, mental health and related information provided by respondents.

The following table shows the hierarchical file structure and the relationship between each level:

Level 1

Level 2

Relationship type



One record per in scope household


Selected Persons

One selected person record per household

Counts and Weights

Number of records by level, NSMHW 2020-21

Number of records by level, NSMHW 2020-21


Record Counts (Unweighted)

Weighted Counts




Person (Selected persons)




Number of records by level, NSMHW 2007

Number of records by level, NSMHW 2007

Record Counts (Unweighted)

Weighted Counts




Person (Selected persons)



Weight variables

For NSMHW, there are two weight variables on the file:

  • Household Weight (FINWTH) - Household level – Benchmarked
  • Person Weight (FINWTP) - Selected Person level - Benchmarked to population of persons 16-85 years

Using weights

The NSMHW is a sample survey, so to produce estimates for the in-scope population you must use weight fields in your calculations. When analysing a Household level item, you will need to use the household weight. When analysing a Selected Person level item, you will need to use the person weight.

File content

Available data items

Data items for 2020-21 include:

  • Demographics including age, sex, gender, variations of sex characteristics, and sexual orientation, country of birth, main language spoken, and marital status
  • Household details including household composition, tenure type, landlord type, number of bedrooms, and household income
  • Socio-economic characteristics of people including labour force status, educational attainment, and personal income
  • General health and wellbeing including self-assessed health status, psychological distress, smoking, long term health conditions, social connectedness, and functioning
  • Mental health including depression, mania, panic, social phobia, agoraphobia, generalised anxiety, substance use, obsessive-compulsive disorder, post-traumatic stress disorder
  • Suicidality
  • Self-harm
  • Disordered eating
  • Use of health and social support services

The Data Item Lists contain a full list of available data items and categories for the 2020-21 and 2007 surveys.


Every record on each level of the file is uniquely identified. See Data Item Lists for details on which ID equates to which level.

Each household has a unique random identifier, ABSHID. This identifier appears on the household level and is repeated on the selected person level. The combination of identifiers uniquely identifies a record at a particular level as shown below.

  1. Household = ABSHID
  2. Person = ABSHID + ABSPID

The Household record identifier, ABSHID, assists with linking people from the same household, and with household characteristics such as geography (located on the household level) to the Person records.

Multi-response items

Several questions in the survey allowed respondents to provide one or more responses. Each response category for these multi-response data items is treated as a separate data item. In the detailed microdata, these data items share the same identifier (SAS name) prefix but are each separately suffixed with a letter - A for the first response, B for the second response, C for the third response and so on. Where there are more than 26 categories, the next suffix after Z is 0, then 1, 2, etc.

For example, the multi-response data item 'Long-term Health Condition' has thirteen response categories (including 'No long-term health conditions'). There are thirteen data items named LTHCONDA, LTHCONDB, LTHCONDC...LTHCONDM. Each data item in the series will have either a positive response code or a null response code, with the exception of the first item in the series, LTHCONDA. LTHCONDA has three potential response codes: the positive response code 10 - 'Arthritis', the code 0 - null response, as well as the additional response codes, code 98 - 'Not known' and code 99 – ‘Refused’. The remaining items LTHCONDB--M have just the two response codes each. The data item list identifies all multi-response items and lists the corresponding codes with the corresponding response categories.

Note that the sum of individual multi-response categories will be greater than the population applicable to a particular data item as respondents can select more than one response.

Reliability of estimates

As the survey was conducted on a sample of private households in Australia, it is important to take account of the method of sample selection when deriving estimates from the detailed microdata. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which the person lived. If these chances of selection are not accounted for by use of appropriate weights, the results will be biased.

Each person record has a main weight (FINWTP). This weight indicates how many population units are represented by the sample units. When producing estimates of sub-populations from the detailed microdata, it is essential that they are calculated by adding the weights of persons in each category and not just by counting the sample number in each category. If each person's weight were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person's chance of selection or of different response rates across population groups, with the result that the estimates produced could be biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, by sex, etc. rather than to the distributions within the sample itself.

Each person record on the detailed microdata contains 30 replicate weights in addition to the main weight. Replicate weights can be used to calculate measures of sampling error.

Continuous items

Some continuous data items are allocated special codes for certain responses (e.g., 9997 = 'Not applicable'). Any special codes for continuous (summation) data items are listed in the Data Item List and will be found in the categorical version of the continuous item.

Comparison between 2020-21, 2007, and 1997

Data from the 2020-21 survey has been released as the National Study of Mental Health and Wellbeing. The ABS also conducted this survey in 2007 and 1997. Data from the 2007 survey was released as the National Survey of Mental Health and Wellbeing and data from the 1997 survey was released as the National Survey of Mental Health and Wellbeing of Adults.

Comparison between 2020-21 and 2007

The 2020-21 NSMHW was designed to be broadly comparable with the 2007 survey. It used the WMH-CIDI 3.0 questionnaire modules used in 2007 and collected them in the same order as they were collected in 2007. Data collected using the WMH-CIDI 3.0 modules are therefore comparable between 2020-21 and 2007.

Many of the non-diagnostic topics and the order in which they were collected in the 2020-21 survey differs from that in 2007. Some topics collected in the 2007 survey were removed and new topics were added. Other topics changed significantly between 2020-21 and 2007. For example, demographic and socio-economic modules were updated to align with current ABS standards and commonly used ABS questions and data items. Data for non-diagnostic topics may not be comparable between 2020-21 and 2007.

Please see the Data Item Lists for each collection for full details.

Due to the change in questions used to collect physical health conditions in the 2020-21 survey, the comorbidity of mental health disorders and physical health conditions is not comparable with 2007.

The diagnoses of mental disorders are based on the WMH-CIDI 3.0 algorithms. The algorithms operationalise criteria from two classification systems: the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV); and the WHO International Classification of Diseases, Tenth Revision (ICD-10).

The version of the algorithms used for the 2020-21 survey was provided by the WHO in 2020. The algorithms are comparable with the version used for the 2007 survey with the following exceptions:

ICD-10 Post-Traumatic Stress Disorder (PTSD):

  • ICD criteria B Part 2 has been updated: Group 2 reactions (unwanted memories, unpleasant dreams, flashbacks, getting very upset when reminded of it, physical reactions) must have occurred at least once a month. The version of the diagnostic algorithms used for the 2007 survey did not include the once-a-month persistence criterion.
  • ICD criteria D Part 2 has been updated: Persistent symptoms of increased psychological sensitivity and arousal shown by any two of the following: difficulty in falling or staying asleep, irritability or outbursts of anger, difficulty in concentrating, hypervigilance, exaggerated startle response; not present before exposure to the stressor, and must have occurred at least once a month. The version of the diagnostic algorithms used for the 2007 survey did not include the once-a-month persistence criterion.
  • Lifetime and 12-month prevalence data items for ICD-10 PTSD are therefore not comparable between 2020-21 and 2007.

ICD-10 Obsessive-Compulsive Disorder (OCD):

  • For an ICD-10 lifetime diagnosis of OCD, obsessions and/or compulsions must be present on most days for at least two weeks. In 2007, the 12-month diagnosis was derived from the lifetime diagnosis including the criterion that disorder symptoms must have been present on most days for at least two weeks or longer in the 12 months prior to the survey interview. The version of the algorithms used for the 2020-21 survey did not include the two-week persistence as a condition for meeting 12-month diagnosis. 12-month diagnosis in 2020-2021 is derived based on lifetime OCD diagnosis with the presence of OCD symptoms, for any duration, in the past 12 months.
  • 12-month prevalence data items for ICD-10 OCD are therefore not comparable between 2020-21 and 2007.

Both Post-Traumatic Stress Disorder and Obsessive-Compulsive Disorder are classified as Anxiety disorders. Consequently, the ICD-10 lifetime and 12-month Anxiety disorders data items and the ICD-10 lifetime and 12-month Mental disorders data items are also not comparable between 2020-21 and 2007.

2007 re-derived data items:

To enable comparison between 2020-21 and 2007, selected ICD-10 Post-Traumatic Stress Disorder, ICD-10 Obsessive-Compulsive Disorder, ICD-10 Anxiety disorders, and ICD-10 Mental disorders data items, as well as the associated ICD-10 comorbidity and severity data items, have been re-derived using the 2020 definitions. These have been added to the 2007 detailed microdata file (see the Re-derived Items tab of the Data Item List). Estimates produced using these items will not match those included in the National Survey of Mental Health and Wellbeing, 2007 Summary of Results.

Comparison between 2007 and 1997

The 2007 survey was designed to provide national estimates that can be compared internationally, rather than to provide comparisons with the 1997 survey. Due to differences in how the data were collected, care should be exercised when comparing data items from the 1997 survey with the 2007 survey. Particular attention should be given to the definition of the data item, the population, and the reference period that applies (e.g., 12-month versus lifetime). Differences between the two surveys are too substantial to list individually, but included changes to questions and topics, concepts, survey methodology, classifications, and measurements.

Detailed information on the differences between the two surveys is provided in the National Survey of Mental Health and Wellbeing: Users' Guide, 2007.

Data downloads

Data Item Lists

Data files

Previous releases





National Survey of Mental Health and Wellbeing, 2007

Basic microdata

Detailed microdata

Mental Health and Wellbeing of Adults, 1997

Basic microdata


Further information

Back to top of the page