National Health Survey: Health Literacy methodology

Latest release
Reference period
Next release Unknown
First release

Explanatory notes


1 This publication presents key indicators from the 2018 Health Literacy Survey (HLS), including information on:

  • nine domains of health literacy (such as how people find, understand and use health information, how they manage their health and interact with healthcare providers) together with:
  • key health risk factors and health conditions, and
  • demographic, socioeconomic characteristics.

2 The HLS was conducted throughout Australia from January 2018 to August 2018.

Scope of the survey

3 The HLS was conducted with a sample drawn from respondents 18 years and over who had already participated in the 2017-18 National Health Survey (NHS) and agreed to be contacted for further ABS surveys. As such the HLS data was combined with that of the NHS, and information related to survey scope, coverage, data collection, input coding and data quality issues for both the NHS and HLS are included below where relevant.

4 Urban and rural areas in all states and territories were included, while Very Remote areas of Australia and discrete Aboriginal and Torres Strait Islander communities were excluded. These exclusions are unlikely to affect national estimates, and will only have a minor effect on aggregate estimates produced for individual states and territories, excepting the Northern Territory where the population living in Very Remote areas accounts for around 20.3% of persons.

5 Non-private dwellings such as hotels, motels, hospitals, nursing homes and short-stay caravan parks were excluded from the survey. This may affect estimates of the number of people with some long-term health conditions (for example, conditions which may require periods of hospitalisation or long term care).

6 The HLS was limited to adults aged 18 years and over.

7 The following groups were excluded from the survey:

  • certain diplomatic personnel of overseas governments, customarily excluded from the Census and estimated resident population;
  • persons whose usual place of residence was outside Australia;
  • members of non-Australian Defence forces (and their dependents) stationed in Australia; and
  • visitors to private dwellings.

Sample design

8 Dwellings for the NHS were selected at random using a multistage area sample of private dwellings. The initial sample selected for the survey consisted of approximately 25,109 dwellings. This was reduced to a sample of 21,544 after sample loss (for example, households selected in the survey which had no residents in scope of the survey, vacant or derelict buildings, buildings under construction). Of those remaining dwellings, 16,376 (or 76.0%) were fully or adequately responding, yielding a total sample for the survey of 21,315 persons.

Approached sample, final sample and response rates

NHS 17-18 Households in sample3 2712 6123 3641 6581 6561 6051 0891 12116 376
Households approached for HLS that agreed to be contacted (after sample loss)1 3031 1501 3348247657164346987224
Fully responding HLS sample1 0219241 0476506015923486085790
Response rate for HLS only (%)78.480.378.578.972.982.780.287.180.1
Response rate of total NHS sample (%)31.235.331.139.236.336.931.954.235.3


9 The sample for the HLS was taken from the total initial sample of 16,376 fully or adequately responding households enumerated for the NHS. The actual HLS sample approached was 7,224 households. Of these households in the actual sample, 5,790 (80.1%) were fully responding households. This represents an overall response rate of 35.3% when measured against the total initial sample of respondents to the NHS.

Data collection

11 The HLS was conducted by trained ABS interviewers over the telephone using Computer Assisted Telephone Interviewing (CATI). Information gathered through the 2017-18 NHS was collected via personal interviews with selected residents in sampled dwellings. One adult (aged 18 years and over) in each dwelling was selected and interviewed about their own health characteristics as well as information about the household (for example, income of other household members).

Weighting, benchmarking and estimation

12 Weighting is a process of adjusting results from a sample survey to infer results for the in-scope total population. To do this, a weight is allocated to each sample unit; for example, a household or a person. The weight is a value which indicates how many population units are represented by the sample unit. The file contains weights for both the 2017-18 NHS and the HLS. When analysing information from the HLS, the HLS weights must be used.

13 The first step in calculating NHS weights for each person was to assign an initial weight, which was equal to the inverse of the probability of being selected in the survey. For example, if the probability of a person being selected in the survey was 1 in 600, then the person would have an initial weight of 600 (that is, they represent 600 others). An adjustment was then made to these initial weights to account for the time period in which a person was assigned to be enumerated.

14 The weights are calibrated to align with independent estimates of the population of interest, referred to as 'benchmarks', in designated categories of sex by age by area of usual residence. Weights calibrated against population benchmarks in this way compensate for over or under-enumeration of particular categories of persons and ensure that the survey estimates conform to the independently estimated distribution of the population by age, sex and area of usual residence, rather than to the distribution within the sample itself. 

15 The NHS was benchmarked to the estimated resident population living in private dwellings in non-Very Remote areas of Australia at 31 December 2017. Excluded from these benchmarks were persons living in discrete Aboriginal and Torres Strait Islander communities. The benchmarks, and hence the estimates from the survey, do not (and are not intended to) match estimates of the total Australian resident population (which include persons living in Very Remote areas or in non-private dwellings, such as hotels) obtained from other sources.

16 HLS data was re-weighted at the person-level for the population aged 18 years and over. The HLS weights were calibrated to the NHS person-level weights for some key variables (i.e. collapsed highest level of education; one-digit country of birth; current daily smoker status; self-assessed health; collapsed disability status; heart/stroke/vascular disease status; and overweight/obese status).

17 Survey estimates of counts of persons are obtained by summing the weights of persons with the characteristic of interest. Estimates of non-person counts (for example, number of health conditions) are obtained by multiplying the characteristic of interest with the weight of the reporting person and aggregating. 

18 In addition to weighted estimates. this release also includes weighted mean Health Literacy Scores. This summary statistic is expressed as the mean of the values for each health literacy domain, falling within a range of 1-4 or 1-5 depending on the domain.

Reliability of estimates

19 All sample surveys are subject to sampling and non-sampling error.

20 Sampling error is the difference between estimates, derived from a sample of persons, and the value that would have been produced if all persons in scope of the survey had been included. Indications of the level of sampling error for estimates are given by the Relative Standard Error (RSE) and 95% Margin of Error (MoE). For more information refer to the Technical Note - Reliability of Estimates.

21 In this publication, estimates with an RSE of 25% to 50% are preceded by an asterisk (e.g. *3.4) to indicate that the estimate has a high level of sampling error relative to the size of the estimate, and should be used with caution. Estimates with an RSE over 50% are indicated by a double asterisk (e.g. **0.6) and are generally considered too unreliable for most purposes.

22 Margins of Error are provided for proportions to assist users in assessing the reliability of these data. Estimates of proportions with an MoE more than 10% are annotated to indicate they are subject to high sample variability and particular consideration should be given to the MoE when using these estimates. Depending on how the estimate is to be used, an MoE greater than 10% may be considered too large to inform decisions. In addition, estimates with a corresponding standard 95% confidence interval that includes 0% or 100% are annotated with a # to indicate that they are usually considered unreliable for most purposes.

23 Non-sampling error may occur in any data collection, whether it is based on a sample or a full count such as a census. Non-sampling errors occur when survey processes work less effectively than intended. Sources of non-sampling error include non-response, errors in reporting by respondents or in recording of answers by interviewers, and errors in coding and processing data.

23 Non-response occurs when people are unable to or do not cooperate, or cannot be contacted. Non-response can affect the reliability of results and can introduce a bias. The magnitude of any bias depends on the rate of non-response and the extent of the difference between the characteristics of those people who responded to the survey and those who did not.

24 The following methods were adopted to reduce the level and impact of non-response for the 2017-18 NHS:

  • face-to-face interviews with respondents;
  • the use of proxy interviews in cases where language difficulties were encountered, noting the interpreter was typically a family member;
  • follow-up of respondents if there was initially no response; and
  • weighting to population benchmarks to reduce non-response bias.

25 To reduce the level and impact of non-response for the HLS, respondents were followed-up via telephone on multiple occasions if there was initially no response.

Interpretation of results

26 Care has been taken to ensure that results are as accurate as possible. This includes thorough design and testing of the questionnaire, interviews being conducted by trained ABS interviewers, and quality control procedures throughout data collection, processing and output. There remain, however, other factors which may have affected the reliability of results, and for which no specific adjustments can be made. The following factors should be considered when interpreting these estimates:

  • Information recorded in the survey is essentially 'as reported' by respondents, and hence may differ from information available from other sources or collected using different methodology; for example, information about health conditions is self-reported and, while not directly based on a diagnosis by a medical practitioner in the survey, respondents were asked whether they had ever been told by a doctor or nurse that they had a particular health condition. Conditions which have a greater effect on people's well-being or lifestyle, or those specifically mentioned in survey questions, are expected in general to have been better reported than others; and
  • Some respondents may have provided responses that they felt were expected, rather than those that accurately reflected their own situation. Every effort has been made to minimise such bias through the development and use of appropriate survey methodology;

27 For reporting purposes, the HLS response category of 'difficult' combines three separate response variables contained in the HLQ: ‘sometimes difficult’, ‘usually difficult’ and ‘cannot do or always difficult’. This applies to domains 6 to 9 only. For domains 1 to 5, the 'strongly disagree' and 'disagree' categories from the HLQ have been combined and are referred to as 'strongly disagree/disagree'.


28 In the HLS, respondents were told that the term 'healthcare providers' encompassed doctors, nurses, physiotherapists, dieticians and any other health workers that respondents seek advice or treatment from.

29 Long-term health conditions reported by respondents in the NHS are presented using a classification originally developed for the 2001 NHS by the Family Medicine Research Centre, University of Sydney, in conjunction with the ABS. The classification is based on the 10th revision of the International Classification of Diseases (ICD) and is used for all years from 2001 to 2017-18.

30 Country of birth is classified to the Standard Australian Classification of Countries (cat. no. 1269.0).

31 Main language spoken at home is classified according to the Australian Standard Classification of Languages (cat. no. 1267.0).

32 Descriptions of data items such as Body Mass Index and the Kessler Psychological Distress Scale (K10) are included in the Glossary to this publication.


33 The Census and Statistics Act, 1905 provides the authority for the ABS to collect statistical information, and requires that statistical output shall not be published or disseminated in a manner that is likely to enable the identification of a particular person or organisation. This requirement means that the ABS must take care and make assurances that any statistical information about individual respondents cannot be derived from published data.

34 To minimise the risk of identifying individuals in aggregate statistics, a technique known as perturbation is used to randomly adjust cell values. Perturbation involves a small random adjustment of the statistics and is considered the most satisfactory technique for avoiding the release of identifiable statistics while maximising the range of information that can be released. These adjustments have a negligible impact on the underlying pattern of the statistics. After perturbation, a given published cell value will be consistent across all tables. However, adding up cell values to derive a total will not necessarily give the same result as published totals. 

35 Perturbation has been applied to the estimates in this release. Perturbation has not been applied to the mean Health Literacy Scores.


36 Estimates presented in this publication have been rounded. 

37 Proportions presented in this publication are based on unrounded estimates. Calculations using rounded estimates may differ from those published.


38 ABS publications draw extensively on information provided freely by individuals, businesses, governments and other organisations. Their continued cooperation is very much appreciated; without it, the wide range of statistics published by the ABS would not be available. Information received by the ABS is treated in strict confidence as required by the Census and Statistics Act, 1905.

Products and services

39 Summary results from the HLS are available in spreadsheet form from the 'Data oownloads' section in this release. The statistics presented are only a selection of the information collected. 

40 For users who wish to undertake more detailed analysis, a TableBuilder product for the 2017-18 NHS, which includes HLS data, will be available on 30 April 2019. TableBuilder is an online tool for creating tables from ABS survey data, where variables can be selected for cross-tabulation. It has been developed to complement the existing suite of ABS microdata products and services including Census TableBuilder and CURFs. Further information about ABS microdata, including conditions of use, is available via the Microdata section on the ABS website.

41 Customised tabulations are available on request. Subject to confidentiality and sampling variability constraints, tabulations can be produced from the survey incorporating data items, populations and geographic areas selected to meet individual requirements.

Related publications

42 Current publications and other products released by the ABS are listed on the ABS website. The ABS also issues a daily Release Advice on the website which details products to be released in the week ahead.

Technical note - reliability of estimates

1 Two types of error are possible in an estimate based on a sample survey: sampling error and non-sampling error. The sampling error is a measure of the variability that occurs by chance because a sample, rather than the entire population, is surveyed. Since the estimates in this publication are based on information obtained from occupants of a sample of dwellings they are subject to sampling variability; that is, they may differ from the figures that would have been produced if all dwellings had been included in the survey. One measure of the likely difference is given by the standard error (SE). There are about two chances in three that a sample estimate will differ by less than one SE from the figure that would have been obtained if all dwellings had been included, and about 19 chances in 20 that the difference will be less than two SEs. 

2 Another measure of the likely difference is the relative standard error (RSE), which is obtained by expressing the RSE as a percentage of the estimate. The RSE is a useful measure in that it provides an immediate indication of the percentage errors likely to have occurred due to sampling, and thus avoids the need to refer also to the size of the estimate.

\(R S E \%=\left(\frac{S E}{E s t i m a t e}\right) \times 100\)

3 RSEs for published estimates are supplied in Excel data tables, available via the Data downloads section.

4 The smaller the estimate the higher is the RSE. Very small estimates are subject to such high SEs (relative to the size of the estimate) as to detract seriously from their value for most reasonable uses. In the tables in this publication, only estimates with RSEs less than 25% are considered sufficiently reliable for most purposes. However, estimates with larger RSEs, between 25% and less than 50% have been included and are preceded by an asterisk (eg *3.4) to indicate they are subject to high SEs and should be used with caution. Estimates with RSEs of 50% or more are preceded with a double asterisk (eg**0.6). Such estimates are considered unreliable for most purposes.

5 The imprecision due to sampling variability, which is measured by the SE, should not be confused with inaccuracies that may occur because of imperfections in reporting by interviewers and respondents and errors made in coding and processing of data. Inaccuracies of this kind are referred to as the non-sampling error, and they may occur in any enumeration, whether it be in a full count or only a sample. In practice, the potential for non-sampling error adds to the uncertainty of the estimates caused by sampling variability. However, it is not possible to quantify the non-sampling error.

Standard errors of proportions and percentages

6 Proportions and percentages formed from the ratio of two estimates are also subject to sampling errors. The size of the error depends on the accuracy of both the numerator and the denominator. For proportions where the denominator is an estimate of the number of persons in a group and the numerator is the number of persons in a sub-group of the denominator group, the formula to approximate the RSE is given below. The formula is only valid when x is a subset of y.

\({RSE}\left(\frac{x}{y}\right)=\sqrt{{RSE}(x)^{2-} {RSE}(y)^{2}}\)

Comparison of estimates

7 Published estimates may also be used to calculate the difference between two survey estimates. Such an estimate is subject to sampling error. The sampling error of the difference between two estimates depends on their SEs and the relationship (correlation) between them. An approximate SE of the difference between two estimates (x-y) may be calculated by the following formula:

\(S E(x-y)=\sqrt{[S E(x)]^{2}+[S E(y)]^{2}}\)

8 While the above formula will be exact only for differences between separate and uncorrelated (unrelated) characteristics of sub-populations, it is expected that it will provide a reasonable approximation for all differences likely to be of interest in this publication.

9 Another measure is the Margin of Error (MOE), which describes the distance from the population value that the sample estimate is likely to be within, and is specified at a given level of confidence. Confidence levels typically used are 90%, 95% and 99%. For example, at the 95% confidence level the MOE indicates that there are about 19 chances in 20 that the estimate will differ by less than the specified MOE from the population value (the figure obtained if all dwellings had been enumerated). The 95% MOE is calculated as 1.96 multiplied by the SE.

10 The 95% MOE can also be calculated from the RSE by:

\(\large{MOE}(y) \approx \frac{R S E(y) \times y}{100} \times 1.96\)

11 The MOEs in this publication are calculated at the 95% confidence level. This can easily be converted to a 90% confidence level by multiplying the MOE by:


or to a 99% confidence level by multiplying by a factor of:


12 A confidence interval expresses the sampling error as a range in which the population value is expected to lie at a given level of confidence. The confidence interval can easily be constructed from the MOE of the same level of confidence by taking the estimate plus or minus the MOE of the estimate.

Significance testing

13 For comparing estimates between surveys or between populations within a survey it is useful to determine whether apparent differences are 'real' differences between the corresponding population characteristics or simply the product of differences between the survey samples. One way to examine this is to determine whether the difference between the estimates is statistically significant. This is done by calculating the standard error of the difference between two estimates (x and y) and using that to calculate the test statistic using the formula below:

\(\large\frac{|x-y|}{S E(x-y)}\)


\(\large S E(y) \approx \frac{R S E(y) \times y}{100}\)

14 If the value of the statistic is greater than 1.96 then we may say there is good evidence of a statistically significant difference at 95% confidence levels between the two populations with respect to that characteristic. Otherwise, it cannot be stated with confidence that there is a real difference between the populations.


Show all

Back to top of the page