|Page tools: Print Page Print All|
10. INTERPRETATION OF RESULTS
For a number of survey data items, some respondents were unwilling or unable to provide the required information. Where responses for a particular data item were missing for a person or household they were recorded in a 'not known' or 'not stated' category for that data item. These 'not known' or 'not stated' categories are not shown in the publication tables, but have been included in the totals.
This chapter explores the reliability of the survey estimates; sampling and non-sampling error; and comparisons of this survey to other data sources.
RELIABILITY OF ESTIMATES
The response rate for the 2007 National Survey of Mental Health and Wellbeing (SMHWB) was 60% nationally. However, the survey sample was designed based on an assumed response rate of 75%. As this rate was not achieved, a smaller sample was available for estimation. More information on 'Sample design' is provided in Chapter 2. The following table provides the differences in response rates for each state/territory based on the achieved sample. It should be noted that 117 records were deleted during processing due to serious inconsistencies in the data, which were not able to be corrected. For example, a person other than the selected person undertook the survey interview and there was an interviewer note to this effect. The exclusion of these records gives an achieved sample of 8,841 fully/adequately responding households.
Sample survey errors
Two types of error are possible in estimates based on a sample survey:
Sampling error occurs because only a small proportion of the total population is used to produce estimates that represent the whole population. Sampling error can be reliably measured as it is calculated based on the scientific methods used to design surveys. Non-sampling error may occur in any data collection, whether it is based on a sample or a full-count (eg Census). Non-sampling error may occur at any stage throughout the survey process. Examples of non-sampling error include:
Sampling and non-sampling errors should be considered when interpreting results of the survey. Sampling errors are considered to occur randomly, whereas non-sampling errors may occur randomly and/or systematically.
Sampling error is the expected difference that could occur between the published estimates, derived from repeated random samples of persons, and the value that would have been produced if all persons in scope of the survey had been included.
The magnitude of the sampling error associated with an estimate depends on the following factors:
For more information on 'Sample design' see Chapter 2.
Measures of sampling error
A measure of the sampling error for a given estimate is provided by the Standard Error (SE), which is the extent to which an estimate might have varied by chance because only a sample of persons was obtained. There are about two chances in three that a sample estimate will differ by less than one SE from the figure that would have been obtained if all persons had been included in the survey, and about 19 chances in 20 that the difference will be less than two SEs.
Another measure is Relative Standard Error (RSE), which is the SE expressed as a percentage of the estimate. The RSE is a useful measure as it provides an immediate indication of the percentage errors likely to have occurred due to sampling, and therefore avoids the need to refer also to the size of the estimate.
The smaller the estimate, the higher the RSE. Very small estimates are subject to such high SEs (relative to the size of the estimate) as to detract seriously from their value for most reasonable uses. Only estimates with RSEs of less than 25% are considered sufficiently reliable for most purposes.
RSEs for all estimates published in the National Survey of Mental Health and Wellbeing: Summary of Results, 2007 (cat. no. 4326.0) are available from the ABS website <www.abs.gov.au> in spreadsheet format.
Imprecision due to sampling variability, which is measured by the SE, should not be confused with inaccuracies that may occur because of imperfections in reporting by respondents and interviewers, or random errors made in coding and processing of survey data. These types of inaccuracies contribute to the total non-sampling error and may occur in any enumeration. The potential for random non-sampling error adds to the uncertainty of the estimates caused by sampling variability. However, it is not usually possible to quantify either the random or systematic non-sampling errors.
Standard errors of proportions and percentages
Proportions and percentages formed from the ratio of two estimates are subject to sampling errors. The size of the error depends on the accuracy of both the numerator and the denominator.
The RSEs of proportions and percentages for the National Survey of Mental Health and Wellbeing: Summary of Results, 2007 (cat. no. 4326.0) are calculated using the full delete-a-group jackknife technique, which is described in the following segment (see 'Replicate weights and directly calculated standard errors'). RSEs for all estimates in the summary publication are available in spreadsheet format from the ABS website <www.abs.gov.au>.
For proportions where the denominator is an estimate of the number of persons in a group and the numerator is the number of persons in a sub-group of the denominator group, a formula to approximate the RSE of the proportion x/y is given by:
From the above formula, the estimated RSE of the proportion or percentage will be lower than the RSE of the estimate of the numerator.
Replicate weights and directly calculated standard errors
Standard Errors (SEs) on estimates from this survey were obtained through the delete-a-group jackknife variance technique. In this technique, the full sample is repeatedly subsampled by successively dropping households from different groups of clusters of households and then the remaining records are re-weighted to the survey benchmark population. Through this technique, the effect of the complex survey design and estimation methodology on the accuracy of the survey estimates is stored in the replicate weights. For the 2007 SMHWB, this process was repeated 60 times to produce 60 replicate weights for each sample unit. The distribution of the 60 replicate estimates based on the full sample estimate is then used to directly calculate the SE for each full sample estimate.
The use of directly calculated SEs for each survey estimate, rather than SEs based on models, provides more information on the sampling variability inherent in a particular estimate. Therefore, directly calculated SEs for estimates of the same magnitude, but from different sample units, generally result in different SE estimates.
For more information on the replicate weights technique see Appendix 2.
Comparison of estimates
Published estimates may also be used to calculate the difference between two survey estimates. Such an estimate is subject to sampling error. The sampling error of the difference between two estimates depends on their Standard Errors (SEs) and the relationship (correlation) between them. An approximate SE of the difference between two estimates (x-y) may be calculated by the following formula:
While the above formula will be exact only for differences between separate and uncorrelated (unrelated) characteristics of sub-populations, it is expected that it will provide a reasonable approximation for all differences likely to be of interest in this survey.
For comparing estimates between surveys or between populations within a survey it is useful to determine whether apparent differences are 'real' differences between the corresponding population characteristics or simply the product of differences between the survey samples. One way to examine this is to determine whether the difference between the estimates is statistically significant. This is done by calculating the SE of the difference between two estimates (x and y) and using that to calculate the test statistic using the formula below:
If the value of the tested statistic is greater than 1.96 then there is a 95% certainty that there is a statistically significant difference between the two populations with respect to the particular characteristic.
The estimates from the survey are based on information collected from August to December 2007, with enumeration being completed prior to Christmas. Due to seasonal effects the data may not be fully representative of other time periods in the year. For example, the survey included questions on labour force status to determine whether a person was employed. Employment is subject to seasonal variation throughout the year. Therefore, the survey results for employment could have differed if the survey had been conducted over the whole year or in a different part of the year.
Every effort was made to minimise non-sampling error by careful design and testing of questionnaires, intensive training of interviewers, and extensive editing and quality control procedures at all stages of data processing. However, errors can be made in giving and recording information during an interview and these may occur regardless of whether the estimates are derived from a sample or from a full count (eg Census). Inaccuracies of this type are referred to as non-sampling errors. The major sources of non-sampling error are:
These sources of random and/or systematic non-sampling error are discussed in more detail in the following segments.
Errors related to the survey scope
Some dwellings may have been inadvertently included or excluded. For example, if it was unclear whether the dwelling was private or non-private. In order to prevent this type of error, dwelling listings are constantly updated. Additionally, some people may have been inadvertently included or excluded because of difficulties in applying the scope rules. For example, identification of a household's usual residents or treatment of some overseas visitors. For more information on 'Scope and coverage' see Chapter 2.
In this survey response errors may have arisen from three main sources:
Errors may have been caused by misleading or ambiguous questions, inadequate or inconsistent definitions of terminology, or by poor overall survey design (eg context effects, where responses to a question are directly influenced by the preceding question/s). In order to overcome these types of issues, individual questions and the overall questionnaire were tested before the survey was enumerated. Testing included:
More information on pre- and field testing is provided in Chapter 2.
As a result of testing, modifications were made to:
In considering modifications it was sometimes necessary to balance better response to a particular item/topic against increased interview time, effects on other parts of the survey and the need to minimise changes to ensure international comparability. Therefore, in some instances it was necessary to adopt a workable/acceptable approach rather than an optimum approach. Although changes would have had the effect of minimising response errors due to questionnaire design and content issues, some will have inevitably occurred in the final survey enumeration.
Response errors may also have occurred due to the length of the survey interview (on average 90 minutes) because of interviewer and/or respondent fatigue (ie loss of concentration). While efforts were made to minimise errors arising from deliberate misreporting or non-reporting, some instances will have inevitably occurred.
Accuracy of recall may also have led to response error, particularly in relation to the lifetime questions. Information in this survey is essentially 'as reported', and therefore may differ from information available from other sources or collected using different methodologies. Responses may be affected by imperfect recall or individual interpretation of survey questions. The focus of this survey is on lifetime mental disorders and people who experienced symptoms in the 12 months prior to interview. The reference periods of the questions reflect this emphasis. The questionnaire was designed to strike a balance between minimising recall errors and ensuring the data was meaningful, representative (from both respondent and data use perspectives) and would yield sufficient observations to support reliable estimates. It is possible that the reference periods did not suit every person for every topic, and that difficulty with recall may have led to inaccurate reporting in some instances.
A further source of response error is lack of uniformity in interviewing standards. To ensure uniform interviewing practices and a high level of response accuracy, extensive interviewer training was provided. An advantage of using Computer Assisted Interviewing (CAI) technology to conduct survey interviews is that it potentially reduces non-sampling error. More information on interviews, interviewer training, the survey instrument and CAI is provided in Chapter 2.
Some respondents may have provided responses that they felt were expected, rather than those that accurately reflected their own situation. Every effort has been made to minimise such bias through the development and use of culturally appropriate survey methodology. Non-uniformity of interviewers themselves is also a potential source of error, in that the impression made upon respondents by personal characteristics of individual interviewers such as age, sex, appearance and manner, may influence the answers obtained.
Errors in processing
Errors may occur during data processing, between the initial collection of the data and final compilation of statistics. These may be due to a failure of computer editing programs to detect errors in the data, or during the manipulation of raw data to produce the final survey data files. For example, in the course of deriving new data items from raw survey data (eg coding), during the estimation procedures or when weighting the data file.
To minimise the likelihood of these errors occurring a number of processes were used, including:
Non-response may occur when people cannot or will not participate in the survey, or cannot be contacted during the period of enumeration. Unit and item non-response by persons/households selected in the survey can affect both sampling and non-sampling error. The loss of information on persons and/or households (unit non-response) and on particular questions (item non-response) reduces the effective sample and increases sampling error.
Non-response can also introduce systematic non-sampling error by creating a biased sample. The magnitude of any non-response bias depends on the level of non-response and the extent of the difference between the characteristics of those people who responded to the survey and those who did not, as well as the extent to which non-response adjustments can be made during estimation through the use of benchmarks.
To reduce the level and impact of non-response, the following methods were adopted in this survey:
Of the dwellings selected for the 2007 SMHWB, 5,851 (40%) did not respond fully or adequately. Reflecting the sensitive topic for the survey, the average expected interview length (approximately 90 minutes) and the voluntary nature of the survey, almost two-thirds (61%) of these dwellings were full refusals. Household details were provided by more than a quarter (27%) of these dwellings, but then the selected person did not complete the main questionnaire. The remainder of these dwellings (12%) provided partial or incomplete information.
Reasons for non-response
Categorisation of interviewer remarks from the 2007 SMHWB indicated that the majority of persons who refused to participate stated that they were 'too busy' or 'not interested' in the survey. People also refused to participate as the survey was 'not compulsory', the content was 'too personal', or they expressed an anti-government or anti-survey sentiment (eg invasion of privacy).
As the level of non-response for this survey was higher than expected, extensive non-response analyses to assess the reliability of the survey estimates were undertaken. The non-response analyses included data comparisons with other ABS sources, external data sources and a Non-Response Follow-Up Study (NRFUS).
Non-response Follow-Up Study (NRFUS)
A purposive small sample/short-form intensive Non-Response Follow-Up Study (NRFUS) was developed for use with non-respondents in Sydney and Perth. The aim of the NRFUS was to provide a qualitative assessment of the likelihood of non-response bias. The NRFUS was conducted from January to February 2008 and yielded 151 respondents or a response rate of 40%. NRFUS respondents were from households that were classified as full non-contacts or full-refusals during the enumeration of the 2007 SMHWB.
As the intent of the NRFUS was to provide qualitative information on non-response bias, the sample selection was done in such a way as to minimise costs (eg interviewer travel). There were ten interviewers available in Sydney and five interviewers in Perth. The resulting sample was of convenience, rather than random, but the interviewers were reasonably spread across different areas of the two cities.
In addition, interviewer remarks from the 2007 SMHWB were used to screen for households where there may be a high risk to the safety of interviewers (eg households with aggressive dogs). This resulted in a small number of households being excluded from the NRFUS. The interviewers were also given the opportunity to exclude potentially dangerous households from the approached sample, however such households were included as 'refusals'.
The NRFUS aimed to achieve 100 fully responding households (persons), within a capped budget during a four week enumeration period. Interviewers were assigned 401 households (229 in Sydney and 172 in Perth) and achieved 151 fully responding households (77 in Sydney and 74 in Perth). As the NRFUS is not based on a random sample, the results should be interpreted with caution. The following table gives the distribution of responses to the NRFUS.
The NRFUS used a short-form questionnaire containing demographic questions and the Kessler Psychological Distress Scale (K10). The short-form approach used for the NRFUS precluded the use of the full diagnostic assessment modules. However, the K10 was included as a minor proxy to the mental health questions. Respondents to the NRFUS were compared to people who responded fully to the 2007 SMHWB by a number of demographic variables, including age, sex and marital status.
Given the small size and purposive nature of the NRFUS sample, the results of the study were not incorporated into the 2007 SMHWB estimation strategy, but were used for qualitative comparison.
The age and sex distribution of respondents to the 2007 SMHWB was compared with the distribution of the NRFUS respondents. The 2007 SMHWB had a higher proportion of older persons (aged 65-85 years) than the NRFUS. This is expected, as older people were more likely to have responded to the main survey if selected and there was also a higher probability of selection for this age group. The proportion of younger people (aged 16-24 years) in both the 2007 SMHWB and the NRFUS samples (17.8% and 17.9% respectively) was higher than the proportion of younger people in the population (16.5%). However, this could also be explained by the higher probability of selection used for this age group.
The 2007 SMHWB had higher coverage of female respondents than the NRFUS (55% compared to 47%). The NRFUS achieved higher coverage of male respondents (53%), particularly younger males (56%), resulting in additional information being available on the characteristics of these people. The NRFUS also achieved higher relative numbers of responding 'never married' people compared to the 2007 SMHWB (46% compared to 35%).
The analysis undertaken suggests that there may be differences in the direction and magnitude of potential non-response bias between various geographical, age and sex variables that the weighting strategy does not correct for. The magnitude of potential non-response bias appears to be small at the aggregate level, but there is possible underestimation in the prevalence of mental health conditions in Perth, for men, and for young persons. It should be noted that the NRFUS was only conducted with non-respondents to the 2007 SMHWB from Sydney and Perth.
Research has found a strong association between high scores on the K10 and the diagnosis of Anxiety and Affective disorders through the current WMH-CIDI (version 3.0). There is also a lesser, but still significant association between the K10 and other mental disorder categories, or the presence of any current mental disorder (Andrews & Slade, 2001). More information about the K10 is provided in Chapter 6.
Differences in K10 scores may be associated with differences in the prevalence rates of certain mental disorders, therefore if the K10 score is underestimated, then it is likely that prevalence rates are also underestimated. By comparing the K10 scores from the NRFUS to the 2007 SMHWB, we can assess the likely differences.
Based on the combined sample from Sydney and Perth, the unweighted mean K10 scores for the NRFUS were higher than the 2007 SMHWB (15.6 compared to 14.4). This indicates that NRFUS respondents had higher levels of distress and therefore may have had marginally higher prevalence of mental disorders. However, the difference between the two samples is small and separate results for Sydney and Perth indicate lower and higher (respectively) K10 scores than the 2007 SMHWB. Given that younger males, people who are 'divorced' or 'never married' characteristically exhibit higher K10 scores, this would be expected in the NRFUS sample.
In order to analyse the sensitivity of changes to the K10 estimates, speculative revised values were obtained by applying the NRFUS K10 scores to the non-respondents of the 2007 SMHWB. This resulted in a revised mean K10 score of 14.8 for the 2007 SMHWB, which is not significantly higher than the current estimate (14.4).
COMPARISONS TO OTHER DATA SOURCES
To ascertain data consistency, the characteristics of the 2007 SMHWB respondents were compared to a number of ABS collections, including:
From this analysis, it was determined that some of the demographic and socio-economic characteristics from the initial weighted data did not align with other ABS estimates. Additional (or 'pseudo') benchmarks were used to adjust for differential undercoverage of educational attainment, labour force status and household composition. For more information on 'Weighting, benchmarking and estimation' see Chapter 2.
Comparisons were also made between the 2007 SMHWB and a number of mental health data sources, including:
Analysis undertaken to compare the results of the 2007 and 1997 SMHWB indicated:
Changes in prevalence between the 1997 and 2007 SMHWB were not included in the National Survey of Mental Health and Wellbeing: Summary of Results, 2007 (cat. no. 4326.0), due to concerns of data comparability. For more information on comparability between these two surveys see 'Comparison with the 1997 survey' in this chapter.
COMPARISON WITH THE 1997 SURVEY
In 1997 the ABS conducted the National Survey of Mental Health and Wellbeing of Adults. The survey provided information on Australians aged 18 years and over on:
The 1997 survey was an initiative of, and was funded by, the then Commonwealth Department of Health and Family Services, as part of the National Mental Health Strategy. A key aim of the survey was to provide prevalence estimates for selected mental disorders in a 12-month time frame. Therefore, diagnostic criteria were assessed solely on a persons' experiences in the 12 months prior to the survey interview.
The 2007 survey was designed to provide lifetime prevalence estimates for selected mental disorders. People were asked about experiences throughout their lifetime, with an emphasis on the time when they had the most symptoms or the worst period of this type. Where a number of symptoms were endorsed across a lifetime, the person was asked about the presence of symptoms in the 12 months prior to the survey interview. To be included in the 12-month prevalence rates, a person must have met the criteria for lifetime diagnoses and had symptoms in the 12 months prior to the survey interview. The full diagnostic criteria were not assessed within the 12-month time frame.
Due to the differences described in this chapter and throughout this publication, users should exercise caution when comparing data from the two surveys.
Comparison of diagnoses
The diagnoses of mental disorders for the 2007 survey are based on the WMH-CIDI 3.0, while the 1997 survey diagnoses were based on an earlier version of the CIDI (version 2.1). Apart from the differences in time frames outlined above, the WMH-CIDI 3.0 differs from earlier versions as it:
For example, the number of questions asked about scenarios which may have triggered a Post-Traumatic Stress Disorder (PTSD) has increased substantially, from 10 questions in 1997 to 28 questions in 2007. Additionally, the 1997 survey excluded people who said their extremely stressful or upsetting event was only related to:
The WMH-CIDI 3.0 diagnostic assessment criteria according to the ICD-10 and DSM-IV for the 2007 survey are provided in Chapter 3. A summary of the broad differences in the diagnostic assessment criteria between the two surveys is provided in Chapter 4.
Both surveys collected information from persons in private dwellings throughout Australia. The 2007 survey collected information from people aged 16-85 years, while the 1997 survey collected information on people aged 18 years and over. In 2007, overseas visitors who had been working or studying in Australia for the 12 months prior to the survey or were intending to do so were included in the scope. In 1997, overseas visitors were excluded.
The sample sizes and response rates varied for both surveys. The 2007 survey had a sample of approximately 8,800 people, compared to approximately 10,600 people in 1997. The 2007 survey had a response rate of 60%, compared to 78% in 1997. Additionally, in 1997 extra survey sample was collected for Victoria, Western Australia (WA) and the Australian Capital Territory (ACT). The additional sample for the ACT was included in published results for the 1997 survey, while the additional sample for Victoria and WA was not. For more information on survey methodology see Chapter 2.
The enumeration period of each survey differs, which may impact on data comparisons. Seasonal effects refer to the influence that timing may have on a survey, ie the period when the survey was enumerated may not be fully representative of other time periods in the year. For example, fluctuations in employment prior to Christmas. The 2007 survey was undertaken from August to December, while the 1997 survey was undertaken from May to August.
The classification of several demographic and socio-economic characteristics used in the 2007 SMHWB differ to those used in 1997, including:
Industry of employment was collected for the first time in 2007. For information on population characteristics see Chapter 9 and for information on classifications see Chapter 2.
Several of the scales and measures used to estimate disability and functioning in the 2007 survey differ from those used in 1997. The 2007 survey includes:
In comparison, the 1997 survey collected information on disability and functioning using:
Both surveys contained questions on:
However, the positioning of questions within each survey and the wording of questions varies. Information on physical activity and body mass were collected for the first time in 2007. Therefore, there are no data from 1997 survey for comparison.
The 2007 survey included a small number of questions on hypochondriasis and somatisation, whereas the 1997 survey assessed somatic disorder, neurasthenia, and the personality characteristic neuroticism (Eysenck Personality Questionnaire). For more information on physical health see Chapter 5.
Other scales and measures
A psychosis screener was included in both surveys. In 1997, there were seven questions about psychotic experiences in the 12 months prior to interview. Reflecting the differing emphasis in time frames, the 2007 survey asked questions about both lifetime and 12-month psychotic experiences.
The 2007 survey contains more detailed information on suicidal behaviour than the 1997 survey. Apart from a small number of questions in the Depression module, the 1997 survey included only three questions about suicidal behaviour. The 2007 survey collected information on lifetime and 12-month suicidal behaviour, including the persistence of the behaviour/s. Detailed information, such as the consequences of attempted suicide (eg medical attention required) was also collected for suicidal behaviour in the 12 months prior to interview.
Both surveys included the Kessler Psychological Distress Scale (K10). However, there were minor differences in the question wording. The Mini-Mental State Examination (MMSE) was included in both surveys. There are some differences in the period referred to for memory problems, as well as differences in the order of tasks performed. For more information on 'Other scales and measures' see Chapter 6.
There are no data from the 1997 survey for comparison with the following items, as these were collected for the first time in 2007:
More information on the data collected in the 1997 survey is provided in the National Survey of Mental Health and Wellbeing of Adults: Users' Guide, 1997 (cat. no. 4327.0).
These documents will be presented in a new window.