4159.0.55.002 - General Social Survey: User Guide, Australia, 2010

ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 07/12/2011

Page tools: Print

Print Page Print all pages in this product

Contents >> Data quality >> Non-Sampling Error

NON-SAMPLING ERROR

Errors made in giving and recording information during an interview can occur regardless of whether the estimates are derived from a sample or from a complete enumeration. Inaccuracies of this kind are referred to as non-sampling errors.

The major sources of non-sampling error are:

errors related to the survey scope;
response errors such as incorrect interpretation or wording of questions, interviewer bias, etc.;
bias due to non-response, characteristics of non-responding persons may differ from responding persons; and
errors in processing such as mistakes in the recording or coding of the data obtained.

These sources of error are discussed in turn below.

Errors related to survey scope

Some dwellings may have been inadvertently included or excluded because, for example, the distinctions between whether they were private or non-private dwellings may have been unclear. All efforts were made to overcome such situations by constant updating of lists both before and during the survey. Furthermore, some persons may have been inadvertently included or excluded because of difficulties in applying the scope rules concerning who was identified as usual residents, and concerning the treatment of some overseas visitors. Other errors which can arise from the application of the scope and coverage rules are outlined in Chapter 3: Survey Methodology.

Response errors

In this survey response errors may have arisen from three main sources: deficiencies in questionnaire design and methodology; deficiencies in interviewing technique; and inaccurate reporting by the respondent.

Errors may be caused by misleading or ambiguous questions, inadequate or inconsistent definitions of terminology used, or by poor overall survey design (e.g. context effects where responses to a question are directly influenced by the preceding questions). In order to overcome problems of this kind, individual questions and the overall questionnaire were thoroughly tested before being finalised for use in the survey.

Testing took two forms:

cognitive interviewing (further explained in Chapter 2: Survey Content); and
field testing, which involved a dress rehearsal conducted in New South Wales, covering 250 - 300 households.

As a result of both forms of testing, modifications were made to question design, wording, ordering and associated prompt cards, and some changes were made to survey procedures. In considering modifications it was sometimes necessary to balance better response to a particular item/topic against increased interview time or effects on other parts of the survey. The result is that in some instances it was necessary to adopt a workable/acceptable approach rather than an optimum approach. Although changes would have had the effect of minimising response errors due to questionnaire design and content issues, some will inevitably have occurred in the final survey enumeration.

Response errors may also have occurred due to the long nature of the interview, resulting in interviewer and/or respondent fatigue (i.e. loss of concentration). While efforts were made to minimise errors arising from deliberate misreporting or non-reporting by respondents (including emphasising the importance of the data and checking consistency within the survey instrument), some instances will have inevitably occurred.

Recall error may also have led to response error. Information recorded in this survey is essentially 'as reported' by respondents, and hence may differ from information available from other sources or collected using different methodologies. Responses may be affected by imperfect recall or individual interpretation of survey questions. Reference periods used in relation to each topic were selected to suit the nature of the information being sought; in particular to strike the right balance between minimising recall errors while ensuring the period was meaningful, representative (from both respondent and data use perspectives) and would yield sufficient observations in the survey to support reliable estimates. It is possible that the reference periods did not suit every person for every topic, and that difficulty with recall may have led to inaccurate reporting in some instances.

A further source of response error is lack of uniformity in interviewing standards. Methods employed to achieve and maintain uniform interviewing practises included training and re-training programs, and regular supervision and checking of interviewers' work. These programs aimed to ensure that a high level of response accuracy was achieved. An advantage of the CAI technology used in conducting interviews for this survey is that it potentially reduced non-sampling error by enabling edits to be applied as the data was being collected. The interviewer was alerted immediately if information entered into the computer was either outside the permitted range for a question, or contradictory to information previously recorded during the interview. These edits allowed the interviewer to query respondents and resolve issues during the interview. CAI sequencing of questions was also automated such that respondents were asked only relevant questions and only in the appropriate sequence, eliminating interviewer sequencing errors.

Some respondents may have provided responses that they felt were expected, rather than those that accurately reflected their own situation. Every effort has been made to minimise such bias through the development and use of culturally appropriate survey methodology. Respondent perception of the personal characteristics of the interviewer can also be a source of error as the age, sex, appearance and manner of the interviewer, may influence the answers obtained.

Non-response bias

One of the main sources of non-sampling error is non-response by persons selected in the survey. Non-response can affect the reliability of results and introduce bias. The magnitude of any bias depends upon the level of non-response and the extent of the difference between the characteristics of those people who responded to the survey and those who did not, as well as the extent to which non-response adjustments can be made during estimation through the use of benchmarks.

To reduce the level and impact of non-response, the following methods were adopted in this survey:

face-to-face interviews with respondents;
the use of Interviewers who could speak languages other than English (where necessary);
follow-up of respondents if there was initially no response; and
ensuring the weighted file is representative of the population by aligning the estimates with population benchmarks.

Of the dwellings selected in the 2010 GSS, 12.4% did not respond fully or adequately. As the non-response to the GSS was low, the impact of non-response bias is considered to be negligible.

Errors in processing

Errors may also occur during data processing, between the initial collection of the data and final compilation of statistics. These may be due to a failure of computer editing programs to detect errors in the data, or during the manipulation of raw data to produce the final survey data files; for example, in the course of deriving new data items from raw survey data or during the estimation procedures or weighting of the data file.

To minimise the likelihood of these errors occurring a number of quality assurance processes were employed, including:

computer editing - edits were devised to ensure that logical sequences were followed in the questionnaires, that necessary items were present and that specific values lay within certain ranges. These edits were designed to detect reporting and recording errors, incorrect relationships between data items or missing data items.
data file checks - at various stages during processing (such as after computer editing or after derivation of new data items) frequency counts and/or tabulations were obtained from the data file showing the distribution of persons for different characteristics. These were used as checks on the content of the data file, to identify unusual values which may have significantly affected estimates and illogical relationships not previously identified. Further checks were conducted to ensure consistency between related data items and in the relevant populations.
where possible, checks of the data were also undertaken to ensure consistency of the survey outputs against results of the previous GSS and data available from other sources.