**TECHNICAL NOTE**

**DATA QUALITY**

**1 **When interpreting the results of a survey it is important to take into account factors that may affect the reliability of estimates. The survey methodology procedures as well as sampling and non-sampling errors should be considered. Examination of the following quality indicators will assist users in determining fitness for purpose of the Queensland Water and Energy Use and Conservation Survey.

*ESTIMATION PROCEDURE*

**2 **The estimates in this publication were obtained using a post-stratification procedure. This procedure ensured that the survey estimates conformed to an independently estimated distribution of the population, by state, part of state, age and sex, rather than the distribution among respondents.

*RELIABILITY OF THE ESTIMATES*

**3 **When interpreting the results of a survey it is important to take into account factors that may affect the reliability of estimates. Such factors can be classified as either sampling or non-sampling error.

*NON-SAMPLING ERRORS*

**4 **Errors other than those due to sampling may occur in any type of collection and are referred to as non-sampling error. For this survey, non-sampling error can be introduced through inadequacies in the questionnaire, non-response, inaccurate reporting by respondents, errors in the application of survey procedures, incorrect recording of answers and errors in data entry and processing. The extent to which non-sampling error affects the results of the survey is not precisely quantifiable. Every effort was made to minimise non-sampling error by careful design and testing of the questionnaire, efficient operating procedures and systems and the use of appropriate methodology.

*SAMPLING ERRORS*

**5 **Since the estimates in this publication are based on information obtained from occupants of a sample of dwellings, they are subject to sampling variability. That is, they may differ from those estimates that would have been produced if all occupants of all dwellings had been included in the survey. One measure of the likely difference is given by the standard error (SE), which indicates the extent to which an estimate might have varied by chance because only a sample of dwellings (or occupants) was included. There are about two chances in three (67%) that a sample estimate will differ by less than one SE from the number that would have been obtained if all dwellings had been included, and about 19 chances in 20 (95%) that the difference will be less than two SEs.

**6 **Another measure of the likely difference is the relative standard error (RSE), which is obtained by expressing the SE as a percentage of the estimate:

**7 **RSEs for estimates from 2009 Queensland Water and Energy Use and Conservation survey are published for each individual data cell. The Jackknife method of variance estimation has been used to produce RSE estimates for this publication. This variance estimation method involves the calculation of 30 'replicate' estimates based on 30 different sub-samples of the original sample. The variability of estimates obtained from these sub-samples is used to estimate the sample variability for the main estimate.

**8 **In the tables in this publication, only estimates (numbers and proportions) with RSEs less than 25% are considered sufficiently reliable for most purposes. However, estimates with larger RSEs have been included and are preceded by an asterisk (e.g. *3.4) to indicate they are subject to high SEs and should be used with caution. Estimates with RSEs greater than 50% are preceded by a double asterisk (e.g. **2.1) to indicate that they are considered too unreliable for general use. In the data cubes in this publication, RSEs are published for each individual data cell.

**PROPORTIONS AND PERCENTAGES**

**9 **Proportions formed from the ratio of two estimates are also subject to sampling errors. The size of the error depends on the accuracy of both the numerator and the denominator. A formula to approximate the RSE of a proportion is given below. This formula is only valid when x is a subset of y.

**DIFFERENCES**

**10 **The sampling error of the difference between two estimates depends on their SEs and the relationship (correlation) between the estimates. An approximate SE of the difference between two estimates (x-y) may be calculated by the following formula:

**11 **While this formula will only be exact for differences between separate and uncorrelated characteristics or subpopulations, it is expected to provide a good approximation for all differences likely to be of interest in this publication.

**SIGNIFICANCE TESTING**

**12 **A statistical significance test can be performed to indicate whether the survey provides sufficient evidence that a difference between two survey estimates reflects an actual difference in the population. The following measure, called a "test statistic", can be used to test the statistical significance of a difference between two survey estimates. (The standard error of the difference between two corresponding estimates (x and y) can be calculated using the formula in paragraph 10.)

**13 **If the value of this test statistic is greater than 1.96, then we may say there is strong evidence the difference between the survey estimates reflects a difference in the population.