APPENDIX: ADJUSTMENT FOR NON-RESPONSE IN LONGITUDINAL ANALYSIS
This appendix describes the weighting methodology used for the longitudinal data presented in the feature article 'Household energy efficient improvements: Intentions, actions and barriers'.
Weighting is the process of adjusting results from a sample survey to infer results for the total in-scope population, whether that be persons or households. To do this, a 'weight' is allocated to each sample unit e.g., a person or a household. The weight is a value which indicates how many populations units are represented by the sample unit.
The estimates presented in the feature article are calculated using weights specifically developed for the article content. All other estimates used in this publication used household level weights, as described in paragraphs 66 to 71 of the Explanatory Notes, and the Household Energy Consumption Survey, User Guide, Australia, 2012 (cat. no. 4671.0).
The sample described in the feature article is a subset of the complete HECS sample, consisting of households who had complete participation in the voluntary longitudinal component. This subset of households participated in all four rounds of follow-up questions, asked every three months following their household interview in January to March 2012, up until January to March 2013.
Significant levels of non-response exist for the voluntary longitudinal component. This introduces the potential for significant non-response bias for analysis based on the longitudinal sample, if those choosing to respond were inherently different to those who did not choose to participate. More information on the voluntary longitudinal component and coverage characteristics of responding records is available in the 'Sample Design, Scope and Coverage' section of the Household Energy Consumption Survey, User Guide, Australia, 2012 (cat. no. 4671.0). Coverage characteristics for the fully participating subsample are presented in Table 1 of the feature article.
This potential for non-response bias in the longitudinal sample was not accounted for in the weighting of the complete HECS sample. Consequently, weights were produced for the longitudinal sample used in the feature article to facilitate estimation with reduced bias from non-response. However, the relatively small sample size for this group (639 households) coupled with the high levels of non-response pose challenges for the weighting process. Therefore only the major identified biases have been corrected for in the weighting. The level of bias unaccounted for in the estimates of the longitudinal data items cannot be quantified.
The first step in calculating weights for each unit is to assign an initial weight, which typically is the inverse of the probability of being selected in the survey. For example, if the probability of a household being selected in the survey was one in 600, then the household would have an initial weight of 600 (that is, it represents 600 households).
A weighting process was already completed for the full HECS sample (described in paragraphs 66 to 71 of the Explanatory Notes), which accounts for coverage and non-response issues in this sample. These weights were an ideal starting point for the longitudinal weighting process and served as initial weights.
As the longitudinal cohort being weighted consists only of units interviewed in the first quarter of HECS enumeration (January to March 2012) the sum of their final weights, if non-response was not experienced, would be a 'quarter' of the sum of the final HECS full sample weights, i.e a quarter of the in-scope population. As there was a high level of non-response to the longitudinal component, the sum of the cohort's weights will be even less than a quarter of the population. Consequently, an initial broad adjustment was made to the final HECS weights for the longitudinal cohort so that they represented the in-scope Australian population and inferences could be made about this population.
Note that the quarters are chosen to align with pension indexation dates rather than calendar quarters, and so are of different lengths ranging from 22% to 28% of the calendar year.
The adjusted initial weights for the HECS longitudinal are then calibrated to align with independent estimates of the population of interest, referred to as 'benchmarks'. Weights calibrated against population benchmarks ensure that the representation of the sample conforms to the independently estimated distribution of the population.
The benchmarks used in calibration of the final longitudinal analysis weights were the number of households -
- By state by tenure type
- Defined as owner with a mortgage, owner without a mortgage and other (predominantly renter), for all states and territories except NT.
- By state by intention to modify dwelling in the next 12 months
- Defined as either 'yes, intended to modify in the next 12 months' or 'no, did not intend to modify in the next 12 months', for all states except NT
- By household composition [number of adults (1, 2, 3+) and whether or not the household contains children]
The first two benchmarks were estimated from the full HECS sample, that is, they were 'pseudo' benchmarks, meaning that they are estimates themselves and therefore have sampling error (see 'Sampling Variability
' for details about sampling error). Pseudo benchmarks are used when a suitable and directly comparable data source available with no sampling error is not available. For example, intention to modify is not collected on the Census.
The final benchmark, household composition, is from household demography benchmarks based on Census 2006. Note that these demography benchmarks have been formed to align with the scope of HECS and therefore include persons residing in private dwellings only and exclude persons living in very remote areas. These demography benchmarks, therefore do not, and are not intended to, match estimates of the Australian resident population published in other ABS publications.
The initial weights were calibrated to these benchmarks based on evidence that they were correlated to key output variables presented in the feature article. Consequently, their inclusion in calibration will improve the accuracy key estimates. The benchmarks included will not improve the representativeness of the sample for all output categories with respect to what is known from the full HECS sample. For example, as illustrated in the table below, the weighting process attributed very limited improvements to the representativeness or estimation of items such as average household net worth and the proportion of dwellings with insulation, when compared to the complete weighted HECS sample. This table is also available in the "Feature Article - Household energy efficient improvements" datacube in the "Downloads" tab of this product, which also features other characteristics of the sample not presented here.
ESTIMATES OF COMPLETE HECS SAMPLE AND AND COMPLETE LONGITUDINAL PARTICIPATION RECORDS
Weighted sample estimates
Complete HECS sample (a)
Complete participation sample (b)
|Household characteristics, January - March 2012|
|Intended to make energy efficient improvements over the next 12 months|
|Owner without a mortgage|
|Owner with a mortgage|
|Renter or other tenure type|
|Family composition of household|
|Couple family with dependent children|
|One parent family with dependent children|
|Other one family households|
|Multiple family households|
|State or territory of usual residence|
|New South Wales|
|Australian Capital Territory|
|Energy efficient improvements already made in dwelling|
|Solar electricity or hot water|
|Made energy efficient improvements to dwelling in last 2 years (since January - March 2010)|
|Other types of dwelling|
|Gross weekly average household income ($)|
|Household average net worth ($'000)|
|Average age of household reference person (years)|
|Average number of employed people in household (no.)|
|Total sample (n)|
|Weighted count ('000)|
* estimate has a relative standard error of 25% to 50 percent and should be used with caution.
** estimate has a relative standard error greater than 50% and is considered too unreliable for general use.
(a) Using full HECS sample household weights.
(b) Using weights described in this appendix.
(c) Estimate is significantly different from the complete HECS sample.
Full sample estimates for some of these output categories were considered as additional benchmarks to improve estimates of key longitudinal variables for that benchmark category, but were not included as they were detrimental to the accuracy of national estimates and other output categories. Given the small sample size, including further benchmarks also created problems related to over-constraining the weight calibration process.
Weighted longitudinal data is of particular value when the primary purpose, as for the feature article, is to provide descriptive statistics. If relationship analysis is being performed, weights may not be necessary, instead it is important that factors associated with non-response and poor coverage are included in models.