4159.0.30.004 - Microdata: General Social Survey, Australia, 2014 Quality Declaration 
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 17/09/2015  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

FILE STRUCTURE

DATA AVAILABLE BY LEVEL

The GSS 2014 microdata is available across four levels. Some of these levels have a hierarchical relationship:

      1. Household
          2. Person
              3. Voluntary work
              4. Access to services

Broadly, each level provides the following:
  • Household level - information about the household size and structure and household income details
  • Person level - all demographic and socio-economic characteristics of the survey respondents, and most of the information they provided
  • Voluntary work level - information about the characteristics of each episode of volunteering that the survey respondent described
  • Access to services level - information about the types of services that were difficult to access and the reasons why they were described as difficult

WEIGHTS AND ESTIMATION

As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. Survey 'weights' are values which indicate how many population units are represented by the sample unit.

There are two survey weights provided: a person weight (FINPRSWT) and a household weight (FINWTHH). These should be used when analysing data at the person and household level respectively.

Where estimates are derived, it is essential that they are calculated by adding the weights of person or households, as appropriate in each category, and not just by counting the number of records falling into each category. If each person's or households 'weight' were to be ignored, then no account would be taken of a person's or household's chance of selection in the survey or of different response rates across population groups, with the result that counts produced could be seriously biased. The application of weights ensures that:
  • person estimates conform to an independently estimated distribution of the population by dwelling type, age, sex, state/territory and part of state
  • household estimates conform to an independently estimated distribution of households by certain characteristics (e.g. by number of adults and children), state/territory and part of state rather than to the distributions within the sample itself.

COUNTING UNITS AND WEIGHT

The counting unit for level one is the household, for level two the person, for level three instances of volunteering and for level four services difficulty accessing. There is a weight attached to each level in order to estimate the total population of the respective counting unit. The weight on level one is the household weight, and the weight for levels two to four is the person weight.

What you count depends on the level from which you select the weight. A household level weight estimates the number of households with a particular characteristic. Likewise the weight included in the person level estimates the number of persons with the selected characteristics. Replicate weights have also been included and these can be used to calculate the standard error. For more information, refer to the Standard Errors section below.

Standard Errors

Each record on the household level and person level also contains 60 replicate weights and, by using these weights, it is possible to calculate standard errors for weighted estimates produced from the microdata. This method is known as the 60 group Jack-knife variance estimator.

Under the Jack-knife method of replicate weighting, weights were derived as follows:
  • 60 replicate groups were formed with each group formed to mirror the overall sample (where units from a collection district all belong to the same replicate group and a unit can belong to only one replicate group)
  • one replicate group was dropped from the file and then the remaining records were weighted in the same manner as for the full sample
  • records in that group that were dropped received a weight of zero

This process was repeated for each replicate group (i.e. a total of 60 times). Ultimately each record had 60 replicate weights attached to it with one of these being the zero weight.

Replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit records analyses such as chi-square and logistic regression to be conducted which take into account the sample design. Replicate weights for any variable of interest can be calculated from the 60 replicate groups, giving 60 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate (based on the general weight) is then used to approximate the variance of the full sample.

To obtain the standard error of a weighted estimate y, the same estimate is calculated using each of the 60 replicate weights. The variability between these replicate estimates (denoting y(g) for group number g) is used to measure the standard error of the original weighted estimate y using the formula:

Image: equasion to obtain the standard error of a weighted estimate

where:

g = the replicate group number

y(g) = the weighted estimate, having applied the weights for replicate group g

y = the weighted estimate from the sample.

The 60 group Jack-knife method can be applied not just to estimates of the population total, but also where the estimate y is a function of estimates of the population total, such as a proportion, difference or ratio. For more information on the 60 group Jack-knife method of SE estimation, see Research Paper: Weighting and Standard Error Estimation for ABS Household Surveys (Methodology Advisory Committee), July 1999 (cat. no. 1352.0.55.029).

Use of the 60 Group Jack-knife method for complex estimates, such as regression parameters from a statistical model, is not straightforward and may not be appropriate. The method as described does not apply to investigations where survey weights are not used, such as in unweighted statistical modelling.


Back to top of the page