|Page tools: Print Page Print All|
The first four levels of SDAC data are in a hierarchical relationship, where each level is derived from the previous. These levels can be described as follows: a person is a member of an income unit, which is a member of a family, which is a member of a household. A household may have more than one family, while a family may have more than one income unit, and so on.
Levels 5 to 9 relate to the characteristics of conditions, restrictions and activities, with each being a sub-level of level 4 (Person). That is, a person can have multiple conditions and restrictions, as well as require assistance with one or more activities. Level 10 is a sub-level of level 9 (Broad activities), as it relates to characteristics of the assistance provided for activities identified in level 9. An activity can be undertaken with the assistance of one or more providers.
There are ‘dummy’ or ‘Not Applicable’ records at each of the sub-person levels 5 to 9, which allow for those instances where a person does not contribute a record to a particular level. For example, a person with no conditions will not contribute a record to the 'All conditions' level. This allows data items on sub-person levels to be used for calculating the total of ‘all persons’.
Additionally, ‘Not Applicable’ records exist at the Assistance Providers level for those people who experience difficulty with a broad activity (i.e. a record exists on level 9), but do not have a provider of assistance for that activity (i.e. no record exists on level 10).
Broadly, each level provides the following:
The 'one to many' relationships described by levels 5 to 10 are known as repeating datasets, that is, sets of data with a counting unit that may be repeated for a person. For example, the repeating dataset for conditions will have one record per condition reported, because condition is the counting unit. Repeating datasets are only useful when common information is collected for each instance of a counting unit. For example, each condition reported has the data item 'Whether reported condition is main condition' associated with it. This data item corresponds to each condition reported. Note that only one of the conditions reported for a particular person has a 1 (Yes) for 'Whether reported condition is main condition'. This enables a table to be run on 'All conditions' by 'Whether reported condition is the main condition' to ascertain which conditions cause the most problems.
Note that although the output above only relates to a single person, the totals are a count of all conditions for that person. As with the person level file, some data items in a repeating dataset are only applicable to a particular sub-population of the dataset. For instance, the item 'Whether assistance is always or sometimes required with each activity' from the specific activities level is only applicable for activities where the respondent needs assistance. Records outside the sub-population will appear as 'Not applicable'.
As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. Survey 'weights' are values which indicate how many population units are represented by the sample unit.
There are two survey weights provided: a person weight (FINWTP) and a household weight (FINWTH). These should be used when analysing data at the person and household level respectively. The household weight should also be used for the family level and income unit level and the person weight for all other levels.
Where estimates are derived, it is essential that they are calculated by adding the weights of persons or households, as appropriate, in each category, and not just by counting the number of records falling into each category. If each person's or household's 'weight' were to be ignored, then no account would be taken of a person's or household's chance of selection in the survey or of different response rates across population groups, with the result that counts produced could be seriously biased. The application of weights ensures that:
The counting unit for level one is the household, for level two the family, for level three the income unit, for level four the person, for level five all conditions, for level six all restrictions, for level seven all specific activities, for level eight recipients of care, for level nine all broad activities and for level ten all assistance providers. There is a weight attached to each level in order to estimate the total population of the respective counting unit. The weight on levels one to three is the household weight and the weight on levels four to ten is the person weight.
What you count depends on the level from which you select the weight. A household level weight estimates the number of households with a particular characteristic. Likewise, the weight included in the family level estimates the number of families, and the weight included in the income unit level estimates the number of income units, with the selected characteristic. Only private dwellings are included at the household, family and income unit levels.
A person weight stored on the person level, or below, provides an estimate of the number of persons with the selected characteristic. When the weights from levels five to ten are used, the population is restricted to persons who have a record on the particular level, but will be repeated for each instance of the counting unit. Replicate weights have also been included and these can be used to calculate the standard error. For more information, refer to the Standard Errors section below.
Each record on the household level and person level also contains 60 replicate weights and, by using these weights, it is possible to calculate standard errors for weighted estimates produced from the microdata. This method is known as the 60 group Jack-knife variance estimator.
Under the Jack-knife method of replicate weighting, weights were derived as follows:
This process was repeated for each replicate group (i.e. a total of 60 times). Ultimately each record had 60 replicate weights attached to it with one of these being the zero weight.
Replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit records analyses such as chi-square and logistic regression to be conducted which take into account the sample design. Replicate weights for any variable of interest can be calculated from the 60 replicate groups, giving 60 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate (based on the general weight) is then used to approximate the variance of the full sample.
To obtain the standard error of a weighted estimate y, the same estimate is calculated using each of the 60 replicate weights. The variability between these replicate estimates (denoting y(g) for group number g) is used to measure the standard error of the original weighted estimate y using the formula:
g = the replicate group number
y(g) = the weighted estimate, having applied the weights for replicate group g
y = the weighted estimate from the sample.
The 60 group Jack-knife method can be applied not just to estimates of the population total, but also where the estimate y is a function of estimates of the population total, such as a proportion, difference or ratio. For more information on the 60 group Jack-knife method of SE estimation, see Research Paper: Weighting and Standard Error Estimation for ABS Household Surveys (Methodology Advisory Committee), July 1999 (cat. no. 1352.0.55.029).
Use of the 60 group Jack-knife method for complex estimates, such as regression parameters from a statistical model, is not straightforward and may not be appropriate. The method as described does not apply to investigations where survey weights are not used, such as in unweighted statistical modelling.
These documents will be presented in a new window.