4430.0.30.002 - Microdata: Disability, Ageing and Carers, Australia, 2003 (Reissue) Quality Declaration 
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 22/07/2005  Reissue
   Page tools: Print Print Page Print all pages in this productPrint All

This document was added or updated on 02/10/2012.




The 2003 SDAC microdata contains a set of confidentialised records obtained from the 2003 SDAC. The files are arranged in a hierarchy made up of the following 10 levels:
1. Household
2. Family
3. Income Unit
4. Person
5. All recipients
6. All conditions
7. Restrictions
8. Specific Activities
9. Broad Activities
10. Providers of assistance

Nature of the levels

The first four levels are in a hierarchical relationship: a person is a member of an income unit, which is a member of a family, which is a member of a household (see diagram 1). Levels five to nine are in a hierarchical relationship with the person level and level ten is in a hierarchical relationship with level nine (see diagram 2). All person and lower level records link to a household, family and income unit record; however, lower level records only exist where the person is in the relevant population.

Diagram 1: Levels one to four
Diagram 1: Levels one to four

Diagram 2: Levels four to ten
Diagram 2: Levels four to ten

Type of information on each level

The Household level contains information about type of dwelling, household structure, amount and sources of income, presence of a primary carer or a person with a disability. A household is defined as one or more persons usually resident in the same private dwelling and therefore cared-accommodation and non-private dwellings have not been included on this level. There are a total of 13,996 records at this level.

The Family level contains information about family type, employment characteristics of partners and lone parents, sources of income, and presence of a primary carer or a person with a disability. A family is defined as two or more related people who usually live together and non-family units, cared-accommodation and non-private dwellings are not included on this level. There are 10,373 records at the Family level.

The Income unit level contains data about income unit type, amount and sources of income and whether the income unit contained a primary carer or a person with a disability. Income unit is defined as a group of two or more related persons in the same household assumed to pool their income and savings and share the benefits deriving from them equitably; or one person assumed to have sole command over his or her income, consumption and savings. So unlike the Family level, the income unit level does include one person households. Cared-accommodation and non-private dwellings are not included on this level. This level contains 17,490 records.

The Person level contains information about geographical area, housing arrangements, education, employment, amount and sources of income, disability status, long-term health condition status and carer status. For persons identified as having a disability or aged 60 and over, this level contains items on level of participation in education, employment and social and community activities, use of personal computers and Internet, need and receipt of assistance with various activities, and use of aids to carry out activities. For persons identified as carers, this level contains information on care provided to others such as type of care provided and the type(s) of tasks for which the carer provided assistance.

The Person level also contains information on primary carers. For persons identified as primary carers, the person identified as the main recipient of care may have been living with the carer or living elsewhere. In situations where the main recipient was living in the same household as the carer, some information was copied from the record of the recipient to the record of the care provider. Items include sex, age, relationship and main activity restrictions of the main recipient of care. Data items about the carer include the range of assistance provided to the main recipient of care; use and availability of respite care; level and forms of support for the carer, both formal and informal; level of need for support in the carer role; and effects of the caring role on the carer's physical, emotional and financial well-being. The Person level file contains 41,233 records.

The Recipient level contains information on all care recipients for a particular carer. This level is not available to TableBuilder users. Data items include age, sex, disability status, long-term health condition status, and main disability, of each recipient of care. It also contains data items on whether the carer provides assistance to each recipient with various activities. As this level is a repeating dataset there is a separate record for each recipient and there is a repeat of the carer's ABSPID where there is more than one care recipient. This allows more powerful tables to be constructed easily such as sex of all recipients (regardless of whether they were the first, second or third reported). The counting unit for this level is all recipients so totals calculated on this level will be total recipients, not total carers.

The All conditions level contains information reported on long-term health conditions. The data contained are on types of conditions reported and whether each condition was identified as the main condition. Information on conditions was reported at various points in the questionnaire but this level combines all that information and enables tables to be run on all conditions regardless of where they were reported. This level is a repeating dataset where a person record may appear a number of times, once for each condition reported. This allows more powerful tables (such as All conditions by Whether main condition) to be run simply. As the counting unit on this level is conditions, totals will be total conditions reported, not total persons.

The Restriction level contains restriction information for all persons who reported a restriction in one or more activities. This level has been labelled 'Restriction level' to follow the terminology used in the questionnaire (i.e. Are you restricted in everyday activities because of this condition?) and to avoid confusion with the 'Disability level' from 1998. This level is a similar concept to 'Restricting impairments' from 1998 but it includes all restrictions including hearing loss. The data items contained on this level include types of disability; total or partial loss of sight/hearing/speech; whether each disability was caused by brain damage and the cause of the brain damage; and whether each disability causes the most problems for the respondent. All items on this level relate to the 'Type of disability' so other items should be cross-classified by this item. For example, to use the data item 'Whether disability was caused by brain damage' it is necessary to cross-classify it by 'Types of disability' to determine which disability was caused by brain damage. Using the 'Whether disability was caused by brain damage' item on it's own will give meaningless data because a respondent may have three disabilities one of which was caused by brain damage and two which were not. This level is a repeating dataset and the counting unit is restrictions so totals calculated on this level will be total restrictions reported, not total persons.

The Specific activities level contains information on specific activities where assistance or supervision is needed or difficulty is experienced by the respondent (e.g. getting around away from home, moving about the house or getting in or out of a bed or chair are specific activities of the broad activity 'mobility'). Data items are on types of specific activity, whether assistance or supervision is required or difficulty experienced, and whether assistance or supervision is always or sometimes needed. All other items on this file relate to 'Types of specific activity' so other items should be cross-classified by this item. This file enables powerful matrix tables to be run on all specific activities by whether assistance is always needed, sometimes needed or only difficulty is experienced. This level is a repeating dataset where a single person may be present a number of times, once for each specific activity reported. The counting unit for this level is specific activities so totals on this level will be total specific activities reported, not total persons. Note that for reading/writing/paperwork the information on whether assistance is always or sometimes needed is only available for respondents from cared-accommodation establishments.

The Broad activities level contains information on broad areas of activity where assistance or supervision is needed or difficulty experienced (e.g. mobility). The data items include the broad area of activity, whether assistance or supervision is needed or difficulty experienced with each activity, frequency of need for assistance or supervision with each activity, whether receives formal or informal assistance with each activity, extent to which need for assistance with each activity is met, whether need more formal or informal assistance with each activity, and main reason not receiving more formal or informal assistance with each activity. All other items on this file relate to the 'Broad areas of activity' so other items should be cross-classified by this item. This level is a repeating dataset with a record for each broad area of activity. This enables powerful matrix tables to be run on all broad areas of activity by other items such as the frequency of need for assistance or supervision. As the counting unit on this level is broad activities, totals calculated on this level will be total broad areas of activity reported, not total persons.

The Assistance providers level contains information on the persons or organisations that provide assistance to respondents who require assistance or supervision with a broad area of activity. Data items include provider of assistance for broad activity, broad area of activity for which assistance is provided, type of organisation provider belongs to, whether each provider supplies the main assistance, how recipient found out about each formal provider, and whether each informal provider co-resides with the recipient. All other items on this file relate to 'Providers of assistance' so other items should generally be cross-classified by this item. This level can be linked with the broad areas of activity level (by merging) because information on providers is collected according to broad areas of activity. Linking the levels allows powerful matrix tables to be run such as all broad areas of activity by all providers of assistance by extent to which need for assistance is met. The counting unit for this level is providers of assistance so totals will be total providers of assistance, not total persons. Note that some providers are not applicable for some activities.

The following table shows the number of records on each level of the CURF and STB:


Counting Unit

of Records

Household level



Family level



Income Unit level

Income Unit


Person level



All conditions level



Restrictions level



Specific activities level

Specific activities


All recipients level



Broad activities level

Broad Activities


Assistance providers level

Assistance providers


Note: Level 8 (All Recipients) has not been released in TableBuilder due to the complexity involved in using and interpreting data on this level. Unlike other sub-person levels, the All Recipients level contains a ‘many to many’ relationship. This describes the ability for a ‘Recipient of care’ to have multiple 'Carers' while a 'Carer' can also have multiple ‘Recipients of care’. To discuss your data needs relating to this level, please contact the ABS National Information Referral Service (NIRS) on 1300 135 070.

Using repeating datasets

Repeating datasets are a set of data with a counting unit which may be repeated for a person. For example, a person may have more than one condition. A repeating dataset for conditions will have one record per condition reported because condition is the counting unit (see Table 1 below). Repeating datasets are only useful when common information is collected for each instance of a counting unit. For example, each condition reported has the data item "Whether reported condition is main condition" associated with it. This data item corresponds to each condition reported. Note that only one of the conditions reported for a particular person has a 1 (Yes) for "Whether reported condition is main condition". This enables a table to be run on All conditions (CODECODC) by Whether each condition is the main condition (WTHMAINC) to ascertain which conditions cause the most problems.

Table 1: Example of 'All conditions' repeating dataset



To run the table mentioned above the following SAS code (or equivalent) can be used with the CURF:


The following output would be produced using the example dataset:



Note that although the output above only relates to a single person, the totals are a count of all conditions for that person.

As with the Person level file, some data items in a repeating dataset are only applicable to a particular sub-population of the dataset. For instance, the item "Whether assistance is always or sometimes required with each activity" from the specific activities level is only applicable for activities where the respondent needs assistance. Records outside the sub-population will appear as "Not applicable".

Counting units and weights

The counting unit for:
  • level one is the household
  • level two the family
  • level three the income unit
  • level four the person
  • level five recipients of care
  • level six all health conditions
  • level seven all restrictions
  • level eight all specific activities
  • level nine all broad activities
  • level ten all providers of assistance.
There is a weight attached to each level, to estimate the total population of the respective counting unit. Levels one to three use the household weight and levels four to ten the person weight.

The weight used determines what is counted. A household level weight estimates the number of households with a particular characteristic. Likewise, a person weight estimates the number of persons with the selected characteristics. Only private dwellings are included at the household, family and income unit levels.

Replicate weights have also been included and these can be used to calculate the sampling error. For more information, refer to the 'Sampling Error' section below.

Transferred items

Items containing data transferred from one person's record to another's can sometimes be confusing, both in selecting an appropriate item and in understanding the counting unit. For example, analysing the age distribution of a population of primary carers requires the following actions:
  • identify people who are primary carers, e.g. where PIDPCPOP = 1
  • use the 'Age of person' variable (AGEPC) from the person level record
  • select the person weight from the person level (PERS_WT).
The item 'Age of principal carer' (MAPPCAGC) only exists on the care recipient's record and only has positive responses on the records of people who have a principal carer. Using this item with a person weight counts persons with a disability or people aged 60 and over, many of whom are 'not applicable', as they do not have a principal carer; those with positive responses constitute the population of main recipients of care.

Using the 'Age of principal carer' item with an item such as 'Primary carer populations = primary carers' using a person weight does not give an age distribution of all primary carers, it counts the number of primary carers who are themselves main recipients of care, by the age of their principal carer.

If your interest is in people with a disability, and you want to analyse them by their main condition, such as dementia or arthritis, etc.:
  • identify people with a disability
  • select the 'Main condition' item from their Person level record
  • select the person weight from the Person level.
Using the item 'Main disabling condition of main care recipient' and the person weight from the Person level identifies a population of primary carers whose main recipient co-resides with them and has dementia, or arthritis, or some other condition as a main condition; using an item such as 'Disability status (1)' with this item provides information on the disability status of persons who are primary carers of people with the specified condition. It is, therefore, important when using items relating to carers and care recipients to pay particular attention to the populations for items.

Using disability populations

Persons were identified as having a disability if they had one or more of the following limitations, restrictions or impairments which had lasted, or was likely to last, for a period of six months or more and restricted everyday activities. This includes:
  • loss of sight (not corrected by glasses or contact lenses)
  • loss of hearing where communication is restricted, or an aid to assist with, or substitute for, hearing is used
  • speech difficulties
  • chronic or recurrent pain or discomfort causing restriction
  • shortness of breath or breathing difficulties causing restriction
  • blackouts, fits, or loss of consciousness
  • difficulty learning or understanding
  • incomplete use of arms or fingers
  • difficulty gripping or holding things
  • incomplete use of feet or legs
  • nervous or emotional condition causing restriction
  • restriction in physical activities or in doing physical work
  • disfigurement or deformity
  • mental illness or condition requiring help or supervision
  • long-term effects of head injury, stroke or other brain damage causing restriction
  • receiving treatment or medication for any other long-term condition or ailment, and still restricted
  • any other long-term condition resulting in a restriction.
If a disfigurement or deformity was reported and it was also reported that this did not restrict the person's everyday activities, it was considered to be a non-restricting disfigurement or deformity. Respondents with a non-restricting disfigurement or deformity only are defined as disabled but if they had no further disabilities, due to a sequencing error, they were not asked the full range of questions asked of people with a disability. Questions on assistance with self care, mobility and communication; aids used; telephone and fax use; changes made to dwelling; schooling restrictions; employment restrictions; and employer provided arrangements were not asked of anyone in this group. Questions on assistance with health care, emotion or cognition, household chores, home maintenance, meal preparation, transport, reading and writing; and questions about driver's licence, transport use, social and community participation, computer and Internet use, moving house and whether carer needed to move in were only asked if the respondent was aged 60 years or more. Where respondents with a non-restricting disfigurement or deformity only were not asked questions they have been assigned the same category as a non-disabled respondent (usually 'Not applicable') for all data items relating to those questions.

To identify the whole disability population including persons with a non-restricting disfigurement or deformity only, the data item 'Whether has a disability' (sasname WTHRDIS) should be used. To identify the disability population excluding persons with only, a non-restricting disfigurement or deformity the data item 'Whether has a disability (excluding a non-restrictive disfigurement or deformity)' (sasname WTHRDISB) should be used. To identify persons with only a non-restricting disfigurement or deformity, the criteria WTHRDIS=1 and WTHRDISB=2 should be used.

As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates from the microdata. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. Survey 'weights' are values which indicate how many population units are represented by the sample unit.

There are two weights provided on the microdata files: a person weight (PERS_WT) and a household weight (HHOLD_WT). These should be used when analysing data at the person and household level respectively. The household weight should also be used for the family level and income unit level and the person weight for all other levels.

Where estimates are derived from the microdata it is essential that they are calculated by adding the weights of persons or households, as appropriate, in each category, and not just by counting the number of records falling into each category. If each person's or household's 'weight' were to be ignored, then no account would be taken of a person's or household's chance of selection in the survey or of different response rates across population groups, with the result that counts produced could be seriously biased. The application of weights ensures that:
  • person estimates conform to an independently estimated distribution of the population by age, sex, state/territory and section of state, and
  • household estimates conform to an independently estimated distribution of households by certain household characteristics (e.g. by number of adults and children), rather than to the distributions within the sample itself.

For weighting purposes, the 2003 SDAC was benchmarked to the estimated population at 30 June 2003, based on results from the 2001 Census of Population and Housing. Although persons in remote or sparse areas were not included in the SDAC, these areas were not removed from benchmarks for practical reasons, except in Northern Territory. The effect on survey estimates is negligible. It should be noted that separate benchmarks were not available for non-private dwellings at the household level. Consequently, estimates for non-private dwellings may be less reliable than those for private dwellings at that level.

Sampling error is the difference between the published estimates, derived from a sample of persons, and the value that would have been produced had all dwellings in scope of the survey been included. In addition to the 'main weight', the CURF also contains 30 'replicate weights'. The purpose of these replicate weights is to enable calculation of the sampling error on each estimate produced.

A basic idea behind the replication approach is to select subsamples repeatedly (30 times) from the whole sample. For each of these subsamples the statistic of interest is calculated. The variance of the full sample statistics is then estimated using the variability among the replicate statistics calculated from these subsamples. The subsamples are called replicate groups and the statistics calculated from these replicates are called replicate estimates.

There are various ways of creating replicate subsamples from the full sample. The replicate weights produced for the 2003 SDAC have been created under the Jackknife method of replication which is described below.

There are numerous advantages to using the replicate weighting approach. These include:
  • the same procedure is applicable to most statistics such as means, percentages, ratios, correlations, derived statistics and regression coefficients
  • it is not necessary for the analyst to have available detailed survey design information if the replicate weights are included with the data file.
Under the Jackknife method of replicate weighting, weights were derived as follows:
  • 30 replicate groups were formed with each group formed to mirror the overall sample. Units from a CD all belong to the same replicate group and a unit can belong to only one replicate group.
  • One replicate group was dropped from the file and then the remaining records were weighted in the same manner as for the full sample
  • The records in that group that were dropped received a weight of zero
  • This process was repeated for each replicate group (i.e. a total of 30 times)
  • Ultimately each record had 30 replicate weights attached to it with one of these being the zero weight.'
Replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit records analyses such as chi-square and logistic regression to be conducted which take into account the sample design.

Replicate weights for any variable of interest can be calculated from the 30 replicate groups, giving 30 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate (based on the general weight) is then used to approximate the variance of the full sample.

The formula for calculating the Standard error (SE) and relative standard error (RSE) of an estimate using this method is shown below.

Image: formula for calculating the Standard error (SE) and relative standard error (RSE)
This method can also be used when modelling relationships from unit record data, regardless of the modelling technique used. In modelling, the full sample would be used to estimate the parameter being studied, such as a regression co-efficient, the 30 replicate groups used to provide 30 replicate estimates of the survey parameter. The variance of the estimate of the parameter from the full sample is then approximated, as above, by the variability of the replicate estimates.

Not all statistical computer packages may allow direct calculation of SEs using the Jacknife replicate weights. However, those packages that allow the direct use of Balanced Repeated Replication (BRR) methodology generally include the option of an adjustment factor. This factor can be incorporated to overcome the difference between the variance formulae.


Most data items on the microdata files include a 'Not applicable' category. The classification value of the 'Not applicable' category, where relevant, are shown in the data item list (see the Data Item List in the Downloads tab).

A number of questions included in the survey allowed respondents to provide one or more responses. Each response category for one of these 'multi-response questions' (or data items) is basically treated as a separate data item. These data items have the same general data item identifier (SASName) but are each suffixed with a letter – A for the first response, B for the second response, C for the third response, D for the fourth response and so on.

For example, the multi-response data item 'Purposes for computer use at home in the last 12 months' (with a general SASName of COMPRCM – see data item list), has six response categories. Consequently, six data items have been produced - COMPRCMA, COMPRCMB, COMPRCMC, COMPRCMD, COMPRCME and COMPRCMF.

Each data item in the series (i.e. COMPRCMA -- COMPRCMF) will have two response codes: A 'Yes' response (for the first in the series (code 1), for the second in the series (code 2) etc.) and a 'Null' response (code 0) indicating that the response was not relevant for the respondent. The last data item in the series will represent a 'Not Applicable' response (i.e. value of last character in series) which comprises the respondents not asked the questions (e.g. COMPRCMF with values of 0 or 6).

It should be noted that the sum of individual multi-response categories will be greater than the population or number of people applicable to the particular data item as respondents are able to select more than one response. Multi-response data items can be identified in the data item list as SASNames followed by a range of letters in brackets; for example, COMPRCM(A-F).

Disability is a difficult concept to measure because it depends on a respondent's perception of their ability to perform a range of activities associated with daily life. Factors discussed below should also be considered when interpreting the data.

Information in the survey was based, wherever possible, on the personal response given by the respondent. However, in cases where information was provided by another person, some answers may differ from those the selected person would have provided. In particular, interpretation of the concepts of 'need' and 'difficulty' may be affected by the proxy-interview method.

A number of people may not have reported certain conditions because of:
  • the sensitive nature of the condition (e.g. alcohol and drug-related conditions, schizophrenia, mental retardation or mental degeneration)
  • the episodic or seasonal nature of the condition (e.g. asthma, epilepsy)
  • a lack of awareness of the presence of the condition on the part of the person reporting (e.g. mild diabetes) or a lack of knowledge or understanding of the correct medical terminology for their condition
  • any lack of comprehensive medical information kept by their cared-accommodation establishment.
As certain conditions may not have been reported, data collected from the survey may have underestimated the number of people with one or more disabilities.

The need for help may have been underestimated, as some people may not have admitted needing help because of such things as a desire to remain independent, or may not have realised help was needed with a task because help had always been received with that task.

The criteria by which people assessed whether they had difficulty performing tasks may have varied. Comparisons may have been made with the ability of others of a similar age, or with the respondent's own ability when younger.

The different collection methods used – personal interview for households, and administrator completed forms for cared-accommodation – may have had some effect on the reporting of need for assistance with core activities. As a result there may have been some impact on measures such as disability status. If so, this would have had more impact on the older age groupings because of their increased likelihood of being in cared-accommodation.