Australian Bureau of Statistics
4324.0 - Information Paper: National Health Survey, Basic and Expanded CURF, 2007-08
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 27/08/2010 Ceased
|Page tools: Print Page Print All RSS Search this Product|
Unless otherwise indicated in this paper, the term CURF has been used to apply to both the Basic and Expanded versions of the file.
The RADL is an on-line batch database query system, under which microdata are held on a server at the ABS, to which users can submit programs in SAS, SPSS and Stata software to interrogate, analyse, model the data, and access the results.
The 2007-08 NHS Basic CURF contains similar information to the Expanded CURF, except that some items are shown in more detail on the Expanded file than on the Basic, and the Expanded file contains some additional data items which are not on the Basic. A full list of data items on both the Basic and Expanded CURFs can be found in a Datacube attached to this publication. This list of items can also be found on the CD-ROM or under "CURF Documentation" on the RADL Help screen.
The CURF enables purchasers to tabulate, manipulate and analyse data to their own specifications. More detailed information to assist in using the CURF and in interpreting the data is provided in the documentation accompanying the file or otherwise through the ABS Website.
ABOUT THE SURVEY
The 2007-08 NHS is the seventh health survey of its type conducted by the ABS; previous surveys were conducted in 1977-78, 1983, 1989-90, 1995, 2001 and 2004-05. Published results and a Basic CURF are available from each of those surveys. An Expanded CURF is also available for the 2004-05 NHS.
The 2007-08 survey was conducted by the ABS with a sample of 15,787 fully responding, private dwellings across Australia from August 2007 to June 2008. Both urban and rural areas in all states and territories were covered, but very remote areas of Australia were excluded. Non-private dwellings, such as hotels, motels, hospitals, nursing and convalescent homes and short-stay caravan parks were not included in the survey.
Within each selected household a random sub-sample of usual residents was selected for inclusion in the survey as follows:
Trained ABS interviewers conducted personal interviews with the selected adult member of the household, and children aged 15 to 17 years, with parental consent. A parent or guardian was asked to answer questions in respect of their children aged 15 to 17 years who were not personally interviewed, and children aged less than 15 years. This person is referred to as the child proxy in survey documentation.
Information was collected in the survey about the health status of Australians, their use of health services and facilities, and health related aspects of their lifestyle. Information was collected about long term health conditions experienced by respondents, consultations with health professionals in general and in direct relation to specific conditions reported. These specific conditions are the National Health Priority Area (NHPA) conditions. Data was also collected on other actions people had taken in regard to their health (e.g. taken days away from work, medication used), aspects of their lifestyle and other factors which may affect their health such as smoking, alcohol consumption, diet and exercise. Respondent's physical measurements of height, waist, hip and weight were taken in the 2007-08 NHS for the first time since 1995. Self-reported height and weight measurements were also collected. The survey design enables information for all topics to be analysed in relation to other topics, and in relation to a range of demographic and socioeconomic characteristics.
A cross-section of results from the 2007-08 NHS was published in National Health Survey: Summary of Results, Australia 2007-08 (cat. no. 4364.0). Currently available from the ABS Website are selected publication tables compiled for individual States and the ACT (cat. no. 4362.0) and User's Guide (cat. no. 4363.0.55.001). A copy of the questionnaire is available in the Data Reference Package (cat. no. 4363.0.55.002). Further results relating to specific topics covered by the survey will be published progressively, details of which will be posted on the ABS Website.
The Users' Guide contains detailed information about the survey design and operation, survey content, and data quality and interpretation. Further information about the NHS can be obtained by contacting the National Information and Referral Service on 1300 135 070. Further information on the CURF can be obtained by emailing email@example.com.
CHANGES TO THE SURVEY
While the 2007-08 NHS is the same, or similar, in many ways to the 2004-05 and 2001 NHS (and in part to the 1995 NHS), there are important differences across aspects of the surveys; sample design and coverage, survey methodology and content, definitions and classifications etc. These differences will affect the degree to which data are directly comparable between the surveys, and hence the interpretation of apparent changes in health characteristics over the 2004-05 to 2007-08 period. These are summarised in the Users' Guide to the survey.
Five new topics were covered in the 2007-08 NHS: Personal stressors, healthy lifestyle, bodily pain, disabilities and measured weight, height, waist and hips.
Personal stressors were defined as life events, experienced in the 12 months prior to interview, that the respondent or anyone close to them regarded as stressful e.g. serious illness, death of a family member, mental illness, divorce or separation.
Healthy lifestyle covers the frequency of check-ups and healthy lifestyle discussions with a General Practitioner or another health professional.
Bodily pain refers to severity of bodily or physical pain experienced. Respondents were asked the severity of bodily pain experienced in the 4 weeks prior to interview. If the respondent had experienced any bodily pain they were then asked if the pain had interfered with their normal work, both outside and inside their home.
Respondents were asked a set of questions to determine whether they had any disabilities or restrictions in everyday activities. A disability (or restrictive long-term health condition) exists if a limitation, restriction, impairment, disease or disorder, has lasted, or is expected to last for six months or more, and restricts everyday activities.
The 2007-08 NHS collected information to describe various aspects of the health status of the Australian population, with a particular focus on asthma, cancer, circulatory conditions, diabetes, mental health and musculoskeletal conditions, particularly arthritis and osteoporosis.
To enable the prevalence of all long-term conditions to be established, supplementary information was also collected on other long-term conditions. A long-term condition was defined as one reported by respondents as being a condition which they currently had and which had lasted, or they expected to last, for 6 months or more. The 2007-08 NHS output all conditions regardless of their long-term or current status, therefore, to gain concordance with previous surveys where only long-term conditions were recorded, the 'Condition status' (CONDSTAT) is required to be cross classified with 'Conditions' (EVERCURF).
The 2007-08 NHS collected information on medications for mental health in two ways; as a direct link to a mental health condition and for those that may or may not have a mental health condition. The latter method was collected after the Kessler 10 questions. Therefore, a person may or may not have psychological distress but could report taking medication for their mental wellbeing. As medications and conditions data are linked, medication information collected by the latter method; for their mental wellbeing, required linking to the condition level as for all other medications. A dummy code (00997) exists for this reason. When compiling information on conditions, at the condition level, ensure 'No condition - dummy code to link mental wellness medication' (00997) is excluded.
ABOUT THE MICRODATA
The data are released under the Census and Statistics Act 1905 which has provision for the release of data in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly there are no names or addresses of survey respondents on the CURF and other steps have been taken to protect the confidentiality of respondents. This may involve any or all of the following steps: removing some items from the CURF, reducing the level of detail shown on the CURF for some items, swapping some characteristics between records, dropping some non-selected person information and perturbing income.
Steps to confidentialise the data sets made available on the CURF are taken in such a way as to ensure the integrity of the data sets and optimise their content, while maintaining the confidentiality of respondents. Intending purchasers should ensure that the data they require at the level of detail they require are available on the CURF they are intending to use; data obtained in the survey but not contained on the CURF may be available in tabulated form on request. A list of the data items on the CURF are provided as a Datacube attachment to this publication.
The NHS CURF contains 20,788 confidentialised respondent records from the survey. Subject to the limitations of sample size and the data classifications used, it is possible to manipulate the data, produce tabulations and undertake statistical analyses to individual specifications.
Both the Basic and Expanded CURFs are available in SAS, SPSS and Stata. If you obtain a copy of the Basic CURF on CD ROM and your analysis software is other than these you may require the services of a computer programmer to use the ASCII file version of the data.
The Basic CURF on CD ROM contains data files, SAS, SPSS and Stata user files and a set of information files. The Expanded CURF on RADL contains the SAS, SPSS and Stata data sets, test files and information files. A copy of this information paper will be available on-line through RADL. Details of the files are shown in Appendix 1.
Some records were confidentialised for both the RADL and the CD-ROM CURFs by differing amounts, therefore aggregate estimates slightly vary between the two CURF versions and the main file.
The 2007-08 NHS CURF contains separate files arranged in a hierarchy made up of the following levels:
The first three levels are in a hierarchical relationship: a household (household level) comprises a number of residents (All Persons level), from which one or two were the selected respondents to the survey (Selected Person level).
Levels four to seven are in a hierarchical relationship with the (selected) Person level. Also, the condition level is linked to condition group level and medication level is linked to condition level. These levels exist to describe 'one to many' relationships. For example,
Some items relating to the topics covered in these lower level hierarchies are also held at the (selected) Person level, where appropriate. For example, while the detailed alcohol consumption data (day of consumption x type of drink x quantity consumed) are held at the Alcohol level, other alcohol-related items related to the activities of the person such as risk level, summary of drink types and quantity consumed, recorded consumption compared to usual, etc, are held at the (selected) Person level. All (selected) persons have a record on every level whether or not the level is applicable to them.
USING REPEATING DATASETS
The 'one to many' relationships described by levels 4 to 7 are known as repeating datasets i.e. sets of data with a counting unit which may be repeated for a person. For example, a repeating dataset for conditions will have one record per condition reported because condition is the counting unit. Repeating datasets are only useful when common information is collected for each instance of a counting unit.
As with the (selected) Person level file, some data items in a repeating dataset are only applicable to a particular sub-population of the dataset. Records outside the sub-population will appear as a "Not applicable". For example the data item "Duration of medications" was only reported on medications taken for asthma and mental health conditions. Therefore for other condition/medication combinations the data item will have a value of "Not applicable".
Some questions were asked of populations more extensive than in previous years; data on written asthma action plans is available for more than those with current asthma.
The Medication level can be used for the types of medication a person is using, how frequently it is being used and for how long the medication has been used. How frequently and for how long the medication was taken was only reported if the person had asthma or a mental health condition and if they had taken any medication for these particular conditions.
The data relates only to medications (known and reported by respondents) used for the following types of medical conditions or reasons - asthma, arthritis, diabetes, cancer, circulatory conditions and mental health conditions. As a result the data does not indicate the levels of total medication use, nor does it necessarily indicate the total use of a particular medication type, particularly in cases where a medication can be used for a range of different conditions.
There is a hierarchical relationship between the Medication level and the Condition level (ie. medications taken for a particular condition). To relate the medication used to the particular condition it is used for, use the data item 'Type of medication taken for NHPA conditions' (MEDCODEC).
Mental wellbeing, even though not considered a condition, is noted in the item MEDCODEC for consistency with the reported medications for mental health on the medication level.
The medications used for a particular condition can range from any of the medications listed. For example, for the condition diabetes, the list of medications used mainly come from medications commonly used for diabetes but can also come from medications commonly used for heart and blood pressure, analgesics medication, etc, as reported by the respondent.
COUNTING UNITS AND WEIGHTS
The counting unit for level one is the household, for level two the person, for level three the selected person/s, for level four the types and quantities of alcohol and days consumed, for level five the condition group, for level six the conditions and for level seven the medications. There is a weight attached to the Household level and the (selected) Person level only, to estimate the total population. The weight on these levels are the household weight (i.e. NHSFHHWT) and the person weight (i.e. NHSFINWT).
While the Expanded CURF contains an (all) Persons in household level, this level should only be used to obtain data on certain characteristics of selected persons and to obtain additional information about all people living in the selected households; they are not intended to be used to produce person estimates in their own right. Only the Household level and (selected) Person level contains weights.
The person weight can be used on levels 4 to 7 by copying it across from the selected person level. When the weight is used for these levels, the population is restricted to persons who have a record on the particular level and will be repeated for each instance of the counting unit. For example, when weighted estimates are produced from the condition level, they will represent the weighted number of conditions with the specified characteristics.
A person weight provides an estimate of the number of persons with the selected characteristic. Replicate weights (i.e. WPH0101 TO WPH0160, WPM0101 to WPM0160) have also been included and these can be used to calculate the sampling error on any estimate produced from the CURF. For more information, refer to the 'Sampling Error' section below.
There is a series of identifiers that can be used on records at each level of the file.
The identifiers ABSHID, ABSAID, ABSPID, ABSBID, ABSGID, ABSCID, ABSMID appear on all levels of the file (as they are needed to create a hierarchical CSV file). Where the information for the identifier is not relevant for a level, it has a value of 0.
Each household has a unique twelve-digit random identifier (ABSHID). This identifier appears on the Household level, and is repeated on every other level. On the (all) Persons in household level, each person within the household is numbered sequentially (ABSAID). Within the household, each person is numbered sequentially (ABSPID). Items containing this person number then appear on all the levels below the (all) Persons in household level. Identifiers can be used to copy information from one level of the file to another.
For the Expanded CURF the identifiers needed for such a linking purpose are:
2. Persons in household = ABSHID, ABSAID
3. Selected person = ABSHID, ABSAID, ABSPID
4. Alcohol = ABSHID, ABSAID, ABSPID, ABSBID
5. Condition group = ABSHID, ABSAID, ABSPID, ABSGID
6. Condition = ABSHID, ABSAID, ABSPID, ABSGID, ABSCID
7. Medication = ABSHID, ABSAID, ABSPID, ABSGID, ABSCID, ABSMID
Please note the (all) Persons in household level is only relevant to the Expanded CURF. The Basic CURF does not have the Persons in household level. Therefore, the identifier for the (all) persons in household level (ABSAID) will be excluded from the Basic CURF and is not needed when linking levels on the Basic CURF.
For the Basic CURF the identifiers needed for such a linking purpose are:
3. Selected person = ABSHID, ABSPID
4. Alcohol = ABSHID, ABSPID, ABSBID
5. Condition group = ABSHID, ABSPID, ABSGID
6. Condition = ABSHID, ABSPID, ABSGID, ABSCID
7. Medication = ABSHID, ABSPID, ABSGID, ABSCID, ABSMID
To copy information from a lower level to a level above the following SAS code can be used (or equivalent):
BY ABSHID ABSPID ;
RETAIN FLAG ;
IF FIRST.ABSPID THEN FLAG=0;
IF EVERCURF=13076 THEN FLAG=1 ; *set flag for condition = anaemia ;
IF LAST.ABSPID THEN DO ;
PROC SORT DATA=NHS08BSP ;
BY ABSHID ABSPID ;
The ANAEMIA file, using the Condition level dataset, keeps the last record for each ABSPID, i.e. person, and sets the item FLAG to 1 if the person has anaemia as a condition. This newly created flag is then merged on to the Person level file so that this item can now be cross classified or analysed with any other item on the Person level.
If using EVERCURF it is necessary to be cross classified with condition status (CONDSTAT) if a long-term condition is required, otherwise all conditions reported will be in EVERCURF. EVERCURF is a collection of all reported conditions regardless of whether a condition was long-term or if a condition was diagnosed by a health professional. Therefore you are required to cross classify CONDSTAT to differentiate categories.
Another slightly more complicated example.
BY ABSHID ABSPID ;
RETAIN MEDDUR ;
IF FIRST.ABSPID THEN MEDDUR=0 ; *sets up to leave respondents with no medication in 0;
IF HOWLNGCF=3 THEN MEDDUR=1 ; * Has one or more medications used for 6 months or more ;
BY ABSHID ABSPID ;
The SUMMARY file only keeps the last record for each ABSPID on the NHS08ECG file so the merge is a one to one match of person records on the SUMMARY file with records on the NHS08ESP (i.e. person) file. MEDDUR is set to 1 where at least one medication used for a person is for a duration of 6 months or more. Where MEDDUR does not meet the condition of being set to 1, it is then set to 2 if it meets the second condition (which in this case is the rest of the people who have identified using a medication), otherwise it is left in 0 (which in this case are those people who have not used medications for conditions or have no conditions). This method allows summary information from one level to be used on the level above it in the hierarchy. The item generated above can now be cross-classified by any number of items on the (selected) Person level file.
Note: All persons have a record on the condition and medication levels, whether or not they have a condition and/or use medication. Persons who don't have a condition and/or medication will be coded to "Not applicable" for type of condition/medication. As such, in the above example, people who do not use medication for any condition (incl. those with no condition) will still be set to 0 for MEDDUR by this program and when output from the merged file, rather than having frequency missing as has previously been the case.
To copy information from a higher level to a level below the following SAS code can be used (or equivalent):
BY ABSHID ABSPID ;
Unlike the previous merge, this merge will match one person record to many condition records. Note that the data items copied from the (selected) Person level will now be repeated for the counting unit of the level they have been added to, conditions in this case.
As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates from the CURF. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. One of the fields on the CURF contains a 'main weight' (NHSFINWT) for each person in the sample. This 'weight' is a value which indicates how many population units are represented by the sample unit.
Where estimates are derived from the CURF it is essential that they are calculated by adding the weights of persons in each category, and not just by counting the number of records falling into each category. If each person's 'weight' were to be ignored, then no account would be taken of a person's chance of selection or of different response rates across population groups, with the result that estimates produced could be seriously biased. The application of weights ensures that estimates will conform to an independently estimated distribution of the population by age, sex, state/territory and section of state, rather than to the distributions within the responding sample itself.
The sample in the Northern Territory (NT) is at a level such that NT records contribute appropriately to national estimates but are insufficient to support reliable estimates for the NT. Northern Territory estimates should not be used alone, they should only be used to contribute to national estimates.
For weighting purposes, the 2007-08 NHS was benchmarked to the estimated population living in private dwellings at 31 December 2007, adjusted for the scope of the survey. For further information about the benchmarks and weighting, see Chapter 2 of the Users Guide.
Sampling error is the difference between the survey estimate and the value that would have been produced had all dwellings in scope of the survey been included.
In addition to the 'main weight' each record on the CURF contains 60 'replicate weights'. The purpose of these replicate weights is to enable calculation of the sampling error on each estimate produced.
A basic idea behind the replication approach is to select subsamples repeatedly (60 times) from the whole sample. For each of these subsamples the statistic of interest is calculated. The variance of the full sample statistics is then estimated using the variability among the replicate statistics calculated from these subsamples. As well as enabling variances of estimates to be calculated relatively simply, replicate weights also enable unit records analyses such as chi-square and logistic regression to be conducted which take into account the sample design.
Further information about the use of replicate weights is contained in the Users' Guide. It should be noted that not all statistical computer packages may allow direct calculation of SEs using replicate weights. However, those packages that allow the direct use of Balanced Repeated Replication (BRR) methodology generally include the option of an adjustment factor. This factor can be incorporated to overcome the difference between the variance formulae.
POINTS ABOUT THE DATA
Detailed information about the data collected, comments regarding data quality and other points to assist in using and interpreting the data are contained in the 2007-08 National Health Survey: User's Guide (cat. no. 4363.0.55.001), which is available free of charge from the ABS website. It is recommended that relevant parts of the Guide be read in conjunction with the use of the 2007-08 NHS CURF.
In addition, there are some general points regarding the data appearing on the CURF which should be noted.
In addition to these changes, some amendments have been made to the main NHS data file since the release of the summary publication as a result of further processing and validation.
This means that estimates produced from the CURFs may differ from those published in 2007-08 National Health, Survey, Summary of Results, Australia (cat. no. 4364.0) or released in tables available electronically from the ABS website. To assist those users of the data who like to compare estimates they produce with published data (e.g. to confirm that appropriate populations are being used), tables showing selected populations have been compiled from the CURFs. These are provided as Appendix 2 to this paper.
In determining the long-term conditions and types of medication to be separately identified on the CURF, thresholds based on the number of observations in the survey have been used. The result was some collapsing of the categories for these items on the CURF as compared with the main data file. This in turn has effected the counts of conditions.
For medications, a duplicate medication was removed where the condition (from the MEDCODEC item) was also the same. Maintained on the file are cases where the same medication was used but are related to different conditions.
Data items associated with the conditions or medications data items discussed above (e.g. number of long-term conditions) have been re-derived as appropriate based on the classifications shown on the CURF.
Populations identified for condition specific data items collected within the condition modules are based on responses as reported to the questions in the module. As respondents may identify conditions in some modules which are not later coded as an applicable condition to that module, or may identify conditions relevant to that module in other sections of the survey, the data populations achieved in the items may not match those identified using the ICD10 coded condition responses.
Some data items on the CURF directly reflect responses to individual questions contained in the survey questionnaire, while others have been derived from responses to two or more questions. Due to the volume and complexity of the derivations, these derivations are not generally released. However, details can be made available on request.
The dollar ranges covered by deciles in all income items are shown in Appendix 3. In 2007-08, income was collected from persons aged 18 years and over. Household income includes income from all persons aged 15 years and over.
CONDITIONS OF USE OF GEOGRAPHY AND SEIFA ITEMS
To enable CURF users greater flexibility in their analyses, the ABS has included one Socioeconomic Index For Areas (SEIFA) and several sub-state geography items on the 2007-08 NHS Expanded CURF.
Cross-tabulations by several of these items simultaneously will produce cells relating to some small geographic regions. Tables showing multiple data items, cross- tabulated by more than one sub-state geography at a time (including SEIFA), are not permitted due to the detailed information about small geographic regions that could be presented. However, simple cross-tabulations of population counts by multiple sub-state geographic data items may be useful for clients in order to determine which geography or SEIFA item to include their primary analysis, and such output is permitted.
The SEIFA decile on the 2007-08 NHS Expanded CURF is derived using area-based deciles. Area-based deciles are derived by simply grouping Collectors District (CDs) into 10 equal groups (i.e. equal number of CDs in each group) then matching the CDs of survey records to those groups. Because all CDs are not equal in size, and because the NHS sample is not selected to ensure an equal distribution at the CD level, this method does not result in an equal number of people in each decile. This is the only method by which SEIFA deciles or quintiles were derived in previous NHSs.
It should be borne in mind that the characteristics indicated by the SEIFA relate to the area (in this case the CD) in which a population lives, not necessarily to all individuals who live in that area.
For further information about SEIFA see Information Paper: Census of Population and Housing - Socio-economic Indexed for Areas, Australia see Information Paper: Census of Population and Housing- Socio-economic Indexed for Areas, Australia (cat. no. 2039.0).
CONDITIONS OF RELEASE
The Microdata: National Health Survey, Basic and Expanded CURF, Australia, 2007-08 is released in accordance with a Ministerial Determination (Clause 7, Statutory Rules 1983, No.19) in pursuance of section 13 of the Census and Statistics Act 1905. As required by the Ministerial Determination, the CURF has been designed so that the information on the file is not likely to enable the identification of the particular person to which it relates.
The Australian Statistician's approval is required for each release of the CURF. In addition, and prior to being granted access to the CURF, all organisations, and individuals within organisations, who request access to the CURF will be required to sign an Undertaking to abide by the legislative restrictions on use. Organisations and individuals who seek access to the Microdata: National Health Survey, Basic and Expanded CURF, Australia, 2007-08 are required to give an undertaking which includes, among other conditions, that in using the CURF data they will:
Use of the data for statistical purposes means use of the content of the CURF to produce information of a statistical nature, i.e. the arrangement and classification of numerical facts or data, including statistical analyses or statistical aggregates. Examples of statistical purposes are:
All CURF users are required to read and abide by the "Responsible Access to ABS Confidentialised Unit Record Files (CURFs) Training Manual" available on the ABS website (see Services, CURF Microdata, Accessing CURF Microdata). Use of the data for unauthorised purposes may render the purchaser liable to severe penalties. Advice on the propriety of any particular intended use of the data is available from the Microdata Access Strategies Section via firstname.lastname@example.org.
Conditions of sale
All ABS products and services are provided subject to the ABS conditions of sale. Any queries relating to these Conditions of Sale should be referred to email@example.com.
While the utmost care is taken in handling each CURF on CD-ROM, deterioration may occur between the time of copying and receipt of the file. Accordingly, if the CD-ROM is unreadable on receipt and this is reported to the ABS within 30 days of receipt, it will be replaced free of charge.
As at 1 January 2009, the recommended retail price of the Microdata: National Health Survey, Basic and Expanded CURF, Australia, 2007-08 on CD-ROM or via the RADL is $1,430 including GST. The Basic and Expanded CURF may be purchased concurrently (bundle) for $2,140 including GST.
University clients should refer to the ABS website (see Services, Services for Universities). The Microdata: National Health Survey, Basic and Expanded CURF, Australia, 2007-08 can be accessed by universities participating in the ABS/Universities Australia CURF agreement for research and teaching purposes.
Other prospective clients should contact the Microdata Access Strategies Section of the ABS via firstname.lastname@example.org or on (02) 6252 7714.
For further information about accessing the CURF, clients should contact the CURF Management Unit of the ABS at email@example.com or on (02) 6252 7714. The CURF is not available on CD-ROM to overseas customers.
These documents will be presented in a new window.
This page last updated 17 June 2013