4159.0.55.002 - General Social Survey: User Guide, Australia, 2010  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 07/12/2011   
   Page tools: Print Print Page Print all pages in this productPrint All  
Contents >> Using the microdata >> Using the CURF Data

USING THE CURF DATA

Microdata from the 2010 General Social Survey (GSS) is available in the forms of a Basic Confidentialised Unit Record File (CURF) and an Expanded CURF. The Basic CURF is available via CD-ROM or the Remote Access Data Laboratory (RADL). The Expanded CURF contains more detailed data for some variables than the Basic CURF, as well as some additional variables and is only available via RADL.

The RADL is a secure on-line data query service that clients can access via the ABS web site. Because the CURFs are kept within the ABS environment, the ABS is able to release more detailed data via the RADL than can be made available on CD-ROM. Further information about this facility is available on the ABS web site: <www.abs.gov.au> (see Services, ABS Microdata).

This chapter details how to use the microdata, content of the files and conditions of microdata release.


About the microdata

The 2010 GSS microdata are released under the Census and Statistics Act 1905, which has provision for the release of microdata in the form of unit records where the information is not likely to enable the identification of a particular person or organisation. Accordingly, there are no names or addresses of survey respondents on the CURF, and other steps have been taken to protect the confidentiality of respondents. These include removing some data items from the CURF, reducing the level of detail shown for some items, changing characteristics such as state or age for some records, and perturbing some data. As a consequence, data on the CURFs will not exactly match published data.

Steps to confidentialise the data sets made available on the CURFs are taken in such a way as to ensure the integrity of the data and optimise the content, while maintaining the confidentiality of respondents. Intending purchasers should ensure that the data they require, at the level of detail they require, are available on the CURFs. Data collected in the survey but not contained on the CURFs may be available in tabulated form on request. A list of the data items on both the Basic and Expanded CURFs is provided as a datacube entitled 'GSS 2010 CURF Data items' accompanying this User Guide.


FILE STRUCTURE AND USE

The 2010 GSS Basic and Expanded CURFs each contain a set of two files with confidentialised records. These files provide records at the following levels:

  • Persons
  • Difficulty accessing service providers


USING THE EPISODIC DATASET

The person level contains information about each selected person and the household to which they belong. The person level contains 15,028 records.

The Difficulty Accessing Service Providers level is an episodic dataset. Respondents who indicated that they had had difficulty accessing a service were asked to report both all the reasons and the main reason that they had difficulty accessing the service for up to three service provider types. Thus users can use the 'Services had difficulty accessing' item on the Difficulty Accessing Service Providers level in conjunction with the 'All difficulties accessing service providers' or 'Main difficulty accessing service providers' items to examine the difficulties experienced by a respondent for up to three service provider types.


Use of weights

The 2010 GSS was conducted on a sample of private households in Australia, and as such users need to take this into account when deriving estimates from the CURFs. Each unit record contains two weights. The weights indicate how many population units, i.e. persons or households, are represented by the sample unit. The person weight identifier is FINPRSWT and the household weight identifier is HHWTPAA. In addition, replicate weights have also been included, with 60 person replicate weights (WPM0101 - WPM0160) and 60 household replicate weights (WHM0101 - WHM0160). The purpose of these replicate weights is to enable calculation of the Relative Standard Error (RSE) for each estimate produced from the CURFs. For more information on RSEs, please refer to Chapter 5: Data Quality.

Where estimates are derived from the CURF, it is essential that they are calculated by using the weights of persons or households in each category, and not just by counting the number of records in each category. If person or household weights were to be ignored when analysing the data to draw inferences about the population, then no account would be taken of a person’s or household's chance of selection, or of different response rates across population groups, and the resulting estimates may therefore be seriously biased. The application of weights ensures that estimates conform to an independently estimated distribution of the population by age and other characteristics, rather than to the distributions within the sample itself.

It should be noted that as a result of some of the changes made to protect confidentiality on the CURFs, estimates of benchmarked items produced from the CURFs may not equal benchmarked values.


Identifiers

Each record has an individual person identifier called ABSPID and a service difficulties episode identifier called ABSDID.


Record types

There is a single record level available on the 2010 GSS CURFs which can be weighted to produce either person or household estimates. Person data exist only for persons aged 18 and over.


Special Codes

Details of special codes to be aware of when analysing data are available in the CURF data item lists available on the ABS web site <www.abs.gov.au>.


Multiple response items

There are a number of data items on the 2010 GSS CURFs which have multiple responses. In these instances respondents were able to select one or more response categories, and the output data items are multi-response in nature, i.e. counts will not add to total persons. This section describes such items and provides some information on how to use them.

One example is the 'Source of support in time of crisis' data item, which captures multiple responses where a respondent may identify more than one source of support in a time of crisis. The first response is captured in the first, or 'A', position (e.g. SCESUPPA), and additional responses are in the second and then third and higher, or 'B' and 'C' and higher, positions (e.g. SCESUPPB, SCESUPPC). If only one response is possible, for example 'No support' then this response may also appear in the 'A' position. If a data item does not apply, e.g. for the multiple response item 'Type of support provided by selected person for children 0-17 living outside the household', where the respondent does not have a child aged 0-17 living outside the household, then the value assigned for 'Not applicable' will appear in the first position (e.g. SUPCHIA). The 'Null response' (value of 0 or 00) is a default code and should be ignored. All of these categories should be used in analysis.

Please refer to the data item list 'GSS 2010 CURF Data items', which is a datacube accompanying this User Guide, for listings of multiple response items and for specific information on the number of item repeats and the category labels and values.


Geographic items

The Basic CURF includes two geographic items: 'State or territory of usual residence' (STATEUR) and 'Remoteness areas' (ARIACF). To enable Expanded CURF users greater flexibility in their analyses, the Index of Relative Socio-Economic Disadvantage (SEIFADEC, SEIFAQN) and several sub-state geography items ('State or territory of usual residence' (STATEUR); 'Remoteness areas' (ARIACF); 'Area of usual residence' (AREAUR); and 'Section of state' (SOSGSS)) are included on the Expanded GSS 2010 CURF.

Conditions are placed on the use of these items. Tables showing multiple data items, cross-tabulated by more than one sub-state geography at a time, are not permitted due to the detailed information about small geographic regions that could be presented. However, simple cross-tabulations of population counts by sub-state geographic data items may be useful for clients in order to determine which geography item to include in their primary analysis, and such output is permitted. Users are advised that this condition is monitored through the RADL audit process.







Previous PageNext Page