This document was added or updated on 28/03/2017.
CODING NON-STANDARD RESPONSES
Responses provided in statistical and administrative collections do not always reflect the official or formal names of categories in the Standard Australian Classification of Countries (SACC). For example, "China" may be a typical survey response to a question about country of birth but it does not exactly match the title of the category "6101 China (excludes SARs and Taiwan)". The accurate coding of country responses within ABS collections is carried out by automated coding systems that link high-frequency responses to their corresponding country categories in the SACC via a coding index. These automatic coding systems are based upon the information contained in the SACC population index.
The SACC population index connects more than one thousand high-frequency country responses to each of their corresponding country codes within the SACC. For example, within the SACC population index the response "Abu Dhabi" is coded to the SACC category "4216 United Arab Emirates" because this category represents the country in which the city of Abu Dhabi is located. Other entries in the index are formal country titles, alternative spellings, misspellings, abbreviations and former titles of countries. The contents of the index are drawn from high-frequency responses identified in statistical surveys and in the 2011 Census.
The SACC population index may be requested by contacting firstname.lastname@example.org.
The following coding rules outline the parameters used to build the entries of the coding index:
- Responses which match exactly with an index entry are given the code allocated to that index entry
- Responses which contain additional information or official titles compared to SACC category titles are coded to the relevant SACC country code (e.g. 'Syrian Arab Republic' is coded to '4214 Syria')
- Responses which use alternative spelling or common misspelling compared to SACC category titles are coded to the relevant SACC country code (e.g. 'Tadzhikistan' is coded to '7207 Tajikistan')
- Responses which use common abbreviations (e.g. 'Aust', 'Korea sth', 'Eng'), initials (e.g. 'USA', 'NZ', 'UAE'), foreign language titles (e.g. 'Deutschland', 'Espana', 'Ceska Republika', 'Eire'), nationalities (e.g. 'Algerian', 'Indian', 'Malaysian'), or informal titles (e.g. 'Aussie', 'Oz') are coded to the relevant SACC country code
- Responses which relate to the former names of contemporary countries (e.g. 'Persia', 'Ceylon', 'Siam', 'Rhodesia') are coded to the contemporary SACC country code to which the response relates
- Responses which relate to defunct national or political entities are coded to the relevant supplementary code (e.g. 'Czechoslovakia' is coded to '0914 Czechoslovakia, nfd') in the first instance and (if applicable/required) may be secondarily coded to a relevant minor group supplementary code (e.g. '0914 Czechoslovakia, nfd' may be coded to '3300 Eastern Europe, nfd') for data output purposes
- Responses which relate to provinces, cities or regions within countries (e.g. 'Sumatra', 'Rio de Janeiro', 'California') are coded to the SACC country code in which those provinces, cities or regions are located
- Responses that relate to cities or regions that have been subject to changes in national boundaries (e.g. 'Danzig' was previously part of Germany but is presently part of Poland) are coded to national boundaries within which the city or region exists at the time of data collection
- Responses which cannot be identified as relating directly to a separately identified country in the classification are assigned a residual category code or a supplementary nfd code.
The coding rules outlined above can also be used as a guide for coding responses that may not already be covered within the index.