Australian Bureau of Statistics

Rate the ABS website
ABS Home > Statistics > By Catalogue Number
1200.0.55.005 - Language Standards, 2012, Version 1.1  
Latest ISSUE Released at 11:30 AM (CANBERRA TIME) 26/09/2012  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All RSS Feed RSS Bookmark and Share Search this Product  
Contents >> First Language Spoken >> Classification and Coding - FLS

First Language Spoken - Classification and Coding

The standard classification criteria and structure

Language data in Australia should be collected, aggregated and disseminated using the Australian Standard Classification of Languages (ASCL), 2011 (ABS cat. no. 1267.0). The two classification criteria used to form categories in the ASCL are:

      • genetic affinity, which is the relationship between languages due to evolution from a common ancestral language and
      • geographic proximity of language(s) based on the area in which the language originated or first became recognised as a distinct language.

The classification has a three level hierarchical structure, with the exception of the Australian Indigenous Languages broad group, where an extra level has been added between the narrow group and language levels of the classification.

The most detailed level of the classification consists of 432 (216 Indigenous and 216 non Indigenous) base level units. Included in the 432 base level units are 388 specific languages and 44 'not elsewhere classified' (nec) categories, used to code languages not separately listed in the classification.

The second level of the classification comprises 51 narrow groups of languages that are similar in terms of the classification criteria, including seven 'other' categories which consist of languages which do not fit into a particular narrow group.

For three narrow groups of Australian Indigenous languages (Narrow Group 81 Arnhem Land and Daly River Region Languages, Narrow Group 82 Yolngu Matha and Narrow Group 86 Arandic) three digit levels are positioned between the narrow group and language level of the classification. There are 13 three digit level categories. They provide meaningful and useful groups of languages.

The first and most general level of the classification comprises nine broad groups of languages including one 'other' category. Broad groups are formed by aggregating geographically proximate narrow groups.

The code structure

One, two and four digit codes are assigned to the first, second and third level units of the classification respectively. The first digit identifies the broad group in which each language or narrow group is contained. The first two digits taken together identify the Narrow Group in which each Language is contained.

Within Broad Group 8, Australian Indigenous Languages, there are some narrow groups where an extra level has been added between the narrow group and language levels of the classification. The first three digits taken together identify the additional language groups at this third level.

The four digit codes represent each of the 432 Languages or base level units.

The following examples illustrate the coding scheme:

a.Broad Group5Southern Asian Languages
Narrow Group52Indo-Aryan

b.Broad Group8Australian Indigenous Languages
Narrow Group82Yolngu Matha
Extra Level824Dhuwala

Residual categories and codes

Within each narrow group, a four digit code, consisting of the narrow group code followed by the digits '99', is reserved for a residual 'not elsewhere classified' (nec) or 'other' category. Similarly, within the third level classification of Australian Indigenous Languages the three digit group code may be followed by '9' to denote a 'not elsewhere classified' (nec) or 'other' category. All languages which are not separately identified in the classification are included in these residual 'nec' or 'other' categories of the related classification level.

In each broad group, two digit codes are reserved for residual categories at the narrow group level. These codes consist of the broad group code followed by '9'. These categories are termed 'Other Broad Group Name' and consist of separately identified languages which do not fit into any of the narrow groups contained within the broad group, based on the classification criteria.

Supplementary codes

Supplementary codes are not part of the classification structure. They exist for operational reasons only, and no data would be coded to them if sufficiently detailed responses were obtained in all instances. They are used to process inadequately described responses in statistical collections. The codes are of two types:

      • four digit codes ending with one, two or three zeros
      • four digit codes commencing with three zeros (operational codes).

Codes ending in zero are described as 'not further defined' (nfd) codes. These codes classify responses to a question about language which cannot be coded to the language level of the classification, but which can be coded to a higher level of the classification structure.

Responses which do not relate directly to a particular language category, but which are within the range of languages relating to a particular narrow group, are coded to that narrow group. Such responses are allocated an 'nfd' code consisting of the two digit code of the narrow group followed by 00.

Language responses which do not directly relate to a particular narrow group or language category, but are within the range of languages relating to a particular broad group, are coded to that broad group. These responses are allocated an 'nfd' code consisting of the one digit code of the broad group followed by 000. Language responses which can only be coded at the broad or narrow group levels of the classification can be processed within a collection coded at the four digit level.

Four digit codes commencing with 000 are supplementary codes included for operational purposes to facilitate the coding of responses such as inadequately described languages, etc., which contain insufficient information to be allocated a language, narrow group or broad group code.

Scope of the variable

The variable First Language Spoken applies to all persons.

Application of the classification to other variables

The ASCL can be used for a variety of variables. These include: Main Language Spoken at Home, Main Language Other Than English Spoken at Home, Languages Spoken at Home, Language of Greatest Competency, and Preferred Language.

Coding procedures and coding Index

Language responses to the First Language Spoken question are coded to the ASCL, or to one of the supplementary codes, using the guidelines detailed in that classification.

A coding index has been developed to assist in the implementation and use of the ASCL. It contains a comprehensive list of probable responses to questions relating to language and their correct classification codes. Each language response is matched with an entry in the ASCL Coding Index to determine the correct code. Use of the coding index enables responses to be coded accurately to the appropriate category of the classification.

Further information about the classification criteria and coding of data about languages can be found in the ASCL.

Copies of the Coding Indexes can be found in the ASCL data cube on the ABS website (

Previous PageNext Page

Bookmark and Share. Opens in a new window

Commonwealth of Australia 2014

Unless otherwise noted, content on this website is licensed under a Creative Commons Attribution 2.5 Australia Licence together with any terms, conditions and exclusions as set out in the website Copyright notice. For permission to do anything beyond the scope of this licence and copyright terms contact us.