Australian Standard Classification of Languages (ASCL)
This classification is used for the collection, storage and dissemination of statistical and administrative data on languages spoken in Australia.
Overview
What's new
This latest release of the Australian Standard Classification of Languages (ASCL), 2025 includes the following changes:
- A new 4-level classification hierarchy and coding structure across the entire classification.
- Sixty-two new stand-alone Languages have been created, 73% (45) of these are Aboriginal and Torres Strait Islander Languages.
- Forty-one Languages have had label changes, 49% (20) of these are Aboriginal and Torres Strait Islander Languages.
- Forty-three Aboriginal and Torres Strait Islander Languages have been linked to another, existing Language group.
- Seventeen Languages have been retired to a related 'not elsewhere classified’ (nec) category, 71% (12) of these are Aboriginal and Torres Strait Islander Languages.
These changes were developed through the review undertaken in 2023 and 2024. More information about this review, including the purpose, and the process used are outlined in About the review, below. More detail on the changes made in the ASCL 2025 is available in the What has changed section and the What's new spreadsheet.
This is the fifth revision since ASCL was established in 1997. The 2025 classification structure, correspondence tables and What’s new spreadsheet are available from the Data downloads section.
About the classification
Language is one of several useful indicators of the cultural diversity of Australia’s society. The ASCL provides a basis for the standardised collection, publication and analysis of data relating to languages used or spoken by the Australian population. It is used to classify data from the ABS Census of Population and Housing and is also recommended for use in administrative data collections where data on language is collected.
The ASCL is designed for use in the collection, aggregation and dissemination of data relating to language usage in Australia and to classify the following ABS language variables:
- First Language Spoken
- Languages Spoken at Home
- Main Language Spoken and
- Main Language Other than English Spoken at Home
Data classified by language can be used across a range of organisations, including in the fields of health, community services, and education, to understand the diversity of languages used in communities and improve service delivery.
Data from the Census of Population and Housing, classified using the ASCL, also contributes to measuring Target 16 Cultures and languages are strong, supported and flourishing in the National Agreement on Closing the Gap.
In the classification, languages are arranged in progressively broader categories based primarily on their relationship to a common ancestral language (genetic affinity). This means that those Languages that are closely related, in terms of their evolution from a common ancestor, are closely aligned in the structure of the classification. Geographic proximity has been used as a secondary criterion, to order groups.
Definition of language
While the ASCL does not attempt to offer an exhaustive definition of language, the following definition encompasses the essential elements of language as used in ASCL.
The Macquarie Dictionary (Sixth Edition, 2013) defines language as:
"Communication in the distinctively human manner, using a system of arbitrary symbols with conventionally assigned meanings, as by voice, writing, or sign language. Any set or system of such symbols as used in a more or less uniform fashion by a number of people, who are thus enabled to communicate intelligibly with one another."
The term "Language" is used to describe the base (finest) level categories in ASCL. Most Languages in the ASCL are those languages which are universally recognised as distinct and separate languages, including creoles, pidgins and sign languages. In a few cases, a language variation (dialect) is included as a stand-alone group if one of the following criteria are met:
- failure to separately include language variations may decrease the usefulness of language data by limiting analysis to the parent language only when a more detailed breakdown is required
- the boundary between a language and its variations are not always clear or agreed
- stakeholders consulted preferred certain variations as separate categories.
In the previous versions of the ASCL a variation was referred to as a dialect. In the ASCL (2025), the following variations have been listed as stand-alone Languages:
- 11113611 Anaiwan,
- 11811112 Gurindji Kriol,
- 11811114 Lockhart River Creole,
- 11991121 Wardaman.
Please refer to the Glossary for language definitions as applied in the ASCL.
Minimum number of speakers threshold
The ASCL does not list all (or even most) of the approximately 6,000 languages spoken worldwide. In order for a stand-alone group to be created, in addition to universal recognition as a separate language, a minimum number of speakers must be recorded in the Census of Population and Housing; or a suitable, alternative data set. The minimum threshold for Aboriginal and Torres Strait Islander languages is three known speakers. The minimum threshold for other languages is 100 or more speakers.
Classification criteria and their application
Classification criteria are the principles by which categories are aggregated to form broader categories within a classification. The classification criteria used in ASCL are:
- the relationship between languages as a result of their evolution from a common ancestral language (genetic affinity)
- the area in which a language originated (geographic proximity). This also refers to the area where a language was first acknowledged as a distinct entity.
Within the ASCL, languages are firstly grouped into progressively broader categories on the basis of genetic affinity. This allows populations of language speakers whose languages have evolved from common linguistic roots to be grouped in analytically useful ways. Use of geographic proximity at the Narrow group level also enables the formation of more meaningful residual language categories.
For usability purposes in the Australian context, the classification criteria have not been applied strictly in Language family group 91 ‘Other Languages’ (see Residual Categories).
About the review
Purpose of the review
The need for periodic reviews of the Australian Standard Classification Languages (ASCL) to reflect changes in the languages spoken and used by the Australian population, was foreshadowed when ASCL was first released.
A review of ASCL was undertaken from 2023 to 2024. The purpose of the review was to update the ASCL so that it reflects the current Australian community.
The initial scope of the review considered feedback from stakeholders and the general public, as well as research and data analysis. The initial scope was confirmed following public consultation in 2023.
The objectives of the review were to:
- identify separate and distinct languages, based on Language family
- ensure the ASCL reflects languages spoken in the Australian community
- improve the usability of the classification
- streamline the coding structure
- better accommodate new groups.
Details of the changes implemented in ASCL (2025) are provided below, in the What has changed section.
How the review was done
The ASCL 2023-2024 review used statistical analysis, research and stakeholder consultation to identify the need for changes to the ASCL, as outlined below. The process used in the 2023-24 review is broadly consistent with the process used in previous reviews of ASCL.
Statistical analysis
Analysis of data from the 2011, 2016 and 2021 Census of Population and Housing was conducted. The purpose of this analysis was to identify Languages that are emerging and diminishing in Australia, since the last review in 2016.
Research
Extensive research was conducted to:
- design the new structure of the classification, including identifying relationships between Languages
- confirm the appropriate labels to be used for categories in the classification
- distinguish between variations and languages, particularly for Aboriginal and Torres Strait Islander languages.
The ABS worked closely with the Australian Institute of Aboriginal and Torres Strait Islander Studies (AIATSIS) to improve representation of Aboriginal and Torres Strait Islander Languages in the ASCL. AustLang was used to inform Language labels, identify variations and understand Language family groups.
Issues relating to non-Aboriginal and Torres Strait Islander Language data, including language family groupings, alternative spellings and variations, were informed by Ethnologue, Languages of the World database, feedback from language experts and the general public.
Stakeholder consultation
There were three phases of stakeholder consultation conducted for the 2023-24 review:
- Public consultation on the scope of the review (September to December 2023).
- Consultation with individuals, community and stakeholder groups, data users and academics (May 2024 to September 2024).
- Public consultation on the changes proposed to the ASCL (October to December 2024).
Government agencies, peak bodies, community groups and individuals made submissions to this review.
In addition to what was published on the ABS Consultation Hub (September to December 2024), a small number of additional amendments were made to the ASCL (2025). The most significant of these was the creation of four additional stand-alone Languages.
A number of issues raised throughout this review have not been investigated at this time. These will be considered for inclusion in the next review of the ASCL. The next review is likely to occur after the 2026 Census.
Scope of the classification
All world languages are in scope of the classification and languages with significant numbers of speakers in Australia are identified as stand-alone Languages within the classification structure. Languages which are not separately identified are included in the most appropriate residual category of the classification.
In the ASCL special attention has been given to separately identifying Aboriginal and Torres Strait Islander Languages. Most language variations, with the exception of those listed above, have not been separately identified and are linked to their parent Language.
Extinct or dead languages spoken for religious or academic purposes are included in the most appropriate residual category of the classification. However, if sufficient numbers of an extinct or dead language are identified as spoken in Australia, it is separately identified in the classification, for example Latin (13139911).
Sign languages are defined as a communication system using gestures rather than speech or writing (The Macquarie Dictionary (Sixth Edition, 2013)) and are included in the classification.
Languages not commonly used as a means of general communication between people, such as computer languages, are excluded from ASCL.
What has changed
Summary of changes
The 2023 -24 review of the Australian Standard Classification of Language (ASCL) was a major review and has resulted in substantial changes to the classification. These changes include:
- A new 4-level classification hierarchy and coding structure across the entire classification.
- Sixty-two new stand-alone Languages have been created, 73% (45) of these are Aboriginal and Torres Strait Islander Languages.
- Forty-one Languages have had label changes, 49% (20) of these are Aboriginal and Torres Strait Islander Languages.
- Forty-three Aboriginal and Torres Strait Islander Languages have been linked to another, existing Language group.
- Seventeen Languages have been retired to a related ‘not elsewhere classified’ (nec) category, 71% (12) of these are Aboriginal and Torres Strait Islander Languages.
- Some variations have been individually specified within the ASCL, based on user needs; however, where possible, variations have been linked to their parent Language. Please refer to the Definition of Language section for further details.
- Genetic affinity (the relationship between languages as a result of their evolution from a common ancestral language) is used as the primary criteria for aggregating languages into progressively broader categories.
Changes to Languages (new, amended labels, linked and retired) in ASCL (2025) are detailed in the What’s new tables and the correspondences between the new (2025) and old (2016) versions of ASCL, available in the Data Downloads section.
Changes to the ASCL classification structure
The structure of the ASCL now has four hierarchical levels. The categories at the most detailed level of the classification are termed ‘Languages’. These are grouped together to form Narrow groups, which in turn are grouped to form Sub Family and Language Family groups. Language Family groups are the highest level of the classification. Please see further detail relating to each of the Language groups listed below.
Language family group (two-digit codes)
The Language family group level is the highest and most general level of the classification. The Language family group level is represented by a two-digit code and is the first and broadest level of the classification. Each Language family is made up of Sub family groups which have originated from the same common ancestral language. The 2025 classification has 16 Language family groups:
11. Aboriginal and Torres Strait Islander Languages
12. Creoles and Pidgins
13. Indo-European Languages
14. Uralic Languages
15. Afro-Asiatic Languages
16. Turkic Languages
17. Niger-Congo Languages
18. Nilo-Saharan Languages
21. Dravidian Languages
22. Language isolates
23. Sino-Tibetan Languages
24. Hmong-Mien Languages
25. Austro-Asiatic Languages
26. Kra-Dai Languages
27. Austronesian Languages
91. Other Languages.
Sub family group (four-digit codes)
Sub family groups (four-digit codes) are the second level of the 2025 classification. The classification contains 49 Sub family groups, created by aggregating the most closely related Narrow groups. Within each Sub family group, Narrow groups are ordered by the similarity of the location where the Languages originated (geographic proximity).
Narrow group (six-digit codes)
Narrow groups (six-digit codes) make up the third level of the classification. The 2025 classification contains 95 Narrow groups, created by aggregating the most closely related Languages. Within Narrow groups, Language groups have been organised alphabetically.
Language groups (eight-digit codes)
The fourth and most detailed level of the classification is the Language level (eight-digit codes). There are 444 Languages at this level of the classification, including 204 Aboriginal and Torres Strait Islander Languages.
Examples of the hierarchy in ASCL (2025) and a comparison with the previous versions of ASCL are provided in Table 1.
ASCL 2016 | ASCL 2025 | ||||
---|---|---|---|---|---|
Code length | Hierarchical level | Example | Code length | Hierarchical level | Example |
1-digit | Broad group | 1 Northern European Languages | 2-digit | Language family group | 13 Indo-European Languages |
2-digit | Narrow group (2 digits) | 11 Celtic | 4-digit | Sub family group | 1311 Celtic Languages |
3-digit | Narrow group (3 digits) | n/a | 6-digit | Narrow group | 131111 Insular Languages |
4-digit | Language | 1101 Gaelic (Scotland) | 8-digit | Language | 13111111 Gaelic (Scotland) |
Comparing current and previous editions of the ASCL
Over time, in addition to changes at the lowest level of the classification (i.e. Language), changes can occur at other levels, usually as a result of classification reviews. Table 2 shows how the number of groups at each level of the classification has changed over time.
Revision | Year of publication | Broad Group | Narrow Group | Narrow Group | Language (inc nec) |
1st edition | 1997 | 9 | 48 | - | 192 |
2nd edition | 2005 | 9 | 51 | 9 | 366 |
3rd edition | 2011 | 9 | 51 | 13 | 432 |
4th edition | 2016 | 9 | 51 | 13 | 435 |
Revision | Year of publication | Language Family Group | Sub Family group | Narrow Group | Language (inc nec) |
5th edition | 2025 | 16 | 49 | 95 | 444 |
An important consideration in the development of a classification is the need to build in sufficient robustness to allow for long-term usage. This robustness facilitates meaningful analysis of data over time; and must be balanced against the need for revisions which ensure the classification is contemporary.
Revisions to ASCL occur to identify changes in Languages spoken within Australia. This includes identifying Languages that are emerging or declining; and reflecting changes to Language labels. There have been four revisions to ASCL since its establishment in 1997 and the changes in each revision are outlined in Table 3.
Revision | Year of Publication | Summary of Revision |
---|---|---|
1st edition | 1997 | ASCL established |
2nd edition | 2005 | Changes to the structure of the classification including the addition of a 3-digit Narrow Group and renaming of some groups. |
3rd edition | 2011 | Seventy-five new languages added to the classification, with several changes to labeling. |
4th edition | 2016 | Three new languages added, amendments to language names and adding appropriate entries to the coding index. |
5th edition | 2025 | Re-structure of the classification into Language family groups (2-digit), Sub family groups (4-digit), Narrow groups (6-digits) and Language (8-digits). |
Correspondence tables are available in the Data downloads section. The correspondence tables itemise the code linkages between groups, details the links between the different levels of the classification, and indicates movements within the structure of particular Languages. In some instances, there is not a direct relationship between the groupings of the structures of the two editions. Partial linkages within the structure are indicated by including the letter 'p' after the code of the group concerned. One table enables users to convert data from the fifth edition (2025) to the fourth edition (2016) of ASCL. The second enables users to convert data from the fourth edition (2016) to the fifth edition (2025).
Changes to the coding index
Changes to the coding index will reflect the changes across the classification, as outlined above.
Codes for residual categories
Not elsewhere classified (nec)
Any language which is not separately identified in the classification (because it does not meet the threshold for the minimum number of speakers) is included in the residual 'nec' category of the narrow group to which it belongs. These categories are easily identified as 8-digit codes ending with 99. Examples include: 11121199 Nyulnyulan Languages, nec and 13991199 Other Indo-European Languages, nec. The ASCL contains forty-eight "nec" categories.
'Other' Language family group category
Language family group ‘91 Other Languages’ consists of groups of languages which are not linguistically or geographically related and do not fit into other Language family groups.
'Other' Narrow group categories
Some Language family groups also contain residual categories at the Narrow group level. These groups are termed 'Other' or 'Miscellaneous' categories and consist of separately identified Language groups which do not fit into other Narrow groups on the basis of the classification criteria. The classification currently contains twenty such residual categories. Examples include: Narrow Group 111116 Other South West Languages and Narrow Group 159999 Other Afro-Asiatic Languages.
Additional residual categories
Provision exists in the code structure for the creation of additional residual categories. As additional languages are identified as being used in Australia, it may be necessary to include more residual categories in the classification structure over time. Residual categories are part of the classification structure and should not be used to 'dump' responses that are not sufficiently detailed to be coded to a separately identified category of the classification (see Supplementary Codes).
Supplementary codes
Supplementary codes are used to process inadequately described responses in statistical collections. They are not part of the classification structure but are required to enable coding of responses that are not sufficiently detailed to enable them to be coded to a Language group. Supplementary codes are listed separately in the Data downloads section.
The codes are of two types:
- ‘Not further defined’ (nfd) codes ending in two or four zeros
- ‘Operational’ codes commencing with seven zeros.
Supplementary codes ending in zero
Codes ending in two or four zeros are described as 'not further defined' (nfd) codes. These are used to code responses about Language spoken which cannot be coded to the Language group (eight-digit level), but which can be coded to a higher level of the classification structure.
For example, responses which cannot be identified as relating directly to a particular Language, but which are known to be part of the Sub family of a particular Narrow group, are coded to the most appropriate Narrow group. Such responses are allocated an nfd code consisting of the six-digit code of the Narrow group followed by 00.
Similarly, responses which do not contain sufficient information to be related directly to a Language group, or to a Narrow group, but which are known to be within the range of the Language family group, are coded to the most appropriate Sub family group. Such responses are allocated an nfd code consisting of the four-digit code of the Sub family group followed by 0000. For instance, the response 'Yanyi’ does not contain sufficient information to be related directly to a Language group or to a Narrow group, but it can be coded to Sub family (Yanyi Languages, 1123), which encompasses all Yanyi languages. It is allocated to Yanyi Languages, nfd, code 11230000.
Where there is a need to code a response within the range of a Language family group, where both the Sub family group and the Narrow group have identical labelling, then the nfd code will be aligned to the Narrow group 6-digit code, followed by 00.
Supplementary codes starting with zero
Eight-digit codes commencing with ‘0000000’ are supplementary codes included for operational purposes only. Inclusion of these codes facilitates the coding of responses which cannot be allocated one particular group in the classification (Language, Narrow, Sub family or Language family). For example, responses that do not provide enough detail to be categorized to one group in the classification are considered to be ‘Inadequately Described’ (code 00000000).
Currently the only supplementary codes, starting with zeros are:
- 00000000 Inadequately Described
- 00000002 Not stated.
Index for coding responses
Why we use it
Responses provided in statistical and administrative collections are not always identical to the labels used to describe the classification categories. Therefore, a coding index is required to link responses to the most appropriate code in the Australian Standard Classification of Languages (ASCL) in a process called "coding" (which can be undertaken by computer or manually). The ASCL coding index contains a comprehensive list of the most likely responses to questions relating to language and their correct classification codes. The coding index is used to code responses to questions such as 'First Language Spoken', 'Languages Spoken at Home', 'Main Language Spoken at Home' and 'Main Language Other Than English Spoken at Home'. The ASCL coding index may be requested by contacting standards@abs.gov.au.
How it was developed
The coding index was developed through literature research, consultation with stakeholders, and analysis of data from responses obtained in ABS statistical collections such as the Census of Population and Housing.
As well as individual languages, a number of entries in the ASCL coding index cover variations and regional language varieties not separately identified in ASCL. Therefore, in addition to its coding function, the numerical index can be used to clarify the nature, extent and varietal content of each language category.
Coding rules
When coding responses in statistical or administrative collections, the following rules apply:
- Responses which match exactly an entry in the coding index are assigned the code allocated to that index entry. For example, a response of "Cambodian" is coded to 25111111 Khmer.
- Responses which relate directly to a language category are coded to that language category. Such instances include responses which are an exact match with the language category title except in terms of:
- alternative spelling (e.g. responses of "Kaura", "Coorna" and "Koornawarra" are all coded to 11111113 Kaurna)
- spelling error (e.g. "Japanease" is coded to 22111113 Japanese)
- the use of abbreviations (e.g. "N.Z Maori" is coded to 27111414 Māori (New Zealand)
- the use of foreign or idiosyncratic words (e.g. "Nihongo" is coded to 22111113 Japanese and a response of "Deutsch" is coded to 13121115 German)
- the use of qualifying, modifying or extraneous words in addition to the fundamental or basic language description. For example, a response of "A little Japanese" or "Yes Japanese" is coded to 22111113 Japanese and "South Korean" is coded to 2211114 Korean.
- Responses which relate directly to a language category because they describe a language variation or geographic variation of the language are coded to that language category (e.g. the responses "Swabian", "Viennese" and "Alsatian" are all coded to 13121115 German).
- Responses containing more than one distinct language are coded to the first language stated (e.g. a response of "Polish and German" is coded to 13151217 Polish). The exception to this rule is where it is possible to store more than one language code, in which case the code for each separate language is recorded.
- Responses which cannot be identified as relating to a separately identified language in the classification are assigned a residual category code or a supplementary code. For example, "Chin" and "Chin Burma" are coded 23121100 Kuki Chin Languages, nfd and “North Queensland Aboriginal" is coded to 11999900 Other Aboriginal and Torres Strait Islander Languages, nfd. Responses such as "Foreign", "Good Speech" and "Truth" cannot be linked to any language and are coded to 00000000 Inadequately described.
A response should be coded to a residual category only when it is clear that it is a distinct language or variation which cannot be placed in a precise language category. Responses which are not precise enough to be coded to any category should be assigned the appropriate supplementary code.
Using the classification
Editing specifications
It is important when validating input codes at the editing stage, manipulating data, and deriving output items, that all valid codes are included in every specification. The full range of valid codes consists of all the codes in the classification structure plus all supplementary codes.
Coding, storage and presentation of data
Data should be collected, classified and stored at the language (eight-digit) level of the classification to allow flexibility of statistical output and more detailed analysis. It also maintains information for future use and enables comparison with previous data using different classifications.
In some instances, concerns about confidentiality or standard errors may not permit the release of data at the finer levels of the classification. The use of a standard classification enhances data comparability even though it may not always be possible to disseminate data at the most detailed level.
The hierarchical structure of the classification provides users the flexibility to output statistics at the level of the classification which best suits their particular purposes. Data can be presented at the Language family level, Sub family level, Narrow group level, or the Language level.
Data downloads
ASCL Structure, Correspondences and What's new
A coding index may be of use to anyone seeking to code responses to the Australian Standard Classification of Languages and may be requested by contacting standards@abs.gov.au.
ASCL structure: The detailed structure of ASCL at each hierarchical level.
ASCL correspondence tables: Correspondences between ASCL 2025 and ASCL 2016
What's new: Detailed changes made to ASCL 2025
Abbreviations
ABS Australian Bureau of Statistics
ASCL Australian Standard Classification of Languages
nec not elsewhere classified
nfd not further defined.
Glossary
Language family
A language family is a group of related languages which have evolved from a common ancestral language and share common linguistic features (such as pronunciation, vocabulary, grammar).
Variation
In the ASCL (2025), a variation is defined as a regional or social variety of a language distinguished by pronunciation, grammar, and/or vocabulary from its parent language. In previous versions of the ASCL, a variation was referred to as a dialect. In the 2025 classification, some variations have been individually specified within the ASCL, based on user needs; however, where possible, variations have been linked to their parent Language.
Pidgin
In the ASCL, a pidgin is defined as a language used for communication between groups having different first languages, as between European traders and Indigenous peoples, and which typically has derived features from each of those languages.
Creole
In the ASCL, a creole is defined as a language which has developed from a pidgin to become the primary language of a community.