Standards for Statistics on Cultural and Language Diversity

Latest release

A set of statistical standards used to collect all necessary information for consistent measurement of cultural and language diversity in Australia

Reference period
Australia
Released
22/02/2022
Next release Unknown
First release

About this release

The Australian Bureau of Statistics (ABS) Standards for Statistics on Cultural and Language Diversity (SSCLD) presents a nationally consistent framework for the collection and dissemination of data on cultural and language diversity.

This publication refreshes the previous publication on the new website. The standards have not been changed, but text and the publication format has been streamlined for readability and discoverability.

The Minimum Core Set of Cultural and Language Indicators consists of four concepts: 

 

The Standard Set of Cultural and Language Indicators is as follows: 

 

When the Standards for Statistics on Cultural and Language Diversity were first published in 1999, there was an accompanying document called The Guide: Implementing the Standards for Statistics on Cultural and Language Diversity (the Guide). The Statistics Working Group of the Commonwealth Interdepartmental Committee on Multicultural Affairs (IDC) prepared the Guide to assist government departments and agencies to implement the Standards.

The original document for the Guide can be found in the Standards for Statistics on Cultural and Language Diversity, 1999.

This reference has been included to provide historical context. Sections of the Guide may still be relevant, however, when the document was published, new methods for creating statistics , such as data linkage, were not commonplace and are not reflected in the document

Introduction

This publication presents a set of statistical standards (referred to further as "Standards") designed to support a nationally consistent framework for the collection and dissemination of data on cultural and language diversity.

It is intended that the Standards be used within and outside the ABS in all relevant data collection activities, as this will improve the compatibility and comparability of data derived from different sources.

The Standards, and the variables which they represent, are considered relevant as cultural and language indicators. The list of variables were developed by the ABS in collaboration with other organisations in response to growing user needs and a request from government for consistent and accurate measurement of cultural diversity in Australia. Links to the standards for each of the identified variables are presented in this publication.

The Standards reference major Australian standard classifications related to language and cultural diversity. Use of these classifications is fundamental to proper application of the standard variables. Links have been provided where relevant.

This publication has four main purposes:

  • to provide standards to identify, define, classify and disseminate particular attributes of a person or group of people that relate to their origins and cultural and language background;
  • to provide a way to identify, measure and monitor service needs associated with advantage or disadvantage related to cultural and language background. For instance, to provide cultural and language data which facilitate access and equity initiatives;
  • to provide information that supports a measure of cultural and language diversity in its broader sense. That is, a measure of the cultural and language communities and groups that make up Australian society; and
  • to provide information to replace non-English speaking background (NESB) and main English-speaking country (MESC) which are widely considered to be no longer appropriate as a general purpose indicator.
     

There are two approaches in which the cultural and language variables can be used: 

  • an individual variable can be used to collect a particular item of information in a statistical or administrative collection to meet a particular need. This method, while satisfying a particular information need, may provide only a superficial measure of cultural and language diversity.
  • a select set of variables can be used to collect a range of cultural and language information. This multi-dimensional approach can provide a broad and relatively balanced method of measuring the cultural diversity of a population or the cultural and ethnic attributes of an individual.


Most ABS statistical collections and the collections of many other agencies do not have the measurement of cultural and language diversity as a primary focus. Many of these collections do, however, collect information on one or two cultural and language variables included as cultural and language indicators in this publication. This usage is acceptable, and if the statistical standards presented in this publication are used, will provide good quality information which is comparable with many other data collections. 

There are many elements to cultural and language diversity which must be considered to provide an accurate measure of cultural and language diversity. To use a single standard variable, such as country of birth, or a non-standard composite concept, such as NESB, is inadequate.

It is recommended that use of the term ‘Non-English Speaking Background’ or acronym NESB be discontinued as a label for data however it is measured. If people are classified into English speakers and non-English speakers on the basis of, for example, the language variable First Language Spoken, they should be described as ‘First Language Spoken English’ and ‘First Language Spoken Other Than English’. Generally, the standard names of the variables and the standard names of classification categories should be used when disseminating data. For instance, the Proficiency in Spoken English of people should be described as ‘Speaks English Very Well’, ‘Speaks English Well’, etc.

Development of the cultural and language indicators

History

NESB was previously used as a broad measure of culturally related need or disadvantage. NESB is no longer considered to be an appropriate measure of culturally related disadvantage, in terms of access to government services, for a variety of reasons: 

  • the term has many conflicting definitions
  • it groups people who are relatively disadvantaged with those who are not disadvantaged
  • it is unable to separately identify the many cultural and linguistic groups in Australian society 
  • it has developed negative connotations. 


NESB is seen as an oversimplified indicator of disadvantage which may result in inappropriate service provision and fail to capture the nuances of a culturally diverse population. Consequently, government agencies at all levels sought to develop a more accurate, effective and consistent measure of cultural and language diversity to improve strategic planning and evaluation of service programs. 

At a meeting of the Council of Ministers of Immigration and Multicultural Affairs (COMIMA) in May 1996, Commonwealth and State Ministers noted the problems associated with the use of NESB and agreed that the term and its acronym be dropped, where possible, from official communications. The ability of all government agencies to capture a common core set of cultural indicator data which will allow a more precise and meaningful assessment of service uptake by different cultural groups across many different portfolio services, as well as a comparative assessment across agencies, was seen as an area requiring urgent attention. To progress the need for the development and implementation of standardised cultural indicator data across all levels of government, the ABS was engaged to cost, develop and pilot a data collection instrument and trial it in several government agencies. 

A working group comprising the Department of Immigration and Multicultural Affairs (DIMA), the Multicultural Affairs Unit of the Department of Premier and Cabinet (Victoria) and the ABS was established and a draft proposal was developed. In November 1997 the draft proposal outlining the objectives, methodology and costs associated with the pilot study was endorsed by the Standing Committee of Immigration and Multicultural Affairs (SCIMA), with the requirement that the project team report back in November 1998.

The overriding aims of the study, known as the Cultural and Language Indicators Pilot Study (CLIP), were to determine the type of data necessary to replace NESB, and to provide a way for cultural and language diversity to be more effectively built into the strategic planning and reporting processes associated with program or service delivery. The objectives were to propose a standard set of variables to measure cultural and language diversity which could be used in all administrative and service provision settings, and to determine a Minimum Core Set of variables from the full standard set that would effectively replace NESB. 

It should be stressed that it was not the intention of the exercise to recommend a single key measure of cultural and language diversity to replace NESB, nor to propose an alternative acronym. It was apparent that precise measurement of cultural and language diversity, and related advantage or disadvantage, required a combination of variables which produced a range of data about a person’s background. 

The study consisted of two main streams of activity. The first was to design and pilot a data collection instrument, which would test a number of indicator variables relating to cultural and language background on surveys and administrative forms in various settings (e.g. hospital admission forms, DIMA Offshore and Onshore processing offices). The second was to undertake analysis of ABS census and survey data, and to conduct supplementary research to assess the performance and suitability of the proposed indicator variables. 

Cultural and language indicators pilot study outcomes

This data collection test involved 5,016 clients who used the services provided by a range of Local, State and Territory and Commonwealth Government agencies from March to September 1998. This sample size allowed for a broad representation of the community and for a detailed analysis of the results in terms of response rates to individual questions, quality of data collected and the degree of respondent burden. Feedback from the participating agencies on respondent reaction to questions, and respondent concern about sensitivity or privacy of the information sought, was also analysed. Evaluation of the performance and suitability of the variables tested found that only one, Visa Category, would be difficult to implement in particular administrative settings. The remaining variables were found to be suitable indicators of cultural and language diversity and could be successfully implemented in a range of statistical and administrative collections. 

Existing ABS census and survey data were analysed to determine whether there was a relationship between individual cultural and language indicator variables used in ABS collections (and trialled in the CLIP data collection test), and other recognised measures of socioeconomic disadvantage, such as unemployment rates. The analysis indicated that population groups from certain countries of birth, people who spoke particular languages and those with certain religions had a relatively high correlation with indicators of socioeconomic disadvantage. However, not all people, or even a majority of people, within these groups necessarily exhibited the characteristics of disadvantage. Therefore, attempts to use a single cultural or language variable as a generalised indicator of advantage or disadvantage suffered from the same limitations associated with the use of NESB. This analysis confirmed the need for a range of cultural and language diversity indicators. 

The CLIP outcomes, supplementary research and subsequent SCIMA discussions were all used to refine and finalise the Minimum Core Set of Cultural and Language Indicators selected out of the full Standard Set of Cultural and Language Indicators. These indicators are designed to replace NESB and to collect a wide range of cultural and language data. 

Because the project was directed primarily towards developing cultural and language indicators to replace NESB, Indigenous Status was not included in the CLIP testing program. However, it was acknowledged that Indigenous Status is a fundamental element of cultural diversity in Australian society and that the existing ABS standard for Indigenous Status, which was at the time the subject of a number of other initiatives, should be included in all data collections where the focus is not restricted to migrants and their descendants. 

The Minimum Core Set consists of four variables: Country of Birth of Person, Main Language Other Than English Spoken at Home, Proficiency in Spoken English and Indigenous Status. Indigenous Status forms part of the core set for those collections which are not specifically focused on migrants to Australia. 

The full Standard Set also includes Ancestry, Country of Birth of Father, Country of Birth of Mother, First Language Spoken, Languages Spoken at Home, Main Language Spoken at Home, Religious Affiliation and Year of Arrival in Australia. Languages Spoken at Home was not included in the CLIP testing program and, therefore, was not endorsed as one of the Standard Set of indicators by COMIMA. It has been included as one of the full Standard Set in this publication because of its value in providing data on the stock of languages actively used in Australian homes. Any of these variables can be added to the Minimum Core Set variables to collect other relevant cultural and language diversity data to meet particular information needs. 

The Concepts section contains more information on these variables and why they were chosen.

Endorsement of the standard set of indicators

The Minimum Core Set and the Standard Set of Cultural and Language Indicators were endorsed by COMIMA in April 1999. COMIMA felt that this range of variables was able to comprehensively measure different aspects of a person’s origins and the extent to which persons from certain cultural and language backgrounds are associated with advantage or disadvantage. The variables were also found to support effective planning, evaluation and monitoring of service programs in all administrative and service provision settings. 

COMIMA recommended that the Minimum Core Set of variables be implemented in all national and state and territory statistical and administrative collections which require information on cultural and language diversity. COMIMA further recommended that additional variables from the full Standard Set be added to the collection where a wider range of information is required.

Using the cultural and language indicators

Government and private organisations currently collect a wide range of cultural diversity data, using many different data collection methodologies (e.g. self-enumerated administrative forms, personal interviews, etc.) and different measures of a person’s cultural background and language use. The Standard Set of Cultural and Language Indicators, with standard questions and data collection procedures, provide significant benefits, including: 

  • providing a consistent method for measuring cultural and language diversity in all statistical and administrative collections 
  • allowing data from different sources, and different time periods, to be compared and integrated in a meaningful way
  • improving the quality, relevance and accuracy of data produced
  • reducing development and operational costs for agencies collecting data on cultural and language diversity by providing a ready-made and reliable method for use in all service provision settings. 


The Council of Ministers of Immigration and Multicultural Affairs (COMIMA) recommended that the measurement of cultural and language diversity be based on the use of the Standard Set of Cultural and Language Indicators in statistical and administrative collections across all states and territories. It further recommended that any measures based on the notion of NESB be replaced by the new method. A Minimum Core Set of the recommended indicators is considered necessary to collect the minimum amount of information needed to replace the key measure of NESB. 

Organisations will need to review and adjust their measurement tools and data processing procedures to fully implement the proposed cultural and language indicators in their collections. This could involve: 

  • the deletion or addition of questions
  • changes to question wording and sequencing
  • changes to definitions
  • changes to the classification of responses
  • changes to manual and computer coding systems used to capture and manipulate data. 


The cultural and language indicators consist of a suite of statistical variables for which the ABS has developed statistical standards. The use of statistical standards is considered essential to provide the basis of comparability of information collected within and between agencies, especially comparability with data produced by the ABS. Therefore, to implement the recommendations of COMIMA, organisations will need to accept and use ABS statistical standards for the Minimum Core Set and Standard Set of Cultural and Language Indicators. 

While implementation of the ABS standards may involve some initial costs and inconvenience, consistent use of the standards will ultimately result in developmental and operational savings as well as improving the quality of information collected.

It is not the role of the ABS to anticipate how a particular organisation should use data relating to each variable or use the data taken together. This will depend on the data requirements necessary to support the policy and operational aims and objectives of each organisation. However, the following observations are offered to illustrate how data on cultural diversity can be used.

There are two basic perspectives of data which affect the range and uses of data collected about cultural and language diversity:

  • Data about individuals. 
  • Aggregated data about community groups. 

Data about individuals

One of the aims of using the Standards is to enable agencies to make decisions about a person’s needs on the basis of direct and accurate information about their background, language and English skills without making unfounded assumptions about individuals on the basis of the general characteristics of the community group to which they belong. Assumptions about a person’s, say, language skills made on the basis of their country of birth are not recommended. For instance, when analysing the characteristics of a population, relatively high levels of correlation between a level of Proficiency in Spoken English and country of birth can be achieved when Year of Arrival in Australia is also considered. When applied to an individual, however, this method can produce inaccurate conclusions. 

Generally, it can be said that when replacing NESB as a measure of disadvantage, Proficiency in Spoken English provides a similar, but more precise and meaningful, measure. CLIP supported the notion that this variable provides information which is fundamental to target the provision of services to people whose lack of ability in spoken English is potentially a barrier to gaining access to government programs and participating in Australian society on an equal footing with those who are proficient in English. 

Use of a language variable not only provides a filter to Proficiency in Spoken English, but also accurately indicates the language characteristics and origins of an individual. Using Country of Birth of Person with a language variable provides a cultural dimension which is likely to indicate a person’s familiarity with Australian institutions, labour market, etc. Adding Country of Birth of Mother and/or Country of Birth of Father to this mix of variables will further refine this measure of cultural and language background and provide additional information on a person’s potential for accessing services. 

Country of Birth of Person is best used in conjunction with Year of Arrival in Australia as this will provide an indication of the extent to which migrants are likely to have adapted to Australian society. It can also be used to determine which community groups have the most difficulty, or take the longest time, to adapt to Australian society. 

Aggregated data

It is necessary to aggregate data collected from individuals to create community group profiles for the purposes of policy setting, service monitoring, analysis and thematic reporting. In such circumstances, the range of data collected should, where possible, be suitable for the aggregation and analysis undertaken. The use of Country of Birth of Person (and perhaps Country of Birth of Mother and/or Country of Birth of Father), a language variable and Religious Affiliation will accurately identify most cultural and ethnic groups. 

As data from many administrative sources (such as income and occupation data) can be limited, extrapolation of group characteristics using correlations developed from, for example, cross-classified census data is often necessary when studying the social and economic functioning of community groups. Although more meaningful data will be obtained if direct questions are asked of the target population, data derived for community groups can be used because such data do not draw direct conclusions about a particular person’s characteristics. 

Using relationships between language communities and aspects of functioning such as Proficiency in Spoken English, labour force status, educational qualifications, etc. (established using census data), it is possible to draw conclusions about the service needs of particular language communities. This provides useful planning information when these communities are encountered in administrative and service provision settings. Similar relationships can be established between Country of Birth of Person and aspects of functioning for the targeting of service provision to communities originating in particular countries. 

Cultural and language indicator data also provide a useful tool for determining if service providers are successfully targeting the client groups in their catchment area. For instance, if census data indicate a certain number of speakers of a language in a particular region, and the service take-up rate for speakers of that language is proportionally low, it may be deduced that targeting strategies are not working. The usefulness of comparing census data with data from other sources is a major reason that the census language variable Main Language Other Than English Spoken At Home is included in the Minimum Core Set.

The intention of this framework for measuring cultural and language diversity is not to replace NESB with another single measure that attempts to synthesise a number of attributes of the background or functioning of a person or group of persons. Such blanket measures tend to inaccurately assign the characteristics of a population to individuals and may tend to create an inappropriately negative (or positive) view of all people falling within them. Rather, it is intended that the attributes of each person are determined by direct and accurate measurement.

Collecting cultural and language indicator data

Standards

The ABS has developed and maintains standards for cultural and language variables for use when collecting the data. 

Each variable codifies the concept, definitions, and methods recommended by the ABS for collecting, processing and presenting quality statistics on cultural and language diversity.

The Minimum Core Set of Cultural and Language Indicators consists of four concepts: 

 

The Standard Set of Cultural and Language Indicators is as follows: 

 

It is recommended that the Minimum Core Set of variables be collected in all administrative and service provision settings where information on cultural and language diversity is required. However, where the focus is on migrant issues only, it may not be appropriate or useful to include Indigenous Status. In instances where data collection activities are focused specifically on Aboriginal and Torres Strait Islander peoples, Country of Birth of Person could be omitted as the population of interest would almost all be born in Australia.  

Any of the non-core variables in the Standard Set can be added to the Minimum Core Set to create question modules which enable other relevant data to be collected to meet particular information requirements. The additional indicators can be added, either individually or in combination, to the core set. A general principle might be to ask as wide a range of questions as possible to provide a comprehensive picture of an individual’s origins and characteristics. 

The questions outside the core set have not been assigned a hierarchy of priority. Individual requirements will determine which additional questions are needed and the order in which non-core questions are asked.  However, some questions should only be asked of certain populations. For example, Year of Arrival in Australia should only be asked of those people born overseas.

Where particular collections currently do not include one of the four Minimum Core Set variables, but include other cultural diversity variables which are better suited to the collection, it is not intended that a core variable necessarily replace a variable currently collected. Instead, it is recommended that the other Minimum Core Set variable be added to the collection. For example, if an organisation collects Country of Birth of Person and Year of Arrival in Australia data because they are useful when used together, it is not proposed that the organisation replace Year of Arrival in Australia with Main Language Other Than English Spoken at Home and Proficiency in Spoken English. Rather, the organisation is encouraged to collect Main Language Other Than English Spoken at Home and Proficiency in Spoken English as well.

The Standard Set contains several different language variables which each measure a different concept associated with language usage. The language variable, Main Language Other Than English Spoken at Home is included as part of the Minimum Core Set because it was identified, following extensive consultation with users of language data, as the most useful general purpose language variable. It is the language variable used in the ABS Census of Population and Housing and its use therefore enables administrative data to be directly compared or integrated with census data. This would be more difficult to achieve if different measures of language were used in the two contexts.  

Organisations should use other language variables in addition to Main Language Other Than English Spoken at Home if additional language data are required. The choice of additional language variables, the order in which they are asked and which language variable is used as a filter for Proficiency in Spoken English will depend on particular information requirements. The relative strengths and weaknesses of the four language variables included in the Standard Set are discussed in the next section.

It should be noted that in statistical collections without a cultural focus, individual variables from the Standard Set can be used without having to use the full Standard Set or Minimum Core Set of variables. In such circumstances it is still necessary to use the standards developed for these variables.

When collecting data on cultural and language diversity, the standards should be used and the ordering of questions and sequence guides provided should be followed. This will ensure that compatible and comparable data are collected across statistical and administrative collections for all government and private sector collections. Responses to the cultural and language questions should be classified using standard ABS classifications and associated coding procedures. These classifications are well researched and soundly developed and their use enables the ready comparison of data from different sources. The use of ABS coding indexes designed to complement the classifications will simplify the coding process and improve data.

It is recommended that data be captured and stored at the most detailed level of the classification wherever possible. This allows the greatest flexibility for the output of statistics, enables more detailed and complex analysis, facilitates comparisons with previous data using different classifications and preserves information so as to provide maximum flexibility for future use of the data.

It should be noted that it is not necessary that all of the full set of questions be used in any one questionnaire. Rather a combination of variables from the standard set, based on the factors that need to be taken into consideration which are itemised above, can be added to the Minimum Core Set of questions to create a tailored question module. 

As far as possible, conceptual and operational consistency is maintained across all of the modules in order to maximise comparability. Inevitably, however, data collected using different question modules will have some limitations in terms of comparability due to differences in the degree of precision and the level of detail collected. 

Concepts

The variables included in the Minimum Core Set and the Standard Set of Cultural and Language Indicators were chosen by the ABS and other agencies because they provide a range of information that is pertinent to the measurement of cultural and language diversity, and of related advantage or disadvantage in terms of access to government and other services. They are variables that are being increasingly used in the statistical and administrative collections of the ABS and other organisations. The fact that ABS standards, with standard definitions, question wording and data collection procedures, already existed for these variables, or were in the process of being developed, supported the choice of variables.

The Minimum Core Set and Standard Set of questions are designed to provide information about the origins and cultural characteristics of an individual or group. They therefore do not include questions about a person’s labour force characteristics, educational qualifications, income, etc. If socioeconomic data are needed, a number of such questions should be asked, and a range of ABS standards is available to assist where such questions are required. It should be noted that making assumptions about the socioeconomic status of a person or a population on the basis of their origins is not recommended. 

The following outlines the main reasons why each variable was chosen for the Standard Set. It also outlines the relative strengths and weaknesses of each language variable, the quality of data collected and the suitability of the language variables as a filter for the Proficiency in Spoken English variable. 

Country of birth of person

This variable provides fundamental and objective information about a person’s origins. It is widely regarded by many organisations as a priority measure of cultural background, and forms a key element of their current data collection practice. The variable readily enables comparison with existing ABS census and survey data, and with overseas data. When used in conjunction with other cultural and language variables, Country of Birth of Person allows for the identification of subgroups within a migrant population. It has limitations in identifying ethnic and cultural groups which form minorities in their country or countries of origin and groups which have significant populations in countries outside their country of origin.

Main language other than English spoken at home

This language variable provides information on the number of people who speak English only and, if one or more other languages are spoken, the main non-English language used in the home. The variable has the merit of capturing a language other than English, where the main language spoken may be English but a language other than English is still used in the home. This maximises numbers for the more established migrant communities. In some cases, however, this measure may not reflect complete language use, for example, when English is the only language spoken in the home but a language other than English is spoken outside the home, within a person’s ethnic or community group. This measure may also record the language usage of those people whose main and preferred language is English but who have learnt another language, which is occasionally but not normally spoken at home, in the ‘Other Than English’ category. 

Main Language Other Than English Spoken at Home was chosen for the Minimum Core Set not only because of its strengths as a measure of language usage, but also because it is the language variable used in the Census. A major advantage of including this language variable in the Minimum Core Set is that it allows for comparability of language data collected by statistical and administrative collections with ABS census data. This allows, for example, rates of usage of services by particular language groups in particular regions to be calculated on the basis of the size of the group in the catchment area as revealed by census data.

A weakness of Main Language Other Than English Spoken at Home is that it can only capture language use in the home, and may indicate the use of languages other than English for those people whose main language is usually English and whose use of another language is marginal. As such, this variable filters some people whose main language spoken is English to the Proficiency in Spoken English variable, which ideally should only be asked of those for whom English is a second language. It may also exclude some people who speak a language other than English outside the home but who speak only English at home, although not proficiently, from being asked the Proficiency in Spoken English question. However, the extent of these ‘filtering’ problems is not known and many users regard the combination of this language variable and Proficiency in Spoken English as the best measure for identifying service needs and the potentially disadvantaged. This variable is also used as a filter for Proficiency in Spoken English by the ABS in the Census of Population and Housing. 

Indigenous Status

Indigenous Status provides data on the number of people who identify as being of Aboriginal or Torres Strait Islander origin. Indigenous Status is a fundamental element of cultural diversity in Australian society and should be included in all relevant data collections except for those specifically focused on migrants and their descendants.

ABS Census of Population and Housing figures have shown that this ‘propensity to identify’ has changed over time. For more detail see Census of Population and Housing: Understanding the Increase in Aboriginal and Torres Strait Islander Counts, 2016

Ancestry

This variable provides a self-assessed measure of ethnicity and cultural background by identifying a person’s origins and heritage. It can also be used in combination with other variables as a measure of the extent to which people retain the ethnicity and culture of their forebears (e.g. parents and grandparents). However, there are many Australians with origins and heritage which do not, in practice, relate to their current ethnic identity. As such, Ancestry is not considered to be a particularly good measure of service needs and should be used in conjunction with the Country of Birth variables and language variables to provide additional information about a person’s cultural identity. It may be of most value for some analytical purposes when the population of interest is restricted to persons born overseas, or who have one or more parents born overseas. 
One advantage of including the Ancestry variable in the Standard Set is that it will allow comparisons of Ancestry data from administrative sources with ABS Census data

Country of birth of Father

Country of Birth of Father identifies the country in which a person’s father was born. It is regarded as an important variable as it can be used, in association with other cultural and language variables, to determine the extent to which second generation Australians retain their parents’ culture, ethnicity or language.

Country of birth of Mother

Country of Birth of Mother identifies the country in which a person’s mother was born. It is regarded as an important variable as it can be used, in association with other cultural and language variables, to determine the extent to which second generation Australians retain their parents’ culture, ethnicity or language.

First language spoken

This variable provides accurate information about a person’s cultural and linguistic background, as First Language Spoken does not change over a person’s lifetime, and is regarded as a good surrogate measure of ethnicity because of its connection with a person’s origins and the origins of his or her parents. This variable also provides a good measure of current language use in the community. ABS data show that 95% of Australians whose first language is a language other than English, are still able to use their first language. 
In some instances however, depending on age, year of arrival in Australia, and living arrangements, a person’s first language spoken will not necessarily be the person’s language of greatest competence or the main language he or she currently speaks at home or in the community. As such, like Main Language Other Than English Spoken at Home, this variable may overstate the real level of usage of languages other than English.

Languages spoken at home

This language variable provides data on the stock of languages actively used in Australian homes. In some cases, however, this measure may not reflect complete language use when, for example, only one language is spoken in the home but other languages are spoken outside the home, within a person’s ethnic community group. This variable puts no restrictions on the number of languages a person can report as being spoken in the home. However, multiple language responses may include languages which play a minor role in a person’s communication because they are not the person’s first language, the language mainly used or the language of greatest competence. The variable does not determine the frequency with which each language, reported as being spoken in the home, is used. 
As it is possible to have multiple responses to this variable which include a mix of English and non-English languages, it does not provide a reliable filter for Proficiency in Spoken English and should not be used as such. 

Main language spoken at home

This variable provides information about the language most frequently used by a person at home. It is a good indicator of the language in which an individual is likely to be most at ease. However, Main Language Spoken at Home tends to understate current community language usage, of languages other than English, amongst the longer standing migrant groups who now mainly use English at home. In some instances, it does not provide information about a person’s cultural and language background but rather information about an aspect of their living arrangements (i.e. the single language most frequently used in the household in which they live).
Main Language Spoken at Home can be used as a filter to Proficiency in Spoken English. However, it will sequence people who mainly speak a non-English language at home but who are proficient in English, which is their main language outside the home, to the Proficiency in Spoken English question. It may also sequence past the Proficiency in Spoken English question some individuals who mainly use English at home but are not fully proficient in English. First Language Spoken and Main Language Other Than English Spoken at Home are better filters to Proficiency in Spoken English.

Religious affiliation

As well as providing data on the number of people who identify with particular religious groups in the Australian community, this variable provides additional data for identifying specific ethnic or cultural groups, when used in conjunction with other cultural and language variables. Some organisations have found data on religious affiliation helpful in delivering culturally relevant services to clients.

Year of arrival in Australia

This variable is used to derive the length of time a person born in another country has spent living in Australia. It is an important variable for many purposes as it gives an indication of how familiar migrants are likely to be with Australian society and practices, how long it took them to overcome settlement difficulties, and how their social characteristics have changed with the length of time they have been here. Year of Arrival in Australia is also related to familiarity with the domestic labour market and may be a major determinant of the economic situation of migrants.

Collection methods

Collection methodology

Collections can be administered using several different methodologies. The questions asked and the way in which they are worded depends on the collection methodology chosen. ABS household collections generally use one of the following methodologies: 

Personal interview (PI), where a trained interviewer personally interviews each person selected; 

Any responsible adult (ARA), where a trained interviewer asks one person in the household (a usual resident aged 18 or over) for information about all persons in the household (Both PI and ARA surveys can be conducted face-to-face or by telephone); or 

Self-enumeration, where questionnaires left at the household may be completed either by each person or by one person on behalf of all household members. 

Following is a guide to the relative efficiency of each methodology. 

The PI methodology enables collection of very complex and detailed information, offering great scope and flexibility in manipulating, compiling and analysing the survey data. It usually achieves high response rates and high quality data. However, the costs involved in employing, training and managing interviewers makes this method of data collection expensive. Also, for more sensitive topics, respondents might not be inclined to reveal private information using this method. 

The ARA methodology is used when time and cost constraints prevent the use of personal interview. The ABS Monthly Population Survey, which administers the Labour Force Survey and one or more Supplementary Surveys to 35,000 households each month, uses the ARA methodology. A limitation of this methodology for collecting data on cultural and language diversity is the subjectivity of responses. Certain questions rely on the respondent’s opinion, or the respondent may not be aware of some of the information required for variables such as Country of Birth of Father, Country of Birth of Mother or First Language Spoken. This could result in larger non-response rates or guessing on behalf of the person who is answering the questions. Cultural diversity data collected by this method may therefore be less precise, less detailed and may support fewer analytical applications than data collected by personal interview. 

Self-enumeration questionnaires such as those used in the Census of Population and Housing must be simple and self-explanatory. The complex sequencing and detailed questions often used in interviewer-administered questionnaires are not feasible in self-enumerated collections. The data quality is dependent on the respondent reading and understanding all of the instructions without the prompts and assistance that an interviewer can provide. This is the least expensive method of data collection, although information obtained from such an enumeration strategy is less precise. 

Question modules

A question module is a set of questions, with response categories and associated sequence guides, designed to collect data for the measurement of a particular variable or group of related variables. More than one question module may be developed for a topic (such as cultural and language diversity) and each of the modules is designed for a specific purpose. 

When designing the question module, there are several factors which need to be taken into consideration. These include: 

  • the collection methodology (how the questionnaire will be administered); 
  • analytical requirements (the information you want to obtain from the data); 
  • time, space and cost constraints; and 
  • provider or respondent load. 

Two question modules have been developed for the measurement of cultural and language diversity. 

  • The Question Module for the Minimum Core Set of Cultural and Language Indicators comprises four questions. This is the minimum question module recommended for obtaining data on cultural or language diversity. 
  • The Question Module for the Standard Set of Cultural and Language Indicators contains all of the standard questions which can be used to measure cultural and language diversity. 

It should be noted that it is not necessary that all of the full set of questions be used in any one questionnaire. Rather a combination of variables from the standard set, based on the factors that need to be taken into consideration which are itemised above, can be added to the Minimum Core Set of questions to create a tailored question module. 

The content of the question module chosen depends on how the data will be used, the collection methodology, and competing claims for interviewer time and space on the questionnaire. As far as possible, conceptual and operational consistency is maintained across all of the modules in order to maximise comparability. Inevitably, however, data collected using different question modules will have some limitations in terms of comparability due to differences in the degree of precision and the level of detail collected. 

Question modules are displayed below with suggested sequencing. 

Most cultural and language variables contain two question options: at least one detailed data question and a minimum data question.  There are usually two forms of detailed question: one that contains a tick box list and an ‘Other—please specify’ response category, and a short form that allows the respondent to write in their response. This second form of detailed question has higher coding costs than the tick box option but elicits more complete information. The question used will depend upon the type of information required from each question. For minimum, short and detailed question modules see the individual standards for each cultural and language variable.

Minimum core set of cultural and language indicators

The following question module is used to collect data on the Minimum Core Set of variables for the measurement of cultural and language diversity. This is the minimum set of questions which should be used when measuring cultural and language diversity.

 

  • Q1. In which country [were you] [was the person] [was (name)] born? 
  • Q2. [Do you] [does the person] [does (name)] [will (name of child under two years)] speak a language other than English at home?
  • Q3. How well [do you] [does the person] speak English? 
  • Note: Question 3 requires a modification when being asked in an interview situation. When collecting data via interview, the following question should be used: 
  • Q3. Do you consider [you speak] [(name) speaks] English very well, well, or not well?
  • Q4. [Are you] [Is the person] [Is (name)] of Aboriginal or Torres Strait Islander origin?

Standard set of cultural and language indicators

The following question module is used to collect data on the full Standard Set of variables for the measurement of cultural and language diversity. The questions are presented in the suggested order with sequencing instructions included. However, it is recognised that all questions would rarely be required in any one collection. This is especially true for the questions relating to languages, where it is likely that collections would require only one or two language variables rather than all four. It is suggested that the variable or combination of variables which best suits information needs be used.

 

  • Q1. In which country [were you] [was the person] [was (name)] born? 
  • Q2. In what year did [you] [the person] [(name)] first arrive in Australia to live here for one year or more? 
  • Note: Any of Questions 3, 5 or 6 can be used as a filter for Question 4. See Concepts section above for further information on the relative efficiencies of each language variable. When using the full set, it is unlikely that all language questions would be needed. Rather the variable, or combinations of variables, that best suit the needs of the collection should be used.
  • Q3. [Do you] [does the person] [does (name)] [will (name of child under two years)] speak a language other than English at home?
  • Q4. How well [do you] [does the person] speak English? 
  • AND/OR
  • Q5. Which language [did you] [did the person] [did (name)] [will (name of child under two years)] first speak as a child?
  • AND/OR
  • Q6. Which language or languages [do you] [does the person] [does (name)] [will (name of child under two years)] speak at home?
  • Q7. Which language [do you] [does the person] [does (name)] [will (name of child under two years)] mainly speak at home?
  • Q8. [Are you] [Is the person] [Is (name)] of Aboriginal or Torres Strait Islander origin? 
  • Q9. In which country was [your] [the person’s] [(name’s)] mother born?
  • Q10. In which country was [your] [the person’s] [(name’s)] father born?
  • Note: If only one question on birthplace of parents can be accommodated, Country of Birth of Mother (Q9) should be collected. 
  • Q11. What is [your] [the person’s] [(name)’s] Ancestry?
  • Q12. What is [your] [the person’s] [(name)’s] religion? 

Example of a tailored question module

The following question module is an example of tailoring the set of cultural and language diversity variables to suit particular data requirements. In addition to the questions contained in the Minimum Core Set, it contains Year of Arrival in Australia, Country of Birth of Mother and First Language Spoken.

Year of Arrival in Australia is a useful addition to the Minimum Core Set as it can be used as an indication of the respondent’s likelihood of having adapted to Australian society. Users interested in determining which individuals or community groups have the most difficulty, or take the longest time to adapt to Australian society, may wish to include this question in their question module.

The combination of First Language Spoken and Main Language Other Than English Spoken at Home provides a good measure of changes in language usage over time. Users interested in identifying those communities or individuals that tend to retain their community language may be interested in the inclusion of both of these language variables. This information can be used to assist in identifying individuals and community groups that may require particular services. Either Main Language Other Than English Spoken at Home or First Language Spoken can be used as a filter for Proficiency in Spoken English.

Country of Birth of Mother can be used in conjunction with the Minimum Core Set to provide a finer measure of ethnic or cultural identity (than does Country of Birth of Person used alone) and to give an indication of the retention of a person’s mother’s culture and language. Users with an interest in broad measures of cultural diversity may wish to include this variable in their question module.

Q1. In which country was the person born?
Australia>Go to Q3
England 
New Zealand 
Italy 
Viet Nam 
Scotland 
Greece 
Germany 
Philippines 
Netherlands 
Other—please specify:  
Q2. In what year did the person first arrive in Australia to live here for one year or more?
(Write in the calendar year of arrival or mark the box if here less than one year.) 
  
Will be here less than one year
Q3. Does the person speak a language other than English at home?
(If more than one language, indicate the one that is spoken most often.) 
No, English only
Yes, Italian
Yes, Greek
Yes, Cantonese
Yes, Mandarin
Yes, Arabic
Yes, Vietnamese
Yes, other—please specify: 
Q4. Which language did the person first speak as a child?
(Mark only one box.)  
English>Go to Q6
Italian 
Greek 
Cantonese 
Mandarin 
Arabic 
Vietnamese 
German 
Spanish 
Tagalog (Filipino) 
Other—please specify:  
Q5. How well does the person speak English?
Very well
Well
Not well
Not at all
Q6. Is the person of Aboriginal or Torres Strait Islander origin?
(For persons of both Aboriginal and Torres Strait Islander origin, mark both ‘Yes’ boxes.) 
No
Yes, Aboriginal
Yes, Torres Strait Islander
Q7. In which country was the person’s mother born?
Australia
Other country

Use of statistical standards

Definition of a statistical standard

Many different standards are used in statistics, such as computing, editing, publication, form design and data quality standards. This publication is concerned with standards which facilitate the collection, aggregation and output of cultural and language data. A statistical standard in this context can be defined as a set of components which, when used together, produce consistent and high-quality statistical output (about the concepts which underpin the cultural and language variables) across collections and over time. Statistical standards are the approved rules we apply in the development, collection, processing and dissemination of official statistics.

A statistical standard includes many components which specify standard practice at any point in the cycle of data collection, processing, and dissemination. The essential set of components that comprise and specify a statistical standard for a variable are as follows:

  • Standard Name of the Variable;
  • Standard Definition of the Variable;
  • Standard Question(s);
  • Standard Classification;
  • Standard Coding Procedures; and
  • Standard Output Categories.

This publication provides summary information about each of these essential components. A full specification of each standard can be found in the links provided in the "Collecting cultural and language indicator data" section.

Advantages of using statistical standards

Statistical standards are designed to add quality to the data produced by improving accuracy, reliability, relevance and timeliness. ABS standards are finalised following thorough and rigorous development of concepts, definitions, questions, classifications and processing and dissemination procedures. This rigorous development process is combined with consideration of practical issues such as statistical feasibility and extensive consultation with users of the standard and of the resulting data. This thorough development work leads to long term cost savings by providing an ‘off the shelf solution’ for most data collections.

Standards improve data comparability and comprehensibility. Because ABS standards are designed to harmonise, as far as is possible, with established Australian and international practices, this comparability may apply internationally, across collections, across time, across agencies and within a given subject area. Comprehensibility means the ability to be understood, by users of the standard and by respondents to administrative and statistical collections. It involves clarity of definitions, realism in the sense of modelling the real world, and providing a logical and coherent structure for collecting information.

The widespread use of standards will also provide an integrated statistical picture of Australian society. They facilitate the process of drawing together all the data about a particular topic, variable or population, from the full range of statistical sources, in a meaningful and useful way.

Maintenance of statistical standards

Users of the statistical standards should be aware that the standards are subject to an ongoing testing and maintenance program and may change as a result. In particular, the questions are reviewed regularly and the question response categories are updated on the basis of the latest available data.

The ABS will provide advice on the use and appropriate implementation of the standard variables. Variations from the standards are not encouraged and should only be formulated and used in consultation with the ABS.

Glossary

Show all

Administrative collection (data)

This refers to data obtained from existing records and documents such as: hospital admissions forms; employment records; births, deaths and marriages records; trade union membership records, etc. Much social and economic research can be done using such existing data without the need for costly and time consuming surveys. 

Any responsible adult (ARA) interview

This method of data collection involves asking questions of the first responsible adult (usual resident aged 18 or over) contacted by the interviewer. This person is required to answer the questions on behalf of either a single member or all members of the household. This person is generally related to the people/person they are answering for or knows them very well. However, this method may result in lower response rates and data quality compared to a personal interview, where the respondent is directly questioned, as the ARA may not know all details about the respondent or may answer the questions subjectively.

Classification

A classification can be defined as a set of categories structured in a way that allows each unit in a population to be assigned unequivocally to only one category in the set according to characteristics of the unit. 

Categories in a classification must be mutually exclusive and jointly exhaustive of the population of units under consideration. Classifications must also be useful (e.g. relevant) and realistic. 

Classifications often have hierarchical structures with two, three, four or more hierarchical levels. A classification with only one level is termed non-hierarchical or flat. There may need to be rules to ensure categories are not too small, nor too large. There will always be a conflict between keeping a classification up to date with reality and maintaining unbroken time series. 

A standard classification can generally be applied to a number of different variables. For example, the Australian Standard Classification of Languages (ASCL) can be used to classify data collected by the four language variables: Main Language Other Than English Spoken at Home, First Language Spoken, Main Language Spoken at Home, and Languages Spoken at Home.

Coding index

A coding index enables descriptive information such as the responses to one or more survey questions to be coded to the appropriate category in a classification. The coding index is generally a comprehensive, alphabetical list of all the probable responses, usually derived from previous survey responses, together with the appropriate numerical code. 

Coding procedures

Coding procedures are the rules whereby data collected for a variable (and related question) are assigned to categories of the classification. This coding is generally facilitated by a coding index. Standard coding procedures promote consistency across collections and over time.

Indicator

An indicator is an attribute of a person or set of attributes of a population (defined and measured by a statistical variable) that indicates possible aspects of their situation in Australian society. For instance, the variables Income and Occupation can be used as indicators of socioeconomic status. Similarly, Country of Birth of Person and Main Language Spoken at Home can be used as indicators of a likely connection to a particular cultural or ethnic group. If Proficiency in Spoken English is added to the indicators, broad conclusions about socioeconomic status and the possible degree of advantage or disadvantage in Australian society can also be drawn. 

Whereas a variable is an objective measure of some attribute of a person (or group of people), an indicator uses the information supplied by a variable to indicate or imply, to a greater or lesser extent, a possibility in regard to the person’s (or group of persons’) status or functioning. 

Interview-based collection

This method of data collection involves an interviewer asking questions of the respondent in a personal interview or an ARA interview. The costs involved in employing, training and managing interviewers make this method of data collection more expensive than self-enumeration. Also, for more sensitive topics, respondents might not be inclined to reveal private information using this method. Interview-based collections may be conducted face-to-face or using the telephone. 

Personal interview (PI)

Personal interviews involve an interviewer directly asking the questions of a respondent and recording the responses. Another person does not answer the questions on behalf of the respondent. This method usually achieves high response rates and high quality data. 

Question module

A question module is a set of standard questions, relating to particular variables, designed to collect a range of information. For example, the question module for the Minimum Core Set of Cultural and Language Indicators consists of four standard questions (for the variables Country of Birth of Person, Main Language Other Than English Spoken at Home, Proficiency in Spoken English and Indigenous Status), and is designed to collect fundamental information about a person’s cultural and language diversity. 

Self-enumerated collection

Self-enumerated collections are those in which respondents are required to complete the survey or administrative questions for themselves. Respondents usually feel more willing to supply personal or sensitive data with this method, compared to the personal interview method, however, there may be some confusion and misinterpretation of questions unless appropriate instructions are included. All question wording, instructions and sequencing of responses must be simple and self-explanatory. 

Statistical standard for a variable

A statistical standard for a variable is defined as: 

‘… a set of components which, when used together, produce consistent and high quality statistical output (about concepts which underpin the cultural and language variables) across collections and over time.’ 

The essential set of components that comprise and specify a statistical standard for a variable are as follows: 

  • Standard Name of the Variable; 
  • Standard Definition of the Variable; 
  • Standard Question(s); 
  • Standard Classification; 
  • Standard Coding Procedures; and 
  • Standard Output Categories. 


Statistical standards are designed to add quality to the data produced by improving the accuracy, reliability, relevance and timeliness of data. Statistical standards involve thorough and rigorous development of concepts, definitions, questions, classifications, and processing and dissemination procedures. This rigorous development process is combined with consideration of practical issues such as statistical feasibility and extensive consultation with users of the standard and of the resulting data, and usually leads to long term cost savings by providing an ‘off the shelf solution’ for most statistical collections. 

Standards improve data comparability and comprehensibility. As ABS standards are designed to harmonise as far as possible with established Australian and international practices, comparability applies internationally, across collections, across time, across agencies and within a given subject area. Comprehensibility means the ability to be understood, by users of the standard and by respondents to administrative and statistical collections. It involves clarity of definitions, realism in the sense of modelling the real world, and providing logical and coherent structure. Standards also provide an integrated statistical picture of society and the economy. It means drawing all the data about a particular topic, variable or population together—in a meaningful way—from the full range of statistical sources. 

Variable

A variable, in simple terms, is a concept which can be measured. More precisely, for ABS purposes, a variable is an attribute of a statistical unit whose value can, in theory, be measured in a statistical collection. In this context (i.e. for the Standard Set of Cultural and Language Indicators), each (statistical) variable names and defines a particular attribute relating to the origins of each person questioned. The variable also describes and itemises the set of attributes of a population of people. It is called a variable because the defined attributes may vary from person to person. It is a ‘statistical’ variable because it is used to collect, classify, and aggregate responses relating to a defined attribute of each person in a statistical collection. 

For instance, First Language Spoken is a statistical variable which attributes a defined element of language usage to each person (the first language they spoke as a child) and itemises the total range of languages first spoken by a population of people. Main Language Spoken at Home and Main Language Other Than English Spoken At Home are other language variables. Language itself is not used as a statistical variable because it does not closely define a particular aspect of language usage or refer to a particular entity being counted, and is not statistically useful. All of the variables recommended in the Standard Set of Cultural and Language Indicators relate to people, but it should be noted that statistical variables can also relate to families, households, activities, businesses, etc., as well as to people. 

Abbreviations

Show all

 
ABSAustralian Bureau of Statistics
ARAany responsible adult
ASCLAustralian Standard Classification of Languages
CLIPCultural and Language Indicators Pilot Study
COMIMACouncil of Ministers of Immigration and Multicultural Affairs
DIMADepartment of Immigration and Multicultural Affairs
IDCCommonwealth Interdepartmental Committee on Multicultural Affairs
MESCmain English-speaking country
NESBnon-English speaking background

PI

personal interview
SCIMAStanding Committee of Immigration and Multicultural Affairs
SSCLDStandards for Statistics on Cultural and Language Diversity

 

Previous catalogue number

This release previously used catalogue number 1289.0

Post release changes

22/02/2022 - The standards have not been changed, but text and the publication format has been streamlined for readability and discoverability on the ABS website. 

Back to top of the page