1331.0 - Statistics - A Powerful Edge!, 1996

ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 31/07/1998

Summary
Downloads
Explanatory Notes
Related Information
Past Releases

Page tools: Print

Print Page Print all pages in this product

Print All

Contents >> Stats Maths >> Sampling Methods - Random Sampling

SAMPLING METHODS

If you survey every person or a whole set of units in a population you are taking a census. However, this method is often impracticable; as it’s often very costly in terms of time and money. For example, a survey that asks complicated questions may need to use trained interviewers to ensure questions are understood. This may be too expensive if every person in the population is to be included.

Sometimes taking a census can be impossible. For example, a car manufacturer might want to test the strength of cars being produced. Obviously, each car could not be crash tested to determine its strength!
To overcome these problems, samples are taken from populations, and estimates made about the total population based on information derived from the sample. A sample must be large enough to give a good representation of the population, but small enough to be manageable. In this section the two major types of sampling, random and non-random, will be examined.

RANDOM SAMPLING

In random sampling, all items have some chance of selection that can be calculated. Random sampling technique ensures that bias is not introduced regarding who is included in the survey. Five common random sampling techniques are:

simple random sampling,
systematic sampling,
stratified sampling,
cluster sampling, and
multi-stage sampling.

SIMPLE RANDOM SAMPLING

With simple random sampling, each item in a population has an equal chance of inclusion in the sample. For example, each name in a telephone book could be numbered sequentially. If the sample size was to include 2,000 people, then 2,000 numbers could be randomly generated by computer or numbers could be picked out of a hat. These numbers could then be matched to names in the telephone book, thereby providing a list of 2,000 people.

A Tattslotto draw is a good example of simple random sampling. A sample of 6 numbers is randomly generated from a population of 45, with each number having an equal chance of being selected.

The advantage of simple random sampling is that it is simple and easy to apply when small populations are involved. However, because every person or item in a population has to be listed before the corresponding random numbers can be read, this method is very cumbersome to use for large populations.

SYSTEMATIC SAMPLING

Systematic sampling, sometimes called interval sampling, means that there is a gap, or interval, between each selection. This method is often used in industry, where an item is selected for testing from a production line (say, every fifteen minutes) to ensure that machines and equipment are working to specification.

Alternatively, the manufacturer might decide to select every 20th item on a production line to test for defects and quality. This technique requires the first item to be selected at random as a starting point for testing and, thereafter, every 20th item is chosen.

This technique could also be used when questioning people in a sample survey. A market researcher might select every 10th person who enters a particular store, after selecting a person at random as a starting point; or interview occupants of every 5th house in a street, after selecting a house at random as a starting point.

It may be that a researcher wants to select a fixed size sample. In this case, it is first necessary to know the whole population size from which the sample is being selected. The appropriate sampling interval, I, is then calculated by dividing population size, N, by required sample size, n, as follows:
I = N/n

EXAMPLE

If a systematic sample of 500 students were to be carried out in a university with an enrolled population of 10,000, the sampling interval would be:
I = N/n = 10,000/500 =20

Note: if I is not a whole number, then it is rounded to the nearest whole number.

All students would be assigned sequential numbers. The starting point would be chosen by selecting a random number between 1 and 20. If this number was 9, then the 9th student on the list of students would be selected along with every following 20th student. The sample of students would be those corresponding to student numbers 9, 29, 49, 69, ........ 9929, 9949, 9969 and 9989.

The advantage of systematic sampling is that it is simpler to select one random number and then every ‘Ith’ (e.g. 20th) member on the list, than to select as many random numbers as sample size. It also gives a good spread right across the population. A disadvantage is that you may need a list to start with, if you wish to know your sample size and calculate your sampling interval.

STRATIFIED SAMPLING

A general problem with random sampling is that you could, by chance, miss out a particular group in the sample. However, if you form the population into groups, and sample from each group, you can make sure the sample is representative.

In stratified sampling, the population is divided into groups called strata. A sample is then drawn from within these strata. Some examples of strata commonly used by the ABS are States, Age and Sex. Other strata may be religion, academic ability or marital status.

EXAMPLE

The committee of a school of 1,000 students wishes to assess any reaction to the re-introduction of Pastoral Care into the school timetable. To ensure a representative sample of students from all year levels, the committee uses the stratified sampling technique.

In this case the strata are the year levels. Within each strata the committee selects a sample. So, in a sample of 100 students, all year levels would be included. The students in the sample would be selected using simple random sampling or systematic sampling within each strata.

Stratification is most useful when the stratifying variables are simple to work with, easy to observe and closely related to the topic of the survey.

An important aspect of stratification is that it can be used to select more of one group than another. You may do this if you feel that responses are more likely to vary in one group than another. So, if you know everyone in one group has much the same value, you only need a small sample to get information for that group; whereas in another group, the values may differ widely and a bigger sample is needed.

If you want to combine group level information to get an answer for the whole population, you have to take account of what proportion you selected from each group (see ‘Bias in Estimation’).

CLUSTER SAMPLING

It is sometimes expensive to spread your sample across the population as a whole. For example, travel can become expensive if you are using interviewers to travel between people spread all over the country. To reduce costs you may choose a cluster sampling technique.

Cluster sampling divides the population into groups, or clusters. A number of clusters are selected randomly to represent the population, and then all units within selected clusters are included in the sample. No units from non-selected clusters are included in the sample. They are represented by those from selected clusters. This differs from stratified sampling, where some units are selected from each group.

Examples of clusters may be factories, schools and geographic areas such as electoral sub-divisions. The selected clusters are then used to represent the population.

EXAMPLE

Suppose an organisation wishes to find out which sports Year 11 students are participating in across Australia. It would be too costly and take too long to survey every student, or even some students from every school. Instead, 100 schools are randomly selected from all over Australia.

These schools are considered to be clusters. Then, every Year 11 student in these 100 schools is surveyed. In effect, students in the sample of 100 schools represent all Year 11 students in Australia.

Cluster sampling has several advantages: reduced costs, simplified field work and administration is more convenient. Instead of having a sample scattered over the entire coverage area, the sample is more localised in relatively few centres (clusters).

Cluster sampling’s disadvantage is that less accurate results are often obtained due to higher sampling error (see section Information - Problems with Using) than for simple random sampling with the same sample size. In the above example, you might expect to get more accurate estimates from randomly selecting students across all schools than from randomly selecting 100 schools and taking every student in those chosen.

MULTI-STAGE SAMPLING

Multi-stage sampling is like cluster sampling, but involves selecting a sample within each chosen cluster, rather than including all units in the cluster. Thus, multi-stage sampling involves selecting a sample in at least two stages. In the first stage, large groups or clusters are selected. These clusters are designed to contain more population units than are required for the final sample.

In the second stage, population units are chosen from selected clusters to derive a final sample. If more than two stages are used, the process of choosing population units within clusters continues until the final sample is achieved.

EXAMPLE

An example of multi-stage sampling is where, firstly, electoral sub-divisions (clusters) are sampled from a city or state. Secondly, blocks of houses are selected from within the electoral sub-divisions and, thirdly, individual houses are selected from within the selected blocks of houses.

The advantages of multi-stage sampling are convenience, economy and efficiency. Multi-stage sampling does not require a complete list of members in the target population, which greatly reduces sample preparation cost. The list of members is required only for those clusters used in the final stage. The main disadvantage of multi-stage sampling is the same as for cluster sampling: lower accuracy due to higher sampling error (see section Information - Problems with Using).