Once the aims of the survey, data requirements, level of accuracy, data collection method and costs have been worked out, the sample design process is next considered. The sample design refers to what a sample consists of and how the sample is to be obtained. It is concerned with defining the population and the frame, sample size issues, sampling techniques and data collection method. In this chapter, frames and population will be discussed and the other issues will be discussed later in other chapters.

The population is the aggregate or collection of units about which the survey will be conducted. Units can refer to people, households, schools, hospitals, businesses etc. There are two different populations that a survey is concerned with. We have a target population, the group of units about which information is wanted, and a survey population, the units that we are able to survey. The target population is also known as the scope of the survey, the population that the survey is aimed at; the survey population is also called the coverage, the population the survey actually covers. Ideally the survey population should correspond exactly with the target population, however, the two populations may not match, so the conclusions based on survey data only apply to the survey population.

The frame refers to the list of units (eg, persons, households, businesses, etc) in the survey population. Since the selection of the sample is directly based on this list, the frame is one of the most important tools in the design of a survey. It determines how well a target population is covered, and affects the choice of the data collection method. It is also desirable that the frame contains auxiliary information on the units so that a more efficient sample plan can be developed (auxiliary information is discussed in Sample Design). The frame should contain contact points for each of the units listed so that it can be used to access the population. This means that for postal surveys the frame should contain postal addresses; for interviewer-based surveys the frame should contain street addresses; and for telephone surveys the frame should contain telephone numbers.

There are two types of frames that may be used for survey design: list frames and area frames.

List Frames
A list frame is a list of all of the selection units in the survey population. List frames are commonly used in surveys of businesses. Examples of list frames include administrative lists, personnel lists, telephone lists, mailing house lists, association membership lists and the electoral roll.

Area Frames
An area frame is a complete and exhaustive list of non-overlapping geographic areas. These areas may be defined by such geographic features as rivers and streets. Area frames are used when it is too expensive or complex to maintain a list frame or where no list frame exists and it would be too expensive or complex to create. They are usually used in household surveys where lists of households are only created for selected geographic regions. Examples of the geographic areas that may be used to create an area frame include Local Government Areas (LGAs), Census Collector's District (CDs), Postcodes, and States.

For most sampling methodologies it is desirable to have a complete list of units or areas from which to select a sample. However, in practice it can be difficult to compile such a complete list and therefore frame bias (ie. the frame is not representative of the target population) is introduced. The bias might result for two reasons: use of the inappropriate frame which may include some out of scope units or may exclude some appropriate units (eg. choosing the wrong units from the frame, for example, choosing non-retail units from a list of businesses when conducting a survey of retail sales); and problems with the composition of the frame

Frames can become inaccurate for many reasons:

  • the most common being that populations are subject to continuous change and the frame easily becomes out of date; also
  • frames are often compiled from inadequate sources, this can sometimes cause frame units to be hard to contact through lack of information.

Some of the problems that can occur in the composition of the frame are described below.

Missing Units
Missing units (eg exporting businesses) on a frame are those units in the target population that should appear on the frame but do not. These units may have different characteristics to those units which do appear on the frame and therefore, information obtained from the survey will not be representative of the target population. This is referred to as under-coverage and may result in bias.

Out -of -Scope (or Foreign) Units
These are units that do appear on the frame but are not part of the target population (eg non-exporting businesses). These units do not contribute to the survey results but they do contribute to costs. Selection of a number of foreign elements in the sample reduces the actual sample size, causing larger errors in survey estimates.

Duplications refer to units that appear on the frame more than once. This can be due to;
  • typographical errors in entering units onto the frame;
  • the same unit is entered under slightly different names;
  • merging of several smaller frames when the frame is created.

This means that the probability of selection of the units on the frame is no longer known. A unit may be selected more than once in the sample and this reduces the accuracy of the results because fewer different units are sampled. It can also reduce the professional image of the survey.

A dead unit is a unit that no longer exists in the population (eg exporting business has folded and no longer exists). Deaths on the frame have a similar effect on survey results as out-of-scope units. Deaths should not be removed from the frame for future surveys based on sample survey results as deaths in the sample reflect the number of deaths in the non-sample population. Retaining dead units in the sample provides an indication of the number of dead units in the survey population. Deaths, however, can be removed based on census results.

Whilst nils are not a frame problem they are worth mentioning here as they can often be confused with dead units. Units that are operating but have a zero return or activity for the survey period are referred to as nils. The zero return for the survey period may be due to a seasonal factor inherent to the business (eg. beachside ice-cream stalls would report zero turnover during winter). These units should not be removed from the frame as they will report non-zero returns at a later date.

Solutions to Frame Problems
It is important to be aware that frames do have problems, so the quality of the frame should be investigated. There are various strategies that may be used if the quality of the frame is in doubt:
  • You can use the frame anyway and allow for the problems by increasing the sample size at the selection stage and by adjusting the weights at the estimation stage.
  • If time and resources allow, it may be possible to update the frame.
  • If another frame exists that also closely approximates the target population, it may be better to use the alternative frame. This situation calls for a trade-off between a frame matching the target population and a frame not quite matching the target population but providing the relevant detailed information.
  • It may also be possible to combine the frame with related frames to improve the coverage of the target population. This process is often called supplementation as the current frame is supplemented with another one. However, consideration should be given to overlapping units and any differences between the definition of a unit on the two frames.

Since the frame provides the means of accessing the population to obtain a sample, considerations should therefore be given to the quality of the frame. Frames should be evaluated early in the planning stage since a bad frame will have an affect on the estimates produced at the end.

A good frame is up-to-date, does not have any missing units, contains only relevant units, does not include duplicates, is accessible to frame users and contains sufficient information to uniquely identify and contact each unit.