What is stratified random sampling?
Stratified random sampling is the technique of breaking the population of interest into groups (called strata) and selecting a random sample from within each of these groups. Breaking the population up into strata helps ensure a representative mix of units is selected from the population and enough sample is allocated to groups you wish to form estimates about. For instance, you may wish to stratify by Geography (eg: state) to ensure a good mix of geographical areas are selected, or to help produce estimates for different geographical areas. For more information on stratification, refer to the Basic Survey Design manual on the ABS website by clicking here.
You have an allocated budget of $10,000 to determine how satisfied your customers are. You have decided to run a survey and you want to produce estimates for large, medium and small size business customers. Your database shows that you have 10,000 large business customers, 20,000 medium business customers, and 8,000 small business customers. Last year's survey determined:
- 40% of large size businesses were satisfied
- 50% of medium size businesses were satisfied
- 60% of small size businesses were satisfied
- 49% satisfaction across all businesses
You also decide you want your margin of error for each group to be plus/minus 3 percentage points, with 95% confidence. This means you can be confident that in 95% of samples, the confidence interval will cover the true value of customer satisfaction.
To determine the total sample size required, you need to enter details into the sample size calculator for each stratum one at a time.
Large business stratum calculation
Sample required for this stratum is 930.
Medium business stratum calculation
Sample required for this stratum is 1,014.
Small business stratum calculation
Sample required for this stratum is 909.
Total sample size required
The total sample size required to meet the three accuracy constraints is 930 + 1014 + 909 = 2,853
If this sample size is too large for the survey budget, quality constraints could be reduced on lower priority strata.
What is the quality of the population estimate?
The calculator displays the standard error and relative standard error (RSE) for each stratum. But what about the quality of the estimate for the whole population (ie small, medium, and large businesses combined)?
One way to calculate the RSE for the population estimate is to enter the total population size (38,000), the total sample size (2,853), and the proportion of satisfied businesses in the whole population (0.49) into the sample size calculator. This results in an RSE of 1.84%. Note that this RSE is only an estimate, as it does not take into account the impact of stratification.
If required, a more accurate population RSE can be calculated using the formula below. This first formula uses the following items to first calculate the standard error (SE) for the population estimate:
- The standard error of each stratum (SE1, SE2, SE3)
- The size of the population for each stratum (N1, N2, N3)
- The total population size (N)
To determine the RSE for the population estimate, we now need to divide the standard error by the population estimate (49%). In this example, the calculation would look like:
This means that this design would result in a population estimate with an RSE of 1.95%.
Why are the RSEs different?
You can see from the above example that the RSE of 1.95% derived using the formula is slightly larger than the RSE of 1.84% that is obtained in the sample size calculator. This is because the formula takes into account the stratification.
To show this in a different way, lets examine the case where we don't use stratification, but want the same RSE constraint of 1.95% for the total population estimate. In this case, the sample size calculator shows that to achieve an RSE of 1.95%, a sample size of only 2,554 businesses is required. This is 299 businesses less than the sample size required when stratification is used. This demonstrates that although using stratification in the design achieved equal quality for each of the business groups, the population RSE is slightly higher than in the non-stratification design. Putting it another way, if you didn't use stratification in this design, you could have achieved the same overall RSE of 1.95% for the total population estimate but sample 299 fewer businesses. This means the cost of the survey would be reduced, but you would have no control over the RSEs you achieved for each of the business groups you are interested in.
Survey managers can read more about stratification and sample allocation on the NSS website, including methods of allocating sample to strata to suit various output requirements, by clicking here.
Click here to begin using the Sample Size Calculator...