Limitations of the Data
Geographic. Although the present CPS sample is a State-based design, the sample size of the CPS is sufficient to produce reliable monthly estimates at the national level only. The sample does not permit the production of reliable monthly estimates for the States. However, demographic, social, and economic detail is published annually for the census regions and divisions, all States and the District of Columbia, 50 large metropolitan areas, and selected central cities. The production of subnational labor force and unemployment estimates is discussed in more detail in chapter 4 of this bulletin.
Sources of errors in the survey estimates. There are two types of errors possible in an estimate based on a sample survey — sampling and nonsampling. The mathematical discipline of sampling theory provides methods for estimating standard errors when the probability of selection of each member of a population can be specified. The standard error, a measure of sampling variability, can be used to compute confidence intervals that indicate a range of differences from true population values that can be anticipated because only a sample of the population has been surveyed. Nonsampling errors such as response variability, response bias, and other types of bias occur in complete censuses as well as sample surveys. In some instances, nonsampling error may be more tightly controlled in a well-conducted survey, through which it is feasible to collect and process the data more skillfully. Estimation of other types of bias is one of the most difficult aspects of survey work, and adequate measures of bias often cannot be made.
Nonsampling error. The full extent of nonsampling error is unknown, but special studies have been conducted to quantify some sources of nonsampling error in the CPS. The effect of nonsampling error should be small on estimates of relative change, such as month-to-month change. Estimates of monthly levels would be more severely affected by nonsampling error.
Nonsampling errors in surveys can be attributed to many sources, including the inability to obtain information about all persons in the sample; differences in the interpretation of questions; inability or unwillingness of respondents to provide correct information; inability to recall information; errors made in collecting and processing the data; errors made in estimating values for missing data; and failure to represent all sample households and all persons within sample households (undercoverage).
The effects of some components of nonsampling error in the CPS data are reflected in the variation in some labor force measures among the rotation groups, each of which is designed to be a representative sample of the population. For example, unemployment estimates from a rotation group tend to be higher in the first and fifth months of interviewing.
Undercoverage in the CPS results from missed housing units and missed persons within sample households. The noninterview adjustment procedure accounts for missed households. It also is known that the CPS undercoverage of persons varies with age, sex, race, and Hispanic ethnicity. Generally, undercoverage is greater for men than for women and greater for blacks, Hispanics, and other races than for whites. Ratio adjustment to independent age-sex-race-origin population controls, as described previously, partially corrects for the biases due to survey undercoverage. Biases still exist in the estimates to the extent that persons in missed households or missed persons in interviewed households have characteristics different from those of interviewed persons in the same age-sex-race-origin group.
The independent population estimates used in the estimation procedure may be a source of error, although, on balance, their use substantially improves the statistical reliability of many of the figures. Errors may arise in the independent population estimates because of underenumeration of certain population groups or errors in age reporting in the decennial census (which serves as the base for the estimates) or similar problems in the components of population change (mortality, immigration, and so forth) since that date.
Sampling error. When a sample, rather than the entire population, is surveyed, estimates differ from the true population values that they represent. This difference, or sampling error, occurs by chance, and its variability is measured by the standard error of the estimate. Sample estimates from a given survey design are unbiased when an average of the estimates from all possible samples would yield, hypothetically, the true population value. In this case, the sample estimate and its standard error can be used to construct approximate confidence intervals, or ranges of values, that include the true population value with known probabilities. If the process of selecting a sample from the population were repeated many times and an estimate and its standard error were calculated for each sample, then:
Although the estimating methods used in the CPS do not produce unbiased estimates, biases for most estimates are believed to be small enough that these confidence interval statements are approximately true.
Standard error estimates computed using generalized variance functions are provided in Employment and Earnings and other publications. Using replicate variance techniques, standard error estimates are generated. As computed, these standard error estimates reflect contributions not only from sampling error, but also from some types of nonsampling error, particularly response variability. Because replicate variance techniques are somewhat cumbersome, simplified formulas called generalized variance functions (GVFs) have been developed for various types of labor force characteristics. The GVF can be used to approximate an estimate's standard error, but this indicates only the general magnitude of the standard error, rather than a precise value.
Next: Technical References
Last Modified Date: April 17, 2003