Government Industry
Industry: Email Alert RSS FeedSOI Sampling Methodology and Data Limitations - Statistics of Income
Statistics of Income Bulletin, Summer, 2001 by Beth Kilss, Paul McMahon, Robert (Irish writer) Wilson
Appendix
This appendix discusses typical sampling procedures used in most Statistics of Income (SOI) programs. Aspects covered briefly include sampling criteria, selection techniques, methods of estimation, and sampling variability. Some of the nonsampling error limitations of the data are also described, as well as the tabular conventions employed.
Additional information on sample design and data limitations for specific SOI studies can be found in the separate SOI reports (see pages 256-257 at the end of this Bulletin). More technical information is available, on request, by writing to the Acting Director, Statistics of Income Division N:ADC:R:S, Internal Revenue Service, P.O. Box 2608, Washington, DC 20013-2608.
Most RecentGovernment Articles
Sample Criteria and Selection of Returns
Statistics compiled for the SOI studies are generally based on stratified probability samples of income tax returns or other forms filed with the Internal Revenue Service (IRS). The statistics do not reflect any changes made by the taxpayer through an amended return or by the IRS as a result of an audit. As returns are filed and processed for tax purposes, they are assigned to sampling classes (strata) based on such criteria as: industry, presence or absence of a tax form or schedule, accounting period, State from which filed, and various income factors or other measures of economic size (total assets, for example, are among the criteria used for the corporation and partnership statistics). The samples are selected from each stratum over the appropriate filing periods. Thus, sample selection can continue for a given study for several calendar years--3 for corporations because of the incidence of fiscal (non-calendar) year reporting and extensions of filing time. Because sampling must take place before the population size is known precisely, the rates of sample selection within each stratum are fixed. This means, in practice, that both the population and the sample size can differ from those planned. However, these factors do not compromise the validity of the estimates.
The probability of a return being designated depends on its sample class or stratum and may range from a fraction of 1 percent to 100 percent. Considerations in determining the selection probability for each stratum include the number of returns in the stratum, the diversity of returns in the stratum, and interest in the stratum as a separate subject of study. All this is subject to constraints based on the estimated processing costs or the target size of the total sample for the program.
For most SOI studies, returns are designated by computer from the IRS Master Files based on the taxpayer identification number (TIN), which is either the Social Security number (SSN) or the Employer Identification Number (EIN). A fixed and essentially random number is associated with each possible TIN. If that random number falls into a range of numbers specified for a return's sample stratum, then it is selected and processed for the study. Otherwise, it is counted (for estimation purposes), but not selected. In some cases, the TIN is used directly by matching specified digits of it against a predetermined list for the sample stratum. A match is required for designation.
Under either method of selection, the TIN's designated from one year's sample are, for the most part, selected for the next year's, so that a very high proportion of the returns selected in the current year's sample are from taxpayers whose previous years' returns were included in earlier samples. This longitudinal character of the sample design improves the estimates of change from one year to the next.
Method of Estimation
As noted above, the probability with which a return is selected for inclusion in a sample depends on the sampling rate prescribed for the stratum in which it is classified. "Weights" are, in general, computed by dividing the count of returns filed for a given stratum by the count of sample returns for that same stratum. Weights are used to adjust for the various sampling rates used, relative to the population--the lower the rate, the larger the weight.
For some studies, it is possible to improve the estimates by subdividing the original sampling classes into "post-strata," based on additional criteria or refinements of those used in the original stratification. Weights are then computed for these post-strata using additional population counts. The data on each sample return in a stratum are then multiplied by that weight. To produce the tabulated estimates, the weighted data are summed to produce the published statistical totals.
Sampling Variability
The particular sample used in a study is only one of a large number of possible random samples that could have been selected using the same sample design. Estimates derived from the different samples usually vary. The standard error of the estimate is a measure of the variation among the estimates from all possible samples and is used to measure the precision with which an estimate from a particular sample approximates the average result of the possible samples. The sample estimate and an estimate of its standard error permit the construction of interval estimates with prescribed confidence that this interval includes the actual population value.
- 5 Rules for Immediate Annuities
- Death in the Family: 12 Things to Do Now
- Dumbest Things You Do With Your Money
- 6 Online Networking Mistakes to Avoid
- 401(k) Mistakes to Avoid
- 5 Economic Scenarios to Keep You Up at Night
- The Real ‘Best Places to Retire’
- Best Credit Cards for You
- 12 Tough Questions to Ask Your Parents
- The Real ‘Best Colleges’
- Home Buyer Tax Credit: How to Cash In
- Why You Shouldn't Bash Cash
- 8 Phony 'Bargains' and Better Alternatives
- Danger: 3 Debit Card Scams to Avoid
- 6 Myths About Gas Mileage
- 29 Fees We Hate Most
- Quick and Easy Ways to Boost Returns
- Best Stocks to Buy Now
- Lower Your Taxes: 10 Moves to Make Now
- New Jobs: 8 Lessons from Real-Life Career Switchers
- The New Job Market: Who Wins and Who Loses?
- Health Care Reform's Public Option: Everything You Need to Know
- Volunteer Work When Unemployed: Should You Work for Free?
- Whose Recovery Is This?
- Long-Term-Care Insurance: 4 Biggest Risks to Avoid
Content provided in partnership with
Most Recent Reference Articles
- A Maryland state trooper gave Erik Bonstrom an $80 ticket for driving too slowly
- In California, postal worker Dean Hudson has been found guilty
- Alec Loorz, the 15-year-old founder of Kids vs. Global Warming and recent Brower Youth Award recipient, went to Congress in November for a press conference with Senators Barbara Boxer and John Kerry, who are championing legislation to stabilize US greenho
- Foreign exchange
- The buzz on bees
Most Recent Reference Publications
Most Popular Reference Articles
- Credit card debt on college campuses: causes, consequences, and solutions
- 9 questions to ask your new lover: what you were afraid to ask, but always wanted to know
- How Tyler Perry rose from homelessness to a $5 million mansion
- Rejoice anyway - Zephaniah 3:14-20, Philippians 4:4-7 - Living by the Word - Column
- Living by the word



