The Accuracy of Tax Imputations: Estimating Tax Liabilities and Credits Using Linked Survey and Administrative Data (2022)

with Bruce Meyer, Grace Finley, Patrick Langetieg, Carla Medalia, Mark Payne, and Alan Plumley

In Measuring Distribution and Mobility of Income and Wealth (eds. R. Chetty, J. Friedman, J. Gornick, B. Johnson, & A. Kennickell), 459-498. NBER Book Series.

Publisher's Link | NBER Working Paper No. 28229 | Online Appendix

Abstract: This paper calculates accurate estimates of income and payroll taxes using a groundbreaking set of linked survey and administrative tax data that are part of the Comprehensive Income Dataset (CID). We compare our estimates to survey imputations produced by the Census Bureau and those generated using the TAXSIM calculator from the National Bureau of Economic Research. The administrative data include two sets of Internal Revenue Service (IRS) data: (1) a limited set of tax information for the population of individual income tax returns covering selected line items from Forms 1040, W-2, and 1099-R; and (2) an extensive set of population tax records processed by the IRS in 2011, covering nearly every line item on Form 1040 and most lines on a series of third-party information returns. We link these IRS records to the Current Population Survey Annual Social and Economic Supplement (CPS) for reference year 2010. We describe how we form tax units and estimate various types of tax liabilities and credits using these linked data, providing a roadmap for constructing accurate measures of taxes while preserving the survey family as the sharing unit for distributional analyses. We find that aggregate estimates of various tax components using the limited and extensive tax data estimates are close to each other and much closer to public IRS tabulations than either of the imputations using survey data alone. At the individual level, the absolute errors of survey-only imputations of federal income taxes and total taxes are on average 10% and 13%, respectively, of adjusted gross income. In contrast, the limited tax data imputations yield mean absolute errors for federal income taxes and total taxes that are about 2% and 3% of adjusted gross income, respectively. For the Earned Income Tax Credit, the limited tax data imputation is off by less than $20 on average for a typical family (compared to more than $500 using either of the survey-only imputations). 

Abstract: This paper is the first to examine changes in poverty over time using a comprehensive set of linked survey and administrative data, implementing recommendations of the Interagency Technical Working Group on Evaluating Alternative Measures of Poverty. Using the Comprehensive Income Dataset (CID), we correct for measurement error in survey-reported incomes, focusing on single parent families from 1995 to 2016. Our preferred estimates indicate that single parent family poverty declined by 62% over time, while it fell by only 45% using survey data alone. Moreover, survey-reported deep poverty among single parent families increased over time, while it fell using the CID.

Abstract: Recent research suggests that rates of extreme poverty, commonly defined as living on less than $2/person/day, are high and rising in the United States. We re-examine the rate of extreme poverty by linking 2011 data from the Survey of Income and Program Participation and Current Population Survey, the sources of recent extreme poverty estimates, to administrative tax and program data. Of the 3.6 million non-homeless households with survey-reported cash income below $2/person/day, we find that more than 90% are not in extreme poverty once we include in-kind transfers, replace survey reports of earnings and transfer receipt with administrative records, and account for the ownership of substantial assets. More than half of all misclassified households have incomes from the administrative data above the poverty line, and several of the largest misclassified groups appear to be at least middle class based on measures of material well-being. In contrast, the households kept from extreme poverty by in-kind transfers appear to be among the most materially deprived Americans. Nearly 80% of all misclassified households are initially categorized as extreme poor due to errors or omissions in reports of cash income. Of the households remaining in extreme poverty, 90% consist of a single individual. An implication of the low recent extreme poverty rate is that it cannot be substantially higher now due to welfare reform, as many commentators have claimed.

Abstract: Schools often have to decide between extending the length of the school year or the school day. This paper examines the effects of changes in the distribution of instructional time on eighth-grade student achievement through a methodological framework that disaggregates total yearly instructional time into separate inputs for days per year and hours per day. This study's dataset brings together nearly 900,000 student observations across eighty countries and four quadrennial testing cycles of the Trends in International Mathematics and Science Study (TIMSS) Assessments (1995–2007). I find that the positive effects of instructional time on student achievement are driven largely by the length of the school day and not by the length of the school year, with diminishing marginal returns to the former. Socioeconomically underprivileged students are most likely to realize gains from a longer school day. Furthermore, isolating the amount of instructional time spent on TIMSS-tested subjects from the rest of the school day reveals spillover effects from time spent in non-tested subjects that are especially meaningful for underprivileged students. In contrast, the effects of time spent in tested subjects are more homogeneous across student groups.

Linking Survey and Administrative Data to Measure Income, Inequality, and Mobility (2019) 

with Carla Medalia, Bruce Meyer, and Amy O'Hara

International Journal of Population Data Science, 4(1): 1-8

Publisher's Link

Winner of the Administrative Data Research Facilities (ADRF) Network 2018 Annual Conference Best Paper Award

Abstract: Income is one of the most important measures of well-being, but it is notoriously difficult to measure accurately. In the United States, income data are available from surveys, tax records, and government programs, but each of these sources has important strengths and major limitations when used alone. We link multiple data sources to develop the Comprehensive Income Dataset (CID), a prototype for a restricted micro-level dataset that combines the demographic detail of survey data with the accuracy of administrative measures. By incorporating information on nearly all taxable income, tax credits, and cash and in-kind government transfers, the CID surpasses previous efforts to provide an accurate and comprehensive measure of income for the population of United States individuals, families, and households. We also evaluate the accuracy of different income sources and imputation methods. While still in development, we envision the CID enhancing Census Bureau surveys and statistics by investigating measurement error, improving imputation methods, and augmenting surveys with the best possible estimates of income. It can also be used for policy related research, such as forecasting and simulating changes in programs and taxes. Finally, the CID has substantial advantages over other sources to analyze numerous research topics, including poverty, inequality, mobility, and the distributional consequences of government transfers and taxes.

Abstract: Many studies examine the anti-poverty effects of social insurance and means-tested transfers, relying solely on survey data with substantial errors. We improve on past work by linking administrative data from Social Security and five large means-tested transfers (SSI, SNAP, Public Assistance, the EITC, and housing assistance) to 2008-2013 Survey of Income and Program Participation data. Using the linked data, we find that Social Security cuts the poverty rate by a third – more than twice the combined effect of the five means-tested transfers. Among means-tested transfers, the EITC and SNAP are most effective. All programs except for the EITC sharply reduce deep poverty (below 50% of the poverty line), while the impact of the EITC is more pronounced at 150% of the poverty line. For the elderly, Social Security single-handedly slashes poverty by 75%, more than 20 times the combined effect of the means-tested transfers. While single parent families benefit more from the EITC, SNAP, and housing assistance, they are still relatively underserved by the safety net, with the six programs together reducing their poverty rate by only 38%. SSI, Public Assistance, and housing assistance have the highest share of benefits going to the pre-transfer poor, while the EITC has the lowest. Finally, the survey data alone provide fairly accurate estimates for the overall population at the poverty line, although they understate the effects of Social Security, SNAP, and Public Assistance. However, there are more striking differences at other income cutoffs and for specific family types. For example, the survey data yield 1) effects of SNAP and Public Assistance on near poverty that are two-thirds and one-half what the administrative data generate and 2) poverty reduction effects of SSI, Social Security, and Public Assistance that are 34-44% of what the administrative data produce for single parent families.

Working Papers

Abstract: How do administrative burdens influence enrollment in different welfare programs? Who is screened out at a given stage? This paper studies the impacts of increased administrative burdens associated with the automation of welfare caseworker assistance, leveraging a unique natural experiment in Indiana in which the IBM Corporation remotely processed applications for two-thirds of all counties. Using linked administrative records covering nearly 3 million program recipients, the results show that SNAP, TANF, and Medicaid enrollments fell by 15%, 24%, and 4% one year after automation, with these heterogeneous declines largely attributable to cross-program differences in recertification costs. Earlier-treated and higher-poverty counties experienced larger declines in welfare receipt. More needy individuals were screened out at exit while less needy individuals were screened out at entry, a novel distinction that would be missed by typical measures of targeting which focus on average changes overall. The decline in Medicaid enrollment exhibited considerable permanence after IBM's automated system was disbanded, suggesting potential long-term consequences of increased administrative burdens.

Abstract: Recipients of government transfers are disadvantaged, yet little is known about how their circumstances evolve leading up to program receipt. Using twenty-five years of survey data and administrative health records, we establish new stylized facts around enrollment in the largest safety net programs in the United States. Incomes fall prior to enrollment in every program, coinciding with declining employment, increased disability, and worse health. Spousal separations increase prior to enrollment even in programs without mechanically related eligibility requirements. These analyses provide a comprehensive and identically measured look across programs to demonstrate that households “slide” into participation through multiple pathways.

Abstract: This paper provides new estimates of poverty in the United States, showing how the bottom of the income distribution changes after correcting for misreporting of survey incomes and accurately incorporating taxes, expenses, and in-kind transfers. As part of the Comprehensive Income Dataset (CID) project, we link a wide range of administrative tax and program microdata to the Current Population Survey Annual Social and Economic Supplement for 2016. At a broad level, using better data shifts up the income distribution at every percentile in the bottom half. Starting from a baseline of survey pre-tax money income, the share of individuals with incomes below official thresholds falls by 26% after broadening the income concept and by an additional 41% after using the CID. Alternatively, poverty thresholds would have to increase by more than a third to keep poverty rates unchanged after using better data and changing the income concept. Relative poverty  also falls by 61% after all adjustments. For most analyses, the corrections for measurement error are more important than the conceptual changes to income. We observe a demographic shift in the composition of the poor, with our adjustments leading to more single individuals and fewer children in poverty. Part of the explanation for the large role played by better data is that the poverty reduction effects of government programs are much larger using the CID for a number of programs, including SNAP, housing assistance, and Social Security Disability Insurance.  

Race, Ethnicity, and Measurement Error (March 2024)

with Bruce Meyer and Nikolas Mittag

Prepared for NBER Volume on Race, Ethnicity, and Economic Statistics for the 21st Century (eds. R. Akee, L. Katz, & M. Loewenstein) 

Abstract: Large literatures have analyzed racial and ethnic disparities in economic outcomes and access to the safety net. For such analyses that rely on survey data, it is crucial that survey accuracy does not vary by race and ethnicity. Otherwise, the observed disparities may be confounded by differences in survey error. In this paper, we review existing studies that use linked data to assess the reporting of key programs (including SNAP, Social Security, Unemployment Insurance, TANF, Medicaid, Medicare, and private pensions) in major Census Bureau surveys, aiming to extract the evidence on differences in survey accuracy by race and ethnicity. Our key finding is a strong and robust, but previously largely unnoticed, pattern of greater measurement error for Black and Hispanic individuals and households relative to whites. As the dominant error is under-reporting for a wide variety of programs, samples, and surveys, the implication is that the safety net better supports minority groups than the survey data suggest, through higher program receipt and greater poverty reduction. We conclude that racial and ethnic minorities are inadequately served by our large household surveys and that researchers should cautiously interpret survey-based estimates of racial and ethnic differences. We briefly discuss paths forward.

Abstract: Homelessness is arguably the most extreme hardship associated with poverty in the United States, yet people experiencing homelessness are excluded from official poverty statistics and much of the extreme poverty literature. This paper provides the most detailed and accurate portrait to date of the level and persistence of material disadvantage faced by the U.S. homeless population, including the first national estimates of income, employment, and safety net participation based on administrative data. We link restricted-use microdata identifying those recorded as homeless during the 2010 Census to longitudinal tax records and administrative data on the Supplemental Nutrition Assistance Program (SNAP), Medicare, Medicaid, Disability Insurance (DI), Supplemental Security Income (SSI), veterans’ benefits, housing assistance, and mortality. We find that nearly half of these adults had formal employment in the year they were observed as homeless, one-quarter received disability assistance, and more than 85 percent were reached by at least one safety net program, primarily SNAP. Incomes are persistently low for the decade surrounding an observed period of homelessness, suggesting that homelessness tends to arise in the context of long-term, severe deprivation rather than large and sudden losses of income. As our findings illustrate, most people appear to experience homelessness because they are very poor despite being connected to the labor market and safety net, with persistently low incomes leaving them vulnerable to loss of housing when met with even modest disruptions to life circumstances.

Abstract: Individuals are often thought to be more disadvantaged in higher-cost areas. As a result, geographic adjustments for local prices are embedded in many federal payments to states, localities, and individuals and have been proposed or implemented for various poverty measures. This paper proposes a rigorous approach to assess the desirability of geographic adjustments to poverty measures by examining how well they achieve a central objective of a poverty measure: identifying the least advantaged population. Specifically, we compare an exhaustive list of material well-being indicators of those classified as poor under the Supplemental Poverty Measure and the new Comprehensive Income Poverty Measure with and without a geographic adjustment. These well-being indicators are drawn from linked survey and administrative records and include material hardships, appliances owned, home quality issues, food security, public services, health, education, assets, permanent income, and mortality. For nine of the ten domains of well-being indicators, we find that incorporating a geographic adjustment identifies a less deprived poor population. This result can be explained by local prices being sufficiently correlated with public goods and locational amenities, such that those with low incomes appear less disadvantaged in higher-cost areas.

Abstract: The replacement of the Child Tax Credit (CTC) with a child allowance has been advocated by numerous policymakers and researchers.  We estimate the anti-poverty, targeting, and labor supply effects of such a change by linking survey data with administrative tax and government program data which form part of the Comprehensive Income Dataset (CID). We focus on the provisions of the 2021 Build Back Better Act, which would have increased maximum benefit amounts to $3,000 or $3,600 per child (up from $2,000 per child) and made the full credit available to all low and middle-income families regardless of earnings or income. Initially ignoring any behavioral responses, we estimate that the replacement of the CTC would reduce child poverty by 34% and deep child poverty by 39%. The change to a child allowance would have a larger anti-poverty effect on children than any existing government program, though at a higher cost per child raised above the poverty line than any other means-tested program. Relatedly, the child allowance would allocate a smaller share of its total dollars to families at the bottom of the income distribution—as well as families with the lowest levels of long-term income, education, or health—than any existing means-tested program with the exception of housing assistance. We then simulate anti-poverty effects accounting for labor supply responses. By replacing the CTC (which contained substantial work incentives akin to the EITC) with a child allowance, the policy change would reduce the return to working at all by at least $2,000 per child for most workers with children. Relying on elasticity estimates consistent with mainstream simulation models and the academic literature, we estimate that this change in policy would lead 1.5 million workers (constituting 2.6% of all working parents) to exit the labor force. The decline in employment and the consequent earnings loss would mean that child poverty would only fall by at most 22% and deep child poverty would not fall at all with the policy change. 

Abstract: Using microdata covering the universe of Chicago taxi trips, this paper measures the dramatic changes in the Chicago taxi industry during the midst of the COVID-19 pandemic. The number of taxi rides and active taxi cabs fell by more than 90% between February and May 2020, with little sign of recovery in May even as economic activity began to rebound. Taxi activity shifted from commercial to more residential areas, and from evenings and rush hour to the middle of the day. These patterns suggest a relative increase in essential trips and a decrease in business and leisure travel. Moreover, taxi activity decreased the most in white and high-income areas, with the larger reductions in white areas persisting even after accounting for spatial differences in income, education, employment, and health. Finally, taxi cabs that stopped driving tended to work less prior to the pandemic, pick up more frequently in the busiest areas, and be part of medium-sized taxi companies.

Abstract: As one of the most important federal education programs, Title I accounts for approximately 30% of federal funding to all school districts. Yet, many studies have shown that Title I is associated with insignificant effects on student achievement. This paper tests one potential mechanism behind this effect - that receiving Title I funds causes state and local governments to offset funding they would otherwise have provided. I employ a regression discontinuity design that exploits discontinuities in the Title I funding formula. In particular, Title I grants are divided into four categories, and a school district is only eligible to receive a given subgrant if its poverty rate is above a given threshold. Evidence from the 2010 to 2015 school years suggests that receiving Title I grants leads to reductions in total expenditures, driven by crowding out of local revenues that becomes more pronounced in the years after gaining eligibility.

Research Notes

Pre-Doctoral Research and Policy Reports

Appendix: University of California Online Education Initiative 

In Locus of Authority (by W. Bowen & E. Tobin), Princeton, NJ: Princeton University of Press. (2015)