"The Accuracy of Tax Imputations: Estimating Tax Liabilities and Credits Using Linked Survey and Administrative Data

with Bruce Meyer, Grace Finley, Patrick Langetieg, Carla Medalia, Mark Payne, and Alan Plumley

In Measuring Distribution and Mobility of Income and Wealth (eds. R. Chetty, J. Friedman, J. Gornick, B. Johnson, & A. Kennickell), 459-498. NBER Book Series. (2022)

Publisher's Version | NBER Working Paper No. 28229 | Online Appendix

Abstract: This paper calculates accurate estimates of income and payroll taxes using a groundbreaking set of linked survey and administrative tax data that are part of the Comprehensive Income Dataset (CID). We compare our estimates to survey imputations produced by the Census Bureau and those generated using the TAXSIM calculator from the National Bureau of Economic Research. The administrative data include two sets of Internal Revenue Service (IRS) data: (1) a limited set of tax information for the population of individual income tax returns covering selected line items from Forms 1040, W-2, and 1099-R; and (2) an extensive set of population tax records processed by the IRS in 2011, covering nearly every line item on Form 1040 and most lines on a series of third-party information returns. We link these IRS records to the Current Population Survey Annual Social and Economic Supplement (CPS) for reference year 2010. We describe how we form tax units and estimate various types of tax liabilities and credits using these linked data, providing a roadmap for constructing accurate measures of taxes while preserving the survey family as the sharing unit for distributional analyses. We find that aggregate estimates of various tax components using the limited and extensive tax data estimates are close to each other and much closer to public IRS tabulations than either of the imputations using survey data alone. At the individual level, the absolute errors of survey-only imputations of federal income taxes and total taxes are on average 10% and 13%, respectively, of adjusted gross income. In contrast, the limited tax data imputations yield mean absolute errors for federal income taxes and total taxes that are about 2% and 3% of adjusted gross income, respectively. For the Earned Income Tax Credit, the limited tax data imputation is off by less than $20 on average for a typical family (compared to more than $500 using either of the survey-only imputations). 

Abstract: This paper is the first to examine changes in poverty over time using a comprehensive set of linked survey and administrative data, implementing recommendations of the Interagency Technical Working Group on Evaluating Alternative Measures of Poverty. Using the Comprehensive Income Dataset (CID), we correct for measurement error in survey-reported incomes, focusing on single parent families from 1995 to 2016. Our preferred estimates indicate that single parent family poverty declined by 62% over time, while it fell by only 45% using survey data alone. Moreover, survey-reported deep poverty among single parent families increased over time, while it fell using the CID.

"The Use and Misuse of Income Data and Extreme Poverty in the United States"

with Bruce Meyer, Victoria Mooers, and Carla Medalia 

Journal of Labor Economics, 39(S1): S5-S58. (2021) 

Publisher's Version | NBER Working Paper No. 25907 | Online Appendix | Summary (Cato)

Coverage: The Economist, Slate, Vox 

Abstract: Recent research suggests that rates of extreme poverty, commonly defined as living on less than $2/person/day, are high and rising in the United States. We re-examine the rate of extreme poverty by linking 2011 data from the Survey of Income and Program Participation and Current Population Survey, the sources of recent extreme poverty estimates, to administrative tax and program data. Of the 3.6 million non-homeless households with survey-reported cash income below $2/person/day, we find that more than 90% are not in extreme poverty once we include in-kind transfers, replace survey reports of earnings and transfer receipt with administrative records, and account for the ownership of substantial assets. More than half of all misclassified households have incomes from the administrative data above the poverty line, and several of the largest misclassified groups appear to be at least middle class based on measures of material well-being. In contrast, the households kept from extreme poverty by in-kind transfers appear to be among the most materially deprived Americans. Nearly 80% of all misclassified households are initially categorized as extreme poor due to errors or omissions in reports of cash income. Of the households remaining in extreme poverty, 90% consist of a single individual. An implication of the low recent extreme poverty rate is that it cannot be substantially higher now due to welfare reform, as many commentators have claimed.

Abstract: Schools often have to decide between extending the length of the school year or the school day. This paper examines the effects of changes in the distribution of instructional time on eighth-grade student achievement through a methodological framework that disaggregates total yearly instructional time into separate inputs for days per year and hours per day. This study's dataset brings together nearly 900,000 student observations across eighty countries and four quadrennial testing cycles of the Trends in International Mathematics and Science Study (TIMSS) Assessments (1995–2007). I find that the positive effects of instructional time on student achievement are driven largely by the length of the school day and not by the length of the school year, with diminishing marginal returns to the former. Socioeconomically underprivileged students are most likely to realize gains from a longer school day. Furthermore, isolating the amount of instructional time spent on TIMSS-tested subjects from the rest of the school day reveals spillover effects from time spent in non-tested subjects that are especially meaningful for underprivileged students. In contrast, the effects of time spent in tested subjects are more homogeneous across student groups.

"Linking Survey and Administrative Data to Measure Income, Inequality, and Mobility"

with Carla Medalia, Bruce Meyer, and Amy O'Hara

International Journal of Population Data Science, 4(1): 1-8. (2019

Publisher's Version

Winner of the Administrative Data Research Facilities (ADRF) Network 2018 Annual Conference Best Paper Award

Abstract: Income is one of the most important measures of well-being, but it is notoriously difficult to measure accurately. In the United States, income data are available from surveys, tax records, and government programs, but each of these sources has important strengths and major limitations when used alone. We link multiple data sources to develop the Comprehensive Income Dataset (CID), a prototype for a restricted micro-level dataset that combines the demographic detail of survey data with the accuracy of administrative measures. By incorporating information on nearly all taxable income, tax credits, and cash and in-kind government transfers, the CID surpasses previous efforts to provide an accurate and comprehensive measure of income for the population of United States individuals, families, and households. We also evaluate the accuracy of different income sources and imputation methods. While still in development, we envision the CID enhancing Census Bureau surveys and statistics by investigating measurement error, improving imputation methods, and augmenting surveys with the best possible estimates of income. It can also be used for policy related research, such as forecasting and simulating changes in programs and taxes. Finally, the CID has substantial advantages over other sources to analyze numerous research topics, including poverty, inequality, mobility, and the distributional consequences of government transfers and taxes.

Abstract: Many studies examine the anti-poverty effects of social insurance and means-tested transfers, relying solely on survey data with substantial errors. We improve on past work by linking administrative data from Social Security and five large means-tested transfers (SSI, SNAP, Public Assistance, the EITC, and housing assistance) to 2008-2013 Survey of Income and Program Participation data. Using the linked data, we find that Social Security cuts the poverty rate by a third – more than twice the combined effect of the five means-tested transfers. Among means-tested transfers, the EITC and SNAP are most effective. All programs except for the EITC sharply reduce deep poverty (below 50% of the poverty line), while the impact of the EITC is more pronounced at 150% of the poverty line. For the elderly, Social Security single-handedly slashes poverty by 75%, more than 20 times the combined effect of the means-tested transfers. While single parent families benefit more from the EITC, SNAP, and housing assistance, they are still relatively underserved by the safety net, with the six programs together reducing their poverty rate by only 38%. SSI, Public Assistance, and housing assistance have the highest share of benefits going to the pre-transfer poor, while the EITC has the lowest. Finally, the survey data alone provide fairly accurate estimates for the overall population at the poverty line, although they understate the effects of Social Security, SNAP, and Public Assistance. However, there are more striking differences at other income cutoffs and for specific family types. For example, the survey data yield 1) effects of SNAP and Public Assistance on near poverty that are two-thirds and one-half what the administrative data generate and 2) poverty reduction effects of SSI, Social Security, and Public Assistance that are 34-44% of what the administrative data produce for single parent families.

Working Papers

Abstract: How do administrative burdens influence enrollment in different welfare programs? Who is screened out at a given stage? This paper studies the impacts of increased administrative burdens associated with the automation of welfare caseworker assistance, leveraging a unique natural experiment in Indiana in which the IBM Corporation remotely processed applications for two-thirds of all counties. Using linked administrative records covering nearly 3 million program recipients, the results show that SNAP, TANF, and Medicaid enrollments fell by 15%, 24%, and 4% one year after automation, with these heterogeneous declines largely attributable to cross-program differences in recertification costs. Earlier-treated and higher-poverty counties experienced larger declines in welfare receipt. More needy individuals were screened out at exit while less needy individuals were screened out at entry, a novel distinction that would be missed by typical measures of targeting which focus on average changes overall. The decline in Medicaid enrollment exhibited considerable permanence after IBM's automated system was disbanded, suggesting potential long-term consequences of increased administrative burdens.

Abstract: Recipients of government transfers are economically disadvantaged, yet little is known about how their circumstances evolve leading up to and around program receipt. Using survey data and administrative health records, we establish three new stylized facts around enrollment in eight large safety net programs in the United States. First, market incomes decline prior to enrolling in almost all studied programs, while post-transfer incomes return to pre-enrollment levels within a year. Second, employment rates decline around program receipt and remain lower over a year later, with these patterns coinciding with increased disability and worse health. Third, spousal separations begin to increase prior to program enrollment, even for social insurance programs without mechanically related eligibility requirements. Taken together, these findings highlight the importance of considering persistent non-income shocks (e.g., to health and marital status) for all programs and suggest the need for future work to incorporate the insurance value of programs beyond their intended risk into welfare calculations.

Abstract: The replacement of the Child Tax Credit (CTC) with a child allowance has been advocated by numerous policymakers and researchers.  We estimate the anti-poverty, targeting, and labor supply effects of such a change by linking survey data with administrative tax and government program data which form part of the Comprehensive Income Dataset (CID). We focus on the provisions of the 2021 Build Back Better Act, which would have increased maximum benefit amounts to $3,000 or $3,600 per child (up from $2,000 per child) and made the full credit available to all low and middle-income families regardless of earnings or income. Initially ignoring any behavioral responses, we estimate that the replacement of the CTC would reduce child poverty by 34% and deep child poverty by 39%. The change to a child allowance would have a larger anti-poverty effect on children than any existing government program, though at a higher cost per child raised above the poverty line than any other means-tested program. Relatedly, the child allowance would allocate a smaller share of its total dollars to families at the bottom of the income distribution—as well as families with the lowest levels of long-term income, education, or health—than any existing means-tested program with the exception of housing assistance. We then simulate anti-poverty effects accounting for labor supply responses. By replacing the CTC (which contained substantial work incentives akin to the EITC) with a child allowance, the policy change would reduce the return to working at all by at least $2,000 per child for most workers with children. Relying on elasticity estimates consistent with mainstream simulation models and the academic literature, we estimate that this change in policy would lead 1.5 million workers (constituting 2.6% of all working parents) to exit the labor force. The decline in employment and the consequent earnings loss would mean that child poverty would only fall by at most 22% and deep child poverty would not fall at all with the policy change. 

Abstract: Geographic adjustments for local prices are embedded in many federal payments to states, localities, and individuals. Adjustments for geographic cost-of-living differences are also part of the Census Bureau’s Supplemental Poverty Measure and have been proposed for the Official Poverty Measure, yet academic work is divided as to whether or not geographic adjustments are justified. This paper proposes a rigorous approach to assess the desirability of geographic adjustments to poverty measures by examining how well they achieve a central objective of a poverty measure: identifying the least advantaged population. Specifically, we compare an exhaustive list of material well-being indicators of those classified as poor under the Supplemental Poverty Measure and the new Comprehensive Income Poverty Measure with and without a geographic adjustment. These well-being indicators are drawn from the Current Population Survey and the Survey of Income and Program Participation and include material hardships, appliances owned, home quality issues, food security, public services, health, education, assets, permanent income, and mortality. For nine of the ten domains of well-being indicators, we find that incorporating a geographic adjustment identifies a less deprived poor population. These results are broadly consistent across different poverty measures, various ways of implementing a geographic adjustment, and multiples of the poverty line. 

"Learning About Homelessness Using Linked Survey and Administrative Data" (May 2021)

with Bruce Meyer, Angela Wyse, Alexa Gruwaldt, and Carla Medalia

NBER Working Paper No. 28861

Coverage: VoxEU

Abstract: Official poverty statistics and even the extreme poverty literature largely ignore people experiencing homelessness. In this paper, we examine the characteristics, labor market attachment, geographic mobility, earnings, and safety net utilization of this population in order to understand their economic well-being. This paper is the first to examine these outcomes at the national level using administrative data on income and government program receipt. It is part of the Comprehensive Income Dataset project, which combines household survey data with administrative records to improve estimates of income and related statistics. Specifically, we use restricted microdata from the 2010 Decennial Census, which enumerates both sheltered and unsheltered homeless people, the 2006-2016 American Community Survey (ACS), which surveys sheltered homeless people, and longitudinal shelter-use data from several major U.S. cities. We link these data to longitudinal administrative tax records as well as administrative data on the Supplemental Nutrition Assistance Program (SNAP), veterans’ benefits, Medicare, Medicaid, housing assistance, and mortality. Our approach benefits from large samples that offer a guide to national homelessness patterns and allow us to compare estimates between data sources, including the Department of Housing and Urban Development (HUD)’s point-in-time (PIT) counts. By shedding light on issues of data linkage and survey coverage among homeless people, this paper contributes to efforts to better incorporate this hard-to-survey population into income and poverty estimates.

Abstract: Using microdata covering the universe of Chicago taxi trips, this paper measures the dramatic changes in the Chicago taxi industry during the midst of the COVID-19 pandemic. The number of taxi rides and active taxi cabs fell by more than 90% between February and May 2020, with little sign of recovery in May even as economic activity began to rebound. Taxi activity shifted from commercial to more residential areas, and from evenings and rush hour to the middle of the day. These patterns suggest a relative increase in essential trips and a decrease in business and leisure travel. Moreover, taxi activity decreased the most in white and high-income areas, with the larger reductions in white areas persisting even after accounting for spatial differences in income, education, employment, and health. Finally, taxi cabs that stopped driving tended to work less prior to the pandemic, pick up more frequently in the busiest areas, and be part of medium-sized taxi companies.

Abstract: As one of the most important federal education programs, Title I accounts for approximately 30% of federal funding to all school districts. Yet, many studies have shown that Title I is associated with insignificant effects on student achievement. This paper tests one potential mechanism behind this effect - that receiving Title I funds causes state and local governments to offset funding they would otherwise have provided. I employ a regression discontinuity design that exploits discontinuities in the Title I funding formula. In particular, Title I grants are divided into four categories, and a school district is only eligible to receive a given subgrant if its poverty rate is above a given threshold. Evidence from the 2010 to 2015 school years suggests that receiving Title I grants leads to reductions in total expenditures, driven by crowding out of local revenues that becomes more pronounced in the years after gaining eligibility.

Research Notes

Pre-Doctoral Research and Policy Reports

"Appendix: University of California Online Education Initiative" 

In Locus of Authority (by W. Bowen & E. Tobin), Princeton, NJ: Princeton University of Press. (2015)