Yuehao Bai

I am an Assistant Professor of Economics at the University of Michigan. My research interests lie in econometric theory. I received my PhD from the University of Chicago in 2020.

## Published or forthcoming

(2021) “Inference in Experiments with Matched Pairs” (with J. P. Romano and A. M. Shaikh), forthcoming in the Journal of the American Statistical Association. abstract supplement doi code

This paper studies inference for the average treatment effect in randomized controlled trials where treatment status is determined according to a “matched pairs” design. By a “matched pairs” design, we mean that units are sampled i.i.d. from the population of interest, paired according to observed, baseline covariates and finally, within each pair, one unit is selected at random for treatment. This type of design is used routinely throughout the sciences, but results about its implications for inference about the average treatment effect are not available. The main requirement underlying our analysis is that pairs are formed so that units within pairs are suitably “close” in terms of the baseline covariates, and we develop novel results to ensure that pairs are formed in a way that satisfies this condition. Under this assumption, we show that, for the problem of testing the null hypothesis that the average treatment effect equals a pre-specified value in such settings, the commonly used two-sample $t$-test and “matched pairs” $t$-test are conservative in the sense that these tests have limiting rejection probability under the null hypothesis no greater than and typically strictly less than the nominal level. We show, however, that a simple adjustment to the standard errors of these tests leads to a test that is asymptotically exact in the sense that its limiting rejection probability under the null hypothesis equals the nominal level. We also study the behavior of randomization tests that arise naturally in these types of settings. When implemented appropriately, we show that this approach also leads to a test that is asymptotically exact in the sense described previously, but additionally has finite-sample rejection probability no greater than the nominal level for certain distributions satisfying the null hypothesis. A simulation study and empirical illustration confirm the practical relevance of our theoretical results.

(2021) “A Two-Step Method for Testing Many Moment Inequalities” (with A. Santos and A. M. Shaikh), forthcoming in the Journal of Business and Economic Statistics. abstract doi code

This paper considers the problem of testing a finite number of moment inequalities. For this problem, Romano et al. (2014) propose a two-step testing procedure. In the first step, the procedure incorporates information about the location of moments using a confidence region. In the second step, the procedure accounts for the use of the confidence region in the first step by adjusting the significance level of the test appropriately. Its justification, however, has so far been limited to settings in which the number of moments is fixed with the sample size. In this paper, we provide weak assumptions under which the same procedure remains valid even in settings in which there are “many” moments in the sense that the number of moments grows rapidly with the sample size. We confirm the practical relevance of our theoretical guarantees in a simulation study. We additionally provide both numerical and theoretical evidence that the procedure compares favorably with the method proposed by Chernozhukov et al. (2019), which has also been shown to be valid in such settings.

(2021) “Inference for Support Vector Regression under $\ell_1$ Regularization” (with H. Ho, G. A. Pouliot, and J. K. C. Shea), AEA Papers and Proceedings, vol. 111, pp. 611-615. abstract doi

We show that support vector regression (SVR) consistently estimates linear median regression functions and we develop a large sample inference method based on the inversion of a novel test statistic in order to produce error bars for SVR with $\ell_1$-norm regularization. Under a homoskedasticity assumption commonly imposed in the quantile regression literature, the procedure does not involve estimation of densities. It is thus unique amongst large sample inference methods for SVR in that it circumvents the need to select a bandwidth parameter. Simulation studies suggest that our procedure produces narrower error bars than does the standard inference method in quantile regression.

## Working papers

(2020) “Optimality of Matched-Pair Designs in Randomized Controlled Trials,” revision requested by the American Economic Review. abstract supplement slides

This paper studies the optimality of matched-pair designs in randomized controlled trials (RCTs). Matched-pair designs are examples of stratified randomization, in which the researcher partitions a set of units into strata based on their observed covariates and assign a fraction of units in each stratum to treatment. A matched-pair design is such a procedure with two units per stratum. Despite the prevalence of stratified randomization in RCTs, implementations differ vastly. We provide an econometric framework in which, among all stratified randomization procedures, the optimal one in terms of the mean-squared error of the difference-in-means estimator is a matched-pair design that orders units according to a scalar function of their covariates and matches adjacent units. Our framework captures a leading motivation for stratifying in the sense that it shows that the proposed matched-pair design additionally minimizes the magnitude of the ex-post bias, i.e., the bias of the estimator conditional on realized treatment status. We then consider empirical counterparts to the optimal stratification using data from pilot experiments and provide two different procedures depending on whether the sample size of the pilot is large or small. For each procedure, we develop methods for testing the null hypothesis that the average treatment effect equals a prespecified value. Each test we provide is asymptotically exact in the sense that the limiting rejection probability under the null equals the nominal level. We run an experiment on the Amazon Mechanical Turk using one of the proposed procedures, replicating one of the treatment arms in DellaVigna and Pope (2018), and find the standard error decreases by 29%, so that only half of the sample size is required to attain the same standard error.

(2020) “Why Randomize? Minimax Optimality under Permutation Invariance,” revision requested by the Journal of Econometrics. abstract

This paper studies finite sample minimax optimal randomization schemes and estimation schemes in estimating parameters including the average treatment effect, when treatment effects are heterogeneous. A randomization scheme is a distribution over a group of permutations of a given treatment assignment vector. An estimation scheme is a joint distribution over assignment vectors, linear estimators, and permutations of assignment vectors. The key element in the minimax problem is that the worst case is over a class of distributions of the data which is invariant to a group of permutations. First, I show that given any assignment vector and any estimator, the uniform distribution over the same group of permutations, namely the complete randomization scheme, is minimax optimal. Second, under further assumptions on the class of distributions and the objective function, I show the minimax optimal estimation scheme involves completely randomizing an assignment vector, while the optimal estimator is the difference-in-means under complete invariance and a weighted average of within-block differences under a block structure, and the numbers of treated and untreated units are determined by Neyman allocations.

(2021) “Partial Identification of Treatment Effect Rankings with Instrumental Variables” (with A. M. Shaikh and E. J. Vytlacil), working paper. abstract slides

This paper develops partial-identification and inference for treatment effect parameters and the rankings of treatments in an instrumental variable framework while imposing alternative monotonicity restrictions. In particular, we consider a discrete, multi-valued treatment, a binary outcome, and a discrete, possibly multi-valued instrument. We use a linear programming formulation to present a flexible framework and to develop general results for characterizing the testable restrictions and the sharp identification of treatment effect parameters and the rankings of treatments in terms of these parameters that follow from imposing instrument exogeneity while additionally imposing alternative monotonicity restrictions on how the treatments depend on the instruments and how the outcomes depend on the treatments. Our results nest both ordered and unordered treatments. We further characterize leading special cases of our general analysis. We develop methods for simultaneous inference about the consistency of the observed data with our restrictions and the treatment effect ranking when the distribution of the observed data is consistent with our restrictions. We illustrate our methodology with empirical applications to the encouragement design of Behaghel, Crepon and Gurgand (2014) investigating the effects of public vs private job search assistance; the RCTs with one-sided non-compliance of Angrist, Lang and Oreopoulos (2009) investigating the effects of alternative strategies on academic performance of college students and of Blattman, Jamison, and Sheridan (2017) investigating the effects cash incentives and therapy on reducing crime in Liberia; and the RCT with close substitutes of Kline and Walters (2016) investigating the effects of alternative early childhood programs.