# SELECTION BIAS: Our Data 2

Table 3 presents the estimated coefficients of the logit model P(X). Variables are included in the model on the basis of two criteria: (a) minimization of classification error when P(X) > Pc is used to predict D = 1 and P{X)
The trouble with online loans is that you can get an insane APR even for the small amount of money you are willing to borrow. If you want the best instant loan online that will not cost too much, you can apply at Source and get one just as easily as you would get a carton of milk at the local grocery store.

Figure 2 presents the distributions of the estimated P{X) in the {D = 0} and {D = 1} groups. We obtain similar distributions for P(X) using alternative sets of regressors.32 This figure indicates the potential importance of defining bias on a common support of P(X). For the sample of controls, the histogram of P(X) values has support over the entire [0,1] interval. Surprisingly, however, the mode of the distribution of P(X) for controls is near zero. Many controls have a low estimated probability of participation. In the sample of ENPs, the support of P(X) is concentrated in the interval [0,0.225]. Thus, the bias measure BsP, which is the bias defined conditional on P(X) rather than X, is defined only over a fairly limited interval. As a result of this restriction on the support, any nonexperimental evaluation can nonparametrically estimate program impacts defined only over this interval. As we demonstrate below, the difference between the distributions of the estimated values of P has important implications for understanding the sources of selection bias as conventionally measured. Before presenting this decomposition, we first develop some econometric tools that are used in the empirical results reported in this paper.

# SELECTION BIAS: Our Data

The data used in this study come from four training centers participating in a randomized evaluation of the Job Training Partnership Act (JTPA). Along with data on the experimental treatment and control groups, information was collected on a nonexperimental comparison group of persons located in the same four labor markets who were eligible for the program but chose not to participate in it at the time random assignment was conducted. These persons are termed ENPs – for eligible nonparticipants.
Random assignment took place at the point where individuals had applied to and been accepted into JTPA (i.e., admitted by a JTPA administrator). Under ideal conditions, randomization at this point identifies parameters (1) and (2). Members of the control group were excluded from receiving JTPA services for 18 months after random assignment. The controls completed the same survey instrument as the ENP comparison group members. This instrument included detailed retrospective questions on labor force participation, job spells, earnings, marital status and other characteristics. In this paper, we analyze a sample of adult males age 22 to 54. Table 1 defines the variables used in this study. Appendix В describes the data more fully and gives summary statistics for our sample.

# SELECTION BIAS: Difference-in-Differences 2

Term B\ in (14) arises when Sox\Sx or S\x\Sx is nonempty. In this case we fail to find counterparts to E(Yq | X, D = 1) in the set S0x\Sx and counterparts to E(YQ | X, D = 0) in the set Six\Sx• Term B2 arises from the differential weighting of E(Y0 | X, D = 0) by the two densities for X given D — \ and D — 0 within the overlap set. Term B\$ arises from differences in outcomes that remain even after controlling for observable differences. Selection bias, rigorously defined as Вsx, may be of a different magnitude and even a different sign than the conventional measure of bias В.
Matching methods that impose the condition of pointwise common support eliminate two of the three sources of bias in (14). Matching only over the common support necessarily eliminates the bias arising from regions of nonoverlapping support given by term B\ in (14). The bias due to different density weighting is eliminated because matching on participant P values effectively reweights the non-participant data. Thus PxBsx 1S 0П^У component of (14) that is not eliminated by matching.28 Bsx is the bias associated with a matching estimator.