# The Least Squares Estimator

# Finite Sample Properties

# Unbiased Estimation

𝐸[b]=𝐸x{𝐸[b|X]}=𝐸x[β]=β.

The interpretation of this result is that for any particular set of observations X, the least squares estimator has expectation β. Therefore, when we average this over the possible values of X, we find the unconditional mean is β as well.

# Omission of Relevant Variables

The estimation is biased when a relevant variable is omitted in the regression:

𝐸[b1|X]=β1+𝑃1,2β2

where

𝑃1,2=(𝑋1𝑋1)1𝑋1𝑋2.

# Inclusion of Irrelevant Variables

Inclusion of irrelevant variables will not affect the biasness of the relevant variables. However, it has the cost of have a higher covariance matrix, which decrease the efficiency of the estimation.

E[(bc)|X,z]=(βγ)==(β0).

# Variance of the Least Squares Estimator

Since 𝐸[b]=β and 𝐸[εε|X]=σ2𝐼,we have

Var[bX]=E[(bβ)(bβ)X]=E[(XX)1XεεX(XX)1X]=(XX)1XE[εεX]X(XX)1=(XX)1X(σ2I)X(XX)1=σ2(XX)1

Theorem (Gauss Markov Theorem). In the linear regression model with regressor matrix X, the least squares estimator b is the minimum variance linear unbiased estimator of β. For any vector of constants w, the minimum variance linear unbiased estimator of wβ in the regression model is wb, where b is the least squares estimator.

# Estimating the Variance of the Least Squares Estimator

We don’t use

σ^2=1ni=1nei2,

but use

s2=eenK.Est.Var[b|X]=eenK(XX)1.

If we assume the disturbances are normally distributed, then the estimator b has normal distribution as well,

b|XN[β,σ2(XX)1].

# Large Sample Properties

# Consistency of the Estimator

Assume

plimnXXn=Q,

where Q is a positive definite matrix, we have

plimb=β.

# Consistency and Unbiasedness

Consider a sample x1,,xn from a N(μ,σ2) population, and we want to estimate μ.

  • Unbiased but not consistent. x1 is an unbiased estimator of μ since E[x1]=µ. But, x1 is not consistent since its distribution does not become more concentrated around µ as the sample size increases, it’s always N(μ,σ2)
  • Consistent but not unbiased. x~=1n1i=1nxi. Since E[x~]=nn1μμ, x~ is an biased estimator. When n , E[x~]μ, so x~ is a consistent estimator.

# Asymptotic Normality of the Estimator

Theorem (Asymptotic Distribution of b with Independent Observations). If {εi} are independently distributed with mean zero and finite variance σ2 and xik is such that the Grenander conditions are met, then

baN[β,σ2nQ1].

Grenander Conditions

  1. For each column of X, xk, if 𝑑nk2=xkxk, then limndnk2=+.
  2. limnxik2dnk2=0 for all i=1,2,,n.
  3. Let 𝑅n be the sample correlation matrix of the columns of X, excluding the constant term if there is one. Then limnRn=𝐶, a positive definite matrix.

# Interval Estimation

The ratio

tK=bkβks2Skk,

where Skk denotes the kth element of (XX)1, has a t distribution with nK degrees of freedom, and a confidence interval for βk can be formed using

Prob[bkt(1α/2),[nK]s2Skkβkbk+t(1α/2),[nK]s2Skk]=1α.

# Prediction

The prediction variance is

𝑉𝑎𝑟[e0|X,x0]=σ2+x0[σ2(XX)1]x0,

and the prediction interval is

y^0±t(1α/2),[nK]se(e0).