Tuesday, January 12, 2021

Ridge Regression (Basic) (Advanced Econometrics - Recent Developments in Regression Concepts)

 ■ Ridge Regression

Two people, Hoerl and Kennard (1970) developed the concept by proposing the class of estimators defined by:

betahat (gamma) = (X'X +gamma I)^-1 X'y, are called as the ridge estimators.

If we see in terms of machine leraning, ridge regression comes under the category of the L2 regularisation,  because it uses L2 norm of the Euclidian distance.

Ridge regression does dimensionality reduction in our feature space when we havr larger number of estimators to visualize.

The dataset is generally divided into the Training set and Test set. The model we have is developed on the basis of the Training set.

When we deploy this into the test set (on training set it performs well). In the training set if we want to calculate the error, most of the points pass through the best fit line that we have,

Sigma ( yi - yhat)^2 = 0

But when we deploy this particular linear model on our data set, our errors will just shoot out. Thus we got to penalize our linear model for doing this because it may have so many test errors. 

The problem of overfitting is there when our Training accuracy is high and our Test accuracy is very low. Thus, penalising the model for making the error are the residuals errors. 

● Why Ridge estimators developed ?

Amemiya (1998) tells that Hoerl and Kennard chose these estimators because they did hope to alleiviate the instability of the least squares estimators due to the near singularity of X'X by adding a positive scalar "gamma" to the characteristic roots of X'X. 

They also proposed that the ridge trace method determines the value of "gamma". The "gamma" be determined as the smallest value at which the ridge trace stabilizes.

This study has 2 major weaknesses: 

¤ The point at which the ridge starts to stabilize cannot always be determines objectively.

¤ This method lacks the theoritical justification in as much as its major justification is derived from certain Monte Carlo studies, which, though is favorable is not conclusive.

In upcoming blogs i will be writing about more on variteties of Ridge Estimators called as Generalized Ridge Estimators, some of which involve the empirical bayes method of determining "gamma".

Note: This is just the simple concept on Ridge Regression. In upcoming blogs continuation of this will be there and I will be presenting the equational framework and slight more concept including the use of hyperparameter and all.

Source: Self made notes on Machine Learning and Advanced Econometrics, Amemiya (1998).

Thank You

Aditya Pokhrel

MBA, MA Economics, MPA


No comments:

Post a Comment

Regression Discontinuity - How to determine whether it is Sharp or Fuzzy RD ? Simplest Look.           Regression discontinuity design is ga...