Dynamic Ridge Regression vs. Lasso Regression: A Comparative Study for Modeling Pakistan's Unemployment Rate

Authors

  • Savera Mubasher Master of Philosophy in Statistics, Department of Mathematics and Statistics, University of Agriculture, Faisalabad, Pakistan.
  • Muhammad Zakria Master of Philosophy in Statistics, Department of Mathematics and Statistics, University of Agriculture, Faisalabad, Pakistan
  • Amir Shahzad Master of Philosophy in Statistics, Department of Mathematics and Statistics, University of Agriculture, Faisalabad, Pakistan
  • Nazakat Ali Master of Philosophy in Statistics, Department of Mathematics and Statistics, University of Agriculture, Faisalabad, Pakistan https://orcid.org/0009-0005-9663-1935
  • Hiba faisal Master of Philosophy in Statistics, Department of Mathematics and Statistics, University of Agriculture, Faisalabad, Pakistan

DOI:

https://doi.org/10.61424/gjme.v1i1.125

Keywords:

Ridge Regression, Lasso Regression, Cross-Validation, Unemployment Rate Pakistan

Abstract

The unemployment rate is a key economic indicator that reflects a country's economic health, influencing policy decisions and citizens' living standards. This study examines Pakistan's economic indicators, using the unemployment rate as the dependent variable, while GDP, exchange rate (ER), inflation rate (INF), foreign direct investment (FDI), exports of goods and services (EGS), general government final consumption expenditure (GFCE), budget deficits (BDF), and population (POP) serve as independent variables. A Variance Inflation Factor (VIF) analysis identifies multicollinearity among predictors, revealing ER as having the highest VIF of 7.544, indicating strong multicollinearity. Other variables like GDP, FDI, GFCE, BDF, and INF exhibit low VIFs, while EGS and POP have moderate levels of multicollinearity. The study employs Ridge and Lasso regression with 2-fold cross-validation to determine significant predictors and assess their impact on unemployment rates. The optimal lambda for Ridge regression is found to be 0.7758532, selected through cross-validation to minimize error. ER emerges as the most influential variable, with a feature importance score of 100. Lasso regression, with an optimal lambda of 0.1943467, eliminates GDP, EGS, POP, and INF, enhancing model simplicity and reducing overfitting. The Ridge model yields an RMSE of 0.32, while Lasso achieves a lower RMSE of 0.25, indicating better predictive accuracy. The study underscores the importance of addressing multicollinearity and demonstrates the effectiveness of Ridge and Lasso regression in predicting unemployment rates, with each model offering unique strengths for economic analysis.

Downloads

Published

2024-10-25 — Updated on 2024-11-23

Versions

How to Cite

Mubasher, S., Zakria, M., Shahzad, A., Ali, N., & faisal, H. (2024). Dynamic Ridge Regression vs. Lasso Regression: A Comparative Study for Modeling Pakistan’s Unemployment Rate. Global Journal of Mathematics and Statistics, 1(1), 21–45. https://doi.org/10.61424/gjme.v1i1.125 (Original work published October 25, 2024)