Optimizing Variable Selection with Bayesian t-Lasso Regression: A Comparison of Normal-mixture and Uniform-mixture Representations for Outlier Data Analysis

https://doi.org/10.59965/pijme.v2i1.105

Authors

Keywords:

Regression models, Scale Mixture of uniform, T-Lasso Regression, Penalized regression, Gibbs Sampling Algorithm

Abstract

One of the discussed parts of the regression model is choosing the optimal model. The choice of the optimal model in regression models is to determine the important explanatory variables and the negligible variables and to express the relationship between the response variable and the explanatory variables in a simpler way. With the limitations of classical variable selection processes such as step-by-step selection, compensated regression methods can be used. One of the compensated regression models is Lasso regression. For data collection and statistical analysis in the presence of remote observations, instead of normal distribution, T-Student distribution can be used for the error of these data. In this article, we propose a variable selection method called T-Lasso Bayesian regression model for data analysis in the presence of outlying observations. Bayesian t-lasso regression model with two different representations of Laplace's prior density function for the coefficients of the regression model is investigated, so that first the Laplace density function is discussed in the form of mixed distribution-normal scale and then in the form of mixed distribution-uniform scale. Then, by using simulation methods and real data analysis, the superiority of the Bayesian T-lasso regression method is shown by showing the Laplace density function in mixed-uniform scale over normal mixed-scale display.

Downloads

Download data is not yet available.

References

Belloni, A., & Chernozhukov, V. (2013). Least squares after model selection in high-dimensional sparse models. Bernoulli, 19(2), 521-547. https://doi.org/10.3150/11-BEJ4

Gelman, A. (2011). Induction and deduction in Bayesian data analysis. RMM, 2, 2011, 67–78. https://jlupub.ub.uni-giessen.de/server/api/core/bitstreams/786a421e-1af2-4baf-a8c8-30d208fb21d5/content

Heidelberger, P., & Lewis, P. A. (1984). Quantile estimation independent sequences. Operations Research, 32(1), 185-209.

Hlavackova-Schindler, K. (2016). Prediction consistency of lasso regression does not need normal errors. British Journal of Mathematics & Computer Science, 19(4), 10-20. https://doi.org/10.9734/BJMCS/2016/29533

Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 12(1), 55-67. https://homepages.math.uic.edu/~lreyzin/papers/ridge.pdf

Karapetyants, M., & László, S. C. (2024). A Nesterov-type algorithm with double Tikhonov regularization: fast convergence of the function values and strong convergence to the minimal norm solution. Applied Mathematics & Optimization, 90(1), 17. https://doi.org/10.1007/s00245-024-10163-0

Liu, C., & Rubin, D. B. (1995). ML estimation of the t distribution using EM and its extensions, ECM and ECME. Statistica sinica, 19-39.

Park, T., & Casella, G. (2008). The Bayesian lasso. Journal of the American Statistical Association, 103(482), 681-686. https://doi.org/10.1198/016214508000000337

Shadrokh, A., Khadembashiri, Z., & Yarmohammadi, M. (2021). Regression Modeling Via T-Lasso Bayesian Method. Journal of Advanced Mathematical Modeling, 11(2), 365-381. https://doi.org/10.22055/jamm.2021.35112.1859

Steele, S. E., & Lopez-Fernandez, H. (2014). Body size diversity and frequency distributions of Neotropical cichlid fishes (Cichliformes: Cichlidae: Cichlinae). PLoS One, 9(9), e106336. https://doi.org/10.1371/journal.pone.0106336

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(1), 267-288. https://webdoc.agsci.colostate.edu/koontz/arec-econ535/papers/Tibshirani%20(JRSS-B%201996).pdf

Wafa, M.N. (2019). Assessing School Students' Mathematic Ability Using DINA and DINO Models. International Journal of Mathematics Trends and Technology-IJMTT, 65(12), 153-165. https://doi.org/10.14445/22315373/IJMTT-V65I12P517

Wafa, M. N., Hussaini, S.A.M & Pazhman, J. (2020). Evaluation of Students’ Mathematical Ability in Afghanistan’s Schools Using Cognitive Diagnosis Models. EURASIA Journal of Mathematics, Science and Technology Education, 16(6), 1-12. https://doi.org/10.29333/ejmste/7834

Wafa, M. N., Zia, Z., & Frozan, F. (2023). Consistency and ability of students using DINA and DINO models. European Journal of Mathematics and Statistics, 4(4), 7-13. https://doi.org/10.24018/ejmath.2023.4.4.230

Wafa, M. N., Zia, Z., & Hussaini, S. A. M. (2023). Regression Models According to Birnbaum-Saunders Distribution. European Journal of Mathematics and Statistics, 4(6), 24-30. https://doi.org/10.24018/ejmath.2023.4.6.267

Zhu, Y.-N., Wu, H., Cao, C., & Li, H.-N. (2008). Correlations between mid-infrared, far-infrared, Hα, and FUV luminosities for Spitzer SWIRE field galaxies. The Astrophysical Journal, 686(1), 155-162. https://doi.org/10.1086/591121

Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476), 1418-1429. https://doi.org/10.1198/016214506000000735

Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2), 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

Published

2024-05-05

How to Cite

Mohmmad Nasim Wafa, & Shuja, S. (2024). Optimizing Variable Selection with Bayesian t-Lasso Regression: A Comparison of Normal-mixture and Uniform-mixture Representations for Outlier Data Analysis. Polyhedron International Journal in Mathematics Education, 2(1), 17–29. https://doi.org/10.59965/pijme.v2i1.105

Issue

Section

Articles