摘要 加性分位数回归为非线性关系的建模提供一种灵活、鲁棒的方法.拟合加性分位数模型的方法通常使用样条函数逼近分量,但需要先验的选择节点,计算速度较慢,并不适合大规模数据问题.因此文中提出基于融合Lasso的非参数加性分位数回归模型(Nonparametric Additive Quantile Regression Model Based on Fused Lasso, AQFL),是在融合Lasso罚和l2罚之间折衷的可对加性分位数回归模型进行估计和变量选择的模型.融合Lasso罚使模型能快速计算,并在局部进行自适应,从而实现对所需分位数甚至极端分位数的预测.同时结合l2罚,在高维数据中将对响应影响较小的协变量函数值压缩为零,实现变量的选择.此外,文中给出保证收敛到全局最优的块坐标ADMM算法(Block Coordinate Alternating Direction Method of Multipliers, BC-ADMM),证明AQFL的预测一致性.在合成数据和碎猪肉数据上的实验表明AQFL在预测准确性和鲁棒性等方面较优.
Abstract:Additive quantile regression provides a flexible and robust method for modeling non-linear relationships. Methods for fitting the additive quantile models rely on spline functions to approximate components. However, the required prior selection of nodes results in slow computation speed and it renders the methods unsuitable for large-scale data problems. Therefore, a nonparametric additive quantile regression model based on the fused Lasso(AQFL) is proposed in this paper. AQFL leverages a compromise between the fused Lasso penalty and the l2 penalty for estimating and selecting variables in the additive quantile regression model. The fused Lasso penalty is employed to make the model compute fast and localize adaptively, thereby achieving the prediction for the desired quantile or even extreme quantiles. Additionally, in combination with the l2 penalty, AQFL compresses the covariate function values with a small impact on the response to zero in high-dimensional data, thereby achieving variable selection. Furthermore, a block coordinate alternating direction method of multipliers(BC-ADMM) algorithm is presented to ensure convergence to the global optimum and demonstrate the prediction consistency of AQFL. Experimental results on synthetic data and ground pork data demonstrate the superiority of AQFL in prediction accuracy and robustness.
[1] KOENKER R, BASSETT G. Regression Quantiles. Econometrica, 1978, 46(1): 33-50. [2] DE GOOIJER J G, ZEROM D. On Additive Conditional Quantiles with High-Dimensional Covariates. Journal of the American Statistical Association, 2003, 98(461): 135-146. [3] GAILLARD P, GOUDE Y, NEDELLEC R.Additive Models and Ro-bust Aggregation for GEFCom2014 Probabilistic Electric Load and Electricity Price Forecasting. International Journal of Forecasting, 2016, 32(3): 1038-1050. [4] FASIOLO M, WOOD S N, ZAFFRAN M, et al. qgam: Bayesian Non-parametric Quantile Regression Modelling in R. Journal of Statistical Software, 2021, 100(9). DOI: 10.18637/jss.v100.i09. [5] SHERWOOD B, MAIDRNAN A.Additive Nonlinear Quantile Regression in Ultra-High Dimension. Journal of Machine Learning Research, 2022, 23(1): 2741-2787. [6] PADILLA O H M, CHATTERJEE S. Risk Bounds for Quantile Trend Filtering. Biometrika, 2022, 109(3): 751-768. [7] YE S S, PADILLA O H M. Non-parametric Quantile Regression via the K-NN Fused Lasso. Journal of Machine Learning Research, 2021, 22: 1-38. [8] PIETROSANU M, GAO J Y, KONG L L, et al. Advanced Algorithms for Penalized Quantile and Composite Quantile Regression. Computational Statistics, 2021, 36(1): 333-346. [9] BOYD S, PARIKH N, CHU E, et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Foundations and Trends® in Machine Learning, 2011, 3(1): 1-122. [10] HUNTER D R, LANGE K.Quantile Regression via an MM Algori-thm. Journal of Computational and Graphical Statistics,2000, 9(1): 60-77. [11] WU T T, LANGE K.Coordinate Descent Algorithms for Lasso Pe-nalized Regression. The Annals of Applied Statistics, 2008, 2(1): 224-244. [12] HOCHBAUM D S, LU C.A Faster Algorithm for Solving a Gene-ralization of Isotonic Median Regression and a Class of Fused Lasso Problems. SIAM Journal on Optimization, 2017, 27(4): 2563-2596. [13] BRANTLE H L, GUINNESS J, ERIC C C.Baseline Drift Estimation for Air Quality Data Using Quantile Trend Filtering. The Annals of Applied Statistics, 2020, 14(2): 585-604. [14] TIBSHIRANI R, SAUNDERS M, ROSSET S, et al. Sparsity and Smoothness via the Fused Lasso. Journal of the Royal Statistical Society(Series B), 2005, 67(1): 91-108. [15] KIM S J, KOH K, BOYD S, et al. l1 Trend Filtering. SIAM Review, 2009, 51(2): 339-360. [16] TIBSHIRANI R J.Adaptive Piecewise Polynomial Estimation via Trend Filtering. The Annals of Statistics, 2014, 42(1): 285-323. [17] YUAN M, LIN Y.Model Selection and Estimation in Regression with Grouped Variables. Journal of the Royal Statistical Society(Series B), 2006, 68(1): 49-67. [18] PETERSEN A, WITTEN D, SIMON N.Fused Lasso Additive Mo-del. Journal of Computational and Graphical Statistics, 2016, 25(4): 1005-1025. [19] YU K M, MOVEED R A. Bayesian Quantile Regression.Statistics and Probability Letters, 2001, 54(4): 437-447. [20] TIBSHIRANI R J, TAYLOR J.Degrees of Freedom in Lasso Pro-blems. The Annals of Statistics, 2012, 40(2): 1198-1232. [21] STONE C J.The Dimensionality Reduction Principle for Generali-zed Additive Models. The Annals of Statistics, 1986, 14(2): 590-606. [22] LENG C L, LIN Y, WAHBA G.A Note on the Lasso and Related Procedures in Model Selection. Statistica Sinica, 2006, 16(4): 1273-1284.