Abstract
In order to deal with high-dimensional distributed data, this article develops a novel and communication-efficient approach for sparse and high-dimensional data with the penalized quantile regression. In each round, the proposed method only requires the master machine to deal with a sparse penalized quantile regression which could be realized fastly by proximal alternating direction method of multipliers algorithm and the other worker machines to compute the subgradient on local data. The advantage of the proximal ADMM algorithm is that it could make every parameter of iteration to have closed formula even in high-dimensional case, which greatly improves the speed of calculation. As for the communication efficiency, the proposed method does not sacrifice any statistical accuracy and provably improves the estimation error obtained by centralized method, provided the penalty levels are chosen properly. Moreover, the asymptotic properties of the proposed estimation and the convergence of the algorithm are convincible. Especially, it presents extensive experiments on both the numerical simulations and the HIV drug resistance data analysis, which all confirm the significant efficiency of our proposed method in quantile regression for distributed data by comparative and empirical analysis.