Transfer Learning under High-dimensional Generalized Linear Models.

Advanced Search

Ye Tian, Yang Feng

Author Information

Ye Tian: Department of Statistics, Columbia University.
Yang Feng: Department of Biostatistics, School of Global Public Health, New York University.

PMID: 38562655 DOI: 10.1080/01621459.2022.2071278

In this work, we study the transfer learning problem under highdimensional generalized linear models (GLMs), which aim to improve the fit on data by borrowing information from useful data. Given which sources to transfer, we propose a transfer learning algorithm on GLM, and derive its / -estimation error bounds as well as a bound for a prediction error measure. The theoretical analysis shows that when the target and source are sufficiently close to each other, these bounds could be improved over those of the classical penalized estimator using only target data under mild conditions. When we don't know which sources to transfer, an transferable source detection approach is introduced to detect informative sources. The detection consistency is proved under the high-dimensional GLM transfer learning setting. We also propose an algorithm to construct confidence intervals of each coefficient component, and the corresponding theories are provided. Extensive simulations and a real-data experiment verify the effectiveness of our algorithms. We implement the proposed GLM transfer learning algorithms in a new R package glmtrans, which is available on CRAN.

Generalized linear models Lasso high-dimensional inference negative transfer sparsity transfer learning

J Stat Softw. 2010;33(1):1-22 [PMID: 20808728]
Ann Stat. 2013 Jun;41(3):1111-1141 [PMID: 26257447]
Comput Stat Data Anal. 2016 Sep;101:226-235 [PMID: 29056802]
J Am Stat Assoc. 2023;118(543):2171-2183 [PMID: 38143788]
J R Stat Soc Series B Stat Methodol. 2022 Feb;84(1):149-173 [PMID: 35210933]

R21 AG074205/NIA NIH HHS

Journal Article

OpenLB
Open Library of Bioscience