Quantile Function on Scalar Regression Analysis for Distributional Data.

Hojin Yang, Veerabhadran Baladandayuthapani, Arvind U K Rao, Jeffrey S Morris
Author Information
  1. Hojin Yang: Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030.
  2. Veerabhadran Baladandayuthapani: Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030.
  3. Arvind U K Rao: Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030.
  4. Jeffrey S Morris: Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030.

Abstract

Radiomics involves the study of tumor images to identify quantitative markers explaining cancer heterogeneity. The predominant approach is to extract hundreds to thousands of image features, including histogram features comprised of summaries of the marginal distribution of pixel intensities, which leads to multiple testing problems and can miss out on insights not contained in the selected features. In this paper, we present methods to model the entire marginal distribution of pixel intensities via the quantile function as functional data, regressed on a set of demographic, clinical, and genetic predictors to investigate their effects of imaging-based cancer heterogeneity. We call this approach , regressing subject-specific marginal distributions across repeated measurements on a set of covariates, allowing us to assess which covariates are associated with the distribution in a global sense, as well as to identify distributional features characterizing these differences, including mean, variance, skewness, heavy-tailedness, and various upper and lower quantiles. To account for smoothness in the quantile functions, account for intrafunctional correlation, and gain statistical power, we introduce custom basis functions we call that are sparse, regularized, near-lossless, and empirically defined, adapting to the features of a given data set and containing a Gaussian subspace so non-Gaussianness can be assessed. We fit this model using a Bayesian framework that uses nonlinear shrinkage of quantlet coefficients to regularize the functional regression coefficients and provides fully Bayesian inference after fitting a Markov chain Monte Carlo. We demonstrate the benefit of the basis space modeling through simulation studies, and apply the method to Magnetic resonance imaging (MRI) based radiomic dataset from Glioblastoma Multiforme to relate imaging-based quantile functions to various demographic, clinical, and genetic predictors, finding specific differences in tumor pixel intensity distribution between males and females and between tumors with and without DDIT3 mutations.

Keywords

References

  1. Neurosurg Rev. 2007 Jul;30(3):181-7; discussion 187 [PMID: 17486380]
  2. Neuroimage. 2018 Nov 1;181:501-512 [PMID: 30057352]
  3. J Multivar Anal. 2022 Jul;190: [PMID: 35370319]
  4. Biostatistics. 2006 Oct;7(4):551-68 [PMID: 16488893]
  5. PLoS One. 2011;6(10):e25451 [PMID: 21998659]
  6. J Comput Graph Stat. 2011 Dec 1;20(4):830-851 [PMID: 22368438]
  7. Radiology. 2015 Apr;275(1):215-27 [PMID: 25490189]
  8. Biometrics. 2012 Dec;68(4):1260-8 [PMID: 22670567]
  9. Eur J Pharmacol. 2000 Dec 27;410(2-3):107-120 [PMID: 11134663]
  10. Biometrics. 2002 Mar;58(1):121-8 [PMID: 11890306]
  11. Ann Appl Stat. 2020 Jun;14(2):521-541 [PMID: 37981999]
  12. Nat Rev Cancer. 2012 Apr 19;12(5):323-34 [PMID: 22513401]
  13. Genome Biol. 2011 Oct 21;12(10):R105 [PMID: 22018358]
  14. J R Stat Soc Series B Stat Methodol. 2006 Apr 1;68(2):179-199 [PMID: 19759841]
  15. J Comput Graph Stat. 2015 Apr 1;24(2):477-501 [PMID: 26347592]
  16. Int J Biostat. 2010;6(1):Article 28 [PMID: 21969982]
  17. J Am Stat Assoc. 2011 Mar;106(493):6-20 [PMID: 23459794]
  18. Neuroimage Clin. 2016 May 27;12:132-43 [PMID: 27408798]
  19. J Am Stat Assoc. 2019;114(526):495-513 [PMID: 31235987]
  20. Electron J Stat. 2011 Jan 1;5:572-602 [PMID: 22163061]
  21. J Am Stat Assoc. 2016;111(514):772-786 [PMID: 28018013]
  22. J R Stat Soc Ser C Appl Stat. 2012 Aug;61(4):535-553 [PMID: 23861555]
  23. Biometrics. 2015 Sep;71(3):563-74 [PMID: 25787146]
  24. EMBO Rep. 2013 Aug;14(8):686-95 [PMID: 23846313]
  25. Br J Cancer. 2014 Dec 9;111(12):2205-13 [PMID: 25268373]
  26. Nucleic Acids Res. 2015 Feb 27;43(4):1997-2007 [PMID: 25653168]
  27. J Am Stat Assoc. 2011 Sep 1;106(495):1167-1179 [PMID: 22308015]
  28. J Clin Oncol. 2016 May 10;34(14):1567-9 [PMID: 27001588]

Grants

  1. R01 CA160736/NCI NIH HHS
  2. R01 CA178744/NCI NIH HHS
  3. R01 CA194391/NCI NIH HHS

Word Cloud

Created with Highcharts 10.0.0featuresdistributionmarginalpixelquantilesetfunctionsBayesiantumoridentifycancerheterogeneityapproachincludingintensitiescanmodelfunctionaldatademographicclinicalgeneticpredictorsimaging-basedcallcovariatesdifferencesvariousaccountbasiscoefficientsMarkovchainMonteCarloFunctionRegressionRadiomicsinvolvesstudyimagesquantitativemarkersexplainingpredominantextracthundredsthousandsimagehistogramcomprisedsummariesleadsmultipletestingproblemsmissinsightscontainedselectedpaperpresentmethodsentireviafunctionregressedinvestigateeffectsregressingsubject-specificdistributionsacrossrepeatedmeasurementsallowingusassessassociatedglobalsensewelldistributionalcharacterizingmeanvarianceskewnessheavy-tailednessupperlowerquantilessmoothnessintrafunctionalcorrelationgainstatisticalpowerintroducecustomsparseregularizednear-losslessempiricallydefinedadaptinggivencontainingGaussiansubspacenon-GaussiannessassessedfitusingframeworkusesnonlinearshrinkagequantletregularizeregressionprovidesfullyinferencefittingdemonstratebenefitspacemodelingsimulationstudiesapplymethodMagneticresonanceimagingMRIbasedradiomicdatasetGlioblastomaMultiformerelatefindingspecificintensitymalesfemalestumorswithoutDDIT3mutationsQuantileScalarAnalysisDistributionalDataBasisFunctionsModelingFunctionalImagingGeneticsProbabilityDensity

Similar Articles

Cited By