A Survey on Multi-View Clustering.

Guoqing Chao, Shiliang Sun, Jinbo Bi
Author Information
  1. Guoqing Chao: School of Computer Science and Technology, Harbin Institute of Technology, Weihai 264209, PR China.
  2. Shiliang Sun: School of Computer Science and Technology, East China Normal University, Shanghai, Shanghai 200062 China.
  3. Jinbo Bi: Department of Computer Science, University of Connecticut, Storrs, CT 06269 USA.

Abstract

Clustering is a machine learning paradigm of dividing sample subjects into a number of groups such that subjects in the same groups are more similar to those in other groups. With advances in information acquisition technologies, samples can frequently be viewed from different angles or in different modalities, generating multi-view data. Multi-view clustering, that clusters subjects into subgroups using multi-view data, has attracted more and more attentions. Although MVC methods have been developed rapidly, there has not been enough survey to summarize and analyze the current progress. Therefore, we propose a novel taxonomy of the MVC approaches. Similar to other machine learning methods, we categorize them into generative and discriminative classes. In discriminative class, based on the way of view integration, we split it further into five groups: Common Eigenvector Matrix, Common Coefficient Matrix, Common Indicator Matrix, Direct Combination and Combination After Projection. Furthermore, we relate MVC to other topics: multi-view representation, ensemble clustering, multi-task clustering, multi-view supervised and semi-supervised learning. Several representative real-world applications are elaborated for practitioners. Some benchmark multi-view datasets are introduced and representative MVC algorithms from each group are empirically evaluated to analyze how they perform on benchmark datasets. To promote future development of MVC approaches, we point out several open problems that may require further investigation and thorough examination.

Keywords

References

  1. Proc IEEE Int Conf Big Data. 2017;2017:766-775 [PMID: 29457155]
  2. IEEE Trans Neural Netw. 2010 Dec;21(12):1925-38 [PMID: 20934949]
  3. IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1774-1782 [PMID: 29994652]
  4. Inf Sci (N Y). 2019 Aug;494:278-293 [PMID: 32863420]
  5. IEEE Trans Cybern. 2015 Apr;45(4):688-701 [PMID: 25069132]
  6. IEEE Trans Image Process. 2016 Jun;25(6):2833-2843 [PMID: 27093625]
  7. IEEE Trans Pattern Anal Mach Intell. 2019 Oct;41(10):2410-2423 [PMID: 30387725]
  8. J Am Stat Assoc. 2010 Jun 1;105(490):713-726 [PMID: 20811510]
  9. Nucleic Acids Res. 2018 Nov 16;46(20):10546-10562 [PMID: 30295871]
  10. Bioinformatics. 2013 Oct 15;29(20):2610-6 [PMID: 23990412]
  11. Neural Comput. 2004 Dec;16(12):2639-64 [PMID: 15516276]
  12. IEEE Trans Cybern. 2020 May;50(5):1833-1843 [PMID: 30629527]
  13. IEEE Trans Pattern Anal Mach Intell. 2020 Jan;42(1):86-99 [PMID: 30369436]
  14. Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9 [PMID: 15016911]
  15. Nature. 1999 Oct 21;401(6755):788-91 [PMID: 10548103]
  16. IEEE Trans Neural Netw Learn Syst. 2016 Jul;27(7):1445-56 [PMID: 26111403]
  17. Sci Rep. 2014 Aug 27;4:6207 [PMID: 25158761]
  18. IEEE Trans Pattern Anal Mach Intell. 2013 Nov;35(11):2765-81 [PMID: 24051734]
  19. IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2634-2646 [PMID: 32086196]
  20. IEEE Trans Cybern. 2015 Aug;45(8):1669-80 [PMID: 25265642]
  21. Biometrics. 2010 Dec;66(4):1087-95 [PMID: 20163403]
  22. IEEE Trans Neural Netw. 2009 Jul;20(7):1181-94 [PMID: 19493848]
  23. IEEE Trans Image Process. 2017 Feb 08;26(6):3016-3027 [PMID: 28186894]
  24. Neural Netw. 2020 Feb;122:279-288 [PMID: 31731045]
  25. IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691 [PMID: 27608449]
  26. IEEE Trans Pattern Anal Mach Intell. 2007 Nov;29(11):1944-57 [PMID: 17848776]
  27. Bioinformatics. 2011 Jan 1;27(1):118-26 [PMID: 20980271]
  28. IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):664-676 [PMID: 27514036]
  29. IEEE Trans Pattern Anal Mach Intell. 2011 Aug;33(8):1489-501 [PMID: 21173449]
  30. Adv Neural Inf Process Syst. 2007;20: [PMID: 26140013]
  31. Neural Netw. 2017 Apr;88:74-89 [PMID: 28214692]
  32. BMC Genet. 2014 Jun 17;15:73 [PMID: 24938865]
  33. IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):28-40 [PMID: 26353206]
  34. IEEE Trans Pattern Anal Mach Intell. 2004 Feb;26(2):214-25 [PMID: 15376896]
  35. IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828 [PMID: 23787338]
  36. IEEE Trans Image Process. 2015 Nov;24(11):3939-49 [PMID: 26353354]
  37. IEEE Trans Pattern Anal Mach Intell. 2012 May;34(5):1031-9 [PMID: 22442124]

Grants

  1. K02 DA043063/NIDA NIH HHS
  2. R01 DA037349/NIDA NIH HHS
  3. R01 DA051922/NIDA NIH HHS
  4. R01 MH119678/NIMH NIH HHS

Word Cloud

Created with Highcharts 10.0.0clusteringlearningmulti-viewMVCmachinesubjectsgroupsdataCommonMatrixClusteringdifferentMulti-viewmethodssurveyanalyzeapproachesdiscriminativeCombinationrepresentativebenchmarkdatasetsparadigmdividingsamplenumbersimilaradvancesinformationacquisitiontechnologiessamplescanfrequentlyviewedanglesmodalitiesgeneratingclusterssubgroupsusingattractedattentionsAlthoughdevelopedrapidlyenoughsummarizecurrentprogressThereforeproposenoveltaxonomySimilarcategorizegenerativeclassesclassbasedwayviewintegrationsplitfivegroups:EigenvectorCoefficientIndicatorDirectProjectionFurthermorerelatetopics:representationensemblemulti-tasksupervisedsemi-supervisedSeveralreal-worldapplicationselaboratedpractitionersintroducedalgorithmsgroupempiricallyevaluatedperformpromotefuturedevelopmentpointseveralopenproblemsmayrequireinvestigationthoroughexaminationSurveyMulti-Viewcanonicalcorrelationanalysisminingk-meansnonnegativematrixfactorizationspectralsubspace

Similar Articles

Cited By