Semi-supervised information-maximization clustering.

Daniele Calandriello, Gang Niu, Masashi Sugiyama
Author Information
  1. Daniele Calandriello: Politecnico di Milano, Milano, Italy. Electronic address: daniele.calandriello@mail.polimi.it.
  2. Gang Niu: Tokyo Institute of Technology, Tokyo, Japan. Electronic address: gang@sg.cs.titech.ac.jp.
  3. Masashi Sugiyama: Tokyo Institute of Technology, Tokyo, Japan. Electronic address: sugi@cs.titech.ac.jp.

Abstract

Semi-supervised clustering aims to introduce prior knowledge in the decision process of a clustering algorithm. In this paper, we propose a novel semi-supervised clustering algorithm based on the information-maximization principle. The proposed method is an extension of a previous unsupervised information-maximization clustering algorithm based on squared-loss mutual information to effectively incorporate must-links and cannot-links. The proposed method is computationally efficient because the clustering solution can be obtained analytically via eigendecomposition. Furthermore, the proposed method allows systematic optimization of tuning parameters such as the kernel width, given the degree of belief in the must-links and cannot-links. The usefulness of the proposed method is demonstrated through experiments.

Keywords

MeSH Term

Algorithms
Artificial Intelligence
Cluster Analysis

Word Cloud

Created with Highcharts 10.0.0clusteringproposedmethodSemi-supervisedalgorithminformation-maximizationbasedmutualinformationmust-linkscannot-linksaimsintroducepriorknowledgedecisionprocesspaperproposenovelsemi-supervisedprincipleextensionpreviousunsupervisedsquared-losseffectivelyincorporatecomputationallyefficientsolutioncanobtainedanalyticallyviaeigendecompositionFurthermoreallowssystematicoptimizationtuningparameterskernelwidthgivendegreebeliefusefulnessdemonstratedexperimentsClusteringInformationmaximizationSquared-loss

Similar Articles

Cited By