A Unified Multi-view Clustering Algorithm using Multi-objective Optimization Coupled with Generative Model

Authors: Sayantan Mitra, Md. Hasanuzzaman, Sriparna Saha

Abstract: There is a large body of works on multi-view clustering which exploit multiple representations (or views) of the same input datafor better convergence. These multiple views can come from multiple modalities (image, audio, text) or different feature subsets.Obtaining one consensus partitioning after considering different views is usually a non-trivial task. Recently, multi-objective basedmulti-view clustering methods have suppressed the performance of single objective based multi-view clustering techniques. One keyproblem is that it is difficult to select a single solution from a set of alternative partitionings generated by multi-objective techniqueson the final Pareto optimal front. In this paper, we propose a novel multi-objective based multi-view clustering framework whichovercomes the problem of selecting a single solution in multi-objective based techniques. In particular, our proposed framework hasthree major components: (i) multi-view based multi-objective algorithm, Multiview-AMOSA, for initial clustering of data points;(ii) a generative model for generating a combined solution having probabilistic labels; and (iii) K-means algorithm for obtaining thefinal labels. As the first component, we have adopted a recently developed multi-view based multi-objective clustering algorithmto generate different possible consensus partitionings of a given dataset taking into account different views. A generative model iscoupled with the first component to generate a single consensus partitioning after considering multiple solutions. It exploits the latentsubsets of the non-dominated solutions obtained from the multi-objective clustering algorithm and combines them to produce a singleprobabilistic labeled solution. Finally, a simple clustering algorithm, namely K-means, is applied on the generated probabilistic labelsto obtain the final cluster labels. Experimental validation of our proposed framework is carried out over several benchmark datasetsbelonging to three different domains; UCI datasets, multiview datasets, search result clustering datasets and patient stratificationdatasets. Experimental results show that our proposed framework achieves an improvement of around 2%-4% over different evaluationmetrics in all the four domains in comparison to state-of-the art methods.

Publishing Date: August, 2019

Published in: ACM Transactions on Knowledge Discovery from Data