Multi-objective Based Approach for Microblog Summarization

Authors: Naveen Saini, Sriparna Saha, Pushpak Bhattacharyya

Abstract: In recent years, social networking sites such as Twitter have become the primary sources for real-time information of ongoing events such as political rallies, natural disasters etc. At the time of occurrence of natural disasters, it has been seen that relevant information collected from tweets can help in different ways. Therefore, there is a need to develop an automated microblog/tweet summarization system to automatically select relevant tweets. In this paper, we employ the concepts of multi-objective optimization in microblog summarization to produce good quality summaries. Different statistical quality measures namely, length, tf-idf score of the tweets, anti-redundancy, measuring different aspects of summary, are optimized simultaneously using the search capability of a multi-objective differential evolution technique. Different types of genetic operators including recently developed self-organizing map (a type of neural network) based operator, are explored in the proposed framework. To measure the similarity between tweets, word mover distance is utilized which is capable of capturing the semantic similarity between tweets. For evaluation, four benchmark datasets related to disaster events are used, and the results obtained are compared with various state-of-the-art techniques using ROUGE measures. It has been found that our algorithm improves by 62.37% and 5.65% in terms of ROUGE-2 and ROUGE-L scores, respectively, over the state-of-the-art techniques. Results are also validated using statistical significance t-test. At the end of the paper, extension of proposed approach to solve the multi-document summarization task is also illustrated.

Publishing Date: September, 2019

Published in: IEEE Transactions on Computational Social Systems (accepted).