Abstract
Cluster Labeling is the process of assigning appropriate and well descriptive titles to text documents. The most suitable label not only explains the central theme of a particular cluster but also provides a means to differentiate it from other clusters in an efficient way. In this paper we proposed a technique for cluster labeling which assigns a generic label to a cluster that may or may not be a part of the text document cluster. It finds the theme of a document and designates it as its label. We used Term Frequency and Inverse Document frequency at baseline for tf-idf, with the Term Frequency calculation refined by using a thesaurus. WordNet was used as an external resource for hypernym generation of the terms having the K-Highest tf-idf. The hypernyms with the highest frequency are then taken as the label of the cluster. The details of the datasets used for experimentation and the comparative results with existing methods are provided in the paper, and clearly reflects the meaningful outcome of our technique

Syed Muhammad Saqlain, Asif Nawaz, Imran Khan, , Faiz Ali Shah, and Muhammad Usman Ashraf. (2016) Text Clusters Labeling using WordNet and Term FrequencyInverse Document Frequency, , Proc. of the PAS: A; 53,, Issue 3.
  • Views 269
  • Downloads