Foong, O.-M. and Ismail, A.N. (2020) Document clustering using hybrid lda- kmeans. Advances in Intelligent Systems and Computing, 1226 A. pp. 137-146. ISSN 21945357
Full text not available from this repository.Abstract
This paper presents a Hybrid Latent Dirichlet Allocation � Kmeans (HLDA-Kmeans) Algorithm for document clustering. The overload information has became a challenge for users due to the existence of abundance information and heterogeneous nature of the Web. Researchers such as academician as well as people who are involved in text analytics have encountered challenges to analyze documents because of ambiguity in keywords/keyphrases. Hence, the objective is to perform document clustering analysis using HLDA - Kmeans algorithm to discover the clusters among the unlabelled text data, classify the keyphrases based on topics and visualize the clustering results. Online news from Oil and Gas is used as a dataset for training and testing using 70�30 split. The system performance of the proposed HLDA - Kmeans algorithm was assessed using Precision, Recall and F-Score Formulas. Experimental results show that the proposed HLDA - Kmeans has achieved clustering results satisfactorily. © Springer Nature Switzerland AG 2020.
Item Type: | Article |
---|---|
Additional Information: | cited By 1; Conference of 9th Computer Science On-line Conference, CSOC 2020 ; Conference Date: 15 July 2020 Through 15 July 2020; Conference Code:243459 |
Uncontrolled Keywords: | Cluster analysis; Information retrieval; Intelligent systems; Statistical tests; Statistics, Clustering results; Document Clustering; Latent Dirichlet allocation; Oil and gas; Online news; Text analytics; Text data; Training and testing, Clustering algorithms |
Depositing User: | Mr Ahmad Suhairi UTP |
Date Deposited: | 10 Nov 2023 03:28 |
Last Modified: | 10 Nov 2023 03:28 |
URI: | https://khub.utp.edu.my/scholars/id/eprint/13791 |