Proceedings of the International Conference on Emerging Intelligent Systems for Sustainable Development (ICEIS 2024)

Word Embedding-based Topic Modeling

Authors
Slimane Bellaouar1, 2, *, Ahmed Itbirene1, 2, Brahim Chihani1, 2
1Department of Mathematics and Computer Science, Université de Ghardaia, Bounoura, Algeria
2Laboratoire des Mathématiques et Sciences Appliquées (LMSA), Université de Ghardaia, Bounoura, Algeria
*Corresponding author. Email: bellaouar.slimane@univ-ghardaia.dz
Corresponding Author
Slimane Bellaouar
Available Online 31 August 2024.
DOI
10.2991/978-94-6463-496-9_8How to use a DOI?
Keywords
Topic modelling; Word embeddings; Latent Dirichlet Allocation (LDA); Word2Vec; Topic coherence
Abstract

The extraction of topics from information that is in the form of unmarked texts has become a challenging task due to the significant advancements in the field of digitization. Therefore, we need a topic modeling technique, which is based on unsupervised algorithms. Our paper delineates the topic modeling concept and the inherent approaches including Latent Dirichlet Allocation (LDA), Embedded Topic Model (ETM), Gaussian LDA (G-LDA), and LDA with Word2Vec (LDA2Vec). In the experimental work, we make an empirical comparison between both LDA and ETM methods on the 20 newsgroups dataset, in terms of topic coherence and runtime. The results are absolutely in favor of the ETM approach.

Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the International Conference on Emerging Intelligent Systems for Sustainable Development (ICEIS 2024)
Series
Advances in Intelligent Systems Research
Publication Date
31 August 2024
ISBN
978-94-6463-496-9
ISSN
1951-6851
DOI
10.2991/978-94-6463-496-9_8How to use a DOI?
Copyright
© 2024 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Slimane Bellaouar
AU  - Ahmed Itbirene
AU  - Brahim Chihani
PY  - 2024
DA  - 2024/08/31
TI  - Word Embedding-based Topic Modeling
BT  - Proceedings of the International Conference on Emerging Intelligent Systems for Sustainable Development (ICEIS 2024)
PB  - Atlantis Press
SP  - 89
EP  - 102
SN  - 1951-6851
UR  - https://doi.org/10.2991/978-94-6463-496-9_8
DO  - 10.2991/978-94-6463-496-9_8
ID  - Bellaouar2024
ER  -