Babelnet text classification

5/21/2023

Our algorithm outperforms the existing supervised technique, which used the same dataset. Empirical results obtained on five experimental languages show that categorization with expanded topics shows a very wide performance margin when compared to usage of the original topics. Furthermore, we compare the performance of our classifier with two state-of-the-art supervised algorithms (each for multilingual and cross-lingual tasks) using the same dataset. We compare the performance of the classifier with a model of it using the original class topics. The JRC-Acquis dataset is based on subject domain classification of the European Commission's EuroVoc microthesaurus. The multilabel categorization task uses the JRC-Acquis dataset. We evaluate our categorization algorithm using a multilabel text categorization problem.

The categorization algorithm computes the distributed semantic similarity between the expanded class topics and the text documents in the test corpus. The lexical knowledge in BabelNet is used for the word sense disambiguation and expansion of the topics' terms. In this paper, as a specific contribution to the document index approach for text categorization, we present a joint multilingual/cross-lingual text categorization algorithm (JointMC) based on semantic term expansion of class topic terms through an optimized knowledge-based word sense disambiguation. Considering the semantics of terms is necessary because of the polysemous nature of most natural language words. Term expansion such as query expansion has been applied in numerous applications however, a major drawback of most of these applications is that the actual meaning of terms is not usually taken into consideration. One of these challenges is that the developer is required to have many different languages involved. Besides the rigor involved in developing training datasets and the requirement for repetition of training for different texts, working with multilingual texts poses additional unique challenges.

In Proceedings of 3rd International Workshop on Inductive Reasoning and Machine Learning for the Semantic Web (IRMLeS 2011), Heraklion, Greece, 2011.The majority of the state-of-the-art text categorization algorithms are supervised and therefore require prior training. A Novel Metric for Information Retrieval in Semantic Networks. International Conference on Information and Knowledge Management, 2013, pp. Vazirgiannis, "Graph-of-word and TW-IDF: new approach to ad hoc IR," in CIKM'13: Proceedings of the 22nd ACM. The International Arab Journal of Information Technology (IAJIT), Vol. Word Sense Disambiguation for Arabic Text Categorization. Journal of Theoretical and Applied Information Technology. ≪Arabic Text Categorization: A Comparative Study of Different Representation Modes ≫. "Arabic Text Categorization," The International Arab Journal of Information Technology, Vol. Alsaleem, S., "Automated Arabic Text Categorization Using SVM and NB," International Arab Journal of e-Technology, Vol.and Al-Zubaidi Rania., "A Hierarchical K-NN Classifier for Textual Data," the International Arab Journal of Information Technology, vol. He J., Tan A., and Tan C., "On Machine Learning Methods for Chinese Document Categorization," Applied Intelligence, vol.Ciravegna F., "Flexible Text Classification for Financial Applications: The FACILE System," in Proceedings of the 14th European Conference on Artificial Intelligence, Berlin, Germany, pp.International Journal on Natural Language Computing (IJNLC) Vol. Meryeme Hadni, Said Alaoui Ouatik, Abdelmonaime Lachkar and Mohammed Meknassi (2013)." Hybrid Part-Of-Speech Tagger for Non-Vocalized Arabic Text".BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network". Roberto Navigli, Simone Paolo Ponzetto.The International Arab Journal of Information Technology, Vol. Zakaria Elberrichi1, Abdelattif Rahmoun, and Mohamed Amine Bentaalah." Using WordNet for Text Categorization".Baraa Sharef, Nazlia Omar, and Zeyad Sharef," An Automated Arabic Text Categorization Based on the Frequency Ratio Accumulation".

0 Comments

Babelnet text classification

Leave a Reply.

Author

Archives

Categories