KEYWORD EXTRACTION BASED ON WORD SYNONYMS USING WORD2VEC


Ogul I. U. , ÖZCAN C. , Hakdagli O.

27th Signal Processing and Communications Applications Conference (SIU), Sivas, Türkiye, 24 - 26 Nisan 2019 identifier identifier

Özet

Nowadays, the data revealed by the online individuals are increasing exponentially. The raw information that increasing data holds, transformed into meaningful outputs using machine learning and deep learning methods. Generally, supervised learning methods are used for information extraction and classification. Supervised learning is based on the training set that classification algorithms are trained. In the proposed approach, keyword extraction solution is proposed to classify text data more convenient. The developed solution is based on the Word2Vec algorithm, which works by taking into consideration the semantic meaning of the words unlike general approaches that based on word frequency. A new approach, word embedding algorithm named "Word2Vec", works by calculating the word weights, semantic relationship, and the final weights of vectors. The obtained keywords are trained with Name Bayes and Decision Trees methods and the performance of the proposed method is shown by classification example.