A new methodology for automatic creation of concept maps of Turkish texts


Bayrak M., Dal D.

LANGUAGE RESOURCES AND EVALUATION, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1007/s10579-023-09713-9
  • Dergi Adı: LANGUAGE RESOURCES AND EVALUATION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, FRANCIS, Aerospace Database, Applied Science & Technology Source, Communication Abstracts, Computer & Applied Sciences, EBSCO Education Source, Educational research abstracts (ERA), Humanities Abstracts, INSPEC, Linguistic Bibliography, Linguistics & Language Behavior Abstracts, Metadex, MLA - Modern Language Association Database, Civil Engineering Abstracts
  • Anahtar Kelimeler: Concept map, Concept map mining, Natural language processing, Turkish text analysis
  • Atatürk Üniversitesi Adresli: Evet

Özet

Concept maps are two-dimensional visual tools that describe the relationships between concepts belonging to a particular subject. The manual creation of these maps entails problems such as requiring expertise in the relevant field, minimizing visual complexity, and integrating maps, especially in terms of text-intensive documents. In order to overcome these problems, automatic creation of concept maps is required. On the other hand, the production of a fully automated and human-hand quality concept map from a document has not yet been achieved satisfactorily. Motivated by this observation, this study aims to develop a new methodology for automatic creation of the concept maps from Turkish text documents for the first time in the literature. In this respect, within the scope of this study, a new heuristic algorithm has been developed using the Turkish Natural Language Processing software chain and the Graphviz tool to automatically extract concept maps from Turkish texts. The proposed algorithm works with the principle of obtaining concepts based on the dependencies of Turkish words in sentences. The algorithm also determines the sentences to be added to the concept map with a new sentence scoring mechanism. The developed algorithm has been applied on a total of 20 data sets in the fields of Turkish Literature, Geography, Science, and Computer Sciences. The effectiveness of the algorithm has been analyzed with three different performance evaluation criteria, namely precision, recall and F-score. The findings have revealed that the proposed algorithm is quite effective in Turkish texts containing concepts. It has also been observed that the sentence selection algorithm produces results close to the average value in terms of the performance criteria being evaluated. According to the findings, the concept maps automatically obtained by the proposed algorithm are quite similar to the concept maps extracted manually. On the other hand, there is a limitation of the developed algorithm since it is dependent on a natural language processing tool and therefore requires manual intervention in some cases.