An investigation into epidemiological situations of COVID-19 with fuzzy K-means and K-prototype clustering methods

Pasin, Ozge; GÖNENÇ, Senem

doi:10.1038/s41598-023-33214-y

An investigation into epidemiological situations of COVID-19 with fuzzy K-means and K-prototype clustering methods

Pasin O., GÖNENÇ S.

Scientific reports, cilt.13, sa.1, ss.6255, 2023 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 13 Sayı: 1
Basım Tarihi: 2023
Doi Numarası: 10.1038/s41598-023-33214-y
Dergi Adı: Scientific reports
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, BIOSIS, CAB Abstracts, Chemical Abstracts Core, EMBASE, MEDLINE, Veterinary Science Database, Directory of Open Access Journals
Sayfa Sayıları: ss.6255
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
Atatürk Üniversitesi Adresli: Evet

Özet

The ten countries with the highest population during the pandemic were analyzed for clustering based on the quantitative numbers of COVID-19 and policy plans. The Fuzzy K-Means (FKM) and K-prototype algorithms were used for clustering, and various performance indices such as Partition Coefficient (PC), Partition Entropy (PE), Xie-Beni (XB), and Silhouette Fuzzy (SIL.F) were used for evaluating the clusters. The analysis included variables such as confirmed cases, tests, vaccines, school and workplace closures, event cancellations, gathering restrictions, transport closures, stay-at-home restrictions, international movement restrictions, testing policies, facial coverings, and vaccination policy statuses. PC, PE, XB, and SIL.F indices were used to analyze the performance indices of the clusters. The Elbow method was used to analyze the performance evaluations for the K-prototype. The K-prototype algorithm's performance evaluations were analyzed using the Elbow method, and the optimum number of clusters for both methods was found to be two. The first cluster included Brazil, Mexico, Nigeria, Bangladesh, US, Indonesia, Russia, and Pakistan, while the second cluster comprised India and China. The analysis also examined the relationship between population and confirmed tests and vaccines, and standardization was made for the country with the largest population for significant correlations. The results showed that the FKM method was superior to the K-prototype method in terms of clustering. In conclusion, it is crucial to accurately evaluate COVID-19 data for countries and develop appropriate policies. The clustering analysis using the FKM and K-prototype algorithms provides valuable insights into identifying groups of countries with similar COVID-19 data and policy plans.