A novel simulated annealing-based optimization approach for cluster-based task scheduling

ÇELİK, Esra; DAL, Deniz

doi:10.1007/s10586-021-03275-7

A novel simulated annealing-based optimization approach for cluster-based task scheduling

Atıf İçin Kopyala

ÇELİK E., DAL D.

Cluster Computing, cilt.24, ss.2927-2956, 2021 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 24
Basım Tarihi: 2021
Doi Numarası: 10.1007/s10586-021-03275-7
Dergi Adı: Cluster Computing
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
Sayfa Sayıları: ss.2927-2956
Anahtar Kelimeler: Cluster, Heterogeneous computing, Cloud computing, Task scheduling, Metaheuristic, Simulated annealing, Parallel computing, Shared memory, OpenMP, INDEPENDENT TASKS, BIG DATA, ALGORITHMS, HEURISTICS
Atatürk Üniversitesi Adresli: Evet

Özet

© 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.Rapidly advancing technology brings a huge volume of data along the way that grows at a staggering pace and cannot be processed with traditional algorithms/hardware. Therefore, storing, processing, and analyzing this data in a timely manner requires distributed data clusters. One of the most critical problems facing these clusters is referred to as task scheduling. In this context, task scheduling is simply the name of the task-cluster node mapping process that will allow the last task to complete its execution as early as possible. Due to the NP-hard nature of the scheduling problem at hand, there is an inevitable need for metaheuristics to solve this problem in such a way that it can produce near-optimal (if possible optimal) solutions at reasonable times. In this study, a simulated annealing-based metaheuristic for cluster-based task scheduling is developed, and serial and parallel (shared memory) versions of the method are implemented in C++. The effectiveness of the proposed approach is demonstrated through twelve famous benchmarks from the Braun dataset. Both the serial and the parallel versions of the approach produce results that are much better than the best latency values ever reported in the literature for all benchmarks within the time constraint of 90 s. For example, the percentage of improvement of the serial version ranges from 0.01% to 0.49%. To decrease the execution time of the developed computer program and improve the quality of the scheduling solutions, different random number generation and perturbation techniques, data structures, early loop termination conditions, exploitation-exploration rates, and compiler effects are also analyzed in detail within the scope of this study.