Xuewen Zhang1, Zhonghao Li1, Gongshen Liu1,*, Jiajun Xu1, Tiankai Xie2, Jan Pan Nees1
CMC-Computers, Materials & Continua, Vol.55, No.3, pp. 405-417, 2018, DOI:10.3970/cmc.2018.02527
Abstract As a main distributed computing system, Spark has been used to solve problems with more and more complex tasks. However, the native scheduling strategy of Spark assumes it works on a homogenized cluster, which is not so effective when it comes to heterogeneous cluster. The aim of this study is looking for a more effective strategy to schedule tasks and adding it to the source code of Spark. After investigating Spark scheduling principles and mechanisms, we developed a stratifying algorithm and a node scheduling algorithm is proposed in this paper to optimize the native scheduling More >