Qinlu He1,*, Fan Zhang1, Genqing Bian1, Weiqi Zhang1, Zhen Li2
CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2833-2850, 2024, DOI:10.32604/cmc.2024.046807
- 15 May 2024
Abstract Spark, a distributed computing platform, has rapidly developed in the field of big data. Its in-memory computing feature reduces disk read overhead and shortens data processing time, making it have broad application prospects in large-scale computing applications such as machine learning and image processing. However, the performance of the Spark platform still needs to be improved. When a large number of tasks are processed simultaneously, Spark’s cache replacement mechanism cannot identify high-value data partitions, resulting in memory resources not being fully utilized and affecting the performance of the Spark platform. To address the problem that… More >