Yao Zhao1, Jian Dong1,*, Hongwei Liu1, Jin Wu2, Yanxin Liu1
CMC-Computers, Materials & Continua, Vol.68, No.1, pp. 727-741, 2021, DOI:10.32604/cmc.2021.016462
- 22 March 2021
Abstract Efficient cache management plays a vital role in in-memory data-parallel systems, such as Spark, Tez, Storm and HANA. Recent research, notably research on the Least Reference Count (LRC) and Most Reference Distance (MRD) policies, has shown that dependency-aware caching management practices that consider the application’s directed acyclic graph (DAG) perform well in Spark. However, these practices ignore the further relationship between RDDs and cached some redundant RDDs with the same child RDDs, which degrades the memory performance. Hence, in memory-constrained situations, systems may encounter a performance bottleneck due to frequent data block replacement. In addition,… More >