Open Access
ARTICLE
Massive Files Prefetching Model Based on LSTM Neural Network with Cache Transaction Strategy
1 School of Computer Science and Technology, Harbin Institute of Technology, Weihai, 264209, China.
2 School of Science, Harbin Institute of Technology, Weihai, 264209, China.
3 School of Engineering, Manukau Institute of Technology, Auckland, 2241, New Zealand.
4 College of Mathematics and Computer Science, Xinyu University, Xinyu, 338004, China.
5 College of Information Engineering, Sanming University, Sanming, 365004, China.
6 School of Astronautics, Harbin Institute of Technology, Harbin, 150001, China.
* Corresponding Author: Ning Cao. Email: .
Computers, Materials & Continua 2020, 63(2), 979-993. https://doi.org/10.32604/cmc.2020.06478
Received 01 March 2019; Accepted 27 November 2019; Issue published 01 May 2020
Abstract
In distributed storage systems, file access efficiency has an important impact on the real-time nature of information forensics. As a popular approach to improve file accessing efficiency, prefetching model can fetches data before it is needed according to the file access pattern, which can reduce the I/O waiting time and increase the system concurrency. However, prefetching model needs to mine the degree of association between files to ensure the accuracy of prefetching. In the massive small file situation, the sheer volume of files poses a challenge to the efficiency and accuracy of relevance mining. In this paper, we propose a massive files prefetching model based on LSTM neural network with cache transaction strategy to improve file access efficiency. Firstly, we propose a file clustering algorithm based on temporal locality and spatial locality to reduce the computational complexity. Secondly, we propose a definition of cache transaction according to files occurrence in cache instead of time-offset distance based methods to extract file block feature accurately. Lastly, we innovatively propose a file access prediction algorithm based on LSTM neural network which predict the file that have high possibility to be accessed. Experiments show that compared with the traditional LRU and the plain grouping methods, the proposed model notably increase the cache hit rate and effectively reduces the I/O wait time.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.