Open Access
ARTICLE
Performance Enhancement of XML Parsing Using Regression and Parallelism
Department of Computer Science, Bahauddin Zakariya University, Multan, 60000, Pakistan
* Corresponding Author: Minhaj Ahmad Khan. Email:
Computer Systems Science and Engineering 2024, 48(2), 287-303. https://doi.org/10.32604/csse.2023.043010
Received 19 June 2023; Accepted 27 October 2023; Issue published 19 March 2024
Abstract
The Extensible Markup Language (XML) files, widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications. With the existing Document Object Model (DOM) based parsing, the performance degrades due to sequential processing and large memory requirements, thereby requiring an efficient XML parser to mitigate these issues. In this paper, we propose a Parallel XML Tree Generator (PXTG) algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework (RXPF) that analyzes and predicts performance through profiling, regression, and code generation for efficient parsing. The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel. The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX, SAX, DOM, JDOM, and PXTG on different cores by using multiple file sizes. The regression phase produces the prediction model, based on which the final code for efficient parsing of XML files is produced through the code generation phase. The RXPF framework has shown a significant improvement in performance varying from 9.54% to 32.34% over other existing models used for parsing XML files.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.