Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (17)
  • Open Access

    ARTICLE

    Research on Performance Optimization of Spark Distributed Computing Platform

    Qinlu He1,*, Fan Zhang1, Genqing Bian1, Weiqi Zhang1, Zhen Li2

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2833-2850, 2024, DOI:10.32604/cmc.2024.046807 - 15 May 2024

    Abstract Spark, a distributed computing platform, has rapidly developed in the field of big data. Its in-memory computing feature reduces disk read overhead and shortens data processing time, making it have broad application prospects in large-scale computing applications such as machine learning and image processing. However, the performance of the Spark platform still needs to be improved. When a large number of tasks are processed simultaneously, Spark’s cache replacement mechanism cannot identify high-value data partitions, resulting in memory resources not being fully utilized and affecting the performance of the Spark platform. To address the problem that… More >

  • Open Access

    ARTICLE

    Deep Learning Model for Big Data Classification in Apache Spark Environment

    T. M. Nithya1,*, R. Umanesan2, T. Kalavathidevi3, C. Selvarathi4, A. Kavitha5

    Intelligent Automation & Soft Computing, Vol.37, No.3, pp. 2537-2547, 2023, DOI:10.32604/iasc.2022.028804 - 11 September 2023

    Abstract Big data analytics is a popular research topic due to its applicability in various real time applications. The recent advent of machine learning and deep learning models can be applied to analyze big data with better performance. Since big data involves numerous features and necessitates high computational time, feature selection methodologies using metaheuristic optimization algorithms can be adopted to choose optimum set of features and thereby improves the overall classification performance. This study proposes a new sigmoid butterfly optimization method with an optimum gated recurrent unit (SBOA-OGRU) model for big data classification in Apache Spark. More >

  • Open Access

    ARTICLE

    Analysis of CLARANS Algorithm for Weather Data Based on Spark

    Jiahao Zhang, Honglin Wang*

    CMC-Computers, Materials & Continua, Vol.76, No.2, pp. 2427-2441, 2023, DOI:10.32604/cmc.2023.038462 - 30 August 2023

    Abstract With the rapid development of technology, processing the explosive growth of meteorological data on traditional standalone computing has become increasingly time-consuming, which cannot meet the demands of scientific research and business. Therefore, this paper proposes the implementation of the parallel Clustering Large Application based upon RANdomized Search (CLARANS) clustering algorithm on the Spark cloud computing platform to cluster China’s climate regions using meteorological data from 1988 to 2018. The aim is to address the challenge of applying clustering algorithms to large datasets. In this paper, the morphological similarity distance is adopted as the similarity measurement… More >

  • Open Access

    ARTICLE

    A Parallel Approach for Sentiment Analysis on Social Networks Using Spark

    M. Mohamed Iqbal1,*, K. Latha2

    Intelligent Automation & Soft Computing, Vol.35, No.2, pp. 1831-1842, 2023, DOI:10.32604/iasc.2023.029036 - 19 July 2022

    Abstract The public is increasingly using social media platforms such as Twitter and Facebook to express their views on a variety of topics. As a result, social media has emerged as the most effective and largest open source for obtaining public opinion. Single node computational methods are inefficient for sentiment analysis on such large datasets. Supercomputers or parallel or distributed processing are two options for dealing with such large amounts of data. Most parallel programming frameworks, such as MPI (Message Processing Interface), are difficult to use and scale in environments where supercomputers are expensive. Using the… More >

  • Open Access

    ARTICLE

    Research on Optimization of Random Forest Algorithm Based on Spark

    Suzhen Wang1, Zhanfeng Zhang1,*, Shanshan Geng1, Chaoyi Pang2

    CMC-Computers, Materials & Continua, Vol.71, No.2, pp. 3721-3731, 2022, DOI:10.32604/cmc.2022.015378 - 07 December 2021

    Abstract As society has developed, increasing amounts of data have been generated by various industries. The random forest algorithm, as a classification algorithm, is widely used because of its superior performance. However, the random forest algorithm uses a simple random sampling feature selection method when generating feature subspaces which cannot distinguish redundant features, thereby affecting its classification accuracy, and resulting in a low data calculation efficiency in the stand-alone mode. In response to the aforementioned problems, related optimization research was conducted with Spark in the present paper. This improved random forest algorithm performs feature extraction according More >

  • Open Access

    ARTICLE

    Effects of Spark Energy on Spark Plug Fault Recognition in a Spark Ignition Engine

    A. A. Azrin1,*, I. M. Yusri1,2, M. H. Mat Yasin3, A. Zainal4

    Energy Engineering, Vol.119, No.1, pp. 189-199, 2022, DOI:10.32604/EE.2022.017843 - 22 November 2021

    Abstract The increasing demands for fuel economy and emission reduction have led to the development of lean/diluted combustion strategies for modern Spark Ignition (SI) engines. The new generation of SI engines requires higher spark energy and a longer discharge duration to improve efficiency and reduce the backpressure. However, the increased spark energy gives negative impacts on the ignition system which results in deterioration of the spark plug. Therefore, a numerical model was used to estimate the spark energy of the ignition system based on the breakdown voltage. The trend of spark energy is then recognized by… More >

  • Open Access

    ARTICLE

    Spark Spectrum Allocation for D2D Communication in Cellular Networks

    Tanveer Ahmad1, Imran Khan2, Azeem Irshad3, Shafiq Ahmad4, Ahmed T. Soliman4, Akber Abid Gardezi5, Muhammad Shafiq6,*, Jin-Ghoo Choi6

    CMC-Computers, Materials & Continua, Vol.70, No.3, pp. 6381-6394, 2022, DOI:10.32604/cmc.2022.019787 - 11 October 2021

    Abstract The device-to-device (D2D) technology performs explicit communication between the terminal and the base station (BS) terminal, so there is no need to transmit data through the BS system. The establishment of a short-distance D2D communication link can greatly reduce the burden on the BS server. At present, D2D is one of the key technologies in 5G technology and has been studied in depth. D2D communication reuses the resources of cellular users to improve system key parameters like utilization and throughput. However, repeated use of the spectrum and coexistence of cellular users can cause co-channel interference. More >

  • Open Access

    ARTICLE

    Applying Apache Spark on Streaming Big Data for Health Status Prediction

    Ahmed Ismail Ebada1, Ibrahim Elhenawy2, Chang-Won Jeong3, Yunyoung Nam4,*, Hazem Elbakry1, Samir Abdelrazek1

    CMC-Computers, Materials & Continua, Vol.70, No.2, pp. 3511-3527, 2022, DOI:10.32604/cmc.2022.019458 - 27 September 2021

    Abstract Big data applications in healthcare have provided a variety of solutions to reduce costs, errors, and waste. This work aims to develop a real-time system based on big medical data processing in the cloud for the prediction of health issues. In the proposed scalable system, medical parameters are sent to Apache Spark to extract attributes from data and apply the proposed machine learning algorithm. In this way, healthcare risks can be predicted and sent as alerts and recommendations to users and healthcare providers. The proposed work also aims to provide an effective recommendation system by… More >

  • Open Access

    ARTICLE

    Improving Cache Management with Redundant RDDs Eviction in Spark

    Yao Zhao1, Jian Dong1,*, Hongwei Liu1, Jin Wu2, Yanxin Liu1

    CMC-Computers, Materials & Continua, Vol.68, No.1, pp. 727-741, 2021, DOI:10.32604/cmc.2021.016462 - 22 March 2021

    Abstract Efficient cache management plays a vital role in in-memory data-parallel systems, such as Spark, Tez, Storm and HANA. Recent research, notably research on the Least Reference Count (LRC) and Most Reference Distance (MRD) policies, has shown that dependency-aware caching management practices that consider the application’s directed acyclic graph (DAG) perform well in Spark. However, these practices ignore the further relationship between RDDs and cached some redundant RDDs with the same child RDDs, which degrades the memory performance. Hence, in memory-constrained situations, systems may encounter a performance bottleneck due to frequent data block replacement. In addition,… More >

  • Open Access

    ARTICLE

    Deep Learning-Based Hybrid Intelligent Intrusion Detection System

    Muhammad Ashfaq Khan1,2, Yangwoo Kim1,*

    CMC-Computers, Materials & Continua, Vol.68, No.1, pp. 671-687, 2021, DOI:10.32604/cmc.2021.015647 - 22 March 2021

    Abstract Machine learning (ML) algorithms are often used to design effective intrusion detection (ID) systems for appropriate mitigation and effective detection of malicious cyber threats at the host and network levels. However, cybersecurity attacks are still increasing. An ID system can play a vital role in detecting such threats. Existing ID systems are unable to detect malicious threats, primarily because they adopt approaches that are based on traditional ML techniques, which are less concerned with the accurate classification and feature selection. Thus, developing an accurate and intelligent ID system is a priority. The main objective of… More >

Displaying 1-10 on page 1 of 17. Per Page