Mazhar Javed Awan1,2,*, Mohd Shafry Mohd Rahim2, Haitham Nobanee3,4,5, Ashna Munawar2, Awais Yasin6, Azlan Mohd Zain 7
CMC-Computers, Materials & Continua, Vol.67, No.2, pp. 2569-2583, 2021, DOI:10.32604/cmc.2021.014253
- 05 February 2021
Abstract Big data is the collection of large datasets from traditional and digital sources to identify trends and patterns. The quantity and variety of computer data are growing exponentially for many reasons. For example, retailers are building vast databases of customer sales activity. Organizations are working on logistics financial services, and public social media are sharing a vast quantity of sentiments related to sales price and products. Challenges of big data include volume and variety in both structured and unstructured data. In this paper, we implemented several machine learning models through Spark MLlib using PySpark, which More >