ISSN:2579-0048(print)
ISSN:2579-0056(online)
Publication Frequency:Continuously
Journal on Big Data is launched in a new area when the engineering features of big data are setting off upsurges of explorations in algorithms, raising challenges on big data, and industrial development integration; and novel paradigms in this cross–disciplinary field need to be constructed by translating complex innovative ideas from various fields.
Starting from July 2023, Journal on Big Data will transition to a continuous publication model, accepted articles will be promptly published online upon completion of the peer review and production processes.
Open Access
REVIEW
Journal on Big Data, Vol.6, pp. 1-20, 2024, DOI:10.32604/jbd.2023.046223
Abstract The extraction, transformation, and loading (ETL) process is a crucial and intricate area of study that lies deep within the broad field of data warehousing. This specific, yet crucial, aspect of data management fills the knowledge gap between unprocessed data and useful insights. Starting with basic information unique to this complex field, this study thoroughly examines the many issues that practitioners encounter. These issues include the complexities of ETL procedures, the rigorous pursuit of data quality, and the increasing amounts and variety of data sources present in the modern data environment. The study examines ETL methods, resources, and the crucial… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.1, pp. 1-25, 2022, DOI:10.32604/jbd.2022.021744
Abstract The market trends rapidly changed over the last two decades. The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniques. Market Basket Analysis has a tangible effect in facilitating current change in the market. Market Basket Analysis is one of the famous fields that deal with Big Data and Data Mining applications. MBA initially uses Association Rule Learning (ARL) as a mean for realization. ARL has a beneficial effect in providing a plenty benefit in analyzing the market data and understanding customers’ behavior. An important motive of… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.1, pp. 61-76, 2022, DOI:10.32604/jbd.2022.028078
Abstract Cryo-Electron Microscopy (cryo-EM) has become a powerful method to study the structure and function of biological macromolecules. However, in clustering tasks based on the projection angle of particles in cryo-EM, the noise considerably affects the clustering results. Existing denoising algorithms are ineffective due to the extremely low signal-to-noise ratio (SNR) of cryo-EM images and the complexity of noise types. The noise of a single particle greatly influences the orientation estimation of the subsequent clustering task, and the result of the clustering task directly affects the accuracy of the 3D reconstruction. In this paper, we propose a construction method of cryo-EM… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.1, pp. 41-60, 2022, DOI:10.32604/jbd.2022.027717
Abstract With the explosive growth of Internet text information, the task of text classification is more important. As a part of text classification, Chinese news text classification also plays an important role. In public security work, public opinion news classification is an important topic. Effective and accurate classification of public opinion news is a necessary prerequisite for relevant departments to grasp the situation of public opinion and control the trend of public opinion in time. This paper introduces a combined-convolutional neural network text classification model based on word2vec and improved TF-IDF: firstly, the word vector is trained through word2vec model, then… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.1, pp. 27-39, 2022, DOI:10.32604/jbd.2022.026850
Abstract The machine learning model has advantages in multi-category credit rating classification. It can replace discriminant analysis based on statistical methods, greatly helping credit rating reduce human interference and improve rating efficiency. Therefore, we use a variety of machine learning algorithms to study the credit rating of telecom users. This paper conducts data understanding and preprocessing on Operator Telecom user data, and matches the user’s characteristics and tags based on the time sliding window method. In order to deal with the deviation caused by the imbalance of multi-category data, the SMOTE oversampling method is used to balance the data. Using the… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.1, pp. 77-86, 2022, DOI:10.32604/jbd.2022.027477
Abstract In 2014, Typhoon Rammasun invaded Qinzhou, Guangxi, causing damage to the wind tower sensor at 80 m in Qinzhou. In order to restore the wind speed at 80 m at that time, this paper was based on the hourly average wind speed data of the wind tower and meteorological station from 2017–2019, and constructed the wind speed related model of Meteorological Station and the wind measuring tower in Qinzhou, Moreover, this paper Based on the hourly average wind speed data of Qinzhou Meteorological Station in 2014, Restored the hourly average wind speed of the anemometer tower during Rammasun landfalled. The results showed… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.2, pp. 97-111, 2022, DOI:10.32604/jbd.2022.028363
Abstract Today’s world is a data-driven one, with data being produced in vast amounts as a result of the rapid growth of technology that permeates every aspect of our lives. New data processing techniques must be developed and refined over time to gain meaningful insights from this vast continuous volume of produced data in various forms. Machine learning technologies provide promising solutions and potential methods for processing large quantities of data and gaining value from it. This study conducts a literature review on the application of machine learning techniques in big data processing. It provides a general overview of machine learning… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.2, pp. 135-143, 2022, DOI:10.32604/jbd.2022.031024
Abstract Many countries are paying more and more attention to the protection of water resources at present, and how to protect water resources has received extensive attention from society. Water quality monitoring is the key work to water resources protection. How to efficiently collect and analyze water quality monitoring data is an important aspect of water resources protection. In this paper, python programming tools and regular expressions were used to design a web crawler for the acquisition of water quality monitoring data from Global Freshwater Quality Database (GEMStat) sites, and the multi-thread parallelism was added to improve the efficiency in the… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.2, pp. 113-123, 2022, DOI:10.32604/jbd.2022.028791
Abstract The application of big data in the medical device industry mainly refers to the analysis and processing of various medical devices, so as to provide patients with better treatment and rehabilitation services. At present, our country already has a relatively mature and reliable large database system. This article studies the application of medical equipment in the big data information platform. The main methods used in this article are survey method, case analysis method, and interview method. The big data information platform and medical devices are studied from different aspects. The survey results show that 41% of people completely agree with… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.2, pp. 87-95, 2022, DOI:10.32604/jbd.2022.024533
Abstract Community based churn prediction, or the assignment of recognising the influence of a customer’s community in churn prediction has become an important concern for firms in many different industries. While churn prediction until recent times have focused only on transactional dataset (targeted approach), the untargeted approach through product advisement, digital marketing and expressions in customer’s opinion on the social media like Twitter, have not been fully harnessed. Although this data source has become an important influencing factor with lasting impact on churn management. Since Social Network Analysis (SNA) has become a blended approach for churn prediction and management in modern… More >
Open Access
ARTICLE
Journal on Big Data, Vol.4, No.2, pp. 125-133, 2022, DOI:10.32604/jbd.2022.030660
Abstract In this paper, 137 “First-class universities” and “First-class discipline” construction universities in China are selected as the objects of investigation to analyzes the present situation and characteristics of the game design of University Library in China. Taking the university library in other countries as the reference object, this paper compares the differences of the game design of University Library in China and other countries, sums up the deficiency of the gamification service practice in Chinese university libraries. At last, this paper proposes an optimization path of the gamification design of Chinese University Library from six aspects of game type, game… More >
Open Access
ARTICLE
Journal on Big Data, Vol.1, No.1, pp. 1-7, 2019, DOI:10.32604/jbd.2019.05899
Abstract Given the glut of information on the web, it is crucially important to have a system, which will parse the information appropriately and recommend users with relevant information, this class of systems is known as Recommendation Systems (RS)-it is one of the most extensively used systems on the web today. Recently, Deep Learning (DL) models are being used to generate recommendations, as it has shown state-of-the-art (SoTA) results in the field of Speech Recognition and Computer Vision in the last decade. However, the RS is a much harder problem, as the central variable in the recommendation system’s environment is the… More >
Open Access
ARTICLE
Journal on Big Data, Vol.1, No.1, pp. 17-24, 2019, DOI:10.32604/jbd.2019.05799
Abstract Railway is the backbone of Chinese transportation system, but its poor quality of services for passengers cause complains now and then. This study first analyzed the influencing factors of service quality on railway passenger, and its quality characteristics was also explained, and finally we proposed an evaluation system of service quality on railway passenger transport. Through the statistical analysis and processing of the basic information from survey data from railway station, trains and the official website of the ticket purchase, the evaluation score of question naire was converted into the score in evaluation index system, which was based on SERVQUAL… More >
Open Access
ARTICLE
Journal on Big Data, Vol.1, No.2, pp. 89-106, 2019, DOI:10.32604/jbd.2019.07235
Abstract Web crawlers are an important part of modern search engines. With the development of the times, data has exploded and humans have entered a “big data era”. For example, Wikipedia carries the knowledge from all over the world, records the real-time news that occurs every day, and provides users with a good database of data, but because of the large amount of data, it puts a lot of pressure on users to search. At present, single-threaded crawling data can no longer meet the requirements of text crawling. In order to improve the performance and program versatility of single-threaded crawlers, a… More >
Open Access
ARTICLE
Journal on Big Data, Vol.2, No.2, pp. 71-84, 2020, DOI:10.32604/jbd.2020.012294
Abstract Adversarial examples are hot topics in the field of security in deep
learning. The feature, generation methods, attack and defense methods of the
adversarial examples are focuses of the current research on adversarial examples.
This article explains the key technologies and theories of adversarial examples
from the concept of adversarial examples, the occurrences of the adversarial
examples, the attacking methods of adversarial examples. This article lists the
possible reasons for the adversarial examples. This article also analyzes several
typical generation methods of adversarial examples in detail: Limited-memory
BFGS (L-BFGS), Fast Gradient Sign Method (FGSM), Basic Iterative Method
(BIM), Iterative Least-likely… More >
Open Access
ARTICLE
Journal on Big Data, Vol.2, No.4, pp. 167-176, 2020, DOI:10.32604/jbd.2020.015357
Abstract Traditional image quality assessment methods use the hand-crafted
features to predict the image quality score, which cannot perform well in many
scenes. Since deep learning promotes the development of many computer vision
tasks, many IQA methods start to utilize the deep convolutional neural networks
(CNN) for IQA task. In this paper, a CNN-based multi-scale blind image quality
predictor is proposed to extract more effectivity multi-scale distortion features
through the pyramidal convolution, which consists of two tasks: A distortion
recognition task and a quality regression task. For the first task, image distortion
type is obtained by the fully connected layer. For… More >
Open Access
ARTICLE
Journal on Big Data, Vol.3, No.1, pp. 1-9, 2021, DOI:10.32604/jbd.2021.010364
Abstract In many fields such as signal processing, machine learning, pattern
recognition and data mining, it is common practice to process datasets containing
huge numbers of features. In such cases, Feature Selection (FS) is often involved.
Meanwhile, owing to their excellent global search ability, evolutionary
computation techniques have been widely employed to the FS. So, as a powerful
global search method and calculation fast than other EC algorithms, PSO can solve
features selection problems well. However, when facing a large number of feature
selection, the efficiency of PSO drops significantly. Therefore, plenty of works
have been done to improve this situation.… More >
Open Access
ARTICLE
Journal on Big Data, Vol.1, No.1, pp. 25-38, 2019, DOI:10.32604/jbd.2019.05800
Abstract Graphical methods are used for construction. Data analysis and visualization are an important area of applications of big data. At the same time, visual analysis is also an important method for big data analysis. Data visualization refers to data that is presented in a visual form, such as a chart or map, to help people understand the meaning of the data. Data visualization helps people extract meaning from data quickly and easily. Visualization can be used to fully demonstrate the patterns, trends, and dependencies of your data, which can be found in other displays. Big data visualization analysis combines the… More >
Open Access
ARTICLE
Journal on Big Data, Vol.1, No.2, pp. 55-69, 2019, DOI:10.32604/jbd.2019.06110
Abstract Haze concentration prediction, especially PM2.5, has always been a significant focus of air quality research, which is necessary to start a deep study. Aimed at predicting the monthly average concentration of PM2.5 in Beijing, a novel method based on Monte Carlo model is conducted. In order to fully exploit the value of PM2.5 data, we take logarithmic processing of the original PM2.5 data and propose two different scales of the daily concentration and the daily chain development speed of PM2.5 respectively. The results show that these data are both approximately normal distribution. On the basis of the results, a Monte… More >
Open Access
ARTICLE
Journal on Big Data, Vol.1, No.2, pp. 79-88, 2019, DOI:10.32604/jbd.2019.05806
Abstract In this paper, the research advances of ontology and its application are reviewed firstly. With the development of ontology technology, subject-oriented web information retrieval technology combining ontology has been becoming one of the hot scientific issues. The innovative method of the semantic web technology combined with the traditional information retrieval technology is put forward, and the related algorithm based on ontology for judging the relevancy with different topics is also represented, and has proved to be effective in given experiments. More >
Open Access
ARTICLE
Journal on Big Data, Vol.2, No.1, pp. 1-8, 2020, DOI:10.32604/jbd.2020.01001
Abstract Water resources are one of the basic resources for human survival, and water
protection has been becoming a major problem for countries around the world. However,
most of the traditional water quality monitoring research work is still concerned with the
collection of water quality indicators, and ignored the analysis of water quality
monitoring data and its value. In this paper, by adopting Laravel and AdminTE
framework, we introduced how to design and implement a water quality data
visualization platform based on Baidu ECharts. Through the deployed water quality
sensor, the collected water quality indicator data is transmitted to the big… More >