In the field of weather modification, it is important to accurately identify the ice crystal particles in ice clouds. When ice crystal habits are correctly identified, cloud structure can be further understood and cloud seeding and other methods of weather modification can be used to change the microstructure of the cloud. Consequently, weather phenomena can be changed at an appropriate time to support human production and quality of life. However, ice crystal morphology is varied. Traditional ice crystal particle classification methods are based on expert experience, which is subjective and unreliable for the identification of the categories by threshold setting. In addition, existing deep learning methods are faced with the problem of improving classification performance on datasets with unbalanced sample distributions. Therefore, we designed a Convolutional Neural Network (CNN) embedded with a hypergraph convolution module, named Hy-INet. The hypergraph convolution module can effectively capture information from hypergraphs constructed from local and global feature spaces and learn the features of small samples in ice crystal datasets that have unbalanced sample numbers. Experimental results demonstrate that the proposed method can achieve superior performance in the classification task of ice crystal particle habits.
The phase states of clouds are usually divided into ice clouds, water clouds, and mixed clouds. Ice clouds and mixed clouds are critical cloud systems that produce precipitation, and the frequency at which they occur is closely related to the frequency and amount of precipitation. Under different humidity, temperature, and environmental conditions, ice clouds are mainly composed of numerous different shapes and sizes of ice crystal particles. After obtaining accurate ice particle habits, we can further calculate the physical properties of ice clouds and mixed clouds, such as cloud liquid water content and the scale and concentration of ice crystal particles. Through a better understanding of the physical properties of ice clouds and mixed clouds, we can better understand the cloud microphysical process, decide the cloud seeding time, and evaluate the cloud seeding effect including precipitation enhancement and hail suppression. Thus, it is of great importance to accurately classify ice crystal particle images for research on cloud microphysics and weather modification.
In the past, the classification methods for ice crystal particles were all based on expert experience. Traditional methods require considerable time and effort and rely on much subjective empirical knowledge, which leads to inconsistencies and deviations. In addition, traditional methods of automatic classification of ice crystal particles are based on ice crystal particle physical properties, such as particle radius and circumference, to distinguish different categories, such as statistical recognition methods based on probability [
In recent years, deep learning has demonstrated excellent performance in the computer vision field, especially in image classification tasks, achieving extremely high accuracy. In 2012, the Convolutional Neural Network (CNN) AlexNet model, proposed by Krizhevsky et al. [
In addition, graph-based methods [
Furthermore, we adopted hypergraphs [
An overview of our proposed model Hy-INet is shown in
There are two main key contributions of this paper:
Based on transfer learning, we propose a convolutional neural network embedded with hypergraph convolution to realize automatic classification of ice crystal particles.
We propose a hypergraph convolution module that combines global and local relations to construct hypergraphs and can be effectively used for ice crystal particle classification.
In this section, we introduce the proposed Hy-INet model, which is composed of a CNN embedded with our proposed hypergraph convolution module. The experimental results have demonstrated that the proposed hypergraph convolution module can effectively obtain feature information from unbalanced samples, thus improving overall classification performance.
First, in the selection of the CNN, we choose the ResNet152 network, which uses a global average pooling layer (GAP) [
Second, in the hypergraph convolution module, we first use the feature maps previously obtained from the traditional convolution layer to construct hypergraphs. Then, we use hypergraph convolution to update the pixel values according to the hypergraph structure. A more specific design and implementation of this module are as follows.
The purpose of the proposed hypergraph convolution module is to explore more diverse and advanced feature information that cannot be obtained by the traditional CNN to improve the performance for ice crystal classification tasks. The design idea of the proposed hypergraph convolution module is as follows.
First, the inputs to the hypergraph convolution module are the feature maps with shape X ∈ RB*C*H*W obtained from the previous traditional convolution operation, where B, C, H and W represent the batch size, channel, height and width, respectively.
Second, according to the correlation between each vertex in the feature maps, we select vertices from local and global feature spaces to construct hyperedges and then construct hypergraphs (see
Third, we input the reshaped X ∈ RN*C into the hypergraph convolutional layer and update the vertices according to the vertex and edge structure information in the constructed hypergraphs. This step can help to learn more characteristics of the higher-level information, which will not be captured by the traditional convolution.
Finally, after the hypergraph convolution operation is completed, we restore X ∈ RN*C to the input size format suitable for the following traditional convolutional layers, which require a shape of X ∈ RB*C*H*W.
In the hypergraph convolution module, each hypergraph convolutional layer is followed by a nonlinear activation function and a dropout. In practical applications, we can flexibly set the number of hypergraph convolutional layers. To be embedded into the CNN, the number of input channels of the first hypergraph convolution layer should be equal to the number of output channels in the previous traditional convolutional layer. The number of output channels of the last hypergraph convolution layer should be the same as the number of input channels in the following traditional convolutional layer.
Next, the method for constructing hypergraphs from the local and global feature space is introduced as follows.
Due to the local self-similarity of the image, the pixels adjacent to a pixel are likely to have a greater correlation. Simultaneously, the most noticeable feature used to distinguish categories usually appears in a local area. The category of the image can be inferred by combining the contextual information of surrounding pixels with a central pixel point. To obtain more detailed information, we need to establish feature relationships in local space.
We choose a simple and effective way to build local relationships. We select the eight neighborhood pixels of the center pixel to form a hyperedge that represents the local space's characteristic relationship around the center pixel. Simultaneously, using the flexibility and diversity of hyperedges that can contain any number of vertices, we can construct hyperedges containing different numbers of vertices for central pixels at different positions. For the center pixels that are not at the boundary, each hyperedge has four adjacent horizontal and vertical pixels surrounding its center and four adjacent diagonal pixels with a total of eight vertices. For center pixels on the edge, each hyperedge contains five or three surrounding pixels.
In addition to characterizing detailed features, a comprehensive analysis of the image is also crucial to classification. In general, the KNN (K nearest neighbors) method is typically used to calculate the distance between two pixels and to select the K global nearest pixels to the center vertex through similarity measurement.
Based on the KNN, we add the idea of the patch-based method. We believe that after a patch is added, more vertex information is used when constructing hyperedges, improving classification performance. We define an N × N patch centered on a vertex. We calculate the distance of two patches to represent the distance between vertexes corresponding to two centers. On this basis, we can obtain the distance between each vertex and other vertices and then select the KNN of the center vertex, thus constructing a hyperedge that contains a K+1 vertex including the center vertex.
A hypergraph is defined as
where
For the hyperedge
We denote the diagonal matrix forms of
According to the definition of the incidence matrix
According to the symmetric Laplacian operator, the convolution expression on the hypergraph structure can be obtained,
where
We use the cross-entropy loss function to learn to predict the ice crystal category. It mainly describes the distance between the actual output and the expected output, which is defined as follows:
where
First, in the hypergraph convolution module, when defining the hyperedge weight matrix W, since we cannot explicitly define an appropriate matrix W, we define the weight of each hyperedge as 1. When constructing the global hypergraph, we set the patch to 3 × 3 and select the three global nearest points of the central vertex to form the hyperedge containing four vertices, including the central vertex. In addition, we choose Leaky ReLU as the activation function, and we set the parameter P of the dropout layer to 0.3.
Second, in the CNN, due to the small amount of ice crystal data, some transfer learning methods [
Third, we found that when the hypergraph convolution module was embedded between Layer1 and Layer2 in the ResNet152 network and was using one hypergraph convolution layer, the prediction accuracy could be maximally improved.
To illustrate the effectiveness and accuracy of the Hy-INet model, we carried out comparative experiments using various models with the same post-processing and the same experimental conditions on the same dataset. The experimental results were evaluated by the same standard. The models compared include VGG16 [
The data used in the experiments in this paper are based on the Ice Particle Database in China (ICDC) [
To better evaluate the classification results of the Hy-INet model, we used accuracy to assess the overall classification results. Precision, recall, F1, precision-recall (PR) curve, average precision (AP) value, obfuscating matrix, receiver operating characteristic (ROC) curve, and the area under the curve of ROC (AUC) value were used to evaluate the classification results of each category.
Accuracy refers to the ratio between the number of samples correctly classified by the model and the total number of ice crystal images in a given training or test dataset, and it is defined as
where
Precision (P) and recall (R) are calculated by TN (true negatives), TP (true positives), FN (false negatives), and FP (false positives). The definitions are as follows:
where
where
Furthermore, the macro average values of P, R and F1 are calculated based on the whole result and are defined as follows:
The PR curve is derived from the horizontal axis, recall, and the vertical axis, precision. Recall reflects the classifier's ability to cover positive examples, precision reflects the precision of a classifier to predict positive examples, and the PR curve reflects the trade-off between the two. The AP value is the area under the PR curve, which is defined as follows:
The ROC curve is calculated by the true positive rate (TPR) and the false positive rate (FPR). The closer the ROC curve is to the point (0,1), the better the classifier effect will be. TPR and FPR are defined as follows:
where
The AUC value is the area under the ROC curve, which can be used to visually evaluate the quality of the classifier. The higher the AUC value, the better the performance of the classifier. The AUC definition is as follows:
To better analyze the performance of the proposed model and demonstrate its validity, we choose the same hyperparameters and dataset partitioning method as is used in the TL-ResNet152 model. We select approximately 20% of the data of each category in the ICDC dataset, a total of 1456 images, as the test set and the rest as the training set, ensuring that the training set data and the test set data do not overlap. Simultaneously, we enhance the input data, i.e., we randomly flip the original image horizontally and randomly sample the image to create an input image of size 224 × 224. In addition, to improve the generalization ability of the model, the dataset is also standardized by pixel. The pixel value of each channel in the image is subtracted from the mean value of the pixel value of the corresponding channel and then divided by the standard deviation to achieve data normalization. In the experiment, we calculate the mean value and standard deviation of different color channels in the training set and the test set to standardize the input data. The RGB channels' mean values in the training set are 0.035, 0.274 and 0.593, and the standard deviations are 0.069, 0.210 and 0.301, respectively. The mean values in the test set are 0.036, 0.279 and 0.600, and the standard deviations are 0.070, 0.211 and 0.301, respectively.
We choose the SGD optimizer to perform stochastic gradient descent to optimize the parameters and set its momentum parameter to 0.9. In addition, due to the small amount of training data, to improve the model's performance and optimize the learning efficiency, the parameters of the pre-trained ResNet152 model on ImageNet are used for the initialization of the corresponding convolutional layer of our model. The training process is set as 30 epochs, with an initial learning rate of 0.001 and a batch size of 8. The classification accuracy of the model is tested on the test set after each training epoch, and we save the model with the highest test accuracy.
We load three classic classification models, VGG16, DenseNet169, and ResNet152, which were also pre-trained on ImageNet. We change the output of the final fully connected layer to ten to correspond to the ten categories' predictions in our dataset. The above three models were trained with the same method on the same training set as the proposed model. The model that has the highest prediction accuracy of each model in the test set will be saved during the training process to be analyzed and compared.
We saved Hy-INet, VGG16, DenseNet169 and ResNet152 (which is used in TL-ResNet152 [
We evaluated the accuracy of the four models and the macro average values of precision, recall and F1 on the test set (see
Method | Accuracy | Macro_P | Macro_R | Macro_F1 |
---|---|---|---|---|
VGG16 | 0.9574 | 0.9421 | 0.9559 | 0.9483 |
DenseNet169 | 0.9705 | 0.9723 | 0.9624 | 0.9668 |
ResNet152 | 0.9753 | 0.9703 | 0.9715 | 0.9705 |
In addition, we further evaluated the test results in each category (see
Category | Number | Method | Precision (P) | Recall (R) | F1-score |
---|---|---|---|---|---|
Bud | 200 | VGG16 | 0.9458 | 0.9600 | 0.9529 |
DenseNet169 | 0.9559 | 0.9750 | 0.9653 | ||
ResNet152 | 0.9703 | 0.9751 | |||
Hy-INet | |||||
Cox | 200 | VGG16 | 0.9200 | 0.9485 | |
DenseNet169 | 0.9750 | 0.9750 | 0.9750 | ||
ResNet152 | 0.9657 | 0.9752 | |||
Hy-INet | 0.9752 | ||||
Hoc | 65 | VGG16 | 0.9394 | 0.9538 | 0.9466 |
DenseNet169 | 0.9552 | 0.9697 | |||
ResNet152 | 0.9692 | ||||
Hy-INet | 0.9692 | ||||
Loc | 164 | VGG16 | 0.9390 | 0.9390 | 0.9390 |
DenseNet169 | 0.9756 | 0.9639 | |||
ResNet152 | 0.9266 | 0.9619 | |||
Hy-INet | 0.9371 | ||||
Plt | 128 | VGG16 | 0.9457 | 0.9531 | 0.9494 |
DenseNet169 | 0.9756 | 0.9375 | 0.9562 | ||
ResNet152 | 0.9688 | 0.9841 | |||
Hy-INet | |||||
Ros | 200 | VGG16 | 0.9657 | 0.9752 | |
DenseNet169 | 0.9750 | 0.9848 | |||
ResNet152 | 0.9948 | 0.9650 | 0.9797 | ||
Hy-INet | |||||
Ser | 17 | VGG16 | 0.8000 | 0.8649 | |
DenseNet169 | 0.8824 | 0.9375 | |||
ResNet152 | 0.8889 | 0.9143 | |||
Hy-INet | |||||
Shc | 160 | VGG16 | 0.9375 | 0.9375 | 0.9375 |
DenseNet169 | 0.9565 | 0.9595 | |||
ResNet152 | 0.9868 | 0.9375 | 0.9615 | ||
Hy-INet | 0.9500 | ||||
Sir | 162 | VGG16 | 0.9876 | 0.9815 | 0.9845 |
DenseNet169 | 0.9630 | 0.9781 | |||
ResNet152 | 0.9758 | 0.9847 | |||
Hy-INet | 0.9877 | 0.9877 | |||
Sph | 160 | VGG16 | 0.9814 | 0.9875 | 0.9844 |
DenseNet169 | 0.9636 | 0.9785 | |||
ResNet152 | 0.9750 | 0.9842 | |||
Hy-INet | 0.9875 | 0.9875 |
We also analyzed and compared the PR and ROC curves of the four models (see
In this paper, we propose a Hy-INet model, which can effectively and accurately classify ice crystal particles on a class distribution imbalanced dataset. The key to this work is that we designed a hypergraph convolution module, which can effectively improve the model's classification accuracy on small samples. The experimental results show that the hypergraph convolution module can learn the feature information of a small sample of data well and achieves high precision on the ice crystal classification task. In the future, we will continue this work as follows: (1) We will expand the dataset further to improve the classification performance of ice crystal particles. (2) Based on the ice crystal particle category, we will further calculate clouds' physical properties, such as cloud liquid water content.