Based on the artificial intelligence algorithm of RetinaNet, we propose the Ghost-RetinaNet in this paper, a fast shadow detection method for photovoltaic panels, to solve the problems of extreme target density, large overlap, high cost and poor real-time performance in photovoltaic panel shadow detection. Firstly, the Ghost CSP module based on Cross Stage Partial (CSP) is adopted in feature extraction network to improve the accuracy and detection speed. Based on extracted features, recursive feature fusion structure is mentioned to enhance the feature information of all objects. We introduce the SiLU activation function and CIoU Loss to increase the learning and generalization ability of the network and improve the positioning accuracy of the bounding box regression, respectively. Finally, in order to achieve fast detection, the Ghost strategy is chosen to lighten the size of the algorithm. The results of the experiment show that the average detection accuracy (mAP) of the algorithm can reach up to 97.17%, the model size is only 8.75 MB and the detection speed is highly up to 50.8 Frame per second (FPS), which can meet the requirements of real-time detection speed and accuracy of photovoltaic panels in the practical environment. The realization of the algorithm also provides new research methods and ideas for fault detection in the photovoltaic power generation system.
Nowadays, electricity is extremely important for the development of technology and economy, while the disadvantages of fossil energy are becoming more and more prominent [
The physical characteristics of the photovoltaic module obtained by infrared image, unmanned aerial vehicle electroluminescence image, ultrasonic and other physical means can be adopted to detect the faults in the photovoltaic module. Cubukcu et al. [
Similarly, the fault detection is achieved by estimating theoretical voltage, current and power values and comparing them with the actual measured values, using them or the difference between them as data. Hariharan et al. [
Using the ample information of photovoltaic array current-voltage (I-V) curve and taking these data as input can accurately reflect the characteristics of fault in various situations. Li et al. [
Under different fault conditions, the real-time output voltage and current of photovoltaic array are not the same. The fault diagnosis of photovoltaic array can be carried out by analyzing the variation of time series waveform. The authors of [
The artificial intelligence algorithm is combined with the voltage, current data measured by the sensor in the photovoltaic array or image to establish the mapping relationship between the fault characteristics, the fault location and fault types. Karakose et al. [
Against the complex and changeable installation environment of photovoltaic system, when looking for a solution to judge the shadow problem of photovoltaic system in real time, we find that deep learning algorithm has developed rapidly in object detection in recent years, and its detection accuracy and speed have been proved by practice. Many traditional detection methods are replaced by it, so we choose deep learning methods to detect photovoltaic shadow. At present, deep learning object detection algorithms are mainly divided into two types. One is the one-stage object detection algorithm, which is an end-to-end detection algorithm based on regression and has the characteristic of fast detection speed. The representative algorithms are YOLO series [
In order to complete the shadow detection of photovoltaic panels, we improve the RetinaNet algorithm. As a one-stage detection algorithm, RetinaNet is more accurate than many two-stage algorithms. However, the RetinaNet algorithm cannot meet the requirement of real-time detection in the photovoltaic plate shadow detection task. The model is too large to be applied in the actual scene. Meanwhile, the detection effect of photovoltaic plate shadow target with large target density and overlap is poor. Therefore, we propose several innovations.
Compared with the feature extraction network of the original algorithm, we propose Ghost CSP DenseNet feature extraction networks based on Cross Stage Partial (CSP) structure and Ghost module, which greatly optimizes the model size and the detection speed. In feature fusion, we choose Ghost module and recursive feature fusion mechanism, meanwhile adjust the number of original feature layers to achieve three-scale network output, which improves the detection speed and feature expression ability of the network. The activation function and regression loss function adopt SiLU and CIoU Loss functions, respectively. SiLU inherits ReLU speed and improves network learning capability. Compared with smooth L1 Loss, CIoU Loss can improve the prediction accuracy and convergence speed of the network.
In order to be appropriate for photovoltaic panel shadow detection in the real environment, the following improvements and optimizations are made in this paper for the RetinaNet algorithm. Firstly, in order to improve the detection speed and accuracy of the algorithm, the feature extraction network is redesigned with CSP structure and Ghost module, which also includes Focus and SPP structures. Secondly, the feature fusion structure uses the Ghost module and the circular feature fusion mechanism to replace the top-down feature fusion network in the original network, and the network parameters are adjusted to enhance the expression ability of all object features. Then, the Relu activation function is improved to the Silu activation function to advance the network learning ability and robustness. Finally, CIoU Loss is used to replace smooth L1 loss, which improves the prediction accuracy and convergence speed. Finally, a lightweight shadow detection model of photovoltaic panels is obtained. The structure of the algorithm in this paper is shown in
Ghost module [
A feature map of
Firstly, the Ghost module uses standard convolution to generate
Although the residual network can solve the problem of gradient disappearance and gradient explosion brought about by deepening the network, its great computations, extremely large number of parameters and relatively small gradient lead to slow detection speed and big model size, which makes practical application difficult. Therefore, we propose a new feature extraction network based on CSP structure [
As shown in
The feedforward and weight update formulas of CSP DenseNet are shown in
The specific structure of the backbone network is to use the Focus at the bottom to carry out the down sampling operation without information loss. Next up is the Layer1 structure, which starts with the GhostBottleneck_2 operation and then adopts the GhostCSP-3 module. The required feature maps for P3 and P4 fusion are then obtained through Layer2 and Layer3 operations, which differ only in the number of GhostBottleneck_1 used in GhostCSP compared to Layer1. Finally, the Layer4 structure is adopted, GhostBottleneck_2 operation is performed first, followed by SPP that can enrich the P5 feature information and increase the receptive field to obtain the required feature map of P5. In addition, the number of convolution channels in Focus is 16, and in layer1–4 is 24, 40, 80, 160. The exp is 36, 90, 240, 480, respectively.
The backbone network of Ghost-RetinaNet algorithm is shown in
The FPN feature fusion mechanism of RetinaNet is top-down. This fusion method mainly enriches the semantic information of shallow feature maps. Although the semantic information of high-level feature maps is rich, the location information is relatively poor, which does not deal with in this algorithm. Therefore, Recursive feature fusion is adopted in this paper, in which the bottom-up feature fusion is performed on the basis of FPN. The advantages of the method are as following. Firstly, it shortens the transmission path of feature information and realizes the utilization of positioning information from high-level features to low-level features. Secondly, all feature information of feature pyramid can be utilized by each anchor. Thirdly, it can increase the source of feature information to enrich the feature information of all feature layers. With regard to the dense and overlapping objects in this paper, the enhancement of semantic and location information can greatly improve the ability of feature expression of this algorithm.
The recursive feature fusion network structure of Ghost-RetinaNet is shown in
The ReLU activation function sets all negative x values to zero and positive x values to itself. Its advantage is that the convergence and calculation speed is fast. However, for the negative value, the output and derivative of the function are always zero, which leads to the fact that the network parameters are no longer updated. Thus, the ReLU activation function limits the learning ability of the network.
SiLU activation function, which has four properties a lower bound, no upper bound, smooth and nonmonotonic included is adopted in this paper. A lower bound can enhance the regularization effect of the network. No upper bound can ensure that the network does not experience gradient disappearance. Smoothing can not only improve the generalization ability of the network, but also optimize the network preferably. It also can avoid the uncontrollable problem caused by the discontinuity of ReLU at the origin. Nonmonotonicity can assure that some small negative values can be retained, enhance the network interpretability and improve the network gradient.
The SiLU activation function inhibits negative values rather than setting them all to zero directly, which advances the learning ability of the network and avoids the appearance of silent neurons. Its mathematical expression is shown in
Ghost-RetinaNet algorithm uses CIoU Loss [
Although the smooth L1 loss function improves the L1 loss function, it calculates the four points of the bounding box separately and then adds them to find the final regression loss, which defaults that the four points of the bounding box are independent of each other. The evaluation index of bounding box detection is intersection ratio (IoU). In the box with the same smooth L1 loss, the value of IoU may vary greatly. So smooth L1 loss function has some adverse effects on the algorithm. Therefore, CIoU Loss is adopted as the bounding box regression loss function, and its mathematical expression is shown in
The loss of the algorithm in this paper consists of regression CIoU Loss and classification Focal Loss. The total loss is the sum of Focal Loss and one-quarter CIoU Loss.
In order to verify the effectiveness of the proposed algorithm, the experimental process and results are described from the experimental environment, evaluation index, datasets making and processing, model training and results analysis and comparison of different algorithms.
The running and testing environment of all the algorithms in this paper is shown in
OS | CPU | GPU | Python | Tensorflow/keras |
---|---|---|---|---|
Windows 10, 64 bits | i5-10400F, 2.9 GHz | RTX 2070S, 8 GB | 3.6.0 | 1.14/2.1.5 |
Standard evaluation indexes such as Precision, Recall, meaning of average precision (AP), mean average precision (mAP), Frame per second (FPS) and model size are used to objectively estimate the algorithm performance, and compared with representative deep learning algorithms to verify the practicality of the proposed algorithm.
Refer to the binary classification problem, the confusion matrix of the classification results are shown in
Ground truth | Detection results | |
---|---|---|
Positive | Negative | |
Positive | TP | FN |
Negative | FP | TN |
Precision [
Recall [
The AP is the integral of the precision-recall curve of a certain class under all thresholds, which balances the precision and recall and reflects the comprehensive ability of the algorithm in a certain category, as shown in
The mAP is the average of the AP of all classes, which reflects the overall effect of the algorithm, as shown in
FPS refers to the number of images detected per second, which reflects the detection speed of the algorithm. Model size is the amount of memory occupied by the model, which embodies the requirement of the algorithm for storage space.
Photographs of photovoltaic panels with or without shadow in actual application are obtained through three methods, including surveillance video in photovoltaic power station, drone shooting and manual shooting. Adobe Premiere software is used to process the video to obtain each frame image. A large number of similar photos and unqualified pictures taken artificially are filtered out, and finally the image pixels are normalized. Images obtained above are labeled with LabelImg software. PVP_shielding and PVP respectively stand for photovoltaic panels with or without shadow. The labeling format is Visual Object Classes. There are 8402 images in the dataset, which consists of 52,220 unshaded photovoltaic panels and 51,269 shadowed photovoltaic panels. Random scaling, random inversion and random gamut distortion are used to realize online data augmentation to expand the number of samples in the dataset and improve the accuracy and generalization ability of the algorithm.
The original anchor is no longer applicable to the photovoltaic panel shadow dataset. Considering the inference speed of the network, we choose an image of 416 × 416 size as the input of the algorithm. Based on this, the anchor of the model is redesigned. For the dataset in this study, the K-means++ clustering results are shown in
As can be seen from
The amount of data in the test set is 10% of the total data set, and the remaining data are automatically divided into a training set and a validation set at a ratio of 9:1 during training. The specific distribution of the dataset is shown in
Class | Train set and validation set number | Test set number |
---|---|---|
PVP | 46203 | 6017 |
PVP_shielding | 46220 | 5049 |
Limited by the experimental platform, the batch is set to 8 and the iterations of each epoch are 850. Adam optimizer with momentum of 0.9 is used, and the initial learning rate is set to 0.0001. The learning rate is dynamically adjusted according to the change of loss in training, and it will be reduced by 50% when loss does not change for 3 times, and the training will stop when loss remains unchanged for 10 times. During the training process, the learning rate is adjusted after iterations of 56, 68, 73, 79, 82, 87 and 90 epochs. The variation of loss is shown in
Test is conducted on test set divided randomly. The P-R curve of detection results of Ghost-RetinaNet proposed in this paper is shown in
The P-R curve is composed of recall and accuracy values under all confidence levels, and the integral of the curve is the AP. The AP values of PVP and PVP_shielding is 98.37% and 95.98%, respectively.
To verify the effectiveness of the proposed algorithm, comparisons will be made between the proposed algorithm and representative one-stage and two-stage algorithms. The one-stage algorithms include SSD, YOLOv3, YOLOv4, EfficientDet and M2Det. The two-stage algorithm are faster R-CNN and R-FCN. It is worth mentioning that the anchors in SSD and faster R-CNN are no longer reasonable for the dataset, so the new anchors are obtained by clustering. These algorithms are implemented in the same experimental environment and photovoltaic panel shadow dataset, and the results are shown in
Algorithms | AP/% | mAP/% | FPS | Model size (MB) | |
---|---|---|---|---|---|
PVP | PVP_shielding | ||||
M2Det | 96.89 | 90.21 | 93.55 | 24.8 | 226.66 |
EfficientDet | 86.29 | 39.09 | 62.69 | 30.7 | 16.16 |
SSD-anchor | 63.23 | 52.15 | 57.69 | 42.3 | 100.40 |
R-FCN | 97.89 | 95.47 | 96.68 | 25.0 | 194.42 |
Faster R-CNN-anchor | 97.68 | 94.42 | 96.05 | 18.5 | 301.05 |
YOLOv3 | 95.81 | 97.07 | 34.9 | 246.30 | |
YOLOv4 | 96.12 | 98.09 | 97.10 | 30.2 | 266.30 |
RetinaNet | 96.92 | 93.52 | 95.22 | 15.6 | 139.30 |
Ghost-RetinaNet | 95.98 |
As can be seen from
The Ghost-RetinaNet is used on the photovoltaic panel shadow dataset. The AP of PVP and PVP_shielding are 98.37% and 95.98%, respectively. The mAP can attain to 97.17% and the detection speed is highly up to 50.8 FPS. Compared with the RetinaNet, mAP is improved by 1.95%, and the detection speed is increased by more than four times. The model is lightweight enough to meet the requirements of actual detection.
The detection effect on the dense photovoltaic panel shadow dataset is shown in
In this paper, we introduce a convolutional neural network into photovoltaic panel state detection and propose a Ghost-RetinaNet algorithm. The proposed algorithm solves the problems of low detection accuracy and slow speed caused by target density and target frame overlap in photovoltaic panel shadow detection, which provides a new method for photovoltaic panel fault detection.
The Ghost-RetinaNet algorithm uses CSP feature extraction network, recursive feature fusion network, SiLU activation function, CIoU Loss function and Ghost module. The detection speed, accuracy and robustness of the photovoltaic panel shadow detection model are improved, particularly the model size is greatly reduced. The experimental results show that the object position prediction is more accurate and the accuracy is also higher. The detection accuracy of photovoltaic panel and photovoltaic panel shadow are increased by 1.45% and 2.46%, respectively, and the overall mAP is improved by 1.95%. Meanwhile, the model size is decreased from 139.3 MB to 8.75 MB, and the detection time of a single image is reduced from 64.1 to 19.7 ms. In general, the proposed algorithm is more effective than other algorithms.
The algorithm presented in this paper has a good effect on the shadow detection of photovoltaic panels. Theoretically, the algorithm also has certain applicability to ash deposition, damage and other states. No research has been done in this paper, so relevant data can be added to the dataset made in this study for further experiments.
The authors would like to thank Editor-in-Chief, Editors, and anonymous Reviewers for their valuable reviews.