Mammography is considered a significant image for accurate breast cancer detection. Content-based image retrieval (CBIR) contributes to classifying the query mammography image and retrieves similar mammographic images from the database. This CBIR system helps a physician to give better treatment. Local features must be described with the input images to retrieve similar images. Existing methods are inefficient and inaccurate by failing in local features analysis. Hence, efficient digital mammography image retrieval needs to be implemented. This paper proposed reliable recovery of the mammographic image from the database, which requires the removal of noise using Kalman filter and scale-invariant feature transform (SIFT) for feature extraction with Crow Search Optimization-based the deep belief network (CSO-DBN). This proposed technique decreases the complexity, cost, energy, and time consumption. Training the proposed model using a deep belief network and validation is performed. Finally, the testing process gives better performance compared to existing techniques. The accuracy rate of the proposed work CSO-DBN is 0.9344, whereas the support vector machine (SVM) (0.5434), naïve Bayes (NB) (0.7014), Butterfly Optimization Algorithm (BOA) (0.8156), and Cat Swarm Optimization (CSO) (0.8852).
In the medical image processing field, many medical images are taken from medical firms. The data was accessed successfully to manage and access these medical images based on some parameters [
The essential component of the content-based image retrieval (CBIR) system is extracting image features and representing them in feature vector format. In the CBIR system, image retrieval is based on the query image, and the featured vector is calculated for the image-based query. This query image vector is evaluated with the feature vector values saved in the database. Then the system gets the similarity of the image from the database based on minimum distance or highly matching feature vector values in the database. Therefore, feature extraction of the image plays a vital role in retrieving the image [
Many research works have been done, and applying these techniques will be inaccurate for detecting the similarity of images from the large data set. Therefore, to improve the detection of the similarity of images and reduce the average computing time, this paper proposed an optimized classification of crow search optimization algorithms with a deep belief network (CSO-DBN). In this proposed work, features are extracted using SIFT and proper and efficient implementation of dimensionality reduction of features using the crow search optimization algorithm is used. The contribution of this work is as follows: Implementing retrieval of similar images based on the optimized concept of the crow search optimization algorithm. To improve accuracy, pre-processing of this work implements the Kalman filter and by using SIFT algorithm for extracting features of the image. For retrieving the similarity image or actual image using Euclidean distance metric measures.
The article’s organization is given as follows: the Section 2 reviews traditional works, the Section 3 provides the proposed model for image retrieval, the Section 4 discusses the experimental outcome, and the last Section finally concludes the work with future ideology.
Recently the development of technology and the increase in usage of multimedia, smartphones, and digital cameras gathering, the graphical format of data from various areas or databases are stored securely. This similar retrieval of images helps physicians diagnose disease within the minimum time requirement [
This paper proposed [
This paper [
Author | Database used | Feature extraction | Feature selection | Classifier |
---|---|---|---|---|
Dutta et al. [ |
The cancer genome atlas (TCGA) and Gene Expression (GEO) databases | ---- | ---- | Cox regression analysis |
Chowdhary et al. [ |
Mammography image analysis (MIAS) | Region of interest (ROI) | ---- | DT(decision |
Prakash Singh et al. [ |
MIAS | GLCM, Harlick texture | Principal component analysis (PCA) | Fuzzy C-Means |
Hinton et al. [ |
BI-RADS | ---- | PCA | DT (decision |
Lakshmitha et al. [ |
MIAS | Extreme learning machine (ELM) | ---- | Deep belief network |
Arora et al. [ |
MIAS | Grey level co-matrix analysis (GLCM) | minimum redundancy maximum relevance (mRMR) | ----- |
This proposed work CSO-DBN contains two phases, namely online and offline. The framework of the proposed work is given in
For diagnosing, mammographic images are challenging to identify. Therefore, pre-processing is needed. In this work, pre-processing work removes noise and pectoral muscles. At the posterior upper margin, thick muscles are present. This muscle is fan-shaped and appears like triangular opacity. The estimation of density in mammography is less. This helps to process specified regions by applying the detection technique.
The primary purpose of applying the Kalman filter is to identify the inaccurate rates and noise in the mammographic image. This filter is based on the concept of mathematical approach, which is then neighbor data as a linear system with Gaussian errors to update continuously. This filter updates the value of the best current value of the neighbor. The pixel value of the mammographic image is spatially dependent on the value of the neighbor pixel of the image, and it is represented, and its mathematical model is:
where denotes the neighboring pixel range value of the mammographic image, which is used to evaluate the linear sum. Indicates the coordinate value of the image, which represents the noise, and the importance of noise in the image is zero mean when the absolute pixel value of the image is selected. Removal of noises in the mammographic image by adding additive noise and blurred noise. Then the original image is represented by:
Here,
To effectively retrieve mammographic images from the large dataset, removing the artifact is necessary. Since artifacts affect numerous mammographic images, such as; labels, scratches, tags, scanning, and opaque marker artifact, in this work removal of label artifact procedure is given below:
The pectoral muscle of a mammogram image is a very thick and fan-like shape that presents as triangular opacity. It reduces the bias of mammographic estimate density and detects the lesion in the image. The procedure for removing pectoral muscle is given below:
The feature extraction purpose is to decrease the time of retrieval in the image dataset. This increases the result outcome and accuracy. Feature extraction derives attribute subset from the original attribute. This paper extracts feature shapes using SIFT. Scale-invariant features transform (SIFT) is a technique for detecting and describing the image’s local features. This SIFT is based on scaling, illumination, and rotation.
where,
For detecting the local minima and maxima of
Crows are intelligent birds that can recognize the faces and where they store food. A flock of crows has similarities in their behavior pattern. In acquiring the food, it follows one another. In implementing the optimized algorithm, crow search for food is considered search space (environment) for the best feasible solution (i.e., environment’s position). The best food source is regarded as a global solution. The quality of the food source represents the fitness function of the program. This crow search optimization algorithm is determined by two main factors: diversion and intensification. The parametric control is Balancing these two factors is Awareness Probability (AP). In implementing the search space, the unexplored area must be visited using diversification. Similarly, searching for the best region using intensification is done to find the best solution.
For considering the dataset, a crow encoding process is needed. For that, the value of each particle is encoded into a sequence string of sets of fundamental importance. For ‘m’ data points, forming a C cluster by combining cluster centers as the string is denoted as every single crow. If data dimensions d, then the length of each is capped words. Randomly generate the initial population, representing the vector for various cluster centers. It can be depicted in
The above pseudocode of crow search optimization described calculating the fitness function of crow. Select the crow and crow. Evaluate the fitness function of crow and crow, and it is compared with the probability of awareness (AP), and if it is the high new position of crow is generated by using:
If the probability of awareness (AP) is low, then to make fool the follower crow i choose the random position and aware of its follower. New position of crow is checked and updated by its position.
DBN is the undirected connection between layers, and it is also called Restricted Boltzmann Machines (RBM). RBM has various layers, including DBN and trained the network based on the unsupervised training process. In this proposed work, the structure of DBN contains one visible layer and multiple hidden layers. The visible nodes are, and the hidden layer nodes are in the visible layer. The features of the visual and hidden layer are and. The bias of visible nodes is, and the preferences of a remote node are. In the RBM, the connection between the visible layer and hidden layers is restricted. To transmit the input data to the hidden layer, the RBM layer communicates with previous and subsequent layers [
In the
Initialized, the bias and synaptic weight value for all neurons in the RBM is given. Training the input neurons in the visible layer consists of positive and negative phases. In the positive step, it transforms the data from the visible layer to the hidden layer and, for the negative phase, converts the data from the hidden layer to the visual layer. The activation function for individual positive and negative steps is evaluated using
Comparing the DBN model this proposed work optimized the weights of parametric values until it reaches the maximum number of epochs. In the training process all parametric values are optimized by using
where,
The process mentioned above is used for the training of one RBM. Repeat the same process until all RBMs are get trained. The feature classification of the mammographic image using the crow search optimization with a deep belief network produces the efficiency in detecting mammographic images from the large data set.
The preprocessing step filters the noise from the input image and pectoral image. These techniques improve feature extraction and feature classification more accurately. Optimization-based extraction is used to select the relevant and optimal features, leading to improved accuracy. As a whole, the proposed deep belief network in retrieving the mammographic image is an efficient way. Some real-time prediction strategy is discussed in the article.
For the query mammographic image retrieval from the large dataset, Euclidean distance metric measures are used. The formula foe Euclidean distance metric measure is:
where,
The extraction and classification techniques are performed in MATLAB R2018a. The data collection for this proposed work is a publicly available dataset: Mammographic Image Analysis Society (MIAS)/Mini-MIAS and Digital Database for Screening Mammography (DDSM)/CBIS-DDSM. The MIAS database is digitized at 50 micron-pixel edge but reduced to a 200-micron pixel edge and clipped each image with pixels. The CBIS-DDSM dataset is provided in 16-bit DICOM format with a resolution of 3131 × 5295 pixels.
In
These parametric metric measures are computed and assessed to retrieve the similarity of the image from the extensive data set in the effectiveness of this proposed work CSO-DBN. This proposed work is compared with existing algorithms of SVM, Naïve Bayesian classifier (NB), butterfly optimization algorithm (BOA), and Crow Search optimization algorithm (CSO).
Sensitivity is a statistical performance metric measure and it is also called as TP rate. It is the proportion of similar mammographic image is recognized in the data set. Specificity is also termed TN rate. It recognized the dissimilar mammographic image. Accuracy precise the mammographic images are categorized accurately.
It is called positive predictive value (PPV). It evaluates true positive for all positive values by using
It evaluates true negatives for all negative values by using
In calculating the F-Score by combining the recall and precision to its value. The maximum value of F-Score is 1 and minimum score is 0. In the MCC is the correlation coefficient value between −1 & +1.
Algorithm | Sensitivity | Specificity |
---|---|---|
SVM | 81.40% | 77.20% |
NB | 84.20% | 89.50% |
BOA | 88.81% | 90.21% |
CSO | 91.68% | 90.54% |
CSO-DBN (Proposed) | 96.80% | 91.70% |
From the
From the
From the
Algorithm | Precision | Recall | F-Score |
---|---|---|---|
SVM | 72.61% | 81.56% | 77.11% |
NB | 81.15% | 82.25% | 84.43% |
BOA | 84.35% | 88.67% | 84.45% |
CSO | 82.78% | 86.56% | 87.88% |
CSO-DBN (Proposed) | 86.32% | 91.62% | 95.34% |
The precision value of proposed work CSO-DBN has achieved better percentage of 86.32%. In the recall rate of CSO-DBNgot 91.62% compared with SVM, NB, BOA, and CSO. The CSO-DBNalgorithm outperforms with an F-score of 95.34%. In applying the Kalman filter for removing the noise in the mammographic image and PSNR value (‘Peak Signal to Noise Ratio’) is evaluated to observe the quality of the image by using:
where M and N denotes the number of rows and columns respectively.
Observation of
From the
SVM | NB | BOA | CSO | CSO-DBN (Proposed) | |
---|---|---|---|---|---|
Mean | 0.3633 | 0.4267 | 0.2537 | 0.2512 | 0.2473 |
Standard deviation | 0.0276 | 0.0487 | 0.0183 | 0.0165 | 0.0137 |
Best fitness | 0.1346 | 0.1789 | 0.1065 | 0.1085 | 0.1036 |
Worst fitness | 0.2859 | 0.2421 | 0.2112 | 0.2134 | 0.2103 |
Average fitness | 0.2378 | 0.2518 | 0.2034 | 0.2168 | 0.2015 |
The results of the proposed CSO-DBN algorithm in this
This paper demonstrated Virtual Mammography Image Retrieval Using an Optimized feature selection with a classifier. Data are collected from the publicly available dataset (MIAS)/Mini-MIAS and Digital Database for Screening Mammography (DDSM)/CBIS-DDSM. In the pre-processing phase Kalman filter is used to remove noise, and for the feature extraction SIFT algorithm is implemented. The accurate and efficient retrieval of the mammographic image from the large dataset is done. The most relevant features are selected using an optimized crow search algorithm and classified using a deep belief network. The accuracy rate of proposed work CSO-DBN is 0.9344 whereas SVM (0.5434), NB (0.7014), BOA (0.8156), and CSO (0.8852). Our proposed work outperforms better results in metric performance measures of error rate, computation time, MCC, and FRR. In the future, this work may extend up implementing the classification by using various optimization techniques.
The authors received no specific funding for this study.
The authors declare no conflict of interest regarding the publication of the paper.