Skin is the largest organ and outer enclosure of the integumentary system that protects the human body from pathogens. Among various cancers in the world, skin cancer is one of the most commonly diagnosed cancer which can be either melanoma or non-melanoma. Melanoma cancers are very fatal compared with non-melanoma cancers but the chances of survival rate are high when diagnosed and treated earlier. The main aim of this work is to analyze and investigate the performance of Non-Subsampled Bendlet Transform (NSBT) on various classifiers for detecting melanoma from dermoscopic images. NSBT is a multiscale and multidirectional transform based on second order shearlet system which precisely classifies the curvature over other directional representation systems. Here two-phase classification is employed using k-Nearest Neighbour (kNN), Naive Bayes (NB), Decision Trees (DT) and Support Vector Machines (SVM). The first phase classification is used to classify the images of PH2 database into normal and abnormal images and the second phase classification classifies the abnormal images into benign and malignant. Experimental result shows the improvement in classification accuracy, sensitivity and specificity compared with the state of art methods.
Skin performs very important functions in human body such as sensation, cooling and protection against physical damages. The three main layers of skin are the outer epidermis, middle dermis and the inner layer hypodermis. These skin layers consist of three major cells namely squamous cells, basal cells and melanocytic cells as shown in
Sometimes these skin cells multiply without any control and forms a big mass of cells called skin cancer. Melanoma skin cancer arises from melanocytic cells which produces melanin pigment that decides the color of the skin. The detection of melanoma at early stage is very much needed to avoid death rate. According to cancer statistics of 2021 [
Tan et al. [
Serten et al. [
Short term lesion changes detection [
This article is organized as follows. Section 2 presents the materials and methods used to detect melanoma from dermoscopic images with detailed description of preprocessing method to remove noise and hair. Section 3 discusses non-sub sampled bendlet transform which extracts anisotropic features from the images. Section 4 illustrates the step-by-step algorithm of non-sub sampled bendlet transform. Section 5 explains the classifier types used to classify the NSBT features. Section 6 compares the performance of NSBT on different classifiers. Finally, Section 7 provides the conclusion.
The work flow of the proposed method is illustrated in
The dermoscopic images consist of noise and hair which has to be removed before feature extraction module step. The noise present in the dermoscopic images is mainly salt & pepper noise and the presence of this noise and hair will affect the overall performance of the system. A non-linear filter is designed in such a way that it should remove both noise and hair from the dermoscopic images. Here, median filtering method is used to remove this unwanted information from the images. The pixels identified by the mask are sorted in the ascending order and the median value is computed and replaced as the center pixel value.
The preprocessed output is subjected to non subsampled bendlet transform and the texture features are extracted in the subbands at various levels and directions. The energy from each subband is calculated and given as input to the classifiers. Various classifiers namely kNN, NB, DT and SVM are used to classify the images into normal and abnormal in the first phase and again abnormal into benign and malignant in the second phase.
Dermoscopic images consists of regions seperated by lines and curves and these curves provides various information about the image features. The existing directional representation systems such as curvelet [
where, j-Scale, θ-Orientation, k1, k2-Location of curvelets.
The contourlets are the extension of curvelets and constructed by combining laplacian pyramid and directional filter banks to capture the anisotropic features. The laplacian pyramid is used to capture the discontinuties, followed by directional filter banks to connect point singularities into linear structure.It suffers from shift variant due to subsampling in both the stages. Shearlets are multilevel and multi diectional transform constructed by parabolic scaling, shearing and translation applied to few generating functions.
where,
The scaling matrix are given as
The above mentioned transforms are based on scale, shearing and translation whereas in Bendlet system [
NSBT is a type of second order shearlet system which captures anisotropic features like edges, curves and other discontinuities from the images more precisely and classify them accurately. The advantage of bending parameter with NSBT exhibits quadratic property in spatial coordinates and uses alpha scaling instead of parabolic scaling. With alpha scaling, it gives location, orientation and curvature of system elements. The alpha scaling is given by
where a > 0 and α ϵ [0, 1]
The value of alpha 0 < α < 1/2 decides the type of scaling used to extract the curves from the dermoscopic images.
α = 0 corresponds to directional scaling called ridgelet.
α = 0.5 corresponds to parabolic scaling called curvelets and shearlets.
α = 1 corresponds to isotropic scaling called wavelets.
The shearing matrix is given as
Here, s stands for integer
The
When,
The bendlet system is represented by the equation as
where a, s, b, t represents scale, shear, bend and location.
The cone adopted NSBT system is expressed as
The heuristic argument of NSBT system illustrates the following points by Lessig et al. [ The magnitude coefficients of bendlet system will be zero if the point p does not intersect the boundary curve ∂ The point p intersects the boundary curve ∂ The point p intersects the boundary curve ∂ The point p intersects the boundary curve ∂
The decay rates are given in the
Conditions | Decay rates |
---|---|
In recent years, various directional representation systems were considered to extract the curves from dermoscopic images for the classification purpose but the drawback in that is they cannot classify the curvature precisely and characterize them. The proposed NSBT not only extracts the anisotropic features but also classifies them effectively improving the classification accuracy. The main steps of the proposed NSBT are described in detail as follows: Step 1: The output from pre-processed image of size 768 × 560 is resized to 256 × 256 is shown in Step 2: Initializing the parameter values for cone adopted bendlet transform with α = 0.33 and obtain the coefficients at various levels and directions.
To obtain the coefficients for Level 1 and Directions 2, assign the values for scale, shear and bend as mentioned below:
Cone = 1:2
Scale = 1 to nScale [1]
Shear = −nShear to nShear [−1 0 1]
Bend = −nBend to nBend [−1 0 1] Step 3: Design the shorter low pass band based on scale and alpha values. The frequency response of low pass filter is shown in Step 4: Generate other wavelet depending on scale by using Daubechies wavelet db8 as given in Step 5: Multiply wavelets obtained from Step 3&4 and apply zero padding to get high pass bands as shown in Step 6: Directional components are obtained by various parameter values of shearing and bending as shown in Step 7: Finally, the bendlet system that precisely captures the curvature at Level 1 and Direction 2 are obtained for horizontal cone are shown in Step 8: The sub bands are generated by convolving the input image with the bendlet system.
Totally 19 sub bands are obtained at Level 1 and Direction 2 are shown in Step 9: Repeating the above steps from Step 2 to Step 8, sub bands for other levels and directions are generated.
The total number of sub bands generated at each level and directions during NSBT decomposition are given in
No. of directions | No. of levels | |||
---|---|---|---|---|
1 | 2 | 3 | 4 | |
2 | 19 | 37 | 55 | 73 |
4 | 31 | 61 | 91 | 121 |
8 | 43 | 85 | 127 | 169 |
16 | 55 | 109 | 163 | 217 |
32 | 67 | 133 | 199 | 265 |
R-size of row, C-size of column.
The value of R and C are 256 × 256. The mean of the magnitude of sub bands or sub image gives the total energy features from the PH2 dataset. The features are given as input to the classifier stage.
The detection of melanoma from PH2 dataset is carried out by two independent SVM classifiers. In first phase classification 120 abnormal and 80 normal images were used and in second phase 80 benign and 40 malignant images were used for training. Various classifiers like kNN, NB, DT and SVM were used to classify the images into its corresponding categories.
kNN is a type of supervised machine learning algorithm which is also called as lazy learning algorithm. It works by calculating distance between the feature vectors by Euclidean distance method. Consider two vectors u = (x1, x2) and v = (y1, y2), the distance between the two vectors is calculated by the equation
Here the
Naive Bayes is a type of probabilistic classifier that follows Bayes theorem in order to classify the images. It assumes that the occurrences of features in a class are not related with the presence of any other features. So, all the features are classified independently. The Bayes formula is given by
Decision Trees are a family of supervised machine learning algorithm that predicts the target variable using graphical representation with leaf node as a class label. Decision trees start with root node and split into many branch nodes. The features of root node are generated by attribute selection measures which is repeated till a leaf node is reached. The entropy and gini index impurity are given by the equation as
Support Vector Machine classifies the data points by creating a hyperplane between the class labels with maximum margin. Two parallel marginal lines are drawn parallel to the hyperplane that separates the datapoints into two classes. The distance between the marginal lines should be maximum to reduce the classification errors. The PH2 dataset consists of non-separable data points which cannot be separated by the hyperplane. In this case, SVM kernels are used to map the data points from two dimensions into high dimensional space. The gaussian kernel called radial basis function kernel RBF is used to effectively separate the data points into two classes as shown in
Radial Basis Function is given by the equation as
The validation is carried out by using k fold cross validation to assess the performance of the model. The original data set is partitioned into k subsets or folds. In first phase of classification, 200 dermoscopic images were used, the value of k is chosen as 10. The total 200 dermoscopic images are divided by k = 10 to give 10 folds with 20 images in each fold. The 20 images in each fold contains 12 abnormal images and 8 normal images are shown in
The training is carried out with k-1 folds and validation with one-fold. The experiments are repeated for all the ten iteration folds and averaging the accuracy gives the total accuracy for Phase I classification.
In Phase II classification, 120 abnormal images are considered for classification in which 80 images belong to benign and 40 images belong to malignant. The total 120 abnormal images are divided by k = 10 to give 10 folds with 12 images in each fold are shown in
The performance metrics are evaluated using sensitivity, specificity and accuracy with four parameters namely, true positive, true negative, false positive and false negative. True Positive is the total number of features correctly classified as abnormal cases True Negative is the total number of features correctly classified as normal cases False Positive is the total number of features misclassified as normal cases False Negative is the total number of features misclassified as abnormal cases
Sensitivity (Sn) is defined as the ability of model to predict the true positives from the actual number of positives tested.
Specificity (Sp) is defined as the ability of model to predict the true negatives from the actual number of negatives tested.
Accuracy (Ac) is defined as the ability of model to predict the test data correctly from the total number of actual cases tested.
In each phase of classification, the levels of NSBT are varied from 1 to 4 and the directional features are extracted from 2 to 32, i.e., in multiples of 2 at every level.
No. of levels | No. of directions | kNN | Naive bayes | Decision trees | Support vector machine | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 62.50 | 68.75 | 66.67 | 72.50 | 73.75 | 73.33 | 77.50 | 78.75 | 78.33 | 85 | 83.75 | 84.17 |
4 | 67.50 | 72.50 | 70.83 | 75 | 76.25 | 75.83 | 80 | 81.25 | 80.83 | 87.50 | 86.25 | 86.67 | |
8 | 75 | 80 | 78.33 | 80 | 83.75 | 82.50 | 82.50 | 88.75 | 86.67 | 90 | 92.50 | 91.67 | |
16 | 67.50 | 76.25 | 73.33 | 75 | 81.25 | 79.17 | 77.50 | 85 | 82.50 | 85 | 90 | 88.33 | |
32 | 65 | 66.25 | 65.83 | 72.50 | 68.75 | 70 | 77.50 | 73.75 | 75 | 85 | 78.75 | 80.83 | |
2 | 2 | 67.50 | 73.75 | 71.67 | 75 | 77.50 | 76.67 | 82.50 | 82.50 | 82.50 | 90 | 87.50 | 88.33 |
4 | 75 | 81.25 | 79.17 | 80 | 83.75 | 82.50 | 85 | 87.50 | 86.67 | 92.50 | 92.50 | 92q.50 | |
8 | 82.50 | 86.25 | 85 | 87.50 | 88.75 | 88.33 | 92.50 | 91.25 | 91.67 | 97.50 | 96.50 | 96.67 | |
16 | 72.50 | 82.50 | 79.17 | 77.50 | 85 | 82.50 | 80 | 87.50 | 85 | 90 | 92.50 | 91.67 | |
32 | 55 | 80 | 71.67 | 65 | 83.75 | 77.50 | 72.50 | 86.25 | 81.67 | 82.50 | 90 | 87.50 | |
3 | 2 | 75 | 75 | 75 | 80 | 80 | 80 | 87.50 | 85 | 85.83 | 95 | 88.75 | 90.83 |
4 | 82.50 | 86.25 | 85 | 87.50 | 88.75 | 88.33 | 92.50 | 92.50 | 92.50 | 95 | 96.25 | 95.83 | |
8 | 87.50 | 93.75 | 91.67 | 95 | 96.25 | 95.83 | 97.50 | 98.75 | 98.33 | 100 | 100 | 100 | |
16 | 75 | 83.75 | 80.83 | 85 | 87.50 | 86.67 | 90 | 90 | 90 | 92.50 | 93.75 | 93.33 | |
32 | 72.50 | 73.75 | 73.33 | 77.50 | 78.75 | 78.33 | 82.50 | 82.50 | 82.50 | 90 | 90 | 90 | |
4 | 2 | 62.50 | 80 | 74.17 | 70 | 85 | 80 | 75 | 88.75 | 84.17 | 82.50 | 92.50 | 89.17 |
4 | 77.50 | 83.75 | 81.67 | 85 | 86.25 | 85.83 | 90 | 91.25 | 90.83 | 92.50 | 95 | 94.17 | |
8 | 85 | 91.25 | 89.17 | 90 | 93.75 | 92.50 | 92.50 | 96.25 | 95 | 97.50 | 97.50 | 97.50 | |
16 | 75 | 85 | 81.67 | 85 | 88.75 | 87.50 | 87.50 | 88.75 | 88.33 | 90 | 93.75 | 92.50 | |
32 | 57.50 | 78.75 | 71.67 | 65 | 85 | 78.33 | 72.50 | 87.50 | 82.50 | 80 | 92.50 | 88.33 |
The performance of NSBT with decision trees are better than kNN and Naive Bayes. In first phase classification, the maximum accuracy of 98.50%, specificity of 97.50% and sensitivity of 100% are achieved with NSBT and SVM at Level 3 and Direction 8. In the second phase classification, the abnormality case is further categorized into benign and malignant. The features are extracted at various levels and different directions by using NSBT with various classifiers. It is observed from
No. of levels | No. of directions | kNN | Naive bayes | Decision trees | Support vector machine | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 68.33 | 72.50 | 70 | 70.83 | 75 | 72.50 | 73.33 | 78.75 | 75.50 | 75 | 81.25 | 77.50 |
4 | 70 | 76.25 | 72.5 | 74.17 | 77.50 | 75.50 | 75 | 81.25 | 77.50 | 78.33 | 83.75 | 80.50 | |
8 | 75 | 78.75 | 76.50 | 76.67 | 80 | 78 | 79.07 | 82.50 | 80.50 | 82.50 | 85 | 85 | |
16 | 72.50 | 77.50 | 74.50 | 74.17 | 77.50 | 75.50 | 75.83 | 82.50 | 78.50 | 78.33 | 85 | 81 | |
32 | 65.83 | 75 | 69.50 | 67.50 | 75 | 70.50 | 70 | 81.25 | 74.50 | 71.67 | 83.75 | 76.50 | |
2 | 2 | 72.50 | 75 | 73.50 | 74.17 | 75.50 | 75.50 | 77.50 | 81.50 | 79 | 80.83 | 87.50 | 83.50 |
4 | 75 | 77.50 | 76 | 78.33 | 80 | 79 | 80.83 | 82.50 | 81.50 | 85 | 90 | 87 | |
8 | 79.17 | 81.25 | 80 | 80.33 | 86.25 | 83 | 85.83 | 88.75 | 87 | 90 | 93.75 | 91.50 | |
16 | 76.67 | 78.75 | 77.50 | 79.17 | 82.50 | 80.50 | 80 | 85 | 82 | 83.33 | 88.75 | 85.50 | |
32 | 72.50 | 77.50 | 74.50 | 73.33 | 81.25 | 76.50 | 75 | 85 | 79 | 77.50 | 88.75 | 82 | |
3 | 2 | 76.67 | 86.25 | 80.50 | 80.83 | 90 | 84.50 | 83.33 | 92.50 | 87 | 86.67 | 96.25 | 90.50 |
4 | 81.67 | 88.75 | 84.50 | 83.33 | 92.50 | 87 | 88.33 | 93.75 | 90.50 | 91.67 | 98.75 | 94.50 | |
8 | 84.17 | 91.25 | 87 | 87.50 | 93.75 | 90 | 92.50 | 97.50 | 94.50 | ||||
16 | 80.83 | 86.25 | 83 | 84.17 | 90 | 86.50 | 87.50 | 92.50 | 89.50 | 90.83 | 96.25 | 93 | |
32 | 74.17 | 85 | 78.50 | 75 | 87.50 | 80 | 79.17 | 90 | 83.50 | 85 | 93.75 | 88.50 | |
4 | 2 | 72.50 | 85 | 77.50 | 75.83 | 87.50 | 80.50 | 77.50 | 90 | 82.50 | 80.83 | 98.75 | 88 |
4 | 79.17 | 86.25 | 82 | 80.83 | 90 | 84.50 | 86.67 | 92.50 | 89 | 90 | 98.75 | 93.50 | |
8 | 82.50 | 90 | 85.50 | 86.67 | 93.75 | 89.50 | 90 | 95 | 92 | 91.67 | 98.75 | 94.50 | |
16 | 79.17 | 88.75 | 83 | 82.50 | 92.50 | 86.50 | 84.17 | 95 | 88.50 | 86.67 | 98.75 | 91.50 | |
32 | 68.33 | 83.75 | 74.50 | 72.50 | 88.75 | 79 | 75 | 92.50 | 82 | 80 | 97.50 | 87 |
The above results illustrate the effectiveness of using NSBT with different classifiers. A performance comparison of NSBT with different classifiers are shown in
From the performance comparison in
In this article, the computerized diagnosis method for identification of melanoma is developed with three modules namely preprocessing, feature extraction and classification. The anisotropic features are represented by non-sub sampled bendlet transform and the energies are calculated from the sub bands. The energy features obtained at various levels and different directions preserved all the necessary information in the image making good features for classification. The energy features are classified using two stage classifiers such as kNN, NB, DT and SVM. Experimental result shows that NSBT along with SVM-RBF provides better classification results in both the phase of classification. The first phase classification gives classification result of sensitivity 97.50%, specificity 100%, accuracy 98.50% and second phase classification gives classification result of sensitivity 100%, specificity 100% and accuracy 100%. Thus, the stated objectives are achieved by developing noninvasive computerized diagnostic method.