iconOpen Access

ARTICLE

crossmark

IoT-Driven Optimal Lightweight RetinaNet-Based Object Detection for Visually Impaired People

Mesfer Alduhayyem1,*, Mrim M. Alnfiai2,3, Nabil Almalki4, Fahd N. Al-Wesabi5, Anwer Mustafa Hilal6, Manar Ahmed Hamza6

1 Department of Computer Science, College of Sciences and Humanities-Aflaj, Prince Sattam bin Abdulaziz University, Saudi Arabia
2 King Salman Center for Disability Research, Riyadh, Al-Hayāṯim 16273, Saudi Arabia
3 Department of Information Technology, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif, 21944, Saudi Arabia
4 Department of Special Education, College of Education, King Saud University, Riyadh, 12372, Saudi Arabia
5 Department of Computer Science, College of Science & Arts, King Khaled University, Ar-Riyad 12372, Saudi Arabia
6 Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, AlKharj, Saudi Arabia

* Corresponding Author: Mesfer Alduhayyem. Email: email

Computer Systems Science and Engineering 2023, 46(1), 475-489. https://doi.org/10.32604/csse.2023.034067

Abstract

Visual impairment is one of the major problems among people of all age groups across the globe. Visually Impaired Persons (VIPs) require help from others to carry out their day-to-day tasks. Since they experience several problems in their daily lives, technical intervention can help them resolve the challenges. In this background, an automatic object detection tool is the need of the hour to empower VIPs with safe navigation. The recent advances in the Internet of Things (IoT) and Deep Learning (DL) techniques make it possible. The current study proposes IoT-assisted Transient Search Optimization with a Lightweight RetinaNet-based object detection (TSOLWR-ODVIP) model to help VIPs. The primary aim of the presented TSOLWR-ODVIP technique is to identify different objects surrounding VIPs and to convey the information via audio message to them. For data acquisition, IoT devices are used in this study. Then, the Lightweight RetinaNet (LWR) model is applied to detect objects accurately. Next, the TSO algorithm is employed for fine-tuning the hyperparameters involved in the LWR model. Finally, the Long Short-Term Memory (LSTM) model is exploited for classifying objects. The performance of the proposed TSOLWR-ODVIP technique was evaluated using a set of objects, and the results were examined under distinct aspects. The comparison study outcomes confirmed that the TSOLWR-ODVIP model could effectually detect and classify the objects, enhancing the quality of life of VIPs.

Keywords


1  Introduction

Visual impairment is one of the major health complications that affect humankind across the globe. Vision loss or vision impairment is characterized by loss of visual capacity or visual-sensing capability of an individual and is incurable by wearing spectacles [1]. In such cases, navigation in unknown locations is highly challenging for Visually Impaired Persons (VIPs) other than in their own house. Vision impairment occurs for multiple reasons, such as retinopathy, uncorrected refractive errors, glaucoma, age-related eye problems, diabetic trachoma, cataracts, unaddressed presbyopia and corneal opacity [2]. People use several aids for social inclusion, rehabilitation, education and work, excluding medical treatment. VIPs depend on their auditory perceptions and somatosensation—basically sound and braille—to obtain information from their environment. VIPs use assistive gadgets like canes to recognize their impediments [3]. However, nearly 28.22% of the global population are VIPs, whereas VIP-accessible facilities are not globally implemented. This leads to social discrimination problems since their activities have constraints. To be specific, VIPs cannot independently handle unpredictable circumstances outdoors. It limits their activities and movements indoors or in neighborhoods [4].

VIPs face important difficulties in their day-to-day activities: object detection and recognition, mobility/navigation and safety, translation, identification of currency and textual information (signs, symbols) [5]. In literature, various methods, applications, systems and devices have been proposed, developed and validated in assistive technologies to facilitate VIPs in executing a task. These solutions usually consist of electronic gadgets equipped with microprocessors, cameras and sensors. These gadgets can take decisions and offer auditory or tactile feedback to the end-user, i.e., VIPs [6,7]. Though the existing object detection and recognition mechanisms are highly accurate, it lacks in presenting essential data and the associated attributes to track VIPs and ensure their safe movement. Since blind persons do not know the type of objects in their surroundings, such devices are highly helpful.

Moreover, a tracking system should be developed using which the VIPs’ family members can monitor their movements [8]. Various authors have proposed numerous methods for developing assistive gadgets for VIPs. The following technologies are used in the prevailing gadgets such as low-energy Bluetooth beacons, vision-based sensors (camera), non-vision sensors (magnetic sensing, infrared, inertial and ultrasonic, etc.), and many more [9]. In the past, several research works attempted to address the following challenges: the nature and position of obstacles in a travelling path, understanding the environment, raising perception and easing VIPs in indoor and outdoor environments. Generally, advanced technologies have functions to choose routes and detect problems automatically [10]. Yet, there is a demand to develop and design intelligent systems that are helpful in object detection and recognition and enable the secure movement of VIPs.

The current study develops an IoT-assisted Transient Search Optimization with a Lightweight RetinaNet-based object detection (TSOLWR-ODVIP) model to help VIPs. For data acquisition, IoT devices are used in this study. Lightweight RetinaNet (LWR) model is applied to detect objects accurately. Next, the TSO algorithm is employed for fine-tuning the hyperparameters involved in the LWR model. Finally, the Long Short-Term Memory (LSTM) model is exploited for classifying objects. The performance of the proposed TSOLWR-ODVIP technique was validated using a set of objects, and the results were examined under distinct aspects.

2  Related Works

Liu et al. [11] modelled a fuzzy-enabled solution for vision challenges. This study leveraged a fuzzy-aided mechanism for the detection of targets that are poorly tracked with the help of response matrices of the samples. In poor tracking, the target tends to get relocated under the stored template. The solution was tested using an OTB100 data set, and the experimental outcomes reveal that the auxiliary solution was effective for vision challenges. Alon et al. [12] utilized different gadgets for multiple purposes, such as a Pi camera as a capturing device, Raspberry Pi 4 as a microcontroller and a speaker for audio to declare an identified bill. EyeBill-PH was performed with an overall testing accuracy of 86.3%. Su et al. [13] formulated finger-worn gadgets—called Chinese FingerReader—that were realistically installed in VIPs to recognize conventional Chinese characters over micro Internet of Things (IoT) processors. The Chinese FingerReader, installed on the index finger, has small buttons and a camera. In this device, the small camera captures the images by finding the relative place of the index finger in the printed text. The buttons are implemented for VIPs to capture the images, whereas the audio output of the respective Chinese characters is provided to the VIPs in the form of a voice prompt.

In literature [14], the authors executed a new indoor object detector with the help of a Deep Convolutional Neural Network (CNN)-based structure. The structure was constructed based on a deep CNN called ‘RetinaNet’. The proposed model was validated through several backbones such as VGGNet, ResNet, and DenseNet to enhance the processing time and detection performance. Bai et al. [15] introduced a wearable assistive device that empowers VIPs to navigate quickly and safely in strange ambiences and to recognize objects in outdoor and indoor environments. A lightweight CNN-related object recognition mechanism was formulated and deployed on the smartphone to increase the perception capability of VIPs and promote their navigation systems. Jiang et al. [16] innovatively used the image quality assessment method to select the images captured via vision sensors. This method could ensure the quality of the input scene for the final identification mechanism. Primarily, binocular vision sensors were used in this method to capture the images in a fixed frequency and choose the informative ones based on stereo image quality values. Then, the captured images were forwarded to the cloud for further computation processes. Specifically, the identification and automatic outcomes were obtained for all images from earlier stages. Then, big data-related CNN was used in this study.

3  The Proposed Object Detection Model

In this study, a new TSOLWR-ODVIP technique has been developed to detect and classify objects for VIPs. The major aim of the presented TSOLWR-ODVIP technique is to identify different objects surrounding the VIP and convey the information via an audio message.

3.1 Object Detection: LWR Model

In this study, the LWR model is applied to detect objects [17] accurately. RetinaNet structure encompasses Feature Pyramid Network (FPN), a detection backend and a backbone, as shown in Fig. 1. Initially, the images are managed using a backbone, which is generally a ResNet structure. It is important to note that though MobileNet efficiency is on par excellent with ResNet in terms of classification, it is not accurate enough to be observed as a corresponding replacement for ResNet. In this perspective, MobileNet is utilized as the backbone for detection tasks though it suffers from reduced accuracy in terms of classification outcomes. The primary purpose of MobileNet is that the confidence score of the MobileNet-based backbone can be decreased by trading off with low computation cost. As a result, it is not considered the desired choice as a backbone of a highly-accurate object detection network. Both backbone and the following FPN combine to form encoder and decoder networks. FPN has an advantage, i.e., it combines the characteristics of successive layers from the roughest to the finest level so that the features in different scales and levels are efficiently transmitted to the subsequent layer. Next, the multiple scale pyramid feature (P3-P7) feeds into the backend in which two detection subdivisions are applied for object classification and bounding box regression.

images

Figure 1: Structure of RetinaNet

Instinctively, the filter size is reduced to achieve FLOP reduction. Now, the D-blockv1 employs the MobileNet structure. A 1 × 1 convolution layer and a 3 × 3 depthwise (DW) convolutional layer replace one original layer. The D-block-v2 consecutively places 1 × 1 as well as 3 × 3 kernels. Based on YOLOv1, 3 × 3 kernels are replaced without presenting the residual blocks. In this study, the number of filters is fixed to a constant value across the layers. The D-block-v3 is highly aggressive because it substitutes each 3 × 3 convolution with a 1 × 1 convolution. In this scenario, the lightweight block experiences a decline in accuracy to trade off less computational cost. A lightweight detection block is a trade-off between low computation difficulty and a decline in accuracy. To indemnify the reduced accuracy, the fully-shared weight system is replaced with a partially-shared weight system in the new RetinaNet model. A partially-shared weight system primarily has two benefits. D3 has an independent weight parameter which learns additionally-tailored characteristics for the branch and can compensate for the reduced accuracy by low computation difficulty. For others, it allows the users not to touch the remaining network through heavy bottleneck blocks that are simply resolved.

3.2 Hyperparameter Tuning: TSO Algorithm

In this stage, the TSO algorithm is employed for fine-tuning the hyperparameters involved in the LWR model [18]. TSO procedure is modelled as follows; 1) initialize the search agents between the upper and lower limits of the search area; 2) identify the optimal solution (Exploration), and 3) obtain the optimal solution or steady-state (Exploitation) phase. At first, the initialization of the searching agent is arbitrarily produced. Then, the exploration behaviour of TSO is followed based on the oscillation of the second order RLC circuit nearby 0. But the exploitation of TSO occurs based on exponential decay of the initial order release. An arbitrary value r1 is applied to create a balance between the exploitation (r1<0.5) and exploration (r1=x>0.5) phases. The optimal solution (yl) mimics the final value (x()) and steady state of the electric circuit as well B1=B2=|YlC1.Yl|.

y=lb+rand×(ublb) (1)

Yl+1={Yl+(YlC1Yl)eTYl+eT[cos(2πT)+sin(2πT)]|YlC1.Yl|r1<0.5 (2)

r1=x>0.5 (3)

T=2×z×r2z (4)

C1=k×z×r3+1 (5)

z=22(l/Lma) (6)

In this expression, lb and ub indicate the lower and upper limits, respectively, rand denotes a uniformly-distributed arbitrary value, z denotes a parameter in the range of [2, 0], T and C1 denote the arbitrary coefficients, respectively, r1,r2,andr3 denote the uniformly-distributed arbitrary values ϵ[0,1],Yl indicates the location of the searching agent, Yl represents the optimal location, l shows the iteration value, k refers to a constant value (k=0,1,2) and Lmax denotes the maximal iteration count. Here, the balance between the exploration and exploitation phases is understood as the coefficient T that differs in the range of [−2, 2]. The exploitation procedure is accomplished if τ>0 . On the other hand, the exploration procedure is accomplished as well. It is to be noted that the transient response is initiated by a higher response value which then dumps into a minimum value if τ>0 . Afterwards, it oscillates again and moves towards the higher value if τ<0 . The approach is not complicated and can be applied to update the location and create a balance between the exploitation and exploration phases.

images

3.3 Object Classification: LSTM Model

Finally, the LSTM model is exploited for classifying objects [19]. Recurrent Neural Network (RNN) is a new Artificial Intelligence technique applied in real-time applications. A conventional RNN is generally employed to forecast the trained temporal dataset. However, it faces many challenges in terms of dealing with gradient explosion datasets. To resolve these issues, the LSTM method was developed. The LSTM method applies a memory function to substitute the hidden RNN unit. Fig. 2 shows the architecture of the LSTM mechanism which detects intrusions in IoT data. The LSTM technique has three major gates: output, forget, and input.

images

Figure 2: LSTM model

The forget gate identifies the forgotten datasets, whereas ht refers to the input dataset, and the output gate lies in the range of [0, 1]. If the value is 1, the output is ‘completely retained’, whereas 0 denotes ‘completely discarded’. The present state is characterized by ct :

ht=sigma(Wxt+Uht1+b(h)) (7)

ft=sigma(W(f)+Xt+U(f)ht1+b(f)) (8)

Here, ht indicates the trained input dataset, whereas the input to the preceding cell is characterized by ht1 . The forget gate is specified as ft , whereas the crucial parameter of the LSTM is denoted by weight W(f) and b(f) denotes the bias. The data is updated at the input gate using two functions such as sigma and tanh . The sigma function is applied to detail the data required for upgradation, whereas the tanh function produces the data for the updation process.

it=sigma(W(i)+Xt+U(i)ht1+b(i)),

mt=tanh(W(m)+Xt+U(m)ht1+b(m)) (9)

ct=itmt+ftct1.

The cell state ct1 from the preceding cell is used for updating with the help of cell state ct . The novel dataset should be discarded, whereas ft.ct1 and it.mt is integrated to obtain the following cell state.

ot=sigma(W(0)+Xt+ht1+b(0)) (10)

ht=0ttanh(ct) (11)

In Eq. (11), 0t denotes the output gate. In contrast, the weight vector of the Neural Network (NN) is signified by U and V. The sigma function is used to find the data that would be the output, whereas tanh is applied to present the cell state and declare the concluding output.

4  Experimental Validation

The current section experimentally validates the proposed TSOLWR-ODVIP model using a dataset that is composed of 1,600 samples. The dataset has a total of eight distinct classes. Table 1 provides the details regarding the dataset. The proposed model was simulated in Python 3.6.5 tool, whereas the PC configurations are as follows; i5-8600 k, GeForce 1050Ti 4 GB, 16 GB RAM, 250 GB SSD, and 1 TB HDD. The parameter settings are given herewith: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch count: 50, and activation: ReLU.

images

Fig. 3 illustrates the confusion matrices offered by the TSOLWR-ODVIP model upon the entire Training (TR) and Testing (TS) datasets. The results demonstrate that the proposed TSOLWR-ODVIP model recognized all the class labels proficiently.

images

Figure 3: Confusion matrices of TSOLWR-ODVIP model (a) entire dataset, (b) 70% of TR data and (c) 30% of TS data

Table 2 provides the comprehensive object detection outcomes achieved by the proposed TSOLWR-ODVIP model. Fig. 4 portrays the results accomplished by the TSOLWR-ODVIP model on the entire dataset under distinct classes. TSOLWR-ODVIP model categorized C-1 samples with accuy , precn,recal , F1score , and Mathew Correlation Coefficient (MCC) values such as 99.62%, 97.09%, 100%, 98.52%, and 98.32% respectively. Meanwhile, the proposed TSOLWR-ODVIP method classified C-2 samples with accuy , precn,recal , F1score , and MCC values such as 99.19%, 93.90%, 100%, 96.85%, and 96.45% correspondingly. Ultimately, the presented TSOLWR-ODVIP technique categorized C-3 samples with accuy , precn,recal , F1score , and MCC values such as 99.25%, 99.47%, 94.50%, 96.92%, and 96.54% correspondingly.

images

images

Figure 4: Overall results of the TSOLWR-ODVIP model on the entire dataset

Fig. 5 shows the outcomes accomplished by the proposed TSOLWR-ODVIP approach on dissimilar classes of 70% TR data. The presented TSOLWR-ODVIP system classified C-1 samples with accuy , precn,recal , F1score , and MCC values such as 99.55%, 96.43%, 100%, 98.18%, and 97.95% correspondingly. Meanwhile, the TSOLWR-ODVIP method categorized C-2 samples with accuy , precn,recal , F1score , and MCC values such as 99.11%, 93.42%, 100%, 96.60%, and 96.16% correspondingly. Eventually, the proposed TSOLWR-ODVIP algorithm classified C-3 samples with accuy , precn,recal , F1score , and MCC values such as 99.20%, 100%, 92.97%, 96.36%, and 95.99% correspondingly.

images

Figure 5: Overall results of TSOLWR-ODVIP model on 70% of TR dataset

Fig. 6 reviews the outcomes achieved by TSOLWR-ODVIP algorithm on dissimilar classes of 30% TS data. The proposed TSOLWR-ODVIP approach categorized C-1 samples with accuy , precn,recal , F1score , and MCC values such as 99.79%, 98.48%, 100%, 99.24%, and 99.12% correspondingly. Meanwhile, the TSOLWR-ODVIP system classified C-2 samples with accuy , precn,recal , F1score , and MCC values such as 99.38%, 95.08%, 100%, 97.48%, and 97.16% correspondingly. Eventually, the proposed TSOLWR-ODVIP technique categorized C-3 samples with accuy , precn,recal , F1score , and MCC values such as 99.38%, 98.59%, 97.22%, 97.90%, and 97.54% correspondingly.

images

Figure 6: Overall results of the TSOLWR-ODVIP model on 30% of testing dataset

Fig. 7 provides the ROC curve examination results of the TSOLWR-ODVIP model. The figure demonstrates that the proposed TSOLWR-ODVIP model accomplished enhanced ROC values under all the classes.

images

Figure 7: ROC analysis results of the TSOLWR-ODVIP model

At last, Table 3 provides an overview of the detailed comparative analysis results achieved by the proposed TSOLWR-ODVIP model and other existing DL models. Fig. 8 portrays the comparative analysis outcomes offered by the TSOLWR-ODVIP model and other DL models in terms of accuy . The figure implies that AlexNet, VGG-16 and VGG-19 models reported the least accuy values such as 85.89%, 86.50%, and 84.43% respectively. At the same time, GoogleNet model achieved a slightly enhanced accuy of 88.77%. Next, SSD-MobileNet and YOLO-v3 models attained considerable outcomes with accuy values such as 98.85% and 95.55%, respectively. But, the proposed TSOLWR-ODVIP model attained effectual outcomes with an increased accuy of 99.69%.

images

images

Figure 8: Comparative accuy examination outcomes of TSOLWR-ODVIP model

Fig. 9 shows the comparative analysis results of the proposed TSOLWR-ODVIP and other DL techniques in terms of precn . The figure infers that AlexNet, VGG-16, and VGG-19 approaches reported the least precn values such as 86.54%, 85.70%, and 84.06%, correspondingly. Simultaneously, the GoogleNet algorithm accomplished a slightly improved outcome with a precn of 86.80%. SSD-MobileNet and YOLO-v3 techniques accomplished significant outcomes with precn values such as 98.33% and 95.16%, correspondingly. But, the proposed TSOLWR-ODVIP method obtained efficient results with an improved precn of 98.80%.

images

Figure 9: Comparative precn examination results of the TSOLWR-ODVIP model and other models

Fig. 10 shows the comparative analysis outcomes of TSOLWR-ODVIP and other DL methods in terms of recal . The figure infers that AlexNet, VGG-16, and VGG-19 techniques reported the least recal values such as 84.81%, 87.92%, and 87.33% correspondingly. On the other hand, GoogleNet approach achieved an improved outcome of 89.05% recal Followed by, SSD-MobileNet and YOLO-v3 algorithms achieved substantial outcomes with recal values being 98.17% and 95.77%, correspondingly. But, the proposed TSOLWR-ODVIP system attained efficient results with an improved recal of 98.70%.

images

Figure 10: Comparative recal examination outcomes of TSOLWR-ODVIP model and other techniques

Fig. 11 portrays the comparative analysis results, in terms of F1score values, offered by TSOLWR-ODVIP and other DL approaches. The figure shows that AlexNet, VGG-16 and VGG-19 systems accomplished the least F1score values, such as 89.53%, 89.11% and 87.22% correspondingly. GoogleNet algorithm demonstrated a good F1score of 86.66%. SSD-MobileNet and YOLO-v3 techniques too accomplished significant results, with its F1score values being 98.38% and 95.80%, correspondingly. But, the proposed TSOLWR-ODVIP method achieved effective outcomes with a high F1score of 98.74%. Therefore, the experimental values confirm the improved outcomes of TSOLWR-ODVIP approach compared to other existing models.

images

Figure 11: Comparative F1score examination results of TSOLWR-ODVIP model and other techniques

5  Conclusion

In this study, a new TSOLWR-ODVIP technique has been developed to detect and classify objects to help VIPs. The major aim of the presented TSOLWR-ODVIP technique is to identify different objects surrounding the VIPs and convey the information to them via audio messages. For data acquisition, IoT devices are used in this study. Followed by LWR model is applied for the accurate detection of the objects. Then, the TSO algorithm is employed for fine-tuning the hyperparameters involved in LWR model. At last, the LSTM model is exploited for the classification of objects. The performance of the proposed TSOLWR-ODVIP technique was validated using a set of objects and the results were examined under distinct aspects. The proposed model achieved a high accuracy of 99.69%. The comparison study outcomes confirmed that the TSOLWR-ODVIP model detects and effectually classifies the objects. So, it can be exploited to enhance the quality of life of VIPs. In future, hybrid DL models can be used to boost the detection efficacy of the TSOLWR-ODVIP technique.

Funding Statement: The authors extend their appreciation to the King Salman center for Disability Research for funding this work through Research Group no KSRG-2022-030.

Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.

References

 1.  R. R. A. Bourne, S. R. Flaxman and T. Braithwaite, “Magnitude, temporal trends, and projections of the global prevalence of blindness and distance and near vision impairment: A systematic review and meta-analysis,” The Lancet Global Health, vol. 5, no. 9, pp. 888–897, 2017. [Google Scholar]

 2.  E. Cardillo, V. Di Mattia, G. Manfredi, P. Russo, A. De Leo et al. “An electromagnetic sensor prototype to assist visually impaired and blind people in autonomous walking,” IEEE Sensors Journal, vol. 18, no. 6, pp. 2568–2576, 2018. [Google Scholar]

 3.  O. Arslan and D. E. Koditschek, “Sensor-based reactive navigation in unknown convex sphere worlds,” The International Journal of Robotics Research, vol. 38, no. 2–3, pp. 196–223, 2019. [Google Scholar]

 4.  S. Caraiman, O. Zvoristeanu, A. Burlacu and P. Herghelegiu, “Stereo vision based sensory substitution for the visually impaired,” Sensors, vol. 19, no. 12, pp. 2771, 2019. [Google Scholar]

 5.  R. A. Z. Daou, J. Chehade, G. A. Haydar, A. Hayek, J. Boercsoek et al. “Design and implementation of smart shoes for blind and visually impaired people for more secure movements,” in 32nd Int. Conf. on Microelectronics (ICM), Aqaba, Jordan, pp. 1–6, 2020. [Google Scholar]

 6.  S. Shaikh, “Assistive object recognition system for visually impaired,” International Journal of Engineering and Technical Research, vol. V9, no. 9, pp. 736–140, 2020. [Google Scholar]

 7.  V. V. Meshram, K. Patil, V. A. Meshram and F. C. Shu, “An astute assistive device for mobility and object recognition for visually impaired people,” IEEE Transactions on Human-Machine Systems, vol. 49, no. 5, pp. 449–460, 2019. [Google Scholar]

 8.  M. M. Islam, M. S. Sadi, K. Z. Zamli and M. M. Ahmed, “Developing walking assistants for visually impaired people: A review,” IEEE Sensors Journal, vol. 19, no. 8, pp. 2814–2828, 2019. [Google Scholar]

 9.  S. S. A. Zaidi, M. S. Ansari, A. Aslam, N. Kanwal, M. Asghar et al. “A survey of modern deep learning based object detection models,” Digital Signal Processing, vol. 126, pp. 103514, 2022. [Google Scholar]

10. P. M. Kumar, U. Gandhi, R. Varatharajan, G. Manogaran and T. Vadivel “Intelligent face recognition and navigation system using neural learning for smart security in internet of things,” Cluster Computing, vol. 22, no. S4, pp. 7733–7744, 2019. [Google Scholar]

11. S. Liu, X. Liu, S. Wang and K. Muhammad, “Fuzzy-aided solution for out-of-view challenge in visual tracking under IoT-assisted complex environment,” Neural Computing and Applications, vol. 33, no. 4, pp. 1055–1065, 2021. [Google Scholar]

12. A. S. Alon, R. M. Dellosa, N. U. Pilueta, H. D. Grimaldo and E. T. Manansala, “EyeBill-PH: A machine vision of assistive philippine bill recognition device for visually impaired,” in 11th IEEE Control and System Graduate Research Colloquium (ICSGRC), Shah Alam, Malaysia, pp. 312–317, 2020. [Google Scholar]

13. Y. S. Su, C. H. Chou, Y. L. Chu and Z. Y. Yang, “A Finger-worn device for exploring Chinese printed text with using cnn algorithm on a micro iot processor,” IEEE Access, vol. 7, pp. 116529–116, 2019. [Google Scholar]

14. M. Afif, R. Ayachi, Y. Said, E. Pissaloux and M. Atri, “An evaluation of retinanet on indoor object detection for blind and visually impaired persons assistance navigation,” Neural Processing Letters, vol. 51, no. 3, pp. 2265–2279, 2020. [Google Scholar]

15. J. Bai, Z. Liu, Y. Lin, Y. Li and S. Lian, “Wearable travel aid for environment perception and navigation of visually impaired people,” Electronics, vol. 8, no. 6, pp. 1–27, 2019. [Google Scholar]

16. B. Jiang, J. Yang, Z. Lv and H. Song, “Wearable vision assistance system based on binocular sensors for visually impaired users,” IEEE Internet of Things Journal, vol. 6, no. 2, pp. 1375–1383, 2019. [Google Scholar]

17. Y. Li, A. Dua and F. Ren, “Lightweight retinanet for object detection on edge devices,” in IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA, pp. 1–6, 2020. [Google Scholar]

18. M. H. Qais, H. M. Hasanien and S. Alghuwainem, “Transient search optimization: A new meta-heuristic optimization algorithm,” Applied Intelligence, vol. 50, no. 11, pp. 3926–3941, 2020. [Google Scholar]

19. F. Karim, S. Majumdar, H. Darabi and S. Chen, “LSTM fully convolutional networks for time series classification,” IEEE Access, vol. 6, pp. 1662–1, 2018. [Google Scholar]


Cite This Article

APA Style
Alduhayyem, M., Alnfiai, M.M., Almalki, N., Al-Wesabi, F.N., Hilal, A.M. et al. (2023). Iot-driven optimal lightweight retinanet-based object detection for visually impaired people. Computer Systems Science and Engineering, 46(1), 475-489. https://doi.org/10.32604/csse.2023.034067
Vancouver Style
Alduhayyem M, Alnfiai MM, Almalki N, Al-Wesabi FN, Hilal AM, Hamza MA. Iot-driven optimal lightweight retinanet-based object detection for visually impaired people. Comput Syst Sci Eng. 2023;46(1):475-489 https://doi.org/10.32604/csse.2023.034067
IEEE Style
M. Alduhayyem, M.M. Alnfiai, N. Almalki, F.N. Al-Wesabi, A.M. Hilal, and M.A. Hamza, “IoT-Driven Optimal Lightweight RetinaNet-Based Object Detection for Visually Impaired People,” Comput. Syst. Sci. Eng., vol. 46, no. 1, pp. 475-489, 2023. https://doi.org/10.32604/csse.2023.034067


cc Copyright © 2023 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 823

    View

  • 526

    Download

  • 0

    Like

Share Link