The autonomous driving aims at ensuring the vehicle to effectively sense the environment and use proper strategies to navigate the vehicle without the interventions of humans. Hence, there exist a prediction of the background scenes and that leads to discontinuity between the predicted and planned outputs. An optimal prediction engine is required that suitably reads the background objects and make optimal decisions. In this paper, the author(s) develop an autonomous model for vehicle driving using ensemble model for large Sport Utility Vehicles (SUVs) that uses three different modules involving (a) recognition model, (b) planning model and (c) prediction model. The study develops a direct realization method for an autonomous vehicle driving. The direct realization method is designed as a behavioral model that incorporates three different modules to ensure optimal autonomous driving. The behavioral model includes recognition, planning and prediction modules that regulates the input trajectory processing of input video datasets. A deep learning algorithm is used in the proposed approach that helps in the classification of known or unknown objects along the line of sight. This model is compared with conventional deep learning classifiers in terms of recall rate and root mean square error (RMSE) to estimate its efficacy. Simulation results on different traffic environment shows that the Ensemble Convolutional Network Reinforcement Learning (E-CNN-RL) offers increased accuracy of 95.45%, reduced RMSE and increased recall rate than existing Ensemble Convolutional Neural Networks (CNN) and Ensemble Stacked CNN.
In the recent past, autonomous vehicles improved its efficiency and safety potential [
To achieve the minimal error rate, several computer vision systems are used up to now on autonomous driving [
Problem Definition: The main problem associated with an autonomous driving using vehicle policy is the collection of larger collection of data for training the required model and an inappropriate mapping of the input with respect to the action output derived [
However, most of the deep learning system fails in predicting the environment after suitable recognition or classification of objects in an environment. This is specifically true in case of video recognition systems, where the deep learning system has to incur the objects along the trajectories of the path in motion.
Objectives: The application of deep learning algorithm with ensemble-based model has motivated the present study to classify the objects in faster and accurate way. Such that the prediction and planning of paths for autonomous SUVs can enable faster transmission of vehicle that is collision resistant. In this paper, a method on direct realization is developed on an autonomous vehicle driving is carried out using a hybrid ensemble model. To incorporate recognition, planning and prediction modules to regulate the input trajectory of video datasets. To aid the process of classification an ensemble Convolutional Neural Network (CNN) is used for the classification of objects along the video trajectories along the line of sight. To aim at optimal prediction for path planning using Reinforcement Learning (RL).
The main contribution involves the following: Authors developed an autonomous driving that deploys an ensemble classifier involving CNN and RL. An Ensemble CNN (E-CNN) model is developed using deep learning that is trained with the driving patterns for possible prediction of objects and enables smooth flow of driving based on the input trajectories and it ensures the vehicle to be driven in an automated way. The E-CNN is designed as a base classifier, where the ensemble system involves other layers namely: stacking layer and a meta-classifier. The stacking ensemble layer collects the features in optimal way and sends to the RL for final classification. The meta-classifier uses RL classifiers in parallelized way that ensures the processing of extracted input real time features from the datasets in faster way. The execution of an ensemble learning classifier is carried out on a path planning ensemble model for a careful autonomous driving specifically on a terrain and on a rugged surface. The performance of the ensemble classifier is carried out on a rugged and terrain areas in terms of classification accuracy on objects, recall rate and Root Mean Square Error (RMSE).
The outline of the paper is given below: Section 2 provides the related works with existing models used in improving the automated detection of road accidents. Section 3 details the Ensemble deep learning model with detailed steps on how the E-CNN-RL works. Section 4 evaluates the ensemble prediction model with a large video dataset over different landscapes. Section 5 concludes the entire work.
Complex planning and decision-making, where definite driving performance have remained a barrier, has led to the integration of machine learning and deep learning modules. Using deep learning models [
Data production without supervision would impact a model trained with redundant training sets. In order to recognise the environment in which predictable and unpredictable things are regarded, we must limit the data required to train the reinforcement learning model [
In this section, a framework or a model is designed with ensemble model that incorporates recognition and prediction, and planning. The selection of algorithms is considered in a careful way such that the recognition and prediction of optimal trajectories is carried out in accurate way. The architecture in
It is not sufficient for an autonomous prediction model to recognize its environment while the vehicle is moving at high speed. Hence it is essential to develop an internal model, which can predict the future environmental conditions. The intelligence level with the hybrid ensemble learning model operates the car on terrain and rugged surfaces autonomously.
The Ensemble-Convolutional Neural Network-Reinforcement Learning (E-CNN-RL) classifier is equipped with recognition, prediction and planning.
It uses a CNN classifier [
The data from the fully convolutional layer is further sent to the reinforcement model that tends to process each action of the CNN with its actor-critic model ( Estimate the objective function in the planning models to achieves high quality prediction output
In the recognition phase, we use a high-speed camera to collect all of the items in the surrounding settings for training and testing. In the algorithm, redundant frames, such as items that are not on the trajectories, are ignored. A small percentage of residual data, which includes all objects, is used as an input to CNN.
Increasing the number of convolutional layers yields a more defined movement. Moving objects are detected by the convolution layer after training on ImageNet. Also, it removes static objects in the foreground of the incoming video frames.
The ensemble base layer in the present layer consists of multiple CNNs that generates multiple labels based on supervised learning sequence. The multiple labels are then process by the stacked ensemble layer based on its weighted function.
Stacked ensemble model is an advanced ensemble model that is designed to improve the precision/accuracy of the prediction. The present study uses this stacked ensemble model that uses ensemble learning algorithm. The meta-learning stacked ensemble is the best suited module that learns to combine well the predictions from two or more CNN base classifier. The stacking in the ensemble algorithm harnesses the ability of CNN classification and predicts using RL how they will work better than any single ensemble model.
The prediction from the multiple CNN classifier with relevance to road trajectories offers accurate autonomous driving by the vehicle without the intervention of the driver. The stacking using multi-label classifier involves the application of binary relevance and considers the prediction of CNN as a meta-level feature for final planning model. This ensures correlation among the supervised/labeled data in the planning model or at the meta-level. The problem of overfitting is avoided using cross validation of the planning classifier. Here, the data is split into disjoint parts that generates base classifier N times using N − 1 partitions each time for training and the remaining utility is used for planning via RL. With such varied feature space from individual CNN classifier, the ensemble classifier is diversified. The stacked ensemble learning considers exploits the local and global pairwise label correlation for prediction and planning.
The weighted stacked ensemble tends to reduce the distance between the predicted score and the target vector (representing the ground truth information in the label space). This is represented as the minimum Euclidean linear least-square problem, which is given as in
In order of improving the performance, various data types at meta-level is taken into account that can either be discrete or continuous values. The introduction of irrelevant or non-labelled classified objects from the CNN can be introduced in RL engine. In such cases, the information is uncorrelated with respect to the prediction by CNN engine/base classifier. Hence, the classification performance degrades with the addition of non-labelled information and noises. In such cases, the present study introduces the weights based on the confidence scores obtained from the various CNN base classifiers with respect to varying labels.
Reinforcement Learning (RL) is an approach that learn the environment through interactions for increasing the cumulative reward signals. An agent for learning interacts actively with an environment at all states (
Initialize state 0:
Repeat
Initialize the state
Choose an action
Assume
Repeat at each iteration
Take action
Consider the next action
Assume
Repeat the action until entire state
The weighted labels obtained from the sequence or video frame of base and then stacked layer is used to train the RL to learn the model for path planning that steers the car to move forward, sideward (left/right), backward, to slow, to speed, and to incline the car at certain degrees to move without any obstacle at its foreground. The decision is based on the weighted labelled sequence from each base classifier that sets to consider optimal detection of objects in motion (both temporal and spatial objects). The decision of autonomous driving is initiated with repeated training and marking each action of the RL with reward/penalty actions. The rules for driving based on the obstacle is designed autonomously by Fuzzy Logic controller. If the car is supposed to move in proper path, the RL is assigned with reward vector and vice versa. In this way, the RL learns the entire planning process and predefined actions based on the labelled sequence is embed during training process and finally the RL is set for testing and validation.
In this section, the validation of the E-CNN-RL model is presented. The entire modelling is coded using Python Scripts in a Pytorch framework. In
The model is trained on various environment and this is provided below:
In this paper, a hybrid E-CNN and RL model adopts well with the recognition of objects along the input trajectory and making optimal decisions of vehicle moving in rugged and terrain environment. The CNN offers improved detection of objects with its state and action mechanism and the error rate of CNN reduces with increasing iterations. The training and testing of the ensemble meta-classifier i.e., RL offer optimal planning and predictions based on the improved E-CNN object detection in fast moving cars. The experiments show that with reduced speed, the accuracy is more, however, the objective of achieving high accuracy during high-speed transmission is achieved using E-CNN object detection and RL path planning in rugged and terrain environment. The validation confirms the optimal path planning with accident-free driving. The supervised E-CNN learning by the base classifier assists the entire ensemble approach to offer optimal decisions on path planning. The faster classification of objects along the trajectories i.e., both labelled and non-labelled objects with multiple base class provides effective path planning than previous systems.
The study can further be improved by considering the video saliency associated with object detection that involves both static and dynamic video saliency in future prediction systems. This can especially be tested on off-road vehicle for torsio-elastic suspension applied to front, rear and both axles [