Interest in automated data classification and identification systems has increased over the past years in conjunction with the high demand for artificial intelligence and security applications. In particular, recognizing human activities with accurate results have become a topic of high interest. Although the current tools have reached remarkable successes, it is still a challenging problem due to various uncontrolled environments and conditions. In this paper two statistical frameworks based on nonparametric hierarchical Bayesian models and Gamma distribution are proposed to solve some real-world applications. In particular, two nonparametric hierarchical Bayesian models based on Dirichlet process and Pitman-Yor process are developed. These models are then applied to address the problem of modelling grouped data where observations are organized into groups and these groups are statistically linked by sharing mixture components. The choice of the Gamma mixtures is motivated by its flexibility for modelling heavy-tailed distributions. In addition, deploying the Dirichlet process prior is justified by its advantage of automatically finding the right number of components and providing nice properties. Moreover, a learning step via variational Bayesian setting is presented in a flexible way. The priors over the parameters are selected appropriately and the posteriors are approximated effectively in a closed form. Experimental results based on a real-life applications that concerns texture classification and human actions recognition show the capabilities and effectiveness of the proposed framework.
Data clustering has been the subject of wide research to the present days [
Unfortunately, the inadequacy of finite mixture model has been apparent when selecting the appropriate number of mixture components. In other word, model selection (i.e., model's complexity) is one of the difficult problems within finite mixture models. The crucial problem of “how many groups in the dataset?” still remains of great interest for various data mining fields since determining inappropriate number of clusters may conduct to poor generalization capability. This problem can be solved by considering an infinite number of components via nonparametric Bayesian methods [
Two sound hierarchical Bayesian alternatives to the conventional DP named as hierarchical Dirichlet process (HDP) and hierarchical Pitman-Yor process (HPYP) have exposed encouraging results especially when dealing with modeling grouped data [
Thus, the main contributions of this manuscript are first to extend our previous works about the Gamma mixture by investigating two efficient nonparametric hierarchical Bayesian models based on both Dirichlet and Pitman-Yor processes mixtures of Gamma distributions. Indeed, in order to reach enhanced modeling performance, we consider Gamma distribution which is able to cover long-tailed distributions and to approximate accurately visual vectors. Another critical issue when dealing with mixture models is model parameters estimation. Accordingly, we propose to develop an effective variational Bayes learning algorithm to estimate the parameters of the implemented models. It is noteworthy to indicate that the complexity of variational inference-based algorithm still remains less than Markov Chain Monte Carlo-Based Bayesian inference and leads to faster convergence. Finally, the implemented hierarchical Bayesian models and the variational inference approaches are validated via challenging real-life problems namely human activity recognition and texture categorization.
This manuscript is organized as follows. In Section 2 we introduce the hierarchical DP and PYP mixtures of Gamma distributions which are based on stick-breaking construction. In Section 3, we describe the details of our variational Bayes learning framework. Section 4 reports the obtained results, which are based on two challenging applications, to verify the merits and effectiveness of our framework, and Section 5 is devoted to conclude the manuscript.
In this section, we start by briefly presenting finite Gamma mixture model and then we present our nonparametric frameworks based on hierarchical Dirichlet and Pitman-Yor processes mixtures.
If a
For a random vector
The hierarchical Dirichlet process (HDP) is an an effective nonparametric Bayesian method to modelling grouped data, which allows the mixture models to share components. Here, observed data are arranged into groups (i.e., mixture model) that we want to make them statistically linked. HDP It is built on the Dirichlet process (DP) as well described in [
Here, the hierarchical Dirichlet process is represented using the stick-breaking construction [
Next, we introduce a latent indicator
Given that
According to
To complete the description of the HDP mixture model, given a grouped observation (data)
As each
That is, the indicator
Based on the stick-breaking construction of the DP (see
Finally, according to
The Pitman-Yor process (PYP) [
In this subsection, we introduce two hierarchical infinite mixture models with Gamma distributions. In this case, each vector
Next, we have to place conjugate distributions over the unknown parameters
Variational inference [
In particular, we adopt one of the most successfully variational inference techniques namely the factorial approximation (or mean fields approximation) [
For a specific variational factor
The principal purpose of the experiment section is to investigate the performance of the developed two frameworks based on HDP mixture and HPYP mixture model with Gamma distributions. Hence, we propose to compare them with other statistical models using two challenging applications: Texture categorization and human action categorization. In all these experiments, the global truncation level
In this work we are primarily motivated by the problem of modeling and classifying texture images. Contrary to natural images which include certain objects and structures, texture images are very special case of images that do not include a well-defined shape. Texture pattern is one of the most important elements in visual multimedia content analysis. It forms the basis for solving complex machine learning and computer vision tasks. In particular, texture classification supports a wide range of applications, including information retrieval, image categorization [
For this application, we start by extracting features from input images and then we model them using the proposed HDPGaM and HPYPGaM. Each image The Matlab code for the features is available at
We conducted our experiments of texture classification using the proposed hierarchical HDP Gamma mixture (referred to as HDPGaM) and HPYP Gamma mixture (referred to as HPYPGaM) on three publicly available databases. The first one namely UIUCTex [
In order to quantify the performance of the proposed frameworks (HDPGaM and HPYPGaM), we proceed by evaluating and comparing the obtained results with seven other methods namely infinite mixture of Gaussian distribution (inGM), infinite mixture of generalized Gaussian distribution (inGGM), infinite mixture of multivariate generalized Gaussian distribution (inMGGM), Hierarchical Dirichlet Process mixture of Gaussian distribution (HDPGM), hierarchical Pitman-Yor process mixture of Gaussian distribution (HPYPGM), Hierarchical Dirichlet Process mixture of generalized Gaussian distribution (HDPGGM), and hierarchical Pitman-Yor process mixture of generalized Gaussian distribution (HPYPGGM).
We run all methods 30 times and calculate the average classification accuracy which are depicted in
Method | KTH-TIPS | UIUCTex | UMD |
---|---|---|---|
inGM | 80.94 | 85.33 | 84.14 |
inGGM | 83, 30 | 87.21 | 87.17 |
inMGGM | 85.91 | 89.03 | 89.10 |
HDPGM | 85.92 | 89.10 | 89.17 |
HPYPGM | 85.95 | 89.13 | 89.20 |
HDPGGM | 85.95 | 89.21 | 89.26 |
HPYPGGM | 85.97 | 89.26 | 89.28 |
HDPGaM (our method) | 86.03 | 89.77 | 89.62 |
HPYPGaM (our method) | 86.12 | 89.82 | 89.70 |
Method | KTH-TIPS | UIUCTex | UMD |
---|---|---|---|
inGM | 81.24 | 86.03 | 84.84 |
inGGM | 83.94 | 87.84 | 87.86 |
inMGGM | 86.29 | 89.81 | 89.97 |
HDPGM | 86.32 | 89.85 | 90.01 |
HPYPGM | 86.35 | 89.88 | 90.05 |
HDPGGM | 86.41 | 89.93 | 90.07 |
HPYPGGM | 86.44 | 89.98 | 90.09 |
HDPGaM (our method) | 86.46 | 90.05 | 90.13 |
HPYPGaM (our method) | 86.50 | 90.15 | 90.18 |
Method | KTH-TIPS | UIUCTex | UMD |
---|---|---|---|
inGMM | 84.13 | 89.77 | 87.12 |
inGGMM | 86.88 | 90.67 | 90.20 |
inMGGM | 88.91 | 92.11 | 92.12 |
HDPGM | 88.93 | 92.11 | 92.11 |
HPYPGM | 88.96 | 92.13 | 92.17 |
HDPGGM | 88.99 | 92.15 | 92.19 |
HPYPGGM | 88.99 | 92.17 | 92.20 |
HDPGaM (our method) | 88.99 | 92.19 | 92.21 |
HPYPGaM (our method) | 89.05 | 92.23 | 92.30 |
Visual multimedia recognition has been a challenging research topic which could attract many applications such as actions recognition [
We perform here the recognition of Human activities using the proposed frameworks HDPGaM and HPYPGaM. Our methodology is outlined as following: First, we extract and normalize SIFT3D descriptor [
On the other hand, a global vocabulary is generated and shared between all groups via the global-model
Our purpose through this application is to show the advantages of investigating our proposed hierarchical models HDPGaM and HPYPGaM over other conventional hierarchical mixtures and other methods from the state of the art. Therefore, we focused first on evaluating the performance of HDPGaM and HPYPGaM over Hierarchical Dirichlet Process mixture of Gaussian distribution (HDPGM), hierarchical Pitman-Yor process mixture of Gaussian distribution (HPYPGM), Hierarchical Dirichlet Process mixture of generalized Gaussian distribution (HDPGGM), and hierarchical Pitman-Yor process mixture of generalized Gaussian distribution (HPYPGGM). It is noted that we learned all the implemented models using variational Bayes. The average recognition performances of our frameworks and models based on HDP mixture and HPYP mixture are depicted in
Method | Recognition rate (%) |
---|---|
HDPGM | 80.12 |
HPYPGM | 80.19 |
HDPGGM | 81.33 |
HPYPGGM | 81.39 |
HDPGaM (our method) | 82.13 |
HPYPGaM (our method) | 82.27 |
As we can see in this table, the proposed frameworks were able to offer the highest recognition rates (82.27% for HPYPGaM and 82.13% for HDPGaM) among all tested models. For different runs, we have p-values < 0.05 and therefore, the differences in accuracy between our frameworks and other models are statistically significant according to Student's t-test. Next, we compared our models against other mixture models (here finite Gaussian mixture (GMM) and finite generalized Gaussian mixture (GGMM) and methods from the literature. The obtained results are given in
Method | Recognition rate (%) |
---|---|
GMM | 72.51 |
GGMM | 73.89 |
SVM-linear | 73.24 |
SVM-polynomial | 71.44 |
SVM-radial basis | 79.33 |
Wong et al. [ |
73.24 |
Fan et al. [ |
79.33 |
Schuldt et al. [ |
71.71 |
Dollár et al. [ |
78.50 |
MGGMM-FS [ |
82.01 |
HDPGaM (our method) | 82.13 |
HPYPGaM (our method) | 82.27 |
Accordingly, we can observe that models again are able to provide higher discrimination rate than the other methods. Clearly, these results confirm the effectiveness of our frameworks for activities modeling and recognition compared to other conventional Dirichlet and Pitman-Yor processes based on Gaussian distribution. Another remark is that our model HPYPGaM outperforms our second model HDPGaM for this specific application and this demonstrates the advantages of using hierarchical Pitman-Yor process over Dirichlet process which is flexible enough to be used for such recognition problem.
In this paper two non-parametric Bayesian frameworks based on both hierarchical Dirichlet and Pitman-Yor processes and Gamma distribution are proposed. The Gamma distribution is considered because of its flexibility for semi-bounded data modelling. Both frameworks are learned using variational inference which has certain advantages such as easy assessment of convergence and easy optimization by offering a trade-off between frequentist techniques and MCMC-based ones. An important property of our approach is that it does not need the specification of the number of mixture components in advance. We carried out experiments on texture categorization and human action recognition to demonstrate the performance of our models which can be used further for a variety of other computer vision and pattern recognition applications.
The authors would like to thank Taif University Researchers Supporting Project number (TURSP-2020/26), Taif University, Taif, Saudi Arabia. They would like also to thank Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R40), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
The Matlab code for the features is available at