A number of risks exist in commercial housing, and it is critical for the government, the real estate industry, and consumers to establish an objective early warning indicator system for commercial housing risks and to conduct research regarding its measurement and early warning. In this paper, we examine the commodity housing market and construct a risk index for the commodity housing market at three levels: market level, the real estate industry and the national economy. Using the Bootstrap aggregating-grey wolf optimizer-support vector machine (Bagging-GWO-SVM) model after synthesizing the risk index by applying the CRITIC objective weighting method, the commercial housing market can be monitored for risks and early warnings. Based on the empirical study, the following conclusions have been drawn: (1) The commodity housing market risk index accurately reflect the actual risk situation in Tianjin; (2) Based on comparisons with other models, the Bagging-GWO-SVM model provides higher accuracy in early warning. A final set of suggestions is presented based on the empirical study.
Since China’s reforms and opening up, the real estate industry has become one of the country’s most important macroeconomic drivers. However, the real estate industry is subject to a variety of risks, including price risk, inventory risk, and market risk. These problems will have a significant adverse effect on the economy of the country. In terms of consumers, the high cost of commodities will result in residents’ wealth being largely absorbed by real estate, thereby reducing the quality of people’s lives and increasing investment risks. Commodity housing market risks pose a potential threat to real estate companies, which limits their own development as well as harming the industry as a whole. Consequently, it is of great importance for consumers, real estate companies, and the country in general to establish a commodity housing risk warning system.
Risk factors in the industry of commercial real estate are numerous. One major factor that influences the commodity housing market is the house price. According to Liu et al. [
For the purpose of assessing whether there is a risk in the real estate market, indicators systems, statistical tests and models are commonly employed. Wu et al. [
Currently, the most widely used early warning models for the commodity housing market are the STV Cross Section Regression Model (STV) [
Machine learning methods are popular among scholars due to their high accuracy in prediction. Artificial Neural Networks (ANN) and Support Vector Machines (SVM) are the most commonly used models in current practice [
In recent years, scholars have increasingly focused on the Long Short Term Memory (LSTM) [
To develop grey wolf optimizer-support vector machine (GWO-SVM) models, we utilized SVM models to forecast market risks and Grey Wolf Optimizer (GWO) algorithms to determine hyperparameters of the model. After this process is completed, the GWO-SVM model is integrated with Bagging to improve prediction accuracy and generalization capabilities.
This paper is structured as follows: Section 2 focuses on the principles of SVM, GWO and Bagging methods. Section 3 will analyze the influencing factors affecting the commodity housing market and establish the index system and risk index. In Section 4, an early warning system and an early warning system regarding the commodity housing market in Tianjin will be established. The purpose of Section 5 will be to present the research findings and make policy recommendations. The conclusion will be provided in Section 6.
Support Vector Machines (SVM) stands for Support Vector Machine, which can be used for classification of discrete variables and prediction of continuous variables. SVC refers to Support Vector Classification (SVC), which is a classification algorithm within the SVM framework. As a result of its capacity to convert a high-dimensional linearly indivisible space into a low-dimensional linearly distinguishable space, this algorithm will generally have greater prediction accuracy than other single classification algorithms (such as Logistic Regression, Decision Trees, Parsimonious Bayes, KNN, etc.) [
As a result, SVM models enable small sample classifications with good generalization ability and robustness, while avoiding dimensional catastrophes and local optimum situations.
Suppose the training sample set has n samples:
The above equation can be expressed by adding the penalty factor C and the slack variable as:
Addition of the Lagrangian function is expressed as follows:
It is typical for SVM to use kernel functions to map the original data into high-dimensional space, converting nonlinear problems into linear ones for solving. In this paper, the polynomial kernel (POLY) is selected since it has better performance in nonlinear datasets than other kernel types: select
From
Set the optimal Langrange multiplier
In light of the above analysis, it is possible to obtain an optimal classification function:
The Grey Wolf Optimizer (GWO) algorithm was first proposed by Seyedai [
It is generally accepted that the hunting process of wolves is divided into three stages, starting with tracking, chasing, approaching and then pursuing, encircling and harassing the prey until it stops moving, then attacking the prey. Following is a detailed description of the process.
Wolf hunters are known to round up prey by the following method:
where,
In this equation,
When the gray wolf identifies the prey location, the wolf
The step size and direction of the wolf
In the event that the prey ceases moving, the wolf begins to attack. In the iterative process, the
As outlined below are the specific steps of the optimization process for the GWO algorithm (
Step 1: The initialization of the parameters of the GWO algorithm includes setting parameters such as the number of populations N, and the maximum number of iterations
Step 2: Training the SVM model and calculating the value of its function, i.e., the position of the
Step 3: Calculating the prey position based on the wolf position and function values, and initiating a hunting search on the prey.
Step 4: In the event that the maximum number of iterations is not reached, then repeat Step 2–4, once again until the maximum number of iterations is reached and the iteration is complete. For the time being, the global optimal position corresponds to the optimal value of the parameter Cand parameter
Step 5: The optimal parameters are incorporated in the SVM modeling process.
The goal of integration learning is to combine multiple weak classifiers into a single robust classifier to improve the overall generalization of the classifier. In which weak classifiers are classifiers with slightly higher classification accuracy than random guesses.
Bootstrap aggregating (Bagging), originally proposed by Leo Breiman in 1996. For improved accuracy and stability, which reduces errors and avoids overfitting, this algorithm is usually combined with regression and classification algorithms. Essentially, it is a bootstrap sample, which is a randomized selection from N data points, with each sample playback and repetition repeated for a total of N times. As shown in the figure below, the base learner adopts the GWO-SVM model described in the previous section, which creates an integrated GWO-SVM model (
There is no unified standard for establishing indicators in the research on real estate early warning, and the same is true for establishing indicators for detecting risks in the commercial housing market. Presently, scholars are expected to adhere to six general principles when selecting risk indicators, including comprehensiveness, sensitivity, operability, independence, and effectiveness.
The analysis of the literature reveals that relative indicators are more indicative of the development of real estate market risk than absolute indicators. As a result, this paper uses relative indicators to quantify the risk associated with the commercial housing market. As a result of the above analysis and literature search, a total of 12 indicators are used to present the indicator system of this paper [
Tier 1 Indicators | Secondary indicators | Serial number |
---|---|---|
Market level | Population growth rate | |
Growth rate of commodity housing prices/growth rate of disposable income of urban residents | ||
Year-on-year growth rate of completed commercial housing area | ||
Year-on-year growth rate of commodity house sales prices | ||
Year-on-year growth rate of sales area of commercial properties | ||
House price to income ratio | ||
Growth rate of new construction area of commercial buildings | ||
Real estate development investment/GDP | ||
Intra-industry coordination | Domestic loans/total investment in real estate development | |
Year-on-year growth rate of real estate development loans | ||
Coordination with the national economy | M2 year-on-year growth rate | |
Growth rate of land area for commercial housing acquisition/GDP growth rate |
Based on data from the National Bureau of Statistics, Tianjin Bureau of Statistics and the CHOICE database, the above indicators span the period from January 2010 to December 2020.
A list of 11 indicators is initially selected and screened for certain characteristics. A number of methods exist for selecting features, including gray correlation, lasso, and principal component analysis. It is impressive that Random Forest (RF) has a high level of accuracy in feature selection and a high level of consistency of feature subsets [
The RF method determines the importance of each feature to be selected and selects the features based on the ranking results. Assume that the original number of samples is
As a segmentation function, the Gini Index is used to compute the “Gini Importance” as a measure of the importance of each feature, which is expressed as.
The risk early warning index is established following a final screening of the indicators initially formed. In view of the late state of China’s real estate industry as well as the short timeliness and poor stability of relevant data, any risk warning research based on historical data may be subject to significant errors. An effective risk warning index can minimize the problems discussed above while still retaining its flexibility. By contrast with subjective weighting methods such as entropy and standard deviation, CRITIC’s objective method of weighting allows for one to consider both the intensity of comparison and the conflict between indicators, as well as providing strong objective properties. For this reason, the risk warning index was established using the CRITIC objective assignment procedure. Its calculation formula is as follows:
As a pioneer of reform and opening up, a coastal open city, and the economic center of the Bohai Sea Rim region, Tianjin excels in these fields. As a representative of the Northern Region, the Tianjin housing market is considered to be indicative of the entire market, and the metrics and early warning research presented in this paper on the commodity housing market risk in Tianjin can aid real estate markets in Northern Regions in preparing to deal with it in advance. As a result of this study, twelve primary indicators are selected and ranked in importance and then filtered by RF (
Tier 1 Indicators | Secondary indicators | RF screening results | Serial number |
---|---|---|---|
Market Level | Population growth rate | 0.17145 | |
Growth rate of commodity housing prices/growth rate of disposable income of urban residents | 0.07497 | ||
Year-on-year growth rate of completed commercial housing area | 0.10793 | ||
Year-on-year growth rate of sales area of commercial properties | 0.08075 | ||
House price to income ratio | 0.15954 | ||
Growth rate of new construction area of commercial buildings | 0.10816 | ||
Intra-industry coordination | Domestic loans/total investment in real estate development | 0.07895 | |
Coordination with the national economy | Growth rate of land area for commercial housing acquisition/GDP growth rate | 0.08498 |
The question of how much of the risk index should be regarded as a safe area becomes relevant after synthesizing the risk index. Currently, three methods are available for resolving the problem: In the first method, alarms will be raised when the risk index exceeds the sum of the mean and one to one and a half times the standard deviation; In the second method, alarms are sounded if risk index surpasses 90% of the norm; In the third method, a reference index is found which already exists and when the risk index is higher than the reference index, a crisis is considered to be present [
Commercial Housing Market Risk Index | 1.49–5.43 | >5.43 | <1.49 |
---|---|---|---|
Grade | Security | Risks | Risks |
It is essential to determine the reliability of a commodity housing market risk index prior to applying it in an accurate manner. Based on cardarelli [
As a whole, the commodity housing market in Tianjin is subject to cyclical fluctuations, and is divided into four stages of development, as shown below.
Phase I (January 2010-December 2012): Tianjin showed a trend of rising and then declining commodity housing risk. Early in 2010, real estate prices in Tianjin continued to rise, creating greater risk. To mitigate the risk of excessive price increases, the State Council issued the “ten national articles” for the first time on April 17. Tianjin issued the “Ten Articles of Tianjin” on July 4, which establishes policies intended to solve the city’s real estate issues. Following that, Tianjin instituted a series of policies aimed at controlling the rising housing prices. Such as the “Notice on Issues Related to the Purchase of Newly Built Commercial Housing in the Six Districts of the City by Families of Residents of the City and Foreign Provinces” issued by the Municipal Land Resources and Housing Administration on October 13. By early 2011, there was a reduction in the rate of house price increases, and the risk was reduced through the continued implementation of financial and land policies.
Phase 2 (January 2013-January 2016): Risks in Tianjin commercial housing increased and then declined. The housing market began to rebound and rise in 2013 and the risk of commercial housing increased after the real estate market exchanged price for volume. Consequently, the municipal government of Tianjin issued “Article 5 of Tianjin” to suppress the real estate market as a means of stabilizing the real estate market. In response to the imbalance in supply and demand of commodity housing, the city of Tianjin abolished the purchase restriction policy in 2014 and encouraged residents to purchase properties according to the policy regarding land, money, taxation, etc. De-stocking came to a head in 2015, and the regulations of finance, taxation, and home ownership continued to be relaxed, which inevitably led to a heating up of the market and a reduction of the risk of commodity housing.
Phase 3 (February 2016-August 2019): Risks associated with commercial housing in Tianjin increased and then declined. In 2016, the housing market continued to experience high prices and risks, and Tianjin introduced the “three articles of Tianjin” to strengthen the management of housing and the market, as well as the “930” new policy on September 30 to implement regionalized purchase restrictions and differential credit policies. In 2017, in accordance with General Secretary Xi’s “no speculation in housing” policy and by further implementing market regulation and control measures, Tianjin implemented measures to further restrict housing purchases, price control and add other market management measures, resulting in a gradual stabilization of the commodity housing market. In the following two years, Tianjin strictly implemented its real estate regulation and control policy. The Ministry of Housing and Construction maintained the positioning that “the house is for living, not speculation”, implemented the long-term property management mechanism, and promoted the steady and balanced development of the commodity housing market.
Phase 4 (September 2019-December 2020): As of the end of 2019, the outbreak of the COVID-19 has caused a considerable impact on China’s commercial housing market, reducing the willingness and capability of residents to purchase commercial real estate, and providing a period of over-cold fluctuations for real estate firms and the market.
Comparing the above comparison with the actual real estate risk events reveals that the fluctuations of the risk index synthesized in this paper roughly coincide with those of the risk index. The results show that the risk index of the commodity housing market synthesized in this paper can represent the actual risk situation in Tianjin, for which the reliability has been established.
This paper presents a Bagging-GWO-SVM early warning model, where the risk index is the label value of the model; therefore, all the data of the model has been prepared. In all, 132 data samples are categorized into two categories: “safety” and “risk”. 70 percent of the data samples are selected at random to serve as training sets, while the rest are used as test sets. The maximum number of iterations of the model is set to 30, degree = 4, and the parameters
Sample set | Accuracy |
---|---|
Training set | 0.99 |
Test set | 0.95 |
The ROC curve is used to evaluate the classifier, and the critical value is determined based on its generalization performance. AUC represents the area under the ROC curve. The closer the AUC is to 1, the better a model’s performance in terms of generalization.
The main objective of this paper is to evaluate the early warning effect of the bagging-GWO-SVM model. For comparison, SVM, K-Nearest Neighbor (KNN), Plain Bayesian (NB), Logistic Regression (LR), and Bagging-SVM were chosen, which were trained 30 times, respectively, as shown in
Sample set | Training set | Test set | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Models | Accuracy | Precision | Recall Rate | F1-Score | Accuracy | Precision | Recall Rate | F1-Score | ||
Bagging-GWO-SVM | 0.989 | 0.989 | 0.989 | 0.989 | 0.95 | 0.953 | 0.95 | 0.948 | ||
Bagging-SVM | 0.902 | 0.913 | 0.902 | 0.893 | 0.925 | 0.932 | 0.925 | 0.920 | ||
GWO-SVM | 1 | 1 | 1 | 1 | 0.9 | 0.9 | 0.9 | 0.9 | ||
SVM | 0.880 | 0.897 | 0.880 | 0.860 | 0.8 | 0.844 | 0.8 | 0.763 | ||
KNN | 0.902 | 0.905 | 0.902 | 0.895 | 0.9 | 0.898 | 0.9 | 0.896 | ||
NB | 0.913 | 0.911 | 0.913 | 0.912 | 0.9 | 0.9 | 0.9 | 0.9 | ||
LR | 0.837 | 0.845 | 0.837 | 0.81 | 0.925 | 0.931 | 0.925 | 0.92 |
Based on the data in the table, it is possible to draw the following conclusions.
(1)In the prediction of the training set, the accuracy, precision, recall, as well as F1-score of Bagging-GWO-SVM and GWO-SVM are greater than 98%. Meanwhile, the various metrics of Bagging-SVM, NB, and KNN are around 90%, while the various metrics of SVM and LR are below 90%.
(2)For the test set, accuracy, precision, recall, and F1-score for Bagging-GWO-SVM, Bagging-SVM, GWO-SVM, LR, and NB are all greater than 90%, with Bagging-GWO-SVM scoring the highest at 95%; the remaining models do not meet 90% in all metrics.
(3)On the basis of the above analysis, it can be observed that Bagging-GWO-SVM performs extremely well in all comparisons of training and testing sets.
Commodity housing market risk has important implications for the real estate industry, governments, and consumers. An index system for determining the risk associated with the commodity housing market is constructed, and a Bagging-GWO-SVM model is developed to highlight early warning signals. The results are as follows. Eight key factors influencing the risk of commodity housing markets are examined. There are three levels and 12 indicators which constitute its main determinants. By screening 12 indicators using the RF method, eight key influencing factors can be identified. The following factors are considered: population growth rate, growth rate of commodity housing prices/growth rate of disposable income of urban residents, year-over-year growth rate of completed commercial housing area, year-over-year growth rate of sales area of commercial properties, house price to income ratio, growth rate of new construction area of commercial buildings, domestic loans/total investment in real estate development and growth rate of land area for commercial housing acquisition/growth rate of GDP. According to the findings of the empirical study, the selected key factors reflect the current state of the Tianjin real estate market. A commodity housing market risk index corresponding to the current market condition in Tianjin is formulated. Utilizing the CRITIC method, a commodity housing market risk index is derived based upon eight key influencing factors. Comparing the risk index to actual commodity housing market events in Tianjin, it can be concluded that the index is consistent with the actual market condition in Tianjin, thereby demonstrating its reliability. A Bagging-GWO-SVM model is developed to offer an early warning of Tianjin’s commodity housing market risk. Comparing the Bagging-GWO-SVM model with SVM, KNN, NB, LR, and Bagging-SVM, it is determined that the Bagging-GWO-SVM model provides higher early warning accuracy and is applicable to the early warning of commodity housing market risk in Tianjin.
This study demonstrates that the risk index provided is comparable to the actual market situation and the Bagging-GWO-SVM early warning model is highly accurate. Consequently, the model may be used as an early warning tool for subsequent commodity housing market risk management. The following recommendations are made regarding the management of commodity housing market risks. Integrating short-term regulation goals with medium-term and long-term regulatory objectives. A variety of areas, such as finance, taxation, and insurance, are closely linked to real estate. It is necessary to integrate government decision making, construction market and commercial housing transactions in Tianjin’s commercial housing market, which is regulated in macro regulation. Moreover, a great need exists to develop regulation methods that take into account the long cycle and the different phases of market development in order to ensure the health of the market. The rational planning of urban lands. According to the government, “Houses are for living, not for speculation.” Because the central city of Tianjin is relatively small and has limited land resources, speculation in housing should be explicitly prohibited and land speculation should be strictly prohibited. A project’s general and detailed design must take into account the overarching principle of Tianjin’s planning as well as its specific requirements. Enhance regional price monitoring. The direction, scope, and duration of price variations in Tianjin’s various regions are interdependent. Tianjin will be able to better gauge price trends by strengthening its monitoring of each region’s price changes and will be able to formulate measures as required.