Volatility modeling and prediction: the role of price impact

In this paper, we are interested in exploring the role of price impact, derived from the order book, in modeling and predicting stock volatility. This is motivated by the market microstructure literature that examines the mechanics of price formation and its relevance to market quality. Using a comprehensive dataset of intraday bids, asks, and three levels of market depths for 148 stocks in the Shanghai Stock Exchange from 2005 to 2016, we find substantial intraday impact from incoming bid and ask limit and market orders on stock prices. More importantly, the permanent price impact at the daily level is a significant determinant of stock volatility dynamics as suggested by the panel VAR estimation. Furthermore, when we augment traditional volatility models with the time series of daily price impact, the augmented models produce significantly more accurate volatility predictions at the one-day ahead forecasting horizon. These volatility predictions also offer economic gains to a mean-variance utility investor in a portfolio setting.


Introduction
Intraday price formation and variation is a central topic in the market microstructure literature dating back a few decades. French and Roll (1986) is an early effort which shows that stock volatility is significantly higher during trading hours than during non-trading hours and attributes this to microstructure phenomenon. Both theoretical and empirical studies focus on examining the sources of these intraday price variations. Many argue that order book events are a conduit for volatility information and, intuitively, stock return volatility is partially determined by microstructure noise generated in the trading process. Madhavan et al. (1997) develop a theoretical model and decompose determinants of stock volatility into public news and microstructure-induced noise, i.e. effective bid-ask spread; and Ahn et al. (2001) establish a bilateral relation between transitory volatility and order flow. These results are in line with those from Foucault (1999), Foucault et al. (2007), and Handa and Schwartz (1996).
Motivated by this strand of the literature, we explore in this paper whether the information content of order book events *Corresponding author. Email: xiaoquan.liu@nottingham.edu.cn such as the arrival of limit and market orders and trades is an important driver of stock volatility in-sample. If the answer is affirmative, we are also interested in knowing whether the time series of price impact is able to improve the precision of out-of-sample volatility predictions, both in statistical and economic terms. Hence our paper crosses over between two important fields in finance, i.e. market microstructure and volatility prediction, and extends the existing literature in which microstructure information is adopted for the purpose of volatility modeling and forecasting such as bid-ask spread (Bollerslev and Melvin 1994); information flow (Gallo and Pacini 2000); and trading volume (Wagner and Marsh 2005) in a GARCH-X type model.
Our empirical analysis is conducted on the Chinese stock market, a major order-driven emerging market that enjoys exponential growth since its inception. Established in 1990, the Shanghai Stock Exchange (SSE) started with only eight listed stocks but over less than three decades it now trades more than 1400 stocks with a total market capitalization of RMB 30 trillion as of July 2018. † During this period, the market has experienced a number of major policy shocks † See http://english.sse.com.cn/indices/statistics/market/. such as the ownership structure reform in 2005 and more recently the ill-fated circuit breaker regulation in 2016. The less-than-stable institutional environment may induce a different information reflection process in equity prices. Equally importantly, the market is characterized by a disproportion of individual investors compared with developed equity markets hence the price dynamics from order book events could be different as suggested by the asymmetric information theory (Milgrom andStokey 1982, Kyle 1985). Our sample consists of 148 firms traded on the SSE, all of which are component stocks of the Chinese CSI 300 index, and covers a variety of sectors. The data are intraday bid and ask quotes over three depth levels over a long sample period from January 2005 to August 2016. † We contribute to the literature by offering a comprehensive study that explores the price impact of order book events in this young, dynamic yet important emerging market, and reveals how the price impact of incoming orders affects volatility and improves its prediction accuracy. ‡ Methodologically, we first follow Hautsch and Huang (2012) and estimate the price impact of incoming limit and market orders by a vector autoregression (VAR) framework. The econometric framework is able to consider the short-and long-run impact of buy and sell orders via impulse response functions. We then investigate whether the price impact significantly affects stock volatility via a panel VAR model that allows us to evaluate the price impact on the volatility of all sample stocks simultaneously. Finally, we include the price impact as additional variable in two commonly used volatility models, the standard GARCH model of Bollerslev (1986) and the heterogeneous auroregressive (HAR) realized volatility model of Corsi (2009), and compare the forecasting accuracy of the augmented models with the original models in statistical and economic terms.
We reveal a host of interesting findings. First, we document substantial price impact of incoming limit and market orders similar in magnitude to that of developed markets (Hautsch and Huang 2012), and a significant relation between price impact and volatility in the Chinese equity market. Both ask and bid prices tend to shift significantly after the arrival of a buy or sell limit or market order. The impact is, however, asymmetric: we show that the magnitude of price impact induced by a market order is generally larger than that by a limit order. We also notice that the arrival of a sell market order gives rise to a larger impact on ask or bid prices than a buy market order. The permanent price impact induced by incoming limit and market orders is highly significant, indicating that incoming orders contain substantial information and contributes to the price discovery process. This is consistent with the existing literature that order book events † Our sample compares favorably to 50 US stocks for a sample period of 21 days in Cont et al. (2014); 30 stocks in the Euronext Amsterdam exchange with a two-month sample periods in Hautsch and Huang (2012); and 100 stocks in Nasdaq over two years in Engle and Patton (2004). ‡ The only related study is Jain and Jiang (2014), which shows that the limit order book slope consistently and significantly predicts future price volatility. However, the paper does not model the price impact of incoming orders nor evaluates the forecast accuracy of volatility.
play an important role in the price formation process in many developed markets.
Second, adopting a panel VAR model which allows us to gauge the effect of permanent price impact series on all sample stocks, we show that changes in aggregate daily price impact cause significant changes in stock volatility. This is the first piece of evidence on the link between stock volatility and the price impact of incoming limit or market orders and in line with the theoretical framework in Madhavan et al. (1997) that microstructure noise is an integral part of the information source for volatility.
Third, by adding daily permanent price impact to GARCH and HAR models, the out-of-sample accuracy of volatility forecasts is significantly improved. We adopt the popular Diebold and Mariano (1995) pairwise comparison and show that the augmented GARCH-X and HAR-X models with the time series of price impact consistently produce statistically smaller forecasting errors across three different loss functions. Furthermore, for a mean-variance utility investor who allocates her wealth between a stock and the riskfree asset, the volatility predictions from augmented models lead to significantly higher annualized portfolio returns, Sharpe ratio, and certainty equivalent returns in a portfolio setting across a range of risk aversion levels. These novel findings support our conjecture that price impact of incoming order book events contains valuable information for volatility and adding the information improves volatility forecasting precision in statistical and economic terms.
The rest of the paper is organized as follows. Section 2 reviews relevant literature in market microstructure and volatility modeling. Section 3 describes the methodology adopted in this study. In Section 4, we introduce the data and analyze empirical results. Finally, Section 5 concludes. Additional materials are provided in the Appendix.

Literature review
In an order-driven market there is no designated market maker for liquidity provision. Instead traders choose to submit limit and/or market orders which will automatically be matched by an electronic trading system and thus change the pending volume and the best bid or ask quotes. Glosten (1994) derives the equilibrium price determined by bid and ask quotes in an open order book for an order-driven market; while Foucault (1999), Foucault et al. (2005), Goettler et al. (2005), Goettler et al. (2009), Rosu (2009 capture the dynamics of a limit-order market via game theoretic models. To date, limit order trading has been examined worldwide in the NAS-DAQ (Eisler et al. 2012, Cont et al. 2014, the Deutsche Boerse (Riordan and Storkenmaier 2012), the Oslo Stock Exchange (Naes and Skjeltorp 2006), the Paris Bourse (Biais et al. 1995(Biais et al. , 1999, the Euronext Amsterdam (Hautsch and Huang 2012), the Tokyo Stock Exchange (Lehmann andModest 1994, Hamao andHasbrouck 1995), the Australian Stock Exchange (Cao et al. 2009), and the Heng Seng Stock Exchange (Ahn et al. 2001).
Thanks to the availability of high frequency data, one important line of research in the order-driven market in recent years is to understand price impact of orders since it is a fundamental mechanism of price formation (Cont et al. 2014, Wilinski et al. 2015, Gencay et al. 2018. Dufour and Engle (2000), Easley et al. (1996), Engle and Patton (2004), Hasbrouck (1991) and Jang and Venkatesh (1991) explore how characteristics of trades such as frequency, size, order flows and bid-ask spread contribute to price formation. However, focusing only on trades misses out the rich information contained in quotes, which provide a more detailed picture of price formation (Engle and Lunde 2003). For example, Weber and Rosenow (2005) show that arriving limit orders play an important role in determining price dynamics; Knez and Ready (1996) argue that outstanding limit orders significantly affect individual orders; and Cont et al. (2014) investigate the instantaneous impact of order book events on equity prices and conclude that price changes are mainly driven by the order flow imbalance. Most relevant to our paper, Hautsch and Huang (2012) quantify price impact based on the framework of Hasbrouck (1991) and Engle and Patton (2004). It measures the price impact of limit orders as the implied expected short-and long-run shifts of ask and bid quotes after submission. Its novel econometric framework captures relevant trading characteristics and provides a comprehensive description of the order book.
Meanwhile, the importance of volatility, which is central to portfolio allocation, derivative valuation, and risk management, is well documented. The literature on volatility modeling has made significant advancement since the seminal work of Engle (1982) (see Andersen et al. 2003, Hansen and Lunde 2011, Engle et al. 2013. One strand in this literature extends volatility modeling by incorporating market microstructure variables in low-order ARCH family of volatility models. Bollerslev and Melvin (1994) document empirical evidence that the size of bid-ask spread in the foreign exchange market is related to the exchange rate volatility in a GARCH framework. This is consistent with theories of asymmetric information in bid-ask spreads. Adding a measure of overnight information flow between market close and open, Gallo and Pacini (2000) reveal a significant relation between this measure and stock volatility in GARCH and EGARCH settings. Furthermore, trading volume is another popular microstructure measure extensively explored in the volatility literature and shown to relate to asset volatility (see Lamoureux and Lastrapes 1990, Wagner and Marsh 2005, Fleming et al. 2008). Our paper is motivated by and contributes to both strands of the literature.

Econometric framework
We first follow Hautsch and Huang (2012) in modeling and quantifying price impact via a restricted VAR model. † The vector of variables includes the logarithmic values of best bid/ask limit and market quotes, the best three volumes on both sides of bid and ask for limit and market orders, and trades. The short-and long-run price impacts are estimated via the impulse response function of the VAR and the long-run † The restrictions are specified in Appendix 1.
impact is considered the permanent price impact and included in the volatility forecasting exercise. The details of the VAR model are summarized in Appendix 1. In this section, we focus on the panel VAR (PVAR) model to examine the relevance of price impact to stock volatility in-sample, and how the information can be utilized in out-of-sample forecasting exercises.

The PVAR model
Given the theoretical and empirical evidence on the relationship between market microstructure variables and asset volatility, we hypothesize that the price impact of incoming orders exerts significant impact on stock volatility. We use permanent price impact of incoming limit and market orders since it represents equilibrium price changes induced by order book events. We adopt a VAR framework in which all variables are treated as endogenous and interdependent both in a dynamic and a static sense. The impulse response function of the VAR system is able to reflect the change in one variable driven by the change in others.
We construct the VAR system that includes daily stock volatility estimated by the GARCH or the HAR model, and daily permanent price impact induced by arriving bid and ask limit and market orders. We use daily data as it is the most commonly used frequency in the volatility forecasting literature. The price impact series for each stock are estimated at the intraday frequency through the impulse response function in equation (A8) based on the estimation of the VAR model in equation (A2), and aggregated to daily level by adding intraday observations.
It is tedious to estimate the VAR stock by stock with permanent price impact series. It is also difficult to draw a general conclusion on the relationship between price impact and volatility through individual estimation. To overcome this difficulty, we implement a PVAR model which has the same structure as VAR models but a cross-sectional dimension is added to the representation. PVAR models have been increasingly applied in finance and economics literature (see Holtz-Eakin et al. 1998, Love and Zicchino 2006, Canova et al. 2007, Beetsma and Giuliadori 2011, Canova and Ciccarelli 2012. They are particularly suited for questions such as incorporating time variation in the coefficients and in the variance of shocks, accounting for the cross-sectional dynamic heterogeneity, and identifying links across units in an unrestricted fashion (Canova and Ciccarelli 2013). We take advantage of the cross-sectional feature in PVAR models by including all sample stocks and evaluating their volatility dynamics in the presence of the time series of price impact. This allows us to obtain a comprehensive picture of the relationship.
Following Abrigo and Love (2016a), we define the kvariate homogeneous PVAR model of order p with panelspecific fixed effects as follows: where i = 1, 2, . . . , N, and N is the number of panels, i.e. the number of stocks in our sample; d = 1, 2, . . . , D i , and D i is the number of days in the sample for each stock i. For each panel, Y id is a 1 × k vector of dependent variables; u i and e id are 1 × k vectors of dependent variable-specific panel fixed-effects and idiosyncratic errors, respectively; the k × k matrices A 1 , . . . , A 2 , A p−1 , A p are parameters to be estimated. Consistent parameters are obtained via an equation-byequation generalized method of moments (GMM) procedure (Abrigo and Love 2016a).
To investigate the relationship between price impact and volatility, the impulse response function specified in the PVAR model is of great interest. Re-writing the model as an infinite vector moving average (VMA), the simple impulse response function l can be expressed as follows: where l are the VMA parameters. In our study, we adopt the bootstrap re-sampling method following Kapetanios (2008) with 100 Monte Carlo draws to estimate the confidence interval of the impulse response function. The system of PVAR is constructed as follows: where for stock i on day d, Vol id denotes stock volatility, which is proxied by the GARCH volatility in equation (5) or the realized volatility in equation (6) specified below. Furthermore, LBid2P id and LAsk2P id are the permanent price impact incurred by bid limit orders and ask limit orders, respectively; and MBid2P id and MAsk2P id are the permanent price impact incurred by bid market orders and ask market orders, respectively, for stock i on day d. They are obtained by aggregating intraday price impact to the daily level following equation (A8). †

Volatility modeling and forecasting
3.2.1. The GARCH model. The generalized autoregressive conditional heteroskedasticity (GARCH) model of Bollerslev (1986) takes account of the time-varying volatility clustering of most financial time series and has been widely applied in many studies (see Glosten et al. 1993, Andersen and Bollerslev 1998b, Martens 2001, Chortareas et al. 2011, Jiang et al. 2017. We use the most parsimonious GARCH (1,1) in our study: where η d represents daily return series as the difference between the logarithmic prices on day d and day d − 1, μ † We follow Abrigo and Love (2016b) and Schnücker (2016)  is the mean, d is the innovation conditional on the information set and follows a t-distribution denoted by t v with zero mean, variance σ 2 d , and v degrees of freedom. In addition, β is the GARCH component coefficient and α is the ARCH component coefficient. The GARCH model requires that α + β < 1 for the volatility process to be stationary. Note that the volatility σ d estimated in equation (5) is used in the PVAR of equation (3). (2009), the HAR model is a simple AR-type model in realized volatility that considers different volatility components realized over different time horizons. We choose this model for its ability of capturing the main empirical characteristics of financial returns such as long memory, fat tails and multi-scaling which cannot be handled by traditional short-memory models such as the GARCH. The HAR model also overcomes undesirable features of fractional integration models such as artificially mixing long-and short-term characteristics, difficulty in estimation, and inability in handling the multiscaling feature (Comte and Renault 1998). Most importantly, it exhibits remarkable forecasting performance (Corsi 2009) and hence has been widely adopted in the literature (see Hillebrand and Medeiros 2010, Chiriac and Voev 2011, Fernandes et al. 2014, Dimpfl and Jank 2016, for example). The model includes additive cascade of volatility components defined over different time horizons as follows:

The HAR model. Proposed by Corsi
represent daily, weekly and monthly volatility components, respectively, on day d. The daily realized volatility RV (d) d is calculated by aggregating intraday squared returns as shown in equation (9). The weekly and monthly realized volatilities are simple averages of the daily realized volatility: Irrespective of their actual frequency, volatility quantities are annualized to facilitate comparison between different frequencies. Note that the realized volatility RV (d) d estimated in equation (6) is used in the PVAR of equation (3).

Proxy for latent volatility dynamics.
The true volatility is an unobservable latent variable. In the literature, the most popular proxy is the realized volatility (RV) proposed by Andersen and Bollerslev (1998a). This is obtained by aggregating intraday squared returns. We follow this approach and construct a realized volatility series using 4-6 second logarithmic return series as follows: whereσ rv,d is the realized volatility for day d and r 2 d,t is the squared intraday logarithmic return on day d for time index t (t = 1, 2, . . . , T). We useσ rv,d as the proxy for the true volatility to evaluate the accuracy of out-of-sample forecasting performance in equations (13)-(15).

Forecasting models.
To incorporate information content of price impact into volatility forecasting, we include the time series of permanent price impact of incoming buy and sell limit and market orders into the baseline GARCH and HAR models in equations (5) and (6), respectively, and formulate the GARCH-X and HAR-X models to produce outof-sample volatility forecasts. The factor X in the GARCH-X and HAR-X models are either the permanent price impact of buy and sell limit orders or the permanent price impact of buy and sell market orders. The GARCH (1,1)-X model is defined as follows: We include the price impact of limit and market orders separately into the model to distinguish their information content for volatility modeling for each stock resulting in 148 × 2 estimations. Similarly, we do the same to form a HAR-X model as follows: 3.2.5. Forecast evaluation. The in-sample coefficient significance does not always translate to out-of-sample forecasting accuracy, which is a more relevant task for investors and traders. Hence we compare the out-of-sample performance between benchmark GARCH and HAR models and augmented GARCH-X and HAR-X models. For each stock, we select the first 80% of data for the in-sample estimation and use the remaining for out-of-sample prediction. We use a rolling window scheme and compute one-day ahead forecast. We evaluate the forecasting accuracy using three popular loss functions: the root mean squared error (RMSE), the mean absolute percentage error (MAPE), and the mean absolute error (MAE) as follows: where M is the number of days in out-of-sample period, var d+1 is the one-day ahead forecasted variance obtained either from equations 11(a,b) or 12(a,b), andσ 2 rv,d+1 is the proxy for true variance in equation (9).
The model with smaller forecasting error is not necessarily superior to competing models as the difference between two forecasts can be insignificant statistically. To take such considerations into account, Diebold and Mariano (1995) (DM henceafter) propose a pairwise comparison test between two forecasting models. The DM statistic follows an asymptotic standard normal distribution under the null hypothesis. We implement the test to provide statistical evidence of whether an augmented volatility model outperforms the benchmark model in providing statistically more accurate forecasts. The test statistics is defined as follows: where Loss d+1 is the difference of forecasting errors between the benchmark and competing models, and is the consistent estimate of the asymptotic variance M −0.5 M d=1 Loss d+1 . The null hypothesis is H 0 : E[ Loss d+1 ] = 0. A positive (negative) and significant tstatistic suggests that the competing (benchmark) model significantly outperforms the counterpart model and is preferred with more accurate volatility forecasts.

Portfolio exercise.
A strong statistical performance does not indicate economic gains to investors. Therefore we analyze the economic value of volatility forecasts assuming a mean-variance utility investor who allocates her wealth between one of the Chinese stocks in our sample and a risk-free asset. We follow Rapach et al. (2010) and Wang et al. (2016) to construct the utility function as follows: (17) where on day d, w d is the weight of the stock in the portfolio, r d is the stock return in excess of the risk-free rate, r d,f , and γ denotes the level of risk aversion. We maximize the utility function U d (r d ) with respect to the weight w d and obtain the ex ante optimal weight on day d + 1: wherer d+1 andσ 2 d+1 are the forecasted mean and volatility, respectively, of excess returns to the stock. The risk-free rate is the short-term government lending rate.
Following Rapach et al. (2010) and Wang et al. (2016), we take the historical average as the mean forecasts for returns, r d+1 = d j=1 r j . Hence, for each level of risk aversion γ , the optimal weightŵ d = (1/γ )(r d+1 /σ 2 d+1 ) of the portfolio is only determined by the volatility forecasts as different strategies share the same mean forecasts of returns. We use the Sharpe ratio (SR): and the certainty equivalent return (CER): to evaluate the performance of the portfolio, whereμ p andμ p are the mean portfolio excess returns at d and d + 1, respectively, andσ p andσ 2 p are the standard deviation and variance of portfolio excess returns at d and d + 1, respectively. For robustness, we adopt γ = 3, 6, and 9 to capture different levels of investor risk aversion.

Data
Our intraday data are obtained from the China Security Market Trade & Quote (Level 1) of the China Stock Market & Accounting Research (CSMAR) database. We use stocks listed in the Shanghai Stock Exchange which are component stocks of the Chinese CSI 300 index, also the largest and most liquid stocks across different sectors. We exclude the financial and banking sector and companies with less than three years of data. Our final sample includes 148 stocks with starting date ranging from August 2005 to May 2012 and the ending date is 31 August 2016 for most stocks. Table 1 summarizes descriptive statistics of 60 randomly selected stocks from our sample. The selected stocks cover 12 industries with a variety of sizes, turnovers, and growth prospects. The number of observations range from 1,890,581 to 7,095,527 due to different starting dates. In table 2 we provide a cross-sectional snapshot of all sample firms by year. With such a comprehensive sample, our empirical findings are free from biases due to stock characteristics or sample period.
Because of the information disclosure restriction of the Chinese Securities Regulatory Commission, all publicly available stock price data in China only contain aggregate order book information over at best four-second intervals and at most five levels of depth volume in terms of turnover and without clear indication of whether the order is a buy or sell. A snapshot of the raw data is provided in table A1. Hence, we need to classify raw data into equivalent order book events    before performing our analysis. We follow Ellis et al. (2000) and adopt their algorithm which is shown to be more accurate than the well-known Lee and Ready (1991) procedure. Details of the algorithm are provided in Appendix 2. Table A2 tabulates the descriptive statistics of the daily permanent price impact of bid limit orders (LBid2P), bid market orders (MBid2P), ask limit orders (LAsk2P) and ask market orders (MAsk2P), respectively, for the first 30 selected stocks (code from 600009 to 600690) that are also shown in table 1. These permanent price impacts are the long-term impulse response to incoming limit and market orders obtained via the VAR model. The average value for the price impact tends to be small with the order of magnitude at 10 −3 . Not surprisingly, the price impact from buy limit and market orders are positive, whereas that from sell limit and market orders are negative.

Empirical analysis
The average of price impact shows that in most cases, the market orders generate greater price impact than limit orders. Figure 1 illustrates typical price impact in the Chinese equity market. In this figure, we plot the instantaneous price impact for the Shanghai Electric Power company (Stock ID 600021) on a randomly selected trading day. The price impact is measured as the change in bid/ask price in basis point induced by a change in buy/sell limit or market orders equal to half the magnitude of level one depth against event time. We also show the 95% confidence interval of the price impact. We notice some interesting patterns. First, it is very clear that there exists substantial impact from the incoming limit or market orders to prices, both in the short-and long-run. This is consistent with the findings documented in Hautsch and Huang (2012) and suggests that in China, a young and emerging order-driven market, the price impact of order book events is as great as, if not more than, that in well developed equity markets.
Second, the market order depicted in (c) and (d) of figure 1 gives rise to greater permanent price impact compared with the limit order shown in (a) and (b). In terms of basis point, the price impact of the limit order is between −1 and 2.5 whereas for market orders it is between −8 and 4. This result is in line with the theoretical prediction in Rosu (2016) that informed traders choose to submit market order when the mispricing between the privately held fundamental asset value and the publicly expected fundamental value is substantial, which leads to greater price impact for market orders.
Third, the sell market order drives greater price impact both in the short-and long-run than the buy market order. This may link to the asymmetric effect that negative news, signaled by sell market orders, tends to cause larger price changes than positive news, signaled by the buy market order. For limit orders, the buy order exhibits greater impact on price than sell order. Since the limit order could be submitted by informed or uninformed traders, it is ambiguous whether differences on the price impact between buy or sell limit orders exist.
Finally, for limit orders in (a) and (b) the price impact on the bid price converges to the permanent impact quicker than ask price; however for market orders in (c) and (d) the price impact on the ask price converges to the permanent impact sooner. This reflects different speed of price discovery process given different orders, i.e. buy or sell limit or market orders carry different information.
Although figure 1 shows the price impact for one stock on a particular day, the patterns are representative of the price impact of order book events for the whole market. Once we establish the existence of substantial price impact in the Chinese equity market, we are interested in exploring how and to what extent the information content can be utilized in gauging the quality of the stock market, i.e. the volatility. We focus on the volatility as Ahn et al. (2001) and Madhavan et al. (1997) show in their theoretical framework as well as empirical evidence a link between intraday microstructure variables such as the bid-ask spread and order flow and intraday transitory volatility. We go one step further to explore whether this relation exists at the daily level between price impact and stock volatility.
In figure 2 we depict the impulse response of the volatility from the GARCH and HAR models induced by the change in the permanent price impact in the panel VAR framework when the estimation is conducted simultaneously on all sample stocks. For clarity we plot the impulse response separately for limit and market orders although the estimation is conducted in one go for (a) and (b) in the GARCH framework and for (c) Figure 1. Price impact of limit and market orders for stock 600021. This figure illustrates the changes in bid/ask prices (price impact) induced by buy/sell limit/market orders with the size equal to half the depth of their corresponding first levels on 1 June 2006 for stock 600021 via the VAR estimation. (a) Changes in bid and ask prices induced by bid and ask limit orders, respectively. (b) Changes in ask and bid prices induced by bid and ask limit orders, respectively. (c) Changes in bid and ask prices induced by bid and ask market orders, respectively. (d) Changes in ask and bid prices induced by bid and ask market orders, respectively. and (d) in the HAR setting. We observe a substantial change in the unit standard deviation of GARCH/HAR volatility as a result of one standard deviation change in the price impact induced by buy and sell limit and market orders. Furthermore, the bid limit order shows a stronger impact on volatility which is about twice as large as that of the ask limit order; it also dies out slightly slower. The same pattern can be observed when we examine the influence of price impact on the HAR volatility in figure 2(c): the impact from the bid limit order is greater in magnitude than that from the ask limit order, and dies out more slowly. However, if we look at the price impact of bid and ask market orders on volatility, we notice that the impact of ask market order is greater in magnitude compared with that of bid market order for both GARCH and HAR volatilities. Overall figure 2 supports our hypothesis that the price impact is part of the information source that drives stock volatility. Table 3 summarizes the in-sample parameter estimates for the GARCH-X and HAR-X models for selected stocks. For the GARCH-X model, we note that estimates for GARCH parameters α and β are both highly significant, and they add up to less than one indicating a stationary GARCH process. The coefficients γ 1 and γ 2 , which are of great interest, capture the loading of permanent price impact in the volatility model and they are highly significant at the 1% level for the majority of stocks, suggesting substantial impact of these variables on the in-sample volatility estimation. Meanwhile, for the HAR-X model, we find that the β coefficient for the weekly realized volatility is highly significant at the 1% level consistently; whereas it is hardly significant for the daily and monthly realized volatilities. The in-sample estimation summarized in this table suggests that the price impact, when aggregated to the daily level, makes a significant contribution to volatility estimation.
In-sample parameter significance does not always translate to out-of-sample forecasting improvement. Hence we conduct the Diebold and Mariano (1995) pairwise comparison to evaluate the out-of-sample forecasting accuracy between benchmark GARCH and HAR models and augmented GARCH-X and HAR-X models, respectively. In table 4 we report the average forecasting errors for the three loss functions, i.e. RMSE, MAPE, and MAE and conduct a simple test to see if the cross-sectional average of 148 individual stocks is significantly different between the GARCH and GARCH-X (HAR and HAR-X) models. In addition, we also provide descriptive statistics, including the mean, minimum and maximum, of t-statistics for the DM test for individual stocks. The results exhibit clear patterns. First, the average prediction error is reduced substantially when the time series of limit and market price impact are augmented to benchmark GARCH and HAR models. For example, when RMSE is the loss function, the average volatility prediction error is reduced from 3.49 for the GARCH model to 0.50 for the market order price impact-augmented GARCH model and the difference is significant at the 1% level. Similarly, regardless of which loss function we examine and whether we focus on the augmented Table 3. In-sample volatility modeling with price impact series. GARCH HAR GARCH HAR  Notes: This table reports volatility forecasting errors between the benchmark GARCH and HAR models and augmented models with price impact induced by limit and market orders. We report three loss functions, namely the root mean square error (RMSE), the mean absolute percentage error (MAPE), and the mean absolute error (MAE). The t-statistic of cross sectional mean comparison test between forecast errors from the benchmark and augmented models is reported. The mean (Avg), minimum (Min) and maximum (Max) of the Diebold and Mariano (1995) t-statistic for individual stocks are also reported. The values of forecast errors are in 10 −5 . The GARCH(HAR) + LO(MO) refers to the GARCH(HAR) model with price impact from limit (market) order, respectively. * * * denotes statistical significance at the 1% level. Notes: This table reports the descriptive statistics of annualized excess return (Ret), the Sharpe ratio (SR), and the certainty equivalent return (CER) of portfolios for a mean-variance utility investor who allocates her wealth between an individual stock and the risk-free asset with risk aversion level captured by γ . Volatility forecasts are obtained from either the benchmark GARCH (HAR) model or augmented GARCH (HAR) model when the time series of permanent price impact from limit orders (LO) and market orders (MO) are incorporated. The t-statistics of a simple mean difference test between the benchmark and augmented models across 148 stocks are reported in parentheses. model with price impact of limit or market orders, forecasting errors drop significantly. Second, the summary of DM t-statistics show that the differences between benchmark and augmented models are invariably significant. For example, the one-day ahead forecasts of market order price impactaugmented HAR model is strongly preferred to the HAR model with an average t-statistic of 10.17 using the MAE, and the minimum t-statistic is 8.01. These results support our conjecture that the information content of the price impact inferred from order book events, when aggregated to the daily level, is highly relevant and able to substantially improve the out-of-sample volatility prediction accuracy. Statistical improvement does not indicate economic gains to investors when volatility predictions are used in trading strategies. Hence we conduct a simple portfolio exercise to gauge the economic value of volatility forecasts. As we assume that expected returns to individual assets are the same as their historical average, overall portfolio returns as well as weight for the stock hinge upon the accuracy of stock volatility forecasts and investor risk aversion. In table 5 we summarize the cross-sectional average of annualized portfolio returns, the Sharpe ratio and the certainty equivalent return with three different risk aversion levels for benchmark GARCH and HAR models and augmented GARCH-X and HAR-X models for our sample stocks. The first thing we notice is that the cross-sectional average of portfolio return, the Sharpe ratio and the certainty equivalent return all increase significantly when we move from benchmark models to augmented models. This is shown by the high t-statistic in parentheses. For example, when the level of risk aversion is low at γ = 3 the market order price impact-augmented GARCH model offers an average annualized return of 6.12%, significantly higher than 5.32% by the benchmark GARCH model (t−statistic = 27.50). The Sharpe ratio increases from 0.28 to 0.33 (t−statistic = 25.58), whereas the certainty equivalent return goes up from 2.51% to 3.12% (t−statistic = 27.85). As the risk aversion level increases from 3 to 9, the returns and adjusted returns gradually drop but the pattern that the augmented models offer significantly improved portfolio returns, Sharpe ratio and certainty equivalent returns remains unchanged. This attests to the enhanced economic value of volatility forecasts when they contain information implied in the price impact.

Conclusion
In this paper, we aim at examining the order book events and studying their price impact on stock volatility. This is motivated by the rich market microstructure literature that explores the mechanics of the price formation in both quote-and orderdriven markets. Furthermore, as volatility is shown to be partly driven by market microstructure related information, we are interested in knowing whether the information content of price impact extracted from order book events is relevant to volatility estimation and forecasting. We take these questions to the data and utilize quotes and three levels of market depths for 148 stocks traded on the Shanghai Stock Exchange, which are also component stocks for the Chinese CSI 300 index, between August 2005 to August 2016. Based on econometrics framework including the VAR and panel VAR models, we reveal a number of interesting findings. We show that there is substantial impact of incoming order book events on bid and ask prices in China, which is consistent with evidence in the existing literature on other equity markets. We further find that the time series of price impact are significant factors when added to traditional GARCH and HAR volatility models. More interestingly, the information content of the time series of price impact is able to significantly improve volatility prediction accuracy for individual stocks and offer economic gains to a mean-variance utility investor. Our comprehensive examination of the order book events is thus relevant to traders, fund managers and regulators alike.
( A 6 ) These four shock vectors, corresponding to common trading scenarios faced by market participants, are adopted in our study. † The short-run price impact on bid and ask prices induced by limit/market orders that come into the market could thus be quantified as the implied expected short-run shift of the bid/ask prices after the submission of the orders. This can be captured by the following impulse response function (IRF) of the system of equation (A1): where h is the number of periods measured in order event time, δ t is the shock vector defined above.
For the long-run price impact, we apply the Engle and Granger (1987) Representation theorem to decompose the VEC model in equation (A2) into long-run components that obey equilibrium constraints and short-run components that exhibit a flexible dynamic specification.
and L is the lag operator. The Engle-Granger Representation theorem decomposes y t into three components: a random walk C, a stationary process C 1 , and a deterministic V. Since C 1 (z) is convergent for |z| < 1 + ( > 0), the impulse response incurred by this component is zero in the long run. The deterministic V, which depends on initial values such that β T V = 0, is irrelevant to the impulse response when h → ∞. The permanent response of y t is therefore determined by C t i=1 (u i + μ) and used as the price impact in volatility modeling and forecasting exercises.
For each stock, we use the highest frequency available in our dataset, i.e. four to six seconds, and implement the above procedures. We obtain eight short-run price impact series: buy limit/market order on the bid price, buy limit/market order on the ask price, sell limit/market order on the bid price, and sell limit/market order on the ask price; and four permanent price impact series: buy limit/market order on prices and sell limit/market order on prices. † We have also explored alternative specifications whereby the limit and market orders are of one fourth (three fourths) of v b,1 t or v a,1 t and found that the instantaneous and permanent impacts are smaller (bigger) than the case we study. These are consistent with the scenario analysis in Hautsch and Huang (2012).

Appendix 2. Order classification
To identify the equivalent order book events, we group order book activities into two categories: the placement of buy/sell limit order, and the execution of buy/sell market order. Both categories include two scenarios: depth changes and bid/ask price changes. Two adjacent order book records are denoted as OB t and OB t+1 . Different scenarios are described below and illustrated in figure A1.
(1) The placement of buy/sell limit order

• Depth changes
If two adjacent order book records have the same bid and ask prices while the depths of OB t+1 at bid or ask side are deeper than the ones of OB t , as illustrated in figure A1(a), we assign an equivalent buy or sell limit order event at the current best bid or ask price between two order book records. • Bid/ask price changes If the bid or ask price of OB t+1 is higher or lower than that of OB t , as illustrated in figure A1(b), we assign an equivalent buy or sell limit order event at the best bid or ask price of OB t+1 between two order book records.
(2) The execution of buy/sell market order

• Depth changes
If two adjacent order book records have the same bid and ask prices while the depths of OB t+1 at bid or ask side are lower than the ones of OB t , as illustrated in figure A1(c), we assign an equivalent sell or buy market order event, which immediately results in a buyer-or seller-initiated trade and eats part of the depth at the best bid or ask price, between two order book records. In this scenario, we do not consider the order cancellation event, which leads to the same result as does the market order event. Due to the limitation of the data, identifying between cancellation and execution is not achievable. • Bid/ask price changes If the bid or ask price of OB t+1 is lower or higher than the one of OB t , as illustrated in figure A1(d), we assign an equivalent sell or buy market order event, which immediately results in a buyer-or seller-initiated trade and eats all depth at the best bid or ask price, between two order book records. Similarly, we do not consider the cancellation of the placed order in this scenario due to the lack of information.  Figure A1. Reconstruction of the order book events. This figure shows four scenarios of order event identification. (a) bid or ask depth increase is equivalent to bid or ask limit order event; (b) bid price increase or ask price decrease is equivalent to bid or ask limit order event; (c) bid or ask depth decrease is equivalent to ask or bid market order event; (d) bid price decrease or ask price increase is equivalent to ask or bid market order event.