## Authors and Contributors

Dr. Sébastien Laurent, the copyright owner of G@RCH, holds a Ph.D. (2002) in Financial Econometrics from Maastricht University (The Netherlands). In 2002-2003 he worked as post-doc researcher at the CREST (Laboratory of Finance-Insurance) in Paris (France). He spent six years as associate professor in Econometrics at the Faculté Notre-Dame de la Paix in Namur (Belgium) and is currently associate professor in financial econometrics at Maastricht University, School of Business and Economics. He is also fellow of the CORE in Louvain-la-Neuve (Belgium). In 2008 and 2009 he has been invited professor at the London School of Economics (Finance Department). Sébastien has published in Journal of Business and Economic Statistics, Journal of Applied Econometrics, Journal of Financial Econometrics, European Economic Review and Journal of Empirical Finance among others.
Web Site: http://www.core.ucl.ac.be/~laurent/
Email: s.laurent@maastrichtuniversity.nl

### Contributors

Dr. Kris Boudt holds a Ph.D. (2008) in Applied Economics at K.U.Leuven, Belgium. He is afliated researcher at K.U.Leuven and since 2008 assistant professor of Finance at Lessius University College. His main research interests are empirical nance, risk management and robust statistics. His research has been published in the Computational Statistics and Data Analysis, Metrika, Journal of Risk and RISK magazine, among others.
Web Site: http://www.econ.kuleuven.be/public/N06054/
Email: Kris.Boudt@econ.kuleuven.be

Jerome Lahaye graduated from the HEC Management School of the University of Liège (Belgium) where he subsequently worked as teaching and research assistant (2000-2003). He holds a master of advanced studies in economics from the University of Leuven (2004) and a Ph.D. (2009) in Economics at the Faculté Notre-Dame de la Paix in Namur (Belgium). He is currently post-doc at HEC Lausanne. His research interests are at the crossroads of applied econometrics and international finance. In particular, his research has focused on the link between economic information disclosure and discontinuities on financial time series.
Email: jerome.lahaye@fundp.ac.be

Jean-Philippe Peters has been involved in the G@RCH project until version 4.2. He graduated from the HEC Management School of the University of Liège (Belgium) in 2000 with a M.Sc. in Management Sciences. His final dissertation, entitled “Development of a Package in the Ox Environment for GARCH Models and Two Empirical Applications” was the founding ground of the current package. From 2000 to 2002, he has worked as researcher in the Operations Research and Production Management unit of the University of Liège. In October 2002 he joined the Advisory and Consulting department of Deloitte Luxembourg where he is involved in consulting activities on risk management for the financial industry. Jean-Philippe also holds a Ph.D. (2010) in Economics from the HEC Management School of the University of Liège (Belgium).

Dr. Jeroen VK Rombouts holds a Ph.D. (2004) in Econometrics from the Catholic University of Louvain and is since 2004 assistant professor HEC Montreal. He is elected member of the International Statistical Institute, reseacher at CIRANO (Canada) and research associate at CORE (Belgium). Jeroen has published in Econometric Theory, Journal of Applied Econometrics, Econometric Reviews and The Econometrics Journal among others.
Web Site: http://neumann.hec.ca/pages/jeroen.rombouts/
Email: Jeroen.Rombouts@hec.ca

Francesco Violante graduated from the Bocconi University, Milan (Italy) where he worked as teaching and research assistant in 2004 at Centre of Research on Innovation and Internationalization (CESPRI). He holds a master of arts in economics from the Universitè Catholique de Louvain (Belgium) and is currently Ph.D. candidate of the Louvain Academy, at the University of Namur (2006). His research interests are focused on applied econometrics and finance and in particular realized volatility and forecast, multivariate GARCH models, volatility models comparison and selection. In 2007, he was visiting scholar at the HEC Management School of the University of Montreal (Canada).
Email: francesco.violante@fundp.ac.be

## Chapter 1Introduction

This book documents G@RCH 6.1, an OxMetrics module (see Doornik, 2009) dedicated to the estimation and forecasting of univariate and multivariate ARCH-type models (Engle, 1982), including some of the most recent contributions in this field. Univariate GARCH (Bollerslev, 1986), EGARCH (Nelson, 1991), GJR (Glosten, Jagannathan, and Runkle, 1993), APARCH (Ding, Granger, and Engle, 1993), IGARCH (Engle and Bollerslev, 1986), RiskMetrics (J.P.Morgan, 1996), FIGARCH (Baillie, Bollerslev, and Mikkelsen, 1996a and Chung, 1999), FIEGARCH (Bollerslev and Mikkelsen, 1996), FIAPARCH (Tse, 1998) and HYGARCH (Davidson, 2001) specifications are available for the conditional variance. Moreover an AR(FI)MA process can be specified in the conditional mean (see Baillie, Chung, and Tieslau, 1996b, Tschernig, 1995, Teyssière, 1997, or Lecourt, 2000, for further details about ARFIMA models). All these models can be adapted to add an “ARCH-in-mean” term in the conditional mean as suggested by Engle, Lilien, and Robbins (1987).

G@RCH 6.1 allows the estimation of some of the most widespread multivariate GARCH models, including the scalar BEKK, diagonal BEKK(Engle and Kroner, 1995), RiskMetrics (J.P.Morgan, 1996), CCC (Bollerslev, 1990), DCC (Engle, 2002 and Tse and Tsui, 2002), DECO (Engle and Kelly, 2007), OGARCH (Kariya, 1988 and Alexander and Chibumba, 1997) and GOGARCH models (van der Weide, 2002 and Boswijk and van der Weide, 2006).

G@RCH 6.1 also allows the estimation of univariate and multivariate non-parametric estimators of the quadratic variation and the integrated volatility. These estimators include the realized volatility, bi-power-variation and realized outlyingness weighted variance. Several tests for daily and intradaily tests for jumps and intraday periodicity filters are also provided.

This software has been developed with Ox 6.0, a matrix programming language described in Doornik (2007b).1 G@RCH 6.1 should be compatible with a lot of platforms, including Windows, Mac OsX, Linux, Unix and Solaris when it is used in combination with the Ox Console. Furthermore, G@RCH 6.1 provides a menu-driven graphical interface for Microsoft Windows, Mac OsX and Linux users as well as a comprehensive HTML documentation. For most of the specifications, it is generally very fast and one of its main characteristic is its ease of use.

This book is structured as follows:

• This chapter introduces the G@RCH software.
• Chapter 2 explains how to get started with G@RCH.
• Chapter 3 presents the simple univariate ARCH model. Comments over estimation procedures (parameters constraints, distributions, standard deviation estimation methods, tests, forecasting procedures and accuracy of the package) are also reviewed.
• Chapter 4 presents more sophisticated ARCH-type models as well as more advanced estimation techniques.
• Then, Chapter 5 explains how to estimate these models using both the Batch editor of OxMetrics and the Ox programming language. Several illustrations are also provided.
• After a brief summary of the concept of Value-at-Risk (VaR), Chapter 6 shows how the package can be used to forecast the VaR and to test the adequacy of the selected models.
• Chapter 7 (written in collaboration with Kris Boudt and Jerome Lahaye) presents the concepts of realized volatility, some of its (robust to jumps) univariate and multivariate extensions and (robust to jumps) intraday periodicity filters.
• Chapter 8 explains how to use G@RCH (and in particular RE@LIZED) to apply the concepts presented in the previous chapter.
• Chapter 9 reviews the multivariate GARCH models available in G@RCH and gives several illustrations on how to estimate these models with the rolling menus, Batch language and finally the Ox language combined with the MGarch class.
• Finally, the structure and the functions of the Garch, MGarch and Realized classes are detailed in Chapter 10.

### 1.1 G@RCH

#### 1.1.1 Definition

G@RCH 6.1 is a software dedicated to the estimation and the forecasting of univariate and multivariate (G)ARCH models and many of its extensions. It can be used within OxMetrics or via the classic programming way (using OxEdit for instance) for those who have access to the Ox programming language.

The available univariate models are all ARCH-type models. These include ARCH, GARCH, EGARCH, GJR, APARCH, IGARCH, RiskMetrics, FIGARCH , FIEGARCH , FIAPARCH and HYGARCH. They can be estimated by approximate (Quasi-) Maximum Likelihood under one of the four proposed distributions for the errors (Normal, Student-t, GED or skewed-Student). Moreover, ARCH-in-mean models are also available and explanatory variables can enter the conditional mean and/or the conditional variance equations.

G@RCH 6.1 also propose the some multivariate GARCH specifications including the scalar BEKK, diagonal BEKK, full BEKK, RiskMetrics, CCC, DCC, DECO, OGARCH and GOGARCH models.

Finally, h-steps-ahead forecasts of both equations are available as well as many univariate and multivariate miss-specification tests (Nyblom, Sign Bias Tests, Pearson goodness-of-fit, Box-Pierce, Residual-Based Diagnostic for conditional heteroscedasticity, Hosking’s portmanteau test, Li and McLead test, constant correlation test, ).

#### 1.1.2 Program

This documentation refers to version 6.0 of G@RCH, which can be used in three different ways. The first one is through the menu-driven environment of OxMetrics, similarly to PcGive, PcGets or STAMP.

Second, one can estimate the models via the “Batch Editor” of OxMetrics.

Finally, Ox users can write Ox codes based on the Garch, MGarch and Realized classes. This solution is found to be particularly interesting when non-standard operations are considered (like a whole catalogue of estimations). A set of examples is provided with the G@RCH package. Please note however that this solution requires a registered version of Ox Professional or its console version. Check the Ox web site (http://www.doornik.com) for details.

The first method is illustrated in Chapters 3 and 4 while the next two are presented in Chapter 5.

#### 1.1.3 What’s new in G@RCH 6.1 ?

• Bug fixed: Thanks to Charles Bos and Janus Pawel, a bug in the estimation of the EGARCH with Student-t errors has been fixed.
• A useful feature of the DCC models is that the parameters governing the variance and correlation dynamics can be estimated separately. For the correlation estimation, one typically first estimates Q as the empirical correlation matrix of the standardized residuals, say ut. Then the parameters α and β of the DCC model are usually estimated by Gaussian quasi maximum likelihood. Aielli (2009) has recently shown that the estimation of Q as the the empirical correlation matrix of ut is inconsistent and proposed a correction to the DCC to circumvent this problem. This model, called cDCC, is now available in G@RCH 6.1.
• A new procedure to simulate the cDCC is also available.
• Small modification in the BV estimator. The estimator is now computed as follows: BV t(Δ) = μ1-2 i=2M|rt,i||rt,i-1|. The term was ignored in version 6.0 (but is close to one when M is large). A similar correction is applied on the Tri-power and Quad-power quarticity estimators.
• The MinRV and MedRV of Andersen, Dobrev, and Schaumburg (2008) are now available in RE@LIZED as well as the corresponding tests for jumps based on the difference between RV and these two robust to jumps estimates of the integrated volatility.
• The function Graphs_RV has been redefined.
• The Batch language for univariate realized volatility estimates and the LM test for jumps has been improved.

#### 1.1.4 What’s new in G@RCH 6.0 ?

• Bug fixed: G@RCH experienced convergence problems when returns were not expressed in %. This is now fixed.
• G@RCH proposes a new module called RE@LIZED whose aim is to provide a full set of procedures to compute non-parametric estimates of the quadratic variation, integrated volatility and jumps using intraday data. The methods implemented in G@RCH 6.1 are based on the recent papers of Andersen, Bollerslev, Diebold and coauthors, Barndorff-Nielsen and Shephard and Boudt, Croux and Laurent. They include univariate and multivariate versions of the realized volatility, bi-power-variation and realized outlyingness weighted variance. Daily and intradaily tests for jumps are also implemented. The ‘Realized’ class allows to apply these estimators and tests on real data using the Ox programming language. Importantly, they are also accessible through the rolling menus of G@RCH. Interestingly, like for the other modules, an Ox code can be generated after the use of the rolling menus. The Model/Ox Batch Code command (or Alt+O) activates a new dialog box called ‘Generate Ox Code’ that allows the user to select an item for which to generate Ox code. Here is an example of code generated after the computation of Lee and Mykland (2008)’s test for intraday jumps detection.

#include <oxstd.h>
#import <packages/Garch6/garch>
#include <oxdraw.h>

main()
{
//--- Ox code for RE@LIZED
decl model = new Realized();

model.Deterministic(-1);

model.Select(Y_VAR, {"Ret_1", 0, 0});
...
model.Select(Y_VAR, {"Ret_288", 0, 0});

model.SetModelClass(MC_RV);
model.RV(1);
model.IV(1);
model.OPTIONS_JUMPS_TEST_BV(1,0,2,0.999);

model.SetSelSampleByDates(dayofcalendar(1987, 1, 5), dayofcalendar(1998, 7, 3));
model.Estimate();
model.Graphs_RV(0,0,1,0,0,0,0,1,1,0,0,10,0,0.999);
model.Append_in(model.m_vIV,"BV");
model.Append_in(model.m_vRJ,"RJ_BV");
model.Save("C:\\Data\\Simulated_cont_GARCH_jumps_FFF_K_0.2_M_0.3.in7");

delete model;
}

• Non-parametric and parametric (robust to jumps) intraday periodicity filters are also provided.
• The DCC-DECO model of Engle and Kelly (2008) is now documented in this manual.
• Conditional means, variances, covariances and correlations of MGARCH models can now be edited in a basic matrix or array editor.
• Bug fixed (thanks to Charles Bos). Several functions of the MGarch class had not been included in the oxo file, e.g. GetVarf_vec, Append_in, Append_out, etc.
• Bug fixed. DCC models: the empirical correlation matrix used when applying ‘Correlation Targeting’ was computed on the residuals and not the devolatilized residuals as it should be.

#### 1.1.5 What’s new in G@RCH 5.1 ?

• The Mac OSX and Linux versions of G@RCH 6.1 are now available.
• Several tests have been added to the menu ’Descriptive Statistics’, i.e. the Variance Ratio test of Lo and MacKinlay (1988), the Runs test originally used by Fama (1965) and the Rescaled Range Tests of Mandelbrot (1972) and Lo (1991). These tests are described in Section 3.12. The size and power of these tests are studied through some Monte Carlo experiments. The Ox codes used for these simulations is available in the folder packages/Garch5/samples/
• Bug fixed (thanks to Stefan De Wachter): Time-axis did not show the right time period when selecting a subsample. The time-axis started from the beginning of the dataset, regardless of what the selected estimation sample was.
• Problem fixed: annoying Internet Explorer 7 warning about running ActiveX controls when running the OxMetrics help.
• Thanks to Robert A. Yaffee, failure rates for short position in the output of the Kupiec LR test are now labelled success rates.
• Thanks to Christian Conrad, non-negativity conditions for FIGARCH and HYGARCH models are implemented in the following examples: Stat_Constr_FIGARCH_Conrad_Haag_Check.ox and
Stat_Constr_HYGARCH_Conrad_Check.ox for a FIGARCH and HYGARCH (1,d,1) respectively. While the conditions are tested on the estimated parameters in the previous two example files, the conditions are imposed during the estimation in the next two, i.e. Stat_Constr_FIGARCH_Conrad_Haag_Impose.ox and
• Bug fixed: Explanatory variables in the intraday seasonality FFF filter, i.e. procedure FFF_filter(), were not taken into account. The example file FFF.ox has been extended and can now be run using the simulated dataset FX_USD_1990_1994.in7. Furthermore, the normalizing constant N2 in equation (7.37) has been changed. Indeed, the analytic solution for a sum of squares of integers k=1Mk2 = (2M3 + 3M2 + M)6 which leads the value N2 = (2M2 + 3M + 1)6 and not N2 = (M + 1)(M + 2)6 as in Andersen and Bollerslev (1997) when computing the mean instead of the sum. Note that this change has only minor effects in practice on the results.
• The DECO (Dynamic EquiCOrrelation) model of Engle and Kelly (2007) has been added to MG@RCH (but is not yet documented).
• New function in MG@RCH: GetParEst(). The version without argument returns a K × 3 matrix with respectively the estimated parameters, standard errors and t-statistics. The the example with one argument equal to 1, i.e. GetParEst(1) returns the names of the estimated parameters and the K × 3 matrix explained above. An example is provided in the file MGarchEstim_GetParEst.ox. The relevant lines of code are
mgarchobj.DoEstimation();

decl names,out;
[names,out]=mgarchobj.GetParEst(1);
println("%r",names,"%c", {"MLE","Std-error","t-stat"},out);
delete mgarchobj;

This will produce the following output:

MLE    Std-error       t-stat
Cst(M)            0.044056     0.018751       2.3495
AR(1)             0.026688     0.030135      0.88560
Cst(V)           0.0013516    0.0017330      0.77995
ARCH(Alpha1)      0.011494    0.0076399       1.5045
GARCH(Beta1)       0.98520     0.010640       92.590
Cst(M)            0.067069     0.025062       2.6761
AR(1)              0.23581     0.028032       8.4121
Cst(V)            0.069550     0.047670       1.4590
ARCH(Alpha1)       0.10320     0.043605       2.3667
GARCH(Beta1)       0.79178      0.10844       7.3015
rho_21             0.67544     0.091076       7.4162
alpha             0.011016    0.0055463       1.9862
beta               0.98470     0.011019       89.360

#### 1.1.6 What’s new in G@RCH 5.0 ?

This manual has been revised to account for the following new features:

• The estimation of EGARCH-type models and ARCH-in-mean model has been considerably improved.
• G@RCH now provides some simulation capabilities through the menu Monte-Carlo. GARCH, GJR, APARCH and EGARCH models are available with an ARMA in the conditional mean and normal, student, GED or skewed student errors. The simulated data can be plotted and stored in a separated dataset.
• Three unit root tests (ADF, KPSS and SP) and two long-memory tests (GPH and GSP) have been added (available through the ‘descriptive statistics’ menu).
• In- and out-of-sample VaR forecasts can be stored.
• Out-of-sample VaR forecasts can be plotted via the menu Test/Forecast.
• A new function is provided to compute the statistics of Lee and Mykland (2008) to detect jumps at ultra-high frequency.
• Several multivariate GARCH models are available including the scalar BEKK, diagonal BEKK, RiskMetrics, CCC, DCC, OGARCH and GOGARCH models.
• G@RCH also provides some specific multivariate miss-specification tests like a multivariate normality test, Hosking’s portmanteau test, Li and McLead test, two constant correlation tests, etc.
• G@RCH also provides some functions to simulate BEKK, CCC and DCC models.
• A new feature of G@RCH 5.0 is that Ox code can be generated. The Model/Ox Batch Code command (or Alt+O) activates a new dialog box called ‘Generate Ox Code’ that allows the user to select an item for which to generate Ox code.

### 1.2 General Information

Suggestions, mistakes, typos and possible improvements can be reported to the first author via e-mail: Sebastien.Laurent@fundp.ac.be

The most appropriate way to discuss about problems and issues of the G@RCH package is the Ox-users discussion list. The ox-users discussion group is an email-based forum to discuss any problems related to Ox programming, and share code and programming solutions. Consult the online help of OxMetrics for information on joining the list.

#### 1.2.2 Availability and Citation

The full independent version of G@RCH is exclusively distributed by Timberlake Consultants.
Contact information for the UK:
Phone: +44 (0)20 8697 3377 or
Internet: http://www.timberlake.co.uk

For the US:
Phone: +1 908 686 1251 or
Internet: http://www.timberlake-consultancy.com
For other countries, please go to the Timberlake web sites for contact information.

There are different versions of G@RCH. A console version of G@RCH is available on G@RCH web site http://www.garch.org.

This (and only this) version of G@RCH may be used freely for academic research and teaching purposes only. The Professional version (i.e. using OxRun, OxDebug, OxMetrics or oxli) is not free.

Commercial users and others who do not qualify for the free version must purchase the Professional version of G@RCH with documentation, regardless of which version they use (i.e., even with Linux or Unix). Redistribution of G@RCH in any form is not permitted.

Moreover, for easier validation and replication of empirical findings, please cite this documentation (or Laurent and Peters, 2002) in all reports and publications involving the use of G@RCH. Failure to cite the use of Ox in published work may result in loss of the right to use the free version, and an invoice at the full commercial price.

All company and product names referred to in this documentation are either trademarks or registered trademarks of their associated companies.

#### 1.2.3 World Wide Web

Check www.garch.org for information on releases, online help and other relevant information on G@RCH.

To install the Professional version of G@RCH 6.1, run the installation file and follow the instructions.

To install the Console version of G@RCH, unzip the file Garch6.zip (available herebelow) in the ox/packages directory. Note that the free console version of G@RCH 6.1 only includes the Garch class. The MGarch and Realized classes are only available in the Professional version of G@RCH.

For OxMetrics enterprise users, note that the Console version of G@RCH 6.1 comes along with OxMetrics enterprise and is directly installed in the ox/packages directory.

THIS PACKAGE IS FUNCTIONAL BUT NO WARRANTY IS GIVEN WHATSOEVER. YOU USE IT AT YOUR OWN RISK!

## Chapter 2Getting Started

We now discuss some of the basic skills required to get started with G@RCH. We will give a brief introduction to the functionalities provided by OxMetrics. The aim of this section is to explain how to load data and make graphs: for instructions on how to transform data using the calculator or algebra, consult the OxMetrics manuals.

We assume that you have the basic skills to operate programs under the Windows operating system (the OxMetrics books provides some hints). G@RCH runs on several versions of Windows, i.e. Windows Vista/XP/NT/2000 but also on Mac OsX and Linux.1 The screen appearance of the program and dialogs will reflect the operating system you are using. The screen captures of this manual reflect the Windows XP operating system but should not be very different from the other systems.

### 2.1 Starting G@RCH

When launching OxMetrics, all the available modules (PcGive, G@RCH and Stamp) are automatically loaded and appear in the Modules group in the workspace window on the left-hand side of OxMetrics. This process is displayed in the following capture.

G@RCH cannot operate without data. Once OxMetrics is activated, the very first step is to load data. Let us illustrate this with an example, loading nasdaq.in7. These are daily returns of the Nasdaq stock index. The covered period is 10/11/1984 - 12/21/2000, which represents 4093 daily observations. Daily returns (in %) are defined as

 (2.1)

where pt is the price series at time t.

The IN7 extension indicates an OxMetrics file. The IN7 file is a human-readable file, describing the data. There is a companion BN7 file, which holds the actual numbers (in binary format, so this file cannot be edited). OxMetrics can handle a wide range of data files, among them Excel and Lotus files and, of course, plain human-readable (ASCII) files. You can also cut and paste data between Excel and OxMetrics. More details are available in OxMetrics handbooks.

To load this dataset, click File then Open in the main menu of OxMetrics, then select the data file. The default datasets provided by G@RCH are stored in the subdirectory \G@RCH\data of OxMetrics’s main directory. Locate that directory and select nasdaq.in7.

Double clicking on the variable name shows the documentation of the variable.

The data can be manipulated, much like in a spreadsheet program. Here we shall not need these facilities, so minimize the database window again: click on the first button in the right-hand side of the window. Do not click on the cross: this closes the database, thus removing it from OxMetrics and hence from G@RCH.

### 2.3 OxMetrics Graphics

The graphics facilities of OxMetrics are powerful yet easy to use. This section will show you how to make time-plots of variables in the database. OxMetrics offers automatic selections of scaling etc., but you will be able to edit these graphs, and change the default layout such as line colors and line types. Graphs can also be saved in a variety of formats for later use in a word processor or for reloading to OxMetrics.

#### 2.3.1 A First Graph

Graphics is the first entry on the Model menu. Activate the command to see the following dialog box (or click on the cross-plot graphics icon on the toolbar).

In this example we select Nasdaq and then press the << button to move the database variables to the selection listbox, as shown above. Then press the button labelled Actual series. Figure 2.1 should appear on the screen. OxMetrics can also draw multiple graphs simultaneously on-screen. For other types of graphs, select All plot types to open the related menu, as seen hereunder (additional details on graphics are available in Doornik, 2009).

#### 2.3.2 Graph Saving and Printing

To print a graph directly to the printer, click on the printer icon in OxMetrics. You can preview the result first using the Print Preview command from the File menu.

Graphs can be saved in various formats. See Doornik (2009) for more details.

#### 2.3.3 Including Graphs in LATEX Documents

Saving graphs in the .EPS format is useful to include them in a LATEX document. The example given below is the Latex code used to include Figure 2.1 in the current document. Note that the command \includegraphics requires the use of the package graphicx. This can be done using the \usepackage{graphicx} command in the preamble of the LATEX document.

## Chapter 3Introduction to the Univariate ARCH Model

In this chapter, our attention is first devoted to review the specifications of the conditional mean equation. Then, the most simple univariate ARCH specification will be presented as well as the estimation and testing issues with G@RCH. More recent developments about ARCH-type models are discussed in the next chapter.

### 3.1 Visual Inspection

Panel 1 of Figure 3.1 indicates that the NASDAQ exhibits volatility clustering as periods of low volatility mingle with periods of high volatility. This is a clear sign of presence of ARCH effect in the series.

Panel 2 also shows that the unconditional distribution of the NASDAQ returns (full line) is not normal. Indeed, it is more peaked than the normal density (dashed line) and it has fatter tails as indicated by panels 3 and 4 (this visual indication is validated by the computation of the kurtosis, which is is equal to 9.34). While not obvious on a visual basis, the skewness parameter (-0.627) indicates a negatively skewed behavior.

Recall that the skewness coefficient (SK) equals 0 for a symmetric distribution while the kurtosis coefficient (KU) equals 3 for the normal distribution. The excess kurtosis equals KU - 3. Formally these coefficients are expressed as

 (3.1)

### 3.2 Preliminary Graphics

When the error term is not independent of previous errors, it is said to be autocorrelated. Autocorrelation of order h is computed as follows

 (3.2)

Plotting the autocorrelation function against the lag gives a first visual impression of the magnitude of the autocorrelation problem. This plot is called “autocorrelogram” and it provides information about simple correlation coefficients. When the errors are strongly time dependent, the autocorrelations tend to be fairly large even for large values of h. The next screen shot shows how to obtain an autocorrelogram in OxMetrics.

The obtained graphic is displayed in Figure 3.2. It suggests that the daily return series of the NASDAQ is a short memory process (in the level) and an AR(1) term might be needed in the conditional mean equation.

We have previously seen from the visual inspection that the NASDAQ exhibits volatility clustering as periods of low volatility mingle with periods of high volatility. In addition to this inspection, one can plot the autocorrelogram of the squared (or absolute) returns to highlight the presence of ARCH effects in the data. This can be done using the calculator of OxMetrics. We can compute squared returns of the NASDAQ, as shown in the screen shot below.

Figure 3.3 shows the autocorrelogram (with 100 lags) of the squared returns. It suggests that squared returns are strongly autocorrelated, exhibiting volatility clustering.

### 3.3 Preliminary Tests

G@RCH 6.1 provides several tests on the raw data (in addition to the possibilities offered by OxMetrics: ACF, PACF, QQ-plots,…):

• Basic Statistics: number of (missing) observations, minimum, maximum, mean and standard deviation.
• The value of the skewness and the kurtosis of the series, their t-tests and p-values. Moreover, the Jarque-Bera normality test (Jarque and Bera, 1987) is also reported.
• The Box-Pierce statistics at lag l* for both series, i.e. BP, and squared series, i.e. BP2. Under the null hypothesis of no autocorrelation, the statistics BP and BP2 should be evaluated against the χ2 and χ2, respectively. A Monte Carlo study of the size and power of this this is given in example file Simulation_Q.ox.
• Engle’s LM ARCH test (Engle, 1982) to test the presence of ARCH effects in a series. For each specified order, the squared series is regressed on p of its own lags. The test statistic is distributed χ2(p) under the null hypothesis of no ARCH effects.
• Unit root tests on yt.
1. The Augmented Dickey-Fuller test H0 : yt is I(1) against H1 : yt is I(0) (see Dickey and Fuller, 1981) is provided by the t-statistic on in:  (3.3)

where q is the number of lagged first differences included in the ADF regression. The constant and the trend are optional and can be excluded if needed. Note that et is assumed to be a white noise. Critical values are taken from MacKinnon (1991).
2. The KPSS test of Kwiatkowski, Phillips, Schmidt, and Shin (1992) tests H0 : yt is I(0) against H1 : yt is I(1). The regression model with a time trend has the form

 (3.4)

where et is assumed to be stationary with mean 0 and variance σe2 and xi an i.i.d. process with E(xi) = 0 and V (xi) = 1. When γ0,yt is integrated while when k = 0 it is trend-stationary (or stationary if β1 = 0). As a consequence, H0 implies k = 0 while H1 implies k0. After the estimation of the above model under H0 by OLS (including or not a time-trend), the KPSS test rejects H0 in favor of H1 for large values of the statistic

 (3.5)

where T2 is an estimator of the spectral density at a frequency of zero (nonparametric estimator of the so-called long-run variance). This estimator is based on a weighted sum of the autocovariances, where the weights are defined by a kernel function. We thus need to specify a truncation lag in the covariance or bandwidth, denoted l. G@RCH 6.1 provides a Bartlett kernel. The bandwidth can be specified by the user. In their original work, Kwiatkowski, Phillips, Schmidt, and Shin (1992) considered three values of l as a function of T : l0 = 0,l4 = integer[4(T∕100)14], and l12 = integer[12(T∕100)14].
3. Schmidt and Phillips (1992) have proposed a test for the null hypothesis of a unit root when a deterministic linear trend is present. These authors end-up with two statistics, denoted Z(rho) and Z(tau) based on a nonparametric estimator of the so-called long-run variance (see above) that involves a truncation lag in the covariance or bandwidth (l). Critical values for these two statistics are reported in Schmidt and Phillips (1992).

• Long-memory tests
1. GPH: implements the log-periodogram regression method for estimating the long-memory parameter as discussed in Geweke and Porter-Hudak (1983). Periodogram points are evaluated at Fourier frequencies ,j = 1,,l, where l is the number of low frequency periodogram points used in estimation (bandwidth).
2. GSP: implements the Gaussian semi-parametric method for estimating the long-memory parameter as discussed in Robinson and Henry (1998).
• Other Long-memory tests: Rescaled Range Tests of Mandelbrot (1972) and Lo (1991). See Section 3.12.
• Variance-ratio test of Lo and MacKinlay (1988). See Section 3.12.
• Runs tests. See Section 3.12.

To launch the descriptive statistics, click on G@RCH in the Modules group in the workspace window on the left-hand side of OxMetrics. Then change the Category to Other models and Model Class to Descriptive Statistics using G@RCH and click on Formulate.

A new dialog box Formulate is launched and the user is asked to select the series to be tested. Selected series are labelled ‘T’.

Next, the user select the tests to be run. Lag values of the ARCH LM and Box-Pierce tests can be changed by the user. The subsequent window displays options related to the sample period. As can be seen, if the date variable is well formatted in the database, the sample selection is made easier by allowing selecting a starting date and an ending date. See Chapter 9 of Doornik (2007a) for details on OxMetrics file formats.

Once the OK button of the Estimate - Descriptive Statistics dialog box is pressed, tests are launched and results are printed out in OxMetrics. Note that if a unit root test has been selected, an additional dialog box is launched and the user is asked to select some options related to the test (to include a constant or a trend for instance and the lag length for differences or bandwidth). They should be similar to the output of “Box 1 - Descriptive Statistics”. Results confirm that NASDAQ returns exhibit ARCH-type effects (see the ARCH LM tests and the Q-Statistics on squared data). Looking at the Q-Statistics on raw data, one concludes that an ARMA-type model seems justified. Finally, the Jarque-Bera statistic clearly rejects the normality assumption for the unconditional distribution of the NASDAQ while the KPSS Statistics indicated that the series is likely to be I(0).

Box 1 - Descriptive Statistics for the NASDAQ
---- Database information ----
Sample:    1984-10-12 - 2000-12-21 (4093 observations)
Frequency: 1
Variables: 4

Variable        #obs  #miss    type          min         mean          max      std.dev
Date            4093      0    date       1984-10-12            2000-12-21
Nasdaq          4093      0  double      -12.043     0.055166       9.9636       1.2617
Constant        4093      0  double            1            1            1            0
Trend           4093      0  double            1         2047         4093       1181.5

Series #1/1: Nasdaq
---------
Normality Test

Statistic       t-Test      P-Value
Skewness            -0.74128       19.368  1.4336e-083
Excess Kurtosis       11.255       147.07      0.00000
Jarque-Bera           21979.         .NaN      0.00000
---------------
ARCH 1-2 test:    F(2,4088)=   420.80 [0.0000]**
ARCH 1-5 test:    F(5,4082)=   228.90 [0.0000]**
ARCH 1-10 test:   F(10,4072)=   118.16 [0.0000]**
---------------

Q-Statistics on Raw data
Q(  5) =  41.8697   [0.0000001]
Q( 10) =  50.9695   [0.0000002]
Q( 20) =  83.6251   [0.0000000]
Q( 50) =  167.368   [0.0000000]
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]
---------------
Q-Statistics on Squared data
Q(  5) =  1988.77   [0.0000000]
Q( 10) =  2874.40   [0.0000000]
Q( 20) =  3748.75   [0.0000000]
Q( 50) =  5491.27   [0.0000000]
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]
---------------------------------------------

KPSS Test without trend and 2 lags
H0: Nasdaq is I(0)

KPSS Statistics: 0.0743478
Asymptotic critical values of Kwiatkowski et al. (1992), JoE, 54,1, p. 159-178

1%    5%   10%
0.739 0.463 0.347

----  Log Periodogram Regression ----
d parameter         0.0691465 (0.015793) [0.0000]
No of observations: 4093; no of periodogram points: 2046

### 3.4 Conditional Mean Specification

Let us consider a univariate time series yt. If Ωt-1 is the information set at time t- 1, we can define its functional form as:

 (3.6)

where E denotes the conditional expectation operator and εt is the disturbance term (or unpredictable part), with E = 0 and E = 0,ts.

Equation (3.6) is the conditional mean equation which has been studied and modelled in many ways. Two of the most famous specifications are the Autoregressive (AR) and Moving Average (MA) models. Combining these two processes and introducing n1 explanatory variables in the equation, we obtain the ARMAX(n,s) process

 (3.7)

where L is the lag operator1 , Ψ = 1 - i=1nψiLi and Θ = 1 + j=1sθjLj. The ARMA residuals are obtained using the arma0 function of Ox on the demeaned series (yt - μt). See the Ox manual for more details about arma0.

Several studies have shown that the dependent variables (interest rate returns, exchange rate returns, etc.) may exhibit significant autocorrelation between observations widely separated in time. In such a case, yt is said to display long memory (or long-term dependence) and is best modelled by a fractionally integrated ARMA process. This ARFIMA process was initially developed in Granger (1980) and Granger and Joyeux (1980) among others.2 The ARFIMA(n,ζ,s) is given by:

 (3.8)

where the operator (1 - L)ζ accounts for the long memory of the process and is defined as:

with 0 < ζ < 1, c1(ζ) = ζ,c2(ζ) = ζ(1 - ζ), and Γ(.) denoting the Gamma function (see Baillie, 1996, for a survey on this topic). The truncation order of the infinite summation is set to t - 1. Note that The ARFIMA residuals are obtained using diffpow function on the ARMA residuals. See the Ox manual for more details about diffpow.

It is worth noting that Doornik and Ooms (1999) provided an Ox package for estimating, forecasting and simulating ARFIMA models. However, contrary to G@RCH, the conditional variance is assumed to be constant over time.3

A feature of G@RCH 6.1 is the availability of ARCH “in-mean” models. If we introduce the conditional variance (or the conditional standard error) as additional explanatory variables in Equation (9.6), we get the ARCH-in-Mean model of Engle, Lilien, and Robbins (1987), i.e.

 (3.10)

with k = 1 to include the conditional standard deviation and k = 2 for the conditional variance.

The ARCH-M model is often used in financial applications where the expected return on an asset is related to the expected asset risk. The estimated coefficient of the expected risk is a measure of the risk-return tradeoff.

### 3.5 Conditional Variance Specification: the ARCH Model

The εt term in Equations (3.6)-(3.8) is the innovation of the process. More than two decades ago, Engle (1982) defined as an Autoregressive Conditional Heteroscedastic (ARCH) process, all εt of the form:

 (3.11)

where zt is an independently and identically distributed (i.i.d.) process with E(zt) = 0 and V ar(zt) = 1. By assumption, εt is serially uncorrelated with a mean equal to zero but its conditional variance equals σt2. Therefore, it may change over time, contrary to what is assumed in the standard regression model.

The models provided by our program are all ARCH-type.4 They differ on the functional form of σt2 but the basic principles are the same (see the next chapter for details). Besides the traditional ARCH and GARCH models, we focus mainly on two kinds of models: the asymmetric models and the fractionally integrated models. The former are defined to account for the so-called “leverage effect” observed in many stock returns, while the latter allows for long-memory in the variance. Early evidence of the “leverage effect” can be found in Black (1976), while persistence in volatility is a common finding of many empirical studies; see for instance the excellent surveys on ARCH models proposed in Bollerslev, Chou, and Kroner (1992), Bera and Higgins (1993) or Palm (1996).

The ARCH (q) model can be expressed as:

where D(.) is a probability density function with mean 0 and unit variance (see Section 3.6.2).

The ARCH model can describe volatility clustering. The conditional variance of εt is indeed an increasing function of the square of the shock that occurred in t - 1. Consequently, if εt-1 was large in absolute value, σt2 and thus εt is expected to be large (in absolute value) as well. Notice that even if the conditional variance of an ARCH model is time-varying, i.e. σt2 = E(εt2|ψt-1), the unconditional variance of εt is constant and, provided that ω > 0 and i=1qαi < 1, we have:

If zt is normally distributed, E(zt3) = 0 and E(zt4) = 3. Consequently, E(εt3) = 0 and the skewness of y is zero. The kurtosis coefficient for the ARCH(1) is 3 if α1 < 0.577. In this case, the unconditional distribution of the returns features fat tails whenever α1 > 0.

In most applications, the excess kurtosis implied by the ARCH model (coupled with a normal density) is not enough to mimic what we observe on real data. Other distributions are possible. For example we could assume that zt follows a Student distribution with unit variance and υ degrees of freedom, i.e. zt is ST(0,1,υ). In that case, the unconditional kurtosis of the ARCH(1) is λ with λ = 3(υ - 2)(υ - 4). Because of the additional coefficient υ, the ARCH(1) model based on the Student distribution does feature fatter tails than the corresponding model based on the Normal distribution. See Section 3.6.2 for more details.

The computation of σt2 in Equation (3.12) depends on past (squared) residuals (εt2), that are not observed for t = 0,-1,,-q + 1. To initialize the process, the unobserved squared residuals are set to their sample mean.

#### 3.5.1 Explanatory Variables

Explanatory variables can be introduced in the conditional variance as follows:

 (3.14)

For the ease of presentation, we will use the notation ω for both ω and ωt in all subsequent equations.5

#### 3.5.2 Positivity Constraints

σt2 has obviously to be positive for all t. Sufficient conditions to ensure that the conditional variance in Equation (3.12) is positive are given by ω > 0 and αi 0.6 Furthermore, when explanatory variables enter the ARCH equation, these positivity constraints are not valid anymore (but the conditional variance is still required to be non-negative).

#### 3.5.3 Variance Targeting

A simple trick to reduce the number of parameters to be estimated is referred to as variance targeting by Engle and Mezrich (1996).

The conditional variance matrix of the ARCH model (and most of its generalizations), may be expressed in terms of the unconditional variance and other parameters. Doing so one can reparametrize the model using the unconditional variance and replace it by a consistent estimator (before maximizing the likelihood).

Applying variance targeting to the ARCH model implies replacing ω by σ2, where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart.

If explanatory variables appear in the ARCH equation, ω is then replaced by σ2- i=1n2ωii, where i is the sample average of variable xi,t (assuming the stationarity of the n2 explanatory variables). In other words, the explanatory variables are centered.

### 3.6 Estimation

Estimating (3.12) with G@RCH is intended to be very simple.

Note that in the example we will use a Monday dummy and a Friday dummy. These dummies are easily created with the Calculator (recall that the dataset is dated). First, use the function dayofweek() to created a variable DAY with DAY = 2 on Mondays, 3 on Wednesdays, etc. Then, create the Monday and Friday dummy as follows: Monday = Day == 2 and Friday = Day == 6.

To specify the model, first click on G@RCH in the Modules group in the workspace window on the left-hand side of OxMetrics. Next change the Category to Models for financial data and the Model class to Univariate GARCH models using G@RCH, then click on Formulate.

A list with all the variables of the database appears in the Database frame. To select variables that will enter your model, click on the variable name and then click on the << button. There are three possible statuses for each variable (see the list of statuses under the Selection frame): dependent variable (Y variable), regressor in the conditional mean (Mean), or regressor in the conditional variance (Variance). In the univariate module, only one Y variable per model is accepted. However one can include several regressors in the conditional mean and the conditional variance equations and the same variable can be a regressor in both equations. In the example a Monday dummy is included in the conditional mean and a Friday dummy in the conditional variance equation.

Once the OK button is pressed, the Model Settings box automatically appears. This box allows to select the specification of the model: AR(FI)MA orders for the mean equation, GARCH orders, type of GARCH model for the variance equation and the distribution. The default specification is an ARMA(0,0)-GARCH(1,1) with normal errors. In our application, we select an ARMA(1,0)-GARCH(0,1) specification or equivalently an ARMA(1,0)-ARCH(1).7

#### 3.6.2 Distributions

Estimation of ARCH-type models is commonly done by maximum likelihood so that one has to make an additional assumption about the innovation process zt, i.e. choosing a density function D(0,1) with a mean 0 and a unit variance.

Weiss (1986) and Bollerslev and Wooldridge (1992) show that under the normality assumption, the quasi-maximum likelihood (QML) estimator is consistent if the conditional mean and the conditional variance are correctly specified. This estimator is, however, inefficient with the degree of inefficiency increasing with the degree of departure from normality (Engle and González-Rivera, 1991).

As reported by Palm (1996), Pagan (1996) and Bollerslev, Chou, and Kroner (1992), the use of a fat-tailed distributions is widespread in the literature. In particular, Bollerslev (1987), Hsieh (1989), Baillie and Bollerslev (1989) and Palm and Vlaar (1997) among others show that these distributions perform better in order to capture the higher observed kurtosis.

As shown below, four distributions are available in G@RCH 6.1 : the usual Gaussian (normal) distribution, the Student-t distribution, the Generalized Error distribution (GED) and the skewed-Student distribution.

The logic of ML is to interpret the density as a function of the parameter set, conditional on a set of sample outcomes. This function is called the likelihood function. It is quite evident from Equation (3.12) (and all the following equations of Section 3) that the recursive evaluation of this function is conditional on unobserved values. The ML estimation is therefore not perfectly exact. To solve the problem of unobserved values, we have set these quantities to their unconditional expected values. For this reason we talk about approximate (or conditional) ML and not exact ML.

If we express the mean equation as in Equation (3.6) and εt = ztσt, the log-likelihood function of the standard normal distribution is given by:

 (3.15)

where T is the number of observations.

For a Student-t distribution, the log-likelihood is:

where υ is the degrees of freedom, 2 < υ ≤∞ and Γ(.) is the gamma function.

The GED log-likelihood function of a normalized random variable is given by:

 (3.17)

where 0 < υ < and

The main drawback of the last two densities is that even if they may account for fat tails, they are symmetric. Skewness and kurtosis are important in financial applications in many respects (in asset pricing models, portfolio selection, option pricing theory or Value-at-Risk among others). Quite recently, Lambert and Laurent (2000, 2001) applied and extended the skewed-Student density proposed by Fernández and Steel (1998) to the GARCH framework.

The log-likelihood of a standardized (zero mean and unit variance) skewed-Student is:

where

ξ is the asymmetry parameter, υ is the degree of freedom of the distribution,

and

Note that G@RCH does not estimate ξ but log(ξ) to facilitate inference about the null hypothesis of symmetry (since the skewed-Student equals the symmetric Student distribution when ξ = 1 or log(ξ) = 0). The estimated value of log(ξ) is reported in the output under the label “Asymmetry”. See Lambert and Laurent (2001) and Bauwens and Laurent (2005) for more details.

The gradient vector and the hessian matrix can be obtained numerically or by evaluating their analytical expression. Due to the high number of possible models and distributions available, G@RCH uses numerical techniques to approximate the derivatives of the log-likelihood function with respect to the parameter vector.

The next window is (Starting Values) . The user is asked to make a choice regarding the starting values of the parameters to be estimated: (1) let the program choose the starting values, (2) enter them manually, element by element (the one selected here), or (3) enter the starting values in a vector form (one value per row) and potentially fix some parameters at their starting value.

The first method is obviously the easiest to use and is recommended for unexperienced users since it prevents from entering aberrant values. The last two methods are useful if the user wants specific starting values for the estimation. The third method is particularly useful when one wants to fix some parameters.

Next, the Estimate window proposes options on two important characteristics of the model: the sample size and the estimation method.

When the variable corresponding to the date is correctly formatted, the sample can conveniently be fixed based on starting and ending date (see Chapter 9 of Doornik, 2007, for details). The number of forecasts can be also subtracted when out-of-sample forecasting is to be performed.

The models are estimated using a maximum likelihood (ML) approach. Basically, three methods are proposed to estimate these models (see the option Optimization Methods in the Model Settings window).

1. By default, a standard ML approach is selected. This method uses the quasi-Newton method of Broyden, Fletcher, Goldfarb and Shanno (BFGS). This function is the well-known MaxBFGS function provided by Ox. See the Ox documentation for details.
2. A constraint optimization technique that uses the MaxSQPF algorithm. MaxSQPF implements a sequential quadratic programming technique to maximize a non-linear function subject to non-linear constraints, similar to Algorithm 18.7 in Nocedal and Wright (1999). MaxSQPF is particularly useful to impose the stationarity and/or positivity constraints like α1 0 in the ARCH(1) model (see Sections 3.9 and 5.3.2).
3. And finally, a simulated annealing algorithm for optimizing non-smooth functions with possible multiple local maxima, ported to Ox by Charles Bos and called MaxSA. See http://www.tinbergen.nl/~cbos/software/maxsa.html for more details.

G@RCH provides three methods to compute the covariance matrix of the estimates: Second Derivatives (based on the inverse of the hessian), Outer-Product of Gradients and Robust standard errors, also known as Quasi-Maximum Likelihood (QML) or ‘Sandwich formula’. This choice is accessible by clicking on the button ‘Options...’. By default, Robust standard errors are reported.

Note that a new feature of G@RCH 5.0 is that the default settings are updated after each estimation (the default model is thus the previously estimated one). To disable this option, disable ‘Update Default Settings’. Option ‘Reset default Settings’ is self-explanatory.

Once all the options have been selected, the estimation procedure is launched if the default starting values are used. Otherwise, a new dialog box appears. Let us assume that second method has been selected for the starting values, i.e. ‘Select (Individual Form)’. This new window contains an exhaustive list of parameters used in the different models. Depending on the specification, some parameters have a value, others have not.

The user should replace only the former since they correspond to the parameters to be estimated for the specified model. Note that it is crucially important to respect the format of these initial values. More specifically, when two values are associated with a single parameter (say, both ARCH coefficients in a ARCH(2) model), the mandatory format is “value1 ; value2”. Here is an example:

If the third method is selected for the starting values, i.e. ‘Select (Matrix Form)’, the corresponding window contains a column with the default starting values of the parameter vector (Column ‘Value’) and a Column ‘FIX’ that allows to fix some parameters to their starting value.

Note that estimating the same model with different values starting values is interesting to check the robustness of the maximum likelihood procedure.

Once this step is completed, the program starts the iteration process. Depending on the options selected8 , G@RCH prints intermediate iteration results or not. The final output is divided in two main parts: first, the model specification reminder; second, the estimated values and other useful statistics of the parameters.9 The outputs “Box 2 - Output of the ARMA(1,0)-GARCH(1,1)” and
Box 3 - Output of the ARMA(1,0)-GARCH(1,1) by QMLE” correspond respectively to the estimation results of the ARMA(1,0)-GARCH(1,1) model by ML and QMLE.

Box 2 - Output of the ARMA(1,0)-GARCH(0,1)
******************************
** G@RCH( 1) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : GARCH (0, 1) model.
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -6106.36
Please wait : Computing the Std Errors ...

Maximum Likelihood Estimation (Std.Errors based on Second derivatives)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.188314   0.016176    11.64  0.0000
Monday (M)          -0.143636   0.033629   -4.271  0.0000
AR(1)               -0.021047   0.017378   -1.211  0.2259
Cst(V)               0.737026   0.028148    26.18  0.0000
Friday (V)          -0.289467   0.040311   -7.181  0.0000
ARCH(Alpha1)         0.746336   0.046922    15.91  0.0000

No. Observations :      4093  No. Parameters  :         6
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -6106.357  Alpha[1]+Beta[1]:   0.74614

The sample mean of squared residuals was used to start recursion.
Positivity & stationarity constraints are not computed because
there are explanatory variables in the conditional variance equation.

Estimated Parameters Vector :
0.188314;-0.143636;-0.021047; 0.737026;-0.289467; 0.746341

Elapsed Time : 1.542 seconds (or 0.0257 minutes).

Parameters labelled ‘(M)’ relate to the conditional mean while those labelled ‘(V)’ relate to the conditional variance equation. AR(1) and ARCH(Alpha1) correspond to ψ1 and α1, respectively.

Surprisingly, the AR(1) coefficient is not significantly different from 0 (we will come back to this issue latter) while it was expected to be significantly negative. Interestingly, the returns and volatility are, on average, found to be lower on Monday and on Friday, respectively. Furthermore, the ARCH coefficient α1 is highly significant (rejecting the null of no ARCH effects) and incompatible with the existence of the kurtosis (since it is above 0.577). The log-likelihood value is -6106.357.

Box 3 - Output of the ARMA(1,0)-GARCH(0,1) by QMLE
******************************
** G@RCH( 2) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : GARCH (0, 1) model.
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -6106.36
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.188314   0.025747    7.314  0.0000
Monday (M)          -0.143636   0.049195   -2.920  0.0035
AR(1)               -0.021047   0.055644  -0.3782  0.7053
Cst(V)               0.737026   0.062300    11.83  0.0000
Friday (V)          -0.289467   0.081034   -3.572  0.0004
ARCH(Alpha1)         0.746336    0.10484    7.119  0.0000

No. Observations :      4093  No. Parameters  :         6
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -6106.357  Alpha[1]+Beta[1]:   0.74634

The sample mean of squared residuals was used to start recursion.
Positivity & stationarity constraints are not be computed because
there are explanatory variables in the conditional variance equation.

Estimated Parameters Vector :
0.188314;-0.143636;-0.021047; 0.737026;-0.289467; 0.746341

Elapsed Time : 1.522 seconds (or 0.0253667 minutes).

Ex-post, it is desirable to test the adequacy of the ARCH model. New options are thus available after the estimation of the model when clicking on the Test... button of the main G@RCH box: Tests, Graphic Analysis, Forecasts, Exclusion Restrictions, Linear Restrictions and Store.

### 3.7 Graphics

The Graphic Analysis... option allows to plot different graphics.

Figure 3.4 plots the conditional variance (t2) as well as the histogram of the standardized residuals (t = ) obtained with the AR(1)-ARCH(1) model, together with a kernel estimation of its unconditional distribution (solid line) and the N(0,1) (dotted line).

Just as any other graphs in the OxMetrics environment, all graphs plotted from G@RCH can be easily edited (color, size,…) and exported in many formats (.eps, .ps, .wmf, .emf and .gwg).

### 3.8 Misspecification Tests

The Tests... option allows the user to run different tests but also to print the variance-covariance matrix of the estimated parameters.

Once again, in addition to the possibilities offered by OxMetrics (ACF, PACF, QQ-plots,…), several misspecification tests are indeed provided in G@RCH:

• Four Information Criteria (divided by the number of observations):10
• The value of the skewness and the kurtosis of the standardized residuals of the estimated model (t), their t-tests and p-values. Moreover, the Jarque-Bera normality test (Jarque and Bera, 1987) is also reported.
• The Box-Pierce statistics at lag l* for both standardized, i.e. BP, and squared standardized, i.e. BP2, residuals. Under the null hypothesis of no autocorrelation, the statistics BP and BP2 should be evaluated against the χ2 and χ2, respectively (see McLeod and Li, 1983).
• Engle’s LM ARCH test (Engle, 1982) to test the presence of ARCH effects in a series. For each specified order, the squared residual series is regressed on p of its own lags. The test statistic is distributed χ2(p) under the null hypothesis of no ARCH effects.
• The diagnostic test of Engle and Ng (1993) that investigates possible misspecification of the conditional variance equation. Let St- denotes a dummy variable which takes the value 1 when t < 0, and 0 otherwise (and St+ 1 - St-). The Sign Bias Test (SBT) examines whether t2 can be predicted by St-1-,St-1- t-1, and/or St-1+ t-1. To test the presence of leverage effects, Engle and Ng (1993) propose to run the following regressions:
and test the significance of a1, b1 and c1 through a t-test. The tests are called respectively, Sign Bias Test (SBT), Negative Sign Bias Test (NSBT) and Positive Sign Bias Test (PSBT). The NSBT and PSBT also test whether the effect of negative and positive shocks on the conditional variance depend on their size.

Instead of running three different regressions, G@RCH follows Engle and Ng (1993) in estimating jointly the three effect, i.e.

 (3.19)

T-stats corresponding to H0 : di = 0 (H1 : di0), i = 1,2 and 3 are reported, as well as their p-value. Finally, a joint test for H0 : d1 = d2 = d3 = 0 is also provided.

• The adjusted Pearson goodness-of-fit test that compares the empirical distribution of the innovations with the theoretical one. In order to carry out this testing procedure, it is necessary to first classify the residuals in cells according to their magnitude.11 Let n be the number of observations, r the number of categories we consider, pi (i = 1,...,r) the observed proportion of observations being in the ith category and pit (i = 1,...,r) the theoretical probability for an observation to be in the ith category. The Pearson goodness-of-fit test has the null H0: p1 = p1t , p2 = p2t,,pr = prt. The statistic is computed as  (3.20)

where ni is the observed number in the sample that fall into the ith category and Eni is the number of observations expected to be in this ith category when Ho is true. The Pearson statistic is therefore “small” when all of the observed counts (proportions) are close to the expected counts (proportions) and it is “large” when one or more observed counts (proportions) differs noticeably from what is expected when H0 is true.12 For i.i.d. observations, Palm and Vlaar (1997) show that under the null of a correct distribution the asymptotic distribution of P(g) is bounded between a χ2(r - 1) and a χ2(r - k - 1) where k is the number of estimated parameters. As explained by Palm and Vlaar (1997), the choice of r is far from being obvious.13 According to König and Gaab (1982), the number of cells must increase at a rate equal to T0.4.

• The Nyblom test (Nyblom, 1989 and Lee and Hansen, 1994) can be used to check the constancy of parameters over time. See also Hansen (1994) for an overview of this test.
• The Residual-Based Diagnostic (RBD) for Conditional Heteroscedasticity of Tse (2002). The BoxPierce portmanteau statistic is perhaps the most widely used diagnostic for conditional heteroscedasticity models. Although it has been noted that the portmanteau statistics do not have an asymptotic χ2 distribution, many authors, nonetheless, apply the χ2 distribution as an approximation (the problem lies in the fact that estimated residuals are used to calculate the portmanteau statistics). To overcome this problem, Tse (2002) proposes a Residual-Based Diagnostic for conditional heteroscedasticity. The diagnostic involves running artificial regressions and testing for the statistical significance of the regression parameters. The key problem is that since the regressors are estimated, the usual ordinary least squares (OLS) result does not apply.

The idea of the test is the following: after estimating the model, the standardized residuals t = tt can be computed. It is obvious that t depends on the set of estimated parameters. E(t2) = 1 by construction, so we can run a regression of E(t2) - 1 on some information variables and examine the statistical significance of the regression parameters. Tse (2002) proposes to run the following OLS regression to test the presence of remaining heteroscedasticity in the standardized residuals:
E(t2) - 1 = d1t-12 + + dMt-M2 + ut. Since the regressors are not observed (but estimated), standard inference procedures of OLS is invalid. Tse (2002) derives the asymptotic distribution of the estimated parameters and shows that a joint test of significance of the d1,,dM is now χ2(M) distributed. Notice that in G@RCH the maximum lag values M of the test are set by default to 2, 5 and 10 but can be changed by the user.

Monte Carlo experiment

To study the performance of this test, let us consider a first simulation study (see Simulation_RBD_TSE.ox). We simulate 1000 series of T = 2000 observations following a N-AR(0)-GARCH(1,1) model with μ = 0.1, i.e. (yt - μ) = εt, where εt ~ N(0,σt2) and σt2 = 0.2 + 0.1εt-12 + 0.8σt-12. We than estimate a N-AR(0)-GARCH(1,1) and the simulated data and apply Tse (2002)’s test on t2 with M = 1,2,3 and 4. We report both the test with and without the RBD correction.

The number of times that the null is rejected at the α% nominal size (called the empirical size of the test) is reported below.

RBD TEST OF TSE (2002)
Empirical size with alpha=0.10 and M=1: 9.7
Empirical size with alpha=0.05 and M=1: 5.4
Empirical size with alpha=0.01 and M=1: 2.2

Empirical size with alpha=0.10 and M=2: 8.5
Empirical size with alpha=0.05 and M=2: 4.5
Empirical size with alpha=0.01 and M=2: 2.9

Empirical size with alpha=0.10 and M=3: 8.4
Empirical size with alpha=0.05 and M=3: 5.7
Empirical size with alpha=0.01 and M=3: 3

Empirical size with alpha=0.10 and M=4: 8.3
Empirical size with alpha=0.05 and M=4: 6.1
Empirical size with alpha=0.01 and M=4: 3.4

RBD TEST OF TSE (2002) without the correction
Empirical size with alpha=0.10 and M=1: 0.9
Empirical size with alpha=0.05 and M=1: 0.3
Empirical size with alpha=0.01 and M=1: 0.1

Empirical size with alpha=0.10 and M=2: 3
Empirical size with alpha=0.05 and M=2: 1.1
Empirical size with alpha=0.01 and M=2: 0.3

Empirical size with alpha=0.10 and M=3: 4.7
Empirical size with alpha=0.05 and M=3: 2.1
Empirical size with alpha=0.01 and M=3: 0.4

Empirical size with alpha=0.10 and M=4: 4.1
Empirical size with alpha=0.05 and M=4: 2.5
Empirical size with alpha=0.01 and M=4: 0.4

These results suggest that the version of the test that implements the RBD correction has a reasonable size for α = 10 and 5% and suffers from a positive size distortion for α = 1%. However, the ’OLS’ version of the test (that does not take into account the uncertainty of the estimation of the GARCH model) is clearly inadequate because it has a tendency to under-reject the null whatever the value of M and α. For instance, for M = 1 and α = 10%, the empirical size is 0.9%, which suggests a strong size distortion.

In the example, 6 tests have been selected. The output is reported in
Box 4 - Misspecification Tests”.

Box 4 - Misspecification Tests
TESTS :
------------
Information Criterium (to be minimized)
Akaike          2.986737  Shibata         2.986733
Schwarz         2.995997  Hannan-Quinn    2.990016
---------------

Statistic       t-Test      P-Value
Skewness            -0.28941       7.5616      0.00000
Excess Kurtosis       5.9872       78.235      0.00000
Jarque-Bera           6170.5         .NaN      0.00000
---------------

Q-Statistics on Standardized Residuals
--> P-values adjusted by 1 degree(s) of freedom
Q(  5) =  62.3925   [0.0000000]
Q( 10) =  68.3636   [0.0000000]
Q( 20) =  84.2257   [0.0000000]
Q( 50) =  125.350   [0.0000000]
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]
---------------

Q-Statistics on Squared Standardized Residuals
--> P-values adjusted by 1 degree(s) of freedom
Q(  5) =  323.697   [0.0000000]
Q( 10) =  628.454   [0.0000000]
Q( 20) =  1035.75   [0.0000000]
Q( 50) =  1695.16   [0.0000000]
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]
---------------

# Cells(g)  Statistic      P-Value(g-1)     P-Value(g-k-1)
40      526.9756         0.000000          0.000000
50      540.9238         0.000000          0.000000
60      565.9641         0.000000          0.000000
Rem.: k = 6 = # estimated parameters
---------------

Residual-Based Diagnostic for Conditional Heteroskedasticity of Tse (2001)

RBD( 2) =  104.753   [0.0000000]
RBD( 5) =  246.916   [0.0000000]
RBD(10) =  303.489   [0.0000000]
------------------------------------------------
P-values in brackets

Without going too deeply into the analysis of these results, we can briefly argue that the model does not capture the dynamics of the the NASDAQ.

The Q-statistics on standardized and squared standardized residuals, and RBD test with various lag values as well as the adjusted Pearson Chi-square goodness-of-fit test (with different cell numbers) reject the null hypothesis of a correct specification. This result is not very surprising. Early empirical evidence has indeed shown that a high ARCH order has to be selected to catch the dynamics of the conditional variance (thus involving the estimation of numerous parameters).

### 3.9 Parameter Constraints

When numerical optimization is used to maximize the log-likelihood function with respect to the vector of parameters Ψ, the inspected range of the parameter space is . The problem is that some parameters might have to be constrained in a smaller interval. For instance, it is convenient to constrain the αi parameters of an ARCH(q) to be positive. Constraining parameters to lie between given lower and upper bounds is easily done by selecting the ‘BFGS-BOUNDS’ option in the Estimate - GARCH Models dialog box in G@RCH.

If the user fixes bounds on parameters, the program uses MaxSQPF instead of MaxBFGS (Broyden, Fletcher, Goldfarb and Shanno quasi-Newton method) to optimize the likelihood function. MaxSQPF enforces all iterates to be feasible, using the algorithm by Lawrence and Tits (2001). If a starting point is infeasible, MaxSQPF will try to minimize the squared constraint violations to find a feasible point. See Doornik (2007b) for more details on these two optimization functions.

To change the values of the bounds, select option ‘Select (Matrix Form)’ in the ‘Starting Value’ dialog box together with option ‘BFGS-BOUNDS’ in the Estimate - GARCH Models dialog box. In addition to columns ‘FIX’ and ‘Value’, two new columns appear with the default lower and upper bounds.

Furthermore, nonlinear constraints (like the stationarity constraint of the GARCH (1,1) model, α1 + β1 < 1) can be imposed during the estimation with the Console version. For more details, see 5.3.2.

### 3.10 Forecasts

The main purpose of building and estimating a model with financial data is probably to produce a forecast. With the Forecast... option, G@RCH 6.1 also provides forecasting tools: forecasts of both the conditional mean and the conditional variance are available as well as several forecast error measures.

The first parameter to specify is the horizon h of the h-step-ahead forecasts. The default value is 10. Three options are available to:

1. print several forecasts error measures;
2. print the forecasts;
3. and make a graph of the forecasts.

Finally, graphical options are available for the standard error bands (error bands, bars or fans).

#### 3.10.1 Forecasting the Conditional Mean

Our first goal is to give the optimal h-step-ahead predictor of yt+h given the information we have up to time t.

For instance, in an AR(1) process such as

yt = μ + ψ1(yt-1 - μ) + εt,

the optimal14 h-step-ahead predictor of yt+h (i.e. ŷt+h|t) is its conditional expectation at time t (given the estimated parameters and 1):

 (3.21)

where ŷt+i|t = yt+i for i 0.

For the AR(1), the optimal 1-step-ahead forecast equals + 1(ŷt -). For h > 1, the optimal forecast can be obtained recursively or directly as ŷt+h|t = + 1h(ŷt -).

More generally, for the ARFIMA(n,ζ,s) model described Equation (3.8), the optimal h-step-ahead predictor of yt+h is:

Recall that when exogenous variables enter the conditional mean equation, μ becomes μt = μ + i=1n1δixi,t and consequently, provided that the information xi,t+h is available at time t (which is the case for instance if xi,t is a “day-of-the-week” dummy variable), t+h|t is also available at time t. When there is no exogenous variable in the ARFIMA model and n = 1,s = 0 and ζ = 0 (ck = 0), the forecast of the AR(1) process given in Equation (3.21) can be recovered.

Note that for ARCH-in-mean models, t+h|t = μ + i=1n1δixi,t+h|t + ϑσt+h|tk (with k = 1 or 2).

#### 3.10.2 Forecasting the Conditional Variance

Independently from the conditional mean, one can forecast the conditional variance. In the simple ARCH(q) case, the optimal h-step-ahead forecast of the conditional variance, i.e. t+h|t2 is given by:

 (3.23)

where εt+i|t2 = σt+i|t2 for i > 0 while εt+i|t2 = εt+i2 for i 0. Equation (3.23) is usually computed recursively, even if a closed form solution of σt+h|t2 can be obtained by recursive substitution in Equation (3.23).

Leaving out the last 10 observation for the forecasting experiment (as shown in the Estimate Model windows here below) and performing a 10-step-ahead forecasts of the Nasdaq based on the AR(1)-ARCH(1) produces Figure 3.5.

The forecasted bands are ±2t+h|t which gives a 95 % confidence interval (note that the critical value 2 can be changed).

Furthermore, if we leave enough observations for the out-of-sample period, G@RCH can report some standard measures of forecasting performance derived from the forecasts errors (see Box 5). Note that certain criteria are not computed on both the mean and the variance (because they are not relevant, e.g. Percentage Correct Sign(PCS) for the variance). In such a case, G@RCH reports a .NaN.

Box 5 - Forecast Evaluation Measures
Mean    Variance
Mean Squared Error(MSE)                              16.81       438.5
Median Squared Error(MedSE)                          11.71        72.5
Mean Error(ME)                                      -1.781       13.72
Mean Absolute Error(MAE)                              3.58       14.45
Root Mean Squared Error(RMSE)                          4.1       20.94
Mean Absolute Percentage Error(MAPE)                  .NaN       4.602
Adjusted Mean Absolute Percentage Error(AMAPE)        .NaN       0.674
Percentage Correct Sign(PCS)                           0.3        .NaN
Theil Inequality Coefficient(TIC)                   0.9795      0.8254
Logarithmic Loss Function(LL)                         .NaN       4.344

### 3.11 Further Options

Finally, three options are also available in the Test menu: an Exclusion Restrictions dialog box, a Linear Restrictions dialog box and a Store in database dialog.

#### 3.11.1 Exclusion Restrictions Dialog Box

The Exclusion Restrictions dialog box option allows you to select explanatory variables and test whether they are jointly significant. A more general form is the test for linear restrictions.

Mark all the variables you wish to include in the test in this Multiple-Selection List box. G@RCH tests whether the selected variables can be deleted from the model.

#### 3.11.2 Linear Restrictions Dialog Box

Tests for linear restrictions are specified in the form of a matrix R, and a vector r. These are entered as one matrix [R : r] in the dialog. (This is more general than testing for exclusion restrictions).

The first four columns are the columns of R, specifying two restrictions. The last column is r, which specifies what the restrictions should add up to.

The dimensions of the matrix must be specified in the rows and columns fields. It is your responsibility to specify the right values, G@RCH will not try to work it out (because elements of a row may be spread over several lines).

• Rows : The number of rows in the matrix.
• Columns : The number of columns in the matrix.
• Matrix : This window is a basic text editor in which you can edit a matrix file. Here you can enter the R:r matrix as in the above example.
• Set to zero : This could be useful to create an initial matrix. Select variables in the model box (this is a this multiple-selection list box) and press this button to specify the R : r matrix which corresponds to the restriction that each selected variable has coefficient zero (so one row for each selected variable).
• Load : Enables you to load an existing matrix file into the editor. Any existing matrix in the editor will be lost.
• Save : Enables you to save the contents of the editor in an matrix file, so that it can be used again.

#### 3.11.3 Store in Database Dialog

Finally, the residuals, the squared residuals, the standardized residuals, the conditional variance and the h-step-ahead forecasts (of the conditional mean and the conditional variance) can be stored in the database as new variables. When selecting this option, a first window appears and the user selects the series to be stored. A default name is then proposed for this series.

### 3.12 The random walk hypothesis (RWH)

This section has been largely inspired by Taylor (1995).

Are prices (or returns) of financial assets forecastable? According to the random walk hypothesis (RWH) not. The steps of a random walk are unpredictable. The RWH associates steps with returns, so that returns can not be predicted from past values. In G@RCH, the RWH is here defined by two statements about the stochastic process that provides observed returns: The mean is stationary, E = μ , and returns are uncorrelated, corr = 0,τ0.

When RWH is true, the latest return and all past returns are irrelevant if we attempt to predict future returns using linear predictors. In this case, the variables can be assumed to be a martingale difference and thus expected returns do not depend on the history of time series information.15

#### 3.12.1 The Variance-ratio test

The Variance-ratio test of Lo and MacKinlay (1988) is designed to detect either mean reversion or price trend behaviour. Let xt = log be the price logarithm (with any dividends reinvested). Then one-period returns are rt = xt - xt-1. N-period returns are rt + ... + rt+N-1 = xt+N-1 - xt-1. Also, assuming the returns process is stationary, let V (N) = var. When RWH is true, V (N) = NV (1). The test attempts to decide if this property is true. N is any integer, 2.

For any stationary process, the variance-ratio is

by using the formula for the variance of a linear combination. Consequently, the variance-ratio test is appropriate when any dependence occurs for at least N - 1 lags and when the autocorrelations all have the same sign. It is a good test when the alternative to RWH is an ARMA(1,1) process that have a positive autoregressive parameter for returns.

Implementation of the test requires:

1. Estimates of V (1),V (N).
2. An estimate of the standard error of the variance-ratio estimate.
3. Five steps to implement the tests.

Assume there are n returns.

Step 1 : Calculate the average return and unbiased variance estimates, thus:

Step 2 : Calculate the sample variance ratio as

Reject RWH if V (N) is far from 1.

Recall that

and

Under H0,E(V (N)) = 1 and thus V ar(V (N)) = V ar. An appropriate estimate of nV ar(ρτ) is provided by bτ.

Step 3 : Estimate variances for the sample autocorrelations, τ, of the n returns, for 1 τ N - 1, under the assumption that RWH is true.

where st = (rt - )2.

A very good approximation is given by bτ1 + τ,s, where ku is the sample kurtosis of the returns rt and τ,s is the sample autocorrelation at lag τ of the time series defined by st = 2.

Step 4 : Estimate a variance for the variance-ratio V (N), under the assumption that RWH is true.

This is denoted by . Then

Equivalently,

Step 5 : Calculate the test statistic,

The asymptotic distribution of this statistic is N(0,1) when RWH is true.

Monte Carlo experiment

To study the performance of this test, let us consider a first simulation study (see Simulation_VR.ox). We simulate 1000 series of T = 2000 observations following a N-AR(0) model with μ = 0.1, i.e. (yt -μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. We than apply the Variance-ratio test with N = 2,3 and 10. The 1000 realisations of the VR statistics are plotted in Figure 3.6 while the empirical distribution of the VR statistics is plotted in Figure 3.7 (together with a N(0,1)).

We see from Figure 3.6 that the VR statistics varies between -3 and 3. Furthermore, Figure 3.7 suggests that the VR statistics is close to be standard normal distributed.

The number of times that the null is rejected at the α% nominal size (called the empirical size of the test) is reported below.

Empirical size with alpha=0.10 and M=2:8.1
Empirical size with alpha=0.05 and M=2:3.3
Empirical size with alpha=0.01 and M=2:0.4
Empirical size with alpha=0.10 and M=3:7.5
Empirical size with alpha=0.05 and M=3:3.7
Empirical size with alpha=0.01 and M=3:0.8
Empirical size with alpha=0.10 and M=10:8.2
Empirical size with alpha=0.05 and M=10:4.7
Empirical size with alpha=0.01 and M=10:1.4

These results suggest that the test suffers from a minor size distortion for the three considered values of M.

To study the power of the test, we now simulate T = 2000 observations of a N-AR(1) model with μ = 0.1 and ρ = 0.1, i.e. (1 - ρL)(yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. To obtain this configuration of the model, chose the following options:

decl m_vAR=<0.05>;
decl m_cAR=1;

The empirical distribution of the VR statistics (under the alternative) is plotted in Figure 3.8 (together with a N(0,1)).

Both Figure 3.8 and the output here below suggest that the VR test has good power to detect the presence of serial correlation in the series.

Empirical power with alpha=0.10 and M=2:69.5
Empirical power with alpha=0.05 and M=2:58.4
Empirical power with alpha=0.01 and M=2:33.3
Empirical power with alpha=0.10 and M=3:61.5
Empirical power with alpha=0.05 and M=3:50.4
Empirical power with alpha=0.01 and M=3:28
Empirical power with alpha=0.10 and M=10:32.1
Empirical power with alpha=0.05 and M=10:22.9
Empirical power with alpha=0.01 and M=10:8.9

A potential weakness of the VR test is that it assumes a constant variance while it is clear that most financial time series do not have constant conditional variance. To study the impact of the presence of GARCH effects on the VR test, let us now simulate T = 2000 observations of a N-AR(0)-GARCH(1,1) model with μ = 0.1, i.e. (yt - μ) = εt, where εt ~ N(0,σt2), σt2 = ω + α1εt-12 + β1σt-12 with ω = 0.2,α1 = 0.1 and β1 = 0.8.

Again, we simulate 1000 series following this DGP and apply the VR test with the same values for M.

Chose now the following options in Simulation_VR.ox:

decl m_cModel=1;
decl m_cAR=0;

Empirical size with alpha=0.10 and M=2:12.1
Empirical size with alpha=0.05 and M=2:6.2
Empirical size with alpha=0.01 and M=2:0.9
Empirical size with alpha=0.10 and M=3:10.4
Empirical size with alpha=0.05 and M=3:5
Empirical size with alpha=0.01 and M=3:0.7
Empirical size with alpha=0.10 and M=10:8.5
Empirical size with alpha=0.05 and M=10:3.5
Empirical size with alpha=0.01 and M=10:0.7

The empirical distribution of the VR statistics is plotted in Figure 3.9 and the empirical size in the output just below. The main conclusion is that the presence of GARCH effects has a negligible impact on the size of the test.

However, the power of the test is slightly affected by the presence of GARCH effects. Indeed, adding an AR(1) component to this GARCH model (with ρ = 0.05) substantially reduces the power of the test as suggested below.

To obtain these results, chose the following options in Simulation_VR.ox:

decl m_cModel=1;
decl m_vAR=<0.05>;
decl m_cAR=1;

Empirical power with alpha=0.10 and M=2:59.2
Empirical power with alpha=0.05 and M=2:47.4
Empirical power with alpha=0.01 and M=2:25.4
Empirical power with alpha=0.10 and M=3:53.1
Empirical power with alpha=0.05 and M=3:40.6
Empirical power with alpha=0.01 and M=3:21.4
Empirical power with alpha=0.10 and M=10:29.9
Empirical power with alpha=0.05 and M=10:19.7
Empirical power with alpha=0.01 and M=10:8.4

To overcome this problem some authors have proposed to apply the VR test on the rescaled or standardized residuals of a GARCH-type model.

This is illustrated in Simulation_VR_GARCH.ox.

As suggested by the next output, applying the VR test on the rescaled returns restores its original power to the test (i.e. case AR(1) without GARCH effects).

Empirical power with alpha=0.10 and M=2:69.5
Empirical power with alpha=0.05 and M=2:58.4
Empirical power with alpha=0.01 and M=2:33.3
Empirical power with alpha=0.10 and M=3:61.5
Empirical power with alpha=0.05 and M=3:50.4
Empirical power with alpha=0.01 and M=3:28
Empirical power with alpha=0.10 and M=10:32.1
Empirical power with alpha=0.05 and M=10:22.9
Empirical power with alpha=0.01 and M=10:8.9

See Taylor (1995) for more details about the VR tests as well as some applications on real data.

#### 3.12.2 Runs test

Another popular test of the RMH is the Runs test, first used by Fama (1965). The runs test (also called Wald-Wolfowitz test) is a non-parametric test that checks the randomness hypothesis of a two- or three-valued data sequence. Similar to a first-autocorrelation test but is non-parametric and does not require any assumption of normality and stationary distribution. A run’ is a sequence of adjacent equal elements. For example, the sequence “1111000111001111110000” is divided in six runs, three of which consist of 1’s and the others of 0’s. If the 1’s and 0’s alternate randomly, the number of runs is a random variable whose asymptotic distribution is N.

Let qt be the sign of the return rt, qt is 1,0,-1, respectively for positive, zero, negative values of rt. Let ct be 1 if qtqt+1 and 0 otherwise. The total number of runs of all types is

Suppose there are n1 positive returns, n2 zero returns, and n3 negative returns in a series of n returns (with 0 mean apply the test on rt - ). Then the mean and variance of the random variable generating C, conditional upon n1, n2, and n3, are

when the signs qt are generated by i.i.d. variables (Mood, 1940).

The statistic C has an approximate normal distribution, for large n. Tests can then be decided by evaluating

with RWH rejected at the 5% level if |K| > 1.96. The Runs test avoids all the problems created by conditional heteroskedasticity. Note that trends in prices (vs. no trend) would give fewer runs than expected.

To study the performance of the Runs test, let us consider a first simulation study (see Simulation_VR.ox). We simulate 1000 series of T = 2000 observations following a N-AR(0) model with μ = 0.1, i.e. (yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. We than apply the Variance-ratio test with N = 2,3 and 10. The empirical distribution of the K statistics is plotted in Figure 3.10 (together with a N(0,1)).

The output reported here below also suggests that the test has a correct size at the three chosen significance levels. Results not reported here suggest (as expected) that the test is not affected by the presence of GARCH effects.

Empirical size with alpha=0.10:11.1
Empirical size with alpha=0.05:5.1
Empirical size with alpha=0.01:0.9

Let us now study the power of the Runs test by considering a N-AR(1) model. The results suggest that this tests suffers from a reduction in power (compared to the VR test on the same DGP) due to the loss of information in the transformation form returns to their signs.

Empirical power with alpha=0.10:41.8
Empirical power with alpha=0.05:30.7
Empirical power with alpha=0.01:12.5

#### 3.12.3 Rescaled Range Tests

Range statistics that have power to detect long-term dependence were first developed for hydrological data by Hurst (1951) and later applied to financial returns.

Lo (1991) provides many references and a rigorous description of appropriate tests when the preferred alternative to randomness is long-term dependence. The range defined by a set of returns :

R/S-test statistics are ranges divided by scaled standard deviations

Two special cases define Mandelbrot (1972)’s and Lo (1991)’s test statistics:

and

with s2 the sample variance of returns.

Under certain assumptions, the distributions of these statistics converge, as n and q increase, to that of the range of a Brownian bridge on the unit interval. The rule is the following: do not reject the null when

 [0.861 ; 1.747] at the 10% level [0.809 ; 1.862] at the 5% level [0.721 ; 2.098] at the 1% level

The null hypothesis of an uncorrelated process can be tested using (R∕S)Man, and Lo focuses on the null hypothesis of a short memory process and then the appropriate test statistic is (R∕S)LO .

To study the performance of the R/S tests, let us consider the following simulation study (see Simulation_RS.ox). We simulate 1000 series of T = 2000 observations following a N-AR(0) model with μ = 0.1, i.e. (yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. We than apply the (R∕S)Man test and the (R∕S)LO test with q = 10 and 20.

The empirical size of the tests is reported below. The results suggest that the test is perfectly sized.

(R/S)Man Test
Empirical size with alpha=0.10:10.6
Empirical size with alpha=0.05:5.3
Empirical size with alpha=0.01:1.4

(R/S)Lo Test
Empirical size with alpha=0.10 and q=10:9.5
Empirical size with alpha=0.05 and q=10:5.3
Empirical size with alpha=0.01 and q=10:1.5
Empirical size with alpha=0.10 and q=20:9.1
Empirical size with alpha=0.05 and q=20:5
Empirical size with alpha=0.01 and q=20:1.1

The next output corresponds to the AR(1) case with ρ = 0.05, obtained with the following options

decl m_vAR=<0.05>;
decl m_cAR=1;

It is clear that the tests have no power against short memory processes.

(R/S)Man Test
Empirical power with alpha=0.10:11
Empirical power with alpha=0.05:5.5
Empirical power with alpha=0.01:1.4

(R/S)Lo Test
Empirical power with alpha=0.10 and q=10:9.7
Empirical power with alpha=0.05 and q=10:5.4
Empirical power with alpha=0.01 and q=10:1.4
Empirical power with alpha=0.10 and q=20:8.9
Empirical power with alpha=0.05 and q=20:4.8
Empirical power with alpha=0.01 and q=20:1.1

With a much higher value of ρ, i.e. ρ = 0.5, we see that the statistics of Lo (1991) is close to its nominal size while the one of Mandelbrot (1972) has some power to detect short memory.

(R/S)Man Test
Empirical power with alpha=0.10:74.4
Empirical power with alpha=0.05:65.6
Empirical power with alpha=0.01:45.1

(R/S)Lo Test
Empirical power with alpha=0.10 and q=10:10.6
Empirical power with alpha=0.05 and q=10:6.2
Empirical power with alpha=0.01 and q=10:1.2
Empirical power with alpha=0.10 and q=20:9.4
Empirical power with alpha=0.05 and q=20:5.5
Empirical power with alpha=0.01 and q=20:1.2

Results not reported here suggest that theses two statistics are not affected by the presence of GARCH effects.

To study the ability of the (R∕S)LO statistics to detect long range dependance, we now use the ARFIMA package of Doornik and Ooms (1999) to simulate T = 2000 observations of a N-ARFIMA(0,0.2,0) model with μ = 0.1, i.e. (1 - L)0.2(yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2.

This results of this simulation have been obtained using example file Simulation_RS.ox with the following options:

decl m_cAR=0;
decl m_cD=1;
decl m_vD=0.2;

The next output suggests that the (R∕S)LO statistics indeed has power to detect long-memory.

(R/S)Man Test
Empirical size with alpha=0.10:96.1
Empirical size with alpha=0.05:94.6
Empirical size with alpha=0.01:87.2

(R/S)Lo Test
Empirical size with alpha=0.10 and q=10:64.4
Empirical size with alpha=0.05 and q=10:55.9
Empirical size with alpha=0.01 and q=10:37.6
Empirical size with alpha=0.10 and q=20:47.5
Empirical size with alpha=0.05 and q=20:37.9
Empirical size with alpha=0.01 and q=20:22

## Chapter 4Further Univariate GARCH Models

While Engle (1982) is certainly the most important contribution in financial econometrics, the ARCH model is rarely used in practice due to its simplicity.

A useful generalization of this model is the GARCH model introduced by Bollerslev (1986). This model is also a weighted average of past squared residuals, but it has declining weights that never go completely to zero. This model is more parsimonious than the ARCH and, even in its simplest form, has proven surprisingly successful in predicting conditional variances.

The GARCH model is not the only extension. Indeed, G@RCH 6.1 proposes no less than 12 specifications. As shown below, choosing an alternative model is extremely easy in G@RCH.

### 4.1 GARCH Model

The Generalized ARCH (GARCH) model of Bollerslev (1986) is based on an infinite ARCH specification and it allows to reduce the number of estimated parameters by imposing nonlinear restrictions on them. The GARCH (p,q) model can be expressed as:

 (4.1)

Using the lag (or backshift) operator L, the GARCH (p,q) model becomes:

with α(L) = α1L + α2L2 + + αqLq and β(L) = β1L + β2L2 + + βpLp.

If all the roots of the polynomial |1 - β(L)| = 0 lie outside the unit circle, we have:

 (4.2)

which may be seen as an ARCH() process since the conditional variance linearly depends on all previous squared residuals. In this case, the conditional variance of yt can become larger than the unconditional variance given by

if past realizations of εt2 are larger than σ2 (Palm, 1996).

Applying variance targeting to the GARCH model implies replacing ω by σ2, where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart.1

Bollerslev (1986) has shown that for a GARCH(1,1) with normal innovations, the kurtosis of y is 3[1 - (α1 + β1)2] > 3. The autocorrelations of εt2 have been derived by Bollerslev (1986). For a stationary GARCH(1,1), ρ1 = α1 + [α12β1(1 - 2α1β1 - β12)], and ρk = (α1 + β1)k-1ρ1, k = 2,3, In other words, the autocorrelations decline exponentially with a decay factor of α1 + β1.

As in the ARCH case, some restrictions are needed to ensure σt2 to be positive for all t. Bollerslev (1986) shows that imposing ω > 0, αi 0 (for i = 1,,q) and βj 0 (for j = 1,,p) is sufficient for the conditional variance to be positive. In practice, the GARCH parameters are often estimated without the positivity constraints. Nelson and Cao (1992) argue that imposing all coefficients to be nonnegative is too restrictive and that some of these coefficients are found to be negative in practice while the conditional variance remains positive (by checking on a case-by-case basis). Consequently, they relaxed this constraint and gave sufficient conditions for the GARCH(1,q) and GARCH(2,q) cases based on the infinite representation given in Equation (4.2). Indeed, the conditional variance is strictly positive provided ω-1 is positive and all the coefficients of the infinite polynomial α(L)-1 in Equation (4.2) are nonnegative. The positivity constraints proposed by Bollerslev (1986) can be imposed during the estimation (see Section 3.9). If not, these constraints, as well as the ones implied by the ARCH() representation, will be tested a posteriori and reported in the output (if there is no explanatory variable in the conditional variance equation).

Estimation results of an ARMA(1,0)-GARCH(1,1) model by QML are reported in the next Output (labelled Box 6).

Box 6 - Output of the ARMA(1,0)-GARCH(1,1) by QMLE
******************************
** G@RCH( 3) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : GARCH (1, 1) model.
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -5370.86
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.116210   0.015919    7.300  0.0000
Monday (M)          -0.174801   0.028920   -6.044  0.0000
AR(1)                0.193905   0.017177    11.29  0.0000
Cst(V)               0.009292   0.012566   0.7394  0.4597
Friday (V)           0.067380   0.059745    1.128  0.2595
ARCH(Alpha1)         0.164808   0.028030    5.880  0.0000
GARCH(Beta1)         0.826238   0.026107    31.65  0.0000

No. Observations :      4093  No. Parameters  :         7
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5370.858  Alpha[1]+Beta[1]:   0.99105

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.116210;-0.174801; 0.193905; 0.009292; 0.067380; 0.164808; 0.826243

Elapsed Time : 2.975 seconds (or 0.0495833 minutes).

Interestingly, the log-likelihood now equals -5370.858 against -6106.357 for the corresponding ARCH(1) model. Any likelihood ratio test (LRT), asymptotically ~ χ2(1), would reject the ARCH(1) model in favour of the GARCH(1,1).

Furthermore, we report below the same 6 misspecification tests as for the ARCH(1) model. The output is reported in Box 7 - Misspecification Tests.

Box 7 - Misspecification Tests
TESTS :
------------
Information Criterium (to be minimized)
Akaike          2.627832  Shibata         2.627826
Schwarz         2.638636  Hannan-Quinn    2.631657
---------------

Statistic       t-Test      P-Value
Skewness            -0.65157       17.024  5.4349e-065
Excess Kurtosis       2.6513       34.645  5.2745e-263
Jarque-Bera           1488.4         .NaN      0.00000
---------------

Q-Statistics on Standardized Residuals
--> P-values adjusted by 1 degree(s) of freedom
Q(  5) =  4.57820   [0.3333749]
Q( 10) =  5.22369   [0.8143890]
Q( 20) =  18.7431   [0.4734248]
Q( 50) =  58.5113   [0.1657199]
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]
---------------

Q-Statistics on Squared Standardized Residuals
--> P-values adjusted by 2 degree(s) of freedom
Q(  5) =  3.79989   [0.2838986]
Q( 10) =  7.15170   [0.5203577]
Q( 20) =  16.8830   [0.5311632]
Q( 50) =  46.2948   [0.5429339]
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]
---------------

# Cells(g)  Statistic      P-Value(g-1)     P-Value(g-k-1)
40      201.6787         0.000000          0.000000
50      221.5492         0.000000          0.000000
60      230.5035         0.000000          0.000000
Rem.: k = 7 = # estimated parameters
---------------

Residual-Based Diagnostic for Conditional Heteroskedasticity of Tse(2001)

RBD( 2) =  2.02370   [0.3635463]
RBD( 5) =  3.62445   [0.6046453]
RBD(10) =  6.84825   [0.7396890]
------------------------------------------------
P-values in brackets

Unlike the ARCH(1) model, the Q-Statistics on standardized and squared standardized residuals, as well as the RBD test with various lag values suggest that the GARCH(1,1) does a good job in modelling the dynamics of the first two conditional moments of the NASDAQ.

However, the adjusted Pearson Chi-square goodness-of-fit test (with different cell numbers) still points out some misspecification of the overall conditional distribution.

Several authors have proposed to use a Student-t or GED distribution in combination with a GARCH model to model the fat tails of the high-frequency financial time-series. Furthermore, since the NASDAQ seems to be skewed, a skewed-Student distributions might be justified (see Section 6 for a discussion about non-normal distributions).

### 4.2 EGARCH Model

The Exponential GARCH (EGARCH) model, originally introduced by Nelson (1991), is re-expressed in Bollerslev and Mikkelsen (1996) as follows:

 (4.3)

The value of g(zt) depends on several elements. Nelson (1991) notes that, “to accommodate the asymmetric relation between stock returns and volatility changes (...) the value of g(zt) must be a function of both the magnitude and the sign of zt.2 That is why he suggests to express the function g(.) as

 (4.4)

E depends on the assumption made on the unconditional density of zt. Indeed, for the normal distribution,

 (4.5)

For the skewed-Student distribution,

 (4.6)

where ξ = 1 for the symmetric Student.

For the GED, we have

 (4.7)

ξ,υ and λυ concern the shape of the non-normal densities and was defined in Section 3.6.2.

Note that the use of a log transformation of the conditional variance ensures that σt2 is always positive.

Applying variance targeting to the EGARCH model implies replacing ω by log(σ2), where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart.3

The output reported below corresponds to the ARMA(1,0)-EGARCH(1,1) with a GED distribution. Interestingly, both θ1 and θ2 are significant. Note that the degree of freedom of the GED distribution is significantly lower than 2, confirming that the standardized residuals are fat-tailed.

Box 8 - EGARCH
******************************
** G@RCH( 4) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : EGARCH (1, 1) model.
1 regressor(s) in the variance.
The distribution is a GED distribution, with a tail coefficient of 1.31814.

Strong convergence using numerical derivatives
Log-likelihood = -5233.92
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.133901   0.010949    12.23  0.0000
Monday (M)          -0.166042   0.026034   -6.378  0.0000
AR(1)                0.188276   0.014182    13.28  0.0000
Cst(V)              -0.342968    0.21183   -1.619  0.1055
Friday (V)          -0.203655   0.079556   -2.560  0.0105
ARCH(Alpha1)        -0.355466    0.14267   -2.491  0.0128
GARCH(Beta1)         0.987227  0.0058283    169.4  0.0000
EGARCH(Theta1)      -0.085730   0.020159   -4.253  0.0000
EGARCH(Theta2)       0.305338   0.037692    8.101  0.0000
G.E.D.(DF)           1.318137   0.052628    25.05  0.0000

No. Observations :      4093  No. Parameters  :        10
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5233.919

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.133901;-0.166042; 0.188276;-0.342968;-0.203655;-0.355466;
0.987227;-0.085730; 0.305338; 1.318142

### 4.3 GJR Model

This popular model is proposed by Glosten, Jagannathan, and Runkle (1993). Its generalized version is given by:

 (4.8)

where St- is a dummy variable that take the value 1 when γi is negative and 0 when it is positive.

In this model, it is assumed that the impact of εt2 on the conditional variance σt2 is different when εt is positive or negative. The TGARCH model of Zakoian (1994) is very similar to the GJR but models the conditional standard deviation instead of the conditional variance. Ling and McAleer (2002) provide, among other stationarity conditions for various GARCH models, the conditions of existence of the second and fourth moment of the GJR.

Applying variance targeting to the GJR model implies replacing ω by σ2, where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart and E(St-) is 0.5 for symmetric distributions (i.e. Normal, Student and GED) and for the SKST.4

A nice feature of the GJR model is that the null hypothesis of no leverage effect is easy to test. Indeed, γ1 = = γq = 0 implies that the news impact curve is symmetric, i.e. past positive shocks have the same impact on today’s volatility as past negative shocks.

The output reported below suggests the presence of such an effect on the NASDAQ since 1 = 0.107988 with a robust t-value of 2.826.

Box 9 - GJR
******************************
** G@RCH( 5) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : GJR (1, 1) model.
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -5354.08
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.093388   0.015714    5.943  0.0000
Monday (M)          -0.163882   0.028471   -5.756  0.0000
AR(1)                0.203798   0.018417    11.07  0.0000
Cst(V)               0.014065   0.013479    1.044  0.2968
Friday (V)           0.053779   0.060397   0.8904  0.3733
ARCH(Alpha1)         0.105963   0.018002    5.886  0.0000
GARCH(Beta1)         0.825532   0.026830    30.77  0.0000
GJR(Gamma1)          0.107988   0.038216    2.826  0.0047

No. Observations :      4093  No. Parameters  :         8
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5354.076

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.093388;-0.163882; 0.203798; 0.014065; 0.053779; 0.105963; 0.825532; 0.107993

Elapsed Time : 4.086 seconds (or 0.0681 minutes).

### 4.4 APARCH Model

This model has been introduced by Ding, Granger, and Engle (1993). The APARCH (p,q) model can be expressed as:

 (4.9)

where δ > 0 and -1 < γi < 1 (i = 1,...,q).

The parameter δ plays the role of a Box-Cox transformation of σt while γi reflects the so-called leverage effect. Properties of the APARCH model are studied in He and Teräsvirta (1999a, 1999b).

The APARCH includes seven other ARCH extensions as special cases:5

Following Ding, Granger, and Engle (1993), if ω > 0 and i=1qαi E(- γiz)δ+ j=1pβj < 1, a stationary solution for Equation (4.9) exists and is expressed as:

 (4.10)

Notice that if we set γ = 0, δ = 2 and if zt has zero mean and unit variance, we have the usual stationarity condition of the GARCH(1,1) model (α1 + β1 < 1). However, if γ0 and/or δ2, this condition depends on the assumption made on the innovation process.

Ding, Granger, and Engle (1993) derive a closed form solution to κi = Eδ in the Gaussian case. Lambert and Laurent (2001) show that for the standardized skewed-Student:6

For the GED, we can show that:

Note that ξ,υ and λυ concern the shape of the non-normal densities and was defined in Section 3.6.2.

Applying variance targeting to the APARCH model implies replacing ω by σδ, where σ is the unconditional standard deviation of εt, which can be consistently estimated by its sample counterpart.7

Once again, the APARCH model suggest the presence of a leverage effect on the NASDAQ (see Box 10). Importantly, δ is significantly different from 2 ( = 1.146366) but not significantly different from 1. This suggests that, instead of modelling the conditional variance (GARCH), it is more relevant in this case to model the conditional standard deviation. This result is in line with those of Taylor (1986), Schwert (1990) and Ding, Granger, and Engle (1993) who indicate that there is substantially more correlation between absolute returns than squared returns, a stylized fact of high frequency financial returns (often called ‘long memory’).

Box 10 - APARCH
******************************
** G@RCH( 6) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : APARCH (1, 1) model.
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -5341.37
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.088279   0.019660    4.490  0.0000
Monday (M)          -0.167957   0.028627   -5.867  0.0000
AR(1)                0.194576   0.018647    10.43  0.0000
Cst(V)               0.020683   0.011917    1.736  0.0827
Friday (V)           0.028304   0.050441   0.5611  0.5747
ARCH(Alpha1)         0.154370   0.024000    6.432  0.0000
GARCH(Beta1)         0.855355   0.024859    34.41  0.0000
APARCH(Gamma1)       0.288072   0.076177    3.782  0.0002
APARCH(Delta)        1.146366    0.20559    5.576  0.0000

No. Observations :      4093  No. Parameters  :         9
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5341.369

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.088279;-0.167957; 0.194576; 0.020683; 0.028304; 0.154370; 0.855355;
0.288072; 1.146371

Elapsed Time : 7.941 seconds (or 0.13235 minutes).

### 4.5 IGARCH Model

In many high-frequency time-series applications, the conditional variance estimated using a GARCH(p,q) process exhibits a strong persistence, that is:

If j=1pβj + i=1qαi < 1, the process (εt) is second order stationary, and a shock to the conditional variance σt2 has a decaying impact on σt+h2, when h increases, and is asymptotically negligible. Indeed, let us rewrite the ARCH() representation of the GARCH(p,q), given in Equation (4.2), as follows:

 (4.11)

where ω* = ω-1, λ(L) = α(L)-1 = i=1λiLi and λi are lag coefficients depending nonlinearly on αi and βi. For a GARCH(1,1), λi = α1β1i-1. Recall that this model is said to be second order stationary provided that α1 + β1 < 1 since it implies that the unconditional variance exists and equals ω(1 - α1 - β1). As shown by Davidson (2004), the amplitude of the GARCH(1,1) is measured by S = i=1λi = α1(1 - β1), which determines “how large the variations in the conditional variance can be” (and hence the order of the existing moments). This concept is often confused with the memory of the model that determines “how long shocks to the volatility take to dissipate”. In this respect, the GARCH(1,1) model has a geometric memory ρ = 1β1, where λi = O. In practice, we often find α1 + β1 = 1. In this case, we are confronted with an Integrated GARCH (IGARCH) model.

The GARCH(p,q) model can be expressed as an ARMA process. Using the lag operator L, we can rearrange Equation (4.1) as:

When the polynomial contains a unit root, i.e. the sum of all the αi and the βj is one, we have the IGARCH(p,q) model of Engle and Bollerslev (1986). It can then be written as:

 (4.12)

where ϕ(L) = (1 - L)-1 is of order max{p,q}- 1.

We can rearrange Equation (4.12) to express the conditional variance as a function of the squared residuals. After some manipulations, we have its ARCH() representation:

 (4.13)

For this model, S = 1 and thus the second moment does not exist. However, this process is still short memory. To show this, Davidson (2004) considers an IGARCH(0,1) model defined as εt = σtzt and σt2 = εt-12. This process is often wrongly compared to a random walk since the long-range forecast σt+h2 = εt2, for any h. However εt = ztεt-1 which means that the memory of a large deviation persists for only one period.

### 4.6 RiskMetricsTM

In October 1994, the risk management group at J.P. Morgan released a technical document describing its internal market risk management methodology (J.P.Morgan, 1996). This methodology, called RiskMetricsTM soon became a standard in the market risk measurement due to its simplicity.

Basically, the RiskMetricsTM model is an IGARCH(1,1) model where the ARCH and GARCH coefficients are fixed.

The model is given by:

 (4.14)

where ω = 0 and λ is generally set to 0.94 with daily data and to 0.97 with weekly data.8 Note also that Equation 4.14 is the very basic conditional variance model of the RiskMetricsTM methodology, but there exist many extensions of the original model. See the RiskMetrics Group website for details.

To illustrate the need for flexible ARCH-type models, here is the output of the Box-Pierce test on squared standardized residuals and the RBD test applied after the estimation of the RiskMetrics model (including an AR(1) term and the two dummy variables). There is no doubt that the RiskMetrics specification is not appropriate.

Q-Statistics on Squared Standardized Residuals
--> P-values adjusted by 2 degree(s) of freedom
Q(  5) =  82.0593   [0.0000]
Q( 10) =  84.1246   [0.0000]
Q( 20) =  93.3285   [0.0000]
Q( 50) =  119.353   [0.0000]
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]
---------------

Residual-Based Diagnostic for Conditional Heteroskedasticity of Tse (2001)

RBD( 2) =  54.0450   [0.0000]
RBD( 5) =  54.6773   [0.0000]
RBD(10) =  59.6488   [0.0000]
------------------------------------------------
P-values in brackets

### 4.7 Fractionally Integrated Models

Volatility tends to change quite slowly over time, and, as shown in Ding, Granger, and Engle (1993) among others, the effects of a shock can take a considerable time to decay.9 Therefore the distinction between stationary and unit root processes seems to be far too restrictive. Indeed, the propagation of shocks in a stationary process occurs at an exponential rate of decay (so that it only captures the short-memory), while for an unit root process the persistence of shocks is infinite. In the conditional mean, the ARFIMA specification has been proposed to fill the gap between short and complete persistence, so that the short-run behavior of the time-series is captured by the ARMA parameters, while the fractional differencing parameter allows for modelling the long-run dependence.10

To mimic the behavior of the correlogram of the observed volatility, Baillie, Bollerslev, and Mikkelsen (1996) (hereafter denoted BBM) introduce the Fractionally Integrated GARCH (FIGARCH) model by replacing the first difference operator of Equation (4.13) by d.

The conditional variance of the FIGARCH (p,d,q) is given by:

 (4.15)

or σt2 = ω* + i=1λiLiεt2 = ω* + λ(L)εt2, with 0 d 1. It is fairly easy to show that ω > 0,β1 - d ϕ1 and d β1 are sufficient to ensure that the conditional variance of the FIGARCH (1,d,1) is positive almost surely for all t. Setting ϕ1 = 0 gives the condition for the FIGARCH (1,d,0). See Section 3.9 to see how to impose these constraints.

When estimating a FIGARCH (1,d,1) model by QML on the NASDAQ dataset (see the next output, labelled Box 11), one sees that d is significantly different from 0 and 1 while ϕ1 is barely significant. Importantly, comparing its log-likelihood with the one of the simple GARCH(1,1), i.e. -5359.179 vs. -5370.858, justifies the use of a long-memory process in the conditional variance.

Box 11 - FIGARCH-BBM
******************************
** G@RCH( 7) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : FIGARCH (1, d, 1) model estimated with BBM’s method
(Truncation order : 1000).
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -5359.18
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.119253   0.016127    7.394  0.0000
Monday (M)          -0.174214   0.028834   -6.042  0.0000
AR(1)                0.195242   0.017134    11.40  0.0000
Cst(V)               0.023438   0.015515    1.511  0.1309
Friday (V)           0.074531   0.065791    1.133  0.2573
d-Figarch            0.581512   0.079626    7.303  0.0000
ARCH(Phi1)           0.144459   0.075900    1.903  0.0571
GARCH(Beta1)         0.536264   0.093287    5.749  0.0000

No. Observations :      4093  No. Parameters  :         8
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5359.179

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.119253;-0.174214; 0.195242; 0.023438; 0.074531; 0.581512; 0.144459; 0.536269

Elapsed Time : 45.115 seconds (or 0.751917 minutes).

Davidson (2004) notes the interesting and counterintuitive fact that the memory parameter of this process is -d, and is increasing as d approaches zero, while in the ARFIMA model the memory increases when ζ increases. According to Davidson (2004), the unexpected behavior of the FIGARCH model may be due less to any inherent paradoxes than to the fact that, embodying restrictions appropriate to a model in levels, it has been transplanted into a model of volatility. The main characteristic of this model is that it is not stationary when d > 0. Indeed,

where c1(d) = d,c2(d) = d(1 -d), etc. By construction, k=1ck(d) = 1 for any value of d, and consequently, the FIGARCH belongs to the same “knife-edge-nonstationary” class represented by the IGARCH model. To test whether this nonstationarity feature holds, Davidson (2004) proposes a generalized version of the FIGARCH and calls it the HYperbolic GARCH. The HYGARCH is given by Equation (4.15), when λ(L) is replaced by 1 --1ϕ(L). Note that G@RCH reports log(α) and not α. The ck(d) coefficients are thus weighted by α. Interestingly, the HYGARCH nests the FIGARCH when α = 1 (or equivalently when log(α) = 0) and the process is stationary when α < 1 (or equivalently when log(α) < 0) in which case the GARCH component observes the usual covariance stationarity restrictions (see Davidson, 2004 for more details).

Note that when estimating the HYGARCH model on the NASDAQ dataset (see the next output, labelled Box 12), one cannot reject the FIGARCH specification in favour of the HYGARCH since = -0.006962 with a robust standard error of 0.029515.

Box 12 - HYGARCH
******************************
** G@RCH( 8) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : HYGARCH (1, d, 1) model of Davidson
(Truncation order : 1000).
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -5359.13
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.119324   0.016057    7.431  0.0000
Monday (M)          -0.174009   0.028839   -6.034  0.0000
AR(1)                0.195311   0.017189    11.36  0.0000
Cst(V)               0.025119   0.017519    1.434  0.1517
Friday (V)           0.074334   0.065858    1.129  0.2591
d-Figarch            0.589652   0.081351    7.248  0.0000
ARCH(Phi1)           0.143596   0.075199    1.910  0.0563
GARCH(Beta1)         0.541468   0.091022    5.949  0.0000
Log Alpha (HY)      -0.006962   0.029515  -0.2359  0.8135

No. Observations :      4093  No. Parameters  :         9
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5359.126

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.119324;-0.174009; 0.195311; 0.025119; 0.074334; 0.589652;
0.143596; 0.541468;-0.006957

Elapsed Time : 51.875 seconds (or 0.864583 minutes).

Chung (1999) underscores some additional drawbacks in the BBM model: there is a structural problem in the BBM specification since the parallel with the ARFIMA framework of the conditional mean equation is not perfect, leading to difficult interpretations of the estimated parameters. Indeed the fractional differencing operator applies to the constant term in the mean equation (ARFIMA) while it does not in the variance equation (FIGARCH). Chung (1999) proposes a slightly different process:

 (4.17)

where σ2 is the unconditional variance of εt .

Applying variance targeting to this model implies replacing σ2 its sample counterpart.

If we keep the same definition of λ as in Equation (4.15), we can formulate the conditional variance as:

or

 (4.18)

Chung (1999) shows that σ2 > 0 and 0 ϕ1 β1 d 1 is sufficient to ensure the positivity of Equation (4.18) when p = q = 1.11

λ is an infinite summation which, in practice, has to be truncated. BBM propose to truncate λ at 1000 lags (this truncation order has been implemented as the default value in our package, but it may be changed by the user) and replace the unobserved εt2’s by the empirical counterpart of E(εt2), i.e. 1∕T t=1T t2. Contrary to BBM, Chung (1999) proposes to truncate λ at the size of the information set (T - 1) and to initialize the unobserved at 0 (this quantity is small in absolute values and has a zero mean).12

Recently, Conrad and Haag (2006) and Conrad (2007) have derived non-negativity conditions for FIGARCH and HYGARCH models (respectively). Thanks to Christian Conrad, these conditions are implemented in the following examples: Stat_Constr_FIGARCH_Conrad_Haag_Check.ox and
Stat_Constr_HYGARCH_Conrad_Check.ox for a FIGARCH and HYGARCH(1,d,1) respectively. While the conditions are tested on the estimated parameters in the previous two example files, the conditions are imposed during the estimation in the next two, i.e. Stat_Constr_FIGARCH_Conrad_Haag_Impose.ox and

The idea of fractional integration has been extended to other GARCH types of models, including the Fractionally Integrated EGARCH (FIEGARCH) of Bollerslev and Mikkelsen (1996) and the Fractionally Integrated APARCH (FIAPARCH) of Tse (1998).13

Similarly to the GARCH(p,q) process, the EGARCH(p,q) of Equation (4.3) can be extended to account for long memory by factorizing the autoregressive polynomial = ϕ(L)(1 - L)d where all the roots of ϕ(z) = 0 lie outside the unit circle. The FIEGARCH (p,d,q) is specified as follows:

 (4.19)

Finally, the FIAPARCH (p,d,q) model can be written as:14

 (4.20)

To illustrate, the next outputs, labelled Box 13 and 14, report the estimation results of a FIAPARCH(1,d,1) and a FIEGARCH(1,d,1) respectively. Once again, long-memory is detected in the conditional variance equation.

Box 13 - FIAPARCH
******************************
** G@RCH( 9) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : FIAPARCH (1, d, 1) model estimated with BBM’s method
(Truncation order : 1000).
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -5319.06
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.090488   0.016690    5.422  0.0000
Monday (M)          -0.155500   0.027083   -5.742  0.0000
AR(1)                0.206201   0.018850    10.94  0.0000
Cst(V)               0.069862   0.027352    2.554  0.0107
Friday (V)           0.032353   0.062970   0.5138  0.6074
d-Figarch            0.442479   0.048626    9.100  0.0000
ARCH(Alpha1)         0.175341   0.085768    2.044  0.0410
GARCH(Beta1)         0.444875    0.10163    4.377  0.0000
APARCH(Gamma1)       0.358940   0.080621    4.452  0.0000
APARCH(Delta)        1.500260    0.12889    11.64  0.0000

No. Observations :      4093  No. Parameters  :        10
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5319.060

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.090488;-0.155500; 0.206201; 0.069862; 0.032353; 0.442479; 0.175341;
0.444875; 0.358940; 1.500265

Elapsed Time : 173.209 seconds (or 2.88682 minutes).

Box 14 - FIEGARCH
******************************
** G@RCH(10) SPECIFICATIONS **
******************************
Dependent variable : Nasdaq
Mean Equation : ARMA (1, 0) model.
1 regressor(s) in the mean.
Variance Equation : FIEGARCH (1, d, 1) model
(Truncation order : 1000).
1 regressor(s) in the variance.
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -5308.76
Please wait : Computing the Std Errors ...

Robust Standard Errors (Sandwich formula)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.096537   0.015428    6.257  0.0000
Monday (M)          -0.172178   0.028035   -6.142  0.0000
AR(1)                0.192984   0.017946    10.75  0.0000
Cst(V)               0.588445    0.28539    2.062  0.0393
Friday (V)          -0.100769   0.096499   -1.044  0.2964
d-Figarch            0.493427   0.045323    10.89  0.0000
ARCH(Alpha1)        -0.246505    0.25960  -0.9496  0.3424
GARCH(Beta1)         0.603661    0.14050    4.297  0.0000
EGARCH(Theta1)      -0.119277   0.026220   -4.549  0.0000
EGARCH(Theta2)       0.302285   0.048533    6.228  0.0000

No. Observations :      4093  No. Parameters  :        10
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531
Log Likelihood   : -5308.762

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.096537;-0.172178; 0.192984; 0.588445;-0.100769; 0.493427;
-0.246505; 0.603661;-0.119277; 0.302290

Elapsed Time : 242.118 seconds (or 4.0353 minutes).

### 4.8 Forecasting the Conditional Variance of GARCH-type models

Like for the simple ARCH(q) model, it is rather easy to obtain h-step-ahead forecasts of these more complicated models. In the simple GARCH(p,q) case, the optimal h-step-ahead forecast of the conditional variance, i.e. t+h|t2 is given by:

 (4.21)

where εt+i|t2 = σt+i|t2 for i > 0 while εt+i|t2 = εt+i2 and σt+i|t2 = σt+i2 for i 0. Equation (4.21) is usually computed recursively, even if a closed form solution of σt+h|t2 can be obtained by recursive substitution in Equation (4.21).

Similarly, one can easily obtain the h-step-ahead forecast of the conditional variance of an ARCH, IGARCH and FIGARCH model. By contrast, for thresholds models, the computation of out-of-sample forecasts is more complicated. Indeed, for EGARCH, GJR and APARCH models (as well as for their long-memory counterparts), the assumption made on the innovation process may have an effect on the forecast (especially for h > 1).

For instance, for the GJR (p,q) model, we have

 (4.22)

When γi = 0 for all i, we obtain the forecast of the GARCH model. Otherwise, St-i+h|t- has to be computed. Note first that St+i|t- = St+i- for i 0. However, when i > 0, St+i|t- depends on the choice of the distribution of zt. When the distribution of zt is symmetric around 0 (for the Gaussian, Student and GED density), the probability that εt+i is negative is St+i|t- = 0.5. If zt is (standardized) skewed-Student distributed with asymmetry parameter ξ and degree of freedom υ, St+i|t- = since ξ2 is the ratio of probability masses above and below the mode.

For the APARCH (p,q) model,

where E = κiσt+k|t , for k > 1 and κi = E (see Section 3.6.2).

For the EGARCH (p,q) model,

where ĝ(zt+k|t) = ĝ(zt+k) for k 0 and 0 for k > 0.

Finally, the h-step-ahead forecast of the FIAPARCH and FIEGARCH models are obtained in a similar way.

### 4.9 Constrained Maximum Likelihood and Simulated Annealing

As explained in Section 3.9, it is possible to constrain the parameters to range between a lower and an upper bound by selecting the option MaxBFGS - Bounded Parameters in the Estimate window.

If the user selects the option Select (Individual or Matrix Form) in the Starting Values window, the program automatically opens a new box dialog before launching the estimation. Analogous to the procedure for the starting values, the user can change the bounds manually.

In the following example, an APARCH specification has been used. By default, we propose to impose the constraints -1 < γi < 1, i = 1,,q and 0 < δ < 3.

Finally, if the option MaxSA - Unbounded Parameters is selected in the Model Settings window, a new box dialog appears before the estimation.15 In this box dialog, the user may change the default values of the parameters used in the simulated annealing algorithm.

Simulated annealing is a global optimization method that potentially distinguishes between different local optima. Starting from an initial point, the algorithm takes a step and the function is evaluated. When minimizing a function, any downhill step is accepted and the process repeats from this new point. An uphill step may be accepted. Thus, it can escape from local optima. This uphill decision is made by the Metropolis criteria. As the optimization process proceeds, the length of the steps declines and the algorithm closes in on the global optimum. Since the algorithm makes very few assumptions regarding the function to be optimized, it is quite robust with respect to non-quadratic surfaces. The degree of robustness can be adjusted by the user. In fact, simulated annealing can be used as a local optimizer for difficult functions. This implementation of simulated annealing is used in Goffe, Ferrier, and Rogers (1994). Briefly, it is found to be competitive, if not superior, to multiple restarts of conventional optimization routines for difficult optimization problems. The price to pay is that it is slower than the previous two techniques. A comprehensive help is available on the MaxSA web page:
www.tinbergen.nl/~cbos/software/maxsa.html.

### 4.10 Accuracy of G@RCH

McCullough and Vinod (1999) and Brooks, Burke, and Persand (2001) use the daily German mark/British pound exchange rate data of Bollerslev and Ghysels (1996) to compare the accuracy of GARCH model estimation among several econometric softwares. They choose the GARCH(1,1) model described in Fiorentini, Calzolari, and Panattoni (1996) (hereafter denoted FCP) as the benchmark. In this section, we use the same methodology with the same dataset to check the accuracy of our procedures. Coefficients and standard errors estimates of G@RCH 4.2 are reported in Table 4.1 together with the results of McCullough and Vinod (1999) (FCP in the table).

 Coefficient Hessian Robust Standard Errors G@RCH FCP G@RCH FCP G@RCH FCP μ -0.006184 -0.006190 0.008462 0.008462 0.009187 0.009189 ω 0.010760 0.010761 0.002851 0.002852 0.006484 0.006493 α1 0.153407 0.153134 0.026569 0.026523 0.053595 0.053532 β1 0.805879 0.805974 0.033542 0.033553 0.072386 0.072461

 Table 4.1: Accuracy of the GARCH procedure

G@RCH gives very satisfactory results since the first four digits (at least) are the same as those of the benchmark for all but two estimations.

Moreover, to investigate the accuracy of our forecasting procedures, we have run a 8-step ahead forecasts of the model, similar to Brooks, Burke, and Persand (2001). Table 4 in Brooks, Burke, and Persand (2001) reports the conditional variance forecasts given by six well-known softwares and the correct values. G@RCH hits the benchmarks for all steps to the third decimal.

Finally, Lombardi and Gallo (2001) extends the work of Fiorentini, Calzolari, and Panattoni (1996) to the FIGARCH model of Baillie, Bollerslev, and Mikkelsen (1996) and develops the analytic Hessian matrices of this long memory process. For the same DEM/UKP database as in the previous example, Table 4.2 reports the coefficients estimates and their standard errors for our package (using numerical gradients and the BFGS optimization method) and for Lombardi and Gallo (2001) (using analytical gradients and the Newton-Raphson algorithm).

 Coefficient Hessian G@RCH LG G@RCH LG μ 0.003606 0.003621 0.009985 0.009985 ω 0.015772 0.015764 0.003578 0.003581 α1 0.198134 0.198448 0.042508 0.042444 β1 0.675652 0.675251 0.051800 0.051693 d 0.570702 0.569951 0.075039 0.074762

 Table 4.2: Accuracy of the FIGARCH procedure

Results show that G@RCH provides excellent numerical estimates that are quite close to the analytical ones, even for an advanced model such as the FIGARCH.16

### 4.11 Simulations

G@RCH 6.1 allows the simulation of four models, namely the GARCH, GJR, APARCH and EGARCH models. An ARMA specification is allowed in the conditional mean and four different distributions are available for the error term, i.e. a normal, Student, GED or skewed-student.

Simulation can be done either in ox or through the rolling menus. Simul_Garch.ox is an example of Ox code using the Garch class. Alternatively to simulate a GARCH model using the rolling menus, click on G@RCH in the Modules group in the workspace window on the left-hand side of OxMetrics. Then change the Category to Monte Carlo and Model Class to Simulation of GARCH Models and click on Formulate.

Then, select the model of interest and change the default values of the corresponding parameters (next window). The simulated series can be stored in a separate dataset and are plotted if requested.

## Chapter 5Estimating Univariate Models using the Batch and Ox Versions

G@RCH is primarily a menu-driven module for OxMetrics. However, there are two additional ways of estimating GARCH-type models with G@RCH. The first makes use of the batch facilities of OxMetrics. In the second, an Ox program is written that imports the Garch class code into the program.

This chapter is devoted to the illustration of these two methods. No detailed analysis of the results will be undertaken as we concentrate on the mechanics of the procedures.

### 5.1 Using the Batch Version

A useful feature in G@RCH 6.1 is the possibility to use the algebra and batch languages in OxMetrics (see the OxMetrics handbook for more details). The batch language gives some control over OxMetrics through a command language. This allows for automating repetitive tasks, or as a quick way to get to a certain point in your analysis. The syntax of these commands is described below.

There are five ways of running Batch commands.

• Batch Editor (Alt+b)
The Model/Batch... command activates the edit window in which you can edit/load/save a set of batch commands. The file extension used for batch files is .FL.
• Batch from Results windows (Ctrl+b)
A text selection containing Batch commands can be run directly from that window using the Edit/Run as Batch command (or Ctrl+b as short cut).
• Batch from the File/Open command
If you use File/Open, and set the file type to Run Batch file (*.fl), then the batch file is run immediately.
• Batch from the Windows Explorer
You can double click on a .FL file in the Windows Explorer to run the file directly. If OxMetrics is not active yet, it will be started automatically.
• Batch from a Batch file
A batch file can be called from another batch file using the command

The most intuitive way is probably to estimate a model first and then click on Tools/Batch Editor ... in OxMetrics (or Alt+b). Then, a new box dialog appears with the batch code corresponding to this model. To illustrate, the next picture shows part of the batch code corresponding to a simple (without explanatory variable) ARMA(1,0)-GARCH(1,1) model.

To estimate a slightly different model, one has to change the batch code accordingly. Here is an example batch code used to estimate an ARMA(1,1)-GJR(1,1) model using the same database (note that only the commands ARMA_ORDERS() and MODEL() have been modified).

Batch code for an ARMA(1,1)-GJR(1,1)
// Batch code for G@RCH( 1)
module("G@RCH");
package("Garch", "GARCH");
usedata("nasdaq.xls");
system
{
Y = Nasdaq;
}
CSTS(1,1);
DISTRI(0);
ARMA_ORDERS(1,0);
ARFIMA(0);
GARCH_ORDERS(0,1);
MODEL("GJR");
MLE(2);
MaxControl(1000,0,1);
MaxControlEps(0.0001,0.005);
estimate("BFGS", 1984-10-12, 0, 2000-12-21, 0);

Here is a list of G@RCH specific commands (see Section 10.3 - G@RCH Members Functions for more details on these options).

1. CSTS: Specifies if constants are wanted respectively in the conditional mean and in the conditional variance equations.
2. DISTRI: Specifies the desired distribution, i.e. 0 for the Gaussian (Normal) distribution 1, for the Student-t distribution, 2, for the Generalized Error distribution (GED) and 3 for the skewed Student-t.
3. ARMA_ORDERS: Specifies respectively the AR and MA orders.
4. ARFIMA: 1 to add a long-memory component in the conditional mean equation, 0 otherwise.
5. GARCH_ORDERS: Specifies the orders of the GARCH(p,q)-type model.
6. MODEL: specification choice (string).
• "RISKMETRICS";
• "GARCH";
• "EGARCH";
• "GJR";
• "APARCH";
• "IGARCH";
• "FIGARCH_BBM";
• "FIGARCH_CHUNG";
• "FIEGARCH";
• "FIAPARCH_BBM";
• "FIAPARCH_CHUNG";
• "HYGARCH".
7. "ARCH_IN_MEAN"; 1 or 2 to add the conditional variance or standard deviation in the conditional mean equation, 0 otherwise.
8. estimate(method, year1=-1, period1=0, year2=-1, period2=0); Estimates the model.
The first argument method = "BFGS" for unconstrained ML, "BFGS-BOUNDS" for Constrained ML (lower and upper bounds) and "MaxSA" for Simulated annealing.
year1(period1) – year2(period2) is the estimation sample. Setting year1 to -1 will result in the earliest possible year1(period1), setting year2 to -1 will result in the latest possible year2(period2).
9. TRUNC, NYBLOM, ITER and Maxsa are used as described in Section 10.3.
10. KUPIEC_TEST; 1 to apply the KUPIEC test on the in-sample VaR. The VaR levels are specified using option VaR_LEVELS, where the VaR levels are separated by a comma, e.g. VaR_LEVELS(0.95,0.99).
11. DQT; 1 to apply the Dynamic Quantile test presented in Section 6.2.2. The first argument is one to apply the test and the second is the number of lags of the Hit variable to include in the regression. The VaR levels are also specified using option VaR_LEVELS (see above).
12. COVAR, BOXPIERCE, ARCHLAGS, RBD, PEARSON, SBT, MLE are also used as described in Section 10.3 except that the lag orders are now separated by a comma, e.g. BOXPIERCE(10,20,30);
13. Tests(); To launch the miss-specification tests selected above.
14. FORECAST; The first argument is 1 to run the forecasting procedure, the second argument is the forecasting horizon and the third argument is one to print the forecasts.
15. Print_VaR_out_of_sample_Forecast; Print the VaR forecasts. The arguments are the VaR levels (see VaR_LEVELS).
16. For_Graphs. See Chapter 10.
17. MaxControl and MaxControlEps. See the ox documentation.

Here is another example batch code.

Batch code for an ARMA(1,0)-APARCH(1,1) with various tests
// Batch code for G@RCH( 1)
module("G@RCH");
package("Garch", "GARCH");
usedata("nasdaq.xls");
system
{
Y = Nasdaq;
}
CSTS(1,1);
DISTRI(0);
ARMA_ORDERS(1,0);
ARFIMA(0);
GARCH_ORDERS(1,1);
MODEL("APARCH");
MLE(2);
MaxControl(1000,0,1);
MaxControlEps(0.0001,0.005);
estimate("BFGS", 1984-10-12, 0, 2000-12-21, 0);
BOXPIERCE(5,10,20,50);
ARCHLAGS(2,5,10);
NYBLOM(1);
SBT(1);
RBD(2,5,10);
PEARSON(40,50,60);
KUPIEC_TEST(1);
DQT(1,5);
VaR_LEVELS(0.95,0.975,0.99,0.995,0.9975);
COVAR(1);
Tests();

TestGraphicAnalysis(1,1,1,1,1,1);
QuantileGraphs(2,0.025,0.975);
FORECAST(1,10, 1);
For_Graphs(0,10,2,2,1);
Print_VaR_out_of_sample_Forecast(0.05,0.95);

Feel free to contact us if you need more flexibility for the batch mode.

### 5.2 Importing the Garch Class in Ox

This section explains how to use the Garch class in Ox. We assume that you have installed Ox (version 5 or later). For more details about Ox, see Jurgen Doornik’s web site (http://www.doornik.com).

#### 5.2.1 GarchEstim.ox example

G@RCH is build upon the concept of object-oriented programming. For non specialists, object-oriented programming might sound rather daunting at first. But it is in fact quite easy to get the whole picture.

A major component of object-oriented programming is the “class”. Several useful classes are supplied with Ox, such as Database, Modelbase and Simulation classes. Garch is an additional class that helps in the estimation of GARCH-type models. Section 10.1 provides additional details on this concept.

GarchEstim.ox is an example of Ox code using the Garch class. As file editor, we strongly recommend to use OxMetrics or OxEdit.1

The GarchEstim.ox file is displayed in the following Box.

GarchEstim.ox
#import <packages/Garch6/Garch>

main()
{
decl garchobj;
garchobj = new Garch();
//*** DATA ***//
garchobj.Info();
garchobj.Select(Y_VAR, {"Nasdaq",0,0});
garchobj.SetSelSample(-1, 1, 2000, 1);
//*** SPECIFICATIONS ***//
garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)
garchobj.DISTRI(3);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student
garchobj.ARMA_ORDERS(0,0);      // AR order (p), MA order (q).
garchobj.ARFIMA(0);             // 1 if Arfima wanted, 0 otherwise
garchobj.GARCH_ORDERS(1,1);     // p order, q order
garchobj.ARCH_in_mean(0);       // ARCH-in-mean: 1 or 2 to add the variance or std. dev.
garchobj.MODEL("GARCH");        //  0: RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH
//  5:IGARCH 6:FIGARCH-BBM   7:FIGARCH-CHUNG   8:FIEGARCH
//  9:FIAPARCH-BBM  10: FIAPARCH-CHUNG  11: HYGARCH
garchobj.TRUNC(1000);           // Truncation order (only F.I. models with BBM method)

//*** TESTS & FORECASTS ***//
garchobj.BOXPIERCE(<10;15;20>); // Lags for the Box-Pierce Q-statistics, <> otherwise
garchobj.ARCHLAGS(<2;5;10>);    // Lags for Engle’s LM ARCH test, <> otherwise
garchobj.NYBLOM(1);             // 1 to compute the Nyblom stability test, 0 otherwise
garchobj.SBT(1);                // 1 to compute the Sign Bias test, 0 otherwise
garchobj.PEARSON(<40;50;60>);   // Cells for the adjusted Pearson Chi-square GoF test,
garchobj.RBD(<10;15;20>);       // Lags for the Residual-Based Diagnostic test of Tse
garchobj.FORECAST(0,15,1);      // Arg.1 : 1 to launch the forecasting procedure, 0 otherwize
// Arg.2 : Number of forecasts
// Arg.3 : 1 to Print the forecasts, 0 otherwise
//*** OUTPUT ***//
garchobj.MLE(2);                // 0 : MLE (Second derivatives), 1 : MLE (OPG Matrix), 2 : QMLE
garchobj.COVAR(0);              // if 1, prints variance-covariance matrix of the parameters.
garchobj.ITER(0);               // Interval of iterations between printed intermediary results
garchobj.TESTS(0,0);            // Arg. 1 : 1 to run tests PRIOR to estimation, 0 otherwise
// Arg. 2 : 1 to run tests AFTER estimation, 0 otherwise
garchobj.GRAPHS(0,0,"");        // Arg.1 : if 1, displays graphics of the estimations
// Arg.2 : if 1, saves these graphics in a EPS file
// Arg.3 : Name of the saved file.
garchobj.FOREGRAPHS(1,0,"");    // Same as GRAPHS(p,s,n) but for the graphics of the forecasts.
//*** PARAMETERS ***//
garchobj.BOUNDS(0);             // 1 if bounded parameters wanted, 0 otherwise
garchobj.FIXPARAM(0,<0;0;0;0;1;0>);
// Arg.1 : 1 to fix some parameters to their starting values, 0 otherwize
// Arg.2 : 1 to fix (see garchobj.DoEstimation(<>)) and 0 to estimate
//         the corresponding parameter

//*** ESTIMATION ***//
garchobj.Maxsa(0,5,0.5,20,5,2,1);
// Arg.1 : 1 to use the MaxSA algorithm of Goffe, Ferrier and Rogers (1994)
//         and implemented in Ox by Charles Bos
// Arg.2 : dT=initial temperature
// Arg.3 : dRt=temperature reduction factor
// Arg.4 : iNS=number of cycles
// Arg.5 : iNT=Number of iterations before temperature reduction
// Arg.6 : vC=step length adjustment
// Arg.7 : vM=step length vector used in initial step
garchobj.Initialization(<>);

garchobj.PrintStartValues(0);    // 1: Prints the S.V. in a table form; 2: Individually;
// 3: in a Ox code to use in StartValues
garchobj.DoEstimation(<>);
garchobj.Output();
garchobj.STORE(0,0,0,1,1,"01",0);  //  Arg.1,2,3,4,5 : if 1 -> stored.
//  (Res-SqRes-CondV-MeanFor-VarFor)
//  Arg.6 : Suffix. The name of the saved series
//  will be "Res_ARG6".
//  Arg.7 : if 0, saves as an Excel spreadsheet (.xls).
//  If 1, saves  as a OxMetrics dataset (.in7)
delete garchobj;
}

Let us study this file more in details. The #import statement indicates that this file is linked to the Garch.oxo and Garch.h files. In the body of the file (after the main() instruction), a new Garch object is first created and a database is loaded. The user has to enter the correct path of the database, but also has to pay attention to the structure of the database to be used. For instance, to use a Microsoft Excel file, the format of the spreadsheet is of crucial importance. The following convention has to be adopted when loading an Excel spreadsheet: variables are in columns, columns with variables are labelled, there is an unlabelled column containing the dates (with the form Year-Period) and the data form a contiguous sample. Here is an example:2

 A B C D 1 RET MON HOL 2 1990-1 0.0439 1 0 3 1990-2 -0.0302 0 0 4 1990-3 0.0845 0 1

OxMetrics also supports databases having a date variable which provides the dates for daily data (or possibly times for higher frequencies). Then selections are made by date, and the dates shown in the graphics. Dates can be read from and saved to Excel files.

We note then that the dependent variable (Y), the regressor(s) in the mean equation (X) and the regressor(s) in the variance equation (Z) are specified with the Select function. Ox being case-sensitive, the exact name of the variable has to be entered. The second and third arguments denote the starting and ending observations to be considered. By default, “0” and “0” mean that all the observations are selected. From this selection, a sample can be extracted with the SetSelSample function. The arguments are ordered as (StartYear, StartPeriod, EndYear, EndPeriod2) and the default (-1, 1, -1, 1) means all the selected observations.3

The GarchEstim.ox file is made up of six parts:4

• the “Data” part deals with the database, the sample and the variables selection;
• the “Specification” part is related to the choice of the model, the lag orders and the distribution;
• the “Tests & Forecasts” part allows computation of different tests and to parameterize the forecasting part. Note that BOXPIERCE, ARCHLAGS, RBD and PEARSON all require a vector of integers corresponding to the lags used in the computation of the statistics;
• the “Output” part includes several output options including MLE that refers to the computation method of the standard deviations of the estimated parameters, TESTS that is useful when you want to run some tests on the raw series, prior to any estimation and GRAPHS (resp. FOREGRAPHS) that plots graphs for the estimation (resp. forecasting) process ;
• the “Parameters” part consists in two procedures. BOUNDS indicates if constraints are imposed on several parameters (see Section 3.9) while FixParam allows fixing some parameters to their starting values;
• the “Estimation” part runs the model. Initialization initializes the model, StartValues is used to modify the starting value of one or more parameter(s), DoEstimation launches the estimation of the model, Output prints results of the estimation, forecasting and tests procedures and, finally, STORE allows storing some series. The argument of DoEstimation is a vector containing starting values of the parameters in a pre-specified order (note that the user has always the possibility to let G@RCH using default values).

#### 5.2.2 Running an Ox Program

If you want to use the “Console Version” of G@RCH 6.1 , all you need is an Ox executable. To illustrate the various ways to run Ox code, we keep our GarchEstim.ox example.

##### Command Prompt

First, you can type

oxl GarchEstim

at the command prompt. This method is mainly for users of Ox Console.

Updating the environment

Skip this section if you managed to run the Ox programs in this booklet. Otherwise, under Windows and Unix you may wish to set the PATH environment variable.

The executable (oxl.exe etc.) is in the ox\bin folder, for example by default it is in:

C:\Program files\OxMetrics5\Ox\bin

So, update your PATH variable if necessary.5

Without these, you can still run GarchEstim.ox, but more typing is needed:

"C:\Program files\OxMetrics5\Ox\bin\oxl"
"-iC:\Program files\OxMetrics5\Ox\include"
GarchEstim.ox

The double quotes are required because of the space in the file name.

##### OxEdit

Alternatively, you may use OxEdit.6 Ox tools should be installed automatically. If not, run ox\bin\oxedit\oxcons.tool. To run the program, use the Modules/Ox (for Ox Console) or Modules/OxRun (OxMetrics), as shown below.

When using OxEdit, OxMetrics is needed to display the graphics but is not mandatory for running the estimation. Alternatively, graphs can be displayed on the screen using Gnudraw and Gnuplot. Indeed, G@RCH 6.1 can be combined with Gnudraw, an Ox package meant for creating GnuPlot7 developed by Charles Bos.8 The interface is completely based on the OxDraw package (even the documentation - gnudraw.html - uses the same structure as Jurgen Doornik’s oxdraw.html).9 Usage of GnuDraw is intended to be simple as the syntax is similar to the original Ox drawing routines. You just have to add the lines

#include <oxstd.h>
#include <packages/gnudraw/gnudraw.h>
before
#import <packages/Garch6/Garch>.
As GnuPlot can be called automatically from within Ox, ShowDrawWindow can be used, displaying graphs on screen.

See Cribari-Neto and Zarkos (2003) for a comprehensive overview of the GnuDraw package.

##### OxMetrics

A third possibility is to launch Ox programs directly from OxMetrics, as shown below.10 There are different ways for running an Ox code under OxMetrics. The first way is to use OxRun.

A new feature of G@RCH 6.1 (and OxMetrics 5 in general) is that Ox code can be generated.

The Model/Ox Batch Code command (or Alt+O) activates a new dialog box called ‘Generate Ox Code’ that allows the user to select an item for which to generate Ox code.

The code is than opened in a new window and provided Ox Professional is available, this code can be run, either from OxMetrics, or from OxEdit or the command line. This option is also available for the ‘Descriptive Statistics’ and ‘Simulation’ modules.

OxBatch_1.ox is an example of Ox Batch code generated by G@RCH 6.1 after the estimation of an APARCH model while OxBatch_2.ox and OxBatch_3.ox are two examples of Ox Batch code generated after the use of the ‘Descriptive Statistics’ and ‘Simulation’ modules.

OxBatch_1.ox
#include <oxstd.h>
#import <packages/Garch6/garch>
#include <oxdraw.h>

main()
{
//--- Ox code for G@RCH( 3)
decl model = new Garch();

model.Deterministic(-1);

model.Select(Y_VAR, {"Nasdaq", 0, 0});

model.CSTS(1,1);
model.DISTRI(3);
model.ARMA_ORDERS(2,0);
model.ARFIMA(0);
model.GARCH_ORDERS(1,1);
model.MODEL(APARCH);
model.MLE(2);
model.ITER(0);

model.SetSelSampleByDates(dayofcalendar(1984, 10, 12), dayofcalendar(2000, 12, 21));
model.Initialization(<>);
model.DoEstimation(<>);
model.Output();

model.INFO_CRITERIA(1);
model.NORMALITY_TEST(1);
model.BOXPIERCE(<5;10;20;50>);
model.ARCHLAGS(<2;5;10>);
model.NYBLOM(1);
model.SBT(1);
model.RBD(<2;5;10>);
model.PEARSON(<40;50;60>);
model.KUPIEC_TEST(1);
model.DQT(1,5);
model.VaR_LEVELS(<0.95;0.975;0.99;0.995;0.9975>);
model.COVAR(1);
model.Tests();

model.TestGraphicAnalysis(1,1,1,1,1,1,0);
model.QuantileGraphs(2,<0.025;0.975>,6);
SetDrawWindow("G@RCH Graphics");
ShowDrawWindow();

model.FORECAST(1,10, 1);
model.Forecasting();
model.For_Graphs(0,10,2,2,1);
SetDrawWindow("G@RCH Forecasting");
ShowDrawWindow();
model.Print_VaR_out_of_sample_Forecast(<0.05;0.95>);

model.Append_in(model.m_vE,"Res");
model.Append_in(model.m_vE.^2,"SqRes");
model.Append_in(model.m_vStandErrors,"StdRes");
model.Append_in(model.m_vSigma2,"CondV");
model.Append_out(model.m_mForc[][0],"ForY");
model.Append_out(model.m_mForc[][1],"ForVar");
model.Save("D:\\G@RCH5\\Compilation\\Oxmetrics5\\data\\nasdaq.xls");

delete model;
}

OxBatch_2.ox
#include <oxstd.h>
#import <packages/Garch6/garch>
#include <oxdraw.h>
#import <lib/testres>
#include <arma.h>
#import <Database>

main()
{
//--- Ox code for G@RCH( 1)
decl model = new Garch();

model.Deterministic(-1);

model.Select(Y_VAR, {"Nasdaq", 0, 0});

model.SetSelSampleByDates(dayofcalendar(1984, 10, 12), dayofcalendar(2000, 12, 21));
decl series =   model.GetGroup(Y_VAR),names,y,name;
model.GetGroupNames(Y_VAR, &names);
decl m_cT=rows(series);
decl nbser=columns(series);
model.Info();
for (decl i = 0; i < nbser; ++i)
{
y=series[][i];
name=names[i];
println("Series #", i+1,"/",nbser, ": ", name);
print("---------");
model.Normality(y);
decl m_cLagArch_test=<2;5;10>;
for (decl l=0; l < sizer(m_cLagArch_test); l++)
ArchTest(y, 1, m_cLagArch_test’[l],  0, TRUE);
println("Q-Statistics on Raw data");
model.BoxPQ(y,<5;10;20;50>, 0);
println("Q-Statistics on Squared data");
model.BoxPQ(y.^2,<5;10;20;50>, 0);
model.EstimateGSP(y,2046,1);
}

delete model;
}

OxBatch_3.ox
#include <oxstd.h>
#import <packages/Garch6/garch>
#include <oxdraw.h>

main()
{
//--- Ox code for G@RCH( 4)
decl model = new Garch();

model.Deterministic(-1);

model.Select(Y_VAR, {"Nasdaq", 0, 0});

decl z,eps,sigma2,y,y_all=<>,sigma2_all=<>;
decl new_name_simul_y=new array[3];
decl new_name_simul_sigma2=new array[3];
for (decl i=0;i<3;++i)
{
z=rann(2000,1);
model.Simulate_APARCH(0.05,<0.1>,<0.8>,<0.01>,<2>, z, 0, &eps, &sigma2);
y=eps+0.01;
y_all~=y;
sigma2_all~=sigma2;
new_name_simul_y[i]=sprint("$y_t$","-",i+1);
new_name_simul_sigma2[i]=sprint("$\sigma_t^2$","-",i+1);
}
decl plot=0;
for (decl i=0;i<3;++i)
{
DrawTMatrix(plot++, y_all[][0]’, new_name_simul_y[i]);
DrawTMatrix(plot++, sigma2_all[][0]’, new_name_simul_sigma2[i]);
}
ShowDrawWindow();
decl dbase = new Database();
dbase.Create(1,1,1,rows(y),1);
dbase.Append(y_all,new_name_simul_y);
dbase.Append(sigma2_all,new_name_simul_sigma2);
dbase.Save("simulation.in7");
delete  dbase;

delete model;
}

#### 5.3.1 Forecast.ox example

To illustrate the potential of writing Ox code based on G@RCH and its Garch class, we also provide Forecast.ox, an example that estimates an ARMA(1,0)-APARCH(1,1) model on the CAC40. In this program, we analyze the French CAC40 stock index for the years 1995-1999 (1249 daily observations). Daily returns in percentage are defined as 100 times the first difference of the log of the closing prices.

After the estimation of the ARMA(1,0)-APARCH(1,1) model with a skewed-student likelihood on the first 800 observations, 448 one-step-ahead forecasts of the conditional mean and conditional variance are computed.

Forecasting Performance

One of the most popular measures to check the forecasting performance of the ARCH-type models is the Mincer-Zarnowitz regression, i.e. ex-post volatility regression:

 (5.1)

where t2 is the ex-post volatility, t2 is the forecasted volatility and a0 and a1 are parameters to be estimated. If the model for the conditional variance is correctly specified (and the parameters are known) and if E(t2) = t2, we have a0 = 0 and a1 = 1. The R2 of this regression is often used as a simple measure of the degree of predictability of the ARCH-type model.

But t2 is never observed. It is thus common to use t2 = (yt -y)2, where y is the sample mean of yt. The R2 of this regression is often lower than 5% and this could lead to the conclusion that GARCH models produce poor forecasts of the volatility (see, among others, Schwert, 1990, or Jorion, 1996). But, as described in Andersen and Bollerslev (1998a), the reason of these poor results is the choice of what is considered as the “true” volatility. Instead, they propose to compute the daily realized volatility as the sum of squared intraday returns and use it as the “true” volatility. Actually, Andersen and Bollerslev (1998a) show that this measure is a more proper one than squared daily returns. Therefore, using h-minute returns (5-minute for instance), the realized volatility can be expressed as:

 (5.2)

where yk,t is the return of the kth h-minutes interval of the tth day and K is the number of h-minutes intervals per day. See Section 7.3 for more details about realized volatility.

Coming back to our illustration, the CAC40 index is computed by the exchange as a weighted measure of the prices of its components and was originally available in our database on an intraday basis with the price index being computed every 15 minutes. For the time period under review, the opening hours of the French stock market were 10.00 am to 5.00 pm, thus 7 hours of trading per day. This translates into 28 intraday returns used to compute the daily realized volatility.11 Because the exchange is closed from 5.00 pm to 10.00 am the next day, the first intraday return is the first difference between the log price at 10.15 am and the log price at 5.00 pm the day before. Then, the intraday data are used to compute the daily realized volatility as the sum of the 28 squared intraday returns as shown in Equation (5.2).

Finally, to compare the adequacy of the different distributions in the selected forecasting model, a well-known tool is the density forecasts tests developed in Diebold, Gunther, and Tay (1998). The idea of density forecasts is quite simple.12 Let fi(yi|Ωi)i=1m be a sequence of m one-step-ahead density forecasts produced by a given model, where Ωi is the conditioning information set, and pi(yi|Ωi)i=1m is the sequence of densities defining the Data Generating Process yi (which is never observed). Testing whether this density is a good approximation of the true density p(.) is equivalent to testing:

 (5.3)

Diebold, Gunther, and Tay (1998) use the fact that, under Equation (5.3), the probability integral transform i = -∞yifi(t)dt is i.i.d. U(0,1), i.e. independent and identically distributed uniform. To check H0, they propose to use both a goodness-of-fit test and an independence test for i.i.d. U(0,1). The i.i.d.-ness property of i can be evaluated by plotting the correlograms of j, for j = 1,2,3,4,..., to detect potential dependence in the conditional mean, variance, skewness, kurtosis, etc. Departure from uniformity can also be evaluated by plotting an histogram of i. According to Bauwens, Giot, Grammig, and Veredas (2000), a humped shape of the -histogram would indicate that the issued forecasts are too narrow and that the tails of the true density are not accounted for. On the other hand, a U-shape of the histogram would suggest that the model issues forecasts that either under- or overestimate too frequently. Moreover, Lambert and Laurent (2001) show that an inverted S shape of the histogram would indicate that the errors are skewed, i.e. the true density is probably not symmetric.13 An illustration is provided in Section 5 with some formal tests and graphical tools.

Results

The Mincer-Zarnowitz regression and some out-of-sample density forecast tests (as suggested by Diebold, Gunther, and Tay, 1998) are also performed.

The program Forecast.ox is printed in the next box. This code has been used to produce Figure 5.1 and the output labelled “Density Forecast Test on Standardized Forecast Errors”. In the first four panels of Figure 5.1, we show the correlograms of j, for j = 1,2,3,4, where is the probability integral transform for the out-of-sample period (see the paragraph below Equation (5.3) for more details). This graphical tool has been proposed by Diebold, Gunther, and Tay (1998) to detect potential remaining dependence in the conditional mean, variance, skewness, kurtosis. In our example, it seems that the probability integral transform is independently distributed.

Forecast.ox
#import <packages/Garch6/Garch>

main() {
decl garchobj;
garchobj = new Garch();
//*** DATA ***//
garchobj.Select(Y_VAR, {"CAC40",0,0} );
garchobj.SetSelSample(-1, 1, 800, 1);
//*** SPECIFICATIONS ***//
garchobj.DISTRI(3);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student
garchobj.ARMA_ORDERS(1,0);      // AR order (p), MA order (q).
garchobj.GARCH_ORDERS(1,1);     // p order, q order
garchobj.MODEL(4);              // 0:RISKMETRICS  1:GARCH      2:EGARCH    3:GJR   4:APARCH
// 5:IGARCH 6:FIGARCH-BBM    7:FIGARCH-CHUNG   8:FIEGARCH
// 9:FIAPARCH-BBM   10: FIAPARCH-CHUNG  11: HYGARCH
garchobj.MLE(1);                // 0 : MLE (Second derivatives), 1 : MLE (OPG Matrix), 2 : QMLE
garchobj.Initialization(<>) ;
garchobj.DoEstimation(<>);
garchobj.Output() ;
//************************************************************************************
decl forc=<>,h,yfor=<>,shape=<>;
decl number_of_forecasts=448; // number of h_step_ahead forecasts
decl step=1;                  // specify h (h-step-ahead forecasts)
decl T=garchobj.GetcT();
println("!!! Please Wait while computing the forecasts !!!");
}

decl distri=garchobj.GetDistri();
if (distri==1 || distri==2)
shape=garchobj.GetValue("m_cV");
else if (distri==3)
shape=garchobj.GetValue("m_cA")|garchobj.GetValue("m_cV");
for (h=0; h<number_of_forecasts; ++h)
{
garchobj.FORECAST(1,step,0);
garchobj.SetSelSample(-1, 1, T+h, 1);
garchobj.InitData();
yfor|=garchobj.GetForcData(Y_VAR, step);
forc|=garchobj.FORECASTING();
}
decl Hfor = garchobj.GetVar("REALVOLA")[801:];  // Realized volatility
//  decl Hfor = (yfor - meanc(yfor)).^2;            // Squared returns (in deviation)
decl cd=garchobj.CD(yfor-forc[][0],forc[][1],garchobj.GetDistri(),shape);
println("Density Forecast Test on Standardized Forecast Errors");
garchobj.APGT(cd,20|30,garchobj.GetValue("m_cPar"));
garchobj.AUTO(cd, number_of_forecasts, -0.1, 0.1, 0);
garchobj.confidence_limits_uniform(cd,30,0.95,1,4);
DrawTitle(5, "Conditional variance forecast and realized volatility");
Draw(5, (Hfor~forc[][1])’);
ShowDrawWindow();
garchobj.MZ(Hfor, forc[][1], number_of_forecasts);
garchobj.FEM(forc, yfor~Hfor);
savemat("MeanFor.xls",forc[][0]);     // Saves the mean forecasts in an Excel file.
savemat("VarFor.xls",forc[][1]);      // Saves the variance forecasts in an Excel file.
delete garchobj;
}

Panel 5 of Figure 5.1 also shows the histogram (with 30 cells) of with the 95% confidence bands. From this figure, it is clear that the ARMA(1,0)-APARCH(1,1) model coupled with a skewed-Student distribution for the innovations performs very well with the dataset we have investigated. This conclusion is reinforced by the result reported in the Density Forecast Test on Standardized Forecast Errors box. The adjusted Pearson Chi-square goodness-of-fit test provides a statistical version of the graphical test presented in Figure 5.1. In addition, the program also performs the Mincer-Zarnowitz regression (see Equation (5.1)) that regresses the observed volatility (in our case the realized volatility) on a constant and a vector of 448 one-step-ahead forecasts of the conditional variance (produced by the APARCH model).14 Theses results suggest that the APARCH model gives good forecasts of the conditional variance. Based on the significance of the β parameter of the regression, one can indeed hardly conclude that the APARCH model provides biased forecasts. Moreover, the R2 of this regression is about 33%, versus 0.14 when we use the daily squared returns (in deviation) as proxy of the observed volatility.15

Density Forecast Test on Standardized Forecast Errors

Cells(g)  Statistic      P-Value(g-1)     P-Value(g-k-1)
20       21.0179         0.335815          0.020969
30       26.5089         0.598181          0.149654
Rem.: k = 9 = Number of estimated parameters
---------------
Probability Integral Transform The Lower and Upper bounds are
0.46875 and 1.54018

Mincer-Zarnowitz regression on the forecasted volatility
Coefficient  Std.Error  t-value  t-prob
Alpha                0.027698    0.23253   0.1191  0.9052
Beta                 1.239532    0.16052    7.722  0.0000
R2: 0.329662 Note: S.E. are
Heteroskedastic Consistent (White, 1980)
---------------

#### 5.3.2 Imposing Nonlinear Constraints

As mentioned in Section 3.9, the inspected range of the parameter space is when numerical optimization is used to maximize the log-likelihood function with respect to the vector of parameters Ψ. Sometimes these parameters might have to be constrained in a smaller interval.

We have shown how to impose these lower and upper bounds and how to change these bounds in the OxPack version of G@RCH (see Section 4.9).

The same feature is available for the Console version. Furthermore, the Console version allows the user to impose any nonlinear constraint on the parameters (like the stationarity constraints).

The example StartValues_Bounds.ox shows how to print the code needed to change the starting values and the bounds for an ARMA(0,0)-GARCH(1,1) with a Student likelihood.

First Step : StartValues_Bounds.ox
#import <packages/Garch6/Garch>

StartValues(const object)
{
object.GetPara();                                   // DO NOT REMOVE THIS LINE
object.Initialization(object.GetValue("m_vPar"));   // DO NOT REMOVE THIS LINE
}

main()
{
decl garchobj;
garchobj = new Garch();
//*** DATA ***//
garchobj.Info();
garchobj.Select(Y_VAR, {"Nasdaq",0,0});
garchobj.SetSelSample(-1, 1, 2000, 1);
//*** SPECIFICATIONS ***//
garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)
garchobj.DISTRI(1);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student
garchobj.ARMA_ORDERS(0,0);      // AR order (p), MA order (q).
garchobj.GARCH_ORDERS(1,1);     // p order, q order
garchobj.MODEL("GARCH");        //  0:RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH
//  5:IGARCH 6:FIGARCH-BBM   7:FIGARCH-CHUNG   8:FIEGARCH
//  9:FIAPARCH-BBM  10: FIAPARCH-CHUNG  11: HYGARCH
//*** OUTPUT ***//
garchobj.MLE(2);                // 0 : MLE (Second derivatives), 1 : MLE (OPG Matrix), 2 : QMLE
//*** PARAMETERS ***//
garchobj.BOUNDS(1);             // 1 if bounded parameters wanted, 0 otherwise
//*** ESTIMATION ***//
garchobj.Initialization(<>);
StartValues(garchobj);
garchobj.PrintBounds(2);          // 1: Prints the bounds in a matrix form,
// 2: to write the Ox code to use in StartValues
garchobj.PrintStartValues(3);     // 1: Prints the S.V. in a table form;
// 2: Individually;  3: in a Ox code to use in StartValues
exit(0);
garchobj.DoEstimation(<>);
garchobj.Output();
delete garchobj;
}

Let us focus now on the following lines:

//*** PARAMETERS ***//
garchobj.BOUNDS(1);               // 1 if bounded parameters wanted, 0 otherwise
//*** ESTIMATION ***//
garchobj.Initialization(<>);
StartValues(garchobj);
garchobj.PrintBounds(2);          // 1: Prints the bounds in a matrix form,
// 2: to write the Ox code to use in StartValues
garchobj.PrintStartValues(3);     // 1: Prints the S.V. in a table form;
// 2: Individually;  3: in a Ox code to use in StartValues
exit(0);

The function BOUNDS(1) selects the constrained optimizer. PrintBounds(2) prints the code needed to change the bounds while PrintStartValues(3) prints the code needed to change the starting values. The function exit(0) stops the program before starting the estimation (since the starting values and the bounds have to be modified).

The output labelled First Step : Output indicates that the default starting value of the degree of freedom of the Student density (m_cV) is 6 and that m_calpha0 (i.e. ω, the constant of the GARCH model) is bounded between 0 and 100.

First Step : Output
Bounds
======
object.SetBounds("m_clevel", -100, 100);
object.SetBounds("m_calpha0", 0, 100);
object.SetBounds("m_valphav", 0, 1);
object.SetBounds("m_vbetav", 0, 1);
object.SetBounds("m_cV", 2, 100);
-------------------------------------
Starting Values
===============
object.SetStartValue("m_clevel",0.01);
object.SetStartValue("m_calpha0",0.05);
object.SetStartValue("m_valphav",<0.1>);
object.SetStartValue("m_vbetav",<0.8>);
object.SetStartValue("m_cV",6);

To change the starting value of m_cV to say 5 and to fix the bounds of m_calpha0 to [-100;100] (instead of [0;100]), just copy-paste the lines

object.SetBounds("m_calpha0", 0, 100);
object.SetStartValue("m_cV",6);

at the top of the function StartValues(const object) and change the values as shown below:16

#import <packages/Garch6/Garch>

StartValues(const object)
{
object.SetBounds("m_calpha0", -100, 100);
object.SetStartValue("m_cV",5);
object.GetPara();
object.Initialization(object.GetValue("m_vPar"));
}

Furthermore, nonlinear constraints (like the stationarity constraint of the GARCH (1,1) model, i.e. α1 + β1 < 1) can be imposed during the estimation in the Console version. The example Stat_Constr_GARCH shows how to impose this restriction.

Stat_Constr_GARCH.ox

#import <packages/Garch6/Garch>

StartValues(const object)
{
object.GetPara();                                   // DO NOT REMOVE THIS LINE
object.Initialization(object.GetValue("m_vPar"));   // DO NOT REMOVE THIS LINE
}

class GARCH_C : Garch                                   // CONSTRUCTOR
{                                                       // This function defines
GARCH_C();                                          // a new class called GARCH_C
cfunc_gt0(const avF, const vP);                     // and launches the cfunc_gt0 function
};

GARCH_C::GARCH_C()
{
this.Garch();                                       // This function defines GARCH_C
m_iModelClass = MC_GARCH;                           // as a Garch object that thus inherits
Init_Globals();                                     // all the properties of Garch
}

GARCH_C::cfunc_gt0(const avF, const vP)
{
SplitPara(vP);  // Do not remove this line
decl matconstr=new matrix[1][1];
matconstr[0] = 1.000001 - m_valphav[0] - m_vbetav[0];     // alpha_1 + beta_1 < 1
avF[0] = matconstr;
return 1;
}

main()
{
decl garchobj;
garchobj = new GARCH_C();
...
garchobj.GARCH_ORDERS(1,1);     // p order, q order
garchobj.MODEL(1);              //  0:RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH
//  5:IGARCH 6:FIGARCH(BBM) 7:FIGARCH(Chung) 8:FIEGARCH(BM only)
//  9:FIAPARCH(BBM) 10: FIAPARCH(Chung) 11: HYGARCH(BBM)
//*** OUTPUT ***//
garchobj.MLE(1);                // 0 : both, 1 : MLE, 2 : QMLE
//*** PARAMETERS ***//
garchobj.BOUNDS(1);             // 1 if bounded parameters wanted, 0 otherwise
//*** ESTIMATION ***//
garchobj.Initialization(<>) ;
garchobj.DoEstimation() ;
garchobj.Output() ;
delete garchobj;
}

In Stat_Constr_GARCH, a new object GARCH_C is created and inherits all the members and functions of the GARCH class. The option Bounds(1) is selected while the virtual function cfunc_gt0 is replaced by the function that imposes the corresponding restrictions:

GARCH_C::cfunc_gt0(const avF, const vP)
{
SplitPara(vP);  // Do not remove this line
decl matconstr=new matrix[1][1];
matconstr[0] = 1.000001 - m_valphav[0] - m_vbetav[0];
// alpha_1 + beta_1 < 1
avF[0] = matconstr;
return 1;
}

In this case the nonlinear constraint α1 + β1 < 1 is evaluated by the cfunc_gt0 function.17

Indeed, the cfunc_gt0 argument can be zero, or a function evaluating the nonlinear constraints (which will be constrained to be positive) with the following format:

cfunc_gt0(const avF, const vP);
avF
out: m x 1 matrix with constraints at vP
vP
in: p x 1 matrix with coefficients
returns
1: successful, 0: constraint evaluation failed

In Stat_Constr_FIGARCH.ox, the stationarity constraints of the FIGARCH(1,d,0) and FIGARCH(1,d,1) models are imposed during the estimation. Interestingly imposing the constraint was found to speed up the estimation by about 25 % (while the estimates are exactly the same without imposing it).

Stat_Constr_FIGARCH.ox
...
GARCH_C::cfunc_gt0(const avF, const vP)
{
SplitPara(vP);  // Do not remove this line
decl matconstr;
if ((m_cMod==6)&&(m_cP == 1)&&(m_cQ < 2))         // FIGARCH-BBM
{
matconstr=new matrix[3][1];
decl a,b,c,d;
// Remark: FIGARCH(1,d,0) --> m_valphav = 0
a = m_vbetav - m_dD;
b = ((2-m_dD)/3);
c = m_dD * (m_valphav - ((1-m_dD)/2));
d = m_vbetav * (m_valphav - m_vbetav + m_dD);

matconstr[0][0] =  m_valphav - a + 0.000001;  // beta_1 - d <= phi_1
matconstr[1][0] =  b - m_valphav + 0.000001;  // phi_1 <= (2-d)/3
matconstr[2][0] =  d - c + 0.000001;          // d*[phi_1-(1-d)/2] <= beta_1*(phi_1-beta_1+d)
}
else if ((m_cMod==7)&&(m_cP == 1)&&(m_cQ < 2))    // FIGARCH-Chung
{
// Remark: FIGARCH(1,d,0) --> m_valphav = 0
if (m_cQ == 0)
{
matconstr = new matrix[3][1];
matconstr[0][0] =  m_vbetav;              // 0 <= beta_1
}
else
{
matconstr = new matrix[4][1];
matconstr[0][0] =  m_vbetav - m_valphav;  // phi_1 <= beta_1
matconstr[3][0] =  m_valphav;             // 0 <= phi_1
}
matconstr[1][0] =  m_dD - m_vbetav;           // beta_1 <= d
matconstr[2][0] =  1 - m_dD;                  // d <= 1
}
avF[0] = matconstr;
return 1;
}
...

### 5.4 G@RCH and OxGauss

OxGauss, is a program available with recent versions of Ox. It provides a way to run Gauss18 programs in the Ox environment or to call an existing Gauss procedure under Ox.

Depending on the goal of the analysis and the user’s experience, both features are noteworthy and useful. From an Ox-G@RCH user point of view, the main objective of OxGauss is to allow existing Gauss programs to be called from Ox with only a minimum number of changes to these programs. This is beneficial to both Ox and Gauss users. It provides more visibility to both and hence increases the potential use of the underlying statistical technique. Furthermore, it can help with the migration from Gauss to Ox.

Running a pure Gauss code with OxGauss is attractive for non-Gauss and potentially even for non-Ox users because it allows the replication of published work using the console version of Ox. This is an interesting feature since the replicability of simulation and empirical results in econometrics is recognized as being an important aspect of research. An increasing number of researchers in econometrics are making their programs and routines freely available to the econometrics community. As such, OxGauss also provides much added value in that it provides the researcher with a free and rather simple solution to run Gauss programs. See Laurent and Urbain (2003) for a review of OxGauss and the M@ximize web site at http://www.core.ucl.ac.be/~laurent/.19

The main goal of this section is to illustrate the usefulness of OxGauss for G@RCH users.

#### 5.4.1 Calling Gauss Programs from Ox

The first use of OxGauss is to allow Gauss procedures to be called from Ox. This helps in the transition to Ox, and it increases the amount of code available to Ox users.

To illustrate how Gauss programs can be called from Ox, we consider a small project that mixes both Gauss and Ox programs. The first file, Gaussprocs.src, consists of a code file that features the procedure gengarch(omega,alpha,beta,nu,T_0,T,n), which simulates a GARCH model. This procedure has been written by Dick van Dijk (see Franses and van Dijk, 2000) and is downloadable from his web site
http://www.few.eur.nl/few/people/djvandijk/nltsmef/nltsmef.htm.

/*
**  gengarch.src (written by dick van dijk)
**  purpose :  generate realizations from garch(p,q) processes
**  format : eps = gengarch(omega,alpha,beta,t_0,t,n);
**
**  input : omega   : scalar, constant in conditional variance equation
**          alpha   : (qx1) vector, parameters of lagged squared residuals
**                    in conditional variance equation
**          beta    : (px1) vector, parameters of lagged conditional variance
**                    in conditional variance equation
**          nu      : scalar, indicating whether disturbances are to be
**                    drawn from standard normal distribution (nu=0), or from
**                    t-distribution with nu degrees of freedom
**          t_0     : scalar, number of initial observations to be discarded
**          t       : scalar, the length of the series to be generated
**          n       : scalar, the number of generated series
**
**  output  : eps  : t x n, the generated series
*/
proc gengarch(omega,alpha,beta,nu,t_0,t,n);
local p,q,eps_1,h_1,eps,i,z;

p=rows(beta); q=rows(alpha);

if (rows(n) gt 1);
z=n;
n=cols(n);
endif;

/* starting values for lagged residuals are set equal to zero, while lagged
conditional variances are set equal to unconditional variance */
eps_1=zeros(q+1,n);
if (sumc(alpha)+sumc(beta) /= 1);
h_1=ones(p+1,n)*(omega/(1-sumc(alpha)-sumc(beta)));
else;
h_1=ones(p+1,n);
endif;
alpha=alpha|0;
beta=beta|0;

eps=zeros(t_0+t,n);
i=0;
do until (i==t_0+t);
i=i+1;
h_1=(omega + alpha’(eps_1.*eps_1) + beta’h_1)|trimr(h_1,0,1);

if (nu==-1);
eps_1=(sqrt(h_1[1,.]).*z[i,.])|trimr(eps_1,0,1);
elseif (nu==0);
eps_1=(sqrt(h_1[1,.]).*rndn(1,n))|trimr(eps_1,0,1);
else;
eps_1=(sqrt((nu-2)/nu).*sqrt(h_1[1,.]).*rndn(1,n)./
sqrt(sumc(rndn(nu,n).^2)’./nu))|trimr(eps_1,0,1);
endif;
eps[i,.]=eps_1[1,.];
endo;

retp(eps[t_0+1:t_0+t,.]);
endp;

To call this procedure from Ox codes, one first has to create a header file. This header file allows the declaration of the functions, constants and external variables so that these are known when required. This is also mandatory to avoid compilation errors in Ox, since functions and global variables have to be explicitly declared before their use. The header file corresponding to our example is Gaussprocs.h.

Gaussprocs.h
#include <oxstd.h>
namespace gauss
{
gengarch(const omega,const alpha,const beta,const nu,const T_0,
const T, const n);
}

Additional procedures can be added in Gaussprocs.src, but the header file has to be modified accordingly.20 It is recommended to use the .src extension for the Gauss programs and .h for the header files.

In the example /packages/OxGauss/GarchEstim.ox, we use a Gauss procedure to generate 20,000 observations from a GARCH(1,1) process with Student-t errors. Then, we rely on G@RCH to estimate a GARCH(1,1) model by Gaussian Quasi-Maximum likelihood. To do this, the Gauss code must be imported into the Ox program, along with the G@RCH package. The #import command has been extended so that OxGauss imports are defined by prefixing the file name with gauss::.

GarchEstim.ox
#include <oxstd.h>
#import <packages/Garch6/Garch>
#import "gauss::Gaussprocs"                                         \\  <---
main()
{
decl omega=0.2; decl alpha=0.1; decl beta=0.8; decl nu=10;
decl T_0=1000; decl T=20000; decl n=1;
decl y=gauss::gengarch(omega,alpha,beta,nu,T_0,T,n);            \\  <---
decl garchobj;
garchobj = new Garch();
garchobj.Create(1, 1, 1, T, 1);
garchobj.Append(y, "Y");
garchobj.Select(Y_VAR, {"Y",0,0} );
garchobj.SetSelSample(-1, 1, -1, 1);
garchobj.DISTRI(0);             \\ 0 for Normal
garchobj.GARCH_ORDERS(1,1);     \\ p order, q order
garchobj.MODEL(1);              \\ 1: GARCH
garchobj.Initialization(<>);
garchobj.DoEstimation(<>);
garchobj.Output();
delete garchobj;
}

Note that when OxGauss functions or variables are accessed, they must also be prefixed with the identifier gauss::.

#### 5.4.2 Understanding OxGauss

When an OxGauss program is run, it automatically includes the ox\include\oxgauss.ox file. This by itself imports the required files:21

\include\oxgauss.ox
/*--------------------------------------------------------------------------
* oxgauss.ox - master file for functions mapped from Gauss to Ox
*
*       (C) Jurgen Doornik 2000-2001
*
* this file is automatically inserted when OxGauss code is run
*--------------------------------------------------------------------------*/
#define OX_GAUSS
#import <g2ox>
#import <gauss::oxgauss>

These import statements ensure that ox\include\g2ox.h and oxgauss.h are being included. Most of the OxGauss run-time system is in ox\include\g2ox.ox while the keywords are largely in oxgauss.src.

Most of the programs that link Gauss functions to Ox are gathered in the file \include\g2ox.ox. For instance, the output of the Gauss function cumprodc(x) is an N ×K matrix with the cumulative products of the columns of the N ×K matrix x. The Ox code given below (copied from the file g2ox.ox) shows how OxGauss interprets this function.

part of g2ox.ox
cumprodc(const mx)
{
return ::cumprod(mx);
}

As indicated in this example, OxGauss does not translate the Gauss code into Ox. Instead, it makes a link between the Gauss function (here cumprodc) and its Ox counterpart (cumprod). When the corresponding Ox function does not exist, Ox code is written between the brackets which computes what the original Gauss function meant. It is important to note that not all Gauss functions are supported by OxGauss. For instance, there is no equivalent of the Gauss function intgrat2 (for the computation of double integrals) in Ox 5. For this reason, the corresponding procedure in g2ox.ox just reports the error message intgrat2() unsupported (see below). However, if such a function becomes available in a future version of Ox, mapping ingrat2 to the corresponding function in Ox will be very easy.

#### 5.4.3 Graphics Support in OxGauss

An important aspect of OxGauss is that it supports most of the graphical features of the Gauss library pgraph. As for the standard functions (see Section 5.4.2), the file \oxgauss\src\pgraph.ox now makes the link between pgraph and the Ox graphical package (oxdraw). For instance, the Gauss function xy() is linked to its Ox counterpart DrawXMatrix().

Here is an example of a Gauss program (grGauss.prg) that draws a simple graph.

grGauss.prg
library pgraph;
x=seqa(1,1,1000);
y=rndn(1000,1);
xlabel("X-axis");
ylabel("Y-axis: Normal(0,1) draws");
call xy(x, y);
end;

However, only Ox Professional for Windows supports on-screen graphics (through OxMetrics). By default, non-Windows versions of Ox and Ox Console have no support for graphs. Nevertheless, the user can rely on the Ox package GnuDraw developed by Charles Bos that allows the creation of GnuPlot (see http://www.gnuplot.info) graphics from Ox (see Section 5.2.2). Interestingly, GnuDraw allows the use of the console version for a quick check of the graphical output of Gauss code. Therefore, academic institutions do not have to license the full professional version of Ox if Gauss programs only need to be replicated.

## Chapter 6Value-at-Risk (VaR) estimation using G@RCH

In recent years, the tremendous growth of trading activity and the widely publicized trading loss of well-known financial institutions (see Jorion, 2000, for a brief history of these events) has led financial regulators and supervisory authorities to favor quantitative techniques which appraise the possible loss that these institutions can incur. Value-at-Risk has become one of the most sought-after techniques as it provides a simple answer to the following question: with a given probability (say α), what is my predicted financial loss over a given time horizon? The answer is the VaR at level α, which gives an amount in the currency of the traded assets (in dollar terms for example) and is thus easily understandable.

It turns out that the VaR has a simple statistical definition: the VaR at level α for a sample of returns is defined as the corresponding empirical quantile at α%. Because of the definition of the quantile, we have that, with probability 1 -α, the returns will be larger than the VaR. In other words, with probability 1 -α, the losses will be smaller than the dollar amount given by the VaR.1 From an empirical point of view, the computation of the VaR for a collection of returns thus requires the computation of the empirical quantile at level α of the distribution of the returns of the portfolio.

In this section, we show how to use G@RCH to predict the VaR and test the adequacy of the selected model in forecasting the VaR of the investigated series. The material used in the current section is based on Giot and Laurent (2003) and is devoted to show that models that rely on a symmetric density distribution for the error term can underperform with respect to skewed density models when the left and right tails of the distribution of returns must be modelled. Thus, VaR for traders having both long and short positions is not adequately modelled using usual normal or Student distributions. We suggest using an APARCH model based on the skewed-Student distribution to fully take into account the fat left and right tails of the returns distribution. This allows for an adequate modelling of large returns defined on long and short trading positions. The performances of the model is assessed on daily data of the NASDAQ (11/10/1984 - 21/12/2000).

### 6.1 VaR Models

To characterize the models, we consider a collection of daily returns (in %), yt = 100 , where t = 1,T , and pt is the price at time t. Because daily returns are known to exhibit some serial autocorrelation, we fit an AR(2) structure on the yt series for all specifications. Accordingly, the conditional mean of yt, i.e. μt, is equal to μ + j=12ψj(yt-j - μ). We now consider several specifications for the conditional variance of εt.

#### 6.1.1 RiskMetricsTM

In its most simple form, it can be shown that the basic RiskMetricsTM model is equivalent to a normal Integrated GARCH (IGARCH) model where the autoregressive parameter is set at a pre-specified value λ and the coefficient of εt-12 is equal to 1 - λ. In the RiskMetricsTM specification for daily data, λ is fixed to 0.94 and we then have:

 (6.1)

where zt is IID N(0,1) and σt2 is defined as in Equation (4.14).

The long side of the daily VaR is defined as the VaR level for traders having long positions in the relevant equity index: this is the “usual” VaR where traders incur losses when negative returns are observed. Correspondingly, the short side of the daily VaR is the VaR level for traders having short positions, i.e. traders who incur losses when stock prices increase.2

How good a model is at predicting long VaR is thus related to its ability to model large negative returns, while its performance regarding the short side of the VaR is based on its ability to take into account large positive returns.

For the RiskMetricsTM model, the one-step-ahead VaR computed in t- 1 for long trading positions is given by μt + zασt, for short trading positions it is equal to μt + z1-ασt, with zα being the left quantile at α% for the normal distribution and z1-α is the right quantile at α%.3

#### 6.1.2 Normal APARCH

The normal APARCH (Ding, Granger, and Engle, 1993) is an extension of the GARCH model of Bollerslev (1986). It is a very flexible ARCH-type model as it nests at least seven GARCH specifications (see Section 4.4). The APARCH(1,1) is:

where ω,α1,γ1,β1 and δ are parameters to be estimated.

For the normal APARCH model, the one-step-ahead VaR is computed as for the RiskMetricsTM model except the computation of the conditional standard deviation σt which is now given by Equation (6.2) (evaluated at its MLE).

#### 6.1.3 Student APARCH

Previous empirical studies on VaR have shown that models based on the normal distribution usually cannot fully take into account the ‘fat tails’ of the distribution of the returns. To alleviate this problem, the Student APARCH (or ST APARCH) is introduced:

 (6.3)

where zt is IID t(0,1,υ) and σt is defined as in Equation (6.2).

For the Student APARCH model, the VaR for long and short positions is given by μt + stα,υσt and μt + st1-α,υσt, with stα,υ being the left quantile at α% for the (standardized) Student distribution with (estimated) number of degrees of freedom υ and st1-α,υ is the right quantile at α% for this same distribution. Note that because zα = -z1-α for the normal distribution and stα,υ = -st1-α,υ for the Student distribution, the forecasted long and short VaR will be equal in both cases.

#### 6.1.4 Skewed-Student APARCH

To account for the excess skewness and kurtosis, Fernández and Steel (1998) propose to extend the Student distribution by adding a skewness parameter.4 Their procedure allows the introduction of skewness in any continuous unimodal and symmetric (about zero) distribution g(.) by changing the scale at each side of the mode. The main drawback of this density is that it is expressed in terms of the mode and the dispersion. In order to keep in the ARCH ‘tradition’, Lambert and Laurent (2001) re-expressed the skewed-Student density in terms of the mean and the variance, i.e. re-parameterize this density in such a way that the innovation process has zero mean and unit variance. Otherwise, it will be difficult to separate the fluctuations in the mean and variance from the fluctuations in the shape of the conditional density (Hansen, 1994).

The innovation process z is said to be (standardized) skewed-Student distributed if:

 (6.4)

where g(.|υ) is the symmetric (unit variance) Student density and ξ is the asymmetry coefficient.5 m and s2 are respectively the mean and the variance of the non-standardized skewed-Student and are defined in Section 3.6.2.

Notice also that the density f(zt|1ξ,υ) is the “mirror” of f(zt|ξ,υ) with respect to the (zero) mean, i.e. f(zt|1ξ,υ) = f(-zt|ξ,υ). Therefore, the sign of log(ξ) indicates the direction of the skewness: the third moment is positive (negative), and the density is skew to the right (left), if log(ξ) > 0 (< 0).

Lambert and Laurent (2000) show that the quantile function skstα,υ,ξ* of a non standardized skewed-Student density is:

 (6.5)

where stα,υ is the quantile function of the (unit variance) Student-t density. It is straightforward to obtain the quantile function of the standardized skewed-Student: skstα,υ,ξ = .6

For the skewed-Student APARCH model, the VaR for long and short positions is given by μt + skstα,υ,ξσt and μt + skst1-α,υ,ξσt, with skstα,υ,ξ being the left quantile at α% for the skewed-Student distribution with υ degrees of freedom and asymmetry coefficient ξ; skst1-α,υ,ξ is the corresponding right quantile. If log(ξ) is smaller than zero (or ξ < 1), |skstα,υ,ξ| > |skst1-α,υ,ξ| and the VaR for long trading positions will be larger (for the same conditional variance) than the VaR for short trading positions. When log(ξ) is positive, we have the opposite result.

### 6.2 Application

#### 6.2.1 Model for VaR assessment

The first step of the application consists in estimating the various models. Then, the adequacy in forecasting the in-sample VaR will be tested. The code used to perform this illustration is given in the file VaR_insample.ox. The first model to be estimated is an ARMA(2,0)-RiskMetricsTM model (with a normal density). The second, third and fourth are ARMA(2,0)-APARCH(1,1) models with respectively a normal, Student and skewed-Student densities. The results are given in the four boxes below. Several comments are in order:

• the autoregressive effect in the volatility specification is strong as β1 is around 0.8, suggesting a strong memory effects. Indeed, α1Eδ + β1 is around 1 (0.944317 and 0.963684 respectively for the N-APARCH and SKST-APARCH).
• γ1 is positive and significant, indicating a leverage effect for negative returns in the conditional variance specification;
• log(ξ) (or “Asymmetry” in the output) is negative and significant which implies that the asymmetry in the Student distribution is needed to fully model the distribution of returns. Likelihood ratio tests also clearly favor the skewed-Student density compared to the normal and the Student.
• δ is about 1.35, significantly different from 2 but not significantly different from 1. This suggests that, instead of modelling the conditional variance (GARCH), it is more relevant in this case to model the conditional standard deviation. This result is in line with those of Taylor (1986), Schwert (1990) and Ding, Granger, and Engle (1993) who indicate that there is substantially more correlation between absolute returns than squared returns, a stylized fact of high frequency financial returns (often called ‘long memory’).
• Comparing the likelihoods of the RiskMetricsTM and the N-APARCH models suggests that the additional flexibility of the APARCH specification is empirically relevant (it increases the likelihood of more than 100 points !).

To summarize, these results indicate the need for a model featuring a negative leverage effect (conditional asymmetry) for the conditional variance combined with an asymmetric distribution for the underlying error term (unconditional asymmetry). The skewed-Student APARCH model delivers such specifications.

Does it means that this model improves on symmetric GARCH models when the VaR for long and short returns is needed ?

VaR_insample.ox
#import <packages/Garch5/Garch>

main()
{
decl garchobj;
garchobj = new Garch();

//*** DATA ***//
garchobj.Select(Y_VAR, {"Nasdaq",0,0} );
garchobj.SetSelSample(1, 1, 2000, 1);
garchobj.Info();

//*** SPECIFICATIONS ***//
garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)
garchobj.DISTRI(0);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student
garchobj.ARMA_ORDERS(2,0);      // AR order (p), MA order (q).
garchobj.ARFIMA(0);             // 1 if Arfima wanted, 0 otherwise
garchobj.GARCH_ORDERS(1,1);     // p order, q order
garchobj.MODEL(0);              //  0:RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH
//  5:IGARCH 6:FIGARCH(BBM) 7:FIGARCH(Chung) 8:FIEGARCH(BBM only)
//  9:FIAPARCH(BBM) 10: FIAPARCH(Chung) 11: HYGARCH(BBM)

//*** OUTPUT ***//
garchobj.MLE(0);                // 0 : Second Derivates, 1 : OPG, 2 : QMLE
garchobj.ITER(0);               // Interval of iterations between printed intermediary
// results (if no intermediary results wanted, enter ’0’)

//*** PARAMETERS ***//
garchobj.BOUNDS(0);             // 1 if bounded parameters wanted, 0 otherwise
garchobj.Initialization(<>);
garchobj.DoEstimation(<>);
garchobj.Output();

decl quan=<0.95,0.975,0.99,0.995,0.9975>; // Quantiles investigated

decl Y=garchobj.GetGroup(Y_VAR);
decl T=garchobj.GetcT();
decl m_cA,m_cV,i,j;
decl qu_pos,qu_neg,m_vSigma2,dfunc,m_vPar,cond_mean;
decl m_Dist=garchobj.GetValue("m_cDist");

println("Infos");
println("Number of observations: ", T);
println("Investigated quantiles: ",quan);
println("In-sample VaR");

decl emp_quan_in_pos=new matrix[T][columns(quan)];
decl emp_quan_in_neg=new matrix[T][columns(quan)];
m_vSigma2=garchobj.GetValue("m_vSigma2");
cond_mean=Y-garchobj.GetValue("m_vE");

if (m_Dist==0)
{
qu_pos=quann(quan)’;
qu_neg=quann(1-quan)’;
}
if (m_Dist==1)
{
m_cV=garchobj.GetValue("m_cV");
qu_pos=sqrt((m_cV-2)/m_cV)*quant(quan,m_cV)’;
qu_neg=sqrt((m_cV-2)/m_cV)*quant(1-quan,m_cV)’;
}
if (m_Dist==3)
{
m_cV=garchobj.GetValue("m_cV");
m_cA=garchobj.GetValue("m_cA");
qu_pos=qu_neg=<>;
for (i = 0; i < columns(quan) ; ++i)
{
qu_pos|=garchobj.INVCDFTA(quan[i],m_cA,m_cV);
qu_neg|=garchobj.INVCDFTA(1-quan[i],m_cA,m_cV);
}
}

emp_quan_in_pos=(cond_mean + sqrt(m_vSigma2).*qu_pos’);
emp_quan_in_neg=(cond_mean + sqrt(m_vSigma2).*qu_neg’);

println("In-sample Value-at-Risk");
garchobj.VaR_DynQuant(Y, emp_quan_in_pos, emp_quan_in_neg, quan, 5, 0,0);
garchobj.VaR_Test(Y, emp_quan_in_pos, emp_quan_in_neg, quan);

delete garchobj;
}

Output RiskMetrics - VaR_insample.ox
********************
** SPECIFICATIONS **
********************
Dependent variable : Nasdaq
Mean Equation : ARMA (2, 0) model.
No regressor in the mean
Variance Equation : RiskMetrics (lambda=0.94).
No regressor in the variance
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -2222.83
Please wait : Computing the Std Errors ...

Maximum Likelihood Estimation (Std.Errors based on Second derivatives)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.044980   0.020337    2.212  0.0271
AR(1)                0.269636   0.024409    11.05  0.0000
AR(2)               -0.011516   0.024168  -0.4765  0.6338
ARCH(Alpha1)         0.060000
GARCH(Beta1)         0.940000

No. Observations :      2000  No. Parameters  :         5
Mean (Y)         :   0.04326  Variance (Y)    :   0.83876
Skewness (Y)     :  -2.33675  Kurtosis (Y)    :  33.73842
Log Likelihood   : -2222.834

The sample mean of squared residuals was used to start recursion.

Estimated Parameters Vector :
0.044980; 0.269636;-0.011516; 0.060000; 0.940000

Elapsed Time : 0.21 seconds (or 0.0035 minutes).

Output N-APARCH(1,1) - VaR_insample.ox
********************
** SPECIFICATIONS **
********************
Dependent variable : Nasdaq
Mean Equation : ARMA (2, 0) model.
No regressor in the mean
Variance Equation : APARCH (1, 1) model.
No regressor in the variance
The distribution is a Gauss distribution.

Strong convergence using numerical derivatives
Log-likelihood = -2119.06
Please wait : Computing the Std Errors ...

Maximum Likelihood Estimation (Std.Errors based on Second derivatives)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.047817   0.019917    2.401  0.0165
AR(1)                0.285268   0.025034    11.40  0.0000
AR(2)                0.005233   0.024285   0.2155  0.8294
Cst(V)               0.045054  0.0081194    5.549  0.0000
ARCH(Alpha1)         0.169705   0.018395    9.226  0.0000
GARCH(Beta1)         0.799843   0.024229    33.01  0.0000
APARCH(Gamma1)       0.286858   0.072419    3.961  0.0001
APARCH(Delta)        1.357528    0.24113    5.630  0.0000

No. Observations :      2000  No. Parameters  :         8
Mean (Y)         :   0.04326  Variance (Y)    :   0.83876
Skewness (Y)     :  -2.33675  Kurtosis (Y)    :  33.73842
Log Likelihood   : -2119.063

The sample mean of squared residuals was used to start recursion.
The condition for existence of E(sigma^delta) and E(|e^delta|) is observed.
The constraint equals 0.944317 and should be < 1.

Estimated Parameters Vector :
0.047817; 0.285268; 0.005233; 0.045054; 0.169705; 0.799843; 0.286858; 1.357528

Elapsed Time : 2.654 seconds (or 0.0442333 minutes).

Output SKST-APARCH(1,1) - VaR_insample.ox
********************
** SPECIFICATIONS **
********************
Dependent variable : Nasdaq
Mean Equation : ARMA (2, 0) model.
No regressor in the mean
Variance Equation : APARCH (1, 1) model.
No regressor in the variance
The distribution is a skewed-Student distribution, with
a tail coefficient of 5.23817 and an asymmetry coefficient of -0.16141.

Strong convergence using numerical derivatives
Log-likelihood = -2001.87
Please wait : Computing the Std Errors ...

Maximum Likelihood Estimation (Std.Errors based on Second derivatives)
Coefficient  Std.Error  t-value  t-prob
Cst(M)               0.054586   0.019069    2.863  0.0042
AR(1)                0.268316   0.023425    11.45  0.0000
AR(2)               -0.011101   0.023234  -0.4778  0.6329
Cst(V)               0.037921  0.0098793    3.838  0.0001
ARCH(Alpha1)         0.153786   0.023273    6.608  0.0000
GARCH(Beta1)         0.820957   0.029905    27.45  0.0000
APARCH(Gamma1)       0.219447   0.087570    2.506  0.0123
APARCH(Delta)        1.367004    0.27287    5.010  0.0000
Asymmetry           -0.161410   0.031849   -5.068  0.0000
Tail                 5.238169    0.60680    8.632  0.0000

No. Observations :      2000  No. Parameters  :        10
Mean (Y)         :   0.04326  Variance (Y)    :   0.83876
Skewness (Y)     :  -2.33675  Kurtosis (Y)    :  33.73842
Log Likelihood   : -2001.868

The sample mean of squared residuals was used to start recursion.
The condition for existence of E(sigma^delta) and E(|e^delta|) is observed.
The constraint equals 0.963684 and should be < 1.

Estimated Parameters Vector :
0.054586; 0.268316;-0.011101; 0.037921; 0.153786; 0.820957;
0.219447; 1.367004;-0.161410; 5.238169

Elapsed Time : 5.457 seconds (or 0.09095 minutes).

Figure 6.1 plots the in-sample VaR for the 5% and 95 % quantiles for the AR(2)-APARCH-SKST model. Using the rolling menus, this graph is easily obtained by using the Test.../Graphic Analysis/In-Sample VaR Forecasts command and choosing option Theoretical Quantiles’ with the following quantiles: 0.05; 0.95.

#### 6.2.2 In-sample VaR

All models are tested with a VaR level α which ranges from 5% to 0.25% (quan = < 0.95, 0.975, 0.99, 0.995, 0.9975 >) and their performance is then assessed by computing the failure rate for the returns yt. By definition, the failure rate is the number of times returns exceed (in absolute value) the forecasted VaR. If the VaR model is correctly specified, the failure rate should be equal to the pre-specified VaR level. In our empirical application, we define a failure rate fl for the long trading positions, which is equal to the percentage of negative returns smaller than one-step-ahead VaR for long positions. Correspondingly, we define fs as the failure rate for short trading positions as the percentage of positive returns larger than the one-step-ahead VaR for short positions.

Because the computation of the empirical failure rate defines a sequence of yes/no observations, it is possible to test H0 : f = α against H1 : fα, where f is the failure rate (estimated by , the empirical failure rate). At the 5% level and if T yes/no observations are available, a confidence interval for is given by - 1.96,+ 1.96. In the literature on VaR models, this test is also called the Kupiec (1995) LR test when the hypothesis is tested using a likelihood ratio test. The LR statistic is LR = -2log , where N is the number of VaR violations, T is the total number of observations and α is the theoretical failure rate. Under the null hypothesis that f is the true failure rate, the LR test statistic is asymptotically distributed as a χ2(1).

In this application, these tests are successively applied to the failure rate fl for long trading positions and then to fs, the failure rate for short trading positions using the procedure VaR_Test.

While an extremely easy-to-understand concept, the Value-at-Risk has however some drawbacks. On of these is that it is not a coherent measure of risk in the sense of Artzner, Delbaen, Eber, and Heath (1999). A somewhat related measure of risk is the so-called expected shortfall (see Scaillet, 2000). Expected shortfall is a coherent measure of risk and it is defined as the expected value of the losses conditional on the loss being larger than the VaR. The average multiple of tail event to risk measure “measures the degree to which events in the tail of the distribution typically exceed the VaR measure by calculating the average multiple of these outcomes to their corresponding VaR measures” (Hendricks, 1996).7 These measure are reported when calling the procedure VaR_Test and labelled ESF1 and ESF2 respectively.

Complete VaR results for the four models are reported in the following outputs.

Output RiskMetrics - VaR_insample.ox
- Short positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.95000      0.96250       7.1752    0.0073918       1.5238       1.3202
0.97500      0.98350       6.7238    0.0095137       1.9570       1.3399
0.99000      0.99000      0.00000       1.0000       2.1653       1.2828
0.99500      0.99300       1.4293      0.23188       2.3557       1.2509
0.99750      0.99450       5.3641     0.020555       2.6115       1.2019
- Long positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.050000     0.056500       1.7103      0.19094      -1.8823       1.6088
0.025000     0.040500       16.648  4.4986e-005      -2.1481       1.5187
0.010000     0.025500       33.969  5.5985e-009      -2.2529       1.4934
0.0050000     0.020000       51.358  7.6961e-013      -2.4338       1.4575
0.0025000     0.015000       57.820  2.8755e-014      -2.8104       1.4673

Output N-APARCH - VaR_insample.ox
- Short positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.95000      0.96700       13.757   0.00020807       1.7191       1.2706
0.97500      0.98500       9.5549    0.0019942       1.9285       1.2721
0.99000      0.99250       1.3822      0.23973       2.2827       1.2493
0.99500      0.99550      0.10401      0.74707       2.4319       1.2518
0.99750      0.99700      0.18836      0.66429       2.6958       1.2570
- Long positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.050000     0.053000      0.37198      0.54193      -1.8833       1.5075
0.025000     0.032500       4.2230     0.039879      -2.1592       1.4640
0.010000     0.018000       10.450    0.0012263      -2.7534       1.4800
0.0050000     0.012500       15.928  6.5802e-005      -3.0119       1.5042
0.0025000    0.0095000       22.829  1.7712e-006      -3.2831       1.5016

Output ST-APARCH - VaR_insample.ox
- Short positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.95000      0.96400       9.1060    0.0025477       1.7063       1.2772
0.97500      0.98750       15.662  7.5739e-005       2.2696       1.2699
0.99000      0.99600       9.4119    0.0021558       2.5200       1.2352
0.99500      0.99900       9.5944    0.0019518       3.7156       1.4563
0.99750      0.99900       2.3393      0.12614       3.7156       1.2326
- Long positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.050000     0.066500       10.434    0.0012371      -1.7154       1.5324
0.025000     0.034500       6.6333     0.010009      -2.1324       1.4559
0.010000     0.014000       2.8748     0.089975      -2.9631       1.4620
0.0050000    0.0080000       3.0582     0.080329      -3.5505       1.4453
0.0025000    0.0050000       3.8755     0.048996      -4.0783       1.3946

Output SKST-APARCH - VaR_insample.ox
- Short positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.95000      0.95050     0.010560      0.91815       1.5490       1.3057
0.97500      0.98000       2.1997      0.13804       2.0181       1.2966
0.99000      0.99150      0.47890      0.48892       2.1671       1.2303
0.99500      0.99700       1.8781      0.17055       2.6958       1.2974
0.99750      0.99900       2.3393      0.12614       3.7156       1.4868
- Long positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.050000     0.051500     0.093853      0.75934      -1.9159       1.5136
0.025000     0.024500     0.020647      0.88574      -2.4564       1.4604
0.010000    0.0090000      0.20904      0.64752      -3.3903       1.5069
0.0050000    0.0060000      0.37773      0.53882      -3.8849       1.4071
0.0025000    0.0045000       2.5882      0.10766      -4.1698       1.2782

These results indicate that:

• VaR models based on the normal distribution (RiskMetricsTM and normal APARCH model) have a difficult job in modelling large positive and negative returns.
• The symmetric Student APARCH model sometimes improves on the performance of normal based models but its performance is still not satisfactory in all cases. Its performance in general is even worse than normal based models. The reason is that the critical values of the Student distribution tα,υ and t1-α,υ are very large in this case, which leads to a high level of long and short VaR: the model is often rejected because it is too conservative.
• the skewed-Student APARCH model improves on all other specifications for both negative and positive returns. The improvement is substantial as the switch to a skewed-Student distribution alleviates almost all problems due to the ‘conservativeness’ of the symmetric Student distribution. The model performs correctly in 100% of all cases for the negative returns (long VaR) and for the positive returns (short VaR).

Besides the failure rate, a relevant VaR model should feature a sequence of indicator functions (VaR violations) that is not serially correlated. With the new variables Hitt(α) = I(yt < V aRt(α)) - α and Hitt(1 - α) = I(yt > V aRt(1 - α)) - α, Engle and Manganelli (1999) suggest to test jointly that:

• A1: E(Hitt(α)) = 0 (respectively E(Hitt(1 - α)) = 0) in the case of long trading positions (short trading positions);
• A2: Hitt(α) (or Hitt(1 - α)) is uncorrelated with the variables included in the information set.

According to Engle and Manganelli (1999), testing A1 and A2 can be done using the artificial regression Hitt = Xλ + ϵt, where X is a T × k matrix whose first column is a column of ones, the next p columns are Hitt-1,,Hitt-p and the k - p - 1 remaining columns are additional independent variables (including the VaR itself). Engle and Manganelli (1999) also show that under the null A1 and A2, the Dynamic Quantile test statistic χ2(k), where is the OLS estimate of λ. A small sample version of this test (F-test) is readily obtained but the difference is negligible since the sample size is larger than 1,000 observations. Note that while Engle and Manganelli (1999) only consider long trading positions, we also use this test when computing the VaR of short trading positions.

The G@RCH package also provide a function called VaR_DynQuant implementing this test.

Complete VaR results corresponding to the Dynamic Quantile test are reported in the following outputs.

Output RiskMetrics - VaR_insample.ox
- Short positions -
Quantile        Stat.      P-value

0.95000       13.994     0.051297
0.97500       9.9820      0.18960
0.99000       4.4913      0.72177
0.99500       2.4473      0.93101
0.99750       7.9630      0.33587

- Long positions -
Quantile        Stat.      P-value

0.050000       22.591    0.0020078
0.025000       44.824  1.4796e-007
0.010000       95.823      0.00000
0.0050000       121.21      0.00000
0.0025000       171.78      0.00000

Remark: In the Dynamic Quantile Regression, p=5.

Output N-APARCH - VaR_insample.ox
- Short positions -
Quantile        Stat.      P-value

0.95000       15.924     0.025818
0.97500       10.064      0.18495
0.99000       6.9102      0.43829
0.99500      0.31019      0.99989
0.99750      0.31617      0.99988

- Long positions -
Quantile        Stat.      P-value

0.050000       1.9588      0.96209
0.025000       5.9063      0.55073
0.010000       18.798    0.0088448
0.0050000       32.636  3.0952e-005
0.0025000       56.664  6.9706e-010

Remark: In the Dynamic Quantile Regression, p=5.

Output ST-APARCH - VaR_insample.ox
- Short positions -
Quantile        Stat.      P-value

0.95000       13.398     0.062986
0.97500       15.831     0.026710
0.99000       7.3396      0.39440
0.99500       6.4381      0.48962
0.99750       1.8166      0.96929

- Long positions -
Quantile        Stat.      P-value

0.050000       15.164     0.033953
0.025000       10.682      0.15308
0.010000       8.2543      0.31070
0.0050000       14.335     0.045531
0.0025000       6.4057      0.49326

Remark: In the Dynamic Quantile Regression, p=5.

Output SKST-APARCH - VaR_insample.ox
- Short positions -
Quantile        Stat.      P-value

0.95000       3.0185      0.88328
0.97500       11.787      0.10777
0.99000       5.2936      0.62418
0.99500       1.6636      0.97610
0.99750       1.8136      0.96943

- Long positions -
Quantile        Stat.      P-value

0.050000       1.6258      0.97763
0.025000       3.0760      0.87789
0.010000       4.9101      0.67093
0.0050000       1.6573      0.97636
0.0025000       4.1900      0.75765

Remark: In the Dynamic Quantile Regression, p=5.

The same conclusion holds for the SKST-APARCH model, i.e. it performs very well for the NASDAQ, unlike its symmetric competitors.

#### 6.2.3 Out-of-sample VaR

The testing methodology in the previous subsection is equivalent to back-testing the model on the estimation sample. Therefore it can be argued that this should be favorable to the tested model. In a ‘real life situation’, VaR models are used to deliver out-of-sample forecasts, where the model is estimated on the known returns (up to time t for example) and the VaR forecast is made for period [t + 1,t + h], where h is the time horizon of the forecasts. In this subsection we implement this testing procedure for the long and short VaR with h = 1 day. Note that the code used for this illustration is VaR_outofsample.ox.

We use an iterative procedure where the skewed-Student APARCH model is estimated to predict the one-day-ahead VaR. The first estimation sample is the complete sample for which the data is available less the last five years. The predicted one-day-ahead VaR (both for long and short positions) is then compared with the observed return and both results are recorded for later assessment using the statistical tests. At the i-th iteration where i goes from 2 to 5 × 252 (five years of data), the estimation sample is augmented to include one more day and the VaRs are forecasted and recorded. Whenever i is a multiple of 50, the model is re-estimated to update the skewed-Student APARCH parameters. In other words, we update the model parameters every 50 trading days and we thus assume a ‘stability window’ of 50 days for our parameters. We iterate the procedure until all days (less the last one) have been included in the estimation sample. Corresponding failure rates are then computed by comparing the long and short forecasted V aRt+1 with the observed return yt+1 for all days in the five years period. We use the same statistical tests as in the previous subsection.

VaR_outofsample.ox
#import <packages/Garch5/Garch>

main()
{
decl garchobj;
garchobj = new Garch();
//*** DATA ***//
garchobj.Select(Y_VAR, {"Nasdaq",0,0} );
garchobj.SetSelSample(1, 1, 4093-5*252, 1);
garchobj.Info();
//*** SPECIFICATIONS ***//
garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)
garchobj.DISTRI(1);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student
garchobj.ARMA_ORDERS(2,0);      // AR order (p), MA order (q).
garchobj.GARCH_ORDERS(1,1);     // p order, q order
garchobj.MODEL(4);              //  0:RISKMETRICS  1:GARCH      2:EGARCH    3:GJR   4:APARCH
//  5:IGARCH 6:FIGARCH(BBM) 7:FIGARCH(Chung) 8:FIEGARCH(BBM only)
//  9:FIAPARCH(BBM) 10: FIAPARCH(Chung) 11: HYGARCH(BBM)
//*** OUTPUT ***//
garchobj.MLE(0);                // 0 : Second Derivates, 1 : OPG, 2 : QMLE
garchobj.ITER(0);
//*** PARAMETERS ***//
garchobj.Initialization(<>) ;
garchobj.DoEstimation(<>) ;
garchobj.Output();

decl quan=<0.95,0.975,0.99,0.995,0.9975>; // Quantiles investigated
decl out_of_sample_VaR=1; // 1 to compute the out-of-sample VaR, 0 otherwize
decl numb_out_of_sample=5*252; // number of observations in the out-of-sample period
decl reestimate_every=50; // k: to reestimate the model every k observations
decl Y=garchobj.GetGroup(Y_VAR);
decl T=garchobj.GetcT();
decl m_cA,m_cV,i,j;
decl qu_pos,qu_neg,m_vSigma2,dfunc,m_vPar,cond_mean;
decl m_Dist=garchobj.GetValue("m_cDist");
println("Infos");
println("Number of observations: ", T);
println("Investigated quantiles: ",quan);
println("Number of forecasts: ", numb_out_of_sample);
println("Out-of-sample: re-estimate every ", reestimate_every, " observations");
}
m_Dist=garchobj.GetValue("m_cDist");
m_vPar=garchobj.GetValue("m_vPar");
decl emp_quan_out_pos=new matrix[numb_out_of_sample][columns(quan)];
decl emp_quan_out_neg=new matrix[numb_out_of_sample][columns(quan)];
decl k=0;
for (j = 0; j < numb_out_of_sample ; ++j)
{
garchobj.SetSelSample(-1, 1, T+j, 1);
if (k==reestimate_every-1)
{
println("..... Remaining Steps (Out-of-sample): ",numb_out_of_sample-j, " .....");
garchobj.Initialization(m_vPar) ;
garchobj.DoEstimation(<>);
k*=0;
}
else
{
garchobj.InitData();
garchobj.FigLL(garchobj.GetFreePar(), &dfunc, 0,0);
++k;
}
m_vPar=garchobj.GetValue("m_vPar");
garchobj.SetSelSample(-1, 1, T+j+1, 1);
garchobj.InitData();
Y=garchobj.GetGroup(Y_VAR);
garchobj.Res_Var();
m_vSigma2=garchobj.GetValue("m_vSigma2")[T+j];
cond_mean=Y[T+j][]-garchobj.GetValue("m_vE")[T+j][];
if (m_Dist==0)
{
qu_pos=quann(quan)’;
qu_neg=quann(1-quan)’;
}

...VaR_outofsample.ox
if (m_Dist==1)
{
m_cV=garchobj.GetValue("m_cV");
qu_pos=sqrt((m_cV-2)/m_cV)*quant(quan,m_cV)’;
qu_neg=sqrt((m_cV-2)/m_cV)*quant(1-quan,m_cV)’;
}
if (m_Dist==3)
{
m_cV=garchobj.GetValue("m_cV");
m_cA=garchobj.GetValue("m_cA");
qu_pos=qu_neg=<>;
for (i = 0; i < columns(quan) ; ++i)
{
qu_pos|=garchobj.INVCDFTA(quan[i],m_cA,m_cV);
qu_neg|=garchobj.INVCDFTA(1-quan[i],m_cA,m_cV);
}
}
emp_quan_out_pos[j][]=(cond_mean + sqrt(m_vSigma2).*qu_pos)’;
emp_quan_out_neg[j][]=(cond_mean + sqrt(m_vSigma2).*qu_neg)’;
}
println("Out-of-sample Value-at-Risk (rolling regression)");
garchobj.VaR_DynQuant(Y[T:T+numb_out_of_sample-1],emp_quan_out_pos,emp_quan_out_neg,quan,5,0,0);
garchobj.VaR_Test(Y[T:T+numb_out_of_sample-1],emp_quan_out_pos,emp_quan_out_neg,quan);
delete garchobj;
}

Empirical results for the SKST-APARCH are given below. Broadly speaking, the results are quite similar (although not as good) to those obtained for the in-sample testing procedure.

Output - VaR_APARCH_SKST_outofsample.ox
- Short positions -
Quantile        Stat.      P-value

0.95000       42.683  3.8399e-007
0.97500       28.903   0.00015072
0.99000       6.6717      0.46384
0.99500      0.32859      0.99986
0.99750       3.5069      0.83449

- Long positions -
Quantile        Stat.      P-value

0.050000       38.467  2.4696e-006
0.025000       43.893  2.2419e-007
0.010000       7.6759      0.36204
0.0050000       3.5150      0.83363
0.0025000      0.61369      0.99891

Remark: In the Dynamic Quantile Regression, p=5.

- Short positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.95000      0.91667       24.765  6.4776e-007       2.7630       1.3391
0.97500      0.95635       14.760   0.00012207       3.1518       1.2488
0.99000      0.98730      0.85382      0.35547       3.7643       1.2290
0.99500      0.99444     0.075438      0.78358       4.4206       1.1863
0.99750      0.99683      0.21171      0.64543       2.7719       1.0845
- Long positions -
Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2

0.050000     0.069048       8.6470    0.0032760      -3.0879       1.4279
0.025000     0.033333       3.2553     0.071193      -3.5326       1.3442
0.010000     0.011905      0.43522      0.50944      -4.4876       1.3198
0.0050000    0.0063492      0.42458      0.51466      -4.5203       1.2693
0.0025000    0.0031746      0.21171      0.64543      -6.2120       1.2574

## Chapter 7Realized Volatility and Intraday Periodicity

(written with the collaboration of Kris Boudt and Jerome Lahaye)

The recent widespread availability of databases providing the intradaily prices of financial assets (stocks, stock indices, bonds, currencies, …) has led to new developments in applied econometrics and quantitative finance to model daily and intradaily volatility.

The aim of this chapter is to introduce and illustrate several concepts. We first sketch basic notions to understand diffusion models before moving on to an explanation of relevant volatility measures. We introduce the notion of integrated volatility and one popular estimator: realized volatility. We then consider the issues of the presence of jumps in the price process. We then present an important stylized fact of intraday returns, namely intraday periodicity in volatility, and show how to consistently estimate it in the presence and absence of jumps. We also review two estimators of the integrated volatility that are robust in the presence of jumps, i.e. the bi-power variation and the realized outlyingness weighted variation.

We also review two different approaches for jumps detection. The first one tests whether the contribution of jumps to the daily variance is zero. The second one tests at the intraday level whether a given return is affected by price jumps.

While G@RCH 6.1 is primarily a menu-driven module for OxMetrics, it also provides a set of Ox classes that allow the user to write Ox programs that import these classes. G@RCH contains a set of procedures, gathered in the Realized class, devoted to the computation (or estimation) of

• the realized volatility and its univariate and multivariate extensions;
• intraday or intraweek periodicity in volatility;

and the simulation of continuous-time stochastic volatility processes.

Most concepts are directly implemented in the menu-driven application. As usual, several ox programs are also provided (they are available in /packages/Realized/samples/).

This chapter has been inspired by several excellent sources of information, namely Chapter 6 (Continuous-Time Models and Their Applications) of Tsay (2002), the recent survey of Andersen, Bollerslev, Christoffersen, and Diebold (2006) and more generally by the seminal papers of Andersen, Bollerslev, Diebold and coauthors, Barndorff-Nielsen and Shephard, as well as Lee and Mykland (2008). This chapter is also based on Boudt, Croux, and Laurent (2008a,2008b).

### 7.1 Introduction to diffusion models

It is useful to think of returns as arising from an underlying continuous-time process. In particular, suppose that this underlying model involves a continuous sample path for the (logarithmic) price process. The return process may then, under general assumptions, be written in standard stochastic differential equation (sde) form as,

 (7.1)

where dp(t) denotes the logarithmic price increment, μ(t) the drift, σ(t) refers to the point-in-time or spot volatility, and W(t) is a standard Brownian motion.

#### 7.1.1 Standard Brownian motion / Wiener process

In a discrete-time econometric model, we assume that shocks are driven by a white noise process, which is not predictable. What is the counterpart of shocks in a continuous-time model? The answer is increments of a Wiener process W(t) (also known as a standard Brownian motion).

A continuous-time stochastic process W(t) is a Wiener process if it satisfies:

1. W(0) = 0;
2. ΔW(t) W(t + Δ) - W(t) = ϵ, where ϵ is an N(0,1) random variable;
3. ΔW(t) is independent of ΔW(j),j t;
4. W(t) is a continuous process, i.e. there are no jumps in its sample paths,

where Δ is a small increment in time.

To simulate a continuous process, it is of course necessary to discretize time. We simulate the motion at times 0,Δ,,,.... Moreover, the definition of a Brownian motion implies that {W(kΔ) - W((k - 1)Δ) : k N} is a collection of i.i.d. N(0,1) random variables. So to simulate a trajectory of the Brownian motion up to time mΔ, one needs to generate m independent N(0,1) random variables {Zk : k {1,2,...,m}}.

Since W0 0, and W(kΔ) W([k - 1]Δ) + Δ12Zk,k {1,2,...,m}, we simulate:

This is called the Euler method to simulate a Brownian motion.

The example file Simul_wiener.ox uses the procedure
Simul_Wiener_process(const m, const deltat) to simulate four Wiener processes consisting of 5 days with 10,000 observations per day (m = 5 × 10,000 = 50,000 and Δ = 110,000). This procedure is available in /packages/Realized/samples/procs_simul.ox.

Figure 7.1 represents the four simulated Wiener processes starting with W(0) = 0.

The Wiener process is a special stochastic process with zero drift and variance proportional to the length of time interval. This means that the rate of change in expectation is 0 and the rate of change in variance is 1. Even though this process is of crucial importance in finance, it is desirable to generalize it and allow the mean and the variance to evolve over time in a more realistic manner.

#### 7.1.2 Generalized Wiener Process

Let us now consider the generalized Wiener process p*(t): its expectation has a drift rate μ while the rate of variance change is σ2.

The generalized Wiener process can be written as follows:

 (7.2)

where W(t) is a Wiener process, and p*(t) is the price (and not its log). Then, E[p*(t) -p*(0)] = μt and Var[p*(t) -p*(0)] = σ2t. μ and σ2 are often referred to as the drift and volatility parameters of the generalized Wiener process p*(t).

The main limitation of the generalized Wiener process is that both the drift and volatility parameters (μ and σ2) are constant over time while it is well known that daily returns for instance (i.e. t = 1) are heteroscedastic.

The Ito process overcomes this drawback, allowing both the drift and volatility parameters to be time-varying. It is usually written as follows:

 (7.3)

where W(t) is a Wiener process. This process can be rewritten p*(t) = p*(0) + 0tμ(p*(s),s)ds + 0tσ(p*(s),s)dW(s).

A particular version of this process is commonly used in the literature to describe the behavior of prices and is known as geometric Brownian motion:

 (7.4)

In other words, a geometric Brownian motion is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion. It is easy to show that the random variable log(p*(t)∕p*(0)) is normally distributed with mean (μ - σ22)t and variance σ2t, which reflects the fact that increments of a geometric Brownian motion are normal relative to the current price, reason for which it is called geometric Setting t = 1, e.g. daily data, we have that the one-period-ahead returns are normally distributed, with mean (μ - σ22) and variance σ2. See Chapter 6 of Tsay (2002) for more details.

### 7.2 Integrated Volatility

A drawback of the geometric Brownian motion is that it is not able to generate the well know ARCH effects (volatility clustering). It is thus desirable to make both μ and σ2 time varying.

#### 7.2.1 Theoretical background

Let us now consider the following diffusion process for the logarithmic price p(t) = log(p*(t)):

 (7.5)

where μ(t) denotes the drift, σ(t) refers to the point-in-time or spot volatility, and W(t) denotes a standard Brownian motion. Intuitively, over (infinitesimal) small time intervals,

 (7.6)

where ΔW(t) = W(t) - W(t - Δ) ~ N(0,Δ).

For the one-period daily return,

 (7.7)

From Equation (7.7), it is clear that the volatility for the continuous-time process over [t- 1,t] is linked to the evolution of the spot volatility σ(t). Furthermore, conditional on the sample path of the drift and the spot volatility processes,

 (7.8)

where IV t denotes the so-called integrated variance (volatility), and is defined as follows:

#### 7.2.2 Illustration of the concept of integrated volatility

To visualize the difference between the spot volatility, the integrated volatility, the conditional volatility of a GARCH(1,1) model and daily squared returns, let us consider the following simulation study.

The simulated model is designed to induce temporal dependencies consistent with the GARCH(1,1) model. The simulated continuous-time GARCH diffusion is formally defined by

where Wp(t) and Wd(t) denote two independent Brownian motions.1

The Realized class contains a function Simul_Continuous_GARCH devoted to the simulation of the above continuous-time GARCH diffusion process. This procedure uses a standard Euler discretization scheme to generate the continuous-time GARCH diffusion process, i.e. p(t + Δ) = p(t) + σ(t)Zp(t) and σ2(t + Δ) = θωΔ + σ2(t), where Zp(t) and Zd(t) denote two independent standard normal variables.

In the example file Simul_Continuous_GARCH.ox, a continuous-time GARCH diffusion process is generated for which the constants are set to θ = 0.035, ω = 0.636 and λ = 0.296. The same model has previously been analyzed in Andersen and Bollerslev (1998a). To simulate exchange rates, we choose Δ = 12880, corresponding to 10 observations per 5-minute interval. The number of simulated days = 510 but the first ten days have been discarded, giving a total of 500 simulated days. Furthermore, we use the following initial values for the log-price and spot volatility: p(0) = 1 and σ2(0) = 0.1.

Figure 7.2 graphs the simulated instantaneous log-prices, 5-min prices and daily prices.

Figure 7.3 plots the corresponding returns, i.e. instantaneous returns, 5-min returns and daily returns.

Finally, Figure 7.4 plots five volatility measures:

1. The top left panel of Figure 7.4 displays the simulated spot volatility σ2(s). The spot variance is σ2(t+Δ) = θωΔ+σ2(t). We thus have 2880 values per day.
2. The top right panel displays the corresponding daily integrated volatility, i.e. IV t. Given the fact that IV t t-1tσ2(s)ds, the ‘daily’ IV t is computed as i=11Δσ2(t - j∕Δ)Δ, where 1Δ = 2880.
3. The bottom left panel displays the conditional variance obtained by estimating a GARCH(1,1) model on the daily returns.
4. Finally, the bottom right panel plots the squared daily returns.

• Even though the daily squared return are known to be an unbiased measure of the volatility, it is extremely noisy.
• Unlike daily squared returns, the conditional variance of the GARCH(1,1) is much less noisy. Indeed, it generally tracks the level of the spot volatility very well.

We have seen in Section 5.3.1 that one of the most popular measures to check the forecasting performance of the ARCH-type models is the Mincer-Zarnowitz regression, i.e. ex-post volatility regression:

 (7.11)

where t2 is the ex-post volatility, t2 is the forecasted volatility and a0 and a1 are parameters to be estimated. Recall that if the model for the conditional variance is correctly specified (and the parameters are known) and if E(t2) = t2, we have a0 = 0 and a1 = 1. Furthermore, the R2 of this regression is often used as a simple measure of the degree of predictability of the ARCH-type model.

To judge the quality of the GARCH forecasts it is common to use t2 = (rt -r)2, where r is the sample mean of rt (daily return at time t). The R2 of this regression is often lower than 5% and this could lead to the conclusion that GARCH models produce poor forecasts of the volatility (see, among others, Schwert, 1990, or Jorion, 1996). In their seminal paper, Andersen and Bollerslev (1998a) have shown that if rt follows a GARCH(1,1), the R2 of this regression is nothing but = . If κ is the kurtosis of zt, we have that κα12 + β12 + 2α1β1 < 1 to ensure the existence of the unconditional kurtosis of rt. It follows then that κα12 < 1 - β12 - 2α1β1 and,

If the innovations (zt) are i.i.d N(0,1), the R2 is thus necessarily lower than .

In example file Simul_Continuous_GARCH_R2.ox, we simulate the same diffusion process as before and run the above regression to illustrate the findings of Andersen and Bollerslev (1998a). When using the daily squared returns to measure the observed daily volatility, the R2 of the regression is found to be extremely low, i.e. 0.0374596 even though a0 and a1 are not significantly different from 0 and 1 respectively. However, if we consider the integrated volatility instead of the squared daily returns as an ex-post volatility measure, the R2 now equals 0.421871 suggesting that the GARCH model explains more than 40% of the variability of the true volatility.

### 7.3 Realized Volatility

It is clear from Section 7.2 that the integrated volatility is an ideal ex-post measure of volatility and thus useful for assessing the quality of conditional volatility forecasts. In other words, it is the ideal measure for assessing the quality of conditional volatility forecasts, like GARCH models. However, the integrated volatility is never observed in practice.

Inspired by the recent widespread availability of intraday prices of various financial assets, Andersen and Bollerslev (1998a) have proposed to proxy the integrated volatility by the so-called realized volatility.

The most fundamental feature of realized volatility is that it provides a consistent non parametric estimate of the price variability that has transpired over a given discrete interval.

The intuition behind the use of realized volatility is most readily conveyed within the popular continuous-time diffusion presented in Equation (7.5) for which the key quantity of daily volatility is the integral of the squared spot variance. This integral is called the integrated variance.

We assume we dispose of T days of M ≡⌊1Δequally-spaced intraday returns. Now denote the i-th return of day t by rt,i p(t + iΔ) - p(t + (i - 1)Δ), where i = 1,,M. The daily Realized Variance (RVar) of day t, denoted RV t(Δ), is then defined as:

 (7.12)

By the theory of quadratic variation, when Δ 0 and when the true diffusion process belongs to the family of processes presented in Equation (7.5) (i.e. absence of jumps):

 (7.13)

In other words, under suitable conditions (like the absence of serial correlation in the intraday returns)2 , the realized volatility is consistent for the integrated volatility in the sense that when Δ 0,RV t(Δ) measures the latent integrated volatility IV t perfectly.

G@RCH contains a set of procedures devoted to the computation of the realized volatility (and other related concepts that we will review in the next sections) gathered in the Realized class. For instance, the example file Simul_Continuous_GARCH_RV.ox simulates the previous GARCH-diffusion process and uses the procedure Compute_RV to compute the realized volatility from the simulated 5-minute returns, i.e. 288 returns per day.

Figure 7.5 plots four volatility measures obtained when running
Simul_Continuous_GARCH_RV.ox.

1. The top left panel displays the simulated spot volatility σ2(t).
2. The top right panel displays the corresponding daily integrated volatility.
3. The bottom left panel plots the daily realized volatility.
4. And finally, the bottom right panel displays GARCH(1,1) conditional variance.

The quality of the approximation of IV t by RV t(Δ) is evident from this graph.

Note that the example file Simul_Continuous_GARCH_R2.ox also reported the R2 of the Mincer-Zarnowitz regression using the realized volatility as endogenous variable. Not surprisingly, the R2 is very close to the previous value obtained for the IV t, i.e. 0.4219 vs. 0.4121.

### 7.4 Jumps

Empirical studies have shown that a continuous diffusion model as in Equation (7.5) fails to explain some characteristics of asset returns. Furthermore, standard GARCH models are not able to fully explain the excess kurtosis found in most financial time series. In a continuous time framework, the inadequacy of the standard stochastic diffusion model has led to developments of alternative models. Jump diffusion and stochastic volatility models have been proposed in the literature to overcome this inadequacy.

As common in the literature on non parametric estimation of the continuous and jump components in the quadratic variation (QVar), we suppose that the log-price process belongs either to the Brownian SemiMartingale (BSM) or the BSM with Jumps (BSMJ) families of models. Under the BSMJ model, the diffusion component captures the smooth variation of the price process, while the jump component accounts for the rare, large discontinuities in the observed prices. Andersen, Bollerslev, and Dobrev (2007) cite the work of several authors who found that this is a realistic model for the price series of many financial assets.

A BSMJ log-price diffusion admits the representation

 (7.14)

where μ(t) is a continuous locally bounded variation process, σ(t) is a strictly positive and càdlàg (right-continuous with left limits) stochastic volatility process, W(t) is a standard Brownian motion, dq(t) is a counting process with dq(t) = 1 corresponding to a jump at time t and dq(t) = 0 otherwise. The (possibly time-varying) jump intensity is l(t) and κ(t) is the size of the corresponding jump.

Jumps in stock prices are often assumed to follow a probability law. For instance, the jumps may follow a Poisson process, which is a continuous-time discrete process. For a given time t, let Xt be the number of times a special event occurs during the time period [0,t]. Then Xt is a Poisson process if:

The ł parameter governs the occurrence of the special event and is referred to as the rate or intensity of the process. Recall also that E(Xt) = l.

G@RCH also provides a procedure Simul_Continuous_GARCH_JUMPS to simulate a continuous-time GARCH diffusion process with jumps, formally defined by

The jump size κ(t) is modeled as the product between σ(t) and a uniformly distributed random variable on ([-2,-1] [1,2]). The parameter m determines the magnitude of the jumps.

Figure 7.6 plots IV t and RV t for 500 days of simulated intraday returns. The DGP is given by Equations (7.15)-(7.18) with the same GARCH parameters as before, but m is set to 0.5 and l is chosen such that there is on average one jump every 5 days. The ox code used to get this figure is Simul_Continuous_GARCH_JUMPS_IV_RV.ox.

Importantly, it is clearly visible that the realized volatility does not match the spot and integrated volatility.

This result is not surprising since we know by the theory of quadratic variation that for Δ 0, we have the following convergence in probability:

 (7.19)

In other words, in the absence of jumps, the realized volatility is a consistent estimator of the integrated volatility. But not in the presence of jumps. Alternatives to the realized volatility that are robust to jumps are discussed in Section 7.6.

Temporal dependence in volatility is one of the most striking features of financial series recorded at various frequencies. Interestingly, new phenomena become visible as one proceeds from daily returns to intraday returns.

The aim of this section is to show how G@RCH can be used to estimate the intraday (or intraweek) periodicity in volatility.

The codes used to prepare this section are also available in the folder /packages/Realized/samples/.

Note that the dataset has been simulated in such a way that the outputs and graphs obtained when running this example should be very similar to the ones reported in this section.

#### 7.5.1 Data

Like Andersen and Bollerslev (1997), Andersen and Bollerslev (1998a), Martens, Chang, and Taylor (2002) and Taylor and Xu (1997), we study the periodicity in intraday volatility of 5-minute EUR/USD returns.3

Our data set ranges from January 2000 to October 2004 and time is expressed in Easter Time (EST, EST = GMT - 5), taking into account the daylight saving time shifts in Europe and the USA. All the returns from Friday 16.05 EST through Sunday 16.00 EST are excluded. We also exclude from the sample some regular (Christmas, New Year...) and irregular (Easter...) holidays. We finally delete days where the data quality is dubious. We are left with T = 1175 days of M = 288 equally-spaced and continuously compounded 5-minute return observations rt,i, as currencies are traded around the clock. The data consists of the last mid point of the interval (last average of log(bid) and log(ask)). The first interval of day t is 16.05 EST, the last is 16.00 EST.

#### 7.5.2 Evidence of intraday periodicity

Figure 7.7 (obtained by running example file Periodicity.ox plots the lag 1 through 1440 (i.e. 288 × 5 = 5 days) sample autocorrelations for the five-minute absolute returns |rt,i|.

This graph displays a distinct U-shaped pattern in the ACF of absolute returns. Standard ARCH models imply a geometric decay in the absolute return autocorrelation structure and simply cannot accommodate strong regular cyclical patterns of the sort displayed in Figure 7.7.

The volatility dynamics of intraday financial time series are complex. There are intraday volatility patterns, reflecting the daily activity cycle of the regional centers as well as weekend, the macroeconomic announcement effects (immediately following the release) and standard volatility clustering at the daily level.

The behaviour of a time series is called periodic if it shows a periodic structure in addition to less regular movements. The foreign exchange (FX) market shows strong periodic effects caused by the presence of the traders in the three major markets according to the hour of the day, the day of the week and the Daylight Saving Times. The global FX market consists of three major markets, i.e., Asia, Europe and North America, and the major movements of intradaily return volatility can be attributed to the passage of market activity around the globe.

Figure 7.8 depicts the average absolute returns over the (288) five-minute intervals. This intraday pattern is quite similar across all days of the week with discrete changes in quoting activity marking the opening and closing of business hours in the three major regional centres, all of which have their own activity pattern.

The following hours can be used as indicative: the Far East is open from 16:00 EST (21:00 GMT) to 1:00 EST (6:00 GMT), Europe trades between 2:00 EST (7:00 GMT) and 11:00 EST (16:00 GMT) and trading in North America occurs from 7:00 EST (12:00 GMT) to 16:00 EST (21:00 GMT). Using the discussion of market opening and closures presented above, we explain the intraday periodic volatility as follows. At 19:00 EST, the Far Eastern market has already been trading for around three hours and market activity is high. From 19:00 EST until about 22:00 EST, activity levels and volatility remain high. The lunchtime in Tokyo (22:00 EST- 23:45 EST) is the point of the day corresponding to the most prominent feature of the series. Volatility drops sharply and regains its former value at about 0:00 EST. Generally, Europe begins to contribute to activity at around 2:00 EST as the Far Eastern market begins to wane: there is a small peak in volatility. During European lunch hours (starting around 6:30 EST), both activity and volatility show a slight lull. The most active period of the day is clearly when both the European and North American markets are open (between 7:00 EST and 11:00 EST). Volatility starts to decline as first the European and then US markets wind down. At around 16:00 EST, the pacific market begins to trade again and the daily cycle is repeated after midnight. This intraday pattern is consistent with previous evidence reported in the literature, see Andersen and Bollerslev (1998b) among others.

Failure to take account of this intra-daily periodicity is likely to result in misleading statistical analyses as shown by Andersen and Bollerslev (1997) (in the sense that intraday returns do not conform at all to the theoretical aggregation results for the GARCH models).

The periodic phenomena in the volatility of FX markets can be modelled in a variety of ways. GARCH specification with dummy variables can be used for modelling the conditional volatility. For our dataset, this would require estimating 288 time-of-day parameters, if one dummy variable were created for each five-minute interval. The number of variables required is very large and it is unlikely to be effective in capturing the complexity of the periodic patterns.

#### 7.5.3 Classical and robust estimation of intraday periodicity

Recall that we dispose of T days of M equally-spaced and continuously compounded intraday returns and that rt,i is the i-th return of day t. We assume the log-price follows a Brownian SemiMartingale with Finite Activity Jumps (BSMFAJ) diffusion.

We assume also that, for small values of Δ, the returns rt,i in an interval without jumps in the underlying price diffusion, are conditionally normally distributed with mean zero and variance σt,i2 = t+(i-1)Δt+iΔσ2(s)ds.

Due to the weekly cycle of opening, lunch and closing times of the financial centers around the world, the high-frequency return variance σt,i2 has a periodic component ft,i2.

Depending on the nature of the analysis, there exists a natural window length for which almost all variation in σt,i2 during the window can be attributed to ft,i2 such that st,i2 σt,i2∕ft,i2 is approximately constant over the local window. Henceforth G@RCH follow the suggestion Andersen and Bollerslev (1997, 1998b), Andersen, Bollerslev, and Dobrev (2007) and Lee and Mykland (2008) to use local windows of one day.4

Andersen and Bollerslev (1997, 1998b) suggest to estimate st,i by ŝt = i = 1,,M, where ht is the conditional variance of day t obtained by estimating a GARCH model on daily returns. Under the BSM model, a more efficient estimator for st,i is ŝt = .

As we will see in Section 7.6.1, under the BSMJAJ model, st,i is better approximated using a normalized version of Barndorff-Nielsen and Shephard (2004b)’s realized bi-power variation, i.e. ŝt,i = where BV t is the bi-power variation computed on all the intraday returns of day t (see Equation (7.38)).

To ensure identifiability of the periodicity factor ft,i with respect to the average variance of day t, we impose that the squared periodicity factor has mean one over the local window.

Under this model we have that when Δ 0 and if rt,i is not affected by jumps, the standardized high-frequency return rt,i = rt,iŝt,i is conditionally normally distributed with mean zero and variance equal to the squared periodicity factor. This result suggests to estimate the periodicity factor using either a non parametric or parametric estimator of the scale of the standardized returns. The estimator has to be robust to price jumps.

##### Non parametric estimation of periodicity

The non parametric periodicity estimator is based on a scale estimate of the standardized returns rt,i = rt,iŝt,i that share the same periodicity factor.

Let r1;t,i,,rnt,i;t,i be the set of nt,i returns having the same periodicity factor as rt,i. If we condition the estimation of the periodicity factor only on the calendar effects, r1;t,i,,rnt,i;t,i are the returns observed on the same time of the day and day of the week as rt,i.

The classical periodicity estimator is based on the standard deviation

 (7.20)

where SDt,i = . This estimator is similar to Taylor and Xu (1997)’s periodicity estimate based on averages of squared returns.

In absence of jumps, the standard deviation is efficient if the standardized returns are normally distributed. In the presence of jumps, this estimator is useless, since it suffices that one observation in the sample is affected by a jump to make the periodicity estimate arbitrarily large.

Because of the non-robustness of the previous estimator in the presence of jumps, Boudt, Croux, and Laurent (2008b) propose to replace the standard deviation in (7.20) by a robust non parametric estimator.

A first candidate is the so-called median absolute deviation (MAD). The MAD of a sequence of observations y1,,yn is defined as

 (7.21)

where 1.486 is a correction factor to guarantee that the MAD is a consistent scale estimator at the normal distribution. The MAD estimator for the periodicity factor of rt,i equals

 (7.22)

Amongst the large number of robust scale estimators available in the literature (see Maronna, Martin, and Yohai 2006, for an overview), Boudt, Croux, and Laurent (2008b) also recommend the use of the Shortest Half scale estimator proposed by Rousseeuw and Leroy (1988), because it remains consistent in the presence of infinitesimal contaminations by jumps in the data. Importantly, it has the property of being, among a wide class of scale estimators, the estimator for which jumps can cause the smallest maximum bias possible. Under normality, it has the same efficiency as the MAD and the interquartile range. A final advantage is that it is computationally convenient and does not need any location estimation. For the definition of the Shortest Half scale estimator, we need the corresponding order statistics r(1);t,i,,r(nt,i);t,i such that r(1);t,i r(2);t,i r(nt,i);t,i. The shortest half scale is the smallest length of all “halves” consisting of ht,i = nt,i2+ 1 contiguous order observations. These halves equal {r(1);t,i,,r(ht,i);t,i}, , {r(nt,i-ht,i+1);t,i,,r(nt,i);t,i}, and their length is r(ht,i);t,i -r(1);t,i, , r(nt,i);t,i -r(ht,i);t,i, respectively. The corresponding scale estimator (corrected for consistency under normality) equals the minimum of these lengths:

 (7.23)

The Shortest Half estimator for the periodicity factor of rt,i equals

 (7.24)

The shortest half dispersion is highly robust to jumps, but it has only a 37% efficiency under normality of the rt,i’s. Boudt, Croux, and Laurent (2008b) show that a better trade-off between the efficiency of the standard deviation under normality and the high robustness to jumps of the shortest half dispersion is offered by the standard deviation applied to the returns weighted in function of their outlyingness under the ShortH estimate, i.e.

 (7.25)

where

Because the weighting is applied to the squared standardized returns which are extremely large in the presence of jumps, Boudt, Croux, and Laurent (2008b) recommend the use of the hard rejection with threshold equal to the 99% quantile of the χ2 distribution with one degree of freedom, i.e.

 (7.26)

The factor 1.081 is needed to ensure consistency of the estimator under normality. The Weighted Standard Deviation (WSD) in (7.25) has a 69% efficiency under normality of the rt,i’s.

Note that when the dataset is ’dated’, the length of the periodicity filter equals 5 days, otherwise it is equal to one day.

##### Parametric estimation of periodicity

The non parametric estimators for the periodic component of intraday volatility use only the subset of the data for which the returns have the same periodicity factor. Andersen and Bollerslev (1997) show that more efficient estimates can be obtained if the whole time series dimension of the data is used for the estimation of the periodicity process. Andersen and Bollerslev (1997) use the result that under the assumption that returns are not affected by jumps (i.e. BSM model): rt,i ft,ist,iut,i, where ut,i ~ N(0,1) and so log() log ft,i + log |ut,i|, which allows to isolate ft,i as follows,

 (7.27)

where the error term εt,i is i.i.d. distributed with mean zero and has the density function of the centered absolute value of the log of a standard normal random variable, i.e.

 (7.28)

The parameter c = -0.63518 equals the mean of the log of the absolute value of a standard normal random variable. Andersen and Bollerslev (1997) then propose to model log ft,i as a function h of a vector of variables x (such as sinusoid and polynomial transformations of the time of the day) that is linear in the parameter vector θ

 (7.29)

Combining (7.27) with (7.29), we obtain the following regression equation

 (7.30)

It is common to estimate the parameter θ in (7.30) by the OLS estimator. This approach is neither efficient nor robust, since the error terms are not normally distributed. Denote the loss functions of the OLS and Maximum Likelihood (ML) estimators by ρOLS(z) = z2 and by

respectively. The OLS and ML estimates equal
 (7.31)

where εt,i is a function of θ (see Equation (7.30)).

Boudt, Croux, and Laurent (2008b) find that the ML estimator has a large bias in the presence of jumps. The non-robustness of the OLS and ML estimators to jumps is due to the unbounded effect an observation can have on their loss function. These loss functions are plotted in Figure 7.9. As mentioned by Martens, Chang, and Taylor (2002), the effect of jumps on the OLS and ML estimators is attenuated because the regression is based on the log of the standardized returns.

As an alternative to the OLS and ML estimators, Boudt, Croux, and Laurent (2008b) propose to use the Truncated Maximum Likelihood (TML) estimator introduced by Marazzi and Yohai (2004). This estimator gives a zero weight to observations that are outliers according to the value of the ML loss function evaluated at the corresponding residual computed under the robust non parametric estimator WSD in (7.25). Let

 (7.32)

Observations for which ρML(e t,iWSD) is large, have a low likelihood and are therefore likely to be outliers (Marazzi and Yohai, 2004). Denote q an extreme upper quantile of the distribution of εt,i. The TML estimator is defined as

 (7.33)

with

The truncation on the basis of the value of the ML-function is illustrated in Figure 7.9 for the 99.5% quantile. All observations with ρML(e iWSD) > 3.36 receive a zero weight in the objective function of the TML estimator.

Like for the non parametric periodicity estimators, we impose that the squared periodicity factor has mean one in the local window. The parametric estimate for the periodicity factor thus equals

 (7.34)

and similarly for t,iOLS and t,iML.

#### 7.5.4 First illustration on simulated data

In this section we show how to use G@RCH 6.1 to estimate the periodicity filters presented above on simulated data.

The non parametric periodicity filters presented in Section 7.5.3 are implemented in the functions TaylorXu(...) and Robust_periodicity(...). Their use is illustrated in example file NP_periodicity_filters.ox. This example uses the simulated data obtained by running the files Store_Cont_GARCH_...ox available in the directory /packages/Realized/samples/. There are four different files simulating 5-min (FX) returns from a continuous-time GARCH(1,1) diffusion with or without intraday periodicity in volatility and with or without jumps. The simulated data are then stored in the directory /packages/Realized/samples/data/.

Let us concentrate first on Store_Cont_GARCH_FFF.ox. The aim of this program is to generate 5-minute returns following a continuous-time stochastic volatility model without drift and spot volatility σ(s) specified as a multiplicative process of the periodicity function f, which depends only on the time of the day s-⌊sand a GARCH diffusion process (see Equation 7.9), i.e.

 (7.35)

where log(s -⌊s) = θx(s -⌊s)s -⌊sis intraday time.

σgarch(s) is generated using the values θ = 0.035, ω = 0.636 and λ = 0.296. The function f(s-⌊s) used in the simulation is based on log f(s-⌊s) = θx(s) with x(s) a vector holding sinusoid transformations of s, i.e.

 (7.36)

where M = 1Δ, P is set to 4 and θ = (γ1,δ1,,γ4,δ4) = (-0.24422;-0.49756;-0.054171;0.073907;-0.26098;0.32408;-0.11591;-0.21442). In this case, the length of the periodicity cycle is one day and the periodicity is thus the same for all the days of the week. To simulate exchange rates, we choose Δ = 12880, corresponding to 10 observations per 5-minute interval. The number of simulated days = 3010 but the first ten days have been discarded, which leads actually to a total of 3000 simulated days.

After running this file, the simulated data are automatically saved to ./data/Simulated_cont_GARCH_no_jumps_FFF.in7.

Figure 7.10 plots the true periodicity (dotted line) and the estimated periodicity (solid line) obtained by applying the four non parametric filters on 3000 days of 5-min simulated returns (288 returns per day). This model being compatible with the BSM model, the four methods should provide good estimates of the periodicity. As expected, we see that the four methods provide excellent estimates of the periodicity. In a more comprehensive Monte Carlo study, Boudt, Croux, and Laurent (2008b) have shown that in the class of non parametric estimators, the SD is the most efficient estimator.

Figure 7.11 plots the true periodicity (dotted line) and the estimated periodicity (solid line) obtained by applying the three parametric filters on the same data (using the file P_periodicity_filters.ox). Table 7.1 reports the estimated values for the three methods.

As expected, we see that the three methods provide excellent estimates of the periodicity.

 Table 7.1: Estimation results for the periodicity filter. DGP is a GARCH(1,1) with periodicity

 TRUE OLS ML TML cos(2PI1i/M) -0.24422 -0.24059 -0.23810 -0.23804 sin(2PI1i/M) -0.49756 -0.49946 -0.49561 -0.49699 cos(2PI2i/M) -0.054171 -0.055155 -0.055193 -0.054463 sin(2PI2i/M) 0.073907 0.073229 0.072583 0.073019 cos(2PI3i/M) -0.26098 -0.27287 -0.27080 -0.27080 sin(2PI3i/M) 0.32408 0.31672 0.31546 0.31562 cos(2PI4i/M) -0.11591 -0.10828 -0.10778 -0.10728 sin(2PI4i/M) -0.21442 -0.21937 -0.21617 -0.21666

Boudt, Croux, and Laurent (2008b) also show that all parametric estimators are more precise that the non parametric estimators. For the parametric estimators, they find that the ML estimator is efficient. The TML estimator is found to be only slightly less efficient than the ML estimator and, as expected, the OLS estimator (Gaussian QML) is the least precise of all parametric estimators.

Let us now consider the case of a BSMJ model (see Equations (7.15)-(7.18)) with the same GARCH parameters as before) with μ(s) = 0 and σ(s) specified as the same multiplicative process as in (7.35) (with the same parameter values as in the previous simulation).

The jump size κ(s) is modeled as the product between σ(s) and a uniformly distributed random variable on ([-2,-1] [1,2]). The parameter m determines the magnitude of the jumps. We set m equal to 0.3 (rather small jumps). Finally, the jump occurrences q(s) are specified as a Poisson process with on average K = 2 jumps per day. Importantly, occurrences of jumps are concentrated on the parts of the day when volatility is periodically very low (f < 0.45). Figure 7.12 plots the number of simulated jumps per intraday period of time. The simulated data, i.e. Simulated_cont_GARCH_jumps_FFF_K_2_M_0.3.in7 are obtained by running Store_Cont_GARCH_Jumps_FFF.ox with the options K = 2, M = 0.3 and threshold = 0.45.5

Figure 7.13 plots the estimated periodicity obtained by applying the four non parametric filters on the simulated 5-min returns (using NP_periodicity_filters.ox).

In this case the SD is expected to be inappropriate because of its non robustness to jumps. This is illustrated in Figure 7.13, where we see that Taylor and Xu (1997)’s periodicity filter (i.e. based on the SD) is very inaccurate while the three robust alternatives are as good as in the previous case of no jump. Again, in a more comprehensive Monte Carlo study, Boudt, Croux, and Laurent (2008b) show that the robust non parametric estimators are little affected by the inclusion of jumps in the price process.

Figure 7.14 plots the true periodicity (dotted line) and the estimated periodicity (solid line) obtained by applying the three parametric filters on the same data (using the file P_periodicity_filters.ox). Table 7.2 reports the estimated values for the three methods. The ML estimator is found to be extremely sensitive to jumps. The optimality of the ML estimator is thus restricted to the model without jumps. Since jumps do occur in practice, we do not recommend its use in practice.

 Table 7.2: Estimation results for the periodicity filter. DGP is a GARCH(1,1) with periodicity and jumps

 TRUE OLS ML TML cos(2PI1i/M) -0.24422 -0.24059 -0.23810 -0.23804 sin(2PI1i/M) -0.49756 -0.49946 -0.49561 -0.49699 cos(2PI2i/M) -0.054171 -0.055155 -0.055193 -0.054463 sin(2PI2i/M) 0.073907 0.073229 0.072583 0.073019 cos(2PI3i/M) -0.26098 -0.27287 -0.27080 -0.27080 sin(2PI3i/M) 0.32408 0.31672 0.31546 0.31562 cos(2PI4i/M) -0.11591 -0.10828 -0.10778 -0.10728 sin(2PI4i/M) -0.21442 -0.21937 -0.21617 -0.21666

The robustness of the OLS estimator is surprising at first sight, but it corroborates Martens, Chang, and Taylor (2002)’s intuition that the log-transformation shrinks the outliers and makes the estimators based on a regression of the log absolute returns more robust to jumps. However, while being much better than the ML estimator, the OLS (or Gaussian QML) estimator is quite affected by jumps, which is apparently not the case for the TML estimator (for which the fit is again almost perfect).

The main message of this simulation study is that the non parametric WSD and parametric TML estimators have a relatively high efficiency in the absence of jumps. If jumps are present in the process, they are the most accurate of all non parametric and parametric estimators considered, respectively. In all cases considered here, the TML estimator based on the correctly specified periodicity function is more efficient than the WSD estimator. While the advantage of the TML over the WSD is clear when the functional form of the periodicity is known, its major drawbacks are that this functional form is not known a priori in practical applications and its estimation is much more time consuming.

#### 7.5.5 Second illustration on EUR/USD data

In this section, we use the dataset described in Section 7.5.1, i.e. 5-min returns of the Euro - US dollar (EUR-USD) exchange rate from January 2000 to October, 2004.

To model the periodic component, we use a slightly more general specification than the one used in the previous simulation studies, i.e.

 (7.37)

where M = 1Δ, P is set to 4, N1 = (M + 1)2 and N2 = (2M2 + 3M + 1)6 are normalizing constants. Note that Andersen and Bollerslev (1997) originally used the normalizing constant N2 = (M + 1)(M + 2)6. However, the analytic solution for a sum of squares of integers k=1Mk2 = (2M3 + 3M2 + M)6 which leads to the value reported above when computing the mean instead of the sum.

The smooth periodicity generated from the Fourier terms is unlikely to cope well with the sharp drop in volatility, for instance, around lunch in the Far East and the day of the week dependencies. To fill this gap, the term k=1DλkI(k)t,i is included, where I(k)t,i is an indicator variable for event k during interval i of day t. The events may be calendar (day of the week dummies, etc.) as well as announcement effects. For instance to account for the announcement effects we have included two dummies that equal one on Friday 8:35 and 8:40 EST.6

Figure 7.15 plots t,iSD, t,iShortH and t,iWSD from Monday to Friday (i.e. 288 × 5 values) while Figure 7.16 plots t,iOLS, t,iML and t,iTML. These graphs have been obtained by running example file Periodicity.ox. Note that for ease of comparison between non parametric and parametric methods, Figure 7.16 also plots t,iWSD on each graph.

We see that while not being radically different, the three non parametric estimates are not equivalent. For instance, the periodicity factor is found to be much higher on Friday at 8h35 (peak around observation 1350) for the non-robust t,iSD estimator than for the robust ones.

The same comment applies for the parametric estimates. t,iTML and t,iOLS seem to overestimate the periodicity on that particular interval. Looking at Table 7.3 we see that the estimated coefficient for the ‘Friday 8:35’ dummy is about 1,1.3, and 0.55 for the OLS, ML and TML methods respectively.

Note also that it seems that while the parametric and non parametric estimates are very close to each other it is clear that the specification chosen for the parametric methods does not fully capture the intraday periodicity and needs more refinement.

 Table 7.3: Estimation results for the parametric periodicity filters

 OLS ML TML Monday Morning -0.033321 0.0025972 -0.0050530 Tuesday 0.0066779 -0.019706 -0.015382 Wednesday 0.026778 -0.020484 -0.0056105 Thursday 0.014360 -0.028472 -0.022191 Friday -0.036773 -0.058797 -0.047153 Friday 8:35 1.0090 1.3076 0.55753 Friday 8:40 0.44006 0.42802 0.33174 i/N1 -0.99591 -0.072911 -0.45299 i2/N2 0.69094 0.055195 0.33335 cos(2PI1i/M) -0.37557 -0.13875 -0.24023 sin(2PI1i/M) -0.36311 -0.31468 -0.30830 cos(2PI2i/M) -0.088932 -0.040887 -0.061628 sin(2PI2i/M) -0.052102 -0.036089 -0.037174 cos(2PI3i/M) -0.17503 -0.11762 -0.13831 sin(2PI3i/M) 0.083212 0.071998 0.084067 cos(2PI4i/M) 0.061097 0.069227 0.066619 sin(2PI4i/M) -0.080584 -0.067013 -0.070710

### 7.6 Robust to jumps volatility measures

We have seen in Section 7.4 that in absence of jumps, realized volatility is a consistent estimator of integrated volatility. However, in presence of jumps it is no longer consistent. Furthermore, it has been shown in the literature (see Andersen, Bollerslev, and Diebold, 2007) that jumps are much less persistent (and predictable) than the continuous sample path, or integrated volatility, component of the quadratic variation process. It is thus useful to separate the two components of the quadratic variation process. This was first done by Barndorff-Nielsen and Shephard (2004b) through the use of the so-called Bi-Power Variation (BPVar). In this section we also review an alternative to the BPVar, namely Realized Outlyingness Weighted Variation (ROWVar), recently proposed by Boudt, Croux, and Laurent (2008a).

#### 7.6.1 Bi-Power Variation

Barndorff-Nielsen and Shephard (2004b) show that, for a subclass of BSMJ price diffusions (i.e. BSM with Finite Activity Jumps), the normalized sum of products of the absolute value of contiguous returns (i.e. bi-power variation) is a consistent estimator for the IVar.

The BPVar (implemented in the procedure Compute_BV()) is defined as:

 (7.38)

where μ1 0.79788 and M = 1Δ.

Unlike the RVar, the BPVar is designed to be robust to jumps because its building block is the product between two consecutive returns instead of the squared return. If one of the returns corresponds to a jump and the next one follows the BSM diffusion process, then the product has a small impact on the BPVar, being the sum of many of these building blocks. If the jump process has finite activity7 then a.s. jumps cannot affect two contiguous returns for Δ 0 (or equivalently M →∞) and the jump process has a negligible impact on the probability limit of the BPVar, which coincides with the IVar. Under the BSM with Finite Activity Jumps (BSMFAJ) one has

 (7.39)

To illustrate the applicability of the above results, let us consider the following picture, obtained by running the example file Simul_Continuous_GARCH_Jumps_1.ox. Note that the DGP is a continuous-time GARCH(1,1) with on average one jump every 5 days (with m = 0.5).

Looking at Figure 7.17, we see that that unlike RV t(Δ) (see Figure 7.6), BV t(Δ) is a much better estimate of the integrated volatility (IV t) in presence of jumps.

#### 7.6.2 Realized Outlyingness Weighted Variance

For the impact of jumps on the BPVar to be negligible, the high-frequency returns need to be observed over extremely short time intervals (Δ 0) in order to avoid that jumps affect two contiguous returns and the effect of jumps on the price process may not be smeared out over these short intervals. When returns are computed over longer time intervals such as 5 or 15 minutes, these conditions may be violated. Furthermore, the BPVar is sensitive to the occurrence of “zero” returns in the sample.8 The multivariate extension of the BPVar has the additional disadvantage that it is not affine equivariant and it is not necessarily positive semidefinite, which are undesirable properties for covolatility estimators. For theses reasons, Boudt, Croux, and Laurent (2008a) have proposed an alternative to the BPVar that is robust to jumps affecting two or more contiguous returns. This estimator, called Realized Outlyingness Weighted Variance (ROWVar) is also more efficient than the BPVar under the BSM model and its multivariate extension has some nice properties.

The computation of the ROWVar proceeds as follows.
Step 1: Estimation of local outlyingness
Boudt, Croux, and Laurent (2008a) measure the outlyingness dt,i of return rt,i as the square of the robustly studentized return:9

 (7.40)

where t,i is a robust estimate of the instantaneous volatility computed from all the returns belonging to the same local window as rt,i. In Boudt, Croux, and Laurent (2008a), t,i is the Median Absolute Deviation about the median (MAD)10 of all the returns belonging to the same local window, which is here equal to one day.11

Because of the presence of a strong intraday periodicity in volatility (see Section 7.5.3), this assumption does not hold when the length of that window is greater than a few minutes. Recall that following Andersen and Bollerslev (1997, 1998b), Andersen, Bollerslev, and Dobrev (2007) and Lee and Mykland (2008) G@RCH uses local windows of length M, i.e. one day.

For this reason, in their empirical application, Boudt, Croux, and Laurent (2008b) propose to compute dt,i on filtered returns t,i = instead of the raw returns rt,i, where t,i is any of the robust periodicity filters presented in Section 7.5.3.

For practical reasons, G@RCH only considers non parametric estimates for t,i, i.e. t,iMAD, t,iShortH and t,iWSD. Note that when the dataset is ’dated’, the length of the periodicity filter is set to 5 days, otherwise it is equal to one day.
Step 2: Computation of ROWVar

Once one has such an outlyingness measure dt,i for all the high-frequency returns in the interval [t- 1,t], a weight function has to be chosen. This weight function equals one for returns for which the outlyingness dt,i does not raise any suspicion that the corresponding return rt,i might be affected by jumps and goes to 0 the more extreme dt,i is. Popular weight functions are the hard Rejection weight function

 (7.41)

and the Soft Rejection Huber weight function

 (7.42)

with k a tuning parameter to be selected. Under the BSM model and some weak assumptions stated in Boudt, Croux, and Laurent (2008a), the outlyingness measure is asymptotically chi-squared distributed with 1 degree of freedom (χ12). The outlyingness of a return affected by a jump will then be in the extreme tails of the χ12 distribution. Consequently, for the ROWVar the rejection threshold k can be set to χ12(β), the β quantile of the χ12 distribution function. Boudt, Croux, and Laurent (2008a) strongly recommend to use the SR weight function with β = 95% as a good compromise between robustness and efficiency.

Given a series of high-frequency returns, their estimated outlyingness and a weight function, one can then compute the ROWVar as follows:

 (7.43)

The correction factor cw in (7.43) ensures that the ROWVar is consistent for the IVar under the BSM model. Table 7.4 (see below) reports the correction factors for the hard and soft rejection weight functions for several values of the critical level β.

 β 1 0.99 0.975 0.95 0.925 0.90 0.85 0.80 cw HR 1 1.081 1.175 1.318 1.459 1.605 1.921 2.285 cw SR 1 1.017 1.041 1.081 1.122 1.165 1.257 1.358 θ HR 2 2.897 3.707 4.674 5.406 5.998 6.917 7.592 θ SR 2 2.072 2.184 2.367 2.539 2.699 2.989 3.245 dw HR 0.333 0.440 0.554 0.741 0.945 1.177 1.760 2.566

 Table 7.4: cw, θ and dw for different critical levels β (such that the threshold k = χ12(β), with χ12(β) the β quantile of the χ12)

Let us now compute ROWVar on the same simulated data than in the previous section. Figure 7.18, obtained by running the example file Simul_Continuous_GARCH_Jumps_2.ox, suggests that ROWV art(Δ) also does a good job in estimating the integrated volatility (IV t) in presence of jumps.

A more formal comparison between bi-power variation and ROWVar will be presented in Section 7.7.

#### 7.6.3 MinRV and MedRV

Two attractive robust estimators of the integrated volatility have been recently proposed by Andersen, Dobrev, and Schaumburg (2008). The MinRV and MedRV (implemented in the procedures Compute_MinRV() and Compute_MedRV()) are defined as follows:

where μ2 π(π - 2), μ3 π(6 - 4 + π), min stands for minimum and med for median.

The MedRV is meant to be less exposed to zero-returns than the BV.

### 7.7 Daily jump tests

Evidently, the difference between RV t and any robust to jumps estimator of IV t, denoted t(Δ), is an estimate of the jump contribution or realized jumps.

More formally,

 (7.46)

where t(Δ) is for instance BV t(Δ) or ROWV art(Δ).

Based on the theoretical results of Barndorff-Nielsen and Shephard (2006) that

 (7.47)

Andersen, Bollerslev, and Diebold (2007) have developed a formal test for (daily) jumps, i.e.

 (7.48)

where t is a robust to jumps estimate of the integrated quarticity IQt t-1tσ4(s)ds.

Andersen, Bollerslev, and Diebold (2007) consider the case t(Δ) = BV t(Δ) and use the so-called Tri-power quarticity TQt(Δ) to estimate IQt, where

 (7.49)

with μ43 223Γ(76)Γ(12)-1.

Another popular estimator for IQt, in the spirit of the bi-power (or multi-power) variation, is the Quad-power quarticity QQt(Δ), i.e.

 (7.50)

When t(Δ) = BV t(Δ), θ = μ1-4 + 2μ1-2 - 3 2.609. Note that TQt(Δ) and QQt(Δ) are implemented in the procedures Compute_TQ() and Compute_QQ(), respectively.

The main drawback of TQt(Δ) and QQt(Δ) is that like BV t(Δ) they are downward biased in the presence of zero returns. Furthermore, the bias is expected to be even larger in finite sample than for BV t(Δ) because they are made up of products of respectively three and four consecutive absolute returns.

To overcome this problem, Boudt, Croux, and Laurent (2008a) have proposed to replace t(Δ) in Equation (7.48) by ROWV art(Δ) and t(Δ) by the Realized Outlyigness Weighted Quarticity

 (7.51)

where w(.) is the hard rejection weight function. The correction factor dw and the asymptotic variance of the ROWVar θ are reported in Table 7.4 for several choices of the critical level β (used to get the outlyingness threshold k).

In the spirit of their MinRV and MedRV estimators, Andersen, Dobrev, and Schaumburg (2008) propose two alternative robust estimators of the integrated quarticity, namely the MinRQ and MedRQ. The formulas are given herebelow (and implemented in the procedures Compute_MinRQ() and Compute_MedRQ()):

where μ4 π(3π - 8) and μ5 = 3π(9π + 72 - 52).

Note also that Andersen, Dobrev, and Schaumburg (2008) show that both the MinRV and MedRV satisfy (7.47), where θ equals 3.81 for the former and 2.96 for the latter (the MedRV being asymptotically more efficient than the MinRV in absence of jumps).

Barndorff-Nielsen and Shephard (2006) advocated the use of a log version of the Zt statistics. According to them, the following statistic

 (7.54)

has better finite sample properties.

They also proposed a max version of logZt, denoted maxlogZt, where

 (7.55)

Under the null of no jump on day t, Zt, logZt and maxlogZt are asymptotically (as Δ 0) standard normal. The sequences {Zt(Δ)}t=1T, {logZt}t=1T and {maxlogZt}t=1T provide evidence on the daily occurrence of jumps in the price process.

The outlyingness measure dt,i can also be used to build a statistic for daily jump detection. Indeed, under the null of no jump during day t, dt,i ~ χ2(1), i = 1,,M (see Equation 7.40). In the spirit of the test proposed by Lee and Mykland (2008) (see next section), we can show that follows a Gumbel distribution under the null. More specifically, we reject the null of no jump during day t at the α% critical level if

where G-1(1 - α) is the 1 - α quantile function of the standard Gumbel distribution, Cn = (2log n)0.5 - and Sn = . When n = M or n = MT, the expected number of spurious (daily) detected jumps respectively equals αT and α.

If It,α(Δ) is a dummy variable that equals 1 if a jump has been detected on day t at the α (e.g. 0.999) significance level (using any of the tests presented above) and 0 otherwise, a better estimate of the realized jumps is given by:

 (7.56)

To make sure that the jump component added to the continuous one equals the realized volatility, we define as an estimator of the integrated variance

 (7.57)

Note that when α = 0,Ct,α(Δ) = t(Δ), t.

So far, we have considered the estimation of functions of jumps over given time intervals. The approach taken in this section is different. We explain how we can test whether any given intra-day return rt,i is from a purely continuous diffusion or is rather due to a jump in the price process. What follows is mainly based on the work of Lee and Mykland (2008) and Boudt, Croux, and Laurent (2008b).

The idea of Lee and Mykland (2008) for jump estimation is intuitive. If a return contains a jump component, it should be abnormally big. However, what constitutes an abnormally big return depends on the volatility condition prevailing at the time of the tested return. Indeed, in times of high volatility, an abnormal return is bigger than an abnormal return in times of low volatility. Hence, Lee and Mykland (2008) study the properties of the ratio of the tested return over a measure of local volatility. They derive asymptotic theory for the statistic and a rejection region under the null of no jump at tested time. In short, they propose a powerful, parsimonious methodology that allows to test whether any return contains a jump component, to know the sign of the jumps and its exact timing.

The statistic Jt,i tests whether a jump occurred between intradaily time periods i- 1 and i of day t. It is defined as the absolute return divided by an estimate of the local standard deviation t,i, i.e.

 (7.58)

Under the null of no jump at the testing time, that the process belongs to the family of BSMJ models described in Equation (7.14), and a suitable choice of the window size for local volatility, asymptotically follows a standard normal distribution.

Note that Lee and Mykland (2008) recommend replacing t,i by ŝt,i = where BV t is the bi-power variation computed on all the intraday returns of day t. Boudt, Croux, and Laurent (2008b) propose to account for the strong periodicity in volatility and show that replacing t,i by t,iŝt,i is more appropriate. They show that ignoring periodic volatility patterns leads to spurious jump identification. Indeed, the original Lee/Mykland statistic (that neglects the periodicity) tends to overdetect (underdetect) jumps in periods of high (low) intraday periodic volatility. G@RCH 6.1 gives the choice between the three robust non parametric estimation methods described in Section 7.5.3 to estimate t,i, i.e. t,iMAD, t,iShortH and t,iWSD. Recall that when the dataset is ‘dated’, the length of the periodicity filter is set to 5 days, otherwise it is equal to one day.

Under the null of no jump and a consistent estimate t,i, Jt,j follows the same distribution as the absolute value of a standard normal variable. Brownlees and Gallo (2006) propose comparing Jt,i with the 1 - α2 quantile of the standard normal distribution. This rule might spuriously detect many jumps, however. Andersen, Bollerslev, and Dobrev (2007) use a Bonferroni correction to minimize spurious jump detection. To minimize the risk of falsely finding jumps, Lee and Mykland (2008) propose inferring jumps from a conservative critical value, which they obtain from the distribution of the statistic’s maximum over the sample size. If the statistic exceeds a plausible maximum, one rejects the null of no jump. Under the stated assumptions and no jump in the interval i - 1,i of day t, then when Δ 0, the sample maximum of the absolute value of a standard normal variable (i.e. the jump statistic Jt,i) follows a Gumbel distribution. We reject the null of no jump if

 (7.59)

where G-1(1 - α) is the 1 - α quantile function of the standard Gumbel distribution, Cn = (2log n)0.5 - and Sn = . When n = 1, the test is similar to the one of Brownlees and Gallo (2006) in the sense that the expected number of spurious detected jumps (under the null) can be extremely large, i.e. αMT. When n = M (i.e. number of observations per day) and n = MT (i.e. total number of observations), this number equals respectively αT and α (i.e. 0). So if we choose a significance level of α = 0.0001, then we reject the null of no jump at testing time if Jt,i > Snβ* + Cn with β* such that P(ψ β*) = exp(-e-β* ) = 0.9999, i.e. β* = -log(-log(0.9999)) = 9.21.

The L&M test is implemented in the function Compute_LeeMykJump of the Realized class. An example is provided in the file
Jumps_JumpDetection_LeeMykland.ox. In this example, we simulate a GARCH diffusion with jumps and intraday periodicity (see Section 7.4) and apply the L&M test.

The first graph on Figure 7.19 shows the simulated 5-min log-prices. The other graphs plot the simulated (plus) and detected (squares) jumps using the L&M test. Note that the presence of intraday periodicity is ignored in the second graph. The third graph corresponds to the filtered L&M test that implements the correction proposed by Boudt, Croux, and Laurent (2008b), i.e. t,i by t,iŝt,i where in this example we chose the WSD periodicity filter t,iWSD.

We see that neglecting the presence of intraday periodicity has a strong impact on the power of the test. Indeed, while all the simulated jumps are correctly detected for the filtered L&M test, most jumps (i.e. small ones) are not detected when ignoring the intraday periodicity. See Boudt, Croux, and Laurent (2008b) for a more comprehensive simulation study.

### 7.9 Multivariate case

In the case where rt,i is an N-dimensional return vector generated by the multivariate counterpart of the BSMJ price diffusion model in (7.14), the processes p(s), μ(s) and q(s) are all N-dimensional vector processes and w(s) is a vector of N independent Brownian motions. Denote by Ω(s) the N × N càdlàg process such that Σ(s) = Ω(s(s) is the spot covariance matrix process of the continuous component of the price diffusion. Let K be the N × N process controlling the magnitude and transmission of jumps such that K(s)dq(s) is the contribution of the jump process to the price diffusion. We then have that a N-dimensional log-price diffusion can be decomposed as follows:

The integrated covariance matrix (ICov) over [t - 1,t] is the matrix
 (7.61)

Denote by κj the contribution of the j-th jump in [t - 1,t] to the price diffusion.

Andersen, Bollerslev, Diebold, and Labys (2003) have shown that the Realized quadratic covariation (RCov)

 (7.62)

is a consistent estimator for the sum of the ICov and the realized jump variability

 (7.63)

where jt = t-1tdq*(s), with q*(s) the univariate counting process derived from q(s) such that q*(s) increases by 1 whenever q(s) changes.

#### 7.9.2 Realized BiPower Covariation

For disentangling the continuous and jump components in the RCov, we need an additional estimator for the ICov that is robust to jumps. To this purpose, Barndorff-Nielsen and Shephard (2004a) introduce the Realized BiPower Covariation process (RBPCov) as the process whose value at time t is the N-dimensional square matrix with k,l-th element equal to

where r(k)t,i is the k-th component of the return vector rt,i. The factor π8 ensures that the RBPCov converges to the ICov under the BSMFAJ model:
 (7.65)

Like the BPVar, the RBPCov is highly affected by jumps when these jumps affect two contiguous returns. The RBPCov, being a multivariate scale estimator, has the additional disadvantage that it is not affine equivariant and it is not necessarily positive semidefinite, which are undesirable properties for covolatility estimators.

In the next subsection, we show that the inclusion of an appropriate weight function in the RCov, that downweights extreme returns, leads to an estimator that has the following properties:

1. it is affine equivariant and yields positive semidefinite matrices;
2. at the BSM model it is consistent for the ICov and has a high efficiency;
3. at the BSMFAJ model it is consistent for the ICov.

#### 7.9.3 ROWQCov

The computation of the ROWQCov is analogous to the computation of the ROWVar, but here rt,i is a return vector of dimension N.
Step 1: estimation of local multivariate outlyingness

We estimate the multivariate outlyingness of the return vector rt,i by the Mahalanobis distance between rt,i and 0 using a highly robust estimator of the multivariate scale S(rt,i) of the returns belonging to the same local window as rt,i.12 Formally, the outlyingness of rt,i equals

 (7.66)

Under the BSM model and some weak assumptions stated in Boudt, Croux, and Laurent (2008a), the Mahalanobis outlyingness measure is asymptotically chi squared distributed with N degrees of freedom (χN2) provided that the spot covariance matrix is constant over the local window.

For the same reasons as the ones evoked in Sections 7.6.2 and 7.8, the assumption of constancy of the covariance is not realistic for local windows of length one day. For that reason, the choice is given to the user to computed dt,i on raw returns (i.e. Equation (7.66)) but also on filtered returns. The three non parametric techniques discussed in Section 7.5.3 are offered, i.e. t,iMAD, t,iShortH or t,iWSD. Note that the N periodicity filters are estimated separately.

A common choice for a highly robust multivariate scale estimator is the Minimum Covariance Determinant (MCD) covariance estimate proposed by Rousseeuw and Driessen (1999). It is defined as follows. We call a halfsample of a local window a subsample of h = 0.5(M + N + 1)returns belonging to that local window. Denote now S the covariance matrix computed from the halfsample having the smallest value of the determinant of the covariance matrix computed from it.13 Compute then for the i-th return belonging to the same local window as rt,i the weight wi = wHR[rt,i(c0.5S)-1rt,i]. The MCD multivariate scale estimator is then given by the weighted covariance matrix of all returns belonging to the local window

The scalars c0.5 and c0.95 are the correction factors needed for consistency at the multivariate normal distribution. One has that cβ = β∕FχN+22(χN2(β)), where FχN+22() is the χN+22 distribution function (Croux and Haesbroeck, 1999).

The MCD covariance estimator is our preferred choice to construct the outlyingness. However, one can also use other robust multivariate scale estimators, such as the average RBPCov (see above). Note the positive semidefiniteness of ROWQCov is preserved when dt,i is build upon the RBPCov.
Step 2: computation of ROWQCov

Because the Mahalanobis outlyingness are scalar-valued and asymptotically χN2 distributed, one can use the hard and soft rejection weight functions in (7.41)-(7.42) with threshold equal to an extreme quantile of the χN2 distribution, to downweight the high-frequency returns that are outlying with respect to the majority of the returns in their local window. The ROWQCov is then defined as follows.

Definition 2 Let rt,i, for i = 1,,M, be a sample of M high-frequency returns and dt,i their outlyingness value based on local windows of length one day. For a given weight function w(), the Realized Outlyingness Weighted Quadratic Covariation is defined as

 (7.67)

#### 7.9.4 Correction factor for ROWVar and ROWQCov

The correction factor cw in (7.67) ensures that the ROWQCov is consistent for the ICov at the BSM and BSMFAJ models. It depends on the dimension N of the process and on the weight function used. Denote E[g(u)] the expectation of g(u) where u is standard normal distributed. The correction factor cw is then given by

For the HR weight function defined in (7.41) with threshold k = χN2(1 - α), cw = (1 -α)∕FχN+22(χN2(1 -α)), where FχN+22() is the χN+22 distribution function. Table 7.5 reports the correction factors for the hard and soft rejection weight functions. For N = 1, the correction factor cw is identical to the one used in (7.43).

 Table 7.5: Correction factors cw when the dimension of the series is N and for the hard and soft rejection weight functions with threshold k