Authors and Contributors

About the Author

Dr. Sébastien Laurent, the copyright owner of G@RCH, holds a Ph.D. (2002) in Financial Econometrics from Maastricht University (The Netherlands). In 2002-2003 he worked as post-doc researcher at the CREST (Laboratory of Finance-Insurance) in Paris (France). He spent six years as associate professor in Econometrics at the Faculté Notre-Dame de la Paix in Namur (Belgium) and is currently associate professor in financial econometrics at Maastricht University, School of Business and Economics. He is also fellow of the CORE in Louvain-la-Neuve (Belgium). In 2008 and 2009 he has been invited professor at the London School of Economics (Finance Department). Sébastien has published in Journal of Business and Economic Statistics, Journal of Applied Econometrics, Journal of Financial Econometrics, European Economic Review and Journal of Empirical Finance among others.
Web Site:


Dr. Kris Boudt holds a Ph.D. (2008) in Applied Economics at K.U.Leuven, Belgium. He is afliated researcher at K.U.Leuven and since 2008 assistant professor of Finance at Lessius University College. His main research interests are empirical nance, risk management and robust statistics. His research has been published in the Computational Statistics and Data Analysis, Metrika, Journal of Risk and RISK magazine, among others.
Web Site:

Jerome Lahaye graduated from the HEC Management School of the University of Liège (Belgium) where he subsequently worked as teaching and research assistant (2000-2003). He holds a master of advanced studies in economics from the University of Leuven (2004) and a Ph.D. (2009) in Economics at the Faculté Notre-Dame de la Paix in Namur (Belgium). He is currently post-doc at HEC Lausanne. His research interests are at the crossroads of applied econometrics and international finance. In particular, his research has focused on the link between economic information disclosure and discontinuities on financial time series.

Jean-Philippe Peters has been involved in the G@RCH project until version 4.2. He graduated from the HEC Management School of the University of Liège (Belgium) in 2000 with a M.Sc. in Management Sciences. His final dissertation, entitled “Development of a Package in the Ox Environment for GARCH Models and Two Empirical Applications” was the founding ground of the current package. From 2000 to 2002, he has worked as researcher in the Operations Research and Production Management unit of the University of Liège. In October 2002 he joined the Advisory and Consulting department of Deloitte Luxembourg where he is involved in consulting activities on risk management for the financial industry. Jean-Philippe also holds a Ph.D. (2010) in Economics from the HEC Management School of the University of Liège (Belgium).

Dr. Jeroen VK Rombouts holds a Ph.D. (2004) in Econometrics from the Catholic University of Louvain and is since 2004 assistant professor HEC Montreal. He is elected member of the International Statistical Institute, reseacher at CIRANO (Canada) and research associate at CORE (Belgium). Jeroen has published in Econometric Theory, Journal of Applied Econometrics, Econometric Reviews and The Econometrics Journal among others.
Web Site:

Francesco Violante graduated from the Bocconi University, Milan (Italy) where he worked as teaching and research assistant in 2004 at Centre of Research on Innovation and Internationalization (CESPRI). He holds a master of arts in economics from the Universitè Catholique de Louvain (Belgium) and is currently Ph.D. candidate of the Louvain Academy, at the University of Namur (2006). His research interests are focused on applied econometrics and finance and in particular realized volatility and forecast, multivariate GARCH models, volatility models comparison and selection. In 2007, he was visiting scholar at the HEC Management School of the University of Montreal (Canada).

Chapter 1

This book documents G@RCH 6.1, an OxMetrics module (see Doornik, 2009) dedicated to the estimation and forecasting of univariate and multivariate ARCH-type models (Engle, 1982), including some of the most recent contributions in this field. Univariate GARCH (Bollerslev, 1986), EGARCH (Nelson, 1991), GJR (Glosten, Jagannathan, and Runkle, 1993), APARCH (Ding, Granger, and Engle, 1993), IGARCH (Engle and Bollerslev, 1986), RiskMetrics (J.P.Morgan, 1996), FIGARCH (Baillie, Bollerslev, and Mikkelsen, 1996a and Chung, 1999), FIEGARCH (Bollerslev and Mikkelsen, 1996), FIAPARCH (Tse, 1998) and HYGARCH (Davidson, 2001) specifications are available for the conditional variance. Moreover an AR(FI)MA process can be specified in the conditional mean (see Baillie, Chung, and Tieslau, 1996b, Tschernig, 1995, Teyssière, 1997, or Lecourt, 2000, for further details about ARFIMA models). All these models can be adapted to add an “ARCH-in-mean” term in the conditional mean as suggested by Engle, Lilien, and Robbins (1987).

G@RCH 6.1 allows the estimation of some of the most widespread multivariate GARCH models, including the scalar BEKK, diagonal BEKK(Engle and Kroner, 1995), RiskMetrics (J.P.Morgan, 1996), CCC (Bollerslev, 1990), DCC (Engle, 2002 and Tse and Tsui, 2002), DECO (Engle and Kelly, 2007), OGARCH (Kariya, 1988 and Alexander and Chibumba, 1997) and GOGARCH models (van der Weide, 2002 and Boswijk and van der Weide, 2006).

G@RCH 6.1 also allows the estimation of univariate and multivariate non-parametric estimators of the quadratic variation and the integrated volatility. These estimators include the realized volatility, bi-power-variation and realized outlyingness weighted variance. Several tests for daily and intradaily tests for jumps and intraday periodicity filters are also provided.

This software has been developed with Ox 6.0, a matrix programming language described in Doornik (2007b).1 G@RCH 6.1 should be compatible with a lot of platforms, including Windows, Mac OsX, Linux, Unix and Solaris when it is used in combination with the Ox Console. Furthermore, G@RCH 6.1 provides a menu-driven graphical interface for Microsoft Windows, Mac OsX and Linux users as well as a comprehensive HTML documentation. For most of the specifications, it is generally very fast and one of its main characteristic is its ease of use.

This book is structured as follows:

1.1 G@RCH

1.1.1 Definition

G@RCH 6.1 is a software dedicated to the estimation and the forecasting of univariate and multivariate (G)ARCH models and many of its extensions. It can be used within OxMetrics or via the classic programming way (using OxEdit for instance) for those who have access to the Ox programming language.

The available univariate models are all ARCH-type models. These include ARCH, GARCH, EGARCH, GJR, APARCH, IGARCH, RiskMetrics, FIGARCH , FIEGARCH , FIAPARCH and HYGARCH. They can be estimated by approximate (Quasi-) Maximum Likelihood under one of the four proposed distributions for the errors (Normal, Student-t, GED or skewed-Student). Moreover, ARCH-in-mean models are also available and explanatory variables can enter the conditional mean and/or the conditional variance equations.

G@RCH 6.1 also propose the some multivariate GARCH specifications including the scalar BEKK, diagonal BEKK, full BEKK, RiskMetrics, CCC, DCC, DECO, OGARCH and GOGARCH models.

Finally, h-steps-ahead forecasts of both equations are available as well as many univariate and multivariate miss-specification tests (Nyblom, Sign Bias Tests, Pearson goodness-of-fit, Box-Pierce, Residual-Based Diagnostic for conditional heteroscedasticity, Hosking’s portmanteau test, Li and McLead test, constant correlation test, ).

1.1.2 Program

This documentation refers to version 6.0 of G@RCH, which can be used in three different ways. The first one is through the menu-driven environment of OxMetrics, similarly to PcGive, PcGets or STAMP.

Second, one can estimate the models via the “Batch Editor” of OxMetrics.

Finally, Ox users can write Ox codes based on the Garch, MGarch and Realized classes. This solution is found to be particularly interesting when non-standard operations are considered (like a whole catalogue of estimations). A set of examples is provided with the G@RCH package. Please note however that this solution requires a registered version of Ox Professional or its console version. Check the Ox web site ( for details.

The first method is illustrated in Chapters 3 and 4 while the next two are presented in Chapter 5.

1.1.3 What’s new in G@RCH 6.1 ?

1.1.4 What’s new in G@RCH 6.0 ?

#include <oxstd.h>  
#import <packages/Garch6/garch>  
#include <oxdraw.h>  
//--- Ox code for RE@LIZED  
decl model = new Realized();  
model.Select(Y_VAR, {"Ret_1", 0, 0});  
model.Select(Y_VAR, {"Ret_288", 0, 0});  
model.SetSelSampleByDates(dayofcalendar(1987, 1, 5), dayofcalendar(1998, 7, 3));  
delete model;  

1.1.5 What’s new in G@RCH 5.1 ?

1.1.6 What’s new in G@RCH 5.0 ?

This manual has been revised to account for the following new features:

1.2 General Information

1.2.1 Queries about G@RCH

Suggestions, mistakes, typos and possible improvements can be reported to the first author via e-mail:

The most appropriate way to discuss about problems and issues of the G@RCH package is the Ox-users discussion list. The ox-users discussion group is an email-based forum to discuss any problems related to Ox programming, and share code and programming solutions. Consult the online help of OxMetrics for information on joining the list.

1.2.2 Availability and Citation

The full independent version of G@RCH is exclusively distributed by Timberlake Consultants.
Contact information for the UK:
Phone: +44 (0)20 8697 3377 or

For the US:
Phone: +1 908 686 1251 or
For other countries, please go to the Timberlake web sites for contact information.

There are different versions of G@RCH. A console version of G@RCH is available on G@RCH web site

This (and only this) version of G@RCH may be used freely for academic research and teaching purposes only. The Professional version (i.e. using OxRun, OxDebug, OxMetrics or oxli) is not free.

Commercial users and others who do not qualify for the free version must purchase the Professional version of G@RCH with documentation, regardless of which version they use (i.e., even with Linux or Unix). Redistribution of G@RCH in any form is not permitted.

Moreover, for easier validation and replication of empirical findings, please cite this documentation (or Laurent and Peters, 2002) in all reports and publications involving the use of G@RCH. Failure to cite the use of Ox in published work may result in loss of the right to use the free version, and an invoice at the full commercial price.

All company and product names referred to in this documentation are either trademarks or registered trademarks of their associated companies.

1.2.3 World Wide Web

Check for information on releases, online help and other relevant information on G@RCH.

1.3 Download G@RCH console 6.1

To install the Professional version of G@RCH 6.1, run the installation file and follow the instructions.

To install the Console version of G@RCH, unzip the file (available herebelow) in the ox/packages directory. Note that the free console version of G@RCH 6.1 only includes the Garch class. The MGarch and Realized classes are only available in the Professional version of G@RCH.

Download G@RCH Console 6.1
Download Ox Console

For OxMetrics enterprise users, note that the Console version of G@RCH 6.1 comes along with OxMetrics enterprise and is directly installed in the ox/packages directory.


Chapter 2
Getting Started

We now discuss some of the basic skills required to get started with G@RCH. We will give a brief introduction to the functionalities provided by OxMetrics. The aim of this section is to explain how to load data and make graphs: for instructions on how to transform data using the calculator or algebra, consult the OxMetrics manuals.

We assume that you have the basic skills to operate programs under the Windows operating system (the OxMetrics books provides some hints). G@RCH runs on several versions of Windows, i.e. Windows Vista/XP/NT/2000 but also on Mac OsX and Linux.1 The screen appearance of the program and dialogs will reflect the operating system you are using. The screen captures of this manual reflect the Windows XP operating system but should not be very different from the other systems.

2.1 Starting G@RCH

When launching OxMetrics, all the available modules (PcGive, G@RCH and Stamp) are automatically loaded and appear in the Modules group in the workspace window on the left-hand side of OxMetrics. This process is displayed in the following capture.

2.2 Loading and Viewing the Tutorial Data Set

G@RCH cannot operate without data. Once OxMetrics is activated, the very first step is to load data. Let us illustrate this with an example, loading nasdaq.in7. These are daily returns of the Nasdaq stock index. The covered period is 10/11/1984 - 12/21/2000, which represents 4093 daily observations. Daily returns (in %) are defined as

yt = 100 [log(pt)- log(pt-1)],

where pt is the price series at time t.


The IN7 extension indicates an OxMetrics file. The IN7 file is a human-readable file, describing the data. There is a companion BN7 file, which holds the actual numbers (in binary format, so this file cannot be edited). OxMetrics can handle a wide range of data files, among them Excel and Lotus files and, of course, plain human-readable (ASCII) files. You can also cut and paste data between Excel and OxMetrics. More details are available in OxMetrics handbooks.

To load this dataset, click File then Open in the main menu of OxMetrics, then select the data file. The default datasets provided by G@RCH are stored in the subdirectory \G@RCH\data of OxMetrics’s main directory. Locate that directory and select nasdaq.in7.




Double clicking on the variable name shows the documentation of the variable.

The data can be manipulated, much like in a spreadsheet program. Here we shall not need these facilities, so minimize the database window again: click on the first button in the right-hand side of the window. Do not click on the cross: this closes the database, thus removing it from OxMetrics and hence from G@RCH.

2.3 OxMetrics Graphics

The graphics facilities of OxMetrics are powerful yet easy to use. This section will show you how to make time-plots of variables in the database. OxMetrics offers automatic selections of scaling etc., but you will be able to edit these graphs, and change the default layout such as line colors and line types. Graphs can also be saved in a variety of formats for later use in a word processor or for reloading to OxMetrics.

2.3.1 A First Graph

Graphics is the first entry on the Model menu. Activate the command to see the following dialog box (or click on the cross-plot graphics icon on the toolbar).


In this example we select Nasdaq and then press the << button to move the database variables to the selection listbox, as shown above. Then press the button labelled Actual series. Figure 2.1 should appear on the screen. OxMetrics can also draw multiple graphs simultaneously on-screen. For other types of graphs, select All plot types to open the related menu, as seen hereunder (additional details on graphics are available in Doornik, 2009).

2.3.2 Graph Saving and Printing

To print a graph directly to the printer, click on the printer icon in OxMetrics. You can preview the result first using the Print Preview command from the File menu.

Graphs can be saved in various formats. See Doornik (2009) for more details.

2.3.3 Including Graphs in LATEX Documents

Saving graphs in the .EPS format is useful to include them in a LATEX document. The example given below is the Latex code used to include Figure 2.1 in the current document. Note that the command \includegraphics requires the use of the package graphicx. This can be done using the \usepackage{graphicx} command in the preamble of the LATEX document.



Figure 2.1: Daily returns (in %) of the NASDAQ

Chapter 3
Introduction to the Univariate ARCH Model

In this chapter, our attention is first devoted to review the specifications of the conditional mean equation. Then, the most simple univariate ARCH specification will be presented as well as the estimation and testing issues with G@RCH. More recent developments about ARCH-type models are discussed in the next chapter.

3.1 Visual Inspection

Panel 1 of Figure 3.1 indicates that the NASDAQ exhibits volatility clustering as periods of low volatility mingle with periods of high volatility. This is a clear sign of presence of ARCH effect in the series.

Panel 2 also shows that the unconditional distribution of the NASDAQ returns (full line) is not normal. Indeed, it is more peaked than the normal density (dashed line) and it has fatter tails as indicated by panels 3 and 4 (this visual indication is validated by the computation of the kurtosis, which is is equal to 9.34). While not obvious on a visual basis, the skewness parameter (-0.627) indicates a negatively skewed behavior.

Recall that the skewness coefficient (SK) equals 0 for a symmetric distribution while the kurtosis coefficient (KU) equals 3 for the normal distribution. The excess kurtosis equals KU - 3. Formally these coefficients are expressed as

              3                  4
SK  = E-[(y--μ)-]andKU  = E-[(y-- μ-)]
          σ3                 σ4


Figure 3.1: Daily Returns of the NASDAQ and Unconditional Density Estimation using OxMetrics

3.2 Preliminary Graphics

When the error term is not independent of previous errors, it is said to be autocorrelated. Autocorrelation of order h is computed as follows

rh =  σy,tσy,t-h

Plotting the autocorrelation function against the lag gives a first visual impression of the magnitude of the autocorrelation problem. This plot is called “autocorrelogram” and it provides information about simple correlation coefficients. When the errors are strongly time dependent, the autocorrelations tend to be fairly large even for large values of h. The next screen shot shows how to obtain an autocorrelogram in OxMetrics.

The obtained graphic is displayed in Figure 3.2. It suggests that the daily return series of the NASDAQ is a short memory process (in the level) and an AR(1) term might be needed in the conditional mean equation.

We have previously seen from the visual inspection that the NASDAQ exhibits volatility clustering as periods of low volatility mingle with periods of high volatility. In addition to this inspection, one can plot the autocorrelogram of the squared (or absolute) returns to highlight the presence of ARCH effects in the data. This can be done using the calculator of OxMetrics. We can compute squared returns of the NASDAQ, as shown in the screen shot below.



Figure 3.2: Autocorrelogram on daily returns (in %) of the NASDAQ



Figure 3.3 shows the autocorrelogram (with 100 lags) of the squared returns. It suggests that squared returns are strongly autocorrelated, exhibiting volatility clustering.


Figure 3.3: Autocorrelogram on daily squared returns (in %) of the NASDAQ

3.3 Preliminary Tests

G@RCH 6.1 provides several tests on the raw data (in addition to the possibilities offered by OxMetrics: ACF, PACF, QQ-plots,…):

To launch the descriptive statistics, click on G@RCH in the Modules group in the workspace window on the left-hand side of OxMetrics. Then change the Category to Other models and Model Class to Descriptive Statistics using G@RCH and click on Formulate.


A new dialog box Formulate is launched and the user is asked to select the series to be tested. Selected series are labelled ‘T’.


Next, the user select the tests to be run. Lag values of the ARCH LM and Box-Pierce tests can be changed by the user. The subsequent window displays options related to the sample period. As can be seen, if the date variable is well formatted in the database, the sample selection is made easier by allowing selecting a starting date and an ending date. See Chapter 9 of Doornik (2007a) for details on OxMetrics file formats.

Once the OK button of the Estimate - Descriptive Statistics dialog box is pressed, tests are launched and results are printed out in OxMetrics. Note that if a unit root test has been selected, an additional dialog box is launched and the user is asked to select some options related to the test (to include a constant or a trend for instance and the lag length for differences or bandwidth). They should be similar to the output of “Box 1 - Descriptive Statistics”. Results confirm that NASDAQ returns exhibit ARCH-type effects (see the ARCH LM tests and the Q-Statistics on squared data). Looking at the Q-Statistics on raw data, one concludes that an ARMA-type model seems justified. Finally, the Jarque-Bera statistic clearly rejects the normality assumption for the unconditional distribution of the NASDAQ while the KPSS Statistics indicated that the series is likely to be I(0).



                                Box 1 - Descriptive Statistics for the NASDAQ  
---- Database information ----  
Sample:    1984-10-12 - 2000-12-21 (4093 observations)  
Frequency: 1  
Variables: 4  
Variable        #obs  #miss    type          min         mean          max  
Date            4093      0    date       1984-10-12            2000-12-21  
Nasdaq          4093      0  double      -12.043     0.055166       9.9636       1.2617  
Constant        4093      0  double            1            1            1            0  
Trend           4093      0  double            1         2047         4093       1181.5  
Series #1/1: Nasdaq  
Normality Test  
                   Statistic       t-Test      P-Value  
Skewness            -0.74128       19.368  1.4336e-083  
Excess Kurtosis       11.255       147.07      0.00000  
Jarque-Bera           21979.         .NaN      0.00000  
ARCH 1-2 test:    F(2,4088)=   420.80 [0.0000]**  
ARCH 1-5 test:    F(5,4082)=   228.90 [0.0000]**  
ARCH 1-10 test:   F(10,4072)=   118.16 [0.0000]**  

Q-Statistics on Raw data  
  Q(  5) =  41.8697   [0.0000001]  
  Q( 10) =  50.9695   [0.0000002]  
  Q( 20) =  83.6251   [0.0000000]  
  Q( 50) =  167.368   [0.0000000]  
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]  
Q-Statistics on Squared data  
  Q(  5) =  1988.77   [0.0000000]  
  Q( 10) =  2874.40   [0.0000000]  
  Q( 20) =  3748.75   [0.0000000]  
  Q( 50) =  5491.27   [0.0000000]  
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]  
KPSS Test without trend and 2 lags  
H0: Nasdaq is I(0)  
KPSS Statistics: 0.0743478  
Asymptotic critical values of Kwiatkowski et al. (1992), JoE, 54,1, p. 159-178  
    1%    5%   10%  
 0.739 0.463 0.347  
----  Log Periodogram Regression ----  
d parameter         0.0691465 (0.015793) [0.0000]  
No of observations: 4093; no of periodogram points: 2046

3.4 Conditional Mean Specification

Let us consider a univariate time series yt. If Ωt-1 is the information set at time t- 1, we can define its functional form as:

yt = E(yt|Ωt-1)+ εt,

where E(.|.) denotes the conditional expectation operator and εt is the disturbance term (or unpredictable part), with E(εt) = 0 and E(εtεs) = 0,ts.

Equation (3.6) is the conditional mean equation which has been studied and modelled in many ways. Two of the most famous specifications are the Autoregressive (AR) and Moving Average (MA) models. Combining these two processes and introducing n1 explanatory variables in the equation, we obtain the ARMAX(n,s) process

Ψ (L)(yt -n∑1μt) = Θ(L )εt
μt = μ +   δixi,t,

where L is the lag operator1 , Ψ(L) = 1 - i=1nψiLi and Θ(L) = 1 + j=1sθjLj. The ARMA residuals are obtained using the arma0 function of Ox on the demeaned series (yt - μt). See the Ox manual for more details about arma0.

Several studies have shown that the dependent variables (interest rate returns, exchange rate returns, etc.) may exhibit significant autocorrelation between observations widely separated in time. In such a case, yt is said to display long memory (or long-term dependence) and is best modelled by a fractionally integrated ARMA process. This ARFIMA process was initially developed in Granger (1980) and Granger and Joyeux (1980) among others.2 The ARFIMA(n,ζ,s) is given by:

Ψ (L)(1- L) (yt - μt) = Θ (L)εt,

where the operator (1 - L)ζ accounts for the long memory of the process and is defined as:

(1- L)ζ  =     ------Γ (ζ +-1)---Lk
            k=0Γ (k + 1)Γ (ζ - k +1)
                    1        2   1              3
         =  1- ζL - 2ζ(1- ζ)L  - 6ζ(1- ζ)(2 - ζ)L  - ...
         =  1-    ck(ζ)Lk,                              (3.9)
with 0 < ζ < 1, c1(ζ) = ζ,c2(ζ) = 1
2ζ(1 - ζ), and Γ(.) denoting the Gamma function (see Baillie, 1996, for a survey on this topic). The truncation order of the infinite summation is set to t - 1. Note that The ARFIMA residuals are obtained using diffpow function on the ARMA residuals. See the Ox manual for more details about diffpow.

It is worth noting that Doornik and Ooms (1999) provided an Ox package for estimating, forecasting and simulating ARFIMA models. However, contrary to G@RCH, the conditional variance is assumed to be constant over time.3

A feature of G@RCH 6.1 is the availability of ARCH “in-mean” models. If we introduce the conditional variance (or the conditional standard error) as additional explanatory variables in Equation (9.6), we get the ARCH-in-Mean model of Engle, Lilien, and Robbins (1987), i.e.

        n∑1          k
μt = μ +   δixi,t + ϑσ t,

with k = 1 to include the conditional standard deviation and k = 2 for the conditional variance.

The ARCH-M model is often used in financial applications where the expected return on an asset is related to the expected asset risk. The estimated coefficient of the expected risk is a measure of the risk-return tradeoff.

3.5 Conditional Variance Specification: the ARCH Model

The εt term in Equations (3.6)-(3.8) is the innovation of the process. More than two decades ago, Engle (1982) defined as an Autoregressive Conditional Heteroscedastic (ARCH) process, all εt of the form:

εt = ztσt,

where zt is an independently and identically distributed (i.i.d.) process with E(zt) = 0 and V ar(zt) = 1. By assumption, εt is serially uncorrelated with a mean equal to zero but its conditional variance equals σt2. Therefore, it may change over time, contrary to what is assumed in the standard regression model.

The models provided by our program are all ARCH-type.4 They differ on the functional form of σt2 but the basic principles are the same (see the next chapter for details). Besides the traditional ARCH and GARCH models, we focus mainly on two kinds of models: the asymmetric models and the fractionally integrated models. The former are defined to account for the so-called “leverage effect” observed in many stock returns, while the latter allows for long-memory in the variance. Early evidence of the “leverage effect” can be found in Black (1976), while persistence in volatility is a common finding of many empirical studies; see for instance the excellent surveys on ARCH models proposed in Bollerslev, Chou, and Kroner (1992), Bera and Higgins (1993) or Palm (1996).

The ARCH (q) model can be expressed as:

εt  =  ztσt
zt  ~  i.i.d.D (0,1)
σ2  =  ω +∑   αε2  ,                  (3.12)
 t         i=1  it-i
where D(.) is a probability density function with mean 0 and unit variance (see Section 3.6.2).

The ARCH model can describe volatility clustering. The conditional variance of εt is indeed an increasing function of the square of the shock that occurred in t - 1. Consequently, if εt-1 was large in absolute value, σt2 and thus εt is expected to be large (in absolute value) as well. Notice that even if the conditional variance of an ARCH model is time-varying, i.e. σt2 = E(εt2|ψt-1), the unconditional variance of εt is constant and, provided that ω > 0 and i=1qαi < 1, we have:

σ2 ≡ E [E (ε2t|ψt-1)] =--∑q---.                (3.13)
                   1-    αi

If zt is normally distributed, E(zt3) = 0 and E(zt4) = 3. Consequently, E(εt3) = 0 and the skewness of y is zero. The kurtosis coefficient for the ARCH(1) is 3 1-α2
1-3α121 if α1 < ∘ ---
  1∕3 0.577. In this case, the unconditional distribution of the returns features fat tails whenever α1 > 0.

In most applications, the excess kurtosis implied by the ARCH model (coupled with a normal density) is not enough to mimic what we observe on real data. Other distributions are possible. For example we could assume that zt follows a Student distribution with unit variance and υ degrees of freedom, i.e. zt is ST(0,1,υ). In that case, the unconditional kurtosis of the ARCH(1) is λ   2
   1 with λ = 3(υ - 2)(υ - 4). Because of the additional coefficient υ, the ARCH(1) model based on the Student distribution does feature fatter tails than the corresponding model based on the Normal distribution. See Section 3.6.2 for more details.

The computation of σt2 in Equation (3.12) depends on past (squared) residuals (εt2), that are not observed for t = 0,-1,,-q + 1. To initialize the process, the unobserved squared residuals are set to their sample mean.

3.5.1 Explanatory Variables

Explanatory variables can be introduced in the conditional variance as follows:

ωt = ω +   ωixi,t

For the ease of presentation, we will use the notation ω for both ω and ωt in all subsequent equations.5

3.5.2 Positivity Constraints

σt2 has obviously to be positive for all t. Sufficient conditions to ensure that the conditional variance in Equation (3.12) is positive are given by ω > 0 and αi 0.6 Furthermore, when explanatory variables enter the ARCH equation, these positivity constraints are not valid anymore (but the conditional variance is still required to be non-negative).

3.5.3 Variance Targeting

A simple trick to reduce the number of parameters to be estimated is referred to as variance targeting by Engle and Mezrich (1996).

The conditional variance matrix of the ARCH model (and most of its generalizations), may be expressed in terms of the unconditional variance and other parameters. Doing so one can reparametrize the model using the unconditional variance and replace it by a consistent estimator (before maximizing the likelihood).

Applying variance targeting to the ARCH model implies replacing ω by σ2(        )
 1- ∑q αi
    i=1, where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart.

If explanatory variables appear in the ARCH equation, ω is then replaced by σ2(    q   )
 1- ∑  αi
    i=1- i=1n2ωi¯xi, where ¯xi is the sample average of variable xi,t (assuming the stationarity of the n2 explanatory variables). In other words, the explanatory variables are centered.

3.6 Estimation

3.6.1 G@RCH menus

Estimating (3.12) with G@RCH is intended to be very simple.

Note that in the example we will use a Monday dummy and a Friday dummy. These dummies are easily created with the Calculator (recall that the dataset is dated). First, use the function dayofweek() to created a variable DAY with DAY = 2 on Mondays, 3 on Wednesdays, etc. Then, create the Monday and Friday dummy as follows: Monday = Day == 2 and Friday = Day == 6.



To specify the model, first click on G@RCH in the Modules group in the workspace window on the left-hand side of OxMetrics. Next change the Category to Models for financial data and the Model class to Univariate GARCH models using G@RCH, then click on Formulate.

A list with all the variables of the database appears in the Database frame. To select variables that will enter your model, click on the variable name and then click on the << button. There are three possible statuses for each variable (see the list of statuses under the Selection frame): dependent variable (Y variable), regressor in the conditional mean (Mean), or regressor in the conditional variance (Variance). In the univariate module, only one Y variable per model is accepted. However one can include several regressors in the conditional mean and the conditional variance equations and the same variable can be a regressor in both equations. In the example a Monday dummy is included in the conditional mean and a Friday dummy in the conditional variance equation.

Once the OK button is pressed, the Model Settings box automatically appears. This box allows to select the specification of the model: AR(FI)MA orders for the mean equation, GARCH orders, type of GARCH model for the variance equation and the distribution. The default specification is an ARMA(0,0)-GARCH(1,1) with normal errors. In our application, we select an ARMA(1,0)-GARCH(0,1) specification or equivalently an ARMA(1,0)-ARCH(1).7




3.6.2 Distributions

Estimation of ARCH-type models is commonly done by maximum likelihood so that one has to make an additional assumption about the innovation process zt, i.e. choosing a density function D(0,1) with a mean 0 and a unit variance.

Weiss (1986) and Bollerslev and Wooldridge (1992) show that under the normality assumption, the quasi-maximum likelihood (QML) estimator is consistent if the conditional mean and the conditional variance are correctly specified. This estimator is, however, inefficient with the degree of inefficiency increasing with the degree of departure from normality (Engle and González-Rivera, 1991).

As reported by Palm (1996), Pagan (1996) and Bollerslev, Chou, and Kroner (1992), the use of a fat-tailed distributions is widespread in the literature. In particular, Bollerslev (1987), Hsieh (1989), Baillie and Bollerslev (1989) and Palm and Vlaar (1997) among others show that these distributions perform better in order to capture the higher observed kurtosis.

As shown below, four distributions are available in G@RCH 6.1 : the usual Gaussian (normal) distribution, the Student-t distribution, the Generalized Error distribution (GED) and the skewed-Student distribution.


The logic of ML is to interpret the density as a function of the parameter set, conditional on a set of sample outcomes. This function is called the likelihood function. It is quite evident from Equation (3.12) (and all the following equations of Section 3) that the recursive evaluation of this function is conditional on unobserved values. The ML estimation is therefore not perfectly exact. To solve the problem of unobserved values, we have set these quantities to their unconditional expected values. For this reason we talk about approximate (or conditional) ML and not exact ML.

If we express the mean equation as in Equation (3.6) and εt = ztσt, the log-likelihood function of the standard normal distribution is given by:

         1 ∑  [           ( 2)   2]
Lnorm = -2 t=1 log(2π)+ log σt + zt ,

where T is the number of observations.

For a Student-t distribution, the log-likelihood is:

           {     (     )       (  )               }
L     =  T   logΓ   υ+-1- - log Γ  υ--  1log[π (υ - 2)]
 Stud                2            2    2
           ∑T [                  (      2 )]
      -   1    log(σ2t)+ (1 + υ)log  1+ --zt-   ,        (3.16)
          2t=1                       υ - 2
where υ is the degrees of freedom, 2 < υ ≤∞ and Γ(.) is the gamma function.

The GED log-likelihood function of a normalized random variable is given by:

         [   (   )     |  |                     ( )           ]
       T∑      -υ       ||zt ||υ       -1             1-         2
LGED = t=1 log λυ  - 0.5 |λυ | - (1+ υ )log(2)- logΓ  υ  - 0.5log(σt) ,

where 0 < υ < and

λ  ≡   Γ (1∕υ)2(-2∕υ).
 υ        Γ (3∕υ)

The main drawback of the last two densities is that even if they may account for fat tails, they are symmetric. Skewness and kurtosis are important in financial applications in many respects (in asset pricing models, portfolio selection, option pricing theory or Value-at-Risk among others). Quite recently, Lambert and Laurent (2000, 2001) applied and extended the skewed-Student density proposed by Fernández and Steel (1998) to the GARCH framework.

The log-likelihood of a standardized (zero mean and unit variance) skewed-Student is:

           {    ( υ+1 )     ( υ)                  (  2 )       }
LSkSt  =  T  logΓ  -2-- - logΓ  2 - 0.5log[π(υ- 2)]+log ξ+-1  + log(s)
              {              [               ]}       ξ
            T∑      2             (szt+m-)2 -2It
       -  0.5t=1  logσt + (1 +υ)log 1 + υ - 2 ξ     ,              (3.18)
    {           m-
It =    1ifzt ≥ - sm ,
      - 1 if zt < - s

ξ is the asymmetry parameter, υ is the degree of freedom of the distribution,

     (    )√-----
    Γ--υ+12---υ---2(    1)
m =    √πΓ (υ)     ξ - ξ ,


   ∘ ------------------
     (     1    )
s =    ξ2 +-2 - 1 - m2.

Note that G@RCH does not estimate ξ but log(ξ) to facilitate inference about the null hypothesis of symmetry (since the skewed-Student equals the symmetric Student distribution when ξ = 1 or log(ξ) = 0). The estimated value of log(ξ) is reported in the output under the label “Asymmetry”. See Lambert and Laurent (2001) and Bauwens and Laurent (2005) for more details.

The gradient vector and the hessian matrix can be obtained numerically or by evaluating their analytical expression. Due to the high number of possible models and distributions available, G@RCH uses numerical techniques to approximate the derivatives of the log-likelihood function with respect to the parameter vector.

The next window is (Starting Values) . The user is asked to make a choice regarding the starting values of the parameters to be estimated: (1) let the program choose the starting values, (2) enter them manually, element by element (the one selected here), or (3) enter the starting values in a vector form (one value per row) and potentially fix some parameters at their starting value.


The first method is obviously the easiest to use and is recommended for unexperienced users since it prevents from entering aberrant values. The last two methods are useful if the user wants specific starting values for the estimation. The third method is particularly useful when one wants to fix some parameters.

Next, the Estimate window proposes options on two important characteristics of the model: the sample size and the estimation method.


When the variable corresponding to the date is correctly formatted, the sample can conveniently be fixed based on starting and ending date (see Chapter 9 of Doornik, 2007, for details). The number of forecasts can be also subtracted when out-of-sample forecasting is to be performed.

The models are estimated using a maximum likelihood (ML) approach. Basically, three methods are proposed to estimate these models (see the option Optimization Methods in the Model Settings window).

  1. By default, a standard ML approach is selected. This method uses the quasi-Newton method of Broyden, Fletcher, Goldfarb and Shanno (BFGS). This function is the well-known MaxBFGS function provided by Ox. See the Ox documentation for details.
  2. A constraint optimization technique that uses the MaxSQPF algorithm. MaxSQPF implements a sequential quadratic programming technique to maximize a non-linear function subject to non-linear constraints, similar to Algorithm 18.7 in Nocedal and Wright (1999). MaxSQPF is particularly useful to impose the stationarity and/or positivity constraints like α1 0 in the ARCH(1) model (see Sections 3.9 and 5.3.2).
  3. And finally, a simulated annealing algorithm for optimizing non-smooth functions with possible multiple local maxima, ported to Ox by Charles Bos and called MaxSA. See for more details.

G@RCH provides three methods to compute the covariance matrix of the estimates: Second Derivatives (based on the inverse of the hessian), Outer-Product of Gradients and Robust standard errors, also known as Quasi-Maximum Likelihood (QML) or ‘Sandwich formula’. This choice is accessible by clicking on the button ‘Options...’. By default, Robust standard errors are reported.



Note that a new feature of G@RCH 5.0 is that the default settings are updated after each estimation (the default model is thus the previously estimated one). To disable this option, disable ‘Update Default Settings’. Option ‘Reset default Settings’ is self-explanatory.

Once all the options have been selected, the estimation procedure is launched if the default starting values are used. Otherwise, a new dialog box appears. Let us assume that second method has been selected for the starting values, i.e. ‘Select (Individual Form)’. This new window contains an exhaustive list of parameters used in the different models. Depending on the specification, some parameters have a value, others have not.

The user should replace only the former since they correspond to the parameters to be estimated for the specified model. Note that it is crucially important to respect the format of these initial values. More specifically, when two values are associated with a single parameter (say, both ARCH coefficients in a ARCH(2) model), the mandatory format is “value1 ; value2”. Here is an example:



If the third method is selected for the starting values, i.e. ‘Select (Matrix Form)’, the corresponding window contains a column with the default starting values of the parameter vector (Column ‘Value’) and a Column ‘FIX’ that allows to fix some parameters to their starting value.


Note that estimating the same model with different values starting values is interesting to check the robustness of the maximum likelihood procedure.

Once this step is completed, the program starts the iteration process. Depending on the options selected8 , G@RCH prints intermediate iteration results or not. The final output is divided in two main parts: first, the model specification reminder; second, the estimated values and other useful statistics of the parameters.9 The outputs “Box 2 - Output of the ARMA(1,0)-GARCH(1,1)” and
Box 3 - Output of the ARMA(1,0)-GARCH(1,1) by QMLE” correspond respectively to the estimation results of the ARMA(1,0)-GARCH(1,1) model by ML and QMLE.

                                   Box 2 - Output of the ARMA(1,0)-GARCH(0,1)  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : GARCH (0, 1) model.  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -6106.36  
Please wait : Computing the Std Errors ...  
 Maximum Likelihood Estimation (Std.Errors based on Second derivatives)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.188314   0.016176    11.64  0.0000  
Monday (M)          -0.143636   0.033629   -4.271  0.0000  
AR(1)               -0.021047   0.017378   -1.211  0.2259  
Cst(V)               0.737026   0.028148    26.18  0.0000  
Friday (V)          -0.289467   0.040311   -7.181  0.0000  
ARCH(Alpha1)         0.746336   0.046922    15.91  0.0000  
No. Observations :      4093  No. Parameters  :         6  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -6106.357  Alpha[1]+Beta[1]:   0.74614  
The sample mean of squared residuals was used to start recursion.  
Positivity & stationarity constraints are not computed because  
there are explanatory variables in the conditional variance equation.  
Estimated Parameters Vector :  
 0.188314;-0.143636;-0.021047; 0.737026;-0.289467; 0.746341  
Elapsed Time : 1.542 seconds (or 0.0257 minutes).

Parameters labelled ‘(M)’ relate to the conditional mean while those labelled ‘(V)’ relate to the conditional variance equation. AR(1) and ARCH(Alpha1) correspond to ψ1 and α1, respectively.

Surprisingly, the AR(1) coefficient is not significantly different from 0 (we will come back to this issue latter) while it was expected to be significantly negative. Interestingly, the returns and volatility are, on average, found to be lower on Monday and on Friday, respectively. Furthermore, the ARCH coefficient α1 is highly significant (rejecting the null of no ARCH effects) and incompatible with the existence of the kurtosis (since it is above 0.577). The log-likelihood value is -6106.357.

                           Box 3 - Output of the ARMA(1,0)-GARCH(0,1) by QMLE  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : GARCH (0, 1) model.  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -6106.36  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.188314   0.025747    7.314  0.0000  
Monday (M)          -0.143636   0.049195   -2.920  0.0035  
AR(1)               -0.021047   0.055644  -0.3782  0.7053  
Cst(V)               0.737026   0.062300    11.83  0.0000  
Friday (V)          -0.289467   0.081034   -3.572  0.0004  
ARCH(Alpha1)         0.746336    0.10484    7.119  0.0000  
No. Observations :      4093  No. Parameters  :         6  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -6106.357  Alpha[1]+Beta[1]:   0.74634  
The sample mean of squared residuals was used to start recursion.  
Positivity & stationarity constraints are not be computed because  
there are explanatory variables in the conditional variance equation.  
Estimated Parameters Vector :  
 0.188314;-0.143636;-0.021047; 0.737026;-0.289467; 0.746341  
Elapsed Time : 1.522 seconds (or 0.0253667 minutes).

Ex-post, it is desirable to test the adequacy of the ARCH model. New options are thus available after the estimation of the model when clicking on the Test... button of the main G@RCH box: Tests, Graphic Analysis, Forecasts, Exclusion Restrictions, Linear Restrictions and Store.



3.7 Graphics

The Graphic Analysis... option allows to plot different graphics.


Figure 3.4 plots the conditional variance (ˆσt2) as well as the histogram of the standardized residuals (t = ˆεˆσtt) obtained with the AR(1)-ARCH(1) model, together with a kernel estimation of its unconditional distribution (solid line) and the N(0,1) (dotted line).


Figure 3.4: Conditional variance of the NASDAQ and estimated unconditional density of the standardized residuals

Just as any other graphs in the OxMetrics environment, all graphs plotted from G@RCH can be easily edited (color, size,…) and exported in many formats (.eps, .ps, .wmf, .emf and .gwg).

3.8 Misspecification Tests

The Tests... option allows the user to run different tests but also to print the variance-covariance matrix of the estimated parameters.


Once again, in addition to the possibilities offered by OxMetrics (ACF, PACF, QQ-plots,…), several misspecification tests are indeed provided in G@RCH:

In the example, 6 tests have been selected. The output is reported in
Box 4 - Misspecification Tests”.

                                               Box 4 - Misspecification Tests  
Information Criterium (to be minimized)  
Akaike          2.986737  Shibata         2.986733  
Schwarz         2.995997  Hannan-Quinn    2.990016  
                   Statistic       t-Test      P-Value  
Skewness            -0.28941       7.5616      0.00000  
Excess Kurtosis       5.9872       78.235      0.00000  
Jarque-Bera           6170.5         .NaN      0.00000  
Q-Statistics on Standardized Residuals  
  --> P-values adjusted by 1 degree(s) of freedom  
  Q(  5) =  62.3925   [0.0000000]  
  Q( 10) =  68.3636   [0.0000000]  
  Q( 20) =  84.2257   [0.0000000]  
  Q( 50) =  125.350   [0.0000000]  
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]  
Q-Statistics on Squared Standardized Residuals  
  --> P-values adjusted by 1 degree(s) of freedom  
  Q(  5) =  323.697   [0.0000000]  
  Q( 10) =  628.454   [0.0000000]  
  Q( 20) =  1035.75   [0.0000000]  
  Q( 50) =  1695.16   [0.0000000]  
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]  
Adjusted Pearson Chi-square Goodness-of-fit test  
# Cells(g)  Statistic      P-Value(g-1)     P-Value(g-k-1)  
    40      526.9756         0.000000          0.000000  
    50      540.9238         0.000000          0.000000  
    60      565.9641         0.000000          0.000000  
Rem.: k = 6 = # estimated parameters  
Residual-Based Diagnostic for Conditional Heteroskedasticity of Tse (2001)  
  RBD( 2) =  104.753   [0.0000000]  
  RBD( 5) =  246.916   [0.0000000]  
  RBD(10) =  303.489   [0.0000000]  
P-values in brackets

Without going too deeply into the analysis of these results, we can briefly argue that the model does not capture the dynamics of the the NASDAQ.

The Q-statistics on standardized and squared standardized residuals, and RBD test with various lag values as well as the adjusted Pearson Chi-square goodness-of-fit test (with different cell numbers) reject the null hypothesis of a correct specification. This result is not very surprising. Early empirical evidence has indeed shown that a high ARCH order has to be selected to catch the dynamics of the conditional variance (thus involving the estimation of numerous parameters).

3.9 Parameter Constraints

When numerical optimization is used to maximize the log-likelihood function with respect to the vector of parameters Ψ, the inspected range of the parameter space is (- ∞;∞ ). The problem is that some parameters might have to be constrained in a smaller interval. For instance, it is convenient to constrain the αi parameters of an ARCH(q) to be positive. Constraining parameters to lie between given lower and upper bounds is easily done by selecting the ‘BFGS-BOUNDS’ option in the Estimate - GARCH Models dialog box in G@RCH.

If the user fixes bounds on parameters, the program uses MaxSQPF instead of MaxBFGS (Broyden, Fletcher, Goldfarb and Shanno quasi-Newton method) to optimize the likelihood function. MaxSQPF enforces all iterates to be feasible, using the algorithm by Lawrence and Tits (2001). If a starting point is infeasible, MaxSQPF will try to minimize the squared constraint violations to find a feasible point. See Doornik (2007b) for more details on these two optimization functions.

To change the values of the bounds, select option ‘Select (Matrix Form)’ in the ‘Starting Value’ dialog box together with option ‘BFGS-BOUNDS’ in the Estimate - GARCH Models dialog box. In addition to columns ‘FIX’ and ‘Value’, two new columns appear with the default lower and upper bounds.

Furthermore, nonlinear constraints (like the stationarity constraint of the GARCH (1,1) model, α1 + β1 < 1) can be imposed during the estimation with the Console version. For more details, see 5.3.2.


3.10 Forecasts

The main purpose of building and estimating a model with financial data is probably to produce a forecast. With the Forecast... option, G@RCH 6.1 also provides forecasting tools: forecasts of both the conditional mean and the conditional variance are available as well as several forecast error measures.

The first parameter to specify is the horizon h of the h-step-ahead forecasts. The default value is 10. Three options are available to:

  1. print several forecasts error measures;
  2. print the forecasts;
  3. and make a graph of the forecasts.

Finally, graphical options are available for the standard error bands (error bands, bars or fans).


3.10.1 Forecasting the Conditional Mean

Our first goal is to give the optimal h-step-ahead predictor of yt+h given the information we have up to time t.

For instance, in an AR(1) process such as

yt = μ + ψ1(yt-1 - μ) + εt,

the optimal14 h-step-ahead predictor of yt+h (i.e. ŷt+h|t) is its conditional expectation at time t (given the estimated parameters ˆμ and ˆψ1):

ˆyt+h|t = ˆμ + ψ1(ˆyt+h-1|t - ˆμ),

where ŷt+i|t = yt+i for i 0.

For the AR(1), the optimal 1-step-ahead forecast equals ˆμ + ˆ
ψ1(ŷt -ˆμ). For h > 1, the optimal forecast can be obtained recursively or directly as ŷt+h|t = ˆμ + ˆψ1h(ŷt -ˆμ).

More generally, for the ARFIMA(n,ζ,s) model described Equation (3.8), the optimal h-step-ahead predictor of yt+h is:

          [                           ]
ˆyt+h|t  =   ˆμt+h|t +    ˆck(ˆyt+h-k - ˆμt+h|t)
           n   {  k=1    [        ∞                   ]}
          ∑  ˆ                   ∑
       +     ψi  ˆyt+h-i - ˆμt+h|t +   ˆck(ˆyt+h-i-k -μˆt+h|t)
          i=s1                    k=1
       +  ∑  ˆθ (ˆy     - ˆy     ).                         (3.22)
          j=1 j  t+h-j    t+h-j|t
Recall that when exogenous variables enter the conditional mean equation, μ becomes μt = μ + i=1n1δixi,t and consequently, provided that the information xi,t+h is available at time t (which is the case for instance if xi,t is a “day-of-the-week” dummy variable), ˆμt+h|t is also available at time t. When there is no exogenous variable in the ARFIMA model and n = 1,s = 0 and ζ = 0 (ck = 0), the forecast of the AR(1) process given in Equation (3.21) can be recovered.

Note that for ARCH-in-mean models, ˆμt+h|t = μ + i=1n1δixi,t+h|t + ϑσt+h|tk (with k = 1 or 2).

3.10.2 Forecasting the Conditional Variance

Independently from the conditional mean, one can forecast the conditional variance. In the simple ARCH(q) case, the optimal h-step-ahead forecast of the conditional variance, i.e. ˆσt+h|t2 is given by:

σ2    = ˆω+ ∑  ˆα ε2    ,
 t+h|t      i=1 i t+h-i|t

where εt+i|t2 = σt+i|t2 for i > 0 while εt+i|t2 = εt+i2 for i 0. Equation (3.23) is usually computed recursively, even if a closed form solution of σt+h|t2 can be obtained by recursive substitution in Equation (3.23).

Leaving out the last 10 observation for the forecasting experiment (as shown in the Estimate Model windows here below) and performing a 10-step-ahead forecasts of the Nasdaq based on the AR(1)-ARCH(1) produces Figure 3.5.



Figure 3.5: 10-step-ahead forecasts of the Nasdaq

The forecasted bands are ±2ˆσt+h|t which gives a 95 % confidence interval (note that the critical value 2 can be changed).

Furthermore, if we leave enough observations for the out-of-sample period, G@RCH can report some standard measures of forecasting performance derived from the forecasts errors (see Box 5). Note that certain criteria are not computed on both the mean and the variance (because they are not relevant, e.g. Percentage Correct Sign(PCS) for the variance). In such a case, G@RCH reports a .NaN.

                                         Box 5 - Forecast Evaluation Measures  
                                                      Mean    Variance  
Mean Squared Error(MSE)                              16.81       438.5  
Median Squared Error(MedSE)                          11.71        72.5  
Mean Error(ME)                                      -1.781       13.72  
Mean Absolute Error(MAE)                              3.58       14.45  
Root Mean Squared Error(RMSE)                          4.1       20.94  
Mean Absolute Percentage Error(MAPE)                  .NaN       4.602  
Adjusted Mean Absolute Percentage Error(AMAPE)        .NaN       0.674  
Percentage Correct Sign(PCS)                           0.3        .NaN  
Theil Inequality Coefficient(TIC)                   0.9795      0.8254  
Logarithmic Loss Function(LL)                         .NaN       4.344

3.11 Further Options

Finally, three options are also available in the Test menu: an Exclusion Restrictions dialog box, a Linear Restrictions dialog box and a Store in database dialog.

3.11.1 Exclusion Restrictions Dialog Box

The Exclusion Restrictions dialog box option allows you to select explanatory variables and test whether they are jointly significant. A more general form is the test for linear restrictions.

Mark all the variables you wish to include in the test in this Multiple-Selection List box. G@RCH tests whether the selected variables can be deleted from the model.

3.11.2 Linear Restrictions Dialog Box

Tests for linear restrictions are specified in the form of a matrix R, and a vector r. These are entered as one matrix [R : r] in the dialog. (This is more general than testing for exclusion restrictions).

The first four columns are the columns of R, specifying two restrictions. The last column is r, which specifies what the restrictions should add up to.

The dimensions of the matrix must be specified in the rows and columns fields. It is your responsibility to specify the right values, G@RCH will not try to work it out (because elements of a row may be spread over several lines).

3.11.3 Store in Database Dialog

Finally, the residuals, the squared residuals, the standardized residuals, the conditional variance and the h-step-ahead forecasts (of the conditional mean and the conditional variance) can be stored in the database as new variables. When selecting this option, a first window appears and the user selects the series to be stored. A default name is then proposed for this series.


3.12 The random walk hypothesis (RWH)

This section has been largely inspired by Taylor (1995).

Are prices (or returns) of financial assets forecastable? According to the random walk hypothesis (RWH) not. The steps of a random walk are unpredictable. The RWH associates steps with returns, so that returns can not be predicted from past values. In G@RCH, the RWH is here defined by two statements about the stochastic process {rt} that provides observed returns: The mean is stationary, E[rt] = μ , and returns are uncorrelated, corr(rt,rt+τ) = 0,τ0.

When RWH is true, the latest return and all past returns are irrelevant if we attempt to predict future returns using linear predictors. In this case, the variables can be assumed to be a martingale difference and thus expected returns do not depend on the history of time series information.15

3.12.1 The Variance-ratio test

The Variance-ratio test of Lo and MacKinlay (1988) is designed to detect either mean reversion or price trend behaviour. Let xt = log (pt) be the price logarithm (with any dividends reinvested). Then one-period returns are rt = xt - xt-1. N-period returns are rt + ... + rt+N-1 = xt+N-1 - xt-1. Also, assuming the returns process is stationary, let V (N) = var(rt + ...+ rt+N-1). When RWH is true, V (N) = NV (1). The test attempts to decide if this property is true. N is any integer, 2.

For any stationary process, the variance-ratio is

 V(N )       N∑-1N - τ
------ = 1+ 2    ------ρτ,
N V (1)        τ=1   N

by using the formula for the variance of a linear combination. Consequently, the variance-ratio test is appropriate when any dependence occurs for at least N - 1 lags and when the autocorrelations all have the same sign. It is a good test when the alternative to RWH is an ARMA(1,1) process that have a positive autoregressive parameter for returns.

Implementation of the test requires:

  1. Estimates of V (1),V (N).
  2. An estimate of the standard error of the variance-ratio estimate.
  3. Five steps to implement the tests.

Assume there are n returns.

Step 1 : Calculate the average return and unbiased variance estimates, thus:

Vˆ(1) =--1--∑   (rt - ¯r)2
       n - 1t=1

                n        n-N∑+1                      2
Vˆ(N) = (n---N-)(n--N-+-1)      (rt + ...+ rt+N -1 - N ¯r).

Step 2 : Calculate the sample variance ratio as

VRˆ(N) = Vˆ(N)-.
         N ˆV(1)

Reject RWH if V ˆR(N) is far from 1.

Recall that

         V-(N-)-       N∑-1 N---τ-
VR (N ) ≡ NV (1) = 1 +2     N   ρτ,


         V (N )        N∑-1 N - τ
VRˆ(N ) ≡ ------~= 1 +2     -----ˆρτ.
         NV (1)       τ=1  N

Under H0,E(V ˆR(N)) = 1 and thus V ar(V ˆR(N)) = V ar(           )
  2N-∑ 1N-τˆρ
   τ=1  N  τ. An appropriate estimate of nV ar(ρτ) is provided by bτ.

Step 3 : Estimate variances for the sample autocorrelations, ˆρτ, of the n returns, for 1 τ N - 1, under the assumption that RWH is true.

      n∑-τ          n∑-τ      2         2
    n t=1 stst+τ   n t=1 (rt - ¯r) (rt+τ - ¯r)
bτ = ---n∑------ = ------n∑--------2------,
      (  st)2         [   (rt - ¯r) ]2
       t=1             t=1

where st = (rt -¯r )2.

A very good approximation is given by bτ~=1 + (ku - 1)ˆρτ,s, where ku is the sample kurtosis of the returns rt and ˆρτ,s is the sample autocorrelation at lag τ of the time series {s1,...,sn} defined by st = (rt - ¯r)2.

Step 4 : Estimate a variance for the variance-ratio V ˆR(N), under the assumption that RWH is true.

This is denoted by vnN. Then

     N∑-1[ 2(N - τ)]2
vN =      --------  bτ = nV ar(V ˆR).
     τ=1     N


        N- 1
v  = -4-∑   (N  - τ )2b .
 N   N 2τ=1         τ

Step 5 : Calculate the test statistic,

zN = V-R∘(N)---1.
        vN /n

The asymptotic distribution of this statistic is N(0,1) when RWH is true.

Monte Carlo experiment

To study the performance of this test, let us consider a first simulation study (see Simulation_VR.ox). We simulate 1000 series of T = 2000 observations following a N-AR(0) model with μ = 0.1, i.e. (yt -μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. We than apply the Variance-ratio test with N = 2,3 and 10. The 1000 realisations of the VR statistics are plotted in Figure 3.6 while the empirical distribution of the VR statistics is plotted in Figure 3.7 (together with a N(0,1)).

We see from Figure 3.6 that the VR statistics varies between -3 and 3. Furthermore, Figure 3.7 suggests that the VR statistics is close to be standard normal distributed.


Figure 3.6: VR statistics with N = 2,3 and 10. DGP is a N-AR(0)


Figure 3.7: SIZE: Empirical distribution of the VR test with N = 2,3 and 10. DGP is a N-AR(0)

The number of times that the null is rejected at the α% nominal size (called the empirical size of the test) is reported below.

Empirical size with alpha=0.10 and M=2:8.1  
Empirical size with alpha=0.05 and M=2:3.3  
Empirical size with alpha=0.01 and M=2:0.4  
Empirical size with alpha=0.10 and M=3:7.5  
Empirical size with alpha=0.05 and M=3:3.7  
Empirical size with alpha=0.01 and M=3:0.8  
Empirical size with alpha=0.10 and M=10:8.2  
Empirical size with alpha=0.05 and M=10:4.7  
Empirical size with alpha=0.01 and M=10:1.4

These results suggest that the test suffers from a minor size distortion for the three considered values of M.

To study the power of the test, we now simulate T = 2000 observations of a N-AR(1) model with μ = 0.1 and ρ = 0.1, i.e. (1 - ρL)(yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. To obtain this configuration of the model, chose the following options:

decl m_vAR=<0.05>;  
decl m_cAR=1;

The empirical distribution of the VR statistics (under the alternative) is plotted in Figure 3.8 (together with a N(0,1)).


Figure 3.8: POWER: Empirical distribution of the VR test with N = 2,3 and 10. DGP is a N-AR(1)

Both Figure 3.8 and the output here below suggest that the VR test has good power to detect the presence of serial correlation in the series.

Empirical power with alpha=0.10 and M=2:69.5  
Empirical power with alpha=0.05 and M=2:58.4  
Empirical power with alpha=0.01 and M=2:33.3  
Empirical power with alpha=0.10 and M=3:61.5  
Empirical power with alpha=0.05 and M=3:50.4  
Empirical power with alpha=0.01 and M=3:28  
Empirical power with alpha=0.10 and M=10:32.1  
Empirical power with alpha=0.05 and M=10:22.9  
Empirical power with alpha=0.01 and M=10:8.9

A potential weakness of the VR test is that it assumes a constant variance while it is clear that most financial time series do not have constant conditional variance. To study the impact of the presence of GARCH effects on the VR test, let us now simulate T = 2000 observations of a N-AR(0)-GARCH(1,1) model with μ = 0.1, i.e. (yt - μ) = εt, where εt ~ N(0,σt2), σt2 = ω + α1εt-12 + β1σt-12 with ω = 0.2,α1 = 0.1 and β1 = 0.8.

Again, we simulate 1000 series following this DGP and apply the VR test with the same values for M.

Chose now the following options in Simulation_VR.ox:

decl m_cModel=1;  
decl m_cAR=0;


Figure 3.9: Empirical size of the VR test with N = 2,3 and 10. DGP is a N-AR(0)-GARCH(1,1)

Empirical size with alpha=0.10 and M=2:12.1  
Empirical size with alpha=0.05 and M=2:6.2  
Empirical size with alpha=0.01 and M=2:0.9  
Empirical size with alpha=0.10 and M=3:10.4  
Empirical size with alpha=0.05 and M=3:5  
Empirical size with alpha=0.01 and M=3:0.7  
Empirical size with alpha=0.10 and M=10:8.5  
Empirical size with alpha=0.05 and M=10:3.5  
Empirical size with alpha=0.01 and M=10:0.7

The empirical distribution of the VR statistics is plotted in Figure 3.9 and the empirical size in the output just below. The main conclusion is that the presence of GARCH effects has a negligible impact on the size of the test.

However, the power of the test is slightly affected by the presence of GARCH effects. Indeed, adding an AR(1) component to this GARCH model (with ρ = 0.05) substantially reduces the power of the test as suggested below.

To obtain these results, chose the following options in Simulation_VR.ox:

decl m_cModel=1;  
decl m_vAR=<0.05>;  
decl m_cAR=1;

Empirical power with alpha=0.10 and M=2:59.2  
Empirical power with alpha=0.05 and M=2:47.4  
Empirical power with alpha=0.01 and M=2:25.4  
Empirical power with alpha=0.10 and M=3:53.1  
Empirical power with alpha=0.05 and M=3:40.6  
Empirical power with alpha=0.01 and M=3:21.4  
Empirical power with alpha=0.10 and M=10:29.9  
Empirical power with alpha=0.05 and M=10:19.7  
Empirical power with alpha=0.01 and M=10:8.4

To overcome this problem some authors have proposed to apply the VR test on the rescaled or standardized residuals of a GARCH-type model.

This is illustrated in Simulation_VR_GARCH.ox.

As suggested by the next output, applying the VR test on the rescaled returns restores its original power to the test (i.e. case AR(1) without GARCH effects).

Empirical power with alpha=0.10 and M=2:69.5  
Empirical power with alpha=0.05 and M=2:58.4  
Empirical power with alpha=0.01 and M=2:33.3  
Empirical power with alpha=0.10 and M=3:61.5  
Empirical power with alpha=0.05 and M=3:50.4  
Empirical power with alpha=0.01 and M=3:28  
Empirical power with alpha=0.10 and M=10:32.1  
Empirical power with alpha=0.05 and M=10:22.9  
Empirical power with alpha=0.01 and M=10:8.9

See Taylor (1995) for more details about the VR tests as well as some applications on real data.

3.12.2 Runs test

Another popular test of the RMH is the Runs test, first used by Fama (1965). The runs test (also called Wald-Wolfowitz test) is a non-parametric test that checks the randomness hypothesis of a two- or three-valued data sequence. Similar to a first-autocorrelation test but is non-parametric and does not require any assumption of normality and stationary distribution. A run’ is a sequence of adjacent equal elements. For example, the sequence “1111000111001111110000” is divided in six runs, three of which consist of 1’s and the others of 0’s. If the 1’s and 0’s alternate randomly, the number of runs is a random variable whose asymptotic distribution is N.

Let qt be the sign of the return rt, qt is 1,0,-1, respectively for positive, zero, negative values of rt. Let ct be 1 if qtqt+1 and 0 otherwise. The total number of runs of all types is

C = 1 +    ct.

Suppose there are n1 positive returns, n2 zero returns, and n3 negative returns in a series of n returns (with 0 mean apply the test on rt -¯r ). Then the mean and variance of the random variable generating C, conditional upon n1, n2, and n3, are

E(C) = n+ 1- 1∕n ∑3   n2
Var(C) = -------------n3-n-------------,

when the signs qt are generated by i.i.d. variables (Mood, 1940).

The statistic C has an approximate normal distribution, for large n. Tests can then be decided by evaluating

                ∘ -------
K  = (C - E (C))∕ Var(C),

with RWH rejected at the 5% level if |K| > 1.96. The Runs test avoids all the problems created by conditional heteroskedasticity. Note that trends in prices (vs. no trend) would give fewer runs than expected.

To study the performance of the Runs test, let us consider a first simulation study (see Simulation_VR.ox). We simulate 1000 series of T = 2000 observations following a N-AR(0) model with μ = 0.1, i.e. (yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. We than apply the Variance-ratio test with N = 2,3 and 10. The empirical distribution of the K statistics is plotted in Figure 3.10 (together with a N(0,1)).


Figure 3.10: Empirical size of the Runs test statistics K. DGP is a N-AR(0)

The output reported here below also suggests that the test has a correct size at the three chosen significance levels. Results not reported here suggest (as expected) that the test is not affected by the presence of GARCH effects.

Empirical size with alpha=0.10:11.1  
Empirical size with alpha=0.05:5.1  
Empirical size with alpha=0.01:0.9

Let us now study the power of the Runs test by considering a N-AR(1) model. The results suggest that this tests suffers from a reduction in power (compared to the VR test on the same DGP) due to the loss of information in the transformation form returns to their signs.

Empirical power with alpha=0.10:41.8  
Empirical power with alpha=0.05:30.7  
Empirical power with alpha=0.01:12.5

3.12.3 Rescaled Range Tests

Range statistics that have power to detect long-term dependence were first developed for hydrological data by Hurst (1951) and later applied to financial returns.

Lo (1991) provides many references and a rigorous description of appropriate tests when the preferred alternative to randomness is long-term dependence. The range defined by a set of returns {r1,...,rn}:

      [               ]  [               ]
            ∑T                  T∑
Mn  =  1m≤aTx≤n   (rt - ¯r) - 1m≤iTn≤n    (rt - ¯r) .
            t=1                 t=1

R/S-test statistics are ranges divided by scaled standard deviations

R-  --1-
S = √n-ˆσMn.

Two special cases define Mandelbrot (1972)’s and Lo (1991)’s test statistics:

ˆσ = s defines(R ∕S)


       ⌊                    ⌋
 2   2      ∑q (      j  )
ˆσ = s  ⌈1+ 2     1- q-+-1 ˆρj⌉ defines(R∕S )Lo

with s2 the sample variance of returns.

Under certain assumptions, the distributions of these statistics converge, as n and q increase, to that of the range of a Brownian bridge on the unit interval. The rule is the following: do not reject the null when

S- ∈[0.861 ; 1.747]at the 10% level
S- ∈[0.809 ; 1.862] at the 5% level
S- ∈[0.721 ; 2.098] at the 1% level

The null hypothesis of an uncorrelated process can be tested using (R∕S)Man, and Lo focuses on the null hypothesis of a short memory process and then the appropriate test statistic is (R∕S)LO .

To study the performance of the R/S tests, let us consider the following simulation study (see Simulation_RS.ox). We simulate 1000 series of T = 2000 observations following a N-AR(0) model with μ = 0.1, i.e. (yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2. We than apply the (R∕S)Man test and the (R∕S)LO test with q = 10 and 20.

The empirical size of the tests is reported below. The results suggest that the test is perfectly sized.

(R/S)Man Test  
Empirical size with alpha=0.10:10.6  
Empirical size with alpha=0.05:5.3  
Empirical size with alpha=0.01:1.4  
(R/S)Lo Test  
Empirical size with alpha=0.10 and q=10:9.5  
Empirical size with alpha=0.05 and q=10:5.3  
Empirical size with alpha=0.01 and q=10:1.5  
Empirical size with alpha=0.10 and q=20:9.1  
Empirical size with alpha=0.05 and q=20:5  
Empirical size with alpha=0.01 and q=20:1.1

The next output corresponds to the AR(1) case with ρ = 0.05, obtained with the following options

decl m_vAR=<0.05>;  
decl m_cAR=1;

It is clear that the tests have no power against short memory processes.

(R/S)Man Test  
Empirical power with alpha=0.10:11  
Empirical power with alpha=0.05:5.5  
Empirical power with alpha=0.01:1.4  
(R/S)Lo Test  
Empirical power with alpha=0.10 and q=10:9.7  
Empirical power with alpha=0.05 and q=10:5.4  
Empirical power with alpha=0.01 and q=10:1.4  
Empirical power with alpha=0.10 and q=20:8.9  
Empirical power with alpha=0.05 and q=20:4.8  
Empirical power with alpha=0.01 and q=20:1.1

With a much higher value of ρ, i.e. ρ = 0.5, we see that the statistics of Lo (1991) is close to its nominal size while the one of Mandelbrot (1972) has some power to detect short memory.

(R/S)Man Test  
Empirical power with alpha=0.10:74.4  
Empirical power with alpha=0.05:65.6  
Empirical power with alpha=0.01:45.1  
(R/S)Lo Test  
Empirical power with alpha=0.10 and q=10:10.6  
Empirical power with alpha=0.05 and q=10:6.2  
Empirical power with alpha=0.01 and q=10:1.2  
Empirical power with alpha=0.10 and q=20:9.4  
Empirical power with alpha=0.05 and q=20:5.5  
Empirical power with alpha=0.01 and q=20:1.2

Results not reported here suggest that theses two statistics are not affected by the presence of GARCH effects.

To study the ability of the (R∕S)LO statistics to detect long range dependance, we now use the ARFIMA package of Doornik and Ooms (1999) to simulate T = 2000 observations of a N-ARFIMA(0,0.2,0) model with μ = 0.1, i.e. (1 - L)0.2(yt - μ) = εt, where εt ~ N(0,σ2) and σ = 0.2.

This results of this simulation have been obtained using example file Simulation_RS.ox with the following options:

decl m_cAR=0;  
decl m_cD=1;  
decl m_vD=0.2;

The next output suggests that the (R∕S)LO statistics indeed has power to detect long-memory.

(R/S)Man Test  
Empirical size with alpha=0.10:96.1  
Empirical size with alpha=0.05:94.6  
Empirical size with alpha=0.01:87.2  
(R/S)Lo Test  
Empirical size with alpha=0.10 and q=10:64.4  
Empirical size with alpha=0.05 and q=10:55.9  
Empirical size with alpha=0.01 and q=10:37.6  
Empirical size with alpha=0.10 and q=20:47.5  
Empirical size with alpha=0.05 and q=20:37.9  
Empirical size with alpha=0.01 and q=20:22

Chapter 4
Further Univariate GARCH Models

While Engle (1982) is certainly the most important contribution in financial econometrics, the ARCH model is rarely used in practice due to its simplicity.

A useful generalization of this model is the GARCH model introduced by Bollerslev (1986). This model is also a weighted average of past squared residuals, but it has declining weights that never go completely to zero. This model is more parsimonious than the ARCH and, even in its simplest form, has proven surprisingly successful in predicting conditional variances.

The GARCH model is not the only extension. Indeed, G@RCH 6.1 proposes no less than 12 specifications. As shown below, choosing an alternative model is extremely easy in G@RCH.

4.1 GARCH Model

The Generalized ARCH (GARCH) model of Bollerslev (1986) is based on an infinite ARCH specification and it allows to reduce the number of estimated parameters by imposing nonlinear restrictions on them. The GARCH (p,q) model can be expressed as:

 2      ∑q    2   ∑p    2
σt = ω +   αiεt-i +   βjσt-j.
        i=1        j=1

Using the lag (or backshift) operator L, the GARCH (p,q) model becomes:

σ2t = ω+ α (L )ε2t +β (L )σ2t,

with α(L) = α1L + α2L2 + + αqLq and β(L) = β1L + β2L2 + + βpLp.


If all the roots of the polynomial |1 - β(L)| = 0 lie outside the unit circle, we have:

 2            -1               -1 2
σt = ω [1 - β (L )] + α (L )[1 - β (L )] εt,

which may be seen as an ARCH() process since the conditional variance linearly depends on all previous squared residuals. In this case, the conditional variance of yt can become larger than the unconditional variance given by

σ2 ≡ E (ε2t) =----q--ω---p---,
            1- ∑  αi - ∑ βj
               i=1    j=1
if past realizations of εt2 are larger than σ2 (Palm, 1996).

Applying variance targeting to the GARCH model implies replacing ω by σ2(               )
 1-  q∑ αi - p∑ β
    i=1    j=1 j, where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart.1

Bollerslev (1986) has shown that for a GARCH(1,1) with normal innovations, the kurtosis of y is 3[1 - (α1 + β1)2][           2     2]
 1 - (α1 + β1) - 2α 1 > 3. The autocorrelations of εt2 have been derived by Bollerslev (1986). For a stationary GARCH(1,1), ρ1 = α1 + [α12β1(1 - 2α1β1 - β12)], and ρk = (α1 + β1)k-1ρ1, k = 2,3, In other words, the autocorrelations decline exponentially with a decay factor of α1 + β1.

As in the ARCH case, some restrictions are needed to ensure σt2 to be positive for all t. Bollerslev (1986) shows that imposing ω > 0, αi 0 (for i = 1,,q) and βj 0 (for j = 1,,p) is sufficient for the conditional variance to be positive. In practice, the GARCH parameters are often estimated without the positivity constraints. Nelson and Cao (1992) argue that imposing all coefficients to be nonnegative is too restrictive and that some of these coefficients are found to be negative in practice while the conditional variance remains positive (by checking on a case-by-case basis). Consequently, they relaxed this constraint and gave sufficient conditions for the GARCH(1,q) and GARCH(2,q) cases based on the infinite representation given in Equation (4.2). Indeed, the conditional variance is strictly positive provided ω[1- β(1)]-1 is positive and all the coefficients of the infinite polynomial α(L)[1 - β(L )]-1 in Equation (4.2) are nonnegative. The positivity constraints proposed by Bollerslev (1986) can be imposed during the estimation (see Section 3.9). If not, these constraints, as well as the ones implied by the ARCH() representation, will be tested a posteriori and reported in the output (if there is no explanatory variable in the conditional variance equation).

Estimation results of an ARMA(1,0)-GARCH(1,1) model by QML are reported in the next Output (labelled Box 6).

                           Box 6 - Output of the ARMA(1,0)-GARCH(1,1) by QMLE  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : GARCH (1, 1) model.  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -5370.86  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.116210   0.015919    7.300  0.0000  
Monday (M)          -0.174801   0.028920   -6.044  0.0000  
AR(1)                0.193905   0.017177    11.29  0.0000  
Cst(V)               0.009292   0.012566   0.7394  0.4597  
Friday (V)           0.067380   0.059745    1.128  0.2595  
ARCH(Alpha1)         0.164808   0.028030    5.880  0.0000  
GARCH(Beta1)         0.826238   0.026107    31.65  0.0000  
No. Observations :      4093  No. Parameters  :         7  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5370.858  Alpha[1]+Beta[1]:   0.99105  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
 0.116210;-0.174801; 0.193905; 0.009292; 0.067380; 0.164808; 0.826243  
Elapsed Time : 2.975 seconds (or 0.0495833 minutes).

Interestingly, the log-likelihood now equals -5370.858 against -6106.357 for the corresponding ARCH(1) model. Any likelihood ratio test (LRT), asymptotically ~ χ2(1), would reject the ARCH(1) model in favour of the GARCH(1,1).

Furthermore, we report below the same 6 misspecification tests as for the ARCH(1) model. The output is reported in Box 7 - Misspecification Tests.

                                               Box 7 - Misspecification Tests  
Information Criterium (to be minimized)  
Akaike          2.627832  Shibata         2.627826  
Schwarz         2.638636  Hannan-Quinn    2.631657  
                   Statistic       t-Test      P-Value  
Skewness            -0.65157       17.024  5.4349e-065  
Excess Kurtosis       2.6513       34.645  5.2745e-263  
Jarque-Bera           1488.4         .NaN      0.00000  
Q-Statistics on Standardized Residuals  
  --> P-values adjusted by 1 degree(s) of freedom  
  Q(  5) =  4.57820   [0.3333749]  
  Q( 10) =  5.22369   [0.8143890]  
  Q( 20) =  18.7431   [0.4734248]  
  Q( 50) =  58.5113   [0.1657199]  
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]  
Q-Statistics on Squared Standardized Residuals  
  --> P-values adjusted by 2 degree(s) of freedom  
  Q(  5) =  3.79989   [0.2838986]  
  Q( 10) =  7.15170   [0.5203577]  
  Q( 20) =  16.8830   [0.5311632]  
  Q( 50) =  46.2948   [0.5429339]  
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]  
Adjusted Pearson Chi-square Goodness-of-fit test  
# Cells(g)  Statistic      P-Value(g-1)     P-Value(g-k-1)  
    40      201.6787         0.000000          0.000000  
    50      221.5492         0.000000          0.000000  
    60      230.5035         0.000000          0.000000  
Rem.: k = 7 = # estimated parameters  
Residual-Based Diagnostic for Conditional Heteroskedasticity of Tse(2001)  
  RBD( 2) =  2.02370   [0.3635463]  
  RBD( 5) =  3.62445   [0.6046453]  
  RBD(10) =  6.84825   [0.7396890]  
P-values in brackets

Unlike the ARCH(1) model, the Q-Statistics on standardized and squared standardized residuals, as well as the RBD test with various lag values suggest that the GARCH(1,1) does a good job in modelling the dynamics of the first two conditional moments of the NASDAQ.

However, the adjusted Pearson Chi-square goodness-of-fit test (with different cell numbers) still points out some misspecification of the overall conditional distribution.

Several authors have proposed to use a Student-t or GED distribution in combination with a GARCH model to model the fat tails of the high-frequency financial time-series. Furthermore, since the NASDAQ seems to be skewed, a skewed-Student distributions might be justified (see Section 6 for a discussion about non-normal distributions).

4.2 EGARCH Model

The Exponential GARCH (EGARCH) model, originally introduced by Nelson (1991), is re-expressed in Bollerslev and Mikkelsen (1996) as follows:

logσ2 = ω+ [1- β(L)]-1[1 + α(L)]g(z   ).
    t                            t- 1

The value of g(zt) depends on several elements. Nelson (1991) notes that, “to accommodate the asymmetric relation between stock returns and volatility changes (...) the value of g(zt) must be a function of both the magnitude and the sign of zt.2 That is why he suggests to express the function g(.) as

g(zt) ≡   γ1zt   + γ2[|zt|- E-|zt|]
      sig◟n◝◜e◞ffect  ◟    ◝◜    ◞
                m agnitudeeffect

E|zt| depends on the assumption made on the unconditional density of zt. Indeed, for the normal distribution,

E(|z|) = ∘2-∕π.

For the skewed-Student distribution,

                (   )√ -----
E(|zt|) = ξ + 1∕ξ  √πΓ (υ∕2)

where ξ = 1 for the symmetric Student.

For the GED, we have

         (1∕υ)  Γ (2∕υ)
E(|zt|) = 2   λυΓ (1∕υ).

ξ,υ and λυ concern the shape of the non-normal densities and was defined in Section 3.6.2.

Note that the use of a log transformation of the conditional variance ensures that σt2 is always positive.

Applying variance targeting to the EGARCH model implies replacing ω by log(σ2), where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart.3

The output reported below corresponds to the ARMA(1,0)-EGARCH(1,1) with a GED distribution. Interestingly, both θ1 and θ2 are significant. Note that the degree of freedom of the GED distribution is significantly lower than 2, confirming that the standardized residuals are fat-tailed.

                                                               Box 8 - EGARCH  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : EGARCH (1, 1) model.  
1 regressor(s) in the variance.  
The distribution is a GED distribution, with a tail coefficient of 1.31814.  
Strong convergence using numerical derivatives  
Log-likelihood = -5233.92  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.133901   0.010949    12.23  0.0000  
Monday (M)          -0.166042   0.026034   -6.378  0.0000  
AR(1)                0.188276   0.014182    13.28  0.0000  
Cst(V)              -0.342968    0.21183   -1.619  0.1055  
Friday (V)          -0.203655   0.079556   -2.560  0.0105  
ARCH(Alpha1)        -0.355466    0.14267   -2.491  0.0128  
GARCH(Beta1)         0.987227  0.0058283    169.4  0.0000  
EGARCH(Theta1)      -0.085730   0.020159   -4.253  0.0000  
EGARCH(Theta2)       0.305338   0.037692    8.101  0.0000  
G.E.D.(DF)           1.318137   0.052628    25.05  0.0000  
No. Observations :      4093  No. Parameters  :        10  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5233.919  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
0.133901;-0.166042; 0.188276;-0.342968;-0.203655;-0.355466;  
0.987227;-0.085730; 0.305338; 1.318142

4.3 GJR Model

This popular model is proposed by Glosten, Jagannathan, and Runkle (1993). Its generalized version is given by:

         ∑q                    ∑p
σ2t = ω +   (αiε2t-i + γiS-t-iε2t-i) +   βjσ2t-j,
         i=1                     j=1

where St- is a dummy variable that take the value 1 when γi is negative and 0 when it is positive.

In this model, it is assumed that the impact of εt2 on the conditional variance σt2 is different when εt is positive or negative. The TGARCH model of Zakoian (1994) is very similar to the GJR but models the conditional standard deviation instead of the conditional variance. Ling and McAleer (2002) provide, among other stationarity conditions for various GARCH models, the conditions of existence of the second and fourth moment of the GJR.

Applying variance targeting to the GJR model implies replacing ω by σ2[    q                 p   ]
 1- ∑  (αi - γiE(S-t ))- ∑ βj
    i=1               j=1, where σ2 is the unconditional variance of εt, which can be consistently estimated by its sample counterpart and E(St-) is 0.5 for symmetric distributions (i.e. Normal, Student and GED) and -1--
1+ξ2 for the SKST.4

A nice feature of the GJR model is that the null hypothesis of no leverage effect is easy to test. Indeed, γ1 = = γq = 0 implies that the news impact curve is symmetric, i.e. past positive shocks have the same impact on today’s volatility as past negative shocks.

The output reported below suggests the presence of such an effect on the NASDAQ since ˆγ1 = 0.107988 with a robust t-value of 2.826.

                                                                  Box 9 - GJR  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : GJR (1, 1) model.  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -5354.08  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.093388   0.015714    5.943  0.0000  
Monday (M)          -0.163882   0.028471   -5.756  0.0000  
AR(1)                0.203798   0.018417    11.07  0.0000  
Cst(V)               0.014065   0.013479    1.044  0.2968  
Friday (V)           0.053779   0.060397   0.8904  0.3733  
ARCH(Alpha1)         0.105963   0.018002    5.886  0.0000  
GARCH(Beta1)         0.825532   0.026830    30.77  0.0000  
GJR(Gamma1)          0.107988   0.038216    2.826  0.0047  
No. Observations :      4093  No. Parameters  :         8  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5354.076  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
0.093388;-0.163882; 0.203798; 0.014065; 0.053779; 0.105963; 0.825532; 0.107993  
Elapsed Time : 4.086 seconds (or 0.0681 minutes).

4.4 APARCH Model

This model has been introduced by Ding, Granger, and Engle (1993). The APARCH (p,q) model can be expressed as:

         q                     p
σδ = ω+ ∑  α (|ε  |- γ ε  )δ + ∑ β σδ  ,
 t      i=1 i   t-i    it-i    j=1 j t-j

where δ > 0 and -1 < γi < 1 (i = 1,...,q).

The parameter δ plays the role of a Box-Cox transformation of σt while γi reflects the so-called leverage effect. Properties of the APARCH model are studied in He and Teräsvirta (1999a, 1999b).

The APARCH includes seven other ARCH extensions as special cases:5

Following Ding, Granger, and Engle (1993), if ω > 0 and i=1qαi E(|z|- γiz)δ+ j=1pβj < 1, a stationary solution for Equation (4.9) exists and is expressed as:

E (σδ) =------------α0-------------.
    t       q∑             δ   ∑p
        1 -i=1αiE (|z|- γiz) - j=1 βj

Notice that if we set γ = 0, δ = 2 and if zt has zero mean and unit variance, we have the usual stationarity condition of the GARCH(1,1) model (α1 + β1 < 1). However, if γ0 and/or δ2, this condition depends on the assumption made on the innovation process.

Ding, Granger, and Engle (1993) derive a closed form solution to κi = E(|z|- γiz)δ in the Gaussian case. Lambert and Laurent (2001) show that for the standardized skewed-Student:6

    {                            }   δ+1-  υ -δ    1+δ
κi = ξ- (1+δ)(1+ γi)δ +ξ1+δ(1 - γi)δ  Γ (2-)1Γ (√-2-)(υ-2)υ-2-.
                                    (ξ+ξ) (υ-2)πΓ (2)

For the GED, we can show that:

          δ     δ  δ-υ- δ+1 δ
κi = [(1+-γi)+-(1-γi)]12υ-Γ (-υ-)λυ-.
               Γ (υ)

Note that ξ,υ and λυ concern the shape of the non-normal densities and was defined in Section 3.6.2.

Applying variance targeting to the APARCH model implies replacing ω by σδ(    q        p   )
 1 - ∑ κiαi - ∑ βj
     i=1      j=1, where σ is the unconditional standard deviation of εt, which can be consistently estimated by its sample counterpart.7

Once again, the APARCH model suggest the presence of a leverage effect on the NASDAQ (see Box 10). Importantly, δ is significantly different from 2 (ˆδ = 1.146366) but not significantly different from 1. This suggests that, instead of modelling the conditional variance (GARCH), it is more relevant in this case to model the conditional standard deviation. This result is in line with those of Taylor (1986), Schwert (1990) and Ding, Granger, and Engle (1993) who indicate that there is substantially more correlation between absolute returns than squared returns, a stylized fact of high frequency financial returns (often called ‘long memory’).

                                                              Box 10 - APARCH  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : APARCH (1, 1) model.  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -5341.37  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.088279   0.019660    4.490  0.0000  
Monday (M)          -0.167957   0.028627   -5.867  0.0000  
AR(1)                0.194576   0.018647    10.43  0.0000  
Cst(V)               0.020683   0.011917    1.736  0.0827  
Friday (V)           0.028304   0.050441   0.5611  0.5747  
ARCH(Alpha1)         0.154370   0.024000    6.432  0.0000  
GARCH(Beta1)         0.855355   0.024859    34.41  0.0000  
APARCH(Gamma1)       0.288072   0.076177    3.782  0.0002  
APARCH(Delta)        1.146366    0.20559    5.576  0.0000  
No. Observations :      4093  No. Parameters  :         9  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5341.369  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
0.088279;-0.167957; 0.194576; 0.020683; 0.028304; 0.154370; 0.855355;  
0.288072; 1.146371  
Elapsed Time : 7.941 seconds (or 0.13235 minutes).

4.5 IGARCH Model

In many high-frequency time-series applications, the conditional variance estimated using a GARCH(p,q) process exhibits a strong persistence, that is:

∑p      ∑q
   βj +    αi ≈ 1.
j=1     i=1

If j=1pβj + i=1qαi < 1, the process (εt) is second order stationary, and a shock to the conditional variance σt2 has a decaying impact on σt+h2, when h increases, and is asymptotically negligible. Indeed, let us rewrite the ARCH() representation of the GARCH(p,q), given in Equation (4.2), as follows:

σ2t = ω* + λ(L)ε2t,

where ω* = ω[1 - β (L )]-1, λ(L) = α(L)[1 - β(L)]-1 = i=1λiLi and λi are lag coefficients depending nonlinearly on αi and βi. For a GARCH(1,1), λi = α1β1i-1. Recall that this model is said to be second order stationary provided that α1 + β1 < 1 since it implies that the unconditional variance exists and equals ω(1 - α1 - β1). As shown by Davidson (2004), the amplitude of the GARCH(1,1) is measured by S = i=1λi = α1(1 - β1), which determines “how large the variations in the conditional variance can be” (and hence the order of the existing moments). This concept is often confused with the memory of the model that determines “how long shocks to the volatility take to dissipate”. In this respect, the GARCH(1,1) model has a geometric memory ρ = 1β1, where λi = O(- i)
 ρ. In practice, we often find α1 + β1 = 1. In this case, we are confronted with an Integrated GARCH (IGARCH) model.

The GARCH(p,q) model can be expressed as an ARMA process. Using the lag operator L, we can rearrange Equation (4.1) as:

                2               ( 2   2)
[1- α (L )- β(L)]εt = ω+ [1- β (L )] εt - σt .

When the [1- α (L )- β(L)] polynomial contains a unit root, i.e. the sum of all the αi and the βj is one, we have the IGARCH(p,q) model of Engle and Bollerslev (1986). It can then be written as:

           2                 2   2
ϕ(L)(1- L)εt = ω + [1 - β (L )](εt - σt),

where ϕ(L) = [1- α(L)- β(L )](1 - L)-1 is of order max{p,q}- 1.

We can rearrange Equation (4.12) to express the conditional variance as a function of the squared residuals. After some manipulations, we have its ARCH() representation:

        ω      {                      -1}
σ2t = [1--β(L)] + 1 - ϕ(L)(1 - L)[1 - β(L )]  ε2t.

For this model, S = 1 and thus the second moment does not exist. However, this process is still short memory. To show this, Davidson (2004) considers an IGARCH(0,1) model defined as εt = σtzt and σt2 = εt-12. This process is often wrongly compared to a random walk since the long-range forecast σt+h2 = εt2, for any h. However εt = ztεt-1 which means that the memory of a large deviation persists for only one period.

4.6 RiskMetricsTM

In October 1994, the risk management group at J.P. Morgan released a technical document describing its internal market risk management methodology (J.P.Morgan, 1996). This methodology, called RiskMetricsTM soon became a standard in the market risk measurement due to its simplicity.

Basically, the RiskMetricsTM model is an IGARCH(1,1) model where the ARCH and GARCH coefficients are fixed.

The model is given by:

 2             2      2
σt = ω+ (1- λ)εt- 1 + λσt-1,

where ω = 0 and λ is generally set to 0.94 with daily data and to 0.97 with weekly data.8 Note also that Equation 4.14 is the very basic conditional variance model of the RiskMetricsTM methodology, but there exist many extensions of the original model. See the RiskMetrics Group website for details.

To illustrate the need for flexible ARCH-type models, here is the output of the Box-Pierce test on squared standardized residuals and the RBD test applied after the estimation of the RiskMetrics model (including an AR(1) term and the two dummy variables). There is no doubt that the RiskMetrics specification is not appropriate.

Q-Statistics on Squared Standardized Residuals  
  --> P-values adjusted by 2 degree(s) of freedom  
  Q(  5) =  82.0593   [0.0000]  
  Q( 10) =  84.1246   [0.0000]  
  Q( 20) =  93.3285   [0.0000]  
  Q( 50) =  119.353   [0.0000]  
H0 : No serial correlation ==> Accept H0 when prob. is High [Q < Chisq(lag)]  
Residual-Based Diagnostic for Conditional Heteroskedasticity of Tse (2001)  
  RBD( 2) =  54.0450   [0.0000]  
  RBD( 5) =  54.6773   [0.0000]  
  RBD(10) =  59.6488   [0.0000]  
P-values in brackets  

4.7 Fractionally Integrated Models

Volatility tends to change quite slowly over time, and, as shown in Ding, Granger, and Engle (1993) among others, the effects of a shock can take a considerable time to decay.9 Therefore the distinction between stationary and unit root processes seems to be far too restrictive. Indeed, the propagation of shocks in a stationary process occurs at an exponential rate of decay (so that it only captures the short-memory), while for an unit root process the persistence of shocks is infinite. In the conditional mean, the ARFIMA specification has been proposed to fill the gap between short and complete persistence, so that the short-run behavior of the time-series is captured by the ARMA parameters, while the fractional differencing parameter allows for modelling the long-run dependence.10

To mimic the behavior of the correlogram of the observed volatility, Baillie, Bollerslev, and Mikkelsen (1996) (hereafter denoted BBM) introduce the Fractionally Integrated GARCH (FIGARCH) model by replacing the first difference operator of Equation (4.13) by (1 - L)d.

The conditional variance of the FIGARCH (p,d,q) is given by:

                  {                         }
σ2t = ω [1- β(L )]-1+ 1 - [1 - β(L )]-1ϕ(L)(1- L)d ε2t,
    ◟----◝ω◜*----◞  ◟------------◝◜-----------◞

or σt2 = ω* + i=1λiLiεt2 = ω* + λ(L)εt2, with 0 d 1. It is fairly easy to show that ω > 0,β1 - d ϕ1 2-d
 3 and d(ϕ1 - 1-d)
      2 β1(ϕ1 - β1 + d) are sufficient to ensure that the conditional variance of the FIGARCH (1,d,1) is positive almost surely for all t. Setting ϕ1 = 0 gives the condition for the FIGARCH (1,d,0). See Section 3.9 to see how to impose these constraints.

When estimating a FIGARCH (1,d,1) model by QML on the NASDAQ dataset (see the next output, labelled Box 11), one sees that d is significantly different from 0 and 1 while ϕ1 is barely significant. Importantly, comparing its log-likelihood with the one of the simple GARCH(1,1), i.e. -5359.179 vs. -5370.858, justifies the use of a long-memory process in the conditional variance.

                                                         Box 11 - FIGARCH-BBM  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : FIGARCH (1, d, 1) model estimated with BBM’s method  
(Truncation order : 1000).  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -5359.18  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.119253   0.016127    7.394  0.0000  
Monday (M)          -0.174214   0.028834   -6.042  0.0000  
AR(1)                0.195242   0.017134    11.40  0.0000  
Cst(V)               0.023438   0.015515    1.511  0.1309  
Friday (V)           0.074531   0.065791    1.133  0.2573  
d-Figarch            0.581512   0.079626    7.303  0.0000  
ARCH(Phi1)           0.144459   0.075900    1.903  0.0571  
GARCH(Beta1)         0.536264   0.093287    5.749  0.0000  
No. Observations :      4093  No. Parameters  :         8  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5359.179  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
0.119253;-0.174214; 0.195242; 0.023438; 0.074531; 0.581512; 0.144459; 0.536269  
Elapsed Time : 45.115 seconds (or 0.751917 minutes).

Davidson (2004) notes the interesting and counterintuitive fact that the memory parameter of this process is -d, and is increasing as d approaches zero, while in the ARFIMA model the memory increases when ζ increases. According to Davidson (2004), the unexpected behavior of the FIGARCH model may be due less to any inherent paradoxes than to the fact that, embodying restrictions appropriate to a model in levels, it has been transplanted into a model of volatility. The main characteristic of this model is that it is not stationary when d > 0. Indeed,

(1 - L)d  =     ------Γ (d+-1)----Lk
            k=0Γ (k + 1)Γ (d- k+ 1)
                    1        2   1              3
         =  1- dL - 2d(1- d)L  - 6d(1- d)(2 - d)L  - ...
         =  1-    ck(d)Lk,                             (4.16)
where c1(d) = d,c2(d) = 1
2d(1 -d), etc. By construction, k=1ck(d) = 1 for any value of d, and consequently, the FIGARCH belongs to the same “knife-edge-nonstationary” class represented by the IGARCH model. To test whether this nonstationarity feature holds, Davidson (2004) proposes a generalized version of the FIGARCH and calls it the HYperbolic GARCH. The HYGARCH is given by Equation (4.15), when λ(L) is replaced by 1 -[1- β(L)]-1ϕ(L){     [      ]}
 1+ α (1- L )d. Note that G@RCH reports log(α) and not α. The ck(d) coefficients are thus weighted by α. Interestingly, the HYGARCH nests the FIGARCH when α = 1 (or equivalently when log(α) = 0) and the process is stationary when α < 1 (or equivalently when log(α) < 0) in which case the GARCH component observes the usual covariance stationarity restrictions (see Davidson, 2004 for more details).

Note that when estimating the HYGARCH model on the NASDAQ dataset (see the next output, labelled Box 12), one cannot reject the FIGARCH specification in favour of the HYGARCH since   ˆ
log(α) = -0.006962 with a robust standard error of 0.029515.

                                                         Box 12 - HYGARCH  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : HYGARCH (1, d, 1) model of Davidson  
(Truncation order : 1000).  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -5359.13  
Please wait : Computing the Std Errors ...  

 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.119324   0.016057    7.431  0.0000  
Monday (M)          -0.174009   0.028839   -6.034  0.0000  
AR(1)                0.195311   0.017189    11.36  0.0000  
Cst(V)               0.025119   0.017519    1.434  0.1517  
Friday (V)           0.074334   0.065858    1.129  0.2591  
d-Figarch            0.589652   0.081351    7.248  0.0000  
ARCH(Phi1)           0.143596   0.075199    1.910  0.0563  
GARCH(Beta1)         0.541468   0.091022    5.949  0.0000  
Log Alpha (HY)      -0.006962   0.029515  -0.2359  0.8135  
No. Observations :      4093  No. Parameters  :         9  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5359.126  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
0.119324;-0.174009; 0.195311; 0.025119; 0.074334; 0.589652;  
0.143596; 0.541468;-0.006957  
Elapsed Time : 51.875 seconds (or 0.864583 minutes).

Chung (1999) underscores some additional drawbacks in the BBM model: there is a structural problem in the BBM specification since the parallel with the ARFIMA framework of the conditional mean equation is not perfect, leading to difficult interpretations of the estimated parameters. Indeed the fractional differencing operator applies to the constant term in the mean equation (ARFIMA) while it does not in the variance equation (FIGARCH). Chung (1999) proposes a slightly different process:

ϕ(L)(1- L)d(ε2t - σ2) = [1 - β (L )](ε2t - σ2t),

where σ2 is the unconditional variance of εt .

Applying variance targeting to this model implies replacing σ2 its sample counterpart.

If we keep the same definition of λ(L) as in Equation (4.15), we can formulate the conditional variance as:

 2   2  {            -1          d} ( 2   2)
σt = σ +  1- [1- β(L)] ϕ(L )(1- L )   εt - σ


             (      )
σ2t = σ2 + λ(L ) ε2t - σ2 .

Chung (1999) shows that σ2 > 0 and 0 ϕ1 β1 d 1 is sufficient to ensure the positivity of Equation (4.18) when p = q = 1.11

λ(L ) is an infinite summation which, in practice, has to be truncated. BBM propose to truncate λ(L) at 1000 lags (this truncation order has been implemented as the default value in our package, but it may be changed by the user) and replace the unobserved εt2’s by the empirical counterpart of E(εt2), i.e. 1∕T t=1Tˆε t2. Contrary to BBM, Chung (1999) proposes to truncate λ(L ) at the size of the information set (T - 1) and to initialize the unobserved (      )
 ε2t - σ2 at 0 (this quantity is small in absolute values and has a zero mean).12

Recently, Conrad and Haag (2006) and Conrad (2007) have derived non-negativity conditions for FIGARCH and HYGARCH models (respectively). Thanks to Christian Conrad, these conditions are implemented in the following examples: Stat_Constr_FIGARCH_Conrad_Haag_Check.ox and
Stat_Constr_HYGARCH_Conrad_Check.ox for a FIGARCH and HYGARCH(1,d,1) respectively. While the conditions are tested on the estimated parameters in the previous two example files, the conditions are imposed during the estimation in the next two, i.e. Stat_Constr_FIGARCH_Conrad_Haag_Impose.ox and

The idea of fractional integration has been extended to other GARCH types of models, including the Fractionally Integrated EGARCH (FIEGARCH) of Bollerslev and Mikkelsen (1996) and the Fractionally Integrated APARCH (FIAPARCH) of Tse (1998).13

Similarly to the GARCH(p,q) process, the EGARCH(p,q) of Equation (4.3) can be extended to account for long memory by factorizing the autoregressive polynomial [1- β (L )] = ϕ(L)(1 - L)d where all the roots of ϕ(z) = 0 lie outside the unit circle. The FIEGARCH (p,d,q) is specified as follows:

   ( 2)          -1      -d
log σt = ω + ϕ(L)  (1- L )  [1 + α(L )]g(zt-1).

Finally, the FIAPARCH (p,d,q) model can be written as:14

        {            -1           d}          δ
σδt = ω + 1 - [1- β (L)] ϕ (L)(1- L)   (|εt|- γεt) .

To illustrate, the next outputs, labelled Box 13 and 14, report the estimation results of a FIAPARCH(1,d,1) and a FIEGARCH(1,d,1) respectively. Once again, long-memory is detected in the conditional variance equation.

                                                            Box 13 - FIAPARCH  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : FIAPARCH (1, d, 1) model estimated with BBM’s method  
(Truncation order : 1000).  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -5319.06  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.090488   0.016690    5.422  0.0000  
Monday (M)          -0.155500   0.027083   -5.742  0.0000  
AR(1)                0.206201   0.018850    10.94  0.0000  
Cst(V)               0.069862   0.027352    2.554  0.0107  
Friday (V)           0.032353   0.062970   0.5138  0.6074  
d-Figarch            0.442479   0.048626    9.100  0.0000  
ARCH(Alpha1)         0.175341   0.085768    2.044  0.0410  
GARCH(Beta1)         0.444875    0.10163    4.377  0.0000  
APARCH(Gamma1)       0.358940   0.080621    4.452  0.0000  
APARCH(Delta)        1.500260    0.12889    11.64  0.0000  
No. Observations :      4093  No. Parameters  :        10  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5319.060  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
0.090488;-0.155500; 0.206201; 0.069862; 0.032353; 0.442479; 0.175341;  
0.444875; 0.358940; 1.500265  
Elapsed Time : 173.209 seconds (or 2.88682 minutes).

                                                            Box 14 - FIEGARCH  
Dependent variable : Nasdaq  
Mean Equation : ARMA (1, 0) model.  
1 regressor(s) in the mean.  
Variance Equation : FIEGARCH (1, d, 1) model  
(Truncation order : 1000).  
1 regressor(s) in the variance.  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -5308.76  
Please wait : Computing the Std Errors ...  
 Robust Standard Errors (Sandwich formula)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.096537   0.015428    6.257  0.0000  
Monday (M)          -0.172178   0.028035   -6.142  0.0000  
AR(1)                0.192984   0.017946    10.75  0.0000  
Cst(V)               0.588445    0.28539    2.062  0.0393  
Friday (V)          -0.100769   0.096499   -1.044  0.2964  
d-Figarch            0.493427   0.045323    10.89  0.0000  
ARCH(Alpha1)        -0.246505    0.25960  -0.9496  0.3424  
GARCH(Beta1)         0.603661    0.14050    4.297  0.0000  
EGARCH(Theta1)      -0.119277   0.026220   -4.549  0.0000  
EGARCH(Theta2)       0.302285   0.048533    6.228  0.0000  
No. Observations :      4093  No. Parameters  :        10  
Mean (Y)         :   0.05517  Variance (Y)    :   1.59189  
Skewness (Y)     :  -0.74128  Kurtosis (Y)    :  14.25531  
Log Likelihood   : -5308.762  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
0.096537;-0.172178; 0.192984; 0.588445;-0.100769; 0.493427;  
-0.246505; 0.603661;-0.119277; 0.302290  
Elapsed Time : 242.118 seconds (or 4.0353 minutes).

4.8 Forecasting the Conditional Variance of GARCH-type models

Like for the simple ARCH(q) model, it is rather easy to obtain h-step-ahead forecasts of these more complicated models. In the simple GARCH(p,q) case, the optimal h-step-ahead forecast of the conditional variance, i.e. ˆσt+h|t2 is given by:

            q             p
 2         ∑     2       ∑  ˆ  2
σt+h|t = ˆω+    ˆαiεt+h -i|t +   βjσt+h-j|t,
           i=1           j=1

where εt+i|t2 = σt+i|t2 for i > 0 while εt+i|t2 = εt+i2 and σt+i|t2 = σt+i2 for i 0. Equation (4.21) is usually computed recursively, even if a closed form solution of σt+h|t2 can be obtained by recursive substitution in Equation (4.21).

Similarly, one can easily obtain the h-step-ahead forecast of the conditional variance of an ARCH, IGARCH and FIGARCH model. By contrast, for thresholds models, the computation of out-of-sample forecasts is more complicated. Indeed, for EGARCH, GJR and APARCH models (as well as for their long-memory counterparts), the assumption made on the innovation process may have an effect on the forecast (especially for h > 1).

For instance, for the GJR (p,q) model, we have

           ∑q                              ∑p
ˆσ2t+h |t = ˆω +   (ˆαiε2t-i+h|t + ˆγiS-t-i+h|tε2t-i+h|t)+   ˆβjσ2t-j+h|t.
           i=1                              j=1

When γi = 0 for all i, we obtain the forecast of the GARCH model. Otherwise, St-i+h|t- has to be computed. Note first that St+i|t- = St+i- for i 0. However, when i > 0, St+i|t- depends on the choice of the distribution of zt. When the distribution of zt is symmetric around 0 (for the Gaussian, Student and GED density), the probability that εt+i is negative is St+i|t- = 0.5. If zt is (standardized) skewed-Student distributed with asymmetry parameter ξ and degree of freedom υ, St+i|t- =  1
1+ξ2 since ξ2 is the ratio of probability masses above and below the mode.

For the APARCH (p,q) model,

ˆσδ     =  E (σδ  |Ω )
 t+h|t       ( t+h  t                                     )
                 ∑q                     ˆ  ∑p    ˆ
       =  E ( ˆω+    ˆαi(|εt+h-i|- ˆγiεt+h-i)δ +   ˆβjσδt+h-j | Ωt)
                 i=1                       j=1
              ∑q    [                    ]  ∑p
       =  ˆω +    ˆαiE (εt+h-i - ˆγiεt+h-i)ˆδ|Ωt +   ˆβjσˆδt+h-j|t, (4.23)
              i=1                            j=1
where E[(ε   - ˆγε   )ˆδ|Ω ]
   t+k    it+k    t = κiσt+k|tˆδ , for k > 1 and κi = E(|z|- γ z)
      iˆδ (see Section 3.6.2).

For the EGARCH (p,q) model,

logˆσ2     =  E (logσ2  |Ω )
    t+h|t       {    t[+h  t  ]                       }
          =  E   ˆω+  1- ˆβ (L )-1 [1+ αˆ(L )]ˆg(zt+h-1) | Ωt

             [    ˆ  ]    ˆ       2
          =   1 - β(L ) ˆω+ β(L) logˆσt+h|t + [1+ αˆ(L )]ˆg(zt+h-1|t),(4.24)
where ĝ(zt+k|t) = ĝ(zt+k) for k 0 and 0 for k > 0.

Finally, the h-step-ahead forecast of the FIAPARCH and FIEGARCH models are obtained in a similar way.

4.9 Constrained Maximum Likelihood and Simulated Annealing

As explained in Section 3.9, it is possible to constrain the parameters to range between a lower and an upper bound by selecting the option MaxBFGS - Bounded Parameters in the Estimate window.


If the user selects the option Select (Individual or Matrix Form) in the Starting Values window, the program automatically opens a new box dialog before launching the estimation. Analogous to the procedure for the starting values, the user can change the bounds manually.

In the following example, an APARCH specification has been used. By default, we propose to impose the constraints -1 < γi < 1, i = 1,,q and 0 < δ < 3.

Finally, if the option MaxSA - Unbounded Parameters is selected in the Model Settings window, a new box dialog appears before the estimation.15 In this box dialog, the user may change the default values of the parameters used in the simulated annealing algorithm.


Simulated annealing is a global optimization method that potentially distinguishes between different local optima. Starting from an initial point, the algorithm takes a step and the function is evaluated. When minimizing a function, any downhill step is accepted and the process repeats from this new point. An uphill step may be accepted. Thus, it can escape from local optima. This uphill decision is made by the Metropolis criteria. As the optimization process proceeds, the length of the steps declines and the algorithm closes in on the global optimum. Since the algorithm makes very few assumptions regarding the function to be optimized, it is quite robust with respect to non-quadratic surfaces. The degree of robustness can be adjusted by the user. In fact, simulated annealing can be used as a local optimizer for difficult functions. This implementation of simulated annealing is used in Goffe, Ferrier, and Rogers (1994). Briefly, it is found to be competitive, if not superior, to multiple restarts of conventional optimization routines for difficult optimization problems. The price to pay is that it is slower than the previous two techniques. A comprehensive help is available on the MaxSA web page:



4.10 Accuracy of G@RCH

McCullough and Vinod (1999) and Brooks, Burke, and Persand (2001) use the daily German mark/British pound exchange rate data of Bollerslev and Ghysels (1996) to compare the accuracy of GARCH model estimation among several econometric softwares. They choose the GARCH(1,1) model described in Fiorentini, Calzolari, and Panattoni (1996) (hereafter denoted FCP) as the benchmark. In this section, we use the same methodology with the same dataset to check the accuracy of our procedures. Coefficients and standard errors estimates of G@RCH 4.2 are reported in Table 4.1 together with the results of McCullough and Vinod (1999) (FCP in the table).

Robust Standard Errors

μ-0.006184-0.0061900.0084620.0084620.009187 0.009189
ω 0.010760 0.010761 0.0028510.0028520.006484 0.006493
α1 0.153407 0.153134 0.0265690.0265230.053595 0.053532
β1 0.805879 0.805974 0.0335420.0335530.072386 0.072461

Table 4.1: Accuracy of the GARCH procedure

G@RCH gives very satisfactory results since the first four digits (at least) are the same as those of the benchmark for all but two estimations.

Moreover, to investigate the accuracy of our forecasting procedures, we have run a 8-step ahead forecasts of the model, similar to Brooks, Burke, and Persand (2001). Table 4 in Brooks, Burke, and Persand (2001) reports the conditional variance forecasts given by six well-known softwares and the correct values. G@RCH hits the benchmarks for all steps to the third decimal.

Finally, Lombardi and Gallo (2001) extends the work of Fiorentini, Calzolari, and Panattoni (1996) to the FIGARCH model of Baillie, Bollerslev, and Mikkelsen (1996) and develops the analytic Hessian matrices of this long memory process. For the same DEM/UKP database as in the previous example, Table 4.2 reports the coefficients estimates and their standard errors for our package (using numerical gradients and the BFGS optimization method) and for Lombardi and Gallo (2001) (using analytical gradients and the Newton-Raphson algorithm).



Table 4.2: Accuracy of the FIGARCH procedure

Results show that G@RCH provides excellent numerical estimates that are quite close to the analytical ones, even for an advanced model such as the FIGARCH.16

4.11 Simulations

G@RCH 6.1 allows the simulation of four models, namely the GARCH, GJR, APARCH and EGARCH models. An ARMA specification is allowed in the conditional mean and four different distributions are available for the error term, i.e. a normal, Student, GED or skewed-student.

Simulation can be done either in ox or through the rolling menus. Simul_Garch.ox is an example of Ox code using the Garch class. Alternatively to simulate a GARCH model using the rolling menus, click on G@RCH in the Modules group in the workspace window on the left-hand side of OxMetrics. Then change the Category to Monte Carlo and Model Class to Simulation of GARCH Models and click on Formulate.


Then, select the model of interest and change the default values of the corresponding parameters (next window). The simulated series can be stored in a separate dataset and are plotted if requested.



Figure 4.1: DGP: AR(1)-GJR(1,1) with SKST errors. T=1000. The generated series and their conditional variance are plotted on the left and on the right hand side respectively

Chapter 5
Estimating Univariate Models using the Batch and Ox Versions

G@RCH is primarily a menu-driven module for OxMetrics. However, there are two additional ways of estimating GARCH-type models with G@RCH. The first makes use of the batch facilities of OxMetrics. In the second, an Ox program is written that imports the Garch class code into the program.

This chapter is devoted to the illustration of these two methods. No detailed analysis of the results will be undertaken as we concentrate on the mechanics of the procedures.

5.1 Using the Batch Version

A useful feature in G@RCH 6.1 is the possibility to use the algebra and batch languages in OxMetrics (see the OxMetrics handbook for more details). The batch language gives some control over OxMetrics through a command language. This allows for automating repetitive tasks, or as a quick way to get to a certain point in your analysis. The syntax of these commands is described below.

There are five ways of running Batch commands.

The most intuitive way is probably to estimate a model first and then click on Tools/Batch Editor ... in OxMetrics (or Alt+b). Then, a new box dialog appears with the batch code corresponding to this model. To illustrate, the next picture shows part of the batch code corresponding to a simple (without explanatory variable) ARMA(1,0)-GARCH(1,1) model.


To estimate a slightly different model, one has to change the batch code accordingly. Here is an example batch code used to estimate an ARMA(1,1)-GJR(1,1) model using the same database (note that only the commands ARMA_ORDERS() and MODEL() have been modified).

                                         Batch code for an ARMA(1,1)-GJR(1,1)  
// Batch code for G@RCH( 1)  
package("Garch", "GARCH");  
    Y = Nasdaq;  
estimate("BFGS", 1984-10-12, 0, 2000-12-21, 0);

Here is a list of G@RCH specific commands (see Section 10.3 - G@RCH Members Functions for more details on these options).

  1. CSTS: Specifies if constants are wanted respectively in the conditional mean and in the conditional variance equations.
  2. DISTRI: Specifies the desired distribution, i.e. 0 for the Gaussian (Normal) distribution 1, for the Student-t distribution, 2, for the Generalized Error distribution (GED) and 3 for the skewed Student-t.
  3. ARMA_ORDERS: Specifies respectively the AR and MA orders.
  4. ARFIMA: 1 to add a long-memory component in the conditional mean equation, 0 otherwise.
  5. GARCH_ORDERS: Specifies the orders of the GARCH(p,q)-type model.
  6. MODEL: specification choice (string).
  7. "ARCH_IN_MEAN"; 1 or 2 to add the conditional variance or standard deviation in the conditional mean equation, 0 otherwise.
  8. estimate(method, year1=-1, period1=0, year2=-1, period2=0); Estimates the model.
    The first argument method = "BFGS" for unconstrained ML, "BFGS-BOUNDS" for Constrained ML (lower and upper bounds) and "MaxSA" for Simulated annealing.
    year1(period1) – year2(period2) is the estimation sample. Setting year1 to -1 will result in the earliest possible year1(period1), setting year2 to -1 will result in the latest possible year2(period2).
  9. TRUNC, NYBLOM, ITER and Maxsa are used as described in Section 10.3.
  10. KUPIEC_TEST; 1 to apply the KUPIEC test on the in-sample VaR. The VaR levels are specified using option VaR_LEVELS, where the VaR levels are separated by a comma, e.g. VaR_LEVELS(0.95,0.99).
  11. DQT; 1 to apply the Dynamic Quantile test presented in Section 6.2.2. The first argument is one to apply the test and the second is the number of lags of the Hit variable to include in the regression. The VaR levels are also specified using option VaR_LEVELS (see above).
  12. COVAR, BOXPIERCE, ARCHLAGS, RBD, PEARSON, SBT, MLE are also used as described in Section 10.3 except that the lag orders are now separated by a comma, e.g. BOXPIERCE(10,20,30);
  13. Tests(); To launch the miss-specification tests selected above.
  14. FORECAST; The first argument is 1 to run the forecasting procedure, the second argument is the forecasting horizon and the third argument is one to print the forecasts.
  15. Print_VaR_out_of_sample_Forecast; Print the VaR forecasts. The arguments are the VaR levels (see VaR_LEVELS).
  16. For_Graphs. See Chapter 10.
  17. MaxControl and MaxControlEps. See the ox documentation.

Here is another example batch code.

             Batch code for an ARMA(1,0)-APARCH(1,1) with various tests  
// Batch code for G@RCH( 1)  
package("Garch", "GARCH");  
    Y = Nasdaq;  
estimate("BFGS", 1984-10-12, 0, 2000-12-21, 0);  
FORECAST(1,10, 1);  

Feel free to contact us if you need more flexibility for the batch mode.

5.2 Importing the Garch Class in Ox

This section explains how to use the Garch class in Ox. We assume that you have installed Ox (version 5 or later). For more details about Ox, see Jurgen Doornik’s web site (

5.2.1 GarchEstim.ox example

G@RCH is build upon the concept of object-oriented programming. For non specialists, object-oriented programming might sound rather daunting at first. But it is in fact quite easy to get the whole picture.

A major component of object-oriented programming is the “class”. Several useful classes are supplied with Ox, such as Database, Modelbase and Simulation classes. Garch is an additional class that helps in the estimation of GARCH-type models. Section 10.1 provides additional details on this concept.

GarchEstim.ox is an example of Ox code using the Garch class. As file editor, we strongly recommend to use OxMetrics or OxEdit.1

The GarchEstim.ox file is displayed in the following Box.

#import <packages/Garch6/Garch>  
    decl garchobj;  
    garchobj = new Garch();  
//*** DATA ***//  
    garchobj.Select(Y_VAR, {"Nasdaq",0,0});  
    garchobj.SetSelSample(-1, 1, 2000, 1);  
//*** SPECIFICATIONS ***//  
    garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)  
    garchobj.DISTRI(3);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student  
    garchobj.ARMA_ORDERS(0,0);      // AR order (p), MA order (q).  
    garchobj.ARFIMA(0);             // 1 if Arfima wanted, 0 otherwise  
    garchobj.GARCH_ORDERS(1,1);     // p order, q order  
    garchobj.ARCH_in_mean(0);       // ARCH-in-mean: 1 or 2 to add the variance or std. dev.  
    garchobj.MODEL("GARCH");        //  0: RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH  
                                    //  5:IGARCH 6:FIGARCH-BBM   7:FIGARCH-CHUNG   8:FIEGARCH  
                                    //  9:FIAPARCH-BBM  10: FIAPARCH-CHUNG  11: HYGARCH  
    garchobj.TRUNC(1000);           // Truncation order (only F.I. models with BBM method)  
//*** TESTS & FORECASTS ***//  
    garchobj.BOXPIERCE(<10;15;20>); // Lags for the Box-Pierce Q-statistics, <> otherwise  
    garchobj.ARCHLAGS(<2;5;10>);    // Lags for Engle’s LM ARCH test, <> otherwise  
    garchobj.NYBLOM(1);             // 1 to compute the Nyblom stability test, 0 otherwise  
    garchobj.SBT(1);                // 1 to compute the Sign Bias test, 0 otherwise  
    garchobj.PEARSON(<40;50;60>);   // Cells for the adjusted Pearson Chi-square GoF test,  
    garchobj.RBD(<10;15;20>);       // Lags for the Residual-Based Diagnostic test of Tse  
    garchobj.FORECAST(0,15,1);      // Arg.1 : 1 to launch the forecasting procedure, 0 otherwize  
                                    // Arg.2 : Number of forecasts  
                                    // Arg.3 : 1 to Print the forecasts, 0 otherwise  
//*** OUTPUT ***//  
    garchobj.MLE(2);                // 0 : MLE (Second derivatives), 1 : MLE (OPG Matrix), 2 : QMLE  
    garchobj.COVAR(0);              // if 1, prints variance-covariance matrix of the parameters.  
    garchobj.ITER(0);               // Interval of iterations between printed intermediary results  
    garchobj.TESTS(0,0);            // Arg. 1 : 1 to run tests PRIOR to estimation, 0 otherwise  
                                    // Arg. 2 : 1 to run tests AFTER estimation, 0 otherwise  
    garchobj.GRAPHS(0,0,"");        // Arg.1 : if 1, displays graphics of the estimations  
                                    // Arg.2 : if 1, saves these graphics in a EPS file  
                                    // Arg.3 : Name of the saved file.  
    garchobj.FOREGRAPHS(1,0,"");    // Same as GRAPHS(p,s,n) but for the graphics of the forecasts.  
//*** PARAMETERS ***//  
    garchobj.BOUNDS(0);             // 1 if bounded parameters wanted, 0 otherwise  
            // Arg.1 : 1 to fix some parameters to their starting values, 0 otherwize  
            // Arg.2 : 1 to fix (see garchobj.DoEstimation(<>)) and 0 to estimate  
            //         the corresponding parameter

//*** ESTIMATION ***//  
            // Arg.1 : 1 to use the MaxSA algorithm of Goffe, Ferrier and Rogers (1994)  
            //         and implemented in Ox by Charles Bos  
            // Arg.2 : dT=initial temperature  
            // Arg.3 : dRt=temperature reduction factor  
            // Arg.4 : iNS=number of cycles  
            // Arg.5 : iNT=Number of iterations before temperature reduction  
            // Arg.6 : vC=step length adjustment  
            // Arg.7 : vM=step length vector used in initial step  
    garchobj.PrintStartValues(0);    // 1: Prints the S.V. in a table form; 2: Individually;  
                                     // 3: in a Ox code to use in StartValues  
    garchobj.STORE(0,0,0,1,1,"01",0);  //  Arg.1,2,3,4,5 : if 1 -> stored.  
                                       //  (Res-SqRes-CondV-MeanFor-VarFor)  
                                       //  Arg.6 : Suffix. The name of the saved series  
                                       //  will be "Res_ARG6".  
                                       //  Arg.7 : if 0, saves as an Excel spreadsheet (.xls).  
                                       //  If 1, saves  as a OxMetrics dataset (.in7)  
    delete garchobj;  

Let us study this file more in details. The #import statement indicates that this file is linked to the Garch.oxo and Garch.h files. In the body of the file (after the main() instruction), a new Garch object is first created and a database is loaded. The user has to enter the correct path of the database, but also has to pay attention to the structure of the database to be used. For instance, to use a Microsoft Excel file, the format of the spreadsheet is of crucial importance. The following convention has to be adopted when loading an Excel spreadsheet: variables are in columns, columns with variables are labelled, there is an unlabelled column containing the dates (with the form Year-Period) and the data form a contiguous sample. Here is an example:2



21990-1 0.0439 1 0

31990-2-0.0302 0 0

41990-3 0.0845 0 1

OxMetrics also supports databases having a date variable which provides the dates for daily data (or possibly times for higher frequencies). Then selections are made by date, and the dates shown in the graphics. Dates can be read from and saved to Excel files.

We note then that the dependent variable (Y), the regressor(s) in the mean equation (X) and the regressor(s) in the variance equation (Z) are specified with the Select function. Ox being case-sensitive, the exact name of the variable has to be entered. The second and third arguments denote the starting and ending observations to be considered. By default, “0” and “0” mean that all the observations are selected. From this selection, a sample can be extracted with the SetSelSample function. The arguments are ordered as (StartYear, StartPeriod, EndYear, EndPeriod2) and the default (-1, 1, -1, 1) means all the selected observations.3

The GarchEstim.ox file is made up of six parts:4

5.2.2 Running an Ox Program

If you want to use the “Console Version” of G@RCH 6.1 , all you need is an Ox executable. To illustrate the various ways to run Ox code, we keep our GarchEstim.ox example.

Command Prompt

First, you can type

oxl GarchEstim

at the command prompt. This method is mainly for users of Ox Console.

Updating the environment

Skip this section if you managed to run the Ox programs in this booklet. Otherwise, under Windows and Unix you may wish to set the PATH environment variable.

The executable (oxl.exe etc.) is in the ox\bin folder, for example by default it is in:

    C:\Program files\OxMetrics5\Ox\bin

So, update your PATH variable if necessary.5

Without these, you can still run GarchEstim.ox, but more typing is needed:

    "C:\Program files\OxMetrics5\Ox\bin\oxl"  
    "-iC:\Program files\OxMetrics5\Ox\include"  

The double quotes are required because of the space in the file name.


Alternatively, you may use OxEdit.6 Ox tools should be installed automatically. If not, run ox\bin\oxedit\oxcons.tool. To run the program, use the Modules/Ox (for Ox Console) or Modules/OxRun (OxMetrics), as shown below.


When using OxEdit, OxMetrics is needed to display the graphics but is not mandatory for running the estimation. Alternatively, graphs can be displayed on the screen using Gnudraw and Gnuplot. Indeed, G@RCH 6.1 can be combined with Gnudraw, an Ox package meant for creating GnuPlot7 developed by Charles Bos.8 The interface is completely based on the OxDraw package (even the documentation - gnudraw.html - uses the same structure as Jurgen Doornik’s oxdraw.html).9 Usage of GnuDraw is intended to be simple as the syntax is similar to the original Ox drawing routines. You just have to add the lines

#include <oxstd.h>
#include <packages/gnudraw/gnudraw.h>
#import <packages/Garch6/Garch>.
As GnuPlot can be called automatically from within Ox, ShowDrawWindow can be used, displaying graphs on screen.

See Cribari-Neto and Zarkos (2003) for a comprehensive overview of the GnuDraw package.


A third possibility is to launch Ox programs directly from OxMetrics, as shown below.10 There are different ways for running an Ox code under OxMetrics. The first way is to use OxRun.



A new feature of G@RCH 6.1 (and OxMetrics 5 in general) is that Ox code can be generated.

The Model/Ox Batch Code command (or Alt+O) activates a new dialog box called ‘Generate Ox Code’ that allows the user to select an item for which to generate Ox code.


The code is than opened in a new window and provided Ox Professional is available, this code can be run, either from OxMetrics, or from OxEdit or the command line. This option is also available for the ‘Descriptive Statistics’ and ‘Simulation’ modules.

OxBatch_1.ox is an example of Ox Batch code generated by G@RCH 6.1 after the estimation of an APARCH model while OxBatch_2.ox and OxBatch_3.ox are two examples of Ox Batch code generated after the use of the ‘Descriptive Statistics’ and ‘Simulation’ modules.

#include <oxstd.h>  
#import <packages/Garch6/garch>  
#include <oxdraw.h>  
    //--- Ox code for G@RCH( 3)  
    decl model = new Garch();  
    model.Select(Y_VAR, {"Nasdaq", 0, 0});  
    model.SetSelSampleByDates(dayofcalendar(1984, 10, 12), dayofcalendar(2000, 12, 21));  
    SetDrawWindow("G@RCH Graphics");  
    model.FORECAST(1,10, 1);  
    SetDrawWindow("G@RCH Forecasting");  
    delete model;  

#include <oxstd.h>  
#import <packages/Garch6/garch>  
#include <oxdraw.h>  
#import <lib/testres>  
#include <arma.h>  
#import <Database>  
    //--- Ox code for G@RCH( 1)  
    decl model = new Garch();  
    model.Select(Y_VAR, {"Nasdaq", 0, 0});  
    model.SetSelSampleByDates(dayofcalendar(1984, 10, 12), dayofcalendar(2000, 12, 21));  
    decl series =   model.GetGroup(Y_VAR),names,y,name;  
    model.GetGroupNames(Y_VAR, &names);  
    decl m_cT=rows(series);  
    decl nbser=columns(series);  
    for (decl i = 0; i < nbser; ++i)  
        println("Series #", i+1,"/",nbser, ": ", name);  
        decl m_cLagArch_test=<2;5;10>;  
        for (decl l=0; l < sizer(m_cLagArch_test); l++)  
            ArchTest(y, 1, m_cLagArch_test’[l],  0, TRUE);  
        println("Q-Statistics on Raw data");  
        model.BoxPQ(y,<5;10;20;50>, 0);  
        println("Q-Statistics on Squared data");  
        model.BoxPQ(y.^2,<5;10;20;50>, 0);  
    delete model;  

#include <oxstd.h>  
#import <packages/Garch6/garch>  
#include <oxdraw.h>  
    //--- Ox code for G@RCH( 4)  
    decl model = new Garch();  
    model.Select(Y_VAR, {"Nasdaq", 0, 0});  
    decl z,eps,sigma2,y,y_all=<>,sigma2_all=<>;  
    decl new_name_simul_y=new array[3];  
    decl new_name_simul_sigma2=new array[3];  
    for (decl i=0;i<3;++i)  
        model.Simulate_APARCH(0.05,<0.1>,<0.8>,<0.01>,<2>, z, 0, &eps, &sigma2);  
    decl plot=0;  
    for (decl i=0;i<3;++i)  
        DrawTMatrix(plot++, y_all[][0]’, new_name_simul_y[i]);  
        DrawTMatrix(plot++, sigma2_all[][0]’, new_name_simul_sigma2[i]);  
    decl dbase = new Database();  
    delete  dbase;  
    delete model;  

5.3 Advanced Ox Usage

5.3.1 Forecast.ox example

To illustrate the potential of writing Ox code based on G@RCH and its Garch class, we also provide Forecast.ox, an example that estimates an ARMA(1,0)-APARCH(1,1) model on the CAC40. In this program, we analyze the French CAC40 stock index for the years 1995-1999 (1249 daily observations). Daily returns in percentage are defined as 100 times the first difference of the log of the closing prices.

After the estimation of the ARMA(1,0)-APARCH(1,1) model with a skewed-student likelihood on the first 800 observations, 448 one-step-ahead forecasts of the conditional mean and conditional variance are computed.

Forecasting Performance

One of the most popular measures to check the forecasting performance of the ARCH-type models is the Mincer-Zarnowitz regression, i.e. ex-post volatility regression:

 2          2
ˇσt = a0 + a1ˆσt + ut,

where ˇσt2 is the ex-post volatility, ˆσt2 is the forecasted volatility and a0 and a1 are parameters to be estimated. If the model for the conditional variance is correctly specified (and the parameters are known) and if E(ˇσt2) = ˆσt2, we have a0 = 0 and a1 = 1. The R2 of this regression is often used as a simple measure of the degree of predictability of the ARCH-type model.

But ˇσt2 is never observed. It is thus common to use ˇσt2 = (yt -y)2, where y is the sample mean of yt. The R2 of this regression is often lower than 5% and this could lead to the conclusion that GARCH models produce poor forecasts of the volatility (see, among others, Schwert, 1990, or Jorion, 1996). But, as described in Andersen and Bollerslev (1998a), the reason of these poor results is the choice of what is considered as the “true” volatility. Instead, they propose to compute the daily realized volatility as the sum of squared intraday returns and use it as the “true” volatility. Actually, Andersen and Bollerslev (1998a) show that this measure is a more proper one than squared daily returns. Therefore, using h-minute returns (5-minute for instance), the realized volatility can be expressed as:

 2   K∑   2
σt =    yk,t,

where yk,t is the return of the kth h-minutes interval of the tth day and K is the number of h-minutes intervals per day. See Section 7.3 for more details about realized volatility.

Coming back to our illustration, the CAC40 index is computed by the exchange as a weighted measure of the prices of its components and was originally available in our database on an intraday basis with the price index being computed every 15 minutes. For the time period under review, the opening hours of the French stock market were 10.00 am to 5.00 pm, thus 7 hours of trading per day. This translates into 28 intraday returns used to compute the daily realized volatility.11 Because the exchange is closed from 5.00 pm to 10.00 am the next day, the first intraday return is the first difference between the log price at 10.15 am and the log price at 5.00 pm the day before. Then, the intraday data are used to compute the daily realized volatility as the sum of the 28 squared intraday returns as shown in Equation (5.2).

Finally, to compare the adequacy of the different distributions in the selected forecasting model, a well-known tool is the density forecasts tests developed in Diebold, Gunther, and Tay (1998). The idea of density forecasts is quite simple.12 Let fi(yi|Ωi)i=1m be a sequence of m one-step-ahead density forecasts produced by a given model, where Ωi is the conditioning information set, and pi(yi|Ωi)i=1m is the sequence of densities defining the Data Generating Process yi (which is never observed). Testing whether this density is a good approximation of the true density p(.) is equivalent to testing:

           m            m
H0 : fi(yi|Ωi)i=1 = pi(yi|Ωi)i=1.

Diebold, Gunther, and Tay (1998) use the fact that, under Equation (5.3), the probability integral transform ˆζi = -∞yifi(t)dt is i.i.d. U(0,1), i.e. independent and identically distributed uniform. To check H0, they propose to use both a goodness-of-fit test and an independence test for i.i.d. U(0,1). The i.i.d.-ness property of ˆζi can be evaluated by plotting the correlograms of (   -)
 ζ - ˆζj, for j = 1,2,3,4,..., to detect potential dependence in the conditional mean, variance, skewness, kurtosis, etc. Departure from uniformity can also be evaluated by plotting an histogram of ˆζi. According to Bauwens, Giot, Grammig, and Veredas (2000), a humped shape of the ˆζ-histogram would indicate that the issued forecasts are too narrow and that the tails of the true density are not accounted for. On the other hand, a U-shape of the histogram would suggest that the model issues forecasts that either under- or overestimate too frequently. Moreover, Lambert and Laurent (2001) show that an inverted S shape of the histogram would indicate that the errors are skewed, i.e. the true density is probably not symmetric.13 An illustration is provided in Section 5 with some formal tests and graphical tools.


The Mincer-Zarnowitz regression and some out-of-sample density forecast tests (as suggested by Diebold, Gunther, and Tay, 1998) are also performed.

The program Forecast.ox is printed in the next box. This code has been used to produce Figure 5.1 and the output labelled “Density Forecast Test on Standardized Forecast Errors”. In the first four panels of Figure 5.1, we show the correlograms of (   -)
 ˆζ - ˆζj, for j = 1,2,3,4, where ˆζ is the probability integral transform for the out-of-sample period (see the paragraph below Equation (5.3) for more details). This graphical tool has been proposed by Diebold, Gunther, and Tay (1998) to detect potential remaining dependence in the conditional mean, variance, skewness, kurtosis. In our example, it seems that the probability integral transform is independently distributed.

#import <packages/Garch6/Garch>  
main() {  
    decl garchobj;  
    garchobj = new Garch();  
//*** DATA ***//  
    garchobj.Select(Y_VAR, {"CAC40",0,0} );  
    garchobj.SetSelSample(-1, 1, 800, 1);  
//*** SPECIFICATIONS ***//  
    garchobj.DISTRI(3);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student  
    garchobj.ARMA_ORDERS(1,0);      // AR order (p), MA order (q).  
    garchobj.GARCH_ORDERS(1,1);     // p order, q order  
    garchobj.MODEL(4);              // 0:RISKMETRICS  1:GARCH      2:EGARCH    3:GJR   4:APARCH  
                                    // 5:IGARCH 6:FIGARCH-BBM    7:FIGARCH-CHUNG   8:FIEGARCH  
                                    // 9:FIAPARCH-BBM   10: FIAPARCH-CHUNG  11: HYGARCH  
    garchobj.MLE(1);                // 0 : MLE (Second derivatives), 1 : MLE (OPG Matrix), 2 : QMLE  
    garchobj.Initialization(<>) ;  
    garchobj.Output() ;  
    decl forc=<>,h,yfor=<>,shape=<>;  
    decl number_of_forecasts=448; // number of h_step_ahead forecasts  
    decl step=1;                  // specify h (h-step-ahead forecasts)  
    decl T=garchobj.GetcT();  
    println("!!! Please Wait while computing the forecasts !!!");  

    decl distri=garchobj.GetDistri();  
    if (distri==1 || distri==2)  
    else if (distri==3)  
    for (h=0; h<number_of_forecasts; ++h)  
        garchobj.SetSelSample(-1, 1, T+h, 1);  
        yfor|=garchobj.GetForcData(Y_VAR, step);  
    decl Hfor = garchobj.GetVar("REALVOLA")[801:];  // Realized volatility  
//  decl Hfor = (yfor - meanc(yfor)).^2;            // Squared returns (in deviation)  
    decl cd=garchobj.CD(yfor-forc[][0],forc[][1],garchobj.GetDistri(),shape);  
    println("Density Forecast Test on Standardized Forecast Errors");  
    garchobj.AUTO(cd, number_of_forecasts, -0.1, 0.1, 0);  
    DrawTitle(5, "Conditional variance forecast and realized volatility");  
    Draw(5, (Hfor~forc[][1])’);  
    garchobj.MZ(Hfor, forc[][1], number_of_forecasts);  
    garchobj.FEM(forc, yfor~Hfor);  
    savemat("MeanFor.xls",forc[][0]);     // Saves the mean forecasts in an Excel file.  
    savemat("VarFor.xls",forc[][1]);      // Saves the variance forecasts in an Excel file.  
    delete garchobj;  


Figure 5.1: GARCH Models - Forecasting Analysis of the CAC40

Panel 5 of Figure 5.1 also shows the histogram (with 30 cells) of ˆ
ζ with the 95% confidence bands. From this figure, it is clear that the ARMA(1,0)-APARCH(1,1) model coupled with a skewed-Student distribution for the innovations performs very well with the dataset we have investigated. This conclusion is reinforced by the result reported in the Density Forecast Test on Standardized Forecast Errors box. The adjusted Pearson Chi-square goodness-of-fit test provides a statistical version of the graphical test presented in Figure 5.1. In addition, the program also performs the Mincer-Zarnowitz regression (see Equation (5.1)) that regresses the observed volatility (in our case the realized volatility) on a constant and a vector of 448 one-step-ahead forecasts of the conditional variance (produced by the APARCH model).14 Theses results suggest that the APARCH model gives good forecasts of the conditional variance. Based on the significance of the β parameter of the regression, one can indeed hardly conclude that the APARCH model provides biased forecasts. Moreover, the R2 of this regression is about 33%, versus 0.14 when we use the daily squared returns (in deviation) as proxy of the observed volatility.15

                        Density Forecast Test on Standardized Forecast Errors  
Adjusted Pearson Chi-square Goodness-of-fit test  
 Cells(g)  Statistic      P-Value(g-1)     P-Value(g-k-1)  
    20       21.0179         0.335815          0.020969  
    30       26.5089         0.598181          0.149654  
Rem.: k = 9 = Number of estimated parameters  
Probability Integral Transform The Lower and Upper bounds are  
0.46875 and 1.54018  
Mincer-Zarnowitz regression on the forecasted volatility  
                  Coefficient  Std.Error  t-value  t-prob  
Alpha                0.027698    0.23253   0.1191  0.9052  
Beta                 1.239532    0.16052    7.722  0.0000  
R2: 0.329662 Note: S.E. are  
Heteroskedastic Consistent (White, 1980)  

5.3.2 Imposing Nonlinear Constraints

As mentioned in Section 3.9, the inspected range of the parameter space is ]- ∞;∞ [ when numerical optimization is used to maximize the log-likelihood function with respect to the vector of parameters Ψ. Sometimes these parameters might have to be constrained in a smaller interval.

We have shown how to impose these lower and upper bounds and how to change these bounds in the OxPack version of G@RCH (see Section 4.9).

The same feature is available for the Console version. Furthermore, the Console version allows the user to impose any nonlinear constraint on the parameters (like the stationarity constraints).

The example StartValues_Bounds.ox shows how to print the code needed to change the starting values and the bounds for an ARMA(0,0)-GARCH(1,1) with a Student likelihood.

                                                                      First Step : StartValues_Bounds.ox  
#import <packages/Garch6/Garch>  
StartValues(const object)  
    object.GetPara();                                   // DO NOT REMOVE THIS LINE  
    object.Initialization(object.GetValue("m_vPar"));   // DO NOT REMOVE THIS LINE  
    decl garchobj;  
    garchobj = new Garch();  
//*** DATA ***//  
    garchobj.Select(Y_VAR, {"Nasdaq",0,0});  
    garchobj.SetSelSample(-1, 1, 2000, 1);  
//*** SPECIFICATIONS ***//  
    garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)  
    garchobj.DISTRI(1);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student  
    garchobj.ARMA_ORDERS(0,0);      // AR order (p), MA order (q).  
    garchobj.GARCH_ORDERS(1,1);     // p order, q order  
    garchobj.MODEL("GARCH");        //  0:RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH  
                                    //  5:IGARCH 6:FIGARCH-BBM   7:FIGARCH-CHUNG   8:FIEGARCH  
                                    //  9:FIAPARCH-BBM  10: FIAPARCH-CHUNG  11: HYGARCH  
//*** OUTPUT ***//  
    garchobj.MLE(2);                // 0 : MLE (Second derivatives), 1 : MLE (OPG Matrix), 2 : QMLE  
//*** PARAMETERS ***//  
    garchobj.BOUNDS(1);             // 1 if bounded parameters wanted, 0 otherwise  
//*** ESTIMATION ***//  
    garchobj.PrintBounds(2);          // 1: Prints the bounds in a matrix form,  
                                      // 2: to write the Ox code to use in StartValues  
    garchobj.PrintStartValues(3);     // 1: Prints the S.V. in a table form;  
                                      // 2: Individually;  3: in a Ox code to use in StartValues  
    delete garchobj;  

Let us focus now on the following lines:

//*** PARAMETERS ***//  
    garchobj.BOUNDS(1);               // 1 if bounded parameters wanted, 0 otherwise  
//*** ESTIMATION ***//  
    garchobj.PrintBounds(2);          // 1: Prints the bounds in a matrix form,  
                                      // 2: to write the Ox code to use in StartValues  
    garchobj.PrintStartValues(3);     // 1: Prints the S.V. in a table form;  
                                      // 2: Individually;  3: in a Ox code to use in StartValues  

The function BOUNDS(1) selects the constrained optimizer. PrintBounds(2) prints the code needed to change the bounds while PrintStartValues(3) prints the code needed to change the starting values. The function exit(0) stops the program before starting the estimation (since the starting values and the bounds have to be modified).

The output labelled First Step : Output indicates that the default starting value of the degree of freedom of the Student density (m_cV) is 6 and that m_calpha0 (i.e. ω, the constant of the GARCH model) is bounded between 0 and 100.

                                                          First Step : Output  
object.SetBounds("m_clevel", -100, 100);  
object.SetBounds("m_calpha0", 0, 100);  
object.SetBounds("m_valphav", 0, 1);  
object.SetBounds("m_vbetav", 0, 1);  
object.SetBounds("m_cV", 2, 100);  
Starting Values  

To change the starting value of m_cV to say 5 and to fix the bounds of m_calpha0 to [-100;100] (instead of [0;100]), just copy-paste the lines

object.SetBounds("m_calpha0", 0, 100);  

at the top of the function StartValues(const object) and change the values as shown below:16

#import <packages/Garch6/Garch>  
StartValues(const object)  
    object.SetBounds("m_calpha0", -100, 100);  

Furthermore, nonlinear constraints (like the stationarity constraint of the GARCH (1,1) model, i.e. α1 + β1 < 1) can be imposed during the estimation in the Console version. The example Stat_Constr_GARCH shows how to impose this restriction.

#import <packages/Garch6/Garch>  
StartValues(const object)  
    object.GetPara();                                   // DO NOT REMOVE THIS LINE  
    object.Initialization(object.GetValue("m_vPar"));   // DO NOT REMOVE THIS LINE  
class GARCH_C : Garch                                   // CONSTRUCTOR  
{                                                       // This function defines  
    GARCH_C();                                          // a new class called GARCH_C  
    cfunc_gt0(const avF, const vP);                     // and launches the cfunc_gt0 function  
    this.Garch();                                       // This function defines GARCH_C  
    m_iModelClass = MC_GARCH;                           // as a Garch object that thus inherits  
    Init_Globals();                                     // all the properties of Garch  
GARCH_C::cfunc_gt0(const avF, const vP)  
    SplitPara(vP);  // Do not remove this line  
    decl matconstr=new matrix[1][1];  
    matconstr[0] = 1.000001 - m_valphav[0] - m_vbetav[0];     // alpha_1 + beta_1 < 1  
    avF[0] = matconstr;  
    return 1;  
    decl garchobj;  
    garchobj = new GARCH_C();  
    garchobj.GARCH_ORDERS(1,1);     // p order, q order  
    garchobj.MODEL(1);              //  0:RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH  
                                    //  5:IGARCH 6:FIGARCH(BBM) 7:FIGARCH(Chung) 8:FIEGARCH(BM only)  
                                    //  9:FIAPARCH(BBM) 10: FIAPARCH(Chung) 11: HYGARCH(BBM)  
//*** OUTPUT ***//  
    garchobj.MLE(1);                // 0 : both, 1 : MLE, 2 : QMLE  
//*** PARAMETERS ***//  
    garchobj.BOUNDS(1);             // 1 if bounded parameters wanted, 0 otherwise  
//*** ESTIMATION ***//  
    garchobj.Initialization(<>) ;  
    garchobj.DoEstimation() ;  
    garchobj.Output() ;  
    delete garchobj;  

In Stat_Constr_GARCH, a new object GARCH_C is created and inherits all the members and functions of the GARCH class. The option Bounds(1) is selected while the virtual function cfunc_gt0 is replaced by the function that imposes the corresponding restrictions:

GARCH_C::cfunc_gt0(const avF, const vP)  
    SplitPara(vP);  // Do not remove this line  
    decl matconstr=new matrix[1][1];  
    matconstr[0] = 1.000001 - m_valphav[0] - m_vbetav[0];  
                   // alpha_1 + beta_1 < 1  
    avF[0] = matconstr;  
    return 1;  

In this case the nonlinear constraint α1 + β1 < 1 is evaluated by the cfunc_gt0 function.17

Indeed, the cfunc_gt0 argument can be zero, or a function evaluating the nonlinear constraints (which will be constrained to be positive) with the following format:

cfunc_gt0(const avF, const vP);  
 in: address  
 out: m x 1 matrix with constraints at vP  
 in: p x 1 matrix with coefficients  
1: successful, 0: constraint evaluation failed

In Stat_Constr_FIGARCH.ox, the stationarity constraints of the FIGARCH(1,d,0) and FIGARCH(1,d,1) models are imposed during the estimation. Interestingly imposing the constraint was found to speed up the estimation by about 25 % (while the estimates are exactly the same without imposing it).

GARCH_C::cfunc_gt0(const avF, const vP)  
    SplitPara(vP);  // Do not remove this line  
    decl matconstr;  
    if ((m_cMod==6)&&(m_cP == 1)&&(m_cQ < 2))         // FIGARCH-BBM  
        matconstr=new matrix[3][1];  
        decl a,b,c,d;  
        // Remark: FIGARCH(1,d,0) --> m_valphav = 0  
        a = m_vbetav - m_dD;  
        b = ((2-m_dD)/3);  
        c = m_dD * (m_valphav - ((1-m_dD)/2));  
        d = m_vbetav * (m_valphav - m_vbetav + m_dD);  
        matconstr[0][0] =  m_valphav - a + 0.000001;  // beta_1 - d <= phi_1  
        matconstr[1][0] =  b - m_valphav + 0.000001;  // phi_1 <= (2-d)/3  
        matconstr[2][0] =  d - c + 0.000001;          // d*[phi_1-(1-d)/2] <= beta_1*(phi_1-beta_1+d)  
    else if ((m_cMod==7)&&(m_cP == 1)&&(m_cQ < 2))    // FIGARCH-Chung  
        // Remark: FIGARCH(1,d,0) --> m_valphav = 0  
        if (m_cQ == 0)  
            matconstr = new matrix[3][1];  
            matconstr[0][0] =  m_vbetav;              // 0 <= beta_1  
            matconstr = new matrix[4][1];  
            matconstr[0][0] =  m_vbetav - m_valphav;  // phi_1 <= beta_1  
            matconstr[3][0] =  m_valphav;             // 0 <= phi_1  
        matconstr[1][0] =  m_dD - m_vbetav;           // beta_1 <= d  
        matconstr[2][0] =  1 - m_dD;                  // d <= 1  
    avF[0] = matconstr;  
    return 1;  

5.4 G@RCH and OxGauss

OxGauss, is a program available with recent versions of Ox. It provides a way to run Gauss18 programs in the Ox environment or to call an existing Gauss procedure under Ox.

Depending on the goal of the analysis and the user’s experience, both features are noteworthy and useful. From an Ox-G@RCH user point of view, the main objective of OxGauss is to allow existing Gauss programs to be called from Ox with only a minimum number of changes to these programs. This is beneficial to both Ox and Gauss users. It provides more visibility to both and hence increases the potential use of the underlying statistical technique. Furthermore, it can help with the migration from Gauss to Ox.

Running a pure Gauss code with OxGauss is attractive for non-Gauss and potentially even for non-Ox users because it allows the replication of published work using the console version of Ox. This is an interesting feature since the replicability of simulation and empirical results in econometrics is recognized as being an important aspect of research. An increasing number of researchers in econometrics are making their programs and routines freely available to the econometrics community. As such, OxGauss also provides much added value in that it provides the researcher with a free and rather simple solution to run Gauss programs. See Laurent and Urbain (2003) for a review of OxGauss and the M@ximize web site at

The main goal of this section is to illustrate the usefulness of OxGauss for G@RCH users.

5.4.1 Calling Gauss Programs from Ox

The first use of OxGauss is to allow Gauss procedures to be called from Ox. This helps in the transition to Ox, and it increases the amount of code available to Ox users.

To illustrate how Gauss programs can be called from Ox, we consider a small project that mixes both Gauss and Ox programs. The first file, Gaussprocs.src, consists of a code file that features the procedure gengarch(omega,alpha,beta,nu,T_0,T,n), which simulates a GARCH model. This procedure has been written by Dick van Dijk (see Franses and van Dijk, 2000) and is downloadable from his web site

**  gengarch.src (written by dick van dijk)  
**  purpose :  generate realizations from garch(p,q) processes  
**  format : eps = gengarch(omega,alpha,beta,t_0,t,n);  
**  input : omega   : scalar, constant in conditional variance equation  
**          alpha   : (qx1) vector, parameters of lagged squared residuals  
**                    in conditional variance equation  
**          beta    : (px1) vector, parameters of lagged conditional variance  
**                    in conditional variance equation  
**          nu      : scalar, indicating whether disturbances are to be  
**                    drawn from standard normal distribution (nu=0), or from  
**                    t-distribution with nu degrees of freedom  
**          t_0     : scalar, number of initial observations to be discarded  
**          t       : scalar, the length of the series to be generated  
**          n       : scalar, the number of generated series  
**  output  : eps  : t x n, the generated series  
proc gengarch(omega,alpha,beta,nu,t_0,t,n);  
  local p,q,eps_1,h_1,eps,i,z;  
p=rows(beta); q=rows(alpha);  
if (rows(n) gt 1);  
/* starting values for lagged residuals are set equal to zero, while lagged  
   conditional variances are set equal to unconditional variance */  
if (sumc(alpha)+sumc(beta) /= 1);  
do until (i==t_0+t);  
  h_1=(omega + alpha’(eps_1.*eps_1) + beta’h_1)|trimr(h_1,0,1);  
  if (nu==-1);  
  elseif (nu==0);  

To call this procedure from Ox codes, one first has to create a header file. This header file allows the declaration of the functions, constants and external variables so that these are known when required. This is also mandatory to avoid compilation errors in Ox, since functions and global variables have to be explicitly declared before their use. The header file corresponding to our example is Gaussprocs.h.

#include <oxstd.h>  
namespace gauss  
    gengarch(const omega,const alpha,const beta,const nu,const T_0,  
             const T, const n);  
    // Add new procedures here  

Additional procedures can be added in Gaussprocs.src, but the header file has to be modified accordingly.20 It is recommended to use the .src extension for the Gauss programs and .h for the header files.

In the example /packages/OxGauss/GarchEstim.ox, we use a Gauss procedure to generate 20,000 observations from a GARCH(1,1) process with Student-t errors. Then, we rely on G@RCH to estimate a GARCH(1,1) model by Gaussian Quasi-Maximum likelihood. To do this, the Gauss code must be imported into the Ox program, along with the G@RCH package. The #import command has been extended so that OxGauss imports are defined by prefixing the file name with gauss::.

#include <oxstd.h>  
#import <packages/Garch6/Garch>  
#import "gauss::Gaussprocs"                                         \\  <---  
    decl omega=0.2; decl alpha=0.1; decl beta=0.8; decl nu=10;  
    decl T_0=1000; decl T=20000; decl n=1;  
    decl y=gauss::gengarch(omega,alpha,beta,nu,T_0,T,n);            \\  <---  
    decl garchobj;  
    garchobj = new Garch();  
    garchobj.Create(1, 1, 1, T, 1);  
    garchobj.Append(y, "Y");  
    garchobj.Select(Y_VAR, {"Y",0,0} );  
    garchobj.SetSelSample(-1, 1, -1, 1);  
    garchobj.DISTRI(0);             \\ 0 for Normal  
    garchobj.GARCH_ORDERS(1,1);     \\ p order, q order  
    garchobj.MODEL(1);              \\ 1: GARCH  
    delete garchobj;  

Note that when OxGauss functions or variables are accessed, they must also be prefixed with the identifier gauss::.

5.4.2 Understanding OxGauss

When an OxGauss program is run, it automatically includes the ox\include\oxgauss.ox file. This by itself imports the required files:21

 * oxgauss.ox - master file for functions mapped from Gauss to Ox  
 *       (C) Jurgen Doornik 2000-2001  
 * this file is automatically inserted when OxGauss code is run  
#define OX_GAUSS  
#import <g2ox>  
#import <gauss::oxgauss>

These import statements ensure that ox\include\g2ox.h and oxgauss.h are being included. Most of the OxGauss run-time system is in ox\include\g2ox.ox while the keywords are largely in oxgauss.src.

Most of the programs that link Gauss functions to Ox are gathered in the file \include\g2ox.ox. For instance, the output of the Gauss function cumprodc(x) is an N ×K matrix with the cumulative products of the columns of the N ×K matrix x. The Ox code given below (copied from the file g2ox.ox) shows how OxGauss interprets this function.

                                                              part of g2ox.ox  
cumprodc(const mx)  
    return ::cumprod(mx);  

As indicated in this example, OxGauss does not translate the Gauss code into Ox. Instead, it makes a link between the Gauss function (here cumprodc) and its Ox counterpart (cumprod). When the corresponding Ox function does not exist, Ox code is written between the brackets which computes what the original Gauss function meant. It is important to note that not all Gauss functions are supported by OxGauss. For instance, there is no equivalent of the Gauss function intgrat2 (for the computation of double integrals) in Ox 5. For this reason, the corresponding procedure in g2ox.ox just reports the error message intgrat2() unsupported (see below). However, if such a function becomes available in a future version of Ox, mapping ingrat2 to the corresponding function in Ox will be very easy.

5.4.3 Graphics Support in OxGauss

An important aspect of OxGauss is that it supports most of the graphical features of the Gauss library pgraph. As for the standard functions (see Section 5.4.2), the file \oxgauss\src\pgraph.ox now makes the link between pgraph and the Ox graphical package (oxdraw). For instance, the Gauss function xy() is linked to its Ox counterpart DrawXMatrix().

Here is an example of a Gauss program (grGauss.prg) that draws a simple graph.

library pgraph;  
    ylabel("Y-axis: Normal(0,1) draws");  
    call xy(x, y);  

However, only Ox Professional for Windows supports on-screen graphics (through OxMetrics). By default, non-Windows versions of Ox and Ox Console have no support for graphs. Nevertheless, the user can rely on the Ox package GnuDraw developed by Charles Bos that allows the creation of GnuPlot (see graphics from Ox (see Section 5.2.2). Interestingly, GnuDraw allows the use of the console version for a quick check of the graphical output of Gauss code. Therefore, academic institutions do not have to license the full professional version of Ox if Gauss programs only need to be replicated.

Chapter 6
Value-at-Risk (VaR) estimation using G@RCH

In recent years, the tremendous growth of trading activity and the widely publicized trading loss of well-known financial institutions (see Jorion, 2000, for a brief history of these events) has led financial regulators and supervisory authorities to favor quantitative techniques which appraise the possible loss that these institutions can incur. Value-at-Risk has become one of the most sought-after techniques as it provides a simple answer to the following question: with a given probability (say α), what is my predicted financial loss over a given time horizon? The answer is the VaR at level α, which gives an amount in the currency of the traded assets (in dollar terms for example) and is thus easily understandable.

It turns out that the VaR has a simple statistical definition: the VaR at level α for a sample of returns is defined as the corresponding empirical quantile at α%. Because of the definition of the quantile, we have that, with probability 1 -α, the returns will be larger than the VaR. In other words, with probability 1 -α, the losses will be smaller than the dollar amount given by the VaR.1 From an empirical point of view, the computation of the VaR for a collection of returns thus requires the computation of the empirical quantile at level α of the distribution of the returns of the portfolio.

In this section, we show how to use G@RCH to predict the VaR and test the adequacy of the selected model in forecasting the VaR of the investigated series. The material used in the current section is based on Giot and Laurent (2003) and is devoted to show that models that rely on a symmetric density distribution for the error term can underperform with respect to skewed density models when the left and right tails of the distribution of returns must be modelled. Thus, VaR for traders having both long and short positions is not adequately modelled using usual normal or Student distributions. We suggest using an APARCH model based on the skewed-Student distribution to fully take into account the fat left and right tails of the returns distribution. This allows for an adequate modelling of large returns defined on long and short trading positions. The performances of the model is assessed on daily data of the NASDAQ (11/10/1984 - 21/12/2000).

6.1 VaR Models

To characterize the models, we consider a collection of daily returns (in %), yt = 100 [log(pt)- log(pt-1)], where t = 1,T , and pt is the price at time t. Because daily returns are known to exhibit some serial autocorrelation, we fit an AR(2) structure on the yt series for all specifications. Accordingly, the conditional mean of yt, i.e. μt, is equal to μ + j=12ψj(yt-j - μ). We now consider several specifications for the conditional variance of εt.

6.1.1 RiskMetricsTM

In its most simple form, it can be shown that the basic RiskMetricsTM model is equivalent to a normal Integrated GARCH (IGARCH) model where the autoregressive parameter is set at a pre-specified value λ and the coefficient of εt-12 is equal to 1 - λ. In the RiskMetricsTM specification for daily data, λ is fixed to 0.94 and we then have:

εt = σtzt

where zt is IID N(0,1) and σt2 is defined as in Equation (4.14).

The long side of the daily VaR is defined as the VaR level for traders having long positions in the relevant equity index: this is the “usual” VaR where traders incur losses when negative returns are observed. Correspondingly, the short side of the daily VaR is the VaR level for traders having short positions, i.e. traders who incur losses when stock prices increase.2

How good a model is at predicting long VaR is thus related to its ability to model large negative returns, while its performance regarding the short side of the VaR is based on its ability to take into account large positive returns.

For the RiskMetricsTM model, the one-step-ahead VaR computed in t- 1 for long trading positions is given by μt + zασt, for short trading positions it is equal to μt + z1-ασt, with zα being the left quantile at α% for the normal distribution and z1-α is the right quantile at α%.3

6.1.2 Normal APARCH

The normal APARCH (Ding, Granger, and Engle, 1993) is an extension of the GARCH model of Bollerslev (1986). It is a very flexible ARCH-type model as it nests at least seven GARCH specifications (see Section 4.4). The APARCH(1,1) is:

σδt =   ω +α1 (|εt-1|- γ1εt-1)δ + β1σδt-1         (6.2)
where ω,α1,γ1,β1 and δ are parameters to be estimated.

For the normal APARCH model, the one-step-ahead VaR is computed as for the RiskMetricsTM model except the computation of the conditional standard deviation σt which is now given by Equation (6.2) (evaluated at its MLE).

6.1.3 Student APARCH

Previous empirical studies on VaR have shown that models based on the normal distribution usually cannot fully take into account the ‘fat tails’ of the distribution of the returns. To alleviate this problem, the Student APARCH (or ST APARCH) is introduced:

εt = σtzt

where zt is IID t(0,1,υ) and σt is defined as in Equation (6.2).

For the Student APARCH model, the VaR for long and short positions is given by μt + stα,υσt and μt + st1-α,υσt, with stα,υ being the left quantile at α% for the (standardized) Student distribution with (estimated) number of degrees of freedom υ and st1-α,υ is the right quantile at α% for this same distribution. Note that because zα = -z1-α for the normal distribution and stα,υ = -st1-α,υ for the Student distribution, the forecasted long and short VaR will be equal in both cases.

6.1.4 Skewed-Student APARCH

To account for the excess skewness and kurtosis, Fernández and Steel (1998) propose to extend the Student distribution by adding a skewness parameter.4 Their procedure allows the introduction of skewness in any continuous unimodal and symmetric (about zero) distribution g(.) by changing the scale at each side of the mode. The main drawback of this density is that it is expressed in terms of the mode and the dispersion. In order to keep in the ARCH ‘tradition’, Lambert and Laurent (2001) re-expressed the skewed-Student density in terms of the mean and the variance, i.e. re-parameterize this density in such a way that the innovation process has zero mean and unit variance. Otherwise, it will be difficult to separate the fluctuations in the mean and variance from the fluctuations in the shape of the conditional density (Hansen, 1994).

The innovation process z is said to be (standardized) skewed-Student distributed if:

          { ξ+21sg[ξ (sz + m)|υ]  if  z < - ms
f(z|ξ,υ) =   --2ξsg[(sz + m )∕ξ|υ ] if z ≥ - m
            ξ+ 1ξ                         s

where g(.|υ) is the symmetric (unit variance) Student density and ξ is the asymmetry coefficient.5 m and s2 are respectively the mean and the variance of the non-standardized skewed-Student and are defined in Section 3.6.2.

Notice also that the density f(zt|1ξ,υ) is the “mirror” of f(zt|ξ,υ) with respect to the (zero) mean, i.e. f(zt|1ξ,υ) = f(-zt|ξ,υ). Therefore, the sign of log(ξ) indicates the direction of the skewness: the third moment is positive (negative), and the density is skew to the right (left), if log(ξ) > 0 (< 0).

Lambert and Laurent (2000) show that the quantile function skstα,υ,ξ* of a non standardized skewed-Student density is:

          {       [       ]
   *        1ξstα,υ  α2(1+ ξ2)      if α < 1+1ξ2
skstα,υ,ξ =   - ξstα,υ [1-α(1+ ξ-2)] if α ≥ --12
                     2                  1+ξ

where stα,υ is the quantile function of the (unit variance) Student-t density. It is straightforward to obtain the quantile function of the standardized skewed-Student: skstα,υ,ξ =    *

For the skewed-Student APARCH model, the VaR for long and short positions is given by μt + skstα,υ,ξσt and μt + skst1-α,υ,ξσt, with skstα,υ,ξ being the left quantile at α% for the skewed-Student distribution with υ degrees of freedom and asymmetry coefficient ξ; skst1-α,υ,ξ is the corresponding right quantile. If log(ξ) is smaller than zero (or ξ < 1), |skstα,υ,ξ| > |skst1-α,υ,ξ| and the VaR for long trading positions will be larger (for the same conditional variance) than the VaR for short trading positions. When log(ξ) is positive, we have the opposite result.

6.2 Application

6.2.1 Model for VaR assessment

The first step of the application consists in estimating the various models. Then, the adequacy in forecasting the in-sample VaR will be tested. The code used to perform this illustration is given in the file VaR_insample.ox. The first model to be estimated is an ARMA(2,0)-RiskMetricsTM model (with a normal density). The second, third and fourth are ARMA(2,0)-APARCH(1,1) models with respectively a normal, Student and skewed-Student densities. The results are given in the four boxes below. Several comments are in order:

To summarize, these results indicate the need for a model featuring a negative leverage effect (conditional asymmetry) for the conditional variance combined with an asymmetric distribution for the underlying error term (unconditional asymmetry). The skewed-Student APARCH model delivers such specifications.

Does it means that this model improves on symmetric GARCH models when the VaR for long and short returns is needed ?

#import <packages/Garch5/Garch>  
    decl garchobj;  
    garchobj = new Garch();  
//*** DATA ***//  
    garchobj.Load("/data/nasdaq.xls"); // -> B4094  
    garchobj.Select(Y_VAR, {"Nasdaq",0,0} );  
    garchobj.SetSelSample(1, 1, 2000, 1);  
//*** SPECIFICATIONS ***//  
    garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)  
    garchobj.DISTRI(0);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student  
    garchobj.ARMA_ORDERS(2,0);      // AR order (p), MA order (q).  
    garchobj.ARFIMA(0);             // 1 if Arfima wanted, 0 otherwise  
    garchobj.GARCH_ORDERS(1,1);     // p order, q order  
    garchobj.MODEL(0);              //  0:RISKMETRICS  1:GARCH     2:EGARCH    3:GJR   4:APARCH  
                                    //  5:IGARCH 6:FIGARCH(BBM) 7:FIGARCH(Chung) 8:FIEGARCH(BBM only)  
                                    //  9:FIAPARCH(BBM) 10: FIAPARCH(Chung) 11: HYGARCH(BBM)  
//*** OUTPUT ***//  
    garchobj.MLE(0);                // 0 : Second Derivates, 1 : OPG, 2 : QMLE  
    garchobj.ITER(0);               // Interval of iterations between printed intermediary  
                                    // results (if no intermediary results wanted, enter ’0’)  
//*** PARAMETERS ***//  
    garchobj.BOUNDS(0);             // 1 if bounded parameters wanted, 0 otherwise  
    decl quan=<0.95,0.975,0.99,0.995,0.9975>; // Quantiles investigated  
    decl Y=garchobj.GetGroup(Y_VAR);  
    decl T=garchobj.GetcT();  
    decl m_cA,m_cV,i,j;  
    decl qu_pos,qu_neg,m_vSigma2,dfunc,m_vPar,cond_mean;  
    decl m_Dist=garchobj.GetValue("m_cDist");  
    println("Number of observations: ", T);  
    println("Investigated quantiles: ",quan);  
    println("In-sample VaR");  
    decl emp_quan_in_pos=new matrix[T][columns(quan)];  
    decl emp_quan_in_neg=new matrix[T][columns(quan)];  

    if (m_Dist==0)  
    if (m_Dist==1)  
    if (m_Dist==3)  
        for (i = 0; i < columns(quan) ; ++i)  
    emp_quan_in_pos=(cond_mean + sqrt(m_vSigma2).*qu_pos’);  
    emp_quan_in_neg=(cond_mean + sqrt(m_vSigma2).*qu_neg’);  
    println("In-sample Value-at-Risk");  
    garchobj.VaR_DynQuant(Y, emp_quan_in_pos, emp_quan_in_neg, quan, 5, 0,0);  
    garchobj.VaR_Test(Y, emp_quan_in_pos, emp_quan_in_neg, quan);  
    delete garchobj;  

                                         Output RiskMetrics - VaR_insample.ox  
Dependent variable : Nasdaq  
Mean Equation : ARMA (2, 0) model.  
No regressor in the mean  
Variance Equation : RiskMetrics (lambda=0.94).  
 No regressor in the variance  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -2222.83  
Please wait : Computing the Std Errors ...  
 Maximum Likelihood Estimation (Std.Errors based on Second derivatives)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.044980   0.020337    2.212  0.0271  
AR(1)                0.269636   0.024409    11.05  0.0000  
AR(2)               -0.011516   0.024168  -0.4765  0.6338  
ARCH(Alpha1)         0.060000  
GARCH(Beta1)         0.940000  
No. Observations :      2000  No. Parameters  :         5  
Mean (Y)         :   0.04326  Variance (Y)    :   0.83876  
Skewness (Y)     :  -2.33675  Kurtosis (Y)    :  33.73842  
Log Likelihood   : -2222.834  
The sample mean of squared residuals was used to start recursion.  
Estimated Parameters Vector :  
 0.044980; 0.269636;-0.011516; 0.060000; 0.940000  
Elapsed Time : 0.21 seconds (or 0.0035 minutes).

                                       Output N-APARCH(1,1) - VaR_insample.ox  
Dependent variable : Nasdaq  
Mean Equation : ARMA (2, 0) model.  
No regressor in the mean  
Variance Equation : APARCH (1, 1) model.  
 No regressor in the variance  
The distribution is a Gauss distribution.  
Strong convergence using numerical derivatives  
Log-likelihood = -2119.06  
Please wait : Computing the Std Errors ...  
 Maximum Likelihood Estimation (Std.Errors based on Second derivatives)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.047817   0.019917    2.401  0.0165  
AR(1)                0.285268   0.025034    11.40  0.0000  
AR(2)                0.005233   0.024285   0.2155  0.8294  
Cst(V)               0.045054  0.0081194    5.549  0.0000  
ARCH(Alpha1)         0.169705   0.018395    9.226  0.0000  
GARCH(Beta1)         0.799843   0.024229    33.01  0.0000  
APARCH(Gamma1)       0.286858   0.072419    3.961  0.0001  
APARCH(Delta)        1.357528    0.24113    5.630  0.0000  
No. Observations :      2000  No. Parameters  :         8  
Mean (Y)         :   0.04326  Variance (Y)    :   0.83876  
Skewness (Y)     :  -2.33675  Kurtosis (Y)    :  33.73842  
Log Likelihood   : -2119.063  
The sample mean of squared residuals was used to start recursion.  
The condition for existence of E(sigma^delta) and E(|e^delta|) is observed.  
The constraint equals 0.944317 and should be < 1.  
Estimated Parameters Vector :  
0.047817; 0.285268; 0.005233; 0.045054; 0.169705; 0.799843; 0.286858; 1.357528  
Elapsed Time : 2.654 seconds (or 0.0442333 minutes).

                                    Output SKST-APARCH(1,1) - VaR_insample.ox  
Dependent variable : Nasdaq  
Mean Equation : ARMA (2, 0) model.  
No regressor in the mean  
Variance Equation : APARCH (1, 1) model.  
 No regressor in the variance  
The distribution is a skewed-Student distribution, with  
a tail coefficient of 5.23817 and an asymmetry coefficient of -0.16141.  
Strong convergence using numerical derivatives  
Log-likelihood = -2001.87  
Please wait : Computing the Std Errors ...  
 Maximum Likelihood Estimation (Std.Errors based on Second derivatives)  
                  Coefficient  Std.Error  t-value  t-prob  
Cst(M)               0.054586   0.019069    2.863  0.0042  
AR(1)                0.268316   0.023425    11.45  0.0000  
AR(2)               -0.011101   0.023234  -0.4778  0.6329  
Cst(V)               0.037921  0.0098793    3.838  0.0001  
ARCH(Alpha1)         0.153786   0.023273    6.608  0.0000  
GARCH(Beta1)         0.820957   0.029905    27.45  0.0000  
APARCH(Gamma1)       0.219447   0.087570    2.506  0.0123  
APARCH(Delta)        1.367004    0.27287    5.010  0.0000  
Asymmetry           -0.161410   0.031849   -5.068  0.0000  
Tail                 5.238169    0.60680    8.632  0.0000  
No. Observations :      2000  No. Parameters  :        10  
Mean (Y)         :   0.04326  Variance (Y)    :   0.83876  
Skewness (Y)     :  -2.33675  Kurtosis (Y)    :  33.73842  
Log Likelihood   : -2001.868  
The sample mean of squared residuals was used to start recursion.  
The condition for existence of E(sigma^delta) and E(|e^delta|) is observed.  
The constraint equals 0.963684 and should be < 1.  
Estimated Parameters Vector :  
0.054586; 0.268316;-0.011101; 0.037921; 0.153786; 0.820957;  
0.219447; 1.367004;-0.161410; 5.238169  
Elapsed Time : 5.457 seconds (or 0.09095 minutes).

Figure 6.1 plots the in-sample VaR for the 5% and 95 % quantiles for the AR(2)-APARCH-SKST model. Using the rolling menus, this graph is easily obtained by using the Test.../Graphic Analysis/In-Sample VaR Forecasts command and choosing option Theoretical Quantiles’ with the following quantiles: 0.05; 0.95.


Figure 6.1: In-sample VaR for the NASDAQ given by an AR(2)-APARCH-SKST model

6.2.2 In-sample VaR

All models are tested with a VaR level α which ranges from 5% to 0.25% (quan = < 0.95, 0.975, 0.99, 0.995, 0.9975 >) and their performance is then assessed by computing the failure rate for the returns yt. By definition, the failure rate is the number of times returns exceed (in absolute value) the forecasted VaR. If the VaR model is correctly specified, the failure rate should be equal to the pre-specified VaR level. In our empirical application, we define a failure rate fl for the long trading positions, which is equal to the percentage of negative returns smaller than one-step-ahead VaR for long positions. Correspondingly, we define fs as the failure rate for short trading positions as the percentage of positive returns larger than the one-step-ahead VaR for short positions.

Because the computation of the empirical failure rate defines a sequence of yes/no observations, it is possible to test H0 : f = α against H1 : fα, where f is the failure rate (estimated by f^ , the empirical failure rate). At the 5% level and if T yes/no observations are available, a confidence interval for f^is given by [f^- 1.96∘ ----------
  ^f(1- f^)∕T,f^+ 1.96∘ ----------
  ^f(1- ^f)∕T]. In the literature on VaR models, this test is also called the Kupiec (1995) LR test when the hypothesis is tested using a likelihood ratio test. The LR statistic is LR = -2log (  N    T-N )
  αˆfN((1-1--α)ˆf)T-N-, where N is the number of VaR violations, T is the total number of observations and α is the theoretical failure rate. Under the null hypothesis that f is the true failure rate, the LR test statistic is asymptotically distributed as a χ2(1).

In this application, these tests are successively applied to the failure rate fl for long trading positions and then to fs, the failure rate for short trading positions using the procedure VaR_Test.

While an extremely easy-to-understand concept, the Value-at-Risk has however some drawbacks. On of these is that it is not a coherent measure of risk in the sense of Artzner, Delbaen, Eber, and Heath (1999). A somewhat related measure of risk is the so-called expected shortfall (see Scaillet, 2000). Expected shortfall is a coherent measure of risk and it is defined as the expected value of the losses conditional on the loss being larger than the VaR. The average multiple of tail event to risk measure “measures the degree to which events in the tail of the distribution typically exceed the VaR measure by calculating the average multiple of these outcomes to their corresponding VaR measures” (Hendricks, 1996).7 These measure are reported when calling the procedure VaR_Test and labelled ESF1 and ESF2 respectively.

Complete VaR results for the four models are reported in the following outputs.

                                         Output RiskMetrics - VaR_insample.ox  
                               - Short positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
      0.95000      0.96250       7.1752    0.0073918       1.5238       1.3202  
      0.97500      0.98350       6.7238    0.0095137       1.9570       1.3399  
      0.99000      0.99000      0.00000       1.0000       2.1653       1.2828  
      0.99500      0.99300       1.4293      0.23188       2.3557       1.2509  
      0.99750      0.99450       5.3641     0.020555       2.6115       1.2019  
                                - Long positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
     0.050000     0.056500       1.7103      0.19094      -1.8823       1.6088  
     0.025000     0.040500       16.648  4.4986e-005      -2.1481       1.5187  
     0.010000     0.025500       33.969  5.5985e-009      -2.2529       1.4934  
    0.0050000     0.020000       51.358  7.6961e-013      -2.4338       1.4575  
    0.0025000     0.015000       57.820  2.8755e-014      -2.8104       1.4673

                                            Output N-APARCH - VaR_insample.ox  
                               - Short positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
      0.95000      0.96700       13.757   0.00020807       1.7191       1.2706  
      0.97500      0.98500       9.5549    0.0019942       1.9285       1.2721  
      0.99000      0.99250       1.3822      0.23973       2.2827       1.2493  
      0.99500      0.99550      0.10401      0.74707       2.4319       1.2518  
      0.99750      0.99700      0.18836      0.66429       2.6958       1.2570  
                                - Long positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
     0.050000     0.053000      0.37198      0.54193      -1.8833       1.5075  
     0.025000     0.032500       4.2230     0.039879      -2.1592       1.4640  
     0.010000     0.018000       10.450    0.0012263      -2.7534       1.4800  
    0.0050000     0.012500       15.928  6.5802e-005      -3.0119       1.5042  
    0.0025000    0.0095000       22.829  1.7712e-006      -3.2831       1.5016

                                           Output ST-APARCH - VaR_insample.ox  
                               - Short positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
      0.95000      0.96400       9.1060    0.0025477       1.7063       1.2772  
      0.97500      0.98750       15.662  7.5739e-005       2.2696       1.2699  
      0.99000      0.99600       9.4119    0.0021558       2.5200       1.2352  
      0.99500      0.99900       9.5944    0.0019518       3.7156       1.4563  
      0.99750      0.99900       2.3393      0.12614       3.7156       1.2326  
                                - Long positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
     0.050000     0.066500       10.434    0.0012371      -1.7154       1.5324  
     0.025000     0.034500       6.6333     0.010009      -2.1324       1.4559  
     0.010000     0.014000       2.8748     0.089975      -2.9631       1.4620  
    0.0050000    0.0080000       3.0582     0.080329      -3.5505       1.4453  
    0.0025000    0.0050000       3.8755     0.048996      -4.0783       1.3946

                                         Output SKST-APARCH - VaR_insample.ox  
                               - Short positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
      0.95000      0.95050     0.010560      0.91815       1.5490       1.3057  
      0.97500      0.98000       2.1997      0.13804       2.0181       1.2966  
      0.99000      0.99150      0.47890      0.48892       2.1671       1.2303  
      0.99500      0.99700       1.8781      0.17055       2.6958       1.2974  
      0.99750      0.99900       2.3393      0.12614       3.7156       1.4868  
                                - Long positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
     0.050000     0.051500     0.093853      0.75934      -1.9159       1.5136  
     0.025000     0.024500     0.020647      0.88574      -2.4564       1.4604  
     0.010000    0.0090000      0.20904      0.64752      -3.3903       1.5069  
    0.0050000    0.0060000      0.37773      0.53882      -3.8849       1.4071  
    0.0025000    0.0045000       2.5882      0.10766      -4.1698       1.2782

These results indicate that:

Besides the failure rate, a relevant VaR model should feature a sequence of indicator functions (VaR violations) that is not serially correlated. With the new variables Hitt(α) = I(yt < V aRt(α)) - α and Hitt(1 - α) = I(yt > V aRt(1 - α)) - α, Engle and Manganelli (1999) suggest to test jointly that:

According to Engle and Manganelli (1999), testing A1 and A2 can be done using the artificial regression Hitt = Xλ + ϵt, where X is a T × k matrix whose first column is a column of ones, the next p columns are Hitt-1,,Hitt-p and the k - p - 1 remaining columns are additional independent variables (including the VaR itself). Engle and Manganelli (1999) also show that under the null A1 and A2, the Dynamic Quantile test statistic ˆλ′X ′Xˆλ
~ χ2(k), where ˆλ is the OLS estimate of λ. A small sample version of this test (F-test) is readily obtained but the difference is negligible since the sample size is larger than 1,000 observations. Note that while Engle and Manganelli (1999) only consider long trading positions, we also use this test when computing the VaR of short trading positions.

The G@RCH package also provide a function called VaR_DynQuant implementing this test.

Complete VaR results corresponding to the Dynamic Quantile test are reported in the following outputs.

                                         Output RiskMetrics - VaR_insample.ox  
           - Short positions -  
     Quantile        Stat.      P-value  
      0.95000       13.994     0.051297  
      0.97500       9.9820      0.18960  
      0.99000       4.4913      0.72177  
      0.99500       2.4473      0.93101  
      0.99750       7.9630      0.33587  
            - Long positions -  
     Quantile        Stat.      P-value  
     0.050000       22.591    0.0020078  
     0.025000       44.824  1.4796e-007  
     0.010000       95.823      0.00000  
    0.0050000       121.21      0.00000  
    0.0025000       171.78      0.00000  
Remark: In the Dynamic Quantile Regression, p=5.

                                            Output N-APARCH - VaR_insample.ox  
           - Short positions -  
     Quantile        Stat.      P-value  
      0.95000       15.924     0.025818  
      0.97500       10.064      0.18495  
      0.99000       6.9102      0.43829  
      0.99500      0.31019      0.99989  
      0.99750      0.31617      0.99988  
            - Long positions -  
     Quantile        Stat.      P-value  
     0.050000       1.9588      0.96209  
     0.025000       5.9063      0.55073  
     0.010000       18.798    0.0088448  
    0.0050000       32.636  3.0952e-005  
    0.0025000       56.664  6.9706e-010  
Remark: In the Dynamic Quantile Regression, p=5.

                                           Output ST-APARCH - VaR_insample.ox  
            - Short positions -  
     Quantile        Stat.      P-value  
      0.95000       13.398     0.062986  
      0.97500       15.831     0.026710  
      0.99000       7.3396      0.39440  
      0.99500       6.4381      0.48962  
      0.99750       1.8166      0.96929  
            - Long positions -  
     Quantile        Stat.      P-value  
     0.050000       15.164     0.033953  
     0.025000       10.682      0.15308  
     0.010000       8.2543      0.31070  
    0.0050000       14.335     0.045531  
    0.0025000       6.4057      0.49326  
Remark: In the Dynamic Quantile Regression, p=5.

                                         Output SKST-APARCH - VaR_insample.ox  
           - Short positions -  
     Quantile        Stat.      P-value  
      0.95000       3.0185      0.88328  
      0.97500       11.787      0.10777  
      0.99000       5.2936      0.62418  
      0.99500       1.6636      0.97610  
      0.99750       1.8136      0.96943  
            - Long positions -  
     Quantile        Stat.      P-value  
     0.050000       1.6258      0.97763  
     0.025000       3.0760      0.87789  
     0.010000       4.9101      0.67093  
    0.0050000       1.6573      0.97636  
    0.0025000       4.1900      0.75765  
Remark: In the Dynamic Quantile Regression, p=5.

The same conclusion holds for the SKST-APARCH model, i.e. it performs very well for the NASDAQ, unlike its symmetric competitors.

6.2.3 Out-of-sample VaR

The testing methodology in the previous subsection is equivalent to back-testing the model on the estimation sample. Therefore it can be argued that this should be favorable to the tested model. In a ‘real life situation’, VaR models are used to deliver out-of-sample forecasts, where the model is estimated on the known returns (up to time t for example) and the VaR forecast is made for period [t + 1,t + h], where h is the time horizon of the forecasts. In this subsection we implement this testing procedure for the long and short VaR with h = 1 day. Note that the code used for this illustration is VaR_outofsample.ox.

We use an iterative procedure where the skewed-Student APARCH model is estimated to predict the one-day-ahead VaR. The first estimation sample is the complete sample for which the data is available less the last five years. The predicted one-day-ahead VaR (both for long and short positions) is then compared with the observed return and both results are recorded for later assessment using the statistical tests. At the i-th iteration where i goes from 2 to 5 × 252 (five years of data), the estimation sample is augmented to include one more day and the VaRs are forecasted and recorded. Whenever i is a multiple of 50, the model is re-estimated to update the skewed-Student APARCH parameters. In other words, we update the model parameters every 50 trading days and we thus assume a ‘stability window’ of 50 days for our parameters. We iterate the procedure until all days (less the last one) have been included in the estimation sample. Corresponding failure rates are then computed by comparing the long and short forecasted V aRt+1 with the observed return yt+1 for all days in the five years period. We use the same statistical tests as in the previous subsection.

#import <packages/Garch5/Garch>  
    decl garchobj;  
    garchobj = new Garch();  
//*** DATA ***//  
    garchobj.Load("/data/nasdaq.xls"); // -> B4094  
    garchobj.Select(Y_VAR, {"Nasdaq",0,0} );  
    garchobj.SetSelSample(1, 1, 4093-5*252, 1);  
//*** SPECIFICATIONS ***//  
    garchobj.CSTS(1,1);             // cst in Mean (1 or 0), cst in Variance (1 or 0)  
    garchobj.DISTRI(1);             // 0 for Gauss, 1 for Student, 2 for GED, 3 for Skewed-Student  
    garchobj.ARMA_ORDERS(2,0);      // AR order (p), MA order (q).  
    garchobj.GARCH_ORDERS(1,1);     // p order, q order  
    garchobj.MODEL(4);              //  0:RISKMETRICS  1:GARCH      2:EGARCH    3:GJR   4:APARCH  
                                    //  5:IGARCH 6:FIGARCH(BBM) 7:FIGARCH(Chung) 8:FIEGARCH(BBM only)  
                                    //  9:FIAPARCH(BBM) 10: FIAPARCH(Chung) 11: HYGARCH(BBM)  
//*** OUTPUT ***//  
    garchobj.MLE(0);                // 0 : Second Derivates, 1 : OPG, 2 : QMLE  
//*** PARAMETERS ***//  
    garchobj.Initialization(<>) ;  
    garchobj.DoEstimation(<>) ;  
    decl quan=<0.95,0.975,0.99,0.995,0.9975>; // Quantiles investigated  
    decl out_of_sample_VaR=1; // 1 to compute the out-of-sample VaR, 0 otherwize  
    decl numb_out_of_sample=5*252; // number of observations in the out-of-sample period  
    decl reestimate_every=50; // k: to reestimate the model every k observations  
    decl Y=garchobj.GetGroup(Y_VAR);  
    decl T=garchobj.GetcT();  
    decl m_cA,m_cV,i,j;  
    decl qu_pos,qu_neg,m_vSigma2,dfunc,m_vPar,cond_mean;  
    decl m_Dist=garchobj.GetValue("m_cDist");  
    println("Number of observations: ", T);  
    println("Investigated quantiles: ",quan);  
    println("Number of forecasts: ", numb_out_of_sample);  
    println("Out-of-sample: re-estimate every ", reestimate_every, " observations");  
    decl emp_quan_out_pos=new matrix[numb_out_of_sample][columns(quan)];  
    decl emp_quan_out_neg=new matrix[numb_out_of_sample][columns(quan)];  
    decl k=0;  
    for (j = 0; j < numb_out_of_sample ; ++j)  
        garchobj.SetSelSample(-1, 1, T+j, 1);  
        if (k==reestimate_every-1)  
            println("..... Remaining Steps (Out-of-sample): ",numb_out_of_sample-j, " .....");  
            garchobj.Initialization(m_vPar) ;  
            garchobj.FigLL(garchobj.GetFreePar(), &dfunc, 0,0);  
        garchobj.SetSelSample(-1, 1, T+j+1, 1);  
        if (m_Dist==0)  

        if (m_Dist==1)  
        if (m_Dist==3)  
            for (i = 0; i < columns(quan) ; ++i)  
        emp_quan_out_pos[j][]=(cond_mean + sqrt(m_vSigma2).*qu_pos)’;  
        emp_quan_out_neg[j][]=(cond_mean + sqrt(m_vSigma2).*qu_neg)’;  
    println("Out-of-sample Value-at-Risk (rolling regression)");  
    delete garchobj;  

Empirical results for the SKST-APARCH are given below. Broadly speaking, the results are quite similar (although not as good) to those obtained for the in-sample testing procedure.

                                      Output - VaR_APARCH_SKST_outofsample.ox  
           - Short positions -  
     Quantile        Stat.      P-value  
      0.95000       42.683  3.8399e-007  
      0.97500       28.903   0.00015072  
      0.99000       6.6717      0.46384  
      0.99500      0.32859      0.99986  
      0.99750       3.5069      0.83449  
            - Long positions -  
     Quantile        Stat.      P-value  
     0.050000       38.467  2.4696e-006  
     0.025000       43.893  2.2419e-007  
     0.010000       7.6759      0.36204  
    0.0050000       3.5150      0.83363  
    0.0025000      0.61369      0.99891  
Remark: In the Dynamic Quantile Regression, p=5.  
                               - Short positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
      0.95000      0.91667       24.765  6.4776e-007       2.7630       1.3391  
      0.97500      0.95635       14.760   0.00012207       3.1518       1.2488  
      0.99000      0.98730      0.85382      0.35547       3.7643       1.2290  
      0.99500      0.99444     0.075438      0.78358       4.4206       1.1863  
      0.99750      0.99683      0.21171      0.64543       2.7719       1.0845  
                                - Long positions -  
     Quantile   Failure rate   Kupiec LRT    P-value        ESF1         ESF2  
     0.050000     0.069048       8.6470    0.0032760      -3.0879       1.4279  
     0.025000     0.033333       3.2553     0.071193      -3.5326       1.3442  
     0.010000     0.011905      0.43522      0.50944      -4.4876       1.3198  
    0.0050000    0.0063492      0.42458      0.51466      -4.5203       1.2693  
    0.0025000    0.0031746      0.21171      0.64543      -6.2120       1.2574

Chapter 7
Realized Volatility and Intraday Periodicity

(written with the collaboration of Kris Boudt and Jerome Lahaye)

The recent widespread availability of databases providing the intradaily prices of financial assets (stocks, stock indices, bonds, currencies, …) has led to new developments in applied econometrics and quantitative finance to model daily and intradaily volatility.

The aim of this chapter is to introduce and illustrate several concepts. We first sketch basic notions to understand diffusion models before moving on to an explanation of relevant volatility measures. We introduce the notion of integrated volatility and one popular estimator: realized volatility. We then consider the issues of the presence of jumps in the price process. We then present an important stylized fact of intraday returns, namely intraday periodicity in volatility, and show how to consistently estimate it in the presence and absence of jumps. We also review two estimators of the integrated volatility that are robust in the presence of jumps, i.e. the bi-power variation and the realized outlyingness weighted variation.

We also review two different approaches for jumps detection. The first one tests whether the contribution of jumps to the daily variance is zero. The second one tests at the intraday level whether a given return is affected by price jumps.

While G@RCH 6.1 is primarily a menu-driven module for OxMetrics, it also provides a set of Ox classes that allow the user to write Ox programs that import these classes. G@RCH contains a set of procedures, gathered in the Realized class, devoted to the computation (or estimation) of

and the simulation of continuous-time stochastic volatility processes.

Most concepts are directly implemented in the menu-driven application. As usual, several ox programs are also provided (they are available in /packages/Realized/samples/).

This chapter has been inspired by several excellent sources of information, namely Chapter 6 (Continuous-Time Models and Their Applications) of Tsay (2002), the recent survey of Andersen, Bollerslev, Christoffersen, and Diebold (2006) and more generally by the seminal papers of Andersen, Bollerslev, Diebold and coauthors, Barndorff-Nielsen and Shephard, as well as Lee and Mykland (2008). This chapter is also based on Boudt, Croux, and Laurent (2008a,2008b).

7.1 Introduction to diffusion models

It is useful to think of returns as arising from an underlying continuous-time process. In particular, suppose that this underlying model involves a continuous sample path for the (logarithmic) price process. The return process may then, under general assumptions, be written in standard stochastic differential equation (sde) form as,

dp(t) = μ(t)dt+ σ(t)dW (t),t ≥ 0,

where dp(t) denotes the logarithmic price increment, μ(t) the drift, σ(t) refers to the point-in-time or spot volatility, and W(t) is a standard Brownian motion.

7.1.1 Standard Brownian motion / Wiener process

In a discrete-time econometric model, we assume that shocks are driven by a white noise process, which is not predictable. What is the counterpart of shocks in a continuous-time model? The answer is increments of a Wiener process W(t) (also known as a standard Brownian motion).

A continuous-time stochastic process W(t) is a Wiener process if it satisfies:

  1. W(0) = 0;
  2. ΔW(t) W(t + Δ) - W(t) = ϵ√ Δ-, where ϵ is an N(0,1) random variable;
  3. ΔW(t) is independent of ΔW(j),j t;
  4. W(t) is a continuous process, i.e. there are no jumps in its sample paths,

where Δ is a small increment in time.

To simulate a continuous process, it is of course necessary to discretize time. We simulate the motion at times 0,Δ,,,.... Moreover, the definition of a Brownian motion implies that {W(kΔ) - W((k - 1)Δ) : k ∈ N} is a collection of i.i.d. N(0,1) random variables. So to simulate a trajectory of the Brownian motion up to time mΔ, one needs to generate m independent N(0,1) random variables {Zk : k ∈{1,2,...,m}}.

Since W0 0, and W(kΔ) W([k - 1]Δ) + Δ12Zk,k ∈{1,2,...,m}, we simulate:

W0 ≡ 0,Wk Δ ≡ W(k-1)Δ + (Δ)1∕2Zk,k ∈ {1,2,...,m }.

This is called the Euler method to simulate a Brownian motion.

The example file Simul_wiener.ox uses the procedure
Simul_Wiener_process(const m, const deltat) to simulate four Wiener processes consisting of 5 days with 10,000 observations per day (m = 5 × 10,000 = 50,000 and Δ = 110,000). This procedure is available in /packages/Realized/samples/procs_simul.ox.


Figure 7.1: Four simulated Wiener processes

Figure 7.1 represents the four simulated Wiener processes starting with W(0) = 0.

The Wiener process is a special stochastic process with zero drift and variance proportional to the length of time interval. This means that the rate of change in expectation is 0 and the rate of change in variance is 1. Even though this process is of crucial importance in finance, it is desirable to generalize it and allow the mean and the variance to evolve over time in a more realistic manner.

7.1.2 Generalized Wiener Process

Let us now consider the generalized Wiener process p*(t): its expectation has a drift rate μ while the rate of variance change is σ2.

The generalized Wiener process can be written as follows:

dp*(t) = μdt+ σdW (t),t ≥ 0,

where W(t) is a Wiener process, and p*(t) is the price (and not its log). Then, E[p*(t) -p*(0)] = μt and Var[p*(t) -p*(0)] = σ2t. μ and σ2 are often referred to as the drift and volatility parameters of the generalized Wiener process p*(t).

The main limitation of the generalized Wiener process is that both the drift and volatility parameters (μ and σ2) are constant over time while it is well known that daily returns for instance (i.e. t = 1) are heteroscedastic.

The Ito process overcomes this drawback, allowing both the drift and volatility parameters to be time-varying. It is usually written as follows:

dp*(t) = μ(p*(t),t)dt+ σ(p*(t),t)dW (t),t ≥ 0,

where W(t) is a Wiener process. This process can be rewritten p*(t) = p*(0) + 0tμ(p*(s),s)ds + 0tσ(p*(s),s)dW(s).

A particular version of this process is commonly used in the literature to describe the behavior of prices and is known as geometric Brownian motion:

dp*(t) = p*(t)μdt+ p*(t)σdW (t),t ≥ 0.

In other words, a geometric Brownian motion is a continuous-time stochastic process in which the logarithm of the randomly varying quantity follows a Brownian motion. It is easy to show that the random variable log(p*(t)∕p*(0)) is normally distributed with mean (μ - σ22)t and variance σ2t, which reflects the fact that increments of a geometric Brownian motion are normal relative to the current price, reason for which it is called geometric Setting t = 1, e.g. daily data, we have that the one-period-ahead returns are normally distributed, with mean (μ - σ22) and variance σ2. See Chapter 6 of Tsay (2002) for more details.

7.2 Integrated Volatility

A drawback of the geometric Brownian motion is that it is not able to generate the well know ARCH effects (volatility clustering). It is thus desirable to make both μ and σ2 time varying.

7.2.1 Theoretical background

Let us now consider the following diffusion process for the logarithmic price p(t) = log(p*(t)):

dp(t) = μ(t)dt+ σ(t)dW (t),t ≥ 0,

where μ(t) denotes the drift, σ(t) refers to the point-in-time or spot volatility, and W(t) denotes a standard Brownian motion. Intuitively, over (infinitesimal) small time intervals,

r(t,Δ) ≡ p(t)- p(t- Δ ) ≈ μ(t- Δ)Δ + σ(t- Δ)ΔW (t),

where ΔW(t) = W(t) - W(t - Δ) ~ N(0,Δ).

For the one-period daily return,

                  ∫           ∫
                    t           t
rt ≡ p(t)- p(t- 1) = t-1μ(s)ds+ t-1σ(s)dW (s).

From Equation (7.7), it is clear that the volatility for the continuous-time process over [t- 1,t] is linked to the evolution of the spot volatility σ(t). Furthermore, conditional on the sample path of the drift and the spot volatility processes,

      ( ∫ t          )
rt ~ N      μ(s)ds,IVt  ,
         t- 1

where IV t denotes the so-called integrated variance (volatility), and is defined as follows:

     ∫ t
IVt ≡    σ2(s)ds.

7.2.2 Illustration of the concept of integrated volatility

To visualize the difference between the spot volatility, the integrated volatility, the conditional volatility of a GARCH(1,1) model and daily squared returns, let us consider the following simulation study.

The simulated model is designed to induce temporal dependencies consistent with the GARCH(1,1) model. The simulated continuous-time GARCH diffusion is formally defined by

 dp(t)  =  σ(t)dWp(t)                              (7.9)
  2             2            1∕2 2
dσ (t)  =  θ[ω- σ (t)]dt+ (2λθ)  σ (t)dWd (t),       (7.10)
where Wp(t) and Wd(t) denote two independent Brownian motions.1

The Realized class contains a function Simul_Continuous_GARCH devoted to the simulation of the above continuous-time GARCH diffusion process. This procedure uses a standard Euler discretization scheme to generate the continuous-time GARCH diffusion process, i.e. p(t + Δ) = p(t) + σ(t)√ --
  ΔZp(t) and σ2(t + Δ) = θωΔ + σ2(t)[                   ]
 1 - θΔ + √2-λθΔZd (t), where Zp(t) and Zd(t) denote two independent standard normal variables.

In the example file Simul_Continuous_GARCH.ox, a continuous-time GARCH diffusion process is generated for which the constants are set to θ = 0.035, ω = 0.636 and λ = 0.296. The same model has previously been analyzed in Andersen and Bollerslev (1998a). To simulate exchange rates, we choose Δ = 12880, corresponding to 10 observations per 5-minute interval. The number of simulated days = 510 but the first ten days have been discarded, giving a total of 500 simulated days. Furthermore, we use the following initial values for the log-price and spot volatility: p(0) = 1 and σ2(0) = 0.1.

Figure 7.2 graphs the simulated instantaneous log-prices, 5-min prices and daily prices.


Figure 7.2: Simulated log-prices of a continuous-time GARCH diffusion model. The figure plots respectively the instantaneous log-prices, 5-min prices and daily prices

Figure 7.3 plots the corresponding returns, i.e. instantaneous returns, 5-min returns and daily returns.


Figure 7.3: Simulated returns of a continuous-time GARCH diffusion model. The figure plots respectively the instantaneous returns, 5-min returns and daily returns

Finally, Figure 7.4 plots five volatility measures:

  1. The top left panel of Figure 7.4 displays the simulated spot volatility σ2(s). The spot variance is σ2(t+Δ) = θωΔ+σ2(t)[        √-----    ]
1 - θΔ +  2λθΔZd (t). We thus have 2880 values per day.
  2. The top right panel displays the corresponding daily integrated volatility, i.e. IV t. Given the fact that IV t t-1tσ2(s)ds, the ‘daily’ IV t is computed as i=11Δσ2(t - j∕Δ)Δ, where 1Δ = 2880.
  3. The bottom left panel displays the conditional variance obtained by estimating a GARCH(1,1) model on the daily returns.
  4. Finally, the bottom right panel plots the squared daily returns.


Figure 7.4: Four volatility measures: Spot volatility, Integrated Volatility, GARCH(1,1) on daily returns and daily squared returns

Two comments are in order.

We have seen in Section 5.3.1 that one of the most popular measures to check the forecasting performance of the ARCH-type models is the Mincer-Zarnowitz regression, i.e. ex-post volatility regression:

ˇσ2t = a0 + a1ˆσ2t + ut,

where ˇσt2 is the ex-post volatility, ˆσt2 is the forecasted volatility and a0 and a1 are parameters to be estimated. Recall that if the model for the conditional variance is correctly specified (and the parameters are known) and if E(σˇt2) = σˆt2, we have a0 = 0 and a1 = 1. Furthermore, the R2 of this regression is often used as a simple measure of the degree of predictability of the ARCH-type model.

To judge the quality of the GARCH forecasts it is common to use ˇσt2 = (rt -r)2, where r is the sample mean of rt (daily return at time t). The R2 of this regression is often lower than 5% and this could lead to the conclusion that GARCH models produce poor forecasts of the volatility (see, among others, Schwert, 1990, or Jorion, 1996). In their seminal paper, Andersen and Bollerslev (1998a) have shown that if rt follows a GARCH(1,1), the R2 of this regression is nothing but     2
Var(rt) =      2
    1  1 1. If κ is the kurtosis of zt, we have that κα12 + β12 + 2α1β1 < 1 to ensure the existence of the unconditional kurtosis of rt. It follows then that κα12 < 1 - β12 - 2α1β1 and,

           α2         1
R2 ≡ -----2-1-------< κ.
     (1 - β1 - 2α1β1)
If the innovations (zt) are i.i.d N(0,1), the R2 is thus necessarily lower than 1

In example file Simul_Continuous_GARCH_R2.ox, we simulate the same diffusion process as before and run the above regression to illustrate the findings of Andersen and Bollerslev (1998a). When using the daily squared returns to measure the observed daily volatility, the R2 of the regression is found to be extremely low, i.e. 0.0374596 even though a0 and a1 are not significantly different from 0 and 1 respectively. However, if we consider the integrated volatility instead of the squared daily returns as an ex-post volatility measure, the R2 now equals 0.421871 suggesting that the GARCH model explains more than 40% of the variability of the true volatility.

7.3 Realized Volatility

It is clear from Section 7.2 that the integrated volatility is an ideal ex-post measure of volatility and thus useful for assessing the quality of conditional volatility forecasts. In other words, it is the ideal measure for assessing the quality of conditional volatility forecasts, like GARCH models. However, the integrated volatility is never observed in practice.

Inspired by the recent widespread availability of intraday prices of various financial assets, Andersen and Bollerslev (1998a) have proposed to proxy the integrated volatility by the so-called realized volatility.

The most fundamental feature of realized volatility is that it provides a consistent non parametric estimate of the price variability that has transpired over a given discrete interval.

The intuition behind the use of realized volatility is most readily conveyed within the popular continuous-time diffusion presented in Equation (7.5) for which the key quantity of daily volatility is the integral of the squared spot variance. This integral is called the integrated variance.

We assume we dispose of T days of M ≡⌊1Δequally-spaced intraday returns. Now denote the i-th return of day t by rt,i p(t + iΔ) - p(t + (i - 1)Δ), where i = 1,,M. The daily Realized Variance (RVar) of day t, denoted RV t(Δ), is then defined as:

RVt(Δ ) ≡   r2t,i.

By the theory of quadratic variation, when Δ 0 and when the true diffusion process belongs to the family of processes presented in Equation (7.5) (i.e. absence of jumps):

         ∫ t
RV (Δ) →     σ2(s)ds.
  t       t-1

In other words, under suitable conditions (like the absence of serial correlation in the intraday returns)2 , the realized volatility is consistent for the integrated volatility in the sense that when Δ 0,RV t(Δ) measures the latent integrated volatility IV t perfectly.

G@RCH contains a set of procedures devoted to the computation of the realized volatility (and other related concepts that we will review in the next sections) gathered in the Realized class. For instance, the example file Simul_Continuous_GARCH_RV.ox simulates the previous GARCH-diffusion process and uses the procedure Compute_RV to compute the realized volatility from the simulated 5-minute returns, i.e. 288 returns per day.


Figure 7.5: Four volatility measures: Spot volatility, Integrated Volatility, realized volatility based on 5-min returns and GARCH(1,1) on daily returns

Figure 7.5 plots four volatility measures obtained when running

  1. The top left panel displays the simulated spot volatility σ2(t).
  2. The top right panel displays the corresponding daily integrated volatility.
  3. The bottom left panel plots the daily realized volatility.
  4. And finally, the bottom right panel displays GARCH(1,1) conditional variance.

The quality of the approximation of IV t by RV t(Δ) is evident from this graph.

Note that the example file Simul_Continuous_GARCH_R2.ox also reported the R2 of the Mincer-Zarnowitz regression using the realized volatility as endogenous variable. Not surprisingly, the R2 is very close to the previous value obtained for the IV t, i.e. 0.4219 vs. 0.4121.

7.4 Jumps

Empirical studies have shown that a continuous diffusion model as in Equation (7.5) fails to explain some characteristics of asset returns. Furthermore, standard GARCH models are not able to fully explain the excess kurtosis found in most financial time series. In a continuous time framework, the inadequacy of the standard stochastic diffusion model has led to developments of alternative models. Jump diffusion and stochastic volatility models have been proposed in the literature to overcome this inadequacy.

As common in the literature on non parametric estimation of the continuous and jump components in the quadratic variation (QVar), we suppose that the log-price process belongs either to the Brownian SemiMartingale (BSM) or the BSM with Jumps (BSMJ) families of models. Under the BSMJ model, the diffusion component captures the smooth variation of the price process, while the jump component accounts for the rare, large discontinuities in the observed prices. Andersen, Bollerslev, and Dobrev (2007) cite the work of several authors who found that this is a realistic model for the price series of many financial assets.

A BSMJ log-price diffusion admits the representation

dp(t) = μ(t)dt+ σ(t)dW (t)+ κ(t)dq(t),0 ≤ t ≤ T,

where μ(t) is a continuous locally bounded variation process, σ(t) is a strictly positive and càdlàg (right-continuous with left limits) stochastic volatility process, W(t) is a standard Brownian motion, dq(t) is a counting process with dq(t) = 1 corresponding to a jump at time t and dq(t) = 0 otherwise. The (possibly time-varying) jump intensity is l(t) and κ(t) is the size of the corresponding jump.

Jumps in stock prices are often assumed to follow a probability law. For instance, the jumps may follow a Poisson process, which is a continuous-time discrete process. For a given time t, let Xt be the number of times a special event occurs during the time period [0,t]. Then Xt is a Poisson process if:

             m m
Pr(Xt = m ) = l-t-exp(- lt),l > 0.

The ł parameter governs the occurrence of the special event and is referred to as the rate or intensity of the process. Recall also that E(Xt) = l.

G@RCH also provides a procedure Simul_Continuous_GARCH_JUMPS to simulate a continuous-time GARCH diffusion process with jumps, formally defined by

 dp(t)  =  σ(t)dW  (t)+ κ(t)dq(t),                   (7.15)
dσ2(t)  =  θ[ω- σ2(t)]dt+ (2λθ)1∕2σ2(t)dWd (t),       (7.16)
             √ --
  κ(t)  ~  σ(t) m ([- 2,- 1]∪[1,2])                 (7.17)
 dq(t)  ~  P oisson(l).                             (7.18)

The jump size κ(t) is modeled as the product between σ(t) and a uniformly distributed random variable on √m--([-2,-1] [1,2]). The parameter m determines the magnitude of the jumps.

Figure 7.6 plots IV t and RV t for 500 days of simulated intraday returns. The DGP is given by Equations (7.15)-(7.18) with the same GARCH parameters as before, but m is set to 0.5 and l is chosen such that there is on average one jump every 5 days. The ox code used to get this figure is Simul_Continuous_GARCH_JUMPS_IV_RV.ox.

Importantly, it is clearly visible that the realized volatility does not match the spot and integrated volatility.


Figure 7.6: Integrated Volatility based and Realized Volatility for a continuous-time GARCH model with jumps

This result is not surprising since we know by the theory of quadratic variation that for Δ 0, we have the following convergence in probability:

         ∫ t            ∑
RVt(Δ ) →    σ2(s)ds+       κ2(s).
          t-1         t- 1<s≤t

In other words, in the absence of jumps, the realized volatility is a consistent estimator of the integrated volatility. But not in the presence of jumps. Alternatives to the realized volatility that are robust to jumps are discussed in Section 7.6.

7.5 Intraday Periodicity

Temporal dependence in volatility is one of the most striking features of financial series recorded at various frequencies. Interestingly, new phenomena become visible as one proceeds from daily returns to intraday returns.

The aim of this section is to show how G@RCH can be used to estimate the intraday (or intraweek) periodicity in volatility.

The codes used to prepare this section are also available in the folder /packages/Realized/samples/.

Note that the dataset has been simulated in such a way that the outputs and graphs obtained when running this example should be very similar to the ones reported in this section.

7.5.1 Data

Like Andersen and Bollerslev (1997), Andersen and Bollerslev (1998a), Martens, Chang, and Taylor (2002) and Taylor and Xu (1997), we study the periodicity in intraday volatility of 5-minute EUR/USD returns.3

Our data set ranges from January 2000 to October 2004 and time is expressed in Easter Time (EST, EST = GMT - 5), taking into account the daylight saving time shifts in Europe and the USA. All the returns from Friday 16.05 EST through Sunday 16.00 EST are excluded. We also exclude from the sample some regular (Christmas, New Year...) and irregular (Easter...) holidays. We finally delete days where the data quality is dubious. We are left with T = 1175 days of M = 288 equally-spaced and continuously compounded 5-minute return observations rt,i, as currencies are traded around the clock. The data consists of the last mid point of the interval (last average of log(bid) and log(ask)). The first interval of day t is 16.05 EST, the last is 16.00 EST.

7.5.2 Evidence of intraday periodicity

Figure 7.7 (obtained by running example file Periodicity.ox plots the lag 1 through 1440 (i.e. 288 × 5 = 5 days) sample autocorrelations for the five-minute absolute returns |rt,i|.


Figure 7.7: ACF of 5-minute absolute returns of the EUR-USD from January 2000, through October 2004

This graph displays a distinct U-shaped pattern in the ACF of absolute returns. Standard ARCH models imply a geometric decay in the absolute return autocorrelation structure and simply cannot accommodate strong regular cyclical patterns of the sort displayed in Figure 7.7.

The volatility dynamics of intraday financial time series are complex. There are intraday volatility patterns, reflecting the daily activity cycle of the regional centers as well as weekend, the macroeconomic announcement effects (immediately following the release) and standard volatility clustering at the daily level.

The behaviour of a time series is called periodic if it shows a periodic structure in addition to less regular movements. The foreign exchange (FX) market shows strong periodic effects caused by the presence of the traders in the three major markets according to the hour of the day, the day of the week and the Daylight Saving Times. The global FX market consists of three major markets, i.e., Asia, Europe and North America, and the major movements of intradaily return volatility can be attributed to the passage of market activity around the globe.


Figure 7.8: Intraday average absolute returns for the EUR-USD

Figure 7.8 depicts the average absolute returns over the (288) five-minute intervals. This intraday pattern is quite similar across all days of the week with discrete changes in quoting activity marking the opening and closing of business hours in the three major regional centres, all of which have their own activity pattern.

The following hours can be used as indicative: the Far East is open from 16:00 EST (21:00 GMT) to 1:00 EST (6:00 GMT), Europe trades between 2:00 EST (7:00 GMT) and 11:00 EST (16:00 GMT) and trading in North America occurs from 7:00 EST (12:00 GMT) to 16:00 EST (21:00 GMT). Using the discussion of market opening and closures presented above, we explain the intraday periodic volatility as follows. At 19:00 EST, the Far Eastern market has already been trading for around three hours and market activity is high. From 19:00 EST until about 22:00 EST, activity levels and volatility remain high. The lunchtime in Tokyo (22:00 EST- 23:45 EST) is the point of the day corresponding to the most prominent feature of the series. Volatility drops sharply and regains its former value at about 0:00 EST. Generally, Europe begins to contribute to activity at around 2:00 EST as the Far Eastern market begins to wane: there is a small peak in volatility. During European lunch hours (starting around 6:30 EST), both activity and volatility show a slight lull. The most active period of the day is clearly when both the European and North American markets are open (between 7:00 EST and 11:00 EST). Volatility starts to decline as first the European and then US markets wind down. At around 16:00 EST, the pacific market begins to trade again and the daily cycle is repeated after midnight. This intraday pattern is consistent with previous evidence reported in the literature, see Andersen and Bollerslev (1998b) among others.

Failure to take account of this intra-daily periodicity is likely to result in misleading statistical analyses as shown by Andersen and Bollerslev (1997) (in the sense that intraday returns do not conform at all to the theoretical aggregation results for the GARCH models).

The periodic phenomena in the volatility of FX markets can be modelled in a variety of ways. GARCH specification with dummy variables can be used for modelling the conditional volatility. For our dataset, this would require estimating 288 time-of-day parameters, if one dummy variable were created for each five-minute interval. The number of variables required is very large and it is unlikely to be effective in capturing the complexity of the periodic patterns.

7.5.3 Classical and robust estimation of intraday periodicity

Recall that we dispose of T days of M equally-spaced and continuously compounded intraday returns and that rt,i is the i-th return of day t. We assume the log-price follows a Brownian SemiMartingale with Finite Activity Jumps (BSMFAJ) diffusion.

We assume also that, for small values of Δ, the returns rt,i in an interval without jumps in the underlying price diffusion, are conditionally normally distributed with mean zero and variance σt,i2 = t+(i-1)Δt+iΔσ2(s)ds.

Due to the weekly cycle of opening, lunch and closing times of the financial centers around the world, the high-frequency return variance σt,i2 has a periodic component ft,i2.

Depending on the nature of the analysis, there exists a natural window length for which almost all variation in σt,i2 during the window can be attributed to ft,i2 such that st,i2 σt,i2∕ft,i2 is approximately constant over the local window. Henceforth G@RCH follow the suggestion Andersen and Bollerslev (1997, 1998b), Andersen, Bollerslev, and Dobrev (2007) and Lee and Mykland (2008) to use local windows of one day.4

Andersen and Bollerslev (1997, 1998b) suggest to estimate st,i by ŝt = ∘ 1---
  M ht i = 1,,M, where ht is the conditional variance of day t obtained by estimating a GARCH model on daily returns. Under the BSM model, a more efficient estimator for st,i is ŝt =   ------
∘  1RV
   M   t.

As we will see in Section 7.6.1, under the BSMJAJ model, st,i is better approximated using a normalized version of Barndorff-Nielsen and Shephard (2004b)’s realized bi-power variation, i.e. ŝt,i = ∘ --1-----
   M-1BVt where BV t is the bi-power variation computed on all the intraday returns of day t (see Equation (7.38)).

To ensure identifiability of the periodicity factor ft,i with respect to the average variance of day t, we impose that the squared periodicity factor has mean one over the local window.

Under this model we have that when Δ 0 and if rt,i is not affected by jumps, the standardized high-frequency return rt,i = rt,iŝt,i is conditionally normally distributed with mean zero and variance equal to the squared periodicity factor. This result suggests to estimate the periodicity factor using either a non parametric or parametric estimator of the scale of the standardized returns. The estimator has to be robust to price jumps.

Non parametric estimation of periodicity

The non parametric periodicity estimator is based on a scale estimate of the standardized returns rt,i = rt,iŝt,i that share the same periodicity factor.

Let r1;t,i,,rnt,i;t,i be the set of nt,i returns having the same periodicity factor as rt,i. If we condition the estimation of the periodicity factor only on the calendar effects, r1;t,i,,rnt,i;t,i are the returns observed on the same time of the day and day of the week as rt,i.

The classical periodicity estimator is based on the standard deviation

ˆfSDt,i = ∘---SDt,i-----,

where SDt,i = ∘ -1-∑nt,i-2---
  nt,i  j=1rj;t,i. This estimator is similar to Taylor and Xu (1997)’s periodicity estimate based on averages of squared returns.

In absence of jumps, the standard deviation is efficient if the standardized returns are normally distributed. In the presence of jumps, this estimator is useless, since it suffices that one observation in the sample is affected by a jump to make the periodicity estimate arbitrarily large.

Because of the non-robustness of the previous estimator in the presence of jumps, Boudt, Croux, and Laurent (2008b) propose to replace the standard deviation in (7.20) by a robust non parametric estimator.

A first candidate is the so-called median absolute deviation (MAD). The MAD of a sequence of observations y1,,yn is defined as

1.486 ⋅mediani|yi - medianjyj|,

where 1.486 is a correction factor to guarantee that the MAD is a consistent scale estimator at the normal distribution. The MAD estimator for the periodicity factor of rt,i equals

fˆtMA,iD=  ∘---MADt,i-----.

Amongst the large number of robust scale estimators available in the literature (see Maronna, Martin, and Yohai 2006, for an overview), Boudt, Croux, and Laurent (2008b) also recommend the use of the Shortest Half scale estimator proposed by Rousseeuw and Leroy (1988), because it remains consistent in the presence of infinitesimal contaminations by jumps in the data. Importantly, it has the property of being, among a wide class of scale estimators, the estimator for which jumps can cause the smallest maximum bias possible. Under normality, it has the same efficiency as the MAD and the interquartile range. A final advantage is that it is computationally convenient and does not need any location estimation. For the definition of the Shortest Half scale estimator, we need the corresponding order statistics r(1);t,i,,r(nt,i);t,i such that r(1);t,i r(2);t,i r(nt,i);t,i. The shortest half scale is the smallest length of all “halves” consisting of ht,i = nt,i2+ 1 contiguous order observations. These halves equal {r(1);t,i,,r(ht,i);t,i}, , {r(nt,i-ht,i+1);t,i,,r(nt,i);t,i}, and their length is r(ht,i);t,i -r(1);t,i, , r(nt,i);t,i -r(ht,i);t,i, respectively. The corresponding scale estimator (corrected for consistency under normality) equals the minimum of these lengths:

ShortH  = 0.741⋅min{r      - r    ,...,r      - r           }.
      t,i             (ht,i);t,i   (1);t,i     (nt,i);t,i   (nt,i-ht,i+1);t,i

The Shortest Half estimator for the periodicity factor of rt,i equals

fˆtSh,iortH= ∘----∑M--------2-.
         M1   j=1 ShortHt,j

The shortest half dispersion is highly robust to jumps, but it has only a 37% efficiency under normality of the rt,i’s. Boudt, Croux, and Laurent (2008b) show that a better trade-off between the efficiency of the standard deviation under normality and the high robustness to jumps of the shortest half dispersion is offered by the standard deviation applied to the returns weighted in function of their outlyingness under the ShortH estimate, i.e.

fˆtW,iSD=  ∘-1-∑M-------2-,
         M-  j=1WSD t,j


        ┌│ ------∑nt,j---------------2--
WSD   = │∘ 1.081⋅--l=1 w-[(rl;t,j∕ˆfSht,orjtH)2]rl;t,j.
    t,j            ∑ntl,=j1 w [(rl;t,j∕ˆftSh,jortH)2]

Because the weighting is applied to the squared standardized returns which are extremely large in the presence of jumps, Boudt, Croux, and Laurent (2008b) recommend the use of the hard rejection with threshold equal to the 99% quantile of the χ2 distribution with one degree of freedom, i.e.

         1  if z ≤ 6.635
w (z) =  0  else.

The factor 1.081 is needed to ensure consistency of the estimator under normality. The Weighted Standard Deviation (WSD) in (7.25) has a 69% efficiency under normality of the rt,i’s.

Note that when the dataset is ’dated’, the length of the periodicity filter equals 5 days, otherwise it is equal to one day.

Parametric estimation of periodicity

The non parametric estimators for the periodic component of intraday volatility use only the subset of the data for which the returns have the same periodicity factor. Andersen and Bollerslev (1997) show that more efficient estimates can be obtained if the whole time series dimension of the data is used for the estimation of the periodicity process. Andersen and Bollerslev (1997) use the result that under the assumption that returns are not affected by jumps (i.e. BSM model): rt,i ft,ist,iut,i, where ut,i ~ N(0,1) and so log(|rst,t,i|i) log ft,i + log |ut,i|, which allows to isolate ft,i as follows,

log(|rt,i∕st,i|)- c = logft,i + εt,i,

where the error term εt,i is i.i.d. distributed with mean zero and has the density function of the centered absolute value of the log of a standard normal random variable, i.e.

      ∘ ----
g(z) =   2∕π exp[z + c- 0.5 exp(2(z + c))].

The parameter c = -0.63518 equals the mean of the log of the absolute value of a standard normal random variable. Andersen and Bollerslev (1997) then propose to model log ft,i as a function h of a vector of variables x (such as sinusoid and polynomial transformations of the time of the day) that is linear in the parameter vector θ

log ft,i = h(xt,i;θ) = xt,iθ.

Combining (7.27) with (7.29), we obtain the following regression equation

    -          ′
log(|rt,i|)- c = xt,iθ + εt,i.

It is common to estimate the parameter θ in (7.30) by the OLS estimator. This approach is neither efficient nor robust, since the error terms are not normally distributed. Denote the loss functions of the OLS and Maximum Likelihood (ML) estimators by ρOLS(z) = z2 and by

ρ (z) = - 0.5 log(2∕π)- z - c+ 0.5exp(2(z + c)),
respectively. The OLS and ML estimates equal
                T∑  M∑                            ∑T ∑M
ˆθOLS = argmin -1--      ρOLS(εt,i) and ˆθML = argmin-1      ρML(εt,i),
            M T t=1 i=1                       M T t=1i=1

where εt,i is a function of θ (see Equation (7.30)).

Boudt, Croux, and Laurent (2008b) find that the ML estimator has a large bias in the presence of jumps. The non-robustness of the OLS and ML estimators to jumps is due to the unbounded effect an observation can have on their loss function. These loss functions are plotted in Figure 7.9. As mentioned by Martens, Chang, and Taylor (2002), the effect of jumps on the OLS and ML estimators is attenuated because the regression is based on the log of the standardized returns.

Figure 7.9: Loss functions associated to the OLS and ML estimators. The vertical lines denotes the likelihood threshold and the upper and lower truncation levels based on the 99.5% quantile, used by the truncated ML estimator

As an alternative to the OLS and ML estimators, Boudt, Croux, and Laurent (2008b) propose to use the Truncated Maximum Likelihood (TML) estimator introduced by Marazzi and Yohai (2004). This estimator gives a zero weight to observations that are outliers according to the value of the ML loss function evaluated at the corresponding residual computed under the robust non parametric estimator fˆ WSD in (7.25). Let

eWt,SDi = logrt,i - c - logfˆWtS,iD.

Observations for which ρML(e t,iWSD) is large, have a low likelihood and are therefore likely to be outliers (Marazzi and Yohai, 2004). Denote q an extreme upper quantile of the distribution of εt,i. The TML estimator is defined as

 TML         1      ∑T ∑M
ˆθ  =  ∑T--∑M-------      wt,iρ(εt,i),
        t=1  i=1wt,it=1i=1


      { 1  if ρML(eWtS,iD ) ≤ ρML(q)
wt,i =  0  else.
The truncation on the basis of the value of the ML-function is illustrated in Figure 7.9 for the 99.5% quantile. All observations with ρML(e iWSD) > 3.36 receive a zero weight in the objective function of the TML estimator.

Like for the non parametric periodicity estimators, we impose that the squared periodicity factor has mean one in the local window. The parametric estimate for the periodicity factor thus equals

                ′ ˆTML
ˆfTt,MiL= ∘-----expxt,iθ--------,
        -1∑M   (expx′ ˆθTML)2
        M   j=1     t,j

and similarly for ˆ
f t,iOLS and ˆ
f t,iML.

7.5.4 First illustration on simulated data

In this section we show how to use G@RCH 6.1 to estimate the periodicity filters presented above on simulated data.

The non parametric periodicity filters presented in Section 7.5.3 are implemented in the functions TaylorXu(...) and Robust_periodicity(...). Their use is illustrated in example file NP_periodicity_filters.ox. This example uses the simulated data obtained by running the files Store_Cont_GARCH_...ox available in the directory /packages/Realized/samples/. There are four different files simulating 5-min (FX) returns from a continuous-time GARCH(1,1) diffusion with or without intraday periodicity in volatility and with or without jumps. The simulated data are then stored in the directory /packages/Realized/samples/data/.

Let us concentrate first on Store_Cont_GARCH_FFF.ox. The aim of this program is to generate 5-minute returns following a continuous-time stochastic volatility model without drift and spot volatility σ(s) specified as a multiplicative process of the periodicity function f, which depends only on the time of the day s-⌊sand a GARCH diffusion process (see Equation 7.9), i.e.

σ(s) = f(s- ⌊s⌋)σgarch(s),

where log(s -⌊s) = θx(s -⌊s)s -⌊sis intraday time.

σgarch(s) is generated using the values θ = 0.035, ω = 0.636 and λ = 0.296. The function f(s-⌊s) used in the simulation is based on log f(s-⌊s) = θx(s) with x(s) a vector holding sinusoid transformations of s, i.e.

         P (                     )
logf  = ∑    γ cos 2ijπ-+ δ sin 2ijπ  ,
    t,i  j=1   j    M     j    M

where M = 1Δ, P is set to 4 and θ = (γ1,δ1,,γ4,δ4) = (-0.24422;-0.49756;-0.054171;0.073907;-0.26098;0.32408;-0.11591;-0.21442). In this case, the length of the periodicity cycle is one day and the periodicity is thus the same for all the days of the week. To simulate exchange rates, we choose Δ = 12880, corresponding to 10 observations per 5-minute interval. The number of simulated days = 3010 but the first ten days have been discarded, which leads actually to a total of 3000 simulated days.

After running this file, the simulated data are automatically saved to ./data/Simulated_cont_GARCH_no_jumps_FFF.in7.

Figure 7.10 plots the true periodicity (dotted line) and the estimated periodicity (solid line) obtained by applying the four non parametric filters on 3000 days of 5-min simulated returns (288 returns per day). This model being compatible with the BSM model, the four methods should provide good estimates of the periodicity. As expected, we see that the four methods provide excellent estimates of the periodicity. In a more comprehensive Monte Carlo study, Boudt, Croux, and Laurent (2008b) have shown that in the class of non parametric estimators, the SD is the most efficient estimator.

Figure 7.11 plots the true periodicity (dotted line) and the estimated periodicity (solid line) obtained by applying the three parametric filters on the same data (using the file P_periodicity_filters.ox). Table 7.1 reports the estimated values for the three methods.

As expected, we see that the three methods provide excellent estimates of the periodicity.

Table 7.1: Estimation results for the periodicity filter. DGP is a GARCH(1,1) with periodicity

cos(2PI1i/M) -0.24422 -0.24059 -0.23810 -0.23804
sin(2PI1i/M) -0.49756 -0.49946 -0.49561 -0.49699
sin(2PI2i/M) 0.073907 0.073229 0.072583 0.073019
cos(2PI3i/M) -0.26098 -0.27287 -0.27080 -0.27080
sin(2PI3i/M) 0.32408 0.31672 0.31546 0.31562
cos(2PI4i/M) -0.11591 -0.10828 -0.10778 -0.10728
sin(2PI4i/M) -0.21442 -0.21937 -0.21617 -0.21666

Boudt, Croux, and Laurent (2008b) also show that all parametric estimators are more precise that the non parametric estimators. For the parametric estimators, they find that the ML estimator is efficient. The TML estimator is found to be only slightly less efficient than the ML estimator and, as expected, the OLS estimator (Gaussian QML) is the least precise of all parametric estimators.


Figure 7.10: Non parametric periodicity filters. DGP=GARCH(1,1)


Figure 7.11: Parametric periodicity filters. DGP=GARCH(1,1)

Let us now consider the case of a BSMJ model (see Equations (7.15)-(7.18)) with the same GARCH parameters as before) with μ(s) = 0 and σ(s) specified as the same multiplicative process as in (7.35) (with the same parameter values as in the previous simulation).

The jump size κ(s) is modeled as the product between σ(s) and a uniformly distributed random variable on   --
√ m([-2,-1] [1,2]). The parameter m determines the magnitude of the jumps. We set m equal to 0.3 (rather small jumps). Finally, the jump occurrences q(s) are specified as a Poisson process with on average K = 2 jumps per day. Importantly, occurrences of jumps are concentrated on the parts of the day when volatility is periodically very low (f < 0.45). Figure 7.12 plots the number of simulated jumps per intraday period of time. The simulated data, i.e. Simulated_cont_GARCH_jumps_FFF_K_2_M_0.3.in7 are obtained by running Store_Cont_GARCH_Jumps_FFF.ox with the options K = 2, M = 0.3 and threshold = 0.45.5

Figure 7.13 plots the estimated periodicity obtained by applying the four non parametric filters on the simulated 5-min returns (using NP_periodicity_filters.ox).


Figure 7.12: Number of simulated jumps per intraday period of time. Jumps only occur when f < 0.45

In this case the SD is expected to be inappropriate because of its non robustness to jumps. This is illustrated in Figure 7.13, where we see that Taylor and Xu (1997)’s periodicity filter (i.e. based on the SD) is very inaccurate while the three robust alternatives are as good as in the previous case of no jump. Again, in a more comprehensive Monte Carlo study, Boudt, Croux, and Laurent (2008b) show that the robust non parametric estimators are little affected by the inclusion of jumps in the price process.

Figure 7.14 plots the true periodicity (dotted line) and the estimated periodicity (solid line) obtained by applying the three parametric filters on the same data (using the file P_periodicity_filters.ox). Table 7.2 reports the estimated values for the three methods. The ML estimator is found to be extremely sensitive to jumps. The optimality of the ML estimator is thus restricted to the model without jumps. Since jumps do occur in practice, we do not recommend its use in practice.

Table 7.2: Estimation results for the periodicity filter. DGP is a GARCH(1,1) with periodicity and jumps

cos(2PI1i/M) -0.24422 -0.24059 -0.23810 -0.23804
sin(2PI1i/M) -0.49756 -0.49946 -0.49561 -0.49699
sin(2PI2i/M) 0.073907 0.073229 0.072583 0.073019
cos(2PI3i/M) -0.26098 -0.27287 -0.27080 -0.27080
sin(2PI3i/M) 0.32408 0.31672 0.31546 0.31562
cos(2PI4i/M) -0.11591 -0.10828 -0.10778 -0.10728
sin(2PI4i/M) -0.21442 -0.21937 -0.21617 -0.21666

The robustness of the OLS estimator is surprising at first sight, but it corroborates Martens, Chang, and Taylor (2002)’s intuition that the log-transformation shrinks the outliers and makes the estimators based on a regression of the log absolute returns more robust to jumps. However, while being much better than the ML estimator, the OLS (or Gaussian QML) estimator is quite affected by jumps, which is apparently not the case for the TML estimator (for which the fit is again almost perfect).

The main message of this simulation study is that the non parametric WSD and parametric TML estimators have a relatively high efficiency in the absence of jumps. If jumps are present in the process, they are the most accurate of all non parametric and parametric estimators considered, respectively. In all cases considered here, the TML estimator based on the correctly specified periodicity function is more efficient than the WSD estimator. While the advantage of the TML over the WSD is clear when the functional form of the periodicity is known, its major drawbacks are that this functional form is not known a priori in practical applications and its estimation is much more time consuming.


Figure 7.13: Non parametric periodicity filters. DGP=GARCH(1,1) with jumps


Figure 7.14: Parametric periodicity filters. DGP=GARCH(1,1) with jumps

7.5.5 Second illustration on EUR/USD data

In this section, we use the dataset described in Section 7.5.1, i.e. 5-min returns of the Euro - US dollar (EUR-USD) exchange rate from January 2000 to October, 2004.

To model the periodic component, we use a slightly more general specification than the one used in the previous simulation studies, i.e.

                         P (                     )    D
logf   = μ  -i-+ μ  -i2-+ ∑    γ cos 2ijπ+ δ sin 2ijπ + ∑ λ I(k) ,
   t,i    1,jN1    2,jN2   j=1   j    M     j    M      k=1 k   t,i

where M = 1Δ, P is set to 4, N1 = (M + 1)2 and N2 = (2M2 + 3M + 1)6 are normalizing constants. Note that Andersen and Bollerslev (1997) originally used the normalizing constant N2 = (M + 1)(M + 2)6. However, the analytic solution for a sum of squares of integers k=1Mk2 = (2M3 + 3M2 + M)6 which leads to the value reported above when computing the mean instead of the sum.

The smooth periodicity generated from the Fourier terms is unlikely to cope well with the sharp drop in volatility, for instance, around lunch in the Far East and the day of the week dependencies. To fill this gap, the term k=1DλkI(k)t,i is included, where I(k)t,i is an indicator variable for event k during interval i of day t. The events may be calendar (day of the week dummies, etc.) as well as announcement effects. For instance to account for the announcement effects we have included two dummies that equal one on Friday 8:35 and 8:40 EST.6

Figure 7.15 plots  ˆ
f t,iSD, ˆ
f t,iShortH and ˆ
f t,iWSD from Monday to Friday (i.e. 288 × 5 values) while Figure 7.16 plots fˆ t,iOLS, fˆ t,iML and ˆf t,iTML. These graphs have been obtained by running example file Periodicity.ox. Note that for ease of comparison between non parametric and parametric methods, Figure 7.16 also plots ˆf t,iWSD on each graph.

We see that while not being radically different, the three non parametric estimates are not equivalent. For instance, the periodicity factor is found to be much higher on Friday at 8h35 (peak around observation 1350) for the non-robust ˆf t,iSD estimator than for the robust ones.

The same comment applies for the parametric estimates. ˆf t,iTML and ˆf t,iOLS seem to overestimate the periodicity on that particular interval. Looking at Table 7.3 we see that the estimated coefficient for the ‘Friday 8:35’ dummy is about 1,1.3, and 0.55 for the OLS, ML and TML methods respectively.

Note also that it seems that while the parametric and non parametric estimates are very close to each other it is clear that the specification chosen for the parametric methods does not fully capture the intraday periodicity and needs more refinement.


Figure 7.15: Estimated non parametric periodicity filters for the EUR-USD


Figure 7.16: Estimated parametric periodicity filters for the EUR-USD

Table 7.3: Estimation results for the parametric periodicity filters

Monday Morning -0.0333210.0025972-0.0050530
Tuesday 0.0066779 -0.019706 -0.015382
Wednesday 0.026778 -0.020484-0.0056105
Thursday 0.014360 -0.028472 -0.022191
Friday -0.036773 -0.058797 -0.047153
Friday 8:35 1.0090 1.3076 0.55753
Friday 8:40 0.44006 0.42802 0.33174
i/N1 -0.99591 -0.072911 -0.45299
i2/N2 0.69094 0.055195 0.33335
cos(2PI1i/M) -0.37557 -0.13875 -0.24023
sin(2PI1i/M) -0.36311 -0.31468 -0.30830
cos(2PI2i/M) -0.088932 -0.040887 -0.061628
sin(2PI2i/M) -0.052102 -0.036089 -0.037174
cos(2PI3i/M) -0.17503 -0.11762 -0.13831
sin(2PI3i/M) 0.083212 0.071998 0.084067
cos(2PI4i/M) 0.061097 0.069227 0.066619
sin(2PI4i/M) -0.080584 -0.067013 -0.070710

7.6 Robust to jumps volatility measures

We have seen in Section 7.4 that in absence of jumps, realized volatility is a consistent estimator of integrated volatility. However, in presence of jumps it is no longer consistent. Furthermore, it has been shown in the literature (see Andersen, Bollerslev, and Diebold, 2007) that jumps are much less persistent (and predictable) than the continuous sample path, or integrated volatility, component of the quadratic variation process. It is thus useful to separate the two components of the quadratic variation process. This was first done by Barndorff-Nielsen and Shephard (2004b) through the use of the so-called Bi-Power Variation (BPVar). In this section we also review an alternative to the BPVar, namely Realized Outlyingness Weighted Variation (ROWVar), recently proposed by Boudt, Croux, and Laurent (2008a).

7.6.1 Bi-Power Variation

Barndorff-Nielsen and Shephard (2004b) show that, for a subclass of BSMJ price diffusions (i.e. BSM with Finite Activity Jumps), the normalized sum of products of the absolute value of contiguous returns (i.e. bi-power variation) is a consistent estimator for the IVar.

The BPVar (implemented in the procedure Compute_BV()) is defined as:

          -2 -M---∑M
BVt(Δ ) ≡ μ1 M - 1   |rt,i||rt,i-1|,

where μ1 ∘2-∕π- 0.79788 and M = 1Δ.

Unlike the RVar, the BPVar is designed to be robust to jumps because its building block is the product between two consecutive returns instead of the squared return. If one of the returns corresponds to a jump and the next one follows the BSM diffusion process, then the product has a small impact on the BPVar, being the sum of many of these building blocks. If the jump process has finite activity7 then a.s. jumps cannot affect two contiguous returns for Δ 0 (or equivalently M →∞) and the jump process has a negligible impact on the probability limit of the BPVar, which coincides with the IVar. Under the BSM with Finite Activity Jumps (BSMFAJ) one has

             ∫ t
plim BVt(Δ) =     σ2(s)ds.
Δ→0           t- 1

To illustrate the applicability of the above results, let us consider the following picture, obtained by running the example file Simul_Continuous_GARCH_Jumps_1.ox. Note that the DGP is a continuous-time GARCH(1,1) with on average one jump every 5 days (with m = 0.5).

Looking at Figure 7.17, we see that that unlike RV t(Δ) (see Figure 7.6), BV t(Δ) is a much better estimate of the integrated volatility (IV t) in presence of jumps.


Figure 7.17: Integrated Volatility and Bi-Power Variation for a continuous-GARCH jump process

7.6.2 Realized Outlyingness Weighted Variance

For the impact of jumps on the BPVar to be negligible, the high-frequency returns need to be observed over extremely short time intervals (Δ 0) in order to avoid that jumps affect two contiguous returns and the effect of jumps on the price process may not be smeared out over these short intervals. When returns are computed over longer time intervals such as 5 or 15 minutes, these conditions may be violated. Furthermore, the BPVar is sensitive to the occurrence of “zero” returns in the sample.8 The multivariate extension of the BPVar has the additional disadvantage that it is not affine equivariant and it is not necessarily positive semidefinite, which are undesirable properties for covolatility estimators. For theses reasons, Boudt, Croux, and Laurent (2008a) have proposed an alternative to the BPVar that is robust to jumps affecting two or more contiguous returns. This estimator, called Realized Outlyingness Weighted Variance (ROWVar) is also more efficient than the BPVar under the BSM model and its multivariate extension has some nice properties.

The computation of the ROWVar proceeds as follows.
Step 1: Estimation of local outlyingness
Boudt, Croux, and Laurent (2008a) measure the outlyingness dt,i of return rt,i as the square of the robustly studentized return:9

     (    )2
d  =   rt,i   ,
 t,i    ˆσt,i

where ˆσt,i is a robust estimate of the instantaneous volatility computed from all the returns belonging to the same local window as rt,i. In Boudt, Croux, and Laurent (2008a), ˆσt,i is the Median Absolute Deviation about the median (MAD)10 of all the returns belonging to the same local window, which is here equal to one day.11

Because of the presence of a strong intraday periodicity in volatility (see Section 7.5.3), this assumption does not hold when the length of that window is greater than a few minutes. Recall that following Andersen and Bollerslev (1997, 1998b), Andersen, Bollerslev, and Dobrev (2007) and Lee and Mykland (2008) G@RCH uses local windows of length M, i.e. one day.

For this reason, in their empirical application, Boudt, Croux, and Laurent (2008b) propose to compute dt,i on filtered returns ˜r t,i = rt,i
ˆft,i instead of the raw returns rt,i, where ˆf t,i is any of the robust periodicity filters presented in Section 7.5.3.

For practical reasons, G@RCH only considers non parametric estimates for ˆf t,i, i.e. fˆt,iMAD,ˆf t,iShortH and fˆ t,iWSD. Note that when the dataset is ’dated’, the length of the periodicity filter is set to 5 days, otherwise it is equal to one day.
Step 2: Computation of ROWVar

Once one has such an outlyingness measure dt,i for all the high-frequency returns in the interval [t- 1,t], a weight function has to be chosen. This weight function equals one for returns for which the outlyingness dt,i does not raise any suspicion that the corresponding return rt,i might be affected by jumps and goes to 0 the more extreme dt,i is. Popular weight functions are the hard Rejection weight function

wHR(z) =   1  if z ≤ k
          0  otherwise,

and the Soft Rejection Huber weight function

wSR(z) = min{1,k∕z},

with k a tuning parameter to be selected. Under the BSM model and some weak assumptions stated in Boudt, Croux, and Laurent (2008a), the outlyingness measure is asymptotically chi-squared distributed with 1 degree of freedom (χ12). The outlyingness of a return affected by a jump will then be in the extreme tails of the χ12 distribution. Consequently, for the ROWVar the rejection threshold k can be set to χ12(β), the β quantile of the χ12 distribution function. Boudt, Croux, and Laurent (2008a) strongly recommend to use the SR weight function with β = 95% as a good compromise between robustness and efficiency.

Given a series of high-frequency returns, their estimated outlyingness and a weight function, one can then compute the ROWVar as follows:

                ∑M   w(dt,i)r2t,i
ROWVart (Δ ) = cw 1-i=∑1M--------.
                M   i=1w(dt,i)

The correction factor cw in (7.43) ensures that the ROWVar is consistent for the IVar under the BSM model. Table 7.4 (see below) reports the correction factors for the hard and soft rejection weight functions for several values of the critical level β.

β 1 0.99 0.975 0.95 0.925 0.90 0.85 0.80

cw HR 1 1.0811.1751.3181.4591.6051.9212.285
cw SR 1 1.0171.0411.0811.1221.1651.2571.358
θ HR 2 2.8973.7074.6745.4065.9986.9177.592
θ SR 2 2.0722.1842.3672.5392.6992.9893.245
dw HR0.3330.4400.5540.7410.9451.1771.7602.566

Table 7.4: cw, θ and dw for different critical levels β (such that the threshold k = χ12(β), with χ12(β) the β quantile of the χ12)

Let us now compute ROWVar on the same simulated data than in the previous section. Figure 7.18, obtained by running the example file Simul_Continuous_GARCH_Jumps_2.ox, suggests that ROWV art(Δ) also does a good job in estimating the integrated volatility (IV t) in presence of jumps.

A more formal comparison between bi-power variation and ROWVar will be presented in Section 7.7.


Figure 7.18: Integrated Volatility and ROWVar for a continuous-GARCH jump process

7.6.3 MinRV and MedRV

Two attractive robust estimators of the integrated volatility have been recently proposed by Andersen, Dobrev, and Schaumburg (2008). The MinRV and MedRV (implemented in the procedures Compute_MinRV() and Compute_MedRV()) are defined as follows:

M inRVt(Δ)  ≡  μ2 -M----   min(|rt,i|,|rt,i-1|)2          (7.44)
                  M - 1 i=2
M edRVt(Δ)  ≡  μ3 -M----   med(|rt,i|,|rt,i-1|,|rt,i-2|)2,   (7.45)
                  M - 2 i=3
where μ2 π(π - 2), μ3 π(6 - 4√-
 3 + π), min stands for minimum and med for median.

The MedRV is meant to be less exposed to zero-returns than the BV.

7.7 Daily jump tests

Evidently, the difference between RV t and any robust to jumps estimator of IV t, denoted  ˆ
IVt(Δ), is an estimate of the jump contribution or realized jumps.

More formally,

RJt(Δ) ≡ RVt (Δ )- IˆV t(Δ ) →      κ2(s),

where ˆ
IVt(Δ) is for instance BV t(Δ) or ROWV art(Δ).

Based on the theoretical results of Barndorff-Nielsen and Shephard (2006) that

√--(  RV (Δ )- IV  )       (   ( 2  2 )    )
 Δ    ˆ t       t   d→ M N   0,  2  θ  IQt   if Δ → 0.
      IVt(Δ )- IVt

Andersen, Bollerslev, and Diebold (2007) have developed a formal test for (daily) jumps, i.e.

Zt ≡  ∘ ------1--ˆ- ,
        (θ- 2)M IQt

where  ˆ
IQt is a robust to jumps estimate of the integrated quarticity IQt t-1tσ4(s)ds.

Andersen, Bollerslev, and Diebold (2007) consider the case IˆVt(Δ) = BV t(Δ) and use the so-called Tri-power quarticity TQt(Δ) to estimate IQt, where

            --M--- -3∑M     4∕3     4∕3      4∕3
T Qt(Δ) ≡ M M - 2 μ4∕3   |rt,i|  |rt,i- 1|  |rt,i-2|  ,

with μ43 223Γ(76)Γ(12)-1.

Another popular estimator for IQt, in the spirit of the bi-power (or multi-power) variation, is the Quad-power quarticity QQt(Δ), i.e.

              M    -4∑M
QQt (Δ) ≡ M M---3-μ1    |rt,i||rt,i-1||rt,i-2||rt,i-3|.

When IˆVt(Δ) = BV t(Δ), θ = μ1-4 + 2μ1-2 - 3 2.609. Note that TQt(Δ) and QQt(Δ) are implemented in the procedures Compute_TQ() and Compute_QQ(), respectively.

The main drawback of TQt(Δ) and QQt(Δ) is that like BV t(Δ) they are downward biased in the presence of zero returns. Furthermore, the bias is expected to be even larger in finite sample than for BV t(Δ) because they are made up of products of respectively three and four consecutive absolute returns.

To overcome this problem, Boudt, Croux, and Laurent (2008a) have proposed to replace IˆVt(Δ) in Equation (7.48) by ROWV art(Δ) and IˆQt(Δ) by the Realized Outlyigness Weighted Quarticity

ROWQuarticityt(Δ ) = dw ∑M  w (dt,i) ,

where w(.) is the hard rejection weight function. The correction factor dw and the asymptotic variance of the ROWVar θ are reported in Table 7.4 for several choices of the critical level β (used to get the outlyingness threshold k).

In the spirit of their MinRV and MedRV estimators, Andersen, Dobrev, and Schaumburg (2008) propose two alternative robust estimators of the integrated quarticity, namely the MinRQ and MedRQ. The formulas are given herebelow (and implemented in the procedures Compute_MinRQ() and Compute_MedRQ()):

M inRQ  (Δ ) ≡   M --M--μ  ∑  min(|r |,|r   |)4         (7.52)
       t          M - 1  4i=2      t,i   t,i- 1
M edRQ (Δ ) ≡   M --M--μ  ∑  med(|r |,|r    |,|r    |)4,  (7.53)
       t          M - 2  5i=3       t,i  t,i-1  t,i-2
where μ4 π(3π - 8) and μ5 = 3π(9π + 72 - 52√ -

Note also that Andersen, Dobrev, and Schaumburg (2008) show that both the MinRV and MedRV satisfy (7.47), where θ equals 3.81 for the former and 2.96 for the latter (the MedRV being asymptotically more efficient than the MinRV in absence of jumps).

Barndorff-Nielsen and Shephard (2006) advocated the use of a log version of the Zt statistics. According to them, the following statistic

logZ  ≡ l∘og(RVt(Δ))--log(I ˆV-t(Δ-)),
   t     (θ - 2) 1IˆQ (Δ ) ˆIVt(Δ)-2
               M   t

has better finite sample properties.

They also proposed a max version of logZt, denoted maxlogZt, where

               log(RVt(Δ))- log(IˆV t(Δ ))
maxlogZt ≡ ∘-------------------------------.
             (θ- 2) 1M-max{1,IˆQt(Δ )I ˆV t(Δ )-2}

Under the null of no jump on day t, Zt, logZt and maxlogZt are asymptotically (as Δ 0) standard normal. The sequences {Zt(Δ)}t=1T, {logZt}t=1T and {maxlogZt}t=1T provide evidence on the daily occurrence of jumps in the price process.

The outlyingness measure dt,i can also be used to build a statistic for daily jump detection. Indeed, under the null of no jump during day t, dt,i ~ χ2(1), i = 1,,M (see Equation 7.40). In the spirit of the test proposed by Lee and Mykland (2008) (see next section), we can show that   max
   t,i follows a Gumbel distribution under the null. More specifically, we reject the null of no jump during day t at the α% critical level if

       ∘ ---
  max    dt,i > G-1(1- α)Sn + Cn,

where G-1(1 - α) is the 1 - α quantile function of the standard Gumbel distribution, Cn = (2log n)0.5 -log(π)+log(l0o.g5n)
 2(2logn) and Sn = ---1-0.5-
(2logn). When n = M or n = MT, the expected number of spurious (daily) detected jumps respectively equals αT and α.

If It,α(Δ) is a dummy variable that equals 1 if a jump has been detected on day t at the α (e.g. 0.999) significance level (using any of the tests presented above) and 0 otherwise, a better estimate of the realized jumps is given by:

Jt,α(Δ ) = It,α(Δ) ⋅[RVt(Δ )- ˆIVt(Δ)].

To make sure that the jump component added to the continuous one equals the realized volatility, we define as an estimator of the integrated variance

Ct,α(Δ ) = [1 - It,α(Δ )]⋅RVt(Δ)+ It,α(Δ )⋅ ˆIVt(Δ).

Note that when α = 0,Ct,α(Δ) = IˆVt(Δ), t.

7.8 Intraday jump tests

So far, we have considered the estimation of functions of jumps over given time intervals. The approach taken in this section is different. We explain how we can test whether any given intra-day return rt,i is from a purely continuous diffusion or is rather due to a jump in the price process. What follows is mainly based on the work of Lee and Mykland (2008) and Boudt, Croux, and Laurent (2008b).

The idea of Lee and Mykland (2008) for jump estimation is intuitive. If a return contains a jump component, it should be abnormally big. However, what constitutes an abnormally big return depends on the volatility condition prevailing at the time of the tested return. Indeed, in times of high volatility, an abnormal return is bigger than an abnormal return in times of low volatility. Hence, Lee and Mykland (2008) study the properties of the ratio of the tested return over a measure of local volatility. They derive asymptotic theory for the statistic and a rejection region under the null of no jump at tested time. In short, they propose a powerful, parsimonious methodology that allows to test whether any return contains a jump component, to know the sign of the jumps and its exact timing.

The statistic Jt,i tests whether a jump occurred between intradaily time periods i- 1 and i of day t. It is defined as the absolute return divided by an estimate of the local standard deviation σˆt,i, i.e.

Jt,i = |rt,i|.

Under the null of no jump at the testing time, that the process belongs to the family of BSMJ models described in Equation (7.14), and a suitable choice of the window size for local volatility, rtˆσ,it,i asymptotically follows a standard normal distribution.

Note that Lee and Mykland (2008) recommend replacing ˆσt,i by ŝt,i = ∘ --1-----
   M-1BVt where BV t is the bi-power variation computed on all the intraday returns of day t. Boudt, Croux, and Laurent (2008b) propose to account for the strong periodicity in volatility and show that replacing ˆσt,i by  ˆ
f t,iŝt,i is more appropriate. They show that ignoring periodic volatility patterns leads to spurious jump identification. Indeed, the original Lee/Mykland statistic (that neglects the periodicity) tends to overdetect (underdetect) jumps in periods of high (low) intraday periodic volatility. G@RCH 6.1 gives the choice between the three robust non parametric estimation methods described in Section 7.5.3 to estimate ˆf t,i, i.e. ˆf t,iMAD,ˆf t,iShortH and ˆf t,iWSD. Recall that when the dataset is ‘dated’, the length of the periodicity filter is set to 5 days, otherwise it is equal to one day.

Under the null of no jump and a consistent estimate σˆt,i, Jt,j follows the same distribution as the absolute value of a standard normal variable. Brownlees and Gallo (2006) propose comparing Jt,i with the 1 - α2 quantile of the standard normal distribution. This rule might spuriously detect many jumps, however. Andersen, Bollerslev, and Dobrev (2007) use a Bonferroni correction to minimize spurious jump detection. To minimize the risk of falsely finding jumps, Lee and Mykland (2008) propose inferring jumps from a conservative critical value, which they obtain from the distribution of the statistic’s maximum over the sample size. If the statistic exceeds a plausible maximum, one rejects the null of no jump. Under the stated assumptions and no jump in the interval i - 1,i of day t, then when Δ 0, the sample maximum of the absolute value of a standard normal variable (i.e. the jump statistic Jt,i) follows a Gumbel distribution. We reject the null of no jump if

Jt,i > G-1(1- α )Sn + Cn,

where G-1(1 - α) is the 1 - α quantile function of the standard Gumbel distribution, Cn = (2log n)0.5 -log(π)+log(lo0.g5n)
 2(2logn) and Sn = ---1-0.5-
(2logn). When n = 1, the test is similar to the one of Brownlees and Gallo (2006) in the sense that the expected number of spurious detected jumps (under the null) can be extremely large, i.e. αMT. When n = M (i.e. number of observations per day) and n = MT (i.e. total number of observations), this number equals respectively αT and α (i.e. 0). So if we choose a significance level of α = 0.0001, then we reject the null of no jump at testing time if Jt,i > Snβ* + Cn with β* such that P(ψ β*) = exp(-e-β* ) = 0.9999, i.e. β* = -log(-log(0.9999)) = 9.21.


Figure 7.19: Simulated and detected jumps

The L&M test is implemented in the function Compute_LeeMykJump of the Realized class. An example is provided in the file
Jumps_JumpDetection_LeeMykland.ox. In this example, we simulate a GARCH diffusion with jumps and intraday periodicity (see Section 7.4) and apply the L&M test.

The first graph on Figure 7.19 shows the simulated 5-min log-prices. The other graphs plot the simulated (plus) and detected (squares) jumps using the L&M test. Note that the presence of intraday periodicity is ignored in the second graph. The third graph corresponds to the filtered L&M test that implements the correction proposed by Boudt, Croux, and Laurent (2008b), i.e. ˆσt,i by ˆ
f t,iŝt,i where in this example we chose the WSD periodicity filter fˆ t,iWSD.

We see that neglecting the presence of intraday periodicity has a strong impact on the power of the test. Indeed, while all the simulated jumps are correctly detected for the filtered L&M test, most jumps (i.e. small ones) are not detected when ignoring the intraday periodicity. See Boudt, Croux, and Laurent (2008b) for a more comprehensive simulation study.

7.9 Multivariate case

In the case where rt,i is an N-dimensional return vector generated by the multivariate counterpart of the BSMJ price diffusion model in (7.14), the processes p(s), μ(s) and q(s) are all N-dimensional vector processes and w(s) is a vector of N independent Brownian motions. Denote by Ω(s) the N × N càdlàg process such that Σ(s) = Ω(s(s) is the spot covariance matrix process of the continuous component of the price diffusion. Let K be the N × N process controlling the magnitude and transmission of jumps such that K(s)dq(s) is the contribution of the jump process to the price diffusion. We then have that a N-dimensional log-price diffusion can be decomposed as follows:

BSMJ:   dp(s)  =  μ(s)ds+ Ω(s)dw(s)+ K (s)dq(s).      (7.60)
The integrated covariance matrix (ICov) over [t - 1,t] is the matrix
ICov =   t Σ(s)ds.
    t   t-1

Denote by κj the contribution of the j-th jump in [t - 1,t] to the price diffusion.

7.9.1 Realized Quadratic Covariation

Andersen, Bollerslev, Diebold, and Labys (2003) have shown that the Realized quadratic covariation (RCov)

RCovt(Δ ) ≡   rt,ir′t,i

is a consistent estimator for the sum of the ICov and the realized jump variability

                      j∑t    ′
pΔlim→0RCovt(Δ ) = ICovt +   κjκ j,

where jt = t-1tdq*(s), with q*(s) the univariate counting process derived from q(s) such that q*(s) increases by 1 whenever q(s) changes.

7.9.2 Realized BiPower Covariation

For disentangling the continuous and jump components in the RCov, we need an additional estimator for the ICov that is robust to jumps. To this purpose, Barndorff-Nielsen and Shephard (2004a) introduce the Realized BiPower Covariation process (RBPCov) as the process whose value at time t is the N-dimensional square matrix with k,l-th element equal to

π-   ∑M   ||r    + r    || ||r     + r     ||
8      i=2  (k)t,i   (l)t,i   (k)t,i-1   (l)t,i-1
       |           ||               |)
     - |r(k)t,i - r(l)t,i||r(k)t,i-1 - r(l)t,i-1| ,        (7.64)
where r(k)t,i is the k-th component of the return vector rt,i. The factor π8 ensures that the RBPCov converges to the ICov under the BSMFAJ model:
                 ∫ t
plim RBPCovt(Δ ) =    Σ(s)ds.
Δ→0               t-1

Like the BPVar, the RBPCov is highly affected by jumps when these jumps affect two contiguous returns. The RBPCov, being a multivariate scale estimator, has the additional disadvantage that it is not affine equivariant and it is not necessarily positive semidefinite, which are undesirable properties for covolatility estimators.

In the next subsection, we show that the inclusion of an appropriate weight function in the RCov, that downweights extreme returns, leads to an estimator that has the following properties:

  1. it is affine equivariant and yields positive semidefinite matrices;
  2. at the BSM model it is consistent for the ICov and has a high efficiency;
  3. at the BSMFAJ model it is consistent for the ICov.

7.9.3 ROWQCov

The computation of the ROWQCov is analogous to the computation of the ROWVar, but here rt,i is a return vector of dimension N.
Step 1: estimation of local multivariate outlyingness

We estimate the multivariate outlyingness of the return vector rt,i by the Mahalanobis distance between rt,i and 0 using a highly robust estimator of the multivariate scale S(rt,i) of the returns belonging to the same local window as rt,i.12 Formally, the outlyingness of rt,i equals

      ′  -1
dt,i = rt,iS  (rt,i)rt,i.

Under the BSM model and some weak assumptions stated in Boudt, Croux, and Laurent (2008a), the Mahalanobis outlyingness measure is asymptotically chi squared distributed with N degrees of freedom (χN2) provided that the spot covariance matrix is constant over the local window.

For the same reasons as the ones evoked in Sections 7.6.2 and 7.8, the assumption of constancy of the covariance is not realistic for local windows of length one day. For that reason, the choice is given to the user to computed dt,i on raw returns (i.e. Equation (7.66)) but also on filtered returns. The three non parametric techniques discussed in Section 7.5.3 are offered, i.e. ˆf t,iMAD, ˆf t,iShortH or ˆf t,iWSD. Note that the N periodicity filters are estimated separately.

A common choice for a highly robust multivariate scale estimator is the Minimum Covariance Determinant (MCD) covariance estimate proposed by Rousseeuw and Driessen (1999). It is defined as follows. We call a halfsample of a local window a subsample of h = 0.5(M + N + 1)returns belonging to that local window. Denote now S the covariance matrix computed from the halfsample having the smallest value of the determinant of the covariance matrix computed from it.13 Compute then for the i-th return belonging to the same local window as rt,i the weight wi = wHR[rt,i(c0.5S)-1rt,i]. The MCD multivariate scale estimator is then given by the weighted covariance matrix of all returns belonging to the local window

SMCD(r  )  =  ∑-c0.95- ∑  w r r′ .
     t,i        Mi=1wi i=1  i t,it,i
The scalars c0.5 and c0.95 are the correction factors needed for consistency at the multivariate normal distribution. One has that cβ = β∕FχN+22(χN2(β)), where FχN+22() is the χN+22 distribution function (Croux and Haesbroeck, 1999).

The MCD covariance estimator is our preferred choice to construct the outlyingness. However, one can also use other robust multivariate scale estimators, such as the average RBPCov (see above). Note the positive semidefiniteness of ROWQCov is preserved when dt,i is build upon the RBPCov.
Step 2: computation of ROWQCov

Because the Mahalanobis outlyingness are scalar-valued and asymptotically χN2 distributed, one can use the hard and soft rejection weight functions in (7.41)-(7.42) with threshold equal to an extreme quantile of the χN2 distribution, to downweight the high-frequency returns that are outlying with respect to the majority of the returns in their local window. The ROWQCov is then defined as follows.

Definition 2 Let rt,i, for i = 1,,M, be a sample of M high-frequency returns and dt,i their outlyingness value based on local windows of length one day. For a given weight function w(), the Realized Outlyingness Weighted Quadratic Covariation is defined as

ROWQCovt (Δ ) = cw 1-∑Mi=1 w(dt,i) .

7.9.4 Correction factor for ROWVar and ROWQCov

The correction factor cw in (7.67) ensures that the ROWQCov is consistent for the ICov at the BSM and BSMFAJ models. It depends on the dimension N of the process and on the weight function used. Denote E[g(u)] the expectation of g(u) where u is standard normal distributed. The correction factor cw is then given by

cw  =  N --E[w(u′-u)′]-.                 (7.68)
         E [w(uu )u u]
For the HR weight function defined in (7.41) with threshold k = χN2(1 - α), cw = (1 -α)∕FχN+22(χN2(1 -α)), where FχN+22() is the χN+22 distribution function. Table 7.5 reports the correction factors for the hard and soft rejection weight functions. For N = 1, the correction factor cw is identical to the one used in (7.43).

Table 7.5: Correction factors cw when the dimension of the series is N and for the hard and soft rejection weight functions with threshold k