Chapter 9 tsdistributions package

9.1 Introduction

The objective of the tsdistributions package is to provide a set of location-scale invariant distributions re-parameterized in mean-variance space. The distributions in the package are based on those of the rugarch package and re-written to more closely follow the convention used by the distributions in stats. Maximum likelihood estimation is included and makes use of automatic differentiation via the functionality of the TMB package. It is part of a longer term plan for a full re-write of rugarch, with this package forming a foundational set piece.

9.2 Distributions

9.2.1 The Normal Distribution

The Normal Distribution is a spherical distribution described completely by it first two moments, the mean and variance. Formally, the random variable \(x\) is said to be normally distributed with mean \(\mu\) and variance \(\sigma^2\) (both of which may be time varying), with density given by, \[ f\left( x \right) = \frac{{{e^{\frac{{ - 0.5{{\left( {x - \mu } \right)}^2}}} {{{\sigma ^2}}}}}}} {{\sigma \sqrt {2\pi } }}. \]

Following some mean filtration or whitening process, the residuals \(\varepsilon\), standardized by \(\sigma\) yield the standard normal density given by, \[ f\left( {\frac{{x - \mu }}{\sigma }} \right) = \frac{1} {\sigma }f\left( z \right) = \frac{1} {\sigma }\left( {\frac{{{e^{ - 0.5{z^2}}}}} {{\sqrt {2\pi } }}} \right). \]

To obtain the conditional likelihood of some generating process at each point in time (\(LL_t\)), the conditional standard deviation \(\sigma_t\) (this could also be constant), acts as a scaling factor on the density, so that: \[ L{L_t}\left( {{z_t};{\sigma _t}} \right) = \frac{1} {{{\sigma _t}}}f\left( {{z_t}} \right) \]

which illustrates the importance of the scaling property. Finally, the normal distribution has zero skewness and zero excess kurtosis.

9.2.2 The Student Distribution

It Student distribution is described completely by a shape parameter \(\nu\), but for standardization we proceed by using its 3 parameter representation as follows:

\[ f\left( x \right) = \frac{{\Gamma \left( {\frac{{\nu + 1}} {2}} \right)}} {{\sqrt {\beta \nu \pi } \Gamma \left( {\frac{\nu } {2}} \right)}}{\left( {1 + \frac{{{{\left( {x - \alpha } \right)}^2}}} {{\beta \nu }}} \right)^{ - \left( {\frac{{\nu + 1}} {2}} \right)}} \]

where \(\alpha\), \(\beta\), and \(\nu\) are the location, scale²⁰ and shape parameters respectively, and \(\Gamma\) is the Gamma function. Similar to the GED distribution described later, this is a unimodal and symmetric distribution where the location parameter \(\alpha\) is the mean (and mode) of the distribution while the variance is:

\[ Var\left( x \right) = \frac{{\beta \nu }}{{\left( {\nu - 2} \right)}}. \]

For the purposes of standardization we require that: \[ \begin{gathered} Var(x) = \frac{{\beta \nu }} {{\left( {\nu - 2} \right)}} = 1 \hfill \\ \therefore \beta = \frac{{\nu - 2}} {\nu } \hfill \\ \end{gathered} \]

Substituting \(\frac{(\nu- 2)}{\nu }\) we obtain the standardized Student’s distribution:

\[ f\left( {\frac{{x - \mu }}{\sigma }} \right) = \frac{1} {\sigma }f\left( z \right) = \frac{1} {\sigma }\frac{{\Gamma \left( {\frac{{\nu + 1}} {2}} \right)}}{{\sqrt {\left( {\nu - 2} \right)\pi } \Gamma \left( {\frac{\nu } {2}} \right)}}{\left( {1 + \frac{{{z^2}}} {{\left( {\nu - 2} \right)}}} \right)^{ - \left( {\frac{{\nu + 1}} {2}} \right)}}. \]

In terms of R’s standard implementation of the Student density (dt), and including a scaling by the standard deviation, this can be represented as:

\[ \frac{{dt\left( {\frac{{{\varepsilon _t}}} {{\sigma \sqrt {\left( {v - 2} \right)/\nu } }},\nu } \right)}} {{\sigma \sqrt {\left( {v - 2} \right)/\nu } }} \]

The Student distribution has zero skewness and excess kurtosis equal to \(6/(\nu - 4)\) for \(\nu > 4\).

9.2.3 The Generalized Error Distribution

The Generalized Error Distribution is a 3 parameter distribution belonging to the exponential family with conditional density given by,

\[ f\left( x \right) = \frac{{\kappa {e^{ - 0.5{{\left| {\frac{{x - \alpha }} {\beta }} \right|}^\kappa }}}}} {{{2^{1 + {\kappa ^{ - 1}}}}\beta \Gamma \left( {{\kappa ^{ - 1}}} \right)}} \]

with \(\alpha\), \(\beta\) and \(\kappa\) representing the location, scale and shape parameters. Since the distribution is symmetric and unimodal the location parameter is also the mode, median and mean of the distribution (i.e. \(\mu\)). By symmetry, all odd moments beyond the mean are zero. The variance and kurtosis are given by, \[ \begin{gathered} Var\left( x \right) = {\beta ^2}{2^{2/\kappa }}\frac{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}} {{\Gamma \left( {{\kappa ^{ - 1}}} \right)}} \hfill \\ Ku\left( x \right) = \frac{{\Gamma \left( {5{\kappa ^{ - 1}}} \right)\Gamma \left( {{\kappa ^{ - 1}}} \right)}} {{\Gamma \left( {3{\kappa ^{ - 1}}} \right)\Gamma \left( {3{\kappa ^{ - 1}}} \right)}} \hfill \\ \end{gathered} \]

As \(\kappa\) decreases the density gets flatter and flatter while in the limit as \(\kappa \to \infty\), the distribution tends towards the uniform. Special cases are the Normal when \(\kappa=2\), the Laplace when \(\kappa=1\). Standardization is simple and involves re-scaling the density to have unit standard deviation: \[ \begin{gathered} Var\left( x \right) = {\beta ^2}{2^{2/\kappa }}\frac{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}} {{\Gamma \left( {{\kappa ^{ - 1}}} \right)}} = 1 \hfill \\ \therefore \beta = \sqrt {{2^{ - 2/\kappa }}\frac{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}} {{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}} \hfill \\ \end{gathered} \]

Finally, substituting into the scaled density of \(z\):

\[ f\left( {\frac{{x - \mu }} {\sigma }} \right) = \frac{1} {\sigma }f\left( z \right) = \frac{1} {\sigma }\frac{{\kappa {e^{ - 0.5{{\left| {\sqrt {{2^{ - 2/\kappa }}\frac{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}} {{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}} z} \right|}^\kappa }}}}} {{\sqrt {{2^{ - 2/\kappa }}\frac{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}} {{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}} {2^{1 + {\kappa ^{ - 1}}}}\Gamma \left( {{\kappa ^{ - 1}}} \right)}} \]

9.2.4 Skewed Distributions by Inverse Scale Factors

Fernandez and Steel (1998) proposed introducing skewness into unimodal and symmetric distributions by introducing inverse scale factors in the positive and negative real half lines. Given a skew parameter, \(\xi\), the density of a random variable z can be represented as:

\[ f\left( {z|\xi } \right) = \frac{2} {{\xi + {\xi ^{ - 1}}}}\left[ {f\left( {\xi z} \right)H\left( { - z} \right) + f\left( {{\xi ^{ - 1}}z} \right)H\left( z \right)} \right] \]

where \(\xi \in {\mathbb{R}^ + }\) and \(H(.)\) is the Heaviside function. The absolute moments, required for deriving the central moments, are generated from the following function:

\[ {M_r} = 2\int_0^\infty {{z^r}f\left( z \right)dz}. \]

The mean and variance are then defined as: \[ \begin{gathered} E\left( z \right) = {M_1}\left( {\xi - {\xi ^{ - 1}}} \right) \hfill \\ Var\left( z \right) = \left( {{M_2} - M_1^2} \right)\left( {{\xi ^2} + {\xi ^{ - 2}}} \right) + 2M_1^2 - {M_2} \hfill \\ \end{gathered} \]

The Normal, Student and GED distributions have skew variants which have been standardized to zero mean, unit variance by making use of the moment conditions given above. These are named as snorm sged and sstd in the package.

9.2.5 The Generalized Hyperbolic Distribution and Sub-Families

In distributions where the expected moments are functions of all the parameters, it is not immediately obvious how to perform such a transformation. In the case of the ghyp distribution, because of the existence of location and scale invariant parameterizations and the possibility of expressing the variance in terms of one of those, namely the \((\zeta, \rho)\), the task of standardizing and estimating the density can be broken down to one of estimating those 2 parameters, representing a combination of shape and skewness, followed by a series of transformation steps to demean, scale and then translate the parameters into the \((\alpha, \beta, \delta, \mu)\) parameterization for which standard formula exist for the likelihood function. The \((\xi, \chi)\) parameterization, which is a simple transformation of the \((\zeta, \rho)\), could also be used in the first step and then transformed into the latter before proceeding further. The only difference is the kind of ‘immediate’ inference one can make from the different parameterizations, each providing a different direct insight into the kind of dynamics produced and their place in the overall ghyp family particularly with regards to the limit cases.

The package performs estimation using the \((\zeta, \rho)\) parameterization, after which a series of steps transform those parameters into the \((\alpha, \beta, \delta, \mu)\) while at the same time including the necessary recursive substitution of parameters in order to standardize the resulting distribution.

Proof. The Standardized Generalized Hyperbolic Distribution. Let \(\varepsilon_t\) be a r.v. with mean \((0)\) and variance \(({\sigma}^2)\) distributed as \(ghyp(\zeta, \rho)\), and let \(z\) be a scaled version of the r.v. \(\varepsilon\) with variance \((1)\) and also distributed as \(ghyp(\zeta, \rho)\).²¹ The density \(f(.)\) of \(z\) can be expressed as

\[ f(\frac{\varepsilon_t}{\sigma}; \zeta ,\rho ) = \frac{1}{\sigma}f_t(z;\zeta ,\rho ) = \frac{1}{\sigma}f_t(z;\tilde \alpha, \tilde \beta, \tilde \delta ,\tilde \mu ), \]

where we make use of the \((\alpha, \beta, \delta, \mu)\) parameterization since we can only naturally express the density in that parameterization. The steps to transforming from the \((\zeta, \rho)\) to the \((\alpha, \beta, \delta, \mu)\) parameterization, while at the same time standardizing for zero mean and unit variance are given henceforth. Let \[ \begin{eqnarray} \zeta & = & \delta \sqrt {{\alpha ^2} - {\beta ^2}} \hfill \\ \rho & = & \frac{\beta }{\alpha }, \hfill \end{eqnarray} \]

which after some substitution may be also written in terms of \(\alpha\) and \(\beta\) as,

\[ \begin{eqnarray} \alpha & = & \frac{\zeta }{{\delta \sqrt {(1 - {\rho ^2})} }},\hfill\\ \beta & = &\alpha \rho.\hfill \end{eqnarray} \]

For standardization we require that,

\[ \begin{eqnarray} E\left(X\right) & = & \mu + \frac{{\beta \delta }}{{\sqrt {{\alpha ^2} - {\beta ^2}} }}\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} = \mu + \frac{{\beta {\delta ^2}}}{\zeta }\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} = 0 \hfill \\ \therefore \mu & = & - \frac{{\beta {\delta ^2}}}{\zeta }\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}}\hfill \\ Var\left(X\right) & = & {\delta ^2}\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{\zeta {K_\lambda }\left(\zeta \right)}} + \frac{{{\beta ^2}}}{{{\alpha ^2} - {\beta ^2}}}\left(\frac{{{K_{\lambda + 2}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} - {\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}}\right)^2}\right)\right) = 1 \hfill\nonumber \\ \therefore \delta & = & {\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{\zeta {K_\lambda }\left(\zeta \right)}} + \frac{{{\beta ^2}}}{{{\alpha ^2} - {\beta ^2}}}\left(\frac{{{K_{\lambda + 2}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} - {\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}}\right)^2}\right)\right)^{ - 0.5}} \hfill \end{eqnarray} \]

Since we can express, \(\beta^2/\left(\alpha^2 - \beta^2\right)\) as,

\[ \frac{{{\beta ^2}}}{{{\alpha ^2} - {\beta ^2}}} = \frac{{{\alpha ^2}{\rho ^2}}}{{{a^2} - {\alpha ^2}{\rho ^2}}} = \frac{{{\alpha ^2}{\rho ^2}}}{{{a^2}\left(1 - {\rho ^2}\right)}} = \frac{{{\rho ^2}}}{{\left(1 - {\rho ^2}\right)}}, \]

then we can re-write the formula for \(\delta\) in terms of the estimated parameters \(\hat\zeta\) and \(\hat\rho\) as,

\[ \delta = {\left(\frac{{{K_{\lambda + 1}}\left(\hat \zeta \right)}}{{\hat \zeta {K_\lambda }\left(\hat \zeta \right)}} + \frac{{{{\hat \rho }^2}}}{{\left(1 - {{\hat \rho }^2}\right)}}\left(\frac{{{K_{\lambda + 2}}\left(\hat \zeta \right)}}{{{K_\lambda }\left(\hat \zeta \right)}} - {\left(\frac{{{K_{\lambda + 1}}\left(\hat \zeta \right)}}{{{K_\lambda }\left(\hat \zeta \right)}}\right)^2}\right)\right)^{ - 0.5}} \]

Transforming into the \((\tilde \alpha ,\tilde \beta ,\tilde \delta ,\tilde \mu )\) parameterization proceeds by first substituting \(\ref{sgh44}\) into \(\ref{sgh31}\) and simplifying,

\[ \begin{eqnarray} \tilde \alpha & = & \,{\frac{{\hat \zeta \left( {\frac{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{\hat \zeta {{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} + \frac{{{{\hat \rho }^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} - \frac{{{{\left( {{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)} \right)}^2}}}{{{{\left( {{{\text{K}}_\lambda }\left( {\hat \zeta } \right)} \right)}^2}}}} \right)}}{{\left( {1 - {{\hat \rho }^2}} \right)}}} \right)}}{{\sqrt {(1 - {{\hat \rho }^2})} }}^{0.5}}, \hfill\nonumber \\ & = &\,{\frac{{\left( {\frac{{\hat \zeta {{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} + \frac{{{{\hat \zeta }^2}{{\hat \rho }^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} - \frac{{{{\left( {{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)} \right)}^2}}}{{{{\left( {{{\text{K}}_\lambda }\left( {\hat \zeta } \right)} \right)}^2}}}} \right)}}{{\left( {1 - {{\hat \rho }^2}} \right)}}} \right)}}{{\sqrt {(1 - {\hat \rho ^2})} }}^{0.5}}, \hfill\nonumber \\ & = & {\left( {\left. {\frac{{\frac{{\hat \zeta {{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}}}}{{(1 - {{\hat \rho }^2})}} + \frac{{{\hat \zeta ^2}{\hat \rho ^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}\frac{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} - \frac{{{{\left( {{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)} \right)}^2}}}{{{{\left( {{{\text{K}}_\lambda }\left( {\hat \zeta } \right)} \right)}^2}}}} \right)}}{{{{\left( {1 - {{\hat \rho }^2}} \right)}^2}}}} \right)} \right.^{0.5}}, \hfill\nonumber \\ & = & {\left( {\left. {\frac{{\frac{{\hat \zeta {{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}}}}{{(1 - {{\hat \rho }^2})}}\left(1 + \frac{{\hat \zeta {{\hat \rho }^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}} - \frac{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}}} \right)}}{{\left( {1 - {{\hat \rho }^2}} \right)}}\right)} \right)} \right.^{0.5}}. \hfill \end{eqnarray} \]

Finally, the rest of the parameters are derived recursively from \(\tilde\alpha\) and the previous results,

\[ \begin{eqnarray} \tilde \beta & = & \tilde \alpha \hat \rho,\hfill\\ \tilde \delta & = & \frac{{\hat \zeta }}{{\tilde \alpha \sqrt {1 - {{\hat \rho }^2}} }}, \hfill\\ \tilde \mu & = & \frac{{ - \tilde \beta {{\tilde \delta }^2}{K_{\lambda + 1}}\left(\hat \zeta \right)}}{{\hat \zeta {K_\lambda }\left(\hat \zeta \right)}}.\hfill \end{eqnarray} \]

For the use of the \((\xi, \chi)\) parameterization in estimation, the additional preliminary steps of converting to the \((\zeta, \rho)\) are,

\[ \begin{eqnarray} \zeta & = & \frac{1}{{{{\hat \xi }^2}}} - 1, \hfill\\ \rho & = & \frac{{\hat \chi }}{{\hat \xi }}. \hfill \end{eqnarray} \]

9.2.6 The Generalized Hyperbolic Skew Student Distribution

The ghyp Skew-Student distribution was popularized by Aas and Haff (2006) because of its uniqueness in the ghyp family in having one tail with polynomial and one with exponential behavior. This distribution is a limiting case of the ghyp when \(\alpha \to \left| \beta \right|\) and \(\lambda=-\nu/2\), where \(\nu\) is the shape parameter of the Student distribution. The domain of variation of the parameters is \(\beta \in \mathbb{R}\) and \(\nu>0\), but for the variance to be finite \(\nu>4\), while for the existence of skewness and kurtosis, \(\nu>6\) and \(\nu>8\) respectively. The density of the random variable \(x\) is then given by:

\[ f\left( x \right) = \frac{{{2^{\left( {1 - \nu } \right)/2}}{\delta ^\nu }{{\left| \beta \right|}^{\left( {\nu + 1} \right)/2}}{K_{\left( {\nu + 1} \right)/2}}\left( {\sqrt {{\beta ^2}\left( {{\delta ^2} + {{\left( {x - \mu } \right)}^2}} \right)} } \right)\exp \left( {\beta \left( {x - \mu } \right)} \right)}} {{\Gamma \left( {\nu /2} \right)\sqrt \pi {{\left( {\sqrt {{\delta ^2} + {{\left( {x - \mu } \right)}^2}} } \right)}^{\left( {\nu + 1} \right)/2}}}} \]

To standardize the distribution to have zero mean and unit variance, I make use of the first two moment conditions for the distribution which are: \[ \begin{gathered} E\left( x \right) = \mu + \frac{{\beta {\delta ^2}}} {{\nu - 2}} \hfill \\ Var\left( x \right) = \frac{{2{\beta ^2}{\delta ^4}}} {{{{\left( {\nu - 2} \right)}^2}\left( {\nu - 4} \right)}} + \frac{{{\delta ^2}}} {{\nu - 2}} \hfill \\ \end{gathered} \]

We require that \(Var(x)=1\), thus:

\[ \delta = {\left( {\frac{{2{{\bar \beta }^2}}} {{{{\left( {\nu - 2} \right)}^2}\left( {\nu - 4} \right)}} + \frac{1} {{\nu - 2}}} \right)^{ - 1/2}} \]

where I have made use of the \(4^{th}\) parameterization of the ghyp distribution given in Prause (1999) where \(\hat \beta = \beta \delta\). The location parameter is then rescaled by substituting into the first moment formula \(\delta\) so that it has zero mean:

\[ \bar \mu = - \frac{{\beta {\delta ^2}}}{{\nu - 2}} \]

Therefore, we model the ghyp Skew-Student using the location-scale invariant parameterization \((\bar \beta, \nu)\) and then translate the parameters into the usual ghyp distribution’s \((\alpha, \beta, \delta, \mu)\), setting \(\alpha = abs(\beta)+1e-12\).

The quantile function is calculated using the SkewHyperbolic package of D. Scott and Grimson (2018) using the spline method (for speed), as is the cumulative distribution function.

9.2.7 Johnson’s reparameterized SU Distribution

The reparameterized Johnson SU distribution, discussed in Rigby and Stasinopoulos (2005), is a four parameter distribution denoted by \(JSU\left(\mu,\sigma,\nu,\tau\right)\), with mean \(\mu\) and standard deviation \(\sigma\) for all values of the skew and shape parameters \(\nu\) and \(\tau\) respectively. The implementation is taken from the GAMLSS package of Stasinopoulos, Rigby, and Akantziliotou (2009) and the reader is referred there for further details.

9.3 A note on analytic higher moments

The skewness and kurtosis of all distributions is based on analytic forms, with the exception of the skew student distribution for which a bijection between this and Hansen (1994) Generalized Skew-T distribution is used.²²

9.4 Implementation and Code Examples

The package implements standard d,p,q,r functions for all distributions in addition to the wrapper functions ddist,pdist,qdist and rdist. The skewness and kurtosis are provider in wrapper functions for all distributions in dskewness and dkurtosis as well as the tsmoments function on an estimated object of class tsdistribution.estimate. The following short demo showcases some of the available functionality.

library(tsdistributions)
dist <- "ghst"
r <- rdist(dist, 2000, mu = 0.1, sigma = 0.3, skew = -10, shape = 5)
spec <- distribution_modelspec(r, dist)
mod <- estimate(spec)
summary(mod)

## 
## 
## Parameter    Est[Value]   Std. Error   t value   Pr(>|t|)
## ----------  -----------  -----------  --------  ---------
## mu               0.1061       0.0047   22.6870     0.0000
## sigma            0.2222       0.0247    8.9969     0.0000
## skew           -11.2932       3.9833   -2.8352     0.0046
## shape            5.9021       0.5673   10.4041     0.0000

pars <- coef(mod)
# equivalence
rx <- (r - pars[1])/pars[2]
test_llh <- sum(-log(ddist(dist, rx, mu = 0, sigma = 1, skew = pars[3], 
                           shape = pars[4])/pars[2]))
all.equal(test_llh, mod$llh)

## [1] TRUE

References

Aas, K., and I.H. Haff. 2006. “The Generalized Hyperbolic Skew Student’s t-Distribution.” Journal of Financial Econometrics 4 (2): 275–309.

Fernandez, C., and M.F. Steel. 1998. “On Bayesian Modeling of Fat Tails and Skewness.” Journal of the American Statistical Association 93 (441): 359–71.

Hansen, Bruce E. 1994. “Autoregressive Conditional Density Estimation.” International Economic Review, 705–30.

Prause, K. 1999. “The Generalized Hyperbolic Model: Estimation, Financial Derivatives, and Risk Measures.” PhD thesis, University of Freiburg.

Rigby, R.A., and D.M. Stasinopoulos. 2005. “Generalized Additive Models for Location, Scale and Shape.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 54 (3): 507–54.

Scott, David, and Fiona Grimson. 2018. SkewHyperbolic: The Skew Hyperbolic Student t-Distribution. https://r-forge.r-project.org/projects/rmetrics/.

Stasinopoulos, D.M., B.A. Rigby, and C. Akantziliotou. 2009. Gamlss: Generalized Additive Models for Location, Scale and Shape. 1.11 ed.

In some representations, mostly Bayesian, this is represented in its inverse form to denote the precision.↩︎
The parameters \(\zeta\) and \(\rho\) do not change as a result of being location and scale invariant↩︎
Based on code from Michael Rockinger at [here]{https://www.hec.unil.ch/matlabcodes/}↩︎