Distributions
The Normal Distribution (norm
)
The Normal Distribution is a spherical distribution described completely by it first two moments, the mean and variance. Formally, the random variable \(x\) is said to be normally distributed with mean \(\mu\) and variance \(\sigma^2\) (both of which may be time varying), with density given by,
\[
f\left( x \right) = \frac{{{e^{\frac{{ - 0.5{{\left( {x - \mu } \right)}^2}}}
{{{\sigma ^2}}}}}}}
{{\sigma \sqrt {2\pi } }}.
\tag{1}\]
Following some mean filtration or whitening process, the residuals \(\varepsilon\), standardized by \(\sigma\) yield the standard normal density given by,
\[
f\left( {\frac{{x - \mu }}{\sigma }} \right) = \frac{1}
{\sigma }f\left( z \right) = \frac{1}
{\sigma }\left( {\frac{{{e^{ - 0.5{z^2}}}}}
{{\sqrt {2\pi } }}} \right).
\tag{2}\]
To obtain the conditional likelihood of some generating process at each point in time, the conditional standard deviation \(\sigma_t\), acts as a scaling factor on the density, so that:
\[
g\left(\varepsilon_t|\sigma_t \right) = \frac{1}
{\sigma _t}f\left(\frac{\varepsilon_t}{\sigma_t}\right)
\tag{3}\]
which illustrates the importance of the scaling property. Finally, the normal distribution has zero skewness and zero excess kurtosis.
The Student Distribution (std
)
It Student distribution is described completely by a shape parameter \(\nu\), but for standardization we proceed by using its 3 parameter representation as follows:
\[
f\left( x \right) = \frac{{\Gamma \left( {\frac{{\nu + 1}}
{2}} \right)}}
{{\sqrt {\beta \nu \pi } \Gamma \left( {\frac{\nu }
{2}} \right)}}{\left( {1 + \frac{{{{\left( {x - \alpha } \right)}^2}}}
{{\beta \nu }}} \right)^{ - \left( {\frac{{\nu + 1}}
{2}} \right)}}
\tag{4}\]
where \(\alpha\), \(\beta\), and \(\nu\) are the location, scale and shape parameters respectively, and \(\Gamma\) is the Gamma function. Similar to the GED distribution described later, this is a unimodal and symmetric distribution where the location parameter \(\alpha\) is the mean (and mode) of the distribution while the variance is:
\[
Var\left( x \right) = \frac{{\beta \nu }}{{\left( {\nu - 2} \right)}}.
\tag{5}\]
For the purposes of standardization we require that:
\[
\begin{gathered}
Var(x) = \frac{{\beta \nu }}
{{\left( {\nu - 2} \right)}} = 1 \hfill \\
\therefore \beta = \frac{{\nu - 2}}
{\nu } \hfill \\
\end{gathered}
\tag{6}\]
Substituting \(\frac{(\nu- 2)}{\nu }\) we obtain the standardized Student’s distribution:
\[
f\left( {\frac{{x - \mu }}{\sigma }} \right) = \frac{1}
{\sigma }f\left( z \right) = \frac{1}
{\sigma }\frac{{\Gamma \left( {\frac{{\nu + 1}}
{2}} \right)}}{{\sqrt {\left( {\nu - 2} \right)\pi } \Gamma \left( {\frac{\nu }
{2}} \right)}}{\left( {1 + \frac{{{z^2}}}
{{\left( {\nu - 2} \right)}}} \right)^{ - \left( {\frac{{\nu + 1}}
{2}} \right)}}.
\tag{7}\]
In terms of R’s standard implementation of the Student density (dt
), and including a scaling by the standard deviation, this can be represented as:
\[
\frac{{dt\left( {\frac{{{\varepsilon _t}}}
{{\sigma \sqrt {\left( {v - 2} \right)/\nu } }},\nu } \right)}}
{{\sigma \sqrt {\left( {v - 2} \right)/\nu } }}
\tag{8}\]
The Student distribution has zero skewness and excess kurtosis equal to \(6/(\nu - 4)\) for \(\nu > 4\).
The Generalized Error Distribution (ged
)
The Generalized Error Distribution is a 3 parameter distribution belonging to the exponential family with conditional density given by,
\[
f\left( x \right) = \frac{{\kappa {e^{ - 0.5{{\left| {\frac{{x - \alpha }}
{\beta }} \right|}^\kappa }}}}}
{{{2^{1 + {\kappa ^{ - 1}}}}\beta \Gamma \left( {{\kappa ^{ - 1}}} \right)}}
\tag{9}\]
with \(\alpha\), \(\beta\) and \(\kappa\) representing the location, scale and shape parameters. Since the distribution is symmetric and unimodal the location parameter is also the mode, median and mean of the distribution (i.e. \(\mu\)). By symmetry, all odd moments beyond the mean are zero. The variance and kurtosis are given by,
\[
\begin{gathered}
Var\left( x \right) = {\beta ^2}{2^{2/\kappa }}\frac{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}
{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}} \hfill \\
Ku\left( x \right) = \frac{{\Gamma \left( {5{\kappa ^{ - 1}}} \right)\Gamma \left( {{\kappa ^{ - 1}}} \right)}}
{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)\Gamma \left( {3{\kappa ^{ - 1}}} \right)}} \hfill \\
\end{gathered}
\tag{10}\]
As \(\kappa\) decreases the density gets flatter and flatter while in the limit as \(\kappa \to \infty\), the distribution tends towards the uniform. Special cases are the Normal when \(\kappa=2\), the Laplace when \(\kappa=1\). Standardization is simple and involves re-scaling the density to have unit standard deviation:
\[
\begin{gathered}
Var\left( x \right) = {\beta ^2}{2^{2/\kappa }}\frac{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}
{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}} = 1 \hfill \\
\therefore \beta = \sqrt {{2^{ - 2/\kappa }}\frac{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}}
{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}} \hfill \\
\end{gathered}
\tag{11}\]
Finally, substituting into the scaled density of \(z\):
\[
f\left( {\frac{{x - \mu }}
{\sigma }} \right) = \frac{1}
{\sigma }f\left( z \right) = \frac{1}
{\sigma }\frac{{\kappa {e^{ - 0.5{{\left| {\sqrt {{2^{ - 2/\kappa }}\frac{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}}
{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}} z} \right|}^\kappa }}}}}
{{\sqrt {{2^{ - 2/\kappa }}\frac{{\Gamma \left( {{\kappa ^{ - 1}}} \right)}}
{{\Gamma \left( {3{\kappa ^{ - 1}}} \right)}}} {2^{1 + {\kappa ^{ - 1}}}}\Gamma \left( {{\kappa ^{ - 1}}} \right)}}
\tag{12}\]
Skewed Distributions by Inverse Scale Factors
Fernandez and Steel (1998) proposed introducing skewness into unimodal and symmetric distributions by introducing inverse scale factors in the positive and negative real half lines. Given a skew parameter, \(\xi\), the density of a random variable z can be represented as:
\[
f\left( {z|\xi } \right) = \frac{2}
{{\xi + {\xi ^{ - 1}}}}\left[ {f\left( {\xi z} \right)H\left( { - z} \right) + f\left( {{\xi ^{ - 1}}z} \right)H\left( z \right)} \right]
\tag{13}\]
where \(\xi \in {\mathbb{R}^ + }\) and \(H(.)\) is the Heaviside function. The absolute moments, required for deriving the central moments, are generated from the following function:
\[
{M_r} = 2\int_0^\infty {{z^r}f\left( z \right)dz}.
\tag{14}\]
The mean and variance are then defined as:
\[
\begin{gathered}
E\left( z \right) = {M_1}\left( {\xi - {\xi ^{ - 1}}} \right) \hfill \\
Var\left( z \right) = \left( {{M_2} - M_1^2} \right)\left( {{\xi ^2} + {\xi ^{ - 2}}} \right) + 2M_1^2 - {M_2} \hfill \\
\end{gathered}
\tag{15}\]
The Normal, Student and GED distributions have skew variants which have been standardized to zero mean, unit variance by making use of the moment conditions given above. These are named as snorm
, sged
and sstd
in the package.
The Generalized Hyperbolic Distribution (ghyp
)
In distributions where the expected moments are functions of all the parameters, it is not immediately obvious how to perform such a transformation. In the case of the Generalized Hyperbolic (GH) distribution, because of the existence of location and scale invariant parameterizations and the possibility of expressing the variance in terms of one of those, namely the \((\zeta, \rho)\), the task of standardizing and estimating the density can be broken down to one of estimating those 2 parameters, representing a combination of shape and skewness, followed by a series of transformation steps to demean, scale and then translate the parameters into the \((\alpha, \beta, \delta, \mu)\) parameterization for which standard formula exist for the likelihood function. The \((\xi, \chi)\) parameterization, which is a simple transformation of the \((\zeta, \rho)\), could also be used in the first step and then transformed into the latter before proceeding further. The only difference is the kind of ‘immediate’ inference one can make from the different parameterizations, each providing a different direct insight into the kind of dynamics produced and their place in the overall GH family particularly with regards to the limit cases.
The package performs estimation using the \((\zeta, \rho)\) parameterization, after which a series of steps transform those parameters into the \((\alpha, \beta, \delta, \mu)\) while at the same time including the necessary recursive substitution of parameters in order to standardize the resulting distribution.
Proof. The Standardized Generalized Hyperbolic Distribution. Let \(\varepsilon_t\) be a r.v. with mean \((0)\) and variance \(({\sigma}^2)\) distributed as \(\textrm{GH}(\zeta, \rho)\), and let \(z\) be a scaled version of the r.v. \(\varepsilon\) with variance \((1)\) and also distributed as \(\textrm{GH}(\zeta, \rho)\). The density \(f(.)\) of \(z\) can be expressed as
\[
f(\frac{\varepsilon_t}{\sigma}; \zeta ,\rho ) = \frac{1}{\sigma}f_t(z;\zeta ,\rho ) =
\frac{1}{\sigma}f_t(z;\tilde \alpha, \tilde \beta, \tilde \delta ,\tilde \mu ),
\tag{16}\]
where we make use of the \((\alpha, \beta, \delta, \mu)\) parameterization since we can only naturally express the density in that parameterization. The steps to transforming from the \((\zeta, \rho)\) to the \((\alpha, \beta, \delta, \mu)\) parameterization, while at the same time standardizing for zero mean and unit variance are given henceforth. Let
\[
\begin{eqnarray}
\zeta & = & \delta \sqrt {{\alpha ^2} - {\beta ^2}} \hfill \\
\rho & = & \frac{\beta }{\alpha }, \hfill
\end{eqnarray}
\tag{17}\]
which after some substitution may be also written in terms of \(\alpha\) and \(\beta\) as,
\[
\begin{eqnarray}
\alpha & = & \frac{\zeta }{{\delta \sqrt {(1 - {\rho ^2})} }},\hfill\\
\beta & = &\alpha \rho.\hfill
\end{eqnarray}
\tag{18}\]
For standardization we require that,
\[
\begin{eqnarray}
E\left(X\right) & = & \mu + \frac{{\beta \delta }}{{\sqrt {{\alpha ^2} - {\beta ^2}} }}\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} = \mu + \frac{{\beta {\delta ^2}}}{\zeta }\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} = 0 \hfill \\
\therefore \mu & = & - \frac{{\beta {\delta ^2}}}{\zeta }\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}}\hfill \\
Var\left(X\right) & = & {\delta ^2}\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{\zeta {K_\lambda }\left(\zeta \right)}} + \frac{{{\beta ^2}}}{{{\alpha ^2} - {\beta ^2}}}\left(\frac{{{K_{\lambda + 2}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} - {\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}}\right)^2}\right)\right) = 1 \hfill\nonumber \\
\therefore \delta & = & {\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{\zeta {K_\lambda }\left(\zeta \right)}} + \frac{{{\beta ^2}}}{{{\alpha ^2} - {\beta ^2}}}\left(\frac{{{K_{\lambda + 2}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}} - {\left(\frac{{{K_{\lambda + 1}}\left(\zeta \right)}}{{{K_\lambda }\left(\zeta \right)}}\right)^2}\right)\right)^{ - 0.5}} \hfill
\end{eqnarray}
\tag{19}\]
Since we can express, \(\beta^2/\left(\alpha^2 - \beta^2\right)\) as,
\[
\frac{{{\beta ^2}}}{{{\alpha ^2} - {\beta ^2}}} = \frac{{{\alpha ^2}{\rho ^2}}}{{{a^2} - {\alpha ^2}{\rho ^2}}} = \frac{{{\alpha ^2}{\rho ^2}}}{{{a^2}\left(1 - {\rho ^2}\right)}} = \frac{{{\rho ^2}}}{{\left(1 - {\rho ^2}\right)}},
\tag{20}\]
then we can re-write the formula for \(\delta\) in terms of the estimated parameters \(\hat\zeta\) and \(\hat\rho\) as,
\[
\delta = {\left(\frac{{{K_{\lambda + 1}}\left(\hat \zeta \right)}}{{\hat \zeta {K_\lambda }\left(\hat \zeta \right)}} + \frac{{{{\hat \rho }^2}}}{{\left(1 - {{\hat \rho }^2}\right)}}\left(\frac{{{K_{\lambda + 2}}\left(\hat \zeta \right)}}{{{K_\lambda }\left(\hat \zeta \right)}} - {\left(\frac{{{K_{\lambda + 1}}\left(\hat \zeta \right)}}{{{K_\lambda }\left(\hat \zeta \right)}}\right)^2}\right)\right)^{ - 0.5}}
\tag{21}\]
Transforming into the \((\tilde \alpha ,\tilde \beta ,\tilde \delta ,\tilde \mu )\) parameterization proceeds by first substituting Equation 21 into \(\alpha\) in Equation 18 and simplifying,
\[
\begin{eqnarray}
\tilde \alpha & = & \,{\frac{{\hat \zeta \left( {\frac{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{\hat \zeta {{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} + \frac{{{{\hat \rho }^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} - \frac{{{{\left( {{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)} \right)}^2}}}{{{{\left( {{{\text{K}}_\lambda }\left( {\hat \zeta } \right)} \right)}^2}}}} \right)}}{{\left( {1 - {{\hat \rho }^2}} \right)}}} \right)}}{{\sqrt {(1 - {{\hat \rho }^2})} }}^{0.5}}, \hfill\nonumber \\
& = &\,{\frac{{\left( {\frac{{\hat \zeta {{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} + \frac{{{{\hat \zeta }^2}{{\hat \rho }^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} - \frac{{{{\left( {{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)} \right)}^2}}}{{{{\left( {{{\text{K}}_\lambda }\left( {\hat \zeta } \right)} \right)}^2}}}} \right)}}{{\left( {1 - {{\hat \rho }^2}} \right)}}} \right)}}{{\sqrt {(1 - {\hat \rho ^2})} }}^{0.5}}, \hfill\nonumber \\
& = & {\left( {\left. {\frac{{\frac{{\hat \zeta {{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}}}}{{(1 - {{\hat \rho }^2})}} + \frac{{{\hat \zeta ^2}{\hat \rho ^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}\frac{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}} - \frac{{{{\left( {{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)} \right)}^2}}}{{{{\left( {{{\text{K}}_\lambda }\left( {\hat \zeta } \right)} \right)}^2}}}} \right)}}{{{{\left( {1 - {{\hat \rho }^2}} \right)}^2}}}} \right)} \right.^{0.5}}, \hfill\nonumber \\
& = & {\left( {\left. {\frac{{\frac{{\hat \zeta {{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}}}}{{(1 - {{\hat \rho }^2})}}\left(1 + \frac{{\hat \zeta {{\hat \rho }^2}\left( {\frac{{{{\text{K}}_{\lambda + 2}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}} - \frac{{{{\text{K}}_{\lambda + 1}}\left( {\hat \zeta } \right)}}{{{{\text{K}}_\lambda }\left( {\hat \zeta } \right)}}} \right)}}{{\left( {1 - {{\hat \rho }^2}} \right)}}\right)} \right)} \right.^{0.5}}. \hfill
\end{eqnarray}
\tag{22}\]
Finally, the rest of the parameters are derived recursively from \(\tilde\alpha\) and the previous results,
\[
\begin{eqnarray}
\tilde \beta & = & \tilde \alpha \hat \rho,\hfill\\
\tilde \delta & = & \frac{{\hat \zeta }}{{\tilde \alpha \sqrt {1 - {{\hat \rho }^2}} }}, \hfill\\
\tilde \mu & = & \frac{{ - \tilde \beta {{\tilde \delta }^2}{K_{\lambda + 1}}\left(\hat \zeta \right)}}{{\hat \zeta {K_\lambda }\left(\hat \zeta \right)}}.\hfill
\end{eqnarray}
\tag{23}\]
For the use of the \((\xi, \chi)\) parameterization in estimation, the additional preliminary steps of converting to the \((\zeta, \rho)\) are,
\[
\begin{eqnarray}
\zeta & = & \frac{1}{{{{\hat \xi }^2}}} - 1, \hfill\\
\rho & = & \frac{{\hat \chi }}{{\hat \xi }}. \hfill
\end{eqnarray}
\tag{24}\]
The Generalized Hyperbolic Skew Student Distribution (ghst
)
The GH Skew-Student distribution was popularized by Aas and Haff (2006) because of its uniqueness in the GH family in having one tail with polynomial and one with exponential behavior. This distribution is a limiting case of the GH when \(\alpha \to \left| \beta \right|\) and \(\lambda=-\nu/2\), where \(\nu\) is the shape parameter of the Student distribution. The domain of variation of the parameters is \(\beta \in \mathbb{R}\) and \(\nu>0\), but for the variance to be finite \(\nu>4\), while for the existence of skewness and kurtosis, \(\nu>6\) and \(\nu>8\) respectively. The density of the random variable \(x\) is then given by:
\[
f\left( x \right) = \frac{{{2^{\left( {1 - \nu } \right)/2}}{\delta ^\nu }{{\left| \beta \right|}^{\left( {\nu + 1} \right)/2}}{K_{\left( {\nu + 1} \right)/2}}\left( {\sqrt {{\beta ^2}\left( {{\delta ^2} + {{\left( {x - \mu } \right)}^2}} \right)} } \right)\exp \left( {\beta \left( {x - \mu } \right)} \right)}}
{{\Gamma \left( {\nu /2} \right)\sqrt \pi {{\left( {\sqrt {{\delta ^2} + {{\left( {x - \mu } \right)}^2}} } \right)}^{\left( {\nu + 1} \right)/2}}}}
\tag{25}\]
To standardize the distribution to have zero mean and unit variance, I make use of the first two moment conditions for the distribution which are:
\[
\begin{gathered}
E\left( x \right) = \mu + \frac{{\beta {\delta ^2}}}
{{\nu - 2}} \hfill \\
Var\left( x \right) = \frac{{2{\beta ^2}{\delta ^4}}}
{{{{\left( {\nu - 2} \right)}^2}\left( {\nu - 4} \right)}} + \frac{{{\delta ^2}}}
{{\nu - 2}} \hfill \\
\end{gathered}
\tag{26}\]
We require that \(Var(x)=1\), thus:
\[
\delta = {\left( {\frac{{2{{\bar \beta }^2}}}
{{{{\left( {\nu - 2} \right)}^2}\left( {\nu - 4} \right)}} + \frac{1}
{{\nu - 2}}} \right)^{ - 1/2}}
\tag{27}\]
where I have made use of the \(4^{th}\) parameterization of the GH distribution given in Prause (1999) where \(\hat \beta = \beta \delta\). The location parameter is then rescaled by substituting into the first moment formula \(\delta\) so that it has zero mean:
\[
\bar \mu = - \frac{{\beta {\delta ^2}}}{{\nu - 2}}
\tag{28}\]
Therefore, we model the GH Skew-Student using the location-scale invariant parameterization \((\bar \beta, \nu)\) and then translate the parameters into the usual GH distribution’s \((\alpha, \beta, \delta, \mu)\), setting \(\alpha = abs(\beta)+1e-12\).
The quantile function is calculated using the SkewHyperbolic package of Scott and Grimson (2018) using the spline method (for speed), as is the cumulative distribution function.
Johnson’s reparameterized SU Distribution (jsu
)
The reparameterized Johnson SU distribution, discussed in Rigby and Stasinopoulos (2005), is a four parameter distribution denoted by \(\textrm{JSU}\left(\mu,\sigma,\nu,\tau\right)\), with mean \(\mu\) and standard deviation \(\sigma\) for all values of the skew and shape parameters \(\nu\) and \(\tau\) respectively. The implementation is taken from the GAMLSS package of Stasinopoulos, Rigby, and Akantziliotou (2009) and the reader is referred there for further details.