Generalized normal distribution and Skew normal distribution

Posted 2020-12-03 sddai

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Generalized normal distribution and Skew normal distribution相关的知识，希望对你有一定的参考价值。

Density Function

The Generalized Gaussian density has the following form:

$技术分享图片$

where $技术分享图片$ (rho) is the "shape parameter". The density is plotted in the following figure:

Matlab code used to generate this figure is available here: ggplot.m.

Adding an arbitrary location parameter, $技术分享图片$ , and inverse scale parameter, $技术分享图片$ , the density has the form,

$技术分享图片$

Matlab code used to generate this figure is available here: ggplot2.m.

Generating Random Samples

Samples from the Generalized Gaussian can be generated by a transformation of Gamma random samples, using the fact that if $技术分享图片$ is a $技术分享图片$ distributed random variable, and $技术分享图片$ is an independent random variable taking the value -1 or +1 with equal probability, then,

$技术分享图片$

is distributed $技术分享图片$ . That is,

$技术分享图片$

where the density of $技术分享图片$ is written in a non-standard but suggestive form.

Matlab Code

Matlab code to generate random variates from the Generalized Gaussian density with parameters as described here is here:

gg6.m

As an example, we generate random samples from the example Generalized Gaussian densities shown above.

Matlab code used to generate this figure is available here: ggplot3.m.

Mixture Densities

A more general family of densities can be constructed from mixtures of Generalized Gaussians. A mixture density, $技术分享图片$ , is made up of $技术分享图片$ constituent densities $技术分享图片$ together with probabilities $技术分享图片$ associated with each constituent density.

$技术分享图片$

The densities $技术分享图片$ have different forms, or parameter values. A random variable with a mixture density can be thought of as being generated by a two-part process: first a decision is made as to which constituent density to draw from, where the $技术分享图片$ density is chosen with probability $技术分享图片$ , then the value of the random variable is drawn from the chosen density. Independent repetitions of this process result in a sample having the mixture density $技术分享图片$ .

As an example consider the density,

$技术分享图片$

Matlab code used to generate these figures is available here: ggplot4.m.

The generalized normal distribution or generalized Gaussian distribution (GGD) is either of two families of parametric continuous probability distributions on the real line. Both families add a shape parameter to the normal distribution. To distinguish the two families, they are referred to below as "version 1" and "version 2". However this is not a standard nomenclature.

Version 1

Generalized Normal (version 1)
Probability density function
Cumulative distribution function
Parameters	$技术分享图片$ location (real) $技术分享图片$ scale (positive, real) $技术分享图片$ shape (positive, real)
Support	$技术分享图片$
PDF	$技术分享图片$ $技术分享图片$ denotes the gamma function
CDF	$技术分享图片$ $技术分享图片$ denotes the lower incomplete gamma function
Mean	$技术分享图片$
Median	$技术分享图片$
Mode	$技术分享图片$
Variance	$技术分享图片$
Skewness	0
Ex. kurtosis	$技术分享图片$
Entropy	$技术分享图片$ ^[1]

Known also as the exponential power distribution, or the generalized error distribution, this is a parametric family of symmetric distributions. It includes all normal and Laplacedistributions, and as limiting cases it includes all continuous uniform distributions on bounded intervals of the real line.

This family includes the normal distribution when $技术分享图片$ (with mean $技术分享图片$ and variance $技术分享图片$ ) and it includes the Laplace distributionwhen $技术分享图片$ . As $技术分享图片$ , the density converges pointwise to a uniform density on $技术分享图片$ .

This family allows for tails that are either heavier than normal (when $技术分享图片$ ) or lighter than normal (when $技术分享图片$ ). It is a useful way to parametrize a continuum of symmetric, platykurticdensities spanning from the normal ( $技术分享图片$ ) to the uniform density ( $技术分享图片$ ), and a continuum of symmetric, leptokurticdensities spanning from the Laplace ( $技术分享图片$ ) to the normal density ( $技术分享图片$ ).

Parameter estimation

Parameter estimation via maximum likelihood and the method of moments has been studied.^[2] The estimates do not have a closed form and must be obtained numerically. Estimators that do not require numerical calculation have also been proposed.^[3]

The generalized normal log-likelihood function has infinitely many continuous derivates (i.e. it belongs to the class C^∞ of smooth functions) only if $技术分享图片$ is a positive, even integer. Otherwise, the function has $技术分享图片$ continuous derivatives. As a result, the standard results for consistency and asymptotic normality of maximum likelihood estimates of $技术分享图片$ only apply when $技术分享图片$ .

Maximum likelihood estimator

It is possible to fit the generalized normal distribution adopting an approximate maximum likelihood method.^[4]^[5] With $技术分享图片$ initially set to the sample first moment $技术分享图片$ , $技术分享图片$ is estimated by using a Newton–Raphson iterative procedure, starting from an initial guess of $技术分享图片$ ,

技术分享图片

where

技术分享图片

is the first statistical moment of the absolute values and $技术分享图片$ is the second statistical moment. The iteration is

技术分享图片

where

技术分享图片

and

技术分享图片

and where $技术分享图片$ and $技术分享图片$ are the digamma function and trigamma function.

Given a value for $技术分享图片$ , it is possible to estimate $技术分享图片$ by finding the minimum of:

技术分享图片

Finally $技术分享图片$ is evaluated as

技术分享图片

Applications

This version of the generalized normal distribution has been used in modeling when the concentration of values around the mean and the tail behavior are of particular interest.^[6]^[7] Other families of distributions can be used if the focus is on other deviations from normality. If the symmetry of the distribution is the main interest, the skew normal family or version 2 of the generalized normal family discussed below can be used. If the tail behavior is the main interest, the student t family can be used, which approximates the normal distribution as the degrees of freedom grows to infinity. The t distribution, unlike this generalized normal distribution, obtains heavier than normal tails without acquiring a cusp at the origin.

Properties

The multivariate generalized normal distribution, i.e. the product of $技术分享图片$ exponential power distributions with the same $技术分享图片$ and $技术分享图片$ parameters, is the only probability density that can be written in the form $技术分享图片$ and has independent marginals.^[8] The results for the special case of the Multivariate normal distribution is originally attributed to Maxwell.^[9]

Version 2

Generalized Normal (version 2)
Probability density function
Cumulative distribution function
Parameters	$技术分享图片$ location (real) $技术分享图片$ scale (positive, real) $技术分享图片$ shape (real)
Support	$技术分享图片$ $技术分享图片$ $技术分享图片$
PDF	$技术分享图片$ , where $技术分享图片$ $技术分享图片$ is the standard normal pdf
CDF	$技术分享图片$ , where $技术分享图片$ $技术分享图片$ is the standard normal CDF
Mean	$技术分享图片$
Median	$技术分享图片$
Variance	$技术分享图片$
Skewness	$技术分享图片$
Ex. kurtosis	$技术分享图片$

This is a family of continuous probability distributions in which the shape parameter can be used to introduce skew.^[10]^[11]When the shape parameter is zero, the normal distribution results. Positive values of the shape parameter yield left-skewed distributions bounded to the right, and negative values of the shape parameter yield right-skewed distributions bounded to the left. Only when the shape parameter is zero is the density function for this distribution positive over the whole real line: in this case the distribution is a normal distribution, otherwise the distributions are shifted and possibly reversed log-normal distributions.

Parameter estimation

Parameters can be estimated via maximum likelihood estimation or the method of moments. The parameter estimates do not have a closed form, so numerical calculations must be used to compute the estimates. Since the sample space (the set of real numbers where the density is non-zero) depends on the true value of the parameter, some standard results about the performance of parameter estimates will not automatically apply when working with this family.

Applications

This family of distributions can be used to model values that may be normally distributed, or that may be either right-skewed or left-skewed relative to the normal distribution. The skew normal distribution is another distribution that is useful for modeling deviations from normality due to skew. Other distributions used to model skewed data include the gamma, lognormal, and Weibull distributions, but these do not include the normal distributions as special cases.

Other distributions related to the normal

The two generalized normal families described here, like the skew normal family, are parametric families that extends the normal distribution by adding a shape parameter. Due to the central role of the normal distribution in probability and statistics, many distributions can be characterized in terms of their relationship to the normal distribution. For example, the lognormal, folded normal, and inverse normal distributions are defined as transformations of a normally-distributed value, but unlike the generalized normal and skew-normal families, these do not include the normal distributions as special cases.
Actually all distributions with finite variance are in the limit highly related to the normal distribution. The Student-t distribution, the Irwin–Hall distribution and the Bates distribution also extend the normal distribution, and include in the limit the normal distribution. So there is no strong reason to prefer the "generalized" normal distribution of type 1, e.g. over a combination of Student-t and a normalized extended Irwin–Hall – this would include e.g. the triangular distribution (which cannot be modeled by the generalized Gaussian type 1).
A symmetric distribution which can model both tail (long and short) and center behavior (like flat, triangular or Gaussian) completely independently could be derived e.g. by using X = IH/chi.

Skew normal distribution

Skew Normal
Probability density function
Cumulative distribution function
Parameters	$技术分享图片$ location (real) $技术分享图片$ scale (positive, real) $技术分享图片$ shape (real)
Support	$技术分享图片$
PDF	$技术分享图片$
CDF	$技术分享图片$ $技术分享图片$ is Owen‘s T function
Mean	$技术分享图片$ where $技术分享图片$
Variance	$技术分享图片$
Skewness	$技术分享图片$
Ex. kurtosis	$技术分享图片$
MGF	$技术分享图片$
CF	$技术分享图片$

In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.

Definition

Let $技术分享图片$ denote the standard normal probability density function

技术分享图片

with the cumulative distribution function given by

技术分享图片

where erf is the error function. Then the probability density function (pdf) of the skew-normal distribution with parameter $技术分享图片$ is given by

技术分享图片

This distribution was first introduced by O‘Hagan and Leonard (1976). A popular alternative parameterization is due to Mudholkar and Hutson (2000), which has a form of the c.d.f. that is easily inverted such that there is a closed form solution to the quantile function.

A stochastic process that underpins the distribution was described by Andel, Netuka and Zvara (1984).^[1] Both the distribution and its stochastic process underpinnings were consequences of the symmetry argument developed in Chan and Tong (1986), which applies to multivariate cases beyond normality, e.g. skew multivariate t distribution and others. The distribution is a particular case of a general class of distributions with probability density functions of the form f(x)=2 φ(x) Φ(x) where φ() is any PDF symmetric about zero and Φ() is any CDF whose PDF is symmetric about zero.^[2]

To add location and scale parameters to this, one makes the usual transform $技术分享图片$ . One can verify that the normal distribution is recovered when $技术分享图片$ , and that the absolute value of the skewness increases as the absolute value of $技术分享图片$ increases. The distribution is right skewed if $技术分享图片$ and is left skewed if $技术分享图片$ . The probability density function with location $技术分享图片$ , scale $技术分享图片$ , and parameter $技术分享图片$ becomes

技术分享图片

Note, however, that the skewness of the distribution is limited to the interval $技术分享图片$ .

Estimation

Maximum likelihood estimates for $技术分享图片$ , $技术分享图片$ , and $技术分享图片$ can be computed numerically, but no closed-form expression for the estimates is available unless $技术分享图片$ . If a closed-form expression is needed, the method of moments can be applied to estimate $技术分享图片$ from the sample skew, by inverting the skewness equation. This yields the estimate

技术分享图片

where $技术分享图片$ , and $技术分享图片$ is the sample skew. The sign of $技术分享图片$ is the same as the sign of $技术分享图片$ . Consequently, $技术分享图片$ .

The maximum (theoretical) skewness is obtained by setting $技术分享图片$ in the skewness equation, giving $技术分享图片$ . However it is possible that the sample skewness is larger, and then $技术分享图片$ cannot be determined from these equations. When using the method of moments in an automatic fashion, for example to give starting values for maximum likelihood iteration, one should therefore let (for example) $技术分享图片$ .

Concern has been expressed about the impact of skew normal methods on the reliability of inferences based upon them.^[3]

Differential equation

The differential equation leading to the pdf of the skew normal distribution is

技术分享图片

with initial conditions

技术分享图片

广义高斯分布：亚高斯信号，高斯信号，超高斯信号

一个信号的高斯性是通过其峭度定义的。在信号x的均值为零的条件下，其峭度定义如下：

kurt(x)=E{x^4}-3[E{x^2}]^2

<0 次高斯信号（亚高斯信号）

kurt(x) =0 高斯信号

>0 超高斯信号

当我们拿到任意信号x的一个样本后,可通过如下的计算求其峭度，进而判断高斯性：

假设x是1*N的行向量：

x=x-mean(x)*ones(1,N); %去均值

KurtX=mean(x.^4)-3*(mean(x.^2))^2; %求峭度

均匀分布的信号是次高斯信号，拉普拉斯分布的信号是超高斯信号。语音信号是超高斯信号。根据中心极限定理的意义，N个不同分布信号的联合分布有高斯化的趋势，所以信号的非高斯性是盲信号分离一个很好的优化判据。

相对于高斯信号，亚高斯信号更平坦多峰，超高斯信号更尖锐且有更长的尾巴。

对于高斯分布的信号，二阶统计量足以描述其特性，但是对于通信系统中典型的通信信号，其分布通常是欠高斯的，所以二阶统计量不足以描述其特性，必须用更高阶统计量描述其特性。

非平稳信号：可以简单地理解为分布参数或者分布律随时间发生变化。

高斯信号：是分布规律符合正态分布的非平稳信号

而非平稳高斯信号：就是信号的分布律不随时间变化，总是高斯的，但分布参数（均值和方差）却是随时间变化的。

一般对于非平稳信号，主要有时频分析和小波分析。

补充：

高斯信号就是信号的各种幅值出现的机会满足高斯分布的信号。

站在ICA上说，高斯信号的坏处就是，它看起来就是一堆玉米(顺便废话：它的概率密度曲线看起来确实很像玉米堆)，你在一堆玉米上再倒上一堆玉米，得到的仍然是一堆玉米，看不出来是由原来两堆玉米混起来的，所以在理论上是不可分离的。

超高斯分布比高斯分布更加集中

亚高斯分布比高斯分布平坦

超高斯：四阶累积量大于0

亚高斯：四阶累积量小于0

以上是关于Generalized normal distribution and Skew normal distribution的主要内容，如果未能解决你的问题，请参考以下文章