The Inverse Gaussian distribution \(\mathsf{IG}(\mu,\lambda)\) is a probability law with density
$$f(x;\mu,\lambda)=\sqrt{\frac{\lambda}{2 \pi x^{3}}} e^{- \frac{\lambda \left(- \mu + x\right)^{2}}{2 \mu^{2} x}}.$$
This distribution arises as passage times of random walks 1. For a survey of its basic properties, see 2. In this note we explore in detail its approximate normality, which holds when \( \lambda/\mu \to \infty\).
Consider the random variable \(X\sim \mathsf{IG}(\mu,\lambda)\) and its standardised version $$Y = \frac{X-\mathbf{E}[X]}{\sqrt{\mathbf{Var}[X]}}.$$
Below we establish normal approximation in various metrics.
Convergence in Distribution
Since the characteristic function of \(X\) is \(\phi_X(t) = e^{\frac{\lambda \left(1 – \sqrt{1 – \frac{2 i \mu^{2} t}{\lambda}}\right)}{\mu}}\), and \(\mathbf{E}[X] = \mu, \mathbf{Var}[X]=\mu^3/\lambda\), the characteristic function of the standardized distribution equals
$$ \phi_Y(t) = e^{k^{2} \cdot \left(1 – \sqrt{1 – \frac{2 i t}{k}}\right)} e^{- i k t}, $$
where \( k \triangleq \sqrt{\lambda/\mu}\). The limit under \(k\to\infty\) equals \(e^{-\frac{t^2}{2}}\), the characteristic function of the standard normal distribution, which proves convergence in distribution:
$$Y \overset{d}{\rightarrow} \mathsf{N}(0,1),\quad \lambda / \mu \to \infty$$
Convergence in Total Variation
We can also look at the pdf of the normalized distribution, with \(k\) as before,
$$ f_Y(y) = \frac{\sqrt{2} k^{\frac{3}{2}} e^{- \frac{k y^{2}}{2 \left(k + y\right)}}}{2 \sqrt{\pi} \left(k + y\right)^{\frac{3}{2}}},\quad y>-k$$
that can be calculated with SymPy
import sympy as sm
x, y, t, lmbd, mu, k = sm.symbols('x y t lambda mu k', positive=True)
pdf = sm.Mul( sm.sqrt(lmbd/(2*sm.pi*x**3), evaluate=False), sm.exp(-lmbd*(x-mu)**2/(2*mu**2*x)), evaluate=False)
mean = mu
var = mu**3/lmbd
norm_pdf = sm.sqrt(var) * pdf.subs({x:y*sm.sqrt(var)+mu})
norm_pdf.subs({lmbd:k**2*mu}).simplify()
and compare it to that of the standard normal distribution. Namely, we have
$$ f_Y(y) \rightarrow \frac{1}{\sqrt{2\pi}} e^{-\frac{y^2}{2}}, \quad k=\lambda / \mu \to \infty, $$
that is the convergence of probability densities to the standard normal distribution. By Scheffe’s Theorem 3 this proves the convergence in total variation (stronger than convergence in distribution)
$$Y\overset{TV}{\rightarrow} \mathsf{N}(0,1), \quad \lambda/\mu \to \infty.$$
Kolmogorov Distance
We now attempt to derive concrete numerical estimates. The ratio of the characteristic function \(\phi_k\) of \(Y\) and \(\phi_{\infty}\) of the standard normal equals
$$\phi_k(t)/\phi_{\infty}(t) = e^{ k^2 h(t /k ) }, \quad h(x) \triangleq 1-\sqrt{1-2 i x}- ix + \frac{x^2}{2}.$$
Observe that \(h(x)\) expands into a convergent power series around \(x=0\) when \(|x|<\frac{1}{2}\). Thus, under the condition that \(|t| \leq \sqrt{k}\) and \(k\geq 4\), we obtain that \(|k^2 h(t /k )|\leq O(1)\), and by the bound \(|e^{z}-1| = \Theta(z)\) for \(|z|\leq O(1)\) that
$$ \left| e^{ k^2 h(t /k )} -1 \right| = \Theta(1)\cdot \frac{t^3}{k} $$
This gives the following integral estimate
$$ \int_{|t|\leq \sqrt{k}} \left| \frac{\phi_k(t)-\phi_{\infty}(t)}{t} \right| \mbox{d} t = \Theta(1/k). $$
We know handle the regime \(\sqrt{k} < |t| \leq k/2\). Observe that \(\Re h(x) =\Theta(1) x^4\) by the power series expansion. It follows that \(|e^{k^2 h(t /k )}| = e^{\Theta(1)}\) and therefore
$$\int_{\sqrt{k} < |t| \leq k/2} \left| \frac{\phi_k(t)-\phi_{\infty}(t)}{t} \right| \mbox{d} t \leq O(1) \int_{\sqrt{k} < |t| \leq k/2} \frac{\phi_{\infty}(t)}{t} \mbox{d}t \leq o(1/k). $$
The integral Berry-Essen inequality 4 implies now the convergence rate in Kolmogorov–Smirnov distance
$$d_{KS}(Y,\mathsf{N}(0,1))=O(1/k) = O(\sqrt{\mu/\lambda}).$$
Total Variation Distance
Observe that the ratio of the densities equals
$$\frac{f_Y(y)}{f_{\infty}(y)} = \frac{k^{\frac{3}{2}} e^{\frac{y^{3}}{2 \left(k + y\right)}}}{\left(k + y\right)^{\frac{3}{2}}},\quad y>-k.$$
By power series expansions we have, uniformly in \(k\)
$$\frac{f_Y(y)}{f_{\infty}(y)} = 1-\Theta(y/k),\quad |y|\leq k^{1/3},$$
and because the expression increases in \(y\) and decreases in \(k\) when \(|y| > k^{1/3}, k>8\) (we have \(\frac{\partial}{\partial y} \frac{f_Y(y)}{f_{\infty}(y)} = \frac{k^{\frac{3}{2}} \left(- 3 k + y^{2} \cdot \left(3 k + 2 y\right) – 3 y\right) e^{\frac{y^{3}}{2 \left(k + y\right)}}}{2 \left(k + y\right)^{\frac{7}{2}}}\) and \(\frac{\partial}{\partial k} \frac{f_Y(y)}{f_{\infty}(y)} = – \frac{\sqrt{k} y \left(k y^{2} – 3 k – 3 y\right) e^{\frac{y^{3}}{2 k + 2 y}}}{2 \left(k + y\right)^{\frac{7}{2}}}\)) also
$$ \frac{f_Y(y)}{f_{\infty}(y)} > 1,\quad y \geq k^{1/3},k\geq 8$$ and
$$ \frac{f_Y(y)}{f_{\infty}(y)} < 1,\quad y \leq -k^{1/3},k\geq 8.$$ Therefore, we obtain
$$ \int \left| f_Y(y)-f_{\infty}(y) \right| \mbox{d}y \leq O(1/k)\int_{|y|<k^{1/3}} |y| f_{\infty}(y)\mbox{d} y + \int_{t<-k^{1/3}} 2 f_{\infty}(y) = O(1/k). $$
Which proves that the convergence rate in total variation is \(O(1/k)\):
$$d_{TV}(Y;\mathsf{N}(0,1)) \leq O(1/k) = O(\sqrt{\mu/\lambda})$$ (in fact, here we obtain the matching lower bound too).
Numerical Evaluation
Numerical results match the theoretical analysis, as shown below:
For completeness, here is the script:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import invgauss, norm
fig, axs = plt.subplots(1,2, figsize=(12,4))
# Part 1: evaluate total variation
mus = np.linspace(0.5,0.00001,100)
out = []
for mu in mus:
# construct Inverse Gaussian
Z = invgauss(mu=mu)
mean = Z.mean()
std = Z.std()
k = (mean/std)**2
# standardise and compare with normal
Z_std = invgauss(mu=mu,loc=-mean/std,scale=1/std)
Norm = norm(0,1)
xs = np.linspace(-10,10,1000)
metric = np.abs(Z_std.pdf(xs) - Norm.pdf(xs)).mean()*20
result = (k,metric)
out.append(result)
ax = axs[1]
ax.plot(*zip(*out))
ax.set_xscale('log')
ax.set_xlabel('$\lambda / \mu$')
ax.set_yscale('log')
ax.set_ylabel('total-variation to normal')
# Part 2: plot pdf
ax = axs[0]
for mu in [0.2,0.1,0.05,0.01,0.0001]:
# construct Inverse Gaussian
Z = invgauss(mu=mu)
mean = Z.mean()
std = Z.std()
k = (mean/std)**2
# standardise and plot
Z_std = invgauss(mu=mu,loc=-mean/std,scale=1/std)
xs = np.linspace(-5,5,10000)
ax.plot(xs,Z_std.pdf(xs), label=f'$\lambda/\mu$={k:.2f}')
ax.legend(title="standardised distribution")
ax.set_xlabel('$x$')
ax.set_ylabel('pdf')
plt.tight_layout()
plt.show()
References
- 1.Hartzell FZ. Sequential Analysis. Agronomy Journal. Published online July 1947:640-641. doi:10.2134/agronj1947.00021962003900070011x
- 2.Folks JL, Chhikara RS. The Inverse Gaussian Distribution and its Statistical Application—A Review. Journal of the Royal Statistical Society: Series B (Methodological). Published online July 1978:263-275. doi:10.1111/j.2517-6161.1978.tb01039.x
- 3.Scheffe H. A Useful Convergence Theorem for Probability Distributions. Ann Math Statist. Published online September 1947:434-438. doi:10.1214/aoms/1177730390
- 4.Bobkov SG. Proximity of probability distributions in terms of Fourier-Stieltjes transforms. Russ Math Surv. Published online December 31, 2016:1021-1079. doi:10.1070/rm9749