Approximating Tails of Beta Distribution

Beta distribution is ubiquitous in statistics, but particularly popular in real-world modeling. The beta-binomial model is perhaps the most known example, given the recent interest in Bayesian inference. But it was in use nearly 50 years ago, for example in toxicology.

Unfortunately, computing probabilities from the density depends on intractable incomplete beta integrals. This creates a demand for closed-form approximations, particularly for probability cfs/tails. The goal is to obtain an exponential concentration inequality in of Bernstein-type

$$\Pr[|X-\mathbf{E}[X]|>\epsilon]\leq \mathrm{e}^{-\frac{\epsilon^2}{2v^2+2c\epsilon}}.$$

Such bounds have been studied few times, the last one being the sub-gaussian approximation, that is when \(c=0\). Recently, I have further improved to optimal \(v\) (most important) and some good value of \(c\) (less important, but possibly worth further improvement). This gives a more accurate approximation when the distribution is very skewed (this happens when we model rare events, like conversion). For example with Beta(2,998) we get this:

The trick is to obtain a recursion scheme on central moments, and bound their growth by a geometric progression. The details are in my paper, and the code is shared in this notebook.

Published by mskorski

Scientist, Consultant, Learning Enthusiast

Leave a comment

Your email address will not be published. Required fields are marked *