On Schur-Concavity of Poisson-Binomial

For many problems in mathematics and theoretical computer science, it is of interest to estimate concentration of sums of heterogenous boolean random variables. This note shows some new results obtained through majorization theory with computer-aided proof in SymPy.

Results

The Poisson Binomial distribution $Y\sim \mathrm{PB}(p) $ is defined for $p=(p_1,\ldots, p_n)$ as
$$
Y = \sum_{i=1}^n X_i,\quad X_i \sim \mathrm{Bern}(p_i).
$$

We are interested in the concentration properties of $Y$ and its dependency on $ p=(p_1,\ldots,p_n)$. To (partly) answer this question, show that for any convex and non-decreasing function $f$, the $f$-expectation of the PB distribution is Schur-concave with respect to the parameters:
$$
\mathbf{E} f(\mathrm{PB}(p) ) \leq \mathbf{E} f(\mathrm{PB}(p’) ),\quad p’ \prec p,
$$
where $X_i$ are independent and $p \prec p’$ denotes vector majorization. In particular, this means that
$$
\mathbf{E} f(\mathrm{PB}(p) ) \leq \mathbf{E}f(\mathrm{Binom}(\bar{p}),\quad \bar{p} = \frac{1}{n}\sum_{i=1}^{n} p_i,
$$
which means that the Binomial case is the “most dispersed” one. As for related work, see the recent survey , in particular the Schur-concavity of the tails. The result above is more general, as by focusing on functional expectations it yields various concentration bounds and is applicable to setups with limited dependence, such as $k$-wise independence. Indeed, by using $f(x) = \max\{x,0\}^d$ and assuming that every $d$ of $X_i$ are independent, we can improve the previous concentration bounds . Numerically optimal results may be computed with recent formulas for binomial moments .

Proof Sketch

To prove the claim, we will check the Schur-Ostrowski criterion, namely that
$$
\Delta_{i,j}\triangleq \left(p_i – p_j\right) \left(\frac{\partial \mathbf{E} f(\mathrm{PB}(p) ) }{ \partial p_i} – \frac{\partial \mathbf{E} f(\mathrm{PB}(p) ) }{ \partial p_j}\right) \leq 0.
$$
Due to the independence of the summands
$$\mathbf{E}f(Y) = \sum_{y}\mathbf{P}\{Y-X_i-X_j= y\} \mathbf{E}_{X_i,X_j}[f(X_i + X_j + y),$$
it suffices to test for any pair of variables, while treating the sum of the remaining variables $Y-X_i-X_j$ as fixed. Since the convexity and monotonicity of $f$ are invariant to translations, we can just work with $\mathbf{E} f(X_i+ X_j)$. Without loss of generality, we can test this for $i=1, j=2$, as the variables are identically distributed. Now, let’s implement this in Python!

import sympy as sm
from sympy.stats import E, Bernoulli, Rademacher
from IPython.display import display

# define objective
p1,p2 = sm.symbols('p1 p2',positive=True)

X1 = Bernoulli('b_1',p1) 
X2 = Bernoulli('b_2',p2)

f = sm.Function('f')
S = f(X1+X2) 

# evaluate Schur-Ostrowski
schur_diff = (p1-p2) * sm.diff(E(S),p1)-sm.diff(E(S), p2)
schur_diff = schur_diff.expand().collect([f(2),f(1),f(0)])
schur_diff

We obtained
$$
\Delta_{i,j}=\left(p_{1} p_{2} – p_{1} – p_{2}^{2}\right) f{\left(2 \right)} + \left(- 2 p_{1} p_{2} + 3 p_{1} + 2 p_{2}^{2} – p_{2} – 1\right) f{\left(1 \right)} + \left(p_{1} p_{2} – 2 p_{1} – p_{2}^{2} + p_{2} + 1\right) f{\left(0 \right)}.
$$
To simplify this expression and made use of the remaining assumptions, let us introduce $\delta_{convex} = \frac{f(2)+f(0)}{2}-f(1)$ and $\delta_{monotone} = f(1)-f(0)$, both non-negative. Further calculations in SymPy

delta_cvx, delta_mnt = sm.symbols('delta_convex delta_monotone')

schur_diff = schur_diff.subs({f(2):2*f(1)-f(0)+delta_cvx})
schur_diff = schur_diff.subs({f(1):f(0)+delta_mnt})

schur_diff.expand().collect([delta_cvx,delta_mnt]).collect(p1)

show that
$$
\Delta_{i,j} = \delta_{convex} \left(p_{1} \left(p_{2} – 1\right) – p_{2}^{2}\right) + \delta_{monotone} \left(p_{1} – p_{2} – 1\right),
$$
which is clearly non-positive as $0 \leq p_1,p_2 \leq 1$.

References

Post Views: 670

On Schur-Concavity of Poisson-Binomial

Results

Proof Sketch

References

Published by mskorski

Leave a comment

Cancel reply