Fourier Analysis of Linear Extractors

Extracting from bit streams remains an active research area in modern cryptography. Recently, there has been renewed interest in debasing with structured linear mappings, namely by linear codes. Linear codes were proposed as extractors for independent but biased bits in the theory community . Decades later, they were popularized among the hardware designers , and …

On Schur-Concavity of Poisson-Binomial

For many problems in mathematics and theoretical computer science, it is of interest to estimate concentration of sums of heterogenous boolean random variables. This note shows some new results obtained through majorization theory with computer-aided proof in SymPy. Results The Poisson Binomial distribution is defined for as$$Y = \sum_{i=1}^n X_i,\quad X_i \sim \mathrm{Bern}(p_i).$$ We are …

Correct Confidence Intervals for Quantiles

In this post, I overview the optimal non-parametric approach to quantile confidence intervals. We will discuss the theoretical background as well as an efficient algorithm implemented in Python. Consider an ordered iid sample from a distribution , and let be the -th quantile, that is . Using our samples we want to build a possibly …

Demographic Data with Zero-Truncated Poisson

The zero-truncated Poisson distribution is useful to model certain demographic data, such as sizes of households and families ​1​. This post illustrates this on historical data from the United Kingdom and demonstrates how to assess the goodness-of-fit in Python. The zero-truncated Poisson distribution is defined as\begin{align}\mathbf{P}\{Y=x\} = \mathbf{P}\{ X = x | X > 0\}, …

Diagnosing regression with interactive QQ plots

QQ plots, despite their importance, are not well-supported in Python, as of the moment of writing this note. The package statsmodels offers plots of rather poor quality and non-interactive, and even Plotly doesn’t come with its own recipe. In this note, I am sharing how to build an interactive QQ plot and apply it to …

On Importance of Log-Normality in Hypothesis Testing

Normality of data is important in statistics, so it is not a surprise that transforming data to look closer to a normal distribution can be beneficial in many ways. There are ways of finding the right transform, such as the popular Box-Cox method ​1,2​. In this post, I will share an elegant example on how …

Optimizing Docker Images – A Case Study

Docker is a modern technology for containerizing software. While packaging to a working version is relatively easy, creating a lightweight Docker image presents a challenge. In this note, I will briefly discuss an example from my university class on software engineering. The task was to containerize a card game web application, written by students as …

Approximate Nash Equilibrium by Regret Matching

Two-player games can be solved by following a very intuitive algorithm called Regret Matching ​1​. Players modify their action probabilities according to the so-called regrets or advantages, which can be thought as consequences of alternative choices. For a good overview of the topic, see the friendly yet detailed introduction by Neller and Lanctot ​2​. The …