Extracting from bit streams remains an active research area in modern cryptography. Recently, there has been renewed interest in debasing with structured linear mappings, namely by linear codes. Linear codes were proposed as extractors for independent but biased bits in the theory community . Decades later, they were popularized among the hardware designers , and …

# Author archives: mskorski

## On Schur-Concavity of Poisson-Binomial

For many problems in mathematics and theoretical computer science, it is of interest to estimate concentration of sums of heterogenous boolean random variables. This note shows some new results obtained through majorization theory with computer-aided proof in SymPy. Results The Poisson Binomial distribution is defined for as$$Y = \sum_{i=1}^n X_i,\quad X_i \sim \mathrm{Bern}(p_i).$$ We are …

## Correct Confidence Intervals for Quantiles

In this post, I overview the optimal non-parametric approach to quantile confidence intervals. We will discuss the theoretical background as well as an efficient algorithm implemented in Python. Consider an ordered iid sample from a distribution , and let be the -th quantile, that is . Using our samples we want to build a possibly …

Continue reading “Correct Confidence Intervals for Quantiles”

## Demographic Data with Zero-Truncated Poisson

The zero-truncated Poisson distribution is useful to model certain demographic data, such as sizes of households and families 1. This post illustrates this on historical data from the United Kingdom and demonstrates how to assess the goodness-of-fit in Python. The zero-truncated Poisson distribution is defined as\begin{align}\mathbf{P}\{Y=x\} = \mathbf{P}\{ X = x | X > 0\}, …

Continue reading “Demographic Data with Zero-Truncated Poisson”

## Diagnosing regression with interactive QQ plots

QQ plots, despite their importance, are not well-supported in Python, as of the moment of writing this note. The package statsmodels offers plots of rather poor quality and non-interactive, and even Plotly doesn’t come with its own recipe. In this note, I am sharing how to build an interactive QQ plot and apply it to …

Continue reading “Diagnosing regression with interactive QQ plots”

## On Importance of Log-Normality in Hypothesis Testing

Normality of data is important in statistics, so it is not a surprise that transforming data to look closer to a normal distribution can be beneficial in many ways. There are ways of finding the right transform, such as the popular Box-Cox method 1,2. In this post, I will share an elegant example on how …

Continue reading “On Importance of Log-Normality in Hypothesis Testing”

## Risk Matrix Method in Python

The risk matrix, a popular risk assessment method 1, presents risks in the form of a likelihood / impact heatmap. In this post, I share a convenient and elegant Python snippet for constructing risk matrices, used to generate this hypothetical project risk assessment: Below are the code details:

## Normality of Inverse Gaussian

The post discusses normal approximations to the inverse gaussian distribution.

## Optimizing Docker Images – A Case Study

Docker is a modern technology for containerizing software. While packaging to a working version is relatively easy, creating a lightweight Docker image presents a challenge. In this note, I will briefly discuss an example from my university class on software engineering. The task was to containerize a card game web application, written by students as …

## Approximate Nash Equilibrium by Regret Matching

Two-player games can be solved by following a very intuitive algorithm called Regret Matching 1. Players modify their action probabilities according to the so-called regrets or advantages, which can be thought as consequences of alternative choices. For a good overview of the topic, see the friendly yet detailed introduction by Neller and Lanctot 2. The …

Continue reading “Approximate Nash Equilibrium by Regret Matching”