Correct Confidence Intervals for Quantiles

In this post, I overview the optimal non-parametric approach to quantile confidence intervals. We will discuss the theoretical background as well as an efficient algorithm implemented in Python. Consider an ordered iid sample from a distribution , and let be the -th quantile, that is . Using our samples we want to build a possibly …

Demographic Data with Zero-Truncated Poisson

The zero-truncated Poisson distribution is useful to model certain demographic data, such as sizes of households and families ​1​. This post illustrates this on historical data from the United Kingdom and demonstrates how to assess the goodness-of-fit in Python. The zero-truncated Poisson distribution is defined as\begin{align}\mathbf{P}\{Y=x\} = \mathbf{P}\{ X = x | X > 0\}, …

Diagnosing regression with interactive QQ plots

QQ plots, despite their importance, are not well-supported in Python, as of the moment of writing this note. The package statsmodels offers plots of rather poor quality and non-interactive, and even Plotly doesn’t come with its own recipe. In this note, I am sharing how to build an interactive QQ plot and apply it to …

On Importance of Log-Normality in Hypothesis Testing

Normality of data is important in statistics, so it is not a surprise that transforming data to look closer to a normal distribution can be beneficial in many ways. There are ways of finding the right transform, such as the popular Box-Cox method ​1,2​. In this post, I will share an elegant example on how …

Optimizing Docker Images – A Case Study

Docker is a modern technology for containerizing software. While packaging to a working version is relatively easy, creating a lightweight Docker image presents a challenge. In this note, I will briefly discuss an example from my university class on software engineering. The task was to containerize a card game web application, written by students as …

Approximate Nash Equilibrium by Regret Matching

Two-player games can be solved by following a very intuitive algorithm called Regret Matching ​1​. Players modify their action probabilities according to the so-called regrets or advantages, which can be thought as consequences of alternative choices. For a good overview of the topic, see the friendly yet detailed introduction by Neller and Lanctot ​2​. The …

Compiling Torch Code – Example on Memory Reduction

Staring from the version 2.x PyTorch, a popular deep-learning framework, introduces a JIT compiler torch.compile. In this post, I am sharing a non-trivial example demonstrating how this tool can reduce memory footprint on GPU. The point of departure is a sub-routine which computes similarity, similar to covariance but not as friendly to compute. For two …