How to look like a statistician
A developer's guide to probabilistic programming
Evelina Gabašová
@evelgab
Probabilistic programming
what is probabilistic programming
It's about writing probability distributions and their manipulations first-class citizens in
a programming language
But why would you use that?
Salary distribution
What is your current annual salary, in [local currency]? Please enter a whole number in the box below, without any punctuation. If you prefer not to answer, please leave the box empty/blank.
Theory
People reported their monthly salary
Probability distributions
in probabilistic programming
are an integral part of any probabilistic programming language and you can manipulate them
Mixture distribution: formally
\[\text{Salary} = p \; \mathcal{N}\left(\mu, \sigma^2\right) + \left(1-p\right) \; \frac{1}{12} \; \mathcal{N}\left(\mu, \sigma^2\right)\]
unknown: \(\mu\) , \(\sigma^2\) , \(p\)
Sampling
is a very efficient way of inferring
Sampling: example
Monty Hall problem
Monty Hall problem
Demo
Monte Carlo sampling
Demo
Representing probability distributions
with computation expressions
Mixture distribution: formally
\[\text{Salary} = p \; \mathcal{N}\left(\mu, \sigma^2\right) + \left(1-p\right) \; \frac{1}{12} \; \mathcal{N}\left(\mu, \sigma^2\right)\]
unknown: \(\mu\) , \(\sigma^2\) , \(p\)
Mixture distribution: informally
\[\text{Salary} = p(\text{correct}) \,\times\, \text{Annual salary} + \\
\;\;\;\;\;\;\;\;\; + \, p\left(\text{mistake}\right) \; \frac{1}{12} \,\times\, \text{Annual salary}\]
unknown: \(p(\text{correct})\) , Annual salary
Inference
for mixture distributions
Easy if we knew the values of the unknown parameters
Mixture distributions
Probabilistic programming
1:
2:
3:
4:
5:
6:
7:
8:
let salary = Gaussian (mean ,variance )
let mistake = Bernoulli (probability )
let observed =
if mistake then
1 / 12 * salary
else
salary
much simpler
This is an example of a potential probabilistic programming language
Manipulating probability distributions directly
Demo
Modelling probability distributions with computation expressions
Modelling
mixture distributions
But how do we get the parameters?
You get PhD from Cambridge for this stuff
This is exactly why people use probabilistic programming - because you don't have to know how
it works underneath
The world's slowest probability inference engine
Try different parameter values
Discretize
Compare two discrete distributions
Demo
The world's slowest probabilistic language
Probabilistic programming
in the real world
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
data {
int < lower = 0 > N ;
vector [N ] y ;
}
parameters {
vector [2 ] mu ;
real < lower = 0 > sigma [2 ];
real < lower = 0 , upper = 1 > theta ;
}
model {
sigma ~ normal (0 , 2 );
mu ~ normal (0 , 2 );
theta ~ uniform (0.0 ,1.0 );
for (n in 1 : N )
target += log_mix (theta ,
normal_lpdf (y [n ] | mu [1 ], sigma [1 ]),
normal_lpdf (y [n ] | mu [2 ], sigma [2 ]));
}
Evelina Gabašová
@evelgab
evelinag.com