Univariate Distribution
In this post, we will show the basic usage of tensorflow probability (tfp), and how to make univariate distribution. This is the summary of lecture "Probabilistic Deep Learning with Tensorflow 2" from Imperial College London
Tensorflow Probability (tfp for short) is a library for probabilistic reasoning and statistical analysis in Tensorflow. It is a part of wide ecosystem of Tensorflow, so it can easily combined with Tensorflow core.
import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np
import matplotlib.pyplot as plt
tfd = tfp.distributions
plt.rcParams['figure.figsize'] = (10, 6)
print("Tensorflow Version: ", tf.__version__)
print("Tensorflow Probability Version: ", tfp.__version__)
From wikipedia,
In In statistics, a univariate distribution is a probability distribution of only one random variable. This is in contrast to a multivariate distribution, the probability distribution of a random vector (consisting of multiple random variables).
One of simple univariate distributions is Normal Distribution (also known as Gaussian Distribution). We can create it with tensorflow probability.
normal = tfd.Normal(loc=0, scale=1)
normal
Note that loc
stands for mean($\mu$) of distribution, and scale
is standard distribution ($\sigma$) of distribution. After that, we create the normal distribution object. In order to generate the data from normal distribution, we need to sample
from it.
normal.sample()
Or it can generate multiple samples.
normal.sample(5)
If we generate 10000 samples and plot it, its shape will be bell-shaped.
plt.hist(samples.numpy(), bins=50, density=True)
plt.show()
If you're familiar with statistics, the probability of each sample can be expressed.
normal.prob(0)
Or you can use log probability.
normal.log_prob(0)
Another example of univariate distribution is exponential distribution. This distribution has controllable parameter called $\lambda$, and can be expressed like this,
$$ f(x; \lambda) \begin{cases} \lambda e^{-\lambda x} & x \ge 0, \\ 0 & x < 0 \end{cases} $$
exponential = tfd.Exponential(rate=1)
exponential.sample(5)
plt.hist(exponential.sample(10000).numpy(), bins=50, density=True)
plt.show()
bernoulli = tfd.Bernoulli(probs=0.8)
bernoulli.sample(5)
This distribution generates only two data, 0 and 1.
for k in [0, 0.5, 1, -1]:
print('Probability result {} for k = {}'.format(bernoulli.prob(k), k))
We already define the probability of 1 to 0.8, so the probability of 0 will be 0.2. You can see that the probability of unexpected data will be strange probability.
The advantage of tensorflow distribution is that it can easily make batch data from specific distribution.
bernoulli_batch = tfd.Bernoulli(probs=[0.1, 0.25, 0.5, 0.75, 0.9])
bernoulli_batch
bernoulli_batch.sample(5)
We can make 2D batch samples using higher rank as probs.
probs = [[[0.5, 0.5],
[0.8, 0.3],
[0.25, 0.75]]]
bernoulli_batch_2D = tfd.Bernoulli(probs=probs)
bernoulli_batch_2D
bernoulli_batch_2D.sample(5)
bernoulli_batch_2D.prob([[[1, 0],
[0, 0],
[1, 1]]])