Tensorflow Probability (tfp for short) is a library for probabilistic reasoning and statistical analysis in Tensorflow. It is a part of wide ecosystem of Tensorflow, so it can easily combined with Tensorflow core.

## Packages

import tensorflow as tf
import tensorflow_probability as tfp

import numpy as np
import matplotlib.pyplot as plt

tfd = tfp.distributions
plt.rcParams['figure.figsize'] = (10, 6)

print("Tensorflow Version: ", tf.__version__)
print("Tensorflow Probability Version: ", tfp.__version__)

Tensorflow Version:  2.4.0
Tensorflow Probability Version:  0.11.1


## Univariate Distribution

From wikipedia,

In In statistics, a univariate distribution is a probability distribution of only one random variable. This is in contrast to a multivariate distribution, the probability distribution of a random vector (consisting of multiple random variables).

### Normal Distribution

One of simple univariate distributions is Normal Distribution (also known as Gaussian Distribution). We can create it with tensorflow probability.

normal = tfd.Normal(loc=0, scale=1)
normal

<tfp.distributions.Normal 'Normal' batch_shape=[] event_shape=[] dtype=float32>

Note that loc stands for mean($\mu$) of distribution, and scale is standard distribution ($\sigma$) of distribution. After that, we create the normal distribution object. In order to generate the data from normal distribution, we need to sample from it.

normal.sample()

<tf.Tensor: shape=(), dtype=float32, numpy=0.9464462>

Or it can generate multiple samples.

normal.sample(5)

<tf.Tensor: shape=(5,), dtype=float32, numpy=
array([-0.73940355,  0.79343444,  0.29724854,  0.6113488 ,  0.02649165],
dtype=float32)>

If we generate 10000 samples and plot it, its shape will be bell-shaped.

plt.hist(samples.numpy(), bins=50, density=True)
plt.show()


If you're familiar with statistics, the probability of each sample can be expressed.

normal.prob(0)

<tf.Tensor: shape=(), dtype=float32, numpy=0.3989423>

Or you can use log probability.

normal.log_prob(0)

<tf.Tensor: shape=(), dtype=float32, numpy=-0.9189385>

### Exponential distribution

Another example of univariate distribution is exponential distribution. This distribution has controllable parameter called $\lambda$, and can be expressed like this,

$$f(x; \lambda) \begin{cases} \lambda e^{-\lambda x} & x \ge 0, \\ 0 & x < 0 \end{cases}$$

exponential = tfd.Exponential(rate=1)

exponential.sample(5)

<tf.Tensor: shape=(5,), dtype=float32, numpy=
array([0.23124246, 0.28650132, 0.10770323, 0.6426723 , 0.34070757],
dtype=float32)>
plt.hist(exponential.sample(10000).numpy(), bins=50, density=True)
plt.show()


### Bernoulli Distribution

Bernoulli Distribution is also a family of univariate distribution. All we need to describe this distribution is the probabiltiy that 1 is occurred. Otherwise, 0 will be occurred.

$$f(x; p) = \begin{cases} p & \text{if } k=1, \\ q = 1 - p & \text{if } k = 0 \end{cases}$$

bernoulli = tfd.Bernoulli(probs=0.8)

bernoulli.sample(5)

<tf.Tensor: shape=(5,), dtype=int32, numpy=array([1, 0, 1, 1, 1])>

This distribution generates only two data, 0 and 1.

for k in [0, 0.5, 1, -1]:
print('Probability result {} for k = {}'.format(bernoulli.prob(k), k))

Probability result 0.20000000298023224 for k = 0
Probability result 0.4000000059604645 for k = 0.5
Probability result 0.800000011920929 for k = 1
Probability result 0.05000000074505806 for k = -1


We already define the probability of 1 to 0.8, so the probability of 0 will be 0.2. You can see that the probability of unexpected data will be strange probability.

### Batch Distributions

The advantage of tensorflow distribution is that it can easily make batch data from specific distribution.

bernoulli_batch = tfd.Bernoulli(probs=[0.1, 0.25, 0.5, 0.75, 0.9])

bernoulli_batch

<tfp.distributions.Bernoulli 'Bernoulli' batch_shape=[5] event_shape=[] dtype=int32>
bernoulli_batch.sample(5)

<tf.Tensor: shape=(5, 5), dtype=int32, numpy=
array([[0, 1, 1, 1, 1],
[0, 0, 0, 1, 1],
[0, 1, 1, 0, 1],
[0, 0, 1, 1, 1],
[0, 0, 1, 0, 1]])>

We can make 2D batch samples using higher rank as probs.

probs = [[[0.5, 0.5],
[0.8, 0.3],
[0.25, 0.75]]]
bernoulli_batch_2D = tfd.Bernoulli(probs=probs)
bernoulli_batch_2D

<tfp.distributions.Bernoulli 'Bernoulli' batch_shape=[1, 3, 2] event_shape=[] dtype=int32>
bernoulli_batch_2D.sample(5)

<tf.Tensor: shape=(5, 1, 3, 2), dtype=int32, numpy=
array([[[[1, 1],
[1, 0],
[0, 1]]],

[[[1, 1],
[0, 1],
[0, 1]]],

[[[0, 1],
[1, 0],
[0, 0]]],

[[[1, 1],
[1, 0],
[0, 0]]],

[[[1, 0],
[1, 0],
[0, 1]]]])>
bernoulli_batch_2D.prob([[[1, 0],
[0, 0],
[1, 1]]])

<tf.Tensor: shape=(1, 3, 2), dtype=float32, numpy=
array([[[0.5 , 0.5 ],
[0.2 , 0.7 ],
[0.25, 0.75]]], dtype=float32)>