Custom Loss Function in Tensorflow 2.
In this post, we will learn how to build custom loss functions with function and class. This is the summary of lecture "Custom Models, Layers and Loss functions with Tensorflow" from DeepLearning.AI.
import tensorflow as tf
from tensorflow.keras.utils import plot_model
from tensorflow.keras import backend as K
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Part 1 - Huber Loss
In this section, we'll walk through how to create custom loss functions. In particular, we'll code the Huber Loss and use that in training the model.
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
# labels
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)
plt.scatter(xs, ys);
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, input_shape=[1])
])
model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(xs, ys, epochs=500, verbose=0)
y_mse = model.predict([10.0])
y_mse
plt.scatter(xs, ys)
plt.scatter(10.0, y_mse, c='r');
Training with Custom Loss
Now let's see how we can use a custom loss. We first define a function that accepts the ground truth labels (y_true
) and model predictions (y_pred
) as parameters. We then compute and return the loss value in the function definition.
The definition of Huber Loss is like this:
$$ L_{\delta}(a) = \begin{cases} \frac{1}{2} (y - f(x))^2 \quad & \text{ for } \vert a \vert \le \delta, \\ \delta (\vert y - f(x) \vert - \frac{1}{2} \delta) \quad & \text{ otherwise} \\ \end{cases} $$def my_huber_loss(y_true, y_pred):
threshold = 1.
error = y_true - y_pred
is_small_error = tf.abs(error) <= threshold
small_error_loss = tf.square(error) / 2
big_error_loss = threshold * (tf.abs(error) - threshold / 2)
return tf.where(is_small_error, small_error_loss, big_error_loss)
Using the loss function is as simple as specifying the loss function in the loss
argument of model.compile()
.
model = tf.keras.Sequential([
tf.keras.layers.Dense(units=1, input_shape=[1,])
])
model.compile(optimizer='sgd', loss=my_huber_loss)
model.fit(xs, ys, epochs=500, verbose=0)
y_hl = model.predict([10.0])
y_hl
plt.scatter(xs, ys);
plt.scatter(10.0, y_mse, label='mse');
plt.scatter(10.0, y_hl, label='huber_loss');
plt.grid()
plt.legend();
Part 2 - Huber Loss Hyperparameter and Loss class
In this section, we'll extend our previous Huber loss function and show how you can include hyperparameters in defining loss functions. We'll also look at how to implement a custom loss as an object by inheriting the Loss class.
Custom loss with hyperparameter
The loss
argument in model.compile()
only accepts functions that accepts two parameters: the ground truth (y_true
) and the model predictions (y_pred
). If we want to include a hyperparameter that we can tune, then we can define a wrapper function that accepts this hyperparameter.
def my_huber_loss_with_threshold(threshold):
# function that accepts the ground truth and predictions
def my_huber_loss(y_true, y_pred):
error = y_true - y_pred
is_small_error = tf.abs(error) <= threshold
small_error_loss = tf.square(error) / 2
big_error_loss = threshold * (tf.abs(error) - (threshold / 2))
return tf.where(is_small_error, small_error_loss, big_error_loss)
# return the inner function tuned by the hyperparameter
return my_huber_loss
We can now specify the loss
as the wrapper function above. Notice that we can now set the threshold
value. Try varying this value and see the results you get.
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, input_shape=[1])
])
model.compile(optimizer='sgd', loss=my_huber_loss_with_threshold(threshold=1.2))
model.fit(xs, ys, epochs=500, verbose=0)
y_hlt = model.predict([10.0])
y_hlt
plt.scatter(xs, ys);
plt.scatter(10.0, y_mse, label='mse');
plt.scatter(10.0, y_hl, label='huber_loss');
plt.scatter(10.0, y_hlt, label='huber_loss_1.2')
plt.grid()
plt.legend();
from tensorflow.keras.losses import Loss
class MyHuberLoss(Loss):
# initialize instance attributes
def __init__(self, threshold=1):
super(MyHuberLoss, self).__init__()
self.threshold = threshold
# Compute loss
def call(self, y_true, y_pred):
error = y_true - y_pred
is_small_error = tf.abs(error) <= self.threshold
small_error_loss = tf.square(error) / 2
big_error_loss = self.threshold * (tf.abs(error) - self.threshold / 2)
return tf.where(is_small_error, small_error_loss, big_error_loss)
You can specify the loss by instantiating an object from your custom loss class.
model = tf.keras.Sequential([
tf.keras.layers.Dense(1, input_shape=[1,])
])
model.compile(optimizer='sgd', loss=MyHuberLoss(threshold=1.02))
model.fit(xs, ys, epochs=500, verbose=0)
y_hltc = model.predict([10.0])
y_hltc