Custom Layers in Tensorflow 2
Custom layers give you the flexibility to implement models that use non-standard layers. In this post, we will practice uilding off of existing standard layers to create custom layers for your models. This is the summary of lecture "Custom Models, Layers and Loss functions with Tensorflow" from DeepLearning.AI.
- Packages
- Part 1 - Lambda Layer
- Part 2 - Building a Custom Dense Layer
- Activation in a custom layer
- Application - Implement a Quadratic Layer
import tensorflow as tf
from tensorflow.keras.utils import plot_model
from tensorflow.keras import backend as K
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Part 1 - Lambda Layer
In this section, it will show how you can define custom layers with the Lambda layer. You can either use lambda functions within the Lambda layer or define a custom function that the Lambda layer will call.
mnist = tf.keras.datasets.mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128),
tf.keras.layers.Lambda(lambda x: tf.abs(x)),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
plot_model(model, show_layer_names=True, show_shapes=True, show_dtype=True, to_file='./image/lambda_model.png')
model.fit(X_train, y_train, epochs=5)
model.evaluate(X_test, y_test)
Another way to use the Lambda layer is to pass in a function defined outside the model. The code below shows how a custom ReLU function is used as a custom layer in the model.
def my_relu(x):
return K.maximum(-0.1, x)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128),
tf.keras.layers.Lambda(my_relu),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5)
model.evaluate(X_test, y_test)
Part 2 - Building a Custom Dense Layer
In this section, we'll walk through how to create a custom layer that inherits the Layer class. Unlike simple Lambda layers you did previously, the custom layer here will contain weights that can be updated during training.
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)
Custom Layer with weights
To make custom layer that is trainable, we need to define a class that inherits the Layer base class from Keras. The Python syntax is shown below in the class declaration. This class requires three functions: __init__()
, build()
and call()
. These ensure that our custom layer has a state and computation that can be accessed during training or inference.
from tensorflow.keras.layers import Layer
class SimpleDense(Layer):
def __init__(self, units=32):
'''
Initialize the instance attributes
'''
super(SimpleDense, self).__init__()
self.units = units
def build(self, input_shape):
'''
Create the state of the layer (weights)
'''
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name='kernel',
initial_value=w_init(shape=(input_shape[-1], self.units), dtype='float32'),
trainable=True)
# initialize bias
b_init = tf.zeros_initializer()
self.b = tf.Variable(name='bias',
initial_value=b_init(shape=(self.units,), dtype='float32'),
trainable=True)
def call(self, inputs):
'''
Defines the computation from inputs to outputs
'''
return tf.matmul(inputs, self.w) + self.b
Now we can use our custom layer like below:
my_dense = SimpleDense(units=1)
# define an input and feed into the layer
x = tf.ones((1, 1))
y = my_dense(x)
my_dense.variables
Let's then try using it in simple network:
my_layer = SimpleDense(units=1)
model = tf.keras.Sequential([my_layer])
# configure and train the model
model.compile(optimizer='sgd', loss='mean_squared_error')
model.fit(xs, ys, epochs=500, verbose=0)
model.predict([10.0])
my_layer.variables
Adding an activation layer
To use the built-in activations in Keras, we can specify an activation
parameter in the __init__()
method of our custom layer class. From there, we can initialize it by using the tf.keras.activations.get()
method. This takes in a string identifier that corresponds to one of the available activations in Keras. Next, you can now pass in the forward computation to this activation in the call()
method.
class SimpleDense(Layer):
# add an activation paramter
def __init__(self, units=32, activation=None):
super(SimpleDense, self).__init__()
self.units = units
# define the activation to get from the built-in activation layers in Keras
self.activation = tf.keras.activations.get(activation)
def build(self, input_shape):
# initialize the weight
w_init = tf.random_normal_initializer()
self.w = tf.Variable(name='kernel',
initial_value=w_init(shape=(input_shape[-1], self.units)),
trainable=True)
# intialize the bias
b_init = tf.zeros_initializer()
self.b = tf.Variable(name='bias',
initial_value=b_init(shape=(self.units, )),
trainable=True)
def call(self, inputs):
# pass the computation to the activation layer
return self.activation(tf.matmul(inputs, self.w) + self.b)
We can now pass in an activation parameter to our custom layer. The string identifier is mostly the same as the function name so 'relu' below will get tf.keras.activations.relu
.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
SimpleDense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
plot_model(model, show_shapes=True, show_layer_names=True, to_file='./image/model_simpleDense.png')
model.fit(X_train, y_train, epochs=5)
model.evaluate(X_test, y_test)
Define the quadratic layer
Implement a simple quadratic layer. It has 3 state variables: $a$, $b$ and $c$. The computation returned is $ax^2 + bx + c$. Make sure it can also accept an activation function.
__init__
- call
super(my_fun, self)
to access the base class ofmy_fun
, and call the__init__()
function to initialize that base class. In this case,my_fun
isSimpleQuadratic
and its base class isLayer
. - self.units: set this using one of the function parameters.
- self.activation: The function parameter
activation
will be passed in as a string. To get the tensorflow object associated with the string, please usetf.keras.activations.get()
build
The following are suggested steps for writing your code. If you prefer to use fewer lines to implement it, feel free to do so. Either way, you'll want to set self.a
, self.b
and self.c
.
- a_init: set this to tensorflow's
random_normal_initializer()
- a_init_val: Use the
random_normal_initializer()
that you just created and invoke it, setting theshape
anddtype
.- The
shape
ofa
should have its row dimension equal to the last dimension ofinput_shape
, and its column dimension equal to the number of units in the layer. - This is because you'll be matrix multiplying x^2 * a, so the dimensions should be compatible.
- set the dtype to 'float32'
- The
-
self.a: create a tensor using tf.Variable, setting the initial_value and set trainable to True.
-
b_init, b_init_val, and self.b: these will be set in the same way that you implemented a_init, a_init_val and self.a
- c_init: set this to
tf.zeros_initializer
. - c_init_val: Set this by calling the tf.zeros_initializer that you just instantiated, and set the
shape
anddtype
- shape: This will be a vector equal to the number of units. This expects a tuple, and remember that a tuple
(9,)
includes a comma. - dtype: set to 'float32'.
- shape: This will be a vector equal to the number of units. This expects a tuple, and remember that a tuple
- self.c: create a tensor using tf.Variable, and set the parameters
initial_value
andtrainable
.
call
The following section performs the multiplication x^2a + xb + c. The steps are broken down for clarity, but you can also perform this calculation in fewer lines if you prefer.
- x_squared: use tf.math.square()
- x_squared_times_a: use tf.matmul().
- If you see an error saying
InvalidArgumentError: Matrix size-incompatible
, please check the order of the matrix multiplication to make sure that the matrix dimensions line up.
- If you see an error saying
- x_times_b: use tf.matmul().
- x2a_plus_xb_plus_c: add the three terms together.
- activated_x2a_plus_xb_plus_c: apply the class's
activation
to the sum of the three terms.
class SimpleQuadratic(Layer):
def __init__(self, units=32, activation=None):
'''Initializes the class and sets up the internal variables'''
super(SimpleQuadratic, self).__init__()
self.units = units
self.activation = tf.keras.activations.get(activation)
def build(self, input_shape):
'''Create the state of the layer (weights)'''
a_init = tf.random_normal_initializer()
a_init_val = a_init(shape=(input_shape[-1], self.units),
dtype='float32')
self.a = tf.Variable(name='a',
initial_value=a_init_val,
trainable=True)
b_init = tf.random_normal_initializer()
b_init_val = b_init(shape=(input_shape[-1], self.units),
dtype='float32')
self.b = tf.Variable(name='b',
initial_value=b_init_val,
trainable=True)
c_init = tf.zeros_initializer()
c_init_val = c_init(shape=(self.units, ),
dtype='float32')
self.c = tf.Variable(name='c',
initial_value=c_init_val,
trainable=True)
super().build(input_shape)
def call(self, inputs):
'''Defines the computation from inputs to outputs'''
x_squared = tf.math.square(inputs)
x_squared_times_a = tf.matmul(x_squared, self.a)
x_times_b = tf.matmul(inputs, self.b)
x2a_plus_xb_plus_c = x_squared_times_a + x_times_b + self.c
activated_x2a_plus_xb_plus_c = self.activation(x2a_plus_xb_plus_c)
return activated_x2a_plus_xb_plus_c
Train your model with the SimpleQuadratic
layer that you just implemented.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
SimpleQuadratic(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
plot_model(model, show_shapes=True, show_layer_names=True, to_file='./image/model_simpleQuadratic.png')
model.fit(X_train, y_train, epochs=5)
model.evaluate(X_test, y_test)