import numpy as np
import scipy
import imageio
import matplotlib.pyplot as plt
from pprint import pprint

plt.rcParams['figure.figsize'] = (10, 8)

Image data

Load images

In this chapter, we'll work with sections of a computed tomography (CT) scan from The Cancer Imaging Archive. CT uses a rotating X-ray tube to create a 3D image of the target area.

The actual content of the image depends on the instrument used: photographs measure visible light, x-ray and CT measure radiation absorbance, and MRI scanners measure magnetic fields.

To warm up, use the imageio package to load a single DICOM image from the scan volume and check out a few of its attributes.

im = imageio.imread('./dataset/tcia-chest-ct-sample/chest-220.dcm')

# Print image attributes
print('Image type:', type(im))
print('Shape of image array:', im.shape)
Image type: <class 'imageio.core.util.Array'>
Shape of image array: (512, 512)

imageio is a versatile package. It can read in a variety of image data, including JPEG, PNG, and TIFF. But it's especially useful for its ability to handle DICOM files.


ImageIO reads in data as Image objects. These are standard NumPy arrays with a dictionary of metadata.

Metadata can be quite rich in medical images and can include:

  • Patient demographics: name, age, sex, clinical information
  • Acquisition information: image shape, sampling rates, data type, modality (such as X-Ray, CT or MRI)

Start this exercise by reading in the chest image and listing the available fields in the meta dictionary.

Dict([('TransferSyntaxUID', '1.2.840.10008.1.2'),
      ('SOPClassUID', '1.2.840.10008.'),
      ('StudyDate', '20040529'),
      ('SeriesDate', '20040515'),
      ('ContentDate', '20040515'),
      ('StudyTime', '115208'),
      ('SeriesTime', '115254'),
      ('ContentTime', '115325'),
      ('Modality', 'CT'),
      ('Manufacturer', 'GE MEDICAL SYSTEMS'),
      ('StudyDescription', 'PET CT with registered MR'),
      ('SeriesDescription', 'CT IMAGES - RESEARCH'),
      ('PatientName', 'STS_007'),
      ('PatientID', 'STS_007'),
      ('PatientBirthDate', ''),
      ('PatientSex', 'F '),
      ('PatientWeight', 82.0),
      ('SeriesNumber', 2),
      ('AcquisitionNumber', 1),
      ('InstanceNumber', 57),
      ('ImagePositionPatient', (-250.0, -250.0, -180.62)),
      ('ImageOrientationPatient', (1.0, 0.0, 0.0, 0.0, 1.0, 0.0)),
      ('SamplesPerPixel', 1),
      ('Rows', 512),
      ('Columns', 512),
      ('PixelSpacing', (0.976562, 0.976562)),
      ('BitsAllocated', 16),
      ('BitsStored', 16),
      ('HighBit', 15),
      ('PixelRepresentation', 0),
      ('RescaleIntercept', -1024.0),
      ('RescaleSlope', 1.0),
       b'Data converted to numpy array, raw data removed to preserve memory'),
      ('shape', (512, 512)),
      ('sampling', (0.976562, 0.976562))])

Plot images

Perhaps the most critical principle of image analysis is: look at your images!

Matplotlib's imshow() function gives you a simple way to do this. Knowing a few simple arguments will help:

  • cmap controls the color mappings for each value. The "gray" colormap is common, but many others are available.
  • vmin and vmax control the color contrast between values. Changing these can reduce the influence of extreme values.
  • plt.axis('off') removes axis and tick labels from the image.

For this exercise, plot the CT scan and investigate the effect of a few different parameters.

fig, ax = plt.subplots(1, 3, figsize=(15, 10))
# Draw the image in grayscale
ax[0].imshow(im, cmap='gray');

# Draw the image with greater contrast
ax[1].imshow(im, cmap='gray', vmin=-200, vmax=200);

# Remove axis ticks and labels
ax[2].imshow(im, cmap='gray', vmin=-200, vmax=200);

N-dimensional images

  • Shape, sampling and field of view
    • Image shape: number of elements along each axis
    • Sampling rate: physical space covered by each element
    • Field of view: physical space covered along each axis

Stack images

Image "stacks" are a useful metaphor for understanding multi-dimensional data. Each higher dimension is a stack of lower dimensional arrays.


In this exercise, we will use NumPy's stack() function to combine several 2D arrays into a 3D volume. By convention, volumetric data should be stacked along the first dimension: vol[plane, row, col].

Note: performing any operations on an ImageIO Image object will convert it to a numpy.ndarray, stripping its metadata.

im1 = imageio.imread('./dataset/tcia-chest-ct-sample/chest-220.dcm')
im2 = imageio.imread('./dataset/tcia-chest-ct-sample/chest-221.dcm')
im3 = imageio.imread('./dataset/tcia-chest-ct-sample/chest-222.dcm')

# Stack images into a volume
vol = np.stack([im1, im2, im3], axis=0)
print('Volume dimensions:', vol.shape)
Volume dimensions: (3, 512, 512)

Load volumes

ImageIO's volread() function can load multi-dimensional datasets and create 3D volumes from a folder of images. It can also aggregate metadata across these multiple images.

For this exercise, read in an entire volume of brain data from the "./dataset/tcia-chest-ct-sample" folder, which contains 5 DICOM images.

vol = imageio.volread('./dataset/tcia-chest-ct-sample/')

# Print image attributes
print('Available metadata:', vol.meta.keys())
print('Shape of image array:', vol.shape)
Reading DICOM (examining files): 1/5 files (20.0%5/5 files (100.0%)
  Found 1 correct series.
Reading DICOM (loading data): 5/5  (100.0%)
Available metadata: odict_keys(['TransferSyntaxUID', 'SOPClassUID', 'SOPInstanceUID', 'StudyDate', 'SeriesDate', 'ContentDate', 'StudyTime', 'SeriesTime', 'ContentTime', 'Modality', 'Manufacturer', 'StudyDescription', 'SeriesDescription', 'PatientName', 'PatientID', 'PatientBirthDate', 'PatientSex', 'PatientWeight', 'StudyInstanceUID', 'SeriesInstanceUID', 'SeriesNumber', 'AcquisitionNumber', 'InstanceNumber', 'ImagePositionPatient', 'ImageOrientationPatient', 'SamplesPerPixel', 'Rows', 'Columns', 'PixelSpacing', 'BitsAllocated', 'BitsStored', 'HighBit', 'PixelRepresentation', 'RescaleIntercept', 'RescaleSlope', 'PixelData', 'shape', 'sampling'])
Shape of image array: (5, 512, 512)

Field of view

The amount of physical space covered by an image is its field of view, which is calculated from two properties:

  • Array shape, the number of data elements on each axis. Can be accessed with the shape attribute.
  • Sampling resolution, the amount of physical space covered by each pixel. Sometimes available in metadata (e.g., meta['sampling']).

Advanced plotting

  • To plot N-dimensional data, slice it!
  • Non-standard views
    • Axial (Plain)
    • Coronal (Row)
    • sagittal (Col)

Generate subplots

You can draw multiple images in one figure to explore data quickly. Use plt.subplots() to generate an array of subplots.

fig, axes = plt.subplots(nrows=2, ncols=2)


To draw an image on a subplot, call the plotting method directly from the subplot object rather than through PyPlot: axes[0,0].imshow(im) rather than plt.imshow(im).

fig, axes = plt.subplots(nrows=1, ncols=2)

# Draw an images on each subplot
axes[0].imshow(im1, cmap='gray');
axes[1].imshow(im2, cmap='gray');

# Remove ticks/labels and render

Slice 3D images

The simplest way to plot 3D and 4D images by slicing them into many 2D frames. Plotting many slices sequentially can create a "fly-through" effect that helps you understand the image as a whole.


To select a 2D frame, pick a frame for the first axis and select all data from the remaining two: vol[0, :, :]

fig, axes = plt.subplots(1, 5, figsize=(15, 10))

# Loop through subplots and draw image
for ii in range(5):
    im = vol[ii, :, :]
    axes[ii].imshow(im, cmap='gray', vmin=-200, vmax=200)

Plot other views

Any two dimensions of an array can form an image, and slicing along different axes can provide a useful perspective. However, unequal sampling rates can create distorted images.


Changing the aspect ratio can address this by increasing the width of one of the dimensions.

For this exercise, plot images that slice along the second and third dimensions of vol. Explicitly set the aspect ratio to generate undistorted images.

im1 = vol[:, 256, :]
im2 = vol[:, :, 256]

# Compute aspect ratios
d0, d1, d2 = vol.meta['sampling']
asp1 = d0 / d2
asp2 = d0 / d1

# Plot the images on a subplots array
fig, axes = plt.subplots(2, 1, figsize=(15, 8))
axes[0].imshow(im1, cmap='gray', aspect=asp1);
axes[1].imshow(im2, cmap='gray', aspect=asp2);

Note: Sample dataset doesn’t contain enough amount to plot whole images.