Introduction to Matplotlib
This chapter introduces the Matplotlib visualization library and demonstrates how to use it with data. This is the Summary of lecture "Introduction to Data Visualization with Matplotlib", via datacamp.
import matplotlib.pyplot as plt
import pandas as pd
plt.rcParams['figure.figsize'] = (10, 5)
Using the matplotlib.pyplot interface
There are many ways to use Matplotlib. In this course, we will focus on the pyplot
interface, which provides the most flexibility in creating and customizing data visualizations.
Initially, we will use the pyplot
interface to create two kinds of objects: Figure
objects and Axes
objects.
fig, ax = plt.subplots()
Adding data to an Axes object
Adding data to a figure is done by calling methods of the Axes
object. In this exercise, we will use the plot
method to add data about rainfall in two American cities: Seattle, WA and Austin, TX.
The data are stored in two Pandas DataFrame objects that are already loaded into memory: seattle_weather
stores information about the weather in Seattle, and austin_weather
stores information about the weather in Austin. Each of the data frames has a MONTHS
column that stores the three-letter name of the months. Each also has a column named "MLY-PRCP-NORMAL" that stores the average rainfall in each month during a ten-year period.
In this exercise, you will create a visualization that will allow you to compare the rainfall in these two cities.
seattle_weather = pd.read_csv('./dataset/seattle_weather.csv', index_col='DATE')
austin_weather = pd.read_csv('./dataset/austin_weather.csv', index_col='DATE')
fig, ax = plt.subplots()
# Plot MLY-PRCP-NORMAL from seattle_weather against the MONTH
ax.plot(seattle_weather["MONTH"], seattle_weather["MLY-PRCP-NORMAL"]);
# Plot MLY-PRCP-NORMAL from austin_weather against MONTH
ax.plot(austin_weather['MONTH'], austin_weather["MLY-PRCP-NORMAL"]);
Customizing data appearance
We can customize the appearance of data in our plots, while adding the data to the plot, using key-word arguments to the plot command.
In this exercise, you will customize the appearance of the markers, the linestyle that is used, and the color of the lines and markers for your data.
As before, the data is already provided in Pandas DataFrame objects loaded into memory: seattle_weather
and austin_weather
. These each have a MONTHS
column and a "MLY-PRCP-NORMAL" that you will plot against each other.
fig, ax = plt.subplots();
# Plot Seattle data, setting data appearance
ax.plot(seattle_weather["MONTH"], seattle_weather["MLY-PRCP-NORMAL"], color='b', marker='o', linestyle='--');
# Plot Austin data, setting data appearance
ax.plot(austin_weather["MONTH"], austin_weather["MLY-PRCP-NORMAL"], color='r', marker='v', linestyle='--');
fig, ax = plt.subplots();
ax.plot(seattle_weather["MONTH"], seattle_weather["MLY-PRCP-NORMAL"]);
ax.plot(austin_weather["MONTH"], austin_weather["MLY-PRCP-NORMAL"]);
# Customize the x-axis label
ax.set_xlabel("Time (months)");
# Customize the y-axis label
ax.set_ylabel("Precipitation (inches)");
# Add the title
ax.set_title("Weather patterns in Austin and Seattle");
Creating small multiples with plt.subplots
Small multiples are used to plot several datasets side-by-side. In Matplotlib, small multiples can be created using the plt.subplots()
function. The first argument is the number of rows in the array of Axes objects generate and the second argument is the number of columns. In this exercise, you will use the Austin and Seattle data to practice creating and populating an array of subplots.
The data is given to you in DataFrames: seattle_weather
and austin_weather
. These each have a MONTH
column and MLY-PRCP-NORMAL
(for average precipitation), as well as MLY-TAVG-NORMAL
(for average temperature) columns. In this exercise, you will plot in a separate subplot the monthly average precipitation and average temperatures in each city.
fig, ax = plt.subplots(2,2);
# Addressing the top left Axes as index 0, 0, plot month and Seattle precipitation
ax[0, 0].plot(seattle_weather['MONTH'], seattle_weather['MLY-PRCP-NORMAL']);
# In the top right (index 0,1), plot month and Seattle temperatures
ax[0, 1].plot(seattle_weather['MONTH'], seattle_weather['MLY-TAVG-NORMAL']);
# In the bottom left (1, 0) plot month and Austin precipitations
ax[1, 0].plot(austin_weather['MONTH'], austin_weather['MLY-PRCP-NORMAL']);
# In the bottom right (1, 1) plot month and Austin temperatures
ax[1, 1].plot(austin_weather['MONTH'], austin_weather['MLY-TAVG-NORMAL']);
Small multiples with shared y axis
When creating small multiples, it is often preferable to make sure that the different plots are displayed with the same scale used on the y-axis. This can be configured by setting the sharey
key-word to True
.
In this exercise, you will create a Figure with two Axes objects that share their y-axis.
fig, ax = plt.subplots(2, 1, sharey=True);
# Plot Seattle precipitation data in the top axes
ax[0].plot(seattle_weather['MONTH'], seattle_weather['MLY-PRCP-NORMAL'], color = 'b');
ax[0].plot(seattle_weather['MONTH'], seattle_weather['MLY-PRCP-25PCTL'], color = 'b', linestyle = '--');
ax[0].plot(seattle_weather['MONTH'], seattle_weather['MLY-PRCP-75PCTL'], color = 'b', linestyle = '--');
# Plot Austin precipitation data in the bottom axes
ax[1].plot(austin_weather['MONTH'], austin_weather['MLY-PRCP-NORMAL'], color = 'r');
ax[1].plot(austin_weather['MONTH'], austin_weather['MLY-PRCP-25PCTL'], color = 'r', linestyle = '--');
ax[1].plot(austin_weather['MONTH'], austin_weather['MLY-PRCP-75PCTL'], color = 'r', linestyle = '--');