Creating and joining GeoDataFrames
You'll work with GeoJSON to create polygonal plots, learn about projections and coordinate reference systems, and get practice spatially joining data in this chapter. This is the Summary of lecture "Visualizing Geospatial Data in Python", via datacamp.
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (10,5)
school_districts = gpd.read_file('./dataset/school_districts.geojson')
lgnd_kwds = {'title': 'School Districts', 'loc': 'upper left',
'bbox_to_anchor': (1, 1.03), 'ncol':1}
# Plot the school districts using the tab20 colormap (qualitative)
school_districts.plot(column='district', cmap='tab20', legend=True, legend_kwds=lgnd_kwds, figsize=(7,7));
plt.xlabel('Latitude');
plt.ylabel('Longitude');
plt.title('Nashville School Districts');
school_districts.plot(column='district', cmap='summer', legend=True, legend_kwds=lgnd_kwds);
plt.xlabel('Latitude');
plt.ylabel('Longitude');
plt.title('Nashville School Districts');
school_districts.plot(cmap='Set3', legend=True, legend_kwds=lgnd_kwds);
plt.xlabel('Latitude');
plt.ylabel('Longitude');
plt.title('Nashville School Districts');
neighborhoods = gpd.read_file('./dataset/neighborhoods.geojson')
# Print the first few rows of neighborhoods
print(neighborhoods.head())
# Plot the neighborhoods, color according to name and use the Dark2 colormap
neighborhoods.plot(column='name', cmap='Dark2');
print(school_districts.head(1))
print(school_districts.crs)
# Convert the crs to epsg:3857
school_districts.geometry = school_districts.geometry.to_crs(epsg=3857)
# Print the first row of school districts GeoDataFrame and the crs
print(school_districts.head(1))
print(school_districts.crs)
You can change the coordinate reference system of a GeoDataFrame by changing the crs property of the GeoDataFrame. Notice that the units for geometry change when you change the CRS. You always need to ensure two GeoDataFrames share the same crs before you spatially join them.
Construct a GeoDataFrame from a DataFrame
In this exercise, you will construct a geopandas GeoDataFrame from the Nashville Public Art DataFrame. You will need to import the Point
constructor from the shapely.geometry
module to create a geometry column in art before you can create a GeoDataFrame from art
. This will get you ready to spatially join the art data and the neighborhoods data in order to discover which neighborhood has the most art.
art = gpd.read_file('./dataset/public_art.csv')
from shapely.geometry import Point
# Print tthe first few rows of the art DataFrame
print(art.head())
# Create a geometry column from lng & lat
art['geometry'] = art.apply(lambda x: Point(float(x.Longitude), float(x.Latitude)), axis=1)
# Create a GeoDataFrame from art and verify the type
art_geo = gpd.GeoDataFrame(art, crs=neighborhoods.crs, geometry=art.geometry)
print(type(art_geo))
Now that the public art data is in a GeoDataFrame we can join it to the neighborhoods with a special kind of join called a spatial join. Let's go learn about how that's done!
art_intersect_neighborhoods = gpd.sjoin(art_geo, neighborhoods, op='intersects')
# Print the shape property of art_intersect_neighborhoods
print(art_intersect_neighborhoods.shape)
art_within_neighborhoods = gpd.sjoin(art_geo, neighborhoods, op = 'within')
# Print the shape property of art_within_neighborhoods
print(art_within_neighborhoods.shape)
art_containing_neighborhoods = gpd.sjoin(art_geo, neighborhoods, op = 'contains')
# Print the shape property of art_containing_neighborhoods
print(art_containing_neighborhoods.shape)
There are no neighborhood polygons contained within an artworks point location.
neighborhood_art = gpd.sjoin(art_geo, neighborhoods, op='within')
# Print the first few rows
print(neighborhood_art.head())
Now that you have successfully joined art and neighborhoods you can see the title and other information about the artwork along with the name of the neighborhood where it is located. Next you'll do the work to see what art is in which neighborhood!
neighborhood_art_grouped = neighborhood_art[['name', 'Title']].groupby('name')
# Aggregate the grouped data and count the artworks within each polygon
print(neighborhood_art_grouped.agg('count').sort_values(by='Title', ascending=False))
It looks like most of the public art is in the Urban Residents neighborhood. Next you'll subset neighborhood art and neighborhoods to get only the Urban Residents art and neighborhood.
Plotting the Urban Residents neighborhood and art
Now you know that most art is in the Urban Residents neighborhood. In this exercise, you'll create a plot of art in that neighborhood. First you will subset just the urban_art
from neighborhood_art
and you'll subset the urban_polygon
from neighborhoods
. Then you will create a plot of the polygon as ax
before adding a plot of the art.
urban_art = neighborhood_art.loc[neighborhood_art.name == 'Urban Residents']
# Get just the Urban Residents neighborhood polygon and save it as urban_polygon
urban_polygon = neighborhoods.loc[neighborhoods.name == "Urban Residents"]
# Plot the urban_polygon as ax
ax = urban_polygon.plot(color = 'lightgreen', figsize=(9, 9))
# Add a plot of the urban_art and show it
urban_art.plot( ax = ax, column = 'Type', legend = True);