Creating a choropleth building permit density in Nashville
In this chapter, you will learn about a special map called a choropleth. Then you will learn and practice building choropleths using two different packages - geopandas and folium. This is the Summary of lecture "Visualizing Geospatial Data in Python", via datacamp.
- What is choropleth?
- Choropleths with geopandas
- Choropleths with folium
- Folium choropleth with markers and popups
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
import folium
from shapely.geometry import Point
from IPython.display import IFrame
plt.rcParams['figure.figsize'] = (10,5)
permits = gpd.read_file('./dataset/building_permits_2017.csv')
council_districts = gpd.read_file('./dataset/council_districts.geojson')
permits['geometry'] = permits.apply(lambda x: Point((float(x.lng), float(x.lat))), axis=1)
# Build a GeoDataFrame: permits_geo
permits_geo = gpd.GeoDataFrame(permits, crs = council_districts.crs, geometry=permits.geometry)
# Spatial join of permits_geo and council_districts
permits_by_district = gpd.sjoin(permits_geo, council_districts, op='within')
print(permits_by_district.head(2))
# Create permit_counts
permit_counts = permits_by_district.groupby(['district']).size()
print(permit_counts)
Now you have a count of building permits issued for each council district. Next you'll get the area of each council_district.
Council district areas and permit counts
In order to create a normalized value for the building permits issued in each council district, you will need to find the area of each council district. Remember that you can leverage the area
attribute of a GeoSeries to do this. You will need to convert permit_counts
to a DataFrame so you can merge it with the council_districts
data.
council_districts['area'] = council_districts.geometry.area
print(council_districts.head(2))
# Convert permit_counts to a DataFrame
permits_df = permit_counts.to_frame()
print(permits_df.head(2))
# Reset index and column names
permits_df.reset_index(inplace=True)
permits_df.columns = ['district', 'bldg_permits']
print(permits_df.head(2))
# Merge council_districts and permits_df
districts_and_permits = pd.merge(council_districts, permits_df, on='district')
print(districts_and_permits.head(2))
You have created a column with the area in the council_districts Geo DataFrame and built a DataFrame from the permit_counts. You have merged all the information into a single GeoDataFrame. Next you will calculate the permits by area for each council district.
print(type(districts_and_permits))
# Create permit_density column in districts_and_permits
districts_and_permits['permit_density'] = districts_and_permits.apply(lambda row: row.bldg_permits/row.area, axis=1)
# Print the head of districts_and_permits
print(districts_and_permits.head())
districts_and_permits.plot(column='permit_density', legend=True);
districts_and_permits.plot(column = 'permit_density', cmap = 'BuGn', edgecolor = 'black', legend = True)
plt.xlabel('longitude')
plt.ylabel('latitude')
plt.xticks(rotation = 'vertical')
plt.title('2017 Building Project Density by Council District')
You can see that most development has happened close to the city center! The scale for permit density is hard to interpret. We'll fix that next.
Area in km squared, geometry in decimal degrees
In this exercise, you'll start again with the council_districts
GeoDataFrame and the permits
DataFrame. You will change the council_districts
to use the EPSG 3857 coordinate reference system before creating a column for area
. Once the area
column has been created, you will change the CRS back to EPSG 4326 so that the geometry is in decimal degrees.
council_districts = council_districts.to_crs(epsg=3857)
print(council_districts.crs)
print(council_districts.head())
# Create area in square km
sqm_to_sqkm = 10 ** 6
council_districts['area'] = council_districts.geometry.area / sqm_to_sqkm
# Change council_districts crs back to epsg 4326
council_districts = council_districts.to_crs(epsg=4326)
print(council_districts.crs)
print(council_districts.head())
Spatially joining and getting counts
You will continue preparing your dataset for plotting a geopandas
choropleth by creating a GeoDataFrame of the building permits spatially joined to the council districts. After that, you will be able to get counts of the building permits issued in each council district.
permits_geo = gpd.GeoDataFrame(permits, crs=council_districts.crs, geometry=permits.geometry)
# Spatially join permits_geo and council_districts
permits_by_district = gpd.sjoin(permits_geo, council_districts, op='within')
# Count permits in each district
permits_counts = permits_by_district.groupby(['district']).size()
# Convert permit_counts to a df with 2 columns: district and bldg_permits
counts_df = permits_counts.to_frame()
counts_df.reset_index(inplace=True)
counts_df.columns = ['district', 'bldg_permits']
print(counts_df.head(2))
You have done all the work to get the area units in km squared, latitude and longitude in decimal degrees, and a count of building permits issued for each council district all in the same dataset. All that's left is to calculate a normalized value and plot the choropleth!
Building a polished Geopandas choropleth
After merging the counts_df
with permits_by_district
, you will create a column with normalized permit_density
by dividing the count of permits in each council district by the area of that council district. Then you will plot your final geopandas choropleth of the building projects in each council district.
districts_and_permits = pd.merge(permits_by_district, counts_df, on='district')
# Create permit_density column
districts_and_permits['permit_density'] = districts_and_permits.apply(lambda row: row.bldg_permits / row.area, axis=1)
print(districts_and_permits.head(2))
# Create choropleth plot
districts_and_permits.plot(column='permit_density', cmap='OrRd', edgecolor='black', legend=True);
# Add axis labels and title
plt.xlabel('longitude');
plt.ylabel('latitude');
plt.title('2017 Building Project Density by Council District');
nashville = [36.1636,-86.7823]
# Create map
m = folium.Map(location=nashville, zoom_start=10)
# Build choropleth
folium.Choropleth(
geo_data=r'./dataset/council_districts.geojson',
name='geometry',
data=districts_and_permits,
columns=['district', 'permit_density'],
key_on='feature.properties.district',
fill_color='Reds',
fill_opacity=0.5,
line_opacity=1.0,
legend_name='2017 Permitted Building Projects per km squared',
).add_to(m)
folium.LayerControl().add_to(m)
m.save('./html/permits_choropleth.html')
districts_and_permits['center'] = districts_and_permits.geometry.centroid
distinct = []
# Build markers and popups
for row in districts_and_permits.iterrows():
row_values = row[1]
if row_values['district'] not in distinct:
center_point = row_values['center']
location = [center_point.y, center_point.x]
popup = ('Council District: ' + str(row_values['district']) +
'; ' + 'permits issued: ' + str(row_values['bldg_permits']))
marker = folium.Marker(location = location, popup = popup)
marker.add_to(m)
distinct.append(row_values['district'])
# Display the map
m.save('./html/permits_choropleth2.html')