Exploring the relationship between gender and policing
A Summary of lecture "Analyzing Police Activity with pandas", via datacamp
- Do the genders commit different violations?
- Does gender affect who gets a ticket for speeding?
- Does gender affect whose vehicle is searched?
- Does gender affect who is frisked during a search?
import pandas as pd
# Read 'police.csv' into a DataFrame named ri
ri = pd.read_csv('./dataset/police.csv')
print(ri['violation'].value_counts())
# Express the counts as proportions
print(ri['violation'].value_counts(normalize=True))
female = ri[ri['driver_gender'] == 'F']
# Create a DataFrame of male drivers
male = ri[ri['driver_gender'] == 'M']
# Compute the violations by female drivers (as proportions)
print(female['violation'].value_counts(normalize=True))
# Compute the violations by male drivers (as proportions)
print(male['violation'].value_counts(normalize=True))
female_and_speeding = ri[(ri['driver_gender'] == 'F') & (ri['violation'] == 'Speeding')]
# Create a DataFrame of male drivers stopped for speeding
male_and_speeding = ri[(ri['driver_gender'] == 'M') & (ri['violation'] == 'Speeding')]
# Compute the stop outcomes for female drivers (as proportions)
print(female_and_speeding.stop_outcome.value_counts(normalize=True))
# Compute the stop outcomes for male drivers (as proportions)
print(male_and_speeding.stop_outcome.value_counts(normalize=True))
print(ri['search_conducted'].dtypes)
# Calculate the search rate by counting the values
print(ri['search_conducted'].value_counts(normalize=True))
# Calculate the search rate by taking the mean
print(ri['search_conducted'].mean())
print(ri[ri['driver_gender'] == 'F'].search_conducted.mean())
# Calculating the search rate for male drivers
print(ri[ri['driver_gender'] == 'M'].search_conducted.mean())
# Calculate the search rate for both groups simultaneously
print(ri.groupby('driver_gender').search_conducted.mean())
print(ri.groupby(['driver_gender', 'violation']).search_conducted.mean())
# Reverse the ordering to group by violation before gender
print(ri.groupby(['violation', 'driver_gender']).search_conducted.mean())
During a vehicle search, the police officer may pat down the driver to check if they have a weapon. This is known as a "protective frisk."
In this exercise, you'll first check to see how many times "Protective Frisk" was the only search type. Then, you'll use a string method to locate all instances in which the driver was frisked.
print(ri['search_type'].value_counts())
# Check if 'search_type' contains the string 'Protective Frisk'
ri['frisk'] = ri.search_type.str.contains('Protective Frisk', na=False)
# Check the data type of 'frisk'
print(ri['frisk'].dtypes)
# Take the sum of frisk
print(ri['frisk'].sum())
In this exercise, you'll compare the rates at which female and male drivers are frisked during a search. Are males frisked more often than females, perhaps because police officers consider them to be higher risk?
Before doing any calculations, it's important to filter the DataFrame to only include the relevant subset of data, namely stops in which a search was conducted.
searched = ri[ri.search_conducted == True]
# Calculate the overall frisk rate by taking the mean of 'frisk'
print(searched.frisk.mean())
# Calculate the frisk rate for each gender
print(searched.groupby('driver_gender').frisk.mean())