September 4, 2017

Murder Accountability Project - Part 2 Looking at Weapons and Relationships

Part 2 - Weapon Trends
At this point we can start looking at some of the weapon trends.

crime_time = pd.pivot_table(data2,index=["Year", "Weapon"],values=["Record ID"],aggfunc=[len])
crime_time.columns = ['Year', 'Weapon', 'Count']
Year object Weapon category Count float64 dtype: object
crime_time['Year'] = pd.factorize(crime_time['Year'])[0]

Use a seaborn factor Plot to show use of weapons over time.

# Create a dataset with many short random walks
rs = np.random.RandomState(4)
pos = rs.randint(-1, 2, (20, 5)).cumsum(axis=1)
pos -= pos[:, 0, np.newaxis]
step = np.tile(range(5), 20)
walk = np.repeat(range(20), 5)

# Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(crime_time, col="Weapon", hue="Weapon", col_wrap=7, size=2.5)

# Draw a horizontal line to show the starting point, y=.5, ls=":", c=".5")

# Draw a line plot to show the trajectory of each random walk, "Year", "Count", marker="o", ms=4)

# Adjust the tick positions and labels
#xlim is the years factorized from 0 to 34 
grid.set(xticks=np.arange(5), yticks=[-1, 1],
     xlim=(-1, 35), ylim=(0, 2500))

# Adjust the arrangement of the plots

Use of firearms as a weapon has steadily increased form 1980 to 2014, and notice how 'Gun' has also increased, which should really be rolled up into 'Firearm'. Weapons like 'Rifle' and 'Shotgun' should also be folded under 'Firearm' possibly. But, we can see that the number of crimes committed using 'Knife' and 'Blunt Object' are high in count. The 'Unknown' weapon type has also increased over the years.

Understanding victim demographics through relationships and weapons
Let's plot the Number of Cases and Average Ages of Perpetrator and Victims by Weapon and Relationship.

#Get a dataframe where the relationships are known, remove unknowns
known_relationships = data2.loc[(data2['Relationship'] != 'Unknown')]
len(data2) #638454
len(known_relationships) #365441

Use a pivot table to tabulate the average victim age, average perpetrator age, and total number of crimes by relationship and weapon over time.

relationships_weapons = pd.pivot_table(known_relationships,index=["Relationship", "Weapon"], values=['Agency Code', 'Perpetrator Age', 'Victim Age'], aggfunc={'Agency Code': lambda x: len(x), 'Perpetrator Age': lambda x: x.mean(),'Victim Age': lambda x: x.mean()})

Sort the pivot table by the number of records.

rw = relationships_weapons.sort_values(by='Agency Code', ascending=False)
rw.rename(columns={'Agency Code': 'Number Cases'}, inplace=True)

Visualize the ages of victims and perpetrators, for the links that have > 1000 cases

The relationship means the perpetrator's relationship to the victim. Unfortunately, the youngest victims occured as a 'Son-Blunt Object', 'Daughter-Blunt Object', 'Son-Handgun', 'Son-Unknown', 'Daughter-Unknown', 'Daughter-Handgun'. The oldest victims came from a 'Father-Handgun' and 'Mother-Knife' relationship-weapon pair.
It's also interesting because these high-frequency crimes show a younger age profile of perpetrators (more yellows and turqouise) than the victims, except in those above mentioned cases.

rwt = (rw.loc[(rw['Number Cases'] >= 1000)]).transpose()

sns.set_context("notebook", font_scale=2.0)
plt.figure(figsize=(35, 8))
sns_plot = sns.heatmap(data=(rwt.iloc[1:]), cmap="YlGnBu")

Plot by total cases

cases_1000 = ((rw.loc[(rw['Number Cases'] >= 1000)]).reset_index())
cases_1000['Link'] = cases_1000['Relationship'].astype(str) + '-' + cases_1000['Weapon'].astype(str)
cases_1000.plot(x='Link', y='Number Cases', kind='bar', figsize=(35,8))

Visualize the ages of victims and perpetrators, for the links that have less than just 10 cases

Here, we're only looking at linkages where there are less than 10 cases per link. For example, Wife-Explosives, less than 10 cases of that murderous combination. And it's interesting to see that where in the high-frequency murders of over 1000 cases, the color profile showed a greater equality of the ages - perp and vic were mostly in the blue-turquoise shades, you can see here that there are more dark blues on average age in the victims - indicating a much older victim demographic than perpetrators in these less frequent murders, using more unique weapons and combinations, like:

  • Girlfriend-Poison
  • Employer-Strangulation
  • In-Law-Poison
  • Employer-Drugs
  • Boyfriend-Fall
  • Husband-Drowning
    sns.set_context("notebook", font_scale=2.0)
    plt.figure(figsize=(40, 8))
    sns_plot = sns.heatmap(data=(((rw.loc[(rw['Number Cases'] <= 10)]).transpose()).iloc[1:]), cmap="YlGnBu")