We analyze the START Global Terrorism Dataset using Plotly, ANOVA test volcano plots, Topological Data Analysis (TDA) and Google Data Studio. Our notebook is published at Google Colab. Our Google Data Studio is at link. Our team includes: Ly Tran (github) and Tuc Nguyen (github)
- GDSCTools and dependencies
- Bokeh
- Scikit-tda Kepler-Mapper and dependencies
- Plotly
- Preprocessing data
- We use
pandas.get_dummies
to "one-hot encode" categorical data - Our features include Attack Type, Target Type, Victim Nationality, Group Name, Weapon Type, Province State, Region
- Our labels include Number Killed, Number Killed US, Success, Damaged Property Value, Number Wounded, Number Wounded US, Number Hostages kidnapped, Number Hostages Kidnapped US, Number of Days kidnapped, Number of Hostages Released, Ransom Amount, Ransom Amount US sources, Ransom Paid, Ransom Paid from US sources
- We split our labels and features data into regions for faster ANOVA test runs with
gdsctools
standalone application.
- We use
- Visualizing ANOVA test results with volcano plots
-
About the data
- The columns used to build the graph are
['nkill','nkillus','nkillter','nwound','nwoundus','nwoundte','propvalue','nhostkid','nhostkidus','ndays','ransomamt','ransomamtus','ransompaid','ransompaidus','nreleased']
- We set
NaN
values to zero - We have two different graph setups
- Graph 1:
- We combine two lens: 1-D lens with
sklearn ensemble IsolationForest
and 1-D lens withkmapper.KeplerMapper
L2norm projection - We use
sklearn.cluster.KMeans
as clusterer in our graph
- We combine two lens: 1-D lens with
- Graph 2:
- We use
kmapper
projection withsklearn.manifold.TSNE
- We use
sklearn.cluster.DBSCAN
as clusterer in our graph
- We use
- Set tooltips and color_function as number killed and weapon type
- Links to example graphs:
-
Link to graph l2norm kmeans tooltip weapontype https://tamlthari.github.io/START-global-terrorism/kepler-weapontype1-l2normkmeans/tooltip-weapontype/index.html#
-
Link to graph l2norm kmeans tooltip number killed https://tamlthari.github.io/START-global-terrorism/kepler-weapontype1-l2normkmeans/tooltip-nkill/index.html#
-
Link to graph dbscan tsne tooltip weapontype https://tamlthari.github.io/START-global-terrorism/kepler-weapontype1-tsnedbscan/tooltipweapontype/index.html#
-
Link to graph tsne dbscan tooltip number killed https://tamlthari.github.io/START-global-terrorism/kepler-weapontype1-tsnedbscan/tooltipnkill/index.html#
-
- The columns used to build the graph are
-
Pearson correlation of number killed and firearms fraction
- For each node we calculate the average number killed for all node members and fraction of firearms weapon type for all node members.
- We use
scipy.stats.pearsonr
to calculate the Pearson correlation
- Geomap:
- We use
plotly.graph_objects
to create the global map of terrorist events. We enable hover text to show the event details including number killed, by group, motive, date.
- We use
- Successful attacks by regions:
- We use
plotly.graph_objects
to create the interactive plot showing yearly event counts by region
- We use
- Number of success/fail attacks in North Africa & Middle East/South Asia/North America/Southeast Asia:
- We use
plotly.graph_objects
to create stacked bar charts showing number of success/fail events for the years 1970-2017.
- We use
- Stacked bar charts of weapon type number of success attacks by year:
- We use
plotly.graph_objects
to create stacked bar charts of weapon type number of success attacks for the years 1970-2017
- We use