Knowledge Discovery for Interpretability and Insight in Network Approaches to Automatic Fake News Detection

Machine learning classifiers that utilize network-based features for fake news detection have resulted in surprisingly low verification/test error rates. By fake new detection, we mean a supervised-learning approaching to classifying conversational threads as being binary rumor or non-rumor. Given the disproportionate number of fake news article and the resources available to determine their veracity, automated approaches to classification are one compelling approach to solving this problem. Instead of focusing on improving accuracy, our project aims to interpret and understand what features are most important in many machine-learning classifiers and why are some features better than others. Our contributions include: detailed code on engineering 58 features available in the Twitter API from the PHEME Rumor Dataset, several visualization idioms for interpreting whether or not features will be effective in fake news classification, and accuracy results from many common machine-learning classification models.

Converting PHEME raw data into CSV files

The pheme-rnr-dataset directory contains tabular subsets of data from the PHEME Rumor Non-Rumor dataset in CSV format. These CSV files were created by opening an interactive Python 3 shell from the repository root, which is one directory up, and running the following commands:

Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from lib.pheme_parsing import pheme_to_csv
>>> pheme_to_csv("ottawashooting")

Project Presentation

For our project presentation, we're using Jupyter Notebook's slides features

Converting Notebooks to Slides

Convert .ipynb files into HTML slideshows with the following command:

$ jupyter nbconvert my_notebook.ipynb --to slides

Running Presentation

Slides are in the docs director.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
data		data
docs		docs
html		html
lib		lib
raw		raw
.gitignore		.gitignore
Classification models.ipynb		Classification models.ipynb
Latent-Factor Models.ipynb		Latent-Factor Models.ipynb
README.md		README.md
data_cleaning.ipynb		data_cleaning.ipynb
data_cleaning.ipynb.orig		data_cleaning.ipynb.orig
exploratory_data_analysis.ipynb		exploratory_data_analysis.ipynb
feature_analysis.ipynb		feature_analysis.ipynb
intro.ipynb		intro.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge Discovery for Interpretability and Insight in Network Approaches to Automatic Fake News Detection

Converting PHEME raw data into CSV files

Project Presentation

Converting Notebooks to Slides

Running Presentation

About

Releases 1

Packages

Languages

steve-kasica/pheme-rnr-knowledge-discovery

Folders and files

Latest commit

History

Repository files navigation

Knowledge Discovery for Interpretability and Insight in Network Approaches to Automatic Fake News Detection

Converting PHEME raw data into CSV files

Project Presentation

Converting Notebooks to Slides

Running Presentation

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages