Fake News Datasets

Introduction

This project was created to show basic analysis of public datasets of fake news. Main idea is to make each analysis replicable, so everyone can add his own analysis and use it for his experiments and data mining. Every dataset has its own python jupyter notebook with simple analysis, which can help to choose appropriate dataset.

Prerequisites

Installation and running

To run all jupyter notebooks with appropriate libraries installed, we refer to use Docker.

With installed Docker, run the following command to build docker image and start container:

./scripts/run.sh -b

Note: Next time, when no build is needed (because image has been already built), you can just run container by skipping -b argument.

Datasets

List of all processed datasets with simple comparison is stored in datasets/README.md file.

All datasets analyses are stored in datasets/ folder. Each dataset has its own folder with simple description in README file and jupyter notebook (also can include different files, e.g. data itself).

Dataset files (e.g. .csv or .tsv files) are stored using Git LFS (see Git LFS for more information).

Adding new dataset

When adding new dataset, please follow these steps:

Call ./scripts/create_structure.sh {name} script with name argument supplied in snake_case format (e.g. fake_news_detection_kaggle). This script will create needed folders and files in datasets/{name} folder.
Add data into datasets/{name}/data directory.
Update datasets/{name}/README.md file to provide link, potential tasks, description and attributes descriptions. Please, follow template file structure.
Update datasets/{name}/{name}.ipynb file with analysis of the dataset. Please, follow template file structure.
Add dataset and details into table of datasets in datasets/README.md file (please, follow the alphabetical order).

TODO

Finish prepared datasets:

coaid
that_is_a_known_lie
fake_health
fake_covid

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
datasets		datasets
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fake News Datasets

Introduction

Prerequisites

Installation and running

Datasets

Adding new dataset

TODO

About

Releases

Packages

Languages

License

pmacinec/fake-news-datasets

Folders and files

Latest commit

History

Repository files navigation

Fake News Datasets

Introduction

Prerequisites

Installation and running

Datasets

Adding new dataset

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages