Detection is based on collection of Twitter posts which contain rumours or non-rumours (fake & real) information.
In addition to analyzing plain text of each tweet graph data (social interactions) is taken into consideration, e.g. retweet count, shares count, etc.
- Create venv with your favourite tool
- Activate it
- Run
python install.py
- Provide
dataset.key
file in raw directory - In raw directory run
bash prepare_dataset.sh # This will initialize raw dataset
- Run
python setup_dataset.py # This will create dataset.csv
Original dataset (PHEME) belongs to Elena Kochkina
, Maria Liakata
& Arkaitz Zubiaga
.
It was downloaded from here and then encrypted because of sensitive data inside.