Developed with 💛 at Expert.ai Research Lab
- License: ISC
- Paper: arXiv
This repository contains code and instruction to download data for the research paper titled Capturing Pertinent Symbolic Features for Enhanced Content-Based Misinformation Detection. It is organized as follows:
- data/: Contains instructions for obtaining the data necessary to run the experiments.
- notebooks/: Contains Jupyter notebooks for conducting paper experiments, or rather preprocessing, analyzing, annotating and modeling misinformation data.
In this work, we propose harnessing symbolic linguistic resources inspired by insights from social science research to automate the detection of content-based misinformation. Our experiments leverage a suite of off-the-shelf freely available symbolic models tailored to identify layered linguistic attributes:
- writeprint,
- sentiment analysis,
- emotional traits,
- behavioral traits,
- hate speech,
- radicalization narratives,
This information is subsequently combined with the capabilities of language models. Our method is validated across a range of datasets, carefully selected and analyzed to represent the heterogeneous misinformation phenomenon.
Data: Follow the instructions in the README located within the data/ directory to obtain both the raw and already preprocessed datasets required to conduct the experiments.
Code: Code to preprocess, analyse, annotate and model data is temporarily shared within Jupyter notebooks in the notebooks/ directory.
To cite this research please use the following:
@inproceedings{merenda2023capturing, title={Capturing Pertinent Symbolic Features for Enhanced Content-Based Misinformation Detection}, author={Merenda, Flavio and Gomez-Perez, Jose Manuel}, booktitle={Proceedings of the 12th Knowledge Capture Conference 2023}, pages={61--69}, year={2023} }
At Expert.ai we turn language into data so humans can make better decisions. Take a look here!