Trained Models and datasets

Trained models and the datasets used can be found here: https://1drv.ms/u/s!AsshhlQM3x93qw6Bn44LfmTBreFz?e=Mdnpx2 Simply extract the folders into the project directory

Requirements

Python 3.9.5 VS Code - with the Jupyter extension installed (Using version v2021.8.1046824664) - alternatively any other .ipynb viewer (iPython notebook)

This is sufficient to view the iPython sessions that have been recorded in the Trained Models directory

executing setup.sh - creates a python virtual environment and installs necessary packages to it

after doing this you can run all code from the command line (this is NOT RECOMMENDED)
Instead follow the steps bellow to setup VENV with VSCode then run the scripts in python interactive mode
to install on windows or mac, simply skip the first 3 lines in the setup.sh script and install the python packages without a virtual environment.

if a GPU is present it must have at least 8GB of memory, otherwise in each of the Claim Detection files change device = torch.device("cuda" if torch.cuda.is_available() else "cpu") to device = "cpu" Using the CPU will increase the memory requirement from 2GB to 10GB

VENV

Be sure change the interpreter to the python executable in the venv folder before running iPython snippets

Open VS Code
Ctrl+Shift+P
Python: Select Interpreter
select to venv/bin/python

Order Of Execution

Twitter/1. Fetch Tweets.py
Twitter/2. Twitter Clean.py
Twitter/Wenija Ma et al/extract_data.py
Twitter/3. Label Tweets.py
Claim Detection/Masked LM.py In both UKP and Twitter modes
Claim Detection/NSP.py In both UKP and Twitter modes (After updating the pre-trained model paths with those produces in step 5)
Claim Detection/Claim Detection.py In all modes (After updating the pre-trained model paths with those produces in step 6)

Viewing Training Output

There are 8 notebooks corresponding to the 8/9 tasks which were developed solely for this project to the 8/9 tasks which were developed solely for this project Each can has the the trained model

Notebooks

The session output from each training task has been logged in a iPython notebook, these files are denoted .ipynb and can be found under each training directory.

TensorBoard

The loss and evaluation metrics have also been logged for each training task, to open these:

press Ctrl+Shift+P
type Python: launch TensorBoard
select use current directory
TensorBoard will now launch, the exact task can be selected by checking the relevant task in the runs selection menu

Reconfigure Mode

There are various different modes for each of the 3 ML files (Masked LM, NSP and Claim Detection), each are described in the code and corresponds to a task referenced in our implementation. All can be changed by substituting the appropriate string from the list of supported modes.

Changing pre-trained model path

If a new model is trained, the path to the model in the following step must be updated. For instance, if a new Masked LM model is trained for Twitter, the Twitter NSP task must be updated to load the appropriate model path. This can be done by simply changing the MASKED_LM_PATH to the new model's path.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Claim Detection		Claim Detection
Twitter		Twitter
bert-it		bert-it
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trained Models and datasets

Requirements

VENV

Order Of Execution

Viewing Training Output

Notebooks

TensorBoard

Reconfigure Mode

Changing pre-trained model path

About

Releases

Packages

Languages

William-Baker/Claim-Detection-with-BERT-on-Semi-Structured-Text

Folders and files

Latest commit

History

Repository files navigation

Trained Models and datasets

Requirements

VENV

Order Of Execution

Viewing Training Output

Notebooks

TensorBoard

Reconfigure Mode

Changing pre-trained model path

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages