This is the code and data for the article "SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting".
The repo includes:
- An automated data construction pipeline for the Structured, Complex, and Time-complete Temporal Events (SCTc-TE).
- Two large-scale complex event datasets, named MidEast-TE and GDELT-TE.
- A proposed model LoGo that leverages both local and global contexts for the newly formulated task: SCTc-TE forecasting.
The code is tested to be runnable under the environment with python=3.9; pytorch=1.12; cuda=11.3.
To create a new environment, you could use the commands below:
conda create --name logo python=3.9
conda activate logo
conda install pytorch==1.12.0 -c pytorch
pip install pandas
pip install tensorboard
conda install tqdm
conda install -c dglteam dgl-cuda11.3
The input data files need to be unzipped first:
unzip data/MIDEAST_CE.zip -d data
unzip data/GDELT_CE.zip -d data
The data folder then contains:
- CAEMO: Information from CAMEO ontology, organized in python dictionaries.
- MIDEAST_CE: SCTc-TE extracted by the Vicuna model.
- GDELT_CE: SCTc-TE constructed from GDELT data.
To construct the MIDEAST_CE and GDELT_CE data from raw data, the pipeline is collected and described in ./dataset_construction.
Before training the model, you need to generate both the local and global graphs:
python generate_graphs_ce.py --dataset MIDEAST_CE
python generate_graphs_ce.py --dataset GDELT_CE
The searched training hyperparameter configuration is stored in config.yaml.
Hyperparameters can also be set in command, see detailed usage in train_logo_early.py:get_cmd().
[dataset_name]
is MIDEAST_CE
or GDELT_CE
python train_logo_early.py -d [dataset_name] --m LoGo_sep
- LoGolocal:
python train_logo_late.py -d [dataset_name] --local_only
- LoGoglobal:
python train_logo_late.py -d [dataset_name] --global_only
- LoGoshare:
python train_logo_early.py -d [dataset_name] --m LoGo_share
- LoGolate:
python train_logo_late.py -d [dataset_name] --m LoGo_sep
Loss and runs:
tensorboard --logdir=./runs/
Results in ./logs folder.