Skip to content

Latest commit

 

History

History
28 lines (20 loc) · 852 Bytes

README.md

File metadata and controls

28 lines (20 loc) · 852 Bytes

Two-stage Discourse Parser

Here is a refactoring of the implementation of the RST discourse parser described in A Two-stage Parsing Method for Text-level Discourse Analysis. Due to the licence of RST data corpus, the training data is not included in our project folder. To reproduce the result in the paper, download it from the LDC, preprocess the data as stated below.

Usage:

  1. Preprocess the data:

    python3 preprocess.py RST_DATA_DIR RST_DEST_DIR
    
  2. Train model:

    python3 main.py --train --train_dir TRAIN_DIR
    
  3. Evaluate model:

    python3 main.py --eval --eval_dir EVAL_DIR
    

Requirements:

Currently runs under Python 3.7. The models are rewritten in sklearn. See requirements.txt for more details.