Skip to content

Latest commit

 

History

History
49 lines (34 loc) · 1.91 KB

README.md

File metadata and controls

49 lines (34 loc) · 1.91 KB

SpecNFS dataset is a collection of 1198 NFS-RFC specifications annotated with their SpecIR representation.

The datasets in SpecNFS along with their brief description are as follows:

  1. ./dataset/NER_dataset.csv

    This is the dataset containing the information about the spans and type of the entities in a sentence

    Column Description
    Sentence Sentence number
    Token Token in the sentence
    Token_ID Unique identifier of the span of entity in a sentence
    Token_no Sequence number of a token in a sentence.
    Partition The partition no. of the sentence which was used for the 5-fold cross validation reported in the paper
  2. ./dataset/Link_dataset.csv

    This is the dataset containing dependency link information between entities in a sentence

    Column Description
    Head_id Entity span identifier of the head of the link
    Dep_id Entity span identifier of the dependent of the link
    Link Type of link between the head and the dependenct entity
    Sentence_no Sentence number of the link
    Partition The partition no. of the sentence which was used for the 5-fold cross validation reported in the paper
  3. ./dataset/token_type2link_type.csv

    This is the dataset containing the valid link types between different pair of entity types

    Column Description
    H_Token_TYPE Head entity type
    D_Token_TYPE Dependenct entity type
    Link_TYPE Valid link types between head and dependenct entity types