Skip to content

Latest commit

 

History

History
137 lines (112 loc) · 15.5 KB

README.md

File metadata and controls

137 lines (112 loc) · 15.5 KB

中文 | English



Collections of resources from Joint Laboratory of HIT and iFLYTEK Research (HFL).

Pre-trained Language Model

Name Description
VLE Multimodal Vision-Language Encoder
MiniRBT Chinese MiniRBT models (a series of small pre-trained models)
LERT Chinese LERT models (small-level, base-level, large-level)
PERT Chinese and English PERT models (base-level, large-level)
Chinese-MobileBERT Chinese MobileBERT (base-level, large-level) (archival purpose only)
CINO Pre-trained Language Models for Chinese Minority Languages
MacBERT Chinese pre-trained MacBERT models (MacBERT-base, MacBERT-large)
CharBERT English pre-trained CharBERT models
Chinese-ELECTRA Chinese pre-trained ELECTRA models (ELECTRA-base, ELECTRA-small) with code supports for six tasks: CMRC 2018, DRCD, XNLI, ChnSentiCorp, LCQMC, BQCorpus
Chinese-XLNet Chinese pre-trained XLNet models: XLNet-mid, XLNet-base
Chinese-BERT-wwm Chinese BERT with Whole Word Masking (wwm), including BERT-wwm, BERT-wwm-ext, RoBERTa-wwm-ext, RoBERTa-wwm-ext-large, RBT3, RBTL3

Dataset

Name Type Paper
CCTC Text Correction Wang et al., 2022
CTC 2021 Text Correction Wang et al., 2022
ExpMRC Reading Comprehension Cui et al., 2021
AdvRACE Reading Comprehension Si et al., 2020
CMRC 2019 Reading Comprehension Cui et al., 2020
CJRC Reading Comprehension Duan et al., 2019
CMRC 2018 Reading Comprehension Cui et al., 2019
CMRC 2017 Reading Comprehension Cui et al., 2018
PD&CFT Reading Comprehension Cui et al., 2016

Toolkit

Name Description Paper
TextPruner Model Pruning for NLP Yang et al., 2022
TextBrewer Knowledge Distillation for NLP Yang et al., 2020

System Demonstration

Name Description Paper
IFlyEA A Chinese Essay Assessment System with Automated Rating, Review Generation, and Recommendation Gong et al., 2021
iFLYChecker A Chinese Grammar Checking System -
IFlyLegal A Chinese Legal System for Consultation & Law Searching Wang et al., 2019

Evaluation Campaign

Name Description Live Leaderboard
CMRC 2022 Explainable Reading Comprehension
CTC 2021 Chinese Text Correction
CAIL 2020 Judiciary Reading Comprehension
CMRC 2019 Sentence Cloze Reading Comprehension
CAIL 2019 Judiciary Reading Comprehension
CMRC 2018 Span-Extraction Reading Comprehension
CMRC 2017 Cloze-style Reading Comprehension

Paper

Year Paper Author List Published in Note
2022 Visualizing Attention Zones in Machine Reading Comprehension Models Yiming Cui, Wei-Nan Zhang, Ting Liu STAR Protocols GitHub
2022 Multilingual Multi-Aspect Explainability Analyses on Machine Reading Comprehension Models Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhigang Chen, Shijin Wang iScience GitHub
2021 ExpMRC: Explainability Evaluation for Machine Reading Comprehension Yiming Cui, Ting Liu, Wanxiang Che, Zhigang Chen, Shijin Wang Heliyon GitHub
2022 Teaching Machines to Read, Answer and Explain Yiming Cui, Ting Liu, Wanxiang Che, Zhigang Chen, Shijin Wang IEEE/ACM TASLP
2022 PERT: Pre-training BERT with Permuted Language Model Yiming Cui, Ziqing Yang, Ting Liu GitHub
2022 A Static and Dynamic Attention Framework for Multi Turn Dialogue Generation Wei-Nan Zhang, Yiming Cui, Kaiyan Zhang, Yifa Wang, Qingfu Zhu, Lingzhi Li, Ting Liu ACM TOIS
2022 Cross-Lingual Text Classification with Multilingual Distillation and Zero-Shot-Aware Training Ziqing Yang, Yiming Cui, Zhigang Chen, Shijin Wang
2022 CINO: A Chinese Minority Pre-trained Language Model Ziqing Yang, Zihang Xu, Yiming Cui, Baoxin Wang, Min Lin, Dayong Wu, Zhigang Chen GitHub
2022 HFL at SemEval-2022 Task 8: A Linguistics-inspired Regression Model with Data Augmentation for Multilingual News Similarity Zihang Xu, Ziqing Yang, Yiming Cui, Zhigang Chen SemEval 2022 GitHub
2022 HIT at SemEval-2022 Task 2: Pre-trained Language Model for Idioms Detection Zheng Chu, Ziqing Yang, Yiming Cui, Zhigang Chen, Ming Liu SemEval 2022
2022 TextPruner: A Model Pruning Toolkit for Pre-trained Language Models Ziqing Yang, Yiming Cui, Zhigang Chen ACL 2022 Demo GitHub
2022 Interactive Gated Decoder for Machine Reading Comprehension Yiming Cui, Wanxiang Che, Ziqing Yang, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu ACM TALLIP
2021 IFlyEA: A Chinese Essay Assessment System with Automated Rating, Review Generation, and Recommendation Jiefu Gong, Xiao Hu, Wei Song, Ruiji Fu, Zhichao Sheng, Bo Zhu, Shijin Wang, Ting Liu ACL 2021 Demo
2021 Dynamic Connected Networks for Chinese Spelling Check Baoxin Wang, Wanxiang Che, Dayong Wu, Shijin Wang, Guoping Hu, Ting Liu Findings of ACL 2021
2021 Various Legal Factors Extraction Based on Machine Reading Comprehension Beichen Wang, Ziyue Wang, Baoxin Wang, Dayong Wu, Zhigang Chen, Shijin Wang, Guoping Hu CCIR 2021
2021 利用深层语言分析改进中文作文自动评分方法 魏思,巩捷甫,宋巍,宋子尧,王士进 中文信息学报
2021 Bilingual Alignment Pre-training for Zero-shot Cross-lingual Transfer Ziqing Yang, Wentao Ma, Yiming Cui, Jiani Ye, Wanxiang Che, Shijin Wang MRQA 2021
2021 Adversarial Training for Machine Reading Comprehension with Virtual Embeddings Ziqing Yang, Yiming Cui, Chenglei Si, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu *SEM 2021
2021 Pre-Training with Whole Word Masking for Chinese BERT Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Ziqing Yang IEEE/ACM TASLP GitHub1, GitHub2
2021 Benchmarking Robustness of Machine Reading Comprehension Models Chenglei Si, Ziqing Yang, Yiming Cui, Wentao Ma, Ting Liu, Shijin Wang Findings of ACL 2021 GitHub
2020 A Sentence Cloze Dataset for Chinese Machine Reading Comprehension Yiming Cui, Ting Liu, Ziqing Yang, Zhipeng Chen, Wentao Ma, Wanxiang Che, Shijin Wang, Guoping Hu COLING 2020 GitHub
2020 CharBERT: Character-aware Pre-trained Language Model Wentao Ma, Yiming Cui, Chenglei Si, Ting Liu, Shijin Wang, Guoping Hu COLING 2020 GitHub
2020 Revisiting Pre-Trained Models for Chinese Natural Language Processing Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu Findings of EMNLP 2020 GitHub
2020 Is Graph Structure Necessary for Multi-hop Question Answering? Nan Shao, Yiming Cui, Ting Liu, Shijin Wang, Guoping Hu EMNLP 2020 -
2020 TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing Ziqing Yang, Yiming Cui, Zhipeng Chen, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu ACL 2020 Demo GitHub
2020 Conversational Word Embedding for Retrieval-based Dialog System Wentao Ma, Yiming Cui, Ting Liu, Dong Wang, Shijin Wang, Guoping Hu ACL 2020 GitHub
2020 Discriminative Sentence Modeling for Story Ending Prediction Yiming Cui, Wanxiang Che, Wei-Nan Zhang, Ting Liu, Shijin Wang, Guoping Hu AAAI 2020 -
2019 Cross-Lingual Machine Reading Comprehension Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, Shijin Wang, Guoping Hu EMNLP 2019 GitHub
2019 A Span-Extraction Dataset for Chinese Machine Reading Comprehension Yiming Cui, Ting Liu, Wanxiang Che, Li Xiao, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu EMNLP 2019 GitHub
2019 IFlyLegal: A Chinese Legal System for Consultation, Law Searching, and Document Analysis Ziyue Wang, Baoxin Wang, Xingyi Duan, Dayong Wu, Shijin Wang, Guoping Hu, Ting Liu EMNLP 2019 Demo -
2019 TripleNet: Triple Attention Network for Multi-Turn Response Selection in Retrieval-based Chatbots Wentao Ma, Yiming Cui, Nan Shao, Su He, Wei-Nan Zhang, Ting Liu, Shijin Wang, Guoping Hu CoNLL 2019 GitHub
2019 Improving Machine Reading Comprehension via Adversarial Training Ziqing Yang, Yiming Cui, Wanxiang Che, Ting Liu, Shijin Wang, Guoping Hu - -
2019 Contextual Recurrent Units for Cloze-style Reading Comprehension Yiming Cui, Wei-Nan Zhang, Wanxiang Che, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping Hu - -
2019 CJRC: A Reliable Human-Annotated Benchmark DataSet for Chinese Judicial Reading Comprehension Xingyi Duan, Baoxin Wang, Ziyue Wang, Wentao Ma, Yiming Cui, Dayong Wu, Shijin Wang, Ting Liu, Tianxiang Huo, Zhen Hu, Heng Wang, Zhiyuan Liu CCL 2019 GitHub
2019 Convolutional Spatial Attention Model for Reading Comprehension with Multiple-Choice Questions Zhipeng Chen, Yiming Cui, Wentao Ma, Shijin Wang, Guoping Hu AAAI 2019 -
2018 Disconnected Recurrent Neural Networks for Text Categorization Baoxin Wang ACL 2018 -
2018 HFL-RC System at SemEval-2018 Task 11: Hybrid Multi-Aspects Model for Commonsense Reading Comprehension Zhipeng Chen, Yiming Cui*, Wentao Ma, Shijin Wang, Ting Liu, Guoping Hu - -
2018 Dataset for the First Evaluation on Chinese Machine Reading Comprehension Yiming Cui, Ting Liu, Zhipeng Chen, Wentao Ma, Shijin Wang, Guoping Hu LREC 2018 GitHub
2018 Chinese Grammatical Error Diagnosis using Statistical and Prior Knowledge driven Features with Probabilistic Ensemble Enhancement Ruiji Fu, Zhengqi Pei, Jiefu Gong, Wei Song, Dechuan Teng, Wanxiang Che, Shijin Wang, Guoping Hu, Ting Liu NLP-TEA@ACL 2018 -
2017 面向作文自动评分的优美句识别 付瑞吉,王栋,王士进,胡国平,刘挺 中文信息学报 -
2017 Attention-over-Attention Neural Networks for Reading Comprehension Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping Hu ACL 2017 -
2017 Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution Ting Liu, Yiming Cui, Qingyu Yin, Wei-Nan Zhang, Shijin Wang, Guoping Hu ACL 2017 -
2016 Consensus Attention-based Neural Networks for Chinese Reading Comprehension Yiming Cui, Ting Liu, Zhipeng Chen, Shijin Wang, Guoping Hu COLING 2016 GitHub
2016 LSTM Neural Reordering Feature for Statistical Machine Translation Yiming Cui, Shijin Wang, Jianfeng Li NAACL 2016 -

Follow Us

Follow our official WeChat account to keep updated with our latest technologies!