Skip to content
forked from primeqa/primeqa

The prime repository for state-of-the-art Multilingual Question Answering research and development.

License

Notifications You must be signed in to change notification settings

mengxiayu/primeqa

 
 

Repository files navigation

primeqa

The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development.

Build Status LICENSE|Apache2.0 sphinx-doc-build

PrimeQA is a public open source repository that enables researchers and developers to train state-of-the-art models for question answering (QA). By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly downloadable.

The models within PrimeQA supports End-to-end Question Answering. PrimeQA answers questions via

Some examples of models (applicable on benchmark datasets) supported are :

🏅 Top of the Leaderboard

PrimeQA is at the top of several leaderboards: XOR-TyDi, TyDiQA-main, OTT-QA and HybridQA.

✔️ Getting Started

Installation

Installation doc

# cd to project root

# If you want to run on GPU make sure to install torch appropriately

# E.g. for torch 1.11 + CUDA 11.3:
pip install 'torch~=1.11.0' --extra-index-url https://download.pytorch.org/whl/cu113

# Install as editable (-e) or non-editable using pip, with extras (e.g. tests) as desired
# Example installation commands:

# Minimal install (non-editable)
pip install .

# GPU support
pip install .[gpu]

# Full install (editable)
pip install -e .[all]

Please note that dependencies (specified in setup.py) are pinned to provide a stable experience. When installing from source these can be modified, however this is not officially supported.

Note: in many environments, conda-forge based faiss libraries perform substantially better than the default ones installed with pip. To install faiss libraries from conda-forge, use the following steps:

  • Create and activate a conda environment
  • Install faiss libraries, using a command

conda install -c conda-forge faiss=1.7.0 faiss-gpu=1.7.0

  • In setup.py, remove the faiss-related lines:
"faiss-cpu~=1.7.2": ["install", "gpu"],
"faiss-gpu~=1.7.2": ["gpu"],
  • Continue with the pip install commands as desctibed above.

JAVA requirements

Java 11 is required for BM25 retrieval.

Download Java 11 package from https://jdk.java.net/archive/ and uncompress

Set JAVA_HOME:

export JAVA_HOME=<jdk-dir>
export PATH=$JAVA_HOME/bin:$PATH

🧪 Unit Tests

Testing doc

To run the unit tests you first need to install PrimeQA. Make sure to install with the [tests] or [all] extras from pip.

From there you can run the tests via pytest, for example:

pytest --cov PrimeQA --cov-config .coveragerc tests/

For more information, see:

🔭 Learn more

Section Description
📒 Documentation Full API documentation and tutorials
🏁 Quick tour: Entry Points for PrimeQA Different entry points for PrimeQA: Information Retrieval, Reading Comprehension, TableQA and Question Generation
📓 Tutorials: Jupyter Notebooks Notebooks to get started on QA tasks
💻 Examples: Applying PrimeQA on various QA tasks Example scripts for fine-tuning PrimeQA models on a range of QA tasks
🤗 Model sharing and uploading Upload and share your fine-tuned models with the community
Pull Request PrimeQA Pull Request
📄 Generate Documentation How Documentation works
🛠 Orchestrator Service REST Microservice Proof-of-concept code for PrimeQA Orchestrator microservice
📖 Tooling UI Demo UI

❤️ PrimeQA collaborators include

stanford Stanford NLP i University of Illinois
stuttgart University of Stuttgart notredame University of Notre Dame
ohio Ohio State University carnegie Carnegie Mellon University
massachusetts University of Massachusetts ibm IBM Research




primeqa

About

The prime repository for state-of-the-art Multilingual Question Answering research and development.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 57.0%
  • Jupyter Notebook 41.1%
  • C++ 1.2%
  • Other 0.7%