Skip to content

Service of wrapped DeepPavlov NER ML models for a quick entities extraction from cells of long tabular data

License

Notifications You must be signed in to change notification settings

nicolay-r/bulk-ner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bulk-ner 0.25.1

twitter PyPI downloads

A no-strings inference implementation framework Named Entity Recognition (NER) service of wrapped AI models.

The key features of this framework are:

  1. ☑️ Native support of batching;
  2. ☑️ Native long-input contexts handling.

Installation

From PyPI:

pip install bulk-ner

or latest from Github:

pip install git+https://github.com/nicolay-r/bulk-ner@main

Usage

API

Please take a look at the related Wiki page

Shell

NOTE: You have to install source-iter package

This is an example for using DeepPavlov==1.3.0 as an adapter for NER models passed via --adapter parameter:

python -m bulk_ner.annotate \
    --src "test/data/test.tsv" \
    --prompt "{text}" \
    --batch-size 10 \
    --adapter "dynamic:models/dp_130.py:DeepPavlovNER" \
    --output "test-annotated.jsonl" \
    %%m \
    --model "ner_ontonotes_bert_mult"

You can choose the other models via --model parameter.

List of the supported models is available here: https://docs.deeppavlov.ai/en/master/features/models/NER.html

Deploy your model

Quick example: Check out the default DeepPavlov wrapper implementation

All you have to do is to implement the BaseNER class that has the following protected method:

  • _forward(sequences) -- expected to return two lists of the same length:
    • terms -- related to the list of atomic elements of the text (usually words)
    • labels -- B-I-O labels for each term.

Powered by

The pipeline construction components were taken from AREkit [github]