Skip to content

Latest commit

 

History

History
52 lines (43 loc) · 1.89 KB

README.md

File metadata and controls

52 lines (43 loc) · 1.89 KB

solipCysme

spaCy pipeline for french fictions or first person point of view texts (with a focus on personal pronouns).

Feature Description
Language french
Name fr_solipcysme
Version 3.7.0
spaCy >=3.7.6,<3.8.0
Default Pipeline presque_normalizer, tokentype, morphologizer, viceverser_lemmatizer, sentencizer, parser
Components quelquhui_tokenizer, presque_normalizer, tokentype, morphologizer, viceverser_lemmatizer, sentencizer, parser
Vectors 504297 keys, 504297 unique vectors (100 dimensions)
Sources Corpus narraFEATS (morphologizer), corpus cabillaUD (parser), corpus attirail (vectors).
License CC BY-NC-SA 4.0
Author thjbdvlt

installation

# from github release
pip install https://github.com/thjbdvlt/solipCysme/releases/download/solipCysme-v3.7.0/fr_solipcysme-3.7.0-py3-none-any.whl

# from huggingface
pip install https://huggingface.co/thjbdvlt/fr_solipcysme/resolve/main/fr_solipcysme-any-py3-none-any.whl

usage

import spacy

nlp = spacy.load("fr_solipcysme")

for i in nlp(
    "la MACHINE à (b)rouiller le temps s'est peuuut-etre déraillée..?"
):
    print(
        i, 
        i.norm_, 
        i.pos_, 
        i.morph, 
        i.lemma_, 
        i.dep_, 
        i._.tokentype,
        i._.vv_pos,
        i._.vv_morph
    )