Named entity extraction from Portuguese web text

My master dissertation on Named entity extraction from Portuguese web text, at FEUP (Faculty of Engineering of University of Porto).

Entity extraction using well-established tools (OpenNLP, Stanford CoreNLP, spaCy and NLTK) for the Portuguese language, and more specifically for the news section in University of Porto Information System - SIGARRA and all its subdomains.

Author: André Ricardo Oliveira Pires

Supervisor: Sérgio Nunes

Co-supervisor: José Devezas

In colaboration with: FEUP InfoLab and INESC TEC

For more information, regarding the developing process, guidelines for each tool, results obtained, resources created (trained NER models and annotated dataset) and more, check wiki.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Named entity extraction from Portuguese web text

Files

README.md

Latest commit

History

README.md

File metadata and controls

Named entity extraction from Portuguese web text