Skip to content

Possible Improvements

Marco Fossati edited this page Oct 5, 2015 · 1 revision
  • Gazetteer: add campionato + [tedesco, spagnolo, ...]
  • giocare come difensore = another frame?
  • Feature: tag n-gram with ontology class, otherwise keep ENT
  • Gazetteer: run it over the whole input (sentence), currently at token level
  • Training: use sentence splitter to get 1-sentence examples
  • Evaluation: use n-grams instead of links
  • Supervised: if a sentence is not in the gold standard, the classifier should discard it (abstention)
  • Unsupervised: is the SportsEvent mapping harmful?
  • esordire may also trigger Partita
  • Gold: add missing words due to errors in the wikiextractor (third-party lib)
Clone this wiki locally