- Use existing corpus/data search and retrieval software as backend,
- Obtain and compile information about:
- single word,
- two or more words compared with each other,
- word translation.
- explore text metadata statistics, time-based trends, word cloud-based data and many more,
- combine statistics from different corpora,
- Use results of a resource as an input for other resource.
- KonText
- MQuery
- NoSketch Engine
- Treq
- Clarin FCS Core 1
- Datamuse API
- Leipzig Corpora Collection (REST API) (LCC)
WaG | KonText | MQuery | NoSkE | Treq | Clarin FCS | Datamuse | ElasticSearch | LCC | |
---|---|---|---|---|---|---|---|---|---|
collocations | ⭐ | 🚧 | ⭐ | ⭐ | |||||
concFilter | ⭐ | ||||||||
concordance | ⭐ | ⭐ | ⭐ | ⭐ | |||||
freqBar | ⭐ | ⭐ | |||||||
freqComparison | ⭐ | ⭐ | |||||||
freqPie | ⭐ | ⭐ | |||||||
geoAreas | ⭐ | ⭐ | |||||||
multiWordGeoAreas | ⭐ | ⭐ | |||||||
html | ⭐ | ⭐ | |||||||
matchingDocuments | ⭐ | ⭐ | |||||||
mergeCorpFreq | ⭐ | ⭐ | |||||||
speeches | ⭐ | ||||||||
syntacticColls | ⭐ | ||||||||
timeDistrib | ⭐ | ⭐ | |||||||
multiWordtimeDistrib | ⭐ | ⭐ | |||||||
translations | ⭐ | ||||||||
treqSubsets | ⭐ | ||||||||
wordForms | ⭐ | ⭐ | 🚧 | ||||||
wordFreq | ⭐ | ⭐ | 🚧 | ||||||
wordSim | ⭐ | 🚧 | ⭐ | ⭐ |
WaG is able to run either as a self-hosted application or within a compatible web page. For the self-hosted variant the following is needed:
- Node.JS + NPM package manager
- HTTP proxy server (Nginx, HAProxy, Apache)
- a core word frequency database:
Please refer for more information to the INSTALL.md.
Tomáš Machálek (2020): Word at a Glance: Modular Word Profile Aggregator. In: Proceedings of LREC 2020, s. 7011–7016.
@InProceedings{machalek2020lrec,
author = {Tomáš Machálek},
title = "{Word at a Glance: Modular Word Profile Aggregator.}",
booktitle = {Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020)},
year = {2020},
publisher = {European Language Resources Association (ELRA)},
language = {english}
}