Geo Vec Model

We introduce a novel document representation learning model, Geometric Document Vectors (Geo-Vec). Inspired by recent developments in geometric deep learning our model encodes documents as graphs and treats an entire corpus as the result of a latent document topology manifold. Using a modified graph auto-encoder (GAE), our approach successfully propagates complex word relations utilizing the shared weights, thus creating a semantically rich latent space. An attention module is included, that serves as a topic filter to compress learned embeddings. We compare our model to several classic document representation learning models on an information retrieval task, and show that Geo-Vec performs on par or outperforms. The shared weights of the model only depend on the vocabulary and can thus enables training on very large corpora. Additionally, inference on unseen documents can be done efficiently by a simple forward pass.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
common.py		common.py
common_nons.py		common_nons.py
graphvec.py		graphvec.py
graphvec_nons.py		graphvec_nons.py
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Geo Vec Model

About

Releases

Packages

Languages

gverkes/Geo-Vec-Model

Folders and files

Latest commit

History

Repository files navigation

Geo Vec Model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages