Word2Vec

This is a re-implementation of Word2Vec relying on Tensorflow Estimators and Datasets.

Works with python >= 3.6 and Tensorflow v2.0.

Install

via pip:

pip3 install tf-word2vec

or, after a git clone:

python3 setup.py install

Get data

You can download a sample of the English Wikipedia here:

wget http://129.194.21.122/~kabbach/enwiki.20190120.sample10.0.balanced.txt.7z

Train Word2Vec

w2v train \
  --data /absolute/path/to/enwiki.20190120.sample10.0.balanced.txt \
  --outputdir /absolute/path/to/word2vec/models \
  --alpha 0.025 \
  --neg 5 \
  --window 2 \
  --epochs 5 \
  --size 300 \
  --min-count 50 \
  --sample 1e-5 \
  --train-mode skipgram \
  --t-num-threads 20 \
  --p-num-threads 25 \
  --keep-checkpoint-max 3 \
  --batch 1 \
  --shuffling-buffer-size 10000 \
  --save-summary-steps 10000 \
  --save-checkpoints-steps 100000 \
  --log-step-count-steps 10000

Name		Name	Last commit message	Last commit date
Latest commit History 389 Commits
tests		tests
word2vec		word2vec
.gitignore		.gitignore
.pydocstylerc		.pydocstylerc
.pylintrc		.pylintrc
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word2Vec

Install

Get data

Train Word2Vec

About

Releases 11

Packages

Contributors 4

Languages

License

akb89/word2vec

Folders and files

Latest commit

History

Repository files navigation

Word2Vec

Install

Get data

Train Word2Vec

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 4

Languages

Packages