IWord2Vec does not work #2

naoa · 2023-02-08T01:04:15Z

The following code will not work correctly
Is this project still under development and does not work?

from torch.utils.data import DataLoader
from rivertext.models.iw2v import IWord2Vec
from rivertext.utils import TweetStream
from tqdm import tqdm
ts = TweetStream("a.txt")
dataloader = DataLoader(ts, batch_size=32)
iw2v = IWord2Vec(window_size=3, vocab_size=3, emb_size=3, sg=1, neg_samples_sum=1,  device="cuda:0")
for batch in tqdm(dataloader):
   iw2v.learn_many(batch)
print(iw2v.vocab2dict())

0it [00:00, ?it/s]/home/pf/glove/rivertext/rivertext/models/iword2vec/unigram_table.py:83: RuntimeWarning: invalid value encountered in true_divide
  nums = self.max_size * counts_pow / z
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/pf/glove/rivertext/a.py", line 9, in <module>
    iw2v.learn_many(batch)
  File "/home/pf/glove/rivertext/rivertext/models/iw2v.py", line 267, in learn_many
    batch = self.prep(tokens)
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/preprocessing.py", line 261, in __call__
    self.update_unigram_table(token)
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/preprocessing.py", line 98, in update_unigram_table
    self.rebuild_unigram_table()
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/preprocessing.py", line 78, in rebuild_unigram_table
    self.unigram_table.build(self.vocab, self.alpha)
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/unigram_table.py", line 84, in build
    nums = np.vectorize(round_number)(nums)
  File "/home/pf/.local/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2304, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/home/pf/.local/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2382, in _vectorize_call
    ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
  File "/home/pf/.local/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2342, in _get_ufunc_and_otypes
    outputs = func(*inputs)
  File "/home/pf/glove/rivertext/rivertext/utils/rand.py", line 17, in round_number
    c = ceil(num)
ValueError: cannot convert float NaN to integer

The text was updated successfully, but these errors were encountered:

naoa · 2023-02-10T10:42:23Z

I have confirmed that the code works in your test case.
Thanks

giturra pinned this issue Feb 8, 2023

naoa closed this as completed Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IWord2Vec does not work #2

IWord2Vec does not work #2

naoa commented Feb 8, 2023

naoa commented Feb 10, 2023

IWord2Vec does not work #2

IWord2Vec does not work #2

Comments

naoa commented Feb 8, 2023

naoa commented Feb 10, 2023