Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IWord2Vec does not work #2

Closed
naoa opened this issue Feb 8, 2023 · 1 comment
Closed

IWord2Vec does not work #2

naoa opened this issue Feb 8, 2023 · 1 comment

Comments

@naoa
Copy link

naoa commented Feb 8, 2023

The following code will not work correctly
Is this project still under development and does not work?

from torch.utils.data import DataLoader
from rivertext.models.iw2v import IWord2Vec
from rivertext.utils import TweetStream
from tqdm import tqdm
ts = TweetStream("a.txt")
dataloader = DataLoader(ts, batch_size=32)
iw2v = IWord2Vec(window_size=3, vocab_size=3, emb_size=3, sg=1, neg_samples_sum=1,  device="cuda:0")
for batch in tqdm(dataloader):
   iw2v.learn_many(batch)
print(iw2v.vocab2dict())
0it [00:00, ?it/s]/home/pf/glove/rivertext/rivertext/models/iword2vec/unigram_table.py:83: RuntimeWarning: invalid value encountered in true_divide
  nums = self.max_size * counts_pow / z
0it [00:00, ?it/s]
Traceback (most recent call last):
  File "/home/pf/glove/rivertext/a.py", line 9, in <module>
    iw2v.learn_many(batch)
  File "/home/pf/glove/rivertext/rivertext/models/iw2v.py", line 267, in learn_many
    batch = self.prep(tokens)
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/preprocessing.py", line 261, in __call__
    self.update_unigram_table(token)
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/preprocessing.py", line 98, in update_unigram_table
    self.rebuild_unigram_table()
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/preprocessing.py", line 78, in rebuild_unigram_table
    self.unigram_table.build(self.vocab, self.alpha)
  File "/home/pf/glove/rivertext/rivertext/models/iword2vec/unigram_table.py", line 84, in build
    nums = np.vectorize(round_number)(nums)
  File "/home/pf/.local/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2304, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/home/pf/.local/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2382, in _vectorize_call
    ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
  File "/home/pf/.local/lib/python3.10/site-packages/numpy/lib/function_base.py", line 2342, in _get_ufunc_and_otypes
    outputs = func(*inputs)
  File "/home/pf/glove/rivertext/rivertext/utils/rand.py", line 17, in round_number
    c = ceil(num)
ValueError: cannot convert float NaN to integer
@giturra giturra pinned this issue Feb 8, 2023
@naoa naoa closed this as completed Feb 10, 2023
@naoa
Copy link
Author

naoa commented Feb 10, 2023

I have confirmed that the code works in your test case.
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant