Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using nas.ContextualWordEmbsForSentenceAug with gpt2 or distilgpt2 returns nonsense #346

Open
djaszak opened this issue May 4, 2024 · 0 comments

Comments

@djaszak
Copy link

djaszak commented May 4, 2024

Hi,

I am using this library for multiple augmentation operations which all work fine. But using nas.ContextualWordEmbsForSentenceAug with gpt2 or distilgpt2 returns nonsense.
This is my code

# These commented constants are defined in an enum at another place.
# GENERATIVE_GPT2 = nas.ContextualWordEmbsForSentenceAug(
#         model_path="gpt2", device="cuda"
# )
# GENERATIVE_DISTILGPT2 = nas.ContextualWordEmbsForSentenceAug(
#     model_path="distilgpt2", device="cuda"
# )
text = 'The quick brown fox jumps over the lazy dog.'
for x in range(5):
    print(f"GPT2: {AugmentationMethods.GENERATIVE_GPT2.value.augment(text)}")
    print(f"DISTILLGPT2: {AugmentationMethods.GENERATIVE_DISTILGPT2.value.augment(text)}") 

The output is:

GPT2: ['The quick brown fox jumps over the lazy dog. - first ( all .']
DISTILLGPT2: ['The quick brown fox jumps over the lazy dog. F A F I A R - W C A The U B S It The B The A M The The B In to The E A In This']
GPT2: ['The quick brown fox jumps over the lazy dog. New next " the it in : .']
DISTILLGPT2: ['The quick brown fox jumps over the lazy dog. 1 The This L .']
GPT2: ['The quick brown fox jumps over the lazy dog. is ( , the " way all a The one was , for of same \' " to in most last I for that two government .']
DISTILLGPT2: ['The quick brown fox jumps over the lazy dog. B I In A E W It the T U The To G L T In We D R We A C We H In A Image It It If']
GPT2: ['The quick brown fox jumps over the lazy dog. next number other same last last in is " for , of ( ( of I and New \' a two .']
DISTILLGPT2: ['The quick brown fox jumps over the lazy dog. W R I , S D and J .']
GPT2: ['The quick brown fox jumps over the lazy dog. it a not not first same of A A people and last a will the is that New world two next \'s government that that two ( " the and']
DISTILLGPT2: ['The quick brown fox jumps over the lazy dog. K - A This This N A R J N A The J Image A The P You W H A The We K In The G A If 2']

Do you see any mistakes that I did or do you have any other clue why the output is so broken?
Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant