Welcome to Transformers-for-NLP-2nd-Edition Discussions! #1
Replies: 11 comments 1 reply
-
Hey, I'm Jonah! I'm working as a data scientist and am getting my hands dirty with deploying some BERT fine tunings. I came into my position with very little knowledge of NLP, so this book has been a great help in my onboarding. It will become much more popular as a text as more people adopt transformer language models, I think. |
Beta Was this translation helpful? Give feedback.
-
Dear Jonah,
Thank you very much for sharing your project with me and your kind words. 😊
You made my day!
Best,
Denis
…On Thu, Jul 7, 2022, 8:13 PM jturner116 ***@***.***> wrote:
Hey, I'm Jonah! I'm working as a data scientist and am getting my hands
dirty with deploying some BERT fine tunings. I came into my position with
very little knowledge of NLP, so this book has been a great help in my
onboarding. It will become much more popular as a text as more people adopt
transformer language models, I think.
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHLCIVRGB5D5ORRVPK2PS6LVS4M3XANCNFSM5QMO3SJA>
.
You are receiving this because you authored the thread.Message ID:
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/3102336
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
In Chapter02 ther are a issue: TypeError: 'Word2Vec' object is not subscriptable |
Beta Was this translation helpful? Give feedback.
-
I ran
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
which contains the code you mentioned. It runs fine on my side.
Make sure that you have downloaded the text.txt that goes with this
notebook: in the same directory as the notebook.
If the text.txt file is not present the program will generate an error
message.
Please contact me for any further questions you may have.
On Tue, Mar 21, 2023 at 1:01 PM Denis Rothman ***@***.***>
wrote:
… Dear Karl,
Thank you for your feedback.
I'll check this out and get back to you in the next few days.
Denis
On Tue, Mar 21, 2023, 11:14 AM Karl Estermann ***@***.***>
wrote:
> In Chapter02 ther are a issue:
> TypeError Traceback (most recent call last)
> Cell In[42], line 44
> 42 print(word1)
> 43 print(model2)
> ---> 44 a=model2[word1]
> 45 b=model2[word2]
> 47 if(dprint==1):
>
> TypeError: 'Word2Vec' object is not subscriptable
> what's the reason of this sequenze.
> Thanks Karl
>
> —
> Reply to this email directly, view it on GitHub
> <#1 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AHLCIVXLAR2PPEFWOUFT4YTW5F5P7ANCNFSM5QMO3SJA>
> .
> You are receiving this because you authored the thread.Message ID:
> <Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5379116
> @github.com>
>
|
Beta Was this translation helpful? Give feedback.
-
For me it run only with this replacement:
a=model2.wv[word1]
b=model2.wv[word2]
Freundliche Grüsse
AALS AG
Karl Estermann
Die Menschen werden durch KI-generierten Schwachsinn getäuscht.
Der ChatGPT-Hype ist der Beweis, dass niemand KI wirklich versteht
Große Sprachmodelle sind dümmer als die Katze deines Nachbarn
AALS AGAsylstrasse 9
6340 Baar
Tel: 0041 41 761 88 15
Fax: 0041 41 761 62 51
Hdy: 0041 79 377 07 99
***@***.***
www.aals.ch
**********************************************************************
This e-mail may contain confidential material. It is intended only for
the person or entity which it is addressed to. Any review, retransmission,
dissemination or other use of this information by persons or entities other
than the intended recipient is prohibited. If you received this e-mail in error,
please immediately notify the sender or The AALS Software AG and delete
the material from any and all computers it may be stored on. Any views or
opinions expressed in this e-mail are those of the sender and do not necessarily
coincide with those of The AALS Software AG. Therefore this e-mail does not represent
a binding agreement nor an offer to deal. E-mail transmission cannot be guaranteed
to be secure or error-free. Information could be intercepted, corrupted, lost,
destroyed, incomplete or may contain viruses. Neither The AALS Software AG nor
the sender can accept any liability for any kind of damage as the result of
viruses or transmission errors.
Von: Denis Rothman ***@***.***>
An: Denis2054/Transformers-for-NLP-2nd-Edition ***@***.***>
Kopie: Karl Estermann ***@***.***>, Comment ***@***.***>
Gesendet: 21.03.2023 17:55
Betreff: Re: [Denis2054/Transformers-for-NLP-2nd-Edition] Welcome to Transformers-for-NLP-2nd-Edition Discussions! (Discussion #1)
I ran
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
which contains the code you mentioned. It runs fine on my side.
Make sure that you have downloaded the text.txt that goes with this
notebook: in the same directory as the notebook.
If the text.txt file is not present the program will generate an error
message.
Please contact me for any further questions you may have.
On Tue, Mar 21, 2023 at 1:01 PM Denis Rothman ***@***.***>
wrote:
Dear Karl,
Thank you for your feedback.
I'll check this out and get back to you in the next few days.
Denis
On Tue, Mar 21, 2023, 11:14 AM Karl Estermann ***@***.***>
wrote:
> In Chapter02 ther are a issue:
> TypeError Traceback (most recent call last)
> Cell In[42], line 44
> 42 print(word1)
> 43 print(model2)
> ---> 44 a=model2[word1]
> 45 b=model2[word2]
> 47 if(dprint==1):
>
> TypeError: 'Word2Vec' object is not subscriptable
> what's the reason of this sequenze.
> Thanks Karl
>
> —
> Reply to this email directly, view it on GitHub
> <#1 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AHLCIVXLAR2PPEFWOUFT4YTW5F5P7ANCNFSM5QMO3SJA>
> .
> You are receiving this because you authored the thread.Message ID:
> <Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5379116
> @github.com>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
For me it run only with this replacement:
a=model2.wv[word1]
b=model2.wv[word2]
You also run in a problem with this Statement:
from keras.preprocessing.sequence import pad_sequences
Correct is:
from keras.preprocessing import text
from keras.utils import np_utils
import keras
from keras import preprocessing
import tensorflow as tf
and in Function: generate_context_word_pairs
x = tf.keras.preprocessing.sequence.pad_sequences(context_words, maxlen=context_length) #instead your orginal
Freundliche Grüsse
AALS AG
Karl Estermann
Die Menschen werden durch KI-generierten Schwachsinn getäuscht.
Der ChatGPT-Hype ist der Beweis, dass niemand KI wirklich versteht
Große Sprachmodelle sind dümmer als die Katze deines Nachbarn
AALS AGAsylstrasse 9
6340 Baar
Tel: 0041 41 761 88 15
Fax: 0041 41 761 62 51
Hdy: 0041 79 377 07 99
***@***.***
www.aals.ch
**********************************************************************
This e-mail may contain confidential material. It is intended only for
the person or entity which it is addressed to. Any review, retransmission,
dissemination or other use of this information by persons or entities other
than the intended recipient is prohibited. If you received this e-mail in error,
please immediately notify the sender or The AALS Software AG and delete
the material from any and all computers it may be stored on. Any views or
opinions expressed in this e-mail are those of the sender and do not necessarily
coincide with those of The AALS Software AG. Therefore this e-mail does not represent
a binding agreement nor an offer to deal. E-mail transmission cannot be guaranteed
to be secure or error-free. Information could be intercepted, corrupted, lost,
destroyed, incomplete or may contain viruses. Neither The AALS Software AG nor
the sender can accept any liability for any kind of damage as the result of
viruses or transmission errors.
Von: Denis Rothman ***@***.***>
An: Denis2054/Transformers-for-NLP-2nd-Edition ***@***.***>
Kopie: Karl Estermann ***@***.***>, Comment ***@***.***>
Gesendet: 21.03.2023 17:55
Betreff: Re: [Denis2054/Transformers-for-NLP-2nd-Edition] Welcome to Transformers-for-NLP-2nd-Edition Discussions! (Discussion #1)
I ran
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
which contains the code you mentioned. It runs fine on my side.
Make sure that you have downloaded the text.txt that goes with this
notebook: in the same directory as the notebook.
If the text.txt file is not present the program will generate an error
message.
Please contact me for any further questions you may have.
On Tue, Mar 21, 2023 at 1:01 PM Denis Rothman ***@***.***>
wrote:
Dear Karl,
Thank you for your feedback.
I'll check this out and get back to you in the next few days.
Denis
On Tue, Mar 21, 2023, 11:14 AM Karl Estermann ***@***.***>
wrote:
> In Chapter02 ther are a issue:
> TypeError Traceback (most recent call last)
> Cell In[42], line 44
> 42 print(word1)
> 43 print(model2)
> ---> 44 a=model2[word1]
> 45 b=model2[word2]
> 47 if(dprint==1):
>
> TypeError: 'Word2Vec' object is not subscriptable
> what's the reason of this sequenze.
> Thanks Karl
>
> —
> Reply to this email directly, view it on GitHub
> <#1 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AHLCIVXLAR2PPEFWOUFT4YTW5F5P7ANCNFSM5QMO3SJA>
> .
> You are receiving this because you authored the thread.Message ID:
> <Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5379116
> @github.com>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thank you for this feedback.
Running a program depends on the environment installed.
I work on Google Colab which provides pre-installed packages and also
install packages with versions I sometimes freeze
because of conflicts with other libraries.
To make sure we are exchanging comments on the same program, here is the
log I just ran:
1. this program:
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
2. on Google Colab (free version) by opening it with the link provided in
the readme.md of the repository; copied the notebook to my drive.
3. uploaded the text.txt file in the directory to Google Colab
4. executed the runtime with "Run all" with no intervention on my part
5. The program worked with no error on my side with any reference to Keras
The possible misunderstanding could be:
- we are not exchanging comments on the same program. In that case, please
provide the name of the notebook
- we are not running the notebooks in the same environment
- the text.txt file is not uploaded
Please ask any other questions if necessary.
On Wed, Mar 22, 2023 at 8:53 AM Karl Estermann ***@***.***>
wrote:
… For me it run only with this replacement:
a=model2.wv[word1]
b=model2.wv[word2]
You also run in a problem with this Statement:
from keras.preprocessing.sequence import pad_sequences
Correct is:
from keras.preprocessing import text
from keras.utils import np_utils
import keras
from keras import preprocessing
import tensorflow as tf
and in Function: generate_context_word_pairs
x = tf.keras.preprocessing.sequence.pad_sequences(context_words,
maxlen=context_length) #instead your orginal
Freundliche Grüsse
AALS AG
Karl Estermann
Die Menschen werden durch KI-generierten Schwachsinn getäuscht.
Der ChatGPT-Hype ist der Beweis, dass niemand KI wirklich versteht
Große Sprachmodelle sind dümmer als die Katze deines Nachbarn
AALS AGAsylstrasse 9
6340 Baar
Tel: 0041 41 761 88 15
Fax: 0041 41 761 62 51
Hdy: 0041 79 377 07 99
***@***.***
www.aals.ch
**********************************************************************
This e-mail may contain confidential material. It is intended only for
the person or entity which it is addressed to. Any review, retransmission,
dissemination or other use of this information by persons or entities other
than the intended recipient is prohibited. If you received this e-mail in
error,
please immediately notify the sender or The AALS Software AG and delete
the material from any and all computers it may be stored on. Any views or
opinions expressed in this e-mail are those of the sender and do not
necessarily
coincide with those of The AALS Software AG. Therefore this e-mail does
not represent
a binding agreement nor an offer to deal. E-mail transmission cannot be
guaranteed
to be secure or error-free. Information could be intercepted, corrupted,
lost,
destroyed, incomplete or may contain viruses. Neither The AALS Software AG
nor
the sender can accept any liability for any kind of damage as the result of
viruses or transmission errors.
Von: Denis Rothman ***@***.***>
An: Denis2054/Transformers-for-NLP-2nd-Edition ***@***.***>
Kopie: Karl Estermann ***@***.***>, Comment ***@***.***>
Gesendet: 21.03.2023 17:55
Betreff: Re: [Denis2054/Transformers-for-NLP-2nd-Edition] Welcome to
Transformers-for-NLP-2nd-Edition Discussions! (Discussion #1)
I ran
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
which contains the code you mentioned. It runs fine on my side.
Make sure that you have downloaded the text.txt that goes with this
notebook: in the same directory as the notebook.
If the text.txt file is not present the program will generate an error
message.
Please contact me for any further questions you may have.
On Tue, Mar 21, 2023 at 1:01 PM Denis Rothman ***@***.***>
wrote:
> Dear Karl,
>
> Thank you for your feedback.
>
> I'll check this out and get back to you in the next few days.
>
>
> Denis
>
>
>
>
>
>
> On Tue, Mar 21, 2023, 11:14 AM Karl Estermann ***@***.***>
> wrote:
>
>> In Chapter02 ther are a issue:
>> TypeError Traceback (most recent call last)
>> Cell In[42], line 44
>> 42 print(word1)
>> 43 print(model2)
>> ---> 44 a=model2[word1]
>> 45 b=model2[word2]
>> 47 if(dprint==1):
>>
>> TypeError: 'Word2Vec' object is not subscriptable
>> what's the reason of this sequenze.
>> Thanks Karl
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <
#1 (comment)
>,
>> or unsubscribe
>> <
https://github.com/notifications/unsubscribe-auth/AHLCIVXLAR2PPEFWOUFT4YTW5F5P7ANCNFSM5QMO3SJA
>
>> .
>> You are receiving this because you authored the thread.Message ID:
>>
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5379116
>> @github.com>
>>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHLCIVWJQOR5CW2B32BEBPLW5KVYFANCNFSM5QMO3SJA>
.
You are receiving this because you authored the thread.Message ID:
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5389821
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Thanks for answer. For me is Colab a no go. I work with DataSpell from JetBrain.
Freundliche Grüsse
AALS AG
Karl Estermann
Die Menschen werden durch KI-generierten Schwachsinn getäuscht.
Der ChatGPT-Hype ist der Beweis, dass niemand KI wirklich versteht
Große Sprachmodelle sind dümmer als die Katze deines Nachbarn
AALS AGAsylstrasse 9
6340 Baar
Tel: 0041 41 761 88 15
Fax: 0041 41 761 62 51
Hdy: 0041 79 377 07 99
***@***.***
www.aals.ch
**********************************************************************
This e-mail may contain confidential material. It is intended only for
the person or entity which it is addressed to. Any review, retransmission,
dissemination or other use of this information by persons or entities other
than the intended recipient is prohibited. If you received this e-mail in error,
please immediately notify the sender or The AALS Software AG and delete
the material from any and all computers it may be stored on. Any views or
opinions expressed in this e-mail are those of the sender and do not necessarily
coincide with those of The AALS Software AG. Therefore this e-mail does not represent
a binding agreement nor an offer to deal. E-mail transmission cannot be guaranteed
to be secure or error-free. Information could be intercepted, corrupted, lost,
destroyed, incomplete or may contain viruses. Neither The AALS Software AG nor
the sender can accept any liability for any kind of damage as the result of
viruses or transmission errors.
Von: Denis Rothman ***@***.***>
An: Denis2054/Transformers-for-NLP-2nd-Edition ***@***.***>
Kopie: Karl Estermann ***@***.***>, Comment ***@***.***>
Gesendet: 22.03.2023 9:22
Betreff: Re: [Denis2054/Transformers-for-NLP-2nd-Edition] Welcome to Transformers-for-NLP-2nd-Edition Discussions! (Discussion #1)
Thank you for this feedback.
Running a program depends on the environment installed.
I work on Google Colab which provides pre-installed packages and also
install packages with versions I sometimes freeze
because of conflicts with other libraries.
To make sure we are exchanging comments on the same program, here is the
log I just ran:
1. this program:
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
2. on Google Colab (free version) by opening it with the link provided in
the readme.md of the repository; copied the notebook to my drive.
3. uploaded the text.txt file in the directory to Google Colab
4. executed the runtime with "Run all" with no intervention on my part
5. The program worked with no error on my side with any reference to Keras
The possible misunderstanding could be:
- we are not exchanging comments on the same program. In that case, please
provide the name of the notebook
- we are not running the notebooks in the same environment
- the text.txt file is not uploaded
Please ask any other questions if necessary.
On Wed, Mar 22, 2023 at 8:53 AM Karl Estermann ***@***.***>
wrote:
For me it run only with this replacement:
a=model2.wv[word1]
b=model2.wv[word2]
You also run in a problem with this Statement:
from keras.preprocessing.sequence import pad_sequences
Correct is:
from keras.preprocessing import text
from keras.utils import np_utils
import keras
from keras import preprocessing
import tensorflow as tf
and in Function: generate_context_word_pairs
x = tf.keras.preprocessing.sequence.pad_sequences(context_words,
maxlen=context_length) #instead your orginal
Freundliche Grüsse
AALS AG
Karl Estermann
Die Menschen werden durch KI-generierten Schwachsinn getäuscht.
Der ChatGPT-Hype ist der Beweis, dass niemand KI wirklich versteht
Große Sprachmodelle sind dümmer als die Katze deines Nachbarn
AALS AGAsylstrasse 9
6340 Baar
Tel: 0041 41 761 88 15
Fax: 0041 41 761 62 51
Hdy: 0041 79 377 07 99
***@***.***
www.aals.ch
**********************************************************************
This e-mail may contain confidential material. It is intended only for
the person or entity which it is addressed to. Any review, retransmission,
dissemination or other use of this information by persons or entities other
than the intended recipient is prohibited. If you received this e-mail in
error,
please immediately notify the sender or The AALS Software AG and delete
the material from any and all computers it may be stored on. Any views or
opinions expressed in this e-mail are those of the sender and do not
necessarily
coincide with those of The AALS Software AG. Therefore this e-mail does
not represent
a binding agreement nor an offer to deal. E-mail transmission cannot be
guaranteed
to be secure or error-free. Information could be intercepted, corrupted,
lost,
destroyed, incomplete or may contain viruses. Neither The AALS Software AG
nor
the sender can accept any liability for any kind of damage as the result of
viruses or transmission errors.
Von: Denis Rothman ***@***.***>
An: Denis2054/Transformers-for-NLP-2nd-Edition ***@***.***>
Kopie: Karl Estermann ***@***.***>, Comment ***@***.***>
Gesendet: 21.03.2023 17:55
Betreff: Re: [Denis2054/Transformers-for-NLP-2nd-Edition] Welcome to
Transformers-for-NLP-2nd-Edition Discussions! (Discussion #1)
I ran
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
which contains the code you mentioned. It runs fine on my side.
Make sure that you have downloaded the text.txt that goes with this
notebook: in the same directory as the notebook.
If the text.txt file is not present the program will generate an error
message.
Please contact me for any further questions you may have.
On Tue, Mar 21, 2023 at 1:01 PM Denis Rothman ***@***.***>
wrote:
> Dear Karl,
>
> Thank you for your feedback.
>
> I'll check this out and get back to you in the next few days.
>
>
> Denis
>
>
>
>
>
>
> On Tue, Mar 21, 2023, 11:14 AM Karl Estermann ***@***.***>
> wrote:
>
>> In Chapter02 ther are a issue:
>> TypeError Traceback (most recent call last)
>> Cell In[42], line 44
>> 42 print(word1)
>> 43 print(model2)
>> ---> 44 a=model2[word1]
>> 45 b=model2[word2]
>> 47 if(dprint==1):
>>
>> TypeError: 'Word2Vec' object is not subscriptable
>> what's the reason of this sequenze.
>> Thanks Karl
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <
#1 (comment)
>,
>> or unsubscribe
>> <
https://github.com/notifications/unsubscribe-auth/AHLCIVXLAR2PPEFWOUFT4YTW5F5P7ANCNFSM5QMO3SJA
>
>> .
>> You are receiving this because you authored the thread.Message ID:
>>
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5379116
>> @github.com>
>>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHLCIVWJQOR5CW2B32BEBPLW5KVYFANCNFSM5QMO3SJA>
.
You are receiving this because you authored the thread.Message ID:
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5389821
@github.com>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks for your feedback.
Your comment is valuable for those who develop on DataSpell from JetBrains:
https://www.jetbrains.com/help/dataspell/quick-start-guide.html
For those who develop on Google Colab, there are no changes to make:
https://colab.research.google.com/
Thanks again.
On Wed, Mar 22, 2023 at 10:01 AM Karl Estermann ***@***.***>
wrote:
… Thanks for answer. For me is Colab a no go. I work with DataSpell from
JetBrain.
Freundliche Grüsse
AALS AG
Karl Estermann
Die Menschen werden durch KI-generierten Schwachsinn getäuscht.
Der ChatGPT-Hype ist der Beweis, dass niemand KI wirklich versteht
Große Sprachmodelle sind dümmer als die Katze deines Nachbarn
AALS AGAsylstrasse 9
6340 Baar
Tel: 0041 41 761 88 15
Fax: 0041 41 761 62 51
Hdy: 0041 79 377 07 99
***@***.***
www.aals.ch
**********************************************************************
This e-mail may contain confidential material. It is intended only for
the person or entity which it is addressed to. Any review, retransmission,
dissemination or other use of this information by persons or entities other
than the intended recipient is prohibited. If you received this e-mail in
error,
please immediately notify the sender or The AALS Software AG and delete
the material from any and all computers it may be stored on. Any views or
opinions expressed in this e-mail are those of the sender and do not
necessarily
coincide with those of The AALS Software AG. Therefore this e-mail does
not represent
a binding agreement nor an offer to deal. E-mail transmission cannot be
guaranteed
to be secure or error-free. Information could be intercepted, corrupted,
lost,
destroyed, incomplete or may contain viruses. Neither The AALS Software AG
nor
the sender can accept any liability for any kind of damage as the result of
viruses or transmission errors.
Von: Denis Rothman ***@***.***>
An: Denis2054/Transformers-for-NLP-2nd-Edition ***@***.***>
Kopie: Karl Estermann ***@***.***>, Comment ***@***.***>
Gesendet: 22.03.2023 9:22
Betreff: Re: [Denis2054/Transformers-for-NLP-2nd-Edition] Welcome to
Transformers-for-NLP-2nd-Edition Discussions! (Discussion #1)
Thank you for this feedback.
Running a program depends on the environment installed.
I work on Google Colab which provides pre-installed packages and also
install packages with versions I sometimes freeze
because of conflicts with other libraries.
To make sure we are exchanging comments on the same program, here is the
log I just ran:
1. this program:
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
2. on Google Colab (free version) by opening it with the link provided in
the readme.md of the repository; copied the notebook to my drive.
3. uploaded the text.txt file in the directory to Google Colab
4. executed the runtime with "Run all" with no intervention on my part
5. The program worked with no error on my side with any reference to Keras
The possible misunderstanding could be:
- we are not exchanging comments on the same program. In that case, please
provide the name of the notebook
- we are not running the notebooks in the same environment
- the text.txt file is not uploaded
Please ask any other questions if necessary.
On Wed, Mar 22, 2023 at 8:53 AM Karl Estermann ***@***.***>
wrote:
> For me it run only with this replacement:
>
> a=model2.wv[word1]
> b=model2.wv[word2]
>
>
>
> You also run in a problem with this Statement:
>
> from keras.preprocessing.sequence import pad_sequences
>
>
>
> Correct is:
>
> from keras.preprocessing import text
> from keras.utils import np_utils
> import keras
> from keras import preprocessing
> import tensorflow as tf
>
>
>
> and in Function: generate_context_word_pairs
>
>
>
> x = tf.keras.preprocessing.sequence.pad_sequences(context_words,
> maxlen=context_length) #instead your orginal
>
>
>
>
>
>
>
>
>
> Freundliche Grüsse
>
> AALS AG
> Karl Estermann
>
>
>
>
> Die Menschen werden durch KI-generierten Schwachsinn getäuscht.
>
> Der ChatGPT-Hype ist der Beweis, dass niemand KI wirklich versteht
> Große Sprachmodelle sind dümmer als die Katze deines Nachbarn
>
>
>
>
>
>
>
> AALS AGAsylstrasse 9
> 6340 Baar
>
>
> Tel: 0041 41 761 88 15
> Fax: 0041 41 761 62 51
> Hdy: 0041 79 377 07 99
>
> ***@***.***
> www.aals.ch
>
> **********************************************************************
>
> This e-mail may contain confidential material. It is intended only for
> the person or entity which it is addressed to. Any review,
retransmission,
> dissemination or other use of this information by persons or entities
other
> than the intended recipient is prohibited. If you received this e-mail in
> error,
> please immediately notify the sender or The AALS Software AG and delete
> the material from any and all computers it may be stored on. Any views or
> opinions expressed in this e-mail are those of the sender and do not
> necessarily
> coincide with those of The AALS Software AG. Therefore this e-mail does
> not represent
> a binding agreement nor an offer to deal. E-mail transmission cannot be
> guaranteed
> to be secure or error-free. Information could be intercepted, corrupted,
> lost,
> destroyed, incomplete or may contain viruses. Neither The AALS Software
AG
> nor
> the sender can accept any liability for any kind of damage as the result
of
> viruses or transmission errors.
>
>
>
> Von: Denis Rothman ***@***.***>
> An: Denis2054/Transformers-for-NLP-2nd-Edition ***@***.***>
> Kopie: Karl Estermann ***@***.***>, Comment ***@***.***>
> Gesendet: 21.03.2023 17:55
> Betreff: Re: [Denis2054/Transformers-for-NLP-2nd-Edition] Welcome to
> Transformers-for-NLP-2nd-Edition Discussions! (Discussion #1)
>
>
> I ran
>
>
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition/blob/main/Chapter02/positional_encoding.ipynb
> which contains the code you mentioned. It runs fine on my side.
>
> Make sure that you have downloaded the text.txt that goes with this
> notebook: in the same directory as the notebook.
> If the text.txt file is not present the program will generate an error
> message.
>
> Please contact me for any further questions you may have.
>
>
> On Tue, Mar 21, 2023 at 1:01 PM Denis Rothman ***@***.***>
> wrote:
>
> > Dear Karl,
> >
> > Thank you for your feedback.
> >
> > I'll check this out and get back to you in the next few days.
> >
> >
> > Denis
> >
> >
> >
> >
> >
> >
> > On Tue, Mar 21, 2023, 11:14 AM Karl Estermann ***@***.***>
> > wrote:
> >
> >> In Chapter02 ther are a issue:
> >> TypeError Traceback (most recent call last)
> >> Cell In[42], line 44
> >> 42 print(word1)
> >> 43 print(model2)
> >> ---> 44 a=model2[word1]
> >> 45 b=model2[word2]
> >> 47 if(dprint==1):
> >>
> >> TypeError: 'Word2Vec' object is not subscriptable
> >> what's the reason of this sequenze.
> >> Thanks Karl
> >>
> >> —
> >> Reply to this email directly, view it on GitHub
> >> <
>
#1 (comment)
> >,
> >> or unsubscribe
> >> <
>
https://github.com/notifications/unsubscribe-auth/AHLCIVXLAR2PPEFWOUFT4YTW5F5P7ANCNFSM5QMO3SJA
> >
> >> .
> >> You are receiving this because you authored the thread.Message ID:
> >>
>
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5379116
> >> @github.com>
> >>
> >
>
> —
> Reply to this email directly, view it on GitHub, or unsubscribe.
> You are receiving this because you commented.Message ID: ***@***.***>
>
> —
> Reply to this email directly, view it on GitHub
> <
#1 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AHLCIVWJQOR5CW2B32BEBPLW5KVYFANCNFSM5QMO3SJA
>
> .
> You are receiving this because you authored the thread.Message ID:
>
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5389821
> @github.com>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHLCIVVAE4JQPCTGMEV6PO3W5K5V3ANCNFSM5QMO3SJA>
.
You are receiving this because you authored the thread.Message ID:
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/5390491
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Your understanding of cosine similarity is generally correct, as it
measures the cosine of the angle between two vectors, which in this context
represent the words "black" and "brown". A cosine similarity close to 1
indicates a high similarity, while a value close to -1 indicates
dissimilarity.
However, there are a few nuances to consider regarding your results and
approach:
1. **Small Corpus Issue**: Your model is trained on a single sentence,
which is an extremely small corpus. Word2Vec models typically require a
large corpus to learn meaningful word representations. With just one
sentence, the model cannot capture the semantic relationship between words
accurately, especially for words like "black" and "brown", which require
broader context to understand their similarity.
2. **Expectation of Similarity**: While "black" and "brown" are both
colors, the semantic similarity in a Word2Vec space depends on their usage
in the training corpus. In a large and diverse corpus, they might be used
in similar contexts (describing objects, colors, etc.), leading to a higher
cosine similarity. However, in a single sentence, there's not enough
context to establish this similarity.
3. **Training on a Single Sentence**: Training a Word2Vec model on a single
sentence does not allow the model to generalize well. The model's
understanding is limited to the exact contexts provided in that sentence,
which does not suffice for capturing the semantic similarity between words
effectively.
4. **Vector Size and Window**: The parameters `vector_size` and `window`
are crucial for training. While these values can be experimented with, the
vector size of 512 might be too large for such a small dataset, and the
window size of 5 might not be optimal given the sentence's length.
To improve your results, consider the following:
- **Use a Larger Corpus**: To get meaningful similarity scores, use a
larger and more representative corpus to train your model.
- **Pre-trained Models**: Instead of training your model on a small corpus,
consider using a pre-trained Word2Vec model. These models have been trained
on large corpora and can provide more accurate representations for semantic
similarity.
- **Understand the Limitations**: Word2Vec captures syntactic and semantic
word relationships based on context. Two words can be similar in one
context and dissimilar in another. It's not always the case that two words
you expect to be similar based on one attribute (like color) will have high
cosine similarity in Word2Vec space.
By addressing these points, you can get a more accurate measure of the
semantic similarity between "black" and "brown" using Word2Vec models.
…On Sat, Mar 30, 2024, 4:56 PM satyesh88 ***@***.***> wrote:
I am trying to generate the Cosine similarity between two words in a
sentence. The sentence is "The black cat sat on the couch and the brown dog
slept on the rug".
My Python code is below:
from nltk.tokenize import sent_tokenize, word_tokenize
import warnings
warnings.filterwarnings(action = 'ignore')
import gensim
from gensim.models import Word2Vec
from sklearn.metrics.pairwise import cosine_similarity
sentence = "The black cat sat on the couch and the brown dog slept on the rug"
# Replaces escape character with space
f = sentence.replace("\n", " ")
data = []
# sentence parsing
for i in sent_tokenize(f):
temp = []
# tokenize the sentence into words
for j in word_tokenize(i):
temp.append(j.lower())
data.append(temp)
print(data)
# Creating Skip Gram model
model2 = gensim.models.Word2Vec(data, min_count = 1, vector_size = 512, window = 5, sg = 1)
# Print results
print("Cosine similarity between 'black' " +
"and 'brown' - Skip Gram : ",
model2.wv.similarity('black', 'brown'))
As "black" and "brown" are of colour type, their cosine similarity should
be maximum (somewhere around 1). But my result shows following:
[['the', 'black', 'cat', 'sat', 'on', 'the', 'couch', 'and', 'the', 'brown', 'dog', 'slept', 'on', 'the', 'rug']]
Cosine similarity between 'black' and 'brown' - Skip Gram : 0.008911405
Any idea what is wrong here? Is my understanding about cosine similarity
correct?
—
Reply to this email directly, view it on GitHub
<#1 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHLCIVWX7DU5C344DBJDJ6TY23HBTAVCNFSM5QMO3SJKU5DIOJSWCZC7NNSXTOKENFZWG5LTONUW63SDN5WW2ZLOOQ5TQOJWGAYTGMQ>
.
You are receiving this because you authored the thread.Message ID:
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/8960132
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Here is the explanation:
In the positional encoding notebook, the reshaping of the embedding
vectors a and b to a shape of (1, 512) doesn't change the
dimensionality of the data but rather reshapes the vectors for further
operations. Here's what's happening:
The original embedding vectors a and b representing word embeddings, are
likely to be 1D arrays of length 512 each. This is a common size for
embedding vectors in many NLP models, as it offers a balance between
capturing enough semantic information and computational efficiency.
The reshaping to (1, 512) transforms these 1D arrays into 2D matrices. This
transformation doesn't alter the data itself but changes how it's
organized. The reason for this reshaping is usually related to requirements
of subsequent operations that expect inputs in a specific shape. For
instance:
1. Matrix operations: Certain matrix operations or functions in libraries
like NumPy or PyTorch may expect inputs with specific dimensions. Reshaping
ensures compatibility with these operations.
2. Batch processing: Even if you're processing a single embedding at a
time, many deep learning libraries are optimized for batch processing,
where the first dimension is typically reserved for the batch size. By
reshaping to (1, 512), you're essentially creating a batch with a single
item, which makes it compatible with these batch processing expectations.
The reshaping to (1, 512) doesn't change the underlying data; it simply
adds an extra dimension, turning the vector into a matrix with one row and
512 columns. It's a common preprocessing step to align with the
requirements of various operations or functions that will be applied to the
data later in the pipeline.
…On Sun, Mar 31, 2024 at 1:02 AM satyesh88 ***@***.***> wrote:
Thanks for the comment. I was going through your code
Chapter02/positional_encoding.ipynb
I found out that you are reshaping the output of Embedding Vector as
follows:
aa = a.reshape(1,512)
ba = b.reshape(1,512)
And then you have calculated the positional vector and further positional
encoding. Isn't reshaping the vector change the dimension from 512 to 1? My
I know the reason behind that step?
—
Reply to this email directly, view it on GitHub
<#1 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHLCIVVXS7IMQOQQ75GM5WDY25AADAVCNFSM5QMO3SJKU5DIOJSWCZC7NNSXTOKENFZWG5LTONUW63SDN5WW2ZLOOQ5TQOJWGIYTOMI>
.
You are receiving this because you authored the thread.Message ID:
<Denis2054/Transformers-for-NLP-2nd-Edition/repo-discussions/1/comments/8962171
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
👋 Welcome!
We’re using Discussions as a place to connect with other members of our community. We hope that you:
build together 💪.
To get started, comment below with an introduction of yourself and tell us about what you do with this community.
Beta Was this translation helpful? Give feedback.
All reactions