You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What is NLL?
Negative Log Likelihood (NLL) is a measure of how well a probability distribution predicted by a model aligns with the true distribution of the data. In the context of LLMs, it measures how well the model predicts the sequences in the evaluation dataset.
The NLL metric measures the prediction probability of the LLM model over a corpus set based on its context. If the corpus set is indicative of a specific type of LLM capability, such as multi-round conversation, instruction following, math problem solving, or role-playing, then the NLL metric on those corpora can offer quantitative measures to assess those abilities.
How is it calculated in the code?
The provided code calculates the NLL of a set of sentences given their contexts.
importmathdefcompute_nll(sentences, contexts):
N=len(sentences)
total_log_prob=0foriinrange(N):
sentence=sentences[i]
context=contexts[i]
# This is a placeholder and you'd need to replace this with the actual model's probability functionprob=model_probability(sentence, context)
total_log_prob+=math.log(prob)
return-total_log_prob/Ndefmodel_probability(sentence, context):
# Placeholder function: This should return the probability of the sentence given the context using your LLM# For this example, we're returning a dummy probabilityreturn0.5
The compute_nll function takes in two lists: sentences and contexts. Each sentence corresponds to a context in the same index.
For each sentence-context pair, the probability of the sentence given the context is computed using the model_probability function.
The logarithm of each probability is taken and summed up.
Finally, the negative of the average log probability is returned as the NLL.
Why use NLL for LLMs?
When evaluating LLMs, it's crucial to know how confidently the model predicts sequences. A low NLL indicates that the model assigns high probabilities to the observed sequences, suggesting it's doing a good job. Conversely, a high NLL indicates the opposite.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
What is NLL?
Negative Log Likelihood (NLL) is a measure of how well a probability distribution predicted by a model aligns with the true distribution of the data. In the context of LLMs, it measures how well the model predicts the sequences in the evaluation dataset.
The NLL metric measures the prediction probability of the LLM model over a corpus set based on its context. If the corpus set is indicative of a specific type of LLM capability, such as multi-round conversation, instruction following, math problem solving, or role-playing, then the NLL metric on those corpora can offer quantitative measures to assess those abilities.
How is it calculated in the code?
The provided code calculates the NLL of a set of sentences given their contexts.
compute_nll
function takes in two lists: sentences and contexts. Each sentence corresponds to a context in the same index.Why use NLL for LLMs?
When evaluating LLMs, it's crucial to know how confidently the model predicts sequences. A low NLL indicates that the model assigns high probabilities to the observed sequences, suggesting it's doing a good job. Conversely, a high NLL indicates the opposite.
Beta Was this translation helpful? Give feedback.
All reactions