Enhance stopping criterion options and handle infinite log-likelihoods #103

andreramosfdc · 2024-09-19T18:16:14Z

Customizable Stopping Criterion:
Previously, the package only allowed stopping when the log-likelihood stops increasing. I've introduced a new argument stopping_criterion that accepts two symbols:
:convergence: maintains the original behavior of stopping when the log-likelihood stops increasing.
:stability: introduces a new criterion where the algorithm stops when the log-likelihood stops changing within a given margin (this is important in my use case).

Handling Infinite Log-Likelihoods:
To address a bug where infinite log-likelihood values would break the code, I added logic to handle these cases by replacing infinite values with the log of the largest representable float number. This prevents the process from crashing while still allowing it to continue with large values.

gdalle · 2024-09-19T18:41:15Z

Hi, thank you for these contributions! I have a few follow up questions. Don't be discouraged by my skepticism, I'm really excited for someone to use my package 😍

Do we need this?

:stability: introduces a new criterion where the algorithm stops when the log-likelihood stops changing within a given margin (this is important in my use case).

Can you explain why it is important in your use case? Do you face a scenario where the loglikelihood goes down?

To address a bug where infinite log-likelihood values would break the code

Do you have a reproducible example of code breaking in that case? I think -Inf values for the loglikelihood are perfectly fine. And if you encounter +Inf it is possibly an issue with your model and not with my library. I'd happily take a look if you can share an MWE.

How do we do this if we need it?

:convergence: maintains the original behavior of stopping when the log-likelihood stops increasing.

That's kind of a weird name for a stopping criterion, either way we're checking convergence. Maybe we could use :max_increase vs :max_variation?

I added logic to handle these cases by replacing infinite values with the log of the largest representable float number.

This logic cannot be merged in the current state, because it is located in one of the central functions of the package (obs_logdensities!), and this function must remain non-allocating. At the moment the tests will probably fail because

findall(i -> i < -log(-nextfloat(-Inf)), logb)

creates a new array of indices, which is very inefficient. If we are to change anything in this function (and I'm really doubtful whether we should), it has to be inside the loop.

andreramosfdc · 2024-09-19T18:58:00Z

Dear Guillaume, Regarding the stopping criterion: - In my use case, I printed the log-likelihood at each iteration to better understand what was happening. Interestingly, it initially decreased before shooting up significantly after a few iterations, likely due to initialization effects. This shift ultimately led to much better results. I also completely agree with your naming suggestion. As for the log-likelihood case: - I encountered a state where all values were identical (and equal to 0), which resulted in a normal distribution with zero variance. When calculating the log-likelihood of a zero observation (for that state), I got an Inf value, and for non-zero observations, it returned -Inf. These extremes caused NaN values in the forward calculations, as there was an attempt to invert a zero value. Thanks to these changes, I was able to successfully run my use case. Regarding the code efficiency: - I agree that making it inside the loop would be better. André

…

On Thu, Sep 19, 2024 at 2:41 PM Guillaume Dalle ***@***.***> wrote: Hi, thank you for these contributions! I have a few follow up questions. Don't be discouraged by my skepticism, I'm really excited for someone to use my package 😍 *Do we need this?* :stability: introduces a new criterion where the algorithm stops when the log-likelihood stops changing within a given margin (this is important in my use case). Can you explain why it is important in your use case? Do you face a scenario where the loglikelihood goes down? To address a bug where infinite log-likelihood values would break the code Do you have a reproducible example of code breaking in that case? I think -Inf values for the loglikelihood are perfectly fine. And if you encounter +Inf it is possibly an issue with your model and not with my library. I'd happily take a look if you can share an MWE. *How do we do this if we need it?* :convergence: maintains the original behavior of stopping when the log-likelihood stops increasing. That's kind of a weird name for a stopping criterion, either way we're checking convergence. Maybe we could use :max_increase vs :max_variation? I added logic to handle these cases by replacing infinite values with the log of the largest representable float number. This logic cannot be merged in the current state, because it is located in one of the central functions of the package (obs_logdensities!), and this function must remain non-allocating. At the moment the tests will probably fail because findall(i -> i < -log(-nextfloat(-Inf)), logb) creates a new array of indices, which is very inefficient. If we are to change anything in this function (and I'm really doubtful whether we should), it has to be inside the loop. — Reply to this email directly, view it on GitHub <#103 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AZQ4SVLNLPPKHEYWXGMACITZXMLGBAVCNFSM6AAAAABOQPKRKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRRHEYTIMZVHA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

gdalle · 2024-09-19T19:13:45Z

Okay, this seems to confirm that we actually don't need the modifications you suggest.

Stopping criterion

Interestingly, it initially decreased before shooting up significantly after a few iterations, likely due to initialization effects.

This should never happen, at least in standard cases. The Baum-Welch algorithm is a version of the EM algorithm, which is guaranteed to increase the loglikelihood (up to floating point errors). The option to check that this increase happens is actually a really useful debugging tool, and should basically never be turned off.
If you provide me with a Minimum Working Example, I might be able to help you debug. Was it a model where you implemented a custom fitting step?

In my use case, I printed the log-likelihood at each iteration to better
understand what was happening.

Also you don't need to do that, Baum-Welch returns the vector of loglikelihood values at the end.

Infinite loglikelihood

I encountered a state where all values were identical (and equal to 0), which resulted in a normal distribution with zero variance.

Indeed, zero-variance Gaussians are one of the cases where you may encounter a loglikelihood of +Inf, I struggled with that one too. While it may sound appealing, truncating the computed loglikelihood will only mask the symptoms of the problem. It won't solve the underlying issue, namely the degenerate emission distribution, which steps from the model and/or the data. Please check out the debugging section of the docs, where I give a few tips to solve it.

andreramosfdc · 2024-09-19T19:25:50Z

My use case is one in which the transition matrices vary over time (and has a large number of observations). Maybe that is why the guarantee that the loglikelihood would increase does not hold. The difference is really relevant between iterations (approximately from [9000, 4000, 2000, 5000, 17000, 29000 etc]). Unfortunately I cannot share my example right now, but I can create an artificial case to show you. The challenge arises because we actually want the model to estimate a state with zero values. So, even if I didn't need to run the Baum-Welch algorithm and already knew the transition matrices and distributions, the forward algorithm would still fail under these conditions.

…

On Thu, Sep 19, 2024 at 3:14 PM Guillaume Dalle ***@***.***> wrote: Okay, this seems to confirm that we actually don't need the modifications you suggest. *Stopping criterion* Interestingly, it initially decreased before shooting up significantly after a few iterations, likely due to initialization effects. This should never happen, at least in standard cases. The Baum-Welch algorithm is a version of the EM algorithm, which is *guaranteed* to increase the loglikelihood (up to floating point errors). The option to check that this increase happens is actually a really useful debugging tool, and should basically never be turned off. If you provide me with a Minimum Working Example, I might be able to help you debug. Was it a model where you implemented a custom fitting step? In my use case, I printed the log-likelihood at each iteration to better understand what was happening. Also you don't need to do that, Baum-Welch returns the vector of loglikelihood values at the end. *Infinite loglikelihood* I encountered a state where all values were identical (and equal to 0), which resulted in a normal distribution with zero variance. Indeed, zero-variance Gaussians are one of the cases where you may encounter a loglikelihood of +Inf, I struggled with that one too. While it may sound appealing, truncating the computed loglikelihood will only mask the symptoms of the problem. It won't solve the underlying issue, namely the degenerate emission distribution, which steps from the model and/or the data. Please check out the debugging section <https://gdalle.github.io/HiddenMarkovModels.jl/stable/debugging/#Numerical-underflow> of the docs, where I give a few tips to solve it. — Reply to this email directly, view it on GitHub <#103 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AZQ4SVMMJQXDGVYURTRA743ZXMPABAVCNFSM6AAAAABOQPKRKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRRHE4TIMJSG4> . You are receiving this because you authored the thread.Message ID: ***@***.***>

gdalle · 2024-09-19T19:30:38Z

My use case is one in which the transition matrices vary over time (and has
a large number of observations). Maybe that is why the guarantee that the
loglikelihood would increase does not hold.

Even with time-dependent matrices the loglikelihood should still increase. I'd love it if you could show me an artificial example to debug it together.

The challenge arises because we actually want the model to estimate a state
with zero values.

By zero values I'm assuming you mean the variances. If that is the case, I guess you should use a different distribution object (like a Dirac mass) to accurately describe these degenerate emissions.

andreramosfdc · 2024-09-19T19:39:03Z

Can we schedule a meeting so that I can show you the problem? Maybe tomorrow 10am(GMT-4)

…

On Thu, Sep 19, 2024 at 3:25 PM André Ramos ***@***.***> wrote: My use case is one in which the transition matrices vary over time (and has a large number of observations). Maybe that is why the guarantee that the loglikelihood would increase does not hold. The difference is really relevant between iterations (approximately from [9000, 4000, 2000, 5000, 17000, 29000 etc]). Unfortunately I cannot share my example right now, but I can create an artificial case to show you. The challenge arises because we actually want the model to estimate a state with zero values. So, even if I didn't need to run the Baum-Welch algorithm and already knew the transition matrices and distributions, the forward algorithm would still fail under these conditions. On Thu, Sep 19, 2024 at 3:14 PM Guillaume Dalle ***@***.***> wrote: > Okay, this seems to confirm that we actually don't need the modifications > you suggest. > > *Stopping criterion* > > Interestingly, it initially decreased before shooting up significantly > after a few iterations, likely due to initialization effects. > > This should never happen, at least in standard cases. The Baum-Welch > algorithm is a version of the EM algorithm, which is *guaranteed* to > increase the loglikelihood (up to floating point errors). The option to > check that this increase happens is actually a really useful debugging > tool, and should basically never be turned off. > If you provide me with a Minimum Working Example, I might be able to help > you debug. Was it a model where you implemented a custom fitting step? > > In my use case, I printed the log-likelihood at each iteration to better > understand what was happening. > > Also you don't need to do that, Baum-Welch returns the vector of > loglikelihood values at the end. > > *Infinite loglikelihood* > > I encountered a state where all values were identical (and equal to 0), > which resulted in a normal distribution with zero variance. > > Indeed, zero-variance Gaussians are one of the cases where you may > encounter a loglikelihood of +Inf, I struggled with that one too. While > it may sound appealing, truncating the computed loglikelihood will only > mask the symptoms of the problem. It won't solve the underlying issue, > namely the degenerate emission distribution, which steps from the model > and/or the data. Please check out the debugging section > <https://gdalle.github.io/HiddenMarkovModels.jl/stable/debugging/#Numerical-underflow> > of the docs, where I give a few tips to solve it. > > — > Reply to this email directly, view it on GitHub > <#103 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AZQ4SVMMJQXDGVYURTRA743ZXMPABAVCNFSM6AAAAABOQPKRKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRRHE4TIMJSG4> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

gdalle · 2024-09-19T19:46:42Z

10:30 AM on GMT-4 (NYC time) is perfect for me. Can you send me an invite on my email, found here?

andreramosfdc · 2024-09-19T19:52:52Z

Perfect! Just sent the invitation.

…

On Thu, Sep 19, 2024 at 3:47 PM Guillaume Dalle ***@***.***> wrote: 10:30 AM on GMT-4 (NYC time) is perfect for me. Can you send me an invite on my email, found here <https://gdalle.github.io/>? — Reply to this email directly, view it on GitHub <#103 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AZQ4SVLHJYYWSKYVSH3QFMLZXMS3PAVCNFSM6AAAAABOQPKRKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRSGA2DOOJZHE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

andreramosfdc · 2024-09-20T14:32:24Z

Hello Guillaume, Did you receive the invitation on the correct email? If not, this is the link of the meeting: meet.google.com/nux-eyym-szk

…

On Thu, Sep 19, 2024 at 3:52 PM André Ramos ***@***.***> wrote: Perfect! Just sent the invitation. On Thu, Sep 19, 2024 at 3:47 PM Guillaume Dalle ***@***.***> wrote: > 10:30 AM on GMT-4 (NYC time) is perfect for me. Can you send me an invite > on my email, found here <https://gdalle.github.io/>? > > — > Reply to this email directly, view it on GitHub > <#103 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AZQ4SVLHJYYWSKYVSH3QFMLZXMS3PAVCNFSM6AAAAABOQPKRKCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRSGA2DOOJZHE> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

andre_ramos added 3 commits September 19, 2024 13:59

add stop criterion and llk stablility

722d15e

fix baum welch

9435fff

add docs

d064b16

gdalle closed this Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance stopping criterion options and handle infinite log-likelihoods #103

Enhance stopping criterion options and handle infinite log-likelihoods #103

andreramosfdc commented Sep 19, 2024

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

andreramosfdc commented Sep 20, 2024 via email

Enhance stopping criterion options and handle infinite log-likelihoods #103

Enhance stopping criterion options and handle infinite log-likelihoods #103

Conversation

andreramosfdc commented Sep 19, 2024

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

gdalle commented Sep 19, 2024

andreramosfdc commented Sep 19, 2024 via email

andreramosfdc commented Sep 20, 2024 via email