Skip to content

Commit

Permalink
Runges rettelser (#55)
Browse files Browse the repository at this point in the history
* made the changes in implementation runge wanted

* made all the requested changes from runge
  • Loading branch information
sebastianbot6969 authored Dec 18, 2024
1 parent ddc1671 commit 977c159
Show file tree
Hide file tree
Showing 3 changed files with 121 additions and 83 deletions.
26 changes: 14 additions & 12 deletions report/src/deprecated/02-preliminaries.tex
Original file line number Diff line number Diff line change
Expand Up @@ -116,19 +116,21 @@ \subsubsection{Key Properties}
These parametric formulations allow a single \gls{pctmc} to represent a broad class of \glspl{ctmc}, where the specific model instance is determined by fixing the parameter values.

\subsection{Baum-Welch Algorithm}\label{subsec:baum-welch}
The Baum-Welch algorithm is an iterative method used to estimate the parameters of a \gls{pctmc} from observed data.
The algorithm aims to find the parameter values that maximize the likelihood of the observed data under the model.
This process is based on the Expectation-Maximization (EM) framework and involves two main steps:
The Baum-Welch algorithm is a key method for estimating the probabilities of an \gls{hmm} from observed data.
The probabilities of an \gls{hmm} are the emission matrix $\omega$, the transition matrix $P$, and the initial distribution $\pi$.
It was chosen as the method for this project due to its ability to estimate the probabilities of a \gls{hmm} without knowing the hidden states that generated the observations, and it is also the standard method for training \glspl{hmm}.
If looking at other Markov models such as \glspl{mc}, the Baum-Welch algorithm can be used to estimate the parameters of the model from observed data, therefore it is a suitable choice for this project, as this can be used to estimate the parameters of other Markov models.
It leverages the Expectation-Maximization (EM) framework and consists of two iterative steps:

\begin{enumerate}
\item \textbf{Expectation Step (E-step)}: Compute the expected values of the latent variables, which are the unobserved state sequences corresponding to the observations.
\item \textbf{Maximization Step (M-step)}: Update the parameter values to maximize the likelihood of the observed data, using the expected latent variables computed in the E-step.
\item \textbf{Expectation Step (E-step)}: Compute the expected the forward and backward variables, for each state $s$ and time $t$. of the latent variables, which are the unobserved state sequences corresponding to the observations. These variables represent the likelihood of being in state $s$ at time $t$ given the observed data up to time $t$ and the likelihood of observing the remaining data from time $t$ onwards given the state $s$ at time $t$, respectivly.
\item \textbf{Maximization Step (M-step)}: Update the model parameters (emission matrix $\omega$, transition matrix $P$, and initial distribution $\pi$) to maximize the likelihood of the observed data, using the expected values computed in the E-step.
\item Repeat the E-step and M-step until convergence.
\end{enumerate}

The Baum-Welch algorithm is particularly useful for estimating the parameters of a \gls{pctmc} when the underlying state sequence is unknown or partially observed.
The Baum-Welch algorithm is particularly useful for estimating the properbilities of the emission and transition matrices of a HMM, given a set of observations, without knowing the hidden states that generated the observations.

Given a multiset of observations $\mathcal{O}$ and initial parameters $\textbf{x}_0$, the Baum-Welch algorithm estimates the parameters of a \gls{pctmc} $\mathcal{P}$ by iteratively improving the current hypothesis $\textbf{x}_n$ using the previous estimate $\textbf{x}_{n-1}$ until a convergence criterion is met.
Given a multiset of observations $\mathcal{O}$ and initial parameters $\textbf{x}_0$, the Baum-Welch algorithm estimates the parameters of a \gls{hmm} $\mathcal{P}$ by iteratively improving the current hypothesis $\textbf{x}_n$ using the previous estimate $\textbf{x}_{n-1}$ until a convergence criterion is met.
A hypothesis refers to a specific set of values for the parameters $\mathbf{x}$.

Each iteration of the algorithm produces a new hypothesis, denoted as $\textbf{x}_n$, which is the algorithm's current best guess for the parameter values based on the observed data.
Expand All @@ -137,10 +139,10 @@ \subsection{Baum-Welch Algorithm}\label{subsec:baum-welch}
This is typically evaluated using a convergence criterion such as:

\begin{equation}
||\textbf{x}_n - \textbf{x}_{n-1}|| < \epsilon\label{eq:convergence-criterion}
||l(\textbf{x}_n) - l(\textbf{x}_{n-1})|| < \epsilon\label{eq:convergence-criterion}
\end{equation}

where $\epsilon > 0$ is a small threshold, and $\textbf{x}_n$ denotes the parameter values at the $n$-th iteration.
where $\epsilon > 0$ is a small threshold, and $l(\textbf{x}_n)$ denotes the likelihood of the observed data given the parameter values at the $n$-th iteration.

The algorithm stops when the change in parameters is sufficiently small, indicating that the model has converged to a local maximum of the likelihood function.
The parameter estimation procedure is outlined in \autoref{alg:parameter-estimation}.
Expand All @@ -149,7 +151,7 @@ \subsection{Baum-Welch Algorithm}\label{subsec:baum-welch}
\begin{codebox}
\Procname{$\proc{Estimate-Parameters}(\mathcal{P}, \mathbf{x}_0, \mathcal{O})$}
\li $\mathbf{x} \gets \mathbf{x}0$
\li \While $\neg\proc{Criterion}(\mathbf{x}{n-1}, \mathbf{x}n)$
\li \While $\neg\proc{Criterion}(\mathbf{x}_{n-1}, \mathbf{x}_n)$
\li \Do $\mathbf{x}_{n - 1} \gets \mathbf{x}_n$
\li $(\alpha, \beta) = \proc{Forward-Backward}(\mathcal{P}(\mathbf{x}_n), \mathcal{O})$
\li $\mathbf{x}_n = \proc{Update}(\mathcal{P}(\mathbf{x}_n), \mathcal{O}, \alpha, \beta)$ \End
Expand All @@ -159,8 +161,8 @@ \subsection{Baum-Welch Algorithm}\label{subsec:baum-welch}
\label{alg:parameter-estimation}
\end{algorithm}

Starting with initial parameters $\mathbf{x}_0$, the parameter estimation procedure iteratively improves the current hypothesis $\mathbf{x}_n$ using the previous estimate $\mathbf{x}_{n-1}$ until a specified criterion for convergence is met.
The specifics of the $\proc{Forward-Backward}$ and $\proc{Update}$ procedures are detailed in \autoref{subsec:forward-backwards_algorithm} and \autoref{subsec:update-algorithm} from~\cite{p7}.
Starting with initial parameters $\mathbf{x}_0$, the parameter estimation procedure iteratively improves the current hypothesis $\mathbf{x}_n$ using the previous estimate $\mathbf{x}_{n-1}$ until a specified criterion for convergence is met, the algorithm returns the final estimate $\mathbf{x}_n$.
The specifics of the $\proc{Forward-Backward}$ and $\proc{Update}$ procedures are detailed in \autoref{subsec:forward-backwards_algorithm} and \autoref{subsec:update-algorithm} from~\cite{baum1970maximization}.

\subsection{The Forward-Backward Algorithm}\label{subsec:forward-backwards_algorithm}
For a given \gls{ctmc} $\mathcal{M}$, the forward-backward algorithm computes the forward and backward variables, $\alpha_s(t)$ and $\beta_s(t)$, for each observation sequence $o_0, o_1, \dots, o_{|\mathbf{o}|-1} = \mathbf{o} \in \mathcal{O}$.
Expand Down
Loading

0 comments on commit 977c159

Please sign in to comment.