Skip to content

Commit

Permalink
typo and wording corrections within lecture03
Browse files Browse the repository at this point in the history
  • Loading branch information
wallscheid committed Mar 25, 2023
1 parent 71e7a51 commit adbd4d0
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 3 deletions.
2 changes: 1 addition & 1 deletion lecture_slides/main.tex
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%Lecture Include Onlys%%%
%\includeonly{tex/Lecture02}
%\includeonly{tex/Lecture03}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{document}
Expand Down
1 change: 1 addition & 0 deletions lecture_slides/tex/Lecture02.tex
Original file line number Diff line number Diff line change
Expand Up @@ -598,6 +598,7 @@ \section{Finite Markov Decision Processes}
\frame{\frametitle{Bellman Expectation Equation (3)}
Inserting \eqref{eq:q_MDP_finite} into \eqref{eq:v_MDP_finite} directly results in:
\begin{equation}
\label{eq:Bellman_MDP_linear_non_matrix}
v_\pi(x_k) = \sum_{u_k\in\mathcal{U}}\pi(u_k|x_k)\left(\mathcal{R}^u_x + \gamma\sum_{x_{k+1}\in\mathcal{X}}p_{xx'}^u v_\pi(x_{k+1})\right) \, .
\end{equation}
\pause
Expand Down
4 changes: 2 additions & 2 deletions lecture_slides/tex/Lecture03.tex
Original file line number Diff line number Diff line change
Expand Up @@ -221,11 +221,11 @@ \section{Policy Evaluation}
%% Iterative Policy Evaluation by Richardson Iteration (1)%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\frame{\frametitle{Iterative Policy Evaluation by Richardson Iteration (1)}
General form for any $x_k\in\mathcal{X}$ at iteration $i$ is given as:
Applying the Richardson iteration \eqref{eq:richardson_general} to the Bellman equation \eqref{eq:Bellman_MDP_linear_non_matrix} for any $x_k\in\mathcal{X}$ at iteration $i$ results in:
\begin{equation}
v_{i+1}(x_k) = \sum_{u_k\in\mathcal{U}}\bm{\pi}(u_k|x_k)\left(\mathcal{R}^u_x + \gamma\sum_{x_{k+1}\in\mathcal{X}}p_{xx'}^u v_{i}(x_{k+1})\right)\, .
\end{equation}\pause
Matrix form then is:
Matrix form based on \eqref{eq:Bellman_MDP_linear} then is:
\begin{equation}
\label{eq:iterative_policy_eval_matrix}
\bm{v}_{\mathcal{X},i+1}^{\pi} =\bm{r}_{\mathcal{X}}^{\pi}+\gamma\bm{\mathcal{P}}_{xx'}^{\pi}\bm{v}_{\mathcal{X},i}^{\pi}\, .
Expand Down

0 comments on commit adbd4d0

Please sign in to comment.