-
Notifications
You must be signed in to change notification settings - Fork 3
/
xworkload.tex
68 lines (60 loc) · 3.97 KB
/
xworkload.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
% -*- mode: latex; mode: visual-line; fill-column: 9999; coding: utf-8 -*-
\subsection{Effect of \RcompIO on Scaling Performance}
\label{sec:increasedworkload}
For an $X$-fold increase in workload, we expected the workload for the computation to scale with $X$ as $\tcomp(X) = N_{\text{frames}}^{\text{total}} X \overline{\tcomp^{\text{frame}}}$ while the read I/O workload $\tIO(X) = N_{\text{frames}}^{\text{total}} \overline{\tIO^{\text{frame}}}$ (number of frames times the average time to read a frame) should remain independent of $X$.
Therefore, the ratio for any $X$ should be $\RcompIO(X) = \tcomp(X)/\tIO(X) = X \RcompIO(X=1)$, i.e., $\RcompIO$ should just linearly scale with the workload factor $X$.
The measured $\RcompIO$ ratios of 11, 19, 27 for the increased computational workloads agreed with this theoretical analysis, as shown in Table \ref{tab:load-ratio}.
\begin{SCtable}[1.0][!htb]
\centering
\caption[Change in load-ratio with RMSD workload]{Change in $\RcompIO$ ratio with change in the RMSD workload $X$.
The RMSD workload was artificially increased in order to examine the effect of compute to I/O ratio on the performance.
The reported compute and I/O time were measured based on the serial version using one core.
The theoretical $\RcompIO$ (see text) is provided for comparison.}
\label{tab:load-ratio}
\begin{tabular}{rrrrr}
\toprule
\bfseries\thead{Workload $X$} & \bfseries\thead{$\tcomp$ (s)} & \bfseries\thead{$\tIO$ (s)}
& \multicolumn{2}{c}{\bfseries\thead{$\RcompIO$}}\\
& & & \thead{measured} & \thead{theoretical}\\
\midrule
$1\times$ & 226 & 791 & 0.29 & \\
$40\times$ & 8655 & 791 & 11 & 11\\
$70\times$ & 15148 & 791 & 19 & 20\\
$100\times$ & 21639 & 791 & 27 & 29\\
\bottomrule
\end{tabular}
\end{SCtable}
\begin{figure}[!htb]
\centering
\begin{subfigure} {.3\textwidth}
\includegraphics[width=\linewidth]{figures/Compute_to_IO_ratio_on_performance_2d_v17.pdf}
\caption{Speed-Up}
\label{fig:S1_tcomp_tIO_effect}
\end{subfigure}
\hfill
\begin{subfigure}{.3\textwidth}
\includegraphics[width=\linewidth]{figures/Compute_to_IO_ratio_on_performance_2d_2_v17.pdf}
\caption{Speed-Up}
\label{fig:S2_tcomp_tIO_effect}
\end{subfigure}
\hfill
\begin{subfigure}{.3\textwidth}
\includegraphics[width=\linewidth]{figures/Compute_to_IO_ratio_on_performance_2d_3_v17.pdf}
\caption{Efficiency}
\label{fig:E_tcomp_tIO_effect}
\end{subfigure}
\caption{Effect of $\RcompIO$ ratio on performance of the RMSD task on \emph{SDSC Comet}. We tested performance for $\RcompIO$ ratios of 0.3, 11, 19, 27, which correspond to $1\times$ RMSD, $40\times$ RMSD, $70\times$ RMSD, and $100\times$ RMSD respectively.
(a) Effect of $\RcompIO$ on the speed-up.
(b) Change in speed-up with respect to $\RcompIO$ for different processor counts.
(c) Change in the efficiency with respect to $\RcompIO$ for different processor counts.}
\label{fig:tcomp_tIO_effect}
\end{figure}
We performed the experiments with increased workload to measure the effect of the $\RcompIO$ ratio (Eq.~\ref{eq:Compute-IO}) on performance (Figure~\ref{fig:tcomp_tIO_effect}).
The strong scaling performance as measured by the speed-up $S(N)$ improved with increasing $\RcompIO$ ratio (Figure \ref{fig:S1_tcomp_tIO_effect}).
The calculations consistently showed better scaling performance to larger numbers of cores for higher $\RcompIO$ ratios, e.g., $N=56$ cores for the $70\times$ RMSD task.
The speed-up and efficiency approached their ideal value for each processor count with increasing $\RcompIO$ ratio (Figures \ref{fig:S2_tcomp_tIO_effect} and \ref{fig:E_tcomp_tIO_effect}).
Even for moderately compute-bound workloads, such as the $40\times$ and $70\times$ RMSD tasks, increasing the computational workload over I/O reduced the impact of stragglers even though they still contributed to large variations in timing across different ranks and thus irregular scaling.
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "main"
%%% End: