-
Notifications
You must be signed in to change notification settings - Fork 37
2022.08.11 Meeting Notes
Philipp Grete edited this page Aug 12, 2022
·
2 revisions
- Individual/group updates
- Developer meeting
- Load balance results
- Slides: Parthenon_Perf_Aug10-11.pdf
- Goals: intuitive model for how different input params affect performance. use to
- load-balancing/optimizing problems
- potential or using in-network capabilities
- for Phoebus relativistic blast wave problem ~100k messages are exchanged per timestep (at 512 ranks)
- interesting use case because relativistic sim makes load per block non-uniform
- 128^3 RG with 16^3 blocks
- on 512 CPU cores about 30k steps (4.5 hours)
- on avg 500ms per step
- rough times:
- LB is negligible
- bound comm has lot of variance
- AllGather in UpdateMeshlockTree is most expensive
- overall this problem is representative for AMR workloads
- meshblock count per rank is
- a good proxy for flux corr phase
- not a good proxy for boundary comm (still huge variance, biggest impact on stragglers)
- good proxy for fillderived
- AllGather as global "barrier" collects all imbalance of prev three phases
Need to finalize list of participants soon (especially foreign nationals) so that they can be processed.
PG (just keeping a note here so that I don't forget)
- Have almost all scaling data I want for the paper
- Fixed data transpose for outputs
- Allow adding params from cmd line
- SZ3 compression for outputs (PR tbd)
- (Cycl. & Sph. coordinate work in AthenaPK thanks to SL)
Next meeting in two weeks.