-
Notifications
You must be signed in to change notification settings - Fork 37
2023.09.21 Meeting Notes
Philipp Grete edited this page Oct 5, 2023
·
3 revisions
- Individual/group updates
- review non-WIP PRs
LR
- further working on multigrid
- now CG and BiCGStab now work on uniform grids for Poisson
- Poisson also seems to work for AMR
- still fighting with spatially dependent diffusion coeff
- code not optimized for performance yet
- still tweaking task list (to reduce downstream interface)
- now time to get the infrastructure (pieces) merged into
main
; concerns- logical block sizes
- comm patterns
- logical locations/negative levels
- will create final PR with additional docs, and then ping people for reviews
PM
- Josh discovered performance regression when rebasing Riot to
main
when running on CPUs - Up to 30% integrated performance just because of buffer packing kernels
- Changing kernel structure fixes regression, but results in a slowdown on GPU runs
- Path forward: downstream codes should tests impact of those kernels and then we can decide how to proceed (general versus specialized solution, ...)
- Also compared parthenon-hydro to AthenaK to identify performance impact from different block handling and load balancing. To be reported once more data is available.
JD
- added capability to pack subset of blocks, useful, for example, if there's significant load imbalance (e.g., when nothing happens in some part of the domain)
- -> soft disabling blocks, might also be useful for adaptive timestepping or multigrid
- working on PR for timer based load balancing
- through timer objects inside kernels
- works/requires on hierarchical parallelism (so that the timer is at the outer level and reports back to a view in device memory)
BP
- preparing new kharma release with AMR and semi-implicit stepping for viscosity
- added a couple of small PRs to Parthenon along the way (QOL, and bug fixes)
- PEP1 is ready for merge (to customize packages, e.g., allows for customizing source terms/streamlining driver)
FG
- working on coordinate
- cyl. coordinates are working (in separate AthenaPK branch for testing)
- sph. should work, but need testing
- coordinates almost working with yt already
- next step: write example on Parthenon for regression testing and review
- kernel timing for AthenaPK, looks like there's room for improvement both around MPI collectives and individual kernel performance (based on initial roofline models, e.g., 30% of HBM -- compared to 70-80% in K-Athena)
- looks like kernel size is a key issue (on AMD GPUs) due to register pressure (also observed by BP in kharma)
PG
- got funding for dedicated CI machine with Nvidia and AMD GPU
- should be ready in about 4 weeks-ish
- PG will review small PRs
- multigrid and packing PR as discussed above