TACC Open Hackathon 2024

Jump to bottom

Ben Prather edited this page Oct 8, 2024 · 15 revisions

Some notes for organizing our efforts

!!! Need at least three people for every day

Agenda (all times CST)

Tues Oct 8 10 AM – 11:30 AM online
- Meet with mentor
Tues Oct 15 9 AM – 5 PM online
- Cluster intro
- Introductory team presentations
- Work with mentor
Tues Oct 22 – Thurs Oct 24 9 AM – 5 PM hybrid
- Work on code with mentor

Our Goals

Primary

Improve MPI scaling for Parthenon applications with many separately enrolled fields

Ideas
- Use smaller fixed-space communication buffers that greedily fill and send repeatedly until all data is exchanged
- Use contiguous buffers large enough to accommodate all fields (not respecting sparsity)
- Others?
Example problem: [parthenon_vibe, advection, fine_advection]
- Modify example to vary number of separately enrolled fields at runtime

Secondary

Improve kernel launching for large meshblocks on single nodes

Example problem:

Diagnose (and improve?) particle efficiency at scale

Example problem: particles-example

Multigrid performance

Example problem:

NCCL/RCCL evaluation

This would be a heavy lift to fully implement
Example problem:

CUDA asynchronous memory copies

Example problem:

Team

Ben Ryan

Secondary goal interests
- Particle scaling

Luke

Secondary goal interests

Philipp

Secondary goal interests

Patrick

Secondary goal interests

Alex

Secondary goal interests

Nirmal

Secondary goal interests

Ben Prather

Secondary goal interests
- Single-meshblock bottlenecks
- Interface for downstreams to add CUDA async copies?

Jonah

Secondary goal interests