Unlimited param count (device inputs) #691

devreal · 2024-11-01T02:51:01Z

Description

The number of flows of a device task are capped by a compile-time constant PARSEC_MAX_PARAM_COUNT, which is used to create several member arrays in parsec_task_class_t, parsec_task_t, and parsec_gpu_task_t. The last one is relevant for TTG because it's the only place we actually use flows and it limits the number of inputs we can handle per task. PARSEC_MAX_PARAM_COUNT can be controlled through a CMake variable but we typically do not know the count up front. And even if we do, different tasks will have different numbers of inputs so setting PARSEC_MAX_PARAM_COUNT to a larger value applies to all task types even if not needed everywhere.

We have not hit this limitation yet but there is a good chance that we will. For example, some users of madness compute in 6 dimensions, which would need 64+ inputs. We will also look into batching of tasks in 3D, which would also quickly exceed the current default of 20 (batching just two levels again needs upwards of 64 inputs).

Describe the solution you'd like

Replace the fixed arrays with pointers to an array. In the case of parsec_gpu_task_t we would make the struct of arrays an array of structs (bundling flow_nb_elts, flow_nb_elts, flow_dc, and sources) and create one array of them. For task classes, extra dynamic allocation should not matter.

We should not use flexible array members because that will make it impossible to embed a parsec_task_t into another structure is generally not supported by C++. Instead, adding a pointer that can potentially point to extra memory at the end of the task structure (or even be NULL for zero flows, like regular tasks in TTG) would be preferable. This would also shrink the footprint of tasks in general, since most applications in PTG use small numbers of inputs.

I understand that bitmaps are used to encode what flows are used so that would have to change as well. We would have to introduce more flexible bitmaps.

Describe alternatives you've considered

The current state would work if in MRA we choose high compile-time defaults once we start batching but that is wasteful for most tasks in the system. We'd also still be limited to whatever integer type is used for the bitmap in PaRSEC.

Additional context

Add any other context, references, and related works about the feature request here.

The text was updated successfully, but these errors were encountered:

devreal added the enhancement New feature or request label Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unlimited param count (device inputs) #691

Unlimited param count (device inputs) #691

devreal commented Nov 1, 2024

Unlimited param count (device inputs) #691

Unlimited param count (device inputs) #691

Comments

devreal commented Nov 1, 2024

Description

Describe the solution you'd like

Describe alternatives you've considered

Additional context