You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using the heuristic scheduler, we segfault in the scheduler code when there is no suitable back-end for an operator:
I0615 23:11:03.220222 21852 scheduler_dynamic.cc:558] The minimum cost of running the DAG: 100001
I0615 23:11:03.220257 21852 scheduler_dynamic.cc:562] Cur cost: 100001
Program received signal SIGSEGV, Segmentation fault.
0x000000000059c494 in musketeer::scheduling::SchedulerDynamic::ComputeOptimal (this=0x83c800, serial_dag=std::vector of length 3, capacity 4 = {...})
at /home/malte/Projects/musketeer/src/scheduling/scheduler_dynamic.cc:563
563 uint32_t prev_jobs_exec = parent[cur_cost][cur_jobs_exec];
To reproduce, craft a minimal DAG that contains an operator that cannot be expressed in any of the available execution engines (i.e., for which the framework's scoring method returns FLAGS_max_scheduler_cost), and try to schedule it using the heuristic scheduler.
The solution is two-fold:
Fail in a more sensible way.
Notice that the DAG is impossible and bail out early, before even invoking the scheduling heuristic.
The text was updated successfully, but these errors were encountered:
This is a workaround for the problem in #4: when the workflow includes
an operator that none of the available frameworks support, the DAG cost
will be in excess of FLAGS_max_scheduler_cost, and the DP algorithm no
longer works.
Thanks @n1v0lg for reporting this issue.
When using the heuristic scheduler, we segfault in the scheduler code when there is no suitable back-end for an operator:
To reproduce, craft a minimal DAG that contains an operator that cannot be expressed in any of the available execution engines (i.e., for which the framework's scoring method returns
FLAGS_max_scheduler_cost
), and try to schedule it using the heuristic scheduler.The solution is two-fold:
The text was updated successfully, but these errors were encountered: