Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exhaustive scheduler dispatches jobs backwards #7

Open
ms705 opened this issue Jul 6, 2016 · 0 comments
Open

Exhaustive scheduler dispatches jobs backwards #7

ms705 opened this issue Jul 6, 2016 · 0 comments

Comments

@ms705
Copy link
Contributor

ms705 commented Jul 6, 2016

This issue occurs only in @n1v0lg's fork of Musketeer, when using the non-mergeable Viff framework's operators. The equivalent job in stock Musketeer does not exhibit this issue, possibly the GroupBy operator that replaces GroupBySEC is mergeable.

The input is this Mindi program.

Trace:

$ build/musketeer --dry_run --run_daemon=0 --beer_query=tests/foo.rap --root_dir=/tmp/ --output_ir_dag_gv --use_frameworks="hadoop-viff" --use_heuristic=false
I0706 18:22:53.907582  1994 musketeer.cc:184] Adding Hadoop Framework
I0706 18:22:53.907663  1994 musketeer.cc:208] Adding Viff (MPC) Framework
I0706 18:22:53.907677  1994 musketeer.cc:267] Looking for new Job to schedule
digraph OpDAG {
node [shape=box]; edges_sel [label="15"]; 
edges_sel->sum_group_by [label="edges_sel"];
sum_group_by->sec_sum_group_by [label="sum_group_by"];
}
I0706 18:22:53.908088  1994 musketeer.cc:315] Scheduling entire DAG
I0706 18:22:53.908100  1994 scheduler_dynamic.cc:170] Determine inputs size for DAG
IsGeneratedByOp 0
I0706 18:22:53.908126  1994 utils.cc:373] edges is an input
IsGeneratedByOp 1
I0706 18:22:53.908154  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.908176  1994 utils.cc:378] sum_group_by is not an input
I0706 18:22:53.908200  1994 scheduler_dynamic.cc:195] Size of: edges is: 0
I0706 18:22:53.908217  1994 scheduler_dynamic.cc:240] DynamicSchedule DAG
I0706 18:22:53.908227  1994 scheduler_dynamic.cc:439] Topological order: edges_sel
I0706 18:22:53.908238  1994 scheduler_dynamic.cc:439] Topological order: sum_group_by
I0706 18:22:53.908252  1994 scheduler_dynamic.cc:439] Topological order: sec_sum_group_by
I0706 18:22:53.908264  1994 utils.cc:260] Node order after optimisation: edges_sel
I0706 18:22:53.908273  1994 utils.cc:260] Node order after optimisation: sum_group_by
I0706 18:22:53.908280  1994 utils.cc:260] Node order after optimisation: sec_sum_group_by
I0706 18:22:53.908291  1994 scheduler_dynamic.cc:342] Refresh rel size of edges_sel is 0
I0706 18:22:53.908303  1994 scheduler_dynamic.cc:342] Refresh rel size of sum_group_by is 0
I0706 18:22:53.908315  1994 scheduler_dynamic.cc:342] Refresh rel size of sec_sum_group_by is 0
SELECT
IsGeneratedByOp 0
I0706 18:22:53.930142  1994 utils.cc:373] edges is an input
IsGeneratedByOp 1
I0706 18:22:53.930179  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.930198  1994 utils.cc:378] sum_group_by is not an input
SELECT
size of node set to schedule: 1
###DAG###
I0706 18:22:53.930243  1994 utils.cc:100] DAG input node: edges_sel
I0706 18:22:53.930253  1994 utils.cc:114] DAG edge: edges_sel sum_group_by
I0706 18:22:53.930263  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
AGG
IsGeneratedByOp 1
I0706 18:22:53.930299  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.930320  1994 utils.cc:378] sum_group_by is not an input
AGG
size of node set to schedule: 1
###DAG###
I0706 18:22:53.930359  1994 utils.cc:100] DAG input node: sum_group_by
I0706 18:22:53.930371  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
SELECT
AGG
IsGeneratedByOp 0
I0706 18:22:53.930418  1994 utils.cc:373] edges is an input
IsGeneratedByOp 1
I0706 18:22:53.930444  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.930461  1994 utils.cc:378] sum_group_by is not an input
SELECT
AGG
size of node set to schedule: 2
###DAG###
I0706 18:22:53.930505  1994 utils.cc:100] DAG input node: edges_sel
I0706 18:22:53.930516  1994 utils.cc:114] DAG edge: edges_sel sum_group_by
I0706 18:22:53.930526  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
AGG_SEC
IsGeneratedByOp 1
I0706 18:22:53.930559  1994 utils.cc:378] sum_group_by is not an input
AGG_SEC
size of node set to schedule: 1
###DAG###
I0706 18:22:53.930595  1994 utils.cc:100] DAG input node: sec_sum_group_by
###DAG###
IsGeneratedByOp 1
I0706 18:22:53.930624  1994 utils.cc:378] sum_group_by is not an input
Secure operator detected.
SELECT
AGG_SEC
SELECT
AGG_SEC
size of node set to schedule: 2
###DAG###
I0706 18:22:53.930680  1994 utils.cc:100] DAG input node: edges_sel
I0706 18:22:53.930691  1994 utils.cc:114] DAG edge: edges_sel sum_group_by
I0706 18:22:53.930702  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
AGG
AGG_SEC
IsGeneratedByOp 1
I0706 18:22:53.930747  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.930768  1994 utils.cc:378] sum_group_by is not an input
AGG
AGG_SEC
size of node set to schedule: 2
###DAG###
I0706 18:22:53.930814  1994 utils.cc:100] DAG input node: sum_group_by
I0706 18:22:53.930824  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
SELECT
AGG
AGG_SEC
IsGeneratedByOp 0
I0706 18:22:53.930876  1994 utils.cc:373] edges is an input
IsGeneratedByOp 1
I0706 18:22:53.930897  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.930918  1994 utils.cc:378] sum_group_by is not an input
SELECT
AGG
AGG_SEC
size of node set to schedule: 3
###DAG###
I0706 18:22:53.930970  1994 utils.cc:100] DAG input node: edges_sel
I0706 18:22:53.930981  1994 utils.cc:114] DAG edge: edges_sel sum_group_by
I0706 18:22:53.930992  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
I0706 18:22:53.931013  1994 scheduler_dynamic.cc:558] The minimum cost of running the DAG: 21
I0706 18:22:53.931025  1994 scheduler_dynamic.cc:562] Cur cost: 21
I0706 18:22:53.931032  1994 scheduler_dynamic.cc:565] ---------- Job boundary ----------
I0706 18:22:53.931041  1994 scheduler_dynamic.cc:569] edges_sel
I0706 18:22:53.931049  1994 scheduler_dynamic.cc:569] sum_group_by
I0706 18:22:53.931061  1994 scheduler_dynamic.cc:562] Cur cost: 1
I0706 18:22:53.931068  1994 scheduler_dynamic.cc:565] ---------- Job boundary ----------
I0706 18:22:53.931077  1994 scheduler_dynamic.cc:569] sec_sum_group_by
OUTPUT OUTPUT
viff
I0706 18:22:53.941830  1994 utils.cc:100] DAG input node: sec_sum_group_by
hadoop
I0706 18:22:53.941892  1994 utils.cc:100] DAG input node: edges_sel
I0706 18:22:53.941901  1994 utils.cc:100] DAG input node: sum_group_by
I0706 18:22:53.941912  1994 utils.cc:114] DAG edge: edges_sel sum_group_by
I0706 18:22:53.941922  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
SCHEDULER TIME: 0.033615
I0706 18:22:53.942006  1994 scheduler_dynamic.cc:300] Dispatching relation sec_sum_group_by in framework viff
I0706 18:22:53.942023  1994 translator_viff.cc:116] Viff generate code
I0706 18:22:53.942040  1994 translator_viff.cc:102] Job input: /tmp/sum_group_by/
FileInputFormat.addInputPath(job, new Path("/tmp/sum_group_by/"));
 String[] sum_group_by = value.toString().trim().split(" ");

I0706 18:22:53.945426  1994 scheduler_dynamic.cc:170] Determine inputs size for DAG
IsGeneratedByOp 1
I0706 18:22:53.945461  1994 utils.cc:378] sum_group_by is not an input
I0706 18:22:53.945487  1994 scheduler_dynamic.cc:761] Size of output: sec_sum_group_by is: 0
I0706 18:22:53.945516  1994 scheduler_dynamic.cc:283] Running operators 1 1 on viff
I0706 18:22:53.945529  1994 scheduler_dynamic.cc:289] Number of operators scheduled: 1
I0706 18:22:53.945544  1994 scheduler_dynamic.cc:342] Refresh rel size of sum_group_by is 0
I0706 18:22:53.945554  1994 scheduler_dynamic.cc:342] Refresh rel size of sec_sum_group_by is 0
AGG
IsGeneratedByOp 1
I0706 18:22:53.961684  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.961711  1994 utils.cc:378] sum_group_by is not an input
AGG
size of node set to schedule: 1
###DAG###
I0706 18:22:53.961752  1994 utils.cc:100] DAG input node: sum_group_by
I0706 18:22:53.961765  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
AGG_SEC
IsGeneratedByOp 1
I0706 18:22:53.961807  1994 utils.cc:378] sum_group_by is not an input
AGG_SEC
size of node set to schedule: 1
###DAG###
I0706 18:22:53.961844  1994 utils.cc:100] DAG input node: sec_sum_group_by
###DAG###
IsGeneratedByOp 1
I0706 18:22:53.961874  1994 utils.cc:378] sum_group_by is not an input
Secure operator detected.
AGG
AGG_SEC
IsGeneratedByOp 1
I0706 18:22:53.961918  1994 utils.cc:378] edges_sel is not an input
IsGeneratedByOp 1
I0706 18:22:53.961941  1994 utils.cc:378] sum_group_by is not an input
AGG
AGG_SEC
size of node set to schedule: 2
###DAG###
I0706 18:22:53.961987  1994 utils.cc:100] DAG input node: sum_group_by
I0706 18:22:53.961997  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
###DAG###
I0706 18:22:53.962016  1994 scheduler_dynamic.cc:558] The minimum cost of running the DAG: 21
I0706 18:22:53.962025  1994 scheduler_dynamic.cc:562] Cur cost: 21
I0706 18:22:53.962033  1994 scheduler_dynamic.cc:565] ---------- Job boundary ----------
I0706 18:22:53.962043  1994 scheduler_dynamic.cc:569] sum_group_by
I0706 18:22:53.962052  1994 scheduler_dynamic.cc:562] Cur cost: 1
I0706 18:22:53.962059  1994 scheduler_dynamic.cc:565] ---------- Job boundary ----------
I0706 18:22:53.962067  1994 scheduler_dynamic.cc:569] sec_sum_group_by
OUTPUT OUTPUT
viff
I0706 18:22:53.970327  1994 utils.cc:100] DAG input node: sec_sum_group_by
hadoop
I0706 18:22:53.970353  1994 utils.cc:100] DAG input node: sum_group_by
I0706 18:22:53.970367  1994 utils.cc:114] DAG edge: sum_group_by sec_sum_group_by
SCHEDULER TIME: 0.024822
I0706 18:22:53.970415  1994 scheduler_dynamic.cc:300] Dispatching relation sec_sum_group_by in framework viff
I0706 18:22:53.970428  1994 translator_viff.cc:116] Viff generate code
I0706 18:22:53.970444  1994 translator_viff.cc:102] Job input: /tmp/sum_group_by/
FileInputFormat.addInputPath(job, new Path("/tmp/sum_group_by/"));
 String[] sum_group_by = value.toString().trim().split(" ");

I0706 18:22:53.973784  1994 scheduler_dynamic.cc:170] Determine inputs size for DAG
IsGeneratedByOp 1
I0706 18:22:53.973822  1994 utils.cc:378] sum_group_by is not an input
I0706 18:22:53.973845  1994 scheduler_dynamic.cc:761] Size of output: sec_sum_group_by is: 0
I0706 18:22:53.973867  1994 scheduler_dynamic.cc:283] Running operators 2 2 on viff
I0706 18:22:53.973878  1994 scheduler_dynamic.cc:289] Number of operators scheduled: 1
I0706 18:22:53.973892  1994 scheduler_dynamic.cc:342] Refresh rel size of sec_sum_group_by is 0
AGG_SEC
IsGeneratedByOp 1
I0706 18:22:53.988021  1994 utils.cc:378] sum_group_by is not an input
AGG_SEC
size of node set to schedule: 1
###DAG###
I0706 18:22:53.988067  1994 utils.cc:100] DAG input node: sec_sum_group_by
###DAG###
IsGeneratedByOp 1
I0706 18:22:53.988098  1994 utils.cc:378] sum_group_by is not an input
Secure operator detected.
I0706 18:22:53.988116  1994 scheduler_dynamic.cc:558] The minimum cost of running the DAG: 1
I0706 18:22:53.988126  1994 scheduler_dynamic.cc:562] Cur cost: 1
I0706 18:22:53.988137  1994 scheduler_dynamic.cc:565] ---------- Job boundary ----------
I0706 18:22:53.988147  1994 scheduler_dynamic.cc:569] sec_sum_group_by
OUTPUT OUTPUT
viff
I0706 18:22:53.996474  1994 utils.cc:100] DAG input node: sec_sum_group_by
SCHEDULER TIME: 0.022593
I0706 18:22:53.996526  1994 scheduler_dynamic.cc:300] Dispatching relation sec_sum_group_by in framework viff
I0706 18:22:53.996539  1994 translator_viff.cc:116] Viff generate code
I0706 18:22:53.996556  1994 translator_viff.cc:102] Job input: /tmp/sum_group_by/
FileInputFormat.addInputPath(job, new Path("/tmp/sum_group_by/"));
 String[] sum_group_by = value.toString().trim().split(" ");

I0706 18:22:53.999891  1994 scheduler_dynamic.cc:170] Determine inputs size for DAG
IsGeneratedByOp 1
I0706 18:22:53.999929  1994 utils.cc:378] sum_group_by is not an input
I0706 18:22:53.999953  1994 scheduler_dynamic.cc:761] Size of output: sec_sum_group_by is: 0
I0706 18:22:53.999974  1994 scheduler_dynamic.cc:283] Running operators 3 3 on viff
I0706 18:22:53.999984  1994 scheduler_dynamic.cc:289] Number of operators scheduled: 1
I0706 18:22:54.000041  1994 musketeer.cc:339] Finished scheduling job

Note that sec_sum_group_by gets dispatched before sum_group_by, which is the wrong way around since sec_sum_group_by depends on the output of sum_group_by.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant