Discussion:
The child procedure is scheduled in the reversed order in Procedure V2 Framework
OpenInx
2018-11-28 14:42:36 UTC
Permalink
Hi :

I read parts of the procedure v2 framework, and found that if a procedure
has 3 added child procedure,
then it's children will be schedued in the reversed order.

Let me give an example. if a procedure A added 3 child procedure: B, C ,
D.

a.addChildProcedure(B, C, D)

The procedure framework will add the B,C,D child produre in a dequeue to
schedule

( Code Path --- ProcedureExecutor#execProcedure
-- submitChildrenProcedures -- dequeue#addFront )

So the dequeue will be : (front) D, C, B (back)

Then we will poll each procedure from the dequeue, and dispatch to the
executor to run ..

In the final, the procedure executing order will be : D, C, B, which is
the revered order compared to the addChildProcedure order.

My question is : is it designed intentionally ? Or unintentionally doing
the wrong implementation ?

Seems most the child procedure are region assign or unassign, looks like no
problem now..

Please correct me if I am wrong or missing something.

Thanks.
Allan Yang
2018-11-28 15:13:16 UTC
Permalink
Yes, you are right, every procedure will be add to front, so the final
execution order may be reversed, But actually, there will be more than one
worker threads, so likely, they will be executed at the same time. I think
the design is unintentionally, since all the child procedure should be
independent and won't depend on each other, so they can be executed at any
order. And more, after HBASE-21375
<https://issues.apache.org/jira/browse/HBASE-21375> checked in all 2.x
branch, the worker thread will execute every possible procedure in the
queue, so the front ones won't block, so this won't be a problem.
Best Regards
Allan Yang
Post by OpenInx
I read parts of the procedure v2 framework, and found that if a procedure
has 3 added child procedure,
then it's children will be schedued in the reversed order.
Let me give an example. if a procedure A added 3 child procedure: B, C ,
D.
a.addChildProcedure(B, C, D)
The procedure framework will add the B,C,D child produre in a dequeue to
schedule
( Code Path --- ProcedureExecutor#execProcedure
-- submitChildrenProcedures -- dequeue#addFront )
So the dequeue will be : (front) D, C, B (back)
Then we will poll each procedure from the dequeue, and dispatch to the
executor to run ..
In the final, the procedure executing order will be : D, C, B, which is
the revered order compared to the addChildProcedure order.
My question is : is it designed intentionally ? Or unintentionally doing
the wrong implementation ?
Seems most the child procedure are region assign or unassign, looks like no
problem now..
Please correct me if I am wrong or missing something.
Thanks.
OpenInx
2018-11-29 04:09:52 UTC
Permalink
Thanks Allan's reply

So if we can ensure that all children are independent, then it won't be a
problem.
But the reversed shcedule order is some counterintuitive. I think a minor
change can
help fix this:

diff --git
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
index cbdb9b8..b688ec3 100644
---
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
+++
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
@@ -1794,7 +1794,7 @@ public class ProcedureExecutor<TEnvironment> {
}

private void submitChildrenProcedures(Procedure<TEnvironment>[]
subprocs) {
- for (int i = 0; i < subprocs.length; ++i) {
+ for (int i = subprocs.length - 1; i >= 0; i--) {
Procedure<TEnvironment> subproc = subprocs[i];
subproc.updateMetricsOnSubmit(getEnvironment());
assert !procedures.containsKey(subproc.getProcId());

Thanks.
Post by Allan Yang
Yes, you are right, every procedure will be add to front, so the final
execution order may be reversed, But actually, there will be more than one
worker threads, so likely, they will be executed at the same time. I think
the design is unintentionally, since all the child procedure should be
independent and won't depend on each other, so they can be executed at any
order. And more, after HBASE-21375
<https://issues.apache.org/jira/browse/HBASE-21375> checked in all 2.x
branch, the worker thread will execute every possible procedure in the
queue, so the front ones won't block, so this won't be a problem.
Best Regards
Allan Yang
Post by OpenInx
I read parts of the procedure v2 framework, and found that if a
procedure
Post by OpenInx
has 3 added child procedure,
then it's children will be schedued in the reversed order.
Let me give an example. if a procedure A added 3 child procedure: B, C ,
D.
a.addChildProcedure(B, C, D)
The procedure framework will add the B,C,D child produre in a dequeue to
schedule
( Code Path --- ProcedureExecutor#execProcedure
-- submitChildrenProcedures -- dequeue#addFront )
So the dequeue will be : (front) D, C, B (back)
Then we will poll each procedure from the dequeue, and dispatch to the
executor to run ..
In the final, the procedure executing order will be : D, C, B, which
is
Post by OpenInx
the revered order compared to the addChildProcedure order.
My question is : is it designed intentionally ? Or unintentionally
doing
Post by OpenInx
the wrong implementation ?
Seems most the child procedure are region assign or unassign, looks like
no
Post by OpenInx
problem now..
Please correct me if I am wrong or missing something.
Thanks.
Misty Linville
2018-12-01 05:55:54 UTC
Permalink
I like your solution in principle, but I think it would be better to loop
until the queue reports that it's empty.
Post by OpenInx
Thanks Allan's reply
So if we can ensure that all children are independent, then it won't be a
problem.
But the reversed shcedule order is some counterintuitive. I think a minor
change can
diff --git
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
index cbdb9b8..b688ec3 100644
---
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
+++
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
@@ -1794,7 +1794,7 @@ public class ProcedureExecutor<TEnvironment> {
}
private void submitChildrenProcedures(Procedure<TEnvironment>[]
subprocs) {
- for (int i = 0; i < subprocs.length; ++i) {
+ for (int i = subprocs.length - 1; i >= 0; i--) {
Procedure<TEnvironment> subproc = subprocs[i];
subproc.updateMetricsOnSubmit(getEnvironment());
assert !procedures.containsKey(subproc.getProcId());
Thanks.
Post by Allan Yang
Yes, you are right, every procedure will be add to front, so the final
execution order may be reversed, But actually, there will be more than
one
Post by Allan Yang
worker threads, so likely, they will be executed at the same time. I
think
Post by Allan Yang
the design is unintentionally, since all the child procedure should be
independent and won't depend on each other, so they can be executed at
any
Post by Allan Yang
order. And more, after HBASE-21375
<https://issues.apache.org/jira/browse/HBASE-21375> checked in all 2.x
branch, the worker thread will execute every possible procedure in the
queue, so the front ones won't block, so this won't be a problem.
Best Regards
Allan Yang
Post by OpenInx
I read parts of the procedure v2 framework, and found that if a
procedure
Post by OpenInx
has 3 added child procedure,
then it's children will be schedued in the reversed order.
Let me give an example. if a procedure A added 3 child procedure: B,
C ,
Post by Allan Yang
Post by OpenInx
D.
a.addChildProcedure(B, C, D)
The procedure framework will add the B,C,D child produre in a dequeue
to
Post by Allan Yang
Post by OpenInx
schedule
( Code Path --- ProcedureExecutor#execProcedure
-- submitChildrenProcedures -- dequeue#addFront )
So the dequeue will be : (front) D, C, B (back)
Then we will poll each procedure from the dequeue, and dispatch to the
executor to run ..
In the final, the procedure executing order will be : D, C, B,
which
Post by Allan Yang
is
Post by OpenInx
the revered order compared to the addChildProcedure order.
My question is : is it designed intentionally ? Or unintentionally
doing
Post by OpenInx
the wrong implementation ?
Seems most the child procedure are region assign or unassign, looks
like
Post by Allan Yang
no
Post by OpenInx
problem now..
Please correct me if I am wrong or missing something.
Thanks.
张铎(Duo Zhang)
2018-12-01 09:44:04 UTC
Permalink
I think it is fine to keep the current implementation? What is the problem
if we schedule in reverse order? We do not guarantee any order when
executing the sub procedures...
Post by Misty Linville
I like your solution in principle, but I think it would be better to loop
until the queue reports that it's empty.
Post by OpenInx
Thanks Allan's reply
So if we can ensure that all children are independent, then it won't be a
problem.
But the reversed shcedule order is some counterintuitive. I think a
minor
Post by OpenInx
change can
diff --git
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Post by OpenInx
index cbdb9b8..b688ec3 100644
---
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Post by OpenInx
+++
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Post by OpenInx
@@ -1794,7 +1794,7 @@ public class ProcedureExecutor<TEnvironment> {
}
private void submitChildrenProcedures(Procedure<TEnvironment>[]
subprocs) {
- for (int i = 0; i < subprocs.length; ++i) {
+ for (int i = subprocs.length - 1; i >= 0; i--) {
Procedure<TEnvironment> subproc = subprocs[i];
subproc.updateMetricsOnSubmit(getEnvironment());
assert !procedures.containsKey(subproc.getProcId());
Thanks.
Post by Allan Yang
Yes, you are right, every procedure will be add to front, so the final
execution order may be reversed, But actually, there will be more than
one
Post by Allan Yang
worker threads, so likely, they will be executed at the same time. I
think
Post by Allan Yang
the design is unintentionally, since all the child procedure should be
independent and won't depend on each other, so they can be executed at
any
Post by Allan Yang
order. And more, after HBASE-21375
<https://issues.apache.org/jira/browse/HBASE-21375> checked in all 2.x
branch, the worker thread will execute every possible procedure in the
queue, so the front ones won't block, so this won't be a problem.
Best Regards
Allan Yang
Post by OpenInx
I read parts of the procedure v2 framework, and found that if a
procedure
Post by OpenInx
has 3 added child procedure,
then it's children will be schedued in the reversed order.
Let me give an example. if a procedure A added 3 child procedure: B,
C ,
Post by Allan Yang
Post by OpenInx
D.
a.addChildProcedure(B, C, D)
The procedure framework will add the B,C,D child produre in a dequeue
to
Post by Allan Yang
Post by OpenInx
schedule
( Code Path --- ProcedureExecutor#execProcedure
-- submitChildrenProcedures -- dequeue#addFront )
So the dequeue will be : (front) D, C, B (back)
Then we will poll each procedure from the dequeue, and dispatch to
the
Post by OpenInx
Post by Allan Yang
Post by OpenInx
executor to run ..
In the final, the procedure executing order will be : D, C, B,
which
Post by Allan Yang
is
Post by OpenInx
the revered order compared to the addChildProcedure order.
My question is : is it designed intentionally ? Or unintentionally
doing
Post by OpenInx
the wrong implementation ?
Seems most the child procedure are region assign or unassign, looks
like
Post by Allan Yang
no
Post by OpenInx
problem now..
Please correct me if I am wrong or missing something.
Thanks.
Misty Linville
2018-12-01 15:26:09 UTC
Permalink
People could come to rely on the order.

The reason I suggested you should rely on the queue being empty to break
the loop instead of getting the number of elements at the start is to avoid
falling off the end of the queue in the presence of multiple threads.
Post by 张铎(Duo Zhang)
I think it is fine to keep the current implementation? What is the problem
if we schedule in reverse order? We do not guarantee any order when
executing the sub procedures...
Post by Misty Linville
I like your solution in principle, but I think it would be better to loop
until the queue reports that it's empty.
Post by OpenInx
Thanks Allan's reply
So if we can ensure that all children are independent, then it won't
be a
Post by Misty Linville
Post by OpenInx
problem.
But the reversed shcedule order is some counterintuitive. I think a
minor
Post by OpenInx
change can
diff --git
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Post by Misty Linville
Post by OpenInx
index cbdb9b8..b688ec3 100644
---
a/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Post by Misty Linville
Post by OpenInx
+++
b/hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
Post by Misty Linville
Post by OpenInx
@@ -1794,7 +1794,7 @@ public class ProcedureExecutor<TEnvironment> {
}
private void submitChildrenProcedures(Procedure<TEnvironment>[]
subprocs) {
- for (int i = 0; i < subprocs.length; ++i) {
+ for (int i = subprocs.length - 1; i >= 0; i--) {
Procedure<TEnvironment> subproc = subprocs[i];
subproc.updateMetricsOnSubmit(getEnvironment());
assert !procedures.containsKey(subproc.getProcId());
Thanks.
Post by Allan Yang
Yes, you are right, every procedure will be add to front, so the
final
Post by Misty Linville
Post by OpenInx
Post by Allan Yang
execution order may be reversed, But actually, there will be more
than
Post by Misty Linville
Post by OpenInx
one
Post by Allan Yang
worker threads, so likely, they will be executed at the same time. I
think
Post by Allan Yang
the design is unintentionally, since all the child procedure should
be
Post by Misty Linville
Post by OpenInx
Post by Allan Yang
independent and won't depend on each other, so they can be executed
at
Post by Misty Linville
Post by OpenInx
any
Post by Allan Yang
order. And more, after HBASE-21375
<https://issues.apache.org/jira/browse/HBASE-21375> checked in all
2.x
Post by Misty Linville
Post by OpenInx
Post by Allan Yang
branch, the worker thread will execute every possible procedure in
the
Post by Misty Linville
Post by OpenInx
Post by Allan Yang
queue, so the front ones won't block, so this won't be a problem.
Best Regards
Allan Yang
Post by OpenInx
I read parts of the procedure v2 framework, and found that if a
procedure
Post by OpenInx
has 3 added child procedure,
then it's children will be schedued in the reversed order.
B,
Post by Misty Linville
Post by OpenInx
C ,
Post by Allan Yang
Post by OpenInx
D.
a.addChildProcedure(B, C, D)
The procedure framework will add the B,C,D child produre in a
dequeue
Post by Misty Linville
Post by OpenInx
to
Post by Allan Yang
Post by OpenInx
schedule
( Code Path --- ProcedureExecutor#execProcedure
-- submitChildrenProcedures -- dequeue#addFront )
So the dequeue will be : (front) D, C, B (back)
Then we will poll each procedure from the dequeue, and dispatch to
the
Post by OpenInx
Post by Allan Yang
Post by OpenInx
executor to run ..
In the final, the procedure executing order will be : D, C, B,
which
Post by Allan Yang
is
Post by OpenInx
the revered order compared to the addChildProcedure order.
My question is : is it designed intentionally ? Or
unintentionally
Post by Misty Linville
Post by OpenInx
Post by Allan Yang
doing
Post by OpenInx
the wrong implementation ?
Seems most the child procedure are region assign or unassign, looks
like
Post by Allan Yang
no
Post by OpenInx
problem now..
Please correct me if I am wrong or missing something.
Thanks.
Loading...