Message ID | 20201204163315.68538-1-luc.vanoostenryck@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | experimental: code sinking | expand |
On Fri, Dec 4, 2020 at 8:34 AM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > A lot of the false 'context imbalance' warnings are caused by > a potential jump-threading being blocked between 2 conditional > branches on the same condition because the second CBR belong > to a non-empty BB. Often the offending instructions can be moved > to some other BB, sometimes even with some added advantages. I hope you don't just use the context imbalance as a sign of this being a good idea, because a lot of the context imbalances are likely real and valid. I do think moving the instruction to the (single) user sounds like a good idea in some cases, but I'm a bit worried about doing it quite this mindlessly. It can expand on liveness a lot - while the liveness of the result of the sunk instruction shrinks, the liveness of the sources to the sunk instruction can grow a lot. That's obviously a non-issue for the use of sparse as an analysis tool (and that's clearly the primary use), but I'd still like to think that code generation might matter. So I think this might be better with more heuristics. And explaining them. Right now you have one heuristic: you only sink instructions from bb's that end with a conditional branch. I'm not entirely sure that I understand the reason for that heuristic, it smells a bit arbitrary to me (I suspect it was the case you saw when looking at examples). On that note: would also be lovely to actually see examples of what this results in - and not necessarily about just the context imbalance again. There might be cases where instruction sinking makes sense even outside the "can we empty this bb entirely" issue. Not that I can think of any, but I wonder if this could be used to actually shrink liveness regions (if both the inputs to the sunk instruction are live _anyway_ at the target, then sinking the instruction should actually improve liveness in general, for example). Linus
On Fri, Dec 04, 2020 at 09:13:50AM -0800, Linus Torvalds wrote: > On Fri, Dec 4, 2020 at 8:34 AM Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > > > A lot of the false 'context imbalance' warnings are caused by > > a potential jump-threading being blocked between 2 conditional > > branches on the same condition because the second CBR belong > > to a non-empty BB. Often the offending instructions can be moved > > to some other BB, sometimes even with some added advantages. > > I hope you don't just use the context imbalance as a sign of this > being a good idea, because a lot of the context imbalances are likely > real and valid. Not exactly, but I confess that currently I'm focusing a lot on the '*false* context imbalance' (those I can see in the C code or in the IR that should be OK but sparse warning on them nevertheless). I've not a very clear idea of how much of those warnings are real (but I'm begin to think more and more than most are real). By far, the most common cause of these false warnings is a CBR-CBR on the same condition that is not simplified away because the second one is not empty. > I do think moving the instruction to the (single) user sounds like a > good idea in some cases, but I'm a bit worried about doing it quite > this mindlessly. It can expand on liveness a lot - while the liveness > of the result of the sunk instruction shrinks, the liveness of the > sources to the sunk instruction can grow a lot. > > That's obviously a non-issue for the use of sparse as an analysis tool > (and that's clearly the primary use), but I'd still like to think that > code generation might matter. Yes, I agree for both points. > So I think this might be better with more heuristics. And explaining > them. Right now you have one heuristic: you only sink instructions > from bb's that end with a conditional branch. I'm not entirely sure > that I understand the reason for that heuristic, it smells a bit > arbitrary to me (I suspect it was the case you saw when looking at > examples). Yes, indeed, it's arbitrary because it's the only case I'm interested about for these 'false context imbalance'. But no worries, as explained in the patch description, it's not my intention to merge this, certainly not as-is. It's more a kind of experiment to play with, a kind of exploratory tool. > On that note: would also be lovely to actually see examples of what > this results in - and not necessarily about just the context imbalance > again. Yes, I'll add this (but I'm not sure to have anything significant not related to emptying a BB ending with a CBR). > There might be cases where instruction sinking makes sense even > outside the "can we empty this bb entirely" issue. Not that I can > think of any, but I wonder if this could be used to actually shrink > liveness regions (if both the inputs to the sunk instruction are live > _anyway_ at the target, then sinking the instruction should actually > improve liveness in general, for example). I don't think I understand this. In the case of an UNOP, sinking it increase the liveness and decrease the liveness of the output, so it should not matter much. In the case of an BINOP or select, sinking it will decrease the liveness of the unique output but increase the liveness of the inputs. So, it seems to me that sinking would globally increase the liveness (contrary to moving up instructions). Am I missing something? -- Luc
On Fri, Dec 4, 2020 at 9:45 AM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > > There might be cases where instruction sinking makes sense even > > outside the "can we empty this bb entirely" issue. Not that I can > > think of any, but I wonder if this could be used to actually shrink > > liveness regions (if both the inputs to the sunk instruction are live > > _anyway_ at the target, then sinking the instruction should actually > > improve liveness in general, for example). > > I don't think I understand this. In the case of an UNOP, sinking it > increase the liveness and decrease the liveness of the output, so > it should not matter much. Right. The UNOP case should be a no-op from a liveness perspective, but: > In the case of an BINOP or select, sinking > it will decrease the liveness of the unique output but increase the > liveness of the inputs. So, it seems to me that sinking would > globally increase the liveness (contrary to moving up instructions). > Am I missing something? No, moving a binop could actually *shrink* liveness under the right circumstances - namely when the sources of the binop are live regardless. Completely stupid example that makes no sense, and only exists to illustrate the issue: int diff(int x, int y); int fn2(int x, int y, int sum, int diff); int test(int x, int y) { int sum = x+y; return fn2(x, y, sum, diff(x,y)); } which generates add.32 %r3 <- %arg1, %arg2 call.32 %r9 <- fn1, %arg1, %arg2 call.32 %r10 <- fn2, %arg1, %arg2, %r3, %r9 ret.32 %r10 but it would actually improve liveness if that "add" was moved down - because even though it "expands" the liveness of %arg1/arg2 by moving the use of those down, both of those argument pseudos have later uses _anyway_. So that expansion of liveness is a non-issue. Instead, it shrinks the liveness region of %r3. Ergo, it actually shrinks liveness region in the big picture. Now, the above stupid example is one single bb, so in that sense it's not really relevant for your inter-bb movement, but that doesn't actually change the argument at all. Insert a conditional in there to get a multi-bb case. Linus
On Fri, Dec 04, 2020 at 10:15:30AM -0800, Linus Torvalds wrote: > On Fri, Dec 4, 2020 at 9:45 AM Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > > > > There might be cases where instruction sinking makes sense even > > > outside the "can we empty this bb entirely" issue. Not that I can > > > think of any, but I wonder if this could be used to actually shrink > > > liveness regions (if both the inputs to the sunk instruction are live > > > _anyway_ at the target, then sinking the instruction should actually > > > improve liveness in general, for example). > > > > I don't think I understand this. In the case of an UNOP, sinking it > > increase the liveness and decrease the liveness of the output, so > > it should not matter much. > > Right. The UNOP case should be a no-op from a liveness perspective, but: > > > In the case of an BINOP or select, sinking > > it will decrease the liveness of the unique output but increase the > > liveness of the inputs. So, it seems to me that sinking would > > globally increase the liveness (contrary to moving up instructions). > > Am I missing something? > > No, moving a binop could actually *shrink* liveness under the right > circumstances - namely when the sources of the binop are live > regardless. Hmm yes, indeed. I thought about that but I also implicitly thought there was a dual effect for the output, but there isn't. And so the 'cost' is not the same before the 'last occurrence of other uses' than after it. In short: "moving down is good but only when not too low, unless it's an unop". Anyway, the idea would be to do such moves only if they would effectively empty and even then it's clear what is the exact advantage (other than for imbalance). Also, moving it to the point where it is needed was very easy. Moving it just past the CBR (or somewhere in the middle) will be more complicated. I'll see. -- Luc
diff --git a/Makefile b/Makefile index 313664467151..5ba54659f625 100644 --- a/Makefile +++ b/Makefile @@ -35,6 +35,7 @@ LIB_OBJS := LIB_OBJS += allocate.o LIB_OBJS += builtin.o LIB_OBJS += char.o +LIB_OBJS += code-sink.o LIB_OBJS += compat-$(OS).o LIB_OBJS += cse.o LIB_OBJS += dissect.o diff --git a/code-sink.c b/code-sink.c new file mode 100644 index 000000000000..566ddec028a0 --- /dev/null +++ b/code-sink.c @@ -0,0 +1,92 @@ +#include "optimize.h" +#include "lib.h" +#include "linearize.h" + + +static inline struct instruction *get_user(pseudo_t p) +{ + struct pseudo_user *pu; + + FOR_EACH_PTR(p->users, pu) { + if (!pu) + continue; + return pu->insn; + } END_FOR_EACH_PTR(pu); + return NULL; +} + +static bool sink_insn(struct instruction *insn, struct basic_block *bb) +{ + struct instruction *curr; + + FOR_EACH_PTR(bb->insns, curr) { + if (!curr->bb) + continue; + if (curr->opcode == OP_PHI) + continue; + INSERT_CURRENT(insn, curr); + insn->bb = bb; + return true; + } END_FOR_EACH_PTR(curr); + return false; +} + +static int code_sink_bb(struct basic_block *bb) +{ + struct instruction *insn; + int changed = 0; + + FOR_EACH_PTR_REVERSE(bb->insns, insn) { + struct instruction *user; + pseudo_t target; + + if (!insn->bb) + continue; + switch (insn->opcode) { + case OP_BINARY ... OP_BINCMP_END: + case OP_UNOP ... OP_UNOP_END: + case OP_SYMADDR: + case OP_SLICE: + case OP_SEL: case OP_FMADD: + case OP_LABEL: case OP_SETVAL: case OP_SETFVAL: + break; + case OP_CBR: + case OP_INLINED_CALL: + case OP_NOP: + continue; + default: + continue; + } + + target = insn->target; + if (!one_use(target)) + continue; + user = get_user(target); + if (!user || !user->bb || user->bb == bb) + continue; + if (!sink_insn(insn, user->bb)) + continue; + DELETE_CURRENT_PTR(insn); + changed = 1; + } END_FOR_EACH_PTR_REVERSE(insn); + return changed; +} + +int code_sink(struct entrypoint *ep) +{ + struct basic_block *bb; + int changed = 0; + + FOR_EACH_PTR(ep->bbs, bb) { + struct instruction *last; + + if (!bb->ep) + continue; + last = last_instruction(bb->insns); + switch (last->opcode) { + case OP_CBR: + changed |= code_sink_bb(bb); + } + } END_FOR_EACH_PTR(bb); + return changed; +} diff --git a/optimize.c b/optimize.c index 3351e67b9d5e..b652b0e76d2a 100644 --- a/optimize.c +++ b/optimize.c @@ -105,6 +105,8 @@ repeat: pack_basic_blocks(ep); if (repeat_phase & REPEAT_CFG_CLEANUP) cleanup_cfg(ep); + if (code_sink(ep)) + repeat_phase |= REPEAT_CSE; } while (repeat_phase); vrfy_flow(ep); diff --git a/optimize.h b/optimize.h index 31e2cf081704..d9ac9cd48ea2 100644 --- a/optimize.h +++ b/optimize.h @@ -6,4 +6,7 @@ struct entrypoint; /* optimize.c */ void optimize(struct entrypoint *ep); +/* sink.c */ +int code_sink(struct entrypoint *ep); + #endif
A lot of the false 'context imbalance' warnings are caused by a potential jump-threading being blocked between 2 conditional branches on the same condition because the second CBR belong to a non-empty BB. Often the offending instructions can be moved to some other BB, sometimes even with some added advantages. This patch help a bit with these false warnings by doing a limited form of code sinking: blocking instructions with a single user are moved in the BB where they're used, possibly making the original BB empty and thus making the jump-threading possible. Note: It's not the intention to use the patch as is. Ideally, it should first be checked if the original BB can be made empty before moving the instructions around, but this should be coordinated with other ways of moving these instructions. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- Makefile | 1 + code-sink.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++++ optimize.c | 2 ++ optimize.h | 3 ++ 4 files changed, 98 insertions(+) create mode 100644 code-sink.c