mbox series

[00/10] simplify and canonicalize signed compares

Message ID 20210126220432.58265-1-luc.vanoostenryck@gmail.com (mailing list archive)
Headers show
Series simplify and canonicalize signed compares | expand

Message

Luc Van Oostenryck Jan. 26, 2021, 10:04 p.m. UTC
This series fixes and improves the simplification and the
canonicalization of signed compares.

Luc Van Oostenryck (10):
  cmps: make clearer we're using the operands' size
  cmps: fix simplification of sext(x) + signed compare of {SMAX,SMIN}
  cmpu: fix canonicalization of unsigned (x {<,>=} C) --> (x {<=,>} C-1)
  cmps: add testcases for simplification of signed compares
  cmps: simplify signed compares with SMIN or SMAX
  cmps: canonicalize signed compares with SMIN/SMAX
  cmps: canonicalize SMIN/SMAX +- 1 --> EQ/NE
  cmps: canonicalize signed compares with constant
  cmps: canonicalize SEL(x {<,<=} y, a, b) --> SEL(x {>=,>} y, b, a)
  cmps: canonicalize SEL(x > 0, a, -a) --> SEL(x >= 0, a, -a)

 simplify.c                               | 73 +++++++++++++++++++++---
 validation/optim/canonical-abs.c         | 11 ++++
 validation/optim/canonical-cmpe-minmax.c | 16 ++++++
 validation/optim/canonical-cmps-minmax.c | 16 ++++++
 validation/optim/canonical-cmps-sel.c    | 25 ++++++++
 validation/optim/canonical-cmps.c        | 16 ++++++
 validation/optim/cmp-sext-simm.c         | 46 +++++++++++----
 validation/optim/cmps-minmax.c           | 16 ++++++
 8 files changed, 200 insertions(+), 19 deletions(-)
 create mode 100644 validation/optim/canonical-abs.c
 create mode 100644 validation/optim/canonical-cmpe-minmax.c
 create mode 100644 validation/optim/canonical-cmps-minmax.c
 create mode 100644 validation/optim/canonical-cmps-sel.c
 create mode 100644 validation/optim/canonical-cmps.c
 create mode 100644 validation/optim/cmps-minmax.c


base-commit: 0fb77bb6e5429575f52b5e26f06db031f93de057

Comments

Linus Torvalds April 17, 2021, 5:16 p.m. UTC | #1
On Tue, Jan 26, 2021 at 7:45 PM Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> This series fixes and improves the simplification and the
> canonicalization of signed compares.

Hmm. Sorry for not replying earlier, but I just checked the most
common simplification of signed compares, and it didn't work.

This:

    _Bool test(int a)
    {
        return a >=0 && a < 16;
    }

should simplify to be the same as

    _Bool test(int a)
    {
        return (unsigned)a < 16;
    }

but it doesn't. It generates the silly - but straightforward - "two
comparisons and a 'and' of the result".

In fact, the recent canonicalizations means that the compare against
zero is actually pessimised, and ">= 0" becomes "> 0xffffffff", which
is often a much more expensive operation.

This came up because I was looking at some kernel code that did
exactly that "check that a signed value is within proper bounds", and
the zero check is a very common bound.

               Linus
Luc Van Oostenryck April 17, 2021, 6:20 p.m. UTC | #2
On Sat, Apr 17, 2021 at 10:16:31AM -0700, Linus Torvalds wrote:
> On Tue, Jan 26, 2021 at 7:45 PM Luc Van Oostenryck
> <luc.vanoostenryck@gmail.com> wrote:
> >
> > This series fixes and improves the simplification and the
> > canonicalization of signed compares.
> 
> Hmm. Sorry for not replying earlier, but I just checked the most
> common simplification of signed compares, and it didn't work.
> 
> This:
> 
>     _Bool test(int a)
>     {
>         return a >=0 && a < 16;
>     }
> 
> should simplify to be the same as
> 
>     _Bool test(int a)
>     {
>         return (unsigned)a < 16;
>     }
> 
> but it doesn't. It generates the silly - but straightforward - "two
> comparisons and a 'and' of the result".

Yes, I've a draft for this but I still needs to add some tests and such.

> In fact, the recent canonicalizations means that the compare against
> zero is actually pessimised, and ">= 0" becomes "> 0xffffffff", which
> is often a much more expensive operation.

Yes, I'm aware of the problem, more or less.
When I did the canonicalization, I wondered what was better:
* canonicalize toward 0
* canonicalize toward the smallest

I choose the later because at the moment it was somehow advantageous
because it reduced by 2 some patterns (it allows to eliminate all >=
and all <; when doing it toward 0 you can eliminate one set for positive
and the other one for negatives values so you need both sets). The
compare with -1 instead of with 0 didn't looked much problematic and
relatively rare. But in the following days I realized it was in fact
quite annoying.

So, yes, it's something I'll change very soon.

-- Luc