Message ID | 20201216222448.2054-1-luc.vanoostenryck@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | warn when zero-extending a negation | expand |
On Wed, Dec 16, 2020 at 2:25 PM Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > I suppose that it is fine for you that I your SoB instead of the > 'Originally-by' I used here? Either works for me. Some of the cases I saw (from my very quick look) were because of annoying zero extensions that should have been optimized away. Example: static inline unsigned char bytemask(int bit) { return ~(1<<bit); } unsigned char maskout(unsigned char a, int bit) { return a & bytemask(bit); } note how this is all safe, because everything is operating on just bytes. But sparse complains: t.c:3:21: warning: zero-extending a negation - upper bits not negated because obviously the usual C type expansion to 'int'. But that really is because sparse is benign stupid, and leaves those type expansions around even though they then get truncated down again. You can see it in the code generation too: Zero-extend, and then truncate: zext.32 %r2 <- (8) %arg1 shl.32 %r5 <- $1, %arg2 trunc.8 %r6 <- (32) %r5 then do the 'not' in 8 bits, because we did that part ok: not.8 %r7 <- %r6 and then the zero-extend and truncate thing again: zext.32 %r9 <- (8) %r7 and.32 %r10 <- %r2, %r9 trunc.8 %r11 <- (32) %r10 and then the return in 8 bits: ret.8 %r11 because sparse doesn't do the simplification to just do the shl and and in 8 bits (but sparse *does* do the simplification to do the 'not' in 8 bits). So the warning comes from the fact that we kept that zero extend around, even though it really wasn't relevant.. I don't know how many of the false positives were due to things like this, but at least a couple were. Linus
On Wed, Dec 16, 2020 at 02:37:04PM -0800, Linus Torvalds wrote: > On Wed, Dec 16, 2020 at 2:25 PM Luc Van Oostenryck > <luc.vanoostenryck@gmail.com> wrote: > > > > I suppose that it is fine for you that I your SoB instead of the > > 'Originally-by' I used here? > > Either works for me. > > Some of the cases I saw (from my very quick look) were because of > annoying zero extensions that should have been optimized away. ... > Zero-extend, and then truncate: > > zext.32 %r2 <- (8) %arg1 > shl.32 %r5 <- $1, %arg2 > trunc.8 %r6 <- (32) %r5 > > then do the 'not' in 8 bits, because we did that part ok: > > not.8 %r7 <- %r6 > > and then the zero-extend and truncate thing again: > > zext.32 %r9 <- (8) %r7 > and.32 %r10 <- %r2, %r9 > trunc.8 %r11 <- (32) %r10 > > and then the return in 8 bits: > > ret.8 %r11 > > because sparse doesn't do the simplification to just do the shl and > and in 8 bits (but sparse *does* do the simplification to do the 'not' > in 8 bits). But replacing a trunc + zext by the corresponding masking, very little, if anything is done for such 'mixed-width' expressions. So, I'm even a bit surprised by the not.8 but well ... I also confess, that coming from an ARM background, seeing a not.8 or a shl.8 seems quite unnatural to me. I would tend to force everything to at least the same width as 'int'. > So the warning comes from the fact that we kept that zero extend > around, even though it really wasn't relevant.. > > I don't know how many of the false positives were due to things like > this, but at least a couple were. Yes, most probably. I suppose my (old old) series about bitfield simplification will help a little bit here. I'll try to look at this during the holidays. -- Luc
On Thu, Dec 17, 2020 at 12:51:52AM +0100, Luc Van Oostenryck wrote: > > But replacing a trunc + zext by the corresponding masking, very > little, if anything is done for such 'mixed-width' expressions. > So, I'm even a bit surprised by the not.8 but well ... This bothered me a bit and kept me awake, so I had to check. I think that the situation is caused by some premature optimization for the ~ operator in expression.c:cast_to(). It saves the allocation and initialization of one expression but makes things more complicated at linearization and simplification. If this is disabled, then the IR simplification returns what I was expecting: maskout: zext.32 %r2 <- (8) %arg1 shl.32 %r5 <- $1, %arg2 not.32 %r6 <- %r5 and.32 %r9 <- %r6, $255 and.32 %r10 <- %r2, %r9 trunc.8 %r11 <- (32) %r10 ret.8 %r11 and some reassociation patches (coming soon) will simplify away the masking with $255 and the trunc.8 -- Luc
diff --git a/linearize.c b/linearize.c index 0250c6bb17ef..b9faac78ebb7 100644 --- a/linearize.c +++ b/linearize.c @@ -2520,6 +2520,27 @@ static void check_tainted_insn(struct instruction *insn) } } +static void check_zero_extend(struct instruction *insn) +{ + struct instruction *def; + pseudo_t src = insn->src1; + + if (!Wzero_extend_negation) + return; + if (src->type != PSEUDO_REG) + return; + def = src->def; + if (!def) + return; + switch (def->opcode) { + case OP_NEG: case OP_NOT: + warning(insn->pos, "zero-extending a negation - upper bits not negated"); + break; + default: + break; + } +} + /// // issue warnings after all possible DCE static void late_warnings(struct entrypoint *ep) @@ -2537,6 +2558,10 @@ static void late_warnings(struct entrypoint *ep) // Check for illegal offsets. check_access(insn); break; + case OP_ZEXT: + // Check for missing sign extension.. + check_zero_extend(insn); + break; } } END_FOR_EACH_PTR(insn); } END_FOR_EACH_PTR(bb); diff --git a/options.c b/options.c index 17da5f367e24..5323ddc05861 100644 --- a/options.c +++ b/options.c @@ -139,6 +139,7 @@ int Wunion_cast = 0; int Wuniversal_initializer = 0; int Wunknown_attribute = 0; int Wvla = 1; +int Wzero_extend_negation = 1; //////////////////////////////////////////////////////////////////////////////// // Helpers for option parsing @@ -884,6 +885,7 @@ static const struct flag warnings[] = { { "universal-initializer", &Wuniversal_initializer }, { "unknown-attribute", &Wunknown_attribute }, { "vla", &Wvla }, + { "zero-extend-negation", &Wzero_extend_negation }, { } }; diff --git a/options.h b/options.h index 0aec8764d27d..3403c9518ead 100644 --- a/options.h +++ b/options.h @@ -138,6 +138,7 @@ extern int Wunion_cast; extern int Wuniversal_initializer; extern int Wunknown_attribute; extern int Wvla; +extern int Wzero_extend_negation; extern char **handle_switch(char *arg, char **next); extern void handle_switch_finalize(void); diff --git a/sparse.1 b/sparse.1 index 430b3710b260..928e3513b9b6 100644 --- a/sparse.1 +++ b/sparse.1 @@ -494,6 +494,14 @@ Warn on casts to union types. Sparse does not issues these warnings by default. . +.TP +.B -Wzero-extend-negation +Warn when an unsigned value is first negated (logical or arithmetical) +and then converted to a wider type. + +Sparse issues these warnings by default. +To turn them off, use \fB-Wno-zero-extend-negation\fR. +. .SH MISC OPTIONS .TP .B \-\-arch=\fIARCH\fR
When an unsigned value is negated (logical or arithmetical) and then converted to a wider type, this value will be zero-extended, not sign-extended. In other words, upper bits won't be negated. This may be the intention but may also be a source of errors. So, add a warning for this. Also, because this warning may be too noise because most catches will possibly be false-positives, add a specific warning flag to disable it: -Wno-zero-extend-negation. Link: https://lore.kernel.org/r/CAHk-=wjiC6UejP6xob9BMQy98O6OLGDhy-qDfaFcOJxo90iOFg@mail.gmail.com Originally-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> --- On my usual test setup (defconfig + allyesconfig) this gives 199 warnings. I only checked a couple of them, they we're false positives but somehow error-prone if some definitions are changed. For example: * struct super_block::s_flags is defined as 'unsigned long', all flags are hold in 32-bits but struct fs_context::sb_flags_mask is defined as 'unsigned int'. * struct inode::i_stat is defined as 'unsigned long', all I_* are defined as (signed) 'int' but some code do 'unsigned dirty = I_DIRTY;' For the moment, I've left the warning enabled by default but it should probably only be enabled at W=1. @Linus, I suppose that it is fine for you that I your SoB instead of the 'Originally-by' I used here? -- Luc linearize.c | 25 +++++++++++++++++++++++++ options.c | 2 ++ options.h | 1 + sparse.1 | 8 ++++++++ 4 files changed, 36 insertions(+)