diff mbox series

block-sha1: remove use of assembly

Message ID 20220307232552.2799122-1-sandals@crustytoothpaste.net (mailing list archive)
State Superseded
Headers show
Series block-sha1: remove use of assembly | expand

Commit Message

brian m. carlson March 7, 2022, 11:25 p.m. UTC
In the block SHA-1 code, we have special assembly code for i386 and
amd64 to perform rotations with assembly.  This is supposed to help pick
the correct rotation operation depending on which rotation is smaller,
which can help some systems perform slightly better, since any circular
rotation can be specified as either a rotate left or a rotate right.
However, this isn't needed, so we should remove it.

First, SHA-1, like SHA-2, uses fixed constant rotates.  Thus, all
rotation amounts are known at compile time and are in fact baked into
the code.  Fortunately, peephole optimizers recognize rotations
specified in the normal way and automatically emit the correct code,
including a preference for choosing a rotate left versus a rotate right.
This has been the case for well over a decade, and is a standard example
of the utility of a peephole optimizer.

Moreover, all modern CPUs, with the exception of extremely limited
embedded CPUs such as some Cortex-M processors, provide a barrel
shifter, which lets the CPU perform rotates of any bit amount in
constant time.  This is valuable for many cryptographic algorithms to
improve performance, and is required to prevent timing attacks in
algorithms which use data-dependent rotations (which don't include the
hash algorithms we use).  As a result, even though the compiler does the
correct optimization, it isn't even needed here and either a left or a
right rotate is equally acceptable.

In fact, the SHA-256 code already takes this into account and just
writes the simple code using an inline function to let the compiler
optimize it for us.

The downside of using this code, however, is that it uses a GCC
extension, which makes the compiler complain when using -pedantic unless
it's prefixed with __extension__.  We could fix that, but since it's
not needed, let's just remove it.  We haven't noticed this because
almost everyone uses the SHA1DC code instead, but it still shows up for
some people.
---
Normally here, I would say, just use the SHA1DC code.  However, at
GitHub, this is wired up to a special piece of code in our Git
implementation that doesn't process attacker controlled data but needs
to persist its internal hash state (to avoid needing to rehash a
potentially large file, a log of operations performed, to which data is
only appended).

Thus, I'm sending in a fix here to avoid the warning, even though it's
my ultimate intention to remove it in favor of the SHA1DC code in the
future.

 block-sha1/sha1.c | 17 -----------------
 1 file changed, 17 deletions(-)

Comments

Taylor Blau March 7, 2022, 11:32 p.m. UTC | #1
On Mon, Mar 07, 2022 at 11:25:52PM +0000, brian m. carlson wrote:
> In the block SHA-1 code, we have special assembly code for i386 and
> amd64 to perform rotations with assembly.  This is supposed to help pick
> the correct rotation operation depending on which rotation is smaller,
> which can help some systems perform slightly better, since any circular
> rotation can be specified as either a rotate left or a rotate right.
> However, this isn't needed, so we should remove it.

At -O1 or higher (at least on GCC) this optimization is indeed
performed. Here's a Godbolt example that shows this:

    https://godbolt.org/z/9zMP93hr1

so I agree that this code isn't helping us at all. And in the
meantime...

> The downside of using this code, however, is that it uses a GCC
> extension, which makes the compiler complain when using -pedantic unless
> it's prefixed with __extension__.  We could fix that, but since it's
> not needed, let's just remove it.  We haven't noticed this because
> almost everyone uses the SHA1DC code instead, but it still shows up for
> some people.

...it makes it impossible to compile git if you have
`BLK_SHA1=YesPlease` and `DEVELOPER=1` in your environment. So I am
happy to see this go.

On another note: missing Signed-off-by?

Thanks,
Taylor
brian m. carlson March 8, 2022, 2:20 a.m. UTC | #2
On 2022-03-07 at 23:32:42, Taylor Blau wrote:
> On Mon, Mar 07, 2022 at 11:25:52PM +0000, brian m. carlson wrote:
> > In the block SHA-1 code, we have special assembly code for i386 and
> > amd64 to perform rotations with assembly.  This is supposed to help pick
> > the correct rotation operation depending on which rotation is smaller,
> > which can help some systems perform slightly better, since any circular
> > rotation can be specified as either a rotate left or a rotate right.
> > However, this isn't needed, so we should remove it.
> 
> At -O1 or higher (at least on GCC) this optimization is indeed
> performed. Here's a Godbolt example that shows this:
> 
>     https://godbolt.org/z/9zMP93hr1
> 
> so I agree that this code isn't helping us at all. And in the
> meantime...

Thanks for providing a link.  I also did a similar test there with
slightly different code (and unfortunately closed the window before
saving the link) but it demonstrated the same thing: that the compiler
can optimize this case adequately.  My (substantially) past experience
with testing GCC in this case has shown the same thing.

> > The downside of using this code, however, is that it uses a GCC
> > extension, which makes the compiler complain when using -pedantic unless
> > it's prefixed with __extension__.  We could fix that, but since it's
> > not needed, let's just remove it.  We haven't noticed this because
> > almost everyone uses the SHA1DC code instead, but it still shows up for
> > some people.
> 
> ...it makes it impossible to compile git if you have
> `BLK_SHA1=YesPlease` and `DEVELOPER=1` in your environment. So I am
> happy to see this go.
> 
> On another note: missing Signed-off-by?

I'll send an otherwise unchanged v2 with that in a second.
diff mbox series

Patch

diff --git a/block-sha1/sha1.c b/block-sha1/sha1.c
index 1bb6e7c069..5974cd7dd3 100644
--- a/block-sha1/sha1.c
+++ b/block-sha1/sha1.c
@@ -11,27 +11,10 @@ 
 
 #include "sha1.h"
 
-#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
-
-/*
- * Force usage of rol or ror by selecting the one with the smaller constant.
- * It _can_ generate slightly smaller code (a constant of 1 is special), but
- * perhaps more importantly it's possibly faster on any uarch that does a
- * rotate with a loop.
- */
-
-#define SHA_ASM(op, x, n) ({ unsigned int __res; __asm__(op " %1,%0":"=r" (__res):"i" (n), "0" (x)); __res; })
-#define SHA_ROL(x,n)	SHA_ASM("rol", x, n)
-#define SHA_ROR(x,n)	SHA_ASM("ror", x, n)
-
-#else
-
 #define SHA_ROT(X,l,r)	(((X) << (l)) | ((X) >> (r)))
 #define SHA_ROL(X,n)	SHA_ROT(X,n,32-(n))
 #define SHA_ROR(X,n)	SHA_ROT(X,32-(n),n)
 
-#endif
-
 /*
  * If you have 32 registers or more, the compiler can (and should)
  * try to change the array[] accesses into registers. However, on