diff mbox series

buffer: Add cast in grow_buffers() to avoid a multiplication libcall

Message ID 20231128-avoid-muloti4-grow_buffers-v1-1-bc3d0f0ec483@kernel.org (mailing list archive)
State New, archived
Headers show
Series buffer: Add cast in grow_buffers() to avoid a multiplication libcall | expand

Commit Message

Nathan Chancellor Nov. 28, 2023, 11:55 p.m. UTC
When building with clang after commit 697607935295 ("buffer: fix
grow_buffers() for block size > PAGE_SIZE"), there is an error at link
time due to the generation of a 128-bit multiplication libcall:

  ld.lld: error: undefined symbol: __muloti4
  >>> referenced by buffer.c:0 (fs/buffer.c:0)
  >>>               fs/buffer.o:(bdev_getblk) in archive vmlinux.a

Due to the width mismatch between the factors and the sign mismatch
between the factors and the result, clang generates IR that performs
this overflow check with 65-bit signed multiplication and LLVM does not
improve on it during optimization, so the 65-bit multiplication is
extended to 128-bit during legalization, resulting in the libcall on
most targets.

To avoid the initial situation that causes clang to generate the
problematic IR, cast size (which is an 'unsigned int') to the same
type/width as block (which is currently a 'u64'/'unsigned long long').
GCC appears to already do this internally because there is no binary
difference with the cast for arm, arm64, riscv, or x86_64.

Link: https://github.com/ClangBuiltLinux/linux/issues/1958
Link: https://github.com/llvm/llvm-project/issues/38013
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Closes: https://lore.kernel.org/CA+G9fYuA_PTd7R2NsBvtNb7qjwp4avHpCmWi4=OmY4jndDcQYA@mail.gmail.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
I am aware the hash in the commit message is not stable due to being on
the mm-unstable branch but I figured I would write the commit message as
if it would be standalone, in case this should not be squashed into the
original change. I did not add a comment to the source around this
workaround but I can if so desired.
---
 fs/buffer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


---
base-commit: 5cdba94229e58a39ca389ad99763af29e6b0c5a5
change-id: 20231128-avoid-muloti4-grow_buffers-5664204f5429

Best regards,

Comments

Andrew Morton Nov. 29, 2023, 1:40 a.m. UTC | #1
On Tue, 28 Nov 2023 16:55:43 -0700 Nathan Chancellor <nathan@kernel.org> wrote:

> When building with clang after commit 697607935295 ("buffer: fix
> grow_buffers() for block size > PAGE_SIZE"), there is an error at link
> time due to the generation of a 128-bit multiplication libcall:
> 
>   ld.lld: error: undefined symbol: __muloti4
>   >>> referenced by buffer.c:0 (fs/buffer.c:0)
>   >>>               fs/buffer.o:(bdev_getblk) in archive vmlinux.a
> 
> Due to the width mismatch between the factors and the sign mismatch
> between the factors and the result, clang generates IR that performs
> this overflow check with 65-bit signed multiplication and LLVM does not
> improve on it during optimization, so the 65-bit multiplication is
> extended to 128-bit during legalization, resulting in the libcall on
> most targets.
> 
> To avoid the initial situation that causes clang to generate the
> problematic IR, cast size (which is an 'unsigned int') to the same
> type/width as block (which is currently a 'u64'/'unsigned long long').
> GCC appears to already do this internally because there is no binary
> difference with the cast for arm, arm64, riscv, or x86_64.
> 
> ...
>
> I am aware the hash in the commit message is not stable due to being on
> the mm-unstable branch but I figured I would write the commit message as
> if it would be standalone, in case this should not be squashed into the
> original change. I did not add a comment to the source around this
> workaround but I can if so desired.

That's good.  Yes, I'll squash it into the base patch, but the Link: to
this fix will appear in the permanent record, for the inquisitive.

> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -1091,7 +1091,7 @@ static bool grow_buffers(struct block_device *bdev, sector_t block,
>  	 * Check for a block which lies outside our maximum possible
>  	 * pagecache index.
>  	 */
> -	if (check_mul_overflow(block, size, &pos) || pos > MAX_LFS_FILESIZE) {
> +	if (check_mul_overflow(block, (sector_t)size, &pos) || pos > MAX_LFS_FILESIZE) {
>  		printk(KERN_ERR "%s: requested out-of-range block %llu for device %pg\n",
>  			__func__, (unsigned long long)block,
>  			bdev);

This seems appropriate.  Changing the type of incoming arg `size' feels
a bit fake - this is the per-bdev buffer_head size and expressing that
as a sector_t is misleading and unrealistic.
diff mbox series

Patch

diff --git a/fs/buffer.c b/fs/buffer.c
index 4eb44ccdc6be..3a8c8322ed28 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -1091,7 +1091,7 @@  static bool grow_buffers(struct block_device *bdev, sector_t block,
 	 * Check for a block which lies outside our maximum possible
 	 * pagecache index.
 	 */
-	if (check_mul_overflow(block, size, &pos) || pos > MAX_LFS_FILESIZE) {
+	if (check_mul_overflow(block, (sector_t)size, &pos) || pos > MAX_LFS_FILESIZE) {
 		printk(KERN_ERR "%s: requested out-of-range block %llu for device %pg\n",
 			__func__, (unsigned long long)block,
 			bdev);