Message ID | 20231128-avoid-muloti4-grow_buffers-v1-1-bc3d0f0ec483@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | buffer: Add cast in grow_buffers() to avoid a multiplication libcall | expand |
On Tue, 28 Nov 2023 16:55:43 -0700 Nathan Chancellor <nathan@kernel.org> wrote: > When building with clang after commit 697607935295 ("buffer: fix > grow_buffers() for block size > PAGE_SIZE"), there is an error at link > time due to the generation of a 128-bit multiplication libcall: > > ld.lld: error: undefined symbol: __muloti4 > >>> referenced by buffer.c:0 (fs/buffer.c:0) > >>> fs/buffer.o:(bdev_getblk) in archive vmlinux.a > > Due to the width mismatch between the factors and the sign mismatch > between the factors and the result, clang generates IR that performs > this overflow check with 65-bit signed multiplication and LLVM does not > improve on it during optimization, so the 65-bit multiplication is > extended to 128-bit during legalization, resulting in the libcall on > most targets. > > To avoid the initial situation that causes clang to generate the > problematic IR, cast size (which is an 'unsigned int') to the same > type/width as block (which is currently a 'u64'/'unsigned long long'). > GCC appears to already do this internally because there is no binary > difference with the cast for arm, arm64, riscv, or x86_64. > > ... > > I am aware the hash in the commit message is not stable due to being on > the mm-unstable branch but I figured I would write the commit message as > if it would be standalone, in case this should not be squashed into the > original change. I did not add a comment to the source around this > workaround but I can if so desired. That's good. Yes, I'll squash it into the base patch, but the Link: to this fix will appear in the permanent record, for the inquisitive. > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -1091,7 +1091,7 @@ static bool grow_buffers(struct block_device *bdev, sector_t block, > * Check for a block which lies outside our maximum possible > * pagecache index. > */ > - if (check_mul_overflow(block, size, &pos) || pos > MAX_LFS_FILESIZE) { > + if (check_mul_overflow(block, (sector_t)size, &pos) || pos > MAX_LFS_FILESIZE) { > printk(KERN_ERR "%s: requested out-of-range block %llu for device %pg\n", > __func__, (unsigned long long)block, > bdev); This seems appropriate. Changing the type of incoming arg `size' feels a bit fake - this is the per-bdev buffer_head size and expressing that as a sector_t is misleading and unrealistic.
diff --git a/fs/buffer.c b/fs/buffer.c index 4eb44ccdc6be..3a8c8322ed28 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1091,7 +1091,7 @@ static bool grow_buffers(struct block_device *bdev, sector_t block, * Check for a block which lies outside our maximum possible * pagecache index. */ - if (check_mul_overflow(block, size, &pos) || pos > MAX_LFS_FILESIZE) { + if (check_mul_overflow(block, (sector_t)size, &pos) || pos > MAX_LFS_FILESIZE) { printk(KERN_ERR "%s: requested out-of-range block %llu for device %pg\n", __func__, (unsigned long long)block, bdev);