Message ID | 20250127044721.GD1977892@ZenIV (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [git,pull] d_revalidate pile | expand |
On Mon, Jan 27, 2025 at 04:47:21AM +0000, Al Viro wrote: >->d_revalidate() series, along with ->d_iname preliminary work. >One trivial conflict in fs/afs/dir.c - afs_do_lookup_one() has lost >one argument in mainline and switched another from dentry to qstr >in this series. > >The following changes since commit 40384c840ea1944d7c5a392e8975ed088ecf0b37: > > Linux 6.13-rc1 (2024-12-01 14:28:56 -0800) > >are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git tags/pull-revalidate > >for you to fetch changes up to a8ea90bfec66b239dad9a478fc444aa32d3961bc: > > 9p: fix ->rename_sem exclusion (2025-01-25 11:51:57 -0500) > >---------------------------------------------------------------- >Provide stable parent and name to ->d_revalidate() instances > >Most of the filesystem methods where we care about dentry name >and parent have their stability guaranteed by the callers; >->d_revalidate() is the major exception. > >It's easy enough for callers to supply stable values for >expected name and expected parent of the dentry being >validated. That kills quite a bit of boilerplate in >->d_revalidate() instances, along with a bunch of races >where they used to access ->d_name without sufficient >precautions. > >Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Hey Al, With this pulled on top of Linus's tree, LKFT is managing to trigger kfence warnings: <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0 <3>[ 62.180289] <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): <4>[ 62.184178] d_same_name+0x4c/0xd0 <4>[ 62.184717] d_lookup+0x40/0x78 <4>[ 62.185378] lookup_dcache+0x34/0xb0 <4>[ 62.185980] lookup_one_qstr_excl+0x2c/0xe0 <4>[ 62.186523] do_renameat2+0x324/0x568 <4>[ 62.186948] __arm64_sys_renameat+0x58/0x78 <4>[ 62.187484] invoke_syscall+0x50/0x120 <4>[ 62.188220] el0_svc_common.constprop.0+0x48/0xf0 <4>[ 62.189031] do_el0_svc_compat+0x24/0x48 <4>[ 62.189635] el0_svc_compat+0x34/0xd0 <4>[ 62.190018] el0t_32_sync_handler+0xb0/0x138 <4>[ 62.190537] el0t_32_sync+0x19c/0x1a0 <3>[ 62.190946] <4>[ 62.191399] kfence-#174: 0x0000000012d508d5-0x0000000023355f7e, size=64, cache=kmalloc-rcl-64 <4>[ 62.191399] <4>[ 62.192260] allocated by task 1 on cpu 0 at 62.177313s (0.014839s ago): <4>[ 62.193504] __d_alloc+0x15c/0x1d0 <4>[ 62.193925] d_alloc+0x24/0x90 <4>[ 62.194204] lookup_one_qstr_excl+0x68/0xe0 <4>[ 62.194741] filename_create+0xc0/0x1b0 <4>[ 62.195129] do_symlinkat+0x68/0x150 <4>[ 62.195657] __arm64_sys_symlinkat+0x50/0x70 <4>[ 62.195954] invoke_syscall+0x50/0x120 <4>[ 62.196461] el0_svc_common.constprop.0+0x48/0xf0 <4>[ 62.197053] do_el0_svc_compat+0x24/0x48 <4>[ 62.197411] el0_svc_compat+0x34/0xd0 <4>[ 62.197849] el0t_32_sync_handler+0xb0/0x138 <4>[ 62.198422] el0t_32_sync+0x19c/0x1a0 <3>[ 62.198857] <3>[ 62.199577] CPU: 0 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0 #1 <3>[ 62.200435] Hardware name: linux,dummy-virt (DT) The full log is at: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27028572/suite/log-parser-test/test/kfence-bug-kfence-out-of-bounds-read-in-d_same_name/log LMK if I should attempt a bisection.
On Mon, Jan 27, 2025 at 12:19:54PM -0500, Sasha Levin wrote: > The full log is at: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27028572/suite/log-parser-test/test/kfence-bug-kfence-out-of-bounds-read-in-d_same_name/log > > LMK if I should attempt a bisection. Could you try your setup on 58cf9c383c5c "dcache: back inline names with a struct-wrapped array of unsigned long"?
On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote: > > With this pulled on top of Linus's tree, LKFT is managing to trigger > kfence warnings: > > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0 > <3>[ 62.180289] > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): > <4>[ 62.184178] d_same_name+0x4c/0xd0 Bah. I've said this before, but I really wish LKFT would use debug builds and run the warnings through scripts/decode_stacktrace.sh. Getting filenames and line numbers (and inlining information!) for stack traces can be really really useful. I think you are using KernelCI builds (at least that was the case last time), and apparently they are non-debug builds. And that's possibly due to just resource issues (the debug info does take a lot more disk space and makes link times much longer too). So it might not be easily fixable. But let's see if it might be an option to get this capability. So I'm adding the kernelci list to see if somebody goes "Oh, that was just an oversight" and might easily be made to happen. Fingers crossed. Linus
On Mon, Jan 27, 2025 at 11:12:11AM -0800, Linus Torvalds wrote: > On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote: > > With this pulled on top of Linus's tree, LKFT is managing to trigger > > kfence warnings: > > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0 > > <3>[ 62.180289] > > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): > > <4>[ 62.184178] d_same_name+0x4c/0xd0 > Bah. I've said this before, but I really wish LKFT would use debug > builds and run the warnings through scripts/decode_stacktrace.sh. > Getting filenames and line numbers (and inlining information!) for > stack traces can be really really useful. > I think you are using KernelCI builds (at least that was the case last > time), and apparently they are non-debug builds. And that's possibly > due to just resource issues (the debug info does take a lot more disk > space and makes link times much longer too). So it might not be easily > fixable. They're not, they're using their own builds done with their tuxsuite service which is a cloud front end for their tuxmake tool, that does have the ability to save the vmlinux. Poking around the LKFT output it does look like they're doing that for the LKFT builds: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27027254/suite/build/test/gcc-13-tinyconfig/details/ https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1jDhjHPNl1XNezFhsjSlvpI/ so hopefully the information is all there and it's just a question of people doing the decode when reporting issues from LKFT. > But let's see if it might be an option to get this capability. So I'm > adding the kernelci list to see if somebody goes "Oh, that was just an > oversight" and might easily be made to happen. Fingers crossed. The issue with KernelCI has been that it's not storing the vmlinux, this was indeed done due to space issues like you suggest. With the new infrastructure that's been rolled out as part of the KernelCI 2.0 revamp the storage should be a lot more scaleable and so this should hopefully be a cost issue rather than actual space limits like it used to be so more tractable. AFAICT we haven't actually revisited making the required changes to include the vmlinux in the stored output though, I filed a ticket: https://github.com/kernelci/kernelci-project/issues/509 The builds themselves are generally using standard defconfigs and derivatives of that so will normally have enough debug info for decode_stacktrace.sh. Where they don't we should probably just change that upstream.
On Mon, Jan 27, 2025 at 05:36:34PM +0000, Al Viro wrote: >On Mon, Jan 27, 2025 at 12:19:54PM -0500, Sasha Levin wrote: > >> The full log is at: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27028572/suite/log-parser-test/test/kfence-bug-kfence-out-of-bounds-read-in-d_same_name/log >> >> LMK if I should attempt a bisection. > >Could you try your setup on 58cf9c383c5c "dcache: back inline names >with a struct-wrapped array of unsigned long"? It looks like we didn't trigger a warnings on that commit, but I'm not sure if the issue reproduces easily. I'll start a bisection and see where it takes me...
On Mon, Jan 27, 2025 at 03:52:16PM -0500, Sasha Levin wrote: > On Mon, Jan 27, 2025 at 05:36:34PM +0000, Al Viro wrote: > > On Mon, Jan 27, 2025 at 12:19:54PM -0500, Sasha Levin wrote: > > > > > The full log is at: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27028572/suite/log-parser-test/test/kfence-bug-kfence-out-of-bounds-read-in-d_same_name/log > > > > > > LMK if I should attempt a bisection. > > > > Could you try your setup on 58cf9c383c5c "dcache: back inline names > > with a struct-wrapped array of unsigned long"? > > It looks like we didn't trigger a warnings on that commit, but I'm not > sure if the issue reproduces easily. > > I'll start a bisection and see where it takes me... Interesting... The thing is, that's the only commit that goes anywhere near ->d_name reassignments. That access smells like access just one byte past struct external_name... wait a minute. Could that be load_unaligned_zeropad() stepping just over the end of external name? If so, then a) it's a false positive (and IIRC, it's not the first time kfence gets confused by that) b) your bisection will probably converge to bdd9951f60f9 "dissolve external_name.u into separate members" which is where we'd ended up with offsetof(struct external_name, name) being 4 modulo 8. As a quick test, try to flip the order of head and count in struct external_name and see if that makes the warning go away. If it does, I'm pretty certain that theory above is correct.
[ Adding in the LKFT folks ] On Mon, Jan 27, 2025 at 08:38:50PM +0000, Mark Brown wrote: >On Mon, Jan 27, 2025 at 11:12:11AM -0800, Linus Torvalds wrote: >> On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote: > >> > With this pulled on top of Linus's tree, LKFT is managing to trigger >> > kfence warnings: > >> > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0 >> > <3>[ 62.180289] >> > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): >> > <4>[ 62.184178] d_same_name+0x4c/0xd0 > >> Bah. I've said this before, but I really wish LKFT would use debug >> builds and run the warnings through scripts/decode_stacktrace.sh. > >> Getting filenames and line numbers (and inlining information!) for >> stack traces can be really really useful. > >> I think you are using KernelCI builds (at least that was the case last >> time), and apparently they are non-debug builds. And that's possibly >> due to just resource issues (the debug info does take a lot more disk >> space and makes link times much longer too). So it might not be easily >> fixable. > >They're not, they're using their own builds done with their tuxsuite >service which is a cloud front end for their tuxmake tool, that does >have the ability to save the vmlinux. Poking around the LKFT output it >does look like they're doing that for the LKFT builds: > > https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27027254/suite/build/test/gcc-13-tinyconfig/details/ > https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1jDhjHPNl1XNezFhsjSlvpI/ > >so hopefully the information is all there and it's just a question of >people doing the decode when reporting issues from LKFT. My understanding was that becuase CONFIG_DEBUG_INFO_NONE=y is set, we actually don't have enough info to resolve line numbers. I've tried running decode_stacktrace.sh on the vmlinux image linked above, and indeed we can't get line numbers.
On Mon, Jan 27, 2025 at 09:34:56PM +0000, Al Viro wrote: > If so, then > a) it's a false positive (and IIRC, it's not the first time > kfence gets confused by that) > b) your bisection will probably converge to bdd9951f60f9 > "dissolve external_name.u into separate members" which is where we'd > ended up with offsetof(struct external_name, name) being 4 modulo 8. > > As a quick test, try to flip the order of head and count in > struct external_name and see if that makes the warning go away. > If it does, I'm pretty certain that theory above is correct. Not quite... dentry_string_cmp() assumes that ->d_name.name is word-aligned, so load_unaligned_zeropad() is done only to the second string (the one we compare against). Linus, does the following look sane to you as replacement for bdd9951f60f9? I'd rather have explicit __aligned(), along with the comment spelling the constraints out... diff --git a/fs/dcache.c b/fs/dcache.c index f387dc97df86..f8d6a2557736 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -295,12 +295,16 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c return dentry_string_cmp(cs, ct, tcount); } +/* + * long names are allocated separately from dentry and never modified. + * Refcounted, freeing is RCU-delayed. See take_dentry_name_snapshot() + * for the reason why ->count and ->head can't be combined into a union. + * dentry_string_cmp() relies upon ->name[] being word-aligned. + */ struct external_name { - struct { - atomic_t count; // ->count and ->head can't be combined - struct rcu_head head; // see take_dentry_name_snapshot() - } u; - unsigned char name[]; + atomic_t count; + struct rcu_head head; + unsigned char name[] __aligned(sizeof(unsigned long)); }; static inline struct external_name *external_name(struct dentry *dentry) @@ -344,7 +348,7 @@ void take_dentry_name_snapshot(struct name_snapshot *name, struct dentry *dentry struct external_name *p; p = container_of(s, struct external_name, name[0]); // get a valid reference - if (unlikely(!atomic_inc_not_zero(&p->u.count))) + if (unlikely(!atomic_inc_not_zero(&p->count))) goto retry; name->name.name = s; } @@ -361,8 +365,8 @@ void release_dentry_name_snapshot(struct name_snapshot *name) if (unlikely(name->name.name != name->inline_name.string)) { struct external_name *p; p = container_of(name->name.name, struct external_name, name[0]); - if (unlikely(atomic_dec_and_test(&p->u.count))) - kfree_rcu(p, u.head); + if (unlikely(atomic_dec_and_test(&p->count))) + kfree_rcu(p, head); } } EXPORT_SYMBOL(release_dentry_name_snapshot); @@ -400,7 +404,7 @@ static void dentry_free(struct dentry *dentry) WARN_ON(!hlist_unhashed(&dentry->d_u.d_alias)); if (unlikely(dname_external(dentry))) { struct external_name *p = external_name(dentry); - if (likely(atomic_dec_and_test(&p->u.count))) { + if (likely(atomic_dec_and_test(&p->count))) { call_rcu(&dentry->d_u.d_rcu, __d_free_external); return; } @@ -1681,7 +1685,7 @@ static struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name) kmem_cache_free(dentry_cache, dentry); return NULL; } - atomic_set(&p->u.count, 1); + atomic_set(&p->count, 1); dname = p->name; } else { dname = dentry->d_shortname.string; @@ -2774,15 +2778,15 @@ static void copy_name(struct dentry *dentry, struct dentry *target) if (unlikely(dname_external(dentry))) old_name = external_name(dentry); if (unlikely(dname_external(target))) { - atomic_inc(&external_name(target)->u.count); + atomic_inc(&external_name(target)->count); dentry->d_name = target->d_name; } else { dentry->d_shortname = target->d_shortname; dentry->d_name.name = dentry->d_shortname.string; dentry->d_name.hash_len = target->d_name.hash_len; } - if (old_name && likely(atomic_dec_and_test(&old_name->u.count))) - kfree_rcu(old_name, u.head); + if (old_name && likely(atomic_dec_and_test(&old_name->count))) + kfree_rcu(old_name, head); } /*
On Mon, 27 Jan 2025 at 14:41, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Linus, does the following look sane to you as replacement for > bdd9951f60f9? I'd rather have explicit __aligned(), along > with the comment spelling the constraints out... Ack, looks good to me. Linus
On Mon, Jan 27, 2025 at 10:40:59PM +0000, Al Viro wrote: >On Mon, Jan 27, 2025 at 09:34:56PM +0000, Al Viro wrote: > >> If so, then >> a) it's a false positive (and IIRC, it's not the first time >> kfence gets confused by that) >> b) your bisection will probably converge to bdd9951f60f9 >> "dissolve external_name.u into separate members" which is where we'd >> ended up with offsetof(struct external_name, name) being 4 modulo 8. >> >> As a quick test, try to flip the order of head and count in >> struct external_name and see if that makes the warning go away. >> If it does, I'm pretty certain that theory above is correct. > >Not quite... dentry_string_cmp() assumes that ->d_name.name is >word-aligned, so load_unaligned_zeropad() is done only to the >second string (the one we compare against). Sorry for the silence on my end: this issue doesn't reproduce consistently, so I need to do more runs for these tests.
On Mon, Jan 27, 2025 at 06:26:58PM -0500, Sasha Levin wrote: > On Mon, Jan 27, 2025 at 10:40:59PM +0000, Al Viro wrote: > > On Mon, Jan 27, 2025 at 09:34:56PM +0000, Al Viro wrote: > > > > > If so, then > > > a) it's a false positive (and IIRC, it's not the first time > > > kfence gets confused by that) > > > b) your bisection will probably converge to bdd9951f60f9 > > > "dissolve external_name.u into separate members" which is where we'd > > > ended up with offsetof(struct external_name, name) being 4 modulo 8. > > > > > > As a quick test, try to flip the order of head and count in > > > struct external_name and see if that makes the warning go away. > > > If it does, I'm pretty certain that theory above is correct. > > > > Not quite... dentry_string_cmp() assumes that ->d_name.name is > > word-aligned, so load_unaligned_zeropad() is done only to the > > second string (the one we compare against). > > Sorry for the silence on my end: this issue doesn't reproduce > consistently, so I need to do more runs for these tests. Updated version force-pushed; delta is diff --git a/fs/dcache.c b/fs/dcache.c index 695406e48937..903142b324e9 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -295,10 +295,16 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c return dentry_string_cmp(cs, ct, tcount); } +/* + * long names are allocated separately from dentry and never modified. + * Refcounted, freeing is RCU-delayed. See take_dentry_name_snapshot() + * for the reason why ->count and ->head can't be combined into a union. + * dentry_string_cmp() relies upon ->name[] being word-aligned. + */ struct external_name { - struct rcu_head head; // ->head and ->count can't be combined - atomic_t count; // see take_dentry_name_snapshot() - unsigned char name[]; + atomic_t count; + struct rcu_head head; + unsigned char name[] __aligned(sizeof(unsigned long)); }; static inline struct external_name *external_name(struct dentry *dentry) Could you recheck that one (23e8b451dea4)? I'll send an update pull request if nothing wrong shows up.
On Tue, Jan 28, 2025 at 12:26:59AM +0000, Al Viro wrote:
> Could you recheck that one (23e8b451dea4)? I'll send an update pull request
s/update/updated/...
On Mon, 27 Jan 2025 at 16:27, Al Viro <viro@zeniv.linux.org.uk> wrote: > > struct external_name { > atomic_t count; > struct rcu_head head; > unsigned char name[] __aligned(sizeof(unsigned long)); > }; Btw, now that the external name looks like this, and has a 32-bit hole on 64-bit, I wonder if we should just add a length to the external name. It would be free on 64-bit, and it would actually mean that we could atomically load the name and the length for external names and never worry about any overruns. The internal name would still race, since the length in the dentry can change at any time under us, but any code that really needs to avoid an overrun and doesn't take any locks can still do things like if (name == &dentry->d_iname && len >= DNAME_INLINE_LEN) .. we know we raced .. and for things like d_path() or similar at worst it can just limit len to DNAME_INLINE_LEN-1 or whatever. I'm thinking mainly dentry_string_cmp(), which currently does things one byte at a time at least partially for the "I must not overrun a NUL character because 'len' may be bogus" reason. Maybe I'm just being silly. Linus
On Mon, Jan 27, 2025 at 04:43:13PM -0800, Linus Torvalds wrote: > I'm thinking mainly dentry_string_cmp(), which currently does things > one byte at a time at least partially for the "I must not overrun a > NUL character because 'len' may be bogus" reason. Umm... On some architectures it does, but mostly that's the ones where unaligned word loads are costly. Which target do you have in mind?
On Mon, 27 Jan 2025 at 17:21, Al Viro <viro@zeniv.linux.org.uk> wrote: > > Umm... On some architectures it does, but mostly that's the ones > where unaligned word loads are costly. Which target do you have > in mind? I was more thinking that we could just make the fallback case be a 'memcmp()'. It's not like this particular place matters - as you say, that byte-at-a-time code is only used on architectures that don't enable the dcache word-at-a-time code (that requires the special "do loads that can fault" zeropad helper), but we've had some other places where we'd worry about the string length. Look at d_path() for another example. That copy_from_kernel_nofault() in prepend_copy()... Linus
On Mon, Jan 27, 2025 at 05:27:08PM -0800, Linus Torvalds wrote: > On Mon, 27 Jan 2025 at 17:21, Al Viro <viro@zeniv.linux.org.uk> wrote: > > > > Umm... On some architectures it does, but mostly that's the ones > > where unaligned word loads are costly. Which target do you have > > in mind? > > I was more thinking that we could just make the fallback case be a 'memcmp()'. > > It's not like this particular place matters - as you say, that > byte-at-a-time code is only used on architectures that don't enable > the dcache word-at-a-time code (that requires the special "do loads > that can fault" zeropad helper), but we've had some other places where > we'd worry about the string length. > > Look at d_path() for another example. That copy_from_kernel_nofault() > in prepend_copy()... Hmm... So something like /* * Returns a pointer to name and a length; length might be * inaccurate in case of race with dentry renaming, but * it will not exceed the distance from returned pointer * to the end of containing object. * Caller MUST hold rcu_read_lock(). * Caller MUST NOT expect the contents of name to remain * stable - it can change at any time. */ const char *__d_name_rcu(struct dentry *dentry, int *p) { const char *name = smp_load_acquire(&dentry->d_name.name); if (unlikely(name != &dentry->d_shortname.string)) *p = container_of(name, struct external_name, name)->len; else if (unlikely((*p = dentry->d_name.name) >= DNAME_INLINE_LEN) *p = DNAME_INLINE_LEN - 1; return name; } with very limited accessibility (basically, dcache.c and d_path.c) prepend_name() might be able to use that...
On 27/01/2025 9:38 pm, Mark Brown wrote: >> But let's see if it might be an option to get this capability. So I'm >> adding the kernelci list to see if somebody goes "Oh, that was just an >> oversight" and might easily be made to happen. Fingers crossed. > The issue with KernelCI has been that it's not storing the vmlinux, this > was indeed done due to space issues like you suggest. With the new > infrastructure that's been rolled out as part of the KernelCI 2.0 revamp > the storage should be a lot more scaleable and so this should hopefully > be a cost issue rather than actual space limits like it used to be so > more tractable. AFAICT we haven't actually revisited making the > required changes to include the vmlinux in the stored output though, I > filed a ticket: > > https://github.com/kernelci/kernelci-project/issues/509 > > The builds themselves are generally using standard defconfigs and > derivatives of that so will normally have enough debug info for > decode_stacktrace.sh. Where they don't we should probably just change > that upstream. One approach that was suggested a while ago was to do extra debug builds in automated post-processing jobs whenever a failure is detected. This came as an evolution of the automated bisection which had checks for the good and bad revisions: if a stacktrace was found while testing the "bad" kernel then it could easily be decoded since bisections do incremental builds and keep the vmlinux at hand. As Sasha mentioned in his email, some particular configs are required in order to decode the stacktrace (IIRC this is enabled with arm64_defconfig but not x86). Debug builds also make larger binaries and affect runtime behaviour, as we all know. So one post-processing check would be to do a special debug build with the right configs for decoding stacktraces as well as maybe some sanitizers and extra useful things to add more information. Builds from bisections or any extra jobs should still be uploaded to public storage so they would be available for manual investigation too. That way, the impact on storage costs and compute resources would be minimal without any real drawback - it might take 30min to get the post-processing job to complete but even that could be optimized and it seems a lot more efficient than doing debug builds and uploading large vmlinux images all the time. Hope this helps! Cheers, Guillaume
On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote: > [ Adding in the LKFT folks ] Oops, sorry - didn't realise they weren't already on the report since it was on LKFT or I'd have done the same. > On Mon, Jan 27, 2025 at 08:38:50PM +0000, Mark Brown wrote: > > have the ability to save the vmlinux. Poking around the LKFT output it > > does look like they're doing that for the LKFT builds: > My understanding was that becuase CONFIG_DEBUG_INFO_NONE=y is set, we > actually don't have enough info to resolve line numbers. The arm64 and arm defconfigs which are the main ones I'd end up looking at both set CONFIG_DEBUG_INFO (_REDUCED in the case of arm64), the trace you posted was from arm64 so unless it was some config that overrode things there ought to be info. x86_64 which I guess you might use more indeed doesn't have it. > I've tried running decode_stacktrace.sh on the vmlinux image linked > above, and indeed we can't get line numbers. That was a random build I pulled out which turns out to be a tinyconfig rather than the specific build that was used - if we look at an arm64 defconfig (your trace looked to be from arm64): https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27028517/suite/build/test/clang-nightly-defconfig-40bc7ee5/details/ https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1oYDQrsEOOs4L6yoysbu9aS/ I'm able to decode (just feeding a random log message in, no idea what specific build generated the log message so the line number is almost certainly wrong): $ echo [ 62.184178] d_same_name+0x4c/0xd0 | ./scripts/decode_stacktrace.sh /tmp/vmlinux [ 62.184178] d_same_name (fs/dcache.c:2127)
On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote:
> [ Adding in the LKFT folks ]
Ugh... The website is pretty difficult to navigate. I've filed a
ticket to hopefully avoid this going forward. It's a bit late for
the line numbers to be any use but here they are:
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
<3>[ 62.179009] ==================================================================
<3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name (include/asm-generic/rwonce.h:86 fs/dcache.c:243 fs/dcache.c:295 fs/dcache.c:2129)
<3>[ 62.180289]
<3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174):
<4>[ 62.184178] d_same_name (include/asm-generic/rwonce.h:86 fs/dcache.c:243 fs/dcache.c:295 fs/dcache.c:2129)
<4>[ 62.184717] d_lookup (fs/dcache.c:2292)
<4>[ 62.185378] lookup_dcache (fs/namei.c:1654)
<4>[ 62.185980] lookup_one_qstr_excl (fs/namei.c:1678)
<4>[ 62.186523] do_renameat2 (fs/namei.c:5167)
<4>[ 62.186948] __arm64_sys_renameat (fs/namei.c:5264)
<4>[ 62.187484] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54)
<4>[ 62.188220] el0_svc_common.constprop.0 (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2))
<4>[ 62.189031] do_el0_svc_compat (arch/arm64/kernel/syscall.c:159)
<4>[ 62.189635] el0_svc_compat (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:888 (discriminator 1))
<4>[ 62.190018] el0t_32_sync_handler (arch/arm64/kernel/entry-common.c:933)
<4>[ 62.190537] el0t_32_sync (arch/arm64/kernel/entry.S:605)
<3>[ 62.190946]
<4>[ 62.191399] kfence-#174: 0x0000000012d508d5-0x0000000023355f7e, size=64, cache=kmalloc-rcl-64
<4>[ 62.191399]
<4>[ 62.192260] allocated by task 1 on cpu 0 at 62.177313s (0.014839s ago):
<4>[ 62.193504] __d_alloc (fs/dcache.c:1678)
<4>[ 62.193925] d_alloc (fs/dcache.c:1737)
<4>[ 62.194204] lookup_one_qstr_excl (fs/namei.c:1689)
<4>[ 62.194741] filename_create (fs/namei.c:4083)
<4>[ 62.195129] do_symlinkat (fs/namei.c:4690)
<4>[ 62.195657] __arm64_sys_symlinkat (fs/namei.c:4710)
<4>[ 62.195954] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54)
<4>[ 62.196461] el0_svc_common.constprop.0 (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2))
<4>[ 62.197053] do_el0_svc_compat (arch/arm64/kernel/syscall.c:159)
<4>[ 62.197411] el0_svc_compat (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:888 (discriminator 1))
<4>[ 62.197849] el0t_32_sync_handler (arch/arm64/kernel/entry-common.c:933)
<4>[ 62.198422] el0t_32_sync (arch/arm64/kernel/entry.S:605)
<3>[ 62.198857]
<3>[ 62.199577] CPU: 0 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0 #1
<3>[ 62.200435] Hardware name: linux,dummy-virt (DT)
<3>[ 62.201130] ==================================================================
[?2004hroot@runner-vwmj3eza-project-40964107-concurrent-3:~#
regards,
dan carpenter
On Tue, Jan 28, 2025 at 12:14:07PM +0000, 'Mark Brown' via lkft wrote: > I'm able to decode (just feeding a random log message in, no idea what > specific build generated the log message so the line number is almost > certainly wrong): > All the kernels that we're planning to boot have the DEBUG_INFO enabled. Here is the config and the vmlinux.xz for this one. https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1u8fB268uU3L32J8FqAxYYR/ I don't want to explain how to find this URL because I've filed a ticket to make it prominent so the instructions will change soon. And ideally we would just have the line numbers on the webpage dmesg itself. regards, dan carpenter
On Tue, Jan 28, 2025 at 03:33:14PM +0300, Dan Carpenter wrote: >On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote: >> [ Adding in the LKFT folks ] > >Ugh... The website is pretty difficult to navigate. I've filed a >ticket to hopefully avoid this going forward. It's a bit late for >the line numbers to be any use but here they are: Thanks Dan & Mark! I think I've figured out (and scripted) it for next time :)
->d_revalidate() series, along with ->d_iname preliminary work. One trivial conflict in fs/afs/dir.c - afs_do_lookup_one() has lost one argument in mainline and switched another from dentry to qstr in this series. The following changes since commit 40384c840ea1944d7c5a392e8975ed088ecf0b37: Linux 6.13-rc1 (2024-12-01 14:28:56 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git tags/pull-revalidate for you to fetch changes up to a8ea90bfec66b239dad9a478fc444aa32d3961bc: 9p: fix ->rename_sem exclusion (2025-01-25 11:51:57 -0500) ---------------------------------------------------------------- Provide stable parent and name to ->d_revalidate() instances Most of the filesystem methods where we care about dentry name and parent have their stability guaranteed by the callers; ->d_revalidate() is the major exception. It's easy enough for callers to supply stable values for expected name and expected parent of the dentry being validated. That kills quite a bit of boilerplate in ->d_revalidate() instances, along with a bunch of races where they used to access ->d_name without sufficient precautions. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> ---------------------------------------------------------------- Al Viro (20): make sure that DNAME_INLINE_LEN is a multiple of word size dcache: back inline names with a struct-wrapped array of unsigned long make take_dentry_name_snapshot() lockless dissolve external_name.u into separate members ext4 fast_commit: make use of name_snapshot primitives generic_ci_d_compare(): use shortname_storage Pass parent directory inode and expected name to ->d_revalidate() afs_d_revalidate(): use stable name and parent inode passed by caller ceph_d_revalidate(): use stable parent inode passed by caller ceph_d_revalidate(): propagate stable name down into request encoding fscrypt_d_revalidate(): use stable parent inode passed by caller exfat_d_revalidate(): use stable parent inode passed by caller vfat_revalidate{,_ci}(): use stable parent inode passed by caller fuse_dentry_revalidate(): use stable parent inode and name passed by caller gfs2_drevalidate(): use stable parent inode and name passed by caller nfs{,4}_lookup_validate(): use stable parent inode passed by caller nfs: fix ->d_revalidate() UAF on ->d_name accesses ocfs2_dentry_revalidate(): use stable parent inode and name passed by caller orangefs_d_revalidate(): use stable parent inode and name passed by caller 9p: fix ->rename_sem exclusion Documentation/filesystems/locking.rst | 7 +- Documentation/filesystems/porting.rst | 16 +++++ Documentation/filesystems/vfs.rst | 24 ++++++- fs/9p/v9fs.h | 2 +- fs/9p/vfs_dentry.c | 26 +++++++- fs/afs/dir.c | 40 ++++-------- fs/ceph/dir.c | 25 ++------ fs/ceph/mds_client.c | 9 ++- fs/ceph/mds_client.h | 2 + fs/coda/dir.c | 3 +- fs/crypto/fname.c | 22 ++----- fs/dcache.c | 95 ++++++++++++++++------------ fs/ecryptfs/dentry.c | 18 ++++-- fs/exfat/namei.c | 11 +--- fs/ext4/fast_commit.c | 29 ++------- fs/ext4/fast_commit.h | 3 +- fs/fat/namei_vfat.c | 19 +++--- fs/fuse/dir.c | 20 +++--- fs/gfs2/dentry.c | 31 ++++----- fs/hfs/sysdep.c | 3 +- fs/jfs/namei.c | 3 +- fs/kernfs/dir.c | 3 +- fs/libfs.c | 15 +++-- fs/namei.c | 18 +++--- fs/nfs/dir.c | 62 ++++++++---------- fs/nfs/namespace.c | 2 +- fs/nfs/nfs3proc.c | 5 +- fs/nfs/nfs4proc.c | 20 +++--- fs/nfs/proc.c | 6 +- fs/ocfs2/dcache.c | 14 ++-- fs/orangefs/dcache.c | 22 +++---- fs/overlayfs/super.c | 22 ++++++- fs/proc/base.c | 6 +- fs/proc/fd.c | 3 +- fs/proc/generic.c | 6 +- fs/proc/proc_sysctl.c | 3 +- fs/smb/client/dir.c | 3 +- fs/tracefs/inode.c | 3 +- fs/vboxsf/dir.c | 3 +- include/linux/dcache.h | 23 +++++-- include/linux/fscrypt.h | 7 +- include/linux/nfs_xdr.h | 2 +- tools/testing/selftests/bpf/progs/find_vma.c | 2 +- 43 files changed, 352 insertions(+), 306 deletions(-)