diff mbox series

Revert "raid: Remove now superfluous sentinel element from ctl_table array"

Message ID 20231221044925.10178-1-colyli@suse.de (mailing list archive)
State New, archived
Headers show
Series Revert "raid: Remove now superfluous sentinel element from ctl_table array" | expand

Commit Message

Coly Li Dec. 21, 2023, 4:49 a.m. UTC
This reverts commit dd6291c506490c195620b394dc96763675e7e5f4.

With this patch, a kernel oops triggered when creating a md device,
[  311.224353][ T3545] BUG: unable to handle page fault for address: 000003e800030d40
[  311.314951][ T3545] #PF: supervisor read access in kernel mode
[  311.384748][ T3545] #PF: error_code(0x0000) - not-present page
[  311.454538][ T3545] PGD 12be1c067 P4D 0
[  311.501451][ T3545] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  311.561888][ T3545] CPU: 19 PID: 3545 Comm: modprobe [snipped...]
[  311.869958][ T3545] RIP: 0010:string+0x48/0xe0
[  311.923116][ T3545] Code: 3b 45 89 d1 45 31 c0 49 01 f9 66 45 85 d2 75 1a eb 1f 48 39 f7 73 02 88 07 48 83 c7 01 41 83 c0 01 48 83 c2 01 4c 39 cf 74 07 <0f> b6 02 84 c0 75 e1 48 89 f2 44 89 c6 e9 c6 e3 ff ff 48 c7 c0 3d
[  312.156194][ T3545] RSP: 0018:ffa000000b877a70 EFLAGS: 00010086
[  312.227025][ T3545] RAX: 000003e80002fd40 RBX: ffa000000b877b86 RCX: ffff0a00ffffff04
[  312.320737][ T3545] RDX: 000003e800030d40 RSI: ffa000000b877b68 RDI: ffa000000b877b86
[  312.414449][ T3545] RBP: ffa000000b877b48 R08: 0000000000000000 R09: ffa000010b877b85
[  312.508160][ T3545] R10: ffffffffffffffff R11: 0000000000000040 R12: ffa000000b877b68
[  312.601873][ T3545] R13: ffffffff99c221fa R14: 0000000000000008 R15: ffffffff99c221fa
[  312.695583][ T3545] FS:  00007fea7a856740(0000) GS:ff11000fffd80000(0000) knlGS:0000000000000000
[  312.800733][ T3545] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  312.877805][ T3545] CR2: 000003e800030d40 CR3: 0000000123790001 CR4: 0000000000771ee0
[  312.971518][ T3545] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  313.065229][ T3545] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  313.158940][ T3545] PKRU: 55555554
[  313.199610][ T3545] Call Trace:
[  313.237162][ T3545]  <TASK>
[  313.270554][ T3545]  ? __die+0x23/0x70
[  313.315391][ T3545]  ? page_fault_oops+0x14d/0x490
[  313.372701][ T3545]  ? update_load_avg+0x7e/0x7d0
[  313.428972][ T3545]  ? exc_page_fault+0x71/0x160
[  313.484203][ T3545]  ? asm_exc_page_fault+0x26/0x30
[  313.542555][ T3545]  ? string+0x48/0xe0
[  313.588426][ T3545]  vsnprintf+0x2d5/0x5a0
[  313.637417][ T3545]  vprintk_store+0x15e/0x4b0
[  313.690567][ T3545]  ? schedule_timeout+0x147/0x160
[  313.748918][ T3545]  ? wait_for_completion_killable+0x1a6/0x1d0
[  313.819750][ T3545]  vprintk_emit+0xc9/0x230
[  313.870823][ T3545]  _printk+0x5c/0x80
[  313.915657][ T3545]  sysctl_err+0x6a/0x90
[  313.963610][ T3545]  ? __kmalloc+0x4d/0x150
[  314.013639][ T3545]  __register_sysctl_table+0x144/0x7d0
[  314.077192][ T3545]  ? kmalloc_trace+0x2a/0xa0
[  314.130341][ T3545]  md_init+0xd2/0xff0 [snipped...]
[  314.228226][ T3545]  ? __pfx_md_init+0x10/0x10 [snipped...]
[  314.333383][ T3545]  do_one_initcall+0x47/0x220
[  314.387576][ T3545]  ? kmalloc_trace+0x2a/0xa0
[  314.440726][ T3545]  do_init_module+0x60/0x240
[  314.493878][ T3545]  __do_sys_finit_module+0xac/0x120
[  314.554308][ T3545]  do_syscall_64+0x5d/0x90
[  314.605380][ T3545]  ? ksys_lseek+0x66/0xb0
[  314.655411][ T3545]  ? syscall_exit_to_user_mode+0x2b/0x40
[  314.721042][ T3545]  ? do_syscall_64+0x6c/0x90
[  314.774194][ T3545]  ? exit_to_user_mode_prepare+0x142/0x1f0
[  314.841906][ T3545]  ? syscall_exit_to_user_mode+0x2b/0x40
[  314.907535][ T3545]  ? do_syscall_64+0x6c/0x90
[  314.960685][ T3545]  ? exc_page_fault+0x71/0x160
[  315.015917][ T3545]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  315.084667][ T3545] RIP: 0033:0x7fea79f161bd

The last NULL element in raid_table[] is necessary, after reverting this
patch, the above oops message is removed.

Fixes: dd6291c50649 ("raid: Remove now superfluous sentinel element from ctl_table array")
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Joel Granados <j.granados@samsung.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
---
 drivers/md/md.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Yu Kuai Dec. 21, 2023, 6:19 a.m. UTC | #1
Hi,

在 2023/12/21 12:49, Coly Li 写道:
> This reverts commit dd6291c506490c195620b394dc96763675e7e5f4.
> 
> With this patch, a kernel oops triggered when creating a md device,
> [  311.224353][ T3545] BUG: unable to handle page fault for address: 000003e800030d40
> [  311.314951][ T3545] #PF: supervisor read access in kernel mode
> [  311.384748][ T3545] #PF: error_code(0x0000) - not-present page
> [  311.454538][ T3545] PGD 12be1c067 P4D 0
> [  311.501451][ T3545] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [  311.561888][ T3545] CPU: 19 PID: 3545 Comm: modprobe [snipped...]
> [  311.869958][ T3545] RIP: 0010:string+0x48/0xe0
> [  311.923116][ T3545] Code: 3b 45 89 d1 45 31 c0 49 01 f9 66 45 85 d2 75 1a eb 1f 48 39 f7 73 02 88 07 48 83 c7 01 41 83 c0 01 48 83 c2 01 4c 39 cf 74 07 <0f> b6 02 84 c0 75 e1 48 89 f2 44 89 c6 e9 c6 e3 ff ff 48 c7 c0 3d
> [  312.156194][ T3545] RSP: 0018:ffa000000b877a70 EFLAGS: 00010086
> [  312.227025][ T3545] RAX: 000003e80002fd40 RBX: ffa000000b877b86 RCX: ffff0a00ffffff04
> [  312.320737][ T3545] RDX: 000003e800030d40 RSI: ffa000000b877b68 RDI: ffa000000b877b86
> [  312.414449][ T3545] RBP: ffa000000b877b48 R08: 0000000000000000 R09: ffa000010b877b85
> [  312.508160][ T3545] R10: ffffffffffffffff R11: 0000000000000040 R12: ffa000000b877b68
> [  312.601873][ T3545] R13: ffffffff99c221fa R14: 0000000000000008 R15: ffffffff99c221fa
> [  312.695583][ T3545] FS:  00007fea7a856740(0000) GS:ff11000fffd80000(0000) knlGS:0000000000000000
> [  312.800733][ T3545] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  312.877805][ T3545] CR2: 000003e800030d40 CR3: 0000000123790001 CR4: 0000000000771ee0
> [  312.971518][ T3545] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  313.065229][ T3545] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  313.158940][ T3545] PKRU: 55555554
> [  313.199610][ T3545] Call Trace:
> [  313.237162][ T3545]  <TASK>
> [  313.270554][ T3545]  ? __die+0x23/0x70
> [  313.315391][ T3545]  ? page_fault_oops+0x14d/0x490
> [  313.372701][ T3545]  ? update_load_avg+0x7e/0x7d0
> [  313.428972][ T3545]  ? exc_page_fault+0x71/0x160
> [  313.484203][ T3545]  ? asm_exc_page_fault+0x26/0x30
> [  313.542555][ T3545]  ? string+0x48/0xe0
> [  313.588426][ T3545]  vsnprintf+0x2d5/0x5a0
> [  313.637417][ T3545]  vprintk_store+0x15e/0x4b0
> [  313.690567][ T3545]  ? schedule_timeout+0x147/0x160
> [  313.748918][ T3545]  ? wait_for_completion_killable+0x1a6/0x1d0
> [  313.819750][ T3545]  vprintk_emit+0xc9/0x230
> [  313.870823][ T3545]  _printk+0x5c/0x80
> [  313.915657][ T3545]  sysctl_err+0x6a/0x90
> [  313.963610][ T3545]  ? __kmalloc+0x4d/0x150
> [  314.013639][ T3545]  __register_sysctl_table+0x144/0x7d0
> [  314.077192][ T3545]  ? kmalloc_trace+0x2a/0xa0
> [  314.130341][ T3545]  md_init+0xd2/0xff0 [snipped...]
> [  314.228226][ T3545]  ? __pfx_md_init+0x10/0x10 [snipped...]
> [  314.333383][ T3545]  do_one_initcall+0x47/0x220
> [  314.387576][ T3545]  ? kmalloc_trace+0x2a/0xa0
> [  314.440726][ T3545]  do_init_module+0x60/0x240
> [  314.493878][ T3545]  __do_sys_finit_module+0xac/0x120
> [  314.554308][ T3545]  do_syscall_64+0x5d/0x90
> [  314.605380][ T3545]  ? ksys_lseek+0x66/0xb0
> [  314.655411][ T3545]  ? syscall_exit_to_user_mode+0x2b/0x40
> [  314.721042][ T3545]  ? do_syscall_64+0x6c/0x90
> [  314.774194][ T3545]  ? exit_to_user_mode_prepare+0x142/0x1f0
> [  314.841906][ T3545]  ? syscall_exit_to_user_mode+0x2b/0x40
> [  314.907535][ T3545]  ? do_syscall_64+0x6c/0x90
> [  314.960685][ T3545]  ? exc_page_fault+0x71/0x160
> [  315.015917][ T3545]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [  315.084667][ T3545] RIP: 0033:0x7fea79f161bd
> 
> The last NULL element in raid_table[] is necessary, after reverting this

Based on commit message, avoid last NULL element is exactly what [1]
did, if this is not true, can you explame more how sysctl_err() is
called from md_init()? I can't find this by code review, and I think
maybe it's better to fix this in sysctl error path.

Thanks,
Kuai

[1] https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/
> patch, the above oops message is removed.
> 
> Fixes: dd6291c50649 ("raid: Remove now superfluous sentinel element from ctl_table array")
> Signed-off-by: Coly Li <colyli@suse.de>
> Cc: Joel Granados <j.granados@samsung.com>
> Cc: Luis Chamberlain <mcgrof@kernel.org>
> ---
>   drivers/md/md.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 9bdd57324c37..90481ed6fdbb 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -312,6 +312,7 @@ static struct ctl_table raid_table[] = {
>   		.mode		= S_IRUGO|S_IWUSR,
>   		.proc_handler	= proc_dointvec,
>   	},
> +	{ }
>   };
>   
>   static int start_readonly;
>
Coly Li Dec. 21, 2023, 7:11 a.m. UTC | #2
> 2023年12月21日 14:19,Yu Kuai <yukuai1@huaweicloud.com> 写道:
> 
> Hi,
> 
> 在 2023/12/21 12:49, Coly Li 写道:
>> This reverts commit dd6291c506490c195620b394dc96763675e7e5f4.
>> With this patch, a kernel oops triggered when creating a md device,
>> [  311.224353][ T3545] BUG: unable to handle page fault for address: 000003e800030d40
>> [  311.314951][ T3545] #PF: supervisor read access in kernel mode
>> [  311.384748][ T3545] #PF: error_code(0x0000) - not-present page
>> [  311.454538][ T3545] PGD 12be1c067 P4D 0
>> [  311.501451][ T3545] Oops: 0000 [#1] PREEMPT SMP NOPTI
>> [  311.561888][ T3545] CPU: 19 PID: 3545 Comm: modprobe [snipped...]
>> [  311.869958][ T3545] RIP: 0010:string+0x48/0xe0
>> [  311.923116][ T3545] Code: 3b 45 89 d1 45 31 c0 49 01 f9 66 45 85 d2 75 1a eb 1f 48 39 f7 73 02 88 07 48 83 c7 01 41 83 c0 01 48 83 c2 01 4c 39 cf 74 07 <0f> b6 02 84 c0 75 e1 48 89 f2 44 89 c6 e9 c6 e3 ff ff 48 c7 c0 3d
>> [  312.156194][ T3545] RSP: 0018:ffa000000b877a70 EFLAGS: 00010086
>> [  312.227025][ T3545] RAX: 000003e80002fd40 RBX: ffa000000b877b86 RCX: ffff0a00ffffff04
>> [  312.320737][ T3545] RDX: 000003e800030d40 RSI: ffa000000b877b68 RDI: ffa000000b877b86
>> [  312.414449][ T3545] RBP: ffa000000b877b48 R08: 0000000000000000 R09: ffa000010b877b85
>> [  312.508160][ T3545] R10: ffffffffffffffff R11: 0000000000000040 R12: ffa000000b877b68
>> [  312.601873][ T3545] R13: ffffffff99c221fa R14: 0000000000000008 R15: ffffffff99c221fa
>> [  312.695583][ T3545] FS:  00007fea7a856740(0000) GS:ff11000fffd80000(0000) knlGS:0000000000000000
>> [  312.800733][ T3545] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  312.877805][ T3545] CR2: 000003e800030d40 CR3: 0000000123790001 CR4: 0000000000771ee0
>> [  312.971518][ T3545] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [  313.065229][ T3545] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [  313.158940][ T3545] PKRU: 55555554
>> [  313.199610][ T3545] Call Trace:
>> [  313.237162][ T3545]  <TASK>
>> [  313.270554][ T3545]  ? __die+0x23/0x70
>> [  313.315391][ T3545]  ? page_fault_oops+0x14d/0x490
>> [  313.372701][ T3545]  ? update_load_avg+0x7e/0x7d0
>> [  313.428972][ T3545]  ? exc_page_fault+0x71/0x160
>> [  313.484203][ T3545]  ? asm_exc_page_fault+0x26/0x30
>> [  313.542555][ T3545]  ? string+0x48/0xe0
>> [  313.588426][ T3545]  vsnprintf+0x2d5/0x5a0
>> [  313.637417][ T3545]  vprintk_store+0x15e/0x4b0
>> [  313.690567][ T3545]  ? schedule_timeout+0x147/0x160
>> [  313.748918][ T3545]  ? wait_for_completion_killable+0x1a6/0x1d0
>> [  313.819750][ T3545]  vprintk_emit+0xc9/0x230
>> [  313.870823][ T3545]  _printk+0x5c/0x80
>> [  313.915657][ T3545]  sysctl_err+0x6a/0x90
>> [  313.963610][ T3545]  ? __kmalloc+0x4d/0x150
>> [  314.013639][ T3545]  __register_sysctl_table+0x144/0x7d0
>> [  314.077192][ T3545]  ? kmalloc_trace+0x2a/0xa0
>> [  314.130341][ T3545]  md_init+0xd2/0xff0 [snipped...]
>> [  314.228226][ T3545]  ? __pfx_md_init+0x10/0x10 [snipped...]
>> [  314.333383][ T3545]  do_one_initcall+0x47/0x220
>> [  314.387576][ T3545]  ? kmalloc_trace+0x2a/0xa0
>> [  314.440726][ T3545]  do_init_module+0x60/0x240
>> [  314.493878][ T3545]  __do_sys_finit_module+0xac/0x120
>> [  314.554308][ T3545]  do_syscall_64+0x5d/0x90
>> [  314.605380][ T3545]  ? ksys_lseek+0x66/0xb0
>> [  314.655411][ T3545]  ? syscall_exit_to_user_mode+0x2b/0x40
>> [  314.721042][ T3545]  ? do_syscall_64+0x6c/0x90
>> [  314.774194][ T3545]  ? exit_to_user_mode_prepare+0x142/0x1f0
>> [  314.841906][ T3545]  ? syscall_exit_to_user_mode+0x2b/0x40
>> [  314.907535][ T3545]  ? do_syscall_64+0x6c/0x90
>> [  314.960685][ T3545]  ? exc_page_fault+0x71/0x160
>> [  315.015917][ T3545]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
>> [  315.084667][ T3545] RIP: 0033:0x7fea79f161bd
>> The last NULL element in raid_table[] is necessary, after reverting this
> 

Hi Kuai,

> Based on commit message, avoid last NULL element is exactly what [1]
> did, if this is not true, can you explame more how sysctl_err() is
> called from md_init()? I can't find this by code review, and I think
> maybe it's better to fix this in sysctl error path.
> 

I feel you are right! The test was based on a backport of stable tree, and the register_sysctl() related code was not included.
After look at the changes of sysctl, I feel the oops should go away after taking the sysctl changes.

Thanks, and please ignore the noise.

Coly Li

[snipped]
Luis Chamberlain Dec. 21, 2023, 7:17 p.m. UTC | #3
On Thu, Dec 21, 2023 at 02:19:56PM +0800, Yu Kuai wrote:
> I can't find this by code review, and I think
> maybe it's better to fix this in sysctl error path.

Indeed, we want to fix anything in the way to remove the empty sentinel,
we continue to do that in queued work on sysctl-next [0]. Although I
won't be able to diagnose this right away, could you try the out of
bounds fix by Joel [1] instead?

We want to identify what caused this and fix it within sysctl code.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=sysctl-next
[1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=sysctl-next&id=fd696ee2395755a292f7d49bf4c701a5bab2f076

  Luis
Coly Li Dec. 22, 2023, 10:17 a.m. UTC | #4
> 2023年12月22日 03:17,Luis Chamberlain <mcgrof@kernel.org> 写道:
> 
> On Thu, Dec 21, 2023 at 02:19:56PM +0800, Yu Kuai wrote:
>> I can't find this by code review, and I think
>> maybe it's better to fix this in sysctl error path.
> 
> Indeed, we want to fix anything in the way to remove the empty sentinel,
> we continue to do that in queued work on sysctl-next [0]. Although I
> won't be able to diagnose this right away, could you try the out of
> bounds fix by Joel [1] instead?
> 
> We want to identify what caused this and fix it within sysctl code.
> 
> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=sysctl-next
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=sysctl-next&id=fd696ee2395755a292f7d49bf4c701a5bab2f076

Hi Luis,

Thanks of the above information. IMHO your code is good, When I cherry pick the upstream md code for testing, the sysctl related change leaked from my eyes. please ignore my noise. 

Coly Li
Luis Chamberlain Dec. 22, 2023, 5:31 p.m. UTC | #5
On Fri, Dec 22, 2023 at 06:17:47PM +0800, Coly Li wrote:
> 
> 
> > 2023年12月22日 03:17,Luis Chamberlain <mcgrof@kernel.org> 写道:
> > 
> > On Thu, Dec 21, 2023 at 02:19:56PM +0800, Yu Kuai wrote:
> >> I can't find this by code review, and I think
> >> maybe it's better to fix this in sysctl error path.
> > 
> > Indeed, we want to fix anything in the way to remove the empty sentinel,
> > we continue to do that in queued work on sysctl-next [0]. Although I
> > won't be able to diagnose this right away, could you try the out of
> > bounds fix by Joel [1] instead?
> > 
> > We want to identify what caused this and fix it within sysctl code.
> > 
> > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=sysctl-next
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=sysctl-next&id=fd696ee2395755a292f7d49bf4c701a5bab2f076
> 
> Hi Luis,
> 
> Thanks of the above information. IMHO your code is good, When I cherry
> pick the upstream md code for testing, the sysctl related change
> leaked from my eyes. please ignore my noise. 

Great thanks for the heads up! Happy holidays.

  Luis
Joel Granados Dec. 23, 2023, 12:25 p.m. UTC | #6
On Fri, Dec 22, 2023 at 09:31:18AM -0800, Luis Chamberlain wrote:
> On Fri, Dec 22, 2023 at 06:17:47PM +0800, Coly Li wrote:
> > 
> > 
> > > 2023年12月22日 03:17,Luis Chamberlain <mcgrof@kernel.org> 写道:
> > > 
> > > On Thu, Dec 21, 2023 at 02:19:56PM +0800, Yu Kuai wrote:
> > >> I can't find this by code review, and I think
> > >> maybe it's better to fix this in sysctl error path.
> > > 
> > > Indeed, we want to fix anything in the way to remove the empty sentinel,
> > > we continue to do that in queued work on sysctl-next [0]. Although I
> > > won't be able to diagnose this right away, could you try the out of
> > > bounds fix by Joel [1] instead?
> > > 
> > > We want to identify what caused this and fix it within sysctl code.
> > > 
> > > [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=sysctl-next
> > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=sysctl-next&id=fd696ee2395755a292f7d49bf4c701a5bab2f076
> > 
> > Hi Luis,
> > 
> > Thanks of the above information. IMHO your code is good, When I cherry
> > pick the upstream md code for testing, the sysctl related change
> > leaked from my eyes. please ignore my noise. 

So this was triggered because the tree was missing the changes that
actually handled the removal of the sentinel?

Get back to me if the oops persists even after you have included the
changes in sysctl-next

Best

> 
> Great thanks for the heads up! Happy holidays.
> 
>   Luis
Coly Li Dec. 23, 2023, 12:42 p.m. UTC | #7
> 2023年12月23日 20:25,Joel Granados <j.granados@samsung.com> 写道:
> 
> On Fri, Dec 22, 2023 at 09:31:18AM -0800, Luis Chamberlain wrote:
>> On Fri, Dec 22, 2023 at 06:17:47PM +0800, Coly Li wrote:
>>> 
>>> 
>>>> 2023年12月22日 03:17,Luis Chamberlain <mcgrof@kernel.org> 写道:
>>>> 
>>>> On Thu, Dec 21, 2023 at 02:19:56PM +0800, Yu Kuai wrote:
>>>>> I can't find this by code review, and I think
>>>>> maybe it's better to fix this in sysctl error path.
>>>> 
>>>> Indeed, we want to fix anything in the way to remove the empty sentinel,
>>>> we continue to do that in queued work on sysctl-next [0]. Although I
>>>> won't be able to diagnose this right away, could you try the out of
>>>> bounds fix by Joel [1] instead?
>>>> 
>>>> We want to identify what caused this and fix it within sysctl code.
>>>> 
>>>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=sysctl-next
>>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=sysctl-next&id=fd696ee2395755a292f7d49bf4c701a5bab2f076
>>> 
>>> Hi Luis,
>>> 
>>> Thanks of the above information. IMHO your code is good, When I cherry
>>> pick the upstream md code for testing, the sysctl related change
>>> leaked from my eyes. please ignore my noise. 
> 
> So this was triggered because the tree was missing the changes that
> actually handled the removal of the sentinel?
> 

It was from a regular update of the subsystem I maintainer for our own product. This is quite common to happen when a tree wide changes happen and subsystem maintainers of downstream products were not aware of the changes out of the subsystem.

Just as I said, please ignore the noise.

How to avoid such unnecessary noise? Maybe the patches to subsystems should add more information about the tree wide changes, e.g. this patch goes with the core change of xxx, when you pick it for backport please also be aware of the changes in xxxx.

But this is suggestion and not mandatory, finally the developer who partially picked patches for  his backport will find out where he made mistake. Just like this time.


> Get back to me if the oops persists even after you have included the
> changes in sysctl-next

Thanks for the help. I do appreciate :-)

Coly Li
Joel Granados Dec. 23, 2023, 12:58 p.m. UTC | #8
On Sat, Dec 23, 2023 at 08:42:35PM +0800, Coly Li wrote:
> 
> 
> > 2023年12月23日 20:25,Joel Granados <j.granados@samsung.com> 写道:
> > 
> > On Fri, Dec 22, 2023 at 09:31:18AM -0800, Luis Chamberlain wrote:
> >> On Fri, Dec 22, 2023 at 06:17:47PM +0800, Coly Li wrote:
> >>> 
> >>> 
> >>>> 2023年12月22日 03:17,Luis Chamberlain <mcgrof@kernel.org> 写道:
> >>>> 
> >>>> On Thu, Dec 21, 2023 at 02:19:56PM +0800, Yu Kuai wrote:
> >>>>> I can't find this by code review, and I think
> >>>>> maybe it's better to fix this in sysctl error path.
> >>>> 
> >>>> Indeed, we want to fix anything in the way to remove the empty sentinel,
> >>>> we continue to do that in queued work on sysctl-next [0]. Although I
> >>>> won't be able to diagnose this right away, could you try the out of
> >>>> bounds fix by Joel [1] instead?
> >>>> 
> >>>> We want to identify what caused this and fix it within sysctl code.
> >>>> 
> >>>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=sysctl-next
> >>>> [1] https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/commit/?h=sysctl-next&id=fd696ee2395755a292f7d49bf4c701a5bab2f076
> >>> 
> >>> Hi Luis,
> >>> 
> >>> Thanks of the above information. IMHO your code is good, When I cherry
> >>> pick the upstream md code for testing, the sysctl related change
> >>> leaked from my eyes. please ignore my noise. 
> > 
> > So this was triggered because the tree was missing the changes that
> > actually handled the removal of the sentinel?
> > 
> 
> It was from a regular update of the subsystem I maintainer for our own
> product. This is quite common to happen when a tree wide changes
> happen and subsystem maintainers of downstream products were not aware
> of the changes out of the subsystem.
> 
> Just as I said, please ignore the noise.
> 
> How to avoid such unnecessary noise? Maybe the patches to subsystems
> should add more information about the tree wide changes, e.g. this
> patch goes with the core change of xxx, when you pick it for backport
> please also be aware of the changes in xxxx.

I'll take this into account for what is missing from the sentinel
changes. It is good to leave breadcrumbs in the commit.

Thx for the suggestion. Just have to remember to do it when I post it.

Best
> 
> But this is suggestion and not mandatory, finally the developer who
> partially picked patches for  his backport will find out where he made
> mistake. Just like this time.
> 
> 
> > Get back to me if the oops persists even after you have included the
> > changes in sysctl-next
> 
> Thanks for the help. I do appreciate :-)
> 
> Coly Li
diff mbox series

Patch

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9bdd57324c37..90481ed6fdbb 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -312,6 +312,7 @@  static struct ctl_table raid_table[] = {
 		.mode		= S_IRUGO|S_IWUSR,
 		.proc_handler	= proc_dointvec,
 	},
+	{ }
 };
 
 static int start_readonly;