mbox series

[net,0/2] net/smc: Two fixes for smc fallback

Message ID 1650614179-11529-1-git-send-email-guwen@linux.alibaba.com (mailing list archive)
Headers show
Series net/smc: Two fixes for smc fallback | expand

Message

Wen Gu April 22, 2022, 7:56 a.m. UTC
This patch set includes two fixes for smc fallback:

Patch 1/2 introduces some simple helpers to wrap the replacement
and restore of clcsock's callback functions. Make sure that only
the original callbacks will be saved and not overwritten.

Patch 2/2 fixes a syzbot reporting slab-out-of-bound issue where
smc_fback_error_report() accesses the already freed smc sock (see
https://lore.kernel.org/r/00000000000013ca8105d7ae3ada@google.com/).
The patch fixes it by resetting sk_user_data and restoring clcsock
callback functions timely in fallback situation.

But it should be noted that although patch 2/2 can fix the issue
of 'slab-out-of-bounds/use-after-free in smc_fback_error_report',
it can't pass the syzbot reproducer test. Because after applying
these two patches in upstream, syzbot reproducer triggered another
known issue like this:

==================================================================
BUG: KASAN: use-after-free in tcp_retransmit_timer+0x2ef3/0x3360 net/ipv4/tcp_timer.c:511
Read of size 8 at addr ffff888020328380 by task udevd/4158

CPU: 1 PID: 4158 Comm: udevd Not tainted 5.18.0-rc3-syzkaller-00074-gb05a5683eba6-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
  print_address_description.constprop.0.cold+0xeb/0x467 mm/kasan/report.c:313
  print_report mm/kasan/report.c:429 [inline]
  kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
  tcp_retransmit_timer+0x2ef3/0x3360 net/ipv4/tcp_timer.c:511
  tcp_write_timer_handler+0x5e6/0xbc0 net/ipv4/tcp_timer.c:622
  tcp_write_timer+0xa2/0x2b0 net/ipv4/tcp_timer.c:642
  call_timer_fn+0x1a5/0x6b0 kernel/time/timer.c:1421
  expire_timers kernel/time/timer.c:1466 [inline]
  __run_timers.part.0+0x679/0xa80 kernel/time/timer.c:1737
  __run_timers kernel/time/timer.c:1715 [inline]
  run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1750
  __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
  invoke_softirq kernel/softirq.c:432 [inline]
  __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
  irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
 </IRQ>
 ...
(detail report can be found in https://syzkaller.appspot.com/text?tag=CrashReport&x=15406b44f00000)

IMHO, the above issue is the same as this known one: https://syzkaller.appspot.com/bug?extid=694120e1002c117747ed,
and it doesn't seem to be related with SMC. The discussion about this known issue is ongoing and can be found in
https://lore.kernel.org/bpf/000000000000f75af905d3ba0716@google.com/T/.

And I added the temporary solution mentioned in the above discussion on
top of my two patches, the syzbot reproducer of 'slab-out-of-bounds/
use-after-free in smc_fback_error_report' no longer triggers any issue.

Wen Gu (2):
  net/smc: Only save the original clcsock callback functions
  net/smc: Fix slab-out-of-bounds issue in fallback

 net/smc/af_smc.c    | 135 ++++++++++++++++++++++++++++++++++++----------------
 net/smc/smc.h       |  29 +++++++++++
 net/smc/smc_close.c |   5 +-
 3 files changed, 126 insertions(+), 43 deletions(-)

Comments

Karsten Graul April 23, 2022, 10:26 a.m. UTC | #1
On 22/04/2022 09:56, Wen Gu wrote:
> This patch set includes two fixes for smc fallback:
> 
> Patch 1/2 introduces some simple helpers to wrap the replacement
> and restore of clcsock's callback functions. Make sure that only
> the original callbacks will be saved and not overwritten.
> 
> Patch 2/2 fixes a syzbot reporting slab-out-of-bound issue where
> smc_fback_error_report() accesses the already freed smc sock (see
> https://lore.kernel.org/r/00000000000013ca8105d7ae3ada@google.com/).
> The patch fixes it by resetting sk_user_data and restoring clcsock
> callback functions timely in fallback situation.

Thank you for the analysis and the fix!

For the series:
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
patchwork-bot+netdevbpf@kernel.org April 25, 2022, 7:10 p.m. UTC | #2
Hello:

This series was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Fri, 22 Apr 2022 15:56:17 +0800 you wrote:
> This patch set includes two fixes for smc fallback:
> 
> Patch 1/2 introduces some simple helpers to wrap the replacement
> and restore of clcsock's callback functions. Make sure that only
> the original callbacks will be saved and not overwritten.
> 
> Patch 2/2 fixes a syzbot reporting slab-out-of-bound issue where
> smc_fback_error_report() accesses the already freed smc sock (see
> https://lore.kernel.org/r/00000000000013ca8105d7ae3ada@google.com/).
> The patch fixes it by resetting sk_user_data and restoring clcsock
> callback functions timely in fallback situation.
> 
> [...]

Here is the summary with links:
  - [net,1/2] net/smc: Only save the original clcsock callback functions
    https://git.kernel.org/netdev/net/c/97b9af7a7093
  - [net,2/2] net/smc: Fix slab-out-of-bounds issue in fallback
    https://git.kernel.org/netdev/net/c/0558226cebee

You are awesome, thank you!