Message ID | 20210918064822.58208-1-xuanzhuo@linux.alibaba.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] napi: fix race inside napi_enable | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | fail | Series targets non-next tree, but doesn't contain any Fixes tags |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 12 of 12 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 6 this patch: 6 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 24 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 6 this patch: 6 |
netdev/header_inline | success | Link |
On 2021/9/18 14:48, Xuan Zhuo wrote: > The process will cause napi.state to contain NAPI_STATE_SCHED and > not in the poll_list, which will cause napi_disable() to get stuck. > > The prefix "NAPI_STATE_" is removed in the figure below, and > NAPI_STATE_HASHED is ignored in napi.state. > > CPU0 | CPU1 | napi.state > =============================================================================== > napi_disable() | | SCHED | NPSVC > napi_enable() | | > { | | > smp_mb__before_atomic(); | | > clear_bit(SCHED, &n->state); | | NPSVC > | napi_schedule_prep() | SCHED | NPSVC > | napi_poll() | > | napi_complete_done() | > | { | > | if (n->state & (NPSVC | | (1) > | _BUSY_POLL))) | > | return false; | > | ................ | > | } | SCHED | NPSVC > | | > clear_bit(NPSVC, &n->state); | | SCHED > } | | > | | > napi_schedule_prep() | | SCHED | MISSED (2) > > (1) Here return direct. Because of NAPI_STATE_NPSVC exists. > (2) NAPI_STATE_SCHED exists. So not add napi.poll_list to sd->poll_list > > Since NAPI_STATE_SCHED already exists and napi is not in the > sd->poll_list queue, NAPI_STATE_SCHED cannot be cleared and will always > exist. > > 1. This will cause this queue to no longer receive packets. > 2. If you encounter napi_disable under the protection of rtnl_lock, it > will cause the entire rtnl_lock to be locked, affecting the overall > system. > > This patch uses cmpxchg to implement napi_enable(), which ensures that > there will be no race due to the separation of clear two bits. > a Fixes tag is needed here. > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Reviewed-by: Dust Li <dust.li@linux.alibaba.com> > --- > net/core/dev.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 74fd402d26dd..3457ae964b8c 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6923,12 +6923,19 @@ EXPORT_SYMBOL(napi_disable); > */ > void napi_enable(struct napi_struct *n) > { > + unsigned long val, new; > + > BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); > - smp_mb__before_atomic(); > - clear_bit(NAPI_STATE_SCHED, &n->state); > - clear_bit(NAPI_STATE_NPSVC, &n->state); > - if (n->dev->threaded && n->thread) > - set_bit(NAPI_STATE_THREADED, &n->state); > + > + do { > + val = READ_ONCE(n->state); It seems we might need to move the above BUG_ON here to preserve the orginial BUG_ON behavior? > + > + new = val & ~(NAPIF_STATE_SCHED | NAPIF_STATE_NPSVC); > + > + if (n->dev->threaded && n->thread) > + new |= NAPIF_STATE_THREADED; > + > + } while (cmpxchg(&n->state, val, new) != val); > } > EXPORT_SYMBOL(napi_enable); > >
diff --git a/net/core/dev.c b/net/core/dev.c index 74fd402d26dd..3457ae964b8c 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6923,12 +6923,19 @@ EXPORT_SYMBOL(napi_disable); */ void napi_enable(struct napi_struct *n) { + unsigned long val, new; + BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state)); - smp_mb__before_atomic(); - clear_bit(NAPI_STATE_SCHED, &n->state); - clear_bit(NAPI_STATE_NPSVC, &n->state); - if (n->dev->threaded && n->thread) - set_bit(NAPI_STATE_THREADED, &n->state); + + do { + val = READ_ONCE(n->state); + + new = val & ~(NAPIF_STATE_SCHED | NAPIF_STATE_NPSVC); + + if (n->dev->threaded && n->thread) + new |= NAPIF_STATE_THREADED; + + } while (cmpxchg(&n->state, val, new) != val); } EXPORT_SYMBOL(napi_enable);