From patchwork Mon Oct 10 03:41:17 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Torvalds X-Patchwork-Id: 9368951 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8CA356048F for ; Mon, 10 Oct 2016 03:41:45 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7E82928C2F for ; Mon, 10 Oct 2016 03:41:45 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 733E428C31; Mon, 10 Oct 2016 03:41:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID,T_TVD_MIME_EPI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA8FB28C2F for ; Mon, 10 Oct 2016 03:41:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752037AbcJJDlU (ORCPT ); Sun, 9 Oct 2016 23:41:20 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:35548 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752015AbcJJDlT (ORCPT ); Sun, 9 Oct 2016 23:41:19 -0400 Received: by mail-oi0-f68.google.com with SMTP id d132so7361923oib.2; Sun, 09 Oct 2016 20:41:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=C9WCMNqqV8+UsWqD/isE5F8HkaJKlvK2VbXHMEqQOoI=; b=PdCQCXGxCqnd6d2S1TzsQeJcgKkh7bA2Y8FSfex5e0gOIdc/iP2gn+Dji0tsLa96yr 99h4SrhYTCXHsrN+d0MPECBwrKOfWshwYJ1SuVRMA65qPBhPw/ViTy71S23EU1sBQ5AY NZbLPu9YbOsy0zyHKlZJQBPH/+vWjIBjWKUnvAtrHd1QYZujto8NeI2SdSA1mpgLicS8 JhfoDRUhqlgw4BZpxOIagC6KoIZugnYUMBdXAtSSZvcmWln6oXNSbKzFk/8XwKwsdmpP +2y7zKBn4UA0W0Rqhqduf07CahHXk6MEyPG4oAcclSq2XK3ba9531UDqFImep+bYGNDa j1vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=C9WCMNqqV8+UsWqD/isE5F8HkaJKlvK2VbXHMEqQOoI=; b=Deb6egoO4WWFmtjuNA9hIn8cyzp4tzvQxJSGZEVZO0c0bVpS4LtUjcuyRip2GriK30 EywBvWmRVpWq/7QGhBZ7cI+U/6o40cSNolHrY4aVvluCQIV81OxBQjPJoavh6cuf6x1R 9x6T3GovQs2TuaNFmWMis8KIXJO/vSNOK7sQswp0EXt2kTjRFv2TgZZ1kPlidivt66q0 hDZFN6k7rn5ghMNDgVuw8AVWrkMYLhFOGAD4hqCRjsPhrwG0eSvAzbhEJcV6MjPWrOjJ i0EONJf+1PKnDU9KyDsrgiccP8IPX2Vd6nIpIj0gya2LvtG+jEoyyryZacqvog7XIly4 tHSg== X-Gm-Message-State: AA6/9RkZ6LHv37AVG7D5K+aeeYLdpTchQdDrgv91EPjhVwMNuckhFzZfP4p7ODDN+3HOXr7225WA+9AvhwQo4g== X-Received: by 10.157.39.2 with SMTP id r2mr17870754ota.103.1476070878254; Sun, 09 Oct 2016 20:41:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.21.198 with HTTP; Sun, 9 Oct 2016 20:41:17 -0700 (PDT) In-Reply-To: References: <20161010005105.GA18349@breakpoint.cc> From: Linus Torvalds Date: Sun, 9 Oct 2016 20:41:17 -0700 X-Google-Sender-Auth: 2Wy6Dd2hWsQmqfpGx1gYS3TLTgc Message-ID: Subject: Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice)) To: Aaron Conole Cc: Florian Westphal , Al Viro , Andrew Morton , Jens Axboe , "Ted Ts'o" , Christoph Lameter , David Miller , Pablo Neira Ayuso , Linux Kernel Mailing List , linux-fsdevel , Network Development , NetFilter Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Sun, Oct 9, 2016 at 7:49 PM, Linus Torvalds wrote: > > There is one *correct* way to remove an entry from a singly linked > list, and it looks like this: > > struct entry **pp, *p; > > pp = &head; > while ((p = *pp) != NULL) { > if (right_entry(p)) { > *pp = p->next; > break; > } > pp = &p->next; > } > > and that's it. Nothing else. This COMPLETELY UNTESTED patch tries to fix the nf_hook_entry code to do this. I repeat: it's ENTIRELY UNTESTED. I just converted the insertion and deletion to the proper pattern, but I could easily have gotten the insertion priority test the wrong way around entirely, for example. Or it could simply have some other completely broken bug in it. It compiles for me, but that's all I actually checked. Note that the "correct way" of doing list operations also almost inevitably is the shortest way by far, since it gets rid of all the special cases. So the patch looks nice. It gets rid of the magic "nf_set_hooks_head()" thing too, because once you do list following right, the head is no different from any other pointer in the list. So the patch stats look good: net/netfilter/core.c | 108 ++++++++++++++++----------------------------------- 1 file changed, 33 insertions(+), 75 deletions(-) but again, it's entirely *entirely* untested. Please consider this just a "this is generally how list insert/delete operations should be done, avoiding special cases for the first entry". ALSO NOTE! The code assumes that the "nf_hook_mutex" locking only protects the actual *lists*, and that the address to the list can be looked up without holding the lock. That's generally how things are done, and it simplifies error handling (because you can do the "there is no such list at all" test before you do anything else. But again, I don't actually know the code, and if there is something that actually expands the number of lists etc that depends on that mutex, then the list head lookup may need to be inside the lock too. Linus net/netfilter/core.c | 108 ++++++++++++++++----------------------------------- 1 file changed, 33 insertions(+), 75 deletions(-) Reviewed-by: Aaron Conole diff --git a/net/netfilter/core.c b/net/netfilter/core.c index c9d90eb64046..814258641fcc 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -65,49 +65,24 @@ static DEFINE_MUTEX(nf_hook_mutex); #define nf_entry_dereference(e) \ rcu_dereference_protected(e, lockdep_is_held(&nf_hook_mutex)) -static struct nf_hook_entry *nf_hook_entry_head(struct net *net, - const struct nf_hook_ops *reg) +static struct nf_hook_entry __rcu **nf_hook_entry_head(struct net *net, const struct nf_hook_ops *reg) { - struct nf_hook_entry *hook_head = NULL; - if (reg->pf != NFPROTO_NETDEV) - hook_head = nf_entry_dereference(net->nf.hooks[reg->pf] - [reg->hooknum]); - else if (reg->hooknum == NF_NETDEV_INGRESS) { + return net->nf.hooks[reg->pf]+reg->hooknum; + #ifdef CONFIG_NETFILTER_INGRESS + if (reg->hooknum == NF_NETDEV_INGRESS) { if (reg->dev && dev_net(reg->dev) == net) - hook_head = - nf_entry_dereference( - reg->dev->nf_hooks_ingress); -#endif + return ®->dev->nf_hooks_ingress; } - return hook_head; -} - -/* must hold nf_hook_mutex */ -static void nf_set_hooks_head(struct net *net, const struct nf_hook_ops *reg, - struct nf_hook_entry *entry) -{ - switch (reg->pf) { - case NFPROTO_NETDEV: -#ifdef CONFIG_NETFILTER_INGRESS - /* We already checked in nf_register_net_hook() that this is - * used from ingress. - */ - rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry); #endif - break; - default: - rcu_assign_pointer(net->nf.hooks[reg->pf][reg->hooknum], - entry); - break; - } + return NULL; } int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg) { - struct nf_hook_entry *hooks_entry; - struct nf_hook_entry *entry; + struct nf_hook_entry __rcu **pp; + struct nf_hook_entry *entry, *p; if (reg->pf == NFPROTO_NETDEV) { #ifndef CONFIG_NETFILTER_INGRESS @@ -119,6 +94,10 @@ int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg) return -EINVAL; } + pp = nf_hook_entry_head(net, reg); + if (!pp) + return -EINVAL; + entry = kmalloc(sizeof(*entry), GFP_KERNEL); if (!entry) return -ENOMEM; @@ -128,26 +107,15 @@ int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg) entry->next = NULL; mutex_lock(&nf_hook_mutex); - hooks_entry = nf_hook_entry_head(net, reg); - - if (hooks_entry && hooks_entry->orig_ops->priority > reg->priority) { - /* This is the case where we need to insert at the head */ - entry->next = hooks_entry; - hooks_entry = NULL; - } - - while (hooks_entry && - reg->priority >= hooks_entry->orig_ops->priority && - nf_entry_dereference(hooks_entry->next)) { - hooks_entry = nf_entry_dereference(hooks_entry->next); - } - if (hooks_entry) { - entry->next = nf_entry_dereference(hooks_entry->next); - rcu_assign_pointer(hooks_entry->next, entry); - } else { - nf_set_hooks_head(net, reg, entry); + /* Find the spot in the list */ + while ((p = nf_entry_dereference(*pp)) != NULL) { + if (reg->priority < p->orig_ops->priority) + break; + pp = &p->next; } + rcu_assign_pointer(entry->next, p); + rcu_assign_pointer(*pp, p); mutex_unlock(&nf_hook_mutex); #ifdef CONFIG_NETFILTER_INGRESS @@ -163,33 +131,23 @@ EXPORT_SYMBOL(nf_register_net_hook); void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg) { - struct nf_hook_entry *hooks_entry; + struct nf_hook_entry __rcu **pp; + struct nf_hook_entry *p; - mutex_lock(&nf_hook_mutex); - hooks_entry = nf_hook_entry_head(net, reg); - if (hooks_entry && hooks_entry->orig_ops == reg) { - nf_set_hooks_head(net, reg, - nf_entry_dereference(hooks_entry->next)); - goto unlock; - } - while (hooks_entry && nf_entry_dereference(hooks_entry->next)) { - struct nf_hook_entry *next = - nf_entry_dereference(hooks_entry->next); - struct nf_hook_entry *nnext; + pp = nf_hook_entry_head(net, reg); + if (WARN_ON_ONCE(!pp)) + return; - if (next->orig_ops != reg) { - hooks_entry = next; - continue; + mutex_lock(&nf_hook_mutex); + while ((p = nf_entry_dereference(*pp)) != NULL) { + if (p->orig_ops == reg) { + rcu_assign_pointer(*pp, p->next); + break; } - nnext = nf_entry_dereference(next->next); - rcu_assign_pointer(hooks_entry->next, nnext); - hooks_entry = next; - break; + pp = &p->next; } - -unlock: mutex_unlock(&nf_hook_mutex); - if (!hooks_entry) { + if (!p) { WARN(1, "nf_unregister_net_hook: hook not found!\n"); return; } @@ -201,10 +159,10 @@ void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg) static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]); #endif synchronize_net(); - nf_queue_nf_hook_drop(net, hooks_entry); + nf_queue_nf_hook_drop(net, p); /* other cpu might still process nfqueue verdict that used reg */ synchronize_net(); - kfree(hooks_entry); + kfree(p); } EXPORT_SYMBOL(nf_unregister_net_hook);