Message ID | CA+55aFzcweyWCqq-fRAnJkFB7neKwKsXH=TMcLtRQRs2Wn6XEQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Oct 10, 2016 at 10:39 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > I guess I will have to double-check that the slub corruption is gone > still with that fixed. So I'm not getting any warnings now from SLUB debugging. So the original bug seems to not have re-surfaced, and the registration bug is gone, so now the unregistration doesn't warn about anything either. But I only rebooted three times. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Linus Torvalds <torvalds@linux-foundation.org> Date: Mon, 10 Oct 2016 22:47:50 -0700 > On Mon, Oct 10, 2016 at 10:39 PM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> I guess I will have to double-check that the slub corruption is gone >> still with that fixed. > > So I'm not getting any warnings now from SLUB debugging. So the > original bug seems to not have re-surfaced, and the registration bug > is gone, so now the unregistration doesn't warn about anything either. > > But I only rebooted three times. Looks good to me, I applied it to my tree with your signoff and will send you a pull request right now. Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016.10.11 at 04:57 -0400, David Miller wrote: > From: Linus Torvalds <torvalds@linux-foundation.org> > Date: Mon, 10 Oct 2016 22:47:50 -0700 > > > On Mon, Oct 10, 2016 at 10:39 PM, Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > >> > >> I guess I will have to double-check that the slub corruption is gone > >> still with that fixed. > > > > So I'm not getting any warnings now from SLUB debugging. So the > > original bug seems to not have re-surfaced, and the registration bug > > is gone, so now the unregistration doesn't warn about anything either. > > > > But I only rebooted three times. > > Looks good to me, I applied it to my tree with your signoff and will > send you a pull request right now. I'm still seeing: nf_conntrack version 0.5.0 (4096 buckets, 16384 max) ctnetlink v0.93: registering with nfnetlink. ip_tables: (C) 2000-2006 Netfilter Core Team WARNING: kmemcheck: Caught 64-bit read from uninitialized memory (ffff88001e605480) 4055601e0088ffff000000000000000090686d81ffffffff0000000000000000 u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u ^ RIP: 0010:[<ffffffff8166e561>] [<ffffffff8166e561>] nf_register_net_hook+0x51/0x160 RSP: 0018:ffffc9000000bcc0 EFLAGS: 00010286 RAX: ffff88001e5af9c0 RBX: ffff88001e605480 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88001e5b0a20 RBP: ffffc9000000bcd8 R08: 000000001fd0e000 R09: 0000000000000000 R10: ffff88001e5b09c0 R11: 0000000000000067 R12: ffff88001e5af9c0 R13: ffffffff81c5c0c8 R14: 0000000000000003 R15: ffff88001e605480 FS: 0000000000000000(0000) GS:ffff88001fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff88001f45ca18 CR3: 0000000001c07000 CR4: 00000000000006f0 [<ffffffff8166e561>] nf_register_net_hook+0x51/0x160 [<ffffffff8166eaaf>] nf_register_net_hooks+0x3f/0xa0 [<ffffffff816d6715>] ipt_register_table+0xe5/0x110 [<ffffffff816d6815>] iptable_filter_table_init.part.1+0x55/0x80 [<ffffffff816d688b>] iptable_filter_net_init+0x2b/0x30 [<ffffffff8163edd7>] ops_init+0x47/0x150 [<ffffffff8163f0c6>] register_pernet_operations+0xd6/0x170 [<ffffffff8163fb77>] register_pernet_subsys+0x27/0x40 [<ffffffff81cb9de3>] iptable_filter_init+0x33/0x4b [<ffffffff81c8bef0>] do_one_initcall+0x8b/0x113 [<ffffffff81c8c091>] kernel_init_freeable+0x119/0x1a1 [<ffffffff816efd09>] kernel_init+0x9/0x100 [<ffffffff816f4e12>] ret_from_fork+0x22/0x30 [<ffffffffffffffff>] 0xffffffffffffffff NET: Registered protocol family 17 9pnet: Installing 9P2000 support
On 2016.10.13 at 08:02 +0200, Markus Trippelsdorf wrote: > On 2016.10.11 at 04:57 -0400, David Miller wrote: > > From: Linus Torvalds <torvalds@linux-foundation.org> > > Date: Mon, 10 Oct 2016 22:47:50 -0700 > > > > > On Mon, Oct 10, 2016 at 10:39 PM, Linus Torvalds > > > <torvalds@linux-foundation.org> wrote: > > >> > > >> I guess I will have to double-check that the slub corruption is gone > > >> still with that fixed. > > > > > > So I'm not getting any warnings now from SLUB debugging. So the > > > original bug seems to not have re-surfaced, and the registration bug > > > is gone, so now the unregistration doesn't warn about anything either. > > > > > > But I only rebooted three times. > > > > Looks good to me, I applied it to my tree with your signoff and will > > send you a pull request right now. > > I'm still seeing: > > nf_conntrack version 0.5.0 (4096 buckets, 16384 max) > ctnetlink v0.93: registering with nfnetlink. > ip_tables: (C) 2000-2006 Netfilter Core Team > WARNING: kmemcheck: Caught 64-bit read from uninitialized memory (ffff88001e605480) > 4055601e0088ffff000000000000000090686d81ffffffff0000000000000000 > u u u u u u u u u u u u u u u u i i i i i i i i u u u u u u u u > ^ > RIP: 0010:[<ffffffff8166e561>] [<ffffffff8166e561>] nf_register_net_hook+0x51/0x160 This is nf_register_net_hook at net/netfilter/core.c:106
On 2016.10.12 at 23:18 -0700, Linus Torvalds wrote: > On Oct 12, 2016 23:07, "Markus Trippelsdorf" <markus@trippelsdorf.de> wrote: > > > > This is nf_register_net_hook at net/netfilter/core.c:106 > > The "*regs" access? Yeah. 105 entry->orig_ops = reg; 106 entry->ops = *reg; 107 entry->next = NULL; -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 12, 2016 at 11:27 PM, Markus Trippelsdorf <markus@trippelsdorf.de> wrote: > > Yeah. > > 105 entry->orig_ops = reg; > 106 entry->ops = *reg; > 107 entry->next = NULL; So ipt_register_table() does: ret = nf_register_net_hooks(net, ops, hweight32(table->valid_hooks)); and then nf_register_net_hooks() just does for (i = 0; i < n; i++) { err = nf_register_net_hook(net, ®[i]); so if the *reg is uninitialized, it means that it's the 'ops[]' array that isn't actually really valid in "valid_hooks". Odd. They should all be initialized by xt_hook_ops_alloc(), no? That said, xt_hook_ops_alloc() itself is odd. Lookie here, this is the loop that initializes things: for (i = 0, hooknum = 0; i < num_hooks && hook_mask != 0; hook_mask >>= 1, ++hooknum) { and it makes no sense to me how that tests *both* "i < num_hools" and "hook_mask != 0". Why? Because num_hooks = hweight32(hook_mask); so it's entirely redundant. num_hooks is already how many bits are on in hook_mask, so that test is just duplicating the same thing twice ("have we done less than that number of bits" and "do we have any bits less"). I don't know. There's something odd going on. Regardless, thsi is a different problem from the nf_register_net_hook() list handling, so I'll leave it to the networking people. David? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 13, 2016 at 12:49:33PM -0700, Linus Torvalds wrote: > That said, xt_hook_ops_alloc() itself is odd. Lookie here, this is the > loop that initializes things: > > for (i = 0, hooknum = 0; i < num_hooks && hook_mask != 0; > hook_mask >>= 1, ++hooknum) { > > and it makes no sense to me how that tests *both* "i < num_hools" and > "hook_mask != 0". > > Why? Because > > num_hooks = hweight32(hook_mask); > > so it's entirely redundant. num_hooks is already how many bits are on > in hook_mask, so that test is just duplicating the same thing twice > ("have we done less than that number of bits" and "do we have any bits > less"). > > I don't know. There's something odd going on. Regardless, thsi is a > different problem from the nf_register_net_hook() list handling, so > I'll leave it to the networking people. David? Hey, I remember looking through that stuff. <checks> There it is, in a thread started by Krause Randomness(tm)... Short version: nf_hook_ops is a mess - it's embedded into different objects, with different subsets of fields used depending on the containing object and I would seriously suggest moving some of those into those containing objects. -------------------------------------------------------------------------- On Thu, Sep 01, 2016 at 08:10:44AM -0500, Eric Sandeen wrote: > On 8/4/16 8:57 AM, Al Viro wrote: > > > Don't feed the troll. On all paths leading to that place we have > > result->name = kname; > > len = strncpy_from_user(kname, filename, EMBEDDED_NAME_MAX); > > or > > result->name = kname; > > len = strncpy_from_user(kname, filename, PATH_MAX); > > with failure exits taken if strncpy_from_user() returns an error, which means > > that the damn thing has already been copied into. > > > > FWIW, it looks a lot like buggered kmemcheck; as usual, he can't be bothered > > to mention which kernel version would it be (let alone how to reproduce it > > on the kernel in question), but IIRC davej had run into some instrumentation > > breakage lately. > > The original report is in https://bugzilla.kernel.org/show_bug.cgi?id=120651 > if anyone is interested in it. What the hell does that one have to getname_flags(), other than having attracted the same... something on the edge of failing the Turing Test? FWIW, looking at the netfilter one... That's nf_register_net_hook() hitting entry->ops = *reg; with reg pointing to something uninitialized (according to kmemcheck, that is, and presuming that it's not an instrumentation bug). With the callchain in report, it came (all in the same assumptions) from nf_register_net_hooks(net, ops, hweight32(table->valid_hooks)) with hweight32(table->valid_hooks) being greater than the amount of initialized entries in ops[] (call site in ipt_register_table()). This "ops" ought to be net/ipv4/netfilter/iptable_filter.c:filter_ops, allocated by filter_ops = xt_hook_ops_alloc(&packet_filter, iptable_filter_hook); in iptable_filter_init(). "table" is &packet_filter and its contents ought to be unchanged, so ->valid_hooks in there is FILTER_VALID_HOOKS, i.e. ((1 << NF_INET_LOCAL_IN) | (1 << NF_INET_FORWARD) | (1 << NF_INET_LOCAL_OUT)). Which is to say, filter_ops[] had fewer than 3 initialized elements when it got to the call of iptable_filter_table_init()... Since filter_ops hadn't been NULL, the xt_hook_ops_alloc() call above must've already been done. Said xt_hook_ops_alloc() should've allocated a 3-element array and hooked through all of it, so it's not a wholesale uninitialized element, it's uninitialized parts of one... What gets initialized is ->hook, ->pf, ->hooknum and ->priority. Let's figure out the offsets: 0: list (two pointers, i.e. 16 bytes) 0x10: hook (8) 0x18: dev (8) 0x20: priv (8) 0x28: pf (1) 0x29: padding (3) 0x2c: hooknum (4) 0x30: priority (4) 0x34: padding (8) OK... The address of the damn thing is apparently ffff880037b4bd80 and we see complaint about the accesses at offsets 0, 0x18, 8, 0x20 and then the same pattern with 0x38 and 0x70 added (i.e. the same fields in the next two elements of the same array). Then there are similar complaints, but with a different call chain (iptable_mangle instead of iptable_filter). These offsets are ->list, ->dev and ->priv, and those are exactly the ones not initialized by xt_hook_ops_alloc(). Looking at the nf_register_net_hook(), we have list_add_rcu(&entry->ops.list, elem->list.prev); a bit further down the road. ->dev and ->priv are left uninitialized (and very likely - unused). I would say it's a false positive. struct nf_hook_ops is embedded into a bunch of different objects, with different subsets of fields getting used. IMO it's a bad idea (in particular, I really wonder if ->list would've been better off moved into (some of) the containing suckers), but it's not a bug per se, just a design choice asking for trouble. One way of getting kmemcheck off your back would be to switch xt_hook_ops_alloc() from ops = kmalloc(sizeof(*ops) * num_hooks, GFP_KERNEL); to ops = kcalloc(num_hooks, sizeof(*ops), GFP_KERNEL); which might have some merits beyond making kmemcheck STFU... -------------------------------------------------------------------------- -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netfilter/core.c b/net/netfilter/core.c index c9d90eb64046..fcb5d1df11e9 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -65,49 +65,24 @@ static DEFINE_MUTEX(nf_hook_mutex); #define nf_entry_dereference(e) \ rcu_dereference_protected(e, lockdep_is_held(&nf_hook_mutex)) -static struct nf_hook_entry *nf_hook_entry_head(struct net *net, - const struct nf_hook_ops *reg) +static struct nf_hook_entry __rcu **nf_hook_entry_head(struct net *net, const struct nf_hook_ops *reg) { - struct nf_hook_entry *hook_head = NULL; - if (reg->pf != NFPROTO_NETDEV) - hook_head = nf_entry_dereference(net->nf.hooks[reg->pf] - [reg->hooknum]); - else if (reg->hooknum == NF_NETDEV_INGRESS) { + return net->nf.hooks[reg->pf]+reg->hooknum; + #ifdef CONFIG_NETFILTER_INGRESS + if (reg->hooknum == NF_NETDEV_INGRESS) { if (reg->dev && dev_net(reg->dev) == net) - hook_head = - nf_entry_dereference( - reg->dev->nf_hooks_ingress); -#endif + return ®->dev->nf_hooks_ingress; } - return hook_head; -} - -/* must hold nf_hook_mutex */ -static void nf_set_hooks_head(struct net *net, const struct nf_hook_ops *reg, - struct nf_hook_entry *entry) -{ - switch (reg->pf) { - case NFPROTO_NETDEV: -#ifdef CONFIG_NETFILTER_INGRESS - /* We already checked in nf_register_net_hook() that this is - * used from ingress. - */ - rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry); #endif - break; - default: - rcu_assign_pointer(net->nf.hooks[reg->pf][reg->hooknum], - entry); - break; - } + return NULL; } int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg) { - struct nf_hook_entry *hooks_entry; - struct nf_hook_entry *entry; + struct nf_hook_entry __rcu **pp; + struct nf_hook_entry *entry, *p; if (reg->pf == NFPROTO_NETDEV) { #ifndef CONFIG_NETFILTER_INGRESS @@ -119,6 +94,10 @@ int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg) return -EINVAL; } + pp = nf_hook_entry_head(net, reg); + if (!pp) + return -EINVAL; + entry = kmalloc(sizeof(*entry), GFP_KERNEL); if (!entry) return -ENOMEM; @@ -128,26 +107,15 @@ int nf_register_net_hook(struct net *net, const struct nf_hook_ops *reg) entry->next = NULL; mutex_lock(&nf_hook_mutex); - hooks_entry = nf_hook_entry_head(net, reg); - - if (hooks_entry && hooks_entry->orig_ops->priority > reg->priority) { - /* This is the case where we need to insert at the head */ - entry->next = hooks_entry; - hooks_entry = NULL; - } - - while (hooks_entry && - reg->priority >= hooks_entry->orig_ops->priority && - nf_entry_dereference(hooks_entry->next)) { - hooks_entry = nf_entry_dereference(hooks_entry->next); - } - if (hooks_entry) { - entry->next = nf_entry_dereference(hooks_entry->next); - rcu_assign_pointer(hooks_entry->next, entry); - } else { - nf_set_hooks_head(net, reg, entry); + /* Find the spot in the list */ + while ((p = nf_entry_dereference(*pp)) != NULL) { + if (reg->priority < p->orig_ops->priority) + break; + pp = &p->next; } + rcu_assign_pointer(entry->next, p); + rcu_assign_pointer(*pp, entry); mutex_unlock(&nf_hook_mutex); #ifdef CONFIG_NETFILTER_INGRESS @@ -163,33 +131,23 @@ EXPORT_SYMBOL(nf_register_net_hook); void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg) { - struct nf_hook_entry *hooks_entry; + struct nf_hook_entry __rcu **pp; + struct nf_hook_entry *p; - mutex_lock(&nf_hook_mutex); - hooks_entry = nf_hook_entry_head(net, reg); - if (hooks_entry && hooks_entry->orig_ops == reg) { - nf_set_hooks_head(net, reg, - nf_entry_dereference(hooks_entry->next)); - goto unlock; - } - while (hooks_entry && nf_entry_dereference(hooks_entry->next)) { - struct nf_hook_entry *next = - nf_entry_dereference(hooks_entry->next); - struct nf_hook_entry *nnext; + pp = nf_hook_entry_head(net, reg); + if (WARN_ON_ONCE(!pp)) + return; - if (next->orig_ops != reg) { - hooks_entry = next; - continue; + mutex_lock(&nf_hook_mutex); + while ((p = nf_entry_dereference(*pp)) != NULL) { + if (p->orig_ops == reg) { + rcu_assign_pointer(*pp, p->next); + break; } - nnext = nf_entry_dereference(next->next); - rcu_assign_pointer(hooks_entry->next, nnext); - hooks_entry = next; - break; + pp = &p->next; } - -unlock: mutex_unlock(&nf_hook_mutex); - if (!hooks_entry) { + if (!p) { WARN(1, "nf_unregister_net_hook: hook not found!\n"); return; } @@ -201,10 +159,10 @@ void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *reg) static_key_slow_dec(&nf_hooks_needed[reg->pf][reg->hooknum]); #endif synchronize_net(); - nf_queue_nf_hook_drop(net, hooks_entry); + nf_queue_nf_hook_drop(net, p); /* other cpu might still process nfqueue verdict that used reg */ synchronize_net(); - kfree(hooks_entry); + kfree(p); } EXPORT_SYMBOL(nf_unregister_net_hook);