Message ID | DM2PR21MB0089CA7DCF4845DB02E0E05FCBC80@DM2PR21MB0089.namprd21.prod.outlook.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Sep 23, 2016 at 1:16 PM, Matthew Wilcox <mawilcox@microsoft.com> wrote: > > #ifdef CONFIG_RADIX_TREE_MULTIORDER > if (radix_tree_is_internal_node(entry)) { > - unsigned long siboff = get_slot_offset(parent, entry); > + unsigned long siboff = get_slot_offset(parent, > + (void **)entry_to_node(entry)); I feel that it is *this* part that I think needs a huge honking comment. If you are going to make get_slot_offset() different, then you could just rewrite get_slot_offset() to do unsigned long diff = (unsigned long) slot - (unsigned long)parent->slots; return diff / sizeof(void *); and add a comment to say "don't do this as a pointer diff, because 'slot' may not be an aligned pointer". No BUG_ON() necessary, because it "just works". At that point, gcc should just generate the right code, because it doesn't see it as a pointer subtraction followed by a pointer addition. And yes, that crazy " (void **)entry_to_node(entry)" fixes it *too*, but it needs a *comment*. Why is that special, when all the other uses of get_slot_offset() don't have that? *That* is what should be explained. Not some internal detail. That said, if this code isn't even used, as Konstantin says (THP selects it - doesn't THP use it?), then the fix really should be to just remove the odd code instead of adding to it. Looking around for uses that set "order" to anything but zero, I really don't see it. So maybe we should just do *that* trivial thing instead, and remove CONFIG_RADIX_TREE_MULTIORDER, since it's appears to be buggy and always has been. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Sep 24, 2016 at 01:21:36PM -0700, Linus Torvalds wrote: > On Fri, Sep 23, 2016 at 1:16 PM, Matthew Wilcox <mawilcox@microsoft.com> wrote: > > > > #ifdef CONFIG_RADIX_TREE_MULTIORDER > > if (radix_tree_is_internal_node(entry)) { > > - unsigned long siboff = get_slot_offset(parent, entry); > > + unsigned long siboff = get_slot_offset(parent, > > + (void **)entry_to_node(entry)); > > I feel that it is *this* part that I think needs a huge honking comment. > > If you are going to make get_slot_offset() different, then you could > just rewrite get_slot_offset() to do > > unsigned long diff = (unsigned long) slot - (unsigned > long)parent->slots; > return diff / sizeof(void *); > > and add a comment to say "don't do this as a pointer diff, because > 'slot' may not be an aligned pointer". No BUG_ON() necessary, because > it "just works". > > At that point, gcc should just generate the right code, because it > doesn't see it as a pointer subtraction followed by a pointer > addition. > > And yes, that crazy " (void **)entry_to_node(entry)" fixes it *too*, > but it needs a *comment*. > > Why is that special, when all the other uses of get_slot_offset() > don't have that? *That* is what should be explained. Not some internal > detail. > > That said, if this code isn't even used, as Konstantin says (THP > selects it - doesn't THP use it?), then the fix really should be to > just remove the odd code instead of adding to it. > > Looking around for uses that set "order" to anything but zero, I > really don't see it. So maybe we should just do *that* trivial thing > instead, and remove CONFIG_RADIX_TREE_MULTIORDER, since it's appears > to be buggy and always has been. Well, my ext4-with-huge-pages patchset[1] uses multi-order entries. It also converts shmem-with-huge-pages and hugetlb to them. I'm okay with converting it to other mechanism, but I need something. (I looked into Konstantin's RFC patchset[2]. It looks okay, but I don't feel myself qualified to review it as I don't know much about radix-tree internals.) [1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com [2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg
On Sat, Sep 24, 2016 at 2:04 PM, Kirill A. Shutemov <kirill.shutemov@linux.intel.com> wrote: > > Well, my ext4-with-huge-pages patchset[1] uses multi-order entries. > It also converts shmem-with-huge-pages and hugetlb to them. Ok, so that code actually has a chance of being used. I guess we'll not remove it. But I *would* like this subtle issue to have a comment around that odd cast/and/mask thing. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/lib/radix-tree.c b/lib/radix-tree.c index 1b7bf73..368f641 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -91,9 +91,15 @@ static inline bool is_sibling_entry(struct radix_tree_node *parent, void *node) } #endif +/* + * The slot pointer must be a real pointer as GCC will optimise + * through inlined functions and may deduce that + * parent->slots + get_slot_offset(parent, slot) == slot + */ static inline unsigned long get_slot_offset(struct radix_tree_node *parent, void **slot) { + BUG_ON(radix_tree_exception(slot)); return slot - parent->slots; } @@ -101,11 +107,12 @@ static unsigned int radix_tree_descend(struct radix_tree_node *parent, struct radix_tree_node **nodep, unsigned long index) { unsigned int offset = (index >> parent->shift) & RADIX_TREE_MAP_MASK; - void **entry = rcu_dereference_raw(parent->slots[offset]); + void *entry = rcu_dereference_raw(parent->slots[offset]); #ifdef CONFIG_RADIX_TREE_MULTIORDER if (radix_tree_is_internal_node(entry)) { - unsigned long siboff = get_slot_offset(parent, entry); + unsigned long siboff = get_slot_offset(parent, + (void **)entry_to_node(entry)); if (siboff < RADIX_TREE_MAP_SIZE) { offset = siboff; entry = rcu_dereference_raw(parent->slots[offset]); @@ -113,7 +120,7 @@ static unsigned int radix_tree_descend(struct radix_tree_node *parent, } #endif - *nodep = (void *)entry; + *nodep = entry; return offset; } diff --git a/tools/testing/radix-tree/Makefile b/tools/testing/radix-tree/Makefile index 3b53046..9d0919ed 100644 --- a/tools/testing/radix-tree/Makefile +++ b/tools/testing/radix-tree/Makefile @@ -1,5 +1,5 @@ -CFLAGS += -I. -g -Wall -D_LGPL_SOURCE +CFLAGS += -I. -g -O2 -Wall -D_LGPL_SOURCE LDFLAGS += -lpthread -lurcu TARGETS = main OFILES = main.o radix-tree.o linux.o test.o tag_check.o find_next_bit.o \