Message ID | 1351450948-15618-9-git-send-email-levinsasha928@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
* Sasha Levin (levinsasha928@gmail.com) wrote: > Switch cache to use the new hashtable implementation. This reduces the amount of > generic unrelated code in the cache implementation. > > Signed-off-by: Sasha Levin <levinsasha928@gmail.com> > --- > net/sunrpc/cache.c | 20 +++++++++----------- > 1 file changed, 9 insertions(+), 11 deletions(-) > > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c > index fc2f7aa..0490546 100644 > --- a/net/sunrpc/cache.c > +++ b/net/sunrpc/cache.c > @@ -28,6 +28,7 @@ > #include <linux/workqueue.h> > #include <linux/mutex.h> > #include <linux/pagemap.h> > +#include <linux/hashtable.h> > #include <asm/ioctls.h> > #include <linux/sunrpc/types.h> > #include <linux/sunrpc/cache.h> > @@ -524,19 +525,18 @@ EXPORT_SYMBOL_GPL(cache_purge); > * it to be revisited when cache info is available > */ > > -#define DFR_HASHSIZE (PAGE_SIZE/sizeof(struct list_head)) > -#define DFR_HASH(item) ((((long)item)>>4 ^ (((long)item)>>13)) % DFR_HASHSIZE) > +#define DFR_HASH_BITS 9 If we look at a bit of history, mainly commit: commit 1117449276bb909b029ed0b9ba13f53e4784db9d Author: NeilBrown <neilb@suse.de> Date: Thu Aug 12 17:04:08 2010 +1000 sunrpc/cache: change deferred-request hash table to use hlist. we'll notice that the only reason why the prior DFR_HASHSIZE was using (PAGE_SIZE/sizeof(struct list_head)) instead of (PAGE_SIZE/sizeof(struct hlist_head)) is because it has been forgotten in that commit. The intent there is to make the hash table array fit the page size. By defining DFR_HASH_BITS arbitrarily to "9", this indeed fulfills this purpose on architectures with 4kB page size and 64-bit pointers, but not on some powerpc configurations, and Tile architectures, which have more exotic 64kB page size, and of course on the far less exotic 32-bit pointer architectures. So defining e.g.: #include <linux/log2.h> #define DFR_HASH_BITS (PAGE_SHIFT - ilog2(BITS_PER_LONG)) would keep the intended behavior in all cases: use one page for the hash array. Thanks, Mathieu
On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > So defining e.g.: > > #include <linux/log2.h> > > #define DFR_HASH_BITS (PAGE_SHIFT - ilog2(BITS_PER_LONG)) > > would keep the intended behavior in all cases: use one page for the hash > array. Well, since that wasn't true before either because of the long-time bug you point out, clearly the page size isn't all that important. I think it's more important to have small and simple code, and "9" is certainly that, compared to playing ilog2 games with not-so-obvious things. Because there's no reason to believe that '9' is in any way a worse random number than something page-shift-related, is there? And getting away from *previous* overly-complicated size calculations that had been broken because they were too complicated and random, sounds like a good idea. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Linus Torvalds (torvalds@linux-foundation.org) wrote: > On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers > <mathieu.desnoyers@efficios.com> wrote: > > > > So defining e.g.: > > > > #include <linux/log2.h> > > > > #define DFR_HASH_BITS (PAGE_SHIFT - ilog2(BITS_PER_LONG)) > > > > would keep the intended behavior in all cases: use one page for the hash > > array. > > Well, since that wasn't true before either because of the long-time > bug you point out, clearly the page size isn't all that important. I > think it's more important to have small and simple code, and "9" is > certainly that, compared to playing ilog2 games with not-so-obvious > things. > > Because there's no reason to believe that '9' is in any way a worse > random number than something page-shift-related, is there? And getting > away from *previous* overly-complicated size calculations that had > been broken because they were too complicated and random, sounds like > a good idea. Good point. I agree that unless we really care about the precise number of TLB entries and cache lines used by this hash table, we might want to stay away from page-size and pointer-size based calculation. It might not hurt to explain this in the patch changelog though. Thanks, Mathieu
On Mon, Oct 29, 2012 at 11:13:43AM -0400, Mathieu Desnoyers wrote: > * Linus Torvalds (torvalds@linux-foundation.org) wrote: > > On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers > > <mathieu.desnoyers@efficios.com> wrote: > > > > > > So defining e.g.: > > > > > > #include <linux/log2.h> > > > > > > #define DFR_HASH_BITS (PAGE_SHIFT - ilog2(BITS_PER_LONG)) > > > > > > would keep the intended behavior in all cases: use one page for the hash > > > array. > > > > Well, since that wasn't true before either because of the long-time > > bug you point out, clearly the page size isn't all that important. I > > think it's more important to have small and simple code, and "9" is > > certainly that, compared to playing ilog2 games with not-so-obvious > > things. > > > > Because there's no reason to believe that '9' is in any way a worse > > random number than something page-shift-related, is there? And getting > > away from *previous* overly-complicated size calculations that had > > been broken because they were too complicated and random, sounds like > > a good idea. > > Good point. I agree that unless we really care about the precise number > of TLB entries and cache lines used by this hash table, we might want to > stay away from page-size and pointer-size based calculation. > > It might not hurt to explain this in the patch changelog though. I'd also be happy to take that as a separate patch now. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* J. Bruce Fields (bfields@fieldses.org) wrote: > On Mon, Oct 29, 2012 at 11:13:43AM -0400, Mathieu Desnoyers wrote: > > * Linus Torvalds (torvalds@linux-foundation.org) wrote: > > > On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers > > > <mathieu.desnoyers@efficios.com> wrote: > > > > > > > > So defining e.g.: > > > > > > > > #include <linux/log2.h> > > > > > > > > #define DFR_HASH_BITS (PAGE_SHIFT - ilog2(BITS_PER_LONG)) > > > > > > > > would keep the intended behavior in all cases: use one page for the hash > > > > array. > > > > > > Well, since that wasn't true before either because of the long-time > > > bug you point out, clearly the page size isn't all that important. I > > > think it's more important to have small and simple code, and "9" is > > > certainly that, compared to playing ilog2 games with not-so-obvious > > > things. > > > > > > Because there's no reason to believe that '9' is in any way a worse > > > random number than something page-shift-related, is there? And getting > > > away from *previous* overly-complicated size calculations that had > > > been broken because they were too complicated and random, sounds like > > > a good idea. > > > > Good point. I agree that unless we really care about the precise number > > of TLB entries and cache lines used by this hash table, we might want to > > stay away from page-size and pointer-size based calculation. > > > > It might not hurt to explain this in the patch changelog though. > > I'd also be happy to take that as a separate patch now. FYIW: I've made a nice boo-boo above. It should have been: #define DFR_HASH_BITS (PAGE_SHIFT - ilog2(sizeof(struct hlist_head))) Because we happen to have a memory indexed in bytes, not in bits. I guess this goes a long way proving Linus' point about virtues of trivial code. ;-) Thanks, Mathieu
On Mon, 29 Oct 2012 07:49:42 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > Because there's no reason to believe that '9' is in any way a worse > random number than something page-shift-related, is there? 9 is much better than PAGE_SHIFT. PAGE_SIZE can vary by a factor of 16, depending on config. Everyone thinks 4k, and tests only for that. There's potential for very large performance and behavior changes when their code gets run on a 64k PAGE_SIZE machine. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c index fc2f7aa..0490546 100644 --- a/net/sunrpc/cache.c +++ b/net/sunrpc/cache.c @@ -28,6 +28,7 @@ #include <linux/workqueue.h> #include <linux/mutex.h> #include <linux/pagemap.h> +#include <linux/hashtable.h> #include <asm/ioctls.h> #include <linux/sunrpc/types.h> #include <linux/sunrpc/cache.h> @@ -524,19 +525,18 @@ EXPORT_SYMBOL_GPL(cache_purge); * it to be revisited when cache info is available */ -#define DFR_HASHSIZE (PAGE_SIZE/sizeof(struct list_head)) -#define DFR_HASH(item) ((((long)item)>>4 ^ (((long)item)>>13)) % DFR_HASHSIZE) +#define DFR_HASH_BITS 9 #define DFR_MAX 300 /* ??? */ static DEFINE_SPINLOCK(cache_defer_lock); static LIST_HEAD(cache_defer_list); -static struct hlist_head cache_defer_hash[DFR_HASHSIZE]; +static DEFINE_HASHTABLE(cache_defer_hash, DFR_HASH_BITS); static int cache_defer_cnt; static void __unhash_deferred_req(struct cache_deferred_req *dreq) { - hlist_del_init(&dreq->hash); + hash_del(&dreq->hash); if (!list_empty(&dreq->recent)) { list_del_init(&dreq->recent); cache_defer_cnt--; @@ -545,10 +545,7 @@ static void __unhash_deferred_req(struct cache_deferred_req *dreq) static void __hash_deferred_req(struct cache_deferred_req *dreq, struct cache_head *item) { - int hash = DFR_HASH(item); - - INIT_LIST_HEAD(&dreq->recent); - hlist_add_head(&dreq->hash, &cache_defer_hash[hash]); + hash_add(cache_defer_hash, &dreq->hash, (unsigned long)item); } static void setup_deferral(struct cache_deferred_req *dreq, @@ -600,7 +597,7 @@ static void cache_wait_req(struct cache_req *req, struct cache_head *item) * to clean up */ spin_lock(&cache_defer_lock); - if (!hlist_unhashed(&sleeper.handle.hash)) { + if (hash_hashed(&sleeper.handle.hash)) { __unhash_deferred_req(&sleeper.handle); spin_unlock(&cache_defer_lock); } else { @@ -671,12 +668,11 @@ static void cache_revisit_request(struct cache_head *item) struct cache_deferred_req *dreq; struct list_head pending; struct hlist_node *lp, *tmp; - int hash = DFR_HASH(item); INIT_LIST_HEAD(&pending); spin_lock(&cache_defer_lock); - hlist_for_each_entry_safe(dreq, lp, tmp, &cache_defer_hash[hash], hash) + hash_for_each_possible_safe(cache_defer_hash, dreq, lp, tmp, hash, (unsigned long)item) if (dreq->item == item) { __unhash_deferred_req(dreq); list_add(&dreq->recent, &pending); @@ -1636,6 +1632,8 @@ static int create_cache_proc_entries(struct cache_detail *cd, struct net *net) void __init cache_initialize(void) { INIT_DEFERRABLE_WORK(&cache_cleaner, do_cache_clean); + + hash_init(cache_defer_hash); } int cache_register_net(struct cache_detail *cd, struct net *net)
Switch cache to use the new hashtable implementation. This reduces the amount of generic unrelated code in the cache implementation. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> --- net/sunrpc/cache.c | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-)