diff mbox

[v7,09/16] SUNRPC/cache: use new hashtable implementation

Message ID 1351450948-15618-9-git-send-email-levinsasha928@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sasha Levin Oct. 28, 2012, 7:02 p.m. UTC
Switch cache to use the new hashtable implementation. This reduces the amount of
generic unrelated code in the cache implementation.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 net/sunrpc/cache.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

Comments

Mathieu Desnoyers Oct. 29, 2012, 12:42 p.m. UTC | #1
* Sasha Levin (levinsasha928@gmail.com) wrote:
> Switch cache to use the new hashtable implementation. This reduces the amount of
> generic unrelated code in the cache implementation.
> 
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> ---
>  net/sunrpc/cache.c | 20 +++++++++-----------
>  1 file changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
> index fc2f7aa..0490546 100644
> --- a/net/sunrpc/cache.c
> +++ b/net/sunrpc/cache.c
> @@ -28,6 +28,7 @@
>  #include <linux/workqueue.h>
>  #include <linux/mutex.h>
>  #include <linux/pagemap.h>
> +#include <linux/hashtable.h>
>  #include <asm/ioctls.h>
>  #include <linux/sunrpc/types.h>
>  #include <linux/sunrpc/cache.h>
> @@ -524,19 +525,18 @@ EXPORT_SYMBOL_GPL(cache_purge);
>   * it to be revisited when cache info is available
>   */
>  
> -#define	DFR_HASHSIZE	(PAGE_SIZE/sizeof(struct list_head))
> -#define	DFR_HASH(item)	((((long)item)>>4 ^ (((long)item)>>13)) % DFR_HASHSIZE)
> +#define	DFR_HASH_BITS	9

If we look at a bit of history, mainly commit:

commit 1117449276bb909b029ed0b9ba13f53e4784db9d
Author: NeilBrown <neilb@suse.de>
Date:   Thu Aug 12 17:04:08 2010 +1000

    sunrpc/cache: change deferred-request hash table to use hlist.


we'll notice that the only reason why the prior DFR_HASHSIZE was using

  (PAGE_SIZE/sizeof(struct list_head))

instead of

  (PAGE_SIZE/sizeof(struct hlist_head))

is because it has been forgotten in that commit. The intent there is to
make the hash table array fit the page size.

By defining DFR_HASH_BITS arbitrarily to "9", this indeed fulfills this
purpose on architectures with 4kB page size and 64-bit pointers, but not
on some powerpc configurations, and Tile architectures, which have more
exotic 64kB page size, and of course on the far less exotic 32-bit
pointer architectures.

So defining e.g.:

#include <linux/log2.h>

#define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))

would keep the intended behavior in all cases: use one page for the hash
array.

Thanks,

Mathieu
Linus Torvalds Oct. 29, 2012, 2:49 p.m. UTC | #2
On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
>
> So defining e.g.:
>
> #include <linux/log2.h>
>
> #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
>
> would keep the intended behavior in all cases: use one page for the hash
> array.

Well, since that wasn't true before either because of the long-time
bug you point out, clearly the page size isn't all that important. I
think it's more important to have small and simple code, and "9" is
certainly that, compared to playing ilog2 games with not-so-obvious
things.

Because there's no reason to believe that '9' is in any way a worse
random number than something page-shift-related, is there? And getting
away from *previous* overly-complicated size calculations that had
been broken because they were too complicated and random, sounds like
a good idea.

            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers Oct. 29, 2012, 3:13 p.m. UTC | #3
* Linus Torvalds (torvalds@linux-foundation.org) wrote:
> On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> >
> > So defining e.g.:
> >
> > #include <linux/log2.h>
> >
> > #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
> >
> > would keep the intended behavior in all cases: use one page for the hash
> > array.
> 
> Well, since that wasn't true before either because of the long-time
> bug you point out, clearly the page size isn't all that important. I
> think it's more important to have small and simple code, and "9" is
> certainly that, compared to playing ilog2 games with not-so-obvious
> things.
> 
> Because there's no reason to believe that '9' is in any way a worse
> random number than something page-shift-related, is there? And getting
> away from *previous* overly-complicated size calculations that had
> been broken because they were too complicated and random, sounds like
> a good idea.

Good point. I agree that unless we really care about the precise number
of TLB entries and cache lines used by this hash table, we might want to
stay away from page-size and pointer-size based calculation.

It might not hurt to explain this in the patch changelog though.

Thanks,

Mathieu
J. Bruce Fields Oct. 29, 2012, 3:16 p.m. UTC | #4
On Mon, Oct 29, 2012 at 11:13:43AM -0400, Mathieu Desnoyers wrote:
> * Linus Torvalds (torvalds@linux-foundation.org) wrote:
> > On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
> > <mathieu.desnoyers@efficios.com> wrote:
> > >
> > > So defining e.g.:
> > >
> > > #include <linux/log2.h>
> > >
> > > #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
> > >
> > > would keep the intended behavior in all cases: use one page for the hash
> > > array.
> > 
> > Well, since that wasn't true before either because of the long-time
> > bug you point out, clearly the page size isn't all that important. I
> > think it's more important to have small and simple code, and "9" is
> > certainly that, compared to playing ilog2 games with not-so-obvious
> > things.
> > 
> > Because there's no reason to believe that '9' is in any way a worse
> > random number than something page-shift-related, is there? And getting
> > away from *previous* overly-complicated size calculations that had
> > been broken because they were too complicated and random, sounds like
> > a good idea.
> 
> Good point. I agree that unless we really care about the precise number
> of TLB entries and cache lines used by this hash table, we might want to
> stay away from page-size and pointer-size based calculation.
>
> It might not hurt to explain this in the patch changelog though.

I'd also be happy to take that as a separate patch now.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mathieu Desnoyers Oct. 29, 2012, 3:41 p.m. UTC | #5
* J. Bruce Fields (bfields@fieldses.org) wrote:
> On Mon, Oct 29, 2012 at 11:13:43AM -0400, Mathieu Desnoyers wrote:
> > * Linus Torvalds (torvalds@linux-foundation.org) wrote:
> > > On Mon, Oct 29, 2012 at 5:42 AM, Mathieu Desnoyers
> > > <mathieu.desnoyers@efficios.com> wrote:
> > > >
> > > > So defining e.g.:
> > > >
> > > > #include <linux/log2.h>
> > > >
> > > > #define DFR_HASH_BITS  (PAGE_SHIFT - ilog2(BITS_PER_LONG))
> > > >
> > > > would keep the intended behavior in all cases: use one page for the hash
> > > > array.
> > > 
> > > Well, since that wasn't true before either because of the long-time
> > > bug you point out, clearly the page size isn't all that important. I
> > > think it's more important to have small and simple code, and "9" is
> > > certainly that, compared to playing ilog2 games with not-so-obvious
> > > things.
> > > 
> > > Because there's no reason to believe that '9' is in any way a worse
> > > random number than something page-shift-related, is there? And getting
> > > away from *previous* overly-complicated size calculations that had
> > > been broken because they were too complicated and random, sounds like
> > > a good idea.
> > 
> > Good point. I agree that unless we really care about the precise number
> > of TLB entries and cache lines used by this hash table, we might want to
> > stay away from page-size and pointer-size based calculation.
> >
> > It might not hurt to explain this in the patch changelog though.
> 
> I'd also be happy to take that as a separate patch now.

FYIW: I've made a nice boo-boo above. It should have been:

#define DFR_HASH_BITS   (PAGE_SHIFT - ilog2(sizeof(struct hlist_head)))

Because we happen to have a memory indexed in bytes, not in bits. I
guess this goes a long way proving Linus' point about virtues of trivial
code. ;-)

Thanks,

Mathieu
Andrew Morton Oct. 29, 2012, 4:27 p.m. UTC | #6
On Mon, 29 Oct 2012 07:49:42 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Because there's no reason to believe that '9' is in any way a worse
> random number than something page-shift-related, is there?

9 is much better than PAGE_SHIFT.  PAGE_SIZE can vary by a factor of
16, depending on config.

Everyone thinks 4k, and tests only for that.  There's potential for
very large performance and behavior changes when their code gets run
on a 64k PAGE_SIZE machine.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index fc2f7aa..0490546 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -28,6 +28,7 @@ 
 #include <linux/workqueue.h>
 #include <linux/mutex.h>
 #include <linux/pagemap.h>
+#include <linux/hashtable.h>
 #include <asm/ioctls.h>
 #include <linux/sunrpc/types.h>
 #include <linux/sunrpc/cache.h>
@@ -524,19 +525,18 @@  EXPORT_SYMBOL_GPL(cache_purge);
  * it to be revisited when cache info is available
  */
 
-#define	DFR_HASHSIZE	(PAGE_SIZE/sizeof(struct list_head))
-#define	DFR_HASH(item)	((((long)item)>>4 ^ (((long)item)>>13)) % DFR_HASHSIZE)
+#define	DFR_HASH_BITS	9
 
 #define	DFR_MAX	300	/* ??? */
 
 static DEFINE_SPINLOCK(cache_defer_lock);
 static LIST_HEAD(cache_defer_list);
-static struct hlist_head cache_defer_hash[DFR_HASHSIZE];
+static DEFINE_HASHTABLE(cache_defer_hash, DFR_HASH_BITS);
 static int cache_defer_cnt;
 
 static void __unhash_deferred_req(struct cache_deferred_req *dreq)
 {
-	hlist_del_init(&dreq->hash);
+	hash_del(&dreq->hash);
 	if (!list_empty(&dreq->recent)) {
 		list_del_init(&dreq->recent);
 		cache_defer_cnt--;
@@ -545,10 +545,7 @@  static void __unhash_deferred_req(struct cache_deferred_req *dreq)
 
 static void __hash_deferred_req(struct cache_deferred_req *dreq, struct cache_head *item)
 {
-	int hash = DFR_HASH(item);
-
-	INIT_LIST_HEAD(&dreq->recent);
-	hlist_add_head(&dreq->hash, &cache_defer_hash[hash]);
+	hash_add(cache_defer_hash, &dreq->hash, (unsigned long)item);
 }
 
 static void setup_deferral(struct cache_deferred_req *dreq,
@@ -600,7 +597,7 @@  static void cache_wait_req(struct cache_req *req, struct cache_head *item)
 		 * to clean up
 		 */
 		spin_lock(&cache_defer_lock);
-		if (!hlist_unhashed(&sleeper.handle.hash)) {
+		if (hash_hashed(&sleeper.handle.hash)) {
 			__unhash_deferred_req(&sleeper.handle);
 			spin_unlock(&cache_defer_lock);
 		} else {
@@ -671,12 +668,11 @@  static void cache_revisit_request(struct cache_head *item)
 	struct cache_deferred_req *dreq;
 	struct list_head pending;
 	struct hlist_node *lp, *tmp;
-	int hash = DFR_HASH(item);
 
 	INIT_LIST_HEAD(&pending);
 	spin_lock(&cache_defer_lock);
 
-	hlist_for_each_entry_safe(dreq, lp, tmp, &cache_defer_hash[hash], hash)
+	hash_for_each_possible_safe(cache_defer_hash, dreq, lp, tmp, hash, (unsigned long)item)
 		if (dreq->item == item) {
 			__unhash_deferred_req(dreq);
 			list_add(&dreq->recent, &pending);
@@ -1636,6 +1632,8 @@  static int create_cache_proc_entries(struct cache_detail *cd, struct net *net)
 void __init cache_initialize(void)
 {
 	INIT_DEFERRABLE_WORK(&cache_cleaner, do_cache_clean);
+
+	hash_init(cache_defer_hash);
 }
 
 int cache_register_net(struct cache_detail *cd, struct net *net)