[v7,15/16] openvswitch: use new hashtable implementation

Message ID	1351450948-15618-15-git-send-email-levinsasha928@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@vger.kernel.org> From: Sasha Levin <levinsasha928@gmail.com> To: torvalds@linux-foundation.org Cc: tj@kernel.org, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, paul.gortmaker@windriver.com, davem@davemloft.net, rostedt@goodmis.org, mingo@elte.hu, ebiederm@xmission.com, aarcange@redhat.com, ericvh@gmail.com, netdev@vger.kernel.org, josh@joshtriplett.org, eric.dumazet@gmail.com, mathieu.desnoyers@efficios.com, axboe@kernel.dk, agk@redhat.com, dm-devel@redhat.com, neilb@suse.de, ccaulfie@redhat.com, teigland@redhat.com, Trond.Myklebust@netapp.com, bfields@fieldses.org, fweisbec@gmail.com, jesse@nicira.com, venkat.x.venkatsubra@oracle.com, ejt@redhat.com, snitzer@redhat.com, edumazet@google.com, linux-nfs@vger.kernel.org, dev@openvswitch.org, rds-devel@oss.oracle.com, lw@cn.fujitsu.com, Sasha Levin <levinsasha928@gmail.com> Subject: [PATCH v7 15/16] openvswitch: use new hashtable implementation Date: Sun, 28 Oct 2012 15:02:27 -0400 Message-Id: <1351450948-15618-15-git-send-email-levinsasha928@gmail.com> In-Reply-To: <1351450948-15618-1-git-send-email-levinsasha928@gmail.com> References: <1351450948-15618-1-git-send-email-levinsasha928@gmail.com> Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk

Sasha Levin Oct. 28, 2012, 7:02 p.m. UTC

Switch openvswitch to use the new hashtable implementation. This reduces the amount of
generic unrelated code in openvswitch.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 net/openvswitch/vport.c | 34 +++++++++++++---------------------
 1 file changed, 13 insertions(+), 21 deletions(-)

Mathieu Desnoyers Oct. 29, 2012, 1:29 p.m. UTC | #1

* Sasha Levin (levinsasha928@gmail.com) wrote:
[...]
> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
> -{
> -	unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
> -	return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
> -}
> -
>  /**
>   *	ovs_vport_locate - find a port that has already been created
>   *
> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, const char *name)
>   */
>  struct vport *ovs_vport_locate(struct net *net, const char *name)
>  {
> -	struct hlist_head *bucket = hash_bucket(net, name);
>  	struct vport *vport;
>  	struct hlist_node *node;
> +	int key = full_name_hash(name, strlen(name));
>  
> -	hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
> -		if (!strcmp(name, vport->ops->get_name(vport)) &&
> -		    net_eq(ovs_dp_get_net(vport->dp), net))
> +	hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)

Is applying hash_32() on top of full_name_hash() needed and expected ?

Thanks,

Mathieu

> +		if (!strcmp(name, vport->ops->get_name(vport)))
>  			return vport;
>  
>  	return NULL;
> @@ -174,7 +165,8 @@ struct vport *ovs_vport_add(const struct vport_parms *parms)
>  
>  	for (i = 0; i < ARRAY_SIZE(vport_ops_list); i++) {
>  		if (vport_ops_list[i]->type == parms->type) {
> -			struct hlist_head *bucket;
> +			int key;
> +			const char *name;
>  
>  			vport = vport_ops_list[i]->create(parms);
>  			if (IS_ERR(vport)) {
> @@ -182,9 +174,9 @@ struct vport *ovs_vport_add(const struct vport_parms *parms)
>  				goto out;
>  			}
>  
> -			bucket = hash_bucket(ovs_dp_get_net(vport->dp),
> -					     vport->ops->get_name(vport));
> -			hlist_add_head_rcu(&vport->hash_node, bucket);
> +			name = vport->ops->get_name(vport);
> +			key = full_name_hash(name, strlen(name));
> +			hash_add_rcu(dev_table, &vport->hash_node, key);
>  			return vport;
>  		}
>  	}
> @@ -225,7 +217,7 @@ void ovs_vport_del(struct vport *vport)
>  {
>  	ASSERT_RTNL();
>  
> -	hlist_del_rcu(&vport->hash_node);
> +	hash_del_rcu(&vport->hash_node);
>  
>  	vport->ops->destroy(vport);
>  }
> -- 
> 1.7.12.4
>

Sasha Levin Oct. 29, 2012, 3:43 p.m. UTC | #2

Hi Mathieu,

On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> * Sasha Levin (levinsasha928@gmail.com) wrote:
> [...]
>> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
>> -{
>> -     unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
>> -     return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
>> -}
>> -
>>  /**
>>   *   ovs_vport_locate - find a port that has already been created
>>   *
>> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, const char *name)
>>   */
>>  struct vport *ovs_vport_locate(struct net *net, const char *name)
>>  {
>> -     struct hlist_head *bucket = hash_bucket(net, name);
>>       struct vport *vport;
>>       struct hlist_node *node;
>> +     int key = full_name_hash(name, strlen(name));
>>
>> -     hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
>> -             if (!strcmp(name, vport->ops->get_name(vport)) &&
>> -                 net_eq(ovs_dp_get_net(vport->dp), net))
>> +     hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
>
> Is applying hash_32() on top of full_name_hash() needed and expected ?

Since this was pointed out in several of the patches, I'll answer it
just once here.

I've intentionally "allowed" double hashing with hash_32 to keep the
code simple.

hash_32() is pretty simple and gcc optimizes it to be almost nothing,
so doing that costs us a multiplication and a shift. On the other
hand, we benefit from keeping our code simple - how would we avoid
doing this double hash? adding a different hashtable function for
strings? or a new function for already hashed keys? I think we benefit
a lot from having to mul/shr instead of adding extra lines of code
here.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mathieu Desnoyers Oct. 29, 2012, 3:59 p.m. UTC | #3

* Sasha Levin (levinsasha928@gmail.com) wrote:
> Hi Mathieu,
> 
> On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> > * Sasha Levin (levinsasha928@gmail.com) wrote:
> > [...]
> >> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
> >> -{
> >> -     unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
> >> -     return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
> >> -}
> >> -
> >>  /**
> >>   *   ovs_vport_locate - find a port that has already been created
> >>   *
> >> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, const char *name)
> >>   */
> >>  struct vport *ovs_vport_locate(struct net *net, const char *name)
> >>  {
> >> -     struct hlist_head *bucket = hash_bucket(net, name);
> >>       struct vport *vport;
> >>       struct hlist_node *node;
> >> +     int key = full_name_hash(name, strlen(name));
> >>
> >> -     hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
> >> -             if (!strcmp(name, vport->ops->get_name(vport)) &&
> >> -                 net_eq(ovs_dp_get_net(vport->dp), net))
> >> +     hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
> >
> > Is applying hash_32() on top of full_name_hash() needed and expected ?
> 
> Since this was pointed out in several of the patches, I'll answer it
> just once here.
> 
> I've intentionally "allowed" double hashing with hash_32 to keep the
> code simple.
> 
> hash_32() is pretty simple and gcc optimizes it to be almost nothing,
> so doing that costs us a multiplication and a shift. On the other
> hand, we benefit from keeping our code simple - how would we avoid
> doing this double hash? adding a different hashtable function for
> strings? or a new function for already hashed keys? I think we benefit
> a lot from having to mul/shr instead of adding extra lines of code
> here.

This could be done, as I pointed out in another email within this
thread, by changing the "key" argument from add/for_each_possible to an
expected "hash" value, and let the caller invoke hash_32() if they want.
I doubt this would add a significant amount of complexity for users of
this API, but would allow much more flexibility to choose hash
functions.

Thanks,

Mathieu

Sasha Levin Oct. 29, 2012, 5:35 p.m. UTC | #4

On Mon, Oct 29, 2012 at 11:59 AM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> * Sasha Levin (levinsasha928@gmail.com) wrote:
>> Hi Mathieu,
>>
>> On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
>> <mathieu.desnoyers@efficios.com> wrote:
>> > * Sasha Levin (levinsasha928@gmail.com) wrote:
>> > [...]
>> >> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
>> >> -{
>> >> -     unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
>> >> -     return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
>> >> -}
>> >> -
>> >>  /**
>> >>   *   ovs_vport_locate - find a port that has already been created
>> >>   *
>> >> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, const char *name)
>> >>   */
>> >>  struct vport *ovs_vport_locate(struct net *net, const char *name)
>> >>  {
>> >> -     struct hlist_head *bucket = hash_bucket(net, name);
>> >>       struct vport *vport;
>> >>       struct hlist_node *node;
>> >> +     int key = full_name_hash(name, strlen(name));
>> >>
>> >> -     hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
>> >> -             if (!strcmp(name, vport->ops->get_name(vport)) &&
>> >> -                 net_eq(ovs_dp_get_net(vport->dp), net))
>> >> +     hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
>> >
>> > Is applying hash_32() on top of full_name_hash() needed and expected ?
>>
>> Since this was pointed out in several of the patches, I'll answer it
>> just once here.
>>
>> I've intentionally "allowed" double hashing with hash_32 to keep the
>> code simple.
>>
>> hash_32() is pretty simple and gcc optimizes it to be almost nothing,
>> so doing that costs us a multiplication and a shift. On the other
>> hand, we benefit from keeping our code simple - how would we avoid
>> doing this double hash? adding a different hashtable function for
>> strings? or a new function for already hashed keys? I think we benefit
>> a lot from having to mul/shr instead of adding extra lines of code
>> here.
>
> This could be done, as I pointed out in another email within this
> thread, by changing the "key" argument from add/for_each_possible to an
> expected "hash" value, and let the caller invoke hash_32() if they want.
> I doubt this would add a significant amount of complexity for users of
> this API, but would allow much more flexibility to choose hash
> functions.

Most callers do need to do the hashing though, so why add an
additional step for all callers instead of doing another hash_32 for
the ones that don't really need it?

Another question is why do you need flexibility? I think that
simplicity wins over flexibility here.

Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mathieu Desnoyers Oct. 29, 2012, 6:16 p.m. UTC | #5

* Sasha Levin (levinsasha928@gmail.com) wrote:
> On Mon, Oct 29, 2012 at 11:59 AM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> > * Sasha Levin (levinsasha928@gmail.com) wrote:
> >> Hi Mathieu,
> >>
> >> On Mon, Oct 29, 2012 at 9:29 AM, Mathieu Desnoyers
> >> <mathieu.desnoyers@efficios.com> wrote:
> >> > * Sasha Levin (levinsasha928@gmail.com) wrote:
> >> > [...]
> >> >> -static struct hlist_head *hash_bucket(struct net *net, const char *name)
> >> >> -{
> >> >> -     unsigned int hash = jhash(name, strlen(name), (unsigned long) net);
> >> >> -     return &dev_table[hash & (VPORT_HASH_BUCKETS - 1)];
> >> >> -}
> >> >> -
> >> >>  /**
> >> >>   *   ovs_vport_locate - find a port that has already been created
> >> >>   *
> >> >> @@ -84,13 +76,12 @@ static struct hlist_head *hash_bucket(struct net *net, const char *name)
> >> >>   */
> >> >>  struct vport *ovs_vport_locate(struct net *net, const char *name)
> >> >>  {
> >> >> -     struct hlist_head *bucket = hash_bucket(net, name);
> >> >>       struct vport *vport;
> >> >>       struct hlist_node *node;
> >> >> +     int key = full_name_hash(name, strlen(name));
> >> >>
> >> >> -     hlist_for_each_entry_rcu(vport, node, bucket, hash_node)
> >> >> -             if (!strcmp(name, vport->ops->get_name(vport)) &&
> >> >> -                 net_eq(ovs_dp_get_net(vport->dp), net))
> >> >> +     hash_for_each_possible_rcu(dev_table, vport, node, hash_node, key)
> >> >
> >> > Is applying hash_32() on top of full_name_hash() needed and expected ?
> >>
> >> Since this was pointed out in several of the patches, I'll answer it
> >> just once here.
> >>
> >> I've intentionally "allowed" double hashing with hash_32 to keep the
> >> code simple.
> >>
> >> hash_32() is pretty simple and gcc optimizes it to be almost nothing,
> >> so doing that costs us a multiplication and a shift. On the other
> >> hand, we benefit from keeping our code simple - how would we avoid
> >> doing this double hash? adding a different hashtable function for
> >> strings? or a new function for already hashed keys? I think we benefit
> >> a lot from having to mul/shr instead of adding extra lines of code
> >> here.
> >
> > This could be done, as I pointed out in another email within this
> > thread, by changing the "key" argument from add/for_each_possible to an
> > expected "hash" value, and let the caller invoke hash_32() if they want.
> > I doubt this would add a significant amount of complexity for users of
> > this API, but would allow much more flexibility to choose hash
> > functions.
> 
> Most callers do need to do the hashing though, so why add an
> additional step for all callers instead of doing another hash_32 for
> the ones that don't really need it?
> 
> Another question is why do you need flexibility? I think that
> simplicity wins over flexibility here.

I usually try to make things as simple as possible, but not simplistic
compared to the problem tackled. In this case, I would ask the following
question: by standardizing the hash function of all those pieces of
kernel infrastructure to "hash_32()", including submodules part of the
kernel network infrastructure, parts of the kernel that can be fed
values coming from user-space (through the VFS), how can you guarantee
that hash_32() won't be the cause of a DoS attack based on the fact that
this algorithm is a) known by an attacker, and b) does not have any
randomness. It's been a recent trend to perform DoS attacks on poorly
implemented hashing functions.

This is just one example in an attempt to show why different hash table
users may have different constraints: for a hash table entirely
populated by keys generated internally by the kernel, a random seed
might not be required, but for cases where values are fed by user-space
and from the NIC, I would argue that flexibility to implement a
randomizable hash function beats implementation simplicity any time.

And you could keep the basic use-case simple by providing hints to the
hash_32()/hash_64()/hash_ulong() helpers in comments.

Thoughts ?

Thanks,

Mathieu

Tejun Heo Oct. 29, 2012, 6:22 p.m. UTC | #6

Hello,

On Mon, Oct 29, 2012 at 02:16:48PM -0400, Mathieu Desnoyers wrote:
> This is just one example in an attempt to show why different hash table
> users may have different constraints: for a hash table entirely
> populated by keys generated internally by the kernel, a random seed
> might not be required, but for cases where values are fed by user-space
> and from the NIC, I would argue that flexibility to implement a
> randomizable hash function beats implementation simplicity any time.
> 
> And you could keep the basic use-case simple by providing hints to the
> hash_32()/hash_64()/hash_ulong() helpers in comments.

If all you need is throwing in a salt value to avoid attacks, can't
you just do that from caller side?  Scrambling the key before feeding
it into hash_*() should work, no?

Thanks.

Mathieu Desnoyers Oct. 29, 2012, 6:35 p.m. UTC | #7

* Tejun Heo (tj@kernel.org) wrote:
> Hello,
> 
> On Mon, Oct 29, 2012 at 02:16:48PM -0400, Mathieu Desnoyers wrote:
> > This is just one example in an attempt to show why different hash table
> > users may have different constraints: for a hash table entirely
> > populated by keys generated internally by the kernel, a random seed
> > might not be required, but for cases where values are fed by user-space
> > and from the NIC, I would argue that flexibility to implement a
> > randomizable hash function beats implementation simplicity any time.
> > 
> > And you could keep the basic use-case simple by providing hints to the
> > hash_32()/hash_64()/hash_ulong() helpers in comments.
> 
> If all you need is throwing in a salt value to avoid attacks, can't
> you just do that from caller side?  Scrambling the key before feeding
> it into hash_*() should work, no?

Yes, I think salting the "key" parameter would work.

Thanks,

Mathieu

> 
> Thanks.
> 
> -- 
> tejun

[v7,15/16] openvswitch: use new hashtable implementation

Commit Message

Comments

Patch