diff mbox

[rdma-next,v1,1/1] IB/core: Protect against concurrent access to hardware stats

Message ID 1521530899-9581-1-git-send-email-markb@mellanox.com (mailing list archive)
State Changes Requested
Headers show

Commit Message

Mark Bloch March 20, 2018, 7:28 a.m. UTC
Currently access to hardware stats buffer isn't protected, this can
result in multiple writes and reads at the same time to the same
memory location. This can lead to providing an incorrect value to
the user. Add a mutex to protect against it.

Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
Signed-off-by: Mark Bloch <markb@mellanox.com>
---
We might want to protect setting/reading lifespan as well, but it seems
not needed as it shouldn't be changed too often, and even if it does not by
multiple threads at the same time.

v0 -> v1: Removed RFC tag and added a Fixes line.
---
 drivers/infiniband/core/sysfs.c | 10 ++++++++--
 include/rdma/ib_verbs.h         |  1 +
 2 files changed, 9 insertions(+), 2 deletions(-)

Comments

Jason Gunthorpe March 21, 2018, 5:57 p.m. UTC | #1
On Tue, Mar 20, 2018 at 07:28:19AM +0000, Mark Bloch wrote:
> Currently access to hardware stats buffer isn't protected, this can
> result in multiple writes and reads at the same time to the same
> memory location. This can lead to providing an incorrect value to
> the user. Add a mutex to protect against it.
> 
> Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
> Signed-off-by: Mark Bloch <markb@mellanox.com>
> We might want to protect setting/reading lifespan as well, but it seems
> not needed as it shouldn't be changed too often, and even if it does not by
> multiple threads at the same time.
> 
> v0 -> v1: Removed RFC tag and added a Fixes line.
>  drivers/infiniband/core/sysfs.c | 10 ++++++++--
>  include/rdma/ib_verbs.h         |  1 +
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> index 8ae1308e..b769d61 100644
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -810,10 +810,15 @@ static ssize_t show_hw_stats(struct kobject *kobj, struct attribute *attr,
>  		dev = port->ibdev;
>  		stats = port->hw_stats;
>  	}
> +	mutex_lock(&stats->lock);
>  	ret = update_hw_stats(dev, stats, hsa->port_num, hsa->index);
>  	if (ret)
> -		return ret;
> -	return print_hw_stat(stats, hsa->index, buf);
> +		goto unlock;
> +	ret = print_hw_stat(stats, hsa->index, buf);
> +unlock:
> +	mutex_unlock(&stats->lock);
> +
> +	return ret;
>  }

I think we are going to add a lock, it should cover all readers
too. There is no real reason not too that I can see.

So switch stats->lock to a rwlock and hold it in

show_hw_stats (write)
show_stats_lifespan
set_stats_lifespan

Which is all the places that touch stats..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Lameter (Ampere) March 21, 2018, 6:04 p.m. UTC | #2
On Wed, 21 Mar 2018, Jason Gunthorpe wrote:

> I think we are going to add a lock, it should cover all readers
> too. There is no real reason not too that I can see.

The general approach in the kernel is to not use locks for access to
statistics and recognize that they are snapshots and may be inaccurate
because unserialized things happened after or at the time of the snapshot.

 > > So switch stats->lock to a rwlock and hold it in
>
> show_hw_stats (write)
> show_stats_lifespan
> set_stats_lifespan
>
> Which is all the places that touch stats..

I think this just causes unnecessary trouble and may slow down the access
to stats in particular in large systems.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 21, 2018, 7:19 p.m. UTC | #3
On Wed, Mar 21, 2018 at 01:04:45PM -0500, Christopher Lameter wrote:
> On Wed, 21 Mar 2018, Jason Gunthorpe wrote:
> 
> > I think we are going to add a lock, it should cover all readers
> > too. There is no real reason not too that I can see.
> 
> The general approach in the kernel is to not use locks for access to
> statistics and recognize that they are snapshots and may be inaccurate
> because unserialized things happened after or at the time of the snapshot.

Maybe, but other places aren't recording their statistics in u64's
then, there is no way to do that without locks portably.

>  > > So switch stats->lock to a rwlock and hold it in
> >
> > show_hw_stats (write)
> > show_stats_lifespan
> > set_stats_lifespan
> >
> > Which is all the places that touch stats..
> 
> I think this just causes unnecessary trouble and may slow down the access
> to stats in particular in large systems.

We already are adding a lock around the main reader, I think that
ship has sailed..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Lameter (Ampere) March 21, 2018, 8:13 p.m. UTC | #4
On Wed, 21 Mar 2018, Jason Gunthorpe wrote:

> On Wed, Mar 21, 2018 at 01:04:45PM -0500, Christopher Lameter wrote:
> > On Wed, 21 Mar 2018, Jason Gunthorpe wrote:
> >
> > > I think we are going to add a lock, it should cover all readers
> > > too. There is no real reason not too that I can see.
> >
> > The general approach in the kernel is to not use locks for access to
> > statistics and recognize that they are snapshots and may be inaccurate
> > because unserialized things happened after or at the time of the snapshot.
>
> Maybe, but other places aren't recording their statistics in u64's
> then, there is no way to do that without locks portably.

Umm... Ok so we are concerned about 32 bits here? True the core kernel
uses unsigned long for these counters.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 21, 2018, 8:31 p.m. UTC | #5
On Wed, Mar 21, 2018 at 03:13:25PM -0500, Christopher Lameter wrote:
> On Wed, 21 Mar 2018, Jason Gunthorpe wrote:
> 
> > On Wed, Mar 21, 2018 at 01:04:45PM -0500, Christopher Lameter wrote:
> > > On Wed, 21 Mar 2018, Jason Gunthorpe wrote:
> > >
> > > > I think we are going to add a lock, it should cover all readers
> > > > too. There is no real reason not too that I can see.
> > >
> > > The general approach in the kernel is to not use locks for access to
> > > statistics and recognize that they are snapshots and may be inaccurate
> > > because unserialized things happened after or at the time of the snapshot.
> >
> > Maybe, but other places aren't recording their statistics in u64's
> > then, there is no way to do that without locks portably.
> 
> Umm... Ok so we are concerned about 32 bits here? True the core kernel
> uses unsigned long for these counters.

Sort of. For core kernel code like this I certainly don't want to see
us cavalierly break things to solve imaginary problems.

In this case I can't see a real performance problem here..

The extra two sysfs that need locking are not used frequently I think,
and I doubt much software is polling sysfs files multi-threaded at a
high rate of speed. That is a pretty nonsense thing to do.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mark Bloch March 22, 2018, 8:38 a.m. UTC | #6
On 21/03/2018 19:57, Jason Gunthorpe wrote:
> On Tue, Mar 20, 2018 at 07:28:19AM +0000, Mark Bloch wrote:
>> Currently access to hardware stats buffer isn't protected, this can
>> result in multiple writes and reads at the same time to the same
>> memory location. This can lead to providing an incorrect value to
>> the user. Add a mutex to protect against it.
>>
>> Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
>> Signed-off-by: Mark Bloch <markb@mellanox.com>
>> We might want to protect setting/reading lifespan as well, but it seems
>> not needed as it shouldn't be changed too often, and even if it does not by
>> multiple threads at the same time.
>>
>> v0 -> v1: Removed RFC tag and added a Fixes line.
>>  drivers/infiniband/core/sysfs.c | 10 ++++++++--
>>  include/rdma/ib_verbs.h         |  1 +
>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
>> index 8ae1308e..b769d61 100644
>> +++ b/drivers/infiniband/core/sysfs.c
>> @@ -810,10 +810,15 @@ static ssize_t show_hw_stats(struct kobject *kobj, struct attribute *attr,
>>  		dev = port->ibdev;
>>  		stats = port->hw_stats;
>>  	}
>> +	mutex_lock(&stats->lock);
>>  	ret = update_hw_stats(dev, stats, hsa->port_num, hsa->index);
>>  	if (ret)
>> -		return ret;
>> -	return print_hw_stat(stats, hsa->index, buf);
>> +		goto unlock;
>> +	ret = print_hw_stat(stats, hsa->index, buf);
>> +unlock:
>> +	mutex_unlock(&stats->lock);
>> +
>> +	return ret;
>>  }
> 
> I think we are going to add a lock, it should cover all readers
> too. There is no real reason not too that I can see.

I'll protect setting/reading lifespan as well.
> 
> So switch stats->lock to a rwlock and hold it in

Why rwlock?
seeing as setting/reading lifespan probably isn't going to be
used a lot (as opposed to reading a counter) and from the driver
point of view, reading a counter will usually mean "Do a FW query command"
I really don't think rwlock is the right option here.

> 
> show_hw_stats (write)
> show_stats_lifespan
> set_stats_lifespan
> 
> Which is all the places that touch stats..
> 
> Jason
> 

Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe March 23, 2018, 8:54 p.m. UTC | #7
On Thu, Mar 22, 2018 at 10:38:01AM +0200, Mark Bloch wrote:
> 
> 
> On 21/03/2018 19:57, Jason Gunthorpe wrote:
> > On Tue, Mar 20, 2018 at 07:28:19AM +0000, Mark Bloch wrote:
> >> Currently access to hardware stats buffer isn't protected, this can
> >> result in multiple writes and reads at the same time to the same
> >> memory location. This can lead to providing an incorrect value to
> >> the user. Add a mutex to protect against it.
> >>
> >> Fixes: b40f4757daa1 ("IB/core: Make device counter infrastructure dynamic")
> >> Signed-off-by: Mark Bloch <markb@mellanox.com>
> >> We might want to protect setting/reading lifespan as well, but it seems
> >> not needed as it shouldn't be changed too often, and even if it does not by
> >> multiple threads at the same time.
> >>
> >> v0 -> v1: Removed RFC tag and added a Fixes line.
> >>  drivers/infiniband/core/sysfs.c | 10 ++++++++--
> >>  include/rdma/ib_verbs.h         |  1 +
> >>  2 files changed, 9 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> >> index 8ae1308e..b769d61 100644
> >> +++ b/drivers/infiniband/core/sysfs.c
> >> @@ -810,10 +810,15 @@ static ssize_t show_hw_stats(struct kobject *kobj, struct attribute *attr,
> >>  		dev = port->ibdev;
> >>  		stats = port->hw_stats;
> >>  	}
> >> +	mutex_lock(&stats->lock);
> >>  	ret = update_hw_stats(dev, stats, hsa->port_num, hsa->index);
> >>  	if (ret)
> >> -		return ret;
> >> -	return print_hw_stat(stats, hsa->index, buf);
> >> +		goto unlock;
> >> +	ret = print_hw_stat(stats, hsa->index, buf);
> >> +unlock:
> >> +	mutex_unlock(&stats->lock);
> >> +
> >> +	return ret;
> >>  }
> > 
> > I think we are going to add a lock, it should cover all readers
> > too. There is no real reason not too that I can see.
> 
> I'll protect setting/reading lifespan as well.
> > 
> > So switch stats->lock to a rwlock and hold it in
> 
> Why rwlock?
> seeing as setting/reading lifespan probably isn't going to be
> used a lot (as opposed to reading a counter) and from the driver
> point of view, reading a counter will usually mean "Do a FW query command"
> I really don't think rwlock is the right option here.

No specific reason, mutex too

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 8ae1308e..b769d61 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -810,10 +810,15 @@  static ssize_t show_hw_stats(struct kobject *kobj, struct attribute *attr,
 		dev = port->ibdev;
 		stats = port->hw_stats;
 	}
+	mutex_lock(&stats->lock);
 	ret = update_hw_stats(dev, stats, hsa->port_num, hsa->index);
 	if (ret)
-		return ret;
-	return print_hw_stat(stats, hsa->index, buf);
+		goto unlock;
+	ret = print_hw_stat(stats, hsa->index, buf);
+unlock:
+	mutex_unlock(&stats->lock);
+
+	return ret;
 }
 
 static ssize_t show_stats_lifespan(struct kobject *kobj,
@@ -951,6 +956,7 @@  static void setup_hw_stats(struct ib_device *device, struct ib_port *port,
 		sysfs_attr_init(hsag->attrs[i]);
 	}
 
+	mutex_init(&stats->lock);
 	/* treat an error here as non-fatal */
 	hsag->attrs[i] = alloc_hsa_lifespan("lifespan", port_num);
 	if (hsag->attrs[i])
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 73b2387..33fbccb 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -484,6 +484,7 @@  enum ib_port_speed {
  *   filled in by the drivers get_stats routine
  */
 struct rdma_hw_stats {
+	struct mutex lock;
 	unsigned long	timestamp;
 	unsigned long	lifespan;
 	const char * const *names;