diff mbox

[v2,06/14] locks: don't walk inode->i_flock list in locks_show

Message ID 1370948948-31784-7-git-send-email-jlayton@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Layton June 11, 2013, 11:09 a.m. UTC
When we convert over to using the i_lock to protect the i_flock list,
that will introduce a potential lock inversion problem in locks_show.
When we want to walk the i_flock list, we'll need to take the i_lock.

Rather than do that, just walk the global blocked_locks list and print
out any that are blocked on the given lock.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/locks.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

Comments

J. Bruce Fields June 13, 2013, 7:45 p.m. UTC | #1
On Tue, Jun 11, 2013 at 07:09:00AM -0400, Jeff Layton wrote:
> When we convert over to using the i_lock to protect the i_flock list,
> that will introduce a potential lock inversion problem in locks_show.
> When we want to walk the i_flock list, we'll need to take the i_lock.
> 
> Rather than do that, just walk the global blocked_locks list and print
> out any that are blocked on the given lock.

I'm OK with this as obviously /proc/locks shouldn't be the common case,
but it still bugs me a bit that we're suddenly making it something like

	O(number of held locks * number of waiters)

where it used to be

	O(number of held lock + number of waiters)

I wonder if there's any solution that's just as easy and avoids scanning
the blocked list each time.

--b.

> 
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/locks.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index e451d18..3fd27f0 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2249,8 +2249,10 @@ static int locks_show(struct seq_file *f, void *v)
>  
>  	lock_get_status(f, fl, *((loff_t *)f->private), "");
>  
> -	list_for_each_entry(bfl, &fl->fl_block, fl_block)
> -		lock_get_status(f, bfl, *((loff_t *)f->private), " ->");
> +	list_for_each_entry(bfl, &blocked_list, fl_link) {
> +		if (bfl->fl_next == fl)
> +			lock_get_status(f, bfl, *((loff_t *)f->private), " ->");
> +	}
>  
>  	return 0;
>  }
> -- 
> 1.7.1
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton June 13, 2013, 8:26 p.m. UTC | #2
On Thu, 13 Jun 2013 15:45:46 -0400
"J. Bruce Fields" <bfields@fieldses.org> wrote:

> On Tue, Jun 11, 2013 at 07:09:00AM -0400, Jeff Layton wrote:
> > When we convert over to using the i_lock to protect the i_flock list,
> > that will introduce a potential lock inversion problem in locks_show.
> > When we want to walk the i_flock list, we'll need to take the i_lock.
> > 
> > Rather than do that, just walk the global blocked_locks list and print
> > out any that are blocked on the given lock.
> 
> I'm OK with this as obviously /proc/locks shouldn't be the common case,
> but it still bugs me a bit that we're suddenly making it something like
> 
> 	O(number of held locks * number of waiters)
> 
> where it used to be
> 
> 	O(number of held lock + number of waiters)
> 
> I wonder if there's any solution that's just as easy and avoids scanning
> the blocked list each time.
> 
> --b.
> 
> > 
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  fs/locks.c |    6 ++++--
> >  1 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/locks.c b/fs/locks.c
> > index e451d18..3fd27f0 100644
> > --- a/fs/locks.c
> > +++ b/fs/locks.c
> > @@ -2249,8 +2249,10 @@ static int locks_show(struct seq_file *f, void *v)
> >  
> >  	lock_get_status(f, fl, *((loff_t *)f->private), "");
> >  
> > -	list_for_each_entry(bfl, &fl->fl_block, fl_block)
> > -		lock_get_status(f, bfl, *((loff_t *)f->private), " ->");
> > +	list_for_each_entry(bfl, &blocked_list, fl_link) {
> > +		if (bfl->fl_next == fl)
> > +			lock_get_status(f, bfl, *((loff_t *)f->private), " ->");
> > +	}
> >  
> >  	return 0;
> >  }
> > -- 
> > 1.7.1
> > 

Yeah, it's ugly, but I don't see a real alternative. We could try to
use RCU for this, but that slows down the list manipulation at the cost
of optimizing a rarely read procfile.

Now that I look though...it occurs to me that we have a problem here
anyway. Only blocked POSIX requests go onto that list currently, so
this misses seeing any blocked flock requests.

The only real solution I can think of is to put flock locks into the
blocked_list/blocked_hash too, or maybe giving them a simple hlist to
sit on.

I'll fix that up in the next iteration. It'll probably make flock()
tests run slower, but such is the cost of preserving this procfile...
Jeff Layton June 15, 2013, 11:05 a.m. UTC | #3
On Fri, 14 Jun 2013 07:52:44 -0400
Simo <idra@samba.org> wrote:

> On 06/13/2013 04:26 PM, Jeff Layton wrote:
> > The only real solution I can think of is to put flock locks into the
> > blocked_list/blocked_hash too, or maybe giving them a simple hlist to
> > sit on.
> >
> > I'll fix that up in the next iteration. It'll probably make flock()
> > tests run slower, but such is the cost of preserving this procfile...
> 
> How hard would it be to make the procfile stuff optional ?
> So that those that need performance can decide to not use it ?
> Maybe even something that can be disabled at run time ? Not just compile 
> time.
> 

(re-adding back the cc lists...)

It'd be tricky, especially if you want to do it at runtime. The
procfile itself is not a problem per-se. The real problem is the
tracking you have to do in order to eventually present the procfile. So
a boot-time or compile-time switch might be reasonable, but a runtime
switch will probably never really be.

I have a new patchset that I'm testing now though that should address
Bruce's concerns about iterating over that global list. So far, it
seems to be at least as fast as the latest patchset I posted.

It makes the (spin)locking a bit more complex, but hopefully I can
document this well enough that it's not a great concern.

Stay tuned...
simo June 15, 2013, 3:04 p.m. UTC | #4
On 06/15/2013 07:05 AM, Jeff Layton wrote:
> On Fri, 14 Jun 2013 07:52:44 -0400
> Simo <idra@samba.org> wrote:
>
>> On 06/13/2013 04:26 PM, Jeff Layton wrote:
>>> The only real solution I can think of is to put flock locks into the
>>> blocked_list/blocked_hash too, or maybe giving them a simple hlist to
>>> sit on.
>>>
>>> I'll fix that up in the next iteration. It'll probably make flock()
>>> tests run slower, but such is the cost of preserving this procfile...
>> How hard would it be to make the procfile stuff optional ?
>> So that those that need performance can decide to not use it ?
>> Maybe even something that can be disabled at run time ? Not just compile
>> time.
>>
> (re-adding back the cc lists...)
>
> It'd be tricky, especially if you want to do it at runtime. The
> procfile itself is not a problem per-se. The real problem is the
> tracking you have to do in order to eventually present the procfile. So
> a boot-time or compile-time switch might be reasonable, but a runtime
> switch will probably never really be.

Just to be clear, I meant for a switch to turn it off at runtime, I 
understand very well that it would be way too hard to turn on at 
runtime. But killing the perf problem might be desirable on a system you 
cannot just reboot.

> I have a new patchset that I'm testing now though that should address
> Bruce's concerns about iterating over that global list. So far, it
> seems to be at least as fast as the latest patchset I posted.
>
> It makes the (spin)locking a bit more complex, but hopefully I can
> document this well enough that it's not a great concern.
>
> Stay tuned...

Thanks Jeff,
this is very valuable work.

Simo.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/locks.c b/fs/locks.c
index e451d18..3fd27f0 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2249,8 +2249,10 @@  static int locks_show(struct seq_file *f, void *v)
 
 	lock_get_status(f, fl, *((loff_t *)f->private), "");
 
-	list_for_each_entry(bfl, &fl->fl_block, fl_block)
-		lock_get_status(f, bfl, *((loff_t *)f->private), " ->");
+	list_for_each_entry(bfl, &blocked_list, fl_link) {
+		if (bfl->fl_next == fl)
+			lock_get_status(f, bfl, *((loff_t *)f->private), " ->");
+	}
 
 	return 0;
 }