Message ID | 1370056054-25449-12-git-send-email-jlayton@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Jeff, > There's no reason we have to protect the blocked_hash and file_lock_list > with the same spinlock. With the tests I have, breaking it in two gives > a barely measurable performance benefit, but it seems reasonable to make > this locking as granular as possible. as file_lock_{list,lock} is only used for debugging (/proc/locks) after this change, I guess it would be possible to use RCU instead of a spinlock. @others: this was the related discussion on IRC (http://irclog.samba.org/) about this: 16:02 < metze> jlayton: do you have time to discuss your file_lock_lock changes? 16:02 < jlayton> metze: sure, what's up? 16:03 < jlayton> metze: note that it won't help vl's thundering herd problems... 16:03 < metze> is it correct that after your last patch file_lock_lock is only used for /proc/locks? 16:03 < jlayton> well, it's only used to protect the list that is used for /proc/locks 16:04 < jlayton> it still gets taken whenever a lock is acquired or released in order to manipulate that list 16:04 < metze> would it be a good idea to use rcu instead of a spin lock? 16:04 < jlayton> I tried using RCU, but it turned out to slow everything down 16:04 < jlayton> this is not a read-mostly workload unfortunately 16:04 < jlayton> so doing it with mutual exclusion turns out to be faster 16:04 < metze> ok 16:05 < jlayton> I might play around with it again sometime, but I don't think it really helps. What we need to ensure is that we optimize the code that manipulates that list, and RCU list manipulations have larger overhead 16:06 < jlayton> metze: that's a good question though so if you want to ask it on the list, please do 16:06 < jlayton> others will probably be wondering the same thing 16:08 < metze> maybe it's worth a comment in commit message and the code 16:08 < metze> btw, why don't you remove the ' /* Protects the file_lock_list and the blocked_hash */' comment? metze
On Tue, 04 Jun 2013 16:19:53 +0200 "Stefan (metze) Metzmacher" <metze@samba.org> wrote: > Hi Jeff, > > > There's no reason we have to protect the blocked_hash and file_lock_list > > with the same spinlock. With the tests I have, breaking it in two gives > > a barely measurable performance benefit, but it seems reasonable to make > > this locking as granular as possible. > > as file_lock_{list,lock} is only used for debugging (/proc/locks) after this > change, I guess it would be possible to use RCU instead of a spinlock. > > @others: this was the related discussion on IRC > (http://irclog.samba.org/) about this: > > 16:02 < metze> jlayton: do you have time to discuss your file_lock_lock > changes? > 16:02 < jlayton> metze: sure, what's up? > 16:03 < jlayton> metze: note that it won't help vl's thundering herd > problems... > 16:03 < metze> is it correct that after your last patch file_lock_lock > is only used for /proc/locks? > 16:03 < jlayton> well, it's only used to protect the list that is used > for /proc/locks > 16:04 < jlayton> it still gets taken whenever a lock is acquired or > released in order to manipulate that list > 16:04 < metze> would it be a good idea to use rcu instead of a spin lock? > 16:04 < jlayton> I tried using RCU, but it turned out to slow everything > down > 16:04 < jlayton> this is not a read-mostly workload unfortunately > 16:04 < jlayton> so doing it with mutual exclusion turns out to be faster > 16:04 < metze> ok > 16:05 < jlayton> I might play around with it again sometime, but I don't > think it really helps. What we need to ensure is > that we optimize the code that manipulates that list, > and RCU list manipulations have larger overhead > 16:06 < jlayton> metze: that's a good question though so if you want to > ask it on the list, please do > 16:06 < jlayton> others will probably be wondering the same thing > 16:08 < metze> maybe it's worth a comment in commit message and the code I'm not sure it's worth commenting about RCU in the code. In the future it may be possible to do this -- who knows. It does seem to have been consistently slower in my testing here though. > 16:08 < metze> btw, why don't you remove the ' /* Protects the > file_lock_list and the blocked_hash */' comment? > Removed in my git tree -- thanks for pointing that out.
Having RCU for modification mostly workloads never is a good idea, so I don't think it makes sense to mention it here. If you care about the overhead it's worth trying to use per-cpu lists, though. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 04, 2013 at 07:46:40AM -0700, Christoph Hellwig wrote: > Having RCU for modification mostly workloads never is a good idea, so > I don't think it makes sense to mention it here. > > If you care about the overhead it's worth trying to use per-cpu lists, > though. Yes. The lock and unlock could happen on different CPU's--but I think you can make the rule that the lock stays associated with the list it was first put on, and then it's correct in general and hopefully quick in the common case. --b. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 4 Jun 2013 07:46:40 -0700 Christoph Hellwig <hch@infradead.org> wrote: > Having RCU for modification mostly workloads never is a good idea, so > I don't think it makes sense to mention it here. > > If you care about the overhead it's worth trying to use per-cpu lists, > though. > Yeah, I looked at those too in an earlier set and it did help some. Moving to a hashtable for the blocked_list really seemed to help the most there, but percpu lists with lglocks or something might help a lot on the file_lock_list.
On Tue, 4 Jun 2013 10:53:22 -0400 "J. Bruce Fields" <bfields@fieldses.org> wrote: > On Tue, Jun 04, 2013 at 07:46:40AM -0700, Christoph Hellwig wrote: > > Having RCU for modification mostly workloads never is a good idea, so > > I don't think it makes sense to mention it here. > > > > If you care about the overhead it's worth trying to use per-cpu lists, > > though. > > Yes. The lock and unlock could happen on different CPU's--but I think > you can make the rule that the lock stays associated with the list it > was first put on, and then it's correct in general and hopefully quick > in the common case. > It's important to distinguish between the blocked_list/hash and the file_lock_list. Yes, they use the same embedded list_head or hlist_node in the file_lock, but they have very different characteristics and use cases. In the testing I did, having a hashtable for the blocked locks helped a lot more than a percpu list. The trick with deadlock detection is to ensure that you don't spend a lot of time walking the lists. Since we do deadlock detection frequently, we need to optimize for that case. For the file_lock_list, it might make sense to have percpu hlists with an lglock however. The thing to note here is that the file_lock_list is almost never read. Only /proc/locks uses it, so anything we can do to optimize the list manipulation is probably worth it. All that said, I'm leery about changing too much of this code too fast. It's pretty old and poorly understood, so I think we need to be cautious and take an incremental approach to changing it.
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index ee351ac..8d8d040 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -359,20 +359,20 @@ prototypes: locking rules: - inode->i_lock file_lock_lock may block -lm_compare_owner: yes maybe no -lm_owner_key yes yes no -lm_notify: yes no no -lm_grant: no no no -lm_break: yes no no -lm_change yes no no + inode->i_lock blocked_hash_lock may block +lm_compare_owner: yes maybe no +lm_owner_key yes yes no +lm_notify: yes no no +lm_grant: no no no +lm_break: yes no no +lm_change yes no no ->lm_compare_owner and ->lm_owner_key are generally called with *an* inode->i_lock held. It may not be the i_lock of the inode associated with either file_lock argument! This is the case with deadlock detection, since the code has to chase down the owners of locks that may be entirely unrelated to the one on which the lock is being acquired. -For deadlock detection however, the file_lock_lock is also held. The +For deadlock detection however, the blocked_hash_lock is also held. The fact that these locks are held ensures that the file_locks do not disappear out from under you while doing the comparison or generating an owner key. diff --git a/fs/locks.c b/fs/locks.c index 8219187..520f32b 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -172,12 +172,13 @@ int lease_break_time = 45; */ #define BLOCKED_HASH_BITS 7 +static DEFINE_SPINLOCK(blocked_hash_lock); static DEFINE_HASHTABLE(blocked_hash, BLOCKED_HASH_BITS); +static DEFINE_SPINLOCK(file_lock_lock); static HLIST_HEAD(file_lock_list); /* Protects the file_lock_list and the blocked_hash */ -static DEFINE_SPINLOCK(file_lock_lock); static struct kmem_cache *filelock_cache __read_mostly; @@ -503,17 +504,17 @@ posix_owner_key(struct file_lock *fl) static inline void locks_insert_global_blocked(struct file_lock *waiter) { - spin_lock(&file_lock_lock); + spin_lock(&blocked_hash_lock); hash_add(blocked_hash, &waiter->fl_link, posix_owner_key(waiter)); - spin_unlock(&file_lock_lock); + spin_unlock(&blocked_hash_lock); } static inline void locks_delete_global_blocked(struct file_lock *waiter) { - spin_lock(&file_lock_lock); + spin_lock(&blocked_hash_lock); hash_del(&waiter->fl_link); - spin_unlock(&file_lock_lock); + spin_unlock(&blocked_hash_lock); } static inline void @@ -739,7 +740,7 @@ static int posix_locks_deadlock(struct file_lock *caller_fl, int i = 0; int ret = 0; - spin_lock(&file_lock_lock); + spin_lock(&blocked_hash_lock); while ((block_fl = what_owner_is_waiting_for(block_fl))) { if (i++ > MAX_DEADLK_ITERATIONS) break; @@ -748,7 +749,7 @@ static int posix_locks_deadlock(struct file_lock *caller_fl, break; } } - spin_unlock(&file_lock_lock); + spin_unlock(&blocked_hash_lock); return ret; } @@ -2300,10 +2301,12 @@ static int locks_show(struct seq_file *f, void *v) lock_get_status(f, fl, *((loff_t *)f->private), ""); + spin_lock(&blocked_hash_lock); hash_for_each(blocked_hash, bkt, bfl, fl_link) { if (bfl->fl_next == fl) lock_get_status(f, bfl, *((loff_t *)f->private), " ->"); } + spin_unlock(&blocked_hash_lock); return 0; }
There's no reason we have to protect the blocked_hash and file_lock_list with the same spinlock. With the tests I have, breaking it in two gives a barely measurable performance benefit, but it seems reasonable to make this locking as granular as possible. Signed-off-by: Jeff Layton <jlayton@redhat.com> --- Documentation/filesystems/Locking | 16 ++++++++-------- fs/locks.c | 17 ++++++++++------- 2 files changed, 18 insertions(+), 15 deletions(-)