Message ID | 20250123174124.24554-3-nilay@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | block: remove q->sysfs_dir_lock and fix race updating nr_hw_queue | expand |
On Thu, Jan 23, 2025 at 11:10:24PM +0530, Nilay Shroff wrote: > > + mutex_lock(&q->tag_set->tag_list_lock); > + > queue_for_each_hw_ctx(q, hctx, i) { > ret = blk_mq_register_hctx(hctx); > - if (ret) > + if (ret) { > + mutex_unlock(&q->tag_set->tag_list_lock); > goto unreg; > + } > } > > + mutex_unlock(&q->tag_set->tag_list_lock); > > out: > return ret; > > unreg: > + mutex_lock(&q->tag_set->tag_list_lock); > + No real need for a unlock/lock cycle here I think. Also as something that is really just a nitpick, I love to keep the locks for the critical sections close to the code they're protecting. e.g. format this as: if (ret < 0) return ret; mutex_lock(&q->tag_set->tag_list_lock); queue_for_each_hw_ctx(q, hctx, i) { ret = blk_mq_register_hctx(hctx); if (ret) goto out_unregister; } mutex_unlock(&q->tag_set->tag_list_lock); return 0 out_unregister: queue_for_each_hw_ctx(q, hctx, j) { if (j < i) blk_mq_unregister_hctx(hctx); } mutex_unlock(&q->tag_set->tag_list_lock); ... (and similar for blk_mq_sysfs_unregister). I assume you did run this through blktests and xfstests with lockdep enabled to catch if we created some new lock ordering problems? I can't think of any right now, but it's good to validate that.
On 1/28/25 11:19 AM, Christoph Hellwig wrote: > On Thu, Jan 23, 2025 at 11:10:24PM +0530, Nilay Shroff wrote: >> >> + mutex_lock(&q->tag_set->tag_list_lock); >> + >> queue_for_each_hw_ctx(q, hctx, i) { >> ret = blk_mq_register_hctx(hctx); >> - if (ret) >> + if (ret) { >> + mutex_unlock(&q->tag_set->tag_list_lock); >> goto unreg; >> + } >> } >> >> + mutex_unlock(&q->tag_set->tag_list_lock); >> >> out: >> return ret; >> >> unreg: >> + mutex_lock(&q->tag_set->tag_list_lock); >> + > > No real need for a unlock/lock cycle here I think. Also as something > that is really just a nitpick, I love to keep the locks for the critical > sections close to the code they're protecting. e.g. format this as: > > if (ret < 0) > return ret; > > mutex_lock(&q->tag_set->tag_list_lock); > queue_for_each_hw_ctx(q, hctx, i) { > ret = blk_mq_register_hctx(hctx); > if (ret) > goto out_unregister; > } > mutex_unlock(&q->tag_set->tag_list_lock); > return 0 > > out_unregister: > queue_for_each_hw_ctx(q, hctx, j) { > if (j < i) > blk_mq_unregister_hctx(hctx); > } > mutex_unlock(&q->tag_set->tag_list_lock); > > ... > > (and similar for blk_mq_sysfs_unregister). Yes looks good. I will update code and send next patch. > I assume you did run this through blktests and xfstests with lockdep > enabled to catch if we created some new lock ordering problems? > I can't think of any right now, but it's good to validate that. > Yeah I ran blktests with lockdep enabled before submitting patch to ensure that lockdep doesn't generate any waning with changes. However I didn't run xfstests. Anyways, I would now run both blktests and xfstests before submitting the next patch. Thanks, --Nilay
diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 6113328abd70..2b9df2b73bf6 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -229,22 +229,31 @@ int blk_mq_sysfs_register(struct gendisk *disk) kobject_uevent(q->mq_kobj, KOBJ_ADD); + mutex_lock(&q->tag_set->tag_list_lock); + queue_for_each_hw_ctx(q, hctx, i) { ret = blk_mq_register_hctx(hctx); - if (ret) + if (ret) { + mutex_unlock(&q->tag_set->tag_list_lock); goto unreg; + } } + mutex_unlock(&q->tag_set->tag_list_lock); out: return ret; unreg: + mutex_lock(&q->tag_set->tag_list_lock); + queue_for_each_hw_ctx(q, hctx, j) { if (j < i) blk_mq_unregister_hctx(hctx); } + mutex_unlock(&q->tag_set->tag_list_lock); + kobject_uevent(q->mq_kobj, KOBJ_REMOVE); kobject_del(q->mq_kobj); return ret; @@ -256,10 +265,13 @@ void blk_mq_sysfs_unregister(struct gendisk *disk) struct blk_mq_hw_ctx *hctx; unsigned long i; + mutex_lock(&q->tag_set->tag_list_lock); queue_for_each_hw_ctx(q, hctx, i) blk_mq_unregister_hctx(hctx); + mutex_unlock(&q->tag_set->tag_list_lock); + kobject_uevent(q->mq_kobj, KOBJ_REMOVE); kobject_del(q->mq_kobj); }
The nr_hw_queue update could potentially race with disk addtion/removal while registering/unregistering hctx sysfs files. The __blk_mq_update_ nr_hw_queues() runs with q->tag_list_lock held and so to avoid it racing with disk addition/removal we should acquire q->tag_list_lock while registering/unregistering hctx sysfs files. With this patch, blk_mq_sysfs_register() (called during disk addition) and blk_mq_sysfs_unregister() (called during disk removal) now runs with q->tag_list_lock held so that it avoids racing with __blk_mq_update _nr_hw_queues(). Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> --- block/blk-mq-sysfs.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)