Message ID | 20190814202027.18735-4-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hmm & mmu_notifier debug/lockdep annotations | expand |
On Wed, Aug 14, 2019 at 10:20:25PM +0200, Daniel Vetter wrote: > We need to make sure implementations don't cheat and don't have a > possible schedule/blocking point deeply burried where review can't > catch it. > > I'm not sure whether this is the best way to make sure all the > might_sleep() callsites trigger, and it's a bit ugly in the code flow. > But it gets the job done. > > Inspired by an i915 patch series which did exactly that, because the > rules haven't been entirely clear to us. I thought lockdep already was able to detect: spin_lock() might_sleep(); spin_unlock() Am I mistaken? If yes, couldn't this patch just inject a dummy lockdep spinlock? Jason
On Wed, Aug 14, 2019 at 09:00:29PM -0300, Jason Gunthorpe wrote: > On Wed, Aug 14, 2019 at 10:20:25PM +0200, Daniel Vetter wrote: > > We need to make sure implementations don't cheat and don't have a > > possible schedule/blocking point deeply burried where review can't > > catch it. > > > > I'm not sure whether this is the best way to make sure all the > > might_sleep() callsites trigger, and it's a bit ugly in the code flow. > > But it gets the job done. > > > > Inspired by an i915 patch series which did exactly that, because the > > rules haven't been entirely clear to us. > > I thought lockdep already was able to detect: > > spin_lock() > might_sleep(); > spin_unlock() > > Am I mistaken? If yes, couldn't this patch just inject a dummy lockdep > spinlock? Hm ... assuming I didn't get lost in the maze I think might_sleep (well ___might_sleep) doesn't do any lockdep checking at all. And we want might_sleep, since that catches a lot more than lockdep. Maybe you mixed it up with the hard/softirq context stuff that lockdep tracks and complains about if you get it wrong? -Daniel
On Sat, Aug 17, 2019 at 5:26 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Thu, Aug 15, 2019 at 09:02:49AM +0200, Daniel Vetter wrote: > > On Wed, Aug 14, 2019 at 09:00:29PM -0300, Jason Gunthorpe wrote: > > > On Wed, Aug 14, 2019 at 10:20:25PM +0200, Daniel Vetter wrote: > > > > We need to make sure implementations don't cheat and don't have a > > > > possible schedule/blocking point deeply burried where review can't > > > > catch it. > > > > > > > > I'm not sure whether this is the best way to make sure all the > > > > might_sleep() callsites trigger, and it's a bit ugly in the code flow. > > > > But it gets the job done. > > > > > > > > Inspired by an i915 patch series which did exactly that, because the > > > > rules haven't been entirely clear to us. > > > > > > I thought lockdep already was able to detect: > > > > > > spin_lock() > > > might_sleep(); > > > spin_unlock() > > > > > > Am I mistaken? If yes, couldn't this patch just inject a dummy lockdep > > > spinlock? > > > > Hm ... assuming I didn't get lost in the maze I think might_sleep (well > > ___might_sleep) doesn't do any lockdep checking at all. And we want > > might_sleep, since that catches a lot more than lockdep. > > Don't know how it works, but it sure looks like it does: > > This: > spin_lock(&file->uobjects_lock); > down_read(&file->hw_destroy_rwsem); > up_read(&file->hw_destroy_rwsem); > spin_unlock(&file->uobjects_lock); > > Causes: > > [ 33.324729] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1444 > [ 33.325599] in_atomic(): 1, irqs_disabled(): 0, pid: 247, name: ibv_devinfo > [ 33.326115] 3 locks held by ibv_devinfo/247: > [ 33.326556] #0: 000000009edf8379 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_open+0xff/0x5f0 [ib_uverbs] > [ 33.327657] #1: 000000005e0eddf1 (&uverbs_dev->lists_mutex){+.+.}, at: ib_uverbs_open+0x16c/0x5f0 [ib_uverbs] > [ 33.328682] #2: 00000000505f509e (&(&file->uobjects_lock)->rlock){+.+.}, at: ib_uverbs_open+0x31a/0x5f0 [ib_uverbs] > > And this: > > spin_lock(&file->uobjects_lock); > might_sleep(); > spin_unlock(&file->uobjects_lock); > > Causes: > > [ 16.867211] BUG: sleeping function called from invalid context at drivers/infiniband/core/uverbs_main.c:1095 > [ 16.867776] in_atomic(): 1, irqs_disabled(): 0, pid: 245, name: ibv_devinfo > [ 16.868098] 3 locks held by ibv_devinfo/245: > [ 16.868383] #0: 000000004c5954ff (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_open+0xf8/0x600 [ib_uverbs] > [ 16.868938] #1: 0000000020a6fae2 (&uverbs_dev->lists_mutex){+.+.}, at: ib_uverbs_open+0x16c/0x600 [ib_uverbs] > [ 16.869568] #2: 00000000036e6a97 (&(&file->uobjects_lock)->rlock){+.+.}, at: ib_uverbs_open+0x317/0x600 [ib_uverbs] > > I think this is done in some very expensive way, so it probably only > works when lockdep is enabled.. This is the might_sleep debug infrastructure (both of them), not lockdep. Disable CONFIG_PROVE_LOCKING and you should still get these. -Daniel
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 16f1cbc775d0..43a76d030164 100644 --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -174,7 +174,13 @@ int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) id = srcu_read_lock(&srcu); hlist_for_each_entry_rcu(mn, &range->mm->mmu_notifier_mm->list, hlist) { if (mn->ops->invalidate_range_start) { - int _ret = mn->ops->invalidate_range_start(mn, range); + int _ret; + + if (!mmu_notifier_range_blockable(range)) + non_block_start(); + _ret = mn->ops->invalidate_range_start(mn, range); + if (!mmu_notifier_range_blockable(range)) + non_block_end(); if (_ret) { pr_info("%pS callback failed with %d in %sblockable context.\n", mn->ops->invalidate_range_start, _ret,