Message ID | 20210113111707.756662-1-nborisov@suse.com (mailing list archive) |
---|---|
Headers | show |
Series | Remove mrlock | expand |
Pavel has looked into this before and got stuck on the allocator workqueue offloads: [PATCH v13 0/4] xfs: Remove wrappers for some semaphores On Wed, Jan 13, 2021 at 01:17:03PM +0200, Nikolay Borisov wrote: > This series removes mrlock_t and directly replaces i_lock and i_mmaplock with > rw_semaphore in xfs_inode. My end game with this is to eventually lift i_mmaplock > in VFS and use it from there. The necessity for the latter came up since BTRFS > is also about to get its own private version of i_mmaplock for the same use case. > This will mean that all 3 major filesystems on linux (ext4/xfs/btrfs) wil share > the same lock. Christoph naturally suggested for the lock to be lifted to VFS. > > Before proceeding with this work I'd like to get the opinion of XFS developers > whether doing that is acceptable for them. I've heard that Dave wants to eventually > convert the mmapsem to a range lock for XFS and implement a callback mechanism > for VFS to call into every filesystem... > > I've only compile tested this and also the way the rwsem is checked for write > is admittedly a bit hackish but it can easily be changed to utilize lockdep. > I'm aware of https://lore.kernel.org/linux-xfs/20201102194135.174806-1-preichl@redhat.com/ > but frankly that series went too far up to rev 10 which is a bit mind boggling... > > Nikolay Borisov (3): > xfs: Add is_rwsem_write_locked function > xfs: Convert i_lock/i_mmaplock to rw_semaphore > xfs: Remove mrlock > > fs/xfs/mrlock.h | 78 ---------------------------------------- > fs/xfs/xfs_inode.c | 48 ++++++++++++++----------- > fs/xfs/xfs_inode.h | 6 ++-- > fs/xfs/xfs_linux.h | 1 - > fs/xfs/xfs_qm_syscalls.c | 2 +- > fs/xfs/xfs_super.c | 7 ++-- > 6 files changed, 34 insertions(+), 108 deletions(-) > delete mode 100644 fs/xfs/mrlock.h > > -- > 2.25.1 > ---end quoted text---
On 13.01.21 г. 13:27 ч., Christoph Hellwig wrote: > Pavel has looked into this before and got stuck on the allocator > workqueue offloads: > > [PATCH v13 0/4] xfs: Remove wrappers for some semaphores I haven't looked into his series but I fail to see how lifting rwsemaphore out of the nested structure can change the behavior ? It just removes a level of indirection. My patches are semantically identical to the original code. <snip>
On Wed, Jan 13, 2021 at 01:41:09PM +0200, Nikolay Borisov wrote: > > > On 13.01.21 ??. 13:27 ??., Christoph Hellwig wrote: > > Pavel has looked into this before and got stuck on the allocator > > workqueue offloads: > > > > [PATCH v13 0/4] xfs: Remove wrappers for some semaphores > > I haven't looked into his series but I fail to see how lifting > rwsemaphore out of the nested structure can change the behavior ? It > just removes a level of indirection. My patches are semantically > identical to the original code. mrlocks have the mr_writer field that annotate that the is a writer locking the lock. The XFS asserts use it to assert that the lock that the current thread holds it for exclusive protection, which isn't actually what the field says, and this breaks when XFS uses synchronous execution of work_struct as basically an extension of the kernel stack.
On 13.01.21 г. 14:09 ч., Christoph Hellwig wrote: > On Wed, Jan 13, 2021 at 01:41:09PM +0200, Nikolay Borisov wrote: >> >> >> On 13.01.21 ??. 13:27 ??., Christoph Hellwig wrote: >>> Pavel has looked into this before and got stuck on the allocator >>> workqueue offloads: >>> >>> [PATCH v13 0/4] xfs: Remove wrappers for some semaphores >> >> I haven't looked into his series but I fail to see how lifting >> rwsemaphore out of the nested structure can change the behavior ? It >> just removes a level of indirection. My patches are semantically >> identical to the original code. > > mrlocks have the mr_writer field that annotate that the is a writer > locking the lock. The XFS asserts use it to assert that the lock that > the current thread holds it for exclusive protection, which isn't > actually what the field says, and this breaks when XFS uses synchronous > execution of work_struct as basically an extension of the kernel stack. I'm still failing to see what's the problem of checking the last bit of the rwsem ->count field. It is set when the sem is held for writing (identical to what mr_write does). As I mention in the cover letter this might be considered a bit hacky because it exposes an internal detail of the rwsem i.e that the bit of interest is bit 0. But I believe the same can be achieved using lockdep_is_held_type(xx, 0/1) and making XFS's debug routines depend on lockdep being on. >
On Wed, Jan 13, 2021 at 02:17:29PM +0200, Nikolay Borisov wrote: > > > On 13.01.21 г. 14:09 ч., Christoph Hellwig wrote: > > On Wed, Jan 13, 2021 at 01:41:09PM +0200, Nikolay Borisov wrote: > >> > >> > >> On 13.01.21 ??. 13:27 ??., Christoph Hellwig wrote: > >>> Pavel has looked into this before and got stuck on the allocator > >>> workqueue offloads: > >>> > >>> [PATCH v13 0/4] xfs: Remove wrappers for some semaphores > >> > >> I haven't looked into his series but I fail to see how lifting > >> rwsemaphore out of the nested structure can change the behavior ? It > >> just removes a level of indirection. My patches are semantically > >> identical to the original code. > > > > mrlocks have the mr_writer field that annotate that the is a writer > > locking the lock. The XFS asserts use it to assert that the lock that > > the current thread holds it for exclusive protection, which isn't > > actually what the field says, and this breaks when XFS uses synchronous > > execution of work_struct as basically an extension of the kernel stack. > > I'm still failing to see what's the problem of checking the last bit of > the rwsem ->count field. It is set when the sem is held for writing > (identical to what mr_write does). As I mention in the cover letter this > might be considered a bit hacky because it exposes an internal detail of > the rwsem i.e that the bit of interest is bit 0. I don't want to tear into the implementation details of rwsems if I can avoid it. Just because we all have one big happy address space doesn't mean shortcuts won't hose everyone. > But I believe the same > can be achieved using lockdep_is_held_type(xx, 0/1) and making XFS's > debug routines depend on lockdep being on. Pavel Reichl tried that two months ago in: https://lore.kernel.org/linux-xfs/20201102194135.174806-2-preichl@redhat.com/ which resulted in fstests regressions: https://lore.kernel.org/linux-xfs/20201104005345.GC7115@magnolia/ TLDR: the ILOCK semaphore is data-centric, but lockdep's debugging chains are task-centric, which causes incorrect lock validation reports. The solutions as I see them are: (a) figure out if we really still need to defer bmbt splits to workers to avoid overflowing the kernel stack; or (b) making it possible to transfer rwsem ownership to shut up lockdep; or (c) fix the is_held predicate to ignore ownership. --D > > >