Message ID | 20200129180324.24099-1-dave@stgolabs.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: optimize barrier usage for Rmw atomics | expand |
On 29.01.20 г. 20:03 ч., Davidlohr Bueso wrote: > Use smp_mb__after_atomic() instead of smp_mb() and avoid the > unnecessary barrier for non LL/SC architectures, such as x86. > > Signed-off-by: Davidlohr Bueso <dbueso@suse.de> While on the topic of this I've been sitting on the following local patch for about a year, care to review the barriers:
On Wed, Jan 29, 2020 at 10:03:24AM -0800, Davidlohr Bueso wrote: > Use smp_mb__after_atomic() instead of smp_mb() and avoid the > unnecessary barrier for non LL/SC architectures, such as x86. So that's a conflicting advice from what we got when discussing wich barriers to use in 6282675e6708ec78518cc0e9ad1f1f73d7c5c53d and the memory is still fresh. My first idea was to take the smp_mb__after_atomic and __before_atomic variants and after discussion with various people the plain smp_wmb/smp_rmb were suggested and used in the end. I can dig the email threads and excerpts from irc conversations, maybe Nik has them at hand too. We do want to get rid of all unnecessary and uncommented barriers in btrfs code, so I appreciate your patch. > Signed-off-by: Davidlohr Bueso <dbueso@suse.de> > --- > fs/btrfs/btrfs_inode.h | 2 +- > fs/btrfs/file.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h > index 4e12a477d32e..54e0d2ae22cc 100644 > --- a/fs/btrfs/btrfs_inode.h > +++ b/fs/btrfs/btrfs_inode.h > @@ -325,7 +325,7 @@ struct btrfs_dio_private { > static inline void btrfs_inode_block_unlocked_dio(struct btrfs_inode *inode) > { > set_bit(BTRFS_INODE_READDIO_NEED_LOCK, &inode->runtime_flags); > - smp_mb(); > + smp_mb__after_atomic(); In this case I think we should use the smp_wmb/smp_rmb pattern rather than the full barrier. > } > > static inline void btrfs_inode_resume_unlocked_dio(struct btrfs_inode *inode) > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index a16da274c9aa..ea79ab068079 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -2143,7 +2143,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) > } > atomic_inc(&root->log_batch); > > - smp_mb(); > + smp_mb__after_atomic(); That's the problem with uncommented barriers that it's not clear what are they related to. In this case it's not the atomic_inc above that would justify __after_atomic. The patch that added it is years old so any change to that barrier would require deeper analysis. > if (btrfs_inode_in_log(BTRFS_I(inode), fs_info->generation) || > BTRFS_I(inode)->last_trans <= fs_info->last_trans_committed) { > /*
On Wed, 29 Jan 2020, David Sterba wrote: >On Wed, Jan 29, 2020 at 10:03:24AM -0800, Davidlohr Bueso wrote: >> Use smp_mb__after_atomic() instead of smp_mb() and avoid the >> unnecessary barrier for non LL/SC architectures, such as x86. > >So that's a conflicting advice from what we got when discussing wich >barriers to use in 6282675e6708ec78518cc0e9ad1f1f73d7c5c53d and the >memory is still fresh. My first idea was to take the >smp_mb__after_atomic and __before_atomic variants and after discussion >with various people the plain smp_wmb/smp_rmb were suggested and used in >the end. So the patch you mention deals with test_bit(), which is out of the scope of smp_mb__{before,after}_atomic() as it's not a RMW operation. atomic_inc() and set_bit() are, however, meant to use these barriers. > >I can dig the email threads and excerpts from irc conversations, maybe >Nik has them at hand too. We do want to get rid of all unnecessary and >uncommented barriers in btrfs code, so I appreciate your patch. Yeah, I struggled with the amount of undocumented barriers, and decided not to go down that rabbit hole. This patch is only an equivalent of what is currently there. When possible, getting rid of barriers is of course better. Thanks, Davidlohr
On 2020/1/30 上午3:25, Davidlohr Bueso wrote: > On Wed, 29 Jan 2020, David Sterba wrote: > >> On Wed, Jan 29, 2020 at 10:03:24AM -0800, Davidlohr Bueso wrote: >>> Use smp_mb__after_atomic() instead of smp_mb() and avoid the >>> unnecessary barrier for non LL/SC architectures, such as x86. >> >> So that's a conflicting advice from what we got when discussing wich >> barriers to use in 6282675e6708ec78518cc0e9ad1f1f73d7c5c53d and the >> memory is still fresh. My first idea was to take the >> smp_mb__after_atomic and __before_atomic variants and after discussion >> with various people the plain smp_wmb/smp_rmb were suggested and used in >> the end. > > So the patch you mention deals with test_bit(), which is out of the scope > of smp_mb__{before,after}_atomic() as it's not a RMW operation. > atomic_inc() > and set_bit() are, however, meant to use these barriers. Exactly! I'm still not convinced to use full barrier for test_bit() and I see no reason to use any barrier for test_bit(). All mb should only be needed between two or more memory access, thus mb should sit between set/clear_bit() and other operations, not around test_bit(). > >> >> I can dig the email threads and excerpts from irc conversations, maybe >> Nik has them at hand too. We do want to get rid of all unnecessary and >> uncommented barriers in btrfs code, so I appreciate your patch. > > Yeah, I struggled with the amount of undocumented barriers, and decided > not to go down that rabbit hole. This patch is only an equivalent of > what is currently there. When possible, getting rid of barriers is of > course better. BTW, is there any convincing method to do proper mb examination? I really found it hard to convince others or even myself when mb is involved. Thanks, Qu > > Thanks, > Davidlohr
On 30.01.20 г. 1:55 ч., Qu Wenruo wrote: > > > On 2020/1/30 上午3:25, Davidlohr Bueso wrote: >> On Wed, 29 Jan 2020, David Sterba wrote: >> >>> On Wed, Jan 29, 2020 at 10:03:24AM -0800, Davidlohr Bueso wrote: >>>> Use smp_mb__after_atomic() instead of smp_mb() and avoid the >>>> unnecessary barrier for non LL/SC architectures, such as x86. >>> >>> So that's a conflicting advice from what we got when discussing wich >>> barriers to use in 6282675e6708ec78518cc0e9ad1f1f73d7c5c53d and the >>> memory is still fresh. My first idea was to take the >>> smp_mb__after_atomic and __before_atomic variants and after discussion >>> with various people the plain smp_wmb/smp_rmb were suggested and used in >>> the end. >> >> So the patch you mention deals with test_bit(), which is out of the scope >> of smp_mb__{before,after}_atomic() as it's not a RMW operation. >> atomic_inc() >> and set_bit() are, however, meant to use these barriers. > > Exactly! > I'm still not convinced to use full barrier for test_bit() and I see no > reason to use any barrier for test_bit(). > All mb should only be needed between two or more memory access, thus mb > should sit between set/clear_bit() and other operations, not around > test_bit(). > >> >>> >>> I can dig the email threads and excerpts from irc conversations, maybe >>> Nik has them at hand too. We do want to get rid of all unnecessary and >>> uncommented barriers in btrfs code, so I appreciate your patch. >> >> Yeah, I struggled with the amount of undocumented barriers, and decided >> not to go down that rabbit hole. This patch is only an equivalent of >> what is currently there. When possible, getting rid of barriers is of >> course better. > > BTW, is there any convincing method to do proper mb examination? > > I really found it hard to convince others or even myself when mb is > involved. Yes there is - the LKMM, you can write a litmus test. Check out tootls/memory-model > > Thanks, > Qu > >> >> Thanks, >> Davidlohr
diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index 4e12a477d32e..54e0d2ae22cc 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -325,7 +325,7 @@ struct btrfs_dio_private { static inline void btrfs_inode_block_unlocked_dio(struct btrfs_inode *inode) { set_bit(BTRFS_INODE_READDIO_NEED_LOCK, &inode->runtime_flags); - smp_mb(); + smp_mb__after_atomic(); } static inline void btrfs_inode_resume_unlocked_dio(struct btrfs_inode *inode) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index a16da274c9aa..ea79ab068079 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2143,7 +2143,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) } atomic_inc(&root->log_batch); - smp_mb(); + smp_mb__after_atomic(); if (btrfs_inode_in_log(BTRFS_I(inode), fs_info->generation) || BTRFS_I(inode)->last_trans <= fs_info->last_trans_committed) { /*
Use smp_mb__after_atomic() instead of smp_mb() and avoid the unnecessary barrier for non LL/SC architectures, such as x86. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> --- fs/btrfs/btrfs_inode.h | 2 +- fs/btrfs/file.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)