Message ID | 20180218130506.GW695913@devbig577.frc2.facebook.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Feb 18, 2018 at 05:05:06AM -0800, Tejun Heo wrote: > sbitmap is used to allocate tags. The free and alloc paths use a > memory ordering scheme similar to the one used by waitqueue, where the > waiter and waker synchronize around set_current_state(). > > This doesn't seem sufficient for sbitmap given that a tag may get > released and re-acquired without the allocator blocking at all. Once > the bit for the tag is cleared, the tag may be reused immediately and > due to the lack of memory ordering around bit clearing, its memory > accesses may race against the ones from before the clearing. > > Given that the bits are the primary synchronization mechanism, they > should be ordered memory-wise. This patch replaces waitqueue-style > memory barriers with clear_bit_unlock() in sbitmap_clear_bit() and > test_and_set_bit_lock() in __sbitmap_get_word(). > > Signed-off-by: Tejun Heo <tj@kernel.org> > --- > Hello, > > Spotted this while verifying the timeout fix. I didn't check the > whole code, so although unlikely it's possible that the removed mb's > are needed from elsewhere, so the RFC. Only boot tested. > > Thanks. > > include/linux/sbitmap.h | 3 ++- > lib/sbitmap.c | 17 ++++++----------- > 2 files changed, 8 insertions(+), 12 deletions(-) > > --- a/include/linux/sbitmap.h > +++ b/include/linux/sbitmap.h > @@ -297,7 +297,8 @@ static inline void sbitmap_set_bit(struc > > static inline void sbitmap_clear_bit(struct sbitmap *sb, unsigned int bitnr) > { > - clear_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); > + /* paired with test_and_set_bit_lock() in __sbitmap_get_word() */ > + clear_bit_unlock(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); > } I agree that we want this, but for a different reason than you described in your changelog: a sbitmap can be used without a sbitmap_queue, so it should provide the proper memory ordering. For the sbitmap_queue case, the compiler/processor could also reorder something after the clear_bit() and before the smp_mb__after_atomic(), which is also wrong. > static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr) > --- a/lib/sbitmap.c > +++ b/lib/sbitmap.c > @@ -100,7 +100,8 @@ static int __sbitmap_get_word(unsigned l > return -1; > } > > - if (!test_and_set_bit(nr, word)) > + /* paired with clear_bit_unlock() in sbitmap_clear_bit() */ > + if (!test_and_set_bit_lock(nr, word)) > break; test_and_set_bit_lock() is an ACQUIRE operation which has weaker guarantees than the full barrier implied by test_and_set_bit(), but ACQUIRE is enough here, so I agree with this part, too. > hint = nr + 1; > @@ -432,14 +433,9 @@ static void sbq_wake_up(struct sbitmap_q > int wait_cnt; > > /* > - * Pairs with the memory barrier in set_current_state() to ensure the > - * proper ordering of clear_bit()/waitqueue_active() in the waker and > - * test_and_set_bit()/prepare_to_wait()/finish_wait() in the waiter. See > - * the comment on waitqueue_active(). This is __after_atomic because we > - * just did clear_bit() in the caller. > + * Memory ordering is handled by sbitmap_clear_bit() and > + * __sbitmap_get_word(). > */ > - smp_mb__after_atomic(); > - This, I'm not convinced that we can get rid of. clear_bit_unlock() is a RELEASE operation, not a full barrier, so the waitqueue_active() read can be reordered before the clear_bit() store. Imagine we get this interleaving: waitqueue_active() -> false | | /* bitmap is full, allocation fails */ | prepare_to_wait() clear_bit_unlock() | We should've woken up the waiter, but we didn't. > ws = sbq_wake_ptr(sbq); > if (!ws) > return; > @@ -481,10 +477,9 @@ void sbitmap_queue_wake_all(struct sbitm > int i, wake_index; > > /* > - * Pairs with the memory barrier in set_current_state() like in > - * sbq_wake_up(). > + * Memory ordering is handled by sbitmap_clear_bit() and > + * __sbitmap_get_word(). > */ > - smp_mb(); Same idea here. > wake_index = atomic_read(&sbq->wake_index); > for (i = 0; i < SBQ_WAIT_QUEUES; i++) { > struct sbq_wait_state *ws = &sbq->ws[wake_index]; So I think we want a patch for the test_and_set_bit_lock() and clear_bit_unlock(), but the rest should stay as-is.
Hello, Omar. On Mon, Feb 26, 2018 at 02:14:44PM -0800, Omar Sandoval wrote: > > wake_index = atomic_read(&sbq->wake_index); > > for (i = 0; i < SBQ_WAIT_QUEUES; i++) { > > struct sbq_wait_state *ws = &sbq->ws[wake_index]; > > So I think we want a patch for the test_and_set_bit_lock() and > clear_bit_unlock(), but the rest should stay as-is. Yeah, that makes sense to me. Would you be interested in updating the patch? Thanks.
On Tue, Feb 27, 2018 at 10:14:04AM -0800, Tejun Heo wrote: > Hello, Omar. > > On Mon, Feb 26, 2018 at 02:14:44PM -0800, Omar Sandoval wrote: > > > wake_index = atomic_read(&sbq->wake_index); > > > for (i = 0; i < SBQ_WAIT_QUEUES; i++) { > > > struct sbq_wait_state *ws = &sbq->ws[wake_index]; > > > > So I think we want a patch for the test_and_set_bit_lock() and > > clear_bit_unlock(), but the rest should stay as-is. > > Yeah, that makes sense to me. Would you be interested in updating the > patch? Yup, I'll send it up. Thanks, Tejun!
--- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -297,7 +297,8 @@ static inline void sbitmap_set_bit(struc static inline void sbitmap_clear_bit(struct sbitmap *sb, unsigned int bitnr) { - clear_bit(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); + /* paired with test_and_set_bit_lock() in __sbitmap_get_word() */ + clear_bit_unlock(SB_NR_TO_BIT(sb, bitnr), __sbitmap_word(sb, bitnr)); } static inline int sbitmap_test_bit(struct sbitmap *sb, unsigned int bitnr) --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -100,7 +100,8 @@ static int __sbitmap_get_word(unsigned l return -1; } - if (!test_and_set_bit(nr, word)) + /* paired with clear_bit_unlock() in sbitmap_clear_bit() */ + if (!test_and_set_bit_lock(nr, word)) break; hint = nr + 1; @@ -432,14 +433,9 @@ static void sbq_wake_up(struct sbitmap_q int wait_cnt; /* - * Pairs with the memory barrier in set_current_state() to ensure the - * proper ordering of clear_bit()/waitqueue_active() in the waker and - * test_and_set_bit()/prepare_to_wait()/finish_wait() in the waiter. See - * the comment on waitqueue_active(). This is __after_atomic because we - * just did clear_bit() in the caller. + * Memory ordering is handled by sbitmap_clear_bit() and + * __sbitmap_get_word(). */ - smp_mb__after_atomic(); - ws = sbq_wake_ptr(sbq); if (!ws) return; @@ -481,10 +477,9 @@ void sbitmap_queue_wake_all(struct sbitm int i, wake_index; /* - * Pairs with the memory barrier in set_current_state() like in - * sbq_wake_up(). + * Memory ordering is handled by sbitmap_clear_bit() and + * __sbitmap_get_word(). */ - smp_mb(); wake_index = atomic_read(&sbq->wake_index); for (i = 0; i < SBQ_WAIT_QUEUES; i++) { struct sbq_wait_state *ws = &sbq->ws[wake_index];
sbitmap is used to allocate tags. The free and alloc paths use a memory ordering scheme similar to the one used by waitqueue, where the waiter and waker synchronize around set_current_state(). This doesn't seem sufficient for sbitmap given that a tag may get released and re-acquired without the allocator blocking at all. Once the bit for the tag is cleared, the tag may be reused immediately and due to the lack of memory ordering around bit clearing, its memory accesses may race against the ones from before the clearing. Given that the bits are the primary synchronization mechanism, they should be ordered memory-wise. This patch replaces waitqueue-style memory barriers with clear_bit_unlock() in sbitmap_clear_bit() and test_and_set_bit_lock() in __sbitmap_get_word(). Signed-off-by: Tejun Heo <tj@kernel.org> --- Hello, Spotted this while verifying the timeout fix. I didn't check the whole code, so although unlikely it's possible that the removed mb's are needed from elsewhere, so the RFC. Only boot tested. Thanks. include/linux/sbitmap.h | 3 ++- lib/sbitmap.c | 17 ++++++----------- 2 files changed, 8 insertions(+), 12 deletions(-)