Message ID | 20240509125716.1268016-1-leitao@debian.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fuse: annotate potential data-race in num_background | expand |
On Thu, 9 May 2024 at 14:57, Breno Leitao <leitao@debian.org> wrote: > Annotated the reader with READ_ONCE() and the writer with WRITE_ONCE() > to avoid such complaint from KCSAN. I'm not sure the write side part is really needed, since the lock is properly protecting against concurrent readers/writers within the locked region. Does KCSAN still complain if you just add the READ_ONCE() to fuse_readahead()? Thanks, Miklos
Hello Miklos, On Fri, May 10, 2024 at 11:21:19AM +0200, Miklos Szeredi wrote: > On Thu, 9 May 2024 at 14:57, Breno Leitao <leitao@debian.org> wrote: > > > Annotated the reader with READ_ONCE() and the writer with WRITE_ONCE() > > to avoid such complaint from KCSAN. > > I'm not sure the write side part is really needed, since the lock is > properly protecting against concurrent readers/writers within the > locked region. I understand that num_background is read from an unlocked region (fuse_readahead()). > Does KCSAN still complain if you just add the READ_ONCE() to fuse_readahead()? I haven't checked, but, looking at the documentation it says that both part needs to be marked. Here is an example very similar to ours here, from tools/memory-model/Documentation/access-marking.txt Lock-Protected Writes With Lockless Reads ----------------------------------------- For another example, suppose a shared variable "foo" is updated only while holding a spinlock, but is read locklessly. The code might look as follows: int foo; DEFINE_SPINLOCK(foo_lock); void update_foo(int newval) { spin_lock(&foo_lock); WRITE_ONCE(foo, newval); ASSERT_EXCLUSIVE_WRITER(foo); do_something(newval); spin_unlock(&foo_wlock); } int read_foo(void) { do_something_else(); return READ_ONCE(foo); } Because foo is read locklessly, all accesses are marked. From my understanding, we need a WRITE_ONCE() inside the lock, because the bg_lock lock in fuse_request_end() is invisible for fuse_readahead(), and fuse_readahead() might read num_backgroud that was writen non-atomically/corrupted (if there is no WRITE_ONCE()). That said, if the reader (fuse_readahead()) can handle possible corrupted data, we can mark is with data_race() annotation. Then I understand we don't need to mark the write with WRITE_ONCE(). Here is what access-marking.txt says about this case: Here are some situations where data_race() should be used instead of READ_ONCE() and WRITE_ONCE(): 1. Data-racy loads from shared variables whose values are used only for diagnostic purposes. 2. Data-racy reads whose values are checked against marked reload. 3. Reads whose values feed into error-tolerant heuristics. 4. Writes setting values that feed into error-tolerant heuristics. Anyway, I am more than happy to test with only a READ_ONLY() in the reader side, if that the approach you prefer. Thanks!
On Mon, 13 May 2024 at 14:41, Breno Leitao <leitao@debian.org> wrote: > That said, if the reader (fuse_readahead()) can handle possible > corrupted data, we can mark is with data_race() annotation. Then I > understand we don't need to mark the write with WRITE_ONCE(). Adding Willy, since the readahead code in fuse is fairly special. I don't think it actually matters if "fc->num_background >= fc->congestion_threshold" returns false positive or false negative, but I don't have a full understanding of how readahead works. Willy, can you please look at fuse_readahead() to confirm that breaking out of the loop is okay if (rac->ra->async_size >= readahead_count(rac)) no mater what? Thanks, Miklos
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 3ec8bb5e68ff..8e63dba49eff 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -282,6 +282,7 @@ void fuse_request_end(struct fuse_req *req) struct fuse_mount *fm = req->fm; struct fuse_conn *fc = fm->fc; struct fuse_iqueue *fiq = &fc->iq; + unsigned int num_background; if (test_and_set_bit(FR_FINISHED, &req->flags)) goto put_request; @@ -301,7 +302,8 @@ void fuse_request_end(struct fuse_req *req) if (test_bit(FR_BACKGROUND, &req->flags)) { spin_lock(&fc->bg_lock); clear_bit(FR_BACKGROUND, &req->flags); - if (fc->num_background == fc->max_background) { + num_background = READ_ONCE(fc->num_background); + if (num_background == fc->max_background) { fc->blocked = 0; wake_up(&fc->blocked_waitq); } else if (!fc->blocked) { @@ -315,7 +317,7 @@ void fuse_request_end(struct fuse_req *req) wake_up(&fc->blocked_waitq); } - fc->num_background--; + WRITE_ONCE(fc->num_background, num_background - 1); fc->active_background--; flush_bg_queue(fc); spin_unlock(&fc->bg_lock); diff --git a/fs/fuse/file.c b/fs/fuse/file.c index b57ce4157640..07331889bbf3 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1002,7 +1002,7 @@ static void fuse_readahead(struct readahead_control *rac) struct fuse_io_args *ia; struct fuse_args_pages *ap; - if (fc->num_background >= fc->congestion_threshold && + if (READ_ONCE(fc->num_background) >= fc->congestion_threshold && rac->ra->async_size >= readahead_count(rac)) /* * Congested and only async pages left, so skip the
A data race occurs when two concurrent data paths potentially access fuse_conn->num_background simultaneously. Specifically, fuse_request_end() accesses and modifies ->num_background while holding the bg_lock, whereas fuse_readahead() reads ->num_background without acquiring any lock beforehand. This potential data race is flagged by KCSAN: BUG: KCSAN: data-race in fuse_readahead [fuse] / fuse_request_end [fuse] read-write to 0xffff8883a6666598 of 4 bytes by task 113809 on cpu 39: fuse_request_end (fs/fuse/dev.c:318) fuse fuse_dev_do_write (fs/fuse/dev.c:?) fuse fuse_dev_write (fs/fuse/dev.c:?) fuse ... read to 0xffff8883a6666598 of 4 bytes by task 113787 on cpu 8: fuse_readahead (fs/fuse/file.c:1005) fuse read_pages (mm/readahead.c:166) page_cache_ra_unbounded (mm/readahead.c:?) ... value changed: 0x00000001 -> 0x00000000 Annotated the reader with READ_ONCE() and the writer with WRITE_ONCE() to avoid such complaint from KCSAN. Suggested-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Breno Leitao <leitao@debian.org> --- fs/fuse/dev.c | 6 ++++-- fs/fuse/file.c | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-)