diff mbox series

[v3] 9p/trans_fd: mark concurrent read and writes to p9_conn->err

Message ID 20250318-p9_conn_err_benign_data_race-v3-1-290bb18335cc@iencinas.com (mailing list archive)
State New
Headers show
Series [v3] 9p/trans_fd: mark concurrent read and writes to p9_conn->err | expand

Commit Message

Ignacio Encinas Rubio March 18, 2025, 9:39 p.m. UTC
Writes for the error value of a connection are spinlock-protected inside
p9_conn_cancel, but lockless reads are present elsewhere to avoid
performing unnecessary work after an error has been met.

Mark the write and lockless reads to make KCSAN happy. Mark the write as
exclusive following the recommendation in "Lock-Protected Writes with
Lockless Reads" in tools/memory-model/Documentation/access-marking.txt
while we are at it.

Mark p9_fd_request and p9_conn_cancel m->err reads despite the fact that
they do not race with concurrent writes for stylistic reasons.

Reported-by: syzbot+d69a7cc8c683c2cb7506@syzkaller.appspotmail.com
Reported-by: syzbot+483d6c9b9231ea7e1851@syzkaller.appspotmail.com
Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
---
Changes in v3:

- Introduce a couple of extra READ_ONCEs to maintain consistency across
  m->err reads (noted in the commit message too for future reference)
- Remove racy read from p9_fd_request by reusing the previously read
  error (arguably, the lock was never of much use)

- Link to v2: https://lore.kernel.org/r/20250313-p9_conn_err_benign_data_race-v2-1-0bb9f45f6bb2@iencinas.com
- Link to v1: https://lore.kernel.org/r/20250308-p9_conn_err_benign_data_race-v1-1-729e57d5832b@iencinas.com
---
 net/9p/trans_fd.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)


---
base-commit: 2a520073e74fbb956b5564818fc5529dcc7e9f0e
change-id: 20250308-p9_conn_err_benign_data_race-2758fe8bbed0

Best regards,

Comments

Dominique Martinet March 18, 2025, 10:01 p.m. UTC | #1
Ignacio Encinas wrote on Tue, Mar 18, 2025 at 10:39:02PM +0100:
> Writes for the error value of a connection are spinlock-protected inside
> p9_conn_cancel, but lockless reads are present elsewhere to avoid
> performing unnecessary work after an error has been met.
> 
> Mark the write and lockless reads to make KCSAN happy. Mark the write as
> exclusive following the recommendation in "Lock-Protected Writes with
> Lockless Reads" in tools/memory-model/Documentation/access-marking.txt
> while we are at it.
> 
> Mark p9_fd_request and p9_conn_cancel m->err reads despite the fact that
> they do not race with concurrent writes for stylistic reasons.
> 
> Reported-by: syzbot+d69a7cc8c683c2cb7506@syzkaller.appspotmail.com
> Reported-by: syzbot+483d6c9b9231ea7e1851@syzkaller.appspotmail.com
> Signed-off-by: Ignacio Encinas <ignacio@iencinas.com>
> ---
> Changes in v3:
> 
> - Introduce a couple of extra READ_ONCEs to maintain consistency across
>   m->err reads (noted in the commit message too for future reference)
> - Remove racy read from p9_fd_request by reusing the previously read
>   error (arguably, the lock was never of much use)

Thank you!

I've updated the patch in my -next branch, and it'll go to Linus in a
couple of weeks with the 6.15 merge window

(our mails crossed, feel free to ignore the other one)
Ignacio Encinas Rubio March 18, 2025, 10:12 p.m. UTC | #2
On 18/3/25 23:01, Dominique Martinet wrote:
> Thank you!
> 
> I've updated the patch in my -next branch, and it'll go to Linus in a
> couple of weeks with the 6.15 merge window

Thank you! 

> (our mails crossed, feel free to ignore the other one)

How ironic... it seems we had a race condition! :)
diff mbox series

Patch

diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index 196060dc6138af10e99ad04a76ee36a11f770c65..791e4868f2d4e16b87bfc6038132b4e8a2a5fb9d 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -191,12 +191,13 @@  static void p9_conn_cancel(struct p9_conn *m, int err)
 
 	spin_lock(&m->req_lock);
 
-	if (m->err) {
+	if (READ_ONCE(m->err)) {
 		spin_unlock(&m->req_lock);
 		return;
 	}
 
-	m->err = err;
+	WRITE_ONCE(m->err, err);
+	ASSERT_EXCLUSIVE_WRITER(m->err);
 
 	list_for_each_entry_safe(req, rtmp, &m->req_list, req_list) {
 		list_move(&req->req_list, &cancel_list);
@@ -283,7 +284,7 @@  static void p9_read_work(struct work_struct *work)
 
 	m = container_of(work, struct p9_conn, rq);
 
-	if (m->err < 0)
+	if (READ_ONCE(m->err) < 0)
 		return;
 
 	p9_debug(P9_DEBUG_TRANS, "start mux %p pos %zd\n", m, m->rc.offset);
@@ -450,7 +451,7 @@  static void p9_write_work(struct work_struct *work)
 
 	m = container_of(work, struct p9_conn, wq);
 
-	if (m->err < 0) {
+	if (READ_ONCE(m->err) < 0) {
 		clear_bit(Wworksched, &m->wsched);
 		return;
 	}
@@ -622,7 +623,7 @@  static void p9_poll_mux(struct p9_conn *m)
 	__poll_t n;
 	int err = -ECONNRESET;
 
-	if (m->err < 0)
+	if (READ_ONCE(m->err) < 0)
 		return;
 
 	n = p9_fd_poll(m->client, NULL, &err);
@@ -665,6 +666,7 @@  static void p9_poll_mux(struct p9_conn *m)
 static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 {
 	__poll_t n;
+	int err;
 	struct p9_trans_fd *ts = client->trans;
 	struct p9_conn *m = &ts->conn;
 
@@ -673,9 +675,10 @@  static int p9_fd_request(struct p9_client *client, struct p9_req_t *req)
 
 	spin_lock(&m->req_lock);
 
-	if (m->err < 0) {
+	err = READ_ONCE(m->err);
+	if (err < 0) {
 		spin_unlock(&m->req_lock);
-		return m->err;
+		return err;
 	}
 
 	WRITE_ONCE(req->status, REQ_STATUS_UNSENT);