diff mbox series

[RFC,2/2] watch_queue: Fix pipe accounting

Message ID a8d8f11a-0fea-4b74-893b-905d6ef841e6@redhat.com (mailing list archive)
State New
Headers show
Series watch_queue: Fix pipe allocation and accounting | expand

Commit Message

Eric Sandeen Feb. 14, 2025, 3:38 p.m. UTC
Currently, watch_queue_set_size() modifies the pipe buffers charged to
user->pipe_bufs without updating the pipe->nr_accounted on the pipe
itself, due to the if (!pipe_has_watch_queue()) test in
pipe_resize_ring(). This means that when the pipe is ultimately freed,
we decrement user->pipe_bufs by something other than what than we had
charged to it, potentially leading to an underflow. This in turn can
cause subsequent too_many_pipe_buffers_soft() tests to fail with -EPERM.

Fixes: e95aada4cb93d ("pipe: wakeup wr_wait after setting max_usage")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

Comments

David Howells Feb. 14, 2025, 4:49 p.m. UTC | #1
Eric Sandeen <sandeen@redhat.com> wrote:

> -	if (!pipe_has_watch_queue(pipe)) {
> -		pipe->max_usage = nr_slots;
> -		pipe->nr_accounted = nr_slots;
> -	}
> +	pipe->max_usage = nr_slots;
> +	pipe->nr_accounted = nr_slots;

Hmmm...   The pipe ring is supposed to have some spare capacity when used as a
watch queue so that you can bung at least a few messages into it.

David
Eric Sandeen Feb. 14, 2025, 4:58 p.m. UTC | #2
On 2/14/25 10:49 AM, David Howells wrote:
> Eric Sandeen <sandeen@redhat.com> wrote:
> 
>> -	if (!pipe_has_watch_queue(pipe)) {
>> -		pipe->max_usage = nr_slots;
>> -		pipe->nr_accounted = nr_slots;
>> -	}
>> +	pipe->max_usage = nr_slots;
>> +	pipe->nr_accounted = nr_slots;
> 
> Hmmm...   The pipe ring is supposed to have some spare capacity when used as a
> watch queue so that you can bung at least a few messages into it.

If the pipe has an original number of buffers on it, and
then watch_queue_set_size adjusts that number - where does
the spare capacity come from? Who adds it / where?

Thanks,
-Eric

> David
> 
>
Eric Sandeen Feb. 14, 2025, 5:36 p.m. UTC | #3
On 2/14/25 10:58 AM, Eric Sandeen wrote:
> On 2/14/25 10:49 AM, David Howells wrote:
>> Eric Sandeen <sandeen@redhat.com> wrote:
>>
>>> -	if (!pipe_has_watch_queue(pipe)) {
>>> -		pipe->max_usage = nr_slots;
>>> -		pipe->nr_accounted = nr_slots;
>>> -	}
>>> +	pipe->max_usage = nr_slots;
>>> +	pipe->nr_accounted = nr_slots;
>>
>> Hmmm...   The pipe ring is supposed to have some spare capacity when used as a
>> watch queue so that you can bung at least a few messages into it.
> 
> If the pipe has an original number of buffers on it, and
> then watch_queue_set_size adjusts that number - where does
> the spare capacity come from? Who adds it / where?

If that was resizing the ring to nr_notes not nr_pages, then ... I guess we
could keep the pipe_has_watch_queue() test as above, skip updating nr_accounted
there, and set it manually to the user-charged number in watch_queue_set_size()?

Something like this? I'm not sure about max_usage, maybe it should be set
as well.

diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index 5267adeaa403..07a161ecc8a8 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -269,6 +269,14 @@ long watch_queue_set_size(struct pipe_inode_info *pipe, unsigned int nr_notes)
 	if (ret < 0)
 		goto error;
 
+	/*
+	 * pipe_resize_ring does not update nr_accounted for watch_queue pipes,
+	 * because the above vastly overprovisions. Set nr_accounted on
+	 * this pipe to the number that was charged to the user above
+	 * here manually.
+	 */
+	pipe->nr_accounted = nr_pages;
+
 	ret = -ENOMEM;
 	pages = kcalloc(nr_pages, sizeof(struct page *), GFP_KERNEL);
 	if (!pages)




> Thanks,
> -Eric
> 
>> David
>>
>>
> 
>
diff mbox series

Patch

diff --git a/fs/pipe.c b/fs/pipe.c
index 94b59045ab44..072e2e003165 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -1317,10 +1317,8 @@  int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots)
 	pipe->tail = tail;
 	pipe->head = head;
 
-	if (!pipe_has_watch_queue(pipe)) {
-		pipe->max_usage = nr_slots;
-		pipe->nr_accounted = nr_slots;
-	}
+	pipe->max_usage = nr_slots;
+	pipe->nr_accounted = nr_slots;
 
 	spin_unlock_irq(&pipe->rd_wait.lock);