Message ID | 20220104171058.22580-1-avagin@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fs/pipe: use kvcalloc to allocate a pipe_buffer array | expand |
On 1/4/22 17:10, Andrei Vagin wrote: > Right now, kcalloc is used to allocate a pipe_buffer array. The size of > the pipe_buffer struct is 40 bytes. kcalloc allows allocating reliably > chunks with sizes less or equal to PAGE_ALLOC_COSTLY_ORDER (3). It means > that the maximum pipe size is 3.2MB in this case. > > In CRIU, we use pipes to dump processes memory. CRIU freezes a target > process, injects a parasite code into it and then this code splices > memory into pipes. If a maximum pipe size is small, we need to > do many iterations or create many pipes. > > kvcalloc attempt to allocate physically contiguous memory, but upon > failure, fall back to non-contiguous (vmalloc) allocation and so it > isn't limited by PAGE_ALLOC_COSTLY_ORDER. > > The maximum pipe size for non-root users is limited by > the /proc/sys/fs/pipe-max-size sysctl that is 1MB by default, so only > the root user will be able to trigger vmalloc allocations. > > Cc: Dmitry Safonov <0x7f454c46@gmail.com> > Signed-off-by: Andrei Vagin <avagin@gmail.com> Good idea! I wonder if you need to apply this on the top: diff --git a/fs/pipe.c b/fs/pipe.c index 45565773ec33..b4ccafffa350 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -605,7 +605,7 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from) static long pipe_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct pipe_inode_info *pipe = filp->private_data; - int count, head, tail, mask; + unsigned int count, head, tail, mask; switch (cmd) { case FIONREAD: @@ -827,7 +827,7 @@ struct pipe_inode_info *alloc_pipe_info(void) void free_pipe_info(struct pipe_inode_info *pipe) { - int i; + unsigned int i; #ifdef CONFIG_WATCH_QUEUE if (pipe->watch_queue) { --->8--- Otherwise this loop in free_pipe_info() may become lockup on some ugly platforms with INTMAX allocation reachable, I think. I may be wrong :-) Thanks, Dmitry
On Tue, Jan 4, 2022 at 10:54 PM Dmitry Safonov <0x7f454c46@gmail.com> wrote: > > On 1/4/22 17:10, Andrei Vagin wrote: > > Right now, kcalloc is used to allocate a pipe_buffer array. The size of > > the pipe_buffer struct is 40 bytes. kcalloc allows allocating reliably > > chunks with sizes less or equal to PAGE_ALLOC_COSTLY_ORDER (3). It means > > that the maximum pipe size is 3.2MB in this case. > > > > In CRIU, we use pipes to dump processes memory. CRIU freezes a target > > process, injects a parasite code into it and then this code splices > > memory into pipes. If a maximum pipe size is small, we need to > > do many iterations or create many pipes. > > > > kvcalloc attempt to allocate physically contiguous memory, but upon > > failure, fall back to non-contiguous (vmalloc) allocation and so it > > isn't limited by PAGE_ALLOC_COSTLY_ORDER. > > > > The maximum pipe size for non-root users is limited by > > the /proc/sys/fs/pipe-max-size sysctl that is 1MB by default, so only > > the root user will be able to trigger vmalloc allocations. > > > > Cc: Dmitry Safonov <0x7f454c46@gmail.com> > > Signed-off-by: Andrei Vagin <avagin@gmail.com> > > Good idea! > > I wonder if you need to apply this on the top: > > diff --git a/fs/pipe.c b/fs/pipe.c > index 45565773ec33..b4ccafffa350 100644 > --- a/fs/pipe.c > +++ b/fs/pipe.c > @@ -605,7 +605,7 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from) > static long pipe_ioctl(struct file *filp, unsigned int cmd, unsigned > long arg) > { > struct pipe_inode_info *pipe = filp->private_data; > - int count, head, tail, mask; > + unsigned int count, head, tail, mask; > > switch (cmd) { > case FIONREAD: > @@ -827,7 +827,7 @@ struct pipe_inode_info *alloc_pipe_info(void) > > void free_pipe_info(struct pipe_inode_info *pipe) > { > - int i; > + unsigned int i; > > #ifdef CONFIG_WATCH_QUEUE > if (pipe->watch_queue) { > --->8--- > > Otherwise this loop in free_pipe_info() may become lockup on some ugly > platforms with INTMAX allocation reachable, I think. I may be wrong :-) This change looks reasonable, it makes types of local variables consistent with proper fields of pipe_inode_info. But right now, the maximum pipe size is limited by (1<<31) (look at round_pipe_size) and so we don't have a real issue here. Thanks, Andrei
On 1/5/22 21:47, Andrei Vagin wrote: > On Tue, Jan 4, 2022 at 10:54 PM Dmitry Safonov <0x7f454c46@gmail.com> wrote: [..] >> Otherwise this loop in free_pipe_info() may become lockup on some ugly >> platforms with INTMAX allocation reachable, I think. I may be wrong :-) > > This change looks reasonable, it makes types of local variables consistent > with proper fields of pipe_inode_info. But right now, the maximum pipe size > is limited by (1<<31) (look at round_pipe_size) and so we don't have a real > issue here. Right you are, I haven't noticed it. Hmm, : if (size > (1U << 31)) : return 0; Isn't size == (1U << 31) a negative integer on 32-bit? [probably, the question is not very related to your patch, though] Thanks, Dmitry
On 1/4/22 17:10, Andrei Vagin wrote: > Right now, kcalloc is used to allocate a pipe_buffer array. The size of > the pipe_buffer struct is 40 bytes. kcalloc allows allocating reliably > chunks with sizes less or equal to PAGE_ALLOC_COSTLY_ORDER (3). It means > that the maximum pipe size is 3.2MB in this case. > > In CRIU, we use pipes to dump processes memory. CRIU freezes a target > process, injects a parasite code into it and then this code splices > memory into pipes. If a maximum pipe size is small, we need to > do many iterations or create many pipes. > > kvcalloc attempt to allocate physically contiguous memory, but upon > failure, fall back to non-contiguous (vmalloc) allocation and so it > isn't limited by PAGE_ALLOC_COSTLY_ORDER. > > The maximum pipe size for non-root users is limited by > the /proc/sys/fs/pipe-max-size sysctl that is 1MB by default, so only > the root user will be able to trigger vmalloc allocations. > > Cc: Dmitry Safonov <0x7f454c46@gmail.com> > Signed-off-by: Andrei Vagin <avagin@gmail.com> Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> > --- > fs/pipe.c | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/fs/pipe.c b/fs/pipe.c > index 6d4342bad9f1..45565773ec33 100644 > --- a/fs/pipe.c > +++ b/fs/pipe.c > @@ -802,7 +802,7 @@ struct pipe_inode_info *alloc_pipe_info(void) > if (too_many_pipe_buffers_hard(user_bufs) && pipe_is_unprivileged_user()) > goto out_revert_acct; > > - pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer), > + pipe->bufs = kvcalloc(pipe_bufs, sizeof(struct pipe_buffer), > GFP_KERNEL_ACCOUNT); > > if (pipe->bufs) { > @@ -845,7 +845,7 @@ void free_pipe_info(struct pipe_inode_info *pipe) > } > if (pipe->tmp_page) > __free_page(pipe->tmp_page); > - kfree(pipe->bufs); > + kvfree(pipe->bufs); > kfree(pipe); > } > > @@ -1260,8 +1260,7 @@ int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots) > if (nr_slots < n) > return -EBUSY; > > - bufs = kcalloc(nr_slots, sizeof(*bufs), > - GFP_KERNEL_ACCOUNT | __GFP_NOWARN); > + bufs = kvcalloc(nr_slots, sizeof(*bufs), GFP_KERNEL_ACCOUNT); > if (unlikely(!bufs)) > return -ENOMEM; > > @@ -1288,7 +1287,7 @@ int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots) > head = n; > tail = 0; > > - kfree(pipe->bufs); > + kvfree(pipe->bufs); > pipe->bufs = bufs; > pipe->ring_size = nr_slots; > if (pipe->max_usage > nr_slots) Thanks, Dmitry
On Tue, Jan 4, 2022 at 9:11 AM Andrei Vagin <avagin@gmail.com> wrote: > > Right now, kcalloc is used to allocate a pipe_buffer array. The size of > the pipe_buffer struct is 40 bytes. kcalloc allows allocating reliably > chunks with sizes less or equal to PAGE_ALLOC_COSTLY_ORDER (3). It means > that the maximum pipe size is 3.2MB in this case. > > In CRIU, we use pipes to dump processes memory. CRIU freezes a target > process, injects a parasite code into it and then this code splices > memory into pipes. If a maximum pipe size is small, we need to > do many iterations or create many pipes. > > kvcalloc attempt to allocate physically contiguous memory, but upon > failure, fall back to non-contiguous (vmalloc) allocation and so it > isn't limited by PAGE_ALLOC_COSTLY_ORDER. > > The maximum pipe size for non-root users is limited by > the /proc/sys/fs/pipe-max-size sysctl that is 1MB by default, so only > the root user will be able to trigger vmalloc allocations. > > Cc: Dmitry Safonov <0x7f454c46@gmail.com> > Signed-off-by: Andrei Vagin <avagin@gmail.com> Alexander and Andrew, Do you have any objections? Could you merge this patch? Dmitry's comments have been addressed in the separate patch: https://lkml.org/lkml/2022/1/6/463 BTW: I found that fuse already uses kvmalloc to allocate pipe buffers (e. g. fuse_dev_splice_read). Thanks, Andrei
diff --git a/fs/pipe.c b/fs/pipe.c index 6d4342bad9f1..45565773ec33 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -802,7 +802,7 @@ struct pipe_inode_info *alloc_pipe_info(void) if (too_many_pipe_buffers_hard(user_bufs) && pipe_is_unprivileged_user()) goto out_revert_acct; - pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer), + pipe->bufs = kvcalloc(pipe_bufs, sizeof(struct pipe_buffer), GFP_KERNEL_ACCOUNT); if (pipe->bufs) { @@ -845,7 +845,7 @@ void free_pipe_info(struct pipe_inode_info *pipe) } if (pipe->tmp_page) __free_page(pipe->tmp_page); - kfree(pipe->bufs); + kvfree(pipe->bufs); kfree(pipe); } @@ -1260,8 +1260,7 @@ int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots) if (nr_slots < n) return -EBUSY; - bufs = kcalloc(nr_slots, sizeof(*bufs), - GFP_KERNEL_ACCOUNT | __GFP_NOWARN); + bufs = kvcalloc(nr_slots, sizeof(*bufs), GFP_KERNEL_ACCOUNT); if (unlikely(!bufs)) return -ENOMEM; @@ -1288,7 +1287,7 @@ int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots) head = n; tail = 0; - kfree(pipe->bufs); + kvfree(pipe->bufs); pipe->bufs = bufs; pipe->ring_size = nr_slots; if (pipe->max_usage > nr_slots)
Right now, kcalloc is used to allocate a pipe_buffer array. The size of the pipe_buffer struct is 40 bytes. kcalloc allows allocating reliably chunks with sizes less or equal to PAGE_ALLOC_COSTLY_ORDER (3). It means that the maximum pipe size is 3.2MB in this case. In CRIU, we use pipes to dump processes memory. CRIU freezes a target process, injects a parasite code into it and then this code splices memory into pipes. If a maximum pipe size is small, we need to do many iterations or create many pipes. kvcalloc attempt to allocate physically contiguous memory, but upon failure, fall back to non-contiguous (vmalloc) allocation and so it isn't limited by PAGE_ALLOC_COSTLY_ORDER. The maximum pipe size for non-root users is limited by the /proc/sys/fs/pipe-max-size sysctl that is 1MB by default, so only the root user will be able to trigger vmalloc allocations. Cc: Dmitry Safonov <0x7f454c46@gmail.com> Signed-off-by: Andrei Vagin <avagin@gmail.com> --- fs/pipe.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)