Message ID | 20240905174541.392785-1-joannelkoong@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2,RESEND] fuse: Enable dynamic configuration of fuse max pages limit (FUSE_MAX_MAX_PAGES) | expand |
On 9/5/24 1:45 PM, Joanne Koong wrote: > Introduce the capability to dynamically configure the fuse max pages > limit (formerly #defined as FUSE_MAX_MAX_PAGES) through a sysctl. > This enhancement allows system administrators to adjust the value > based on system-specific requirements. > > This removes the previous static limit of 256 max pages, which limits > the max write size of a request to 1 MiB (on 4096 pagesize systems). > Having the ability to up the max write size beyond 1 MiB allows for the > perf improvements detailed in this thread [1]. > > $ sysctl -a | grep max_pages_limit > fs.fuse.max_pages_limit = 256 > > $ sysctl -n fs.fuse.max_pages_limit > 256 > > $ echo 1024 | sudo tee /proc/sys/fs/fuse/max_pages_limit > 1024 > > $ sysctl -n fs.fuse.max_pages_limit > 1024 > > $ echo 65536 | sudo tee /proc/sys/fs/fuse/max_pages_limit > tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument > > $ echo 0 | sudo tee /proc/sys/fs/fuse/max_pages_limit > tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument > > $ echo 65535 | sudo tee /proc/sys/fs/fuse/max_pages_limit > 65535 > > $ sysctl -n fs.fuse.max_pages_limit > 65535 > > v2 (original): > https://lore.kernel.org/linux-fsdevel/20240702014627.4068146-1-joannelkoong@gmail.com/ > > v1: > https://lore.kernel.org/linux-fsdevel/20240628001355.243805-1-joannelkoong@gmail.com/ > > Changes from v1: > - Rename fuse_max_max_pages to fuse_max_pages_limit internally > - Rename /proc/sys/fs/fuse/fuse_max_max_pages to > /proc/sys/fs/fuse/max_pages_limit > - Restrict fuse max_pages_limit sysctl values to between 1 and 65535 > (inclusive) > > [1] https://lore.kernel.org/linux-fsdevel/20240124070512.52207-1-jefflexu@linux.alibaba.com/T/#u > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me> Thanks for doing this!!
Hi Joanne, On 9/5/24 19:45, Joanne Koong wrote: > Introduce the capability to dynamically configure the fuse max pages > limit (formerly #defined as FUSE_MAX_MAX_PAGES) through a sysctl. > This enhancement allows system administrators to adjust the value > based on system-specific requirements. > > This removes the previous static limit of 256 max pages, which limits > the max write size of a request to 1 MiB (on 4096 pagesize systems). > Having the ability to up the max write size beyond 1 MiB allows for the > perf improvements detailed in this thread [1]. the change itself looks good to me, but have you seen this discussion here? https://lore.kernel.org/lkml/CAJfpegs10SdtzNXJfj3=vxoAZMhksT5A1u5W5L6nKL-P2UOuLQ@mail.gmail.com/T/ Miklos is basically worried about page pinning and accounting for that for unprivileged user processes. Thanks, Bernd > > $ sysctl -a | grep max_pages_limit > fs.fuse.max_pages_limit = 256 > > $ sysctl -n fs.fuse.max_pages_limit > 256 > > $ echo 1024 | sudo tee /proc/sys/fs/fuse/max_pages_limit > 1024 > > $ sysctl -n fs.fuse.max_pages_limit > 1024 > > $ echo 65536 | sudo tee /proc/sys/fs/fuse/max_pages_limit > tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument > > $ echo 0 | sudo tee /proc/sys/fs/fuse/max_pages_limit > tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument > > $ echo 65535 | sudo tee /proc/sys/fs/fuse/max_pages_limit > 65535 > > $ sysctl -n fs.fuse.max_pages_limit > 65535 > > v2 (original): > https://lore.kernel.org/linux-fsdevel/20240702014627.4068146-1-joannelkoong@gmail.com/ > > v1: > https://lore.kernel.org/linux-fsdevel/20240628001355.243805-1-joannelkoong@gmail.com/ > > Changes from v1: > - Rename fuse_max_max_pages to fuse_max_pages_limit internally > - Rename /proc/sys/fs/fuse/fuse_max_max_pages to > /proc/sys/fs/fuse/max_pages_limit > - Restrict fuse max_pages_limit sysctl values to between 1 and 65535 > (inclusive) > > [1] https://lore.kernel.org/linux-fsdevel/20240124070512.52207-1-jefflexu@linux.alibaba.com/T/#u > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > Reviewed-by: Josef Bacik <josef@toxicpanda.com> > --- > Documentation/admin-guide/sysctl/fs.rst | 10 +++++++ > fs/fuse/Makefile | 2 +- > fs/fuse/fuse_i.h | 14 +++++++-- > fs/fuse/inode.c | 11 ++++++- > fs/fuse/ioctl.c | 4 ++- > fs/fuse/sysctl.c | 40 +++++++++++++++++++++++++ > 6 files changed, 75 insertions(+), 6 deletions(-) > create mode 100644 fs/fuse/sysctl.c > > diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst > index 47499a1742bd..fa25d7e718b3 100644 > --- a/Documentation/admin-guide/sysctl/fs.rst > +++ b/Documentation/admin-guide/sysctl/fs.rst > @@ -332,3 +332,13 @@ Each "watch" costs roughly 90 bytes on a 32-bit kernel, and roughly 160 bytes > on a 64-bit one. > The current default value for ``max_user_watches`` is 4% of the > available low memory, divided by the "watch" cost in bytes. > + > +5. /proc/sys/fs/fuse - Configuration options for FUSE filesystems > +===================================================================== > + > +This directory contains the following configuration options for FUSE > +filesystems: > + > +``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for > +setting/getting the maximum number of pages that can be used for servicing > +requests in FUSE. > diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile > index 6e0228c6d0cb..cd4ef3e08ebf 100644 > --- a/fs/fuse/Makefile > +++ b/fs/fuse/Makefile > @@ -7,7 +7,7 @@ obj-$(CONFIG_FUSE_FS) += fuse.o > obj-$(CONFIG_CUSE) += cuse.o > obj-$(CONFIG_VIRTIO_FS) += virtiofs.o > > -fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o > +fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o sysctl.o > fuse-y += iomode.o > fuse-$(CONFIG_FUSE_DAX) += dax.o > fuse-$(CONFIG_FUSE_PASSTHROUGH) += passthrough.o > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > index f23919610313..bb252a3ea37b 100644 > --- a/fs/fuse/fuse_i.h > +++ b/fs/fuse/fuse_i.h > @@ -35,9 +35,6 @@ > /** Default max number of pages that can be used in a single read request */ > #define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32 > > -/** Maximum of max_pages received in init_out */ > -#define FUSE_MAX_MAX_PAGES 256 > - > /** Bias for fi->writectr, meaning new writepages must not be sent */ > #define FUSE_NOWRITE INT_MIN > > @@ -47,6 +44,9 @@ > /** Number of dentries for each connection in the control filesystem */ > #define FUSE_CTL_NUM_DENTRIES 5 > > +/** Maximum of max_pages received in init_out */ > +extern unsigned int fuse_max_pages_limit; > + > /** List of active connections */ > extern struct list_head fuse_conn_list; > > @@ -1472,4 +1472,12 @@ ssize_t fuse_passthrough_splice_write(struct pipe_inode_info *pipe, > size_t len, unsigned int flags); > ssize_t fuse_passthrough_mmap(struct file *file, struct vm_area_struct *vma); > > +#ifdef CONFIG_SYSCTL > +extern int fuse_sysctl_register(void); > +extern void fuse_sysctl_unregister(void); > +#else > +#define fuse_sysctl_register() (0) > +#define fuse_sysctl_unregister() do { } while (0) > +#endif /* CONFIG_SYSCTL */ > + > #endif /* _FS_FUSE_I_H */ > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index 99e44ea7d875..973e58df816a 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -35,6 +35,8 @@ DEFINE_MUTEX(fuse_mutex); > > static int set_global_limit(const char *val, const struct kernel_param *kp); > > +unsigned int fuse_max_pages_limit = 256; > + > unsigned max_user_bgreq; > module_param_call(max_user_bgreq, set_global_limit, param_get_uint, > &max_user_bgreq, 0644); > @@ -932,7 +934,7 @@ void fuse_conn_init(struct fuse_conn *fc, struct fuse_mount *fm, > fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); > fc->user_ns = get_user_ns(user_ns); > fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ; > - fc->max_pages_limit = FUSE_MAX_MAX_PAGES; > + fc->max_pages_limit = fuse_max_pages_limit; > > if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) > fuse_backing_files_init(fc); > @@ -2039,8 +2041,14 @@ static int __init fuse_fs_init(void) > if (err) > goto out3; > > + err = fuse_sysctl_register(); > + if (err) > + goto out4; > + > return 0; > > + out4: > + unregister_filesystem(&fuse_fs_type); > out3: > unregister_fuseblk(); > out2: > @@ -2053,6 +2061,7 @@ static void fuse_fs_cleanup(void) > { > unregister_filesystem(&fuse_fs_type); > unregister_fuseblk(); > + fuse_sysctl_unregister(); > > /* > * Make sure all delayed rcu free inodes are flushed before we > diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c > index 572ce8a82ceb..a6c8ee551635 100644 > --- a/fs/fuse/ioctl.c > +++ b/fs/fuse/ioctl.c > @@ -10,6 +10,8 @@ > #include <linux/fileattr.h> > #include <linux/fsverity.h> > > +#define FUSE_VERITY_ENABLE_ARG_MAX_PAGES 256 > + > static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args, > struct fuse_ioctl_out *outarg) > { > @@ -140,7 +142,7 @@ static int fuse_setup_enable_verity(unsigned long arg, struct iovec *iov, > { > struct fsverity_enable_arg enable; > struct fsverity_enable_arg __user *uarg = (void __user *)arg; > - const __u32 max_buffer_len = FUSE_MAX_MAX_PAGES * PAGE_SIZE; > + const __u32 max_buffer_len = FUSE_VERITY_ENABLE_ARG_MAX_PAGES * PAGE_SIZE; > > if (copy_from_user(&enable, uarg, sizeof(enable))) > return -EFAULT; > diff --git a/fs/fuse/sysctl.c b/fs/fuse/sysctl.c > new file mode 100644 > index 000000000000..b272bb333005 > --- /dev/null > +++ b/fs/fuse/sysctl.c > @@ -0,0 +1,40 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * linux/fs/fuse/fuse_sysctl.c > + * > + * Sysctl interface to fuse parameters > + */ > +#include <linux/sysctl.h> > + > +#include "fuse_i.h" > + > +static struct ctl_table_header *fuse_table_header; > + > +/* Bound by fuse_init_out max_pages, which is a u16 */ > +static unsigned int sysctl_fuse_max_pages_limit = 65535; > + > +static struct ctl_table fuse_sysctl_table[] = { > + { > + .procname = "max_pages_limit", > + .data = &fuse_max_pages_limit, > + .maxlen = sizeof(fuse_max_pages_limit), > + .mode = 0644, > + .proc_handler = proc_douintvec_minmax, > + .extra1 = SYSCTL_ONE, > + .extra2 = &sysctl_fuse_max_pages_limit, > + }, > +}; > + > +int fuse_sysctl_register(void) > +{ > + fuse_table_header = register_sysctl("fs/fuse", fuse_sysctl_table); > + if (!fuse_table_header) > + return -ENOMEM; > + return 0; > +} > + > +void fuse_sysctl_unregister(void) > +{ > + unregister_sysctl_table(fuse_table_header); > + fuse_table_header = NULL; > +}
On Thu, Sep 5, 2024 at 2:16 PM Bernd Schubert <bernd.schubert@fastmail.fm> wrote: > > Hi Joanne, > > On 9/5/24 19:45, Joanne Koong wrote: > > Introduce the capability to dynamically configure the fuse max pages > > limit (formerly #defined as FUSE_MAX_MAX_PAGES) through a sysctl. > > This enhancement allows system administrators to adjust the value > > based on system-specific requirements. > > > > This removes the previous static limit of 256 max pages, which limits > > the max write size of a request to 1 MiB (on 4096 pagesize systems). > > Having the ability to up the max write size beyond 1 MiB allows for the > > perf improvements detailed in this thread [1]. > > the change itself looks good to me, but have you seen this discussion here? > > https://lore.kernel.org/lkml/CAJfpegs10SdtzNXJfj3=vxoAZMhksT5A1u5W5L6nKL-P2UOuLQ@mail.gmail.com/T/ > > > Miklos is basically worried about page pinning and accounting for that > for unprivileged user processes. > Thanks for the link to the previous discussion, Bernd. I'll cc Xu and Jingbo, who started that thread, to this email. I'm curious whether this sysctl approach might mitigate those worries here since modifying the sysctl value will require admin privileges to explicitly opt into this. I liked Sweet Tea's comment "Perhaps, in analogy to soft and hard limits on pipe size, FUSE_MAX_MAX_PAGES could be increased and treated as the maximum possible hard limit for max_write; and the default hard limit could stay at 1M, thereby allowing folks to opt into the new behavior if they care about the performance more than the memory?" where something like this could let admins choose what aspects they'd like to optimize for. My understanding of how bigger write buffers affects pinning is that each chunk of the write will pin a higher number of contiguous pages at one time (but overall for the duration of the write request, the number for total pages that get pinned are the same). My understanding is that the pages only get pinned when we do the copying to the fuse server's buffer (and is not pinned while the server is servicing the request). Thanks, Joanne > > Thanks, > Bernd > > > > > > $ sysctl -a | grep max_pages_limit > > fs.fuse.max_pages_limit = 256 > > > > $ sysctl -n fs.fuse.max_pages_limit > > 256 > > > > $ echo 1024 | sudo tee /proc/sys/fs/fuse/max_pages_limit > > 1024 > > > > $ sysctl -n fs.fuse.max_pages_limit > > 1024 > > > > $ echo 65536 | sudo tee /proc/sys/fs/fuse/max_pages_limit > > tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument > > > > $ echo 0 | sudo tee /proc/sys/fs/fuse/max_pages_limit > > tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument > > > > $ echo 65535 | sudo tee /proc/sys/fs/fuse/max_pages_limit > > 65535 > > > > $ sysctl -n fs.fuse.max_pages_limit > > 65535 > > > > v2 (original): > > https://lore.kernel.org/linux-fsdevel/20240702014627.4068146-1-joannelkoong@gmail.com/ > > > > v1: > > https://lore.kernel.org/linux-fsdevel/20240628001355.243805-1-joannelkoong@gmail.com/ > > > > Changes from v1: > > - Rename fuse_max_max_pages to fuse_max_pages_limit internally > > - Rename /proc/sys/fs/fuse/fuse_max_max_pages to > > /proc/sys/fs/fuse/max_pages_limit > > - Restrict fuse max_pages_limit sysctl values to between 1 and 65535 > > (inclusive) > > > > [1] https://lore.kernel.org/linux-fsdevel/20240124070512.52207-1-jefflexu@linux.alibaba.com/T/#u > > > > Signed-off-by: Joanne Koong <joannelkoong@gmail.com> > > Reviewed-by: Josef Bacik <josef@toxicpanda.com> > > --- > > Documentation/admin-guide/sysctl/fs.rst | 10 +++++++ > > fs/fuse/Makefile | 2 +- > > fs/fuse/fuse_i.h | 14 +++++++-- > > fs/fuse/inode.c | 11 ++++++- > > fs/fuse/ioctl.c | 4 ++- > > fs/fuse/sysctl.c | 40 +++++++++++++++++++++++++ > > 6 files changed, 75 insertions(+), 6 deletions(-) > > create mode 100644 fs/fuse/sysctl.c > > > > diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst > > index 47499a1742bd..fa25d7e718b3 100644 > > --- a/Documentation/admin-guide/sysctl/fs.rst > > +++ b/Documentation/admin-guide/sysctl/fs.rst > > @@ -332,3 +332,13 @@ Each "watch" costs roughly 90 bytes on a 32-bit kernel, and roughly 160 bytes > > on a 64-bit one. > > The current default value for ``max_user_watches`` is 4% of the > > available low memory, divided by the "watch" cost in bytes. > > + > > +5. /proc/sys/fs/fuse - Configuration options for FUSE filesystems > > +===================================================================== > > + > > +This directory contains the following configuration options for FUSE > > +filesystems: > > + > > +``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for > > +setting/getting the maximum number of pages that can be used for servicing > > +requests in FUSE. > > diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile > > index 6e0228c6d0cb..cd4ef3e08ebf 100644 > > --- a/fs/fuse/Makefile > > +++ b/fs/fuse/Makefile > > @@ -7,7 +7,7 @@ obj-$(CONFIG_FUSE_FS) += fuse.o > > obj-$(CONFIG_CUSE) += cuse.o > > obj-$(CONFIG_VIRTIO_FS) += virtiofs.o > > > > -fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o > > +fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o sysctl.o > > fuse-y += iomode.o > > fuse-$(CONFIG_FUSE_DAX) += dax.o > > fuse-$(CONFIG_FUSE_PASSTHROUGH) += passthrough.o > > diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h > > index f23919610313..bb252a3ea37b 100644 > > --- a/fs/fuse/fuse_i.h > > +++ b/fs/fuse/fuse_i.h > > @@ -35,9 +35,6 @@ > > /** Default max number of pages that can be used in a single read request */ > > #define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32 > > > > -/** Maximum of max_pages received in init_out */ > > -#define FUSE_MAX_MAX_PAGES 256 > > - > > /** Bias for fi->writectr, meaning new writepages must not be sent */ > > #define FUSE_NOWRITE INT_MIN > > > > @@ -47,6 +44,9 @@ > > /** Number of dentries for each connection in the control filesystem */ > > #define FUSE_CTL_NUM_DENTRIES 5 > > > > +/** Maximum of max_pages received in init_out */ > > +extern unsigned int fuse_max_pages_limit; > > + > > /** List of active connections */ > > extern struct list_head fuse_conn_list; > > > > @@ -1472,4 +1472,12 @@ ssize_t fuse_passthrough_splice_write(struct pipe_inode_info *pipe, > > size_t len, unsigned int flags); > > ssize_t fuse_passthrough_mmap(struct file *file, struct vm_area_struct *vma); > > > > +#ifdef CONFIG_SYSCTL > > +extern int fuse_sysctl_register(void); > > +extern void fuse_sysctl_unregister(void); > > +#else > > +#define fuse_sysctl_register() (0) > > +#define fuse_sysctl_unregister() do { } while (0) > > +#endif /* CONFIG_SYSCTL */ > > + > > #endif /* _FS_FUSE_I_H */ > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > > index 99e44ea7d875..973e58df816a 100644 > > --- a/fs/fuse/inode.c > > +++ b/fs/fuse/inode.c > > @@ -35,6 +35,8 @@ DEFINE_MUTEX(fuse_mutex); > > > > static int set_global_limit(const char *val, const struct kernel_param *kp); > > > > +unsigned int fuse_max_pages_limit = 256; > > + > > unsigned max_user_bgreq; > > module_param_call(max_user_bgreq, set_global_limit, param_get_uint, > > &max_user_bgreq, 0644); > > @@ -932,7 +934,7 @@ void fuse_conn_init(struct fuse_conn *fc, struct fuse_mount *fm, > > fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); > > fc->user_ns = get_user_ns(user_ns); > > fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ; > > - fc->max_pages_limit = FUSE_MAX_MAX_PAGES; > > + fc->max_pages_limit = fuse_max_pages_limit; > > > > if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) > > fuse_backing_files_init(fc); > > @@ -2039,8 +2041,14 @@ static int __init fuse_fs_init(void) > > if (err) > > goto out3; > > > > + err = fuse_sysctl_register(); > > + if (err) > > + goto out4; > > + > > return 0; > > > > + out4: > > + unregister_filesystem(&fuse_fs_type); > > out3: > > unregister_fuseblk(); > > out2: > > @@ -2053,6 +2061,7 @@ static void fuse_fs_cleanup(void) > > { > > unregister_filesystem(&fuse_fs_type); > > unregister_fuseblk(); > > + fuse_sysctl_unregister(); > > > > /* > > * Make sure all delayed rcu free inodes are flushed before we > > diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c > > index 572ce8a82ceb..a6c8ee551635 100644 > > --- a/fs/fuse/ioctl.c > > +++ b/fs/fuse/ioctl.c > > @@ -10,6 +10,8 @@ > > #include <linux/fileattr.h> > > #include <linux/fsverity.h> > > > > +#define FUSE_VERITY_ENABLE_ARG_MAX_PAGES 256 > > + > > static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args, > > struct fuse_ioctl_out *outarg) > > { > > @@ -140,7 +142,7 @@ static int fuse_setup_enable_verity(unsigned long arg, struct iovec *iov, > > { > > struct fsverity_enable_arg enable; > > struct fsverity_enable_arg __user *uarg = (void __user *)arg; > > - const __u32 max_buffer_len = FUSE_MAX_MAX_PAGES * PAGE_SIZE; > > + const __u32 max_buffer_len = FUSE_VERITY_ENABLE_ARG_MAX_PAGES * PAGE_SIZE; > > > > if (copy_from_user(&enable, uarg, sizeof(enable))) > > return -EFAULT; > > diff --git a/fs/fuse/sysctl.c b/fs/fuse/sysctl.c > > new file mode 100644 > > index 000000000000..b272bb333005 > > --- /dev/null > > +++ b/fs/fuse/sysctl.c > > @@ -0,0 +1,40 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * linux/fs/fuse/fuse_sysctl.c > > + * > > + * Sysctl interface to fuse parameters > > + */ > > +#include <linux/sysctl.h> > > + > > +#include "fuse_i.h" > > + > > +static struct ctl_table_header *fuse_table_header; > > + > > +/* Bound by fuse_init_out max_pages, which is a u16 */ > > +static unsigned int sysctl_fuse_max_pages_limit = 65535; > > + > > +static struct ctl_table fuse_sysctl_table[] = { > > + { > > + .procname = "max_pages_limit", > > + .data = &fuse_max_pages_limit, > > + .maxlen = sizeof(fuse_max_pages_limit), > > + .mode = 0644, > > + .proc_handler = proc_douintvec_minmax, > > + .extra1 = SYSCTL_ONE, > > + .extra2 = &sysctl_fuse_max_pages_limit, > > + }, > > +}; > > + > > +int fuse_sysctl_register(void) > > +{ > > + fuse_table_header = register_sysctl("fs/fuse", fuse_sysctl_table); > > + if (!fuse_table_header) > > + return -ENOMEM; > > + return 0; > > +} > > + > > +void fuse_sysctl_unregister(void) > > +{ > > + unregister_sysctl_table(fuse_table_header); > > + fuse_table_header = NULL; > > +}
On Thu, Sep 5, 2024 at 3:32 PM Joanne Koong <joannelkoong@gmail.com> wrote: > > On Thu, Sep 5, 2024 at 2:16 PM Bernd Schubert > <bernd.schubert@fastmail.fm> wrote: > > > > Hi Joanne, > > > > On 9/5/24 19:45, Joanne Koong wrote: > > > Introduce the capability to dynamically configure the fuse max pages > > > limit (formerly #defined as FUSE_MAX_MAX_PAGES) through a sysctl. > > > This enhancement allows system administrators to adjust the value > > > based on system-specific requirements. > > > > > > This removes the previous static limit of 256 max pages, which limits > > > the max write size of a request to 1 MiB (on 4096 pagesize systems). > > > Having the ability to up the max write size beyond 1 MiB allows for the > > > perf improvements detailed in this thread [1]. > > > > the change itself looks good to me, but have you seen this discussion here? > > > > https://lore.kernel.org/lkml/CAJfpegs10SdtzNXJfj3=vxoAZMhksT5A1u5W5L6nKL-P2UOuLQ@mail.gmail.com/T/ > > > > > > Miklos is basically worried about page pinning and accounting for that > > for unprivileged user processes. > > > > Thanks for the link to the previous discussion, Bernd. I'll cc Xu and > Jingbo, who started that thread, to this email. > > I'm curious whether this sysctl approach might mitigate those worries > here since modifying the sysctl value will require admin privileges to > explicitly opt into this. I liked Sweet Tea's comment > > "Perhaps, in analogy to soft and hard limits on pipe size, > FUSE_MAX_MAX_PAGES could be increased and treated as the maximum > possible hard limit for max_write; and the default hard limit could stay > at 1M, thereby allowing folks to opt into the new behavior if they care > about the performance more than the memory?" > > where something like this could let admins choose what aspects they'd > like to optimize for. > > My understanding of how bigger write buffers affects pinning is that > each chunk of the write will pin a higher number of contiguous pages > at one time (but overall for the duration of the write request, the > number for total pages that get pinned are the same). My understanding > is that the pages only get pinned when we do the copying to the fuse > server's buffer (and is not pinned while the server is servicing the Just realized my wording of "when we do the copying to the fuse server's buffer" can be interpreted in different ways. By this I mean from when we do the "fuse_pages_alloc()" call in fuse_perform_write() to when we finish fuse_send_write_pages(). > request). > > > Thanks, > Joanne > > > > > Thanks, > > Bernd > > > >
On 9/6/24 5:16 AM, Bernd Schubert wrote: > Hi Joanne, > > On 9/5/24 19:45, Joanne Koong wrote: >> Introduce the capability to dynamically configure the fuse max pages >> limit (formerly #defined as FUSE_MAX_MAX_PAGES) through a sysctl. >> This enhancement allows system administrators to adjust the value >> based on system-specific requirements. >> >> This removes the previous static limit of 256 max pages, which limits >> the max write size of a request to 1 MiB (on 4096 pagesize systems). >> Having the ability to up the max write size beyond 1 MiB allows for the >> perf improvements detailed in this thread [1]. > > the change itself looks good to me, but have you seen this discussion here? > > https://lore.kernel.org/lkml/CAJfpegs10SdtzNXJfj3=vxoAZMhksT5A1u5W5L6nKL-P2UOuLQ@mail.gmail.com/T/ > > > Miklos is basically worried about page pinning and accounting for that > for unprivileged user processes. Yeah this is the blocking issue here when attempting to increase the max pages limit. For background requests, maybe this could be fixed by taking the max pages limit into account when calculating max_user_bgreq. (currently max_user_bgreq is calculated "assuming 392 bytes per request") While for synchronous requests, I'm not sure if this indeed matters. Or at least it won't be such severe as the pinned memory is limited by the number of the user process in this case.
On 9/6/24 6:32 AM, Joanne Koong wrote: > On Thu, Sep 5, 2024 at 2:16 PM Bernd Schubert > <bernd.schubert@fastmail.fm> wrote: >> >> Hi Joanne, >> >> On 9/5/24 19:45, Joanne Koong wrote: >>> Introduce the capability to dynamically configure the fuse max pages >>> limit (formerly #defined as FUSE_MAX_MAX_PAGES) through a sysctl. >>> This enhancement allows system administrators to adjust the value >>> based on system-specific requirements. >>> >>> This removes the previous static limit of 256 max pages, which limits >>> the max write size of a request to 1 MiB (on 4096 pagesize systems). >>> Having the ability to up the max write size beyond 1 MiB allows for the >>> perf improvements detailed in this thread [1]. >> >> the change itself looks good to me, but have you seen this discussion here? >> >> https://lore.kernel.org/lkml/CAJfpegs10SdtzNXJfj3=vxoAZMhksT5A1u5W5L6nKL-P2UOuLQ@mail.gmail.com/T/ >> >> >> Miklos is basically worried about page pinning and accounting for that >> for unprivileged user processes. >> > > Thanks for the link to the previous discussion, Bernd. I'll cc Xu and > Jingbo, who started that thread, to this email. > > I'm curious whether this sysctl approach might mitigate those worries > here since modifying the sysctl value will require admin privileges to > explicitly opt into this. I liked Sweet Tea's comment > > "Perhaps, in analogy to soft and hard limits on pipe size, > FUSE_MAX_MAX_PAGES could be increased and treated as the maximum > possible hard limit for max_write; and the default hard limit could stay > at 1M, thereby allowing folks to opt into the new behavior if they care > about the performance more than the memory?" > > where something like this could let admins choose what aspects they'd > like to optimize for. The upper bound of max_pages limit is indeed increased by the sysadmin, but it can't overwhelms the fact that an unpriviledged fuse server could still configure a max_pages limit up to the bound and thus causing the memory pinning issue. > > My understanding of how bigger write buffers affects pinning is that > each chunk of the write will pin a higher number of contiguous pages > at one time (but overall for the duration of the write request, the > number for total pages that get pinned are the same). My understanding > is that the pages only get pinned when we do the copying to the fuse > server's buffer (and is not pinned while the server is servicing the > request). > I think the "possible memory that an unprivileged user is allowed to pin" by Miklos means the possible memory that an unprivileged (even malicious) fuse server could pin? Hi Miklos, would you mind giving more details on this?
Hi Joanne, kernel test robot noticed the following build warnings: [auto build test WARNING on mszeredi-fuse/for-next] [also build test WARNING on linus/master v6.11-rc6 next-20240906] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/fuse-Enable-dynamic-configuration-of-fuse-max-pages-limit-FUSE_MAX_MAX_PAGES/20240906-014722 base: https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git for-next patch link: https://lore.kernel.org/r/20240905174541.392785-1-joannelkoong%40gmail.com patch subject: [PATCH v2 RESEND] fuse: Enable dynamic configuration of fuse max pages limit (FUSE_MAX_MAX_PAGES) config: arm-randconfig-003-20240907 (https://download.01.org/0day-ci/archive/20240907/202409070207.bAwolF9v-lkp@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 14.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240907/202409070207.bAwolF9v-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202409070207.bAwolF9v-lkp@intel.com/ All warnings (new ones prefixed by >>): fs/fuse/sysctl.c:28:30: error: macro "fuse_sysctl_register" passed 1 arguments, but takes just 0 28 | int fuse_sysctl_register(void) | ^ In file included from fs/fuse/sysctl.c:9: fs/fuse/fuse_i.h:1473:9: note: macro "fuse_sysctl_register" defined here 1473 | #define fuse_sysctl_register() (0) | ^~~~~~~~~~~~~~~~~~~~ fs/fuse/sysctl.c:29:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token 29 | { | ^ fs/fuse/sysctl.c:36:33: error: macro "fuse_sysctl_unregister" passed 1 arguments, but takes just 0 36 | void fuse_sysctl_unregister(void) | ^ fs/fuse/fuse_i.h:1474:9: note: macro "fuse_sysctl_unregister" defined here 1474 | #define fuse_sysctl_unregister() do { } while (0) | ^~~~~~~~~~~~~~~~~~~~~~ fs/fuse/sysctl.c:37:1: error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token 37 | { | ^ >> fs/fuse/sysctl.c:16:25: warning: 'fuse_sysctl_table' defined but not used [-Wunused-variable] 16 | static struct ctl_table fuse_sysctl_table[] = { | ^~~~~~~~~~~~~~~~~ >> fs/fuse/sysctl.c:11:33: warning: 'fuse_table_header' defined but not used [-Wunused-variable] 11 | static struct ctl_table_header *fuse_table_header; | ^~~~~~~~~~~~~~~~~ vim +/fuse_sysctl_table +16 fs/fuse/sysctl.c 10 > 11 static struct ctl_table_header *fuse_table_header; 12 13 /* Bound by fuse_init_out max_pages, which is a u16 */ 14 static unsigned int sysctl_fuse_max_pages_limit = 65535; 15 > 16 static struct ctl_table fuse_sysctl_table[] = { 17 { 18 .procname = "max_pages_limit", 19 .data = &fuse_max_pages_limit, 20 .maxlen = sizeof(fuse_max_pages_limit), 21 .mode = 0644, 22 .proc_handler = proc_douintvec_minmax, 23 .extra1 = SYSCTL_ONE, 24 .extra2 = &sysctl_fuse_max_pages_limit, 25 }, 26 }; 27
Hi Joanne, kernel test robot noticed the following build errors: [auto build test ERROR on mszeredi-fuse/for-next] [also build test ERROR on linus/master v6.11-rc6 next-20240906] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/fuse-Enable-dynamic-configuration-of-fuse-max-pages-limit-FUSE_MAX_MAX_PAGES/20240906-014722 base: https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git for-next patch link: https://lore.kernel.org/r/20240905174541.392785-1-joannelkoong%40gmail.com patch subject: [PATCH v2 RESEND] fuse: Enable dynamic configuration of fuse max pages limit (FUSE_MAX_MAX_PAGES) config: arm-randconfig-004-20240907 (https://download.01.org/0day-ci/archive/20240907/202409070339.Pg9v3MA4-lkp@intel.com/config) compiler: clang version 16.0.6 (https://github.com/llvm/llvm-project 7cbf1a2591520c2491aa35339f227775f4d3adf6) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240907/202409070339.Pg9v3MA4-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202409070339.Pg9v3MA4-lkp@intel.com/ All errors (new ones prefixed by >>): >> fs/fuse/sysctl.c:28:26: error: too many arguments provided to function-like macro invocation int fuse_sysctl_register(void) ^ fs/fuse/fuse_i.h:1473:9: note: macro 'fuse_sysctl_register' defined here #define fuse_sysctl_register() (0) ^ >> fs/fuse/sysctl.c:28:25: error: expected ';' after top level declarator int fuse_sysctl_register(void) ^ ; fs/fuse/sysctl.c:36:29: error: too many arguments provided to function-like macro invocation void fuse_sysctl_unregister(void) ^ fs/fuse/fuse_i.h:1474:9: note: macro 'fuse_sysctl_unregister' defined here #define fuse_sysctl_unregister() do { } while (0) ^ 3 errors generated. vim +28 fs/fuse/sysctl.c 27 > 28 int fuse_sysctl_register(void) 29 { 30 fuse_table_header = register_sysctl("fs/fuse", fuse_sysctl_table); 31 if (!fuse_table_header) 32 return -ENOMEM; 33 return 0; 34 } 35
Hi Joanne, kernel test robot noticed the following build errors: [auto build test ERROR on mszeredi-fuse/for-next] [also build test ERROR on linus/master v6.11-rc6 next-20240906] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Joanne-Koong/fuse-Enable-dynamic-configuration-of-fuse-max-pages-limit-FUSE_MAX_MAX_PAGES/20240906-014722 base: https://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git for-next patch link: https://lore.kernel.org/r/20240905174541.392785-1-joannelkoong%40gmail.com patch subject: [PATCH v2 RESEND] fuse: Enable dynamic configuration of fuse max pages limit (FUSE_MAX_MAX_PAGES) config: hexagon-randconfig-001-20240907 (https://download.01.org/0day-ci/archive/20240907/202409070557.u6rI9J9K-lkp@intel.com/config) compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 05f5a91d00b02f4369f46d076411c700755ae041) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240907/202409070557.u6rI9J9K-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202409070557.u6rI9J9K-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from fs/fuse/sysctl.c:9: In file included from fs/fuse/fuse_i.h:22: In file included from include/linux/mm.h:2228: include/linux/vmstat.h:514:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 514 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~~~~~~~~~ ^ ~~~ In file included from fs/fuse/sysctl.c:9: In file included from fs/fuse/fuse_i.h:23: In file included from include/linux/backing-dev.h:16: In file included from include/linux/writeback.h:13: In file included from include/linux/blk_types.h:10: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:12: In file included from include/linux/hardirq.h:11: In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/hexagon/include/asm/io.h:328: include/asm-generic/io.h:548:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 548 | val = __raw_readb(PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:561:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 561 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu' 37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x)) | ^ In file included from fs/fuse/sysctl.c:9: In file included from fs/fuse/fuse_i.h:23: In file included from include/linux/backing-dev.h:16: In file included from include/linux/writeback.h:13: In file included from include/linux/blk_types.h:10: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:12: In file included from include/linux/hardirq.h:11: In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/hexagon/include/asm/io.h:328: include/asm-generic/io.h:574:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 574 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu' 35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x)) | ^ In file included from fs/fuse/sysctl.c:9: In file included from fs/fuse/fuse_i.h:23: In file included from include/linux/backing-dev.h:16: In file included from include/linux/writeback.h:13: In file included from include/linux/blk_types.h:10: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:12: In file included from include/linux/hardirq.h:11: In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:14: In file included from arch/hexagon/include/asm/io.h:328: include/asm-generic/io.h:585:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 585 | __raw_writeb(value, PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:595:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 595 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:605:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 605 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ fs/fuse/sysctl.c:28:26: error: too many arguments provided to function-like macro invocation 28 | int fuse_sysctl_register(void) | ^ fs/fuse/fuse_i.h:1473:9: note: macro 'fuse_sysctl_register' defined here 1473 | #define fuse_sysctl_register() (0) | ^ fs/fuse/sysctl.c:28:25: error: expected ';' after top level declarator 28 | int fuse_sysctl_register(void) | ^ | ; fs/fuse/sysctl.c:36:29: error: too many arguments provided to function-like macro invocation 36 | void fuse_sysctl_unregister(void) | ^ fs/fuse/fuse_i.h:1474:9: note: macro 'fuse_sysctl_unregister' defined here 1474 | #define fuse_sysctl_unregister() do { } while (0) | ^ >> fs/fuse/sysctl.c:36:6: error: variable has incomplete type 'void' 36 | void fuse_sysctl_unregister(void) | ^ fs/fuse/sysctl.c:36:28: error: expected ';' after top level declarator 36 | void fuse_sysctl_unregister(void) | ^ | ; 7 warnings and 5 errors generated. vim +/void +36 fs/fuse/sysctl.c 35 > 36 void fuse_sysctl_unregister(void)
diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst index 47499a1742bd..fa25d7e718b3 100644 --- a/Documentation/admin-guide/sysctl/fs.rst +++ b/Documentation/admin-guide/sysctl/fs.rst @@ -332,3 +332,13 @@ Each "watch" costs roughly 90 bytes on a 32-bit kernel, and roughly 160 bytes on a 64-bit one. The current default value for ``max_user_watches`` is 4% of the available low memory, divided by the "watch" cost in bytes. + +5. /proc/sys/fs/fuse - Configuration options for FUSE filesystems +===================================================================== + +This directory contains the following configuration options for FUSE +filesystems: + +``/proc/sys/fs/fuse/max_pages_limit`` is a read/write file for +setting/getting the maximum number of pages that can be used for servicing +requests in FUSE. diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile index 6e0228c6d0cb..cd4ef3e08ebf 100644 --- a/fs/fuse/Makefile +++ b/fs/fuse/Makefile @@ -7,7 +7,7 @@ obj-$(CONFIG_FUSE_FS) += fuse.o obj-$(CONFIG_CUSE) += cuse.o obj-$(CONFIG_VIRTIO_FS) += virtiofs.o -fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o +fuse-y := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o ioctl.o sysctl.o fuse-y += iomode.o fuse-$(CONFIG_FUSE_DAX) += dax.o fuse-$(CONFIG_FUSE_PASSTHROUGH) += passthrough.o diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index f23919610313..bb252a3ea37b 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -35,9 +35,6 @@ /** Default max number of pages that can be used in a single read request */ #define FUSE_DEFAULT_MAX_PAGES_PER_REQ 32 -/** Maximum of max_pages received in init_out */ -#define FUSE_MAX_MAX_PAGES 256 - /** Bias for fi->writectr, meaning new writepages must not be sent */ #define FUSE_NOWRITE INT_MIN @@ -47,6 +44,9 @@ /** Number of dentries for each connection in the control filesystem */ #define FUSE_CTL_NUM_DENTRIES 5 +/** Maximum of max_pages received in init_out */ +extern unsigned int fuse_max_pages_limit; + /** List of active connections */ extern struct list_head fuse_conn_list; @@ -1472,4 +1472,12 @@ ssize_t fuse_passthrough_splice_write(struct pipe_inode_info *pipe, size_t len, unsigned int flags); ssize_t fuse_passthrough_mmap(struct file *file, struct vm_area_struct *vma); +#ifdef CONFIG_SYSCTL +extern int fuse_sysctl_register(void); +extern void fuse_sysctl_unregister(void); +#else +#define fuse_sysctl_register() (0) +#define fuse_sysctl_unregister() do { } while (0) +#endif /* CONFIG_SYSCTL */ + #endif /* _FS_FUSE_I_H */ diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 99e44ea7d875..973e58df816a 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -35,6 +35,8 @@ DEFINE_MUTEX(fuse_mutex); static int set_global_limit(const char *val, const struct kernel_param *kp); +unsigned int fuse_max_pages_limit = 256; + unsigned max_user_bgreq; module_param_call(max_user_bgreq, set_global_limit, param_get_uint, &max_user_bgreq, 0644); @@ -932,7 +934,7 @@ void fuse_conn_init(struct fuse_conn *fc, struct fuse_mount *fm, fc->pid_ns = get_pid_ns(task_active_pid_ns(current)); fc->user_ns = get_user_ns(user_ns); fc->max_pages = FUSE_DEFAULT_MAX_PAGES_PER_REQ; - fc->max_pages_limit = FUSE_MAX_MAX_PAGES; + fc->max_pages_limit = fuse_max_pages_limit; if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) fuse_backing_files_init(fc); @@ -2039,8 +2041,14 @@ static int __init fuse_fs_init(void) if (err) goto out3; + err = fuse_sysctl_register(); + if (err) + goto out4; + return 0; + out4: + unregister_filesystem(&fuse_fs_type); out3: unregister_fuseblk(); out2: @@ -2053,6 +2061,7 @@ static void fuse_fs_cleanup(void) { unregister_filesystem(&fuse_fs_type); unregister_fuseblk(); + fuse_sysctl_unregister(); /* * Make sure all delayed rcu free inodes are flushed before we diff --git a/fs/fuse/ioctl.c b/fs/fuse/ioctl.c index 572ce8a82ceb..a6c8ee551635 100644 --- a/fs/fuse/ioctl.c +++ b/fs/fuse/ioctl.c @@ -10,6 +10,8 @@ #include <linux/fileattr.h> #include <linux/fsverity.h> +#define FUSE_VERITY_ENABLE_ARG_MAX_PAGES 256 + static ssize_t fuse_send_ioctl(struct fuse_mount *fm, struct fuse_args *args, struct fuse_ioctl_out *outarg) { @@ -140,7 +142,7 @@ static int fuse_setup_enable_verity(unsigned long arg, struct iovec *iov, { struct fsverity_enable_arg enable; struct fsverity_enable_arg __user *uarg = (void __user *)arg; - const __u32 max_buffer_len = FUSE_MAX_MAX_PAGES * PAGE_SIZE; + const __u32 max_buffer_len = FUSE_VERITY_ENABLE_ARG_MAX_PAGES * PAGE_SIZE; if (copy_from_user(&enable, uarg, sizeof(enable))) return -EFAULT; diff --git a/fs/fuse/sysctl.c b/fs/fuse/sysctl.c new file mode 100644 index 000000000000..b272bb333005 --- /dev/null +++ b/fs/fuse/sysctl.c @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * linux/fs/fuse/fuse_sysctl.c + * + * Sysctl interface to fuse parameters + */ +#include <linux/sysctl.h> + +#include "fuse_i.h" + +static struct ctl_table_header *fuse_table_header; + +/* Bound by fuse_init_out max_pages, which is a u16 */ +static unsigned int sysctl_fuse_max_pages_limit = 65535; + +static struct ctl_table fuse_sysctl_table[] = { + { + .procname = "max_pages_limit", + .data = &fuse_max_pages_limit, + .maxlen = sizeof(fuse_max_pages_limit), + .mode = 0644, + .proc_handler = proc_douintvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = &sysctl_fuse_max_pages_limit, + }, +}; + +int fuse_sysctl_register(void) +{ + fuse_table_header = register_sysctl("fs/fuse", fuse_sysctl_table); + if (!fuse_table_header) + return -ENOMEM; + return 0; +} + +void fuse_sysctl_unregister(void) +{ + unregister_sysctl_table(fuse_table_header); + fuse_table_header = NULL; +}