Message ID | 7-v6-a196d26f289e+11787-iommufd_jgg@nvidia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | IOMMUFD Generic interface | expand |
On Tue, Nov 29, 2022 at 04:29:30PM -0400, Jason Gunthorpe wrote: > Following the pattern of io_uring, perf, skb, and bpf, iommfd will use > user->locked_vm for accounting pinned pages. Ensure the value is included > in the struct and export free_uid() as iommufd is modular. > > user->locked_vm is the good accounting to use for ulimit because it is > per-user, and the security sandboxing of locked pages is not supposed to > be per-process. Other places (vfio, vdpa and infiniband) have used > mm->pinned_vm and/or mm->locked_vm for accounting pinned pages, but this > is only per-process and inconsistent with the new FOLL_LONGTERM users in > the kernel. > > Concurrent work is underway to try to put this in a cgroup, so everything > can be consistent and the kernel can provide a FOLL_LONGTERM limit that > actually provides security. > > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > Tested-by: Yi Liu <yi.l.liu@intel.com> > Tested-by: Lixiao Yang <lixiao.yang@intel.com> > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > Reviewed-by: Eric Auger <eric.auger@redhat.com> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Just curious: why does the subject say "user::locked_vm"? As opposed to user->locked_vm? Made me think it's somehow related to rust in kernel or whatever. > --- > include/linux/sched/user.h | 2 +- > kernel/user.c | 1 + > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/sched/user.h b/include/linux/sched/user.h > index f054d0360a7533..4cc52698e214e2 100644 > --- a/include/linux/sched/user.h > +++ b/include/linux/sched/user.h > @@ -25,7 +25,7 @@ struct user_struct { > > #if defined(CONFIG_PERF_EVENTS) || defined(CONFIG_BPF_SYSCALL) || \ > defined(CONFIG_NET) || defined(CONFIG_IO_URING) || \ > - defined(CONFIG_VFIO_PCI_ZDEV_KVM) > + defined(CONFIG_VFIO_PCI_ZDEV_KVM) || IS_ENABLED(CONFIG_IOMMUFD) > atomic_long_t locked_vm; > #endif > #ifdef CONFIG_WATCH_QUEUE > diff --git a/kernel/user.c b/kernel/user.c > index e2cf8c22b539a7..d667debeafd609 100644 > --- a/kernel/user.c > +++ b/kernel/user.c > @@ -185,6 +185,7 @@ void free_uid(struct user_struct *up) > if (refcount_dec_and_lock_irqsave(&up->__count, &uidhash_lock, &flags)) > free_user(up, flags); > } > +EXPORT_SYMBOL_GPL(free_uid); > > struct user_struct *alloc_uid(kuid_t uid) > { > -- > 2.38.1
On Tue, Nov 29, 2022 at 03:42:23PM -0500, Michael S. Tsirkin wrote: > On Tue, Nov 29, 2022 at 04:29:30PM -0400, Jason Gunthorpe wrote: > > Following the pattern of io_uring, perf, skb, and bpf, iommfd will use > > user->locked_vm for accounting pinned pages. Ensure the value is included > > in the struct and export free_uid() as iommufd is modular. > > > > user->locked_vm is the good accounting to use for ulimit because it is > > per-user, and the security sandboxing of locked pages is not supposed to > > be per-process. Other places (vfio, vdpa and infiniband) have used > > mm->pinned_vm and/or mm->locked_vm for accounting pinned pages, but this > > is only per-process and inconsistent with the new FOLL_LONGTERM users in > > the kernel. > > > > Concurrent work is underway to try to put this in a cgroup, so everything > > can be consistent and the kernel can provide a FOLL_LONGTERM limit that > > actually provides security. > > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > > Tested-by: Yi Liu <yi.l.liu@intel.com> > > Tested-by: Lixiao Yang <lixiao.yang@intel.com> > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > Reviewed-by: Eric Auger <eric.auger@redhat.com> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > > Just curious: why does the subject say "user::locked_vm"? As opposed to > user->locked_vm? Made me think it's somehow related to rust in kernel or > whatever. :: is the C++ way to say "member of a type", I suppose it is a typo and should be user_struct::locked_vm The use of -> otherwise was to have some clarity about mm vs user structs. Jason
On Tue, Nov 29, 2022 at 04:48:16PM -0400, Jason Gunthorpe wrote: > On Tue, Nov 29, 2022 at 03:42:23PM -0500, Michael S. Tsirkin wrote: > > On Tue, Nov 29, 2022 at 04:29:30PM -0400, Jason Gunthorpe wrote: > > > Following the pattern of io_uring, perf, skb, and bpf, iommfd will use > > > user->locked_vm for accounting pinned pages. Ensure the value is included > > > in the struct and export free_uid() as iommufd is modular. > > > > > > user->locked_vm is the good accounting to use for ulimit because it is > > > per-user, and the security sandboxing of locked pages is not supposed to > > > be per-process. Other places (vfio, vdpa and infiniband) have used > > > mm->pinned_vm and/or mm->locked_vm for accounting pinned pages, but this > > > is only per-process and inconsistent with the new FOLL_LONGTERM users in > > > the kernel. > > > > > > Concurrent work is underway to try to put this in a cgroup, so everything > > > can be consistent and the kernel can provide a FOLL_LONGTERM limit that > > > actually provides security. > > > > > > Tested-by: Nicolin Chen <nicolinc@nvidia.com> > > > Tested-by: Yi Liu <yi.l.liu@intel.com> > > > Tested-by: Lixiao Yang <lixiao.yang@intel.com> > > > Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > > Reviewed-by: Eric Auger <eric.auger@redhat.com> > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > > > > Just curious: why does the subject say "user::locked_vm"? As opposed to > > user->locked_vm? Made me think it's somehow related to rust in kernel or > > whatever. > > :: is the C++ way to say "member of a type", I suppose it is a typo > and should be user_struct::locked_vm > > The use of -> otherwise was to have some clarity about mm vs user > structs. > > Jason I note that commit log says user->locked_vm and that's clear enough IMHO, I'd leave C++ alone - IIRC yes you can write ptr->type::field but no one does so it's not idiomatic, :: is more commonly used with static members there. So this confuses more than it clarifies. But whatever, hardly a blocker. Feel free to ignore.
diff --git a/include/linux/sched/user.h b/include/linux/sched/user.h index f054d0360a7533..4cc52698e214e2 100644 --- a/include/linux/sched/user.h +++ b/include/linux/sched/user.h @@ -25,7 +25,7 @@ struct user_struct { #if defined(CONFIG_PERF_EVENTS) || defined(CONFIG_BPF_SYSCALL) || \ defined(CONFIG_NET) || defined(CONFIG_IO_URING) || \ - defined(CONFIG_VFIO_PCI_ZDEV_KVM) + defined(CONFIG_VFIO_PCI_ZDEV_KVM) || IS_ENABLED(CONFIG_IOMMUFD) atomic_long_t locked_vm; #endif #ifdef CONFIG_WATCH_QUEUE diff --git a/kernel/user.c b/kernel/user.c index e2cf8c22b539a7..d667debeafd609 100644 --- a/kernel/user.c +++ b/kernel/user.c @@ -185,6 +185,7 @@ void free_uid(struct user_struct *up) if (refcount_dec_and_lock_irqsave(&up->__count, &uidhash_lock, &flags)) free_user(up, flags); } +EXPORT_SYMBOL_GPL(free_uid); struct user_struct *alloc_uid(kuid_t uid) {