diff mbox series

RDMA/umem: Minor optimization

Message ID d9bf7632363df45eced5407855545176b81a1b0e.1537474903.git.dledford@redhat.com (mailing list archive)
State Superseded
Headers show
Series RDMA/umem: Minor optimization | expand

Commit Message

Doug Ledford Sept. 20, 2018, 8:23 p.m. UTC
Noticed while reviewing d4b4dd1b9706 ("RDMA/umem: Do not use
current->tgid to track the mm_struct") patch.  Why would we take a lock,
adjust a protected variable, drop the lock, and *then* check the input
into our protected variable adjustment?  Then we have to take the lock
again on our error unwind.  Let's just check the input early and skip
taking the locks needlessly if the input isn't valid.

Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/core/umem.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Parav Pandit Sept. 20, 2018, 8:51 p.m. UTC | #1
> -----Original Message-----
> From: linux-rdma-owner@vger.kernel.org <linux-rdma-
> owner@vger.kernel.org> On Behalf Of Doug Ledford
> Sent: Thursday, September 20, 2018 3:23 PM
> To: linux-rdma@vger.kernel.org
> Cc: Doug Ledford <dledford@redhat.com>
> Subject: [PATCH] RDMA/umem: Minor optimization
> 
> Noticed while reviewing d4b4dd1b9706 ("RDMA/umem: Do not use
> current->tgid to track the mm_struct") patch.  Why would we take a lock,
> adjust a protected variable, drop the lock, and *then* check the input into
> our protected variable adjustment?  Then we have to take the lock again on
> our error unwind.  Let's just check the input early and skip taking the locks
> needlessly if the input isn't valid.
> 
> Signed-off-by: Doug Ledford <dledford@redhat.com>
> ---
>  drivers/infiniband/core/umem.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c
> b/drivers/infiniband/core/umem.c index c32a3e27a896..34d7256f5ba0
> 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -147,6 +147,10 @@ struct ib_umem *ib_umem_get(struct ib_ucontext
> *context, unsigned long addr,
>  		umem->hugetlb = 0;
> 
>  	npages = ib_umem_num_pages(umem);
> +	if (npages == 0 || npages > UINT_MAX) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> 
>  	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> 
> @@ -161,11 +165,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext
> *context, unsigned long addr,
> 
>  	cur_base = addr & PAGE_MASK;
> 
> -	if (npages == 0 || npages > UINT_MAX) {
> -		ret = -EINVAL;
> -		goto vma;
> -	}
> -
>  	ret = sg_alloc_table(&umem->sg_head, npages, GFP_KERNEL);
>  	if (ret)
>  		goto vma;
> --
> 2.17.1
Reviewed-by: Parav Pandit <parav@mellanox.com>
Jason Gunthorpe Sept. 20, 2018, 8:52 p.m. UTC | #2
On Thu, Sep 20, 2018 at 04:23:00PM -0400, Doug Ledford wrote:
> Noticed while reviewing d4b4dd1b9706 ("RDMA/umem: Do not use
> current->tgid to track the mm_struct") patch.  Why would we take a lock,
> adjust a protected variable, drop the lock, and *then* check the input
> into our protected variable adjustment?  Then we have to take the lock
> again on our error unwind.  Let's just check the input early and skip
> taking the locks needlessly if the input isn't valid.
>
> Signed-off-by: Doug Ledford <dledford@redhat.com>
>  drivers/infiniband/core/umem.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index c32a3e27a896..34d7256f5ba0 100644
> +++ b/drivers/infiniband/core/umem.c
> @@ -147,6 +147,10 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
>  		umem->hugetlb = 0;
>
>  	npages = ib_umem_num_pages(umem);
> +	if (npages == 0 || npages > UINT_MAX) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
>  
>  	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;

Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>

Though this bit is doing the same nonsense:

	down_write(&current->mm->mmap_sem);
	current->mm->pinned_vm += npages;
	if ((current->mm->pinned_vm > lock_limit) && !capable(CAP_IPC_LOCK)) {
		up_write(&current->mm->mmap_sem);
		ret = -ENOMEM;
		goto vma;

And would be better written like

if (check_add_overflow(current->mm->pinned_vm, npages, &newpinned) ||
    (newpinned > lock_limit && !capable(CAP_IPC_LOCK))) {
		up_write(&current->mm->mmap_sem);
		ret = -ENOMEM;
		goto out;
}
current->mm->pinned_vm = newpinned;

Ie we have a bug here, if a process creates a large enough MR it can
overflow pinned_vm and violate lock_limit. :(

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index c32a3e27a896..34d7256f5ba0 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -147,6 +147,10 @@  struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
 		umem->hugetlb = 0;
 
 	npages = ib_umem_num_pages(umem);
+	if (npages == 0 || npages > UINT_MAX) {
+		ret = -EINVAL;
+		goto out;
+	}
 
 	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
 
@@ -161,11 +165,6 @@  struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
 
 	cur_base = addr & PAGE_MASK;
 
-	if (npages == 0 || npages > UINT_MAX) {
-		ret = -EINVAL;
-		goto vma;
-	}
-
 	ret = sg_alloc_table(&umem->sg_head, npages, GFP_KERNEL);
 	if (ret)
 		goto vma;