diff mbox series

[v2] userfaultfd: release page in error path to avoid BUG_ON

Message ID 20210428230858.348400-1-axelrasmussen@google.com (mailing list archive)
State New, archived
Headers show
Series [v2] userfaultfd: release page in error path to avoid BUG_ON | expand

Commit Message

Axel Rasmussen April 28, 2021, 11:08 p.m. UTC
Consider the following sequence of events:

1. Userspace issues a UFFD ioctl, which ends up calling into
   shmem_mfill_atomic_pte(). We successfully account the blocks, we
   shmem_alloc_page(), but then the copy_from_user() fails. We return
   -ENOENT. We don't release the page we allocated.
2. Our caller detects this error code, tries the copy_from_user() after
   dropping the mmap_lock, and retries, calling back into
   shmem_mfill_atomic_pte().
3. Meanwhile, let's say another process filled up the tmpfs being used.
4. So shmem_mfill_atomic_pte() fails to account blocks this time, and
   immediately returns - without releasing the page.

This triggers a BUG_ON in our caller, which asserts that the page
should always be consumed, unless -ENOENT is returned.

To fix this, detect if we have such a "dangling" page when accounting
fails, and if so, release it before returning.

Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY")
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
---
 mm/shmem.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Hugh Dickins April 28, 2021, 11:56 p.m. UTC | #1
On Wed, 28 Apr 2021, Axel Rasmussen wrote:

> Consider the following sequence of events:
> 
> 1. Userspace issues a UFFD ioctl, which ends up calling into
>    shmem_mfill_atomic_pte(). We successfully account the blocks, we
>    shmem_alloc_page(), but then the copy_from_user() fails. We return
>    -ENOENT. We don't release the page we allocated.
> 2. Our caller detects this error code, tries the copy_from_user() after
>    dropping the mmap_lock, and retries, calling back into
>    shmem_mfill_atomic_pte().
> 3. Meanwhile, let's say another process filled up the tmpfs being used.
> 4. So shmem_mfill_atomic_pte() fails to account blocks this time, and
>    immediately returns - without releasing the page.
> 
> This triggers a BUG_ON in our caller, which asserts that the page
> should always be consumed, unless -ENOENT is returned.
> 
> To fix this, detect if we have such a "dangling" page when accounting
> fails, and if so, release it before returning.
> 
> Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY")
> Reported-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>

Acked-by: Hugh Dickins <hughd@google.com>

Thanks!

> ---
>  mm/shmem.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 26c76b13ad23..8def03d3f32a 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2375,8 +2375,18 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
>  	pgoff_t offset, max_off;
>  
>  	ret = -ENOMEM;
> -	if (!shmem_inode_acct_block(inode, 1))
> +	if (!shmem_inode_acct_block(inode, 1)) {
> +		/*
> +		 * We may have got a page, returned -ENOENT triggering a retry,
> +		 * and now we find ourselves with -ENOMEM. Release the page, to
> +		 * avoid a BUG_ON in our caller.
> +		 */
> +		if (unlikely(*pagep)) {
> +			put_page(*pagep);
> +			*pagep = NULL;
> +		}
>  		goto out;
> +	}
>  
>  	if (!*pagep) {
>  		page = shmem_alloc_page(gfp, info, pgoff);
> -- 
> 2.31.1.498.g6c1eba8ee3d-goog
Peter Xu April 29, 2021, 12:08 a.m. UTC | #2
On Wed, Apr 28, 2021 at 04:08:58PM -0700, Axel Rasmussen wrote:
> Consider the following sequence of events:
> 
> 1. Userspace issues a UFFD ioctl, which ends up calling into
>    shmem_mfill_atomic_pte(). We successfully account the blocks, we
>    shmem_alloc_page(), but then the copy_from_user() fails. We return
>    -ENOENT. We don't release the page we allocated.
> 2. Our caller detects this error code, tries the copy_from_user() after
>    dropping the mmap_lock, and retries, calling back into
>    shmem_mfill_atomic_pte().
> 3. Meanwhile, let's say another process filled up the tmpfs being used.
> 4. So shmem_mfill_atomic_pte() fails to account blocks this time, and
>    immediately returns - without releasing the page.
> 
> This triggers a BUG_ON in our caller, which asserts that the page
> should always be consumed, unless -ENOENT is returned.
> 
> To fix this, detect if we have such a "dangling" page when accounting
> fails, and if so, release it before returning.
> 
> Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY")
> Reported-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks,
Axel Rasmussen May 5, 2021, 10:13 p.m. UTC | #3
On Wed, Apr 28, 2021 at 4:09 PM Axel Rasmussen <axelrasmussen@google.com> wrote:
>
> Consider the following sequence of events:
>
> 1. Userspace issues a UFFD ioctl, which ends up calling into
>    shmem_mfill_atomic_pte(). We successfully account the blocks, we
>    shmem_alloc_page(), but then the copy_from_user() fails. We return
>    -ENOENT. We don't release the page we allocated.
> 2. Our caller detects this error code, tries the copy_from_user() after
>    dropping the mmap_lock, and retries, calling back into
>    shmem_mfill_atomic_pte().
> 3. Meanwhile, let's say another process filled up the tmpfs being used.
> 4. So shmem_mfill_atomic_pte() fails to account blocks this time, and
>    immediately returns - without releasing the page.
>
> This triggers a BUG_ON in our caller, which asserts that the page
> should always be consumed, unless -ENOENT is returned.
>
> To fix this, detect if we have such a "dangling" page when accounting
> fails, and if so, release it before returning.
>
> Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY")
> Reported-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>

Apologies, I should have added this line:

Cc: stable@vger.kernel.org

I believe this fix ought to go into the 4.14 and later stable branches
(the commit referenced in Fixes: was introduced in 4.11).

I can resend with this included, if that would be easier.

> ---
>  mm/shmem.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 26c76b13ad23..8def03d3f32a 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2375,8 +2375,18 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
>         pgoff_t offset, max_off;
>
>         ret = -ENOMEM;
> -       if (!shmem_inode_acct_block(inode, 1))
> +       if (!shmem_inode_acct_block(inode, 1)) {
> +               /*
> +                * We may have got a page, returned -ENOENT triggering a retry,
> +                * and now we find ourselves with -ENOMEM. Release the page, to
> +                * avoid a BUG_ON in our caller.
> +                */
> +               if (unlikely(*pagep)) {
> +                       put_page(*pagep);
> +                       *pagep = NULL;
> +               }
>                 goto out;
> +       }
>
>         if (!*pagep) {
>                 page = shmem_alloc_page(gfp, info, pgoff);
> --
> 2.31.1.498.g6c1eba8ee3d-goog
>
diff mbox series

Patch

diff --git a/mm/shmem.c b/mm/shmem.c
index 26c76b13ad23..8def03d3f32a 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2375,8 +2375,18 @@  static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
 	pgoff_t offset, max_off;
 
 	ret = -ENOMEM;
-	if (!shmem_inode_acct_block(inode, 1))
+	if (!shmem_inode_acct_block(inode, 1)) {
+		/*
+		 * We may have got a page, returned -ENOENT triggering a retry,
+		 * and now we find ourselves with -ENOMEM. Release the page, to
+		 * avoid a BUG_ON in our caller.
+		 */
+		if (unlikely(*pagep)) {
+			put_page(*pagep);
+			*pagep = NULL;
+		}
 		goto out;
+	}
 
 	if (!*pagep) {
 		page = shmem_alloc_page(gfp, info, pgoff);