diff mbox series

[mm-unstable,fix] mm: userfaultfd: add new UFFDIO_POISON ioctl: fix

Message ID bc7bba61-d34f-ad3a-ccf1-c191585ef851@google.com (mailing list archive)
State New
Headers show
Series [mm-unstable,fix] mm: userfaultfd: add new UFFDIO_POISON ioctl: fix | expand

Commit Message

Hugh Dickins July 12, 2023, 1:27 a.m. UTC
Smatch has observed that pte_offset_map_lock() is now allowed to fail,
and then ptl should not be unlocked.  Use -EAGAIN here like elsewhere.

Signed-off-by: Hugh Dickins <hughd@google.com>
---
Axel, Peter: this seems right as a fix to the patch in mm-unstable;
but in preparing this, I noticed mfill_atomic()'s code before calling
mfill_atomic_pte(), and think that my original choice of -EFAULT was
therefore better than -EAGAIN for all of these; and that mfill_atomic()'s
BUG_ONs there would be better deleted (and is its BUG_ON(folio) safe??).
Something one of us should address, after this fixup is in akpm's tree.

 mm/userfaultfd.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Axel Rasmussen July 12, 2023, 6:16 p.m. UTC | #1
On Tue, Jul 11, 2023 at 6:27 PM Hugh Dickins <hughd@google.com> wrote:
>
> Smatch has observed that pte_offset_map_lock() is now allowed to fail,
> and then ptl should not be unlocked.  Use -EAGAIN here like elsewhere.
>
> Signed-off-by: Hugh Dickins <hughd@google.com>

Looks correct to me, thanks Hugh! If it's useful, feel free to take:

Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>

> ---
> Axel, Peter: this seems right as a fix to the patch in mm-unstable;
> but in preparing this, I noticed mfill_atomic()'s code before calling
> mfill_atomic_pte(), and think that my original choice of -EFAULT was
> therefore better than -EAGAIN for all of these; and that mfill_atomic()'s
> BUG_ONs there would be better deleted (and is its BUG_ON(folio) safe??).
> Something one of us should address, after this fixup is in akpm's tree.

Agreed, dealing with the BUG_ONs is something I have been meaning to
do as checkpatch bothers me about it frequently.

What this code is trying to do is to enforce the contract that: if
mfill_atomic_pte failed, then it must have released *folio and set it
to NULL. But, you're exactly right that if we had such a bug, it would
mean we leaked one page in an unusual error case - not great, but not
worth crashing the kernel either.

For UFFDIO_POISON this doesn't really apply because it doesn't
allocate or get a reference to a page in any case.

For other cases like UFFDIO_COPY where it does matter, I think it's
fine as is regardless of the error code returned (as long as it's not
-ENOENT :) ). mfill_atomic_pte_copy() is sure to set `*foliop = NULL`
before it attempts this lock (in mfill_atomic_install_pte). So -EAGAIN
or -EFAULT should work equally well at least from that perspective. I
somewhat prefer -EAGAIN, but maybe I'm missing something.

>
>  mm/userfaultfd.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -300,7 +300,10 @@ static int mfill_atomic_pte_poison(pmd_t *dst_pmd,
>         spinlock_t *ptl;
>
>         _dst_pte = make_pte_marker(PTE_MARKER_POISONED);
> +       ret = -EAGAIN;
>         dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
> +       if (!dst_pte)
> +               goto out;
>
>         if (mfill_file_over_size(dst_vma, dst_addr)) {
>                 ret = -EFAULT;
> @@ -319,6 +322,7 @@ static int mfill_atomic_pte_poison(pmd_t *dst_pmd,
>         ret = 0;
>  out_unlock:
>         pte_unmap_unlock(dst_pte, ptl);
> +out:
>         return ret;
>  }
>
Hugh Dickins July 14, 2023, 2:40 a.m. UTC | #2
On Wed, 12 Jul 2023, Axel Rasmussen wrote:
> On Tue, Jul 11, 2023 at 6:27 PM Hugh Dickins <hughd@google.com> wrote:
> >
> > Smatch has observed that pte_offset_map_lock() is now allowed to fail,
> > and then ptl should not be unlocked.  Use -EAGAIN here like elsewhere.
> >
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> 
> Looks correct to me, thanks Hugh! If it's useful, feel free to take:
> 
> Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>

Thanks, that will have helped Andrew (but would vanish into the merge
with your original when it moves from mm-unstable to mm-stable).

> 
> > ---
> > Axel, Peter: this seems right as a fix to the patch in mm-unstable;
> > but in preparing this, I noticed mfill_atomic()'s code before calling
> > mfill_atomic_pte(), and think that my original choice of -EFAULT was
> > therefore better than -EAGAIN for all of these; and that mfill_atomic()'s
> > BUG_ONs there would be better deleted (and is its BUG_ON(folio) safe??).
> > Something one of us should address, after this fixup is in akpm's tree.
> 
> Agreed, dealing with the BUG_ONs is something I have been meaning to
> do as checkpatch bothers me about it frequently.
> 
> What this code is trying to do is to enforce the contract that: if
> mfill_atomic_pte failed, then it must have released *folio and set it
> to NULL. But, you're exactly right that if we had such a bug, it would
> mean we leaked one page in an unusual error case - not great, but not
> worth crashing the kernel either.
> 
> For UFFDIO_POISON this doesn't really apply because it doesn't
> allocate or get a reference to a page in any case.
> 
> For other cases like UFFDIO_COPY where it does matter, I think it's
> fine as is regardless of the error code returned (as long as it's not
> -ENOENT :) ). mfill_atomic_pte_copy() is sure to set `*foliop = NULL`
> before it attempts this lock (in mfill_atomic_install_pte). So -EAGAIN
> or -EFAULT should work equally well at least from that perspective. I
> somewhat prefer -EAGAIN, but maybe I'm missing something.

Right, when I said -EFAULT better than -EAGAIN, I was only basing that
on -EFAULT being the code which was already used for pmd_trans_huge()
in the check above mfill_atomic()'s call to mfill_atomic_pte().

I don't disagree with you, and Peter too preferred -EAGAIN: it's just
better if they all use the same code (or maybe mfill_atomic()'s first
check for pmd_trans_huge() can be deleted along with the BUG_ONs - but
I have not checked through the cases, there may be very good reason to
keep that preliminary check there).

Hugh
Peter Xu July 19, 2023, 9:53 p.m. UTC | #3
Hello, Hugh, Axel,

Sorry for a very late reply after a long PTO..

On Thu, Jul 13, 2023 at 07:40:30PM -0700, Hugh Dickins wrote:
> On Wed, 12 Jul 2023, Axel Rasmussen wrote:
> > On Tue, Jul 11, 2023 at 6:27 PM Hugh Dickins <hughd@google.com> wrote:
> > >
> > > Smatch has observed that pte_offset_map_lock() is now allowed to fail,
> > > and then ptl should not be unlocked.  Use -EAGAIN here like elsewhere.
> > >
> > > Signed-off-by: Hugh Dickins <hughd@google.com>
> > 
> > Looks correct to me, thanks Hugh! If it's useful, feel free to take:
> > 
> > Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
> 
> Thanks, that will have helped Andrew (but would vanish into the merge
> with your original when it moves from mm-unstable to mm-stable).
> 
> > 
> > > ---
> > > Axel, Peter: this seems right as a fix to the patch in mm-unstable;
> > > but in preparing this, I noticed mfill_atomic()'s code before calling
> > > mfill_atomic_pte(), and think that my original choice of -EFAULT was
> > > therefore better than -EAGAIN for all of these; and that mfill_atomic()'s
> > > BUG_ONs there would be better deleted (and is its BUG_ON(folio) safe??).
> > > Something one of us should address, after this fixup is in akpm's tree.
> > 
> > Agreed, dealing with the BUG_ONs is something I have been meaning to
> > do as checkpatch bothers me about it frequently.
> > 
> > What this code is trying to do is to enforce the contract that: if
> > mfill_atomic_pte failed, then it must have released *folio and set it
> > to NULL. But, you're exactly right that if we had such a bug, it would
> > mean we leaked one page in an unusual error case - not great, but not
> > worth crashing the kernel either.

Yes I also agree the BUG_ON might be a bit too aggressive, even if still
accurate (so we should really not trigger that).  WARN_ON_ONCE() should be
enough.

> > 
> > For UFFDIO_POISON this doesn't really apply because it doesn't
> > allocate or get a reference to a page in any case.
> > 
> > For other cases like UFFDIO_COPY where it does matter, I think it's
> > fine as is regardless of the error code returned (as long as it's not
> > -ENOENT :) ). mfill_atomic_pte_copy() is sure to set `*foliop = NULL`
> > before it attempts this lock (in mfill_atomic_install_pte). So -EAGAIN
> > or -EFAULT should work equally well at least from that perspective. I
> > somewhat prefer -EAGAIN, but maybe I'm missing something.
> 
> Right, when I said -EFAULT better than -EAGAIN, I was only basing that
> on -EFAULT being the code which was already used for pmd_trans_huge()
> in the check above mfill_atomic()'s call to mfill_atomic_pte().
> 
> I don't disagree with you, and Peter too preferred -EAGAIN: it's just
> better if they all use the same code (or maybe mfill_atomic()'s first
> check for pmd_trans_huge() can be deleted along with the BUG_ONs - but
> I have not checked through the cases, there may be very good reason to
> keep that preliminary check there).

It's just that -EFAULT will definitely fail most of the uffd based apps,
because no app can handle an -EFAULT over any ioctl(UFFDIO_*).  So if we'd
delete the 1st pmd_trans_huge() check we may want to make the 2nd return
-EAGAIN.

I think this -EFAULT (of 2nd pmd_trans_huge() check) should really also be
an -EAGAIN, I'd bet ten dollars if it's really hit somewhere before it'll
crash the app as explained.

With Hugh's recent rework on pte_offset_map_lock() (which should contain a
thp check within the map_lock()) IMHO we can safely drop the 2nd check as a
whole (while keeping the 1st check as a fast path to detect existing thps),
and then we should reliably return an -EAGAIN if a thp raced with us, so
the userapp should just retry when race with anything else.

Thanks,
diff mbox series

Patch

--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -300,7 +300,10 @@  static int mfill_atomic_pte_poison(pmd_t *dst_pmd,
 	spinlock_t *ptl;
 
 	_dst_pte = make_pte_marker(PTE_MARKER_POISONED);
+	ret = -EAGAIN;
 	dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
+	if (!dst_pte)
+		goto out;
 
 	if (mfill_file_over_size(dst_vma, dst_addr)) {
 		ret = -EFAULT;
@@ -319,6 +322,7 @@  static int mfill_atomic_pte_poison(pmd_t *dst_pmd,
 	ret = 0;
 out_unlock:
 	pte_unmap_unlock(dst_pte, ptl);
+out:
 	return ret;
 }