Message ID | 49d92b15-3442-4e84-39bd-c77c316bf844@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: allow pte_offset_map[_lock]() to fail | expand |
Hi, Hugh, On Sun, May 21, 2023 at 10:07:35PM -0700, Hugh Dickins wrote: > mfill_atomic_install_pte() and mfill_atomic_pte_zeropage() treat > failed pte_offset_map_lock() as -EFAULT, with no attempt to retry. Could you help explain why it should be -EFAULT, not -EAGAIN or -EEXIST? IIUC right now if pte existed we have -EEXIST returned as part of the userfault ABI, no matter whether it's pte or thp. IMHO it may boil down to my limited knowledge on how pte_offset_map_lock() is used after this part 2 series, and I assume the core changes will be in your 3rd series (besides this one and the arch one). Please shed some light if there's quick answers (IIUC this is for speeding up collapsing shmem thps, but still no much clue here), or I can also wait for reading the 3rd part if it'll come soon in any form. Thanks,
On Wed, 24 May 2023, Peter Xu wrote: > On Sun, May 21, 2023 at 10:07:35PM -0700, Hugh Dickins wrote: > > mfill_atomic_install_pte() and mfill_atomic_pte_zeropage() treat > > failed pte_offset_map_lock() as -EFAULT, with no attempt to retry. > > Could you help explain why it should be -EFAULT, not -EAGAIN or -EEXIST? Thanks a lot for looking, Peter. No good justification for -EFAULT: I just grabbed the closest, fairly neutral, error code that I could see already being in use there: but now that you mention -EAGAIN, which I can see being used from mfill_atomic(), yes, that would be ideal - and consistent with how it's already being used. I'll make that change, thanks for suggesting. (And it had bugged me how my fs/userfaultfd.c was electing to retry, but this one electing to fail.) > > IIUC right now if pte existed we have -EEXIST returned as part of the > userfault ABI, no matter whether it's pte or thp. It might or might not correspond to -EEXIST - it might even end up as -EFAULT on a retry after -EAGAIN: I see mfill_atomic() contains both -EEXIST and -EFAULT cases for pmd_trans_huge(). Actually, I could say that the -EFAULT case there corresponds to the -EFAULT in this 15/31 patch, but that would be by coincidence not design: I'm happier with your -EAGAIN suggestion. > > IMHO it may boil down to my limited knowledge on how pte_offset_map_lock() > is used after this part 2 series, and I assume the core changes will be in > your 3rd series (besides this one and the arch one). > > Please shed some light if there's quick answers (IIUC this is for speeding > up collapsing shmem thps, but still no much clue here), or I can also wait > for reading the 3rd part if it'll come soon in any form. It wouldn't be particularly easy to deduce from the third series of patches, rather submerged in implementation details. Just keep in mind that, like in the "old" pmd_trans_unstable() cases, there may be instants at which, when trying to get the lock on a page table, that page table might already have gone, or been replaced by something else e.g. a THP, and a retry necessary at the outer level (if it's important to persist). Hugh
On Thu, May 25, 2023 at 03:06:27PM -0700, Hugh Dickins wrote: > On Wed, 24 May 2023, Peter Xu wrote: > > On Sun, May 21, 2023 at 10:07:35PM -0700, Hugh Dickins wrote: > > > mfill_atomic_install_pte() and mfill_atomic_pte_zeropage() treat > > > failed pte_offset_map_lock() as -EFAULT, with no attempt to retry. > > > > Could you help explain why it should be -EFAULT, not -EAGAIN or -EEXIST? > > Thanks a lot for looking, Peter. > > No good justification for -EFAULT: I just grabbed the closest, fairly > neutral, error code that I could see already being in use there: but now > that you mention -EAGAIN, which I can see being used from mfill_atomic(), > yes, that would be ideal - and consistent with how it's already being used. > > I'll make that change, thanks for suggesting. (And it had bugged me how > my fs/userfaultfd.c was electing to retry, but this one electing to fail.) Thanks. > > > > > IIUC right now if pte existed we have -EEXIST returned as part of the > > userfault ABI, no matter whether it's pte or thp. > > It might or might not correspond to -EEXIST - it might even end up as > -EFAULT on a retry after -EAGAIN: I see mfill_atomic() contains both > -EEXIST and -EFAULT cases for pmd_trans_huge(). Actually, I could > say that the -EFAULT case there corresponds to the -EFAULT in this > 15/31 patch, but that would be by coincidence not design: I'm happier > with your -EAGAIN suggestion. I had a feeling that that 2nd -EFAULT there could crash some userapp already if it got returned somewhere, because the userapp shouldn't expect that. IMHO it should also return -EAGAIN, or even -EEXIST because even if user retries, we should highly possibly see that thp again, so the -EEXIST should possibly follow anyway. Not a big deal here I think - if an userapp can trigger that -EFAULT I'd say it's also a user bug because it made two decisions already on resolving page fault for single VA, and it's racy between them.. > > > > > IMHO it may boil down to my limited knowledge on how pte_offset_map_lock() > > is used after this part 2 series, and I assume the core changes will be in > > your 3rd series (besides this one and the arch one). > > > > Please shed some light if there's quick answers (IIUC this is for speeding > > up collapsing shmem thps, but still no much clue here), or I can also wait > > for reading the 3rd part if it'll come soon in any form. > > It wouldn't be particularly easy to deduce from the third series of > patches, rather submerged in implementation details. Just keep in mind > that, like in the "old" pmd_trans_unstable() cases, there may be instants > at which, when trying to get the lock on a page table, that page table > might already have gone, or been replaced by something else e.g. a THP, > and a retry necessary at the outer level (if it's important to persist). I'm actually still curious how the 3rd series will look like; would love to read it when it comes. Thanks,
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index e97a0b4889fc..b1554286a31c 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -76,14 +76,16 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, if (flags & MFILL_ATOMIC_WP) _dst_pte = pte_mkuffd_wp(_dst_pte); + ret = -EFAULT; dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); + if (!dst_pte) + goto out; if (vma_is_shmem(dst_vma)) { /* serialize against truncate with the page table lock */ inode = dst_vma->vm_file->f_inode; offset = linear_page_index(dst_vma, dst_addr); max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); - ret = -EFAULT; if (unlikely(offset >= max_off)) goto out_unlock; } @@ -121,6 +123,7 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, ret = 0; out_unlock: pte_unmap_unlock(dst_pte, ptl); +out: return ret; } @@ -212,13 +215,15 @@ static int mfill_atomic_pte_zeropage(pmd_t *dst_pmd, _dst_pte = pte_mkspecial(pfn_pte(my_zero_pfn(dst_addr), dst_vma->vm_page_prot)); + ret = -EFAULT; dst_pte = pte_offset_map_lock(dst_vma->vm_mm, dst_pmd, dst_addr, &ptl); + if (!dst_pte) + goto out; if (dst_vma->vm_file) { /* the shmem MAP_PRIVATE case requires checking the i_size */ inode = dst_vma->vm_file->f_inode; offset = linear_page_index(dst_vma, dst_addr); max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); - ret = -EFAULT; if (unlikely(offset >= max_off)) goto out_unlock; } @@ -231,6 +236,7 @@ static int mfill_atomic_pte_zeropage(pmd_t *dst_pmd, ret = 0; out_unlock: pte_unmap_unlock(dst_pte, ptl); +out: return ret; }
mfill_atomic_install_pte() and mfill_atomic_pte_zeropage() treat failed pte_offset_map_lock() as -EFAULT, with no attempt to retry. Signed-off-by: Hugh Dickins <hughd@google.com> --- mm/userfaultfd.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)