Message ID | 1592363698-4266-1-git-send-email-jrdr.linux@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] xen/privcmd: Convert get_user_pages*() to pin_user_pages*() | expand |
On 6/16/20 11:14 PM, Souptick Joarder wrote: > In 2019, we introduced pin_user_pages*() and now we are converting > get_user_pages*() to the new API as appropriate. [1] & [2] could > be referred for more information. > > [1] Documentation/core-api/pin_user_pages.rst > > [2] "Explicit pinning of user-space pages": > https://lwn.net/Articles/807108/ > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > Cc: John Hubbard <jhubbard@nvidia.com> > --- > Hi, > > I have compile tested this patch but unable to run-time test, > so any testing help is much appriciated. > > Also have a question, why the existing code is not marking the > pages dirty (since it did FOLL_WRITE) ? Indeed, seems to me it should. Paul? > > drivers/xen/privcmd.c | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c > index a250d11..543739e 100644 > --- a/drivers/xen/privcmd.c > +++ b/drivers/xen/privcmd.c > @@ -594,7 +594,7 @@ static int lock_pages( > if (requested > nr_pages) > return -ENOSPC; > > - pinned = get_user_pages_fast( > + pinned = pin_user_pages_fast( > (unsigned long) kbufs[i].uptr, > requested, FOLL_WRITE, pages); > if (pinned < 0) > @@ -614,10 +614,7 @@ static void unlock_pages(struct page *pages[], unsigned int nr_pages) > if (!pages) > return; > > - for (i = 0; i < nr_pages; i++) { > - if (pages[i]) > - put_page(pages[i]); > - } > + unpin_user_pages(pages, nr_pages); Why are you no longer checking for valid pages? -boris
On Wed, Jun 17, 2020 at 11:29 PM Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote: > > On 6/16/20 11:14 PM, Souptick Joarder wrote: > > In 2019, we introduced pin_user_pages*() and now we are converting > > get_user_pages*() to the new API as appropriate. [1] & [2] could > > be referred for more information. > > > > [1] Documentation/core-api/pin_user_pages.rst > > > > [2] "Explicit pinning of user-space pages": > > https://lwn.net/Articles/807108/ > > > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > > Cc: John Hubbard <jhubbard@nvidia.com> > > --- > > Hi, > > > > I have compile tested this patch but unable to run-time test, > > so any testing help is much appriciated. > > > > Also have a question, why the existing code is not marking the > > pages dirty (since it did FOLL_WRITE) ? > > > Indeed, seems to me it should. Paul? > > > > > > drivers/xen/privcmd.c | 7 ++----- > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c > > index a250d11..543739e 100644 > > --- a/drivers/xen/privcmd.c > > +++ b/drivers/xen/privcmd.c > > @@ -594,7 +594,7 @@ static int lock_pages( > > if (requested > nr_pages) > > return -ENOSPC; > > > > - pinned = get_user_pages_fast( > > + pinned = pin_user_pages_fast( > > (unsigned long) kbufs[i].uptr, > > requested, FOLL_WRITE, pages); > > if (pinned < 0) > > @@ -614,10 +614,7 @@ static void unlock_pages(struct page *pages[], unsigned int nr_pages) > > if (!pages) > > return; > > > > - for (i = 0; i < nr_pages; i++) { > > - if (pages[i]) > > - put_page(pages[i]); > > - } > > + unpin_user_pages(pages, nr_pages); > > > Why are you no longer checking for valid pages? My understanding is, in case of lock_pages() end up returning partial mapped pages, we should pass no. of partial mapped pages to unlock_pages(), not nr_pages. This will avoid checking extra check to validate the pages[i]. and if lock_pages() returns 0 in success, anyway we have all the pages[i] valid. I will try to correct it in v2. But I agree, there is no harm to check for pages[i] and I believe, unpin_user_pages() is the right place to do so. John any thought ?
On 2020-06-18 20:12, Souptick Joarder wrote: > On Wed, Jun 17, 2020 at 11:29 PM Boris Ostrovsky > <boris.ostrovsky@oracle.com> wrote: >> >> On 6/16/20 11:14 PM, Souptick Joarder wrote: >>> In 2019, we introduced pin_user_pages*() and now we are converting >>> get_user_pages*() to the new API as appropriate. [1] & [2] could >>> be referred for more information. Ideally, the commit description should say which case, in pin_user_pages.rst, that this is. >>> >>> [1] Documentation/core-api/pin_user_pages.rst >>> >>> [2] "Explicit pinning of user-space pages": >>> https://lwn.net/Articles/807108/ >>> >>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> >>> Cc: John Hubbard <jhubbard@nvidia.com> >>> --- >>> Hi, >>> >>> I have compile tested this patch but unable to run-time test, >>> so any testing help is much appriciated. >>> >>> Also have a question, why the existing code is not marking the >>> pages dirty (since it did FOLL_WRITE) ? >> >> >> Indeed, seems to me it should. Paul? Definitely good to get an answer from an expert in this code, but meanwhile, it's reasonable to just mark them dirty. Below... >> >> >>> >>> drivers/xen/privcmd.c | 7 ++----- >>> 1 file changed, 2 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c >>> index a250d11..543739e 100644 >>> --- a/drivers/xen/privcmd.c >>> +++ b/drivers/xen/privcmd.c >>> @@ -594,7 +594,7 @@ static int lock_pages( >>> if (requested > nr_pages) >>> return -ENOSPC; >>> >>> - pinned = get_user_pages_fast( >>> + pinned = pin_user_pages_fast( >>> (unsigned long) kbufs[i].uptr, >>> requested, FOLL_WRITE, pages); >>> if (pinned < 0) >>> @@ -614,10 +614,7 @@ static void unlock_pages(struct page *pages[], unsigned int nr_pages) >>> if (!pages) >>> return; >>> >>> - for (i = 0; i < nr_pages; i++) { >>> - if (pages[i]) >>> - put_page(pages[i]); >>> - } >>> + unpin_user_pages(pages, nr_pages); ...so just use unpin_user_pages_dirty_lock() here, I think. >> >> >> Why are you no longer checking for valid pages? > > My understanding is, in case of lock_pages() end up returning partial > mapped pages, > we should pass no. of partial mapped pages to unlock_pages(), not nr_pages. > This will avoid checking extra check to validate the pages[i]. > > and if lock_pages() returns 0 in success, anyway we have all the pages[i] valid. > I will try to correct it in v2. > > But I agree, there is no harm to check for pages[i] and I believe, Generally, it *is* harmful to do unnecessary checks, in most code, but especially in most kernel code. If you can convince yourself that the check for null pages is redundant here, then please let's remove that check. The code becomes then becomes shorter, simpler, and faster. > unpin_user_pages() > is the right place to do so. > > John any thought ? So far I haven't seen any cases to justify changing the implementation of unpin_user_pages(). thanks,
> -----Original Message----- > From: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Sent: 17 June 2020 18:57 > To: Souptick Joarder <jrdr.linux@gmail.com>; jgross@suse.com; sstabellini@kernel.org > Cc: xen-devel@lists.xenproject.org; linux-kernel@vger.kernel.org; John Hubbard <jhubbard@nvidia.com>; > paul@xen.org > Subject: Re: [RFC PATCH] xen/privcmd: Convert get_user_pages*() to pin_user_pages*() > > On 6/16/20 11:14 PM, Souptick Joarder wrote: > > In 2019, we introduced pin_user_pages*() and now we are converting > > get_user_pages*() to the new API as appropriate. [1] & [2] could > > be referred for more information. > > > > [1] Documentation/core-api/pin_user_pages.rst > > > > [2] "Explicit pinning of user-space pages": > > https://lwn.net/Articles/807108/ > > > > Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > > Cc: John Hubbard <jhubbard@nvidia.com> > > --- > > Hi, > > > > I have compile tested this patch but unable to run-time test, > > so any testing help is much appriciated. > > > > Also have a question, why the existing code is not marking the > > pages dirty (since it did FOLL_WRITE) ? > > > Indeed, seems to me it should. Paul? > Yes, it looks like that was an oversight. The hypercall may well result in data being copied back into the buffers so the whole pages array should be considered dirty. Paul > > > > > drivers/xen/privcmd.c | 7 ++----- > > 1 file changed, 2 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c > > index a250d11..543739e 100644 > > --- a/drivers/xen/privcmd.c > > +++ b/drivers/xen/privcmd.c > > @@ -594,7 +594,7 @@ static int lock_pages( > > if (requested > nr_pages) > > return -ENOSPC; > > > > - pinned = get_user_pages_fast( > > + pinned = pin_user_pages_fast( > > (unsigned long) kbufs[i].uptr, > > requested, FOLL_WRITE, pages); > > if (pinned < 0) > > @@ -614,10 +614,7 @@ static void unlock_pages(struct page *pages[], unsigned int nr_pages) > > if (!pages) > > return; > > > > - for (i = 0; i < nr_pages; i++) { > > - if (pages[i]) > > - put_page(pages[i]); > > - } > > + unpin_user_pages(pages, nr_pages); > > > Why are you no longer checking for valid pages? > > > -boris > > >
On Fri, Jun 19, 2020 at 1:00 PM John Hubbard <jhubbard@nvidia.com> wrote: > > On 2020-06-18 20:12, Souptick Joarder wrote: > > On Wed, Jun 17, 2020 at 11:29 PM Boris Ostrovsky > > <boris.ostrovsky@oracle.com> wrote: > >> > >> On 6/16/20 11:14 PM, Souptick Joarder wrote: > >>> In 2019, we introduced pin_user_pages*() and now we are converting > >>> get_user_pages*() to the new API as appropriate. [1] & [2] could > >>> be referred for more information. > > > Ideally, the commit description should say which case, in > pin_user_pages.rst, that this is. > Ok. > > >>> > >>> [1] Documentation/core-api/pin_user_pages.rst > >>> > >>> [2] "Explicit pinning of user-space pages": > >>> https://lwn.net/Articles/807108/ > >>> > >>> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> > >>> Cc: John Hubbard <jhubbard@nvidia.com> > >>> --- > >>> Hi, > >>> > >>> I have compile tested this patch but unable to run-time test, > >>> so any testing help is much appriciated. > >>> > >>> Also have a question, why the existing code is not marking the > >>> pages dirty (since it did FOLL_WRITE) ? > >> > >> > >> Indeed, seems to me it should. Paul? > > Definitely good to get an answer from an expert in this code, but > meanwhile, it's reasonable to just mark them dirty. Below... > > >> > >> > >>> > >>> drivers/xen/privcmd.c | 7 ++----- > >>> 1 file changed, 2 insertions(+), 5 deletions(-) > >>> > >>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c > >>> index a250d11..543739e 100644 > >>> --- a/drivers/xen/privcmd.c > >>> +++ b/drivers/xen/privcmd.c > >>> @@ -594,7 +594,7 @@ static int lock_pages( > >>> if (requested > nr_pages) > >>> return -ENOSPC; > >>> > >>> - pinned = get_user_pages_fast( > >>> + pinned = pin_user_pages_fast( > >>> (unsigned long) kbufs[i].uptr, > >>> requested, FOLL_WRITE, pages); > >>> if (pinned < 0) > >>> @@ -614,10 +614,7 @@ static void unlock_pages(struct page *pages[], unsigned int nr_pages) > >>> if (!pages) > >>> return; > >>> > >>> - for (i = 0; i < nr_pages; i++) { > >>> - if (pages[i]) > >>> - put_page(pages[i]); > >>> - } > >>> + unpin_user_pages(pages, nr_pages); > > > ...so just use unpin_user_pages_dirty_lock() here, I think. > > > >> > >> > >> Why are you no longer checking for valid pages? > > > > My understanding is, in case of lock_pages() end up returning partial > > mapped pages, > > we should pass no. of partial mapped pages to unlock_pages(), not nr_pages. > > This will avoid checking extra check to validate the pages[i]. > > > > and if lock_pages() returns 0 in success, anyway we have all the pages[i] valid. > > I will try to correct it in v2. > > > > But I agree, there is no harm to check for pages[i] and I believe, > > > Generally, it *is* harmful to do unnecessary checks, in most code, but especially > in most kernel code. If you can convince yourself that the check for null pages > is redundant here, then please let's remove that check. The code becomes then > becomes shorter, simpler, and faster. I read the code again. I think, this check is needed to handle a scenario when lock_pages() return -ENOSPC. Better to keep this check. Let me post v2 of this RFC for a clear view. > > > > unpin_user_pages() > > is the right place to do so. > > > > John any thought ? > > > So far I haven't seen any cases to justify changing the implementation of > unpin_user_pages(). > > > thanks, > -- > John Hubbard > NVIDIA
On 6/22/20 2:52 PM, Souptick Joarder wrote: > > I read the code again. I think, this check is needed to handle a scenario when > lock_pages() return -ENOSPC. Better to keep this check. Let me post v2 of this > RFC for a clear view. Actually, error handling seems to be somewhat broken here. If lock_pages() returns number of pinned pages then that's what we end up returning from privcmd_ioctl_dm_op(), all the way to user ioctl(). Which I don't think is right, we should return proper (negative) error. Do you mind fixing that we well? Then you should be able to avoid testing pages in a loop. -boris
On 6/22/20 3:28 PM, Souptick Joarder wrote: > On Tue, Jun 23, 2020 at 12:40 AM Boris Ostrovsky > <boris.ostrovsky@oracle.com> wrote: >> On 6/22/20 2:52 PM, Souptick Joarder wrote: >>> I read the code again. I think, this check is needed to handle a scenario when >>> lock_pages() return -ENOSPC. Better to keep this check. Let me post v2 of this >>> RFC for a clear view. >> >> Actually, error handling seems to be somewhat broken here. If >> lock_pages() returns number of pinned pages then that's what we end up >> returning from privcmd_ioctl_dm_op(), all the way to user ioctl(). Which >> I don't think is right, we should return proper (negative) error. >> > What -ERRNO is more appropriate here ? -ENOSPC ? You can simply pass along error code that get_user_pages_fast() returned. -boris
On Tue, Jun 23, 2020 at 12:40 AM Boris Ostrovsky <boris.ostrovsky@oracle.com> wrote: > > On 6/22/20 2:52 PM, Souptick Joarder wrote: > > > > I read the code again. I think, this check is needed to handle a scenario when > > lock_pages() return -ENOSPC. Better to keep this check. Let me post v2 of this > > RFC for a clear view. > > > Actually, error handling seems to be somewhat broken here. If > lock_pages() returns number of pinned pages then that's what we end up > returning from privcmd_ioctl_dm_op(), all the way to user ioctl(). Which > I don't think is right, we should return proper (negative) error. > What -ERRNO is more appropriate here ? -ENOSPC ? > > Do you mind fixing that we well? Then you should be able to avoid > testing pages in a loop. Ok, let me try to fix it. > > > -boris >
diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c index a250d11..543739e 100644 --- a/drivers/xen/privcmd.c +++ b/drivers/xen/privcmd.c @@ -594,7 +594,7 @@ static int lock_pages( if (requested > nr_pages) return -ENOSPC; - pinned = get_user_pages_fast( + pinned = pin_user_pages_fast( (unsigned long) kbufs[i].uptr, requested, FOLL_WRITE, pages); if (pinned < 0) @@ -614,10 +614,7 @@ static void unlock_pages(struct page *pages[], unsigned int nr_pages) if (!pages) return; - for (i = 0; i < nr_pages; i++) { - if (pages[i]) - put_page(pages[i]); - } + unpin_user_pages(pages, nr_pages); } static long privcmd_ioctl_dm_op(struct file *file, void __user *udata)
In 2019, we introduced pin_user_pages*() and now we are converting get_user_pages*() to the new API as appropriate. [1] & [2] could be referred for more information. [1] Documentation/core-api/pin_user_pages.rst [2] "Explicit pinning of user-space pages": https://lwn.net/Articles/807108/ Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: John Hubbard <jhubbard@nvidia.com> --- Hi, I have compile tested this patch but unable to run-time test, so any testing help is much appriciated. Also have a question, why the existing code is not marking the pages dirty (since it did FOLL_WRITE) ? drivers/xen/privcmd.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)