Message ID | 20190910103016.14290-3-osalvador@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Hwpoison soft-offline rework | expand |
On 10.09.19 12:30, Oscar Salvador wrote: > From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > > Currently madvise_inject_error() pins the target via get_user_pages_fast. > The call to get_user_pages_fast is only to get the respective page > of a given address, but it is the job of the memory-poisoning handler > to deal with races, so drop the refcount grabbed by get_user_pages_fast. > > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > Signed-off-by: Oscar Salvador <osalvador@suse.de> > --- > mm/madvise.c | 25 +++++++++++-------------- > 1 file changed, 11 insertions(+), 14 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 6e023414f5c1..fbe6d402232c 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -883,6 +883,16 @@ static int madvise_inject_error(int behavior, > ret = get_user_pages_fast(start, 1, 0, &page); > if (ret != 1) > return ret; > + /* > + * The get_user_pages_fast() is just to get the pfn of the > + * given address, and the refcount has nothing to do with > + * what we try to test, so it should be released immediately. > + * This is racy but it's intended because the real hardware > + * errors could happen at any moment and memory error handlers > + * must properly handle the race. > + */ > + put_page(page); > + I wonder if it would be clearer to do that after the page has been fully used - e.g. after getting the pfn and the order (and then e.g., symbolically setting the page pointer to 0). I guess the important part of this patch is to not have an elevated refcount while calling soft_offline_page(). > pfn = page_to_pfn(page); > > /* > @@ -892,16 +902,11 @@ static int madvise_inject_error(int behavior, > */ > order = compound_order(compound_head(page)); > > - if (PageHWPoison(page)) { > - put_page(page); > - continue; > - } This change is not reflected in the changelog. I would have expected that only the put_page() would go. If this should go completely, I suggest a separate patch. > - > if (behavior == MADV_SOFT_OFFLINE) { > pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n", > pfn, start); > > - ret = soft_offline_page(page, MF_COUNT_INCREASED); > + ret = soft_offline_page(page, 0); > if (ret) > return ret; > continue; > @@ -909,14 +914,6 @@ static int madvise_inject_error(int behavior, > > pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n", > pfn, start); > - > - /* > - * Drop the page reference taken by get_user_pages_fast(). In > - * the absence of MF_COUNT_INCREASED the memory_failure() > - * routine is responsible for pinning the page to prevent it > - * from being released back to the page allocator. > - */ > - put_page(page); > ret = memory_failure(pfn, 0); > if (ret) > return ret; >
On 10.09.19 12:30, Oscar Salvador wrote: > From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > > Currently madvise_inject_error() pins the target via get_user_pages_fast. > The call to get_user_pages_fast is only to get the respective page > of a given address, but it is the job of the memory-poisoning handler > to deal with races, so drop the refcount grabbed by get_user_pages_fast. > Oh, and another question "it is the job of the memory-poisoning handler" - is that already properly implemented? (newbee question ¯\_(ツ)_/¯)
Hi David, On Wed, Sep 11, 2019 at 12:23:24PM +0200, David Hildenbrand wrote: > On 10.09.19 12:30, Oscar Salvador wrote: > > From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > > > > Currently madvise_inject_error() pins the target via get_user_pages_fast. > > The call to get_user_pages_fast is only to get the respective page > > of a given address, but it is the job of the memory-poisoning handler > > to deal with races, so drop the refcount grabbed by get_user_pages_fast. > > > > Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > > Signed-off-by: Oscar Salvador <osalvador@suse.de> > > --- > > mm/madvise.c | 25 +++++++++++-------------- > > 1 file changed, 11 insertions(+), 14 deletions(-) > > > > diff --git a/mm/madvise.c b/mm/madvise.c > > index 6e023414f5c1..fbe6d402232c 100644 > > --- a/mm/madvise.c > > +++ b/mm/madvise.c > > @@ -883,6 +883,16 @@ static int madvise_inject_error(int behavior, > > ret = get_user_pages_fast(start, 1, 0, &page); > > if (ret != 1) > > return ret; > > + /* > > + * The get_user_pages_fast() is just to get the pfn of the > > + * given address, and the refcount has nothing to do with > > + * what we try to test, so it should be released immediately. > > + * This is racy but it's intended because the real hardware > > + * errors could happen at any moment and memory error handlers > > + * must properly handle the race. > > + */ > > + put_page(page); > > + > > I wonder if it would be clearer to do that after the page has been fully > used - e.g. after getting the pfn and the order (and then e.g., > symbolically setting the page pointer to 0). Yes, this could be called just after page_to_pfn() below. > I guess the important part of this patch is to not have an elevated > refcount while calling soft_offline_page(). > That's right. > > pfn = page_to_pfn(page); > > > > /* > > @@ -892,16 +902,11 @@ static int madvise_inject_error(int behavior, > > */ > > order = compound_order(compound_head(page)); > > > > - if (PageHWPoison(page)) { > > - put_page(page); > > - continue; > > - } > > This change is not reflected in the changelog. I would have expected > that only the put_page() would go. If this should go completely, I > suggest a separate patch. > I forget why I tried to remove the if block, and now I think only the put_page() should go as you point out. Thanks for the comment. - Naoya
On Wed, Sep 11, 2019 at 12:27:22PM +0200, David Hildenbrand wrote: > On 10.09.19 12:30, Oscar Salvador wrote: > > From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > > > > Currently madvise_inject_error() pins the target via get_user_pages_fast. > > The call to get_user_pages_fast is only to get the respective page > > of a given address, but it is the job of the memory-poisoning handler > > to deal with races, so drop the refcount grabbed by get_user_pages_fast. > > > > Oh, and another question "it is the job of the memory-poisoning handler" > - is that already properly implemented? (newbee question ¯\_(ツ)_/¯) The above description might be confusing, sorry. It's intended likes The call to get_user_pages_fast is only to get the pointer to struct page of a given address, pinning it is memory-poisoning handler's job, so drop the refcount grabbed by get_user_pages_fast. And pinning is done in get_hwpoison_page() for hard-offline and get_any_page() for soft-offline. For soft-offline case, the semantics of refcount of poisoned pages is what this patchset tries to change/improve. Thanks, Naoya Horiguchi
On 12.09.19 03:37, Naoya Horiguchi wrote: > On Wed, Sep 11, 2019 at 12:27:22PM +0200, David Hildenbrand wrote: >> On 10.09.19 12:30, Oscar Salvador wrote: >>> From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> >>> >>> Currently madvise_inject_error() pins the target via get_user_pages_fast. >>> The call to get_user_pages_fast is only to get the respective page >>> of a given address, but it is the job of the memory-poisoning handler >>> to deal with races, so drop the refcount grabbed by get_user_pages_fast. >>> >> >> Oh, and another question "it is the job of the memory-poisoning handler" >> - is that already properly implemented? (newbee question ¯\_(ツ)_/¯) > > The above description might be confusing, sorry. It's intended likes > > The call to get_user_pages_fast is only to get the pointer to struct > page of a given address, pinning it is memory-poisoning handler's job, > so drop the refcount grabbed by get_user_pages_fast. > > And pinning is done in get_hwpoison_page() for hard-offline and > get_any_page() for soft-offline. For soft-offline case, the semantics of > refcount of poisoned pages is what this patchset tries to change/improve. Thanks for the clarification!
diff --git a/mm/madvise.c b/mm/madvise.c index 6e023414f5c1..fbe6d402232c 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -883,6 +883,16 @@ static int madvise_inject_error(int behavior, ret = get_user_pages_fast(start, 1, 0, &page); if (ret != 1) return ret; + /* + * The get_user_pages_fast() is just to get the pfn of the + * given address, and the refcount has nothing to do with + * what we try to test, so it should be released immediately. + * This is racy but it's intended because the real hardware + * errors could happen at any moment and memory error handlers + * must properly handle the race. + */ + put_page(page); + pfn = page_to_pfn(page); /* @@ -892,16 +902,11 @@ static int madvise_inject_error(int behavior, */ order = compound_order(compound_head(page)); - if (PageHWPoison(page)) { - put_page(page); - continue; - } - if (behavior == MADV_SOFT_OFFLINE) { pr_info("Soft offlining pfn %#lx at process virtual address %#lx\n", pfn, start); - ret = soft_offline_page(page, MF_COUNT_INCREASED); + ret = soft_offline_page(page, 0); if (ret) return ret; continue; @@ -909,14 +914,6 @@ static int madvise_inject_error(int behavior, pr_info("Injecting memory failure for pfn %#lx at process virtual address %#lx\n", pfn, start); - - /* - * Drop the page reference taken by get_user_pages_fast(). In - * the absence of MF_COUNT_INCREASED the memory_failure() - * routine is responsible for pinning the page to prevent it - * from being released back to the page allocator. - */ - put_page(page); ret = memory_failure(pfn, 0); if (ret) return ret;