mbox series

[v2,0/2] A couple hugetlbfs fixes

Message ID 20190328234704.27083-1-mike.kravetz@oracle.com (mailing list archive)
Headers show
Series A couple hugetlbfs fixes | expand

Message

Mike Kravetz March 28, 2019, 11:47 p.m. UTC
I stumbled on these two hugetlbfs issues while looking at other things:
- The 'restore reserve' functionality at page free time should not
  be adjusting subpool counts.
- A BUG can be triggered (not easily) due to temporarily mapping a
  page before doing a COW.

Both are described in detail in the commit message of the patches.
I would appreciate comments from Davidlohr Bueso as one patch is
directly related to code he added in commit 8382d914ebf7.

I did not cc stable as the first problem has been around since reserves
were added to hugetlbfs and nobody has noticed.  The second is very hard
to hit/reproduce.

v2 - Update definition and all callers of hugetlb_fault_mutex_hash as
     the arguments mm and vma are no longer used or necessary.

Mike Kravetz (2):
  huegtlbfs: on restore reserve error path retain subpool reservation
  hugetlb: use same fault hash key for shared and private mappings

 fs/hugetlbfs/inode.c    |  7 ++-----
 include/linux/hugetlb.h |  4 +---
 mm/hugetlb.c            | 43 +++++++++++++++++++++--------------------
 mm/userfaultfd.c        |  3 +--
 4 files changed, 26 insertions(+), 31 deletions(-)

Comments

Naoya Horiguchi April 3, 2019, 5:29 a.m. UTC | #1
On Thu, Mar 28, 2019 at 04:47:02PM -0700, Mike Kravetz wrote:
> I stumbled on these two hugetlbfs issues while looking at other things:
> - The 'restore reserve' functionality at page free time should not
>   be adjusting subpool counts.
> - A BUG can be triggered (not easily) due to temporarily mapping a
>   page before doing a COW.
> 
> Both are described in detail in the commit message of the patches.
> I would appreciate comments from Davidlohr Bueso as one patch is
> directly related to code he added in commit 8382d914ebf7.
> 
> I did not cc stable as the first problem has been around since reserves
> were added to hugetlbfs and nobody has noticed.  The second is very hard
> to hit/reproduce.
> 
> v2 - Update definition and all callers of hugetlb_fault_mutex_hash as
>      the arguments mm and vma are no longer used or necessary.
> 
> Mike Kravetz (2):
>   huegtlbfs: on restore reserve error path retain subpool reservation
>   hugetlb: use same fault hash key for shared and private mappings
> 
>  fs/hugetlbfs/inode.c    |  7 ++-----
>  include/linux/hugetlb.h |  4 +---
>  mm/hugetlb.c            | 43 +++++++++++++++++++++--------------------
>  mm/userfaultfd.c        |  3 +--
>  4 files changed, 26 insertions(+), 31 deletions(-)

Both fixes look fine to me.

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Davidlohr Bueso April 8, 2019, 7:48 p.m. UTC | #2
On Thu, 28 Mar 2019, Mike Kravetz wrote:

>- A BUG can be triggered (not easily) due to temporarily mapping a
>  page before doing a COW.

But you actually _have_ seen it? Do you have the traces? I ask
not because of the patches perse, but because it would be nice
to have a real snipplet in the Changelog for patch 2.

Thanks,
Davidlohr
Mike Kravetz April 9, 2019, 3:30 a.m. UTC | #3
On 4/8/19 12:48 PM, Davidlohr Bueso wrote:
> On Thu, 28 Mar 2019, Mike Kravetz wrote:
> 
>> - A BUG can be triggered (not easily) due to temporarily mapping a
>>  page before doing a COW.
> 
> But you actually _have_ seen it? Do you have the traces? I ask
> not because of the patches perse, but because it would be nice
> to have a real snipplet in the Changelog for patch 2.

Yes, I actually saw this problem.  It happened while I was debugging and
testing some patches for hugetlb migration.  The BUG I hit was in
unaccount_page_cache_page(): VM_BUG_ON_PAGE(page_mapped(page), page).

Stack trace was something like:
unaccount_page_cache_page
  __delete_from_page_cache
    delete_from_page_cache
      remove_huge_page
        remove_inode_hugepages
          hugetlbfs_punch_hole
            hugetlbfs_fallocate

When I hit that, it took me a while to figure out how it could happen.
i.e. How could a page be mapped at that point in remove_inode_hugepages?
It checks page_mapped and we are holding the fault mutex.  With some
additional debug code (strategic udelays) I could hit the issue on a
somewhat regular basis and verified another thread was in the
hugetlb_no_page/hugetlb_cow path for the same page at the same time.

Unfortunately, I did not save the traces.  I am trying to recreate now.
However, my test system was recently updated and it might take a little
time to recreate.