Message ID | 20200331085604.1260162-1-ying.huang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | /proc/PID/smaps: Add PMD migration entry parsing | expand |
On 31/03/2020 11.56, Huang, Ying wrote: > From: Huang Ying <ying.huang@intel.com> > > Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply > ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing > is added. > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Alexey Dobriyan <adobriyan@gmail.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> > Cc: "Jérôme Glisse" <jglisse@redhat.com> > Cc: Yang Shi <yang.shi@linux.alibaba.com> > --- > fs/proc/task_mmu.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 8d382d4ec067..b5b3aef8cb3b 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, > bool locked = !!(vma->vm_flags & VM_LOCKED); > struct page *page; struct page *page = NULL; > > - /* FOLL_DUMP will return -EFAULT on huge zero page */ > - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + if (pmd_present(*pmd)) { > + /* FOLL_DUMP will return -EFAULT on huge zero page */ > + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + } else if (unlikely(is_swap_pmd(*pmd))) { > + swp_entry_t entry = pmd_to_swp_entry(*pmd); > + > + VM_BUG_ON(!is_migration_entry(entry)); > + page = migration_entry_to_page(entry); if (is_migration_entry(entry)) page = migration_entry_to_page(entry); Seems safer and doesn't add much code. > + } else { > + return; > + } > if (IS_ERR_OR_NULL(page)) > return; > if (PageAnon(page)) > @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { > - if (pmd_present(*pmd)) > - smaps_pmd_entry(pmd, addr, walk); > + smaps_pmd_entry(pmd, addr, walk); > spin_unlock(ptl); > goto out; > } >
On 31 Mar 2020, at 4:56, Huang, Ying wrote: > > From: Huang Ying <ying.huang@intel.com> > > Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply > ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing > is added. > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Andrea Arcangeli <aarcange@redhat.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Alexey Dobriyan <adobriyan@gmail.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> > Cc: "Jérôme Glisse" <jglisse@redhat.com> > Cc: Yang Shi <yang.shi@linux.alibaba.com> > --- > fs/proc/task_mmu.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > index 8d382d4ec067..b5b3aef8cb3b 100644 > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, > bool locked = !!(vma->vm_flags & VM_LOCKED); > struct page *page; Like Konstantin pointed out in another email, you could initialize page to NULL here. Plus you do not need the “else-return” below, if you do that. > > - /* FOLL_DUMP will return -EFAULT on huge zero page */ > - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + if (pmd_present(*pmd)) { > + /* FOLL_DUMP will return -EFAULT on huge zero page */ > + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + } else if (unlikely(is_swap_pmd(*pmd))) { Should be: } else if (unlikely(thp_migration_support() && is_swap_pmd(*pmd))) { Otherwise, when THP migration is disabled and the PMD is under splitting, VM_BUG_ON will be triggered. > + swp_entry_t entry = pmd_to_swp_entry(*pmd); > + > + VM_BUG_ON(!is_migration_entry(entry)); > + page = migration_entry_to_page(entry); > + } else { > + return; > + } > if (IS_ERR_OR_NULL(page)) > return; > if (PageAnon(page)) > @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { > - if (pmd_present(*pmd)) > - smaps_pmd_entry(pmd, addr, walk); > + smaps_pmd_entry(pmd, addr, walk); > spin_unlock(ptl); > goto out; > } > -- > 2.25.0 Everything else looks good to me. Thanks. With the fixes mentioned above, you can add Reviewed-by: Zi Yan <ziy@nvidia.com> — Best Regards, Yan Zi
Zi Yan <ziy@nvidia.com> writes: > On 31 Mar 2020, at 4:56, Huang, Ying wrote: >> >> From: Huang Ying <ying.huang@intel.com> >> >> Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply >> ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing >> is added. >> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >> Cc: Andrea Arcangeli <aarcange@redhat.com> >> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >> Cc: Zi Yan <ziy@nvidia.com> >> Cc: Vlastimil Babka <vbabka@suse.cz> >> Cc: Alexey Dobriyan <adobriyan@gmail.com> >> Cc: Michal Hocko <mhocko@suse.com> >> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> >> Cc: "Jérôme Glisse" <jglisse@redhat.com> >> Cc: Yang Shi <yang.shi@linux.alibaba.com> >> --- >> fs/proc/task_mmu.c | 16 ++++++++++++---- >> 1 file changed, 12 insertions(+), 4 deletions(-) >> >> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >> index 8d382d4ec067..b5b3aef8cb3b 100644 >> --- a/fs/proc/task_mmu.c >> +++ b/fs/proc/task_mmu.c >> @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, >> bool locked = !!(vma->vm_flags & VM_LOCKED); >> struct page *page; > > Like Konstantin pointed out in another email, you could initialize page to NULL here. > Plus you do not need the “else-return” below, if you do that. Yes. That looks better. Will change this in the next version. >> >> - /* FOLL_DUMP will return -EFAULT on huge zero page */ >> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + if (pmd_present(*pmd)) { >> + /* FOLL_DUMP will return -EFAULT on huge zero page */ >> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + } else if (unlikely(is_swap_pmd(*pmd))) { > > Should be: > } else if (unlikely(thp_migration_support() && is_swap_pmd(*pmd))) { > > Otherwise, when THP migration is disabled and the PMD is under splitting, VM_BUG_ON > will be triggered. We hold the PMD page table lock when call smaps_pmd_entry(). How does PMD splitting trigger VM_BUG_ON()? >> + swp_entry_t entry = pmd_to_swp_entry(*pmd); >> + >> + VM_BUG_ON(!is_migration_entry(entry)); >> + page = migration_entry_to_page(entry); >> + } else { >> + return; >> + } >> if (IS_ERR_OR_NULL(page)) >> return; >> if (PageAnon(page)) >> @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, >> >> ptl = pmd_trans_huge_lock(pmd, vma); >> if (ptl) { >> - if (pmd_present(*pmd)) >> - smaps_pmd_entry(pmd, addr, walk); >> + smaps_pmd_entry(pmd, addr, walk); >> spin_unlock(ptl); >> goto out; >> } >> -- >> 2.25.0 > > Everything else looks good to me. Thanks. > > With the fixes mentioned above, you can add > Reviewed-by: Zi Yan <ziy@nvidia.com> Thanks! Best Regards, Huang, Ying
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes: > On 31/03/2020 11.56, Huang, Ying wrote: >> From: Huang Ying <ying.huang@intel.com> >> >> Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply >> ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing >> is added. >> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >> Cc: Andrea Arcangeli <aarcange@redhat.com> >> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >> Cc: Zi Yan <ziy@nvidia.com> >> Cc: Vlastimil Babka <vbabka@suse.cz> >> Cc: Alexey Dobriyan <adobriyan@gmail.com> >> Cc: Michal Hocko <mhocko@suse.com> >> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> >> Cc: "Jérôme Glisse" <jglisse@redhat.com> >> Cc: Yang Shi <yang.shi@linux.alibaba.com> >> --- >> fs/proc/task_mmu.c | 16 ++++++++++++---- >> 1 file changed, 12 insertions(+), 4 deletions(-) >> >> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >> index 8d382d4ec067..b5b3aef8cb3b 100644 >> --- a/fs/proc/task_mmu.c >> +++ b/fs/proc/task_mmu.c >> @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, >> bool locked = !!(vma->vm_flags & VM_LOCKED); >> struct page *page; > > struct page *page = NULL; Looks good. Will do this in the next version. >> - /* FOLL_DUMP will return -EFAULT on huge zero page */ >> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + if (pmd_present(*pmd)) { >> + /* FOLL_DUMP will return -EFAULT on huge zero page */ >> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + } else if (unlikely(is_swap_pmd(*pmd))) { >> + swp_entry_t entry = pmd_to_swp_entry(*pmd); >> + >> + VM_BUG_ON(!is_migration_entry(entry)); >> + page = migration_entry_to_page(entry); > > if (is_migration_entry(entry)) > page = migration_entry_to_page(entry); > > Seems safer and doesn't add much code. With this, we lose an opportunity to capture some bugs during debugging. Right? Best Regards, Huang, Ying >> + } else { >> + return; >> + } >> if (IS_ERR_OR_NULL(page)) >> return; >> if (PageAnon(page)) >> @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, >> ptl = pmd_trans_huge_lock(pmd, vma); >> if (ptl) { >> - if (pmd_present(*pmd)) >> - smaps_pmd_entry(pmd, addr, walk); >> + smaps_pmd_entry(pmd, addr, walk); >> spin_unlock(ptl); >> goto out; >> } >>
On 31 Mar 2020, at 22:24, Huang, Ying wrote: > External email: Use caution opening links or attachments > > > Zi Yan <ziy@nvidia.com> writes: > >> On 31 Mar 2020, at 4:56, Huang, Ying wrote: >>> >>> From: Huang Ying <ying.huang@intel.com> >>> >>> Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply >>> ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing >>> is added. >>> >>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >>> Cc: Andrea Arcangeli <aarcange@redhat.com> >>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >>> Cc: Zi Yan <ziy@nvidia.com> >>> Cc: Vlastimil Babka <vbabka@suse.cz> >>> Cc: Alexey Dobriyan <adobriyan@gmail.com> >>> Cc: Michal Hocko <mhocko@suse.com> >>> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> >>> Cc: "Jérôme Glisse" <jglisse@redhat.com> >>> Cc: Yang Shi <yang.shi@linux.alibaba.com> >>> --- >>> fs/proc/task_mmu.c | 16 ++++++++++++---- >>> 1 file changed, 12 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >>> index 8d382d4ec067..b5b3aef8cb3b 100644 >>> --- a/fs/proc/task_mmu.c >>> +++ b/fs/proc/task_mmu.c >>> @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, >>> bool locked = !!(vma->vm_flags & VM_LOCKED); >>> struct page *page; >> >> Like Konstantin pointed out in another email, you could initialize page to NULL here. >> Plus you do not need the “else-return” below, if you do that. > > Yes. That looks better. Will change this in the next version. Thanks. > >>> >>> - /* FOLL_DUMP will return -EFAULT on huge zero page */ >>> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >>> + if (pmd_present(*pmd)) { >>> + /* FOLL_DUMP will return -EFAULT on huge zero page */ >>> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >>> + } else if (unlikely(is_swap_pmd(*pmd))) { >> >> Should be: >> } else if (unlikely(thp_migration_support() && is_swap_pmd(*pmd))) { >> >> Otherwise, when THP migration is disabled and the PMD is under splitting, VM_BUG_ON >> will be triggered. > > We hold the PMD page table lock when call smaps_pmd_entry(). How does > PMD splitting trigger VM_BUG_ON()? Oh, I missed that. Your original code is right. Thank you for the explanation. — Best Regards, Yan Zi
On 01/04/2020 05.31, Huang, Ying wrote: > Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes: > >> On 31/03/2020 11.56, Huang, Ying wrote: >>> From: Huang Ying <ying.huang@intel.com> >>> >>> Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply >>> ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing >>> is added. >>> >>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >>> Cc: Andrea Arcangeli <aarcange@redhat.com> >>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >>> Cc: Zi Yan <ziy@nvidia.com> >>> Cc: Vlastimil Babka <vbabka@suse.cz> >>> Cc: Alexey Dobriyan <adobriyan@gmail.com> >>> Cc: Michal Hocko <mhocko@suse.com> >>> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> >>> Cc: "Jérôme Glisse" <jglisse@redhat.com> >>> Cc: Yang Shi <yang.shi@linux.alibaba.com> >>> --- >>> fs/proc/task_mmu.c | 16 ++++++++++++---- >>> 1 file changed, 12 insertions(+), 4 deletions(-) >>> >>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >>> index 8d382d4ec067..b5b3aef8cb3b 100644 >>> --- a/fs/proc/task_mmu.c >>> +++ b/fs/proc/task_mmu.c >>> @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, >>> bool locked = !!(vma->vm_flags & VM_LOCKED); >>> struct page *page; >> >> struct page *page = NULL; > > Looks good. Will do this in the next version. > >>> - /* FOLL_DUMP will return -EFAULT on huge zero page */ >>> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >>> + if (pmd_present(*pmd)) { >>> + /* FOLL_DUMP will return -EFAULT on huge zero page */ >>> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >>> + } else if (unlikely(is_swap_pmd(*pmd))) { >>> + swp_entry_t entry = pmd_to_swp_entry(*pmd); >>> + >>> + VM_BUG_ON(!is_migration_entry(entry)); >>> + page = migration_entry_to_page(entry); >> >> if (is_migration_entry(entry)) >> page = migration_entry_to_page(entry); >> >> Seems safer and doesn't add much code. > > With this, we lose an opportunity to capture some bugs during debugging. > Right? You can keep VM_BUG_ON or VM_WARN_ON_ONCE Off-by-page in statistics isn't a big deal and not a good reason to crash (even debug) kernel. But for normal build should use safe behaviour if this isn't hard. > > Best Regards, > Huang, Ying > >>> + } else { >>> + return; >>> + } >>> if (IS_ERR_OR_NULL(page)) >>> return; >>> if (PageAnon(page)) >>> @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, >>> ptl = pmd_trans_huge_lock(pmd, vma); >>> if (ptl) { >>> - if (pmd_present(*pmd)) >>> - smaps_pmd_entry(pmd, addr, walk); >>> + smaps_pmd_entry(pmd, addr, walk); >>> spin_unlock(ptl); >>> goto out; >>> } >>>
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes: > On 01/04/2020 05.31, Huang, Ying wrote: >> Konstantin Khlebnikov <khlebnikov@yandex-team.ru> writes: >> >>> On 31/03/2020 11.56, Huang, Ying wrote: >>>> From: Huang Ying <ying.huang@intel.com> >>>> >>>> Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply >>>> ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing >>>> is added. >>>> >>>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >>>> Cc: Andrea Arcangeli <aarcange@redhat.com> >>>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> >>>> Cc: Zi Yan <ziy@nvidia.com> >>>> Cc: Vlastimil Babka <vbabka@suse.cz> >>>> Cc: Alexey Dobriyan <adobriyan@gmail.com> >>>> Cc: Michal Hocko <mhocko@suse.com> >>>> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> >>>> Cc: "Jérôme Glisse" <jglisse@redhat.com> >>>> Cc: Yang Shi <yang.shi@linux.alibaba.com> >>>> --- >>>> fs/proc/task_mmu.c | 16 ++++++++++++---- >>>> 1 file changed, 12 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c >>>> index 8d382d4ec067..b5b3aef8cb3b 100644 >>>> --- a/fs/proc/task_mmu.c >>>> +++ b/fs/proc/task_mmu.c >>>> @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, >>>> bool locked = !!(vma->vm_flags & VM_LOCKED); >>>> struct page *page; >>> >>> struct page *page = NULL; >> >> Looks good. Will do this in the next version. >> >>>> - /* FOLL_DUMP will return -EFAULT on huge zero page */ >>>> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >>>> + if (pmd_present(*pmd)) { >>>> + /* FOLL_DUMP will return -EFAULT on huge zero page */ >>>> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >>>> + } else if (unlikely(is_swap_pmd(*pmd))) { >>>> + swp_entry_t entry = pmd_to_swp_entry(*pmd); >>>> + >>>> + VM_BUG_ON(!is_migration_entry(entry)); >>>> + page = migration_entry_to_page(entry); >>> >>> if (is_migration_entry(entry)) >>> page = migration_entry_to_page(entry); >>> >>> Seems safer and doesn't add much code. >> >> With this, we lose an opportunity to capture some bugs during debugging. >> Right? > > You can keep VM_BUG_ON or VM_WARN_ON_ONCE > > Off-by-page in statistics isn't a big deal and not a good reason to crash (even debug) kernel. > But for normal build should use safe behaviour if this isn't hard. Sounds reasonable! Will revise the code. Thanks! Best Regards, Huang, Ying >> >> Best Regards, >> Huang, Ying >> >>>> + } else { >>>> + return; >>>> + } >>>> if (IS_ERR_OR_NULL(page)) >>>> return; >>>> if (PageAnon(page)) >>>> @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, >>>> ptl = pmd_trans_huge_lock(pmd, vma); >>>> if (ptl) { >>>> - if (pmd_present(*pmd)) >>>> - smaps_pmd_entry(pmd, addr, walk); >>>> + smaps_pmd_entry(pmd, addr, walk); >>>> spin_unlock(ptl); >>>> goto out; >>>> } >>>>
On Tue, 31 Mar 2020 16:56:04 +0800 "Huang, Ying" <ying.huang@intel.com> wrote: > From: Huang Ying <ying.huang@intel.com> > > Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply > ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing > is added. It would be helpful to show the before-and-after output in the changelog. > --- a/fs/proc/task_mmu.c > +++ b/fs/proc/task_mmu.c > @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, > bool locked = !!(vma->vm_flags & VM_LOCKED); > struct page *page; > > - /* FOLL_DUMP will return -EFAULT on huge zero page */ > - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + if (pmd_present(*pmd)) { > + /* FOLL_DUMP will return -EFAULT on huge zero page */ > + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); > + } else if (unlikely(is_swap_pmd(*pmd))) { > + swp_entry_t entry = pmd_to_swp_entry(*pmd); > + > + VM_BUG_ON(!is_migration_entry(entry)); I don't think this justifies nuking the kernel. A WARN()-and-recover would be better. > + page = migration_entry_to_page(entry); > + } else { > + return; > + } > if (IS_ERR_OR_NULL(page)) > return; > if (PageAnon(page)) > @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, > > ptl = pmd_trans_huge_lock(pmd, vma); > if (ptl) { > - if (pmd_present(*pmd)) > - smaps_pmd_entry(pmd, addr, walk); > + smaps_pmd_entry(pmd, addr, walk); > spin_unlock(ptl); > goto out; > }
Andrew Morton <akpm@linux-foundation.org> writes: > On Tue, 31 Mar 2020 16:56:04 +0800 "Huang, Ying" <ying.huang@intel.com> wrote: > >> From: Huang Ying <ying.huang@intel.com> >> >> Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply >> ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing >> is added. > > It would be helpful to show the before-and-after output in the changelog. OK. >> --- a/fs/proc/task_mmu.c >> +++ b/fs/proc/task_mmu.c >> @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, >> bool locked = !!(vma->vm_flags & VM_LOCKED); >> struct page *page; >> >> - /* FOLL_DUMP will return -EFAULT on huge zero page */ >> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + if (pmd_present(*pmd)) { >> + /* FOLL_DUMP will return -EFAULT on huge zero page */ >> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + } else if (unlikely(is_swap_pmd(*pmd))) { >> + swp_entry_t entry = pmd_to_swp_entry(*pmd); >> + >> + VM_BUG_ON(!is_migration_entry(entry)); > > I don't think this justifies nuking the kernel. A > WARN()-and-recover would be better. Yes. Will change this in the next version. Best Regards, Huang, Ying >> + page = migration_entry_to_page(entry); >> + } else { >> + return; >> + } >> if (IS_ERR_OR_NULL(page)) >> return; >> if (PageAnon(page)) >> @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, >> >> ptl = pmd_trans_huge_lock(pmd, vma); >> if (ptl) { >> - if (pmd_present(*pmd)) >> - smaps_pmd_entry(pmd, addr, walk); >> + smaps_pmd_entry(pmd, addr, walk); >> spin_unlock(ptl); >> goto out; >> }
Zi Yan <ziy@nvidia.com> writes: > On 31 Mar 2020, at 4:56, Huang, Ying wrote: >> >> From: Huang Ying <ying.huang@intel.com> >> - /* FOLL_DUMP will return -EFAULT on huge zero page */ >> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + if (pmd_present(*pmd)) { >> + /* FOLL_DUMP will return -EFAULT on huge zero page */ >> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); >> + } else if (unlikely(is_swap_pmd(*pmd))) { > > Should be: > } else if (unlikely(thp_migration_support() && is_swap_pmd(*pmd))) { Thought this again. This can reduce the code size if thp_migration_support() isn't enabled. I will do this in the next version. Best Regards, Huang, Ying
On Wed 01-04-20 16:04:32, Andrew Morton wrote: > On Tue, 31 Mar 2020 16:56:04 +0800 "Huang, Ying" <ying.huang@intel.com> wrote: > > > From: Huang Ying <ying.huang@intel.com> > > > > Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply > > ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing > > is added. > > It would be helpful to show the before-and-after output in the changelog. Migration entries are ephemeral. Is this observable in practice? I suspect this is just primarily motivated by reading the code more than hitting the actual problem.
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 8d382d4ec067..b5b3aef8cb3b 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -548,8 +548,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, bool locked = !!(vma->vm_flags & VM_LOCKED); struct page *page; - /* FOLL_DUMP will return -EFAULT on huge zero page */ - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); + if (pmd_present(*pmd)) { + /* FOLL_DUMP will return -EFAULT on huge zero page */ + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP); + } else if (unlikely(is_swap_pmd(*pmd))) { + swp_entry_t entry = pmd_to_swp_entry(*pmd); + + VM_BUG_ON(!is_migration_entry(entry)); + page = migration_entry_to_page(entry); + } else { + return; + } if (IS_ERR_OR_NULL(page)) return; if (PageAnon(page)) @@ -578,8 +587,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) { - if (pmd_present(*pmd)) - smaps_pmd_entry(pmd, addr, walk); + smaps_pmd_entry(pmd, addr, walk); spin_unlock(ptl); goto out; }