Message ID | 20191128010321.21730-2-richardw.yang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] mm/page_vma_mapped: use PMD_SIZE instead of calculating it | expand |
On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote: > The check here is to guarantee pvmw->address iteration is limited in one > page table boundary. To be specific, here the address range should be in > one PMD_SIZE. > > If my understanding is correct, this check is already done in the above > check: > > address >= __vma_address(page, vma) + PMD_SIZE > > The boundary check here seems not necessary. > > Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> NAK. THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap(). > Test: > more than 48 hours kernel build test shows this code is not touched. Not an argument. I doubt mremap(2) is ever called in kernel build workload.
On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote: >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote: >> The check here is to guarantee pvmw->address iteration is limited in one >> page table boundary. To be specific, here the address range should be in >> one PMD_SIZE. >> >> If my understanding is correct, this check is already done in the above >> check: >> >> address >= __vma_address(page, vma) + PMD_SIZE >> >> The boundary check here seems not necessary. >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> > >NAK. > >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap(). > Hi, Kirill Thanks for your comment during Thanks Giving Day. Happy holiday:-) I didn't think about this case before, thanks for reminding. Then I tried to understand your concern. mremap() would expand/shrink a memory mapping. In this case, probably shrink is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the case you mentioned maybe pvmw->page is the head of a THP but part of it is unmapped. This means the following condition stands: vma->vm_start <= vma_address(page) vma->vm_end <= vma_address(page) + page_size(page) Since we have checked address with vm_end, do you think this case is also guarded? Not sure my understanding is correct, look forward your comments. >> Test: >> more than 48 hours kernel build test shows this code is not touched. > >Not an argument. I doubt mremap(2) is ever called in kernel build >workload. > >-- > Kirill A. Shutemov
On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote: > On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote: > >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote: > >> The check here is to guarantee pvmw->address iteration is limited in one > >> page table boundary. To be specific, here the address range should be in > >> one PMD_SIZE. > >> > >> If my understanding is correct, this check is already done in the above > >> check: > >> > >> address >= __vma_address(page, vma) + PMD_SIZE > >> > >> The boundary check here seems not necessary. > >> > >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> > > > >NAK. > > > >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap(). > > > > Hi, Kirill > > Thanks for your comment during Thanks Giving Day. Happy holiday:-) > > I didn't think about this case before, thanks for reminding. Then I tried to > understand your concern. > > mremap() would expand/shrink a memory mapping. In this case, probably shrink > is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the > case you mentioned maybe pvmw->page is the head of a THP but part of it is > unmapped. mremap() can also move a mapping, see MREMAP_FIXED. > This means the following condition stands: > > vma->vm_start <= vma_address(page) > vma->vm_end <= vma_address(page) + page_size(page) > > Since we have checked address with vm_end, do you think this case is also > guarded? > > Not sure my understanding is correct, look forward your comments. > > >> Test: > >> more than 48 hours kernel build test shows this code is not touched. > > > >Not an argument. I doubt mremap(2) is ever called in kernel build > >workload. > > > >-- > > Kirill A. Shutemov > > -- > Wei Yang > Help you, Help me >
On Thu, Nov 28, 2019 at 02:39:04PM -0800, Matthew Wilcox wrote: >On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote: >> On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote: >> >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote: >> >> The check here is to guarantee pvmw->address iteration is limited in one >> >> page table boundary. To be specific, here the address range should be in >> >> one PMD_SIZE. >> >> >> >> If my understanding is correct, this check is already done in the above >> >> check: >> >> >> >> address >= __vma_address(page, vma) + PMD_SIZE >> >> >> >> The boundary check here seems not necessary. >> >> >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> >> > >> >NAK. >> > >> >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap(). >> > >> >> Hi, Kirill >> >> Thanks for your comment during Thanks Giving Day. Happy holiday:-) >> >> I didn't think about this case before, thanks for reminding. Then I tried to >> understand your concern. >> >> mremap() would expand/shrink a memory mapping. In this case, probably shrink >> is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the >> case you mentioned maybe pvmw->page is the head of a THP but part of it is >> unmapped. > >mremap() can also move a mapping, see MREMAP_FIXED. Hi, Matthew Thanks for your comment. I took a look into the MREMAP_FIXED case, but still not clear in which case it fall into the situation Kirill mentioned. Per my understanding, move mapping is achieved in two steps: * unmap some range in old vma if old_len >= new_len * move vma If the length doesn't change, we are expecting to have the "copy" of old vma. This doesn't change the THP PMD mapping. So the change still happens in the unmap step, if I am correct. Would you mind giving me more hint on the case when we would have the situation as Kirill mentioned? > >> This means the following condition stands: >> >> vma->vm_start <= vma_address(page) >> vma->vm_end <= vma_address(page) + page_size(page) >> >> Since we have checked address with vm_end, do you think this case is also >> guarded? >> >> Not sure my understanding is correct, look forward your comments. >> >> >> Test: >> >> more than 48 hours kernel build test shows this code is not touched. >> > >> >Not an argument. I doubt mremap(2) is ever called in kernel build >> >workload. >> > >> >-- >> > Kirill A. Shutemov >> >> -- >> Wei Yang >> Help you, Help me >>
On Fri, Nov 29, 2019 at 04:30:02PM +0800, Wei Yang wrote: > On Thu, Nov 28, 2019 at 02:39:04PM -0800, Matthew Wilcox wrote: > >On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote: > >> On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote: > >> >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote: > >> >> The check here is to guarantee pvmw->address iteration is limited in one > >> >> page table boundary. To be specific, here the address range should be in > >> >> one PMD_SIZE. > >> >> > >> >> If my understanding is correct, this check is already done in the above > >> >> check: > >> >> > >> >> address >= __vma_address(page, vma) + PMD_SIZE > >> >> > >> >> The boundary check here seems not necessary. > >> >> > >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> > >> > > >> >NAK. > >> > > >> >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap(). > >> > > >> > >> Hi, Kirill > >> > >> Thanks for your comment during Thanks Giving Day. Happy holiday:-) > >> > >> I didn't think about this case before, thanks for reminding. Then I tried to > >> understand your concern. > >> > >> mremap() would expand/shrink a memory mapping. In this case, probably shrink > >> is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the > >> case you mentioned maybe pvmw->page is the head of a THP but part of it is > >> unmapped. > > > >mremap() can also move a mapping, see MREMAP_FIXED. > > Hi, Matthew > > Thanks for your comment. > > I took a look into the MREMAP_FIXED case, but still not clear in which case it > fall into the situation Kirill mentioned. > > Per my understanding, move mapping is achieved in two steps: > > * unmap some range in old vma if old_len >= new_len > * move vma > > If the length doesn't change, we are expecting to have the "copy" of old > vma. This doesn't change the THP PMD mapping. > > So the change still happens in the unmap step, if I am correct. > > Would you mind giving me more hint on the case when we would have the > situation as Kirill mentioned? Set up a THP mapping. Move it to an address which is no longer 2MB aligned. Unmap it.
On Fri, Nov 29, 2019 at 03:18:01AM -0800, Matthew Wilcox wrote: >On Fri, Nov 29, 2019 at 04:30:02PM +0800, Wei Yang wrote: >> On Thu, Nov 28, 2019 at 02:39:04PM -0800, Matthew Wilcox wrote: >> >On Thu, Nov 28, 2019 at 09:09:45PM +0000, Wei Yang wrote: >> >> On Thu, Nov 28, 2019 at 11:31:43AM +0300, Kirill A. Shutemov wrote: >> >> >On Thu, Nov 28, 2019 at 09:03:21AM +0800, Wei Yang wrote: >> >> >> The check here is to guarantee pvmw->address iteration is limited in one >> >> >> page table boundary. To be specific, here the address range should be in >> >> >> one PMD_SIZE. >> >> >> >> >> >> If my understanding is correct, this check is already done in the above >> >> >> check: >> >> >> >> >> >> address >= __vma_address(page, vma) + PMD_SIZE >> >> >> >> >> >> The boundary check here seems not necessary. >> >> >> >> >> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> >> >> > >> >> >NAK. >> >> > >> >> >THP can be mapped with PTE not aligned to PMD_SIZE. Consider mremap(). >> >> > >> >> >> >> Hi, Kirill >> >> >> >> Thanks for your comment during Thanks Giving Day. Happy holiday:-) >> >> >> >> I didn't think about this case before, thanks for reminding. Then I tried to >> >> understand your concern. >> >> >> >> mremap() would expand/shrink a memory mapping. In this case, probably shrink >> >> is in concern. Since pvmw->page and pvmw->vma are not changed in the loop, the >> >> case you mentioned maybe pvmw->page is the head of a THP but part of it is >> >> unmapped. >> > >> >mremap() can also move a mapping, see MREMAP_FIXED. >> >> Hi, Matthew >> >> Thanks for your comment. >> >> I took a look into the MREMAP_FIXED case, but still not clear in which case it >> fall into the situation Kirill mentioned. >> >> Per my understanding, move mapping is achieved in two steps: >> >> * unmap some range in old vma if old_len >= new_len >> * move vma >> >> If the length doesn't change, we are expecting to have the "copy" of old >> vma. This doesn't change the THP PMD mapping. >> >> So the change still happens in the unmap step, if I am correct. >> >> Would you mind giving me more hint on the case when we would have the >> situation as Kirill mentioned? > >Set up a THP mapping. >Move it to an address which is no longer 2MB aligned. >Unmap it. Thanks Matthew I got the point, thanks a lot :-)
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 76e03650a3ab..25aada8a1271 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -163,7 +163,6 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) return not_found(pvmw); return true; } -restart: pgd = pgd_offset(mm, pvmw->address); if (!pgd_present(*pgd)) return false; @@ -225,17 +224,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) __vma_address(pvmw->page, pvmw->vma) + PMD_SIZE) return not_found(pvmw); - /* Did we cross page table boundary? */ - if (pvmw->address % PMD_SIZE == 0) { - pte_unmap(pvmw->pte); - if (pvmw->ptl) { - spin_unlock(pvmw->ptl); - pvmw->ptl = NULL; - } - goto restart; - } else { - pvmw->pte++; - } + pvmw->pte++; } while (pte_none(*pvmw->pte)); if (!pvmw->ptl) {
The check here is to guarantee pvmw->address iteration is limited in one page table boundary. To be specific, here the address range should be in one PMD_SIZE. If my understanding is correct, this check is already done in the above check: address >= __vma_address(page, vma) + PMD_SIZE The boundary check here seems not necessary. Signed-off-by: Wei Yang <richardw.yang@linux.intel.com> --- Test: more than 48 hours kernel build test shows this code is not touched. --- mm/page_vma_mapped.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-)