Message ID | 1580174873-18117-1-git-send-email-anshuman.khandual@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V12] mm/debug: Add tests validating architecture page table helpers | expand |
> On Jan 27, 2020, at 8:28 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: > > This adds tests which will validate architecture page table helpers and > other accessors in their compliance with expected generic MM semantics. > This will help various architectures in validating changes to existing > page table helpers or addition of new ones. > > This test covers basic page table entry transformations including but not > limited to old, young, dirty, clean, write, write protect etc at various > level along with populating intermediate entries with next page table page > and validating them. > > Test page table pages are allocated from system memory with required size > and alignments. The mapped pfns at page table levels are derived from a > real pfn representing a valid kernel text symbol. This test gets called > right after page_alloc_init_late(). > > This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with > CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to > select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and > arm64. Going forward, other architectures too can enable this after fixing > build or runtime problems (if any) with their page table helpers. What’s the value of this block of new code? It only supports x86 and arm64 which are supposed to be good now. Did those tests ever find any regression or this is almost only useful for new architectures which only happened once in a few years? The worry if not many people will use this config and code those that much in the future because it is inefficient to find bugs, it will simply be rotten like a few other debugging options out there we have in the mainline that will be a pain to remove later on.
On 01/28/2020 07:41 AM, Qian Cai wrote: > > >> On Jan 27, 2020, at 8:28 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: >> >> This adds tests which will validate architecture page table helpers and >> other accessors in their compliance with expected generic MM semantics. >> This will help various architectures in validating changes to existing >> page table helpers or addition of new ones. >> >> This test covers basic page table entry transformations including but not >> limited to old, young, dirty, clean, write, write protect etc at various >> level along with populating intermediate entries with next page table page >> and validating them. >> >> Test page table pages are allocated from system memory with required size >> and alignments. The mapped pfns at page table levels are derived from a >> real pfn representing a valid kernel text symbol. This test gets called >> right after page_alloc_init_late(). >> >> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >> arm64. Going forward, other architectures too can enable this after fixing >> build or runtime problems (if any) with their page table helpers. Hello Qian, > > What’s the value of this block of new code? It only supports x86 and arm64 > which are supposed to be good now. We have been over the usefulness of this code many times before as the patch is already in it's V12. Currently it is enabled on arm64, x86 (except PAE), arc and ppc32. There are build time or runtime problems with other archs which prevent enablement of this test (for the moment) but then the goal is to integrate all of them going forward. The test not only validates platform's adherence to the expected semantics from generic MM but also helps in keeping it that way during code changes in future as well. > Did those tests ever find any regression or this is almost only useful for new The test has already found problems with s390 page table helpers. > architectures which only happened once in a few years? Again, not only it validates what exist today but its also a tool to make sure that all platforms continue adhere to a common agreed upon semantics as reflected through the tests here. > The worry if not many people will use this config and code those that much in Debug features or tests in the kernel are used when required. These are never or should not be enabled by default. AFAICT this is true even for entire DEBUG_VM packaged tests. Do you have any particular data or precedence to substantiate the fact that this test will be used any less often than the other similar ones in the tree ? I can only speak for arm64 platform but the very idea for this test came from Catalin when we were trying to understand the semantics for THP helpers while enabling THP migration without split. Apart from going over the commit messages from the past, there were no other way to figure out how any particular page table helper is suppose to change given page table entry. This test tries to formalize those semantics. > the future because it is inefficient to find bugs, it will simply be rotten Could you be more specific here ? What parts of the test are inefficient ? I am happy to improve upon the test. Do let me know you if you have suggestions. > like a few other debugging options out there we have in the mainline that will be a pain to remove later on. > Even though I am not agreeing to your assessment about the usefulness of the test without any substantial data backing up the claims, the test case in itself is very much compartmentalized, staying clear from generic MM and debug_vm_pgtable() is only function executing the test which is getting called from kernel_init_freeable() path. - Anshuman
> On Jan 27, 2020, at 10:06 PM, Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > > On 01/28/2020 07:41 AM, Qian Cai wrote: >> >> >>> On Jan 27, 2020, at 8:28 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: >>> >>> This adds tests which will validate architecture page table helpers and >>> other accessors in their compliance with expected generic MM semantics. >>> This will help various architectures in validating changes to existing >>> page table helpers or addition of new ones. >>> >>> This test covers basic page table entry transformations including but not >>> limited to old, young, dirty, clean, write, write protect etc at various >>> level along with populating intermediate entries with next page table page >>> and validating them. >>> >>> Test page table pages are allocated from system memory with required size >>> and alignments. The mapped pfns at page table levels are derived from a >>> real pfn representing a valid kernel text symbol. This test gets called >>> right after page_alloc_init_late(). >>> >>> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >>> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >>> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >>> arm64. Going forward, other architectures too can enable this after fixing >>> build or runtime problems (if any) with their page table helpers. > > Hello Qian, > >> >> What’s the value of this block of new code? It only supports x86 and arm64 >> which are supposed to be good now. > > We have been over the usefulness of this code many times before as the patch is > already in it's V12. Currently it is enabled on arm64, x86 (except PAE), arc and > ppc32. There are build time or runtime problems with other archs which prevent I am not sure if I care too much about arc and ppc32 which are pretty much legacy platforms. > enablement of this test (for the moment) but then the goal is to integrate all > of them going forward. The test not only validates platform's adherence to the > expected semantics from generic MM but also helps in keeping it that way during > code changes in future as well. Another option maybe to get some decent arches on board first before merging this thing, so it have more changes to catch regressions for developers who might run this. > >> Did those tests ever find any regression or this is almost only useful for new > > The test has already found problems with s390 page table helpers. Hmm, that is pretty weak where s390 is not even official supported with this version. > >> architectures which only happened once in a few years? > > Again, not only it validates what exist today but its also a tool to make > sure that all platforms continue adhere to a common agreed upon semantics > as reflected through the tests here. > >> The worry if not many people will use this config and code those that much in > > Debug features or tests in the kernel are used when required. These are never or > should not be enabled by default. AFAICT this is true even for entire DEBUG_VM > packaged tests. Do you have any particular data or precedence to substantiate > the fact that this test will be used any less often than the other similar ones > in the tree ? I can only speak for arm64 platform but the very idea for this > test came from Catalin when we were trying to understand the semantics for THP > helpers while enabling THP migration without split. Apart from going over the > commit messages from the past, there were no other way to figure out how any > particular page table helper is suppose to change given page table entry. This > test tries to formalize those semantics. I am thinking about how we made so many mistakes before by merging too many of those debugging options that many of them have been broken for many releases proving that nobody actually used them regularly. We don’t need to repeat the same mistake again. I am actually thinking about to remove things like page_poisoning often which is almost are never found any bug recently and only cause pains when interacting with other new features that almost nobody will test them together to begin with. We even have some SLUB debugging code sit there for almost 15 years that almost nobody used it and maintainers refused to remove it. > >> the future because it is inefficient to find bugs, it will simply be rotten > Could you be more specific here ? What parts of the test are inefficient ? I > am happy to improve upon the test. Do let me know you if you have suggestions. > >> like a few other debugging options out there we have in the mainline that > will be a pain to remove later on. >> > > Even though I am not agreeing to your assessment about the usefulness of the > test without any substantial data backing up the claims, the test case in > itself is very much compartmentalized, staying clear from generic MM and > debug_vm_pgtable() is only function executing the test which is getting > called from kernel_init_freeable() path. I am thinking exactly the other way around. You are proposing to merge this tests without proving how useful it will be able to find regressions for future developers to make sure it will actually get used.
On 01/28/2020 09:03 AM, Qian Cai wrote: > > >> On Jan 27, 2020, at 10:06 PM, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >> >> >> >> On 01/28/2020 07:41 AM, Qian Cai wrote: >>> >>> >>>> On Jan 27, 2020, at 8:28 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: >>>> >>>> This adds tests which will validate architecture page table helpers and >>>> other accessors in their compliance with expected generic MM semantics. >>>> This will help various architectures in validating changes to existing >>>> page table helpers or addition of new ones. >>>> >>>> This test covers basic page table entry transformations including but not >>>> limited to old, young, dirty, clean, write, write protect etc at various >>>> level along with populating intermediate entries with next page table page >>>> and validating them. >>>> >>>> Test page table pages are allocated from system memory with required size >>>> and alignments. The mapped pfns at page table levels are derived from a >>>> real pfn representing a valid kernel text symbol. This test gets called >>>> right after page_alloc_init_late(). >>>> >>>> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >>>> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >>>> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >>>> arm64. Going forward, other architectures too can enable this after fixing >>>> build or runtime problems (if any) with their page table helpers. >> >> Hello Qian, >> >>> >>> What’s the value of this block of new code? It only supports x86 and arm64 >>> which are supposed to be good now. >> >> We have been over the usefulness of this code many times before as the patch is >> already in it's V12. Currently it is enabled on arm64, x86 (except PAE), arc and >> ppc32. There are build time or runtime problems with other archs which prevent > > I am not sure if I care too much about arc and ppc32 which are pretty much legacy > platforms. Okay but FWIW the maintainers for all these enabled platforms cared for this test at the least and really helped in shaping the test to it's current state. Besides I am still failing to understand your point here about evaluating particular feature's usefulness based on it's support on relative and perceived importance of some platforms compared to others. Again the idea is to integrate all platforms eventually but we had discovered build and runtime issues which needs to be resolved at platform level first. Unless I am mistaken, debug feature like this which is putting down a framework while also benefiting some initial platforms to start with, will be a potential candidate for eventual inclusion in the mainline. Otherwise, please point to any other agreed upon community criteria for debug feature's mainline inclusion which I will try to adhere. I wonder if all other similar debug features from the past ever met 'the all inclusive at the beginning' criteria which you are trying to propose here. This test also adds a feature file, enlisting all supported archs as suggested by Ingo for the exact same reason. This is not the first time, a feature is listing out archs which are supported and archs which are not. > >> enablement of this test (for the moment) but then the goal is to integrate all >> of them going forward. The test not only validates platform's adherence to the >> expected semantics from generic MM but also helps in keeping it that way during >> code changes in future as well. > > Another option maybe to get some decent arches on board first before merging this > thing, so it have more changes to catch regressions for developers who might run this. > >> >>> Did those tests ever find any regression or this is almost only useful for new >> >> The test has already found problems with s390 page table helpers. > > Hmm, that is pretty weak where s390 is not even official supported with this version. And there were valid reasons why s390 could not be enabled just yet as explained by s390 folks during our previous discussions. I just pointed out an example where this test was useful as you had asked previously. Not being official supported in this version does not take away the fact the it was indeed useful for that platform in discovering a bug. > >> >>> architectures which only happened once in a few years? >> >> Again, not only it validates what exist today but its also a tool to make >> sure that all platforms continue adhere to a common agreed upon semantics >> as reflected through the tests here. >> >>> The worry if not many people will use this config and code those that much in >> >> Debug features or tests in the kernel are used when required. These are never or >> should not be enabled by default. AFAICT this is true even for entire DEBUG_VM >> packaged tests. Do you have any particular data or precedence to substantiate >> the fact that this test will be used any less often than the other similar ones >> in the tree ? I can only speak for arm64 platform but the very idea for this >> test came from Catalin when we were trying to understand the semantics for THP >> helpers while enabling THP migration without split. Apart from going over the >> commit messages from the past, there were no other way to figure out how any >> particular page table helper is suppose to change given page table entry. This >> test tries to formalize those semantics. > > I am thinking about how we made so many mistakes before by merging too many of > those debugging options that many of them have been broken for many releases > proving that nobody actually used them regularly. We don’t need to repeat the same Again will ask for some data to substantiate these claims. Though I am not really sure but believe that there are integration test frameworks out there which regularly validates each of these code path on multiple platforms. One such automation found that V11 of the test was broken on X86 PAE platform which I fixed. Nonetheless, I can speak only for arm64 platform and we intend to use this test to validate arm64 exported page table helpers. Citing unsubstantiated past examples should not really block these enabled platforms (arm64 at the very least) from getting this debug feature which has already demonstrated it's usefulness during arm64 THP migration development and on s390 platforms as well. > mistake again. I am actually thinking about to remove things like page_poisoning often > which is almost are never found any bug recently and only cause pains when interacting > with other new features that almost nobody will test them together to begin with. > We even have some SLUB debugging code sit there for almost 15 years that almost > nobody used it and maintainers refused to remove it. Unlike those, the proposed test here is isolated as a stand alone test and stays clear off from any other code path. I have not been involved in or aware of the usefulness of existing MM debug features and hence will just leave them upto the judgment of the maintainers whether to keep or discard them. > >> >>> the future because it is inefficient to find bugs, it will simply be rotten >> Could you be more specific here ? What parts of the test are inefficient ? I >> am happy to improve upon the test. Do let me know you if you have suggestions. >> >>> like a few other debugging options out there we have in the mainline that >> will be a pain to remove later on. >>> >> >> Even though I am not agreeing to your assessment about the usefulness of the >> test without any substantial data backing up the claims, the test case in >> itself is very much compartmentalized, staying clear from generic MM and >> debug_vm_pgtable() is only function executing the test which is getting >> called from kernel_init_freeable() path. > > I am thinking exactly the other way around. You are proposing to merge this tests > without proving how useful it will be able to find regressions for future developers > to make sure it will actually get used. As I had mentioned before, the test attempts to formalize page table helper semantics as expected from generic MM code paths and intend to catch deviations when enabled on a given platform. How else should we test semantics errors otherwise ? There are past examples of usefulness for this procedure on arm64 and on s390. I am wondering how else to prove the usefulness of a debug feature if these references are not enough. >
> On Jan 27, 2020, at 11:58 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: > > As I had mentioned before, the test attempts to formalize page table helper semantics > as expected from generic MM code paths and intend to catch deviations when enabled on > a given platform. How else should we test semantics errors otherwise ? There are past > examples of usefulness for this procedure on arm64 and on s390. I am wondering how > else to prove the usefulness of a debug feature if these references are not enough. Not saying it will not be useful. As you mentioned it actually found a bug or two in the past. The problem is that there is always a cost to maintain something like this, and nobody knew how things could be broken even for the isolated code you mentioned in the future given how complicated the kernel code base is. I am not so positive that many developers would enable this debug feature and use it on a regular basis from the information you gave so far. On the other hand, it might just be good at maintaining this thing out of tree by yourself anyway, because if there isn’t going to be used by many developers, few people is going to contribute to this and even noticed when it is broken. What’s the point of getting this merged apart from being getting some meaningless credits?
Le 28/01/2020 à 04:33, Qian Cai a écrit : > > >> On Jan 27, 2020, at 10:06 PM, Anshuman Khandual <anshuman.khandual@arm.com> wrote: >> >> >> >> On 01/28/2020 07:41 AM, Qian Cai wrote: >>> >>> >>>> On Jan 27, 2020, at 8:28 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: >>>> >>>> This adds tests which will validate architecture page table helpers and >>>> other accessors in their compliance with expected generic MM semantics. >>>> This will help various architectures in validating changes to existing >>>> page table helpers or addition of new ones. >>>> >>>> This test covers basic page table entry transformations including but not >>>> limited to old, young, dirty, clean, write, write protect etc at various >>>> level along with populating intermediate entries with next page table page >>>> and validating them. >>>> >>>> Test page table pages are allocated from system memory with required size >>>> and alignments. The mapped pfns at page table levels are derived from a >>>> real pfn representing a valid kernel text symbol. This test gets called >>>> right after page_alloc_init_late(). >>>> >>>> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >>>> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >>>> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >>>> arm64. Going forward, other architectures too can enable this after fixing >>>> build or runtime problems (if any) with their page table helpers. >> >> Hello Qian, >> >>> >>> What’s the value of this block of new code? It only supports x86 and arm64 >>> which are supposed to be good now. >> >> We have been over the usefulness of this code many times before as the patch is >> already in it's V12. Currently it is enabled on arm64, x86 (except PAE), arc and >> ppc32. There are build time or runtime problems with other archs which prevent > > I am not sure if I care too much about arc and ppc32 which are pretty much legacy > platforms. > >> enablement of this test (for the moment) but then the goal is to integrate all >> of them going forward. The test not only validates platform's adherence to the >> expected semantics from generic MM but also helps in keeping it that way during >> code changes in future as well. > > Another option maybe to get some decent arches on board first before merging this > thing, so it have more changes to catch regressions for developers who might run this. > ppc32 an indecent / legacy platform ? Are you kidying ? Powerquicc II PRO for instance is fully supported by the manufacturer and widely used in many small networking devices. Christophe
Le 28/01/2020 à 06:48, Qian Cai a écrit : > > >> On Jan 27, 2020, at 11:58 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: >> >> As I had mentioned before, the test attempts to formalize page table helper semantics >> as expected from generic MM code paths and intend to catch deviations when enabled on >> a given platform. How else should we test semantics errors otherwise ? There are past >> examples of usefulness for this procedure on arm64 and on s390. I am wondering how >> else to prove the usefulness of a debug feature if these references are not enough. > > Not saying it will not be useful. As you mentioned it actually found a bug or two in the past. The problem is that there is always a cost to maintain something like this, and nobody knew how things could be broken even for the isolated code you mentioned in the future given how complicated the kernel code base is. I am not so positive that many developers would enable this debug feature and use it on a regular basis from the information you gave so far. > > On the other hand, it might just be good at maintaining this thing out of tree by yourself anyway, because if there isn’t going to be used by many developers, few people is going to contribute to this and even noticed when it is broken. What’s the point of getting this merged apart from being getting some meaningless credits? > It is 'default y' so there is no much risk that it is forgotten, at least all test suites run with 'allyes_defconfig' will trigger the test, so I think it is really a good feature. Christophe
> On Jan 28, 2020, at 1:17 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote: > > It is 'default y' so there is no much risk that it is forgotten, at least all test suites run with 'allyes_defconfig' will trigger the test, so I think it is really a good feature. This thing depends on DEBUG_VM which I don’t see it is selected by any defconfig. Am I missing anything?
On 01/28/2020 12:06 PM, Qian Cai wrote: > > >> On Jan 28, 2020, at 1:17 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote: >> >> It is 'default y' so there is no much risk that it is forgotten, at least all test suites run with 'allyes_defconfig' will trigger the test, so I think it is really a good feature. > > This thing depends on DEBUG_VM which I don’t see it is selected by any defconfig. Am I missing anything? > 'allyesconfig' makes 'DEBUG_VM = y' which in turn will enable 'DEBUG_VM_PGTABLE = y' on platforms that subscribe ARCH_HAS_DEBUG_VM_PGTABLE.
> On Jan 28, 2020, at 2:03 AM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: > > 'allyesconfig' makes 'DEBUG_VM = y' which in turn will enable 'DEBUG_VM_PGTABLE = y' > on platforms that subscribe ARCH_HAS_DEBUG_VM_PGTABLE. Isn’t that only for compiling testing? Who is booting such a beast and make sure everything working as expected?
> On Jan 28, 2020, at 1:13 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote: > > ppc32 an indecent / legacy platform ? Are you kidying ? > > Powerquicc II PRO for instance is fully supported by the manufacturer and widely used in many small networking devices. Of course I forgot about embedded devices. The problem is that how many developers are actually going to run this debug option on embedded devices?
On Tue, Jan 28, 2020 at 02:12:56AM -0500, Qian Cai wrote: > > On Jan 28, 2020, at 1:13 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote: > > ppc32 an indecent / legacy platform ? Are you kidying ? > > Powerquicc II PRO for instance is fully supported by the > > manufacturer and widely used in many small networking devices. > Of course I forgot about embedded devices. The problem is that how > many developers are actually going to run this debug option on > embedded devices? Much fewer if the code isn't upstream than if it is. This isn't something that every developer is going to enable all the time but that doesn't mean it's not useful, it's more for people doing work on the architectures or on memory management (or who suspect they're running into a relevant problem), and I'm sure some of the automated testing people will enable it. The more barriers there are in place to getting the testsuite up and running the less likely it is that any of these groups will run it regularly.
Hello Qian, On Mon, Jan 27, 2020 at 10:33:08PM -0500, Qian Cai wrote: > > > On Jan 27, 2020, at 10:06 PM, Anshuman Khandual <anshuman.khandual@arm.com> wrote: > > > > enablement of this test (for the moment) but then the goal is to integrate all > > of them going forward. The test not only validates platform's adherence to the > > expected semantics from generic MM but also helps in keeping it that way during > > code changes in future as well. > > Another option maybe to get some decent arches on board first before merging this > thing, so it have more changes to catch regressions for developers who might run this. Aren't x86 and arm64 not decent enough? Even if this test could be used to detect regressions only on these two platforms, the test is valuable.
Le 28/01/2020 à 02:27, Anshuman Khandual a écrit : > This adds tests which will validate architecture page table helpers and > other accessors in their compliance with expected generic MM semantics. > This will help various architectures in validating changes to existing > page table helpers or addition of new ones. > > This test covers basic page table entry transformations including but not > limited to old, young, dirty, clean, write, write protect etc at various > level along with populating intermediate entries with next page table page > and validating them. > > Test page table pages are allocated from system memory with required size > and alignments. The mapped pfns at page table levels are derived from a > real pfn representing a valid kernel text symbol. This test gets called > right after page_alloc_init_late(). > > This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with > CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to > select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and > arm64. Going forward, other architectures too can enable this after fixing > build or runtime problems (if any) with their page table helpers. > > Folks interested in making sure that a given platform's page table helpers > conform to expected generic MM semantics should enable the above config > which will just trigger this test during boot. Any non conformity here will > be reported as an warning which would need to be fixed. This test will help > catch any changes to the agreed upon semantics expected from generic MM and > enable platforms to accommodate it thereafter. > [...] > > Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> #PPC32 Also tested on PPC64 (under QEMU): book3s/64 64k pages, book3s/64 4k pages and book3e/64 > Reviewed-by: Ingo Molnar <mingo@kernel.org> > Suggested-by: Catalin Marinas <catalin.marinas@arm.com> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> > --- [...] > > diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > new file mode 100644 > index 000000000000..f3f8111edbe3 > --- /dev/null > +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > @@ -0,0 +1,35 @@ > +# > +# Feature name: debug-vm-pgtable > +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE > +# description: arch supports pgtable tests for semantics compliance > +# > + ----------------------- > + | arch |status| > + ----------------------- > + | alpha: | TODO | > + | arc: | ok | > + | arm: | TODO | > + | arm64: | ok | > + | c6x: | TODO | > + | csky: | TODO | > + | h8300: | TODO | > + | hexagon: | TODO | > + | ia64: | TODO | > + | m68k: | TODO | > + | microblaze: | TODO | > + | mips: | TODO | > + | nds32: | TODO | > + | nios2: | TODO | > + | openrisc: | TODO | > + | parisc: | TODO | > + | powerpc/32: | ok | > + | powerpc/64: | TODO | You can change the two above lines by powerpc: ok > + | riscv: | TODO | > + | s390: | TODO | > + | sh: | TODO | > + | sparc: | TODO | > + | um: | TODO | > + | unicore32: | TODO | > + | x86: | ok | > + | xtensa: | TODO | > + ----------------------- > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 1ec34e16ed65..253dcab0bebc 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -120,6 +120,7 @@ config PPC > # > select ARCH_32BIT_OFF_T if PPC32 > select ARCH_HAS_DEBUG_VIRTUAL > + select ARCH_HAS_DEBUG_VM_PGTABLE if PPC32 Remove the 'if PPC32' as we now know it also work on PPC64. > select ARCH_HAS_DEVMEM_IS_ALLOWED > select ARCH_HAS_ELF_RANDOMIZE > select ARCH_HAS_FORTIFY_SOURCE > diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h > index 0b6c4042942a..fb0e76d254b3 100644 > --- a/arch/x86/include/asm/pgtable_64.h > +++ b/arch/x86/include/asm/pgtable_64.h > @@ -53,6 +53,12 @@ static inline void sync_initial_page_table(void) { } > > struct mm_struct; > > +#define mm_p4d_folded mm_p4d_folded > +static inline bool mm_p4d_folded(struct mm_struct *mm) > +{ > + return !pgtable_l5_enabled(); > +} > + For me this should be part of another patch, it is not directly linked to the tests. > void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte); > void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte); > > diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h > index 798ea36a0549..e0b04787e789 100644 > --- a/include/asm-generic/pgtable.h > +++ b/include/asm-generic/pgtable.h > @@ -1208,6 +1208,12 @@ static inline bool arch_has_pfn_modify_check(void) > # define PAGE_KERNEL_EXEC PAGE_KERNEL > #endif > > +#ifdef CONFIG_DEBUG_VM_PGTABLE Not sure it is a good idea to put that in include/asm-generic/pgtable.h By doing this you are forcing a rebuild of almost all files, whereas only init/main.o and mm/debug_vm_pgtable.o should be rebuilt when activating this config option. > +extern void debug_vm_pgtable(void); Please don't use the 'extern' keyword, it is useless and not to be used for functions declaration. > +#else > +static inline void debug_vm_pgtable(void) { } > +#endif > + > #endif /* !__ASSEMBLY__ */ > > #ifndef io_remap_pfn_range > diff --git a/init/main.c b/init/main.c > index da1bc0b60a7d..5e59e6ac0780 100644 > --- a/init/main.c > +++ b/init/main.c > @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) > sched_init_smp(); > > page_alloc_init_late(); > + debug_vm_pgtable(); Wouldn't it be better to call debug_vm_pgtable() in kernel_init() between the call to async_synchronise_full() and ftrace_free_init_mem() ? > /* Initialize page ext after all struct pages are initialized. */ > page_ext_init(); > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index 5ffe144c9794..7cceae923c05 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -653,6 +653,12 @@ config SCHED_STACK_END_CHECK > data corruption or a sporadic crash at a later stage once the region > is examined. The runtime overhead introduced is minimal. > > +config ARCH_HAS_DEBUG_VM_PGTABLE > + bool > + help > + An architecture should select this when it can successfully > + build and run DEBUG_VM_PGTABLE. > + > config DEBUG_VM > bool "Debug VM" > depends on DEBUG_KERNEL > @@ -688,6 +694,22 @@ config DEBUG_VM_PGFLAGS > > If unsure, say N. > > +config DEBUG_VM_PGTABLE > + bool "Debug arch page table for semantics compliance" > + depends on MMU > + depends on DEBUG_VM Does it really need to depend on DEBUG_VM ? I think we could make it standalone and 'default y if DEBUG_VM' instead. > + depends on ARCH_HAS_DEBUG_VM_PGTABLE > + default y > + help > + This option provides a debug method which can be used to test > + architecture page table helper functions on various platforms in > + verifying if they comply with expected generic MM semantics. This > + will help architecture code in making sure that any changes or > + new additions of these helpers still conform to expected > + semantics of the generic MM. > + > + If unsure, say N. > + Does it make sense to make it 'default y' and say 'If unsure, say N' ? > config ARCH_HAS_DEBUG_VIRTUAL > bool > Christophe
On Mon, Jan 27, 2020 at 09:11:53PM -0500, Qian Cai wrote: > On Jan 27, 2020, at 8:28 PM, Anshuman Khandual <Anshuman.Khandual@arm.com> wrote: > > This adds tests which will validate architecture page table helpers and > > other accessors in their compliance with expected generic MM semantics. > > This will help various architectures in validating changes to existing > > page table helpers or addition of new ones. [...] > What’s the value of this block of new code? It only supports x86 and > arm64 which are supposed to be good now. Did those tests ever find any > regression or this is almost only useful for new architectures which > only happened once in a few years? The primary goal here is not finding regressions but having clearly defined semantics of the page table accessors across architectures. x86 and arm64 are a good starting point and other architectures will be enabled as they are aligned to the same semantics. See for example this past discussion: https://lore.kernel.org/linux-mm/20190628102003.GA56463@arrakis.emea.arm.com/ These tests should act as the 'contract' between the generic mm code and the architecture port. Without clear semantics, some bugs may be a lot subtler than a boot failure. FTR, I fully support this patch (and I should get around to review it properly; thanks for the reminder ;)).
> On Jan 28, 2020, at 12:47 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > > The primary goal here is not finding regressions but having clearly > defined semantics of the page table accessors across architectures. x86 > and arm64 are a good starting point and other architectures will be > enabled as they are aligned to the same semantics. This still does not answer the fundamental question. If this test is simply inefficient to find bugs, who wants to spend time to use it regularly? If this is just one off test that may get running once in a few years (when introducing a new arch), how does it justify the ongoing cost to maintain it? I do agree there could be a need to clearly define this thing but that belongs to documentation rather than testing purpose. It is confusing to mix this with other config options which have somewhat a different purpose, it will then be a waste of time for people who mistakenly enable this for regular automatic testing and never found any bug from it.
On Tue, Jan 28, 2020 at 02:07:10PM -0500, Qian Cai wrote: > On Jan 28, 2020, at 12:47 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: > > The primary goal here is not finding regressions but having clearly > > defined semantics of the page table accessors across architectures. x86 > > and arm64 are a good starting point and other architectures will be > > enabled as they are aligned to the same semantics. > > This still does not answer the fundamental question. If this test is > simply inefficient to find bugs, Who said this is inefficient (other than you)? > who wants to spend time to use it regularly? Arch maintainers, mm maintainers introducing new macros or assuming certain new semantics of the existing macros. > If this is just one off test that may get running once in a few years > (when introducing a new arch), how does it justify the ongoing cost to > maintain it? You are really missing the point. It's not only for a new arch but changes to existing arch code. And if the arch code churn in this area is relatively small, I'd expect a similarly small cost of maintaining this test. If you only turn DEBUG_VM on once every few years, don't generalise this to the rest of the kernel developers (as others pointed out, this test is default y if DEBUG_VM). Anyway, I think that's a pointless discussion, so not going to reply further (unless you have technical content to add).
> On Jan 29, 2020, at 5:36 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Tue, Jan 28, 2020 at 02:07:10PM -0500, Qian Cai wrote: >> On Jan 28, 2020, at 12:47 PM, Catalin Marinas <catalin.marinas@arm.com> wrote: >>> The primary goal here is not finding regressions but having clearly >>> defined semantics of the page table accessors across architectures. x86 >>> and arm64 are a good starting point and other architectures will be >>> enabled as they are aligned to the same semantics. >> >> This still does not answer the fundamental question. If this test is >> simply inefficient to find bugs, > > Who said this is inefficient (other than you)? Inefficient of finding bugs. It said only found a bug or two in its lifetime? > >> who wants to spend time to use it regularly? > > Arch maintainers, mm maintainers introducing new macros or assuming > certain new semantics of the existing macros. > >> If this is just one off test that may get running once in a few years >> (when introducing a new arch), how does it justify the ongoing cost to >> maintain it? > > You are really missing the point. It's not only for a new arch but > changes to existing arch code. And if the arch code churn in this area > is relatively small, I'd expect a similarly small cost of maintaining > this test. > > If you only turn DEBUG_VM on once every few years, don't generalise this > to the rest of the kernel developers (as others pointed out, this test > is default y if DEBUG_VM). Quite the opposite, I am running DEBUG_VM almost daily for regression workload while I felt strongly this thing does not add any value mixing there. So, I would suggest to decouple this away from DEBUG_VM, and clearly document that this test is not something intended for automated regression workloads, so those people don’t need to waste time running this. > > Anyway, I think that's a pointless discussion, so not going to reply > further (unless you have technical content to add). > > -- > Catalin
On Tue, 28 Jan 2020 06:57:53 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote: > This adds tests which will validate architecture page table helpers and > other accessors in their compliance with expected generic MM semantics. > This will help various architectures in validating changes to existing > page table helpers or addition of new ones. > > This test covers basic page table entry transformations including but not > limited to old, young, dirty, clean, write, write protect etc at various > level along with populating intermediate entries with next page table page > and validating them. > > Test page table pages are allocated from system memory with required size > and alignments. The mapped pfns at page table levels are derived from a > real pfn representing a valid kernel text symbol. This test gets called > right after page_alloc_init_late(). > > This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with > CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to > select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and > arm64. Going forward, other architectures too can enable this after fixing > build or runtime problems (if any) with their page table helpers. > > Folks interested in making sure that a given platform's page table helpers > conform to expected generic MM semantics should enable the above config > which will just trigger this test during boot. Any non conformity here will > be reported as an warning which would need to be fixed. This test will help > catch any changes to the agreed upon semantics expected from generic MM and > enable platforms to accommodate it thereafter. > [...] > > Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> #PPC32 Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> # s390 Thanks again for this effort, and for keeping up the spirit against all odds and even after 12 iterations :-) > > diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > new file mode 100644 > index 000000000000..f3f8111edbe3 > --- /dev/null > +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > @@ -0,0 +1,35 @@ > +# > +# Feature name: debug-vm-pgtable > +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE > +# description: arch supports pgtable tests for semantics compliance > +# > + ----------------------- > + | arch |status| > + ----------------------- > + | alpha: | TODO | > + | arc: | ok | > + | arm: | TODO | > + | arm64: | ok | > + | c6x: | TODO | > + | csky: | TODO | > + | h8300: | TODO | > + | hexagon: | TODO | > + | ia64: | TODO | > + | m68k: | TODO | > + | microblaze: | TODO | > + | mips: | TODO | > + | nds32: | TODO | > + | nios2: | TODO | > + | openrisc: | TODO | > + | parisc: | TODO | > + | powerpc/32: | ok | > + | powerpc/64: | TODO | > + | riscv: | TODO | > + | s390: | TODO | s390 is ok now, with my patches included in v5.5-rc1. So you can now add --- a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt @@ -25,7 +25,7 @@ | powerpc/32: | ok | | powerpc/64: | TODO | | riscv: | TODO | - | s390: | TODO | + | s390: | ok | | sh: | TODO | | sparc: | TODO | | um: | TODO | --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -59,6 +59,7 @@ config KASAN_SHADOW_OFFSET config S390 def_bool y select ARCH_BINFMT_ELF_STATE + select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_ELF_RANDOMIZE select ARCH_HAS_FORTIFY_SOURCE
On Mon, 27 Jan 2020 22:33:08 -0500 Qian Cai <cai@lca.pw> wrote: > > > >> Did those tests ever find any regression or this is almost only useful for new > > > > The test has already found problems with s390 page table helpers. > > Hmm, that is pretty weak where s390 is not even official supported with this version. > I first had to get the three patches upstream, each fixing non-conform behavior on s390, and each issue was found by this extremely useful test: 2416cefc504b s390/mm: add mm_pxd_folded() checks to pxd_free() 2d1fc1eb9b54 s390/mm: simplify page table helpers for large entries 1c27a4bc817b s390/mm: make pmd/pud_bad() report large entries as bad I did not see any direct effect of this misbehavior yet, but I am very happy that this could be found and fixed in order to prevent future issues. And this is exactly the value of this test, to make sure that all architectures have a common understanding of how the various page table helpers are supposed to work. For example, who would have thought that pXd_bad() is supposed to report large entries as bad? It's not really documented anywhere, so we just checked them for sanity like normal entries, which apparently worked fine so far, but for how long?
On Wed, Jan 29, 2020 at 11:20:44PM +0100, Gerald Schaefer wrote: > On Mon, 27 Jan 2020 22:33:08 -0500 > > For example, who would have thought that pXd_bad() is supposed to > report large entries as bad? It's not really documented anywhere, A bit off-topic, @Anshuman, maybe you could start a Documentation/ patch that describes at least some of the pXd_whaterver()? Or that would be too much to ask? ;-) > so we just checked them for sanity like normal entries, which > apparently worked fine so far, but for how long?
On 01/28/2020 10:35 PM, Christophe Leroy wrote: > > > Le 28/01/2020 à 02:27, Anshuman Khandual a écrit : >> This adds tests which will validate architecture page table helpers and >> other accessors in their compliance with expected generic MM semantics. >> This will help various architectures in validating changes to existing >> page table helpers or addition of new ones. >> >> This test covers basic page table entry transformations including but not >> limited to old, young, dirty, clean, write, write protect etc at various >> level along with populating intermediate entries with next page table page >> and validating them. >> >> Test page table pages are allocated from system memory with required size >> and alignments. The mapped pfns at page table levels are derived from a >> real pfn representing a valid kernel text symbol. This test gets called >> right after page_alloc_init_late(). >> >> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >> arm64. Going forward, other architectures too can enable this after fixing >> build or runtime problems (if any) with their page table helpers. >> >> Folks interested in making sure that a given platform's page table helpers >> conform to expected generic MM semantics should enable the above config >> which will just trigger this test during boot. Any non conformity here will >> be reported as an warning which would need to be fixed. This test will help >> catch any changes to the agreed upon semantics expected from generic MM and >> enable platforms to accommodate it thereafter. >> > > [...] > >> >> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> #PPC32 > > Also tested on PPC64 (under QEMU): book3s/64 64k pages, book3s/64 4k pages and book3e/64 Hmm but earlier Michael Ellerman had reported some problems while running these tests on PPC64, a soft lock up in hash__pte_update() and a kernel BUG (radix MMU). Are those problems gone away now ? Details in this thread - https://patchwork.kernel.org/patch/11214603/ > >> Reviewed-by: Ingo Molnar <mingo@kernel.org> >> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> >> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> >> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> >> --- > > [...] > >> >> diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt >> new file mode 100644 >> index 000000000000..f3f8111edbe3 >> --- /dev/null >> +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt >> @@ -0,0 +1,35 @@ >> +# >> +# Feature name: debug-vm-pgtable >> +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE >> +# description: arch supports pgtable tests for semantics compliance >> +# >> + ----------------------- >> + | arch |status| >> + ----------------------- >> + | alpha: | TODO | >> + | arc: | ok | >> + | arm: | TODO | >> + | arm64: | ok | >> + | c6x: | TODO | >> + | csky: | TODO | >> + | h8300: | TODO | >> + | hexagon: | TODO | >> + | ia64: | TODO | >> + | m68k: | TODO | >> + | microblaze: | TODO | >> + | mips: | TODO | >> + | nds32: | TODO | >> + | nios2: | TODO | >> + | openrisc: | TODO | >> + | parisc: | TODO | >> + | powerpc/32: | ok | >> + | powerpc/64: | TODO | > > You can change the two above lines by > > powerpc: ok > >> + | riscv: | TODO | >> + | s390: | TODO | >> + | sh: | TODO | >> + | sparc: | TODO | >> + | um: | TODO | >> + | unicore32: | TODO | >> + | x86: | ok | >> + | xtensa: | TODO | >> + ----------------------- > >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index 1ec34e16ed65..253dcab0bebc 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -120,6 +120,7 @@ config PPC >> # >> select ARCH_32BIT_OFF_T if PPC32 >> select ARCH_HAS_DEBUG_VIRTUAL >> + select ARCH_HAS_DEBUG_VM_PGTABLE if PPC32 > > Remove the 'if PPC32' as we now know it also work on PPC64. But in case there is a subset of PPC64 which still does not work (problem reported earlier) with the test, will have to adjust the config accordingly. > >> select ARCH_HAS_DEVMEM_IS_ALLOWED >> select ARCH_HAS_ELF_RANDOMIZE >> select ARCH_HAS_FORTIFY_SOURCE > >> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h >> index 0b6c4042942a..fb0e76d254b3 100644 >> --- a/arch/x86/include/asm/pgtable_64.h >> +++ b/arch/x86/include/asm/pgtable_64.h >> @@ -53,6 +53,12 @@ static inline void sync_initial_page_table(void) { } >> struct mm_struct; >> +#define mm_p4d_folded mm_p4d_folded >> +static inline bool mm_p4d_folded(struct mm_struct *mm) >> +{ >> + return !pgtable_l5_enabled(); >> +} >> + > > For me this should be part of another patch, it is not directly linked to the tests. We did discuss about this earlier and Kirril mentioned its not worth a separate patch. https://lore.kernel.org/linux-arm-kernel/20190913091305.rkds4f3fqv3yjhjy@box/ > >> void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte); >> void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte); >> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h >> index 798ea36a0549..e0b04787e789 100644 >> --- a/include/asm-generic/pgtable.h >> +++ b/include/asm-generic/pgtable.h >> @@ -1208,6 +1208,12 @@ static inline bool arch_has_pfn_modify_check(void) >> # define PAGE_KERNEL_EXEC PAGE_KERNEL >> #endif >> +#ifdef CONFIG_DEBUG_VM_PGTABLE > > Not sure it is a good idea to put that in include/asm-generic/pgtable.h Logically that is the right place, as it is related to page table but not something platform related. > > By doing this you are forcing a rebuild of almost all files, whereas only init/main.o and mm/debug_vm_pgtable.o should be rebuilt when activating this config option. I agreed but whats the alternative ? We could move these into init/main.c to make things simpler but will that be a right place, given its related to generic page table. > >> +extern void debug_vm_pgtable(void); > > Please don't use the 'extern' keyword, it is useless and not to be used for functions declaration. Really ? But, there are tons of examples doing the same thing both in generic and platform code as well. > >> +#else >> +static inline void debug_vm_pgtable(void) { } >> +#endif >> + >> #endif /* !__ASSEMBLY__ */ >> #ifndef io_remap_pfn_range >> diff --git a/init/main.c b/init/main.c >> index da1bc0b60a7d..5e59e6ac0780 100644 >> --- a/init/main.c >> +++ b/init/main.c >> @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) >> sched_init_smp(); >> page_alloc_init_late(); >> + debug_vm_pgtable(); > > Wouldn't it be better to call debug_vm_pgtable() in kernel_init() between the call to async_synchronise_full() and ftrace_free_init_mem() ? IIRC, proposed location is the earliest we could call debug_vm_pgtable(). Is there any particular benefit or reason to move it into kernel_init() ? > >> /* Initialize page ext after all struct pages are initialized. */ >> page_ext_init(); >> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug >> index 5ffe144c9794..7cceae923c05 100644 >> --- a/lib/Kconfig.debug >> +++ b/lib/Kconfig.debug >> @@ -653,6 +653,12 @@ config SCHED_STACK_END_CHECK >> data corruption or a sporadic crash at a later stage once the region >> is examined. The runtime overhead introduced is minimal. >> +config ARCH_HAS_DEBUG_VM_PGTABLE >> + bool >> + help >> + An architecture should select this when it can successfully >> + build and run DEBUG_VM_PGTABLE. >> + >> config DEBUG_VM >> bool "Debug VM" >> depends on DEBUG_KERNEL >> @@ -688,6 +694,22 @@ config DEBUG_VM_PGFLAGS >> If unsure, say N. >> +config DEBUG_VM_PGTABLE >> + bool "Debug arch page table for semantics compliance" >> + depends on MMU >> + depends on DEBUG_VM > > Does it really need to depend on DEBUG_VM ? No. It seemed better to package this test along with DEBUG_VM (although I dont remember the conversation around it) and hence this dependency. > I think we could make it standalone and 'default y if DEBUG_VM' instead. Which will yield the same result like before but in a different way. But yes, this test could go about either way but unless there is a good enough reason why change the current one. > >> + depends on ARCH_HAS_DEBUG_VM_PGTABLE >> + default y >> + help >> + This option provides a debug method which can be used to test >> + architecture page table helper functions on various platforms in >> + verifying if they comply with expected generic MM semantics. This >> + will help architecture code in making sure that any changes or >> + new additions of these helpers still conform to expected >> + semantics of the generic MM. >> + >> + If unsure, say N. >> + > > Does it make sense to make it 'default y' and say 'If unsure, say N' ? No it does. Not when it defaults 'y' unconditionally. Will drop the last sentence "If unsure, say N". Nice catch, thank you. > >> config ARCH_HAS_DEBUG_VIRTUAL >> bool >> > > Christophe >
On 01/30/2020 03:50 AM, Gerald Schaefer wrote: > On Tue, 28 Jan 2020 06:57:53 +0530 > Anshuman Khandual <anshuman.khandual@arm.com> wrote: > >> This adds tests which will validate architecture page table helpers and >> other accessors in their compliance with expected generic MM semantics. >> This will help various architectures in validating changes to existing >> page table helpers or addition of new ones. >> >> This test covers basic page table entry transformations including but not >> limited to old, young, dirty, clean, write, write protect etc at various >> level along with populating intermediate entries with next page table page >> and validating them. >> >> Test page table pages are allocated from system memory with required size >> and alignments. The mapped pfns at page table levels are derived from a >> real pfn representing a valid kernel text symbol. This test gets called >> right after page_alloc_init_late(). >> >> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >> arm64. Going forward, other architectures too can enable this after fixing >> build or runtime problems (if any) with their page table helpers. >> >> Folks interested in making sure that a given platform's page table helpers >> conform to expected generic MM semantics should enable the above config >> which will just trigger this test during boot. Any non conformity here will >> be reported as an warning which would need to be fixed. This test will help >> catch any changes to the agreed upon semantics expected from generic MM and >> enable platforms to accommodate it thereafter. >> > > [...] > >> >> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> #PPC32 > > Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> # s390 Thanks for testing. > > Thanks again for this effort, and for keeping up the spirit against > all odds and even after 12 iterations :-) > >> >> diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt >> new file mode 100644 >> index 000000000000..f3f8111edbe3 >> --- /dev/null >> +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt >> @@ -0,0 +1,35 @@ >> +# >> +# Feature name: debug-vm-pgtable >> +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE >> +# description: arch supports pgtable tests for semantics compliance >> +# >> + ----------------------- >> + | arch |status| >> + ----------------------- >> + | alpha: | TODO | >> + | arc: | ok | >> + | arm: | TODO | >> + | arm64: | ok | >> + | c6x: | TODO | >> + | csky: | TODO | >> + | h8300: | TODO | >> + | hexagon: | TODO | >> + | ia64: | TODO | >> + | m68k: | TODO | >> + | microblaze: | TODO | >> + | mips: | TODO | >> + | nds32: | TODO | >> + | nios2: | TODO | >> + | openrisc: | TODO | >> + | parisc: | TODO | >> + | powerpc/32: | ok | >> + | powerpc/64: | TODO | >> + | riscv: | TODO | >> + | s390: | TODO | > > s390 is ok now, with my patches included in v5.5-rc1. So you can now add > > --- a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > @@ -25,7 +25,7 @@ > | powerpc/32: | ok | > | powerpc/64: | TODO | > | riscv: | TODO | > - | s390: | TODO | > + | s390: | ok | > | sh: | TODO | > | sparc: | TODO | > | um: | TODO | > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -59,6 +59,7 @@ config KASAN_SHADOW_OFFSET > config S390 > def_bool y > select ARCH_BINFMT_ELF_STATE > + select ARCH_HAS_DEBUG_VM_PGTABLE > select ARCH_HAS_DEVMEM_IS_ALLOWED > select ARCH_HAS_ELF_RANDOMIZE > select ARCH_HAS_FORTIFY_SOURCE Sure, will add this up.
On 01/30/2020 12:57 PM, Mike Rapoport wrote: > On Wed, Jan 29, 2020 at 11:20:44PM +0100, Gerald Schaefer wrote: >> On Mon, 27 Jan 2020 22:33:08 -0500 >> >> For example, who would have thought that pXd_bad() is supposed to >> report large entries as bad? It's not really documented anywhere, > > A bit off-topic, > > @Anshuman, maybe you could start a Documentation/ patch that describes at > least some of the pXd_whaterver()? > Or that would be too much to ask? ;-) No, it would not be :) I have been documenting the expected semantics for the helpers in the test itself. The idea is to collate them all (have been working on some additional tests but waiting for this one to get merged first) here and once most of the test gets settled, will move semantics documentation from here into Documentation/ directory in a proper format. /* * Basic operations * * mkold(entry) = An old and not a young entry * mkyoung(entry) = A young and not an old entry * mkdirty(entry) = A dirty and not a clean entry * mkclean(entry) = A clean and not a dirty entry * mkwrite(entry) = A write and not a write protected entry * wrprotect(entry) = A write protected and not a write entry * pxx_bad(entry) = A mapped and non-table entry * pxx_same(entry1, entry2) = Both entries hold the exact same value */ > >> so we just checked them for sanity like normal entries, which >> apparently worked fine so far, but for how long? >
Le 30/01/2020 à 14:04, Anshuman Khandual a écrit : > > On 01/28/2020 10:35 PM, Christophe Leroy wrote: >> >> >> Le 28/01/2020 à 02:27, Anshuman Khandual a écrit : >>> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h >>> index 0b6c4042942a..fb0e76d254b3 100644 >>> --- a/arch/x86/include/asm/pgtable_64.h >>> +++ b/arch/x86/include/asm/pgtable_64.h >>> @@ -53,6 +53,12 @@ static inline void sync_initial_page_table(void) { } >>> struct mm_struct; >>> +#define mm_p4d_folded mm_p4d_folded >>> +static inline bool mm_p4d_folded(struct mm_struct *mm) >>> +{ >>> + return !pgtable_l5_enabled(); >>> +} >>> + >> >> For me this should be part of another patch, it is not directly linked to the tests. > > We did discuss about this earlier and Kirril mentioned its not worth > a separate patch. > > https://lore.kernel.org/linux-arm-kernel/20190913091305.rkds4f3fqv3yjhjy@box/ For me it would make sense to not mix this patch which implement tests, and changes that are needed for the test to work (or even build) on the different architectures. But that's up to you. > >> >>> void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte); >>> void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte); >>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h >>> index 798ea36a0549..e0b04787e789 100644 >>> --- a/include/asm-generic/pgtable.h >>> +++ b/include/asm-generic/pgtable.h >>> @@ -1208,6 +1208,12 @@ static inline bool arch_has_pfn_modify_check(void) >>> # define PAGE_KERNEL_EXEC PAGE_KERNEL >>> #endif >>> +#ifdef CONFIG_DEBUG_VM_PGTABLE >> >> Not sure it is a good idea to put that in include/asm-generic/pgtable.h > > Logically that is the right place, as it is related to page table but > not something platform related. I can't see any debug related features in that file. > >> >> By doing this you are forcing a rebuild of almost all files, whereas only init/main.o and mm/debug_vm_pgtable.o should be rebuilt when activating this config option. > > I agreed but whats the alternative ? We could move these into init/main.c > to make things simpler but will that be a right place, given its related > to generic page table. What about linux/mmdebug.h instead ? (I have not checked if it would reduce the impact, but that's where things related to CONFIG_DEBUG_VM seems to be). Otherwise, you can just create new file, for instance <linux/mmdebug-pgtable.h> and include that file only in the init/main.c and mm/debug_vm_pgtable.c > >> >>> +extern void debug_vm_pgtable(void); >> >> Please don't use the 'extern' keyword, it is useless and not to be used for functions declaration. > > Really ? But, there are tons of examples doing the same thing both in > generic and platform code as well. Yes, but how can we improve if we blindly copy the errors from the past ? Having tons of 'extern' doesn't mean we must add more. I think checkpatch.pl usually complains when a patch brings a new unreleval extern symbol. > >> >>> +#else >>> +static inline void debug_vm_pgtable(void) { } >>> +#endif >>> + >>> #endif /* !__ASSEMBLY__ */ >>> #ifndef io_remap_pfn_range >>> diff --git a/init/main.c b/init/main.c >>> index da1bc0b60a7d..5e59e6ac0780 100644 >>> --- a/init/main.c >>> +++ b/init/main.c >>> @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) >>> sched_init_smp(); >>> page_alloc_init_late(); >>> + debug_vm_pgtable(); >> >> Wouldn't it be better to call debug_vm_pgtable() in kernel_init() between the call to async_synchronise_full() and ftrace_free_init_mem() ? > > IIRC, proposed location is the earliest we could call debug_vm_pgtable(). > Is there any particular benefit or reason to move it into kernel_init() ? It would avoid having it lost in the middle of drivers logs, would be close to the end of init, at a place we can't miss it, close to the result of other tests like CONFIG_DEBUG_RODATA_TEST for instance. At the moment, you have to look for it to be sure the test is done and what the result is. > >> >>> /* Initialize page ext after all struct pages are initialized. */ >>> page_ext_init(); >>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug >>> index 5ffe144c9794..7cceae923c05 100644 >>> --- a/lib/Kconfig.debug >>> +++ b/lib/Kconfig.debug >>> @@ -653,6 +653,12 @@ config SCHED_STACK_END_CHECK >>> data corruption or a sporadic crash at a later stage once the region >>> is examined. The runtime overhead introduced is minimal. >>> +config ARCH_HAS_DEBUG_VM_PGTABLE >>> + bool >>> + help >>> + An architecture should select this when it can successfully >>> + build and run DEBUG_VM_PGTABLE. >>> + >>> config DEBUG_VM >>> bool "Debug VM" >>> depends on DEBUG_KERNEL >>> @@ -688,6 +694,22 @@ config DEBUG_VM_PGFLAGS >>> If unsure, say N. >>> +config DEBUG_VM_PGTABLE >>> + bool "Debug arch page table for semantics compliance" >>> + depends on MMU >>> + depends on DEBUG_VM >> >> Does it really need to depend on DEBUG_VM ? > > No. It seemed better to package this test along with DEBUG_VM (although I > dont remember the conversation around it) and hence this dependency. Yes but it perfectly work as standalone. The more easy it is to activate and the more people will use it. DEBUG_VM obliges to rebuild the kernel entirely and could modify the behaviour. Could the helpers we are testing behave differently when DEBUG_VM is not set ? I think it's good the test things as close as possible to final config. > >> I think we could make it standalone and 'default y if DEBUG_VM' instead. > > Which will yield the same result like before but in a different way. But > yes, this test could go about either way but unless there is a good enough > reason why change the current one. I think if we want people to really use it on other architectures it must be possible to activate it without having to modify Kconfig. Otherwise people won't even know the test exists and the architecture fails the test. The purpose of a test suite is to detect bugs. If you can't run the test until you have fixed the bugs, I guess nobody will ever detect the bugs and they will never be fixed. So I think: - the test should be 'default y' when ARCH_HAS_DEBUG_VM_PGTABLE is selected - the test should be 'default n' when ARCH_HAS_DEBUG_VM_PGTABLE is not selected, and it should be user selectable if EXPERT is selected. Something like: config DEBUG_VM_PGTABLE bool "Debug arch page table for semantics compliance" if ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT depends on MMU default 'n' if !ARCH_HAS_DEBUG_VM_PGTABLE default 'y' if DEBUG_VM > >> >>> + depends on ARCH_HAS_DEBUG_VM_PGTABLE >>> + default y >>> + help >>> + This option provides a debug method which can be used to test >>> + architecture page table helper functions on various platforms in >>> + verifying if they comply with expected generic MM semantics. This >>> + will help architecture code in making sure that any changes or >>> + new additions of these helpers still conform to expected >>> + semantics of the generic MM. >>> + >>> + If unsure, say N. >>> + >> >> Does it make sense to make it 'default y' and say 'If unsure, say N' ? > > No it does. Not when it defaults 'y' unconditionally. Will drop the last > sentence "If unsure, say N". Nice catch, thank you. Well I was not asking if 'default y' was making sense but only if 'If unsure say N' was making sense due to the 'default y'. You got it. Christophe
On 01/28/2020 06:57 AM, Anshuman Khandual wrote: > This adds tests which will validate architecture page table helpers and > other accessors in their compliance with expected generic MM semantics. > This will help various architectures in validating changes to existing > page table helpers or addition of new ones. > > This test covers basic page table entry transformations including but not > limited to old, young, dirty, clean, write, write protect etc at various > level along with populating intermediate entries with next page table page > and validating them. > > Test page table pages are allocated from system memory with required size > and alignments. The mapped pfns at page table levels are derived from a > real pfn representing a valid kernel text symbol. This test gets called > right after page_alloc_init_late(). > > This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with > CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to > select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and > arm64. Going forward, other architectures too can enable this after fixing > build or runtime problems (if any) with their page table helpers. > > Folks interested in making sure that a given platform's page table helpers > conform to expected generic MM semantics should enable the above config > which will just trigger this test during boot. Any non conformity here will > be reported as an warning which would need to be fixed. This test will help > catch any changes to the agreed upon semantics expected from generic MM and > enable platforms to accommodate it thereafter. > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: Dan Williams <dan.j.williams@intel.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Michal Hocko <mhocko@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Mark Brown <broonie@kernel.org> > Cc: Steven Price <Steven.Price@arm.com> > Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> > Cc: Masahiro Yamada <yamada.masahiro@socionext.com> > Cc: Kees Cook <keescook@chromium.org> > Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Sri Krishna chowdary <schowdary@nvidia.com> > Cc: Dave Hansen <dave.hansen@intel.com> > Cc: Russell King - ARM Linux <linux@armlinux.org.uk> > Cc: Michael Ellerman <mpe@ellerman.id.au> > Cc: Paul Mackerras <paulus@samba.org> > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> > Cc: Heiko Carstens <heiko.carstens@de.ibm.com> > Cc: "David S. Miller" <davem@davemloft.net> > Cc: Vineet Gupta <vgupta@synopsys.com> > Cc: James Hogan <jhogan@kernel.org> > Cc: Paul Burton <paul.burton@mips.com> > Cc: Ralf Baechle <ralf@linux-mips.org> > Cc: Kirill A. Shutemov <kirill@shutemov.name> > Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> > Cc: Christophe Leroy <christophe.leroy@c-s.fr> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: linux-snps-arc@lists.infradead.org > Cc: linux-mips@vger.kernel.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: linux-ia64@vger.kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-s390@vger.kernel.org > Cc: linux-sh@vger.kernel.org > Cc: sparclinux@vger.kernel.org > Cc: x86@kernel.org > Cc: linux-kernel@vger.kernel.org I should have included mailing lists for all missing platforms here. Will add them in the patch next time around but for now just adding them here explicitly so that hopefully in case some of them can build and run the test successfully on respective platforms. ALPHA: + linux-alpha@vger.kernel.org + Richard Henderson <rth@twiddle.net> + Ivan Kokshaysky <ink@jurassic.park.msu.ru> + Matt Turner <mattst88@gmail.com> C6X: + linux-c6x-dev@linux-c6x.org + Mark Salter <msalter@redhat.com> + Aurelien Jacquiot <jacquiot.aurelien@gmail.com> H8300: + uclinux-h8-devel@lists.sourceforge.jp + Yoshinori Sato <ysato@users.sourceforge.jp> HEXAGON: + linux-hexagon@vger.kernel.org + Brian Cain <bcain@codeaurora.org> M68K: + linux-m68k@lists.linux-m68k.org + Geert Uytterhoeven <geert@linux-m68k.org> MICROBLAZE: + Michal Simek <monstr@monstr.eu> RISCV: + linux-riscv@lists.infradead.org + Paul Walmsley <paul.walmsley@sifive.com> + Palmer Dabbelt <palmer@dabbelt.com> UNICORE32: + Guan Xuetao <gxt@pku.edu.cn> XTENSA: + linux-xtensa@linux-xtensa.org + Chris Zankel <chris@zankel.net> + Max Filippov <jcmvbkbc@gmail.com> Please feel free to add others if I have missed.
On 01/30/2020 07:43 PM, Christophe Leroy wrote: > > > Le 30/01/2020 à 14:04, Anshuman Khandual a écrit : >> >> On 01/28/2020 10:35 PM, Christophe Leroy wrote: >>> >>> >>> Le 28/01/2020 à 02:27, Anshuman Khandual a écrit : >>>> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h >>>> index 0b6c4042942a..fb0e76d254b3 100644 >>>> --- a/arch/x86/include/asm/pgtable_64.h >>>> +++ b/arch/x86/include/asm/pgtable_64.h >>>> @@ -53,6 +53,12 @@ static inline void sync_initial_page_table(void) { } >>>> struct mm_struct; >>>> +#define mm_p4d_folded mm_p4d_folded >>>> +static inline bool mm_p4d_folded(struct mm_struct *mm) >>>> +{ >>>> + return !pgtable_l5_enabled(); >>>> +} >>>> + >>> >>> For me this should be part of another patch, it is not directly linked to the tests. >> >> We did discuss about this earlier and Kirril mentioned its not worth >> a separate patch. >> >> https://lore.kernel.org/linux-arm-kernel/20190913091305.rkds4f3fqv3yjhjy@box/ > > For me it would make sense to not mix this patch which implement tests, and changes that are needed for the test to work (or even build) on the different architectures. > > But that's up to you. > >> >>> >>>> void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte); >>>> void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte); >>>> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h >>>> index 798ea36a0549..e0b04787e789 100644 >>>> --- a/include/asm-generic/pgtable.h >>>> +++ b/include/asm-generic/pgtable.h >>>> @@ -1208,6 +1208,12 @@ static inline bool arch_has_pfn_modify_check(void) >>>> # define PAGE_KERNEL_EXEC PAGE_KERNEL >>>> #endif >>>> +#ifdef CONFIG_DEBUG_VM_PGTABLE >>> >>> Not sure it is a good idea to put that in include/asm-generic/pgtable.h >> >> Logically that is the right place, as it is related to page table but >> not something platform related. > > I can't see any debug related features in that file. > >> >>> >>> By doing this you are forcing a rebuild of almost all files, whereas only init/main.o and mm/debug_vm_pgtable.o should be rebuilt when activating this config option. >> >> I agreed but whats the alternative ? We could move these into init/main.c >> to make things simpler but will that be a right place, given its related >> to generic page table. > > What about linux/mmdebug.h instead ? (I have not checked if it would reduce the impact, but that's where things related to CONFIG_DEBUG_VM seems to be). > > Otherwise, you can just create new file, for instance <linux/mmdebug-pgtable.h> and include that file only in the init/main.c and mm/debug_vm_pgtable.c IMHO it might not be wise to add yet another header file for this purpose. Instead lets use <linux/mmdebug.h> in line with DEBUG_VM, DEBUG_VM_PGFLAGS, DEBUG_VIRTUAL (which is also a stand alone test). A simple grep shows that the impact of mmdebug.h would be less than generic pgtable.h header. > > > >> >>> >>>> +extern void debug_vm_pgtable(void); >>> >>> Please don't use the 'extern' keyword, it is useless and not to be used for functions declaration. >> >> Really ? But, there are tons of examples doing the same thing both in >> generic and platform code as well. > > Yes, but how can we improve if we blindly copy the errors from the past ? Having tons of 'extern' doesn't mean we must add more. > > I think checkpatch.pl usually complains when a patch brings a new unreleval extern symbol. Sure np, will drop it. But checkpatch.pl never complained. > >> >>> >>>> +#else >>>> +static inline void debug_vm_pgtable(void) { } >>>> +#endif >>>> + >>>> #endif /* !__ASSEMBLY__ */ >>>> #ifndef io_remap_pfn_range >>>> diff --git a/init/main.c b/init/main.c >>>> index da1bc0b60a7d..5e59e6ac0780 100644 >>>> --- a/init/main.c >>>> +++ b/init/main.c >>>> @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) >>>> sched_init_smp(); >>>> page_alloc_init_late(); >>>> + debug_vm_pgtable(); >>> >>> Wouldn't it be better to call debug_vm_pgtable() in kernel_init() between the call to async_synchronise_full() and ftrace_free_init_mem() ? >> >> IIRC, proposed location is the earliest we could call debug_vm_pgtable(). >> Is there any particular benefit or reason to move it into kernel_init() ? > > It would avoid having it lost in the middle of drivers logs, would be close to the end of init, at a place we can't miss it, close to the result of other tests like CONFIG_DEBUG_RODATA_TEST for instance. > > At the moment, you have to look for it to be sure the test is done and what the result is. Sure, will move it. > >> >>> >>>> /* Initialize page ext after all struct pages are initialized. */ >>>> page_ext_init(); >>>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug >>>> index 5ffe144c9794..7cceae923c05 100644 >>>> --- a/lib/Kconfig.debug >>>> +++ b/lib/Kconfig.debug >>>> @@ -653,6 +653,12 @@ config SCHED_STACK_END_CHECK >>>> data corruption or a sporadic crash at a later stage once the region >>>> is examined. The runtime overhead introduced is minimal. >>>> +config ARCH_HAS_DEBUG_VM_PGTABLE >>>> + bool >>>> + help >>>> + An architecture should select this when it can successfully >>>> + build and run DEBUG_VM_PGTABLE. >>>> + >>>> config DEBUG_VM >>>> bool "Debug VM" >>>> depends on DEBUG_KERNEL >>>> @@ -688,6 +694,22 @@ config DEBUG_VM_PGFLAGS >>>> If unsure, say N. >>>> +config DEBUG_VM_PGTABLE >>>> + bool "Debug arch page table for semantics compliance" >>>> + depends on MMU >>>> + depends on DEBUG_VM >>> >>> Does it really need to depend on DEBUG_VM ? >> >> No. It seemed better to package this test along with DEBUG_VM (although I >> dont remember the conversation around it) and hence this dependency. > > Yes but it perfectly work as standalone. The more easy it is to activate and the more people will use it. DEBUG_VM obliges to rebuild the kernel entirely and could modify the behaviour. Could the helpers we are testing behave differently when DEBUG_VM is not set ? I think it's good the test things as close as possible to final config. Makes sense. There is no functional dependency for the individual tests here on DEBUG_VM. > >> >>> I think we could make it standalone and 'default y if DEBUG_VM' instead. >> >> Which will yield the same result like before but in a different way. But >> yes, this test could go about either way but unless there is a good enough >> reason why change the current one. > > I think if we want people to really use it on other architectures it must be possible to activate it without having to modify Kconfig. Otherwise people won't even know the test exists and the architecture fails the test. > > The purpose of a test suite is to detect bugs. If you can't run the test until you have fixed the bugs, I guess nobody will ever detect the bugs and they will never be fixed. > > So I think: > - the test should be 'default y' when ARCH_HAS_DEBUG_VM_PGTABLE is selected > - the test should be 'default n' when ARCH_HAS_DEBUG_VM_PGTABLE is not selected, and it should be user selectable if EXPERT is selected. > > Something like: > > config DEBUG_VM_PGTABLE > bool "Debug arch page table for semantics compliance" if ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT > depends on MMU (ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT) be moved along side MMU on the same line ? > default 'n' if !ARCH_HAS_DEBUG_VM_PGTABLE > default 'y' if DEBUG_VM This looks good, at least until we get all platforms enabled. Will do all these changes along with s390 enablement and re-spin. > > >> >>> >>>> + depends on ARCH_HAS_DEBUG_VM_PGTABLE >>>> + default y >>>> + help >>>> + This option provides a debug method which can be used to test >>>> + architecture page table helper functions on various platforms in >>>> + verifying if they comply with expected generic MM semantics. This >>>> + will help architecture code in making sure that any changes or >>>> + new additions of these helpers still conform to expected >>>> + semantics of the generic MM. >>>> + >>>> + If unsure, say N. >>>> + >>> >>> Does it make sense to make it 'default y' and say 'If unsure, say N' ? >> >> No it does. Not when it defaults 'y' unconditionally. Will drop the last >> sentence "If unsure, say N". Nice catch, thank you. > > Well I was not asking if 'default y' was making sense but only if 'If unsure say N' was making sense due to the 'default y'. You got it. > > Christophe > >
On 01/30/2020 06:34 PM, Anshuman Khandual wrote: > On 01/28/2020 10:35 PM, Christophe Leroy wrote: >> >> Le 28/01/2020 à 02:27, Anshuman Khandual a écrit : >>> This adds tests which will validate architecture page table helpers and >>> other accessors in their compliance with expected generic MM semantics. >>> This will help various architectures in validating changes to existing >>> page table helpers or addition of new ones. >>> >>> This test covers basic page table entry transformations including but not >>> limited to old, young, dirty, clean, write, write protect etc at various >>> level along with populating intermediate entries with next page table page >>> and validating them. >>> >>> Test page table pages are allocated from system memory with required size >>> and alignments. The mapped pfns at page table levels are derived from a >>> real pfn representing a valid kernel text symbol. This test gets called >>> right after page_alloc_init_late(). >>> >>> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >>> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >>> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >>> arm64. Going forward, other architectures too can enable this after fixing >>> build or runtime problems (if any) with their page table helpers. >>> >>> Folks interested in making sure that a given platform's page table helpers >>> conform to expected generic MM semantics should enable the above config >>> which will just trigger this test during boot. Any non conformity here will >>> be reported as an warning which would need to be fixed. This test will help >>> catch any changes to the agreed upon semantics expected from generic MM and >>> enable platforms to accommodate it thereafter. >>> >> [...] >> >>> Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> #PPC32 >> Also tested on PPC64 (under QEMU): book3s/64 64k pages, book3s/64 4k pages and book3e/64 > Hmm but earlier Michael Ellerman had reported some problems while > running these tests on PPC64, a soft lock up in hash__pte_update() > and a kernel BUG (radix MMU). Are those problems gone away now ? > > Details in this thread - https://patchwork.kernel.org/patch/11214603/ > It is always better to have more platforms enabled than not. But lets keep this test disabled on PPC64 for now, if there is any inconsistency between results while running this under QEMU and on actual systems.
Le 02/02/2020 à 08:18, Anshuman Khandual a écrit : > On 01/30/2020 07:43 PM, Christophe Leroy wrote: >> >> >> Le 30/01/2020 à 14:04, Anshuman Khandual a écrit : >>> >>> On 01/28/2020 10:35 PM, Christophe Leroy wrote: >> >>> >>>> I think we could make it standalone and 'default y if DEBUG_VM' instead. >>> >>> Which will yield the same result like before but in a different way. But >>> yes, this test could go about either way but unless there is a good enough >>> reason why change the current one. >> >> I think if we want people to really use it on other architectures it must be possible to activate it without having to modify Kconfig. Otherwise people won't even know the test exists and the architecture fails the test. >> >> The purpose of a test suite is to detect bugs. If you can't run the test until you have fixed the bugs, I guess nobody will ever detect the bugs and they will never be fixed. >> >> So I think: >> - the test should be 'default y' when ARCH_HAS_DEBUG_VM_PGTABLE is selected >> - the test should be 'default n' when ARCH_HAS_DEBUG_VM_PGTABLE is not selected, and it should be user selectable if EXPERT is selected. >> >> Something like: >> >> config DEBUG_VM_PGTABLE >> bool "Debug arch page table for semantics compliance" if ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT >> depends on MMU > > (ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT) be moved along side MMU on the same line ? Yes could also go along side MMU, or could be a depend by itself: depends on ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT > >> default 'n' if !ARCH_HAS_DEBUG_VM_PGTABLE >> default 'y' if DEBUG_VM > > This looks good, at least until we get all platforms enabled. Will do all these > changes along with s390 enablement and re-spin. Christophe
> On Jan 30, 2020, at 9:13 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote: > > config DEBUG_VM_PGTABLE > bool "Debug arch page table for semantics compliance" if ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT > depends on MMU > default 'n' if !ARCH_HAS_DEBUG_VM_PGTABLE > default 'y' if DEBUG_VM Does it really necessary to potentially force all bots to run this? Syzbot, kernel test robot etc? Does it ever pay off for all their machine times there?
Le 02/02/2020 à 12:26, Qian Cai a écrit : > > >> On Jan 30, 2020, at 9:13 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote: >> >> config DEBUG_VM_PGTABLE >> bool "Debug arch page table for semantics compliance" if ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT >> depends on MMU >> default 'n' if !ARCH_HAS_DEBUG_VM_PGTABLE >> default 'y' if DEBUG_VM > > Does it really necessary to potentially force all bots to run this? Syzbot, kernel test robot etc? Does it ever pay off for all their machine times there? > Machine time ? On a 32 bits powerpc running at 132 MHz, the tests takes less than 10ms. Is it worth taking the risk of not detecting faults by not selecting it by default ? [ 5.656916] debug_vm_pgtable: debug_vm_pgtable: Validating architecture page table helpers [ 5.665661] debug_vm_pgtable: debug_vm_pgtable: Validated architecture page table helpers Christophe
On Mon, 2020-02-03 at 16:14 +0100, Christophe Leroy wrote: > > Le 02/02/2020 à 12:26, Qian Cai a écrit : > > > > > > > On Jan 30, 2020, at 9:13 AM, Christophe Leroy <christophe.leroy@c-s.fr> wrote: > > > > > > config DEBUG_VM_PGTABLE > > > bool "Debug arch page table for semantics compliance" if ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT > > > depends on MMU > > > default 'n' if !ARCH_HAS_DEBUG_VM_PGTABLE > > > default 'y' if DEBUG_VM > > > > Does it really necessary to potentially force all bots to run this? Syzbot, kernel test robot etc? Does it ever pay off for all their machine times there? > > > > Machine time ? > > On a 32 bits powerpc running at 132 MHz, the tests takes less than 10ms. > Is it worth taking the risk of not detecting faults by not selecting it > by default ? The risk is quite low as Catalin mentioned this thing is not to detect regressions but rather for arch/mm maintainers. I do appreciate the efforts to get everyone as possible to run this thing, so it get more notices once it is broken. However, DEBUG_VM seems like such a generic Kconfig those days that have even been enabled by default for Fedora Linux, so I would rather see a more sensitive default been taken even though the test runtime is fairly quickly on a small machine for now. > > [ 5.656916] debug_vm_pgtable: debug_vm_pgtable: Validating > architecture page table helpers > [ 5.665661] debug_vm_pgtable: debug_vm_pgtable: Validated > architecture page table helpers
On Tue, Jan 28, 2020 at 06:57:53AM +0530, Anshuman Khandual wrote: > This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with > CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to > select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and > arm64. Going forward, other architectures too can enable this after fixing > build or runtime problems (if any) with their page table helpers. It may be worth posting the next version to linux-arch to reach out to other arch maintainers. Also I've seen that you posted a v13 but it hasn't reached linux-arm-kernel (likely held in moderation because of the large amount of addresses cc'ed) and I don't normally follow LKML. I'm not cc'ed to this patch either (which is fine as long as you post to a list that I read). Since I started the reply on v12 about a week ago, I'll follow up here. When you post a v14, please trim the people on cc only to those strictly necessary (e.g. arch maintainers, linux-mm, linux-arch and lkml). > diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > new file mode 100644 > index 000000000000..f3f8111edbe3 > --- /dev/null > +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt > @@ -0,0 +1,35 @@ > +# > +# Feature name: debug-vm-pgtable > +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE > +# description: arch supports pgtable tests for semantics compliance > +# > + ----------------------- > + | arch |status| > + ----------------------- > + | alpha: | TODO | > + | arc: | ok | > + | arm: | TODO | I'm sure you can find some arm32 hardware around (or a VM) to give this a try ;). > diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h > index 0b6c4042942a..fb0e76d254b3 100644 > --- a/arch/x86/include/asm/pgtable_64.h > +++ b/arch/x86/include/asm/pgtable_64.h [...] > @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) > sched_init_smp(); > > page_alloc_init_late(); > + debug_vm_pgtable(); > /* Initialize page ext after all struct pages are initialized. */ > page_ext_init(); I guess you could even make debug_vm_pgtable() an early_initcall(). I don't have a strong opinion either way. > diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c > new file mode 100644 > index 000000000000..0f37f32d15f1 > --- /dev/null > +++ b/mm/debug_vm_pgtable.c > @@ -0,0 +1,388 @@ [...] > +/* > + * Basic operations > + * > + * mkold(entry) = An old and not a young entry > + * mkyoung(entry) = A young and not an old entry > + * mkdirty(entry) = A dirty and not a clean entry > + * mkclean(entry) = A clean and not a dirty entry > + * mkwrite(entry) = A write and not a write protected entry > + * wrprotect(entry) = A write protected and not a write entry > + * pxx_bad(entry) = A mapped and non-table entry > + * pxx_same(entry1, entry2) = Both entries hold the exact same value > + */ > +#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC) > + > +/* > + * On s390 platform, the lower 12 bits are used to identify given page table > + * entry type and for other arch specific requirements. But these bits might > + * affect the ability to clear entries with pxx_clear(). So while loading up > + * the entries skip all lower 12 bits in order to accommodate s390 platform. > + * It does not have affect any other platform. > + */ > +#define RANDOM_ORVALUE (0xfffffffffffff000UL) I'd suggest you generate this mask with something like GENMASK(BITS_PER_LONG, PAGE_SHIFT). > +#define RANDOM_NZVALUE (0xff) > + > +static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pte_t pte = pfn_pte(pfn, prot); > + > + WARN_ON(!pte_same(pte, pte)); > + WARN_ON(!pte_young(pte_mkyoung(pte))); > + WARN_ON(!pte_dirty(pte_mkdirty(pte))); > + WARN_ON(!pte_write(pte_mkwrite(pte))); > + WARN_ON(pte_young(pte_mkold(pte))); > + WARN_ON(pte_dirty(pte_mkclean(pte))); > + WARN_ON(pte_write(pte_wrprotect(pte))); Given that you start with rwx permissions set, some of these ops would not have any effect. For example, on arm64 at least, mkwrite clears a bit already cleared here. You could try with multiple rwx combinations values (e.g. all set and all cleared) or maybe something like below: WARN_ON(!pte_write(pte_mkwrite(pte_wrprotect(pte)))); You could also try something like this: WARN_ON(!pte_same(pte_wrprotect(pte), pte_wrprotect(pte_mkwrite(pte)))); though the above approach may not work for arm64 ptep_set_wrprotect() on a dirty pte (if you extend these tests later). > +} > + > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pmd_t pmd = pfn_pmd(pfn, prot); > + > + WARN_ON(!pmd_same(pmd, pmd)); > + WARN_ON(!pmd_young(pmd_mkyoung(pmd))); > + WARN_ON(!pmd_dirty(pmd_mkdirty(pmd))); > + WARN_ON(!pmd_write(pmd_mkwrite(pmd))); > + WARN_ON(pmd_young(pmd_mkold(pmd))); > + WARN_ON(pmd_dirty(pmd_mkclean(pmd))); > + WARN_ON(pmd_write(pmd_wrprotect(pmd))); > + /* > + * A huge page does not point to next level page table > + * entry. Hence this must qualify as pmd_bad(). > + */ > + WARN_ON(!pmd_bad(pmd_mkhuge(pmd))); > +} > + > +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pud_t pud = pfn_pud(pfn, prot); > + > + WARN_ON(!pud_same(pud, pud)); > + WARN_ON(!pud_young(pud_mkyoung(pud))); > + WARN_ON(!pud_write(pud_mkwrite(pud))); > + WARN_ON(pud_write(pud_wrprotect(pud))); > + WARN_ON(pud_young(pud_mkold(pud))); > + > + if (mm_pmd_folded(mm) || __is_defined(ARCH_HAS_4LEVEL_HACK)) > + return; > + > + /* > + * A huge page does not point to next level page table > + * entry. Hence this must qualify as pud_bad(). > + */ > + WARN_ON(!pud_bad(pud_mkhuge(pud))); > +} > +#else > +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } > +#endif > +#else > +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { } > +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } > +#endif > + > +static void __init p4d_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + p4d_t p4d; > + > + memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t)); > + WARN_ON(!p4d_same(p4d, p4d)); > +} > + > +static void __init pgd_basic_tests(unsigned long pfn, pgprot_t prot) > +{ > + pgd_t pgd; > + > + memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t)); > + WARN_ON(!pgd_same(pgd, pgd)); > +} > + > +#ifndef __ARCH_HAS_4LEVEL_HACK This macro doesn't exist in the kernel anymore (it's a 5LEVEL now). But can you not use the __PAGETABLE_PUD_FOLDED instead? > +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) > +{ > + pud_t pud = READ_ONCE(*pudp); > + > + if (mm_pmd_folded(mm)) > + return; > + > + pud = __pud(pud_val(pud) | RANDOM_ORVALUE); > + WRITE_ONCE(*pudp, pud); > + pud_clear(pudp); > + pud = READ_ONCE(*pudp); > + WARN_ON(!pud_none(pud)); > +} > + > +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, > + pmd_t *pmdp) > +{ > + pud_t pud; > + > + if (mm_pmd_folded(mm)) > + return; > + /* > + * This entry points to next level page table page. > + * Hence this must not qualify as pud_bad(). > + */ > + pmd_clear(pmdp); > + pud_clear(pudp); > + pud_populate(mm, pudp, pmdp); > + pud = READ_ONCE(*pudp); > + WARN_ON(pud_bad(pud)); > +} > +#else > +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) { } > +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, > + pmd_t *pmdp) > +{ > +} > +#endif > + > +#ifndef __ARCH_HAS_5LEVEL_HACK Could you use __PAGETABLE_P4D_FOLDED instead? > +static void __init p4d_clear_tests(struct mm_struct *mm, p4d_t *p4dp) > +{ > + p4d_t p4d = READ_ONCE(*p4dp); > + > + if (mm_pud_folded(mm)) > + return; > + > + p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE); > + WRITE_ONCE(*p4dp, p4d); > + p4d_clear(p4dp); > + p4d = READ_ONCE(*p4dp); > + WARN_ON(!p4d_none(p4d)); > +} Otherwise the patch looks fine. As per the comment on v13, make sure you don't break the build on any architecture, so this could either be an opt-in or patch those architectures before this patch is applied. Thanks.
On 02/10/2020 09:07 PM, Catalin Marinas wrote: > On Tue, Jan 28, 2020 at 06:57:53AM +0530, Anshuman Khandual wrote: >> This gets build and run when CONFIG_DEBUG_VM_PGTABLE is selected along with >> CONFIG_VM_DEBUG. Architectures willing to subscribe this test also need to >> select CONFIG_ARCH_HAS_DEBUG_VM_PGTABLE which for now is limited to x86 and >> arm64. Going forward, other architectures too can enable this after fixing >> build or runtime problems (if any) with their page table helpers. > > It may be worth posting the next version to linux-arch to reach out to > other arch maintainers. Sure, will do. > > Also I've seen that you posted a v13 but it hasn't reached > linux-arm-kernel (likely held in moderation because of the large amount > of addresses cc'ed) and I don't normally follow LKML. I'm not cc'ed to > this patch either (which is fine as long as you post to a list that I > read). Right, the CC list on V13 was a disaster. I did not realize that it will exceed the permitted limit when the lists will start refusing to take. In fact, it looks like LKML did not get the email either. > > Since I started the reply on v12 about a week ago, I'll follow up here. > When you post a v14, please trim the people on cc only to those strictly > necessary (e.g. arch maintainers, linux-mm, linux-arch and lkml). Sure, will do. > >> diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt >> new file mode 100644 >> index 000000000000..f3f8111edbe3 >> --- /dev/null >> +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt >> @@ -0,0 +1,35 @@ >> +# >> +# Feature name: debug-vm-pgtable >> +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE >> +# description: arch supports pgtable tests for semantics compliance >> +# >> + ----------------------- >> + | arch |status| >> + ----------------------- >> + | alpha: | TODO | >> + | arc: | ok | >> + | arm: | TODO | > > I'm sure you can find some arm32 hardware around (or a VM) to give this > a try ;). It does not build on arm32 and we dont have an agreement on how to go about that either, hence will disable this test on IA64 and ARM (32) in order to prevent the known build failures (as Andrew had requested). > >> diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h >> index 0b6c4042942a..fb0e76d254b3 100644 >> --- a/arch/x86/include/asm/pgtable_64.h >> +++ b/arch/x86/include/asm/pgtable_64.h > [...] >> @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) >> sched_init_smp(); >> >> page_alloc_init_late(); >> + debug_vm_pgtable(); >> /* Initialize page ext after all struct pages are initialized. */ >> page_ext_init(); > > I guess you could even make debug_vm_pgtable() an early_initcall(). I > don't have a strong opinion either way. > >> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >> new file mode 100644 >> index 000000000000..0f37f32d15f1 >> --- /dev/null >> +++ b/mm/debug_vm_pgtable.c >> @@ -0,0 +1,388 @@ > [...] >> +/* >> + * Basic operations >> + * >> + * mkold(entry) = An old and not a young entry >> + * mkyoung(entry) = A young and not an old entry >> + * mkdirty(entry) = A dirty and not a clean entry >> + * mkclean(entry) = A clean and not a dirty entry >> + * mkwrite(entry) = A write and not a write protected entry >> + * wrprotect(entry) = A write protected and not a write entry >> + * pxx_bad(entry) = A mapped and non-table entry >> + * pxx_same(entry1, entry2) = Both entries hold the exact same value >> + */ >> +#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC) >> + >> +/* >> + * On s390 platform, the lower 12 bits are used to identify given page table >> + * entry type and for other arch specific requirements. But these bits might >> + * affect the ability to clear entries with pxx_clear(). So while loading up >> + * the entries skip all lower 12 bits in order to accommodate s390 platform. >> + * It does not have affect any other platform. >> + */ >> +#define RANDOM_ORVALUE (0xfffffffffffff000UL) > > I'd suggest you generate this mask with something like > GENMASK(BITS_PER_LONG, PAGE_SHIFT). IIRC the lower 12 bits constrains on s390 platform might not be really related to it's PAGE_SHIFT which can be a variable, but instead just a constant number. But can definitely use GENMASK or it's variants here. https://lkml.org/lkml/2019/9/5/862 > >> +#define RANDOM_NZVALUE (0xff) >> + >> +static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot) >> +{ >> + pte_t pte = pfn_pte(pfn, prot); >> + >> + WARN_ON(!pte_same(pte, pte)); >> + WARN_ON(!pte_young(pte_mkyoung(pte))); >> + WARN_ON(!pte_dirty(pte_mkdirty(pte))); >> + WARN_ON(!pte_write(pte_mkwrite(pte))); >> + WARN_ON(pte_young(pte_mkold(pte))); >> + WARN_ON(pte_dirty(pte_mkclean(pte))); >> + WARN_ON(pte_write(pte_wrprotect(pte))); > > Given that you start with rwx permissions set, > some of these ops would not have any effect. For example, on arm64 at > least, mkwrite clears a bit already cleared here. You could try with PTE_RDONLY ! > multiple rwx combinations values (e.g. all set and all cleared) or maybe Which will require running the sequence of tests multiple times, each time with different prot value (e.g all set or all clear). Wondering if that would be better than the proposed single pass. > something like below: > > WARN_ON(!pte_write(pte_mkwrite(pte_wrprotect(pte)))); Hmm, we should run invert functions first for each function we are trying to test ? That makes sense because any platform specific bit combination (clear or set) for the function to be tested, will first be flipped with it's invert function. > > You could also try something like this: > > WARN_ON(!pte_same(pte_wrprotect(pte), pte_wrprotect(pte_mkwrite(pte)))); > > though the above approach may not work for arm64 ptep_set_wrprotect() on > a dirty pte (if you extend these tests later). Okay, will use the previous method (invert function -> actual function) for basic tests on each level. > >> +} >> + >> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >> +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) >> +{ >> + pmd_t pmd = pfn_pmd(pfn, prot); >> + >> + WARN_ON(!pmd_same(pmd, pmd)); >> + WARN_ON(!pmd_young(pmd_mkyoung(pmd))); >> + WARN_ON(!pmd_dirty(pmd_mkdirty(pmd))); >> + WARN_ON(!pmd_write(pmd_mkwrite(pmd))); >> + WARN_ON(pmd_young(pmd_mkold(pmd))); >> + WARN_ON(pmd_dirty(pmd_mkclean(pmd))); >> + WARN_ON(pmd_write(pmd_wrprotect(pmd))); >> + /* >> + * A huge page does not point to next level page table >> + * entry. Hence this must qualify as pmd_bad(). >> + */ >> + WARN_ON(!pmd_bad(pmd_mkhuge(pmd))); >> +} >> + >> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD >> +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) >> +{ >> + pud_t pud = pfn_pud(pfn, prot); >> + >> + WARN_ON(!pud_same(pud, pud)); >> + WARN_ON(!pud_young(pud_mkyoung(pud))); >> + WARN_ON(!pud_write(pud_mkwrite(pud))); >> + WARN_ON(pud_write(pud_wrprotect(pud))); >> + WARN_ON(pud_young(pud_mkold(pud))); >> + >> + if (mm_pmd_folded(mm) || __is_defined(ARCH_HAS_4LEVEL_HACK)) >> + return; >> + >> + /* >> + * A huge page does not point to next level page table >> + * entry. Hence this must qualify as pud_bad(). >> + */ >> + WARN_ON(!pud_bad(pud_mkhuge(pud))); >> +} >> +#else >> +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } >> +#endif >> +#else >> +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { } >> +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } >> +#endif >> + >> +static void __init p4d_basic_tests(unsigned long pfn, pgprot_t prot) >> +{ >> + p4d_t p4d; >> + >> + memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t)); >> + WARN_ON(!p4d_same(p4d, p4d)); >> +} >> + >> +static void __init pgd_basic_tests(unsigned long pfn, pgprot_t prot) >> +{ >> + pgd_t pgd; >> + >> + memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t)); >> + WARN_ON(!pgd_same(pgd, pgd)); >> +} >> + >> +#ifndef __ARCH_HAS_4LEVEL_HACK > > This macro doesn't exist in the kernel anymore (it's a 5LEVEL now). But I was aware about the work to drop __ARCH_HAS_4LEVEL_HACK but did not realize that it has already merged. > can you not use the __PAGETABLE_PUD_FOLDED instead? Sure, will try. > >> +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) >> +{ >> + pud_t pud = READ_ONCE(*pudp); >> + >> + if (mm_pmd_folded(mm)) >> + return; >> + >> + pud = __pud(pud_val(pud) | RANDOM_ORVALUE); >> + WRITE_ONCE(*pudp, pud); >> + pud_clear(pudp); >> + pud = READ_ONCE(*pudp); >> + WARN_ON(!pud_none(pud)); >> +} >> + >> +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, >> + pmd_t *pmdp) >> +{ >> + pud_t pud; >> + >> + if (mm_pmd_folded(mm)) >> + return; >> + /* >> + * This entry points to next level page table page. >> + * Hence this must not qualify as pud_bad(). >> + */ >> + pmd_clear(pmdp); >> + pud_clear(pudp); >> + pud_populate(mm, pudp, pmdp); >> + pud = READ_ONCE(*pudp); >> + WARN_ON(pud_bad(pud)); >> +} >> +#else >> +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) { } >> +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, >> + pmd_t *pmdp) >> +{ >> +} >> +#endif >> + >> +#ifndef __ARCH_HAS_5LEVEL_HACK > > Could you use __PAGETABLE_P4D_FOLDED instead? Sure, will try. Initial tests with __PAGETABLE_PUD_FOLDED and __PAGETABLE_P4D_FOLDED replacement looks okay. > >> +static void __init p4d_clear_tests(struct mm_struct *mm, p4d_t *p4dp) >> +{ >> + p4d_t p4d = READ_ONCE(*p4dp); >> + >> + if (mm_pud_folded(mm)) >> + return; >> + >> + p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE); >> + WRITE_ONCE(*p4dp, p4d); >> + p4d_clear(p4dp); >> + p4d = READ_ONCE(*p4dp); >> + WARN_ON(!p4d_none(p4d)); >> +} > > Otherwise the patch looks fine. As per the comment on v13, make sure you > don't break the build on any architecture, so this could either be an > opt-in or patch those architectures before this patch is applied. We already have an opt-in method through ARCH_HAS_DEBUG_VM_PGTABLE config. But lately (v13) we had decided to enable the test through CONFIG_EXPERT, for better adaptability on non supported platforms without requiring it's Kconfig change. This exposed the existing build failures on IA64 and ARM. I will probably disable the test on those platforms as agreed upon on V13 thread. > > Thanks. >
On Wed, 12 Feb 2020 15:12:54 +0530 Anshuman Khandual <anshuman.khandual@arm.com> wrote: > >> +/* > >> + * On s390 platform, the lower 12 bits are used to identify given page table > >> + * entry type and for other arch specific requirements. But these bits might > >> + * affect the ability to clear entries with pxx_clear(). So while loading up > >> + * the entries skip all lower 12 bits in order to accommodate s390 platform. > >> + * It does not have affect any other platform. > >> + */ > >> +#define RANDOM_ORVALUE (0xfffffffffffff000UL) > > > > I'd suggest you generate this mask with something like > > GENMASK(BITS_PER_LONG, PAGE_SHIFT). > > IIRC the lower 12 bits constrains on s390 platform might not be really related > to it's PAGE_SHIFT which can be a variable, but instead just a constant number. > But can definitely use GENMASK or it's variants here. > > https://lkml.org/lkml/2019/9/5/862 PAGE_SHIFT would be fine, it is 12 on s390. However, in order to be more precise, we do not really need all 12 bits, only the last 4 bits. So, something like this would work: #define RANDOM_ORVALUE GENMASK(BITS_PER_LONG - 1, 4) The text in the comment could then also be changed from 12 to 4, and be a bit more specific on the fact that the impact on pxx_clear() results from the dynamic page table folding logic on s390: /* * On s390 platform, the lower 4 bits are used to identify given page table * entry type. But these bits might affect the ability to clear entries with * pxx_clear() because of how dynamic page table folding works on s390. So * while loading up the entries do not change the lower 4 bits. * It does not have affect any other platform. */
On 02/12/2020 11:25 PM, Gerald Schaefer wrote: > On Wed, 12 Feb 2020 15:12:54 +0530 > Anshuman Khandual <anshuman.khandual@arm.com> wrote: > >>>> +/* >>>> + * On s390 platform, the lower 12 bits are used to identify given page table >>>> + * entry type and for other arch specific requirements. But these bits might >>>> + * affect the ability to clear entries with pxx_clear(). So while loading up >>>> + * the entries skip all lower 12 bits in order to accommodate s390 platform. >>>> + * It does not have affect any other platform. >>>> + */ >>>> +#define RANDOM_ORVALUE (0xfffffffffffff000UL) >>> >>> I'd suggest you generate this mask with something like >>> GENMASK(BITS_PER_LONG, PAGE_SHIFT). >> >> IIRC the lower 12 bits constrains on s390 platform might not be really related >> to it's PAGE_SHIFT which can be a variable, but instead just a constant number. >> But can definitely use GENMASK or it's variants here. >> >> https://lkml.org/lkml/2019/9/5/862 > > PAGE_SHIFT would be fine, it is 12 on s390. However, in order to be > more precise, we do not really need all 12 bits, only the last 4 bits. > So, something like this would work: > > #define RANDOM_ORVALUE GENMASK(BITS_PER_LONG - 1, 4) > > The text in the comment could then also be changed from 12 to 4, and > be a bit more specific on the fact that the impact on pxx_clear() > results from the dynamic page table folding logic on s390: > > /* > * On s390 platform, the lower 4 bits are used to identify given page table > * entry type. But these bits might affect the ability to clear entries with > * pxx_clear() because of how dynamic page table folding works on s390. So > * while loading up the entries do not change the lower 4 bits. > * It does not have affect any other platform. > */ Sure, will update accordingly.
diff --git a/Documentation/features/debug/debug-vm-pgtable/arch-support.txt b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt new file mode 100644 index 000000000000..f3f8111edbe3 --- /dev/null +++ b/Documentation/features/debug/debug-vm-pgtable/arch-support.txt @@ -0,0 +1,35 @@ +# +# Feature name: debug-vm-pgtable +# Kconfig: ARCH_HAS_DEBUG_VM_PGTABLE +# description: arch supports pgtable tests for semantics compliance +# + ----------------------- + | arch |status| + ----------------------- + | alpha: | TODO | + | arc: | ok | + | arm: | TODO | + | arm64: | ok | + | c6x: | TODO | + | csky: | TODO | + | h8300: | TODO | + | hexagon: | TODO | + | ia64: | TODO | + | m68k: | TODO | + | microblaze: | TODO | + | mips: | TODO | + | nds32: | TODO | + | nios2: | TODO | + | openrisc: | TODO | + | parisc: | TODO | + | powerpc/32: | ok | + | powerpc/64: | TODO | + | riscv: | TODO | + | s390: | TODO | + | sh: | TODO | + | sparc: | TODO | + | um: | TODO | + | unicore32: | TODO | + | x86: | ok | + | xtensa: | TODO | + ----------------------- diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index 26108ea785c2..6a76f3b609b2 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -6,6 +6,7 @@ config ARC def_bool y select ARC_TIMERS + select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DMA_PREP_COHERENT select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_SETUP_DMA_OPS diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index e688dfad0b72..876c9a672db1 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -11,6 +11,7 @@ config ARM64 select ACPI_PPTT if ACPI select ARCH_CLOCKSOURCE_DATA select ARCH_HAS_DEBUG_VIRTUAL + select ARCH_HAS_DEBUG_VM_PGTABLE select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_DMA_PREP_COHERENT select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1ec34e16ed65..253dcab0bebc 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -120,6 +120,7 @@ config PPC # select ARCH_32BIT_OFF_T if PPC32 select ARCH_HAS_DEBUG_VIRTUAL + select ARCH_HAS_DEBUG_VM_PGTABLE if PPC32 select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_ELF_RANDOMIZE select ARCH_HAS_FORTIFY_SOURCE diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5e8949953660..218536a4581d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -61,6 +61,7 @@ config X86 select ARCH_CLOCKSOURCE_INIT select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI select ARCH_HAS_DEBUG_VIRTUAL + select ARCH_HAS_DEBUG_VM_PGTABLE if !X86_PAE select ARCH_HAS_DEVMEM_IS_ALLOWED select ARCH_HAS_ELF_RANDOMIZE select ARCH_HAS_FAST_MULTIPLIER diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h index 0b6c4042942a..fb0e76d254b3 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -53,6 +53,12 @@ static inline void sync_initial_page_table(void) { } struct mm_struct; +#define mm_p4d_folded mm_p4d_folded +static inline bool mm_p4d_folded(struct mm_struct *mm) +{ + return !pgtable_l5_enabled(); +} + void set_pte_vaddr_p4d(p4d_t *p4d_page, unsigned long vaddr, pte_t new_pte); void set_pte_vaddr_pud(pud_t *pud_page, unsigned long vaddr, pte_t new_pte); diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 798ea36a0549..e0b04787e789 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -1208,6 +1208,12 @@ static inline bool arch_has_pfn_modify_check(void) # define PAGE_KERNEL_EXEC PAGE_KERNEL #endif +#ifdef CONFIG_DEBUG_VM_PGTABLE +extern void debug_vm_pgtable(void); +#else +static inline void debug_vm_pgtable(void) { } +#endif + #endif /* !__ASSEMBLY__ */ #ifndef io_remap_pfn_range diff --git a/init/main.c b/init/main.c index da1bc0b60a7d..5e59e6ac0780 100644 --- a/init/main.c +++ b/init/main.c @@ -1197,6 +1197,7 @@ static noinline void __init kernel_init_freeable(void) sched_init_smp(); page_alloc_init_late(); + debug_vm_pgtable(); /* Initialize page ext after all struct pages are initialized. */ page_ext_init(); diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 5ffe144c9794..7cceae923c05 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -653,6 +653,12 @@ config SCHED_STACK_END_CHECK data corruption or a sporadic crash at a later stage once the region is examined. The runtime overhead introduced is minimal. +config ARCH_HAS_DEBUG_VM_PGTABLE + bool + help + An architecture should select this when it can successfully + build and run DEBUG_VM_PGTABLE. + config DEBUG_VM bool "Debug VM" depends on DEBUG_KERNEL @@ -688,6 +694,22 @@ config DEBUG_VM_PGFLAGS If unsure, say N. +config DEBUG_VM_PGTABLE + bool "Debug arch page table for semantics compliance" + depends on MMU + depends on DEBUG_VM + depends on ARCH_HAS_DEBUG_VM_PGTABLE + default y + help + This option provides a debug method which can be used to test + architecture page table helper functions on various platforms in + verifying if they comply with expected generic MM semantics. This + will help architecture code in making sure that any changes or + new additions of these helpers still conform to expected + semantics of the generic MM. + + If unsure, say N. + config ARCH_HAS_DEBUG_VIRTUAL bool diff --git a/mm/Makefile b/mm/Makefile index 1937cc251883..eba423a26b19 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -86,6 +86,7 @@ obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o obj-$(CONFIG_DEBUG_RODATA_TEST) += rodata_test.o +obj-$(CONFIG_DEBUG_VM_PGTABLE) += debug_vm_pgtable.o obj-$(CONFIG_PAGE_OWNER) += page_owner.o obj-$(CONFIG_CLEANCACHE) += cleancache.o obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c new file mode 100644 index 000000000000..0f37f32d15f1 --- /dev/null +++ b/mm/debug_vm_pgtable.c @@ -0,0 +1,388 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * This kernel test validates architecture page table helpers and + * accessors and helps in verifying their continued compliance with + * expected generic MM semantics. + * + * Copyright (C) 2019 ARM Ltd. + * + * Author: Anshuman Khandual <anshuman.khandual@arm.com> + */ +#define pr_fmt(fmt) "debug_vm_pgtable: %s: " fmt, __func__ + +#include <linux/gfp.h> +#include <linux/highmem.h> +#include <linux/hugetlb.h> +#include <linux/kernel.h> +#include <linux/kconfig.h> +#include <linux/mm.h> +#include <linux/mman.h> +#include <linux/mm_types.h> +#include <linux/module.h> +#include <linux/pfn_t.h> +#include <linux/printk.h> +#include <linux/random.h> +#include <linux/spinlock.h> +#include <linux/swap.h> +#include <linux/swapops.h> +#include <linux/start_kernel.h> +#include <linux/sched/mm.h> +#include <asm/pgalloc.h> +#include <asm/pgtable.h> + +/* + * Basic operations + * + * mkold(entry) = An old and not a young entry + * mkyoung(entry) = A young and not an old entry + * mkdirty(entry) = A dirty and not a clean entry + * mkclean(entry) = A clean and not a dirty entry + * mkwrite(entry) = A write and not a write protected entry + * wrprotect(entry) = A write protected and not a write entry + * pxx_bad(entry) = A mapped and non-table entry + * pxx_same(entry1, entry2) = Both entries hold the exact same value + */ +#define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC) + +/* + * On s390 platform, the lower 12 bits are used to identify given page table + * entry type and for other arch specific requirements. But these bits might + * affect the ability to clear entries with pxx_clear(). So while loading up + * the entries skip all lower 12 bits in order to accommodate s390 platform. + * It does not have affect any other platform. + */ +#define RANDOM_ORVALUE (0xfffffffffffff000UL) +#define RANDOM_NZVALUE (0xff) + +static void __init pte_basic_tests(unsigned long pfn, pgprot_t prot) +{ + pte_t pte = pfn_pte(pfn, prot); + + WARN_ON(!pte_same(pte, pte)); + WARN_ON(!pte_young(pte_mkyoung(pte))); + WARN_ON(!pte_dirty(pte_mkdirty(pte))); + WARN_ON(!pte_write(pte_mkwrite(pte))); + WARN_ON(pte_young(pte_mkold(pte))); + WARN_ON(pte_dirty(pte_mkclean(pte))); + WARN_ON(pte_write(pte_wrprotect(pte))); +} + +#ifdef CONFIG_TRANSPARENT_HUGEPAGE +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) +{ + pmd_t pmd = pfn_pmd(pfn, prot); + + WARN_ON(!pmd_same(pmd, pmd)); + WARN_ON(!pmd_young(pmd_mkyoung(pmd))); + WARN_ON(!pmd_dirty(pmd_mkdirty(pmd))); + WARN_ON(!pmd_write(pmd_mkwrite(pmd))); + WARN_ON(pmd_young(pmd_mkold(pmd))); + WARN_ON(pmd_dirty(pmd_mkclean(pmd))); + WARN_ON(pmd_write(pmd_wrprotect(pmd))); + /* + * A huge page does not point to next level page table + * entry. Hence this must qualify as pmd_bad(). + */ + WARN_ON(!pmd_bad(pmd_mkhuge(pmd))); +} + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) +{ + pud_t pud = pfn_pud(pfn, prot); + + WARN_ON(!pud_same(pud, pud)); + WARN_ON(!pud_young(pud_mkyoung(pud))); + WARN_ON(!pud_write(pud_mkwrite(pud))); + WARN_ON(pud_write(pud_wrprotect(pud))); + WARN_ON(pud_young(pud_mkold(pud))); + + if (mm_pmd_folded(mm) || __is_defined(ARCH_HAS_4LEVEL_HACK)) + return; + + /* + * A huge page does not point to next level page table + * entry. Hence this must qualify as pud_bad(). + */ + WARN_ON(!pud_bad(pud_mkhuge(pud))); +} +#else +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } +#endif +#else +static void __init pmd_basic_tests(unsigned long pfn, pgprot_t prot) { } +static void __init pud_basic_tests(unsigned long pfn, pgprot_t prot) { } +#endif + +static void __init p4d_basic_tests(unsigned long pfn, pgprot_t prot) +{ + p4d_t p4d; + + memset(&p4d, RANDOM_NZVALUE, sizeof(p4d_t)); + WARN_ON(!p4d_same(p4d, p4d)); +} + +static void __init pgd_basic_tests(unsigned long pfn, pgprot_t prot) +{ + pgd_t pgd; + + memset(&pgd, RANDOM_NZVALUE, sizeof(pgd_t)); + WARN_ON(!pgd_same(pgd, pgd)); +} + +#ifndef __ARCH_HAS_4LEVEL_HACK +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) +{ + pud_t pud = READ_ONCE(*pudp); + + if (mm_pmd_folded(mm)) + return; + + pud = __pud(pud_val(pud) | RANDOM_ORVALUE); + WRITE_ONCE(*pudp, pud); + pud_clear(pudp); + pud = READ_ONCE(*pudp); + WARN_ON(!pud_none(pud)); +} + +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, + pmd_t *pmdp) +{ + pud_t pud; + + if (mm_pmd_folded(mm)) + return; + /* + * This entry points to next level page table page. + * Hence this must not qualify as pud_bad(). + */ + pmd_clear(pmdp); + pud_clear(pudp); + pud_populate(mm, pudp, pmdp); + pud = READ_ONCE(*pudp); + WARN_ON(pud_bad(pud)); +} +#else +static void __init pud_clear_tests(struct mm_struct *mm, pud_t *pudp) { } +static void __init pud_populate_tests(struct mm_struct *mm, pud_t *pudp, + pmd_t *pmdp) +{ +} +#endif + +#ifndef __ARCH_HAS_5LEVEL_HACK +static void __init p4d_clear_tests(struct mm_struct *mm, p4d_t *p4dp) +{ + p4d_t p4d = READ_ONCE(*p4dp); + + if (mm_pud_folded(mm)) + return; + + p4d = __p4d(p4d_val(p4d) | RANDOM_ORVALUE); + WRITE_ONCE(*p4dp, p4d); + p4d_clear(p4dp); + p4d = READ_ONCE(*p4dp); + WARN_ON(!p4d_none(p4d)); +} + +static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, + pud_t *pudp) +{ + p4d_t p4d; + + if (mm_pud_folded(mm)) + return; + + /* + * This entry points to next level page table page. + * Hence this must not qualify as p4d_bad(). + */ + pud_clear(pudp); + p4d_clear(p4dp); + p4d_populate(mm, p4dp, pudp); + p4d = READ_ONCE(*p4dp); + WARN_ON(p4d_bad(p4d)); +} + +static void __init pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) +{ + pgd_t pgd = READ_ONCE(*pgdp); + + if (mm_p4d_folded(mm)) + return; + + pgd = __pgd(pgd_val(pgd) | RANDOM_ORVALUE); + WRITE_ONCE(*pgdp, pgd); + pgd_clear(pgdp); + pgd = READ_ONCE(*pgdp); + WARN_ON(!pgd_none(pgd)); +} + +static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, + p4d_t *p4dp) +{ + pgd_t pgd; + + if (mm_p4d_folded(mm)) + return; + + /* + * This entry points to next level page table page. + * Hence this must not qualify as pgd_bad(). + */ + p4d_clear(p4dp); + pgd_clear(pgdp); + pgd_populate(mm, pgdp, p4dp); + pgd = READ_ONCE(*pgdp); + WARN_ON(pgd_bad(pgd)); +} +#else +static void __init p4d_clear_tests(struct mm_struct *mm, p4d_t *p4dp) { } +static void __init pgd_clear_tests(struct mm_struct *mm, pgd_t *pgdp) { } +static void __init p4d_populate_tests(struct mm_struct *mm, p4d_t *p4dp, + pud_t *pudp) +{ +} +static void __init pgd_populate_tests(struct mm_struct *mm, pgd_t *pgdp, + p4d_t *p4dp) +{ +} +#endif + +static void __init pte_clear_tests(struct mm_struct *mm, pte_t *ptep) +{ + pte_t pte = READ_ONCE(*ptep); + + pte = __pte(pte_val(pte) | RANDOM_ORVALUE); + WRITE_ONCE(*ptep, pte); + pte_clear(mm, 0, ptep); + pte = READ_ONCE(*ptep); + WARN_ON(!pte_none(pte)); +} + +static void __init pmd_clear_tests(struct mm_struct *mm, pmd_t *pmdp) +{ + pmd_t pmd = READ_ONCE(*pmdp); + + pmd = __pmd(pmd_val(pmd) | RANDOM_ORVALUE); + WRITE_ONCE(*pmdp, pmd); + pmd_clear(pmdp); + pmd = READ_ONCE(*pmdp); + WARN_ON(!pmd_none(pmd)); +} + +static void __init pmd_populate_tests(struct mm_struct *mm, pmd_t *pmdp, + pgtable_t pgtable) +{ + pmd_t pmd; + + /* + * This entry points to next level page table page. + * Hence this must not qualify as pmd_bad(). + */ + pmd_clear(pmdp); + pmd_populate(mm, pmdp, pgtable); + pmd = READ_ONCE(*pmdp); + WARN_ON(pmd_bad(pmd)); +} + +static unsigned long __init get_random_vaddr(void) +{ + unsigned long random_vaddr, random_pages, total_user_pages; + + total_user_pages = (TASK_SIZE - FIRST_USER_ADDRESS) / PAGE_SIZE; + + random_pages = get_random_long() % total_user_pages; + random_vaddr = FIRST_USER_ADDRESS + random_pages * PAGE_SIZE; + + return random_vaddr; +} + +void __init debug_vm_pgtable(void) +{ + struct mm_struct *mm; + pgd_t *pgdp; + p4d_t *p4dp, *saved_p4dp; + pud_t *pudp, *saved_pudp; + pmd_t *pmdp, *saved_pmdp, pmd; + pte_t *ptep; + pgtable_t saved_ptep; + pgprot_t prot; + phys_addr_t paddr; + unsigned long vaddr, pte_aligned, pmd_aligned; + unsigned long pud_aligned, p4d_aligned, pgd_aligned; + + pr_info("Validating architecture page table helpers\n"); + prot = vm_get_page_prot(VMFLAGS); + vaddr = get_random_vaddr(); + mm = mm_alloc(); + if (!mm) { + pr_err("mm_struct allocation failed\n"); + return; + } + + /* + * PFN for mapping at PTE level is determined from a standard kernel + * text symbol. But pfns for higher page table levels are derived by + * masking lower bits of this real pfn. These derived pfns might not + * exist on the platform but that does not really matter as pfn_pxx() + * helpers will still create appropriate entries for the test. This + * helps avoid large memory block allocations to be used for mapping + * at higher page table levels. + */ + paddr = __pa(&start_kernel); + + pte_aligned = (paddr & PAGE_MASK) >> PAGE_SHIFT; + pmd_aligned = (paddr & PMD_MASK) >> PAGE_SHIFT; + pud_aligned = (paddr & PUD_MASK) >> PAGE_SHIFT; + p4d_aligned = (paddr & P4D_MASK) >> PAGE_SHIFT; + pgd_aligned = (paddr & PGDIR_MASK) >> PAGE_SHIFT; + WARN_ON(!pfn_valid(pte_aligned)); + + pgdp = pgd_offset(mm, vaddr); + p4dp = p4d_alloc(mm, pgdp, vaddr); + pudp = pud_alloc(mm, p4dp, vaddr); + pmdp = pmd_alloc(mm, pudp, vaddr); + ptep = pte_alloc_map(mm, pmdp, vaddr); + + /* + * Save all the page table page addresses as the page table + * entries will be used for testing with random or garbage + * values. These saved addresses will be used for freeing + * page table pages. + */ + pmd = READ_ONCE(*pmdp); + saved_p4dp = p4d_offset(pgdp, 0UL); + saved_pudp = pud_offset(p4dp, 0UL); + saved_pmdp = pmd_offset(pudp, 0UL); + saved_ptep = pmd_pgtable(pmd); + + pte_basic_tests(pte_aligned, prot); + pmd_basic_tests(pmd_aligned, prot); + pud_basic_tests(pud_aligned, prot); + p4d_basic_tests(p4d_aligned, prot); + pgd_basic_tests(pgd_aligned, prot); + + pte_clear_tests(mm, ptep); + pmd_clear_tests(mm, pmdp); + pud_clear_tests(mm, pudp); + p4d_clear_tests(mm, p4dp); + pgd_clear_tests(mm, pgdp); + + pte_unmap(ptep); + + pmd_populate_tests(mm, pmdp, saved_ptep); + pud_populate_tests(mm, pudp, saved_pmdp); + p4d_populate_tests(mm, p4dp, saved_pudp); + pgd_populate_tests(mm, pgdp, saved_p4dp); + + p4d_free(mm, saved_p4dp); + pud_free(mm, saved_pudp); + pmd_free(mm, saved_pmdp); + pte_free(mm, saved_ptep); + + mm_dec_nr_puds(mm); + mm_dec_nr_pmds(mm); + mm_dec_nr_ptes(mm); + mmdrop(mm); +}