Message ID | 20200318220634.32100-2-mike.kravetz@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Clean up hugetlb boot command line processing | expand |
On Wed, Mar 18, 2020 at 03:06:31PM -0700, Mike Kravetz wrote: > The architecture independent routine hugetlb_default_setup sets up > the default huge pages size. It has no way to verify if the passed > value is valid, so it accepts it and attempts to validate at a later > time. This requires undocumented cooperation between the arch specific > and arch independent code. > > For architectures that support more than one huge page size, provide > a routine arch_hugetlb_valid_size to validate a huge page size. > hugetlb_default_setup can use this to validate passed values. > > arch_hugetlb_valid_size will also be used in a subsequent patch to > move processing of the "hugepagesz=" in arch specific code to a common > routine in arch independent code. > > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> > --- > arch/arm64/include/asm/hugetlb.h | 2 ++ > arch/arm64/mm/hugetlbpage.c | 19 ++++++++++++++----- > arch/powerpc/include/asm/hugetlb.h | 3 +++ > arch/powerpc/mm/hugetlbpage.c | 20 +++++++++++++------- > arch/riscv/include/asm/hugetlb.h | 3 +++ > arch/riscv/mm/hugetlbpage.c | 28 ++++++++++++++++++---------- > arch/s390/include/asm/hugetlb.h | 3 +++ > arch/s390/mm/hugetlbpage.c | 18 +++++++++++++----- > arch/sparc/include/asm/hugetlb.h | 3 +++ > arch/sparc/mm/init_64.c | 23 ++++++++++++++++------- > arch/x86/include/asm/hugetlb.h | 3 +++ > arch/x86/mm/hugetlbpage.c | 21 +++++++++++++++------ > include/linux/hugetlb.h | 7 +++++++ > mm/hugetlb.c | 16 +++++++++++++--- > 14 files changed, 126 insertions(+), 43 deletions(-) > > diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h > index 2eb6c234d594..3248f35213ee 100644 > --- a/arch/arm64/include/asm/hugetlb.h > +++ b/arch/arm64/include/asm/hugetlb.h > @@ -59,6 +59,8 @@ extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr, > extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr, > pte_t *ptep, pte_t pte, unsigned long sz); > #define set_huge_swap_pte_at set_huge_swap_pte_at > +extern bool __init arch_hugetlb_valid_size(unsigned long long size); > +#define arch_hugetlb_valid_size arch_hugetlb_valid_size > > #include <asm-generic/hugetlb.h> > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > index bbeb6a5a6ba6..da30127086d0 100644 > --- a/arch/arm64/mm/hugetlbpage.c > +++ b/arch/arm64/mm/hugetlbpage.c > @@ -462,23 +462,32 @@ static int __init hugetlbpage_init(void) > } > arch_initcall(hugetlbpage_init); > > -static __init int setup_hugepagesz(char *opt) > +bool __init arch_hugetlb_valid_size(unsigned long long size) > { > - unsigned long ps = memparse(opt, &opt); > - > - switch (ps) { > + switch (size) { > #ifdef CONFIG_ARM64_4K_PAGES > case PUD_SIZE: > #endif > case CONT_PMD_SIZE: > case PMD_SIZE: > case CONT_PTE_SIZE: > + return true; > + } > + > + return false; > +} > + > +static __init int setup_hugepagesz(char *opt) > +{ > + unsigned long long ps = memparse(opt, &opt); > + > + if arch_hugetlb_valid_size(ps)) { Please compile your changes if you're touching multiple architectures. You can get cross-compiler binaries from: https://mirrors.edge.kernel.org/pub/tools/crosstool/ https://toolchains.bootlin.com/ Will
Hi Mike, The series looks like a great idea to me. One nit on the x86 bits, though... > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c > index 5bfd5aef5378..51e6208fdeec 100644 > --- a/arch/x86/mm/hugetlbpage.c > +++ b/arch/x86/mm/hugetlbpage.c > @@ -181,16 +181,25 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > #endif /* CONFIG_HUGETLB_PAGE */ > > #ifdef CONFIG_X86_64 > +bool __init arch_hugetlb_valid_size(unsigned long long size) > +{ > + if (size == PMD_SIZE) > + return true; > + else if (size == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) > + return true; > + else > + return false; > +} I'm pretty sure it's possible to have a system without 2M/PMD page support. We even have a handy-dandy comment about it in arch/x86/include/asm/required-features.h: #ifdef CONFIG_X86_64 #ifdef CONFIG_PARAVIRT /* Paravirtualized systems may not have PSE or PGE available */ #define NEED_PSE 0 ... I *think* you need an X86_FEATURE_PSE check here to be totally correct. if (size == PMD_SIZE && cpu_feature_enabled(X86_FEATURE_PSE)) return true; BTW, I prefer cpu_feature_enabled() to boot_cpu_has() because it includes disabled-features checking. I don't think any of it matters for these specific features, but I generally prefer it on principle.
On 3/18/20 3:09 PM, Will Deacon wrote: > On Wed, Mar 18, 2020 at 03:06:31PM -0700, Mike Kravetz wrote: >> The architecture independent routine hugetlb_default_setup sets up >> the default huge pages size. It has no way to verify if the passed >> value is valid, so it accepts it and attempts to validate at a later >> time. This requires undocumented cooperation between the arch specific >> and arch independent code. >> >> For architectures that support more than one huge page size, provide >> a routine arch_hugetlb_valid_size to validate a huge page size. >> hugetlb_default_setup can use this to validate passed values. >> >> arch_hugetlb_valid_size will also be used in a subsequent patch to >> move processing of the "hugepagesz=" in arch specific code to a common >> routine in arch independent code. >> >> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> >> --- >> arch/arm64/include/asm/hugetlb.h | 2 ++ >> arch/arm64/mm/hugetlbpage.c | 19 ++++++++++++++----- >> arch/powerpc/include/asm/hugetlb.h | 3 +++ >> arch/powerpc/mm/hugetlbpage.c | 20 +++++++++++++------- >> arch/riscv/include/asm/hugetlb.h | 3 +++ >> arch/riscv/mm/hugetlbpage.c | 28 ++++++++++++++++++---------- >> arch/s390/include/asm/hugetlb.h | 3 +++ >> arch/s390/mm/hugetlbpage.c | 18 +++++++++++++----- >> arch/sparc/include/asm/hugetlb.h | 3 +++ >> arch/sparc/mm/init_64.c | 23 ++++++++++++++++------- >> arch/x86/include/asm/hugetlb.h | 3 +++ >> arch/x86/mm/hugetlbpage.c | 21 +++++++++++++++------ >> include/linux/hugetlb.h | 7 +++++++ >> mm/hugetlb.c | 16 +++++++++++++--- >> 14 files changed, 126 insertions(+), 43 deletions(-) >> >> diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h >> index 2eb6c234d594..3248f35213ee 100644 >> --- a/arch/arm64/include/asm/hugetlb.h >> +++ b/arch/arm64/include/asm/hugetlb.h <snip> >> + >> +static __init int setup_hugepagesz(char *opt) >> +{ >> + unsigned long long ps = memparse(opt, &opt); >> + >> + if arch_hugetlb_valid_size(ps)) { > > Please compile your changes if you're touching multiple architectures. You > can get cross-compiler binaries from: > My apologies. I only cross compiled the result of the series on each architecture. The above code is obviously bad.
On 3/18/20 3:15 PM, Dave Hansen wrote: > Hi Mike, > > The series looks like a great idea to me. One nit on the x86 bits, > though... > >> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c >> index 5bfd5aef5378..51e6208fdeec 100644 >> --- a/arch/x86/mm/hugetlbpage.c >> +++ b/arch/x86/mm/hugetlbpage.c >> @@ -181,16 +181,25 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, >> #endif /* CONFIG_HUGETLB_PAGE */ >> >> #ifdef CONFIG_X86_64 >> +bool __init arch_hugetlb_valid_size(unsigned long long size) >> +{ >> + if (size == PMD_SIZE) >> + return true; >> + else if (size == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) >> + return true; >> + else >> + return false; >> +} > > I'm pretty sure it's possible to have a system without 2M/PMD page > support. We even have a handy-dandy comment about it in > arch/x86/include/asm/required-features.h: > > #ifdef CONFIG_X86_64 > #ifdef CONFIG_PARAVIRT > /* Paravirtualized systems may not have PSE or PGE available */ > #define NEED_PSE 0 > ... > > I *think* you need an X86_FEATURE_PSE check here to be totally correct. > > if (size == PMD_SIZE && cpu_feature_enabled(X86_FEATURE_PSE)) > return true; > > BTW, I prefer cpu_feature_enabled() to boot_cpu_has() because it > includes disabled-features checking. I don't think any of it matters > for these specific features, but I generally prefer it on principle. Sounds good. I'll incorporate those changes into a v2, unless someone else with has a different opinion. BTW, this patch should not really change the way the code works today. It is mostly a movement of code. Unless I am missing something, the existing code will always allow setup of PMD_SIZE hugetlb pages.
On 3/18/20 3:52 PM, Mike Kravetz wrote: > Sounds good. I'll incorporate those changes into a v2, unless someone > else with has a different opinion. > > BTW, this patch should not really change the way the code works today. > It is mostly a movement of code. Unless I am missing something, the > existing code will always allow setup of PMD_SIZE hugetlb pages. Hah, I totally skipped over the old code in the diff. It looks like we'll disable hugetblfs *entirely* if PSE isn't supported. I think this is actually wrong, but nobody ever noticed. I think you'd have to be running as a guest under a hypervisor that's lying about PSE not being supported *and* care about 1GB pages. Nobody does that.
Hi Mike, I love your patch! Yet something to improve: [auto build test ERROR on next-20200318] [also build test ERROR on v5.6-rc6] [cannot apply to arm64/for-next/core powerpc/next sparc/master linus/master sparc-next/master v5.6-rc6 v5.6-rc5 v5.6-rc4] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Mike-Kravetz/Clean-up-hugetlb-boot-command-line-processing/20200319-060943 base: 47780d7892b77e922bbe19b5dea99cde06b2f0e5 config: riscv-allyesconfig (attached as .config) compiler: riscv64-linux-gcc (GCC) 9.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=9.2.0 make.cross ARCH=riscv If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): arch/riscv/mm/hugetlbpage.c: In function 'arch_hugetlb_valid_size': >> arch/riscv/mm/hugetlbpage.c:19:39: error: 'ps' undeclared (first use in this function) 19 | else if (IS_ENABLED(CONFIG_64BIT) && ps == PUD_SIZE) | ^~ arch/riscv/mm/hugetlbpage.c:19:39: note: each undeclared identifier is reported only once for each function it appears in >> arch/riscv/mm/hugetlbpage.c:20:3: error: 'retrurn' undeclared (first use in this function) 20 | retrurn true; | ^~~~~~~ >> arch/riscv/mm/hugetlbpage.c:20:10: error: expected ';' before 'true' 20 | retrurn true; | ^~~~~ | ; In file included from include/linux/printk.h:7, from include/linux/kernel.h:15, from include/asm-generic/bug.h:19, from arch/riscv/include/asm/bug.h:75, from include/linux/bug.h:5, from arch/riscv/include/asm/cmpxchg.h:9, from arch/riscv/include/asm/atomic.h:19, from include/linux/atomic.h:7, from include/linux/mm_types_task.h:13, from include/linux/mm_types.h:5, from include/linux/hugetlb.h:5, from arch/riscv/mm/hugetlbpage.c:2: arch/riscv/mm/hugetlbpage.c: In function 'setup_hugepagesz': include/linux/kern_levels.h:5:18: warning: format '%lu' expects argument of type 'long unsigned int', but argument 2 has type 'long long unsigned int' [-Wformat=] 5 | #define KERN_SOH "\001" /* ASCII Start Of Header */ | ^~~~~~ include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH' 11 | #define KERN_ERR KERN_SOH "3" /* error conditions */ | ^~~~~~~~ include/linux/printk.h:304:9: note: in expansion of macro 'KERN_ERR' 304 | printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) | ^~~~~~~~ arch/riscv/mm/hugetlbpage.c:35:2: note: in expansion of macro 'pr_err' 35 | pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20); | ^~~~~~ arch/riscv/mm/hugetlbpage.c:35:46: note: format string is defined here 35 | pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20); | ~~^ | | | long unsigned int | %llu arch/riscv/mm/hugetlbpage.c: In function 'arch_hugetlb_valid_size': >> arch/riscv/mm/hugetlbpage.c:23:1: warning: control reaches end of non-void function [-Wreturn-type] 23 | } | ^ vim +/ps +19 arch/riscv/mm/hugetlbpage.c 14 15 bool __init arch_hugetlb_valid_size(unsigned long long size) 16 { 17 if (size == HPAGE_SIZE) 18 return true; > 19 else if (IS_ENABLED(CONFIG_64BIT) && ps == PUD_SIZE) > 20 retrurn true; 21 else 22 return false; > 23 } 24 25 static __init int setup_hugepagesz(char *opt) 26 { 27 unsigned long long ps = memparse(opt, &opt); 28 29 if (arch_hugetlb_valid_size(ps)) { 30 hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT); 31 return 1; 32 } 33 34 hugetlb_bad_size(); > 35 pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20); 36 return 0; 37 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Mike,
I love your patch! Yet something to improve:
[auto build test ERROR on next-20200318]
[also build test ERROR on v5.6-rc6]
[cannot apply to arm64/for-next/core powerpc/next sparc/master linus/master sparc-next/master v5.6-rc6 v5.6-rc5 v5.6-rc4]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
url: https://github.com/0day-ci/linux/commits/Mike-Kravetz/Clean-up-hugetlb-boot-command-line-processing/20200319-060943
base: 47780d7892b77e922bbe19b5dea99cde06b2f0e5
config: i386-defconfig (attached as .config)
compiler: gcc-7 (Debian 7.5.0-5) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
ld: mm/hugetlb.o: in function `default_hugepagesz_setup':
>> hugetlb.c:(.init.text+0x16): undefined reference to `arch_hugetlb_valid_size'
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Le 18/03/2020 à 23:06, Mike Kravetz a écrit : > The architecture independent routine hugetlb_default_setup sets up > the default huge pages size. It has no way to verify if the passed > value is valid, so it accepts it and attempts to validate at a later > time. This requires undocumented cooperation between the arch specific > and arch independent code. > > For architectures that support more than one huge page size, provide > a routine arch_hugetlb_valid_size to validate a huge page size. > hugetlb_default_setup can use this to validate passed values. > > arch_hugetlb_valid_size will also be used in a subsequent patch to > move processing of the "hugepagesz=" in arch specific code to a common > routine in arch independent code. > > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> > --- > arch/arm64/include/asm/hugetlb.h | 2 ++ > arch/arm64/mm/hugetlbpage.c | 19 ++++++++++++++----- > arch/powerpc/include/asm/hugetlb.h | 3 +++ > arch/powerpc/mm/hugetlbpage.c | 20 +++++++++++++------- > arch/riscv/include/asm/hugetlb.h | 3 +++ > arch/riscv/mm/hugetlbpage.c | 28 ++++++++++++++++++---------- > arch/s390/include/asm/hugetlb.h | 3 +++ > arch/s390/mm/hugetlbpage.c | 18 +++++++++++++----- > arch/sparc/include/asm/hugetlb.h | 3 +++ > arch/sparc/mm/init_64.c | 23 ++++++++++++++++------- > arch/x86/include/asm/hugetlb.h | 3 +++ > arch/x86/mm/hugetlbpage.c | 21 +++++++++++++++------ > include/linux/hugetlb.h | 7 +++++++ > mm/hugetlb.c | 16 +++++++++++++--- > 14 files changed, 126 insertions(+), 43 deletions(-) > [snip] > diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h > index bd6504c28c2f..3b5939016955 100644 > --- a/arch/powerpc/include/asm/hugetlb.h > +++ b/arch/powerpc/include/asm/hugetlb.h > @@ -64,6 +64,9 @@ static inline void arch_clear_hugepage_flags(struct page *page) > { > } > > +#define arch_hugetlb_valid_size arch_hugetlb_valid_size > +extern bool __init arch_hugetlb_valid_size(unsigned long long size); Don't add 'extern' keyword, it is irrelevant for a function declaration. checkpatch --strict doesn't like it either (https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/12318//artifact/linux/checkpatch.log) > + > #include <asm-generic/hugetlb.h> > > #else /* ! CONFIG_HUGETLB_PAGE */ > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > index 33b3461d91e8..b78f660252f3 100644 > --- a/arch/powerpc/mm/hugetlbpage.c > +++ b/arch/powerpc/mm/hugetlbpage.c > @@ -558,7 +558,7 @@ unsigned long vma_mmu_pagesize(struct vm_area_struct *vma) > return vma_kernel_pagesize(vma); > } > > -static int __init add_huge_page_size(unsigned long long size) > +bool __init arch_hugetlb_valid_size(unsigned long long size) > { > int shift = __ffs(size); > int mmu_psize; > @@ -566,20 +566,26 @@ static int __init add_huge_page_size(unsigned long long size) > /* Check that it is a page size supported by the hardware and > * that it fits within pagetable and slice limits. */ > if (size <= PAGE_SIZE || !is_power_of_2(size)) > - return -EINVAL; > + return false; > > mmu_psize = check_and_get_huge_psize(shift); > if (mmu_psize < 0) > - return -EINVAL; > + return false; > > BUG_ON(mmu_psize_defs[mmu_psize].shift != shift); > > - /* Return if huge page size has already been setup */ > - if (size_to_hstate(size)) > - return 0; > + return true; > +} > > - hugetlb_add_hstate(shift - PAGE_SHIFT); > +static int __init add_huge_page_size(unsigned long long size) > +{ > + int shift = __ffs(size); > + > + if (!arch_hugetlb_valid_size(size)) > + return -EINVAL; > > + if (!size_to_hstate(size)) > + hugetlb_add_hstate(shift - PAGE_SHIFT); > return 0; > } > [snip] > diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c > index 5bfd5aef5378..51e6208fdeec 100644 > --- a/arch/x86/mm/hugetlbpage.c > +++ b/arch/x86/mm/hugetlbpage.c > @@ -181,16 +181,25 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > #endif /* CONFIG_HUGETLB_PAGE */ > > #ifdef CONFIG_X86_64 > +bool __init arch_hugetlb_valid_size(unsigned long long size) > +{ > + if (size == PMD_SIZE) > + return true; > + else if (size == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) > + return true; > + else > + return false; > +} > + > static __init int setup_hugepagesz(char *opt) > { > - unsigned long ps = memparse(opt, &opt); > - if (ps == PMD_SIZE) { > - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT); > - } else if (ps == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) { > - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); > + unsigned long long ps = memparse(opt, &opt); > + > + if (arch_hugetlb_valid_size(ps)) { > + hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT); > } else { > hugetlb_bad_size(); > - printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n", > + printk(KERN_ERR "hugepagesz: Unsupported page size %llu M\n", > ps >> 20); Nowadays we use pr_err() instead of printk. It would also likely allow you to have everything fit on a single line. > return 0; > } > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index b831e9fa1a26..33343eb980d0 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -678,6 +678,13 @@ static inline spinlock_t *huge_pte_lockptr(struct hstate *h, > return &mm->page_table_lock; > } > > +#ifndef arch_hugetlb_valid_size > +static inline bool arch_hugetlb_valid_size(unsigned long long size) > +{ > + return (size == HPAGE_SIZE); Not sure the ( ) are necessary. > +} > +#endif > + > #ifndef hugepages_supported > /* > * Some platform decide whether they support huge pages at boot > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index d8ebd876871d..2f99359b93af 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -3224,12 +3224,22 @@ static int __init hugetlb_nrpages_setup(char *s) > } > __setup("hugepages=", hugetlb_nrpages_setup); > > -static int __init hugetlb_default_setup(char *s) > +static int __init default_hugepagesz_setup(char *s) > { > - default_hstate_size = memparse(s, &s); > + unsigned long long size; Why unsigned long long ? default_hstate_size is long. I can't imagine 32 bits platforms having a hugepage with a 64 bits size. > + char *saved_s = s; > + > + size = memparse(s, &s); The updated s is not reused after that so you can pass NULL instead of &s and then you don't need the saved_s. > + > + if (!arch_hugetlb_valid_size(size)) { > + pr_err("HugeTLB: unsupported default_hugepagesz %s\n", saved_s); > + return 0; > + } > + > + default_hstate_size = size; > return 1; > } > -__setup("default_hugepagesz=", hugetlb_default_setup); > +__setup("default_hugepagesz=", default_hugepagesz_setup); > > static unsigned int cpuset_mems_nr(unsigned int *array) > { > Christophe
On 3/19/20 12:00 AM, Christophe Leroy wrote: > > Le 18/03/2020 à 23:06, Mike Kravetz a écrit : >> The architecture independent routine hugetlb_default_setup sets up >> the default huge pages size. It has no way to verify if the passed >> value is valid, so it accepts it and attempts to validate at a later >> time. This requires undocumented cooperation between the arch specific >> and arch independent code. >> >> For architectures that support more than one huge page size, provide >> a routine arch_hugetlb_valid_size to validate a huge page size. >> hugetlb_default_setup can use this to validate passed values. >> >> arch_hugetlb_valid_size will also be used in a subsequent patch to >> move processing of the "hugepagesz=" in arch specific code to a common >> routine in arch independent code. >> >> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> >> --- >> arch/arm64/include/asm/hugetlb.h | 2 ++ >> arch/arm64/mm/hugetlbpage.c | 19 ++++++++++++++----- >> arch/powerpc/include/asm/hugetlb.h | 3 +++ >> arch/powerpc/mm/hugetlbpage.c | 20 +++++++++++++------- >> arch/riscv/include/asm/hugetlb.h | 3 +++ >> arch/riscv/mm/hugetlbpage.c | 28 ++++++++++++++++++---------- >> arch/s390/include/asm/hugetlb.h | 3 +++ >> arch/s390/mm/hugetlbpage.c | 18 +++++++++++++----- >> arch/sparc/include/asm/hugetlb.h | 3 +++ >> arch/sparc/mm/init_64.c | 23 ++++++++++++++++------- >> arch/x86/include/asm/hugetlb.h | 3 +++ >> arch/x86/mm/hugetlbpage.c | 21 +++++++++++++++------ >> include/linux/hugetlb.h | 7 +++++++ >> mm/hugetlb.c | 16 +++++++++++++--- >> 14 files changed, 126 insertions(+), 43 deletions(-) >> > > [snip] > >> diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h >> index bd6504c28c2f..3b5939016955 100644 >> --- a/arch/powerpc/include/asm/hugetlb.h >> +++ b/arch/powerpc/include/asm/hugetlb.h >> @@ -64,6 +64,9 @@ static inline void arch_clear_hugepage_flags(struct page *page) >> { >> } >> +#define arch_hugetlb_valid_size arch_hugetlb_valid_size >> +extern bool __init arch_hugetlb_valid_size(unsigned long long size); > > Don't add 'extern' keyword, it is irrelevant for a function declaration. > Will do. One of the other arch's did this and I got into a bad habit. > checkpatch --strict doesn't like it either (https://openpower.xyz/job/snowpatch/job/snowpatch-linux-checkpatch/12318//artifact/linux/checkpatch.log) > >> + >> #include <asm-generic/hugetlb.h> >> #else /* ! CONFIG_HUGETLB_PAGE */ >> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c >> index 33b3461d91e8..b78f660252f3 100644 >> --- a/arch/powerpc/mm/hugetlbpage.c >> +++ b/arch/powerpc/mm/hugetlbpage.c >> @@ -558,7 +558,7 @@ unsigned long vma_mmu_pagesize(struct vm_area_struct *vma) >> return vma_kernel_pagesize(vma); >> } >> -static int __init add_huge_page_size(unsigned long long size) >> +bool __init arch_hugetlb_valid_size(unsigned long long size) >> { >> int shift = __ffs(size); >> int mmu_psize; >> @@ -566,20 +566,26 @@ static int __init add_huge_page_size(unsigned long long size) >> /* Check that it is a page size supported by the hardware and >> * that it fits within pagetable and slice limits. */ >> if (size <= PAGE_SIZE || !is_power_of_2(size)) >> - return -EINVAL; >> + return false; >> mmu_psize = check_and_get_huge_psize(shift); >> if (mmu_psize < 0) >> - return -EINVAL; >> + return false; >> BUG_ON(mmu_psize_defs[mmu_psize].shift != shift); >> - /* Return if huge page size has already been setup */ >> - if (size_to_hstate(size)) >> - return 0; >> + return true; >> +} >> - hugetlb_add_hstate(shift - PAGE_SHIFT); >> +static int __init add_huge_page_size(unsigned long long size) >> +{ >> + int shift = __ffs(size); >> + >> + if (!arch_hugetlb_valid_size(size)) >> + return -EINVAL; >> + if (!size_to_hstate(size)) >> + hugetlb_add_hstate(shift - PAGE_SHIFT); >> return 0; >> } >> > > [snip] > >> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c >> index 5bfd5aef5378..51e6208fdeec 100644 >> --- a/arch/x86/mm/hugetlbpage.c >> +++ b/arch/x86/mm/hugetlbpage.c >> @@ -181,16 +181,25 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, >> #endif /* CONFIG_HUGETLB_PAGE */ >> #ifdef CONFIG_X86_64 >> +bool __init arch_hugetlb_valid_size(unsigned long long size) >> +{ >> + if (size == PMD_SIZE) >> + return true; >> + else if (size == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) >> + return true; >> + else >> + return false; >> +} >> + >> static __init int setup_hugepagesz(char *opt) >> { >> - unsigned long ps = memparse(opt, &opt); >> - if (ps == PMD_SIZE) { >> - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT); >> - } else if (ps == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) { >> - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); >> + unsigned long long ps = memparse(opt, &opt); >> + >> + if (arch_hugetlb_valid_size(ps)) { >> + hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT); >> } else { >> hugetlb_bad_size(); >> - printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n", >> + printk(KERN_ERR "hugepagesz: Unsupported page size %llu M\n", >> ps >> 20); > > Nowadays we use pr_err() instead of printk. > > It would also likely allow you to have everything fit on a single line. I may just leave this 'as is' as it will be removed in a later patch. >> return 0; >> } >> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >> index b831e9fa1a26..33343eb980d0 100644 >> --- a/include/linux/hugetlb.h >> +++ b/include/linux/hugetlb.h >> @@ -678,6 +678,13 @@ static inline spinlock_t *huge_pte_lockptr(struct hstate *h, >> return &mm->page_table_lock; >> } >> +#ifndef arch_hugetlb_valid_size >> +static inline bool arch_hugetlb_valid_size(unsigned long long size) >> +{ >> + return (size == HPAGE_SIZE); > > Not sure the ( ) are necessary. Likely not. I will look at removing. > >> +} >> +#endif >> + >> #ifndef hugepages_supported >> /* >> * Some platform decide whether they support huge pages at boot >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index d8ebd876871d..2f99359b93af 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -3224,12 +3224,22 @@ static int __init hugetlb_nrpages_setup(char *s) >> } >> __setup("hugepages=", hugetlb_nrpages_setup); >> -static int __init hugetlb_default_setup(char *s) >> +static int __init default_hugepagesz_setup(char *s) >> { >> - default_hstate_size = memparse(s, &s); >> + unsigned long long size; > > Why unsigned long long ? > > default_hstate_size is long. Only because memparse is defined as unsigned long long. I actually took this from the existing powerpc hugetlb setup code. There are no compiler warnings/issues assigning unsigned long long to long on 64 bit builds. Thought there would be on 32 bit platformes. That was also the reason for making the argument to arch_hugetlb_valid_size be unsigned long long. So that it would match the type from memparse. I suppose making these unsigned long and casting would be OK based on the expected sizes. > > I can't imagine 32 bits platforms having a hugepage with a 64 bits size. > >> + char *saved_s = s; >> + >> + size = memparse(s, &s); > > The updated s is not reused after that so you can pass NULL instead of &s and then you don't need the saved_s. > Thanks for this and all the comments. I will incorporate in v2.
On Wed, Mar 18, 2020 at 3:07 PM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > The architecture independent routine hugetlb_default_setup sets up > the default huge pages size. It has no way to verify if the passed > value is valid, so it accepts it and attempts to validate at a later > time. This requires undocumented cooperation between the arch specific > and arch independent code. > > For architectures that support more than one huge page size, provide > a routine arch_hugetlb_valid_size to validate a huge page size. > hugetlb_default_setup can use this to validate passed values. > > arch_hugetlb_valid_size will also be used in a subsequent patch to > move processing of the "hugepagesz=" in arch specific code to a common > routine in arch independent code. > > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> > --- > arch/arm64/include/asm/hugetlb.h | 2 ++ > arch/arm64/mm/hugetlbpage.c | 19 ++++++++++++++----- > arch/powerpc/include/asm/hugetlb.h | 3 +++ > arch/powerpc/mm/hugetlbpage.c | 20 +++++++++++++------- > arch/riscv/include/asm/hugetlb.h | 3 +++ > arch/riscv/mm/hugetlbpage.c | 28 ++++++++++++++++++---------- > arch/s390/include/asm/hugetlb.h | 3 +++ > arch/s390/mm/hugetlbpage.c | 18 +++++++++++++----- > arch/sparc/include/asm/hugetlb.h | 3 +++ > arch/sparc/mm/init_64.c | 23 ++++++++++++++++------- > arch/x86/include/asm/hugetlb.h | 3 +++ > arch/x86/mm/hugetlbpage.c | 21 +++++++++++++++------ > include/linux/hugetlb.h | 7 +++++++ > mm/hugetlb.c | 16 +++++++++++++--- > 14 files changed, 126 insertions(+), 43 deletions(-) > With build fixes: Acked-by: Mina Almasry <almasrymina@google.com>
On 2020/3/19 6:52, Mike Kravetz wrote: > On 3/18/20 3:15 PM, Dave Hansen wrote: >> Hi Mike, >> >> The series looks like a great idea to me. One nit on the x86 bits, >> though... >> >>> diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c >>> index 5bfd5aef5378..51e6208fdeec 100644 >>> --- a/arch/x86/mm/hugetlbpage.c >>> +++ b/arch/x86/mm/hugetlbpage.c >>> @@ -181,16 +181,25 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, >>> #endif /* CONFIG_HUGETLB_PAGE */ >>> >>> #ifdef CONFIG_X86_64 >>> +bool __init arch_hugetlb_valid_size(unsigned long long size) >>> +{ >>> + if (size == PMD_SIZE) >>> + return true; >>> + else if (size == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) >>> + return true; >>> + else >>> + return false; >>> +} >> >> I'm pretty sure it's possible to have a system without 2M/PMD page >> support. We even have a handy-dandy comment about it in >> arch/x86/include/asm/required-features.h: >> >> #ifdef CONFIG_X86_64 >> #ifdef CONFIG_PARAVIRT >> /* Paravirtualized systems may not have PSE or PGE available */ >> #define NEED_PSE 0 >> ... >> >> I *think* you need an X86_FEATURE_PSE check here to be totally correct. >> >> if (size == PMD_SIZE && cpu_feature_enabled(X86_FEATURE_PSE)) >> return true; >> >> BTW, I prefer cpu_feature_enabled() to boot_cpu_has() because it >> includes disabled-features checking. I don't think any of it matters >> for these specific features, but I generally prefer it on principle. > > Sounds good. I'll incorporate those changes into a v2, unless someone > else with has a different opinion. > > BTW, this patch should not really change the way the code works today. > It is mostly a movement of code. Unless I am missing something, the > existing code will always allow setup of PMD_SIZE hugetlb pages. > Hi Mike, Inspired by Dave's opinion, it seems the x86-specific hugepages_supported should also need to use cpu_feature_enabled instead. Also, I wonder if the hugepages_supported is correct ? There're two arch specific hugepages_supported: x86: #define hugepages_supported() boot_cpu_has(X86_FEATURE_PSE) and s390: #define hugepages_supported() (MACHINE_HAS_EDAT1) Is it possible that x86 has X86_FEATURE_GBPAGES but hasn't X86_FEATURE_GBPAGES or s390 has MACHINE_HAS_EDAT2 but hasn't MACHINE_HAS_EDAT1 ? --- Regards, Longpeng(Mike)
On 25.03.20 03:58, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote: [...] > Hi Mike, > > Inspired by Dave's opinion, it seems the x86-specific hugepages_supported should > also need to use cpu_feature_enabled instead. > > Also, I wonder if the hugepages_supported is correct ? There're two arch > specific hugepages_supported: > x86: > #define hugepages_supported() boot_cpu_has(X86_FEATURE_PSE) > and > s390: > #define hugepages_supported() (MACHINE_HAS_EDAT1) > > Is it possible that x86 has X86_FEATURE_GBPAGES but hasn't X86_FEATURE_GBPAGES > or s390 has MACHINE_HAS_EDAT2 but hasn't MACHINE_HAS_EDAT1 ? The s390 architecture says that When EDAT-2 applies, the following function is available in the DAT process: - EDAT-1 applies. [..] So if the machine has EDAT-2 it also has EDAT-1.
On 3/18/20 4:36 PM, Dave Hansen wrote: > On 3/18/20 3:52 PM, Mike Kravetz wrote: >> Sounds good. I'll incorporate those changes into a v2, unless someone >> else with has a different opinion. >> >> BTW, this patch should not really change the way the code works today. >> It is mostly a movement of code. Unless I am missing something, the >> existing code will always allow setup of PMD_SIZE hugetlb pages. > > Hah, I totally skipped over the old code in the diff. > > It looks like we'll disable hugetblfs *entirely* if PSE isn't supported. > I think this is actually wrong, but nobody ever noticed. I think you'd > have to be running as a guest under a hypervisor that's lying about PSE > not being supported *and* care about 1GB pages. Nobody does that. Actually, !PSE will disable hugetlbfs a little later in the boot process. You are talking about hugepages_supported() correct? I think something really bad could happen in this situation (!PSE and X86_FEATURE_GBPAGES). When parsing 'hugepages=' for gigantic pages we immediately allocate from bootmem. This happens before later checks in hugetlb_init for hugepages_supported(). So, I think we would end up allocating GB pages from bootmem and not be able to use or free them. :( Perhaps it would be best to check hugepages_supported() when parsing hugetlb command line options. If not enabled, throw an error. This will be much easier to do after moving all command line parsing to arch independent code. Is that a sufficient way to address this concern? I think it is a good change in any case.
On 3/26/20 2:56 PM, Mike Kravetz wrote: > Perhaps it would be best to check hugepages_supported() when parsing > hugetlb command line options. If not enabled, throw an error. This > will be much easier to do after moving all command line parsing to > arch independent code. Yeah, that sounds sane. > Is that a sufficient way to address this concern? I think it is a good > change in any case. (Thanks to Kirill for pointing this out.) So, it turns out the x86 huge page enumeration is totally buggered. X86_FEATURE_PSE is actually meaningless on 64-bit (and 32-bit PAE). All CPUs architecturally support 2MB pages regardless of X86_FEATURE_PSE and the state of CR4.PSE. So, on x86_64 at least, hugepages_supported() should *always* return 1. 1GB page support can continue to be dependent on X86_FEATURE_GBPAGES.
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index 2eb6c234d594..3248f35213ee 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -59,6 +59,8 @@ extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr, extern void set_huge_swap_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned long sz); #define set_huge_swap_pte_at set_huge_swap_pte_at +extern bool __init arch_hugetlb_valid_size(unsigned long long size); +#define arch_hugetlb_valid_size arch_hugetlb_valid_size #include <asm-generic/hugetlb.h> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index bbeb6a5a6ba6..da30127086d0 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -462,23 +462,32 @@ static int __init hugetlbpage_init(void) } arch_initcall(hugetlbpage_init); -static __init int setup_hugepagesz(char *opt) +bool __init arch_hugetlb_valid_size(unsigned long long size) { - unsigned long ps = memparse(opt, &opt); - - switch (ps) { + switch (size) { #ifdef CONFIG_ARM64_4K_PAGES case PUD_SIZE: #endif case CONT_PMD_SIZE: case PMD_SIZE: case CONT_PTE_SIZE: + return true; + } + + return false; +} + +static __init int setup_hugepagesz(char *opt) +{ + unsigned long long ps = memparse(opt, &opt); + + if arch_hugetlb_valid_size(ps)) { add_huge_page_size(ps); return 1; } hugetlb_bad_size(); - pr_err("hugepagesz: Unsupported page size %lu K\n", ps >> 10); + pr_err("hugepagesz: Unsupported page size %llu K\n", ps >> 10); return 0; } __setup("hugepagesz=", setup_hugepagesz); diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index bd6504c28c2f..3b5939016955 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -64,6 +64,9 @@ static inline void arch_clear_hugepage_flags(struct page *page) { } +#define arch_hugetlb_valid_size arch_hugetlb_valid_size +extern bool __init arch_hugetlb_valid_size(unsigned long long size); + #include <asm-generic/hugetlb.h> #else /* ! CONFIG_HUGETLB_PAGE */ diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 33b3461d91e8..b78f660252f3 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -558,7 +558,7 @@ unsigned long vma_mmu_pagesize(struct vm_area_struct *vma) return vma_kernel_pagesize(vma); } -static int __init add_huge_page_size(unsigned long long size) +bool __init arch_hugetlb_valid_size(unsigned long long size) { int shift = __ffs(size); int mmu_psize; @@ -566,20 +566,26 @@ static int __init add_huge_page_size(unsigned long long size) /* Check that it is a page size supported by the hardware and * that it fits within pagetable and slice limits. */ if (size <= PAGE_SIZE || !is_power_of_2(size)) - return -EINVAL; + return false; mmu_psize = check_and_get_huge_psize(shift); if (mmu_psize < 0) - return -EINVAL; + return false; BUG_ON(mmu_psize_defs[mmu_psize].shift != shift); - /* Return if huge page size has already been setup */ - if (size_to_hstate(size)) - return 0; + return true; +} - hugetlb_add_hstate(shift - PAGE_SHIFT); +static int __init add_huge_page_size(unsigned long long size) +{ + int shift = __ffs(size); + + if (!arch_hugetlb_valid_size(size)) + return -EINVAL; + if (!size_to_hstate(size)) + hugetlb_add_hstate(shift - PAGE_SHIFT); return 0; } diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h index 728a5db66597..ebd6f5a35d26 100644 --- a/arch/riscv/include/asm/hugetlb.h +++ b/arch/riscv/include/asm/hugetlb.h @@ -5,6 +5,9 @@ #include <asm-generic/hugetlb.h> #include <asm/page.h> +extern bool __init arch_hugetlb_valid_size(unsigned long long size); +#define arch_hugetlb_valid_size arch_hugetlb_valid_size + static inline int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, unsigned long len) { diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index a6189ed36c5f..f1990882f16c 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -12,21 +12,29 @@ int pmd_huge(pmd_t pmd) return pmd_leaf(pmd); } +bool __init arch_hugetlb_valid_size(unsigned long long size) +{ + if (size == HPAGE_SIZE) + return true; + else if (IS_ENABLED(CONFIG_64BIT) && ps == PUD_SIZE) + retrurn true; + else + return false; +} + static __init int setup_hugepagesz(char *opt) { - unsigned long ps = memparse(opt, &opt); + unsigned long long ps = memparse(opt, &opt); - if (ps == HPAGE_SIZE) { - hugetlb_add_hstate(HPAGE_SHIFT - PAGE_SHIFT); - } else if (IS_ENABLED(CONFIG_64BIT) && ps == PUD_SIZE) { - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); - } else { - hugetlb_bad_size(); - pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20); - return 0; + if (arch_hugetlb_valid_size(ps)) { + hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT); + return 1; } - return 1; + hugetlb_bad_size(); + pr_err("hugepagesz: Unsupported page size %lu M\n", ps >> 20); + return 0; + } __setup("hugepagesz=", setup_hugepagesz); diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index de8f0bf5f238..a3dd457d2167 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -15,6 +15,9 @@ #define hugetlb_free_pgd_range free_pgd_range #define hugepages_supported() (MACHINE_HAS_EDAT1) +extern bool __init arch_hugetlb_valid_size(unsigned long long size); +#define arch_hugetlb_valid_size arch_hugetlb_valid_size + void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte); pte_t huge_ptep_get(pte_t *ptep); diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index 5674710a4841..d92e8c5c3e71 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -251,16 +251,24 @@ follow_huge_pud(struct mm_struct *mm, unsigned long address, return pud_page(*pud) + ((address & ~PUD_MASK) >> PAGE_SHIFT); } +bool __init arch_hugetlb_valid_size(unsigned long long size) +{ + if (MACHINE_HAS_EDAT1 && size == PMD_SIZE) + return true; + else if (MACHINE_HAS_EDAT2 && size == PUD_SIZE) + return true; + else + return false; +} + static __init int setup_hugepagesz(char *opt) { - unsigned long size; + unsigned long long size; char *string = opt; size = memparse(opt, &opt); - if (MACHINE_HAS_EDAT1 && size == PMD_SIZE) { - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT); - } else if (MACHINE_HAS_EDAT2 && size == PUD_SIZE) { - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); + if (arch_hugetlb_valid_size(size)) { + hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); } else { hugetlb_bad_size(); pr_err("hugepagesz= specifies an unsupported page size %s\n", diff --git a/arch/sparc/include/asm/hugetlb.h b/arch/sparc/include/asm/hugetlb.h index 3963f80d1cb3..0d4f4adaffaf 100644 --- a/arch/sparc/include/asm/hugetlb.h +++ b/arch/sparc/include/asm/hugetlb.h @@ -10,6 +10,9 @@ struct pud_huge_patch_entry { unsigned int insn; }; extern struct pud_huge_patch_entry __pud_huge_patch, __pud_huge_patch_end; + +extern bool __init arch_hugetlb_valid_size(unsigned long long size); +#define arch_hugetlb_valid_size arch_hugetlb_valid_size #endif #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 1cf0d666dea3..4cc248817b19 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -360,17 +360,13 @@ static void __init pud_huge_patch(void) __asm__ __volatile__("flush %0" : : "r" (addr)); } -static int __init setup_hugepagesz(char *string) +bool __init arch_hugetlb_valid_size(unsigned long long size) { - unsigned long long hugepage_size; - unsigned int hugepage_shift; + unsigned int hugepage_shift = ilog2(size); unsigned short hv_pgsz_idx; unsigned int hv_pgsz_mask; int rc = 0; - hugepage_size = memparse(string, &string); - hugepage_shift = ilog2(hugepage_size); - switch (hugepage_shift) { case HPAGE_16GB_SHIFT: hv_pgsz_mask = HV_PGSZ_MASK_16GB; @@ -397,7 +393,20 @@ static int __init setup_hugepagesz(char *string) hv_pgsz_mask = 0; } - if ((hv_pgsz_mask & cpu_pgsz_mask) == 0U) { + if ((hv_pgsz_mask & cpu_pgsz_mask) == 0U) + return false; + + return true; +} + +static int __init setup_hugepagesz(char *string) +{ + unsigned long long hugepage_size; + int rc = 0; + + hugepage_size = memparse(string, &string); + + if (!arch_hugetlb_valid_size(hugepage_size)) { hugetlb_bad_size(); pr_err("hugepagesz=%llu not supported by MMU.\n", hugepage_size); diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h index f65cfb48cfdd..8ed96e4010ec 100644 --- a/arch/x86/include/asm/hugetlb.h +++ b/arch/x86/include/asm/hugetlb.h @@ -7,6 +7,9 @@ #define hugepages_supported() boot_cpu_has(X86_FEATURE_PSE) +extern bool __init arch_hugetlb_valid_size(unsigned long long size); +#define arch_hugetlb_valid_size arch_hugetlb_valid_size + static inline int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, unsigned long len) { diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 5bfd5aef5378..51e6208fdeec 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -181,16 +181,25 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, #endif /* CONFIG_HUGETLB_PAGE */ #ifdef CONFIG_X86_64 +bool __init arch_hugetlb_valid_size(unsigned long long size) +{ + if (size == PMD_SIZE) + return true; + else if (size == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) + return true; + else + return false; +} + static __init int setup_hugepagesz(char *opt) { - unsigned long ps = memparse(opt, &opt); - if (ps == PMD_SIZE) { - hugetlb_add_hstate(PMD_SHIFT - PAGE_SHIFT); - } else if (ps == PUD_SIZE && boot_cpu_has(X86_FEATURE_GBPAGES)) { - hugetlb_add_hstate(PUD_SHIFT - PAGE_SHIFT); + unsigned long long ps = memparse(opt, &opt); + + if (arch_hugetlb_valid_size(ps)) { + hugetlb_add_hstate(ilog2(ps) - PAGE_SHIFT); } else { hugetlb_bad_size(); - printk(KERN_ERR "hugepagesz: Unsupported page size %lu M\n", + printk(KERN_ERR "hugepagesz: Unsupported page size %llu M\n", ps >> 20); return 0; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index b831e9fa1a26..33343eb980d0 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -678,6 +678,13 @@ static inline spinlock_t *huge_pte_lockptr(struct hstate *h, return &mm->page_table_lock; } +#ifndef arch_hugetlb_valid_size +static inline bool arch_hugetlb_valid_size(unsigned long long size) +{ + return (size == HPAGE_SIZE); +} +#endif + #ifndef hugepages_supported /* * Some platform decide whether they support huge pages at boot diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d8ebd876871d..2f99359b93af 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3224,12 +3224,22 @@ static int __init hugetlb_nrpages_setup(char *s) } __setup("hugepages=", hugetlb_nrpages_setup); -static int __init hugetlb_default_setup(char *s) +static int __init default_hugepagesz_setup(char *s) { - default_hstate_size = memparse(s, &s); + unsigned long long size; + char *saved_s = s; + + size = memparse(s, &s); + + if (!arch_hugetlb_valid_size(size)) { + pr_err("HugeTLB: unsupported default_hugepagesz %s\n", saved_s); + return 0; + } + + default_hstate_size = size; return 1; } -__setup("default_hugepagesz=", hugetlb_default_setup); +__setup("default_hugepagesz=", default_hugepagesz_setup); static unsigned int cpuset_mems_nr(unsigned int *array) {
The architecture independent routine hugetlb_default_setup sets up the default huge pages size. It has no way to verify if the passed value is valid, so it accepts it and attempts to validate at a later time. This requires undocumented cooperation between the arch specific and arch independent code. For architectures that support more than one huge page size, provide a routine arch_hugetlb_valid_size to validate a huge page size. hugetlb_default_setup can use this to validate passed values. arch_hugetlb_valid_size will also be used in a subsequent patch to move processing of the "hugepagesz=" in arch specific code to a common routine in arch independent code. Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> --- arch/arm64/include/asm/hugetlb.h | 2 ++ arch/arm64/mm/hugetlbpage.c | 19 ++++++++++++++----- arch/powerpc/include/asm/hugetlb.h | 3 +++ arch/powerpc/mm/hugetlbpage.c | 20 +++++++++++++------- arch/riscv/include/asm/hugetlb.h | 3 +++ arch/riscv/mm/hugetlbpage.c | 28 ++++++++++++++++++---------- arch/s390/include/asm/hugetlb.h | 3 +++ arch/s390/mm/hugetlbpage.c | 18 +++++++++++++----- arch/sparc/include/asm/hugetlb.h | 3 +++ arch/sparc/mm/init_64.c | 23 ++++++++++++++++------- arch/x86/include/asm/hugetlb.h | 3 +++ arch/x86/mm/hugetlbpage.c | 21 +++++++++++++++------ include/linux/hugetlb.h | 7 +++++++ mm/hugetlb.c | 16 +++++++++++++--- 14 files changed, 126 insertions(+), 43 deletions(-)