Message ID | 1455069975-14291-1-git-send-email-toshi.kani@hpe.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
* Toshi Kani <toshi.kani@hpe.com> wrote: > x86 does not define ARCH_HAS_VALID_PHYS_ADDR_RANGE, which > leads /dev/mem to use the default valid_phys_addr_range() > and valid_mmap_phys_addr_range() in drivers/char/mem.c. > > The default valid_phys_addr_range() allows any range lower > than __pa(high_memory), which is the end of system RAM, and > disallows any range higher than it. > > Persistent memory may be located at lower and/or higher > address of __pa(high_memory) depending on their memory slots. > When using crash(8) via /dev/mem for analyzing data in > persistent memory, it can only access to the one lower than > __pa(high_memory). > > Add x86 valid_phys_addr_range() and valid_mmap_phys_addr_range() > to provide better checking: > - Physical address range is valid when it is fully backed by > IORESOURCE_MEM, regardless of __pa(high_memory). > - Other ranges, including holes, are invalid. > > This also allows crash(8) to access persistent memory ranges > via /dev/mem (with a minor change to remove high_memory check > from crash itself). > > Note, /dev/mem makes additional check with devmem_is_allowed() > for read/write when CONFIG_STRICT_DEVMEM is set, and does always > for mmap. CONFIG_IO_STRICT_DEVMEM provides further restriction. > > Signed-off-by: Toshi Kani <toshi.kani@hpe.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: Borislav Petkov <bp@suse.de> > Cc: Dan Williams <dan.j.williams@intel.com> > --- > This patch applies on top of the patch-set below, and is based > on the tip tree. > https://lkml.org/lkml/2016/1/26/886 > --- > arch/x86/include/asm/io.h | 3 +++ > arch/x86/mm/init.c | 24 ++++++++++++++++++++++++ > 2 files changed, 27 insertions(+) > > diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h > index de25aad..189901a 100644 > --- a/arch/x86/include/asm/io.h > +++ b/arch/x86/include/asm/io.h > @@ -36,6 +36,7 @@ > > #define ARCH_HAS_IOREMAP_WC > #define ARCH_HAS_IOREMAP_WT > +#define ARCH_HAS_VALID_PHYS_ADDR_RANGE > > #include <linux/string.h> > #include <linux/compiler.h> > @@ -326,6 +327,8 @@ extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size); > extern void __iomem *ioremap_wt(resource_size_t offset, unsigned long size); > > extern bool is_early_ioremap_ptep(pte_t *ptep); > +extern int valid_phys_addr_range(phys_addr_t addr, size_t size); > +extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t size); > > #ifdef CONFIG_XEN > #include <xen/xen.h> > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c > index 493f541..35cf96f 100644 > --- a/arch/x86/mm/init.c > +++ b/arch/x86/mm/init.c > @@ -624,6 +624,30 @@ void __init init_mem_mapping(void) > early_memtest(0, max_pfn_mapped << PAGE_SHIFT); > } > > +/** > + * valid_phys_addr_range - check phys addr for /dev/mem read and write > + * > + * Return true if a target physical address is marked as IORESOURCE_MEM. > + */ > +int valid_phys_addr_range(phys_addr_t addr, size_t size) > +{ > + return (region_intersects(addr, size, IORESOURCE_MEM, > + IORES_DESC_NONE) == REGION_INTERSECTS); > +} > + > +/** > + * valid_mmap_phys_addr_range - check phys addr for /dev/mem mmap > + * > + * Return true if a target physical address is marked as IORESOURCE_MEM. > + */ > +int valid_mmap_phys_addr_range(unsigned long pfn, size_t size) > +{ > + resource_size_t addr = pfn << PAGE_SHIFT; > + > + return (region_intersects(addr, size, IORESOURCE_MEM, > + IORES_DESC_NONE) == REGION_INTERSECTS); > +} > + > /* > * devmem_is_allowed() checks to see if /dev/mem access to a certain address > * is valid. The argument is a physical page number. So it's hard to judge the quality of these new APIs without seeing their actual usecases. So please Cc: me to whatever work this is used in, and I'll have a look in that context. Thanks, Ingo
On Wed, 2016-02-17 at 10:29 +0100, Ingo Molnar wrote: > * Toshi Kani <toshi.kani@hpe.com> wrote: > > > x86 does not define ARCH_HAS_VALID_PHYS_ADDR_RANGE, which > > leads /dev/mem to use the default valid_phys_addr_range() > > and valid_mmap_phys_addr_range() in drivers/char/mem.c. > > > > The default valid_phys_addr_range() allows any range lower > > than __pa(high_memory), which is the end of system RAM, and > > disallows any range higher than it. > > > > Persistent memory may be located at lower and/or higher > > address of __pa(high_memory) depending on their memory slots. > > When using crash(8) via /dev/mem for analyzing data in > > persistent memory, it can only access to the one lower than > > __pa(high_memory). > > > > Add x86 valid_phys_addr_range() and valid_mmap_phys_addr_range() > > to provide better checking: > > - Physical address range is valid when it is fully backed by > > IORESOURCE_MEM, regardless of __pa(high_memory). > > - Other ranges, including holes, are invalid. > > > > This also allows crash(8) to access persistent memory ranges > > via /dev/mem (with a minor change to remove high_memory check > > from crash itself). > > > > Note, /dev/mem makes additional check with devmem_is_allowed() > > for read/write when CONFIG_STRICT_DEVMEM is set, and does always > > for mmap. CONFIG_IO_STRICT_DEVMEM provides further restriction. : > So it's hard to judge the quality of these new APIs without seeing their > actual usecases. So please Cc: me to whatever work this is used in, and > I'll have a look in that context. Great! The source code of crash(8) is available in the github below. https://github.com/crash-utility/crash crash is a tool to analyze memory data via a crashdump file or /dev/mem. I am trying to make this tool works on NVDIMM via /dev/mem. (NVDIMM ranges are not saved to a crashdump file, but are persistent anyway.) We had BTT metadata corruptions, and this tool can be helpful to verify such data until we have better tools. https://lkml.org/lkml/2016/2/11/674 When /dev/mem is specified as the source, read_dev_mem() is set to pc- >readmem as the read function to /dev/mem. read_dev_mem() is at line 2268 of the file blow. https://github.com/crash-utility/crash/blob/master/memory.c read_dev_mem() has the same high_memory check (i.e. mirroring the /dev/mem restriction), which should also be removed after this patch is accepted. Falling back to /dev/kmem does not help since read_kmem() in the kernel has check to high_memory in the function itself. With this patch, read_dev_mem() works on NVDIMM ranges. Let me know if you have any question. Thanks, -Toshi
On Tue, Feb 9, 2016 at 6:06 PM, Toshi Kani <toshi.kani@hpe.com> wrote: > x86 does not define ARCH_HAS_VALID_PHYS_ADDR_RANGE, which > leads /dev/mem to use the default valid_phys_addr_range() > and valid_mmap_phys_addr_range() in drivers/char/mem.c. > > The default valid_phys_addr_range() allows any range lower > than __pa(high_memory), which is the end of system RAM, and > disallows any range higher than it. > > Persistent memory may be located at lower and/or higher > address of __pa(high_memory) depending on their memory slots. > When using crash(8) via /dev/mem for analyzing data in > persistent memory, it can only access to the one lower than > __pa(high_memory). > > Add x86 valid_phys_addr_range() and valid_mmap_phys_addr_range() > to provide better checking: > - Physical address range is valid when it is fully backed by > IORESOURCE_MEM, regardless of __pa(high_memory). > - Other ranges, including holes, are invalid. > > This also allows crash(8) to access persistent memory ranges > via /dev/mem (with a minor change to remove high_memory check > from crash itself). If we're modifying crash(8) can't we also teach it to mmap /dev/pmemX directly? With commit 90a545e98126 "restrict /dev/mem to idle io memory ranges" /dev/mem should not have access to active pmem ranges.
On Wed, Feb 17, 2016 at 11:35 AM, Toshi Kani <toshi.kani@hpe.com> wrote: > On Wed, 2016-02-17 at 09:58 -0800, Dan Williams wrote: >> On Tue, Feb 9, 2016 at 6:06 PM, Toshi Kani <toshi.kani@hpe.com> wrote: >> > x86 does not define ARCH_HAS_VALID_PHYS_ADDR_RANGE, which >> > leads /dev/mem to use the default valid_phys_addr_range() >> > and valid_mmap_phys_addr_range() in drivers/char/mem.c. >> > >> > The default valid_phys_addr_range() allows any range lower >> > than __pa(high_memory), which is the end of system RAM, and >> > disallows any range higher than it. >> > >> > Persistent memory may be located at lower and/or higher >> > address of __pa(high_memory) depending on their memory slots. >> > When using crash(8) via /dev/mem for analyzing data in >> > persistent memory, it can only access to the one lower than >> > __pa(high_memory). >> > >> > Add x86 valid_phys_addr_range() and valid_mmap_phys_addr_range() >> > to provide better checking: >> > - Physical address range is valid when it is fully backed by >> > IORESOURCE_MEM, regardless of __pa(high_memory). >> > - Other ranges, including holes, are invalid. >> > >> > This also allows crash(8) to access persistent memory ranges >> > via /dev/mem (with a minor change to remove high_memory check >> > from crash itself). >> >> If we're modifying crash(8) can't we also teach it to mmap /dev/pmemX >> directly? With commit 90a545e98126 "restrict /dev/mem to idle io >> memory ranges" /dev/mem should not have access to active pmem ranges. > > Yes, I am aware of the commit. Unloading drivers while using crash(8) to > analyze NVDIMM via /dev/mem makes sense. /dev/mem does not require any > other drivers be loaded. Ah, ok. I thought this patch was bypassing that safety check. If it requires the driver to be unloaded first then I'm fine with this. > Using /dev/pmemX, on the other hand, requires the driver to be loaded, > which can be problematic. For instance, when btt_init() fails due to some > corruption in arena, it fails to create any pmem device file. A dev file > also restricts access range within the dev file. > > Thanks, > -Toshi > > ps. > Looking at iomem_is_exclusive(), it only checks the top-level iomem > entries. I think the pmem/btt driver only marks a child entry busy... > It looks to me that next_resource(), via r_next(), walks child ranges.
On Wed, 2016-02-17 at 09:58 -0800, Dan Williams wrote: > On Tue, Feb 9, 2016 at 6:06 PM, Toshi Kani <toshi.kani@hpe.com> wrote: > > x86 does not define ARCH_HAS_VALID_PHYS_ADDR_RANGE, which > > leads /dev/mem to use the default valid_phys_addr_range() > > and valid_mmap_phys_addr_range() in drivers/char/mem.c. > > > > The default valid_phys_addr_range() allows any range lower > > than __pa(high_memory), which is the end of system RAM, and > > disallows any range higher than it. > > > > Persistent memory may be located at lower and/or higher > > address of __pa(high_memory) depending on their memory slots. > > When using crash(8) via /dev/mem for analyzing data in > > persistent memory, it can only access to the one lower than > > __pa(high_memory). > > > > Add x86 valid_phys_addr_range() and valid_mmap_phys_addr_range() > > to provide better checking: > > - Physical address range is valid when it is fully backed by > > IORESOURCE_MEM, regardless of __pa(high_memory). > > - Other ranges, including holes, are invalid. > > > > This also allows crash(8) to access persistent memory ranges > > via /dev/mem (with a minor change to remove high_memory check > > from crash itself). > > If we're modifying crash(8) can't we also teach it to mmap /dev/pmemX > directly? With commit 90a545e98126 "restrict /dev/mem to idle io > memory ranges" /dev/mem should not have access to active pmem ranges. Yes, I am aware of the commit. Unloading drivers while using crash(8) to analyze NVDIMM via /dev/mem makes sense. /dev/mem does not require any other drivers be loaded. Using /dev/pmemX, on the other hand, requires the driver to be loaded, which can be problematic. For instance, when btt_init() fails due to some corruption in arena, it fails to create any pmem device file. A dev file also restricts access range within the dev file. Thanks, -Toshi ps. Looking at iomem_is_exclusive(), it only checks the top-level iomem entries. I think the pmem/btt driver only marks a child entry busy...
On Wed, 2016-02-17 at 11:05 -0800, Dan Williams wrote: > On Wed, Feb 17, 2016 at 11:35 AM, Toshi Kani <toshi.kani@hpe.com> wrote: > > On Wed, 2016-02-17 at 09:58 -0800, Dan Williams wrote: > > > On Tue, Feb 9, 2016 at 6:06 PM, Toshi Kani <toshi.kani@hpe.com> > > > wrote: > > > > x86 does not define ARCH_HAS_VALID_PHYS_ADDR_RANGE, which > > > > leads /dev/mem to use the default valid_phys_addr_range() > > > > and valid_mmap_phys_addr_range() in drivers/char/mem.c. > > > > > > > > The default valid_phys_addr_range() allows any range lower > > > > than __pa(high_memory), which is the end of system RAM, and > > > > disallows any range higher than it. > > > > > > > > Persistent memory may be located at lower and/or higher > > > > address of __pa(high_memory) depending on their memory slots. > > > > When using crash(8) via /dev/mem for analyzing data in > > > > persistent memory, it can only access to the one lower than > > > > __pa(high_memory). > > > > > > > > Add x86 valid_phys_addr_range() and valid_mmap_phys_addr_range() > > > > to provide better checking: > > > > - Physical address range is valid when it is fully backed by > > > > IORESOURCE_MEM, regardless of __pa(high_memory). > > > > - Other ranges, including holes, are invalid. > > > > > > > > This also allows crash(8) to access persistent memory ranges > > > > via /dev/mem (with a minor change to remove high_memory check > > > > from crash itself). > > > > > > If we're modifying crash(8) can't we also teach it to mmap /dev/pmemX > > > directly? With commit 90a545e98126 "restrict /dev/mem to idle io > > > memory ranges" /dev/mem should not have access to active pmem ranges. > > > > Yes, I am aware of the commit. Unloading drivers while using crash(8) > > to analyze NVDIMM via /dev/mem makes sense. /dev/mem does not require > > any other drivers be loaded. > > Ah, ok. I thought this patch was bypassing that safety check. If it > requires the driver to be unloaded first then I'm fine with this. Yes, crash(8) is used for analyzing static data (such as a crashdump file). It does not allow writing data unless user specifically enable it. > > Using /dev/pmemX, on the other hand, requires the driver to be loaded, > > which can be problematic. For instance, when btt_init() fails due to > > some corruption in arena, it fails to create any pmem device file. A > > dev file also restricts access range within the dev file. > > > > Thanks, > > -Toshi > > > > ps. > > Looking at iomem_is_exclusive(), it only checks the top-level iomem > > entries. I think the pmem/btt driver only marks a child entry busy... > > > > It looks to me that next_resource(), via r_next(), walks child ranges. Oh, sorry, I missed the r_next(). Thanks, -Toshi
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index de25aad..189901a 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -36,6 +36,7 @@ #define ARCH_HAS_IOREMAP_WC #define ARCH_HAS_IOREMAP_WT +#define ARCH_HAS_VALID_PHYS_ADDR_RANGE #include <linux/string.h> #include <linux/compiler.h> @@ -326,6 +327,8 @@ extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size); extern void __iomem *ioremap_wt(resource_size_t offset, unsigned long size); extern bool is_early_ioremap_ptep(pte_t *ptep); +extern int valid_phys_addr_range(phys_addr_t addr, size_t size); +extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t size); #ifdef CONFIG_XEN #include <xen/xen.h> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index 493f541..35cf96f 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -624,6 +624,30 @@ void __init init_mem_mapping(void) early_memtest(0, max_pfn_mapped << PAGE_SHIFT); } +/** + * valid_phys_addr_range - check phys addr for /dev/mem read and write + * + * Return true if a target physical address is marked as IORESOURCE_MEM. + */ +int valid_phys_addr_range(phys_addr_t addr, size_t size) +{ + return (region_intersects(addr, size, IORESOURCE_MEM, + IORES_DESC_NONE) == REGION_INTERSECTS); +} + +/** + * valid_mmap_phys_addr_range - check phys addr for /dev/mem mmap + * + * Return true if a target physical address is marked as IORESOURCE_MEM. + */ +int valid_mmap_phys_addr_range(unsigned long pfn, size_t size) +{ + resource_size_t addr = pfn << PAGE_SHIFT; + + return (region_intersects(addr, size, IORESOURCE_MEM, + IORES_DESC_NONE) == REGION_INTERSECTS); +} + /* * devmem_is_allowed() checks to see if /dev/mem access to a certain address * is valid. The argument is a physical page number.
x86 does not define ARCH_HAS_VALID_PHYS_ADDR_RANGE, which leads /dev/mem to use the default valid_phys_addr_range() and valid_mmap_phys_addr_range() in drivers/char/mem.c. The default valid_phys_addr_range() allows any range lower than __pa(high_memory), which is the end of system RAM, and disallows any range higher than it. Persistent memory may be located at lower and/or higher address of __pa(high_memory) depending on their memory slots. When using crash(8) via /dev/mem for analyzing data in persistent memory, it can only access to the one lower than __pa(high_memory). Add x86 valid_phys_addr_range() and valid_mmap_phys_addr_range() to provide better checking: - Physical address range is valid when it is fully backed by IORESOURCE_MEM, regardless of __pa(high_memory). - Other ranges, including holes, are invalid. This also allows crash(8) to access persistent memory ranges via /dev/mem (with a minor change to remove high_memory check from crash itself). Note, /dev/mem makes additional check with devmem_is_allowed() for read/write when CONFIG_STRICT_DEVMEM is set, and does always for mmap. CONFIG_IO_STRICT_DEVMEM provides further restriction. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> --- This patch applies on top of the patch-set below, and is based on the tip tree. https://lkml.org/lkml/2016/1/26/886 --- arch/x86/include/asm/io.h | 3 +++ arch/x86/mm/init.c | 24 ++++++++++++++++++++++++ 2 files changed, 27 insertions(+)