Message ID | 29ed519902512319bcc62e071d52b712fa97e306.1549469965.git.robin.murphy@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm/memory-hotplug: Add sysfs hot-remove trigger | expand |
On Wed, Feb 06, 2019 at 05:03:53PM +0000, Robin Murphy wrote: > ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug, > but being able to exercise the (arguably trickier) hot-remove path would > be even more useful. Extend the feature to allow removal of offline > sections to be triggered manually to aid development. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > > This is inspired by a previous proposal[1], but in coming up with a > more robust interface I ended up rewriting the whole thing from > scratch. The lack of documentation is semi-deliberate, since I don't > like the idea of anyone actually relying on this interface as ABI, but > as a handy tool it felt useful enough to be worth sharing :) > > Robin. > > [1] https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com/ > > drivers/base/memory.c | 42 +++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 41 insertions(+), 1 deletion(-) You have to add new Documentation/ABI entries for each new sysfs file you add :(
On 2019-02-06 6:45 pm, Greg KH wrote: > On Wed, Feb 06, 2019 at 05:03:53PM +0000, Robin Murphy wrote: >> ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug, >> but being able to exercise the (arguably trickier) hot-remove path would >> be even more useful. Extend the feature to allow removal of offline >> sections to be triggered manually to aid development. >> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> --- >> >> This is inspired by a previous proposal[1], but in coming up with a >> more robust interface I ended up rewriting the whole thing from >> scratch. The lack of documentation is semi-deliberate, since I don't >> like the idea of anyone actually relying on this interface as ABI, but >> as a handy tool it felt useful enough to be worth sharing :) >> >> Robin. >> >> [1] https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com/ >> >> drivers/base/memory.c | 42 +++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 41 insertions(+), 1 deletion(-) > > You have to add new Documentation/ABI entries for each new sysfs file > you add :( Ah, the documentation provided is adapted from the documentation for the existing "probe" attribute it builds upon ;) But yeah, perhaps I should have tagged this post as an RFC. Much as I dislike it being anywhere near an ABI, if everyone thinks the feature is merge-worthy I'll go ahead and write up some proper doc entries too. Robin.
On Wed, Feb 06, 2019 at 05:03:53PM +0000, Robin Murphy wrote: > ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug, > but being able to exercise the (arguably trickier) hot-remove path would > be even more useful. Extend the feature to allow removal of offline > sections to be triggered manually to aid development. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > > This is inspired by a previous proposal[1], but in coming up with a > more robust interface I ended up rewriting the whole thing from > scratch. The lack of documentation is semi-deliberate, since I don't > like the idea of anyone actually relying on this interface as ABI, but > as a handy tool it felt useful enough to be worth sharing :) Hi Robin, I think this might come in handy, especially when trying to test hot-remove on arch's that do not have any means to hot-remove memory, or even on virtual platforms that do not have yet support for hot-remove depending on the platform, like qemu/arm64. I could have used this while testing hot-remove on other archs for [1] > > Robin. > > [1] https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com/ > > + if (mem->state != MEM_OFFLINE) > + return -EBUSY; We do have the helper "is_memblock_offlined()", although it is only used in one place now. So, I would rather use it here as well. > + > + ret = lock_device_hotplug_sysfs(); > + if (ret) > + return ret; > + > + if (device_remove_file_self(dev, attr)) { > + __remove_memory(pfn_to_nid(start_pfn), PFN_PHYS(start_pfn), > + MIN_MEMORY_BLOCK_SIZE * sections_per_block); Sorry, I am not into sysfs inners, but I thought that: device_del::device_remove_attrs::device_remove_groups::sysfs_remove_groups would be enough to remove the dev attributes. I guess in this case that is not enough, could you explain why? [1] https://patchwork.kernel.org/patch/10775339/
On 07/02/2019 13:36, Oscar Salvador wrote: > On Wed, Feb 06, 2019 at 05:03:53PM +0000, Robin Murphy wrote: >> ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug, >> but being able to exercise the (arguably trickier) hot-remove path would >> be even more useful. Extend the feature to allow removal of offline >> sections to be triggered manually to aid development. >> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> --- >> >> This is inspired by a previous proposal[1], but in coming up with a >> more robust interface I ended up rewriting the whole thing from >> scratch. The lack of documentation is semi-deliberate, since I don't >> like the idea of anyone actually relying on this interface as ABI, but >> as a handy tool it felt useful enough to be worth sharing :) > > Hi Robin, > > I think this might come in handy, especially when trying to test hot-remove > on arch's that do not have any means to hot-remove memory, or even on virtual > platforms that do not have yet support for hot-remove depending on the platform, > like qemu/arm64. > > > I could have used this while testing hot-remove on other archs for [1] > >> >> Robin. >> >> [1] https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com/ >> > >> + if (mem->state != MEM_OFFLINE) >> + return -EBUSY; > > We do have the helper "is_memblock_offlined()", although it is only used in one place now. > So, I would rather use it here as well. Ooh, if I'd actually noticed that that helper existed, I would indeed have used it - fixed. >> + >> + ret = lock_device_hotplug_sysfs(); >> + if (ret) >> + return ret; >> + >> + if (device_remove_file_self(dev, attr)) { >> + __remove_memory(pfn_to_nid(start_pfn), PFN_PHYS(start_pfn), >> + MIN_MEMORY_BLOCK_SIZE * sections_per_block); > > Sorry, I am not into sysfs inners, but I thought that: > device_del::device_remove_attrs::device_remove_groups::sysfs_remove_groups > would be enough to remove the dev attributes. > I guess in this case that is not enough, could you explain why? As I found out the hard way, since the "remove" attribute itself belongs to the device being removed, the standard device teardown callchain would end up trying to remove the file from its own method, which results in deadlock. Fortunately, the PCI sysfs code has a similar "remove" attribute which showed me how it should be handled - following the kerneldoc breadcrumb trail to kernfs_remove_self() hopefully explains it more completely. Thanks, Robin. > > > [1] https://patchwork.kernel.org/patch/10775339/ >
On 02/07/2019 09:02 PM, Robin Murphy wrote: > On 07/02/2019 13:36, Oscar Salvador wrote: >> On Wed, Feb 06, 2019 at 05:03:53PM +0000, Robin Murphy wrote: >>> ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug, >>> but being able to exercise the (arguably trickier) hot-remove path would >>> be even more useful. Extend the feature to allow removal of offline >>> sections to be triggered manually to aid development. >>> >>> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >>> --- >>> >>> This is inspired by a previous proposal[1], but in coming up with a >>> more robust interface I ended up rewriting the whole thing from >>> scratch. The lack of documentation is semi-deliberate, since I don't >>> like the idea of anyone actually relying on this interface as ABI, but >>> as a handy tool it felt useful enough to be worth sharing :) >> >> Hi Robin, >> >> I think this might come in handy, especially when trying to test hot-remove >> on arch's that do not have any means to hot-remove memory, or even on virtual >> platforms that do not have yet support for hot-remove depending on the platform, >> like qemu/arm64. >> >> >> I could have used this while testing hot-remove on other archs for [1] >> >>> >>> Robin. >>> >>> [1] https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com/ >>> >> >>> + if (mem->state != MEM_OFFLINE) >>> + return -EBUSY; >> >> We do have the helper "is_memblock_offlined()", although it is only used in one place now. >> So, I would rather use it here as well. > > Ooh, if I'd actually noticed that that helper existed, I would indeed have used it - fixed. > >>> + >>> + ret = lock_device_hotplug_sysfs(); >>> + if (ret) >>> + return ret; >>> + >>> + if (device_remove_file_self(dev, attr)) { >>> + __remove_memory(pfn_to_nid(start_pfn), PFN_PHYS(start_pfn), >>> + MIN_MEMORY_BLOCK_SIZE * sections_per_block); >> >> Sorry, I am not into sysfs inners, but I thought that: >> device_del::device_remove_attrs::device_remove_groups::sysfs_remove_groups >> would be enough to remove the dev attributes. >> I guess in this case that is not enough, could you explain why? > > As I found out the hard way, since the "remove" attribute itself belongs to the device being removed, the standard device teardown callchain would end up trying to remove the file from its own method, which results in deadlock. Fortunately, the PCI sysfs code has a similar "remove" attribute which showed me how it should be handled - following the kerneldoc breadcrumb trail to kernfs_remove_self() hopefully explains it more completely. Instead we could have an interface like /sys/devices/system/memory/[unprobe|remove] which would take a memory block start address looking into the existing ones and attempt to remove [addr, addr + memory_block_size] provided its already offlined. This will be exact opposite for /sys/devices/system/memory/probe except the fact that it can trigger onlining of the memory automatically (even this new one could trigger offlining automatically as well). But I dont have a preference between the proposed one or this one. Either of them should be okay.
diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 048cbf7d5233..26344cb9f045 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -521,7 +521,44 @@ static ssize_t probe_store(struct device *dev, struct device_attribute *attr, } static DEVICE_ATTR_WO(probe); -#endif + +#ifdef CONFIG_MEMORY_HOTREMOVE +static ssize_t remove_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct memory_block *mem = to_memory_block(dev); + unsigned long start_pfn = section_nr_to_pfn(mem->start_section_nr); + bool remove; + int ret; + + ret = kstrtobool(buf, &remove); + if (ret) + return ret; + if (!remove) + return count; + + if (mem->state != MEM_OFFLINE) + return -EBUSY; + + ret = lock_device_hotplug_sysfs(); + if (ret) + return ret; + + if (device_remove_file_self(dev, attr)) { + __remove_memory(pfn_to_nid(start_pfn), PFN_PHYS(start_pfn), + MIN_MEMORY_BLOCK_SIZE * sections_per_block); + ret = count; + } else { + ret = -EBUSY; + } + + unlock_device_hotplug(); + return ret; +} + +static DEVICE_ATTR_WO(remove); +#endif /* CONFIG_MEMORY_HOTREMOVE */ +#endif /* CONFIG_ARCH_MEMORY_PROBE */ #ifdef CONFIG_MEMORY_FAILURE /* @@ -615,6 +652,9 @@ static struct attribute *memory_memblk_attrs[] = { &dev_attr_removable.attr, #ifdef CONFIG_MEMORY_HOTREMOVE &dev_attr_valid_zones.attr, +#ifdef CONFIG_ARCH_MEMORY_PROBE + &dev_attr_remove.attr, +#endif #endif NULL };
ARCH_MEMORY_PROBE is a useful thing for testing and debugging hotplug, but being able to exercise the (arguably trickier) hot-remove path would be even more useful. Extend the feature to allow removal of offline sections to be triggered manually to aid development. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- This is inspired by a previous proposal[1], but in coming up with a more robust interface I ended up rewriting the whole thing from scratch. The lack of documentation is semi-deliberate, since I don't like the idea of anyone actually relying on this interface as ABI, but as a handy tool it felt useful enough to be worth sharing :) Robin. [1] https://lore.kernel.org/lkml/22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com/ drivers/base/memory.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-)