Message ID | 20210506152623.178731-1-zi.yan@sent.com (mailing list archive) |
---|---|
Headers | show |
Series | Memory hotplug/hotremove at subsection size | expand |
On 06.05.21 17:26, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > Hi all, > > This patchset tries to remove the restriction on memory hotplug/hotremove > granularity, which is always greater or equal to memory section size[1]. > With the patchset, kernel is able to online/offline memory at a size independent > of memory section size, as small as 2MB (the subsection size). ... which doesn't make any sense as we can only online/offline whole memory block devices. > > The motivation is to increase MAX_ORDER of the buddy allocator and pageblock > size without increasing memory hotplug/hotremove granularity at the same time, Gah, no. Please no. No. > so that the kernel can allocator 1GB pages using buddy allocator and utilizes > existing pageblock based anti-fragmentation, paving the road for 1GB THP > support[2]. Not like this, please no. > > The patchset utilizes the existing subsection support[3] and changes the > section size alignment checks to subsection size alignment checks. There are > also changes to pageblock code to support partial pageblocks, when pageblock > size is increased along with MAX_ORDER. Increasing pageblock size can enable > kernel to utilize existing anti-fragmentation mechanism for gigantic page > allocations. Please not like this. > > The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory > hotplug/hotremove subsection, but is not intended to be merged as is. It is > there in case one wants to try this out and will be removed during the final > submission. > > Feel free to give suggestions and comments. I am looking forward to your > feedback. Please not like this.
On 6 May 2021, at 11:31, David Hildenbrand wrote: > On 06.05.21 17:26, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> Hi all, >> >> This patchset tries to remove the restriction on memory hotplug/hotremove >> granularity, which is always greater or equal to memory section size[1]. >> With the patchset, kernel is able to online/offline memory at a size independent >> of memory section size, as small as 2MB (the subsection size). > > ... which doesn't make any sense as we can only online/offline whole memory block devices. Why limit the memory block size to section size? Patch 3 removes the restriction by using (start_pfn, nr_pages) to allow memory block size goes below section size. Also we have subsection bitmap available to tell us which subsection is online, there is no reason to force memory block size to match section size. > >> >> The motivation is to increase MAX_ORDER of the buddy allocator and pageblock >> size without increasing memory hotplug/hotremove granularity at the same time, > > Gah, no. Please no. No. > >> so that the kernel can allocator 1GB pages using buddy allocator and utilizes >> existing pageblock based anti-fragmentation, paving the road for 1GB THP >> support[2]. > > Not like this, please no. > >> >> The patchset utilizes the existing subsection support[3] and changes the >> section size alignment checks to subsection size alignment checks. There are >> also changes to pageblock code to support partial pageblocks, when pageblock >> size is increased along with MAX_ORDER. Increasing pageblock size can enable >> kernel to utilize existing anti-fragmentation mechanism for gigantic page >> allocations. > > Please not like this. > >> >> The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory >> hotplug/hotremove subsection, but is not intended to be merged as is. It is >> there in case one wants to try this out and will be removed during the final >> submission. >> >> Feel free to give suggestions and comments. I am looking forward to your >> feedback. > > Please not like this. Do you mind sharing more useful feedback instead of just saying a lot of No? Thanks. — Best Regards, Yan Zi
On 06.05.21 17:31, David Hildenbrand wrote: > On 06.05.21 17:26, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> Hi all, >> >> This patchset tries to remove the restriction on memory hotplug/hotremove >> granularity, which is always greater or equal to memory section size[1]. >> With the patchset, kernel is able to online/offline memory at a size independent >> of memory section size, as small as 2MB (the subsection size). > > ... which doesn't make any sense as we can only online/offline whole > memory block devices. > >> >> The motivation is to increase MAX_ORDER of the buddy allocator and pageblock >> size without increasing memory hotplug/hotremove granularity at the same time, > > Gah, no. Please no. No. > >> so that the kernel can allocator 1GB pages using buddy allocator and utilizes >> existing pageblock based anti-fragmentation, paving the road for 1GB THP >> support[2]. > > Not like this, please no. > >> >> The patchset utilizes the existing subsection support[3] and changes the >> section size alignment checks to subsection size alignment checks. There are >> also changes to pageblock code to support partial pageblocks, when pageblock >> size is increased along with MAX_ORDER. Increasing pageblock size can enable >> kernel to utilize existing anti-fragmentation mechanism for gigantic page >> allocations. > > Please not like this. > >> >> The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory >> hotplug/hotremove subsection, but is not intended to be merged as is. It is >> there in case one wants to try this out and will be removed during the final >> submission. >> >> Feel free to give suggestions and comments. I am looking forward to your >> feedback. > > Please not like this. > And just to be clear (I think I mentioned this already to you?): Nack to increasing the section size. Nack to increasing the pageblock order. Please find different ways to group on gigantic-pages level. There are alternative ideas floating around. Semi-nack to increasing MAX_ORDER. I first want to see alloc_contig_range() be able to fully and cleanly handle allocations < MAX_ORDER in all cases (especially !CMA and !ZONE_MOVABLE) before we go down that path.
>>> The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory >>> hotplug/hotremove subsection, but is not intended to be merged as is. It is >>> there in case one wants to try this out and will be removed during the final >>> submission. >>> >>> Feel free to give suggestions and comments. I am looking forward to your >>> feedback. >> >> Please not like this. > > Do you mind sharing more useful feedback instead of just saying a lot of No? I remember reasoning about this already in another thread, no? Either you're ignoring my previous feedback or my mind is messing with me.
On 6 May 2021, at 11:26, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > Hi all, > > This patchset tries to remove the restriction on memory hotplug/hotremove > granularity, which is always greater or equal to memory section size[1]. > With the patchset, kernel is able to online/offline memory at a size independent > of memory section size, as small as 2MB (the subsection size). > > The motivation is to increase MAX_ORDER of the buddy allocator and pageblock > size without increasing memory hotplug/hotremove granularity at the same time, > so that the kernel can allocator 1GB pages using buddy allocator and utilizes > existing pageblock based anti-fragmentation, paving the road for 1GB THP > support[2]. > > The patchset utilizes the existing subsection support[3] and changes the > section size alignment checks to subsection size alignment checks. There are > also changes to pageblock code to support partial pageblocks, when pageblock > size is increased along with MAX_ORDER. Increasing pageblock size can enable > kernel to utilize existing anti-fragmentation mechanism for gigantic page > allocations. > > The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory > hotplug/hotremove subsection, but is not intended to be merged as is. It is > there in case one wants to try this out and will be removed during the final > submission. > > Feel free to give suggestions and comments. I am looking forward to your > feedback. > > Thanks. Added the missing references. [1] https://lore.kernel.org/linux-mm/4b3006cf-3391-6839-904e-b415613198cb@redhat.com/ [2] https://lore.kernel.org/linux-mm/20200928175428.4110504-1-zi.yan@sent.com/ [3] https://patchwork.kernel.org/project/linux-nvdimm/cover/156092349300.979959.17603710711957735135.stgit@dwillia2-desk3.amr.corp.intel.com/ > > Zi Yan (7): > mm: sparse: set/clear subsection bitmap when pages are > onlined/offlined. > mm: set pageblock_order to the max of HUGETLB_PAGE_ORDER and > MAX_ORDER-1 > mm: memory_hotplug: decouple memory_block size with section size. > mm: pageblock: allow set/unset migratetype for partial pageblock > mm: memory_hotplug, sparse: enable memory hotplug/hotremove > subsections > arch: x86: no MAX_ORDER exceeds SECTION_SIZE check for 32bit vdso. > [not for merge] mm: increase SECTION_SIZE_BITS to 31 > > arch/ia64/Kconfig | 1 - > arch/powerpc/Kconfig | 1 - > arch/x86/Kconfig | 15 +++ > arch/x86/entry/vdso/Makefile | 1 + > arch/x86/include/asm/sparsemem.h | 2 +- > drivers/base/memory.c | 176 +++++++++++++++---------------- > drivers/base/node.c | 2 +- > include/linux/memory.h | 8 +- > include/linux/mmzone.h | 2 + > include/linux/page-isolation.h | 8 +- > include/linux/pageblock-flags.h | 9 -- > mm/Kconfig | 7 -- > mm/memory_hotplug.c | 22 ++-- > mm/page_alloc.c | 40 ++++--- > mm/page_isolation.c | 30 +++--- > mm/sparse.c | 55 ++++++++-- > 16 files changed, 219 insertions(+), 160 deletions(-) > > -- > 2.30.2 — Best Regards, Yan Zi
On 6 May 2021, at 11:40, David Hildenbrand wrote: >>>> The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory >>>> hotplug/hotremove subsection, but is not intended to be merged as is. It is >>>> there in case one wants to try this out and will be removed during the final >>>> submission. >>>> >>>> Feel free to give suggestions and comments. I am looking forward to your >>>> feedback. >>> >>> Please not like this. >> >> Do you mind sharing more useful feedback instead of just saying a lot of No? > > I remember reasoning about this already in another thread, no? Either you're ignoring my previous feedback or my mind is messing with me. I definitely remember all your suggestions: 1. do not use CMA allocation for 1GB THP. 2. section size defines the minimum size in which we can add_memory(), so we cannot increase it. I am trying an alternative here. I am not using CMA allocation and not increasing the minimum size of add_memory() by decoupling the memory block size from section size, so that add_memory() can add a memory block smaller (as small as 2MB, the subsection size) than section size. In this way, section size can be increased freely. I do not see the strong tie between add_memory() and section size, especially we have subsection bitmap support. — Best Regards, Yan Zi
On 06.05.21 17:50, Zi Yan wrote: > On 6 May 2021, at 11:40, David Hildenbrand wrote: > >>>>> The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory >>>>> hotplug/hotremove subsection, but is not intended to be merged as is. It is >>>>> there in case one wants to try this out and will be removed during the final >>>>> submission. >>>>> >>>>> Feel free to give suggestions and comments. I am looking forward to your >>>>> feedback. >>>> >>>> Please not like this. >>> >>> Do you mind sharing more useful feedback instead of just saying a lot of No? >> >> I remember reasoning about this already in another thread, no? Either you're ignoring my previous feedback or my mind is messing with me. > > I definitely remember all your suggestions: > > 1. do not use CMA allocation for 1GB THP. > 2. section size defines the minimum size in which we can add_memory(), so we cannot increase it. > > I am trying an alternative here. I am not using CMA allocation and not increasing the minimum size of add_memory() by decoupling the memory block size from section size, so that add_memory() can add a memory block smaller (as small as 2MB, the subsection size) than section size. In this way, section size can be increased freely. I do not see the strong tie between add_memory() and section size, especially we have subsection bitmap support. Okay, let me express my thoughts, I could have sworn I explained back then why I am not a friend of messing with the existing pageblock size: 1. Pageblock size There are a couple of features that rely on the pageblock size to be reasonably small to work as expected. One example is virtio-balloon free page reporting, then there is virtio-mem (still also glued MAX_ORDER) and we have CMA (still also glued to MAX_ORDER). Most probably there are more. We track movability/ page isolation per pageblock; it's the smallest granularity you can effectively isolate pages or mark them as CMA (MIGRATE_ISOLATE, MIGRATE_CMA). Well, and there are "ordinary" THP / huge pages most of our applications use and will use, especially on smallish systems. Assume you bump up the pageblock order to 1G. Small VMs won't be able to report any free pages to the hypervisor. You'll take the "fine-grained" out of virtio-mem. Each CMA area will have to be at least 1G big, which turns CMA essentially useless on smallish systems (like we have on arm64 with 64k base pages -- pageblock_size is 512MB and I hate it). Then, imagine systems that have like 4G of main memory. By stopping grouping at 2M and instead grouping at 1G you can very easily find yourself in the system where all your 4 pageblocks are unmovable and you essentially don't optimize for huge pages in that environment any more. Long story short: we need a different mechanism on top and shall leave the pageblock size untouched, it's too tightly integrated with page isolation, ordinary THP, and CMA. 2. Section size I assume the only reason you want to touch that is because pageblock_size <= section_size, and I guess that's one of the reasons I dislike it so much. Messing with the section size really only makes sense when we want to manage metadata for larger granularity within a section. We allocate metadata per section. We mark whole sections early/online/present/.... Yes, in case of vmemmap, we manage the memmap in smaller granularity using the sub-section map, some kind of hack to support some ZONE_DEVICE cases better. Let's assume we introduce something new "gigapage_order", corresponding to 1G. We could either decide to squeeze the metadata into sections, having to increase the section size, or manage that metadata differently. Managing it differently certainly makes the necessary changes easier. Instead of adding more hacks into sections, rather manage that metadata at differently place / in a different way. See [1] for an alternative. Not necessarily what I would dream off, but just to showcase that there might be alternative to group pages. 3. Grouping pages > pageblock_order There are other approaches that would benefit from grouping at > pageblock_order and having bigger MAX_ORDER. And that doesn't necessarily mean to form gigantic pages only, we might want to group in multiple granularity on a single system. Memory hot(un)plug is one example, but also optimizing memory consumption by powering down DIMM banks. Also, some architectures support differing huge page sizes (aarch64) that could be improved without CMA. Why not have more than 2 THP sizes on these systems? Ideally, we'd have a mechanism that tries grouping on different granularity, like for every order in pageblock_order ... max_pageblock_order (e.g., 1 GiB), and not only add one new level of grouping (or increase the single grouping size). [1] https://lkml.kernel.org/r/20210414023803.937-1-lipeifeng@oppo.com
On 6 May 2021, at 12:28, David Hildenbrand wrote: > On 06.05.21 17:50, Zi Yan wrote: >> On 6 May 2021, at 11:40, David Hildenbrand wrote: >> >>>>>> The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory >>>>>> hotplug/hotremove subsection, but is not intended to be merged as is. It is >>>>>> there in case one wants to try this out and will be removed during the final >>>>>> submission. >>>>>> >>>>>> Feel free to give suggestions and comments. I am looking forward to your >>>>>> feedback. >>>>> >>>>> Please not like this. >>>> >>>> Do you mind sharing more useful feedback instead of just saying a lot of No? >>> >>> I remember reasoning about this already in another thread, no? Either you're ignoring my previous feedback or my mind is messing with me. >> >> I definitely remember all your suggestions: >> >> 1. do not use CMA allocation for 1GB THP. >> 2. section size defines the minimum size in which we can add_memory(), so we cannot increase it. >> >> I am trying an alternative here. I am not using CMA allocation and not increasing the minimum size of add_memory() by decoupling the memory block size from section size, so that add_memory() can add a memory block smaller (as small as 2MB, the subsection size) than section size. In this way, section size can be increased freely. I do not see the strong tie between add_memory() and section size, especially we have subsection bitmap support. > > Okay, let me express my thoughts, I could have sworn I explained back then why I am not a friend of messing with the existing pageblock size: Thanks for writing down your thoughts in detail. I will clarify my high-level plan below too. > > 1. Pageblock size > > There are a couple of features that rely on the pageblock size to be reasonably small to work as expected. One example is virtio-balloon free page reporting, then there is virtio-mem (still also glued MAX_ORDER) and we have CMA (still also glued to MAX_ORDER). Most probably there are more. We track movability/ page isolation per pageblock; it's the smallest granularity you can effectively isolate pages or mark them as CMA (MIGRATE_ISOLATE, MIGRATE_CMA). Well, and there are "ordinary" THP / huge pages most of our applications use and will use, especially on smallish systems. > > Assume you bump up the pageblock order to 1G. Small VMs won't be able to report any free pages to the hypervisor. You'll take the "fine-grained" out of virtio-mem. Each CMA area will have to be at least 1G big, which turns CMA essentially useless on smallish systems (like we have on arm64 with 64k base pages -- pageblock_size is 512MB and I hate it). I understand the issue of having large pageblock in small systems. My plan for this issue is to make MAX_ORDER a variable (pageblock size would be set according to MAX_ORDER) that can be adjusted based on total memory and via boot time parameter. My apology since I did not state this clearly in my cover letter and it confused you. When we have a boot time adjustable MAX_ORDER, large pageblock like 1GB would only appear for systems with large memory. For small VMs, pageblock size would stay at 2MB, so all your concerns on smallish systems should go away. > > Then, imagine systems that have like 4G of main memory. By stopping grouping at 2M and instead grouping at 1G you can very easily find yourself in the system where all your 4 pageblocks are unmovable and you essentially don't optimize for huge pages in that environment any more. > > Long story short: we need a different mechanism on top and shall leave the pageblock size untouched, it's too tightly integrated with page isolation, ordinary THP, and CMA. I think it is better to make pageblock size adjustable based on total memory of a system. It is not reasonable to have the same pageblock size across systems with memory sizes from <1GB to several TBs. Do you agree? > > 2. Section size > > I assume the only reason you want to touch that is because pageblock_size <= section_size, and I guess that's one of the reasons I dislike it so much. Messing with the section size really only makes sense when we want to manage metadata for larger granularity within a section. Perhaps it is worth checking if it is feasible to make pageblock_size > section_size, so we can still have small sections when pageblock_size are large. One potential issue for that is when PFN discontinues at section boundary, we might have partial pageblock when pageblock_size is big. I guess supporting partial pageblock (or different pageblock sizes like you mentioned below ) would be the right solution. > > We allocate metadata per section. We mark whole sections early/online/present/.... Yes, in case of vmemmap, we manage the memmap in smaller granularity using the sub-section map, some kind of hack to support some ZONE_DEVICE cases better. > > Let's assume we introduce something new "gigapage_order", corresponding to 1G. We could either decide to squeeze the metadata into sections, having to increase the section size, or manage that metadata differently. > > Managing it differently certainly makes the necessary changes easier. Instead of adding more hacks into sections, rather manage that metadata at differently place / in a different way. Can you elaborate on managing it differently? > > See [1] for an alternative. Not necessarily what I would dream off, but just to showcase that there might be alternative to group pages. I saw this patch too. It is an interesting idea to separate different allocation orders into different regions, but it would not work for gigantic page allocations unless we have large pageblock size to utilize existing anti-fragmentation mechanisms. > > 3. Grouping pages > pageblock_order > > There are other approaches that would benefit from grouping at > pageblock_order and having bigger MAX_ORDER. And that doesn't necessarily mean to form gigantic pages only, we might want to group in multiple granularity on a single system. Memory hot(un)plug is one example, but also optimizing memory consumption by powering down DIMM banks. Also, some architectures support differing huge page sizes (aarch64) that could be improved without CMA. Why not have more than 2 THP sizes on these systems? > > Ideally, we'd have a mechanism that tries grouping on different granularity, like for every order in pageblock_order ... max_pageblock_order (e.g., 1 GiB), and not only add one new level of grouping (or increase the single grouping size). I agree. In some sense, supporting partial pageblock and increasing pageblock size (e.g., to 1GB) is, at the high level, quite similar to having multiple pageblock sizes. But I am not sure if we really want to support multiple pageblock sizes, since it creates pageblock fragmentation when we want to change migratetype for part of a pageblock. This means we would break a large pageblock into small ones if we just want to steal a subset of pages from MOVEABLE for UNMOVABLE allocations. Then pageblock loses its most useful anti-fragmentation feature. Also it seems to be a replication of buddy allocator functionalities when it comes to pageblock split and merge. The above is really a nice discussion with you on pageblock, section, memory hotplug/hotremove, which also helps me understand more on the issues with increasing MAX_ORDER to enable 1GB page allocation. In sum, if I get it correctly, the issues I need to address are: 1. large pageblock size (which is needed when we bump MAX_ORDER for gigantic page allocation from buddy allocator) is not good for machines with small memory. 2. pageblock size is currently tied with section size (which made me want to bump section size). For 1, I think making MAX_ORDER a variable that can be set based on total memory size and adjustable via boot time parameter should solve the problem. For small machines, we will keep MAX_ORDER as small as we have now like 4MB, whereas for large machines, we can increase MAX_ORDER to utilize gigantic pages. For 2, supporting partial pageblock and allow a pageblock to cross multiple sections would break the tie between pageblock size and section to solve the issue. I am going to look into them. What do you think? — Best Regards, Yan Zi
>> >> 1. Pageblock size >> >> There are a couple of features that rely on the pageblock size to be reasonably small to work as expected. One example is virtio-balloon free page reporting, then there is virtio-mem (still also glued MAX_ORDER) and we have CMA (still also glued to MAX_ORDER). Most probably there are more. We track movability/ page isolation per pageblock; it's the smallest granularity you can effectively isolate pages or mark them as CMA (MIGRATE_ISOLATE, MIGRATE_CMA). Well, and there are "ordinary" THP / huge pages most of our applications use and will use, especially on smallish systems. >> >> Assume you bump up the pageblock order to 1G. Small VMs won't be able to report any free pages to the hypervisor. You'll take the "fine-grained" out of virtio-mem. Each CMA area will have to be at least 1G big, which turns CMA essentially useless on smallish systems (like we have on arm64 with 64k base pages -- pageblock_size is 512MB and I hate it). > > I understand the issue of having large pageblock in small systems. My plan for this issue is to make MAX_ORDER a variable (pageblock size would be set according to MAX_ORDER) that can be adjusted based on total memory and via boot time parameter. My apology since I did not state this clearly in my cover letter and it confused you. When we have a boot time adjustable MAX_ORDER, large pageblock like 1GB would only appear for systems with large memory. For small VMs, pageblock size would stay at 2MB, so all your concerns on smallish systems should go away. I have to admit that I am not really a friend of that. I still think our target goal should be to have gigantic THP *in addition to* ordinary THP. Use gigantic THP where enabled and possible, and just use ordinary THP everywhere else. Having one pageblock granularity is a real limitation IMHO and requires us to hack the system to support it to some degree. > >> >> Then, imagine systems that have like 4G of main memory. By stopping grouping at 2M and instead grouping at 1G you can very easily find yourself in the system where all your 4 pageblocks are unmovable and you essentially don't optimize for huge pages in that environment any more. >> >> Long story short: we need a different mechanism on top and shall leave the pageblock size untouched, it's too tightly integrated with page isolation, ordinary THP, and CMA. > > I think it is better to make pageblock size adjustable based on total memory of a system. It is not reasonable to have the same pageblock size across systems with memory sizes from <1GB to several TBs. Do you agree? > I suggest an additional mechanism on top. Please bear in mind that ordinary THP will most probably be still the default for 99.9% of all application/library cases, even when you have gigantic THP around. >> >> 2. Section size >> >> I assume the only reason you want to touch that is because pageblock_size <= section_size, and I guess that's one of the reasons I dislike it so much. Messing with the section size really only makes sense when we want to manage metadata for larger granularity within a section. > > Perhaps it is worth checking if it is feasible to make pageblock_size > section_size, so we can still have small sections when pageblock_size are large. One potential issue for that is when PFN discontinues at section boundary, we might have partial pageblock when pageblock_size is big. I guess supporting partial pageblock (or different pageblock sizes like you mentioned below ) would be the right solution. > >> >> We allocate metadata per section. We mark whole sections early/online/present/.... Yes, in case of vmemmap, we manage the memmap in smaller granularity using the sub-section map, some kind of hack to support some ZONE_DEVICE cases better. >> >> Let's assume we introduce something new "gigapage_order", corresponding to 1G. We could either decide to squeeze the metadata into sections, having to increase the section size, or manage that metadata differently. >> >> Managing it differently certainly makes the necessary changes easier. Instead of adding more hacks into sections, rather manage that metadata at differently place / in a different way. > > Can you elaborate on managing it differently? Let's keep it simple. Assume you track on a 1G gigpageblock MOVABLE vs. !movable in addition to existing pageblocks. A 64 TB system would have 64*1024 gigpageblocks. One bit per gigapageblock would require 8k a.k.a. 2 pages. If you need more states, it would maybe double. No need to manage that using sparse memory sections IMHO. Just allocate 2/4 pages during boot for the bitmap. > >> >> See [1] for an alternative. Not necessarily what I would dream off, but just to showcase that there might be alternative to group pages. > > I saw this patch too. It is an interesting idea to separate different allocation orders into different regions, but it would not work for gigantic page allocations unless we have large pageblock size to utilize existing anti-fragmentation mechanisms. Right, any anti-fragmentation mechanism on top. >> 3. Grouping pages > pageblock_order >> >> There are other approaches that would benefit from grouping at > pageblock_order and having bigger MAX_ORDER. And that doesn't necessarily mean to form gigantic pages only, we might want to group in multiple granularity on a single system. Memory hot(un)plug is one example, but also optimizing memory consumption by powering down DIMM banks. Also, some architectures support differing huge page sizes (aarch64) that could be improved without CMA. Why not have more than 2 THP sizes on these systems? >> >> Ideally, we'd have a mechanism that tries grouping on different granularity, like for every order in pageblock_order ... max_pageblock_order (e.g., 1 GiB), and not only add one new level of grouping (or increase the single grouping size). > > I agree. In some sense, supporting partial pageblock and increasing pageblock size (e.g., to 1GB) is, at the high level, quite similar to having multiple pageblock sizes. But I am not sure if we really want to support multiple pageblock sizes, since it creates pageblock fragmentation when we want to change migratetype for part of a pageblock. This means we would break a large pageblock into small ones if we just want to steal a subset of pages from MOVEABLE for UNMOVABLE allocations. Then pageblock loses its most useful anti-fragmentation feature. Also it seems to be a replication of buddy allocator functionalities when it comes to pageblock split and merge. Let's assume for simplicity that you have a 4G machine, maximum 4 gigantic pages. The first gigantic page will be impossible either way due to the kernel, boot time allocations etc. So you're left with 3 gigantic pages you could use at best. Obviously, you want to make sure that the remaining parts of the first gigantic page are used as best as possible for ordinary huge pages, so you would actually want to group them in 2 MiB chunks and avoid fragmentation there. Obviously, supporting two pageblock types would require core modifications to support it natively. (not pushing for the idea of two pageblock orders, just motivating why we actually want to keep grouping for ordinary THP). > > > The above is really a nice discussion with you on pageblock, section, memory hotplug/hotremove, which also helps me understand more on the issues with increasing MAX_ORDER to enable 1GB page allocation. > > In sum, if I get it correctly, the issues I need to address are: > > 1. large pageblock size (which is needed when we bump MAX_ORDER for gigantic page allocation from buddy allocator) is not good for machines with small memory. > > 2. pageblock size is currently tied with section size (which made me want to bump section size). > > > For 1, I think making MAX_ORDER a variable that can be set based on total memory size and adjustable via boot time parameter should solve the problem. For small machines, we will keep MAX_ORDER as small as we have now like 4MB, whereas for large machines, we can increase MAX_ORDER to utilize gigantic pages. > > For 2, supporting partial pageblock and allow a pageblock to cross multiple sections would break the tie between pageblock size and section to solve the issue. > > I am going to look into them. What do you think? I am not sure that's really the right direction as stated above.
On Thu, May 06, 2021 at 09:10:52PM +0200, David Hildenbrand wrote: > I have to admit that I am not really a friend of that. I still think our > target goal should be to have gigantic THP *in addition to* ordinary THP. > Use gigantic THP where enabled and possible, and just use ordinary THP > everywhere else. Having one pageblock granularity is a real limitation IMHO > and requires us to hack the system to support it to some degree. You're thinking too small with only two THP sizes ;-) I'm aiming to support arbitrary power-of-two memory allocations. I think there's a fruitful discussion to be had about how that works for anonymous memory -- with page cache, we have readahead to tell us when our predictions of use are actually fulfilled. It doesn't tell us what percentage of the pages allocated were actually used, but it's a hint. It's a big lift to go from 2MB all the way to 1GB ... if you can look back to see that the previous 1GB was basically fully populated, then maybe jump up from allocating 2MB folios to allocating a 1GB folio, but wow, that's a big step. This goal really does mean that we want to allocate from the page allocator, and so we do want to grow MAX_ORDER. I suppose we could do somethig ugly like if (order <= MAX_ORDER) alloc_page() else alloc_really_big_page() but that feels like unnecessary hardship to place on the user. I know that for the initial implementation, we're going to rely on hints from the user to use 1GB pages, but it'd be nice to not do that.
On 06.05.21 21:30, Matthew Wilcox wrote: > On Thu, May 06, 2021 at 09:10:52PM +0200, David Hildenbrand wrote: >> I have to admit that I am not really a friend of that. I still think our >> target goal should be to have gigantic THP *in addition to* ordinary THP. >> Use gigantic THP where enabled and possible, and just use ordinary THP >> everywhere else. Having one pageblock granularity is a real limitation IMHO >> and requires us to hack the system to support it to some degree. > > You're thinking too small with only two THP sizes ;-) I'm aiming to Well, I raised in my other mail that we will have multiple different use cases, including multiple different THP e.g., on aarch64 ;) > support arbitrary power-of-two memory allocations. I think there's a > fruitful discussion to be had about how that works for anonymous memory -- > with page cache, we have readahead to tell us when our predictions of use > are actually fulfilled. It doesn't tell us what percentage of the pages Right, and I think we have to think about a better approach than just increasing the pageblock_order. > allocated were actually used, but it's a hint. It's a big lift to go from > 2MB all the way to 1GB ... if you can look back to see that the previous > 1GB was basically fully populated, then maybe jump up from allocating > 2MB folios to allocating a 1GB folio, but wow, that's a big step. > > This goal really does mean that we want to allocate from the page > allocator, and so we do want to grow MAX_ORDER. I suppose we could > do somethig ugly like > > if (order <= MAX_ORDER) > alloc_page() > else > alloc_really_big_page() > > but that feels like unnecessary hardship to place on the user. I had something similar for the sort term in mind, relying on alloc_contig_pages() (and maybe ZONE_MOVABLE to make allocations more likely to succeed). Devil's in the details (page migration, ...).
[I haven't read through respective patches due to lack of time but let me comment on the general idea and the underlying justification] On Thu 06-05-21 17:31:09, David Hildenbrand wrote: > On 06.05.21 17:26, Zi Yan wrote: > > From: Zi Yan <ziy@nvidia.com> > > > > Hi all, > > > > This patchset tries to remove the restriction on memory hotplug/hotremove > > granularity, which is always greater or equal to memory section size[1]. > > With the patchset, kernel is able to online/offline memory at a size independent > > of memory section size, as small as 2MB (the subsection size). > > ... which doesn't make any sense as we can only online/offline whole memory > block devices. Agreed. The subsection thingy is just a hack to workaround pmem alignement problems. For the real memory hotplug it is quite hard to argue for reasonable hotplug scenarios for very small physical memory ranges wrt. to the existing sparsemem memory model. > > The motivation is to increase MAX_ORDER of the buddy allocator and pageblock > > size without increasing memory hotplug/hotremove granularity at the same time, > > Gah, no. Please no. No. Agreed. Those are completely independent concepts. MAX_ORDER is can be really arbitrary irrespective of the section size with vmemmap sparse model. The existing restriction is due to old sparse model not being able to do page pointer arithmetic across memory sections. Is there any reason to stick with that memory model for an advance feature you are working on?
On 07.05.21 13:55, Michal Hocko wrote: > [I haven't read through respective patches due to lack of time but let > me comment on the general idea and the underlying justification] > > On Thu 06-05-21 17:31:09, David Hildenbrand wrote: >> On 06.05.21 17:26, Zi Yan wrote: >>> From: Zi Yan <ziy@nvidia.com> >>> >>> Hi all, >>> >>> This patchset tries to remove the restriction on memory hotplug/hotremove >>> granularity, which is always greater or equal to memory section size[1]. >>> With the patchset, kernel is able to online/offline memory at a size independent >>> of memory section size, as small as 2MB (the subsection size). >> >> ... which doesn't make any sense as we can only online/offline whole memory >> block devices. > > Agreed. The subsection thingy is just a hack to workaround pmem > alignement problems. For the real memory hotplug it is quite hard to > argue for reasonable hotplug scenarios for very small physical memory > ranges wrt. to the existing sparsemem memory model. > >>> The motivation is to increase MAX_ORDER of the buddy allocator and pageblock >>> size without increasing memory hotplug/hotremove granularity at the same time, >> >> Gah, no. Please no. No. > > Agreed. Those are completely independent concepts. MAX_ORDER is can be > really arbitrary irrespective of the section size with vmemmap sparse > model. The existing restriction is due to old sparse model not being > able to do page pointer arithmetic across memory sections. Is there any > reason to stick with that memory model for an advance feature you are > working on? > I gave it some more thought yesterday. I guess the first thing we should look into is increasing MAX_ORDER and leaving pageblock_order and section size as is -- finding out what we have to tweak to get that up and running. Once we have that in place, we can actually look into better fragmentation avoidance etc. One step at a time. Because that change itself might require some thought. Requiring that bigger MAX_ORDER depends on SPARSE_VMEMMAP is something reasonable to do. As stated somewhere here already, we'll have to look into making alloc_contig_range() (and main users CMA and virtio-mem) independent of MAX_ORDER and mainly rely on pageblock_order. The current handling in alloc_contig_range() is far from optimal as we have to isolate a whole MAX_ORDER - 1 page -- and on ZONE_NORMAL we'll fail easily if any part contains something unmovable although we don't even want to allocate that part. I actually have that on my list (to be able to fully support pageblock_order instead of MAX_ORDER -1 chunks in virtio-mem), however didn't have time to look into it. Further, page onlining / offlining code and early init code most probably also needs care if MAX_ORDER - 1 crosses sections. Memory holes we might suddenly have in MAX_ORDER - 1 pages might become a problem and will have to be handled. Not sure which other code has to be tweaked (compaction? page isolation?). Figuring out what needs care itself might take quite some effort. One thing I was thinking about as well: The bigger our MAX_ORDER, the slower it could be to allocate smaller pages. If we have 1G pages, splitting them down to 4k then takes 8 additional steps if I'm, not wrong. Of course, that's the worst case. Would be interesting to evaluate.
On 7 May 2021, at 10:00, David Hildenbrand wrote: > On 07.05.21 13:55, Michal Hocko wrote: >> [I haven't read through respective patches due to lack of time but let >> me comment on the general idea and the underlying justification] >> >> On Thu 06-05-21 17:31:09, David Hildenbrand wrote: >>> On 06.05.21 17:26, Zi Yan wrote: >>>> From: Zi Yan <ziy@nvidia.com> >>>> >>>> Hi all, >>>> >>>> This patchset tries to remove the restriction on memory hotplug/hotremove >>>> granularity, which is always greater or equal to memory section size[1]. >>>> With the patchset, kernel is able to online/offline memory at a size independent >>>> of memory section size, as small as 2MB (the subsection size). >>> >>> ... which doesn't make any sense as we can only online/offline whole memory >>> block devices. >> >> Agreed. The subsection thingy is just a hack to workaround pmem >> alignement problems. For the real memory hotplug it is quite hard to >> argue for reasonable hotplug scenarios for very small physical memory >> ranges wrt. to the existing sparsemem memory model. >> >>>> The motivation is to increase MAX_ORDER of the buddy allocator and pageblock >>>> size without increasing memory hotplug/hotremove granularity at the same time, >>> >>> Gah, no. Please no. No. >> >> Agreed. Those are completely independent concepts. MAX_ORDER is can be >> really arbitrary irrespective of the section size with vmemmap sparse >> model. The existing restriction is due to old sparse model not being >> able to do page pointer arithmetic across memory sections. Is there any >> reason to stick with that memory model for an advance feature you are >> working on? No. I just want to increase MAX_ORDER. If the existing restriction can be removed, that will be great. > > I gave it some more thought yesterday. I guess the first thing we should look into is increasing MAX_ORDER and leaving pageblock_order and section size as is -- finding out what we have to tweak to get that up and running. Once we have that in place, we can actually look into better fragmentation avoidance etc. One step at a time. It makes sense to me. > > Because that change itself might require some thought. Requiring that bigger MAX_ORDER depends on SPARSE_VMEMMAP is something reasonable to do. OK, if with SPARSE_VMEMMAP MAX_ORDER can be set to be bigger than SECTION_SIZE, it is perfectly OK to me. Since 1GB THP support, which I want to add ultimately, will require SPARSE_VMEMMAP too (otherwise, all page++ will need to be changed to nth_page(page,1)). > > As stated somewhere here already, we'll have to look into making alloc_contig_range() (and main users CMA and virtio-mem) independent of MAX_ORDER and mainly rely on pageblock_order. The current handling in alloc_contig_range() is far from optimal as we have to isolate a whole MAX_ORDER - 1 page -- and on ZONE_NORMAL we'll fail easily if any part contains something unmovable although we don't even want to allocate that part. I actually have that on my list (to be able to fully support pageblock_order instead of MAX_ORDER -1 chunks in virtio-mem), however didn't have time to look into it. So in your mind, for gigantic page allocation (> MAX_ORDER), alloc_contig_range() should be used instead of buddy allocator while pageblock_order is kept at a small granularity like 2MB. Is that the case? Isn’t it going to have high fail rate when any of the pageblocks within a gigantic page range (like 1GB) becomes unmovable? Are you thinking additional mechanism/policy to prevent such thing happening as an additional step for gigantic page allocation? Like your ZONE_PREFER_MOVABLE idea? > > Further, page onlining / offlining code and early init code most probably also needs care if MAX_ORDER - 1 crosses sections. Memory holes we might suddenly have in MAX_ORDER - 1 pages might become a problem and will have to be handled. Not sure which other code has to be tweaked (compaction? page isolation?). Can you elaborate it a little more? From what I understand, memory holes mean valid PFNs are not contiguous before and after a hole, so pfn++ will not work, but struct pages are still virtually contiguous assuming SPARSE_VMEMMAP, meaning page++ would still work. So when MAX_ORDER - 1 crosses sections, additional code would be needed instead of simple pfn++. Is there anything I am missing? BTW, to test a system with memory holes, do you know is there an easy of adding random memory holes to an x86_64 VM, which can help reveal potential missing pieces in the code? Changing BIOS-e820 table might be one way, but I have no idea on how to do it on QEMU. > > Figuring out what needs care itself might take quite some effort. > > One thing I was thinking about as well: The bigger our MAX_ORDER, the slower it could be to allocate smaller pages. If we have 1G pages, splitting them down to 4k then takes 8 additional steps if I'm, not wrong. Of course, that's the worst case. Would be interesting to evaluate. Sure. I am planning to check it too. As a simple start, I am going to run will it scale benchmarks to see if there is any performance difference between different MAX_ORDERs. Thank you for all these valuable inputs. They are very helpful. I appreciate them. — Best Regards, Yan Zi
>> >> As stated somewhere here already, we'll have to look into making alloc_contig_range() (and main users CMA and virtio-mem) independent of MAX_ORDER and mainly rely on pageblock_order. The current handling in alloc_contig_range() is far from optimal as we have to isolate a whole MAX_ORDER - 1 page -- and on ZONE_NORMAL we'll fail easily if any part contains something unmovable although we don't even want to allocate that part. I actually have that on my list (to be able to fully support pageblock_order instead of MAX_ORDER -1 chunks in virtio-mem), however didn't have time to look into it. > > So in your mind, for gigantic page allocation (> MAX_ORDER), alloc_contig_range() > should be used instead of buddy allocator while pageblock_order is kept at a small > granularity like 2MB. Is that the case? Isn’t it going to have high fail rate > when any of the pageblocks within a gigantic page range (like 1GB) becomes unmovable? > Are you thinking additional mechanism/policy to prevent such thing happening as > an additional step for gigantic page allocation? Like your ZONE_PREFER_MOVABLE idea? > I am not fully sure yet where the journey will go , I guess nobody knows. Ultimately, having buddy support for >= current MAX_ORDER (IOW, increasing MAX_ORDER) will most probably happen, so it would be worth investigating what has to be done to get that running as a first step. Of course, we could temporarily think about wiring it up in the buddy like if (order < MAX_ORDER) __alloc_pages()... else alloc_contig_pages() but it doesn't really improve the situation IMHO, just an API change. So I think we should look into increasing MAX_ORDER, seeing what needs to be done to have that part running while keeping the section size and the pageblock order as is. I know that at least memory onlining/offlining, cma, alloc_contig_range(), ... needs tweaking, especially when we don't increase the section size (but also if we would due to the way page isolation is currently handled). Having a MAX_ORDER -1 page being partially in different nodes might be another thing to look into (I heard that it can already happen right now, but I don't remember the details). The next step after that would then be better fragmentation avoidance for larger granularity like 1G THP. >> >> Further, page onlining / offlining code and early init code most probably also needs care if MAX_ORDER - 1 crosses sections. Memory holes we might suddenly have in MAX_ORDER - 1 pages might become a problem and will have to be handled. Not sure which other code has to be tweaked (compaction? page isolation?). > > Can you elaborate it a little more? From what I understand, memory holes mean valid > PFNs are not contiguous before and after a hole, so pfn++ will not work, but > struct pages are still virtually contiguous assuming SPARSE_VMEMMAP, meaning page++ > would still work. So when MAX_ORDER - 1 crosses sections, additional code would be > needed instead of simple pfn++. Is there anything I am missing? I think there are two cases when talking about MAX_ORDER and memory holes: 1. Hole with a valid memmap: the memmap is initialize to PageReserved() and the pages are not given to the buddy. pfn_valid() and pfn_to_page() works as expected. 2. Hole without a valid memmam: we have that CONFIG_HOLES_IN_ZONE thing already, see include/linux/mmzone.h. pfn_valid_within() checks are required. Doesn't win a beauty contest, but gets the job done in existing setups that seem to care. "If it is possible to have holes within a MAX_ORDER_NR_PAGES, then we need to check pfn validity within that MAX_ORDER_NR_PAGES block. pfn_valid_within() should be used in this case; we optimise this away when we have no holes within a MAX_ORDER_NR_PAGES block." CONFIG_HOLES_IN_ZONE is just a bad name for this. (increasing the section size implies that we waste more memory for the memmap in holes. increasing MAX_ORDER means that we might have to deal with holes within MAX_ORDER chunks) We don't have too many pfn_valid_within() checks. I wonder if we could add something that is optimized for "holes are a power of two and properly aligned", because pfn_valid_within() right not deals with holes of any kind which makes it somewhat inefficient IIRC. > > BTW, to test a system with memory holes, do you know is there an easy of adding > random memory holes to an x86_64 VM, which can help reveal potential missing pieces > in the code? Changing BIOS-e820 table might be one way, but I have no idea on > how to do it on QEMU. It might not be very easy that way. But I heard that some arm64 systems have crazy memory layouts -- maybe there, it's easier to get something nasty running? :) https://lkml.kernel.org/r/YJpEwF2cGjS5mKma@kernel.org I remember there was a way to define the e820 completely on kernel cmdline, but I might be wrong ...
On 10 May 2021, at 10:36, Zi Yan wrote: > On 7 May 2021, at 10:00, David Hildenbrand wrote: > >> On 07.05.21 13:55, Michal Hocko wrote: >>> [I haven't read through respective patches due to lack of time but let >>> me comment on the general idea and the underlying justification] >>> >>> On Thu 06-05-21 17:31:09, David Hildenbrand wrote: >>>> On 06.05.21 17:26, Zi Yan wrote: >>>>> From: Zi Yan <ziy@nvidia.com> >>>>> >>>>> Hi all, >>>>> >>>>> This patchset tries to remove the restriction on memory hotplug/hotremove >>>>> granularity, which is always greater or equal to memory section size[1]. >>>>> With the patchset, kernel is able to online/offline memory at a size independent >>>>> of memory section size, as small as 2MB (the subsection size). >>>> >>>> ... which doesn't make any sense as we can only online/offline whole memory >>>> block devices. >>> >>> Agreed. The subsection thingy is just a hack to workaround pmem >>> alignement problems. For the real memory hotplug it is quite hard to >>> argue for reasonable hotplug scenarios for very small physical memory >>> ranges wrt. to the existing sparsemem memory model. >>> >>>>> The motivation is to increase MAX_ORDER of the buddy allocator and pageblock >>>>> size without increasing memory hotplug/hotremove granularity at the same time, >>>> >>>> Gah, no. Please no. No. >>> >>> Agreed. Those are completely independent concepts. MAX_ORDER is can be >>> really arbitrary irrespective of the section size with vmemmap sparse >>> model. The existing restriction is due to old sparse model not being >>> able to do page pointer arithmetic across memory sections. Is there any >>> reason to stick with that memory model for an advance feature you are >>> working on? > > No. I just want to increase MAX_ORDER. If the existing restriction can > be removed, that will be great. > >> >> I gave it some more thought yesterday. I guess the first thing we should look into is increasing MAX_ORDER and leaving pageblock_order and section size as is -- finding out what we have to tweak to get that up and running. Once we have that in place, we can actually look into better fragmentation avoidance etc. One step at a time. > > It makes sense to me. > >> >> Because that change itself might require some thought. Requiring that bigger MAX_ORDER depends on SPARSE_VMEMMAP is something reasonable to do. > > OK, if with SPARSE_VMEMMAP MAX_ORDER can be set to be bigger than > SECTION_SIZE, it is perfectly OK to me. Since 1GB THP support, which I > want to add ultimately, will require SPARSE_VMEMMAP too (otherwise, > all page++ will need to be changed to nth_page(page,1)). > >> >> As stated somewhere here already, we'll have to look into making alloc_contig_range() (and main users CMA and virtio-mem) independent of MAX_ORDER and mainly rely on pageblock_order. The current handling in alloc_contig_range() is far from optimal as we have to isolate a whole MAX_ORDER - 1 page -- and on ZONE_NORMAL we'll fail easily if any part contains something unmovable although we don't even want to allocate that part. I actually have that on my list (to be able to fully support pageblock_order instead of MAX_ORDER -1 chunks in virtio-mem), however didn't have time to look into it. > > So in your mind, for gigantic page allocation (> MAX_ORDER), alloc_contig_range() > should be used instead of buddy allocator while pageblock_order is kept at a small > granularity like 2MB. Is that the case? Isn’t it going to have high fail rate > when any of the pageblocks within a gigantic page range (like 1GB) becomes unmovable? > Are you thinking additional mechanism/policy to prevent such thing happening as > an additional step for gigantic page allocation? Like your ZONE_PREFER_MOVABLE idea? > >> >> Further, page onlining / offlining code and early init code most probably also needs care if MAX_ORDER - 1 crosses sections. Memory holes we might suddenly have in MAX_ORDER - 1 pages might become a problem and will have to be handled. Not sure which other code has to be tweaked (compaction? page isolation?). > > Can you elaborate it a little more? From what I understand, memory holes mean valid > PFNs are not contiguous before and after a hole, so pfn++ will not work, but > struct pages are still virtually contiguous assuming SPARSE_VMEMMAP, meaning page++ > would still work. So when MAX_ORDER - 1 crosses sections, additional code would be > needed instead of simple pfn++. Is there anything I am missing? > > BTW, to test a system with memory holes, do you know is there an easy of adding > random memory holes to an x86_64 VM, which can help reveal potential missing pieces > in the code? Changing BIOS-e820 table might be one way, but I have no idea on > how to do it on QEMU. > >> >> Figuring out what needs care itself might take quite some effort. >> >> One thing I was thinking about as well: The bigger our MAX_ORDER, the slower it could be to allocate smaller pages. If we have 1G pages, splitting them down to 4k then takes 8 additional steps if I'm, not wrong. Of course, that's the worst case. Would be interesting to evaluate. > > Sure. I am planning to check it too. As a simple start, I am going to run will it scale > benchmarks to see if there is any performance difference between different MAX_ORDERs. I ran vm-scalablity and memory-related will-it-scale on a server with 256GB memory to see the impact of increasing MAX_ORDER and didn’t see much difference for most of the workloads like page_fault1, page_fault2, and page_fault3 from will-it-scale. But feel free to check the attached complete results and let me know what should be looked into. Thanks. # Environment Dell R630 with 2x 12-core E5-2650 v4 and 256GB memory. # Kernel changes On top of v5.13-rc1-mmotm-2021-05-13-17-18, with SECTION_SIZE_BITS set to 31 and MAX_ORDER set to 11 and 20 respectively. # Results of page_fault1, page_fault2, and page_fault3 compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 3199850 ± 2% +6.0% 3390866 ± 3% will-it-scale.24.threads 54.94 +1.7% 55.85 will-it-scale.24.threads_idle 133326 ± 2% +6.0% 141285 ± 3% will-it-scale.per_thread_ops 3199850 ± 2% +6.0% 3390866 ± 3% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2016984 -6.6% 1883075 ± 2% will-it-scale.24.threads 69.69 -4.4% 66.64 will-it-scale.24.threads_idle 84040 -6.6% 78461 ± 2% will-it-scale.per_thread_ops 2016984 -6.6% 1883075 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2138067 -1.3% 2109865 will-it-scale.24.threads 63.34 +1.1% 64.06 will-it-scale.24.threads_idle 89085 -1.3% 87910 will-it-scale.per_thread_ops 2138067 -1.3% 2109865 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 3216287 ± 3% +4.8% 3369356 ± 10% will-it-scale.16.threads 69.18 +0.5% 69.51 will-it-scale.16.threads_idle 201017 ± 3% +4.8% 210584 ± 10% will-it-scale.per_thread_ops 3216287 ± 3% +4.8% 3369356 ± 10% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2005510 -2.7% 1950620 ± 2% will-it-scale.16.threads 78.77 -0.2% 78.64 will-it-scale.16.threads_idle 125344 -2.7% 121913 ± 2% will-it-scale.per_thread_ops 2005510 -2.7% 1950620 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2332446 -6.5% 2179823 ± 2% will-it-scale.16.threads 77.57 -2.0% 76.03 will-it-scale.16.threads_idle 145777 -6.5% 136238 ± 2% will-it-scale.per_thread_ops 2332446 -6.5% 2179823 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 3236057 ± 2% -4.5% 3089222 ± 4% will-it-scale.48.threads 24.64 ± 7% -3.3% 23.83 ± 2% will-it-scale.48.threads_idle 67417 ± 2% -4.5% 64358 ± 4% will-it-scale.per_thread_ops 3236057 ± 2% -4.5% 3089222 ± 4% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1611363 -0.1% 1609891 will-it-scale.48.threads 47.42 ± 2% +1.2% 48.01 will-it-scale.48.threads_idle 33569 -0.1% 33539 will-it-scale.per_thread_ops 1611363 -0.1% 1609891 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1776494 ± 3% -2.6% 1730693 will-it-scale.48.threads 43.36 ± 4% +0.5% 43.57 ± 2% will-it-scale.48.threads_idle 37010 ± 3% -2.6% 36055 will-it-scale.per_thread_ops 1776494 ± 3% -2.6% 1730693 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 15235214 -0.3% 15185167 will-it-scale.24.processes 49.63 -0.4% 49.45 will-it-scale.24.processes_idle 634800 -0.3% 632715 will-it-scale.per_process_ops 15235214 -0.3% 15185167 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 6700813 -0.6% 6662570 will-it-scale.24.processes 49.17 +0.0% 49.18 will-it-scale.24.processes_idle 279200 -0.6% 277606 will-it-scale.per_process_ops 6700813 -0.6% 6662570 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 8052059 -1.2% 7952172 will-it-scale.24.processes 49.48 -0.4% 49.29 will-it-scale.24.processes_idle 335502 -1.2% 331340 will-it-scale.per_process_ops 8052059 -1.2% 7952172 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 10152559 +0.7% 10221240 will-it-scale.16.processes 66.10 -0.0% 66.09 will-it-scale.16.processes_idle 634534 +0.7% 638827 will-it-scale.per_process_ops 10152559 +0.7% 10221240 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 4621434 -1.0% 4576959 will-it-scale.16.processes 66.14 -0.2% 65.98 will-it-scale.16.processes_idle 288839 -1.0% 286059 will-it-scale.per_process_ops 4621434 -1.0% 4576959 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 5546153 -1.3% 5474778 will-it-scale.16.processes 66.02 -0.1% 65.98 will-it-scale.16.processes_idle 346634 -1.3% 342173 will-it-scale.per_process_ops 5546153 -1.3% 5474778 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 20575719 +0.4% 20651992 will-it-scale.48.processes 0.06 +5.6% 0.06 ± 7% will-it-scale.48.processes_idle 428660 +0.4% 430249 will-it-scale.per_process_ops 20575719 +0.4% 20651992 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 6984071 -1.1% 6906022 will-it-scale.48.processes 0.07 +4.8% 0.07 ± 6% will-it-scale.48.processes_idle 145501 -1.1% 143875 will-it-scale.per_process_ops 6984071 -1.1% 6906022 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 7527654 -1.7% 7399284 will-it-scale.48.processes 0.07 +0.0% 0.07 will-it-scale.48.processes_idle 156826 -1.7% 154151 will-it-scale.per_process_ops 7527654 -1.7% 7399284 will-it-scale.workload — Best Regards, Yan, Zi ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/small-allocs/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 357824 +1.5% 363199 vm-scalability.median 8.33 ± 16% -0.0 8.29 ± 5% vm-scalability.stddev% 17180066 +1.5% 17435528 vm-scalability.throughput 4.832e+09 +0.0% 4.832e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/small-allocs-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 20010 ± 2% +2.8% 20573 ± 2% vm-scalability.median 2.52 ± 15% -0.6 1.91 ± 14% vm-scalability.stddev% 960542 ± 2% +2.8% 987552 ± 2% vm-scalability.throughput 2.886e+08 ± 2% +2.8% 2.967e+08 ± 2% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/mremap-xread-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 105253 +0.2% 105515 vm-scalability.median 8.48 ± 3% +0.3 8.77 vm-scalability.stddev% 5053394 +0.2% 5065570 vm-scalability.throughput 1.52e+09 +0.2% 1.523e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/mmap-xread-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1406301 -0.1% 1404676 vm-scalability.median 0.03 ± 33% -0.0 0.02 ± 23% vm-scalability.stddev% 67502299 -0.1% 67424356 vm-scalability.throughput 2.031e+10 -0.3% 2.025e+10 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/mmap-xread-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 68136 +0.1% 68179 vm-scalability.median 13.73 ± 4% +0.2 13.94 vm-scalability.stddev% 3271260 +0.1% 3273712 vm-scalability.throughput 9.834e+08 -0.0% 9.834e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/mmap-pread-seq/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 78.13 -16.2% 65.44 vm-scalability.free_time 1209399 -1.0% 1197800 vm-scalability.median 1.05 ±140% -1.0 0.00 ± 23% vm-scalability.stddev% 58047054 -1.0% 57494408 vm-scalability.throughput 4.832e+09 +0.0% 4.832e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/mmap-pread-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 54.47 ± 10% +12.1% 61.04 ± 35% vm-scalability.free_time 1040180 -1.5% 1024808 ± 2% vm-scalability.median 273.39 ± 82% +140.2 413.55 ± 63% vm-scalability.stddev% 48474981 ± 4% -2.1% 47462936 ± 5% vm-scalability.throughput 1.541e+10 -0.3% 1.537e+10 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/mmap-pread-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 78.92 -16.3% 66.08 vm-scalability.free_time 67291 +0.1% 67339 vm-scalability.median 36.43 ± 61% +6.4 42.83 ± 54% vm-scalability.stddev% 3226783 -0.0% 3226730 vm-scalability.throughput 9.703e+08 -0.1% 9.697e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/mmap-pread-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 42.14 ± 3% +4.2% 43.92 vm-scalability.free_time 66589 +0.4% 66850 vm-scalability.median 89.06 ± 8% -14.8 74.29 ± 29% vm-scalability.stddev% 3188137 +0.2% 3195092 vm-scalability.throughput 9.573e+08 +0.3% 9.597e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/lru-file-readtwice/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 144610 +2.6% 148318 vm-scalability.median 650.58 ± 22% -273.4 377.19 ± 7% vm-scalability.stddev% 14175414 +1.5% 14382055 vm-scalability.throughput 4.253e+09 +1.5% 4.315e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/lru-file-readonce/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 430327 -0.7% 427117 vm-scalability.median 27.99 ± 11% +3.3 31.25 ± 21% vm-scalability.stddev% 20676239 -0.7% 20537224 vm-scalability.throughput 4.295e+09 +0.0% 4.295e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/lru-file-mmap-read/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.73 -5.9% 1.63 ± 6% vm-scalability.free_time 284645 +1.4% 288596 vm-scalability.median 24.72 ± 12% -1.4 23.34 ± 10% vm-scalability.stddev% 13663327 +1.4% 13853326 vm-scalability.throughput 4.107e+09 +1.4% 4.163e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/lru-file-mmap-read-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2.63 ± 52% +0.7% 2.65 ± 51% vm-scalability.free_time 35249 -1.1% 34877 vm-scalability.median 70.25 ± 47% -9.6 60.65 ± 46% vm-scalability.stddev% 1701952 -1.3% 1680521 vm-scalability.throughput 5.372e+08 +0.0% 5.372e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/anon-rx-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2071445 +0.0% 2072475 vm-scalability.median 99429376 +0.0% 99478816 vm-scalability.throughput 2.989e+10 -0.1% 2.986e+10 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/anon-rx-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 94892 +0.2% 95062 vm-scalability.median 42.22 ± 39% +9.0 51.25 ± 20% vm-scalability.stddev% 4535789 +0.2% 4545805 vm-scalability.throughput 1.364e+09 +0.1% 1.366e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/anon-r-seq/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 5.48 -4.7% 5.22 ± 2% vm-scalability.free_time 1469914 -0.0% 1469383 vm-scalability.median 6.76 ± 22% -0.4 6.34 ± 17% vm-scalability.stddev% 70532973 -0.0% 70502407 vm-scalability.throughput 2.12e+10 +0.0% 2.12e+10 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/anon-r-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2.46 ± 13% -10.1% 2.21 ± 12% vm-scalability.free_time 980150 -2.6% 954707 ± 2% vm-scalability.median 213.08 ± 19% +109.3 322.40 ± 14% vm-scalability.stddev% 47055326 -3.4% 45456419 vm-scalability.throughput 1.459e+10 -3.7% 1.405e+10 ± 2% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/anon-r-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 3.15 -0.1% 3.15 vm-scalability.free_time 90700 +0.3% 90950 vm-scalability.median 90.49 ± 32% +59.7 150.15 ± 19% vm-scalability.stddev% 4336824 +0.0% 4337090 vm-scalability.throughput 1.305e+09 -0.1% 1.303e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/dellr630/anon-r-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 8.08 ± 2% -0.3% 8.06 ± 2% vm-scalability.free_time 54716 +0.9% 55195 ± 2% vm-scalability.median 256.82 ± 7% -21.9 234.93 ± 14% vm-scalability.stddev% 2708434 +0.8% 2729708 ± 3% vm-scalability.throughput 8.147e+08 +0.7% 8.205e+08 ± 3% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/8T/dellr630/anon-wx-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 262182 ± 2% -7.2% 243353 ± 4% vm-scalability.median 12584768 ± 2% -7.2% 11680976 ± 4% vm-scalability.throughput 3.781e+09 ± 2% -7.2% 3.509e+09 ± 4% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/8T/dellr630/anon-w-seq/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.82 -0.2% 1.82 vm-scalability.free_time 671396 +0.2% 672973 vm-scalability.median 0.18 ± 8% +0.0 0.22 ± 22% vm-scalability.median_stddev% 0.24 ± 4% +0.0 0.26 ± 12% vm-scalability.stddev% 32200410 +0.2% 32276241 vm-scalability.throughput 6.945e+09 -0.0% 6.945e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/8T/dellr630/anon-w-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.29 +2.2% 0.29 vm-scalability.free_time 280375 +0.3% 281262 vm-scalability.median 4.07 ± 7% +0.5 4.58 ± 19% vm-scalability.median_stddev% 3.97 ± 8% +0.7 4.67 ± 20% vm-scalability.stddev% 13856067 +0.4% 13917746 vm-scalability.throughput 2.955e+09 -1.7% 2.906e+09 ± 2% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/8T/dellr630/anon-cow-seq/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 394969 -0.5% 392986 ± 3% vm-scalability.median 0.98 ± 5% -0.3 0.69 ± 18% vm-scalability.median_stddev% 1.02 ± 4% -0.3 0.69 ± 19% vm-scalability.stddev% 18910179 -0.5% 18811746 ± 3% vm-scalability.throughput 4.536e+09 -0.0% 4.536e+09 ± 3% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/8T/dellr630/anon-cow-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 80651 -1.2% 79710 vm-scalability.median 3871280 -1.2% 3826080 vm-scalability.throughput 1.163e+09 -1.1% 1.15e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/512G/dellr630/anon-wx-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 61777 +0.5% 62081 vm-scalability.median 0.86 ± 28% -0.3 0.56 ± 12% vm-scalability.median_stddev% 0.46 ± 82% -0.2 0.22 ± 17% vm-scalability.stddev% 2947653 +0.1% 2949945 vm-scalability.throughput 8.866e+08 -0.0% 8.866e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/512G/dellr630/anon-w-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2.08 -7.2% 1.93 ± 6% vm-scalability.free_time 62997 +0.0% 63014 vm-scalability.median 0.84 ± 38% -0.3 0.56 ± 35% vm-scalability.median_stddev% 1.32 ± 31% -0.2 1.09 ± 47% vm-scalability.stddev% 3003429 -0.2% 2997175 vm-scalability.throughput 8.866e+08 -0.0% 8.866e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/512G/dellr630/anon-w-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 4.38 -12.9% 3.82 ± 9% vm-scalability.free_time 56381 +0.1% 56460 vm-scalability.median 1.13 ± 25% -0.4 0.71 ± 20% vm-scalability.median_stddev% 1.40 ± 33% -0.4 0.99 ± 12% vm-scalability.stddev% 2692710 +0.1% 2695449 vm-scalability.throughput 7.389e+08 -0.0% 7.389e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/512G/dellr630/anon-cow-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 53787 -1.1% 53191 vm-scalability.median 1.87 ±115% +1.1 2.99 ± 48% vm-scalability.median_stddev% 0.38 ± 54% -0.0 0.36 ± 29% vm-scalability.stddev% 2437373 -0.2% 2432934 vm-scalability.throughput 7.238e+08 -0.0% 7.238e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/512G/dellr630/anon-cow-rand-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 68313 -0.3% 68094 vm-scalability.median 197.31 ± 3% -0.4 196.89 vm-scalability.stddev% 3190330 +0.2% 3195172 vm-scalability.throughput 9.588e+08 +0.1% 9.598e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/2T/dellr630/shm-xread-seq/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 318628 -0.6% 316839 vm-scalability.median 0.00 +0.0 0.00 ±141% vm-scalability.stddev% 15294160 -0.6% 15208286 vm-scalability.throughput 4.593e+09 -0.6% 4.565e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/2T/dellr630/shm-xread-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 224701 -0.1% 224411 vm-scalability.median 10785648 -0.1% 10771744 vm-scalability.throughput 3.24e+09 -0.1% 3.237e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/2T/dellr630/shm-pread-seq/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 44.80 -8.0% 41.20 vm-scalability.free_time 312877 +1.1% 316454 vm-scalability.median 0.00 ±141% -0.0 0.00 ±141% vm-scalability.stddev% 15018126 +1.1% 15189806 vm-scalability.throughput 4.513e+09 +1.2% 4.569e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/2T/dellr630/shm-pread-seq-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 31.66 ± 32% +7.2% 33.93 ± 33% vm-scalability.free_time 339602 +0.2% 340334 vm-scalability.median 0.00 ±141% +0.0 0.00 ±141% vm-scalability.stddev% 16300930 +0.2% 16336068 vm-scalability.throughput 4.898e+09 +0.1% 4.904e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/256G/dellr630/msync/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 41194 ± 5% -3.3% 39825 ± 6% vm-scalability.median 19.45 ± 11% -2.4 17.02 ± 23% vm-scalability.median_stddev% 19.35 ± 15% -3.5 15.83 ± 26% vm-scalability.stddev% 2065321 ± 4% -3.7% 1988720 ± 6% vm-scalability.throughput 5.418e+08 ± 12% -9.1% 4.926e+08 ± 14% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/256G/dellr630/msync-mt/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 145547 +2.3% 148921 ± 2% vm-scalability.median 0.01 ±141% -0.0 0.00 ±141% vm-scalability.stddev% 6986266 +2.3% 7148204 ± 2% vm-scalability.throughput 4.557e+09 -0.4% 4.539e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/256G/dellr630/lru-shm-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.14 ± 2% -0.7% 0.14 ± 2% vm-scalability.free_time 48663 +0.2% 48778 vm-scalability.median 0.65 ± 32% -0.1 0.57 ± 73% vm-scalability.median_stddev% 0.66 ± 34% +0.3 0.92 ± 58% vm-scalability.stddev% 2333942 +0.0% 2334978 vm-scalability.throughput 5.911e+08 -0.0% 5.911e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/1T/dellr630/lru-shm/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.09 -0.5% 0.09 vm-scalability.free_time 589821 -0.3% 588261 vm-scalability.median 0.19 ± 11% +0.0 0.24 ± 16% vm-scalability.median_stddev% 0.21 ± 18% +0.0 0.24 ± 10% vm-scalability.stddev% 28276616 -0.3% 28192741 vm-scalability.throughput 2.364e+09 -2.1% 2.315e+09 ± 3% vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/16G/dellr630/shm-xread-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 80388 +0.9% 81089 vm-scalability.median 53.48 ± 26% +15.3 68.78 vm-scalability.stddev% 3856313 +0.7% 3881571 vm-scalability.throughput 1.159e+09 +0.7% 1.167e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase/unit_size: gcc-10/defconfig/debian/300s/16G/dellr630/shm-xread-rand-mt/vm-scalability/1G commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 83938 -0.1% 83823 vm-scalability.median 10.55 ± 4% +0.3 10.90 ± 4% vm-scalability.stddev% 4030473 -0.2% 4024362 vm-scalability.throughput 1.212e+09 -0.2% 1.209e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/16G/dellr630/shm-pread-rand/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 64.56 -7.2% 59.91 vm-scalability.free_time 80319 +0.5% 80758 vm-scalability.median 35.63 ± 16% +11.3 46.92 ± 12% vm-scalability.stddev% 3853520 +0.7% 3880131 vm-scalability.throughput 1.158e+09 +0.7% 1.166e+09 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase/unit_size: gcc-10/defconfig/debian/300s/16G/dellr630/shm-pread-rand-mt/vm-scalability/1G commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.32 ± 9% -21.3% 0.25 ± 7% vm-scalability.free_time 46429 +0.7% 46754 vm-scalability.median 0.49 ± 22% +0.2 0.72 ± 19% vm-scalability.median_stddev% 0.51 ± 16% +0.2 0.71 ± 22% vm-scalability.stddev% 2224755 +0.7% 2240920 vm-scalability.throughput 6.795e+08 +0.0% 6.795e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-10/defconfig/debian/300s/128G/dellr630/truncate/vm-scalability commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.055e+10 ± 2% +2.7% 1.083e+10 ± 5% vm-scalability.median 4.25 ±125% -0.7 3.54 ± 58% vm-scalability.median_stddev% 4.25 ±125% -0.7 3.54 ± 58% vm-scalability.stddev% 1.055e+10 ± 2% +2.7% 1.083e+10 ± 5% vm-scalability.throughput 5.254e+08 -0.0% 5.254e+08 vm-scalability.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 3199850 ± 2% +6.0% 3390866 ± 3% will-it-scale.24.threads 54.94 +1.7% 55.85 will-it-scale.24.threads_idle 133326 ± 2% +6.0% 141285 ± 3% will-it-scale.per_thread_ops 3199850 ± 2% +6.0% 3390866 ± 3% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2016984 -6.6% 1883075 ± 2% will-it-scale.24.threads 69.69 -4.4% 66.64 will-it-scale.24.threads_idle 84040 -6.6% 78461 ± 2% will-it-scale.per_thread_ops 2016984 -6.6% 1883075 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2138067 -1.3% 2109865 will-it-scale.24.threads 63.34 +1.1% 64.06 will-it-scale.24.threads_idle 89085 -1.3% 87910 will-it-scale.per_thread_ops 2138067 -1.3% 2109865 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 3216287 ± 3% +4.8% 3369356 ± 10% will-it-scale.16.threads 69.18 +0.5% 69.51 will-it-scale.16.threads_idle 201017 ± 3% +4.8% 210584 ± 10% will-it-scale.per_thread_ops 3216287 ± 3% +4.8% 3369356 ± 10% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2005510 -2.7% 1950620 ± 2% will-it-scale.16.threads 78.77 -0.2% 78.64 will-it-scale.16.threads_idle 125344 -2.7% 121913 ± 2% will-it-scale.per_thread_ops 2005510 -2.7% 1950620 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 2332446 -6.5% 2179823 ± 2% will-it-scale.16.threads 77.57 -2.0% 76.03 will-it-scale.16.threads_idle 145777 -6.5% 136238 ± 2% will-it-scale.per_thread_ops 2332446 -6.5% 2179823 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 3236057 ± 2% -4.5% 3089222 ± 4% will-it-scale.48.threads 24.64 ± 7% -3.3% 23.83 ± 2% will-it-scale.48.threads_idle 67417 ± 2% -4.5% 64358 ± 4% will-it-scale.per_thread_ops 3236057 ± 2% -4.5% 3089222 ± 4% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1611363 -0.1% 1609891 will-it-scale.48.threads 47.42 ± 2% +1.2% 48.01 will-it-scale.48.threads_idle 33569 -0.1% 33539 will-it-scale.per_thread_ops 1611363 -0.1% 1609891 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1776494 ± 3% -2.6% 1730693 will-it-scale.48.threads 43.36 ± 4% +0.5% 43.57 ± 2% will-it-scale.48.threads_idle 37010 ± 3% -2.6% 36055 will-it-scale.per_thread_ops 1776494 ± 3% -2.6% 1730693 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 15235214 -0.3% 15185167 will-it-scale.24.processes 49.63 -0.4% 49.45 will-it-scale.24.processes_idle 634800 -0.3% 632715 will-it-scale.per_process_ops 15235214 -0.3% 15185167 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 6700813 -0.6% 6662570 will-it-scale.24.processes 49.17 +0.0% 49.18 will-it-scale.24.processes_idle 279200 -0.6% 277606 will-it-scale.per_process_ops 6700813 -0.6% 6662570 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 8052059 -1.2% 7952172 will-it-scale.24.processes 49.48 -0.4% 49.29 will-it-scale.24.processes_idle 335502 -1.2% 331340 will-it-scale.per_process_ops 8052059 -1.2% 7952172 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 10152559 +0.7% 10221240 will-it-scale.16.processes 66.10 -0.0% 66.09 will-it-scale.16.processes_idle 634534 +0.7% 638827 will-it-scale.per_process_ops 10152559 +0.7% 10221240 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 4621434 -1.0% 4576959 will-it-scale.16.processes 66.14 -0.2% 65.98 will-it-scale.16.processes_idle 288839 -1.0% 286059 will-it-scale.per_process_ops 4621434 -1.0% 4576959 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 5546153 -1.3% 5474778 will-it-scale.16.processes 66.02 -0.1% 65.98 will-it-scale.16.processes_idle 346634 -1.3% 342173 will-it-scale.per_process_ops 5546153 -1.3% 5474778 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/page_fault3/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 20575719 +0.4% 20651992 will-it-scale.48.processes 0.06 +5.6% 0.06 ± 7% will-it-scale.48.processes_idle 428660 +0.4% 430249 will-it-scale.per_process_ops 20575719 +0.4% 20651992 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/page_fault2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 6984071 -1.1% 6906022 will-it-scale.48.processes 0.07 +4.8% 0.07 ± 6% will-it-scale.48.processes_idle 145501 -1.1% 143875 will-it-scale.per_process_ops 6984071 -1.1% 6906022 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/page_fault1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 7527654 -1.7% 7399284 will-it-scale.48.processes 0.07 +0.0% 0.07 will-it-scale.48.processes_idle 156826 -1.7% 154151 will-it-scale.per_process_ops 7527654 -1.7% 7399284 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/mmap2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 107582 -2.2% 105195 ± 2% will-it-scale.24.threads 86.69 +0.2% 86.90 will-it-scale.24.threads_idle 4482 -2.2% 4382 ± 2% will-it-scale.per_thread_ops 107582 -2.2% 105195 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/mmap1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 99087 -6.6% 92582 ± 4% will-it-scale.24.threads 89.30 +0.9% 90.14 will-it-scale.24.threads_idle 4128 -6.6% 3857 ± 4% will-it-scale.per_thread_ops 99087 -6.6% 92582 ± 4% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/malloc2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.692e+09 +0.0% 1.692e+09 will-it-scale.24.threads 49.55 -0.0% 49.54 will-it-scale.24.threads_idle 70479299 +0.0% 70497886 will-it-scale.per_thread_ops 1.692e+09 +0.0% 1.692e+09 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/mmap2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 108693 -3.4% 104954 ± 2% will-it-scale.16.threads 91.00 +0.1% 91.10 will-it-scale.16.threads_idle 6793 -3.4% 6559 ± 2% will-it-scale.per_thread_ops 108693 -3.4% 104954 ± 2% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/mmap1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 100512 -0.1% 100370 will-it-scale.16.threads 91.18 -0.5% 90.76 will-it-scale.16.threads_idle 6281 -0.1% 6272 will-it-scale.per_thread_ops 100512 -0.1% 100370 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/malloc2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.136e+09 +0.0% 1.136e+09 will-it-scale.16.threads 66.30 -0.2% 66.19 will-it-scale.16.threads_idle 70977943 +0.0% 70978047 will-it-scale.per_thread_ops 1.136e+09 +0.0% 1.136e+09 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/mmap2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 89113 +1.9% 90829 ± 3% will-it-scale.48.threads 89.65 -0.9% 88.88 will-it-scale.48.threads_idle 1856 +1.9% 1891 ± 3% will-it-scale.per_thread_ops 89113 +1.9% 90829 ± 3% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/mmap1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 77550 -2.5% 75624 will-it-scale.48.threads 91.39 +0.3% 91.63 will-it-scale.48.threads_idle 1615 -2.5% 1575 will-it-scale.per_thread_ops 77550 -2.5% 75624 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/malloc2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.702e+09 -0.0% 1.702e+09 will-it-scale.48.threads 0.03 +11.1% 0.03 ± 14% will-it-scale.48.threads_idle 35468051 -0.0% 35463537 will-it-scale.per_thread_ops 1.702e+09 -0.0% 1.702e+09 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/mmap2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 670141 -1.0% 663139 will-it-scale.24.processes 49.32 -0.5% 49.10 will-it-scale.24.processes_idle 27922 -1.0% 27630 will-it-scale.per_process_ops 670141 -1.0% 663139 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/mmap1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 821022 -0.1% 820465 will-it-scale.24.processes 49.09 -0.0% 49.09 will-it-scale.24.processes_idle 34208 -0.1% 34185 will-it-scale.per_process_ops 821022 -0.1% 820465 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/malloc2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.694e+09 -0.2% 1.691e+09 will-it-scale.24.processes 49.75 -0.3% 49.62 will-it-scale.24.processes_idle 70576503 -0.2% 70455796 will-it-scale.per_process_ops 1.694e+09 -0.2% 1.691e+09 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/malloc1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 195637 +0.6% 196726 will-it-scale.24.processes 49.07 -0.0% 49.05 will-it-scale.24.processes_idle 8151 +0.6% 8196 will-it-scale.per_process_ops 195637 +0.6% 196726 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/mmap2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 724970 -2.3% 708248 will-it-scale.16.processes 66.04 -0.3% 65.85 will-it-scale.16.processes_idle 45310 -2.3% 44264 will-it-scale.per_process_ops 724970 -2.3% 708248 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/mmap1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 873427 +0.7% 879407 will-it-scale.16.processes 65.81 +0.0% 65.81 will-it-scale.16.processes_idle 54588 +0.7% 54962 will-it-scale.per_process_ops 873427 +0.7% 879407 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/malloc2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.136e+09 +0.1% 1.137e+09 will-it-scale.16.processes 66.32 -0.2% 66.22 will-it-scale.16.processes_idle 70999478 +0.1% 71072433 will-it-scale.per_process_ops 1.136e+09 +0.1% 1.137e+09 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/malloc1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 205149 +2.1% 209554 will-it-scale.16.processes 65.76 +0.1% 65.81 will-it-scale.16.processes_idle 12821 +2.1% 13096 will-it-scale.per_process_ops 205149 +2.1% 209554 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/mmap2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 606469 -3.5% 585485 ± 3% will-it-scale.48.processes 0.06 +0.0% 0.06 will-it-scale.48.processes_idle 12634 -3.5% 12197 ± 3% will-it-scale.per_process_ops 606469 -3.5% 585485 ± 3% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/mmap1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 745891 -0.9% 739042 will-it-scale.48.processes 0.06 +5.6% 0.06 ± 7% will-it-scale.48.processes_idle 15538 -0.9% 15396 will-it-scale.per_process_ops 745891 -0.9% 739042 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/malloc2/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1.706e+09 -0.1% 1.704e+09 will-it-scale.48.processes 0.06 +0.0% 0.06 will-it-scale.48.processes_idle 35535756 -0.1% 35492836 will-it-scale.per_process_ops 1.706e+09 -0.1% 1.704e+09 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/malloc1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 174384 +1.0% 176162 will-it-scale.48.processes 0.06 +0.0% 0.06 will-it-scale.48.processes_idle 3632 +1.0% 3669 will-it-scale.per_process_ops 174384 +1.0% 176162 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/50%/debian/dellr630/brk1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 296768 -4.5% 283480 ± 5% will-it-scale.24.threads 77.96 +1.7% 79.29 ± 2% will-it-scale.24.threads_idle 12365 -4.5% 11811 ± 5% will-it-scale.per_thread_ops 296768 -4.5% 283480 ± 5% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/16/debian/dellr630/brk1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 267562 ± 4% +13.1% 302654 ± 9% will-it-scale.16.threads 86.35 -4.6% 82.35 ± 4% will-it-scale.16.threads_idle 16722 ± 4% +13.1% 18915 ± 9% will-it-scale.per_thread_ops 267562 ± 4% +13.1% 302654 ± 9% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/thread/100%/debian/dellr630/brk1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 232568 ± 8% -4.3% 222509 ± 11% will-it-scale.48.threads 83.85 ± 5% +2.3% 85.75 ± 4% will-it-scale.48.threads_idle 4845 ± 8% -4.3% 4635 ± 11% will-it-scale.per_thread_ops 232568 ± 8% -4.3% 222509 ± 11% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/50%/debian/dellr630/brk1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1645683 +1.7% 1673456 will-it-scale.24.processes 49.01 +0.2% 49.09 will-it-scale.24.processes_idle 68569 +1.7% 69726 will-it-scale.per_process_ops 1645683 +1.7% 1673456 will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/16/debian/dellr630/brk1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1768679 -2.4% 1726099 ± 3% will-it-scale.16.processes 66.06 -0.5% 65.73 will-it-scale.16.processes_idle 110542 -2.4% 107880 ± 3% will-it-scale.per_process_ops 1768679 -2.4% 1726099 ± 3% will-it-scale.workload ========================================================================================= compiler/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-10/defconfig/process/100%/debian/dellr630/brk1/will-it-scale commit: 5.13.0-rc1-mm1-max-order-11+ 5.13.0-rc1-mm1-max-order-20+ 5.13.0-rc1-mm1-m 5.13.0-rc1-mm1-max-order-20 ---------------- --------------------------- %stddev %change %stddev \ | \ 1467894 -1.0% 1453898 ± 2% will-it-scale.48.processes 0.06 ± 7% -5.3% 0.06 will-it-scale.48.processes_idle 30581 -1.0% 30289 ± 2% will-it-scale.per_process_ops 1467894 -1.0% 1453898 ± 2% will-it-scale.workload
On 02.06.21 17:56, Zi Yan wrote: > On 10 May 2021, at 10:36, Zi Yan wrote: > >> On 7 May 2021, at 10:00, David Hildenbrand wrote: >> >>> On 07.05.21 13:55, Michal Hocko wrote: >>>> [I haven't read through respective patches due to lack of time but let >>>> me comment on the general idea and the underlying justification] >>>> >>>> On Thu 06-05-21 17:31:09, David Hildenbrand wrote: >>>>> On 06.05.21 17:26, Zi Yan wrote: >>>>>> From: Zi Yan <ziy@nvidia.com> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> This patchset tries to remove the restriction on memory hotplug/hotremove >>>>>> granularity, which is always greater or equal to memory section size[1]. >>>>>> With the patchset, kernel is able to online/offline memory at a size independent >>>>>> of memory section size, as small as 2MB (the subsection size). >>>>> >>>>> ... which doesn't make any sense as we can only online/offline whole memory >>>>> block devices. >>>> >>>> Agreed. The subsection thingy is just a hack to workaround pmem >>>> alignement problems. For the real memory hotplug it is quite hard to >>>> argue for reasonable hotplug scenarios for very small physical memory >>>> ranges wrt. to the existing sparsemem memory model. >>>> >>>>>> The motivation is to increase MAX_ORDER of the buddy allocator and pageblock >>>>>> size without increasing memory hotplug/hotremove granularity at the same time, >>>>> >>>>> Gah, no. Please no. No. >>>> >>>> Agreed. Those are completely independent concepts. MAX_ORDER is can be >>>> really arbitrary irrespective of the section size with vmemmap sparse >>>> model. The existing restriction is due to old sparse model not being >>>> able to do page pointer arithmetic across memory sections. Is there any >>>> reason to stick with that memory model for an advance feature you are >>>> working on? >> >> No. I just want to increase MAX_ORDER. If the existing restriction can >> be removed, that will be great. >> >>> >>> I gave it some more thought yesterday. I guess the first thing we should look into is increasing MAX_ORDER and leaving pageblock_order and section size as is -- finding out what we have to tweak to get that up and running. Once we have that in place, we can actually look into better fragmentation avoidance etc. One step at a time. >> >> It makes sense to me. >> >>> >>> Because that change itself might require some thought. Requiring that bigger MAX_ORDER depends on SPARSE_VMEMMAP is something reasonable to do. >> >> OK, if with SPARSE_VMEMMAP MAX_ORDER can be set to be bigger than >> SECTION_SIZE, it is perfectly OK to me. Since 1GB THP support, which I >> want to add ultimately, will require SPARSE_VMEMMAP too (otherwise, >> all page++ will need to be changed to nth_page(page,1)). >> >>> >>> As stated somewhere here already, we'll have to look into making alloc_contig_range() (and main users CMA and virtio-mem) independent of MAX_ORDER and mainly rely on pageblock_order. The current handling in alloc_contig_range() is far from optimal as we have to isolate a whole MAX_ORDER - 1 page -- and on ZONE_NORMAL we'll fail easily if any part contains something unmovable although we don't even want to allocate that part. I actually have that on my list (to be able to fully support pageblock_order instead of MAX_ORDER -1 chunks in virtio-mem), however didn't have time to look into it. >> >> So in your mind, for gigantic page allocation (> MAX_ORDER), alloc_contig_range() >> should be used instead of buddy allocator while pageblock_order is kept at a small >> granularity like 2MB. Is that the case? Isn’t it going to have high fail rate >> when any of the pageblocks within a gigantic page range (like 1GB) becomes unmovable? >> Are you thinking additional mechanism/policy to prevent such thing happening as >> an additional step for gigantic page allocation? Like your ZONE_PREFER_MOVABLE idea? >> >>> >>> Further, page onlining / offlining code and early init code most probably also needs care if MAX_ORDER - 1 crosses sections. Memory holes we might suddenly have in MAX_ORDER - 1 pages might become a problem and will have to be handled. Not sure which other code has to be tweaked (compaction? page isolation?). >> >> Can you elaborate it a little more? From what I understand, memory holes mean valid >> PFNs are not contiguous before and after a hole, so pfn++ will not work, but >> struct pages are still virtually contiguous assuming SPARSE_VMEMMAP, meaning page++ >> would still work. So when MAX_ORDER - 1 crosses sections, additional code would be >> needed instead of simple pfn++. Is there anything I am missing? >> >> BTW, to test a system with memory holes, do you know is there an easy of adding >> random memory holes to an x86_64 VM, which can help reveal potential missing pieces >> in the code? Changing BIOS-e820 table might be one way, but I have no idea on >> how to do it on QEMU. >> >>> >>> Figuring out what needs care itself might take quite some effort. >>> >>> One thing I was thinking about as well: The bigger our MAX_ORDER, the slower it could be to allocate smaller pages. If we have 1G pages, splitting them down to 4k then takes 8 additional steps if I'm, not wrong. Of course, that's the worst case. Would be interesting to evaluate. >> >> Sure. I am planning to check it too. As a simple start, I am going to run will it scale >> benchmarks to see if there is any performance difference between different MAX_ORDERs. > > I ran vm-scalablity and memory-related will-it-scale on a server with 256GB memory to > see the impact of increasing MAX_ORDER and didn’t see much difference for most of > the workloads like page_fault1, page_fault2, and page_fault3 from will-it-scale. > But feel free to check the attached complete results and let me know what should be > looked into. Thanks. Right, for will-it-scale it looks like there are mostly minor differences, although I am not sure if the results are really stable (reaching from -6% to +6%). For vm-scalability the numbers seem to vary even more (e.g., stddev of ± 63%), so I have no idea how expressive they are. But I guess for these benchmarks, the net change won't really be significant. Thanks!
From: Zi Yan <ziy@nvidia.com> Hi all, This patchset tries to remove the restriction on memory hotplug/hotremove granularity, which is always greater or equal to memory section size[1]. With the patchset, kernel is able to online/offline memory at a size independent of memory section size, as small as 2MB (the subsection size). The motivation is to increase MAX_ORDER of the buddy allocator and pageblock size without increasing memory hotplug/hotremove granularity at the same time, so that the kernel can allocator 1GB pages using buddy allocator and utilizes existing pageblock based anti-fragmentation, paving the road for 1GB THP support[2]. The patchset utilizes the existing subsection support[3] and changes the section size alignment checks to subsection size alignment checks. There are also changes to pageblock code to support partial pageblocks, when pageblock size is increased along with MAX_ORDER. Increasing pageblock size can enable kernel to utilize existing anti-fragmentation mechanism for gigantic page allocations. The last patch increases SECTION_SIZE_BITS to demonstrate the use of memory hotplug/hotremove subsection, but is not intended to be merged as is. It is there in case one wants to try this out and will be removed during the final submission. Feel free to give suggestions and comments. I am looking forward to your feedback. Thanks. Zi Yan (7): mm: sparse: set/clear subsection bitmap when pages are onlined/offlined. mm: set pageblock_order to the max of HUGETLB_PAGE_ORDER and MAX_ORDER-1 mm: memory_hotplug: decouple memory_block size with section size. mm: pageblock: allow set/unset migratetype for partial pageblock mm: memory_hotplug, sparse: enable memory hotplug/hotremove subsections arch: x86: no MAX_ORDER exceeds SECTION_SIZE check for 32bit vdso. [not for merge] mm: increase SECTION_SIZE_BITS to 31 arch/ia64/Kconfig | 1 - arch/powerpc/Kconfig | 1 - arch/x86/Kconfig | 15 +++ arch/x86/entry/vdso/Makefile | 1 + arch/x86/include/asm/sparsemem.h | 2 +- drivers/base/memory.c | 176 +++++++++++++++---------------- drivers/base/node.c | 2 +- include/linux/memory.h | 8 +- include/linux/mmzone.h | 2 + include/linux/page-isolation.h | 8 +- include/linux/pageblock-flags.h | 9 -- mm/Kconfig | 7 -- mm/memory_hotplug.c | 22 ++-- mm/page_alloc.c | 40 ++++--- mm/page_isolation.c | 30 +++--- mm/sparse.c | 55 ++++++++-- 16 files changed, 219 insertions(+), 160 deletions(-)