Message ID | 20230612085535.275206-1-jaypatel@linux.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [RFC] mm/slub: Reducing slub memory wastage | expand |
On Mon, Jun 12, 2023 at 02:25:35PM +0530, Jay Patel wrote: > 3) If the minimum order is less than the slub_max_order, iterate through > a loop from minimum order to slub_max_order and check if the condition > (rem <= slab_size / fract_leftover) holds true. Here, slab_size is > calculated as (PAGE_SIZE << order), rem is (slab_size % object_size), > and fract_leftover can have values of 16, 8, or 4. If the condition is > true, select that order for the slab. > > However, in point 3, when calculating the fraction left over, it can > result in a large range of values (like 1 Kb to 256 bytes on 4K page > size & 4 Kb to 16 Kb on 64K page size with order 0 and goes on > increasing with higher order) when compared to the remainder (rem). This > can lead to the selection of an order that results in more memory > wastage. To mitigate such wastage, we have modified point 3 as follows: > instead of selecting the first order that satisfies the condition (rem > <= slab_size / fract_leftover), we iterate through the loop from > min_order to slub_max_order and choose the order that minimizes memory > wastage for the slab. Hi Jay, If understand correctly, slub currently chooses an order if it does not waste too much memory, but the order could be sub-optimal because there can be an order that wastes less memory. right? Hmm, the new code might choose larger order than before, as SLUB previously wasted more memory instead of increasing order. BUT the maximum slub order is still bound by slub_max_order, so that looks fine to me. If using high order for less fragmentation becomes a problem, slub_max_order should be changed. <...snip...> > I conducted tests on systems with 160 CPUs and 16 CPUs, using 4K and > 64K page sizes. Through these tests, it was observed that the patch > successfully reduces the wastage of slab memory without any noticeable > performance degradation in the hackbench test report. However, it should > be noted that the patch also increases the total number of objects, > leading to an overall increase in total slab memory usage. <...snip...> Then my question is that, why is this a useful change if total memory usage is increased? > Test results are as follows: > 3) On 16 CPUs with 4K Page size > > +-----------------+----------------+------------------+ > | Total wastage in slub memory | > +-----------------+----------------+------------------+ > | | After Boot | After Hackbench | > | Normal | 666 Kb | 902 Kb | > | With Patch | 533 Kb | 694 Kb | > | Wastage reduce | ~20% | ~23% | > +-----------------+----------------+------------------+ > > +-----------------+----------------+----------------+ > | Total slub memory | > +-----------------+----------------+----------------+ > | | After Boot | After Hackbench| > | Normal | 82360 | 122532 | > | With Patch | 87372 | 129180 | > | Memory increase | ~6% | ~5% | > +-----------------+----------------+----------------+ > How should we understand this data? reducing amount of memory wastage by increasing slab order might not reduce total SLUB memory usage? > hackbench-process-sockets > +-------+----+---------+---------+-----------+ > | Amean | 1 | 1.4983 | 1.4867 | ( 0.78%) | > | Amean | 4 | 5.6613 | 5.6793 | ( -0.32%) | > | Amean | 7 | 9.9813 | 9.9873 | ( -0.06%) | > | Amean | 12 | 17.6963 | 17.8527 | ( -0.88%) | > | Amean | 21 | 31.2017 | 31.2060 | ( -0.01%) | > | Amean | 30 | 44.0297 | 44.1750 | ( -0.33%) | > | Amean | 48 | 70.2073 | 69.6210 | ( 0.84%) | > | Amean | 64 | 92.3257 | 93.7410 | ( -1.53%) | > +-------+----+---------+---------+-----------+
On Mon, 2023-06-12 at 19:17 +0900, Hyeonggon Yoo wrote: > On Mon, Jun 12, 2023 at 02:25:35PM +0530, Jay Patel wrote: > > 3) If the minimum order is less than the slub_max_order, iterate > > through > > a loop from minimum order to slub_max_order and check if the > > condition > > (rem <= slab_size / fract_leftover) holds true. Here, slab_size is > > calculated as (PAGE_SIZE << order), rem is (slab_size % > > object_size), > > and fract_leftover can have values of 16, 8, or 4. If the condition > > is > > true, select that order for the slab. > > > > However, in point 3, when calculating the fraction left over, it > > can > > result in a large range of values (like 1 Kb to 256 bytes on 4K > > page > > size & 4 Kb to 16 Kb on 64K page size with order 0 and goes on > > increasing with higher order) when compared to the remainder (rem). > > This > > can lead to the selection of an order that results in more memory > > wastage. To mitigate such wastage, we have modified point 3 as > > follows: > > instead of selecting the first order that satisfies the condition > > (rem > > <= slab_size / fract_leftover), we iterate through the loop from > > min_order to slub_max_order and choose the order that minimizes > > memory > > wastage for the slab. > > Hi Jay, > > If understand correctly, slub currently chooses an order if it > does not waste too much memory, but the order could be sub-optimal > because there can be an order that wastes less memory. right? > > Hmm, the new code might choose larger order than before, as SLUB > previously > wasted more memory instead of increasing order. > > BUT the maximum slub order is still bound by slub_max_order, > so that looks fine to me. If using high order for less fragmentation > becomes a problem, slub_max_order should be changed. > > Hi Hyeonggon Based on my understanding, the slub_max_order parameter is derived from the PAGE_ALLOC_COSTLY_ORDER configuration option. Any changes made to can have an impact on system performance. If you reduce the value of slub_max_order, it means that smaller contiguous memory will be used for certain slab caches. However, this can lead to a situation where the minimum number of objects required (min_objects) cannot fit within those smaller memory. As a result, performance issues may arise. > <...snip...> > > > I conducted tests on systems with 160 CPUs and 16 CPUs, using 4K > > and > > 64K page sizes. Through these tests, it was observed that the patch > > successfully reduces the wastage of slab memory without any > > noticeable > > performance degradation in the hackbench test report. However, it > > should > > be noted that the patch also increases the total number of objects, > > leading to an overall increase in total slab memory usage. > > <...snip...> > > Then my question is that, why is this a useful change if total memory > usage is increased? > This patch aimed in reducing memory wastage can potentially lead to an increase in the slab order for a slab cache. Consequently, this increase in page order can result in a higher number of objects per slab, reducing wastage and leading to a more efficient utilization of memory. This enhancement is advantageous since the presence of unused objects can be leveraged in the future, depending on varying workloads. > > > Test results are as follows: > > 3) On 16 CPUs with 4K Page size > > > > +-----------------+----------------+------------------+ > > > Total wastage in slub memory | > > +-----------------+----------------+------------------+ > > > | After Boot | After Hackbench | > > > Normal | 666 Kb | 902 Kb | > > > With Patch | 533 Kb | 694 Kb | > > > Wastage reduce | ~20% | ~23% | > > +-----------------+----------------+------------------+ > > > > +-----------------+----------------+----------------+ > > > Total slub memory | > > +-----------------+----------------+----------------+ > > > | After Boot | After Hackbench| > > > Normal | 82360 | 122532 | > > > With Patch | 87372 | 129180 | > > > Memory increase | ~6% | ~5% | > > +-----------------+----------------+----------------+ > > > > How should we understand this data? > reducing amount of memory wastage by increasing slab order > might not reduce total SLUB memory usage? > Indeed, the total slub memory is increase with this patch. However, the memory utilization is improved. The slub memory comprises both active objects and unused objects, along with memory wastage. With this patch, the memory wastage is reduced, leading to a higher number of unused objects (the numbers of active objects mostly remains the same). The presence of these unused objects provides the opportunity for their utilization, which can vary depending on different workloads. > > hackbench-process-sockets > > +-------+----+---------+---------+-----------+ > > > Amean | 1 | 1.4983 | 1.4867 | ( 0.78%) | > > > Amean | 4 | 5.6613 | 5.6793 | ( -0.32%) | > > > Amean | 7 | 9.9813 | 9.9873 | ( -0.06%) | > > > Amean | 12 | 17.6963 | 17.8527 | ( -0.88%) | > > > Amean | 21 | 31.2017 | 31.2060 | ( -0.01%) | > > > Amean | 30 | 44.0297 | 44.1750 | ( -0.33%) | > > > Amean | 48 | 70.2073 | 69.6210 | ( 0.84%) | > > > Amean | 64 | 92.3257 | 93.7410 | ( -1.53%) | > > +-------+----+---------+---------+-----------+ -- Jay Patel
On Tue, Jun 13, 2023 at 06:25:48PM +0530, Jay Patel wrote: > > <...snip...> > > > I conducted tests on systems with 160 CPUs and 16 CPUs, using 4K > > > and > > > 64K page sizes. Through these tests, it was observed that the patch > > > successfully reduces the wastage of slab memory without any > > > noticeable > > > performance degradation in the hackbench test report. However, it > > > should > > > be noted that the patch also increases the total number of objects, > > > leading to an overall increase in total slab memory usage. > > > > <...snip...> > > > > Then my question is that, why is this a useful change if total memory > > usage is increased? > > > This patch aimed in reducing memory wastage can potentially lead to an > increase in the slab order for a slab cache. Consequently, this > increase in page order can result in a higher number of objects per > slab, reducing wastage and leading to a more efficient utilization of > memory. if you define utilization as percentage of memory that is being used out of total memory, utilization becomes worse... (based on data you provided) I think 'less memory wastage' is a useful feature only if the total memory usage is reduced so that it could be used for other purposes. I mean, if it consumes more memory on a same workload (in most cases), who would like it? > This enhancement is advantageous since the presence of unused > objects can be leveraged in the future, depending on varying > workloads. At least we need to know when it is leveraged and what kinds of workloads would benefit... Thanks,
On Mon, 2023-06-19 at 12:25 +0900, Hyeonggon Yoo wrote: > On Tue, Jun 13, 2023 at 06:25:48PM +0530, Jay Patel wrote: > > > <...snip...> > > > > I conducted tests on systems with 160 CPUs and 16 CPUs, > > > > using 4K > > > > and > > > > 64K page sizes. Through these tests, it was observed that the > > > > patch > > > > successfully reduces the wastage of slab memory without any > > > > noticeable > > > > performance degradation in the hackbench test report. However, > > > > it > > > > should > > > > be noted that the patch also increases the total number of > > > > objects, > > > > leading to an overall increase in total slab memory usage. > > > > > > <...snip...> > > > > > > Then my question is that, why is this a useful change if total > > > memory > > > usage is increased? > > > > > This patch aimed in reducing memory wastage can potentially lead to > > an > > increase in the slab order for a slab cache. Consequently, this > > increase in page order can result in a higher number of objects per > > slab, reducing wastage and leading to a more efficient utilization > > of > > memory. > > if you define utilization as percentage of memory that is > being used out of total memory, utilization becomes worse... (based > on data you provided) > > I think 'less memory wastage' is a useful feature only if the total > memory usage is reduced so that it could be used for other purposes. > > I mean, if it consumes more memory on a same workload (in most > cases), > who would like it? Hi Hyeonggon, Thank you for your response. I acknowledge your feedback, and I have made the necessary modifications to the patch. I am pleased to inform you that I have sent the updated version [1], which addresses the issues we discussed. [1] https://lore.kernel.org/linux-mm/20230628095740.589893-1-jaypatel@linux.ibm.com/T/#u Thanks, Jay Patel > > > This enhancement is advantageous since the presence of unused > > objects can be leveraged in the future, depending on varying > > workloads. > > At least we need to know when it is leveraged and what kinds of > workloads > would benefit... > > Thanks, >
diff --git a/mm/slub.c b/mm/slub.c index c87628cd8a9a..e0b465173ed3 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4087,11 +4087,10 @@ static unsigned int slub_min_objects; * the smallest order which will fit the object. */ static inline unsigned int calc_slab_order(unsigned int size, - unsigned int min_objects, unsigned int max_order, - unsigned int fract_leftover) + unsigned int min_objects, unsigned int max_order) { unsigned int min_order = slub_min_order; - unsigned int order; + unsigned int order, min_wastage = size, min_wastage_order = slub_max_order+1; if (order_objects(min_order, size) > MAX_OBJS_PER_PAGE) return get_order(size * MAX_OBJS_PER_PAGE) - 1; @@ -4104,11 +4103,17 @@ static inline unsigned int calc_slab_order(unsigned int size, rem = slab_size % size; - if (rem <= slab_size / fract_leftover) - break; + if (rem < min_wastage) { + min_wastage = rem; + min_wastage_order = order; + } } - return order; + if (min_wastage_order <= slub_max_order) + return min_wastage_order; + else + return order; + } static inline int calculate_order(unsigned int size) @@ -4145,32 +4150,18 @@ static inline int calculate_order(unsigned int size) max_objects = order_objects(slub_max_order, size); min_objects = min(min_objects, max_objects); - while (min_objects > 1) { - unsigned int fraction; - - fraction = 16; - while (fraction >= 4) { - order = calc_slab_order(size, min_objects, - slub_max_order, fraction); - if (order <= slub_max_order) - return order; - fraction /= 2; - } + while (min_objects >= 1) { + order = calc_slab_order(size, min_objects, + slub_max_order); + if (order <= slub_max_order) + return order; min_objects--; } - /* - * We were unable to place multiple objects in a slab. Now - * lets see if we can place a single object there. - */ - order = calc_slab_order(size, 1, slub_max_order, 1); - if (order <= slub_max_order) - return order; - /* * Doh this slab cannot be placed using slub_max_order. */ - order = calc_slab_order(size, 1, MAX_ORDER, 1); + order = calc_slab_order(size, 1, MAX_ORDER); if (order <= MAX_ORDER) return order; return -ENOSYS;
In the current implementation of the slub memory allocator, the slab order selection process follows these criteria: 1) Determine the minimum order required to serve the minimum number of objects (min_objects). This calculation is based on the formula (order = min_objects * object_size / PAGE_SIZE). 2) If the minimum order is greater than the maximum allowed order (slub_max_order), set slub_max_order as the order for this slab. 3) If the minimum order is less than the slub_max_order, iterate through a loop from minimum order to slub_max_order and check if the condition (rem <= slab_size / fract_leftover) holds true. Here, slab_size is calculated as (PAGE_SIZE << order), rem is (slab_size % object_size), and fract_leftover can have values of 16, 8, or 4. If the condition is true, select that order for the slab. However, in point 3, when calculating the fraction left over, it can result in a large range of values (like 1 Kb to 256 bytes on 4K page size & 4 Kb to 16 Kb on 64K page size with order 0 and goes on increasing with higher order) when compared to the remainder (rem). This can lead to the selection of an order that results in more memory wastage. To mitigate such wastage, we have modified point 3 as follows: instead of selecting the first order that satisfies the condition (rem <= slab_size / fract_leftover), we iterate through the loop from min_order to slub_max_order and choose the order that minimizes memory wastage for the slab. Let's consider an example using mm_struct on 160 CPUs so min_objects is 32, and let's assume a page size of 64K. The size of mm_struct is 1536 bytes, which means a single page can serve 42 objects, exceeding the min_objects requirement. With the current logic, order 0 is selected for this slab since the remainder (rem) is 1 Kb (64 Kb % 1536 bytes), which is less than 4 Kb (slab_size is 64 Kb/fraction_size is 16). However, this results in wasting 1 Kb of memory for each mm_struct slab. But with this patch, order 1 (2 pages) is chosen, leading to a wastage of 512 bytes of memory for each mm_struct slab. Consequently, reducing memory wastage for this slab, increases the numbers of objects per slab. I conducted tests on systems with 160 CPUs and 16 CPUs, using 4K and 64K page sizes. Through these tests, it was observed that the patch successfully reduces the wastage of slab memory without any noticeable performance degradation in the hackbench test report. However, it should be noted that the patch also increases the total number of objects, leading to an overall increase in total slab memory usage. Test results are as follows: 1) On 160 CPUs with 4K Page size +----------------+----------------+----------------+ | Total wastage in slub memory | +----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 1819 Kb | 3056 Kb | | With Patch | 1288 Kb | 2217 Kb | | Wastage reduce | ~29% | ~27% | +----------------+----------------+----------------+ +-----------------+----------------+----------------+ | Total slub memory | +-----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 469336 | 725960 | | With Patch | 488032 | 726416 | | Memory increase | ~4% | ~0.06% | +-----------------+----------------+----------------+ hackbench-process-sockets +-------+-----+----------+----------+-----------+ | | Normal |With Patch| | +-------+-----+----------+----------+-----------+ | Amean | 1 | 1.2887 | 1.2143 | ( 5.77%) | | Amean | 4 | 1.5633 | 1.5993 | ( -2.30%) | | Amean | 7 | 2.3993 | 2.3813 | ( 0.75%) | | Amean | 12 | 3.9543 | 3.9637 | ( -0.24%) | | Amean | 21 | 6.9723 | 6.9290 | ( 0.62%) | | Amean | 30 | 10.1407 | 10.1067 | ( 0.34%) | | Amean | 48 | 16.6730 | 16.6697 | ( 0.02%) | | Amean | 79 | 28.6743 | 28.8970 | ( -0.78%) | | Amean | 110 | 39.0990 | 39.1857 | ( -0.22%) | | Amean | 141 | 51.2667 | 51.2003 | ( 0.13%) | | Amean | 172 | 62.0797 | 62.3190 | ( -0.39%) | | Amean | 203 | 73.5273 | 74.3567 | ( -1.13%) | | Amean | 234 | 84.7130 | 85.7940 | ( -1.28%) | | Amean | 265 | 97.0863 | 96.5810 | ( 0.52%) | | Amean | 296 | 108.4597 | 108.2987 | ( 0.15%) | +-------+-----+----------+----------+-----------+ 2) On 160 CPUs with 64K Page size +-----------------+----------------+----------------+ | Total wastage in slub memory | +-----------------+----------------+----------------+ | | After Boot |After Hackbench | | Normal | 729 Kb | 1597 Kb | | With Patch | 512 Kb | 1066 Kb | | Wastage reduce | ~30% | ~33% | +-----------------+----------------+----------------+ +-----------------+----------------+----------------+ | Total slub memory | +-----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 1612608 | 2667200 | | With Patch | 2147456 | 3500096 | | Memory increase | ~33% | ~31% | +-----------------+----------------+----------------+ hackbench-process-sockets +-------+-----+----------+----------+-----------+ | Amean | 1 | 1.2667 | 1.2053 | ( 4.84%) | | Amean | 4 | 1.5997 | 1.6453 | ( -2.85%) | | Amean | 7 | 2.3797 | 2.4017 | ( -0.92%) | | Amean | 12 | 3.9763 | 3.9987 | ( -0.56%) | | Amean | 21 | 6.9760 | 6.9917 | ( -0.22%) | | Amean | 30 | 10.2150 | 10.2093 | ( 0.06%) | | Amean | 48 | 16.8080 | 16.7707 | ( 0.22%) | | Amean | 79 | 28.2237 | 28.1583 | ( 0.23%) | | Amean | 110 | 39.7710 | 39.8420 | ( -0.18%) | | Amean | 141 | 51.3563 | 51.9233 | ( -1.10%) | | Amean | 172 | 63.4027 | 63.7463 | ( -0.54%) | | Amean | 203 | 74.4970 | 74.9327 | ( -0.58%) | | Amean | 234 | 86.1483 | 85.9420 | ( 0.24%) | | Amean | 265 | 97.5137 | 97.6100 | ( -0.10%) | | Amean | 296 | 109.2327 | 110.2417 | ( -0.92%) | +-------+-----+----------+----------+-----------+ 3) On 16 CPUs with 4K Page size +-----------------+----------------+------------------+ | Total wastage in slub memory | +-----------------+----------------+------------------+ | | After Boot | After Hackbench | | Normal | 666 Kb | 902 Kb | | With Patch | 533 Kb | 694 Kb | | Wastage reduce | ~20% | ~23% | +-----------------+----------------+------------------+ +-----------------+----------------+----------------+ | Total slub memory | +-----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 82360 | 122532 | | With Patch | 87372 | 129180 | | Memory increase | ~6% | ~5% | +-----------------+----------------+----------------+ hackbench-process-sockets +-------+----+---------+---------+-----------+ | Amean | 1 | 1.4983 | 1.4867 | ( 0.78%) | | Amean | 4 | 5.6613 | 5.6793 | ( -0.32%) | | Amean | 7 | 9.9813 | 9.9873 | ( -0.06%) | | Amean | 12 | 17.6963 | 17.8527 | ( -0.88%) | | Amean | 21 | 31.2017 | 31.2060 | ( -0.01%) | | Amean | 30 | 44.0297 | 44.1750 | ( -0.33%) | | Amean | 48 | 70.2073 | 69.6210 | ( 0.84%) | | Amean | 64 | 92.3257 | 93.7410 | ( -1.53%) | +-------+----+---------+---------+-----------+ 4) On 16 CPUs with 64K Page size +----------------+----------------+----------------+ | Total wastage in slub memory | +----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 239 Kb | 484 Kb | | With Patch | 135 Kb | 234 Kb | | Wastage reduce | ~43% | ~51% | +----------------+----------------+----------------+ +-----------------+----------------+----------------+ | Total slub memory | +-----------------+----------------+----------------+ | | After Boot | After Hackbench| | Normal | 227136 | 328110 | | With Patch | 284352 | 451391 | | Memory increase | ~25% | ~37% | +-----------------+----------------+----------------+ hackbench-process-sockets +-------+----+---------+---------+-----------+ | Amean | 1 | 1.3597 | 1.3583 | ( 0.10%) | | Amean | 4 | 5.2633 | 5.2503 | ( 0.25%) | | Amean | 7 | 9.2700 | 9.1710 | ( 1.07%) | | Amean | 12 | 16.3730 | 16.3103 | ( 0.38%) | | Amean | 21 | 28.7140 | 28.7510 | ( -0.13%) | | Amean | 30 | 40.3987 | 40.4940 | ( -0.24%) | | Amean | 48 | 63.8477 | 63.9457 | ( -0.15%) | | Amean | 64 | 86.4917 | 85.3810 | ( 1.28%) | +-------+----+---------+---------+-----------+ Signed-off-by: Jay Patel <jaypatel@linux.ibm.com> --- mm/slub.c | 43 +++++++++++++++++-------------------------- 1 file changed, 17 insertions(+), 26 deletions(-)