Message ID | cover.1640077468.git.baolin.wang@linux.alibaba.com (mailing list archive) |
---|---|
Headers | show |
Series | Add a new scheme to support demotion on tiered memory system | expand |
Hi Baolin, On Tue, 21 Dec 2021 17:18:02 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > Hi, > > Now on tiered memory system with different memory types, the reclaim path in > shrink_page_list() already support demoting pages to slow memory node instead > of discarding the pages. However, at that time the fast memory node memory > wartermark is already tense, which will increase the memory allocation latency > during page demotion. So a new method from user space demoting cold pages > proactively will be more helpful. > > We can rely on the DAMON in user space to help to monitor the cold memory on > fast memory node, and demote the cold pages to slow memory node proactively to > keep the fast memory node in a healthy state. > > This patch set introduces a new scheme named DAMOS_DEMOTE to support this feature, > and works well from my testing. Any comments are welcome. Thanks. I like the idea, thank you for these patches! If possible, could you share some details about your tests? Thanks, SJ > > > Baolin Wang (2): > mm: Export the alloc_demote_page() function > mm/damon: Add a new scheme to support demotion on tiered memory system > > include/linux/damon.h | 3 + > mm/damon/dbgfs.c | 1 + > mm/damon/vaddr.c | 156 ++++++++++++++++++++++++++++++++++++++++++++++++++ > mm/internal.h | 1 + > mm/vmscan.c | 2 +- > 5 files changed, 162 insertions(+), 1 deletion(-) > > -- > 1.8.3.1
On 12/21/2021 9:26 PM, SeongJae Park wrote: > Hi Baolin, > > On Tue, 21 Dec 2021 17:18:02 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > >> Hi, >> >> Now on tiered memory system with different memory types, the reclaim path in >> shrink_page_list() already support demoting pages to slow memory node instead >> of discarding the pages. However, at that time the fast memory node memory >> wartermark is already tense, which will increase the memory allocation latency >> during page demotion. So a new method from user space demoting cold pages >> proactively will be more helpful. >> >> We can rely on the DAMON in user space to help to monitor the cold memory on >> fast memory node, and demote the cold pages to slow memory node proactively to >> keep the fast memory node in a healthy state. >> >> This patch set introduces a new scheme named DAMOS_DEMOTE to support this feature, >> and works well from my testing. Any comments are welcome. Thanks. > > I like the idea, thank you for these patches! If possible, could you share > some details about your tests? Sure, sorry for not adding more information about my tests. My machine contains 64G DRAM + 256G AEP(persistent memory), and you should enable the demotion firstly by: echo "true" > /sys/kernel/mm/numa/demotion_enabled Then I just write a simple test case like below to mmap some anon memory, and then just read and write half of the mmap buffer to let another half to be cold enough to demote. int main() { int len = 50 * 1024 * 1024; int scan_len = len / 2; int i, ret, j; unsigned long *p; p = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (p == MAP_FAILED) { printf("failed to get memory\n"); return -1; } for (i = 0; i < len / sizeof(*p); i++) p[i] = 0x55aa; /* Let another half of buffer to be cold */ do { for (i = 0; i < scan_len / sizeof(*p); i++) p[i] = 0x55aa; sleep(2); for (i = 0; i < scan_len / sizeof(*p); i++) j += p[i] >> 2; } while (1); munmap(p, len); return 0; } After setting the atts/schemes/target_ids, then start monitoring: echo 100000 1000000 1000000 10 1000 > /sys/kernel/debug/damon/attrs echo 4096 8192000 0 5 10 2000 5 1000 2097152 5000 0 0 0 0 0 3 2 1 > /sys/kernel/debug/damon/schemes After a while, you can check the demote statictics by below command, and you can find the demote scheme is applied by demoting some cold pages to slow memory (AEP) node. cat /proc/vmstat | grep "demote" pgdemote_direct 6881
On Tue, 21 Dec 2021 22:32:24 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > On 12/21/2021 9:26 PM, SeongJae Park wrote: > > Hi Baolin, > > > > On Tue, 21 Dec 2021 17:18:02 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > >> Hi, > >> > >> Now on tiered memory system with different memory types, the reclaim path in > >> shrink_page_list() already support demoting pages to slow memory node instead > >> of discarding the pages. However, at that time the fast memory node memory > >> wartermark is already tense, which will increase the memory allocation latency > >> during page demotion. So a new method from user space demoting cold pages > >> proactively will be more helpful. > >> > >> We can rely on the DAMON in user space to help to monitor the cold memory on > >> fast memory node, and demote the cold pages to slow memory node proactively to > >> keep the fast memory node in a healthy state. > >> > >> This patch set introduces a new scheme named DAMOS_DEMOTE to support this feature, > >> and works well from my testing. Any comments are welcome. Thanks. > > > > I like the idea, thank you for these patches! If possible, could you share > > some details about your tests? > > Sure, sorry for not adding more information about my tests. No problem! > > My machine contains 64G DRAM + 256G AEP(persistent memory), and you > should enable the demotion firstly by: > echo "true" > /sys/kernel/mm/numa/demotion_enabled > > Then I just write a simple test case like below to mmap some anon > memory, and then just read and write half of the mmap buffer to let > another half to be cold enough to demote. > > int main() > { > int len = 50 * 1024 * 1024; > int scan_len = len / 2; > int i, ret, j; > unsigned long *p; > > p = mmap(NULL, len, PROT_READ | PROT_WRITE, > MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > if (p == MAP_FAILED) { > printf("failed to get memory\n"); > return -1; > } > > for (i = 0; i < len / sizeof(*p); i++) > p[i] = 0x55aa; > > /* Let another half of buffer to be cold */ > do { > for (i = 0; i < scan_len / sizeof(*p); i++) > p[i] = 0x55aa; > > sleep(2); > > for (i = 0; i < scan_len / sizeof(*p); i++) > j += p[i] >> 2; > } while (1); > > munmap(p, len); > return 0; > } > > After setting the atts/schemes/target_ids, then start monitoring: > echo 100000 1000000 1000000 10 1000 > /sys/kernel/debug/damon/attrs > echo 4096 8192000 0 5 10 2000 5 1000 2097152 5000 0 0 0 0 0 3 2 1 > > /sys/kernel/debug/damon/schemes > > After a while, you can check the demote statictics by below command, and > you can find the demote scheme is applied by demoting some cold pages to > slow memory (AEP) node. > > cat /proc/vmstat | grep "demote" > pgdemote_direct 6881 Thank you for sharing this great details! I was just wondering if you have tested and measured the effects of the memory allocation latency increase during the page demotion, which invoked by shrink_page_list(), and also if you have measured how much improvement can be achieved with DAMON-based demotion in the scenario. Seems that's not the case, and I personally think that information is not essential for this patch, so I see no problem here. But, if you have tested or have a plan to do that, and if you could, I think sharing the results on this cover letter would make this even greater. Thanks, SJ
On 12/22/2021 4:54 PM, SeongJae Park wrote: [snip] >> >> My machine contains 64G DRAM + 256G AEP(persistent memory), and you >> should enable the demotion firstly by: >> echo "true" > /sys/kernel/mm/numa/demotion_enabled >> >> Then I just write a simple test case like below to mmap some anon >> memory, and then just read and write half of the mmap buffer to let >> another half to be cold enough to demote. >> >> int main() >> { >> int len = 50 * 1024 * 1024; >> int scan_len = len / 2; >> int i, ret, j; >> unsigned long *p; >> >> p = mmap(NULL, len, PROT_READ | PROT_WRITE, >> MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); >> if (p == MAP_FAILED) { >> printf("failed to get memory\n"); >> return -1; >> } >> >> for (i = 0; i < len / sizeof(*p); i++) >> p[i] = 0x55aa; >> >> /* Let another half of buffer to be cold */ >> do { >> for (i = 0; i < scan_len / sizeof(*p); i++) >> p[i] = 0x55aa; >> >> sleep(2); >> >> for (i = 0; i < scan_len / sizeof(*p); i++) >> j += p[i] >> 2; >> } while (1); >> >> munmap(p, len); >> return 0; >> } >> >> After setting the atts/schemes/target_ids, then start monitoring: >> echo 100000 1000000 1000000 10 1000 > /sys/kernel/debug/damon/attrs >> echo 4096 8192000 0 5 10 2000 5 1000 2097152 5000 0 0 0 0 0 3 2 1 > >> /sys/kernel/debug/damon/schemes >> >> After a while, you can check the demote statictics by below command, and >> you can find the demote scheme is applied by demoting some cold pages to >> slow memory (AEP) node. >> >> cat /proc/vmstat | grep "demote" >> pgdemote_direct 6881 > > Thank you for sharing this great details! > > I was just wondering if you have tested and measured the effects of the memory > allocation latency increase during the page demotion, which invoked by > shrink_page_list(), and also if you have measured how much improvement can be > achieved with DAMON-based demotion in the scenario. Seems that's not the case, Not yet testing on the real workload with DAMON demote scheme now, and I think DAMON is lack of some functions to tune performance on tiered memory system. At least I think we also need add a new promotion scheme for DAMON to promote hot memory from slow memory node to the fast memory node, which is on my TODO list. > and I personally think that information is not essential for this patch, so I > see no problem here. But, if you have tested or have a plan to do that, and if > you could, I think sharing the results on this cover letter would make this > even greater. Sure, will do if we find some funny results with DAMON on tiered memory system in future. Thanks.