Message ID | 1467104499-27517-4-git-send-email-pl@kamp.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 28/06/2016 11:01, Peter Lieven wrote: > evaluation with the recently introduced maximum stack size monitoring revealed > that the actual used stack size was never above 4kB so allocating 1MB stack > for each coroutine is a lot of wasted memory. So reduce the stack size to > 64kB which should still give enough head room. If we make the stack this much smaller, there is a non-zero chance of smashing it. You must add a guard page if you do this (actually more than one because QEMU will happily have stack frames as big as 16 KB). The stack counts for RSS but it's not actually allocated memory, so why does it matter? Paolo
* Paolo Bonzini (pbonzini@redhat.com) wrote: > > > On 28/06/2016 11:01, Peter Lieven wrote: > > evaluation with the recently introduced maximum stack size monitoring revealed > > that the actual used stack size was never above 4kB so allocating 1MB stack > > for each coroutine is a lot of wasted memory. So reduce the stack size to > > 64kB which should still give enough head room. > > If we make the stack this much smaller, there is a non-zero chance of > smashing it. You must add a guard page if you do this (actually more > than one because QEMU will happily have stack frames as big as 16 KB). > The stack counts for RSS but it's not actually allocated memory, so why > does it matter? I think I'd be interested in seeing the /proc/.../smaps before and after this change to see if anything is visible and if we can see the difference in rss etc. Dave > > Paolo -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Am 28.06.2016 um 12:54 schrieb Paolo Bonzini: > > On 28/06/2016 11:01, Peter Lieven wrote: >> evaluation with the recently introduced maximum stack size monitoring revealed >> that the actual used stack size was never above 4kB so allocating 1MB stack >> for each coroutine is a lot of wasted memory. So reduce the stack size to >> 64kB which should still give enough head room. > If we make the stack this much smaller, there is a non-zero chance of > smashing it. You must add a guard page if you do this (actually more > than one because QEMU will happily have stack frames as big as 16 KB). > The stack counts for RSS but it's not actually allocated memory, so why > does it matter? Is there an easy way to determinate how much of the RSS is actually allocated? I erroneously it was all allocated.... So as for the stack, the MAP_GROWSDOWN is it really important? Will the kernel allocate all pages of the stack otherwise if the last page is written? I am asking because I don't know if MAP_GROWSDOWN is a good idea as Peter mentioned there were discussions to deprecate it. Peter
Am 28.06.2016 um 12:57 schrieb Dr. David Alan Gilbert: > * Paolo Bonzini (pbonzini@redhat.com) wrote: >> >> On 28/06/2016 11:01, Peter Lieven wrote: >>> evaluation with the recently introduced maximum stack size monitoring revealed >>> that the actual used stack size was never above 4kB so allocating 1MB stack >>> for each coroutine is a lot of wasted memory. So reduce the stack size to >>> 64kB which should still give enough head room. >> If we make the stack this much smaller, there is a non-zero chance of >> smashing it. You must add a guard page if you do this (actually more >> than one because QEMU will happily have stack frames as big as 16 KB). >> The stack counts for RSS but it's not actually allocated memory, so why >> does it matter? > I think I'd be interested in seeing the /proc/.../smaps before and after this > change to see if anything is visible and if we can see the difference > in rss etc. Can you advise what in smaps should be especially looked at. As for RSS I can report hat the long term usage is significantly lower. I had the strange observation that when the VM is running for some minutes the RSS suddenly increases to the whole stack size. Peter
----- Original Message ----- > From: "Peter Lieven" <pl@kamp.de> > To: "Paolo Bonzini" <pbonzini@redhat.com>, qemu-devel@nongnu.org > Cc: kwolf@redhat.com, "peter maydell" <peter.maydell@linaro.org>, mst@redhat.com, dgilbert@redhat.com, > mreitz@redhat.com, kraxel@redhat.com > Sent: Tuesday, June 28, 2016 1:13:26 PM > Subject: Re: [PATCH 03/15] coroutine-ucontext: reduce stack size to 64kB > > Am 28.06.2016 um 12:54 schrieb Paolo Bonzini: > > > > On 28/06/2016 11:01, Peter Lieven wrote: > >> evaluation with the recently introduced maximum stack size monitoring > >> revealed > >> that the actual used stack size was never above 4kB so allocating 1MB > >> stack > >> for each coroutine is a lot of wasted memory. So reduce the stack size to > >> 64kB which should still give enough head room. > > If we make the stack this much smaller, there is a non-zero chance of > > smashing it. You must add a guard page if you do this (actually more > > than one because QEMU will happily have stack frames as big as 16 KB). > > The stack counts for RSS but it's not actually allocated memory, so why > > does it matter? > > Is there an easy way to determinate how much of the RSS is actually > allocated? I erroneously it was all allocated.... > > So as for the stack, the MAP_GROWSDOWN is it really important? Will the > kernel > allocate all pages of the stack otherwise if the last page is written? > > I am asking because I don't know if MAP_GROWSDOWN is a good idea as Peter > mentioned there were discussions to deprecate it. I don't know, I found those discussions too. However I've also seen an interesting patch to ensure a guard page is kept at the bottom of the VMA. But thinking more about it, if you use MAP_GROWSDOWN you don't know anymore where the bottom of the stack and you cannot free it correctly, can you? Or am I completely misunderstanding the purpose of the flag? I guess it's better to steer clear of it unless we're ready to look at kernel code for a while... Paolo
* Peter Lieven (pl@kamp.de) wrote: > Am 28.06.2016 um 12:57 schrieb Dr. David Alan Gilbert: > > * Paolo Bonzini (pbonzini@redhat.com) wrote: > > > > > > On 28/06/2016 11:01, Peter Lieven wrote: > > > > evaluation with the recently introduced maximum stack size monitoring revealed > > > > that the actual used stack size was never above 4kB so allocating 1MB stack > > > > for each coroutine is a lot of wasted memory. So reduce the stack size to > > > > 64kB which should still give enough head room. > > > If we make the stack this much smaller, there is a non-zero chance of > > > smashing it. You must add a guard page if you do this (actually more > > > than one because QEMU will happily have stack frames as big as 16 KB). > > > The stack counts for RSS but it's not actually allocated memory, so why > > > does it matter? > > I think I'd be interested in seeing the /proc/.../smaps before and after this > > change to see if anything is visible and if we can see the difference > > in rss etc. > > Can you advise what in smaps should be especially looked at. > > As for RSS I can report hat the long term usage is significantly lower. > I had the strange observation that when the VM is running for some minutes > the RSS suddenly increases to the whole stack size. You can see the Rss of each mapping; if you knew where your stacks were it would be easy to see if it was the stacks that were Rss and if there was anything else odd about them. If you set hte mapping as growsdown then you can see the area that has a 'gd' in it's VmFlags. Dave > > Peter > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Am 28.06.2016 um 13:35 schrieb Dr. David Alan Gilbert: > * Peter Lieven (pl@kamp.de) wrote: >> Am 28.06.2016 um 12:57 schrieb Dr. David Alan Gilbert: >>> * Paolo Bonzini (pbonzini@redhat.com) wrote: >>>> On 28/06/2016 11:01, Peter Lieven wrote: >>>>> evaluation with the recently introduced maximum stack size monitoring revealed >>>>> that the actual used stack size was never above 4kB so allocating 1MB stack >>>>> for each coroutine is a lot of wasted memory. So reduce the stack size to >>>>> 64kB which should still give enough head room. >>>> If we make the stack this much smaller, there is a non-zero chance of >>>> smashing it. You must add a guard page if you do this (actually more >>>> than one because QEMU will happily have stack frames as big as 16 KB). >>>> The stack counts for RSS but it's not actually allocated memory, so why >>>> does it matter? >>> I think I'd be interested in seeing the /proc/.../smaps before and after this >>> change to see if anything is visible and if we can see the difference >>> in rss etc. >> Can you advise what in smaps should be especially looked at. >> >> As for RSS I can report hat the long term usage is significantly lower. >> I had the strange observation that when the VM is running for some minutes >> the RSS suddenly increases to the whole stack size. > You can see the Rss of each mapping; if you knew where your stacks were > it would be easy to see if it was the stacks that were Rss and if > there was anything else odd about them. > If you set hte mapping as growsdown then you can see the area that has a 'gd' > in it's VmFlags. Would you expect to see each 1MB allocation in smaps or is it possible that the kernel merges some mappings to bigger ones? And more importantly if the regions are merged Paolos comment about we do not need a guard page would not be true because a coroutine stack could grow into annother coroutines stack. Looking at the commit from Linus it would also be good to have that guard page not having the gd flag. Some of the regions above 1024kB have an RSS of exactly 4kB * (Size / 1024kB) which leads to the assumption that it is a corouine stack where exactly one page has been allocated. I am asking because this is what I e.g. see for a Qemu VM with flags "gd": cat /proc/5031/smaps | grep -B18 gd 7f808aee7000-7f808b9e6000 rw-p 00000000 00:00 0 Size: 11264 kB Rss: 44 kB Pss: 44 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 44 kB Referenced: 44 kB Anonymous: 44 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f808bb01000-7f8090000000 rw-p 00000000 00:00 0 Size: 70656 kB Rss: 276 kB Pss: 276 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 276 kB Referenced: 276 kB Anonymous: 276 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f80940ff000-7f80943fe000 rw-p 00000000 00:00 0 Size: 3072 kB Rss: 12 kB Pss: 12 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 12 kB Referenced: 12 kB Anonymous: 12 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f8095700000-7f80957ff000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f8097301000-7f8097400000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f80974df000-7f80975de000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd 7f809760c000-7f809770b000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f8097901000-7f8097a00000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f8097b01000-7f8097c00000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f8097d01000-7f8097e00000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f8197f01000-7f8198000000 rw-p 00000000 00:00 0 Size: 1024 kB Rss: 4 kB Pss: 4 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 4 kB Referenced: 4 kB Anonymous: 4 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7f81b4001000-7f81b4200000 rw-p 00000000 00:00 0 Size: 2048 kB Rss: 20 kB Pss: 20 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 20 kB Referenced: 20 kB Anonymous: 20 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac sd -- 7ffd337e2000-7ffd33805000 rw-p 00000000 00:00 0 [stack] Size: 144 kB Rss: 64 kB Pss: 64 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 64 kB Referenced: 64 kB Anonymous: 64 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB VmFlags: rd wr mr mw me gd ac Peter
* Peter Lieven (pl@kamp.de) wrote: > Am 28.06.2016 um 13:35 schrieb Dr. David Alan Gilbert: > > * Peter Lieven (pl@kamp.de) wrote: > > > Am 28.06.2016 um 12:57 schrieb Dr. David Alan Gilbert: > > > > * Paolo Bonzini (pbonzini@redhat.com) wrote: > > > > > On 28/06/2016 11:01, Peter Lieven wrote: > > > > > > evaluation with the recently introduced maximum stack size monitoring revealed > > > > > > that the actual used stack size was never above 4kB so allocating 1MB stack > > > > > > for each coroutine is a lot of wasted memory. So reduce the stack size to > > > > > > 64kB which should still give enough head room. > > > > > If we make the stack this much smaller, there is a non-zero chance of > > > > > smashing it. You must add a guard page if you do this (actually more > > > > > than one because QEMU will happily have stack frames as big as 16 KB). > > > > > The stack counts for RSS but it's not actually allocated memory, so why > > > > > does it matter? > > > > I think I'd be interested in seeing the /proc/.../smaps before and after this > > > > change to see if anything is visible and if we can see the difference > > > > in rss etc. > > > Can you advise what in smaps should be especially looked at. > > > > > > As for RSS I can report hat the long term usage is significantly lower. > > > I had the strange observation that when the VM is running for some minutes > > > the RSS suddenly increases to the whole stack size. > > You can see the Rss of each mapping; if you knew where your stacks were > > it would be easy to see if it was the stacks that were Rss and if > > there was anything else odd about them. > > If you set hte mapping as growsdown then you can see the area that has a 'gd' > > in it's VmFlags. > > Would you expect to see each 1MB allocation in smaps or is it possible that > the kernel merges some mappings to bigger ones? > > And more importantly if the regions are merged Paolos comment about we > do not need a guard page would not be true because a coroutine stack could > grow into annother coroutines stack. Looking at the commit from Linus it > would also be good to have that guard page not having the gd flag. Hmm I'm not sure; one for Paolo. > Some of the regions above 1024kB have an RSS of exactly 4kB * (Size / 1024kB) > which leads to the assumption that it is a corouine stack where exactly one page > has been allocated. > > I am asking because this is what I e.g. see for a Qemu VM with flags "gd": However, what that does show is that if you add up all the Rss, it's still near-enough nothing worth worrying about. Maybe it looks different in the old world before you mmap'd it, you could try going back to the g_malloc'd version but printf'ing the address you get, then comparing that with smaps to see what the malloc'd world ended up with mapped. Dave > cat /proc/5031/smaps | grep -B18 gd > 7f808aee7000-7f808b9e6000 rw-p 00000000 00:00 0 > Size: 11264 kB > Rss: 44 kB > Pss: 44 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 44 kB > Referenced: 44 kB > Anonymous: 44 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f808bb01000-7f8090000000 rw-p 00000000 00:00 0 > Size: 70656 kB > Rss: 276 kB > Pss: 276 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 276 kB > Referenced: 276 kB > Anonymous: 276 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f80940ff000-7f80943fe000 rw-p 00000000 00:00 0 > Size: 3072 kB > Rss: 12 kB > Pss: 12 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 12 kB > Referenced: 12 kB > Anonymous: 12 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f8095700000-7f80957ff000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f8097301000-7f8097400000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f80974df000-7f80975de000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > 7f809760c000-7f809770b000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f8097901000-7f8097a00000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f8097b01000-7f8097c00000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f8097d01000-7f8097e00000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f8197f01000-7f8198000000 rw-p 00000000 00:00 0 > Size: 1024 kB > Rss: 4 kB > Pss: 4 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 4 kB > Referenced: 4 kB > Anonymous: 4 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7f81b4001000-7f81b4200000 rw-p 00000000 00:00 0 > Size: 2048 kB > Rss: 20 kB > Pss: 20 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 20 kB > Referenced: 20 kB > Anonymous: 20 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac sd > -- > 7ffd337e2000-7ffd33805000 rw-p 00000000 00:00 0 [stack] > Size: 144 kB > Rss: 64 kB > Pss: 64 kB > Shared_Clean: 0 kB > Shared_Dirty: 0 kB > Private_Clean: 0 kB > Private_Dirty: 64 kB > Referenced: 64 kB > Anonymous: 64 kB > AnonHugePages: 0 kB > Shared_Hugetlb: 0 kB > Private_Hugetlb: 0 kB > Swap: 0 kB > SwapPss: 0 kB > KernelPageSize: 4 kB > MMUPageSize: 4 kB > Locked: 0 kB > VmFlags: rd wr mr mw me gd ac > > Peter > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Am 28.06.2016 um 16:20 schrieb Dr. David Alan Gilbert: > * Peter Lieven (pl@kamp.de) wrote: >> Am 28.06.2016 um 13:35 schrieb Dr. David Alan Gilbert: >>> * Peter Lieven (pl@kamp.de) wrote: >>>> Am 28.06.2016 um 12:57 schrieb Dr. David Alan Gilbert: >>>>> * Paolo Bonzini (pbonzini@redhat.com) wrote: >>>>>> On 28/06/2016 11:01, Peter Lieven wrote: >>>>>>> evaluation with the recently introduced maximum stack size monitoring revealed >>>>>>> that the actual used stack size was never above 4kB so allocating 1MB stack >>>>>>> for each coroutine is a lot of wasted memory. So reduce the stack size to >>>>>>> 64kB which should still give enough head room. >>>>>> If we make the stack this much smaller, there is a non-zero chance of >>>>>> smashing it. You must add a guard page if you do this (actually more >>>>>> than one because QEMU will happily have stack frames as big as 16 KB). >>>>>> The stack counts for RSS but it's not actually allocated memory, so why >>>>>> does it matter? >>>>> I think I'd be interested in seeing the /proc/.../smaps before and after this >>>>> change to see if anything is visible and if we can see the difference >>>>> in rss etc. >>>> Can you advise what in smaps should be especially looked at. >>>> >>>> As for RSS I can report hat the long term usage is significantly lower. >>>> I had the strange observation that when the VM is running for some minutes >>>> the RSS suddenly increases to the whole stack size. >>> You can see the Rss of each mapping; if you knew where your stacks were >>> it would be easy to see if it was the stacks that were Rss and if >>> there was anything else odd about them. >>> If you set hte mapping as growsdown then you can see the area that has a 'gd' >>> in it's VmFlags. >> Would you expect to see each 1MB allocation in smaps or is it possible that >> the kernel merges some mappings to bigger ones? >> >> And more importantly if the regions are merged Paolos comment about we >> do not need a guard page would not be true because a coroutine stack could >> grow into annother coroutines stack. Looking at the commit from Linus it >> would also be good to have that guard page not having the gd flag. > Hmm I'm not sure; one for Paolo. My fault. The second mmap call with the pointer to the stack must carry the MAP_FIXED flag. Peter
diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c index 27c61f3..7f1d541 100644 --- a/util/coroutine-ucontext.c +++ b/util/coroutine-ucontext.c @@ -88,7 +88,7 @@ static void coroutine_trampoline(int i0, int i1) } } -#define COROUTINE_STACK_SIZE (1 << 20) +#define COROUTINE_STACK_SIZE (1 << 16) Coroutine *qemu_coroutine_new(void) {
evaluation with the recently introduced maximum stack size monitoring revealed that the actual used stack size was never above 4kB so allocating 1MB stack for each coroutine is a lot of wasted memory. So reduce the stack size to 64kB which should still give enough head room. Signed-off-by: Peter Lieven <pl@kamp.de> --- util/coroutine-ucontext.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)