Message ID | 20200311034441.23243-1-jaewon31.kim@samsung.com (mailing list archive) |
---|---|
Headers | show |
Series | meminfo: introduce extra meminfo | expand |
On Wed, Mar 11, 2020 at 12:44:38PM +0900, Jaewon Kim wrote: > /proc/meminfo or show_free_areas does not show full system wide memory > usage status. There seems to be huge hidden memory especially on > embedded Android system. Because it usually have some HW IP which do not > have internal memory and use common DRAM memory. > > In Android system, most of those hidden memory seems to be vmalloc pages > , ion system heap memory, graphics memory, and memory for DRAM based > compressed swap storage. They may be shown in other node but it seems to > useful if /proc/meminfo shows all those extra memory information. And > show_mem also need to print the info in oom situation. > > Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > > Memory usage of specific driver can be various so that showing the usage > through upstream meminfo.c is not easy. To print the extra memory usage > of a driver, introduce following APIs. Each driver needs to count as > atomic_long_t. > > int register_extra_meminfo(atomic_long_t *val, int shift, > const char *name); > int unregister_extra_meminfo(atomic_long_t *val); > > Currently register ION system heap allocator and zsmalloc pages. > Additionally tested on local graphics driver. > > i.e) cat /proc/meminfo | tail -3 > IonSystemHeap: 242620 kB > ZsPages: 203860 kB > GraphicDriver: 196576 kB > > i.e.) show_mem on oom > <6>[ 420.856428] Mem-Info: > <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 The idea is nice and helpful, but I'm sure that the interface will be abused almost immediately. I expect that every driver will register to such API. First it will be done by "large" drivers and after that everyone will copy/paste. Thanks
On 2020년 03월 11일 16:25, Leon Romanovsky wrote: > On Wed, Mar 11, 2020 at 12:44:38PM +0900, Jaewon Kim wrote: >> /proc/meminfo or show_free_areas does not show full system wide memory >> usage status. There seems to be huge hidden memory especially on >> embedded Android system. Because it usually have some HW IP which do not >> have internal memory and use common DRAM memory. >> >> In Android system, most of those hidden memory seems to be vmalloc pages >> , ion system heap memory, graphics memory, and memory for DRAM based >> compressed swap storage. They may be shown in other node but it seems to >> useful if /proc/meminfo shows all those extra memory information. And >> show_mem also need to print the info in oom situation. >> >> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 >> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap >> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 >> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. >> >> Memory usage of specific driver can be various so that showing the usage >> through upstream meminfo.c is not easy. To print the extra memory usage >> of a driver, introduce following APIs. Each driver needs to count as >> atomic_long_t. >> >> int register_extra_meminfo(atomic_long_t *val, int shift, >> const char *name); >> int unregister_extra_meminfo(atomic_long_t *val); >> >> Currently register ION system heap allocator and zsmalloc pages. >> Additionally tested on local graphics driver. >> >> i.e) cat /proc/meminfo | tail -3 >> IonSystemHeap: 242620 kB >> ZsPages: 203860 kB >> GraphicDriver: 196576 kB >> >> i.e.) show_mem on oom >> <6>[ 420.856428] Mem-Info: >> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB >> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 > The idea is nice and helpful, but I'm sure that the interface will be abused > almost immediately. I expect that every driver will register to such API. > > First it will be done by "large" drivers and after that everyone will copy/paste. I thought using it is up to driver developers. If it is abused, /proc/meminfo will show too much info. for that device. What about a new node, /proc/meminfo_extra, to gather those info. and not corrupting original /proc/meminfo. Thank you > Thanks > >
On Fri, Mar 13, 2020 at 01:39:14PM +0900, Jaewon Kim wrote: > > > On 2020년 03월 11일 16:25, Leon Romanovsky wrote: > > On Wed, Mar 11, 2020 at 12:44:38PM +0900, Jaewon Kim wrote: > >> /proc/meminfo or show_free_areas does not show full system wide memory > >> usage status. There seems to be huge hidden memory especially on > >> embedded Android system. Because it usually have some HW IP which do not > >> have internal memory and use common DRAM memory. > >> > >> In Android system, most of those hidden memory seems to be vmalloc pages > >> , ion system heap memory, graphics memory, and memory for DRAM based > >> compressed swap storage. They may be shown in other node but it seems to > >> useful if /proc/meminfo shows all those extra memory information. And > >> show_mem also need to print the info in oom situation. > >> > >> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > >> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > >> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > >> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > >> > >> Memory usage of specific driver can be various so that showing the usage > >> through upstream meminfo.c is not easy. To print the extra memory usage > >> of a driver, introduce following APIs. Each driver needs to count as > >> atomic_long_t. > >> > >> int register_extra_meminfo(atomic_long_t *val, int shift, > >> const char *name); > >> int unregister_extra_meminfo(atomic_long_t *val); > >> > >> Currently register ION system heap allocator and zsmalloc pages. > >> Additionally tested on local graphics driver. > >> > >> i.e) cat /proc/meminfo | tail -3 > >> IonSystemHeap: 242620 kB > >> ZsPages: 203860 kB > >> GraphicDriver: 196576 kB > >> > >> i.e.) show_mem on oom > >> <6>[ 420.856428] Mem-Info: > >> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > >> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 > > The idea is nice and helpful, but I'm sure that the interface will be abused > > almost immediately. I expect that every driver will register to such API. > > > > First it will be done by "large" drivers and after that everyone will copy/paste. > I thought using it is up to driver developers. > If it is abused, /proc/meminfo will show too much info. for that device. > What about a new node, /proc/meminfo_extra, to gather those info. and not > corrupting original /proc/meminfo. I don't know if it is applicable for all users, but for the drivers such info is better to be placed in /sys/ as separate file (for example /sys/class/net/wlp3s0/*) and driver/core will be responsible to register/unregister. It will ensure that all drivers get this info without need to register and make /proc/meminfo and /proc/meminfo_extra too large. Thanks > > Thank you > > Thanks > > > > >
+CC linux-api, please include in future versions as well On 3/11/20 4:44 AM, Jaewon Kim wrote: > /proc/meminfo or show_free_areas does not show full system wide memory > usage status. There seems to be huge hidden memory especially on > embedded Android system. Because it usually have some HW IP which do not > have internal memory and use common DRAM memory. > > In Android system, most of those hidden memory seems to be vmalloc pages > , ion system heap memory, graphics memory, and memory for DRAM based > compressed swap storage. They may be shown in other node but it seems to > useful if /proc/meminfo shows all those extra memory information. And > show_mem also need to print the info in oom situation. > > Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > > Memory usage of specific driver can be various so that showing the usage > through upstream meminfo.c is not easy. To print the extra memory usage > of a driver, introduce following APIs. Each driver needs to count as > atomic_long_t. > > int register_extra_meminfo(atomic_long_t *val, int shift, > const char *name); > int unregister_extra_meminfo(atomic_long_t *val); > > Currently register ION system heap allocator and zsmalloc pages. > Additionally tested on local graphics driver. > > i.e) cat /proc/meminfo | tail -3 > IonSystemHeap: 242620 kB > ZsPages: 203860 kB > GraphicDriver: 196576 kB > > i.e.) show_mem on oom > <6>[ 420.856428] Mem-Info: > <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 I like the idea and the dynamic nature of this, so that drivers not present wouldn't add lots of useless zeroes to the output. It also makes simpler the decisions of "what is important enough to need its own meminfo entry". The suggestion for hunting per-driver /sys files would only work if there was a common name to such files so once can find(1) them easily. It also doesn't work for the oom/failed alloc warning output. I think a new meminfo_extra file is a reasonable compromise, as there might be tools periodically reading /proc/meminfo and thus we would limit the overhead of that. > Jaewon Kim (3): > proc/meminfo: introduce extra meminfo > mm: zsmalloc: include zs page size in proc/meminfo > android: ion: include system heap size in proc/meminfo > > drivers/staging/android/ion/ion.c | 2 + > drivers/staging/android/ion/ion.h | 1 + > drivers/staging/android/ion/ion_system_heap.c | 2 + > fs/proc/meminfo.c | 103 ++++++++++++++++++++++++++ > include/linux/mm.h | 4 + > lib/show_mem.c | 1 + > mm/zsmalloc.c | 2 + > 7 files changed, 115 insertions(+) >
On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote: > +CC linux-api, please include in future versions as well > > On 3/11/20 4:44 AM, Jaewon Kim wrote: > > /proc/meminfo or show_free_areas does not show full system wide memory > > usage status. There seems to be huge hidden memory especially on > > embedded Android system. Because it usually have some HW IP which do not > > have internal memory and use common DRAM memory. > > > > In Android system, most of those hidden memory seems to be vmalloc pages > > , ion system heap memory, graphics memory, and memory for DRAM based > > compressed swap storage. They may be shown in other node but it seems to > > useful if /proc/meminfo shows all those extra memory information. And > > show_mem also need to print the info in oom situation. > > > > Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > > ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > > memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > > ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > > > > Memory usage of specific driver can be various so that showing the usage > > through upstream meminfo.c is not easy. To print the extra memory usage > > of a driver, introduce following APIs. Each driver needs to count as > > atomic_long_t. > > > > int register_extra_meminfo(atomic_long_t *val, int shift, > > const char *name); > > int unregister_extra_meminfo(atomic_long_t *val); > > > > Currently register ION system heap allocator and zsmalloc pages. > > Additionally tested on local graphics driver. > > > > i.e) cat /proc/meminfo | tail -3 > > IonSystemHeap: 242620 kB > > ZsPages: 203860 kB > > GraphicDriver: 196576 kB > > > > i.e.) show_mem on oom > > <6>[ 420.856428] Mem-Info: > > <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > > <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 > > I like the idea and the dynamic nature of this, so that drivers not present > wouldn't add lots of useless zeroes to the output. > It also makes simpler the decisions of "what is important enough to need its own > meminfo entry". > > The suggestion for hunting per-driver /sys files would only work if there was a > common name to such files so once can find(1) them easily. > It also doesn't work for the oom/failed alloc warning output. Of course there is a need to have a stable name for such an output, this is why driver/core should be responsible for that and not drivers authors. The use case which I had in mind slightly different than to look after OOM. I'm interested to optimize our drivers in their memory footprint to allow better scale in SR-IOV mode where one device creates many separate copies of itself. Those copies easily can take gigabytes of RAM due to the need to optimize for high-performance networking. Sometimes the amount of memory and not HW is actually limits the scale factor. So I would imagine this feature being used as an aid for the driver developers and not for the runtime decisions. My 2-cents. Thanks
On 2020년 03월 14일 02:48, Leon Romanovsky wrote: > On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote: >> +CC linux-api, please include in future versions as well >> >> On 3/11/20 4:44 AM, Jaewon Kim wrote: >>> /proc/meminfo or show_free_areas does not show full system wide memory >>> usage status. There seems to be huge hidden memory especially on >>> embedded Android system. Because it usually have some HW IP which do not >>> have internal memory and use common DRAM memory. >>> >>> In Android system, most of those hidden memory seems to be vmalloc pages >>> , ion system heap memory, graphics memory, and memory for DRAM based >>> compressed swap storage. They may be shown in other node but it seems to >>> useful if /proc/meminfo shows all those extra memory information. And >>> show_mem also need to print the info in oom situation. >>> >>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 >>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap >>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 >>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. >>> >>> Memory usage of specific driver can be various so that showing the usage >>> through upstream meminfo.c is not easy. To print the extra memory usage >>> of a driver, introduce following APIs. Each driver needs to count as >>> atomic_long_t. >>> >>> int register_extra_meminfo(atomic_long_t *val, int shift, >>> const char *name); >>> int unregister_extra_meminfo(atomic_long_t *val); >>> >>> Currently register ION system heap allocator and zsmalloc pages. >>> Additionally tested on local graphics driver. >>> >>> i.e) cat /proc/meminfo | tail -3 >>> IonSystemHeap: 242620 kB >>> ZsPages: 203860 kB >>> GraphicDriver: 196576 kB >>> >>> i.e.) show_mem on oom >>> <6>[ 420.856428] Mem-Info: >>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB >>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 >> I like the idea and the dynamic nature of this, so that drivers not present >> wouldn't add lots of useless zeroes to the output. >> It also makes simpler the decisions of "what is important enough to need its own >> meminfo entry". >> >> The suggestion for hunting per-driver /sys files would only work if there was a >> common name to such files so once can find(1) them easily. >> It also doesn't work for the oom/failed alloc warning output. > Of course there is a need to have a stable name for such an output, this > is why driver/core should be responsible for that and not drivers authors. > > The use case which I had in mind slightly different than to look after OOM. > > I'm interested to optimize our drivers in their memory footprint to > allow better scale in SR-IOV mode where one device creates many separate > copies of itself. Those copies easily can take gigabytes of RAM due to > the need to optimize for high-performance networking. Sometimes the > amount of memory and not HW is actually limits the scale factor. > > So I would imagine this feature being used as an aid for the driver > developers and not for the runtime decisions. > > My 2-cents. > > Thanks > > Thank you for your comment. My idea, I think, may be able to help each driver developer to see their memory usage. But I'd like to see overall memory usage through the one node. Let me know if you have more comment. I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2. Thank you Jaewon Kim
On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote: > > > On 2020년 03월 14일 02:48, Leon Romanovsky wrote: > > On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote: > >> +CC linux-api, please include in future versions as well > >> > >> On 3/11/20 4:44 AM, Jaewon Kim wrote: > >>> /proc/meminfo or show_free_areas does not show full system wide memory > >>> usage status. There seems to be huge hidden memory especially on > >>> embedded Android system. Because it usually have some HW IP which do not > >>> have internal memory and use common DRAM memory. > >>> > >>> In Android system, most of those hidden memory seems to be vmalloc pages > >>> , ion system heap memory, graphics memory, and memory for DRAM based > >>> compressed swap storage. They may be shown in other node but it seems to > >>> useful if /proc/meminfo shows all those extra memory information. And > >>> show_mem also need to print the info in oom situation. > >>> > >>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > >>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > >>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > >>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > >>> > >>> Memory usage of specific driver can be various so that showing the usage > >>> through upstream meminfo.c is not easy. To print the extra memory usage > >>> of a driver, introduce following APIs. Each driver needs to count as > >>> atomic_long_t. > >>> > >>> int register_extra_meminfo(atomic_long_t *val, int shift, > >>> const char *name); > >>> int unregister_extra_meminfo(atomic_long_t *val); > >>> > >>> Currently register ION system heap allocator and zsmalloc pages. > >>> Additionally tested on local graphics driver. > >>> > >>> i.e) cat /proc/meminfo | tail -3 > >>> IonSystemHeap: 242620 kB > >>> ZsPages: 203860 kB > >>> GraphicDriver: 196576 kB > >>> > >>> i.e.) show_mem on oom > >>> <6>[ 420.856428] Mem-Info: > >>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > >>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 > >> I like the idea and the dynamic nature of this, so that drivers not present > >> wouldn't add lots of useless zeroes to the output. > >> It also makes simpler the decisions of "what is important enough to need its own > >> meminfo entry". > >> > >> The suggestion for hunting per-driver /sys files would only work if there was a > >> common name to such files so once can find(1) them easily. > >> It also doesn't work for the oom/failed alloc warning output. > > Of course there is a need to have a stable name for such an output, this > > is why driver/core should be responsible for that and not drivers authors. > > > > The use case which I had in mind slightly different than to look after OOM. > > > > I'm interested to optimize our drivers in their memory footprint to > > allow better scale in SR-IOV mode where one device creates many separate > > copies of itself. Those copies easily can take gigabytes of RAM due to > > the need to optimize for high-performance networking. Sometimes the > > amount of memory and not HW is actually limits the scale factor. > > > > So I would imagine this feature being used as an aid for the driver > > developers and not for the runtime decisions. > > > > My 2-cents. > > > > Thanks > > > > > Thank you for your comment. > My idea, I think, may be able to help each driver developer to see their memory usage. > But I'd like to see overall memory usage through the one node. It is more than enough :). > > Let me know if you have more comment. > I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2. Can you please help me to understand how that file will look like once many drivers will start to use this interface? Will I see multiple lines? Something like: driver1 .... driver2 .... driver3 .... ... driver1000 .... How can we extend it to support subsystems core code? Thanks > > Thank you > Jaewon Kim
2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <leon@kernel.org>님이 작성: > > On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote: > > > > > > On 2020년 03월 14일 02:48, Leon Romanovsky wrote: > > > On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote: > > >> +CC linux-api, please include in future versions as well > > >> > > >> On 3/11/20 4:44 AM, Jaewon Kim wrote: > > >>> /proc/meminfo or show_free_areas does not show full system wide memory > > >>> usage status. There seems to be huge hidden memory especially on > > >>> embedded Android system. Because it usually have some HW IP which do not > > >>> have internal memory and use common DRAM memory. > > >>> > > >>> In Android system, most of those hidden memory seems to be vmalloc pages > > >>> , ion system heap memory, graphics memory, and memory for DRAM based > > >>> compressed swap storage. They may be shown in other node but it seems to > > >>> useful if /proc/meminfo shows all those extra memory information. And > > >>> show_mem also need to print the info in oom situation. > > >>> > > >>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > > >>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > > >>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > > >>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > > >>> > > >>> Memory usage of specific driver can be various so that showing the usage > > >>> through upstream meminfo.c is not easy. To print the extra memory usage > > >>> of a driver, introduce following APIs. Each driver needs to count as > > >>> atomic_long_t. > > >>> > > >>> int register_extra_meminfo(atomic_long_t *val, int shift, > > >>> const char *name); > > >>> int unregister_extra_meminfo(atomic_long_t *val); > > >>> > > >>> Currently register ION system heap allocator and zsmalloc pages. > > >>> Additionally tested on local graphics driver. > > >>> > > >>> i.e) cat /proc/meminfo | tail -3 > > >>> IonSystemHeap: 242620 kB > > >>> ZsPages: 203860 kB > > >>> GraphicDriver: 196576 kB > > >>> > > >>> i.e.) show_mem on oom > > >>> <6>[ 420.856428] Mem-Info: > > >>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > > >>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 > > >> I like the idea and the dynamic nature of this, so that drivers not present > > >> wouldn't add lots of useless zeroes to the output. > > >> It also makes simpler the decisions of "what is important enough to need its own > > >> meminfo entry". > > >> > > >> The suggestion for hunting per-driver /sys files would only work if there was a > > >> common name to such files so once can find(1) them easily. > > >> It also doesn't work for the oom/failed alloc warning output. > > > Of course there is a need to have a stable name for such an output, this > > > is why driver/core should be responsible for that and not drivers authors. > > > > > > The use case which I had in mind slightly different than to look after OOM. > > > > > > I'm interested to optimize our drivers in their memory footprint to > > > allow better scale in SR-IOV mode where one device creates many separate > > > copies of itself. Those copies easily can take gigabytes of RAM due to > > > the need to optimize for high-performance networking. Sometimes the > > > amount of memory and not HW is actually limits the scale factor. > > > > > > So I would imagine this feature being used as an aid for the driver > > > developers and not for the runtime decisions. > > > > > > My 2-cents. > > > > > > Thanks > > > > > > > > Thank you for your comment. > > My idea, I think, may be able to help each driver developer to see their memory usage. > > But I'd like to see overall memory usage through the one node. > > It is more than enough :). > > > > > Let me know if you have more comment. > > I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2. > > Can you please help me to understand how that file will look like once > many drivers will start to use this interface? Will I see multiple > lines? > > Something like: > driver1 .... > driver2 .... > driver3 .... > ... > driver1000 .... > > How can we extend it to support subsystems core code? I do not have a plan to support subsystem core. I just want the /proc/meminfo_extra to show size of alloc_pages APIs rather than slub size. It is to show hidden huge memory. I think most of drivers do not need to register its size to /proc/meminfo_extra because drivers usually use slub APIs and rather than alloc_pages APIs. /proc/slabinfo helps for slub size in detail. As a candidate of /proc/meminfo_extra, I hope only few drivers using huge memory like over 100 MB got from alloc_pages APIs. As you say, if there is a static node on /sys for each driver, it may be used for all the drivers. I think sysfs class way may be better to show categorized sum size. But /proc/meminfo_extra can be another way to show those hidden huge memory. I mean your idea and my idea is not exclusive. Thank you > > Thanks > > > > > Thank you > > Jaewon Kim
On Tue, Mar 17, 2020 at 12:04:46PM +0900, Jaewon Kim wrote: > 2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <leon@kernel.org>님이 작성: > > > > On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote: > > > > > > > > > On 2020년 03월 14일 02:48, Leon Romanovsky wrote: > > > > On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote: > > > >> +CC linux-api, please include in future versions as well > > > >> > > > >> On 3/11/20 4:44 AM, Jaewon Kim wrote: > > > >>> /proc/meminfo or show_free_areas does not show full system wide memory > > > >>> usage status. There seems to be huge hidden memory especially on > > > >>> embedded Android system. Because it usually have some HW IP which do not > > > >>> have internal memory and use common DRAM memory. > > > >>> > > > >>> In Android system, most of those hidden memory seems to be vmalloc pages > > > >>> , ion system heap memory, graphics memory, and memory for DRAM based > > > >>> compressed swap storage. They may be shown in other node but it seems to > > > >>> useful if /proc/meminfo shows all those extra memory information. And > > > >>> show_mem also need to print the info in oom situation. > > > >>> > > > >>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > > > >>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > > > >>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > > > >>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > > > >>> > > > >>> Memory usage of specific driver can be various so that showing the usage > > > >>> through upstream meminfo.c is not easy. To print the extra memory usage > > > >>> of a driver, introduce following APIs. Each driver needs to count as > > > >>> atomic_long_t. > > > >>> > > > >>> int register_extra_meminfo(atomic_long_t *val, int shift, > > > >>> const char *name); > > > >>> int unregister_extra_meminfo(atomic_long_t *val); > > > >>> > > > >>> Currently register ION system heap allocator and zsmalloc pages. > > > >>> Additionally tested on local graphics driver. > > > >>> > > > >>> i.e) cat /proc/meminfo | tail -3 > > > >>> IonSystemHeap: 242620 kB > > > >>> ZsPages: 203860 kB > > > >>> GraphicDriver: 196576 kB > > > >>> > > > >>> i.e.) show_mem on oom > > > >>> <6>[ 420.856428] Mem-Info: > > > >>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > > > >>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 > > > >> I like the idea and the dynamic nature of this, so that drivers not present > > > >> wouldn't add lots of useless zeroes to the output. > > > >> It also makes simpler the decisions of "what is important enough to need its own > > > >> meminfo entry". > > > >> > > > >> The suggestion for hunting per-driver /sys files would only work if there was a > > > >> common name to such files so once can find(1) them easily. > > > >> It also doesn't work for the oom/failed alloc warning output. > > > > Of course there is a need to have a stable name for such an output, this > > > > is why driver/core should be responsible for that and not drivers authors. > > > > > > > > The use case which I had in mind slightly different than to look after OOM. > > > > > > > > I'm interested to optimize our drivers in their memory footprint to > > > > allow better scale in SR-IOV mode where one device creates many separate > > > > copies of itself. Those copies easily can take gigabytes of RAM due to > > > > the need to optimize for high-performance networking. Sometimes the > > > > amount of memory and not HW is actually limits the scale factor. > > > > > > > > So I would imagine this feature being used as an aid for the driver > > > > developers and not for the runtime decisions. > > > > > > > > My 2-cents. > > > > > > > > Thanks > > > > > > > > > > > Thank you for your comment. > > > My idea, I think, may be able to help each driver developer to see their memory usage. > > > But I'd like to see overall memory usage through the one node. > > > > It is more than enough :). > > > > > > > > Let me know if you have more comment. > > > I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2. > > > > Can you please help me to understand how that file will look like once > > many drivers will start to use this interface? Will I see multiple > > lines? > > > > Something like: > > driver1 .... > > driver2 .... > > driver3 .... > > ... > > driver1000 .... > > > > How can we extend it to support subsystems core code? > > I do not have a plan to support subsystem core. Fair enough. > > I just want the /proc/meminfo_extra to show size of alloc_pages APIs > rather than slub size. It is to show hidden huge memory. > I think most of drivers do not need to register its size to > /proc/meminfo_extra because > drivers usually use slub APIs and rather than alloc_pages APIs. > /proc/slabinfo helps for slub size in detail. The problem with this statement that the drivers that consuming memory are the ones who are interested in this interface. I can be not accurate here, but I think that all RDMA and major NICs will want to get this information. On my machine, it is something like 6 devices. > > As a candidate of /proc/meminfo_extra, I hope only few drivers using > huge memory like over 100 MB got from alloc_pages APIs. > > As you say, if there is a static node on /sys for each driver, it may > be used for all the drivers. > I think sysfs class way may be better to show categorized sum size. > But /proc/meminfo_extra can be another way to show those hidden huge memory. > I mean your idea and my idea is not exclusive. It is just better to have one interface. > > Thank you > > > > Thanks > > > > > > > > Thank you > > > Jaewon Kim >
On 2020년 03월 17일 23:37, Leon Romanovsky wrote: > On Tue, Mar 17, 2020 at 12:04:46PM +0900, Jaewon Kim wrote: >> 2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <leon@kernel.org>님이 작성: >>> On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote: >>>> >>>> On 2020년 03월 14일 02:48, Leon Romanovsky wrote: >>>>> On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote: >>>>>> +CC linux-api, please include in future versions as well >>>>>> >>>>>> On 3/11/20 4:44 AM, Jaewon Kim wrote: >>>>>>> /proc/meminfo or show_free_areas does not show full system wide memory >>>>>>> usage status. There seems to be huge hidden memory especially on >>>>>>> embedded Android system. Because it usually have some HW IP which do not >>>>>>> have internal memory and use common DRAM memory. >>>>>>> >>>>>>> In Android system, most of those hidden memory seems to be vmalloc pages >>>>>>> , ion system heap memory, graphics memory, and memory for DRAM based >>>>>>> compressed swap storage. They may be shown in other node but it seems to >>>>>>> useful if /proc/meminfo shows all those extra memory information. And >>>>>>> show_mem also need to print the info in oom situation. >>>>>>> >>>>>>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 >>>>>>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap >>>>>>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 >>>>>>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. >>>>>>> >>>>>>> Memory usage of specific driver can be various so that showing the usage >>>>>>> through upstream meminfo.c is not easy. To print the extra memory usage >>>>>>> of a driver, introduce following APIs. Each driver needs to count as >>>>>>> atomic_long_t. >>>>>>> >>>>>>> int register_extra_meminfo(atomic_long_t *val, int shift, >>>>>>> const char *name); >>>>>>> int unregister_extra_meminfo(atomic_long_t *val); >>>>>>> >>>>>>> Currently register ION system heap allocator and zsmalloc pages. >>>>>>> Additionally tested on local graphics driver. >>>>>>> >>>>>>> i.e) cat /proc/meminfo | tail -3 >>>>>>> IonSystemHeap: 242620 kB >>>>>>> ZsPages: 203860 kB >>>>>>> GraphicDriver: 196576 kB >>>>>>> >>>>>>> i.e.) show_mem on oom >>>>>>> <6>[ 420.856428] Mem-Info: >>>>>>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB >>>>>>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 >>>>>> I like the idea and the dynamic nature of this, so that drivers not present >>>>>> wouldn't add lots of useless zeroes to the output. >>>>>> It also makes simpler the decisions of "what is important enough to need its own >>>>>> meminfo entry". >>>>>> >>>>>> The suggestion for hunting per-driver /sys files would only work if there was a >>>>>> common name to such files so once can find(1) them easily. >>>>>> It also doesn't work for the oom/failed alloc warning output. >>>>> Of course there is a need to have a stable name for such an output, this >>>>> is why driver/core should be responsible for that and not drivers authors. >>>>> >>>>> The use case which I had in mind slightly different than to look after OOM. >>>>> >>>>> I'm interested to optimize our drivers in their memory footprint to >>>>> allow better scale in SR-IOV mode where one device creates many separate >>>>> copies of itself. Those copies easily can take gigabytes of RAM due to >>>>> the need to optimize for high-performance networking. Sometimes the >>>>> amount of memory and not HW is actually limits the scale factor. >>>>> >>>>> So I would imagine this feature being used as an aid for the driver >>>>> developers and not for the runtime decisions. >>>>> >>>>> My 2-cents. >>>>> >>>>> Thanks >>>>> >>>>> >>>> Thank you for your comment. >>>> My idea, I think, may be able to help each driver developer to see their memory usage. >>>> But I'd like to see overall memory usage through the one node. >>> It is more than enough :). >>> >>>> Let me know if you have more comment. >>>> I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2. >>> Can you please help me to understand how that file will look like once >>> many drivers will start to use this interface? Will I see multiple >>> lines? >>> >>> Something like: >>> driver1 .... >>> driver2 .... >>> driver3 .... >>> ... >>> driver1000 .... >>> >>> How can we extend it to support subsystems core code? >> I do not have a plan to support subsystem core. > Fair enough. > >> I just want the /proc/meminfo_extra to show size of alloc_pages APIs >> rather than slub size. It is to show hidden huge memory. >> I think most of drivers do not need to register its size to >> /proc/meminfo_extra because >> drivers usually use slub APIs and rather than alloc_pages APIs. >> /proc/slabinfo helps for slub size in detail. > The problem with this statement that the drivers that consuming memory > are the ones who are interested in this interface. I can be not accurate > here, but I think that all RDMA and major NICs will want to get this > information. > > On my machine, it is something like 6 devices. > >> As a candidate of /proc/meminfo_extra, I hope only few drivers using >> huge memory like over 100 MB got from alloc_pages APIs. >> >> As you say, if there is a static node on /sys for each driver, it may >> be used for all the drivers. >> I think sysfs class way may be better to show categorized sum size. >> But /proc/meminfo_extra can be another way to show those hidden huge memory. >> I mean your idea and my idea is not exclusive. > It is just better to have one interface. Sorry about that one interface. If we need to create a-meminfo_extra-like node on /sysfs, then I think further discussion with more people is needed. If there is no logical problem on creating /proc/meminfo_extra, I'd like to prepare v2 patch and get more comment on that v2 patch. Please help again for further discussion. Thank you > >> Thank you >>> Thanks >>> >>>> Thank you >>>> Jaewon Kim >
On Wed, Mar 18, 2020 at 05:58:51PM +0900, Jaewon Kim wrote: > > > On 2020년 03월 17일 23:37, Leon Romanovsky wrote: > > On Tue, Mar 17, 2020 at 12:04:46PM +0900, Jaewon Kim wrote: > >> 2020년 3월 16일 (월) 오후 5:32, Leon Romanovsky <leon@kernel.org>님이 작성: > >>> On Mon, Mar 16, 2020 at 01:07:08PM +0900, Jaewon Kim wrote: > >>>> > >>>> On 2020년 03월 14일 02:48, Leon Romanovsky wrote: > >>>>> On Fri, Mar 13, 2020 at 04:19:36PM +0100, Vlastimil Babka wrote: > >>>>>> +CC linux-api, please include in future versions as well > >>>>>> > >>>>>> On 3/11/20 4:44 AM, Jaewon Kim wrote: > >>>>>>> /proc/meminfo or show_free_areas does not show full system wide memory > >>>>>>> usage status. There seems to be huge hidden memory especially on > >>>>>>> embedded Android system. Because it usually have some HW IP which do not > >>>>>>> have internal memory and use common DRAM memory. > >>>>>>> > >>>>>>> In Android system, most of those hidden memory seems to be vmalloc pages > >>>>>>> , ion system heap memory, graphics memory, and memory for DRAM based > >>>>>>> compressed swap storage. They may be shown in other node but it seems to > >>>>>>> useful if /proc/meminfo shows all those extra memory information. And > >>>>>>> show_mem also need to print the info in oom situation. > >>>>>>> > >>>>>>> Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > >>>>>>> ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > >>>>>>> memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > >>>>>>> ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > >>>>>>> > >>>>>>> Memory usage of specific driver can be various so that showing the usage > >>>>>>> through upstream meminfo.c is not easy. To print the extra memory usage > >>>>>>> of a driver, introduce following APIs. Each driver needs to count as > >>>>>>> atomic_long_t. > >>>>>>> > >>>>>>> int register_extra_meminfo(atomic_long_t *val, int shift, > >>>>>>> const char *name); > >>>>>>> int unregister_extra_meminfo(atomic_long_t *val); > >>>>>>> > >>>>>>> Currently register ION system heap allocator and zsmalloc pages. > >>>>>>> Additionally tested on local graphics driver. > >>>>>>> > >>>>>>> i.e) cat /proc/meminfo | tail -3 > >>>>>>> IonSystemHeap: 242620 kB > >>>>>>> ZsPages: 203860 kB > >>>>>>> GraphicDriver: 196576 kB > >>>>>>> > >>>>>>> i.e.) show_mem on oom > >>>>>>> <6>[ 420.856428] Mem-Info: > >>>>>>> <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > >>>>>>> <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 > >>>>>> I like the idea and the dynamic nature of this, so that drivers not present > >>>>>> wouldn't add lots of useless zeroes to the output. > >>>>>> It also makes simpler the decisions of "what is important enough to need its own > >>>>>> meminfo entry". > >>>>>> > >>>>>> The suggestion for hunting per-driver /sys files would only work if there was a > >>>>>> common name to such files so once can find(1) them easily. > >>>>>> It also doesn't work for the oom/failed alloc warning output. > >>>>> Of course there is a need to have a stable name for such an output, this > >>>>> is why driver/core should be responsible for that and not drivers authors. > >>>>> > >>>>> The use case which I had in mind slightly different than to look after OOM. > >>>>> > >>>>> I'm interested to optimize our drivers in their memory footprint to > >>>>> allow better scale in SR-IOV mode where one device creates many separate > >>>>> copies of itself. Those copies easily can take gigabytes of RAM due to > >>>>> the need to optimize for high-performance networking. Sometimes the > >>>>> amount of memory and not HW is actually limits the scale factor. > >>>>> > >>>>> So I would imagine this feature being used as an aid for the driver > >>>>> developers and not for the runtime decisions. > >>>>> > >>>>> My 2-cents. > >>>>> > >>>>> Thanks > >>>>> > >>>>> > >>>> Thank you for your comment. > >>>> My idea, I think, may be able to help each driver developer to see their memory usage. > >>>> But I'd like to see overall memory usage through the one node. > >>> It is more than enough :). > >>> > >>>> Let me know if you have more comment. > >>>> I am planning to move my logic to be shown on a new node, /proc/meminfo_extra at v2. > >>> Can you please help me to understand how that file will look like once > >>> many drivers will start to use this interface? Will I see multiple > >>> lines? > >>> > >>> Something like: > >>> driver1 .... > >>> driver2 .... > >>> driver3 .... > >>> ... > >>> driver1000 .... > >>> > >>> How can we extend it to support subsystems core code? > >> I do not have a plan to support subsystem core. > > Fair enough. > > > >> I just want the /proc/meminfo_extra to show size of alloc_pages APIs > >> rather than slub size. It is to show hidden huge memory. > >> I think most of drivers do not need to register its size to > >> /proc/meminfo_extra because > >> drivers usually use slub APIs and rather than alloc_pages APIs. > >> /proc/slabinfo helps for slub size in detail. > > The problem with this statement that the drivers that consuming memory > > are the ones who are interested in this interface. I can be not accurate > > here, but I think that all RDMA and major NICs will want to get this > > information. > > > > On my machine, it is something like 6 devices. > > > >> As a candidate of /proc/meminfo_extra, I hope only few drivers using > >> huge memory like over 100 MB got from alloc_pages APIs. > >> > >> As you say, if there is a static node on /sys for each driver, it may > >> be used for all the drivers. > >> I think sysfs class way may be better to show categorized sum size. > >> But /proc/meminfo_extra can be another way to show those hidden huge memory. > >> I mean your idea and my idea is not exclusive. > > It is just better to have one interface. > Sorry about that one interface. > > If we need to create a-meminfo_extra-like node on /sysfs, then > I think further discussion with more people is needed. > If there is no logical problem on creating /proc/meminfo_extra, > I'd like to prepare v2 patch and get more comment on that v2 > patch. Please help again for further discussion. No problem, but can you please the summary of that discussion in the cover letter of v2 and add Greg KH as the driver/core maintainer? It will save from us to go in circles. Thanks > > Thank you > > > >> Thank you > >>> Thanks > >>> > >>>> Thank you > >>>> Jaewon Kim > > >
On 03/11/20 at 12:44pm, Jaewon Kim wrote: > /proc/meminfo or show_free_areas does not show full system wide memory > usage status. There seems to be huge hidden memory especially on > embedded Android system. Because it usually have some HW IP which do not > have internal memory and use common DRAM memory. > > In Android system, most of those hidden memory seems to be vmalloc pages > , ion system heap memory, graphics memory, and memory for DRAM based > compressed swap storage. They may be shown in other node but it seems to > useful if /proc/meminfo shows all those extra memory information. And > show_mem also need to print the info in oom situation. > > Fortunately vmalloc pages is alread shown by commit 97105f0ab7b8 > ("mm: vmalloc: show number of vmalloc pages in /proc/meminfo"). Swap > memory using zsmalloc can be seen through vmstat by commit 91537fee0013 > ("mm: add NR_ZSMALLOC to vmstat") but not on /proc/meminfo. > > Memory usage of specific driver can be various so that showing the usage > through upstream meminfo.c is not easy. To print the extra memory usage > of a driver, introduce following APIs. Each driver needs to count as > atomic_long_t. > > int register_extra_meminfo(atomic_long_t *val, int shift, > const char *name); > int unregister_extra_meminfo(atomic_long_t *val); > > Currently register ION system heap allocator and zsmalloc pages. > Additionally tested on local graphics driver. > > i.e) cat /proc/meminfo | tail -3 > IonSystemHeap: 242620 kB > ZsPages: 203860 kB > GraphicDriver: 196576 kB > > i.e.) show_mem on oom > <6>[ 420.856428] Mem-Info: > <6>[ 420.856433] IonSystemHeap:32813kB ZsPages:44114kB GraphicDriver::13091kB > <6>[ 420.856450] active_anon:957205 inactive_anon:159383 isolated_anon:0 Kdump is also a use case for having a better memory use info, it runs with limited memory, and we see more oom cases from device drivers instead of userspace processes. I think this might be helpful if drivers can implement and register the hook. But it would be ideal if we can have some tracing code to trace the memory alloc/free and get the memory use info automatically. Anyway the proposal is better than none, thumb up! Let me cc Kairui who is working on kdump oom issues. Thanks Dave