Message ID | 20230410073228.23043-1-jaewon31.kim@samsung.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3] dma-buf/heaps: system_heap: avoid too much allocation | expand |
Sorry for being late. I know there was some pre-existing discussion around that but I didn't have time to participate. On Mon 10-04-23 16:32:28, Jaewon Kim wrote: > @@ -350,6 +350,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > struct page *page, *tmp_page; > int i, ret = -ENOMEM; > > + if (len / PAGE_SIZE > totalram_pages()) > + return ERR_PTR(-ENOMEM); > + This is an antipattern imho. Check 7661809d493b ("mm: don't allow oversized kvmalloc() calls") how kvmalloc has dealt with a similar issue. totalram_pages doesn't really tell you anything about incorrect users. You might be on a low memory system where the request size is sane normally, it just doesn't fit into memory on that particular machine. > buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); > if (!buffer) > return ERR_PTR(-ENOMEM); > -- > 2.17.1
>Sorry for being late. I know there was some pre-existing discussion >around that but I didn't have time to participate. > >On Mon 10-04-23 16:32:28, Jaewon Kim wrote: >> @@ -350,6 +350,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, >> struct page *page, *tmp_page; >> int i, ret = -ENOMEM; >> >> + if (len / PAGE_SIZE > totalram_pages()) >> + return ERR_PTR(-ENOMEM); >> + > >This is an antipattern imho. Check 7661809d493b ("mm: don't allow >oversized kvmalloc() calls") how kvmalloc has dealt with a similar Hello Thank you for the information. I tried to search the macro of INT_MAX. include/vdso/limits.h #define INT_MAX ((int)(~0U >> 1)) AFAIK the dma-buf system heap user can request that huge size more than 2GB. So I think totalram_pages() is better than INT_MAX in this case. >issue. totalram_pages doesn't really tell you anything about incorrect >users. You might be on a low memory system where the request size is >sane normally, it just doesn't fit into memory on that particular >machine. Sorry maybe I'm not fully understand what you meant. User may requested a huge size like 3GB on 2GB ram device. But I think that should be rejected because it is bigger than the device ram size. Jaewon Kim > > >> buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); >> if (!buffer) >> return ERR_PTR(-ENOMEM); >> -- >> 2.17.1 > >-- >Michal Hocko >SUSE Labs
On Wed 12-04-23 17:57:26, Jaewon Kim wrote: > >Sorry for being late. I know there was some pre-existing discussion > >around that but I didn't have time to participate. > > > >On Mon 10-04-23 16:32:28, Jaewon Kim wrote: > >> @@ -350,6 +350,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > >> struct page *page, *tmp_page; > >> int i, ret = -ENOMEM; > >> > >> + if (len / PAGE_SIZE > totalram_pages()) > >> + return ERR_PTR(-ENOMEM); > >> + > > > >This is an antipattern imho. Check 7661809d493b ("mm: don't allow > >oversized kvmalloc() calls") how kvmalloc has dealt with a similar > > Hello Thank you for the information. > > I tried to search the macro of INT_MAX. > > include/vdso/limits.h > #define INT_MAX ((int)(~0U >> 1)) > > AFAIK the dma-buf system heap user can request that huge size more than 2GB. Do you have any pointers? This all is unreclaimable memory, right? How are those users constrained to not go overboard? > So > I think totalram_pages() is better than INT_MAX in this case. > > >issue. totalram_pages doesn't really tell you anything about incorrect > >users. You might be on a low memory system where the request size is > >sane normally, it just doesn't fit into memory on that particular > >machine. > > Sorry maybe I'm not fully understand what you meant. User may requested > a huge size like 3GB on 2GB ram device. But I think that should be rejected > because it is bigger than the device ram size. Even totalram_pages/10 can be just unfeasible amount of data to be allocated without a major disruption. totalram_pages is no measure of the memory availability. If you want to have a ballpark estimation then si_mem_available might be something you are looking for. But I thought the sole purpose of this patch is to catch obviously buggy callers (like sign overflow lenght etc) rather than any memory consumption sanity check.
>On Wed 12-04-23 17:57:26, Jaewon Kim wrote: >> >Sorry for being late. I know there was some pre-existing discussion >> >around that but I didn't have time to participate. >> > >> >On Mon 10-04-23 16:32:28, Jaewon Kim wrote: >> >> @@ -350,6 +350,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, >> >> struct page *page, *tmp_page; >> >> int i, ret = -ENOMEM; >> >> >> >> + if (len / PAGE_SIZE > totalram_pages()) >> >> + return ERR_PTR(-ENOMEM); >> >> + >> > >> >This is an antipattern imho. Check 7661809d493b ("mm: don't allow >> >oversized kvmalloc() calls") how kvmalloc has dealt with a similar >> >> Hello Thank you for the information. >> >> I tried to search the macro of INT_MAX. >> >> include/vdso/limits.h >> #define INT_MAX ((int)(~0U >> 1)) >> >> AFAIK the dma-buf system heap user can request that huge size more than 2GB. > >Do you have any pointers? This all is unreclaimable memory, right? How >are those users constrained to not go overboard? Correct dma-buf system heap memory is unreclaimable. To avoid that huge request, this patch includes __GFP_RETRY_MAYFAIL. #define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_RETRY_MAYFAIL) > >> So >> I think totalram_pages() is better than INT_MAX in this case. >> >> >issue. totalram_pages doesn't really tell you anything about incorrect >> >users. You might be on a low memory system where the request size is >> >sane normally, it just doesn't fit into memory on that particular >> >machine. >> >> Sorry maybe I'm not fully understand what you meant. User may requested >> a huge size like 3GB on 2GB ram device. But I think that should be rejected >> because it is bigger than the device ram size. > >Even totalram_pages/10 can be just unfeasible amount of data to be >allocated without a major disruption. totalram_pages is no measure of >the memory availability. >If you want to have a ballpark estimation then si_mem_available might be >something you are looking for. But I thought the sole purpose of this >patch is to catch obviously buggy callers (like sign overflow lenght >etc) rather than any memory consumption sanity check. Yes if we want to avoid some big size, si_mem_available could be one option. Actually I tried to do totalram_pages() / 2 like the old ion system heap in the previous patch version. Anyway totalram_pages in this patch is used to avoid the buggy size. And as we discussed in v2 patch, __GFP_RETRY_MAYFAIL was added. And I think the gfp makes us feel better in memory perspective. > >-- >Michal Hocko >SUSE Labs
On Wed 12-04-23 18:44:40, Jaewon Kim wrote: > >On Wed 12-04-23 17:57:26, Jaewon Kim wrote: > >> >Sorry for being late. I know there was some pre-existing discussion > >> >around that but I didn't have time to participate. > >> > > >> >On Mon 10-04-23 16:32:28, Jaewon Kim wrote: > >> >> @@ -350,6 +350,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, > >> >> struct page *page, *tmp_page; > >> >> int i, ret = -ENOMEM; > >> >> > >> >> + if (len / PAGE_SIZE > totalram_pages()) > >> >> + return ERR_PTR(-ENOMEM); > >> >> + > >> > > >> >This is an antipattern imho. Check 7661809d493b ("mm: don't allow > >> >oversized kvmalloc() calls") how kvmalloc has dealt with a similar > >> > >> Hello Thank you for the information. > >> > >> I tried to search the macro of INT_MAX. > >> > >> include/vdso/limits.h > >> #define INT_MAX ((int)(~0U >> 1)) > >> > >> AFAIK the dma-buf system heap user can request that huge size more than 2GB. > > > >Do you have any pointers? This all is unreclaimable memory, right? How > >are those users constrained to not go overboard? > > Correct dma-buf system heap memory is unreclaimable. To avoid that huge request, > this patch includes __GFP_RETRY_MAYFAIL. __GFP_RETRY_MAYFAIL doesn't avoud huge requests. It will drain the free available memory to the edge of OOM (especially for low order requests) so effectively anybody else requesting any memory (GFP_KERNEL like req.) will hit the oom killer very likely). > #define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_RETRY_MAYFAIL) > > > > >> So > >> I think totalram_pages() is better than INT_MAX in this case. > >> > >> >issue. totalram_pages doesn't really tell you anything about incorrect > >> >users. You might be on a low memory system where the request size is > >> >sane normally, it just doesn't fit into memory on that particular > >> >machine. > >> > >> Sorry maybe I'm not fully understand what you meant. User may requested > >> a huge size like 3GB on 2GB ram device. But I think that should be rejected > >> because it is bigger than the device ram size. > > > >Even totalram_pages/10 can be just unfeasible amount of data to be > >allocated without a major disruption. totalram_pages is no measure of > >the memory availability. > >If you want to have a ballpark estimation then si_mem_available might be > >something you are looking for. But I thought the sole purpose of this > >patch is to catch obviously buggy callers (like sign overflow lenght > >etc) rather than any memory consumption sanity check. > > Yes if we want to avoid some big size, si_mem_available could be one option. > Actually I tried to do totalram_pages() / 2 like the old ion system heap in > the previous patch version. Anyway totalram_pages in this patch is used to > avoid the buggy size. So let me repeat that totalram_pages is a wrong thing to do(tm). This is not a subsystem I would feel like nacking a patch, but consider this feedback as strong of a rejection as somebody external can give you. A mm internal allocator would get an outright nack. What you are doing is just wrong and an antipattern to what other allocators do. Either use something like INT_MAX to catch overflows or do not try to catch buggy code but pretend a better memory consumer citizen by using something like si_mem_available (ideally think of other potential memory users so do not allow any request to use all of it). The later might require much more involved interface and I do rememeber some attempts to account and limit dmabuf memory better. > And as we discussed in v2 patch, __GFP_RETRY_MAYFAIL was added. And I think > the gfp makes us feel better in memory perspective. wishful thinking that is.
>On Wed 12-04-23 18:44:40, Jaewon Kim wrote: >> >On Wed 12-04-23 17:57:26, Jaewon Kim wrote: >> >> >Sorry for being late. I know there was some pre-existing discussion >> >> >around that but I didn't have time to participate. >> >> > >> >> >On Mon 10-04-23 16:32:28, Jaewon Kim wrote: >> >> >> @@ -350,6 +350,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, >> >> >> struct page *page, *tmp_page; >> >> >> int i, ret = -ENOMEM; >> >> >> >> >> >> + if (len / PAGE_SIZE > totalram_pages()) >> >> >> + return ERR_PTR(-ENOMEM); >> >> >> + >> >> > >> >> >This is an antipattern imho. Check 7661809d493b ("mm: don't allow >> >> >oversized kvmalloc() calls") how kvmalloc has dealt with a similar >> >> >> >> Hello Thank you for the information. >> >> >> >> I tried to search the macro of INT_MAX. >> >> >> >> include/vdso/limits.h >> >> #define INT_MAX ((int)(~0U >> 1)) >> >> >> >> AFAIK the dma-buf system heap user can request that huge size more than 2GB. >> > >> >Do you have any pointers? This all is unreclaimable memory, right? How >> >are those users constrained to not go overboard? >> >> Correct dma-buf system heap memory is unreclaimable. To avoid that huge request, >> this patch includes __GFP_RETRY_MAYFAIL. > >__GFP_RETRY_MAYFAIL doesn't avoud huge requests. It will drain the free >available memory to the edge of OOM (especially for low order requests) >so effectively anybody else requesting any memory (GFP_KERNEL like req.) >will hit the oom killer very likely). > >> #define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_RETRY_MAYFAIL) >> >> > >> >> So >> >> I think totalram_pages() is better than INT_MAX in this case. >> >> >> >> >issue. totalram_pages doesn't really tell you anything about incorrect >> >> >users. You might be on a low memory system where the request size is >> >> >sane normally, it just doesn't fit into memory on that particular >> >> >machine. >> >> >> >> Sorry maybe I'm not fully understand what you meant. User may requested >> >> a huge size like 3GB on 2GB ram device. But I think that should be rejected >> >> because it is bigger than the device ram size. >> > >> >Even totalram_pages/10 can be just unfeasible amount of data to be >> >allocated without a major disruption. totalram_pages is no measure of >> >the memory availability. >> >If you want to have a ballpark estimation then si_mem_available might be >> >something you are looking for. But I thought the sole purpose of this >> >patch is to catch obviously buggy callers (like sign overflow lenght >> >etc) rather than any memory consumption sanity check. >> >> Yes if we want to avoid some big size, si_mem_available could be one option. >> Actually I tried to do totalram_pages() / 2 like the old ion system heap in >> the previous patch version. Anyway totalram_pages in this patch is used to >> avoid the buggy size. > >So let me repeat that totalram_pages is a wrong thing to do(tm). > >This is not a subsystem I would feel like nacking a patch, but consider >this feedback as strong of a rejection as somebody external can give >you. A mm internal allocator would get an outright nack. > >What you are doing is just wrong and an antipattern to what other >allocators do. Either use something like INT_MAX to catch overflows or >do not try to catch buggy code but pretend a better memory consumer >citizen by using something like si_mem_available (ideally think of >other potential memory users so do not allow any request to use all >of it). The later might require much more involved interface and I do >rememeber some attempts to account and limit dmabuf memory better. > >> And as we discussed in v2 patch, __GFP_RETRY_MAYFAIL was added. And I think >> the gfp makes us feel better in memory perspective. > >wishful thinking that is. >-- >Michal Hocko >SUSE Labs Yes I think you're right. As a allocator, dma-buf system heap looks to be loose in memory allocation. Limiting dmabuf memory may be required. But I think there is no nice and reasonable way so far. And the dma-buf system heap is being widely used in Android mobile system. AFAIK the camera consumes huge memory through this dma-buf system heap. I actually even looked a huge size request over 2GB in one dma-buf request. Jaewon Kim
On Wed 12-04-23 20:37:59, Jaewon Kim wrote: > Limiting dmabuf memory may be required. But I think there > is no nice and reasonable way so far. If that is really the way then the patch doesn't really add a big benefit. It doesn't really prevent OOMs (or panics due to OOM) as the allocator still allows to consume arbitrary amount of memory. The provided check is not able to tell between buggy and legit calls.
>On Wed 12-04-23 20:37:59, Jaewon Kim wrote: >> Limiting dmabuf memory may be required. But I think there >> is no nice and reasonable way so far. > >If that is really the way then the patch doesn't really add a big >benefit. It doesn't really prevent OOMs (or panics due to OOM) as the >allocator still allows to consume arbitrary amount of memory. The >provided check is not able to tell between buggy and legit calls. >-- >Michal Hocko >SUSE Labs Yes it could be. Though the buggy call is blocked by totalram_pages check, mm may suffer memory shortage due to the huge memory consumption through dma-buf system heap. We just hope Android LMKD or oomk kills the memory hoggers prior to oom panic. IMO if possible mm should be able to track the dma-buf size as stat in mm_rss_stat for each process. Then oomk is able to know size through get_mm_rss. Actually in mobile device environment, there is one more this kind of allocator. That is graphics drivers. They also consumes quite much memory and they may allocate as userspace wants. Jaewon Kim
On Wed 12-04-23 21:35:32, Jaewon Kim wrote: > >On Wed 12-04-23 20:37:59, Jaewon Kim wrote: > >> Limiting dmabuf memory may be required. But I think there > >> is no nice and reasonable way so far. > > > >If that is really the way then the patch doesn't really add a big > >benefit. It doesn't really prevent OOMs (or panics due to OOM) as the > >allocator still allows to consume arbitrary amount of memory. The > >provided check is not able to tell between buggy and legit calls. > >-- > >Michal Hocko > >SUSE Labs > > Yes it could be. Though the buggy call is blocked by totalram_pages check, It seems our definitions of buggy differ here. I do not see much difference between totalram_pages +- PAGE_SIZE (or any epsilon for that matter). Both would put the system down to its knees without a way out other than panic. > mm may suffer memory shortage due to the huge memory consumption through > dma-buf system heap. We just hope Android LMKD or oomk kills the memory > hoggers prior to oom panic. You seem to be missing an important point. If the global OOM killer is not able to find a victim the LMKD or oomk are highly unlikely as well (unless they ignore OOM_SCORE_ADJ_MIN). > IMO if possible mm should be able to track the dma-buf size as stat in > mm_rss_stat for each process. I do remember some proposals from the past and IIRC the main problem was how to attribute those buffers to the actual owner. I believe I have give you some arguments to consider. The rest is up to you. As I've said I do not have any stakes in dmabuf. The patch itself is not actively harmful, it is just adding an illusion of a fix while it doesn't give much.
On Wed, Apr 12, 2023 at 4:38 AM Jaewon Kim <jaewon31.kim@samsung.com> wrote: > Yes I think you're right. As a allocator, dma-buf system heap looks to be loose > in memory allocation. Limiting dmabuf memory may be required. But I think there > is no nice and reasonable way so far. And the dma-buf system heap is being > widely used in Android mobile system. AFAIK the camera consumes huge memory > through this dma-buf system heap. I actually even looked a huge size request > over 2GB in one dma-buf request. > Hey can you point me to where you saw a request that big? That's a non-buggy request?!
It was one of camera scenarios. I internally asked and heard that was not a bug but normal. I think 2GB looks too big for one graphics buffer but it could be for other purposes like camera. I think the system heap should support that. 2023년 4월 13일 (목) 오전 1:49, T.J. Mercier <tjmercier@google.com>님이 작성: > On Wed, Apr 12, 2023 at 4:38 AM Jaewon Kim <jaewon31.kim@samsung.com> > wrote: > > > Yes I think you're right. As a allocator, dma-buf system heap looks to > be loose > > in memory allocation. Limiting dmabuf memory may be required. But I > think there > > is no nice and reasonable way so far. And the dma-buf system heap is > being > > widely used in Android mobile system. AFAIK the camera consumes huge > memory > > through this dma-buf system heap. I actually even looked a huge size > request > > over 2GB in one dma-buf request. > > > Hey can you point me to where you saw a request that big? That's a > non-buggy request?! >
>On Wed, Apr 12, 2023 at 4:38?AM Jaewon Kim <jaewon31.kim@samsung.com> wrote: > >> Yes I think you're right. As a allocator, dma-buf system heap looks to be loose >> in memory allocation. Limiting dmabuf memory may be required. But I think there >> is no nice and reasonable way so far. And the dma-buf system heap is being >> widely used in Android mobile system. AFAIK the camera consumes huge memory >> through this dma-buf system heap. I actually even looked a huge size request >> over 2GB in one dma-buf request. >> >Hey can you point me to where you saw a request that big? That's a >non-buggy request?! (let me resend as plain text) It was one of camera scenarios. I internally asked and heard that was not a bug but normal. I think 2GB looks too big for one graphics buffer but it could be for other purposes like camera. I think the system heap should support that. Regarding __GFP_RETRY_MAYFAIL, we may need to say dma-buf system heap was designed to gather many pages up to a requested size. If mm returns NULL due to __GFP_RETRY_MAYFAIL, dma-buf system heap will release other already allocated pages, so that it may help to avoid oom.
On Thu 13-04-23 09:16:58, Jaewon Kim wrote: > >On Wed, Apr 12, 2023 at 4:38?AM Jaewon Kim <jaewon31.kim@samsung.com> wrote: > > > >> Yes I think you're right. As a allocator, dma-buf system heap looks to be loose > >> in memory allocation. Limiting dmabuf memory may be required. But I think there > >> is no nice and reasonable way so far. And the dma-buf system heap is being > >> widely used in Android mobile system. AFAIK the camera consumes huge memory > >> through this dma-buf system heap. I actually even looked a huge size request > >> over 2GB in one dma-buf request. > >> > >Hey can you point me to where you saw a request that big? That's a > >non-buggy request?! > > (let me resend as plain text) > It was one of camera scenarios. I internally asked and heard that was not a bug > but normal. I think 2GB looks too big for one graphics buffer but it could be > for other purposes like camera. I think the system heap should support that. Is that any of the upstream drivers or something sitting out of the tree. > Regarding __GFP_RETRY_MAYFAIL, we may need to say dma-buf system heap was > designed to gather many pages up to a requested size. If mm returns NULL due to > __GFP_RETRY_MAYFAIL, dma-buf system heap will release other already allocated > pages, so that it may help to avoid oom. This really depends on the other activity on the system. If you have a more concurrent memory demand at the time then you might be just out of the luck. Really, claiming huge portion of the memory shouldn't be done nilly willy.
>On Thu 13-04-23 09:16:58, Jaewon Kim wrote: >> >On Wed, Apr 12, 2023 at 4:38?AM Jaewon Kim <jaewon31.kim@samsung.com> wrote: >> > >> >> Yes I think you're right. As a allocator, dma-buf system heap looks to be loose >> >> in memory allocation. Limiting dmabuf memory may be required. But I think there >> >> is no nice and reasonable way so far. And the dma-buf system heap is being >> >> widely used in Android mobile system. AFAIK the camera consumes huge memory >> >> through this dma-buf system heap. I actually even looked a huge size request >> >> over 2GB in one dma-buf request. >> >> >> >Hey can you point me to where you saw a request that big? That's a >> >non-buggy request?! >> >> (let me resend as plain text) >> It was one of camera scenarios. I internally asked and heard that was not a bug >> but normal. I think 2GB looks too big for one graphics buffer but it could be >> for other purposes like camera. I think the system heap should support that. > >Is that any of the upstream drivers or something sitting out of the >tree. I don't think so. I guess that is userspace library rather than kernel driver. The user library directly might request the size through dma-buf ioctl. Even though that is kernel driver, I think the driver may not be upstreamed. > >> Regarding __GFP_RETRY_MAYFAIL, we may need to say dma-buf system heap was >> designed to gather many pages up to a requested size. If mm returns NULL due to >> __GFP_RETRY_MAYFAIL, dma-buf system heap will release other already allocated >> pages, so that it may help to avoid oom. > >This really depends on the other activity on the system. If you have a >more concurrent memory demand at the time then you might be just out of >the luck. Really, claiming huge portion of the memory shouldn't be done >nilly willy. I agree on that. >-- >Michal Hocko >SUSE Labs
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index 920db302a273..583da8948679 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -41,7 +41,7 @@ struct dma_heap_attachment { bool mapped; }; -#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO) +#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_RETRY_MAYFAIL) #define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \ | __GFP_NORETRY) & ~__GFP_RECLAIM) \ | __GFP_COMP) @@ -350,6 +350,9 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, struct page *page, *tmp_page; int i, ret = -ENOMEM; + if (len / PAGE_SIZE > totalram_pages()) + return ERR_PTR(-ENOMEM); + buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); if (!buffer) return ERR_PTR(-ENOMEM);