Message ID | 20230809100754.3094517-2-shikemeng@huaweicloud.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/2] mm/page_alloc: remove track of active PCP lists range in bulk free | expand |
Hi Kemeng, Can you confirm this patch has no intended functional change? I have a patch sitting in my tree for a while related to this count vs pcp->count. The BPF function hook can potentially change pcp->count and make count out of sync with pcp->count which causes a dead loop. Maybe I can send my out alone side with yours for discussion? I don't mind my patch combined with yours. Your change looks fine to me. There is more can be done on the clean up. Chris On Wed, Aug 09, 2023 at 06:07:53PM +0800, Kemeng Shi wrote: > After commit fd56eef258a17 ("mm/page_alloc: simplify how many pages are > selected per pcp list during bulk free"), we will drain all pages in > selected pcp list. And we ensured passed count is < pcp->count. Then, > the search will finish before wrap-around and track of active PCP lists > range intended for wrap-around case is no longer needed. > > Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> > --- > mm/page_alloc.c | 15 +++------------ > 1 file changed, 3 insertions(+), 12 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 96b7c1a7d1f2..1ddcb2707d05 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1207,8 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, > int pindex) > { > unsigned long flags; > - int min_pindex = 0; > - int max_pindex = NR_PCP_LISTS - 1; > unsigned int order; > bool isolated_pageblocks; > struct page *page; > @@ -1231,17 +1229,10 @@ static void free_pcppages_bulk(struct zone *zone, int count, > > /* Remove pages from lists in a round-robin fashion. */ > do { > - if (++pindex > max_pindex) > - pindex = min_pindex; > + if (++pindex > NR_PCP_LISTS - 1) > + pindex = 0; > list = &pcp->lists[pindex]; > - if (!list_empty(list)) > - break; > - > - if (pindex == max_pindex) > - max_pindex--; > - if (pindex == min_pindex) > - min_pindex++; > - } while (1); > + } while (list_empty(list)); > > order = pindex_to_order(pindex); > nr_pages = 1 << order; > -- > 2.30.0 >
on 8/16/2023 1:45 AM, Chris Li wrote: > Hi Kemeng, > > Can you confirm this patch has no intended functional change? > Hi Chris, there is no functional change intended in this patch. As I menthioned in changelog, there is no wrap for list iteration, so that the active PCP lists range will never be used. > I have a patch sitting in my tree for a while related to this > count vs pcp->count. The BPF function hook can potentially change > pcp->count and make count out of sync with pcp->count which causes > a dead loop. > I guess pcp->count is set to bigger than it should be. In this case, we will keep trying get pages while all pages in pcp list were taken off already and dead lock will happen. In this case, dead looo will happen with or without this patch as the root cause is that we try to get pages more than pcp list owns.> Maybe I can send my out alone side with yours for discussion? > I don't mind my patch combined with yours. > Either way is acceptable to me, just feel free to choose one you like and I'd like to see if more we could do to this. > Your change looks fine to me. There is more can be done > on the clean up. > Thanks for feedback, and more clean up is welcome. > Chris > > On Wed, Aug 09, 2023 at 06:07:53PM +0800, Kemeng Shi wrote: >> After commit fd56eef258a17 ("mm/page_alloc: simplify how many pages are >> selected per pcp list during bulk free"), we will drain all pages in >> selected pcp list. And we ensured passed count is < pcp->count. Then, >> the search will finish before wrap-around and track of active PCP lists >> range intended for wrap-around case is no longer needed. > >> >> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >> --- >> mm/page_alloc.c | 15 +++------------ >> 1 file changed, 3 insertions(+), 12 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 96b7c1a7d1f2..1ddcb2707d05 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -1207,8 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, >> int pindex) >> { >> unsigned long flags; >> - int min_pindex = 0; >> - int max_pindex = NR_PCP_LISTS - 1; >> unsigned int order; >> bool isolated_pageblocks; >> struct page *page; >> @@ -1231,17 +1229,10 @@ static void free_pcppages_bulk(struct zone *zone, int count, >> >> /* Remove pages from lists in a round-robin fashion. */ >> do { >> - if (++pindex > max_pindex) >> - pindex = min_pindex; >> + if (++pindex > NR_PCP_LISTS - 1) >> + pindex = 0; >> list = &pcp->lists[pindex]; >> - if (!list_empty(list)) >> - break; >> - >> - if (pindex == max_pindex) >> - max_pindex--; >> - if (pindex == min_pindex) >> - min_pindex++; >> - } while (1); >> + } while (list_empty(list)); >> >> order = pindex_to_order(pindex); >> nr_pages = 1 << order; >> -- >> 2.30.0 >> >
Hi Kemeng, On Wed, Aug 16, 2023 at 7:22 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote: > Hi Chris, there is no functional change intended in this patch. As > I menthioned in changelog, there is no wrap for list iteration, so > that the active PCP lists range will never be used. > > I have a patch sitting in my tree for a while related to this > > count vs pcp->count. The BPF function hook can potentially change > > pcp->count and make count out of sync with pcp->count which causes > > a dead loop. In this case the BPF allocates a page inside spin_lock. The "pcp->count" is smaller than "count". The loop condition only checks "count > 0" but all pcp->lists pages have been free. pcp->count is 0 but "count" is 1. After a few times wrap around, the pindex_max is smaller than pindex_min, then reach to -1 cause the invalid page fault. > I guess pcp->count is set to bigger than it should be. In this case, > we will keep trying get pages while all pages in pcp list were taken > off already and dead lock will happen. In this case, dead looo will > happen with or without this patch as the root cause is that we try > to get pages more than pcp list owns.> Maybe I can send my out alone side with yours for discussion? My patch is split into two parts, the first patch is a functional change to allow pcp->count drop below "count". The other patch is just to clean up, and should have the same function. Sure will send it out and CC you for discussion. > > I don't mind my patch combined with yours. > > > Either way is acceptable to me, just feel free to choose one you like > and I'd like to see if more we could do to this. Thanks Chris
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 96b7c1a7d1f2..1ddcb2707d05 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1207,8 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count, int pindex) { unsigned long flags; - int min_pindex = 0; - int max_pindex = NR_PCP_LISTS - 1; unsigned int order; bool isolated_pageblocks; struct page *page; @@ -1231,17 +1229,10 @@ static void free_pcppages_bulk(struct zone *zone, int count, /* Remove pages from lists in a round-robin fashion. */ do { - if (++pindex > max_pindex) - pindex = min_pindex; + if (++pindex > NR_PCP_LISTS - 1) + pindex = 0; list = &pcp->lists[pindex]; - if (!list_empty(list)) - break; - - if (pindex == max_pindex) - max_pindex--; - if (pindex == min_pindex) - min_pindex++; - } while (1); + } while (list_empty(list)); order = pindex_to_order(pindex); nr_pages = 1 << order;
After commit fd56eef258a17 ("mm/page_alloc: simplify how many pages are selected per pcp list during bulk free"), we will drain all pages in selected pcp list. And we ensured passed count is < pcp->count. Then, the search will finish before wrap-around and track of active PCP lists range intended for wrap-around case is no longer needed. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> --- mm/page_alloc.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-)