Message ID | 20230524153311.3625329-6-dhowells@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 3 | expand |
On Wed, 24 May 2023 16:33:04 +0100 David Howells wrote: > Make the page_frag_cache allocator handle __GFP_ZERO itself rather than > passing it off to the page allocator. There may be a mix of callers, some > specifying __GFP_ZERO and some not - and even if all specify __GFP_ZERO, we > might refurbish the page, in which case the returned memory doesn't get > cleared. I think it's pretty clear that page frag allocator was never supposed to support GFP_ZERO, as we don't need it in networking.. So maybe you're better off adding the memset() in nvme? CCing Alex, who I think would say something along those lines :) IDK how much we still care given that most networking drivers are migrating to page_poll these days.
On Fri, May 26, 2023 at 5:57 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Wed, 24 May 2023 16:33:04 +0100 David Howells wrote: > > Make the page_frag_cache allocator handle __GFP_ZERO itself rather than > > passing it off to the page allocator. There may be a mix of callers, some > > specifying __GFP_ZERO and some not - and even if all specify __GFP_ZERO, we > > might refurbish the page, in which case the returned memory doesn't get > > cleared. > > I think it's pretty clear that page frag allocator was never supposed > to support GFP_ZERO, as we don't need it in networking.. So maybe > you're better off adding the memset() in nvme? > > CCing Alex, who I think would say something along those lines :) > IDK how much we still care given that most networking drivers are > migrating to page_poll these days. Yeah, the page frag allocator wasn't meant to handle things like this. Generally the cache was meant to be used within one context so that the GFP flags used were consistent between calls. Currently the only thing passed appears to be GFP_ATOMIC. Also I am not a big fan of pulling this out of page_alloc.c The fact is that is where the allocation functions live so it makes sense to just leave it there. It isn't as if there is enough code added in my point of view to create yet another file and make it harder to track git history as a result.
diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c index ffd68bfb677d..2b73c7f5d9a9 100644 --- a/mm/page_frag_alloc.c +++ b/mm/page_frag_alloc.c @@ -23,7 +23,10 @@ static struct folio *page_frag_cache_refill(struct page_frag_cache *nc, gfp_t gfp_mask) { struct folio *folio = NULL; - gfp_t gfp = gfp_mask; + gfp_t gfp; + + gfp_mask &= ~__GFP_ZERO; + gfp = gfp_mask; #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) gfp_mask |= __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; @@ -71,6 +74,7 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, { struct folio *folio = nc->folio; size_t offset; + void *p; WARN_ON_ONCE(!is_power_of_2(align)); @@ -133,7 +137,10 @@ void *page_frag_alloc_align(struct page_frag_cache *nc, offset &= ~(align - 1); nc->offset = offset; - return folio_address(folio) + offset; + p = folio_address(folio) + offset; + if (gfp_mask & __GFP_ZERO) + return memset(p, 0, fragsz); + return p; } EXPORT_SYMBOL(page_frag_alloc_align);
Make the page_frag_cache allocator handle __GFP_ZERO itself rather than passing it off to the page allocator. There may be a mix of callers, some specifying __GFP_ZERO and some not - and even if all specify __GFP_ZERO, we might refurbish the page, in which case the returned memory doesn't get cleared. This is a potential bug in the nvme over TCP driver. Signed-off-by: David Howells <dhowells@redhat.com> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: Jeroen de Borst <jeroendb@google.com> cc: Catherine Sullivan <csully@google.com> cc: Shailend Chand <shailend@google.com> cc: Felix Fietkau <nbd@nbd.name> cc: John Crispin <john@phrozen.org> cc: Sean Wang <sean.wang@mediatek.com> cc: Mark Lee <Mark-MC.Lee@mediatek.com> cc: Lorenzo Bianconi <lorenzo@kernel.org> cc: Matthias Brugger <matthias.bgg@gmail.com> cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> cc: Keith Busch <kbusch@kernel.org> cc: Jens Axboe <axboe@fb.com> cc: Christoph Hellwig <hch@lst.de> cc: Sagi Grimberg <sagi@grimberg.me> cc: Chaitanya Kulkarni <kch@nvidia.com> cc: Andrew Morton <akpm@linux-foundation.org> cc: Matthew Wilcox <willy@infradead.org> cc: netdev@vger.kernel.org cc: linux-arm-kernel@lists.infradead.org cc: linux-mediatek@lists.infradead.org cc: linux-nvme@lists.infradead.org cc: linux-mm@kvack.org --- mm/page_frag_alloc.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)