Message ID | 20240621162227.215412-8-cel@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Fixes for pNFS SCSI layout PR key registration | expand |
On Fri, Jun 21, 2024 at 12:22:30PM -0400, cel@kernel.org wrote: > From: Chuck Lever <chuck.lever@oracle.com> > > nfs4_get_device_info() frequently requests more than a few pages > when provisioning a nfs4_deviceid_node object. Make this more > efficient by using alloc_pages_bulk_array(). This API is known to be > several times faster than an open-coded loop around alloc_page(). > > release_pages() is folio-enabled so it is also more efficient than > repeatedly invoking __free_pages(). This isn't really a pnfs fix, right? Just a little optimization. It does looks fine to me: Reviewed-by: Christoph Hellwig <hch@lst.de> But I'd really with if we could do better than this with lazy decoding in ->alloc_deviceid_node, which (at least for blocklayout) knows roughly how much we need to decode after the first value parsed. Or at least cache it if it is that frequent (which it really shouldn't be due to the device id cache, or am I missing something?)
On Sat, Jun 22, 2024 at 07:08:12AM +0200, Christoph Hellwig wrote: > On Fri, Jun 21, 2024 at 12:22:30PM -0400, cel@kernel.org wrote: > > From: Chuck Lever <chuck.lever@oracle.com> > > > > nfs4_get_device_info() frequently requests more than a few pages > > when provisioning a nfs4_deviceid_node object. Make this more > > efficient by using alloc_pages_bulk_array(). This API is known to be > > several times faster than an open-coded loop around alloc_page(). > > > > release_pages() is folio-enabled so it is also more efficient than > > repeatedly invoking __free_pages(). > > This isn't really a pnfs fix, right? Just a little optimization. It doesn't say "fix" anywhere and doesn't include a Fixes: tag. And subsequent patches in the series are also clearly not fixes. I can make it more clear that this one is only an optimization. > It does looks fine to me: > > Reviewed-by: Christoph Hellwig <hch@lst.de> Thank you! > But I'd really wish if we could do better than this with lazy > decoding in ->alloc_deviceid_node, which (at least for blocklayout) > knows roughly how much we need to decode after the first value > parsed. Agreed. And it's not the only culprit in NFS and RPC of this kind of temporary "just in case" overallocation. > Or at least cache it if it is that frequent (which it > really shouldn't be due to the device id cache, or am I missing > something?) It's not a frequent operation; it's done the first time pNFS encounters a new block device. But the alloc_page() loop is slow and takes and releases an IRQ spinlock repeatedly (IIRC) so it's an opportunity for IRQs to run and delay get_device_info considerably.
diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c index 178001c90156..26a78d69acab 100644 --- a/fs/nfs/pnfs_dev.c +++ b/fs/nfs/pnfs_dev.c @@ -101,9 +101,8 @@ nfs4_get_device_info(struct nfs_server *server, struct nfs4_deviceid_node *d = NULL; struct pnfs_device *pdev = NULL; struct page **pages = NULL; + int rc, i, max_pages; u32 max_resp_sz; - int max_pages; - int rc, i; /* * Use the session max response size as the basis for setting @@ -125,11 +124,9 @@ nfs4_get_device_info(struct nfs_server *server, if (!pages) goto out_free_pdev; - for (i = 0; i < max_pages; i++) { - pages[i] = alloc_page(gfp_flags); - if (!pages[i]) - goto out_free_pages; - } + i = alloc_pages_bulk_array(GFP_KERNEL, max_pages, pages); + if (i != max_pages) + goto out_free_pages; memcpy(&pdev->dev_id, dev_id, sizeof(*dev_id)); pdev->layout_type = server->pnfs_curr_ld->id; @@ -154,8 +151,8 @@ nfs4_get_device_info(struct nfs_server *server, set_bit(NFS_DEVICEID_NOCACHE, &d->flags); out_free_pages: - while (--i >= 0) - __free_page(pages[i]); + if (i) + release_pages(pages, i); kfree(pages); out_free_pdev: kfree(pdev);