Message ID | 20230117120638.72254-1-anuj20.g@samsung.com (mailing list archive) |
---|---|
Headers | show |
Series | enable pcpu bio-cache for IRQ uring-passthru I/O | expand |
On 1/17/23 5:06?AM, Anuj Gupta wrote: > This series extends bio pcpu caching for normal / IRQ-driven > uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could > leverage bio-cache. After the series from Pavel[1], bio-cache can be > leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane > SSD setup shows +7.21% for batches of 32 requests. > > [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/ > > IRQ, 128/32/32, cache off Tests here - before: polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=62.88M, BW=30.70GiB/s, IOS/call=32/31 IOPS=62.95M, BW=30.74GiB/s, IOS/call=32/31 IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/32 IOPS=62.61M, BW=30.57GiB/s, IOS/call=31/32 IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/31 IOPS=62.40M, BW=30.47GiB/s, IOS/call=32/32 after: polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128 Engine=io_uring, sq_ring=128, cq_ring=128 IOPS=76.58M, BW=37.39GiB/s, IOS/call=31/31 IOPS=79.42M, BW=38.78GiB/s, IOS/call=32/32 IOPS=78.06M, BW=38.12GiB/s, IOS/call=31/31 IOPS=77.64M, BW=37.91GiB/s, IOS/call=32/31 IOPS=77.17M, BW=37.68GiB/s, IOS/call=32/32 IOPS=76.73M, BW=37.47GiB/s, IOS/call=31/31 IOPS=76.94M, BW=37.57GiB/s, IOS/call=32/31 Note that this includes Pavel's fix as well: https://lore.kernel.org/linux-block/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com/ But this mirrors the improvement seen on the non-passthrough side as well. I'd say that's a pass :-)
On Tue, 17 Jan 2023 17:36:36 +0530, Anuj Gupta wrote: > This series extends bio pcpu caching for normal / IRQ-driven > uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could > leverage bio-cache. After the series from Pavel[1], bio-cache can be > leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane > SSD setup shows +7.21% for batches of 32 requests. > > [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/ > > [...] Applied, thanks! [1/2] nvme: set REQ_ALLOC_CACHE for uring-passthru request commit: 988136a307157de9e6e9d27ee9f7ea24ee374f32 [2/2] block: extend bio-cache for non-polled requests commit: 934f178446b11f621ab52e83211ebf399896db47 Best regards,
On Tue, Jan 17, 2023 at 10:11:08AM -0700, Jens Axboe wrote: >On 1/17/23 5:06?AM, Anuj Gupta wrote: >> This series extends bio pcpu caching for normal / IRQ-driven >> uring-passthru I/Os. Earlier, only polled uring-passthru I/Os could >> leverage bio-cache. After the series from Pavel[1], bio-cache can be >> leveraged by normal / IRQ driven I/Os as well. t/io_uring with an Optane >> SSD setup shows +7.21% for batches of 32 requests. >> >> [1] https://lore.kernel.org/io-uring/cover.1666347703.git.asml.silence@gmail.com/ >> >> IRQ, 128/32/32, cache off > >Tests here - > >before: > >polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128 >Engine=io_uring, sq_ring=128, cq_ring=128 >IOPS=62.88M, BW=30.70GiB/s, IOS/call=32/31 >IOPS=62.95M, BW=30.74GiB/s, IOS/call=32/31 >IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/32 >IOPS=62.61M, BW=30.57GiB/s, IOS/call=31/32 >IOPS=62.52M, BW=30.53GiB/s, IOS/call=32/31 >IOPS=62.40M, BW=30.47GiB/s, IOS/call=32/32 > >after: > >polled=0, fixedbufs=1/0, register_files=1, buffered=1, QD=128 >Engine=io_uring, sq_ring=128, cq_ring=128 >IOPS=76.58M, BW=37.39GiB/s, IOS/call=31/31 >IOPS=79.42M, BW=38.78GiB/s, IOS/call=32/32 >IOPS=78.06M, BW=38.12GiB/s, IOS/call=31/31 >IOPS=77.64M, BW=37.91GiB/s, IOS/call=32/31 >IOPS=77.17M, BW=37.68GiB/s, IOS/call=32/32 >IOPS=76.73M, BW=37.47GiB/s, IOS/call=31/31 >IOPS=76.94M, BW=37.57GiB/s, IOS/call=32/31 > >Note that this includes Pavel's fix as well: > >https://lore.kernel.org/linux-block/80d4511011d7d4751b4cf6375c4e38f237d935e3.1673955390.git.asml.silence@gmail.com/ So I was thinking whether we need this fix for passthru path too. We do not. For block path, blk_mq_get_cached_request() encountered a mismatch since type was different (read vs default). For passthru, blk_mq_alloc_cached_request() sees no mismatch since passthrough opf is not treated as read (default vs default).