Message ID | 20230809194306.170979-1-axboe@kernel.dk (mailing list archive) |
---|---|
Headers | show |
Series | io-wq locking improvements | expand |
On 8/10/23 03:43, Jens Axboe wrote: > Hi, > > In chatting with someone that was trying to use io_uring to read > mailddirs, they found that running a test case that does: > > open file, statx file, read file, close file > > The culprit here is statx, and argumentation aside on whether it makes > sense to statx in the first place, it does highlight that io-wq is > pretty locking intensive. > > This (very lightly tested [1]) patchset attempts to improve this > situation, but reducing the frequency of grabbing wq->lock and > acct->lock. > > The first patch gets rid of wq->lock on work insertion. io-wq grabs it > to iterate the free worker list, but that is not necessary. > > Second patch reduces the frequency of acct->lock grabs, when we need to > run the queue and process new work. We currently grab the lock and check > for work, then drop it, then grab it again to process the work. That is > unneccessary. > > Final patch just optimizes how we activate new workers. It's not related > to the locking itself, just reducing the overhead of activating a new > worker. > > Running the above test case on a directory with 50K files, each being > between 10 and 4096 bytes, before these patches we get spend 160-170ms > running the workload. With this patchset, we spend 90-100ms doing the > same work. A bit of profile information is included in the patch commit > messages. > > Can also be found here: > > https://git.kernel.dk/cgit/linux/log/?h=io_uring-wq-lock > > [1] Runs the test suite just fine, with PROVE_LOCKING enabled and raw > lockdep as well. > Haven't got time to test it, but looks good from the code itself. Reviewed-by: Hao Xu <howeyxu@tencent.com>
On 8/10/23 10:00 PM, Hao Xu wrote: > On 8/10/23 03:43, Jens Axboe wrote: >> Hi, >> >> In chatting with someone that was trying to use io_uring to read >> mailddirs, they found that running a test case that does: >> >> open file, statx file, read file, close file >> >> The culprit here is statx, and argumentation aside on whether it makes >> sense to statx in the first place, it does highlight that io-wq is >> pretty locking intensive. >> >> This (very lightly tested [1]) patchset attempts to improve this >> situation, but reducing the frequency of grabbing wq->lock and >> acct->lock. >> >> The first patch gets rid of wq->lock on work insertion. io-wq grabs it >> to iterate the free worker list, but that is not necessary. >> >> Second patch reduces the frequency of acct->lock grabs, when we need to >> run the queue and process new work. We currently grab the lock and check >> for work, then drop it, then grab it again to process the work. That is >> unneccessary. >> >> Final patch just optimizes how we activate new workers. It's not related >> to the locking itself, just reducing the overhead of activating a new >> worker. >> >> Running the above test case on a directory with 50K files, each being >> between 10 and 4096 bytes, before these patches we get spend 160-170ms >> running the workload. With this patchset, we spend 90-100ms doing the >> same work. A bit of profile information is included in the patch commit >> messages. >> >> Can also be found here: >> >> https://git.kernel.dk/cgit/linux/log/?h=io_uring-wq-lock >> >> [1] Runs the test suite just fine, with PROVE_LOCKING enabled and raw >> lockdep as well. >> > > Haven't got time to test it, but looks good from the code itself. > > Reviewed-by: Hao Xu <howeyxu@tencent.com> Thanks, added.