Message ID | 20221004151748.293388-1-longman@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | blk-cgroup: Optimize blkcg_rstat_flush() | expand |
On 10/6/22 06:11, Hillf Danton wrote: > On 4 Oct 2022 11:17:48 -0400 Waiman Long <longman@redhat.com> >> For a system with many CPUs and block devices, the time to do >> blkcg_rstat_flush() from cgroup_rstat_flush() can be rather long. It >> can be especially problematic as interrupt is disabled during the flush. >> It was reported that it might take seconds to complete in some extreme >> cases leading to hard lockup messages. >> >> As it is likely that not all the percpu blkg_iostat_set's has been >> updated since the last flush, those stale blkg_iostat_set's don't need >> to be flushed in this case. This patch optimizes blkcg_rstat_flush() >> by keeping a lockless list of recently updated blkg_iostat_set's in a >> newly added percpu blkcg->lhead pointer. >> >> The blkg_iostat_set is added to a sentinel lockless list on the update >> side in blk_cgroup_bio_start(). It is removed from the sentinel lockless >> list when flushed in blkcg_rstat_flush(). Due to racing, it is possible >> that blk_iostat_set's in the lockless list may have no new IO stats to >> be flushed, but that is OK. > So it is likely that another flag, updated when bis is added to/deleted > from llist, can cut 1/3 off without raising the risk of getting your patch > over complicated. > >> >> struct blkg_iostat_set { >> struct u64_stats_sync sync; >> + struct llist_node lnode; >> + struct blkcg_gq *blkg; > + atomic_t queued; > >> struct blkg_iostat cur; >> struct blkg_iostat last; >> }; Yes, by introducing a flag to record the lockless list state, it is possible to just use the current llist implementation. Maybe I can rework it for now without the sentinel variant and post a separate llist patch for that later on. Cheers, Longman