Message ID | 1498855388-16990-12-git-send-email-bcache@lists.ewheeler.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 2017/7/1 上午4:43, bcache@lists.ewheeler.net wrote: > From: Tang Junhui <tang.junhui@zte.com.cn> > > bucket_in_use is updated in gc thread which triggered by invalidating or > writing sectors_to_gc dirty data, It's been too long, Therefore, when we > use it to compare with the threshold, it is often not timely, which leads > to inaccurate judgment and often results in bucket depletion. > > Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn> > --- > drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- > 1 file changed, 27 insertions(+), 2 deletions(-) > > diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c > index 866dcf7..77aa20b 100644 > --- a/drivers/md/bcache/btree.c > +++ b/drivers/md/bcache/btree.c > @@ -90,6 +90,8 @@ > #define MAX_NEED_GC 64 > #define MAX_SAVE_PRIO 72 > > +#define GC_THREAD_TIMEOUT_MS (30 * 1000) > + > #define PTR_DIRTY_BIT (((uint64_t) 1 << 36)) > > #define PTR_HASH(c, k) \ > @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) > bch_moving_gc(c); > } > > +void bch_update_bucket_in_use(struct cache_set *c) > +{ > + struct cache *ca; > + struct bucket *b; > + unsigned i; > + size_t available = 0; > + > + for_each_cache(ca, c, i) { > + for_each_bucket(b, ca) { > + if (!GC_MARK(b) || GC_MARK(b) == GC_MARK_RECLAIMABLE) > + available++; > + } > + } > + bucket_lock of cache set should be held before accessing buckets. > + c->gc_stats.in_use = (c->nbuckets - available) * 100 / c->nbuckets; > +} > + > static bool gc_should_run(struct cache_set *c) > { > struct cache *ca; > @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) > static int bch_gc_thread(void *arg) > { > struct cache_set *c = arg; > + long ret; > + unsigned long timeout = msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); > > while (1) { > - wait_event_interruptible(c->gc_wait, > - kthread_should_stop() || gc_should_run(c)); > + ret = wait_event_interruptible_timeout(c->gc_wait, > + kthread_should_stop() || gc_should_run(c), timeout); > + if (!ret) { > + bch_update_bucket_in_use(c); > + continue; A continue here will ignore status returned from kthread_should_stop(), which might not be expected behavior. > + } > > if (kthread_should_stop()) > break; > Iterating all buckets from the cache set requires bucket_lock to be held. Waiting for bucket_lock may take quite a long time for either bucket allocating code or bch_gc_thread(). What I concern is, this patch may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. We need to find out a way to avoid such a performance regression.
On 2017/7/11 下午1:39, tang.junhui@zte.com.cn wrote: > Compared to bucket depletion, resulting in hanging dead, > It is worthy to consumes a little time to update the bucket_in_use. > If you have any better solution, please show to us, > We should solve it as soon as possible, not wait for it forever. > > Your response makes sense, but not all people have same opinion with yours. The major issue here is, we need to hold bucket_lock here, otherwise we may access wild pointer to freed kobj and panic kernel. And we need to measure how much time will be occupied to iterate dirty buckets number with bucket_lock held. If the latency is not that much, we can ignore the cost, otherwise we do need to find a better solution. Thanks. Coly > > > 发件人: Coly Li <i@coly.li> > 收件人: linux-block@vger.kernel.org, Tang Junhui > <tang.junhui@zte.com.cn>, > 抄送: bcache@lists.ewheeler.net, linux-bcache@vger.kernel.org, > hch@infradead.org, axboe@kernel.dk > 日期: 2017/07/11 13:06 > 主题: Re: [PATCH 12/19] bcache: update bucket_in_use periodically > 发件人: linux-bcache-owner@vger.kernel.org > ------------------------------------------------------------------------ > > > > On 2017/7/1 上午4:43, bcache@lists.ewheeler.net wrote: >> From: Tang Junhui <tang.junhui@zte.com.cn> >> >> bucket_in_use is updated in gc thread which triggered by invalidating or >> writing sectors_to_gc dirty data, It's been too long, Therefore, when we >> use it to compare with the threshold, it is often not timely, which leads >> to inaccurate judgment and often results in bucket depletion. >> >> Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn> >> --- >> drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- >> 1 file changed, 27 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c >> index 866dcf7..77aa20b 100644 >> --- a/drivers/md/bcache/btree.c >> +++ b/drivers/md/bcache/btree.c >> @@ -90,6 +90,8 @@ >> #define MAX_NEED_GC 64 >> #define MAX_SAVE_PRIO 72 >> >> +#define GC_THREAD_TIMEOUT_MS (30 * 1000) >> + >> #define PTR_DIRTY_BIT (((uint64_t) 1 > << 36)) >> >> #define PTR_HASH(c, k) > \ >> @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) >> bch_moving_gc(c); >> } >> >> +void bch_update_bucket_in_use(struct cache_set *c) >> +{ >> + struct cache *ca; >> + struct bucket *b; >> + unsigned i; >> + size_t available = 0; >> + >> + for_each_cache(ca, c, i) { >> + for_each_bucket(b, ca) { >> + if (!GC_MARK(b) || > GC_MARK(b) == GC_MARK_RECLAIMABLE) >> + > available++; >> + } >> + } >> + > > bucket_lock of cache set should be held before accessing buckets. > > >> + c->gc_stats.in_use = (c->nbuckets - available) * 100 > / c->nbuckets; >> +} >> + >> static bool gc_should_run(struct cache_set *c) >> { >> struct cache *ca; >> @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) >> static int bch_gc_thread(void *arg) >> { >> struct cache_set *c = arg; >> + long ret; >> + unsigned long timeout = > msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); >> >> while (1) { >> - wait_event_interruptible(c->gc_wait, >> - > kthread_should_stop() || gc_should_run(c)); >> + ret = > wait_event_interruptible_timeout(c->gc_wait, >> + > kthread_should_stop() || gc_should_run(c), timeout); >> + if (!ret) { >> + > bch_update_bucket_in_use(c); >> + continue; > > A continue here will ignore status returned from kthread_should_stop(), > which might not be expected behavior. > > >> + } >> >> if (kthread_should_stop()) >> break; >> > > Iterating all buckets from the cache set requires bucket_lock to be > held. Waiting for bucket_lock may take quite a long time for either > bucket allocating code or bch_gc_thread(). What I concern is, this patch > may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. > > We need to find out a way to avoid such a performance regression. > > -- > Coly Li > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > >
On 2017/7/11 下午1:39, tang.junhui@zte.com.cn wrote: > Compared to bucket depletion, resulting in hanging dead, > It is worthy to consumes a little time to update the bucket_in_use. > If you have any better solution, please show to us, > We should solve it as soon as possible, not wait for it forever. > I also test this patch on a cache device with 4x3.8TB size, all buckets iteration takes around 40-50ms. If the iteration needs to hold bucket_lock of cache set, it is very probably to introduce a huge I/O latency in period of every 30 seconds. For database people, this is not good news. Coly > > > > 发件人: Coly Li <i@coly.li> > 收件人: linux-block@vger.kernel.org, Tang Junhui > <tang.junhui@zte.com.cn>, > 抄送: bcache@lists.ewheeler.net, linux-bcache@vger.kernel.org, > hch@infradead.org, axboe@kernel.dk > 日期: 2017/07/11 13:06 > 主题: Re: [PATCH 12/19] bcache: update bucket_in_use periodically > 发件人: linux-bcache-owner@vger.kernel.org > ------------------------------------------------------------------------ > > > > On 2017/7/1 上午4:43, bcache@lists.ewheeler.net wrote: >> From: Tang Junhui <tang.junhui@zte.com.cn> >> >> bucket_in_use is updated in gc thread which triggered by invalidating or >> writing sectors_to_gc dirty data, It's been too long, Therefore, when we >> use it to compare with the threshold, it is often not timely, which leads >> to inaccurate judgment and often results in bucket depletion. >> >> Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn> >> --- >> drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- >> 1 file changed, 27 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c >> index 866dcf7..77aa20b 100644 >> --- a/drivers/md/bcache/btree.c >> +++ b/drivers/md/bcache/btree.c >> @@ -90,6 +90,8 @@ >> #define MAX_NEED_GC 64 >> #define MAX_SAVE_PRIO 72 >> >> +#define GC_THREAD_TIMEOUT_MS (30 * 1000) >> + >> #define PTR_DIRTY_BIT (((uint64_t) 1 > << 36)) >> >> #define PTR_HASH(c, k) > \ >> @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) >> bch_moving_gc(c); >> } >> >> +void bch_update_bucket_in_use(struct cache_set *c) >> +{ >> + struct cache *ca; >> + struct bucket *b; >> + unsigned i; >> + size_t available = 0; >> + >> + for_each_cache(ca, c, i) { >> + for_each_bucket(b, ca) { >> + if (!GC_MARK(b) || > GC_MARK(b) == GC_MARK_RECLAIMABLE) >> + > available++; >> + } >> + } >> + > > bucket_lock of cache set should be held before accessing buckets. > > >> + c->gc_stats.in_use = (c->nbuckets - available) * 100 > / c->nbuckets; >> +} >> + >> static bool gc_should_run(struct cache_set *c) >> { >> struct cache *ca; >> @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) >> static int bch_gc_thread(void *arg) >> { >> struct cache_set *c = arg; >> + long ret; >> + unsigned long timeout = > msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); >> >> while (1) { >> - wait_event_interruptible(c->gc_wait, >> - > kthread_should_stop() || gc_should_run(c)); >> + ret = > wait_event_interruptible_timeout(c->gc_wait, >> + > kthread_should_stop() || gc_should_run(c), timeout); >> + if (!ret) { >> + > bch_update_bucket_in_use(c); >> + continue; > > A continue here will ignore status returned from kthread_should_stop(), > which might not be expected behavior. > > >> + } >> >> if (kthread_should_stop()) >> break; >> > > Iterating all buckets from the cache set requires bucket_lock to be > held. Waiting for bucket_lock may take quite a long time for either > bucket allocating code or bch_gc_thread(). What I concern is, this patch > may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. > > We need to find out a way to avoid such a performance regression. > > -- > Coly Li > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > >
On Tue, 11 Jul 2017, Coly Li wrote: > On 2017/7/11 下午1:39, tang.junhui@zte.com.cn wrote: > > Compared to bucket depletion, resulting in hanging dead, > > It is worthy to consumes a little time to update the bucket_in_use. > > If you have any better solution, please show to us, > > We should solve it as soon as possible, not wait for it forever. > > > > I also test this patch on a cache device with 4x3.8TB size, all buckets > iteration takes around 40-50ms. If the iteration needs to hold > bucket_lock of cache set, it is very probably to introduce a huge I/O > latency in period of every 30 seconds. > > For database people, this is not good news. Hi Tang, I'm waiting to queue this patch pending your response to Coly. Please send a v2 when you're ready. Thanks! -- Eric Wheeler > > Coly > > > > > > > > > > 发件人: Coly Li <i@coly.li> > > 收件人: linux-block@vger.kernel.org, Tang Junhui > > <tang.junhui@zte.com.cn>, > > 抄送: bcache@lists.ewheeler.net, linux-bcache@vger.kernel.org, > > hch@infradead.org, axboe@kernel.dk > > 日期: 2017/07/11 13:06 > > 主题: Re: [PATCH 12/19] bcache: update bucket_in_use periodically > > 发件人: linux-bcache-owner@vger.kernel.org > > ------------------------------------------------------------------------ > > > > > > > > On 2017/7/1 上午4:43, bcache@lists.ewheeler.net wrote: > >> From: Tang Junhui <tang.junhui@zte.com.cn> > >> > >> bucket_in_use is updated in gc thread which triggered by invalidating or > >> writing sectors_to_gc dirty data, It's been too long, Therefore, when we > >> use it to compare with the threshold, it is often not timely, which leads > >> to inaccurate judgment and often results in bucket depletion. > >> > >> Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn> > >> --- > >> drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- > >> 1 file changed, 27 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c > >> index 866dcf7..77aa20b 100644 > >> --- a/drivers/md/bcache/btree.c > >> +++ b/drivers/md/bcache/btree.c > >> @@ -90,6 +90,8 @@ > >> #define MAX_NEED_GC 64 > >> #define MAX_SAVE_PRIO 72 > >> > >> +#define GC_THREAD_TIMEOUT_MS (30 * 1000) > >> + > >> #define PTR_DIRTY_BIT (((uint64_t) 1 > > << 36)) > >> > >> #define PTR_HASH(c, k) > > \ > >> @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) > >> bch_moving_gc(c); > >> } > >> > >> +void bch_update_bucket_in_use(struct cache_set *c) > >> +{ > >> + struct cache *ca; > >> + struct bucket *b; > >> + unsigned i; > >> + size_t available = 0; > >> + > >> + for_each_cache(ca, c, i) { > >> + for_each_bucket(b, ca) { > >> + if (!GC_MARK(b) || > > GC_MARK(b) == GC_MARK_RECLAIMABLE) > >> + > > available++; > >> + } > >> + } > >> + > > > > bucket_lock of cache set should be held before accessing buckets. > > > > > >> + c->gc_stats.in_use = (c->nbuckets - available) * 100 > > / c->nbuckets; > >> +} > >> + > >> static bool gc_should_run(struct cache_set *c) > >> { > >> struct cache *ca; > >> @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) > >> static int bch_gc_thread(void *arg) > >> { > >> struct cache_set *c = arg; > >> + long ret; > >> + unsigned long timeout = > > msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); > >> > >> while (1) { > >> - wait_event_interruptible(c->gc_wait, > >> - > > kthread_should_stop() || gc_should_run(c)); > >> + ret = > > wait_event_interruptible_timeout(c->gc_wait, > >> + > > kthread_should_stop() || gc_should_run(c), timeout); > >> + if (!ret) { > >> + > > bch_update_bucket_in_use(c); > >> + continue; > > > > A continue here will ignore status returned from kthread_should_stop(), > > which might not be expected behavior. > > > > > >> + } > >> > >> if (kthread_should_stop()) > >> break; > >> > > > > Iterating all buckets from the cache set requires bucket_lock to be > > held. Waiting for bucket_lock may take quite a long time for either > > bucket allocating code or bch_gc_thread(). What I concern is, this patch > > may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. > > > > We need to find out a way to avoid such a performance regression. > > > > -- > > Coly Li > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > -- > Coly Li >
On 2017/7/13 下午12:13, Eric Wheeler wrote: > On Tue, 11 Jul 2017, Coly Li wrote: > >> On 2017/7/11 下午1:39, tang.junhui@zte.com.cn wrote: >>> Compared to bucket depletion, resulting in hanging dead, >>> It is worthy to consumes a little time to update the bucket_in_use. >>> If you have any better solution, please show to us, >>> We should solve it as soon as possible, not wait for it forever. >>> >> >> I also test this patch on a cache device with 4x3.8TB size, all buckets >> iteration takes around 40-50ms. If the iteration needs to hold >> bucket_lock of cache set, it is very probably to introduce a huge I/O >> latency in period of every 30 seconds. >> >> For database people, this is not good news. > > > Hi Tang, > > I'm waiting to queue this patch pending your response to Coly. > > Please send a v2 when you're ready. Eric, I guess Tang is working on the I/O hang issue during back ground garbage collection running. From discussion from other email thread, it seems a regular I/O request gets hung for 10+ second in some cases. Maybe that issue is more urgent than this one. From my personal opinion, updating bucket_in_use is for acting garbage collection. If number of bucket in use is not updated in time, garbage collection won't start due to old bucket_in_use still beyond CUTOFF_WRITEBACK_SYNC. We may maintain an atomic counter per-cache set for dirty buckets, and update it at some locations when allocating or reclaiming bucket. This counter is unnecessary to be very accurate, just accurate enough for should_writeback() working correctly. I am also looking at it for a better solution as well. Coly >> >> Coly >> >> >>> >>> >>> >>> 发件人: Coly Li <i@coly.li> >>> 收件人: linux-block@vger.kernel.org, Tang Junhui >>> <tang.junhui@zte.com.cn>, >>> 抄送: bcache@lists.ewheeler.net, linux-bcache@vger.kernel.org, >>> hch@infradead.org, axboe@kernel.dk >>> 日期: 2017/07/11 13:06 >>> 主题: Re: [PATCH 12/19] bcache: update bucket_in_use periodically >>> 发件人: linux-bcache-owner@vger.kernel.org >>> ------------------------------------------------------------------------ >>> >>> >>> >>> On 2017/7/1 上午4:43, bcache@lists.ewheeler.net wrote: >>>> From: Tang Junhui <tang.junhui@zte.com.cn> >>>> >>>> bucket_in_use is updated in gc thread which triggered by invalidating or >>>> writing sectors_to_gc dirty data, It's been too long, Therefore, when we >>>> use it to compare with the threshold, it is often not timely, which leads >>>> to inaccurate judgment and often results in bucket depletion. >>>> >>>> Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn> >>>> --- >>>> drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- >>>> 1 file changed, 27 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c >>>> index 866dcf7..77aa20b 100644 >>>> --- a/drivers/md/bcache/btree.c >>>> +++ b/drivers/md/bcache/btree.c >>>> @@ -90,6 +90,8 @@ >>>> #define MAX_NEED_GC 64 >>>> #define MAX_SAVE_PRIO 72 >>>> >>>> +#define GC_THREAD_TIMEOUT_MS (30 * 1000) >>>> + >>>> #define PTR_DIRTY_BIT (((uint64_t) 1 >>> << 36)) >>>> >>>> #define PTR_HASH(c, k) >>> \ >>>> @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) >>>> bch_moving_gc(c); >>>> } >>>> >>>> +void bch_update_bucket_in_use(struct cache_set *c) >>>> +{ >>>> + struct cache *ca; >>>> + struct bucket *b; >>>> + unsigned i; >>>> + size_t available = 0; >>>> + >>>> + for_each_cache(ca, c, i) { >>>> + for_each_bucket(b, ca) { >>>> + if (!GC_MARK(b) || >>> GC_MARK(b) == GC_MARK_RECLAIMABLE) >>>> + >>> available++; >>>> + } >>>> + } >>>> + >>> >>> bucket_lock of cache set should be held before accessing buckets. >>> >>> >>>> + c->gc_stats.in_use = (c->nbuckets - available) * 100 >>> / c->nbuckets; >>>> +} >>>> + >>>> static bool gc_should_run(struct cache_set *c) >>>> { >>>> struct cache *ca; >>>> @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) >>>> static int bch_gc_thread(void *arg) >>>> { >>>> struct cache_set *c = arg; >>>> + long ret; >>>> + unsigned long timeout = >>> msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); >>>> >>>> while (1) { >>>> - wait_event_interruptible(c->gc_wait, >>>> - >>> kthread_should_stop() || gc_should_run(c)); >>>> + ret = >>> wait_event_interruptible_timeout(c->gc_wait, >>>> + >>> kthread_should_stop() || gc_should_run(c), timeout); >>>> + if (!ret) { >>>> + >>> bch_update_bucket_in_use(c); >>>> + continue; >>> >>> A continue here will ignore status returned from kthread_should_stop(), >>> which might not be expected behavior. >>> >>> >>>> + } >>>> >>>> if (kthread_should_stop()) >>>> break; >>>> >>> >>> Iterating all buckets from the cache set requires bucket_lock to be >>> held. Waiting for bucket_lock may take quite a long time for either >>> bucket allocating code or bch_gc_thread(). What I concern is, this patch >>> may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. >>> >>> We need to find out a way to avoid such a performance regression. >>> >>> -- >>> Coly Li >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >>> >> >> >> -- >> Coly Li
On Thu, 13 Jul 2017, Coly Li wrote: > On 2017/7/13 下午12:13, Eric Wheeler wrote: > > On Tue, 11 Jul 2017, Coly Li wrote: > > > >> On 2017/7/11 下午1:39, tang.junhui@zte.com.cn wrote: > >>> Compared to bucket depletion, resulting in hanging dead, > >>> It is worthy to consumes a little time to update the bucket_in_use. > >>> If you have any better solution, please show to us, > >>> We should solve it as soon as possible, not wait for it forever. > >>> > >> > >> I also test this patch on a cache device with 4x3.8TB size, all buckets > >> iteration takes around 40-50ms. If the iteration needs to hold > >> bucket_lock of cache set, it is very probably to introduce a huge I/O > >> latency in period of every 30 seconds. > >> > >> For database people, this is not good news. > > > > > > Hi Tang, > > > > I'm waiting to queue this patch pending your response to Coly. > > > > Please send a v2 when you're ready. > > > Eric, > > I guess Tang is working on the I/O hang issue during back ground garbage > collection running. From discussion from other email thread, it seems a > regular I/O request gets hung for 10+ second in some cases. Maybe that > issue is more urgent than this one. > > From my personal opinion, updating bucket_in_use is for acting garbage > collection. If number of bucket in use is not updated in time, garbage > collection won't start due to old bucket_in_use still beyond > CUTOFF_WRITEBACK_SYNC. > > We may maintain an atomic counter per-cache set for dirty buckets, and > update it at some locations when allocating or reclaiming bucket. This > counter is unnecessary to be very accurate, just accurate enough for > should_writeback() working correctly. > > I am also looking at it for a better solution as well. Hi Coli & Tang, Have either of you had a chance to come up with a solution to this? -- Eric Wheeler > > Coly > > > >> > >> Coly > >> > >> > >>> > >>> > >>> > >>> 发件人: Coly Li <i@coly.li> > >>> 收件人: linux-block@vger.kernel.org, Tang Junhui > >>> <tang.junhui@zte.com.cn>, > >>> 抄送: bcache@lists.ewheeler.net, linux-bcache@vger.kernel.org, > >>> hch@infradead.org, axboe@kernel.dk > >>> 日期: 2017/07/11 13:06 > >>> 主题: Re: [PATCH 12/19] bcache: update bucket_in_use periodically > >>> 发件人: linux-bcache-owner@vger.kernel.org > >>> ------------------------------------------------------------------------ > >>> > >>> > >>> > >>> On 2017/7/1 上午4:43, bcache@lists.ewheeler.net wrote: > >>>> From: Tang Junhui <tang.junhui@zte.com.cn> > >>>> > >>>> bucket_in_use is updated in gc thread which triggered by invalidating or > >>>> writing sectors_to_gc dirty data, It's been too long, Therefore, when we > >>>> use it to compare with the threshold, it is often not timely, which leads > >>>> to inaccurate judgment and often results in bucket depletion. > >>>> > >>>> Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn> > >>>> --- > >>>> drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- > >>>> 1 file changed, 27 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c > >>>> index 866dcf7..77aa20b 100644 > >>>> --- a/drivers/md/bcache/btree.c > >>>> +++ b/drivers/md/bcache/btree.c > >>>> @@ -90,6 +90,8 @@ > >>>> #define MAX_NEED_GC 64 > >>>> #define MAX_SAVE_PRIO 72 > >>>> > >>>> +#define GC_THREAD_TIMEOUT_MS (30 * 1000) > >>>> + > >>>> #define PTR_DIRTY_BIT (((uint64_t) 1 > >>> << 36)) > >>>> > >>>> #define PTR_HASH(c, k) > >>> \ > >>>> @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) > >>>> bch_moving_gc(c); > >>>> } > >>>> > >>>> +void bch_update_bucket_in_use(struct cache_set *c) > >>>> +{ > >>>> + struct cache *ca; > >>>> + struct bucket *b; > >>>> + unsigned i; > >>>> + size_t available = 0; > >>>> + > >>>> + for_each_cache(ca, c, i) { > >>>> + for_each_bucket(b, ca) { > >>>> + if (!GC_MARK(b) || > >>> GC_MARK(b) == GC_MARK_RECLAIMABLE) > >>>> + > >>> available++; > >>>> + } > >>>> + } > >>>> + > >>> > >>> bucket_lock of cache set should be held before accessing buckets. > >>> > >>> > >>>> + c->gc_stats.in_use = (c->nbuckets - available) * 100 > >>> / c->nbuckets; > >>>> +} > >>>> + > >>>> static bool gc_should_run(struct cache_set *c) > >>>> { > >>>> struct cache *ca; > >>>> @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) > >>>> static int bch_gc_thread(void *arg) > >>>> { > >>>> struct cache_set *c = arg; > >>>> + long ret; > >>>> + unsigned long timeout = > >>> msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); > >>>> > >>>> while (1) { > >>>> - wait_event_interruptible(c->gc_wait, > >>>> - > >>> kthread_should_stop() || gc_should_run(c)); > >>>> + ret = > >>> wait_event_interruptible_timeout(c->gc_wait, > >>>> + > >>> kthread_should_stop() || gc_should_run(c), timeout); > >>>> + if (!ret) { > >>>> + > >>> bch_update_bucket_in_use(c); > >>>> + continue; > >>> > >>> A continue here will ignore status returned from kthread_should_stop(), > >>> which might not be expected behavior. > >>> > >>> > >>>> + } > >>>> > >>>> if (kthread_should_stop()) > >>>> break; > >>>> > >>> > >>> Iterating all buckets from the cache set requires bucket_lock to be > >>> held. Waiting for bucket_lock may take quite a long time for either > >>> bucket allocating code or bch_gc_thread(). What I concern is, this patch > >>> may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. > >>> > >>> We need to find out a way to avoid such a performance regression. > >>> > >>> -- > >>> Coly Li > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > >>> the body of a message to majordomo@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >>> > >> > >> > >> -- > >> Coly Li > > > -- > Coly Li > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
On Fri, 27 Oct 2017, Eric Wheeler wrote: > On Thu, 13 Jul 2017, Coly Li wrote: > > > On 2017/7/13 下午12:13, Eric Wheeler wrote: > > > On Tue, 11 Jul 2017, Coly Li wrote: > > > > > >> On 2017/7/11 下午1:39, tang.junhui@zte.com.cn wrote: > > >>> Compared to bucket depletion, resulting in hanging dead, > > >>> It is worthy to consumes a little time to update the bucket_in_use. > > >>> If you have any better solution, please show to us, > > >>> We should solve it as soon as possible, not wait for it forever. > > >>> > > >> > > >> I also test this patch on a cache device with 4x3.8TB size, all buckets > > >> iteration takes around 40-50ms. If the iteration needs to hold > > >> bucket_lock of cache set, it is very probably to introduce a huge I/O > > >> latency in period of every 30 seconds. > > >> > > >> For database people, this is not good news. > > > > > > > > > Hi Tang, > > > > > > I'm waiting to queue this patch pending your response to Coly. > > > > > > Please send a v2 when you're ready. > > > > > > Eric, > > > > I guess Tang is working on the I/O hang issue during back ground garbage > > collection running. From discussion from other email thread, it seems a > > regular I/O request gets hung for 10+ second in some cases. Maybe that > > issue is more urgent than this one. > > > > From my personal opinion, updating bucket_in_use is for acting garbage > > collection. If number of bucket in use is not updated in time, garbage > > collection won't start due to old bucket_in_use still beyond > > CUTOFF_WRITEBACK_SYNC. > > > > We may maintain an atomic counter per-cache set for dirty buckets, and > > update it at some locations when allocating or reclaiming bucket. This > > counter is unnecessary to be very accurate, just accurate enough for > > should_writeback() working correctly. > > > > I am also looking at it for a better solution as well. > > Hi Coli & Tang, > > Have either of you had a chance to come up with a solution to this? Nevermind, I just saw the patch sent on 10/24. Thanks for your work on that! I'll see if we can try it out. -- Eric Wheeler > > -- > Eric Wheeler > > > > > Coly > > > > > > >> > > >> Coly > > >> > > >> > > >>> > > >>> > > >>> > > >>> 发件人: Coly Li <i@coly.li> > > >>> 收件人: linux-block@vger.kernel.org, Tang Junhui > > >>> <tang.junhui@zte.com.cn>, > > >>> 抄送: bcache@lists.ewheeler.net, linux-bcache@vger.kernel.org, > > >>> hch@infradead.org, axboe@kernel.dk > > >>> 日期: 2017/07/11 13:06 > > >>> 主题: Re: [PATCH 12/19] bcache: update bucket_in_use periodically > > >>> 发件人: linux-bcache-owner@vger.kernel.org > > >>> ------------------------------------------------------------------------ > > >>> > > >>> > > >>> > > >>> On 2017/7/1 上午4:43, bcache@lists.ewheeler.net wrote: > > >>>> From: Tang Junhui <tang.junhui@zte.com.cn> > > >>>> > > >>>> bucket_in_use is updated in gc thread which triggered by invalidating or > > >>>> writing sectors_to_gc dirty data, It's been too long, Therefore, when we > > >>>> use it to compare with the threshold, it is often not timely, which leads > > >>>> to inaccurate judgment and often results in bucket depletion. > > >>>> > > >>>> Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn> > > >>>> --- > > >>>> drivers/md/bcache/btree.c | 29 +++++++++++++++++++++++++++-- > > >>>> 1 file changed, 27 insertions(+), 2 deletions(-) > > >>>> > > >>>> diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c > > >>>> index 866dcf7..77aa20b 100644 > > >>>> --- a/drivers/md/bcache/btree.c > > >>>> +++ b/drivers/md/bcache/btree.c > > >>>> @@ -90,6 +90,8 @@ > > >>>> #define MAX_NEED_GC 64 > > >>>> #define MAX_SAVE_PRIO 72 > > >>>> > > >>>> +#define GC_THREAD_TIMEOUT_MS (30 * 1000) > > >>>> + > > >>>> #define PTR_DIRTY_BIT (((uint64_t) 1 > > >>> << 36)) > > >>>> > > >>>> #define PTR_HASH(c, k) > > >>> \ > > >>>> @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) > > >>>> bch_moving_gc(c); > > >>>> } > > >>>> > > >>>> +void bch_update_bucket_in_use(struct cache_set *c) > > >>>> +{ > > >>>> + struct cache *ca; > > >>>> + struct bucket *b; > > >>>> + unsigned i; > > >>>> + size_t available = 0; > > >>>> + > > >>>> + for_each_cache(ca, c, i) { > > >>>> + for_each_bucket(b, ca) { > > >>>> + if (!GC_MARK(b) || > > >>> GC_MARK(b) == GC_MARK_RECLAIMABLE) > > >>>> + > > >>> available++; > > >>>> + } > > >>>> + } > > >>>> + > > >>> > > >>> bucket_lock of cache set should be held before accessing buckets. > > >>> > > >>> > > >>>> + c->gc_stats.in_use = (c->nbuckets - available) * 100 > > >>> / c->nbuckets; > > >>>> +} > > >>>> + > > >>>> static bool gc_should_run(struct cache_set *c) > > >>>> { > > >>>> struct cache *ca; > > >>>> @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) > > >>>> static int bch_gc_thread(void *arg) > > >>>> { > > >>>> struct cache_set *c = arg; > > >>>> + long ret; > > >>>> + unsigned long timeout = > > >>> msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); > > >>>> > > >>>> while (1) { > > >>>> - wait_event_interruptible(c->gc_wait, > > >>>> - > > >>> kthread_should_stop() || gc_should_run(c)); > > >>>> + ret = > > >>> wait_event_interruptible_timeout(c->gc_wait, > > >>>> + > > >>> kthread_should_stop() || gc_should_run(c), timeout); > > >>>> + if (!ret) { > > >>>> + > > >>> bch_update_bucket_in_use(c); > > >>>> + continue; > > >>> > > >>> A continue here will ignore status returned from kthread_should_stop(), > > >>> which might not be expected behavior. > > >>> > > >>> > > >>>> + } > > >>>> > > >>>> if (kthread_should_stop()) > > >>>> break; > > >>>> > > >>> > > >>> Iterating all buckets from the cache set requires bucket_lock to be > > >>> held. Waiting for bucket_lock may take quite a long time for either > > >>> bucket allocating code or bch_gc_thread(). What I concern is, this patch > > >>> may introduce bucket allocation delay in period of GC_THREAD_TIMEOUT_MS. > > >>> > > >>> We need to find out a way to avoid such a performance regression. > > >>> > > >>> -- > > >>> Coly Li > > >>> -- > > >>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > >>> the body of a message to majordomo@vger.kernel.org > > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > > >>> > > >>> > > >> > > >> > > >> -- > > >> Coly Li > > > > > > -- > > Coly Li > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > >
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 866dcf7..77aa20b 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -90,6 +90,8 @@ #define MAX_NEED_GC 64 #define MAX_SAVE_PRIO 72 +#define GC_THREAD_TIMEOUT_MS (30 * 1000) + #define PTR_DIRTY_BIT (((uint64_t) 1 << 36)) #define PTR_HASH(c, k) \ @@ -1760,6 +1762,23 @@ static void bch_btree_gc(struct cache_set *c) bch_moving_gc(c); } +void bch_update_bucket_in_use(struct cache_set *c) +{ + struct cache *ca; + struct bucket *b; + unsigned i; + size_t available = 0; + + for_each_cache(ca, c, i) { + for_each_bucket(b, ca) { + if (!GC_MARK(b) || GC_MARK(b) == GC_MARK_RECLAIMABLE) + available++; + } + } + + c->gc_stats.in_use = (c->nbuckets - available) * 100 / c->nbuckets; +} + static bool gc_should_run(struct cache_set *c) { struct cache *ca; @@ -1778,10 +1797,16 @@ static bool gc_should_run(struct cache_set *c) static int bch_gc_thread(void *arg) { struct cache_set *c = arg; + long ret; + unsigned long timeout = msecs_to_jiffies(GC_THREAD_TIMEOUT_MS); while (1) { - wait_event_interruptible(c->gc_wait, - kthread_should_stop() || gc_should_run(c)); + ret = wait_event_interruptible_timeout(c->gc_wait, + kthread_should_stop() || gc_should_run(c), timeout); + if (!ret) { + bch_update_bucket_in_use(c); + continue; + } if (kthread_should_stop()) break;