Message ID | 20241106192105.6731-9-kanchana.p.sridhar@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | zswap IAA compress batching | expand |
On Wed, Nov 6, 2024 at 11:21 AM Kanchana P Sridhar <kanchana.p.sridhar@intel.com> wrote: > > This patch implements two changes with respect to the acomp_ctx mutex lock: The commit subject is misleading, one of these is definitely not an optimization. Also, if we are doing two unrelated things we should do them in two separate commits. > > 1) The mutex lock is not acquired/released in zswap_compress(). Instead, > zswap_store() acquires the mutex lock once before compressing each page > in a large folio, and releases the lock once all pages in the folio have > been compressed. This should reduce some compute cycles in case of large > folio stores. I understand how bouncing the mutex around can regress performance, but I expect this to be more due to things like cacheline bouncing and allowing reclaim to make meaningful progress before giving up the mutex, rather than the actual cycles spent acquiring the mutex. Do you have any numbers to support that this is a net improvement? We usually base optimizations on data. > 2) In zswap_decompress(), the mutex lock is released after the conditional > zpool_unmap_handle() based on "src != acomp_ctx->buffer" rather than > before. This ensures that the value of "src" obtained earlier does not > change. If the mutex lock is released before the comparison of "src" it > is possible that another call to reclaim by the same process could > obtain the mutex lock and over-write the value of "src". This seems like a bug fix for 9c500835f279 ("mm: zswap: fix kernel BUG in sg_init_one"). That commit changed checking acomp_ctx->is_sleepable outside the mutex, which seems to be safe, to checking acomp_ctx->buffer. If my understanding is correct, this needs to be sent separately as a hotfix, with a proper Fixes tag and CC stable. The side effect would be that we never unmap the zpool handle and essentially leak the memory, right? > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> > --- > mm/zswap.c | 19 +++++++++++++++---- > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/mm/zswap.c b/mm/zswap.c > index f6316b66fb23..3e899fa61445 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -880,6 +880,9 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) > return 0; > } > > +/* > + * The acomp_ctx->mutex must be locked/unlocked in the calling procedure. > + */ > static bool zswap_compress(struct page *page, struct zswap_entry *entry, > struct zswap_pool *pool) > { > @@ -895,8 +898,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, > > acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); > > - mutex_lock(&acomp_ctx->mutex); > - > dst = acomp_ctx->buffer; > sg_init_table(&input, 1); > sg_set_page(&input, page, PAGE_SIZE, 0); > @@ -949,7 +950,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, > else if (alloc_ret) > zswap_reject_alloc_fail++; > > - mutex_unlock(&acomp_ctx->mutex); > return comp_ret == 0 && alloc_ret == 0; > } > > @@ -986,10 +986,16 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE); > BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait)); > BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE); > - mutex_unlock(&acomp_ctx->mutex); > > if (src != acomp_ctx->buffer) > zpool_unmap_handle(zpool, entry->handle); > + > + /* > + * It is safer to unlock the mutex after the check for > + * "src != acomp_ctx->buffer" so that the value of "src" > + * does not change. > + */ This comment is unnecessary, we should only release the lock after we are done accessing protected fields. > + mutex_unlock(&acomp_ctx->mutex); > } > > /********************************* > @@ -1487,6 +1493,7 @@ bool zswap_store(struct folio *folio) > { > long nr_pages = folio_nr_pages(folio); > swp_entry_t swp = folio->swap; > + struct crypto_acomp_ctx *acomp_ctx; > struct obj_cgroup *objcg = NULL; > struct mem_cgroup *memcg = NULL; > struct zswap_pool *pool; > @@ -1526,6 +1533,9 @@ bool zswap_store(struct folio *folio) > mem_cgroup_put(memcg); > } > > + acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); > + mutex_lock(&acomp_ctx->mutex); > + > for (index = 0; index < nr_pages; ++index) { > struct page *page = folio_page(folio, index); > ssize_t bytes; > @@ -1547,6 +1557,7 @@ bool zswap_store(struct folio *folio) > ret = true; > > put_pool: > + mutex_unlock(&acomp_ctx->mutex); > zswap_pool_put(pool); > put_objcg: > obj_cgroup_put(objcg); > -- > 2.27.0 >
Hi Yosry, > -----Original Message----- > From: Yosry Ahmed <yosryahmed@google.com> > Sent: Friday, November 8, 2024 12:14 PM > To: Sridhar, Kanchana P <kanchana.p.sridhar@intel.com> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org; > hannes@cmpxchg.org; nphamcs@gmail.com; chengming.zhou@linux.dev; > usamaarif642@gmail.com; ryan.roberts@arm.com; Huang, Ying > <ying.huang@intel.com>; 21cnbao@gmail.com; akpm@linux-foundation.org; > linux-crypto@vger.kernel.org; herbert@gondor.apana.org.au; > davem@davemloft.net; clabbe@baylibre.com; ardb@kernel.org; > ebiggers@google.com; surenb@google.com; Accardi, Kristen C > <kristen.c.accardi@intel.com>; zanussi@kernel.org; Feghali, Wajdi K > <wajdi.k.feghali@intel.com>; Gopal, Vinodh <vinodh.gopal@intel.com> > Subject: Re: [PATCH v3 08/13] mm: zswap: acomp_ctx mutex lock/unlock > optimizations. > > On Wed, Nov 6, 2024 at 11:21 AM Kanchana P Sridhar > <kanchana.p.sridhar@intel.com> wrote: > > > > This patch implements two changes with respect to the acomp_ctx mutex > lock: > > The commit subject is misleading, one of these is definitely not an > optimization. > > Also, if we are doing two unrelated things we should do them in two > separate commits. Thanks for the code review comments. I agree, these should be two separate commits. > > > > > 1) The mutex lock is not acquired/released in zswap_compress(). Instead, > > zswap_store() acquires the mutex lock once before compressing each > page > > in a large folio, and releases the lock once all pages in the folio have > > been compressed. This should reduce some compute cycles in case of > large > > folio stores. > > I understand how bouncing the mutex around can regress performance, > but I expect this to be more due to things like cacheline bouncing and > allowing reclaim to make meaningful progress before giving up the > mutex, rather than the actual cycles spent acquiring the mutex. > > Do you have any numbers to support that this is a net improvement? We > usually base optimizations on data. Makes sense. I will gather the data to motivate this. In my internal validation, I have been re-evaluating if this acquire/release once per large folio store still makes sense, because it runs the risk of introducing long latency paths within a sleeping mutex. I will quantify the benefits of this (if at all) and update. > > > 2) In zswap_decompress(), the mutex lock is released after the conditional > > zpool_unmap_handle() based on "src != acomp_ctx->buffer" rather than > > before. This ensures that the value of "src" obtained earlier does not > > change. If the mutex lock is released before the comparison of "src" it > > is possible that another call to reclaim by the same process could > > obtain the mutex lock and over-write the value of "src". > > This seems like a bug fix for 9c500835f279 ("mm: zswap: fix kernel BUG > in sg_init_one"). That commit changed checking acomp_ctx->is_sleepable > outside the mutex, which seems to be safe, to checking > acomp_ctx->buffer. > > If my understanding is correct, this needs to be sent separately as a > hotfix, with a proper Fixes tag and CC stable. The side effect would > be that we never unmap the zpool handle and essentially leak the > memory, right? Sure, I will send this separately as a hotfix. Yes, the side effect you describe is correct. Thanks, Kanchana > > > > > Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> > > --- > > mm/zswap.c | 19 +++++++++++++++---- > > 1 file changed, 15 insertions(+), 4 deletions(-) > > > > diff --git a/mm/zswap.c b/mm/zswap.c > > index f6316b66fb23..3e899fa61445 100644 > > --- a/mm/zswap.c > > +++ b/mm/zswap.c > > @@ -880,6 +880,9 @@ static int zswap_cpu_comp_dead(unsigned int cpu, > struct hlist_node *node) > > return 0; > > } > > > > +/* > > + * The acomp_ctx->mutex must be locked/unlocked in the calling > procedure. > > + */ > > static bool zswap_compress(struct page *page, struct zswap_entry *entry, > > struct zswap_pool *pool) > > { > > @@ -895,8 +898,6 @@ static bool zswap_compress(struct page *page, > struct zswap_entry *entry, > > > > acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); > > > > - mutex_lock(&acomp_ctx->mutex); > > - > > dst = acomp_ctx->buffer; > > sg_init_table(&input, 1); > > sg_set_page(&input, page, PAGE_SIZE, 0); > > @@ -949,7 +950,6 @@ static bool zswap_compress(struct page *page, > struct zswap_entry *entry, > > else if (alloc_ret) > > zswap_reject_alloc_fail++; > > > > - mutex_unlock(&acomp_ctx->mutex); > > return comp_ret == 0 && alloc_ret == 0; > > } > > > > @@ -986,10 +986,16 @@ static void zswap_decompress(struct > zswap_entry *entry, struct folio *folio) > > acomp_request_set_params(acomp_ctx->req, &input, &output, entry- > >length, PAGE_SIZE); > > BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx- > >req), &acomp_ctx->wait)); > > BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE); > > - mutex_unlock(&acomp_ctx->mutex); > > > > if (src != acomp_ctx->buffer) > > zpool_unmap_handle(zpool, entry->handle); > > + > > + /* > > + * It is safer to unlock the mutex after the check for > > + * "src != acomp_ctx->buffer" so that the value of "src" > > + * does not change. > > + */ > > This comment is unnecessary, we should only release the lock after we > are done accessing protected fields. > > > + mutex_unlock(&acomp_ctx->mutex); > > } > > > > /********************************* > > @@ -1487,6 +1493,7 @@ bool zswap_store(struct folio *folio) > > { > > long nr_pages = folio_nr_pages(folio); > > swp_entry_t swp = folio->swap; > > + struct crypto_acomp_ctx *acomp_ctx; > > struct obj_cgroup *objcg = NULL; > > struct mem_cgroup *memcg = NULL; > > struct zswap_pool *pool; > > @@ -1526,6 +1533,9 @@ bool zswap_store(struct folio *folio) > > mem_cgroup_put(memcg); > > } > > > > + acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); > > + mutex_lock(&acomp_ctx->mutex); > > + > > for (index = 0; index < nr_pages; ++index) { > > struct page *page = folio_page(folio, index); > > ssize_t bytes; > > @@ -1547,6 +1557,7 @@ bool zswap_store(struct folio *folio) > > ret = true; > > > > put_pool: > > + mutex_unlock(&acomp_ctx->mutex); > > zswap_pool_put(pool); > > put_objcg: > > obj_cgroup_put(objcg); > > -- > > 2.27.0 > >
diff --git a/mm/zswap.c b/mm/zswap.c index f6316b66fb23..3e899fa61445 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -880,6 +880,9 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) return 0; } +/* + * The acomp_ctx->mutex must be locked/unlocked in the calling procedure. + */ static bool zswap_compress(struct page *page, struct zswap_entry *entry, struct zswap_pool *pool) { @@ -895,8 +898,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); - mutex_lock(&acomp_ctx->mutex); - dst = acomp_ctx->buffer; sg_init_table(&input, 1); sg_set_page(&input, page, PAGE_SIZE, 0); @@ -949,7 +950,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, else if (alloc_ret) zswap_reject_alloc_fail++; - mutex_unlock(&acomp_ctx->mutex); return comp_ret == 0 && alloc_ret == 0; } @@ -986,10 +986,16 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE); BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait)); BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE); - mutex_unlock(&acomp_ctx->mutex); if (src != acomp_ctx->buffer) zpool_unmap_handle(zpool, entry->handle); + + /* + * It is safer to unlock the mutex after the check for + * "src != acomp_ctx->buffer" so that the value of "src" + * does not change. + */ + mutex_unlock(&acomp_ctx->mutex); } /********************************* @@ -1487,6 +1493,7 @@ bool zswap_store(struct folio *folio) { long nr_pages = folio_nr_pages(folio); swp_entry_t swp = folio->swap; + struct crypto_acomp_ctx *acomp_ctx; struct obj_cgroup *objcg = NULL; struct mem_cgroup *memcg = NULL; struct zswap_pool *pool; @@ -1526,6 +1533,9 @@ bool zswap_store(struct folio *folio) mem_cgroup_put(memcg); } + acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); + mutex_lock(&acomp_ctx->mutex); + for (index = 0; index < nr_pages; ++index) { struct page *page = folio_page(folio, index); ssize_t bytes; @@ -1547,6 +1557,7 @@ bool zswap_store(struct folio *folio) ret = true; put_pool: + mutex_unlock(&acomp_ctx->mutex); zswap_pool_put(pool); put_objcg: obj_cgroup_put(objcg);
This patch implements two changes with respect to the acomp_ctx mutex lock: 1) The mutex lock is not acquired/released in zswap_compress(). Instead, zswap_store() acquires the mutex lock once before compressing each page in a large folio, and releases the lock once all pages in the folio have been compressed. This should reduce some compute cycles in case of large folio stores. 2) In zswap_decompress(), the mutex lock is released after the conditional zpool_unmap_handle() based on "src != acomp_ctx->buffer" rather than before. This ensures that the value of "src" obtained earlier does not change. If the mutex lock is released before the comparison of "src" it is possible that another call to reclaim by the same process could obtain the mutex lock and over-write the value of "src". Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> --- mm/zswap.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)