From patchwork Tue Feb 18 22:26:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mina Almasry X-Patchwork-Id: 11389887 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD0B7139A for ; Tue, 18 Feb 2020 22:27:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5DEA922B48 for ; Tue, 18 Feb 2020 22:27:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fzWZN8N1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5DEA922B48 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 92FDE6B0005; Tue, 18 Feb 2020 17:27:15 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8E15C6B0006; Tue, 18 Feb 2020 17:27:15 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7CF826B0007; Tue, 18 Feb 2020 17:27:15 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0201.hostedemail.com [216.40.44.201]) by kanga.kvack.org (Postfix) with ESMTP id 65C516B0005 for ; Tue, 18 Feb 2020 17:27:15 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2D3E9181AEF1D for ; Tue, 18 Feb 2020 22:27:15 +0000 (UTC) X-FDA: 76504684830.29.spoon37_7971995bae203 X-Spam-Summary: 2,0,0,df4d9b54503029bf,d41d8cd98f00b204,3wwrmxgskclqufgumlsgchuaiiafy.wigfchor-ggepuwe.ila@flex--almasrymina.bounces.google.com,:akpm@linux-foundation.org::linux-kernel@vger.kernel.org:cai@lca.pw:mike.kravetz@oracle.com:almasrymina@google.com,RULES_HIT:1:2:41:69:152:355:379:387:541:800:960:966:973:982:988:989:1260:1277:1313:1314:1345:1431:1437:1516:1518:1593:1594:1605:1730:1747:1777:1792:2194:2196:2199:2200:2393:2553:2559:2562:2693:2736:2902:3138:3139:3140:3141:3142:3152:3865:3866:3867:3868:3870:3871:3872:3874:4050:4321:4385:4605:5007:6119:6261:6653:7550:7875:7903:7904:8660:9592:9969:10004:11026:11232:11473:11658:11914:12043:12291:12296:12297:12438:12555:12679:12683:12895:12986:13053:13141:13148:13221:13229:13230:14096:14097:14394:14659:21444:21451:21627:21939:21990:30012:30034:30054:30070:30090,0,RBL:209.85.216.74:@flex--almasrymina.bounces.google.com:.lbl8.mailshell.net-66.100.201.100 62.18.0.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0, MSF:not X-HE-Tag: spoon37_7971995bae203 X-Filterd-Recvd-Size: 10410 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Tue, 18 Feb 2020 22:27:14 +0000 (UTC) Received: by mail-pj1-f74.google.com with SMTP id dw15so947947pjb.2 for ; Tue, 18 Feb 2020 14:27:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:message-id:mime-version:subject:from:cc; bh=eGlw4mfkVo6U76801oQzH59xpynvIYX/RN7BDkRX24s=; b=fzWZN8N1C4R3jALjqLTu+KjZnmG9DjN+7vqKhgsbyDGFqcoPuvuf6X5xGBIrqlg/4e S0ieQI4H+uX2+da6XAMXzqkwympEJ/+0AJWHK6bB+C6/p9yDx+tVh+psoKQg5nsPINHB mPtmJuTqihDid89LGZQTEmIqdUwpNwasFWWNdgE2y/duUG9YMWhRPaYknwxHXLF3xSIx V6R5HpVBmYgJe4NNTNMfJ0BdSV9AxNI8hEWpnZLBpDKn6sXAZ0SJuAwnkwVdZq6V9qoj JjXDpImWPsvRVeoEZAe5lGnYtp76nDHvFqObFZli6Yw6hUg7HqBunAyZDBevNvv4KAVB aumg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:message-id:mime-version:subject:from:cc; bh=eGlw4mfkVo6U76801oQzH59xpynvIYX/RN7BDkRX24s=; b=FQpEkmWxAlQThDpkPsL9zJ0GHNOY6p1jmte1tBy40MYrnA1IDoBQMklSvYeywrxrtn P3SdJ6RiTvjTiBgQ/0ZwWTB2cLRHZ13F7/zfxCnaFnFSGnIgXztyiVVuNoF26uMUx8ip cpRBe1Mi4KU7F+KSpgSLjcGYq5M+DWPgPX+lNah9X2ETVIsvlutUb4L9RL19iV47HgsU J5dM/W1vLtbqUiXxjqvMc6YhZ0zDRO4gaVSZfQtzDzjl+TIGEx+cXsC60w1O+I+cWhb4 UcUN88QNmeHv75DzUcEn3vXW8ugOdJcGHxQsfd10fJGVsWk1UFS+xzgQ1+48m01bITNB 03OQ== X-Gm-Message-State: APjAAAUOEbCn7yNCfuXmr2EcvpXbdBDP8/7f9f7036LogIl8+fpl5FaL 4TDmfR2QPo6a5NmJy9P9kEMNlUA6ZVIgL7hnBg== X-Google-Smtp-Source: APXvYqxAcnEHkXsaFEzbR1G9lrKsAuVIgxKmGwoJE1hFhI3xxkcGXJ3AY2YDmQaA2VGkBG60gw60jq3c5KG7MGCz6A== X-Received: by 2002:a65:669a:: with SMTP id b26mr2237466pgw.24.1582064833005; Tue, 18 Feb 2020 14:27:13 -0800 (PST) Date: Tue, 18 Feb 2020 14:26:58 -0800 Message-Id: <20200218222658.132101-1-almasrymina@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.25.0.265.gbab2e86ba0-goog Subject: [PATCH -next] mm/hugetlb: Fix file_region entry allocations From: Mina Almasry Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Qian Cai , mike.kravetz@oracle.com, Mina Almasry X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit a9e443086489e ("hugetlb: disable region_add file_region coalescing") introduced a bug with adding file_region entries that is fixed here: 1. Refactor file_region entry allocation logic into 1 function called from region_add and region_chg since the code is now identical. 2. region_chg only modifies resv->adds_in_progress after the regions have been allocated. In the past it used to increment adds_in_progress and then drop the lock, which would confuse racing region_add calls into thinking they need to allocate entries when they are not allowed. 3. In region_add, only try to allocate regions when actual_regions_needed > in_regions_needed. This is not causing a bug but is better for cleanliness and reasoning about the code. Tested using ltp hugemmap0* tests, and libhugetlbfs tests. Reported-by: Qian Cai Signed-off-by: Mina Almasry Fixes: Commit a9e443086489e ("hugetlb: disable region_add file_region coalescing") --- mm/hugetlb.c | 149 +++++++++++++++++++++++++-------------------------- 1 file changed, 74 insertions(+), 75 deletions(-) -- 2.25.0.265.gbab2e86ba0-goog diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8171d2211be77..3d5b48ae8971f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -439,6 +439,66 @@ static long add_reservation_in_range(struct resv_map *resv, long f, long t, return add; } +/* Must be called with resv->lock acquired. Will drop lock to allocate entries. + */ +static int allocate_file_region_entries(struct resv_map *resv, + int regions_needed) +{ + struct list_head allocated_regions; + int to_allocate = 0, i = 0; + struct file_region *trg = NULL, *rg = NULL; + + VM_BUG_ON(regions_needed < 0); + + INIT_LIST_HEAD(&allocated_regions); + + /* + * Check for sufficient descriptors in the cache to accommodate + * the number of in progress add operations plus regions_needed. + * + * This is a while loop because when we drop the lock, some other call + * to region_add or region_del may have consumed some region_entries, + * so we keep looping here until we finally have enough entries for + * (adds_in_progress + regions_needed). + */ + while (resv->region_cache_count < + (resv->adds_in_progress + regions_needed)) { + to_allocate = resv->adds_in_progress + regions_needed - + resv->region_cache_count; + + /* At this point, we should have enough entries in the cache + * for all the existings adds_in_progress. We should only be + * needing to allocate for regions_needed. + */ + VM_BUG_ON(resv->region_cache_count < resv->adds_in_progress); + + spin_unlock(&resv->lock); + for (i = 0; i < to_allocate; i++) { + trg = kmalloc(sizeof(*trg), GFP_KERNEL); + if (!trg) + goto out_of_memory; + list_add(&trg->link, &allocated_regions); + } + + spin_lock(&resv->lock); + + list_for_each_entry_safe (rg, trg, &allocated_regions, link) { + list_del(&rg->link); + list_add(&rg->link, &resv->region_cache); + resv->region_cache_count++; + } + } + + return 0; + +out_of_memory: + list_for_each_entry_safe (rg, trg, &allocated_regions, link) { + list_del(&rg->link); + kfree(rg); + } + return -ENOMEM; +} + /* * Add the huge page range represented by [f, t) to the reserve * map. Regions will be taken from the cache to fill in this range. @@ -460,11 +520,7 @@ static long region_add(struct resv_map *resv, long f, long t, long in_regions_needed, struct hstate *h, struct hugetlb_cgroup *h_cg) { - long add = 0, actual_regions_needed = 0, i = 0; - struct file_region *trg = NULL, *rg = NULL; - struct list_head allocated_regions; - - INIT_LIST_HEAD(&allocated_regions); + long add = 0, actual_regions_needed = 0; spin_lock(&resv->lock); retry: @@ -476,34 +532,24 @@ static long region_add(struct resv_map *resv, long f, long t, /* * Check for sufficient descriptors in the cache to accommodate * this add operation. Note that actual_regions_needed may be greater - * than in_regions_needed. In this case, we need to make sure that we + * than in_regions_needed, as the resv_map may have been modified since + * the region_chg call. In this case, we need to make sure that we * allocate extra entries, such that we have enough for all the * existing adds_in_progress, plus the excess needed for this * operation. */ - if (resv->region_cache_count < - resv->adds_in_progress + - (actual_regions_needed - in_regions_needed)) { + if (actual_regions_needed > in_regions_needed && + resv->region_cache_count < + resv->adds_in_progress + + (actual_regions_needed - in_regions_needed)) { /* region_add operation of range 1 should never need to * allocate file_region entries. */ VM_BUG_ON(t - f <= 1); - /* Must drop lock to allocate a new descriptor. */ - spin_unlock(&resv->lock); - for (i = 0; i < (actual_regions_needed - in_regions_needed); - i++) { - trg = kmalloc(sizeof(*trg), GFP_KERNEL); - if (!trg) - goto out_of_memory; - list_add(&trg->link, &allocated_regions); - } - spin_lock(&resv->lock); - - list_for_each_entry_safe(rg, trg, &allocated_regions, link) { - list_del(&rg->link); - list_add(&rg->link, &resv->region_cache); - resv->region_cache_count++; + if (allocate_file_region_entries( + resv, actual_regions_needed - in_regions_needed)) { + return -ENOMEM; } goto retry; @@ -516,13 +562,6 @@ static long region_add(struct resv_map *resv, long f, long t, spin_unlock(&resv->lock); VM_BUG_ON(add < 0); return add; - -out_of_memory: - list_for_each_entry_safe(rg, trg, &allocated_regions, link) { - list_del(&rg->link); - kfree(rg); - } - return -ENOMEM; } /* @@ -548,11 +587,7 @@ static long region_add(struct resv_map *resv, long f, long t, static long region_chg(struct resv_map *resv, long f, long t, long *out_regions_needed) { - struct file_region *trg = NULL, *rg = NULL; - long chg = 0, i = 0, to_allocate = 0; - struct list_head allocated_regions; - - INIT_LIST_HEAD(&allocated_regions); + long chg = 0; spin_lock(&resv->lock); @@ -563,49 +598,13 @@ static long region_chg(struct resv_map *resv, long f, long t, if (*out_regions_needed == 0) *out_regions_needed = 1; - resv->adds_in_progress += *out_regions_needed; - - /* - * Check for sufficient descriptors in the cache to accommodate - * the number of in progress add operations. - */ - while (resv->region_cache_count < resv->adds_in_progress) { - to_allocate = resv->adds_in_progress - resv->region_cache_count; - - /* Must drop lock to allocate a new descriptor. Note that even - * though we drop the lock here, we do not make another call to - * add_reservation_in_range after re-acquiring the lock. - * Essentially this branch makes sure that we have enough - * descriptors in the cache as suggested by the first call to - * add_reservation_in_range. If more regions turn out to be - * required, region_add will deal with it. - */ - spin_unlock(&resv->lock); - for (i = 0; i < to_allocate; i++) { - trg = kmalloc(sizeof(*trg), GFP_KERNEL); - if (!trg) - goto out_of_memory; - list_add(&trg->link, &allocated_regions); - } - - spin_lock(&resv->lock); + if (allocate_file_region_entries(resv, *out_regions_needed)) + return -ENOMEM; - list_for_each_entry_safe(rg, trg, &allocated_regions, link) { - list_del(&rg->link); - list_add(&rg->link, &resv->region_cache); - resv->region_cache_count++; - } - } + resv->adds_in_progress += *out_regions_needed; spin_unlock(&resv->lock); return chg; - -out_of_memory: - list_for_each_entry_safe(rg, trg, &allocated_regions, link) { - list_del(&rg->link); - kfree(rg); - } - return -ENOMEM; } /*