From patchwork Thu May 31 10:51:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chunyu Hu X-Patchwork-Id: 10440713 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C6FD7602BF for ; Thu, 31 May 2018 10:51:27 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B9F8029050 for ; Thu, 31 May 2018 10:51:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AE08329083; Thu, 31 May 2018 10:51:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00, MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EDFA229050 for ; Thu, 31 May 2018 10:51:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 102C56B0005; Thu, 31 May 2018 06:51:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0B2E56B0006; Thu, 31 May 2018 06:51:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE44B6B0007; Thu, 31 May 2018 06:51:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ot0-f199.google.com (mail-ot0-f199.google.com [74.125.82.199]) by kanga.kvack.org (Postfix) with ESMTP id C1E3A6B0005 for ; Thu, 31 May 2018 06:51:25 -0400 (EDT) Received: by mail-ot0-f199.google.com with SMTP id m7-v6so3683992otd.20 for ; Thu, 31 May 2018 03:51:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:date:from :reply-to:to:cc:message-id:in-reply-to:references:subject :mime-version:content-transfer-encoding:thread-topic:thread-index; bh=4GXqDYtoFtlve/JODayOSUYopOgHhSSm+B4Evefhi70=; b=rc5QX1PZa9iwiGrDeOdXo1+VGQGi2MziXlFgahEds25ATH2HxV1Q4a/x8lXTI4EfAc eXBrGqYQV32Bxxs5EwE9btGcL+TJqK3F9zLVylY+Bqrj811FFUi8rWVQKa4tTpURiF6A 9QVyP9YnXiUtCYN9CQE7TIqCp2+JhVLlTDdvDTN4D8jIfcyOMz025+QfJAbz7kNWB5/J POrKDtgyav1PhOlkNUaRwAH/3OIGxwnsin/X1wj63E94kOAg1ECDPrA1QXRyOtvp7ye9 yGHPOopO9yIF722e5BA//yLlZEwa6nRMSyxH7TeqCUfgTpRVG+iRenIFd1C0yuAN65qG +NiA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of chuhu@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=chuhu@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Gm-Message-State: APt69E2JUBoZbu8AnW1yKA7/a+Ag3Xt6LrJa50Qaw7Wf9wfEy3cbkNnQ Cj5gP4UO0DSrYazT5+MThzmccjSyKwH1VTmChb013+IaNsvhxax8+fHmd7EK995mvyVT9XVsG9n VGdmIEocAINHM8kdq4UMIRvkbwFv1mvyPpag0csLPASKwLPKJgbW/0NbNeJ7Y9DcbYw== X-Received: by 2002:a9d:3f49:: with SMTP id m67-v6mr4486294otc.133.1527763885562; Thu, 31 May 2018 03:51:25 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKiGQbdn26RddDPo4mTp7vpUfv/gh2pOpIMCTV9/3WGSgCV+TxVDXLDcLTE9BeYPo+3Sg4i X-Received: by 2002:a9d:3f49:: with SMTP id m67-v6mr4486264otc.133.1527763884527; Thu, 31 May 2018 03:51:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527763884; cv=none; d=google.com; s=arc-20160816; b=csPDVTCq7GNB395pWilfM1fGjnEW6PUsTLxkPGgFGdQ9O69M59VoMIJuMH+SCLSLDk nx0KtktYdErMz0kLwprfqZ0A54w56xr6IJYafqBKkwOnwFNya1kRHCA40oz9uDPEpCHb H1kbIrIe9N02L5j2B7jnHqEOpXX4r37HHgvgxuu+ITtpCNtYzF918HNof3Unp+aEyvRw /2uLrI+4U2IBCe3HzemTith3elKUHg3ELq9dla8OM1gtRPdRVupk27uH5qhmhMMGKOVs jYJxOaf4nlUXvtpg8s5qElk7mTJP+p9vUYs6fSLYD667JL26j0Mdph01y0+uihsYNUZC 1OTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=thread-index:thread-topic:content-transfer-encoding:mime-version :subject:references:in-reply-to:message-id:cc:to:reply-to:from:date :arc-authentication-results; bh=4GXqDYtoFtlve/JODayOSUYopOgHhSSm+B4Evefhi70=; b=Y6FmbVXOgAAEBhXLe06Erb+HfemgKWdkFhFOI8/D1R84h6C1TiQpT1wcJkJCIsLqW5 oNmNp/UUlQmjZ2Rz3bPA+Nz6HjtdkiWDvFEoJu2uygysNl1OggC/1In6yKBA+wCON/fR hz3ZBDGp/FORpX4aLlTbOKciNarDXg2GkpShSqHB8YKLXu4t+jn9vcUq8tN0s5BcqnxF gtyRDS56kRWZa86zsD6AbmxxIZQOTDXgyIZsYBNW1RFeIh05Q+7gH4kz0jPgtEOJIwQY ijCpoEnuZq9/bXO3Wf1B+wI85OvGvYw4PkCHcLMFbaeX6Rf7fODUtLWCWenRhkPcMC8+ orbg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of chuhu@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=chuhu@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id r25-v6si5639071ote.94.2018.05.31.03.51.24 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 31 May 2018 03:51:24 -0700 (PDT) Received-SPF: pass (google.com: domain of chuhu@redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; Authentication-Results: mx.google.com; spf=pass (google.com: domain of chuhu@redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=chuhu@redhat.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AD06330A633A; Thu, 31 May 2018 10:51:23 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 97BAE6012A; Thu, 31 May 2018 10:51:23 +0000 (UTC) Received: from zmail23.collab.prod.int.phx2.redhat.com (zmail23.collab.prod.int.phx2.redhat.com [10.5.83.28]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 0F83F1800C9B; Thu, 31 May 2018 10:51:23 +0000 (UTC) Date: Thu, 31 May 2018 06:51:22 -0400 (EDT) From: Chunyu Hu Reply-To: Chunyu Hu To: Michal Hocko Cc: Tetsuo Handa , malat@debian.org, dvyukov@google.com, linux-mm@kvack.org, catalin marinas , Akinobu Mita Message-ID: <2074740225.5769475.1527763882580.JavaMail.zimbra@redhat.com> In-Reply-To: <20180530123826.GF27180@dhcp22.suse.cz> References: <201805290605.DGF87549.LOVFMFJQSOHtFO@I-love.SAKURA.ne.jp> <1126233373.5118805.1527600426174.JavaMail.zimbra@redhat.com> <1730157334.5467848.1527672937617.JavaMail.zimbra@redhat.com> <20180530104637.GC27180@dhcp22.suse.cz> <1684479370.5483281.1527680579781.JavaMail.zimbra@redhat.com> <20180530123826.GF27180@dhcp22.suse.cz> Subject: Re: [PATCH] kmemleak: don't use __GFP_NOFAIL MIME-Version: 1.0 X-Originating-IP: [10.68.5.41, 10.4.195.15] Thread-Topic: kmemleak: don't use __GFP_NOFAIL Thread-Index: CwGb1tL/nYqIqcuFAJJIHhBuoUwVuw== X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Thu, 31 May 2018 10:51:23 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP ----- Original Message ----- > From: "Michal Hocko" > To: "Chunyu Hu" > Cc: "Tetsuo Handa" , malat@debian.org, dvyukov@google.com, linux-mm@kvack.org, > "catalin marinas" , "Akinobu Mita" > Sent: Wednesday, May 30, 2018 8:38:26 PM > Subject: Re: [PATCH] kmemleak: don't use __GFP_NOFAIL > > On Wed 30-05-18 07:42:59, Chunyu Hu wrote: > > > > ----- Original Message ----- > > > From: "Michal Hocko" > > > To: "Chunyu Hu" > > > Cc: "Tetsuo Handa" , > > > malat@debian.org, dvyukov@google.com, linux-mm@kvack.org, > > > "catalin marinas" > > > Sent: Wednesday, May 30, 2018 6:46:37 PM > > > Subject: Re: [PATCH] kmemleak: don't use __GFP_NOFAIL > > > > > > On Wed 30-05-18 05:35:37, Chunyu Hu wrote: > > > [...] > > > > I'm trying to reuse the make_it_fail field in task for fault injection. > > > > As > > > > adding > > > > an extra memory alloc flag is not thought so good, I think adding task > > > > flag > > > > is either? > > > > > > Yeah, task flag will be reduced to KMEMLEAK enabled configurations > > > without an additional maint. overhead. Anyway, you should really think > > > about how to guarantee trackability for atomic allocation requests. You > > > cannot simply assume that GFP_NOWAIT will succeed. I guess you really > > > > Sure. While I'm using task->make_it_fail, I'm still in the direction of > > making kmemleak avoid fault inject with task flag instead of page alloc > > flag. > > > > > want to have a pre-populated pool of objects for those requests. The > > > obvious question is how to balance such a pool. It ain't easy to track > > > memory by allocating more memory... > > > > This solution is going to make kmemleak trace really nofail. We can think > > later. > > > > while I'm thinking about if fault inject can be disabled via flag in task. > > > > Actually, I'm doing something like below, the disable_fault_inject() is > > just setting a flag in task->make_it_fail. But this will depend on if > > fault injection accept a change like this. CCing Akinobu > > You still seem to be missing my point I am afraid (or I am ;). So say > that you want to track a GFP_NOWAIT allocation request. So create_object > will get called with that gfp mask and no matter what you try here your > tracking object will be allocated in a weak allocation context as well > and disable kmemleak. So it only takes a more heavy memory pressure and > the tracing is gone... Michal, Thank you for the good suggestion. You mean GFP_NOWAIT still can make create_object fail and as a result kmemleak disable itself. So it's not so useful, just like the current __GFP_NOFAIL usage in create_object. In the first thread, we discussed this. and that time you suggested we have fault injection disabled when kmemleak is working and suggested per task way. so my head has been stuck in that point. While now you gave a better suggestion that why not we pre allocate a urgent pool for kmemleak objects. After thinking for a while, I got your point, it's a good way for improving kmemleak to make it can tolerate light allocation failure. And catalin mentioned that we have one option that use the early_log array as urgent pool, which has the similar ideology. Basing on your suggestions, I tried to draft this, what does it look to you? another strong alloc mask and an extra thread for fill the pool, which containts 1M objects in a frequency of 100 ms. If first kmem_cache_alloc failed, then get a object from the pool. object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); if (!object) { + object = kmemleak_get_pool_object(); + pr_info("total=%u", total); + } + if (!object) { pr_warn("Cannot allocate a kmemleak_object structure\n"); kmemleak_disable(); return NULL; @@ -1872,8 +1957,10 @@ static ssize_t kmemleak_write(struct file *file, const char __user *user_buf, kmemleak_stack_scan = 0; else if (strncmp(buf, "scan=on", 7) == 0) start_scan_thread(); - else if (strncmp(buf, "scan=off", 8) == 0) + else if (strncmp(buf, "scan=off", 8) == 0) { stop_scan_thread(); + stop_pool_thread(); + } else if (strncmp(buf, "scan=", 5) == 0) { unsigned long secs; @@ -1929,6 +2016,7 @@ static void __kmemleak_do_cleanup(void) static void kmemleak_do_cleanup(struct work_struct *work) { stop_scan_thread(); + stop_pool_thread(); mutex_lock(&scan_mutex); /* @@ -2114,6 +2202,7 @@ static int __init kmemleak_late_init(void) pr_warn("Failed to create the debugfs kmemleak file\n"); mutex_lock(&scan_mutex); start_scan_thread(); + start_pool_thread(); mutex_unlock(&scan_mutex); pr_info("Kernel memory leak detector initialized\n"); > -- > Michal Hocko > SUSE Labs > diff --git a/mm/kmemleak.c b/mm/kmemleak.c index 9a085d5..7163489 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -128,6 +128,10 @@ __GFP_NORETRY | __GFP_NOMEMALLOC | \ __GFP_NOWARN | __GFP_NOFAIL) +#define gfp_kmemleak_mask_strong() (__GFP_NOMEMALLOC | \ + __GFP_NOWARN | __GFP_RECLAIM | __GFP_NOFAIL) + + /* scanning area inside a memory block */ struct kmemleak_scan_area { struct hlist_node node; @@ -299,6 +303,83 @@ struct early_log { kmemleak_disable(); \ } while (0) +static DEFINE_SPINLOCK(kmemleak_object_lock); +static LIST_HEAD(pool_object_list); +static unsigned int volatile total; +static unsigned int pool_object_max = 1024 * 1024; +static struct task_struct *pool_thread; + +static struct kmemleak_object* kmemleak_pool_fill(void) +{ + struct kmemleak_object *object = NULL; + unsigned long flags; + + object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask_strong()); + spin_lock_irqsave(&kmemleak_object_lock, flags); + if (object) { + list_add(&object->object_list, &pool_object_list); + total++; + } + spin_unlock_irqrestore(&kmemleak_object_lock, flags); + return object; +} + +static struct kmemleak_object* kmemleak_get_pool_object(void) +{ + struct kmemleak_object *object = NULL; + unsigned long flags; + + spin_lock_irqsave(&kmemleak_object_lock, flags); + if (!list_empty(&pool_object_list)) { + object = list_first_entry(&pool_object_list,struct kmemleak_object, + object_list); + list_del(&object->object_list); + total--; + } + spin_unlock_irqrestore(&kmemleak_object_lock, flags); + return object; +} + +static int kmemleak_pool_thread(void *nothinng) +{ + struct kmemleak_object *object = NULL; + while (!kthread_should_stop()) { + if (READ_ONCE(total) < pool_object_max) { + object = kmemleak_pool_fill(); + WARN_ON(!object); + } + schedule_timeout_interruptible(msecs_to_jiffies(100)); + } + return 0; +} + +static void start_pool_thread(void) +{ + if (pool_thread) + return; + pool_thread = kthread_run(kmemleak_pool_thread, NULL, "kmemleak_pool"); + if (IS_ERR(pool_thread)) { + pr_warn("Failed to create the scan thread\n"); + pool_thread = NULL; + } +} +static void stop_pool_thread(void) +{ + struct kmemleak_object *object; + unsigned long flags; + if (pool_thread) { + kthread_stop(pool_thread); + pool_thread = NULL; + } + spin_lock_irqsave(&kmemleak_object_lock, flags); + list_for_each_entry(object, &pool_object_list, object_list) { + list_del(&object->object_list); + kmem_cache_free(object_cache, object); + } + spin_unlock_irqrestore(&kmemleak_object_lock, flags); +} + /* * Printing of the objects hex dump to the seq file. The number of lines to be * printed is limited to HEX_MAX_LINES to prevent seq file spamming. The @@ -553,6 +634,10 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,