From patchwork Fri Aug 3 08:32:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10554791 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D2A215A6 for ; Fri, 3 Aug 2018 09:01:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6CBE42C24F for ; Fri, 3 Aug 2018 09:01:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6AEE52C26F; Fri, 3 Aug 2018 09:01:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B0DA12C258 for ; Fri, 3 Aug 2018 09:01:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF3026B0266; Fri, 3 Aug 2018 05:01:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BC8AA6B026A; Fri, 3 Aug 2018 05:01:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A43426B026B; Fri, 3 Aug 2018 05:01:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id 624636B0266 for ; Fri, 3 Aug 2018 05:01:00 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id d22-v6so3332946pfn.3 for ; Fri, 03 Aug 2018 02:01:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:in-reply-to:references; bh=pinpfAx3EIAyiXtyVzexl8TFXgvsoy7W+zXZhNR7gvw=; b=cTevSkK1pU2k0XoWSACOsF1X1qDiTCw9mN78wsnbmgU+M2tjHl5v/lyt3MIoXMFlVO l2l2f6y7noa0t1vG+g9HTUyixBCOUSHr4vHjwB5YbOV5Z8tqTmPcZjQCZGefN0wM7r2m QDmR7xLqhUbiEWtJRoYD8cQyo0u2mDdLySGE8ddWHVOMvSO2G7d0EnPS25JJvSLlz0Kf DpK6TU6I5aqLtmpCQVlfxz2jQINvSO2d/HkkLj9aVdP5EgVgldYxuidUv0M/JWTz7fda T/21W7xL8TTy55VvhQMAPyJ5mW4V7eASvltKPGHFFvAc0xgKdw9+RcH1EGwnsKsYjOAR ysuw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of wei.w.wang@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=wei.w.wang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AOUpUlEgIShYDtgLlD0L0pBn+DLDIu2dwKu28SsT/2yyI4n/4EcaXLcy 0IQ//HU3Bj8QjYv2Vu6SEy2Fm4TVYNqbV9GAjiDUUiyLuUpk8/qNFCEr6ZMVlplYxAyBIpPGVKa jvCCcI4VdIQVlxU8YH4/gF5ve4xccPLLPWlNafHUyFjkPGUU7xA+Ct0RsPkuJ06MPvg== X-Received: by 2002:a17:902:c85:: with SMTP id 5-v6mr2724386plt.141.1533286859979; Fri, 03 Aug 2018 02:00:59 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfFApfwgHLWBwNVbO5r1YVlxNhgHJ7tyB2V8LrFOo4bbOK7ChgOXcQgmUkdYwOx4qL6hJHm X-Received: by 2002:a17:902:c85:: with SMTP id 5-v6mr2724326plt.141.1533286859198; Fri, 03 Aug 2018 02:00:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533286859; cv=none; d=google.com; s=arc-20160816; b=rtZqZl8eLoZQ1ZDUvrMww/dw19FBRkIHM4XO8agk5EDhW5QeT9VxblNbsuNioZP1BX 60nnxLSSjKcmblnbQJGoY5INo8TaJ8ZD2isLCu3/yqxh6tQxShuh/1ZLEGPMHEfwvJtN TMcoQsGTR9FP2e98zf38z5UHUywfK3h7G5D3KDCjpQKcY6RC6FakE3OQ72qvxwAfpb+n LwW40D0g3twPLkmAcABqrR2xYkfK/xoScidaM9UQzTgcMbRSHXTEqvinGXU5kn4B1RmX vhUNyCYyUOGAcXwXiRUy9d832Q3AvVBqAC81LFes7uXQmn5QzyIggE6T+E7lO9x4d/kt Wn+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=pinpfAx3EIAyiXtyVzexl8TFXgvsoy7W+zXZhNR7gvw=; b=YAHBEzXOGLlJybGJNjuH4fZe+rUwk1APPsHaRwT5TgeDjEJEL1Nuv9PIrunpV+ghSL ATvQ//dUTvZNnKWFFzUCl0zKcGrNjXUU7JIOZUqXw/WaFFHZZodJ0alq4hgGZFw+lIgd 9+E9zzDbllE6BYJSACwKS5WPXdrCMjH0mtvP/uY1x5d0EzO+iC08cfUUhz8nY8apylSI 7ltXgMQGHp/0m1cHKXkPJup/rZIXDy+A1Cjc8Wyh7M5NXrl8/9yQSCA4X4eAMPGS/1/p QFtrfuYjakXPbIVYDhWgJFqfY2EAASngZqKzbKMUUGItyiMosCK3Mv419E6OsufMwGpb P2lg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of wei.w.wang@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=wei.w.wang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id b5-v6si4633098pfa.116.2018.08.03.02.00.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Aug 2018 02:00:58 -0700 (PDT) Received-SPF: pass (google.com: domain of wei.w.wang@intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of wei.w.wang@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=wei.w.wang@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 03 Aug 2018 02:00:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,438,1526367600"; d="scan'208";a="60151764" Received: from devel-ww.sh.intel.com ([10.239.48.110]) by fmsmga008.fm.intel.com with ESMTP; 03 Aug 2018 02:00:43 -0700 From: Wei Wang To: virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-mm@kvack.org, mst@redhat.com, mhocko@kernel.org, akpm@linux-foundation.org, penguin-kernel@I-love.SAKURA.ne.jp Cc: wei.w.wang@intel.com Subject: [PATCH v3 2/2] virtio_balloon: replace oom notifier with shrinker Date: Fri, 3 Aug 2018 16:32:26 +0800 Message-Id: <1533285146-25212-3-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1533285146-25212-1-git-send-email-wei.w.wang@intel.com> References: <1533285146-25212-1-git-send-email-wei.w.wang@intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The OOM notifier is getting deprecated to use for the reasons: - As a callout from the oom context, it is too subtle and easy to generate bugs and corner cases which are hard to track; - It is called too late (after the reclaiming has been performed). Drivers with large amuont of reclaimable memory is expected to release them at an early stage of memory pressure; - The notifier callback isn't aware of oom contrains; Link: https://lkml.org/lkml/2018/7/12/314 This patch replaces the virtio-balloon oom notifier with a shrinker to release balloon pages on memory pressure. The balloon pages are given back to mm adaptively by returning the number of pages that the reclaimer is asking for (i.e. sc->nr_to_scan). Currently the max possible value of sc->nr_to_scan passed to the balloon shrinker is SHRINK_BATCH, which is 128. This is smaller than the limitation that only VIRTIO_BALLOON_ARRAY_PFNS_MAX (256) pages can be returned via one invocation of leak_balloon. But this patch still considers the case that SHRINK_BATCH or shrinker->batch could be changed to a value larger than VIRTIO_BALLOON_ARRAY_PFNS_MAX, which will need to do multiple invocations of leak_balloon. Historically, the feature VIRTIO_BALLOON_F_DEFLATE_ON_OOM has been used to release balloon pages on OOM. We continue to use this feature bit for the shrinker, so the shrinker is only registered when this feature bit has been negotiated with host. Signed-off-by: Wei Wang Cc: Michael S. Tsirkin Cc: Michal Hocko Cc: Andrew Morton --- drivers/virtio/virtio_balloon.c | 111 ++++++++++++++++++++++------------------ 1 file changed, 60 insertions(+), 51 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 8100e77..612a359 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -27,7 +27,6 @@ #include #include #include -#include #include #include #include @@ -40,13 +39,8 @@ */ #define VIRTIO_BALLOON_PAGES_PER_PAGE (unsigned)(PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT) #define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 -#define OOM_VBALLOON_DEFAULT_PAGES 256 #define VIRTBALLOON_OOM_NOTIFY_PRIORITY 80 -static int oom_pages = OOM_VBALLOON_DEFAULT_PAGES; -module_param(oom_pages, int, S_IRUSR | S_IWUSR); -MODULE_PARM_DESC(oom_pages, "pages to free on OOM"); - #ifdef CONFIG_BALLOON_COMPACTION static struct vfsmount *balloon_mnt; #endif @@ -86,8 +80,8 @@ struct virtio_balloon { /* Memory statistics */ struct virtio_balloon_stat stats[VIRTIO_BALLOON_S_NR]; - /* To register callback in oom notifier call chain */ - struct notifier_block nb; + /* To register a shrinker to shrink memory upon memory pressure */ + struct shrinker shrinker; }; static struct virtio_device_id id_table[] = { @@ -365,38 +359,6 @@ static void update_balloon_size(struct virtio_balloon *vb) &actual); } -/* - * virtballoon_oom_notify - release pages when system is under severe - * memory pressure (called from out_of_memory()) - * @self : notifier block struct - * @dummy: not used - * @parm : returned - number of freed pages - * - * The balancing of memory by use of the virtio balloon should not cause - * the termination of processes while there are pages in the balloon. - * If virtio balloon manages to release some memory, it will make the - * system return and retry the allocation that forced the OOM killer - * to run. - */ -static int virtballoon_oom_notify(struct notifier_block *self, - unsigned long dummy, void *parm) -{ - struct virtio_balloon *vb; - unsigned long *freed; - unsigned num_freed_pages; - - vb = container_of(self, struct virtio_balloon, nb); - if (!virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) - return NOTIFY_OK; - - freed = parm; - num_freed_pages = leak_balloon(vb, oom_pages); - update_balloon_size(vb); - *freed += num_freed_pages; - - return NOTIFY_OK; -} - static void update_balloon_stats_func(struct work_struct *work) { struct virtio_balloon *vb; @@ -550,6 +512,53 @@ static struct file_system_type balloon_fs = { #endif /* CONFIG_BALLOON_COMPACTION */ +static unsigned long virtio_balloon_shrinker_scan(struct shrinker *shrinker, + struct shrink_control *sc) +{ + unsigned long pages_to_free, pages_freed = 0; + struct virtio_balloon *vb = container_of(shrinker, + struct virtio_balloon, shrinker); + + pages_to_free = sc->nr_to_scan * VIRTIO_BALLOON_PAGES_PER_PAGE; + + /* + * One invocation of leak_balloon can deflate at most + * VIRTIO_BALLOON_ARRAY_PFNS_MAX balloon pages, so we call it + * multiple times to deflate pages till reaching pages_to_free. + */ + while (vb->num_pages && pages_to_free) { + pages_to_free -= pages_freed; + pages_freed += leak_balloon(vb, pages_to_free); + } + update_balloon_size(vb); + + return pages_freed / VIRTIO_BALLOON_PAGES_PER_PAGE; +} + +static unsigned long virtio_balloon_shrinker_count(struct shrinker *shrinker, + struct shrink_control *sc) +{ + struct virtio_balloon *vb = container_of(shrinker, + struct virtio_balloon, shrinker); + + return vb->num_pages / VIRTIO_BALLOON_PAGES_PER_PAGE; +} + +static void virtio_balloon_unregister_shrinker(struct virtio_balloon *vb) +{ + unregister_shrinker(&vb->shrinker); +} + +static int virtio_balloon_register_shrinker(struct virtio_balloon *vb) +{ + vb->shrinker.scan_objects = virtio_balloon_shrinker_scan; + vb->shrinker.count_objects = virtio_balloon_shrinker_count; + vb->shrinker.batch = 0; + vb->shrinker.seeks = DEFAULT_SEEKS; + + return register_shrinker(&vb->shrinker); +} + static int virtballoon_probe(struct virtio_device *vdev) { struct virtio_balloon *vb; @@ -582,17 +591,10 @@ static int virtballoon_probe(struct virtio_device *vdev) if (err) goto out_free_vb; - vb->nb.notifier_call = virtballoon_oom_notify; - vb->nb.priority = VIRTBALLOON_OOM_NOTIFY_PRIORITY; - err = register_oom_notifier(&vb->nb); - if (err < 0) - goto out_del_vqs; - #ifdef CONFIG_BALLOON_COMPACTION balloon_mnt = kern_mount(&balloon_fs); if (IS_ERR(balloon_mnt)) { err = PTR_ERR(balloon_mnt); - unregister_oom_notifier(&vb->nb); goto out_del_vqs; } @@ -601,13 +603,20 @@ static int virtballoon_probe(struct virtio_device *vdev) if (IS_ERR(vb->vb_dev_info.inode)) { err = PTR_ERR(vb->vb_dev_info.inode); kern_unmount(balloon_mnt); - unregister_oom_notifier(&vb->nb); vb->vb_dev_info.inode = NULL; goto out_del_vqs; } vb->vb_dev_info.inode->i_mapping->a_ops = &balloon_aops; #endif - + /* + * We continue to use VIRTIO_BALLOON_F_DEFLATE_ON_OOM to decide if a + * shrinker needs to be registered to relieve memory pressure. + */ + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) { + err = virtio_balloon_register_shrinker(vb); + if (err) + goto out_del_vqs; + } virtio_device_ready(vdev); if (towards_target(vb)) @@ -639,8 +648,8 @@ static void virtballoon_remove(struct virtio_device *vdev) { struct virtio_balloon *vb = vdev->priv; - unregister_oom_notifier(&vb->nb); - + if (virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) + virtio_balloon_unregister_shrinker(vb); spin_lock_irq(&vb->stop_update_lock); vb->stop_update = true; spin_unlock_irq(&vb->stop_update_lock);