From patchwork Sun Nov 29 00:45:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 11938887 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83D72C63798 for ; Sun, 29 Nov 2020 00:50:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1253C20731 for ; Sun, 29 Nov 2020 00:50:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lsz3W7Eo" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1253C20731 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 851146B0070; Sat, 28 Nov 2020 19:50:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7DC6E6B0071; Sat, 28 Nov 2020 19:50:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 658D76B0072; Sat, 28 Nov 2020 19:50:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 426D26B0070 for ; Sat, 28 Nov 2020 19:50:03 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 0C6F78249980 for ; Sun, 29 Nov 2020 00:50:03 +0000 (UTC) X-FDA: 77535623886.02.side89_04104f527395 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id E314F10097AA0 for ; Sun, 29 Nov 2020 00:50:02 +0000 (UTC) X-HE-Tag: side89_04104f527395 X-Filterd-Recvd-Size: 7546 Received: from mail-pf1-f193.google.com (mail-pf1-f193.google.com [209.85.210.193]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Sun, 29 Nov 2020 00:50:02 +0000 (UTC) Received: by mail-pf1-f193.google.com with SMTP id y7so7721828pfq.11 for ; Sat, 28 Nov 2020 16:50:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gF437dYZ7jdLwMEZ7tinhd91PTaHOyGZ1bS4Ys+9sbE=; b=lsz3W7Eo6rNaH9lJHydIzJMFNyroHGWOYBcGBrThiveADagHW1sODxWZKkeaSi1VC7 zLQZB+Ua3lMMfGNhgBXE4MzHHfya8W0dlJgtLfLZa0WZZDqH7VSucoOi7I3mktfr1TVk V1IbKNnHF5YIZtjZM0NFkJ4iYvTZ1W17FJZ1z89/Q4E17kk3XyPI2hW3dh9Ekuv8VlFV xNjcWGBA3WiziRHvYQu275NQjRrlQAnRRFD/qQ11Vl9/L59G+sp+i3WgSMLiTuU1Yn1X zWljhsVdOMGwRMqTk+DbE5nMO7w004wAWe6dZs/QSEmC1uKuT0Mdy5htd4XQ9D4odrjk HTow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gF437dYZ7jdLwMEZ7tinhd91PTaHOyGZ1bS4Ys+9sbE=; b=kWxZMNcCbJ+tcErtL4Ggq5Yp56H4cIRpPxbnxWrtZY4HFkZYLoj+i8habwyb+01NGz tJ76Nv516NtvS23JRv7SUR+lZtVzK0T8dLlIBHj6MulG4N/+0f4DtgaTI6qkMNI/nC/g YD7pV3bqpylX1YYsxsePeUPkBDMWMVgI/qjlvDgvUAiPkraKvvilHWl57uaU9HAVs3L/ Yr1hvC5OFNEj2vukCtSxbnefXJQvZCZm6PhxTZ5oYHe6eN7FtN88fwAlBzNTAbWx3XYJ LI1LnhFB/TqQLYspgSR+r2vPfvFBt5jj+0pA6Fvw1LoxNOxe/teniWW26IpupCkjc2ez c0eQ== X-Gm-Message-State: AOAM531IQ5jC3X8JW0WFRM32K5tywKROBHFpX6EuNJA9Wk2EeQMOZ7Vp 1+5a69YK5BYPUacLH1Tc/Jo= X-Google-Smtp-Source: ABdhPJyVHHI5kxSoiEdetWfo5TY0LHVXJyWXpUXKSK5uyIxAkcnig+Tzt6Rc0pwKm2cocbCFg6qDig== X-Received: by 2002:a63:68f:: with SMTP id 137mr923738pgg.361.1606611001687; Sat, 28 Nov 2020 16:50:01 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id gg19sm16444871pjb.21.2020.11.28.16.50.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 16:50:01 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-fsdevel@vger.kernel.org Cc: Nadav Amit , Jens Axboe , Andrea Arcangeli , Peter Xu , Alexander Viro , io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 05/13] fs/userfaultfd: introduce UFFD_FEATURE_POLL Date: Sat, 28 Nov 2020 16:45:40 -0800 Message-Id: <20201129004548.1619714-6-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201129004548.1619714-1-namit@vmware.com> References: <20201129004548.1619714-1-namit@vmware.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Nadav Amit Add a feature UFFD_FEATURE_POLL that makes the faulting thread spin while waiting for the page-fault to be handled. Users of this feature should be wise by setting the page-fault handling thread on another physical CPU and to potentially ensure that there are available cores to run the handler, as otherwise they will see performance degradation. We can later enhance it by setting one or two timeouts: one timeout until the page-fault is handled and another until the handler was woken. Cc: Jens Axboe Cc: Andrea Arcangeli Cc: Peter Xu Cc: Alexander Viro Cc: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: Nadav Amit --- fs/userfaultfd.c | 24 ++++++++++++++++++++---- include/uapi/linux/userfaultfd.h | 9 ++++++++- 2 files changed, 28 insertions(+), 5 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index fedf7c1615d5..b6a04e526025 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -122,7 +122,9 @@ static int userfaultfd_wake_function(wait_queue_entry_t *wq, unsigned mode, if (len && (start > uwq->msg.arg.pagefault.address || start + len <= uwq->msg.arg.pagefault.address)) goto out; - WRITE_ONCE(uwq->waken, true); + + smp_store_mb(uwq->waken, true); + /* * The Program-Order guarantees provided by the scheduler * ensure uwq->waken is visible before the task is woken. @@ -377,6 +379,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) vm_fault_t ret = VM_FAULT_SIGBUS; bool must_wait; long blocking_state; + bool poll; /* * We don't do userfault handling for the final child pid update. @@ -410,6 +413,8 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) if (ctx->features & UFFD_FEATURE_SIGBUS) goto out; + poll = ctx->features & UFFD_FEATURE_POLL; + /* * If it's already released don't get it. This avoids to loop * in __get_user_pages if userfaultfd_release waits on the @@ -495,7 +500,10 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) * following the spin_unlock to happen before the list_add in * __add_wait_queue. */ - set_current_state(blocking_state); + + if (!poll) + set_current_state(blocking_state); + spin_unlock_irq(&ctx->fault_pending_wqh.lock); if (!is_vm_hugetlb_page(vmf->vma)) @@ -509,10 +517,18 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) if (likely(must_wait && !READ_ONCE(ctx->released))) { wake_up_poll(&ctx->fd_wqh, EPOLLIN); - schedule(); + if (poll) { + while (!READ_ONCE(uwq.waken) && !READ_ONCE(ctx->released) && + !signal_pending(current)) { + cpu_relax(); + cond_resched(); + } + } else + schedule(); } - __set_current_state(TASK_RUNNING); + if (!poll) + __set_current_state(TASK_RUNNING); /* * Here we race with the list_del; list_add in diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index e7e98bde221f..4eeba4235afe 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -27,7 +27,9 @@ UFFD_FEATURE_MISSING_HUGETLBFS | \ UFFD_FEATURE_MISSING_SHMEM | \ UFFD_FEATURE_SIGBUS | \ - UFFD_FEATURE_THREAD_ID) + UFFD_FEATURE_THREAD_ID | \ + UFFD_FEATURE_POLL) + #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -171,6 +173,10 @@ struct uffdio_api { * * UFFD_FEATURE_THREAD_ID pid of the page faulted task_struct will * be returned, if feature is not requested 0 will be returned. + * + * UFFD_FEATURE_POLL polls upon page-fault if the feature is requested + * instead of descheduling. This feature should only be enabled for + * low-latency handlers and when CPUs are not overcomitted. */ #define UFFD_FEATURE_PAGEFAULT_FLAG_WP (1<<0) #define UFFD_FEATURE_EVENT_FORK (1<<1) @@ -181,6 +187,7 @@ struct uffdio_api { #define UFFD_FEATURE_EVENT_UNMAP (1<<6) #define UFFD_FEATURE_SIGBUS (1<<7) #define UFFD_FEATURE_THREAD_ID (1<<8) +#define UFFD_FEATURE_POLL (1<<9) __u64 features; __u64 ioctls;