From patchwork Tue Sep 5 21:42:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13375105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 652C5CA100F for ; Tue, 5 Sep 2023 21:43:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233637AbjIEVoA (ORCPT ); Tue, 5 Sep 2023 17:44:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244212AbjIEVnz (ORCPT ); Tue, 5 Sep 2023 17:43:55 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 781C4CCB for ; Tue, 5 Sep 2023 14:42:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693950160; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yL/LhjyJgEaE1ERsjh3smTILDBi2p+nbzGXe8vQHohc=; b=cExvXWBzwFl1JDZHXDuFq5M1VzgMv2DqitvxY09rqBT/BtoQBAPX+KrZZelMhZK7VX4HbU RSGr7B2beMoy0ooDg1W/e2PTRNHQve6gf1JNV/8/rvxqHG1HLf+yImRdGYMWu0Lzsp8yn3 Ggd2iOPryx+gZ+cCCdyzTPByQiFd2/g= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-25-qqeJ7uK4MSSohBCoqVFJog-1; Tue, 05 Sep 2023 17:42:39 -0400 X-MC-Unique: qqeJ7uK4MSSohBCoqVFJog-1 Received: by mail-qk1-f197.google.com with SMTP id af79cd13be357-76f025ed860so44174685a.0 for ; Tue, 05 Sep 2023 14:42:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693950159; x=1694554959; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yL/LhjyJgEaE1ERsjh3smTILDBi2p+nbzGXe8vQHohc=; b=RMlkXC1icrE9DFyNwzUI7I0QMu86fB8WtHV7wZctVfagcWr+fOjkCamGtb6Qrfcf5p zG9J7Urnvi8CHoC2RagupCn5lrL67MSQRJQDZtlo0i0XK1Bh9RuTofXPjVqWT5wDd3Cv SNbfNYA+aM45rGvZiHheU2LCk95AQUxXmS+/FELQrB8EYGBmgTGbEZAN+k2ThpdcSJRg UGBAPC5COUSmoxeXCd+AJxV6u2jE3WUhHdjvGK3+c2toDEUOyD5kzbN04Nv/Of/Nbbq2 AOGPczlY/qbZlg+e+xmtc23n7bUAzMni2fKoUCdoAu/0Pf1WxfJKZ/OjSU3ImZ0NPYqN C1gg== X-Gm-Message-State: AOJu0YwxyRtmwRpUgUy6NqsPwPAAMbIJtR6/1yzk+xZHJRj3mSl8u03g NuZt4VWK7cbowl69HnvlzEIP0UdQqGoZ1u/jWYKKXbJReN/rhlM+n4JcFo/AVcK+HL7TN/IxixE 22zrj2Flox/FtoHRlQ9PbXjWu2Q== X-Received: by 2002:a05:620a:1aa4:b0:76f:1614:577d with SMTP id bl36-20020a05620a1aa400b0076f1614577dmr16479343qkb.4.1693950158925; Tue, 05 Sep 2023 14:42:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEdEZxCLbQuYXO7d4fPer9qvq4VofEooWqnv//Ff0/KOa/Ui0Gkdskd8CCG0Sg9WRZVsAXIfA== X-Received: by 2002:a05:620a:1aa4:b0:76f:1614:577d with SMTP id bl36-20020a05620a1aa400b0076f1614577dmr16479333qkb.4.1693950158658; Tue, 05 Sep 2023 14:42:38 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id i2-20020a37c202000000b007682af2c8aasm4396938qkm.126.2023.09.05.14.42.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Sep 2023 14:42:38 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Anish Moorthy , Axel Rasmussen , Alexander Viro , Mike Kravetz , Peter Zijlstra , Andrew Morton , Mike Rapoport , Christian Brauner , peterx@redhat.com, linux-fsdevel@vger.kernel.org, Andrea Arcangeli , Ingo Molnar , James Houghton , Nadav Amit Subject: [PATCH 1/7] mm/userfaultfd: Make uffd read() wait event exclusive Date: Tue, 5 Sep 2023 17:42:29 -0400 Message-ID: <20230905214235.320571-2-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230905214235.320571-1-peterx@redhat.com> References: <20230905214235.320571-1-peterx@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Andrea Arcangeli When a new message is generated for an userfaultfd, instead of waking up all the readers, we can wake up only one exclusive reader to process the event. Waking up >1 readers for 1 message will be a waste of resource, where the rest readers will see nothing again and re-queue. This should make userfaultfd read() O(1) on wakeups. Note that queuing on head is intended (rather than tail) to make sure the readers are waked up in LIFO fashion; fairness doesn't matter much here, but caching does. Signed-off-by: Andrea Arcangeli [peterx: modified subjects / commit message] Signed-off-by: Peter Xu --- fs/userfaultfd.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 56eaae9dac1a..f7fda7d0c994 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1061,7 +1061,11 @@ static ssize_t userfaultfd_ctx_read(struct userfaultfd_ctx *ctx, int no_wait, /* always take the fd_wqh lock before the fault_pending_wqh lock */ spin_lock_irq(&ctx->fd_wqh.lock); - __add_wait_queue(&ctx->fd_wqh, &wait); + /* + * Only wake up one exclusive reader each time there's an event. + * Paired with wake_up_poll() when e.g. a new page fault msg generated. + */ + __add_wait_queue_exclusive(&ctx->fd_wqh, &wait); for (;;) { set_current_state(TASK_INTERRUPTIBLE); spin_lock(&ctx->fault_pending_wqh.lock); From patchwork Tue Sep 5 21:42:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13375102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85F89CA1008 for ; Tue, 5 Sep 2023 21:43:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243002AbjIEVnz (ORCPT ); Tue, 5 Sep 2023 17:43:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229574AbjIEVny (ORCPT ); Tue, 5 Sep 2023 17:43:54 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CEF01AE for ; Tue, 5 Sep 2023 14:42:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693950165; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iFjFRpr2tLumBDFqT/D9Qiro5EdrVpMUq8hDA8RT3eI=; b=YT8lAwKSFpQErl1edaTjvaGFNMLsR729J/06Z/XByeMsTxRvHgn+Glq2u9xOgttYZHAoZv Hk5DFFfbqKF7oCMeT0pFRLQ4BIJOcWqc5+/F+ZHm0xR+6G+t0Kfkva1VOkWThWWcsve2Ul ixT6bWYb5IXYaQdQsEMsVW+bNQ5oko0= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-111-wpO3ATzzNM24fMD9RrORDA-1; Tue, 05 Sep 2023 17:42:41 -0400 X-MC-Unique: wpO3ATzzNM24fMD9RrORDA-1 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-76efdcb7be4so82010285a.1 for ; Tue, 05 Sep 2023 14:42:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693950160; x=1694554960; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iFjFRpr2tLumBDFqT/D9Qiro5EdrVpMUq8hDA8RT3eI=; b=ZwqIRL7tfHwE66VdnwhBsunqlMRq4dB0QaTf6yEdzcPbFX8D/PT46FD7hnA+IcmHEi lfK7XZOnXHYwUxENFSDGk5O8+LgDV8G5bj8idFD/cABW1YAwrkp69EctQhyNnOoJzRlB 2mYsH9rgZGMvRnUap6liuQ9p66IuqvWprqqF1VtAe3U6yQrtiXyIGAjzo4zupRByPrk7 5J17Tyc7SozannLO/VJPgKTqnXvf9v8Q3sLLPEb7GUaLFOZDYJNVb3a6O/thOueAOE++ TK0H36amKB0A8nFJ+3LQ8oCbIHecrjd4Vz9IuS88n5lD+u1BAHZQcaltcQvXDKeHwafN pwXQ== X-Gm-Message-State: AOJu0YxzOGs69TAMBENcOXhGoqOfySEKRzy5QWEowtHpedQrZvKlHLjF 3zcakK4wxdq+NBBG8A4zkxaLMHIc5GpwGk+D6IfagnYS3i0g1+XRRJSjzkZiiHCzyHn1cr4TGud 2+zSHwY34KpDAs1U6ijfy7rLwjQ== X-Received: by 2002:a05:620a:1a92:b0:76c:ea67:38e2 with SMTP id bl18-20020a05620a1a9200b0076cea6738e2mr16297938qkb.2.1693950160428; Tue, 05 Sep 2023 14:42:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGb0/uwjCE+wD8vjFeylTm+w9KR1ry9a3YxdBWKPhIRxUJR2OeC5hjEf8cXax4t1I0LDP54fg== X-Received: by 2002:a05:620a:1a92:b0:76c:ea67:38e2 with SMTP id bl18-20020a05620a1a9200b0076cea6738e2mr16297913qkb.2.1693950160176; Tue, 05 Sep 2023 14:42:40 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id i2-20020a37c202000000b007682af2c8aasm4396938qkm.126.2023.09.05.14.42.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Sep 2023 14:42:39 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Anish Moorthy , Axel Rasmussen , Alexander Viro , Mike Kravetz , Peter Zijlstra , Andrew Morton , Mike Rapoport , Christian Brauner , peterx@redhat.com, linux-fsdevel@vger.kernel.org, Andrea Arcangeli , Ingo Molnar , James Houghton , Nadav Amit Subject: [PATCH 2/7] poll: Add a poll_flags for poll_queue_proc() Date: Tue, 5 Sep 2023 17:42:30 -0400 Message-ID: <20230905214235.320571-3-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230905214235.320571-1-peterx@redhat.com> References: <20230905214235.320571-1-peterx@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Allows the poll enqueue function to pass over a flag into it. Signed-off-by: Peter Xu Signed-off-by: Peter Xu --- drivers/vfio/virqfd.c | 4 ++-- drivers/vhost/vhost.c | 2 +- drivers/virt/acrn/irqfd.c | 2 +- fs/aio.c | 2 +- fs/eventpoll.c | 2 +- fs/select.c | 4 ++-- include/linux/poll.h | 7 +++++-- io_uring/poll.c | 4 ++-- mm/memcontrol.c | 4 +++- net/9p/trans_fd.c | 3 ++- virt/kvm/eventfd.c | 2 +- 11 files changed, 21 insertions(+), 15 deletions(-) diff --git a/drivers/vfio/virqfd.c b/drivers/vfio/virqfd.c index 29c564b7a6e1..4b817a6f4f72 100644 --- a/drivers/vfio/virqfd.c +++ b/drivers/vfio/virqfd.c @@ -75,8 +75,8 @@ static int virqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, void return 0; } -static void virqfd_ptable_queue_proc(struct file *file, - wait_queue_head_t *wqh, poll_table *pt) +static void virqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, + poll_table *pt, poll_flags flags) { struct virqfd *virqfd = container_of(pt, struct virqfd, pt); add_wait_queue(wqh, &virqfd->wait); diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index c71d573f1c94..02caad721843 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -162,7 +162,7 @@ static void vhost_poll_func(struct file *file, wait_queue_head_t *wqh, } static int vhost_poll_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, - void *key) + void *key, poll_flags flags) { struct vhost_poll *poll = container_of(wait, struct vhost_poll, wait); struct vhost_work *work = &poll->work; diff --git a/drivers/virt/acrn/irqfd.c b/drivers/virt/acrn/irqfd.c index d4ad211dce7a..9b79e4e76e49 100644 --- a/drivers/virt/acrn/irqfd.c +++ b/drivers/virt/acrn/irqfd.c @@ -94,7 +94,7 @@ static int hsm_irqfd_wakeup(wait_queue_entry_t *wait, unsigned int mode, } static void hsm_irqfd_poll_func(struct file *file, wait_queue_head_t *wqh, - poll_table *pt) + poll_table *pt, poll_flags flags) { struct hsm_irqfd *irqfd; diff --git a/fs/aio.c b/fs/aio.c index a4c2a6bac72c..abb5b22f4fdf 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1823,7 +1823,7 @@ struct aio_poll_table { static void aio_poll_queue_proc(struct file *file, struct wait_queue_head *head, - struct poll_table_struct *p) + struct poll_table_struct *p, poll_flags flags) { struct aio_poll_table *pt = container_of(p, struct aio_poll_table, pt); diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 1d9a71a0c4c1..c74d6a083fd1 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1270,7 +1270,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v * target file wakeup lists. */ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead, - poll_table *pt) + poll_table *pt, poll_flags flags) { struct ep_pqueue *epq = container_of(pt, struct ep_pqueue, pt); struct epitem *epi = epq->epi; diff --git a/fs/select.c b/fs/select.c index 0ee55af1a55c..0433448481e9 100644 --- a/fs/select.c +++ b/fs/select.c @@ -117,7 +117,7 @@ struct poll_table_page { * poll table. */ static void __pollwait(struct file *filp, wait_queue_head_t *wait_address, - poll_table *p); + poll_table *p, poll_flags flags); void poll_initwait(struct poll_wqueues *pwq) { @@ -220,7 +220,7 @@ static int pollwake(wait_queue_entry_t *wait, unsigned mode, int sync, void *key /* Add a new entry */ static void __pollwait(struct file *filp, wait_queue_head_t *wait_address, - poll_table *p) + poll_table *p, poll_flags flags) { struct poll_wqueues *pwq = container_of(p, struct poll_wqueues, pt); struct poll_table_entry *entry = poll_get_entry(pwq); diff --git a/include/linux/poll.h b/include/linux/poll.h index a9e0e1c2d1f2..cbad520fc65c 100644 --- a/include/linux/poll.h +++ b/include/linux/poll.h @@ -27,12 +27,15 @@ #define DEFAULT_POLLMASK (EPOLLIN | EPOLLOUT | EPOLLRDNORM | EPOLLWRNORM) +typedef unsigned int poll_flags; + struct poll_table_struct; /* * structures and helpers for f_op->poll implementations */ -typedef void (*poll_queue_proc)(struct file *, wait_queue_head_t *, struct poll_table_struct *); +typedef void (*poll_queue_proc)(struct file *, wait_queue_head_t *, + struct poll_table_struct *, poll_flags); /* * Do not touch the structure directly, use the access functions @@ -46,7 +49,7 @@ typedef struct poll_table_struct { static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) { if (p && p->_qproc && wait_address) - p->_qproc(filp, wait_address, p); + p->_qproc(filp, wait_address, p, 0); } /* diff --git a/io_uring/poll.c b/io_uring/poll.c index 4c360ba8793a..c3b41e963a8d 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -533,7 +533,7 @@ static void __io_queue_proc(struct io_poll *poll, struct io_poll_table *pt, } static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head, - struct poll_table_struct *p) + struct poll_table_struct *p, poll_flags flags) { struct io_poll_table *pt = container_of(p, struct io_poll_table, pt); struct io_poll *poll = io_kiocb_to_cmd(pt->req, struct io_poll); @@ -644,7 +644,7 @@ static int __io_arm_poll_handler(struct io_kiocb *req, } static void io_async_queue_proc(struct file *file, struct wait_queue_head *head, - struct poll_table_struct *p) + struct poll_table_struct *p, poll_flags flags) { struct io_poll_table *pt = container_of(p, struct io_poll_table, pt); struct async_poll *apoll = pt->req->apoll; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ecc07b47e813..97b03ab30d5e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4877,7 +4877,9 @@ static int memcg_event_wake(wait_queue_entry_t *wait, unsigned mode, } static void memcg_event_ptable_queue_proc(struct file *file, - wait_queue_head_t *wqh, poll_table *pt) + wait_queue_head_t *wqh, + poll_table *pt, + poll_flags flags) { struct mem_cgroup_event *event = container_of(pt, struct mem_cgroup_event, pt); diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index c4015f30f9fa..91f9f474ab01 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -550,7 +550,8 @@ static int p9_pollwake(wait_queue_entry_t *wait, unsigned int mode, int sync, vo */ static void -p9_pollwait(struct file *filp, wait_queue_head_t *wait_address, poll_table *p) +p9_pollwait(struct file *filp, wait_queue_head_t *wait_address, poll_table *p, + poll_flags flags) { struct p9_conn *m = container_of(p, struct p9_conn, pt); struct p9_poll_wait *pwait = NULL; diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c index 89912a17f5d5..645b5d155386 100644 --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -246,7 +246,7 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, void *key) static void irqfd_ptable_queue_proc(struct file *file, wait_queue_head_t *wqh, - poll_table *pt) + poll_table *pt, poll_flags flags) { struct kvm_kernel_irqfd *irqfd = container_of(pt, struct kvm_kernel_irqfd, pt); From patchwork Tue Sep 5 21:42:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13375104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE0F9CA1008 for ; Tue, 5 Sep 2023 21:43:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244206AbjIEVn6 (ORCPT ); Tue, 5 Sep 2023 17:43:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244104AbjIEVnz (ORCPT ); Tue, 5 Sep 2023 17:43:55 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1436CE42 for ; Tue, 5 Sep 2023 14:42:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693950163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BAc+MDavbWwJ6795LOxgDkxGvxMJL+vGI+ILCcyBkDQ=; b=PTgUd/sNxIM7N+a+OPfC/o/PAhnprxkaQ1Mc6p2YfCyAZDQeWRcROiI+8kB0Tq8sZzkHIb 1oaCx5Sz3JtZgF0EESJ9rRgDOU3/NWVj8vFJlxXuvCJ6oaSFfXIkVGDBb2gbSoLWZM4vKn ovyrq76eFBsozNJXaSO5CTH2HNtylFA= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-150-vzaHc_KjMeWJYYU3crm0tQ-1; Tue, 05 Sep 2023 17:42:42 -0400 X-MC-Unique: vzaHc_KjMeWJYYU3crm0tQ-1 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-76f097b28ecso19718685a.1 for ; Tue, 05 Sep 2023 14:42:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693950162; x=1694554962; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BAc+MDavbWwJ6795LOxgDkxGvxMJL+vGI+ILCcyBkDQ=; b=DSqXaYTQ0xeAaee9LDH8s4ZPZfvAkn9qvxdKy7bLsHkcx91s1cWbkhbrOfJPODF49G BCSr5WZGHhvRVcrgug3SOdn1Fc44u5zdWO2jvmWVc/z5BxCmNDgPgey7NJ33/Z0dFD48 8M4Bk6yFHkvbuP99imPNomHzQlSJ/t8IilAkYkWWngCdeiLFnUyIOe8yap26jXYqFlAc JPsk3U0Epzt8MNCIBJX0BrgQVBSHLdcL1mdJGT/g+m1PVitDizicVB6or2ULQza85Jhi 9mh3KsNOsYn277I33VUZo/qBjRYVIPA32MR22mo4pDjS05aOFUNF4bXuJn/X8CJv5ldt 9s0w== X-Gm-Message-State: AOJu0Yzvu9rpWp9spub+Umn7LsJJ29hjc2uVXH9OnlosrmyQ7PdDLopl d+s8jenPdCAGnD1tPm1+UQ9nD3eOSpIeHuwPq1Lij7zAgw32+jomFauAn1w5MEcTIC/BJmSr+H+ BVblVH8Rybw8Y33eUBVEH4dTCGA== X-Received: by 2002:a05:620a:1a26:b0:76c:ed4e:ac10 with SMTP id bk38-20020a05620a1a2600b0076ced4eac10mr16770020qkb.6.1693950161778; Tue, 05 Sep 2023 14:42:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHX81eRpvF4K65o6If4EOmXGS0BgJwHt7s35QflxCMt79X+rblwCZ36Xulk34UDBuMUSImVnw== X-Received: by 2002:a05:620a:1a26:b0:76c:ed4e:ac10 with SMTP id bk38-20020a05620a1a2600b0076ced4eac10mr16769997qkb.6.1693950161469; Tue, 05 Sep 2023 14:42:41 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id i2-20020a37c202000000b007682af2c8aasm4396938qkm.126.2023.09.05.14.42.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Sep 2023 14:42:41 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Anish Moorthy , Axel Rasmussen , Alexander Viro , Mike Kravetz , Peter Zijlstra , Andrew Morton , Mike Rapoport , Christian Brauner , peterx@redhat.com, linux-fsdevel@vger.kernel.org, Andrea Arcangeli , Ingo Molnar , James Houghton , Nadav Amit Subject: [PATCH 3/7] poll: POLL_ENQUEUE_EXCLUSIVE Date: Tue, 5 Sep 2023 17:42:31 -0400 Message-ID: <20230905214235.320571-4-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230905214235.320571-1-peterx@redhat.com> References: <20230905214235.320571-1-peterx@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a flag for poll_wait() showing that the caller wants the enqueue to be exclusive. It is similar to EPOLLEXCLUSIVE for epoll() but grants kernel poll users to opt-in with more efficient exclusive queuing where applicable. Signed-off-by: Peter Xu --- fs/select.c | 5 ++++- include/linux/poll.h | 20 ++++++++++++++++++-- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/fs/select.c b/fs/select.c index 0433448481e9..a3c9088e8d76 100644 --- a/fs/select.c +++ b/fs/select.c @@ -231,7 +231,10 @@ static void __pollwait(struct file *filp, wait_queue_head_t *wait_address, entry->key = p->_key; init_waitqueue_func_entry(&entry->wait, pollwake); entry->wait.private = pwq; - add_wait_queue(wait_address, &entry->wait); + if (flags & POLL_ENQUEUE_EXCLUSIVE) + add_wait_queue_exclusive(wait_address, &entry->wait); + else + add_wait_queue(wait_address, &entry->wait); } static int poll_schedule_timeout(struct poll_wqueues *pwq, int state, diff --git a/include/linux/poll.h b/include/linux/poll.h index cbad520fc65c..11af98ae579c 100644 --- a/include/linux/poll.h +++ b/include/linux/poll.h @@ -29,6 +29,8 @@ typedef unsigned int poll_flags; +#define POLL_ENQUEUE_EXCLUSIVE BIT(0) + struct poll_table_struct; /* @@ -46,10 +48,24 @@ typedef struct poll_table_struct { __poll_t _key; } poll_table; -static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p) +static inline void __poll_wait(struct file *filp, wait_queue_head_t *wait_address, + poll_table *p, poll_flags flags) { if (p && p->_qproc && wait_address) - p->_qproc(filp, wait_address, p, 0); + p->_qproc(filp, wait_address, p, flags); +} + +static inline void poll_wait(struct file *filp, wait_queue_head_t *wait_address, + poll_table *p) +{ + __poll_wait(filp, wait_address, p, 0); +} + +static inline void poll_wait_exclusive(struct file *filp, + wait_queue_head_t *wait_address, + poll_table *p) +{ + __poll_wait(filp, wait_address, p, POLL_ENQUEUE_EXCLUSIVE); } /* From patchwork Tue Sep 5 21:42:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13375106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AB16CA100D for ; Tue, 5 Sep 2023 21:44:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242202AbjIEVoB (ORCPT ); Tue, 5 Sep 2023 17:44:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244431AbjIEVoA (ORCPT ); Tue, 5 Sep 2023 17:44:00 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E134E46 for ; Tue, 5 Sep 2023 14:42:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693950165; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i4qwgXUgzQeLfKMT3tv3YVc7S/hLt8ARXOPSgsfGIQ4=; b=GVvzjIixE/gC1e/RMKsa5QLbfGh0hq9vH7IkpfXZC5uBygeayeVtrz7wbeaVgb9lk9mZVm 2jAUnj2Krbz5aLTn5pD3MKWQFM3G5L1OHbn6stCTaoefDm70LHq5Kk/4MR21faa/AK257m dad7Lcci/aOT8eBxOCV/1yxd51SoqkA= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-235-23XmAZG3NQGHlw4WilLdLw-1; Tue, 05 Sep 2023 17:42:43 -0400 X-MC-Unique: 23XmAZG3NQGHlw4WilLdLw-1 Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-63d2b88325bso6453446d6.1 for ; Tue, 05 Sep 2023 14:42:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693950163; x=1694554963; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i4qwgXUgzQeLfKMT3tv3YVc7S/hLt8ARXOPSgsfGIQ4=; b=IOZlajhRU++c3a2IWoccFHWrmzHBLNrRdHWUlDTjlzCmEonH0LUAZdUY9vFOPZaBkI hiBUtv6BfszjT7eC22yJptU3Cf2IVOhHMNaKxkQMzHBk4lj4uSrjL8MCThNWAOxIc5sD C9+yDe78frXtngzWLUpNB1M+SkMeEd1n/05Mm1Cf31fsDnJXdORQzFP8n76KKv+wD+Vk 65pyBa+z1HBc1+EoVwfdT0qBj8tpCuOUP9nXY1c5nuoXiJibQBPealIDntw17S7EZSSq Rpn/mbQnSIhAQIwFm0krruWqNm92AQlHUXVhohXpR+95VIufQPDlGSnXG/kBYrJLYFGi vuEg== X-Gm-Message-State: AOJu0YyZWbe9cexqMBSXNkUHABS+Y1XuwuMHdpazh5G/JBQT3+gbhmx3 gT5mTwdnTdn6bGS5NFXdWHhCI93rIxGh1TyvkfFvWgmyO0IUpL4byBW7m/4Ojqq87sEIHzRx6Wg OyXMRxLo7dZxi+dsefzGJzBHbUA== X-Received: by 2002:a05:6214:21e4:b0:64a:8d39:3378 with SMTP id p4-20020a05621421e400b0064a8d393378mr17304043qvj.4.1693950163016; Tue, 05 Sep 2023 14:42:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFlS4tLxX3RT0SNIZn/SUsnHB3xGIbADDb9sPouLrcBV//Rq31qTbeRmNhRZtJnSN5gXEiBhA== X-Received: by 2002:a05:6214:21e4:b0:64a:8d39:3378 with SMTP id p4-20020a05621421e400b0064a8d393378mr17304030qvj.4.1693950162797; Tue, 05 Sep 2023 14:42:42 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id i2-20020a37c202000000b007682af2c8aasm4396938qkm.126.2023.09.05.14.42.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Sep 2023 14:42:42 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Anish Moorthy , Axel Rasmussen , Alexander Viro , Mike Kravetz , Peter Zijlstra , Andrew Morton , Mike Rapoport , Christian Brauner , peterx@redhat.com, linux-fsdevel@vger.kernel.org, Andrea Arcangeli , Ingo Molnar , James Houghton , Nadav Amit Subject: [PATCH 4/7] fs/userfaultfd: Use exclusive waitqueue for poll() Date: Tue, 5 Sep 2023 17:42:32 -0400 Message-ID: <20230905214235.320571-5-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230905214235.320571-1-peterx@redhat.com> References: <20230905214235.320571-1-peterx@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Userfaultfd is the kind of fd that does not need a wake all semantics when wake up. Enqueue using the new POLL_ENQUEUE_EXCLUSIVE flag. Signed-off-by: Peter Xu --- fs/userfaultfd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index f7fda7d0c994..9c39adc398fc 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -994,7 +994,7 @@ static __poll_t userfaultfd_poll(struct file *file, poll_table *wait) struct userfaultfd_ctx *ctx = file->private_data; __poll_t ret; - poll_wait(file, &ctx->fd_wqh, wait); + poll_wait_exclusive(file, &ctx->fd_wqh, wait); if (!userfaultfd_is_initialized(ctx)) return EPOLLERR; From patchwork Tue Sep 5 21:42:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13375107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 675CACA1010 for ; Tue, 5 Sep 2023 21:44:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244172AbjIEVoD (ORCPT ); Tue, 5 Sep 2023 17:44:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236948AbjIEVoA (ORCPT ); Tue, 5 Sep 2023 17:44:00 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A8B3CE5 for ; Tue, 5 Sep 2023 14:42:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693950166; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uQJce4v94fBnSsGo2eU4ZKQw/Wk8m14ksf+Px8TLSGM=; b=ZBE8Hs7oczX6PZ8XChPGE5ZUsFIMv36rZzXvpmcXUxVYgo1iXbPiJC6YJDSLYMul5tmdt7 beT4rpG+ztXE2xqrNiP82JEmhbU5XvS108E2oBKz+6YDSBZw1FSOadKaEss0upzlNOSujn 9Viw6tkBlRLT6OgcpP4ac6hPCq8vXwY= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-311-2blVHz9vM_m9FpJG1qOWhQ-1; Tue, 05 Sep 2023 17:42:44 -0400 X-MC-Unique: 2blVHz9vM_m9FpJG1qOWhQ-1 Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-76f025ed860so44176885a.0 for ; Tue, 05 Sep 2023 14:42:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693950164; x=1694554964; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uQJce4v94fBnSsGo2eU4ZKQw/Wk8m14ksf+Px8TLSGM=; b=LWgEK7HaXpUE21HNmgkwidteBIZalKfJn5SlZ6XNjw813HtGu3kHQol9ayPCD9pqeo uK+r361KSAUdILMN0yjbR0Xo2QNl6QrnM1c8dEIMVIn+MDXxvsHpTzgg6ZIjKWBX3FVT YlkPL3lArzI2Hr9rIxXzrtrsSaskSS5biTmwPZR7QQ3u2e4KL+6BxUzW6L6LJ51qy/uA EbA8nvW17/IJ/xnlZRMTvoKl3qaCqFprQTWQqzgLaD8vv55g2vQ/sPZtXJSp+Kqyiuye PTW4C3hlLbwBK7DwyramAiG7gtDid4jynZE8x0cALWt4hnw9APGonouslSOFAygEddOZ ypZA== X-Gm-Message-State: AOJu0YwF+bkYxD9Eof9X4Nr/aAKg5Gc7g3VpYsajPMG5aSsyR1KFuc4l f/FArFSRKC7AJMPTmaNxjYjqqQQYFziwIMhjgYp0d3zGe+MvBt8w8RShQ9ISzoRGn7AJTQR/VXV Yg2uET6XddVGt0/ZRJozjWOhZ1Q== X-Received: by 2002:a05:620a:1aa4:b0:76f:1614:577d with SMTP id bl36-20020a05620a1aa400b0076f1614577dmr16479573qkb.4.1693950164463; Tue, 05 Sep 2023 14:42:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH0CeamzOJT7Goeqq7g3UHhRHuitMC171tC//j49BSbILvNCZigPeU/QwAtYa7Vbv4u1ikW+A== X-Received: by 2002:a05:620a:1aa4:b0:76f:1614:577d with SMTP id bl36-20020a05620a1aa400b0076f1614577dmr16479552qkb.4.1693950164161; Tue, 05 Sep 2023 14:42:44 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id i2-20020a37c202000000b007682af2c8aasm4396938qkm.126.2023.09.05.14.42.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Sep 2023 14:42:43 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Anish Moorthy , Axel Rasmussen , Alexander Viro , Mike Kravetz , Peter Zijlstra , Andrew Morton , Mike Rapoport , Christian Brauner , peterx@redhat.com, linux-fsdevel@vger.kernel.org, Andrea Arcangeli , Ingo Molnar , James Houghton , Nadav Amit Subject: [PATCH 5/7] selftests/mm: Replace uffd_read_mutex with a semaphore Date: Tue, 5 Sep 2023 17:42:33 -0400 Message-ID: <20230905214235.320571-6-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230905214235.320571-1-peterx@redhat.com> References: <20230905214235.320571-1-peterx@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Each uffd read threads unlocks the read mutex first, probably just to make sure the thread is reaching a stage where pthread_cancel() can always work before the main thread moves on. However keeping the mutex locked always and unlock in the thread is a bit hacky. Replacing it with a semaphore which should be much clearer, where the main thread will wait() and the thread will just post(). Move it to uffd-common.* to be reused later. Signed-off-by: Peter Xu --- tools/testing/selftests/mm/uffd-common.c | 1 + tools/testing/selftests/mm/uffd-common.h | 2 ++ tools/testing/selftests/mm/uffd-stress.c | 8 +++----- 3 files changed, 6 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c index 02b89860e193..aded06cab285 100644 --- a/tools/testing/selftests/mm/uffd-common.c +++ b/tools/testing/selftests/mm/uffd-common.c @@ -17,6 +17,7 @@ bool map_shared; bool test_uffdio_wp = true; unsigned long long *count_verify; uffd_test_ops_t *uffd_test_ops; +sem_t uffd_read_sem; static int uffd_mem_fd_create(off_t mem_size, bool hugetlb) { diff --git a/tools/testing/selftests/mm/uffd-common.h b/tools/testing/selftests/mm/uffd-common.h index 7c4fa964c3b0..521523baded1 100644 --- a/tools/testing/selftests/mm/uffd-common.h +++ b/tools/testing/selftests/mm/uffd-common.h @@ -32,6 +32,7 @@ #include #include #include +#include #include "../kselftest.h" #include "vm_util.h" @@ -97,6 +98,7 @@ extern bool map_shared; extern bool test_uffdio_wp; extern unsigned long long *count_verify; extern volatile bool test_uffdio_copy_eexist; +extern sem_t uffd_read_sem; extern uffd_test_ops_t anon_uffd_test_ops; extern uffd_test_ops_t shmem_uffd_test_ops; diff --git a/tools/testing/selftests/mm/uffd-stress.c b/tools/testing/selftests/mm/uffd-stress.c index 469e0476af26..7219f55ae794 100644 --- a/tools/testing/selftests/mm/uffd-stress.c +++ b/tools/testing/selftests/mm/uffd-stress.c @@ -125,14 +125,12 @@ static int copy_page_retry(int ufd, unsigned long offset) return __copy_page(ufd, offset, true, test_uffdio_wp); } -pthread_mutex_t uffd_read_mutex = PTHREAD_MUTEX_INITIALIZER; - static void *uffd_read_thread(void *arg) { struct uffd_args *args = (struct uffd_args *)arg; struct uffd_msg msg; - pthread_mutex_unlock(&uffd_read_mutex); + sem_post(&uffd_read_sem); /* from here cancellation is ok */ for (;;) { @@ -196,7 +194,7 @@ static int stress(struct uffd_args *args) uffd_read_thread, (void *)&args[cpu])) return 1; - pthread_mutex_lock(&uffd_read_mutex); + sem_wait(&uffd_read_sem); } if (pthread_create(&background_threads[cpu], &attr, background_thread, (void *)cpu)) @@ -258,7 +256,7 @@ static int userfaultfd_stress(void) zeropage = area; bzero(zeropage, page_size); - pthread_mutex_lock(&uffd_read_mutex); + sem_init(&uffd_read_sem, 0, 0); pthread_attr_init(&attr); pthread_attr_setstacksize(&attr, 16*1024*1024); From patchwork Tue Sep 5 21:42:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13375108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D75ECA1010 for ; Tue, 5 Sep 2023 21:45:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244576AbjIEVpC (ORCPT ); Tue, 5 Sep 2023 17:45:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244638AbjIEVol (ORCPT ); Tue, 5 Sep 2023 17:44:41 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89093E5C for ; Tue, 5 Sep 2023 14:42:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693950167; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cw3DqXN0nAqgICZcLU+deZuKVK3JziD/s0/YHmFn328=; b=F8vv3UVlUcnGG3TwQmHglH33tfViEKeFb971EE/527Rj4rX4jbzU+spgRdlHDZgxVngJ0v Q/Np+UbNap1Fc13YgLFYYZ5kKTLarXqWuuLsCCiV4fkU+4mCF5SS1aoeytA+FkoNxdU9E0 sbqNxGsQ6KGlxhGuxYMV+TKDoa5CD9Q= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-353-x7Yhf8inNwiBmUTOilUZUg-1; Tue, 05 Sep 2023 17:42:46 -0400 X-MC-Unique: x7Yhf8inNwiBmUTOilUZUg-1 Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-76f191e26f5so91401085a.0 for ; Tue, 05 Sep 2023 14:42:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693950166; x=1694554966; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cw3DqXN0nAqgICZcLU+deZuKVK3JziD/s0/YHmFn328=; b=eHyCL/qwHvzYKhIx7sX+gb10bTHSZBHGX07pJcv2vD+taCEvK/VGpGCOutJfrTUtsN n24e1/OZOXJSn5rpSd2zgW7KzjPyN9oxrpUT03iJ//pZIBbINYhQVoxF5l4TuHnnQHT0 xz0Hkef+jWStJwHB5twI+duoUneIaDrXAjzZE+sCtcnZtyNahv+LQwjjCjlUG45bF3yb tv3woVC5TYZe+LpBaNJp/++YKGKuiyuj1vKYB8uZb7NaC09oMUbNNJXqGCCRMJk+XBVh l+mDlUiiA0fvcftXbx0wnt6LUNRBwTX75piYHAzxuhQb8fc4XBFvkthHpKolfHIH2lHc GGnA== X-Gm-Message-State: AOJu0YxQZs57dZDRQxu3dPApz/muUeKiPD+xa06TJt+199zDZpE/+hP5 +rDf1uxBfUbzHQiaT3jF+XG3qDuzyubDFMDNy6qd02r+gYbA4xzVsxWUtMbveEGfPKei4i4PRDU 7US/ShC79ZGvGu6jGImETG74t5A== X-Received: by 2002:a05:620a:4712:b0:76f:1b38:e74a with SMTP id bs18-20020a05620a471200b0076f1b38e74amr15367604qkb.4.1693950165774; Tue, 05 Sep 2023 14:42:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEjHmAQEBZtigx4G4B3sp1ASlOU6CNSzPZeJM4WgpbfHkl1dq3l/QojNcctzg5rIDRzuMLaUA== X-Received: by 2002:a05:620a:4712:b0:76f:1b38:e74a with SMTP id bs18-20020a05620a471200b0076f1b38e74amr15367597qkb.4.1693950165524; Tue, 05 Sep 2023 14:42:45 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id i2-20020a37c202000000b007682af2c8aasm4396938qkm.126.2023.09.05.14.42.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Sep 2023 14:42:45 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Anish Moorthy , Axel Rasmussen , Alexander Viro , Mike Kravetz , Peter Zijlstra , Andrew Morton , Mike Rapoport , Christian Brauner , peterx@redhat.com, linux-fsdevel@vger.kernel.org, Andrea Arcangeli , Ingo Molnar , James Houghton , Nadav Amit Subject: [PATCH 6/7] selftests/mm: Create uffd_fault_thread_create|join() Date: Tue, 5 Sep 2023 17:42:34 -0400 Message-ID: <20230905214235.320571-7-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230905214235.320571-1-peterx@redhat.com> References: <20230905214235.320571-1-peterx@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Make them common functions to be reused. Signed-off-by: Peter Xu --- tools/testing/selftests/mm/uffd-common.c | 46 ++++++++++++++++++++++ tools/testing/selftests/mm/uffd-common.h | 4 ++ tools/testing/selftests/mm/uffd-stress.c | 49 ++++-------------------- 3 files changed, 57 insertions(+), 42 deletions(-) diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c index aded06cab285..851284395b29 100644 --- a/tools/testing/selftests/mm/uffd-common.c +++ b/tools/testing/selftests/mm/uffd-common.c @@ -555,6 +555,52 @@ void *uffd_poll_thread(void *arg) return NULL; } +void *uffd_read_thread(void *arg) +{ + struct uffd_args *args = (struct uffd_args *)arg; + struct uffd_msg msg; + + sem_post(&uffd_read_sem); + /* from here cancellation is ok */ + + for (;;) { + if (uffd_read_msg(uffd, &msg)) + continue; + uffd_handle_page_fault(&msg, args); + } + + return NULL; +} + +void uffd_fault_thread_create(pthread_t *thread, pthread_attr_t *attr, + struct uffd_args *args, bool poll) +{ + if (poll) { + if (pthread_create(thread, attr, uffd_poll_thread, args)) + err("uffd_poll_thread create"); + } else { + if (pthread_create(thread, attr, uffd_read_thread, args)) + err("uffd_read_thread create"); + sem_wait(&uffd_read_sem); + } +} + +void uffd_fault_thread_join(pthread_t thread, int cpu, bool poll) +{ + char c = 1; + + if (poll) { + if (write(pipefd[cpu*2+1], &c, 1) != 1) + err("pipefd write error"); + } else { + if (pthread_cancel(thread)) + err("pthread_cancel()"); + } + + if (pthread_join(thread, NULL)) + err("pthread_join()"); +} + static void retry_copy_page(int ufd, struct uffdio_copy *uffdio_copy, unsigned long offset) { diff --git a/tools/testing/selftests/mm/uffd-common.h b/tools/testing/selftests/mm/uffd-common.h index 521523baded1..9d66ad5c52cb 100644 --- a/tools/testing/selftests/mm/uffd-common.h +++ b/tools/testing/selftests/mm/uffd-common.h @@ -114,6 +114,10 @@ void uffd_handle_page_fault(struct uffd_msg *msg, struct uffd_args *args); int __copy_page(int ufd, unsigned long offset, bool retry, bool wp); int copy_page(int ufd, unsigned long offset, bool wp); void *uffd_poll_thread(void *arg); +void *uffd_read_thread(void *arg); +void uffd_fault_thread_create(pthread_t *thread, pthread_attr_t *attr, + struct uffd_args *args, bool poll); +void uffd_fault_thread_join(pthread_t thread, int cpu, bool poll); int uffd_open_dev(unsigned int flags); int uffd_open_sys(unsigned int flags); diff --git a/tools/testing/selftests/mm/uffd-stress.c b/tools/testing/selftests/mm/uffd-stress.c index 7219f55ae794..915795e33432 100644 --- a/tools/testing/selftests/mm/uffd-stress.c +++ b/tools/testing/selftests/mm/uffd-stress.c @@ -125,23 +125,6 @@ static int copy_page_retry(int ufd, unsigned long offset) return __copy_page(ufd, offset, true, test_uffdio_wp); } -static void *uffd_read_thread(void *arg) -{ - struct uffd_args *args = (struct uffd_args *)arg; - struct uffd_msg msg; - - sem_post(&uffd_read_sem); - /* from here cancellation is ok */ - - for (;;) { - if (uffd_read_msg(uffd, &msg)) - continue; - uffd_handle_page_fault(&msg, args); - } - - return NULL; -} - static void *background_thread(void *arg) { unsigned long cpu = (unsigned long) arg; @@ -186,16 +169,10 @@ static int stress(struct uffd_args *args) if (pthread_create(&locking_threads[cpu], &attr, locking_thread, (void *)cpu)) return 1; - if (bounces & BOUNCE_POLL) { - if (pthread_create(&uffd_threads[cpu], &attr, uffd_poll_thread, &args[cpu])) - err("uffd_poll_thread create"); - } else { - if (pthread_create(&uffd_threads[cpu], &attr, - uffd_read_thread, - (void *)&args[cpu])) - return 1; - sem_wait(&uffd_read_sem); - } + + uffd_fault_thread_create(&uffd_threads[cpu], &attr, + &args[cpu], bounces & BOUNCE_POLL); + if (pthread_create(&background_threads[cpu], &attr, background_thread, (void *)cpu)) return 1; @@ -220,21 +197,9 @@ static int stress(struct uffd_args *args) if (pthread_join(locking_threads[cpu], NULL)) return 1; - for (cpu = 0; cpu < nr_cpus; cpu++) { - char c; - if (bounces & BOUNCE_POLL) { - if (write(pipefd[cpu*2+1], &c, 1) != 1) - err("pipefd write error"); - if (pthread_join(uffd_threads[cpu], - (void *)&args[cpu])) - return 1; - } else { - if (pthread_cancel(uffd_threads[cpu])) - return 1; - if (pthread_join(uffd_threads[cpu], NULL)) - return 1; - } - } + for (cpu = 0; cpu < nr_cpus; cpu++) + uffd_fault_thread_join(uffd_threads[cpu], cpu, + bounces & BOUNCE_POLL); return 0; } From patchwork Tue Sep 5 21:42:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 13375109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C95FCA1016 for ; Tue, 5 Sep 2023 21:45:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244584AbjIEVpD (ORCPT ); Tue, 5 Sep 2023 17:45:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236782AbjIEVoo (ORCPT ); Tue, 5 Sep 2023 17:44:44 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6F3A199 for ; Tue, 5 Sep 2023 14:42:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1693950168; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eXvYIDhk9Xunttv7gLZeZ6m5tePV5F/iEIjQTI0WMa4=; b=PwGwsk6P1xWHNl0u6nlKM/1aRicywEQ3mMHqic9Vs0nM5tP9pGIl7wNTQxE2Nou9oR++sF CaGKG6T49+S4/Mti4MkG0L7SnTKwUGSZWjINbt6gXeF7t+R5NYwdFTdQv/NcEa+ByGSlbi qz5tPNT/U4EXgBhLBf3FZOOmYPho9so= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-226-g1ZEE6SxOQOcRNlXN23eRA-1; Tue, 05 Sep 2023 17:42:47 -0400 X-MC-Unique: g1ZEE6SxOQOcRNlXN23eRA-1 Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-7708c1ae500so2640185a.0 for ; Tue, 05 Sep 2023 14:42:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693950167; x=1694554967; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eXvYIDhk9Xunttv7gLZeZ6m5tePV5F/iEIjQTI0WMa4=; b=B8/93SiVOipEivxT1g7WR53hLqeHQr5LJwvOR9NGCiQeG7LWdFfPJaFgGBNuj8hr6l LoKifNazX5w2oBm/6dlsyt0Phh+gcO000Qcz5NjmrZdvDPOg+uJ8ZixuG73LAMQeokyM /EGFaNcYQ9f1IhK9NIqwKDJXxzrAFs/eNFx7G3blV2ru0virDheWiRhG0tnk1S0FZ7Wh xXZlBT7DFaWmlLSuveD/R20Kkpk254/Ocwn12gV3hu19FzzAAT2M5LYpEF7JkA2V48ss UwodeL84ps5gs2S6Jtv/Eki1AnaV/Pih9nlBRCulRB8vV0CQDtNpOXgxIOIsj9l8/JRB 8xZw== X-Gm-Message-State: AOJu0YxG4JBnNaZmeNGgZXEZSdIJQw7I3Mu6hPLNjYaa8fPOX1uap+kU vhlFNTinhmT1f3QQ+kH1W6FCYDTmEAKzePMNbqk+TQ/ubuKyDzcGO+lw5TT9zXiE3NKPFQ89UEn stGwVRCOkbYh88NvpuZ10rIubpQ== X-Received: by 2002:a05:620a:1a26:b0:76c:ed4e:ac10 with SMTP id bk38-20020a05620a1a2600b0076ced4eac10mr16770229qkb.6.1693950167013; Tue, 05 Sep 2023 14:42:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEGbQRi3lj2GCHZMrI9G7N8PmFSehzLMjlA7R138X4fKreXMS+kXyyfOvFWMrTvBreLVK2oMg== X-Received: by 2002:a05:620a:1a26:b0:76c:ed4e:ac10 with SMTP id bk38-20020a05620a1a2600b0076ced4eac10mr16770206qkb.6.1693950166761; Tue, 05 Sep 2023 14:42:46 -0700 (PDT) Received: from x1n.redhat.com (cpe5c7695f3aee0-cm5c7695f3aede.cpe.net.cable.rogers.com. [99.254.144.39]) by smtp.gmail.com with ESMTPSA id i2-20020a37c202000000b007682af2c8aasm4396938qkm.126.2023.09.05.14.42.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Sep 2023 14:42:46 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Anish Moorthy , Axel Rasmussen , Alexander Viro , Mike Kravetz , Peter Zijlstra , Andrew Morton , Mike Rapoport , Christian Brauner , peterx@redhat.com, linux-fsdevel@vger.kernel.org, Andrea Arcangeli , Ingo Molnar , James Houghton , Nadav Amit Subject: [PATCH 7/7] selftests/mm: uffd perf test Date: Tue, 5 Sep 2023 17:42:35 -0400 Message-ID: <20230905214235.320571-8-peterx@redhat.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230905214235.320571-1-peterx@redhat.com> References: <20230905214235.320571-1-peterx@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a simple perf test for userfaultfd missing mode, on private anon only. It mostly only tests the messaging, so memory type / fault type may not that much yet. Signed-off-by: Peter Xu --- tools/testing/selftests/mm/Makefile | 2 + tools/testing/selftests/mm/uffd-common.c | 18 ++ tools/testing/selftests/mm/uffd-common.h | 1 + tools/testing/selftests/mm/uffd-perf.c | 207 +++++++++++++++++++++++ 4 files changed, 228 insertions(+) create mode 100644 tools/testing/selftests/mm/uffd-perf.c diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 6a9fc5693145..acb22517d37e 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -64,6 +64,7 @@ TEST_GEN_FILES += thuge-gen TEST_GEN_FILES += transhuge-stress TEST_GEN_FILES += uffd-stress TEST_GEN_FILES += uffd-unit-tests +TEST_GEN_FILES += uffd-perf TEST_GEN_FILES += split_huge_page_test TEST_GEN_FILES += ksm_tests TEST_GEN_FILES += ksm_functional_tests @@ -120,6 +121,7 @@ $(TEST_GEN_FILES): vm_util.c $(OUTPUT)/uffd-stress: uffd-common.c $(OUTPUT)/uffd-unit-tests: uffd-common.c +$(OUTPUT)/uffd-perf: uffd-common.c ifeq ($(ARCH),x86_64) BINARIES_32 := $(patsubst %,$(OUTPUT)/%,$(BINARIES_32)) diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c index 851284395b29..afbf2f7add56 100644 --- a/tools/testing/selftests/mm/uffd-common.c +++ b/tools/testing/selftests/mm/uffd-common.c @@ -725,3 +725,21 @@ int uffd_get_features(uint64_t *features) return 0; } + +uint64_t get_usec(void) +{ + uint64_t val = 0; + struct timespec t; + int ret = clock_gettime(CLOCK_MONOTONIC, &t); + + if (ret == -1) { + perror("clock_gettime() failed"); + /* should never happen */ + exit(-1); + } + + val = t.tv_nsec / 1000; /* ns -> us */ + val += t.tv_sec * 1000000; /* s -> us */ + + return val; +} diff --git a/tools/testing/selftests/mm/uffd-common.h b/tools/testing/selftests/mm/uffd-common.h index 9d66ad5c52cb..4273201ae19f 100644 --- a/tools/testing/selftests/mm/uffd-common.h +++ b/tools/testing/selftests/mm/uffd-common.h @@ -123,6 +123,7 @@ int uffd_open_dev(unsigned int flags); int uffd_open_sys(unsigned int flags); int uffd_open(unsigned int flags); int uffd_get_features(uint64_t *features); +uint64_t get_usec(void); #define TEST_ANON 1 #define TEST_HUGETLB 2 diff --git a/tools/testing/selftests/mm/uffd-perf.c b/tools/testing/selftests/mm/uffd-perf.c new file mode 100644 index 000000000000..eda99718311a --- /dev/null +++ b/tools/testing/selftests/mm/uffd-perf.c @@ -0,0 +1,207 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Userfaultfd performance tests. + * + * Copyright (C) 2023 Red Hat, Inc. + */ + +#include "uffd-common.h" + +#ifdef __NR_userfaultfd + +#define DEF_MEM_SIZE_MB (512) +#define MB(x) ((x) * 1024 * 1024) +#define DEF_N_TESTS 5 + +static volatile bool perf_test_started; +static unsigned int n_uffd_threads, n_worker_threads; +static uint64_t nr_pages_per_worker; +static unsigned long n_tests = DEF_N_TESTS; + +static void setup_env(unsigned long mem_size_mb) +{ + /* Test private anon only for now */ + map_shared = false; + uffd_test_ops = &anon_uffd_test_ops; + page_size = psize(); + nr_cpus = n_uffd_threads; + nr_pages = MB(mem_size_mb) / page_size; + nr_pages_per_worker = nr_pages / n_worker_threads; + if (nr_pages_per_worker == 0) + err("each worker should at least own one page"); +} + +void *worker_fn(void *opaque) +{ + unsigned long i = (unsigned long) opaque; + unsigned long page_nr, start_nr, end_nr; + int v = 0; + + start_nr = i * nr_pages_per_worker; + end_nr = (i + 1) * nr_pages_per_worker; + + while (!perf_test_started); + + for (page_nr = start_nr; page_nr < end_nr; page_nr++) + v += *(volatile int *)(area_dst + page_nr * page_size); + + return NULL; +} + +static uint64_t run_perf(uint64_t mem_size_mb, bool poll) +{ + pthread_t worker_threads[n_worker_threads]; + pthread_t uffd_threads[n_uffd_threads]; + const char *errmsg = NULL; + struct uffd_args *args; + uint64_t start, end; + int i, ret; + + if (uffd_test_ctx_init(0, &errmsg)) + err("%s", errmsg); + + /* + * By default, uffd is opened with NONBLOCK mode; use block mode + * when test read() + */ + if (!poll) { + int flags = fcntl(uffd, F_GETFL); + + if (flags < 0) + err("fcntl(F_GETFL) failed"); + + if (flags & O_NONBLOCK) + flags &= ~O_NONBLOCK; + + if (fcntl(uffd, F_SETFL, flags)) + err("fcntl(F_SETFL) failed"); + } + + ret = uffd_register(uffd, area_dst, MB(mem_size_mb), + true, false, false); + if (ret) + err("uffd_register() failed"); + + args = calloc(nr_cpus, sizeof(struct uffd_args)); + if (!args) + err("calloc()"); + + for (i = 0; i < n_uffd_threads; i++) { + args[i].cpu = i; + uffd_fault_thread_create(&uffd_threads[i], NULL, + &args[i], poll); + } + + for (i = 0; i < n_worker_threads; i++) { + if (pthread_create(&worker_threads[i], NULL, + worker_fn, (void *)(uintptr_t)i)) + err("create uffd threads"); + } + + start = get_usec(); + perf_test_started = true; + for (i = 0; i < n_worker_threads; i++) + pthread_join(worker_threads[i], NULL); + end = get_usec(); + + for (i = 0; i < n_uffd_threads; i++) { + struct uffd_args *p = &args[i]; + + uffd_fault_thread_join(uffd_threads[i], i, poll); + + assert(p->wp_faults == 0 && p->minor_faults == 0); + } + + free(args); + + ret = uffd_unregister(uffd, area_dst, MB(mem_size_mb)); + if (ret) + err("uffd_unregister() failed"); + + return end - start; +} + +static void usage(const char *prog) +{ + printf("usage: %s \n", prog); + puts(""); + printf(" -m: size of memory to test (in MB, default: %u)\n", + DEF_MEM_SIZE_MB); + puts(" -p: use poll() (the default)"); + puts(" -r: use read()"); + printf(" -t: test rounds (default: %u)\n", DEF_N_TESTS); + puts(" -u: number of uffd threads (default: n_cpus)"); + puts(" -w: number of worker threads (default: n_cpus)"); + puts(""); + exit(KSFT_FAIL); +} + +int main(int argc, char *argv[]) +{ + unsigned long mem_size_mb = DEF_MEM_SIZE_MB; + uint64_t result, sum = 0; + bool use_poll = true; + int opt, count; + + n_uffd_threads = n_worker_threads = sysconf(_SC_NPROCESSORS_ONLN); + + while ((opt = getopt(argc, argv, "hm:prt:u:w:")) != -1) { + switch (opt) { + case 'm': + mem_size_mb = strtoul(optarg, NULL, 10); + break; + case 'p': + use_poll = true; + break; + case 'r': + use_poll = false; + break; + case 't': + n_tests = strtoul(optarg, NULL, 10); + break; + case 'u': + n_uffd_threads = strtoul(optarg, NULL, 10); + break; + case 'w': + n_worker_threads = strtoul(optarg, NULL, 10); + break; + case 'h': + default: + /* Unknown */ + usage(argv[0]); + break; + } + } + + setup_env(mem_size_mb); + + printf("Message mode: \t\t%s\n", use_poll ? "poll" : "read"); + printf("Mem size: \t\t%lu (MB)\n", mem_size_mb); + printf("Uffd threads: \t\t%u\n", n_uffd_threads); + printf("Worker threads: \t%u\n", n_worker_threads); + printf("Test rounds: \t\t%lu\n", n_tests); + printf("Time used (us): \t"); + + for (count = 0; count < n_tests; count++) { + result = run_perf(mem_size_mb, use_poll); + sum += result; + printf("%" PRIu64 ", ", result); + fflush(stdout); + } + printf("\b\b \n"); + printf("Average (us): \t\t%"PRIu64"\n", sum / n_tests); + + return KSFT_PASS; +} + +#else /* __NR_userfaultfd */ + +#warning "missing __NR_userfaultfd definition" + +int main(void) +{ + printf("Skipping %s (missing __NR_userfaultfd)\n", __file__); + return KSFT_SKIP; +} + +#endif /* __NR_userfaultfd */