From patchwork Fri Oct 13 18:48:53 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Baron X-Patchwork-Id: 10005707 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 62FCD602B3 for ; Fri, 13 Oct 2017 18:49:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 527A92876C for ; Fri, 13 Oct 2017 18:49:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 44299290AC; Fri, 13 Oct 2017 18:49:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CFD36290A9 for ; Fri, 13 Oct 2017 18:49:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751341AbdJMStj (ORCPT ); Fri, 13 Oct 2017 14:49:39 -0400 Received: from mx0a-00190b01.pphosted.com ([67.231.149.131]:33786 "EHLO mx0a-00190b01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751138AbdJMSti (ORCPT ); Fri, 13 Oct 2017 14:49:38 -0400 Received: from pps.filterd (m0050095.ppops.net [127.0.0.1]) by m0050095.ppops.net-00190b01. (8.16.0.21/8.16.0.21) with SMTP id v9DIksfe016186; Fri, 13 Oct 2017 19:49:33 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=akamai.com; h=from : to : cc : subject : date : message-id; s=jan2016.eng; bh=lekHF6hfcxTKMF+Dj/UtL0Znkb1qg6qbqStDy4uorLA=; b=SVtFJXhk2a7J4wAvOjKem3Hk/I0uklQfRV6MM7rGAaAsoImwz/ay7TdviQLyIAwAw6mY x9JXE4QyiXZ9yi1QEbkQECqBkhVFmPkEILTTgB5l9FQaU+s2FUgcybPIO1Mu/kw3SL08 mvSPyo2EKDeTVoo42EF9c4EFK86POXkOS32K45hpSWiAbIyosgHRZIHIqfUbivmatKNu 3kLapX51SAZTEZNmZ6TXziJy6xF/xSerNITkB8esQdMddR5luf2CoxiefVFY/Y+D3wjb D2g8AzQUmnbtNVYaUZzw5CvSJBGMYZxYyOdjZjQBy5w0G+bqEUOY53VRbrqFus8CKvmz ig== Received: from prod-mail-ppoint2 (prod-mail-ppoint2.akamai.com [184.51.33.19]) by m0050095.ppops.net-00190b01. with ESMTP id 2dk2qrg1ws-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 13 Oct 2017 19:49:32 +0100 Received: from pps.filterd (prod-mail-ppoint2.akamai.com [127.0.0.1]) by prod-mail-ppoint2.akamai.com (8.16.0.21/8.16.0.21) with SMTP id v9DIkWoB009535; Fri, 13 Oct 2017 14:49:31 -0400 Received: from prod-mail-relay10.akamai.com ([172.27.118.251]) by prod-mail-ppoint2.akamai.com with ESMTP id 2det8uhu9a-1; Fri, 13 Oct 2017 14:49:31 -0400 Received: from bos-lpjec.kendall.corp.akamai.com (bos-lpjec.kendall.corp.akamai.com [172.28.12.38]) by prod-mail-relay10.akamai.com (Postfix) with ESMTP id 929251FC71; Fri, 13 Oct 2017 18:49:31 +0000 (GMT) From: Jason Baron To: akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexander Viro , Salman Qazi Subject: [PATCH] epoll: avoid calling ep_call_nested() from ep_poll_safewake() Date: Fri, 13 Oct 2017 14:48:53 -0400 Message-Id: <1507920533-8812-1-git-send-email-jbaron@akamai.com> X-Mailer: git-send-email 1.9.1 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-10-13_07:, , signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710130260 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-10-13_07:, , signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710130261 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The ep_poll_safewake() function is used to wakeup potentially nested epoll file descriptors. The function uses ep_call_nested() to prevent entering the same wake up queue more than once, and to prevent excessively deep wakeup paths (deeper than EP_MAX_NESTS). However, this is not necessary since we are already preventing these conditions during EPOLL_CTL_ADD. This saves extra function calls, and avoids taking a global lock during the ep_call_nested() calls. I have, however, left ep_call_nested() for the CONFIG_DEBUG_LOCK_ALLOC case, since ep_call_nested() keeps track of the nesting level, and this is required by the call to spin_lock_irqsave_nested(). It would be nice to remove the ep_call_nested() calls for the CONFIG_DEBUG_LOCK_ALLOC case as well, however its not clear how to simply pass the nesting level through multiple wake_up() levels without more surgery. In any case, I don't think CONFIG_DEBUG_LOCK_ALLOC is generally used for production. This patch, also apparently fixes a workload at Google that Salman Qazi reported by completely removing the poll_safewake_ncalls->lock from wakeup paths. Signed-off-by: Jason Baron Cc: Alexander Viro Cc: Andrew Morton Cc: Salman Qazi Acked-by: Davidlohr Bueso --- fs/eventpoll.c | 47 ++++++++++++++++++----------------------------- 1 file changed, 18 insertions(+), 29 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 2fabd19..69acfab 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -276,9 +276,6 @@ static DEFINE_MUTEX(epmutex); /* Used to check for epoll file descriptor inclusion loops */ static struct nested_calls poll_loop_ncalls; -/* Used for safe wake up implementation */ -static struct nested_calls poll_safewake_ncalls; - /* Used to call file's f_op->poll() under the nested calls boundaries */ static struct nested_calls poll_readywalk_ncalls; @@ -551,40 +548,21 @@ static int ep_call_nested(struct nested_calls *ncalls, int max_nests, * this special case of epoll. */ #ifdef CONFIG_DEBUG_LOCK_ALLOC -static inline void ep_wake_up_nested(wait_queue_head_t *wqueue, - unsigned long events, int subclass) + +static struct nested_calls poll_safewake_ncalls; + +static int ep_poll_wakeup_proc(void *priv, void *cookie, int call_nests) { unsigned long flags; + wait_queue_head_t *wqueue = (wait_queue_head_t *)cookie; - spin_lock_irqsave_nested(&wqueue->lock, flags, subclass); - wake_up_locked_poll(wqueue, events); + spin_lock_irqsave_nested(&wqueue->lock, flags, call_nests + 1); + wake_up_locked_poll(wqueue, POLLIN); spin_unlock_irqrestore(&wqueue->lock, flags); -} -#else -static inline void ep_wake_up_nested(wait_queue_head_t *wqueue, - unsigned long events, int subclass) -{ - wake_up_poll(wqueue, events); -} -#endif -static int ep_poll_wakeup_proc(void *priv, void *cookie, int call_nests) -{ - ep_wake_up_nested((wait_queue_head_t *) cookie, POLLIN, - 1 + call_nests); return 0; } -/* - * Perform a safe wake up of the poll wait list. The problem is that - * with the new callback'd wake up system, it is possible that the - * poll callback is reentered from inside the call to wake_up() done - * on the poll wait queue head. The rule is that we cannot reenter the - * wake up code from the same task more than EP_MAX_NESTS times, - * and we cannot reenter the same wait queue head at all. This will - * enable to have a hierarchy of epoll file descriptor of no more than - * EP_MAX_NESTS deep. - */ static void ep_poll_safewake(wait_queue_head_t *wq) { int this_cpu = get_cpu(); @@ -595,6 +573,15 @@ static void ep_poll_safewake(wait_queue_head_t *wq) put_cpu(); } +#else + +static void ep_poll_safewake(wait_queue_head_t *wq) +{ + wake_up_poll(wq, POLLIN); +} + +#endif + static void ep_remove_wait_queue(struct eppoll_entry *pwq) { wait_queue_head_t *whead; @@ -2315,8 +2302,10 @@ static int __init eventpoll_init(void) */ ep_nested_calls_init(&poll_loop_ncalls); +#ifdef CONFIG_DEBUG_LOCK_ALLOC /* Initialize the structure used to perform safe poll wait head wake ups */ ep_nested_calls_init(&poll_safewake_ncalls); +#endif /* Initialize the structure used to perform file's f_op->poll() calls */ ep_nested_calls_init(&poll_readywalk_ncalls);