From patchwork Wed Jul 15 20:44:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666223 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1DCDF618 for ; Wed, 15 Jul 2020 20:45:54 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0788C2065F for ; Wed, 15 Jul 2020 20:45:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0788C2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8425321F8C7; Wed, 15 Jul 2020 13:45:42 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 26A55200EC8 for ; Wed, 15 Jul 2020 13:45:23 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 6D41846B; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 61722216; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:42 -0400 Message-Id: <1594845918-29027-2-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 01/37] lustre: osc: fix osc_extent_find() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown - fix a pre-existing bug - osc_extent_merge() should never try to merge two extends with different ->oe_mppr as later alignment checks can get confused. - Remove a redundant list_del_init() which is already included in __osc_extent_remove(). Fixes: 85ebb57ddc ("lustre: osc: simplify osc_extent_find()") WC-bug-id: https://jira.whamcloud.com/browse/LU-9679 Lustre-commit: 80e21cce3dd67 ("LU-9679 osc: simplify osc_extent_find()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/37607 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/osc/osc_cache.c | 33 +++++++++++++++------------------ 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c index 5049aaa..474b711 100644 --- a/fs/lustre/osc/osc_cache.c +++ b/fs/lustre/osc/osc_cache.c @@ -574,6 +574,14 @@ static int osc_extent_merge(const struct lu_env *env, struct osc_extent *cur, if (cur->oe_max_end != victim->oe_max_end) return -ERANGE; + /* + * In the rare case max_pages_per_rpc (mppr) is changed, don't + * merge extents until after old ones have been sent, or the + * "extents are aligned to RPCs" checks are unhappy. + */ + if (cur->oe_mppr != victim->oe_mppr) + return -ERANGE; + LASSERT(cur->oe_dlmlock == victim->oe_dlmlock); ppc_bits = osc_cli(obj)->cl_chunkbits - PAGE_SHIFT; chunk_start = cur->oe_start >> ppc_bits; @@ -601,7 +609,6 @@ static int osc_extent_merge(const struct lu_env *env, struct osc_extent *cur, cur->oe_urgent |= victim->oe_urgent; cur->oe_memalloc |= victim->oe_memalloc; list_splice_init(&victim->oe_pages, &cur->oe_pages); - list_del_init(&victim->oe_link); victim->oe_nr_pages = 0; osc_extent_get(victim); @@ -727,8 +734,7 @@ static struct osc_extent *osc_extent_find(const struct lu_env *env, cur->oe_start = descr->cld_start; if (cur->oe_end > max_end) cur->oe_end = max_end; - LASSERT(*grants >= chunksize); - cur->oe_grants = chunksize; + cur->oe_grants = chunksize + cli->cl_grant_extent_tax; cur->oe_mppr = max_pages; if (olck->ols_dlmlock) { LASSERT(olck->ols_hold); @@ -800,17 +806,8 @@ static struct osc_extent *osc_extent_find(const struct lu_env *env, */ continue; - /* it's required that an extent must be contiguous at chunk - * level so that we know the whole extent is covered by grant - * (the pages in the extent are NOT required to be contiguous). - * Otherwise, it will be too much difficult to know which - * chunks have grants allocated. - */ - /* On success, osc_extent_merge() will put cur, - * so we take an extra reference - */ - osc_extent_get(cur); if (osc_extent_merge(env, ext, cur) == 0) { + LASSERT(*grants >= chunksize); *grants -= chunksize; found = osc_extent_hold(ext); @@ -824,19 +821,19 @@ static struct osc_extent *osc_extent_find(const struct lu_env *env, break; } - osc_extent_put(env, cur); } osc_extent_tree_dump(D_CACHE, obj); if (found) { LASSERT(!conflict); - LASSERT(found->oe_dlmlock == cur->oe_dlmlock); - OSC_EXTENT_DUMP(D_CACHE, found, - "found caching ext for %lu.\n", index); + if (!IS_ERR(found)) { + LASSERT(found->oe_dlmlock == cur->oe_dlmlock); + OSC_EXTENT_DUMP(D_CACHE, found, + "found caching ext for %lu.\n", index); + } } else if (!conflict) { /* create a new extent */ EASSERT(osc_extent_is_overlapped(obj, cur) == 0, cur); - cur->oe_grants = chunksize + cli->cl_grant_extent_tax; LASSERT(*grants >= cur->oe_grants); *grants -= cur->oe_grants; From patchwork Wed Jul 15 20:44:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666205 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AE480618 for ; Wed, 15 Jul 2020 20:45:28 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9526B2067D for ; Wed, 15 Jul 2020 20:45:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9526B2067D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C89AE21F7C3; Wed, 15 Jul 2020 13:45:27 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 88ECA200EC8 for ; Wed, 15 Jul 2020 13:45:23 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 6F5E346E; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 63434217; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:43 -0400 Message-Id: <1594845918-29027-3-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 02/37] lustre: ldlm: check slv and limit before updating X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong slv and limit do not change for most of time, ldlm_cli_update_pool() could be called for each RPC reply, try hold read lock to check firstly could avoid heavy write lock in hot path. WC-bug-id: https://jira.whamcloud.com/browse/LU-13365 Lustre-commit: 3116b9e19dc09 ("LU-13365 ldlm: check slv and limit before updating") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/37969 Reviewed-by: Andreas Dilger Reviewed-by: Mike Pershin Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_request.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/lustre/ldlm/ldlm_request.c b/fs/lustre/ldlm/ldlm_request.c index e1ba596..6318137 100644 --- a/fs/lustre/ldlm/ldlm_request.c +++ b/fs/lustre/ldlm/ldlm_request.c @@ -1163,6 +1163,14 @@ int ldlm_cli_update_pool(struct ptlrpc_request *req) new_slv = lustre_msg_get_slv(req->rq_repmsg); obd = req->rq_import->imp_obd; + read_lock(&obd->obd_pool_lock); + if (obd->obd_pool_slv == new_slv && + obd->obd_pool_limit == new_limit) { + read_unlock(&obd->obd_pool_lock); + return 0; + } + read_unlock(&obd->obd_pool_lock); + /* * Set new SLV and limit in OBD fields to make them accessible * to the pool thread. We do not access obd_namespace and pool From patchwork Wed Jul 15 20:44:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666207 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD7EF13A4 for ; Wed, 15 Jul 2020 20:45:28 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A7536206F4 for ; Wed, 15 Jul 2020 20:45:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A7536206F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B766121F7C0; Wed, 15 Jul 2020 13:45:27 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D2E9C21F6A1 for ; Wed, 15 Jul 2020 13:45:23 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 7173D46F; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 675C2218; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:44 -0400 Message-Id: <1594845918-29027-4-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 03/37] lustre: sec: better struct sepol_downcall_data X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Sebastien Buisson struct sepol_downcall_data is badly formed for several reasons: - it uses a __kernel_time_t field, which can be variably sized, depending on the size of __kernel_long_t. Replace it with a fixed-size __s64 type; - it has __u32 sdd_magic that is immediately before a potentially 64-bit field, whereas the 64-bit fields in a structure should always be naturally aligned on 64-bit boundaries to avoid potential incompatibility in the structure definition; - it has __u16 sdd_sepol_len which may be followed by padding. So create a better struct sepol_downcall_data, while maintaining compatibility with 2.12 by keeping a struct sepol_downcall_data_old. WC-bug-id: https://jira.whamcloud.com/browse/LU-13525 Lustre-commit: 82b8cb5528f48 ("LU-13525 sec: better struct sepol_downcall_data") Signed-off-by: Sebastien Buisson Reviewed-on: https://review.whamcloud.com/38580 Reviewed-by: Olaf Faaland-LLNL Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/sec_lproc.c | 134 ++++++++++++++++++++++++++++---- include/uapi/linux/lustre/lustre_user.h | 16 +++- 2 files changed, 135 insertions(+), 15 deletions(-) diff --git a/fs/lustre/ptlrpc/sec_lproc.c b/fs/lustre/ptlrpc/sec_lproc.c index 7db7e81..b34ced4 100644 --- a/fs/lustre/ptlrpc/sec_lproc.c +++ b/fs/lustre/ptlrpc/sec_lproc.c @@ -131,6 +131,86 @@ static int sptlrpc_ctxs_lprocfs_seq_show(struct seq_file *seq, void *v) LPROC_SEQ_FOPS_RO(sptlrpc_ctxs_lprocfs); +#if LUSTRE_VERSION_CODE < OBD_OCD_VERSION(2, 16, 53, 0) +static ssize_t sepol_seq_write_old(struct obd_device *obd, + const char __user *buffer, + size_t count) +{ + struct client_obd *cli = &obd->u.cli; + struct obd_import *imp = cli->cl_import; + struct sepol_downcall_data_old *param; + int size = sizeof(*param); + u16 len; + int rc = 0; + + if (count < size) { + rc = -EINVAL; + CERROR("%s: invalid data count = %lu, size = %d: rc = %d\n", + obd->obd_name, (unsigned long) count, size, rc); + return rc; + } + + param = kmalloc(size, GFP_KERNEL); + if (!param) + return -ENOMEM; + + if (copy_from_user(param, buffer, size)) { + rc = -EFAULT; + CERROR("%s: bad sepol data: rc = %d\n", obd->obd_name, rc); + goto out; + } + + if (param->sdd_magic != SEPOL_DOWNCALL_MAGIC_OLD) { + rc = -EINVAL; + CERROR("%s: sepol downcall bad params: rc = %d\n", + obd->obd_name, rc); + goto out; + } + + if (param->sdd_sepol_len == 0 || + param->sdd_sepol_len >= sizeof(imp->imp_sec->ps_sepol)) { + rc = -EINVAL; + CERROR("%s: invalid sepol data returned: rc = %d\n", + obd->obd_name, rc); + goto out; + } + len = param->sdd_sepol_len; /* save sdd_sepol_len */ + kfree(param); + size = offsetof(struct sepol_downcall_data_old, + sdd_sepol[len]); + + if (count < size) { + rc = -EINVAL; + CERROR("%s: invalid sepol count = %lu, size = %d: rc = %d\n", + obd->obd_name, (unsigned long) count, size, rc); + return rc; + } + + /* alloc again with real size */ + param = kmalloc(size, GFP_KERNEL); + if (!param) + return -ENOMEM; + + if (copy_from_user(param, buffer, size)) { + rc = -EFAULT; + CERROR("%s: cannot copy sepol data: rc = %d\n", + obd->obd_name, rc); + goto out; + } + + spin_lock(&imp->imp_sec->ps_lock); + snprintf(imp->imp_sec->ps_sepol, param->sdd_sepol_len + 1, "%s", + param->sdd_sepol); + imp->imp_sec->ps_sepol_mtime = ktime_set(param->sdd_sepol_mtime, 0); + spin_unlock(&imp->imp_sec->ps_lock); + +out: + kfree(param); + + return rc ? rc : count; +} +#endif + static ssize_t lprocfs_wr_sptlrpc_sepol(struct file *file, const char __user *buffer, size_t count, void *data) @@ -140,13 +220,41 @@ static int sptlrpc_ctxs_lprocfs_seq_show(struct seq_file *seq, void *v) struct client_obd *cli = &obd->u.cli; struct obd_import *imp = cli->cl_import; struct sepol_downcall_data *param; - int size = sizeof(*param); + u32 magic; + int size = sizeof(magic); + u16 len; int rc = 0; if (count < size) { - CERROR("%s: invalid data count = %lu, size = %d\n", - obd->obd_name, (unsigned long) count, size); - return -EINVAL; + rc = -EINVAL; + CERROR("%s: invalid buffer count = %lu, size = %d: rc = %d\n", + obd->obd_name, (unsigned long) count, size, rc); + return rc; + } + + if (copy_from_user(&magic, buffer, size)) { + rc = -EFAULT; + CERROR("%s: bad sepol magic: rc = %d\n", obd->obd_name, rc); + return rc; + } + + if (magic != SEPOL_DOWNCALL_MAGIC) { +#if LUSTRE_VERSION_CODE < OBD_OCD_VERSION(2, 16, 53, 0) + if (magic == SEPOL_DOWNCALL_MAGIC_OLD) + return sepol_seq_write_old(obd, buffer, count); +#endif + rc = -EINVAL; + CERROR("%s: sepol downcall bad magic '%#08x': rc = %d\n", + obd->obd_name, magic, rc); + return rc; + } + + size = sizeof(*param); + if (count < size) { + rc = -EINVAL; + CERROR("%s: invalid data count = %lu, size = %d: rc = %d\n", + obd->obd_name, (unsigned long) count, size, rc); + return rc; } param = kzalloc(size, GFP_KERNEL); @@ -154,39 +262,39 @@ static int sptlrpc_ctxs_lprocfs_seq_show(struct seq_file *seq, void *v) return -ENOMEM; if (copy_from_user(param, buffer, size)) { - CERROR("%s: bad sepol data\n", obd->obd_name); rc = -EFAULT; + CERROR("%s: bad sepol data: rc = %d\n", obd->obd_name, rc); goto out; } if (param->sdd_magic != SEPOL_DOWNCALL_MAGIC) { - CERROR("%s: sepol downcall bad params\n", - obd->obd_name); rc = -EINVAL; + CERROR("%s: invalid sepol data returned: rc = %d\n", + obd->obd_name, rc); goto out; } if (param->sdd_sepol_len == 0 || param->sdd_sepol_len >= sizeof(imp->imp_sec->ps_sepol)) { - CERROR("%s: invalid sepol data returned\n", - obd->obd_name); rc = -EINVAL; + CERROR("%s: invalid sepol data returned: rc = %d\n", + obd->obd_name, rc); goto out; } - rc = param->sdd_sepol_len; /* save sdd_sepol_len */ + len = param->sdd_sepol_len; /* save sdd_sepol_len */ kfree(param); size = offsetof(struct sepol_downcall_data, - sdd_sepol[rc]); + sdd_sepol[len]); /* alloc again with real size */ - rc = 0; param = kzalloc(size, GFP_KERNEL); if (!param) return -ENOMEM; if (copy_from_user(param, buffer, size)) { - CERROR("%s: bad sepol data\n", obd->obd_name); rc = -EFAULT; + CERROR("%s: cannot copy sepol data: rc = %d\n", + obd->obd_name, rc); goto out; } diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index 6a2d5f9..b0301e1 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -51,6 +51,7 @@ #include #include #include +#include #ifndef __KERNEL__ # define __USE_ISOC99 1 @@ -980,7 +981,6 @@ static inline const char *qtype_name(int qtype) } #define IDENTITY_DOWNCALL_MAGIC 0x6d6dd629 -#define SEPOL_DOWNCALL_MAGIC 0x8b8bb842 /* permission */ #define N_PERMS_MAX 64 @@ -1002,13 +1002,25 @@ struct identity_downcall_data { __u32 idd_groups[0]; }; -struct sepol_downcall_data { +#if LUSTRE_VERSION_CODE < OBD_OCD_VERSION(2, 16, 53, 0) +/* old interface struct is deprecated in 2.14 */ +#define SEPOL_DOWNCALL_MAGIC_OLD 0x8b8bb842 +struct sepol_downcall_data_old { __u32 sdd_magic; __s64 sdd_sepol_mtime; __u16 sdd_sepol_len; char sdd_sepol[0]; }; +#endif +#define SEPOL_DOWNCALL_MAGIC 0x8b8bb843 +struct sepol_downcall_data { + __u32 sdd_magic; + __u16 sdd_sepol_len; + __u16 sdd_padding1; + __s64 sdd_sepol_mtime; + char sdd_sepol[0]; +}; /* lustre volatile file support * file name header: ".^L^S^T^R:volatile" From patchwork Wed Jul 15 20:44:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666209 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 161D8618 for ; Wed, 15 Jul 2020 20:45:34 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F3AB020672 for ; Wed, 15 Jul 2020 20:45:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F3AB020672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9C08121F7C0; Wed, 15 Jul 2020 13:45:31 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 34CE621F6B2 for ; Wed, 15 Jul 2020 13:45:24 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 74426470; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 6A5DF21C; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:45 -0400 Message-Id: <1594845918-29027-5-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 04/37] lustre: obdclass: remove init to 0 from lustre_init_lsi() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown After allocating a struct with kzalloc, there is no value in setting a few of the fields to zero. And as all fields were zero, it must be safe to kfree lmd_exclude, whether lmd_exclude_count is zero or not. WC-bug-id: https://jira.whamcloud.com/browse/LU-9679 Lustre-commit: 513dde601d2e9 ("LU-9679 obdclass: remove init to 0 from lustre_init_lsi()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39135 Reviewed-by: James Simmons Reviewed-by: Yang Sheng Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/obdclass/obd_mount.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/fs/lustre/obdclass/obd_mount.c b/fs/lustre/obdclass/obd_mount.c index 13e6521..ea5b469 100644 --- a/fs/lustre/obdclass/obd_mount.c +++ b/fs/lustre/obdclass/obd_mount.c @@ -515,9 +515,6 @@ struct lustre_sb_info *lustre_init_lsi(struct super_block *sb) return NULL; } - lsi->lsi_lmd->lmd_exclude_count = 0; - lsi->lsi_lmd->lmd_recovery_time_soft = 0; - lsi->lsi_lmd->lmd_recovery_time_hard = 0; s2lsi_nocast(sb) = lsi; /* we take 1 extra ref for our setup */ atomic_set(&lsi->lsi_mounts, 1); @@ -544,8 +541,7 @@ static int lustre_free_lsi(struct super_block *sb) kfree(lsi->lsi_lmd->lmd_fileset); kfree(lsi->lsi_lmd->lmd_mgssec); kfree(lsi->lsi_lmd->lmd_opts); - if (lsi->lsi_lmd->lmd_exclude_count) - kfree(lsi->lsi_lmd->lmd_exclude); + kfree(lsi->lsi_lmd->lmd_exclude); kfree(lsi->lsi_lmd->lmd_mgs); kfree(lsi->lsi_lmd->lmd_osd_type); kfree(lsi->lsi_lmd->lmd_params); From patchwork Wed Jul 15 20:44:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666211 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D14D618 for ; Wed, 15 Jul 2020 20:45:35 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 26B6320672 for ; Wed, 15 Jul 2020 20:45:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 26B6320672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2C50F21F846; Wed, 15 Jul 2020 13:45:32 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7585821F6B2 for ; Wed, 15 Jul 2020 13:45:24 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 77656474; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 6D7A32A0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:46 -0400 Message-Id: <1594845918-29027-6-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 05/37] lustre: ptlrpc: handle conn_hash rhashtable resize X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" The errors returned by rhashtable_lookup_get_insert_fast() of the values -ENOMEM or -EBUSY is due to the hashtable being resized. This is not fatal so retry a lookup. Fixes: ac2370ac2b ("staging: lustre: ptlrpc: convert conn_hash to rhashtable") WC-bug-id: https://jira.whamcloud.com/browse/LU-8130 Lustre-commit: 37b29a8f709aa ("LU-8130 ptlrpc: convert conn_hash to rhashtable") Signed-off-by: James Simmons Reviewed-on: https://review.whamcloud.com/33616 Reviewed-by: Shaun Tancheff Reviewed-by: Oleg Drokin --- fs/lustre/ptlrpc/connection.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/fs/lustre/ptlrpc/connection.c b/fs/lustre/ptlrpc/connection.c index 5466755..a548d99 100644 --- a/fs/lustre/ptlrpc/connection.c +++ b/fs/lustre/ptlrpc/connection.c @@ -32,6 +32,8 @@ */ #define DEBUG_SUBSYSTEM S_RPC + +#include #include #include #include @@ -103,13 +105,21 @@ struct ptlrpc_connection * * connection. The object which exists in the hash will be * returned, otherwise NULL is returned on success. */ +try_again: conn2 = rhashtable_lookup_get_insert_fast(&conn_hash, &conn->c_hash, conn_hash_params); if (conn2) { /* insertion failed */ kfree(conn); - if (IS_ERR(conn2)) + if (IS_ERR(conn2)) { + /* hash table could be resizing. */ + if (PTR_ERR(conn2) == -ENOMEM || + PTR_ERR(conn2) == -EBUSY) { + msleep(20); + goto try_again; + } return NULL; + } conn = conn2; ptlrpc_connection_addref(conn); } From patchwork Wed Jul 15 20:44:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666215 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C5AAA618 for ; Wed, 15 Jul 2020 20:45:41 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AEDB120672 for ; Wed, 15 Jul 2020 20:45:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AEDB120672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D72FF21F86E; Wed, 15 Jul 2020 13:45:35 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B6D2B21F6E3 for ; Wed, 15 Jul 2020 13:45:24 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 7A87C476; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 709F72B5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:47 -0400 Message-Id: <1594845918-29027-7-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 06/37] lustre: lu_object: convert lu_object cache to rhashtable X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown The lu_object cache is a little more complex than the other lustre hash tables for two reasons. 1/ there is a debugfs file which displays the contents of the cache, so we need to use rhashtable_walk in a way that works for seq_file. 2/ There is a (sharded) lru list for objects which are no longer referenced, so finding an object needs to consider races with the lru as well as with the hash table. The debugfs file already manages walking the libcfs hash table keeping a current-position in the private data. We can fairly easily convert that to a struct rhashtable_iter. The debugfs file actually reports pages, and there are multiple pages per hashtable object. So as well as rhashtable_iter, we need the current page index. For the double-locking, the current code uses direct-access to the bucket locks that libcfs_hash provides. rhashtable doesn't provide that access - callers must provide their own locking or use rcu techniques. The lsb_waitq.lock is still used to manage the lru list, but with this patch it is no longer nested *inside* the hashtable locks, but instead is outside. It is used to protect an object with a refcount of zero. When purging old objects from an lru, we first set LU_OBJECT_HEARD_BANSHEE while holding the lsb_waitq.lock, then remove all the entries from the hashtable separately. When removing the last reference from an object, we first take the lsb_waitq.lock, then decrement the reference and add to the lru list or discard it setting LU_OBJECT_UNHASHED. When we find an object in the hashtable with a refcount of zero, we take the corresponding lsb_waitq.lock and check that neither LU_OBJECT_HEARD_BANSHEE or LU_OBJECT_UNHASHED is set. If neither is, we can safely increment the refcount. If either are, the object is gone. This way, we only ever manipulate an object with a refcount of zero while holding the lsb_waitq.lock. As there is nothing to stop us using the resizing capabilities of rhashtable, the code to try to guess the perfect hash size has been removed. Also: the "is_dying" variable in lu_object_put() is racey - the value could change the moment it is sampled. It is also not needed as it is only used to avoid a wakeup, which is not particularly expensive. In the same code as comment says that 'top' could not be accessed, but the code then immediately accesses 'top' to calculate 'bkt'. So move the initialization of 'bkt' to before 'top' becomes unsafe. Also: Change "wake_up_all()" to "wake_up()". wake_up_all() is only relevant when an exclusive wait is used. Moving from the libcfs hashtable to rhashtable also gives the benefit of a very large performance boost. Before patch: SUMMARY rate: (of 5 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 12036.610 11091.880 11452.978 318.829 Directory stat: 25871.734 24232.310 24935.661 574.996 Directory removal: 12698.769 12239.685 12491.008 149.149 File creation: 11722.036 11673.961 11692.157 15.966 File stat: 62304.540 58237.124 60282.003 1479.103 File read: 24204.811 23889.091 24048.577 110.245 File removal: 9412.930 9111.828 9217.546 120.894 Tree creation: 3515.536 3195.627 3442.609 123.792 Tree removal: 433.917 418.935 428.038 5.545 After patch: SUMMARY rate: (of 5 iterations) Operation Max Min Mean Std Dev --------- --- --- ---- ------- Directory creation: 11873.308 303.626 9371.860 4539.539 Directory stat: 31116.512 30190.574 30568.091 335.545 Directory removal: 13082.121 12645.228 12943.239 157.695 File creation: 12607.135 12293.319 12466.647 138.307 File stat: 124419.347 105240.996 116919.977 7847.165 File read: 39707.270 36295.477 38266.011 1328.857 File removal: 9614.333 9273.931 9477.299 140.201 Tree creation: 3572.602 3017.580 3339.547 207.061 Tree removal: 487.687 0.004 282.188 230.659 WC-bug-id: https://jira.whamcloud.com/browse/LU-8130 Lustre-commit: aff14dbc522e1 ("LU-8130 lu_object: convert lu_object cache to rhashtable") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/36707 Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lu_object.h | 20 +- fs/lustre/llite/vvp_dev.c | 105 +++------ fs/lustre/lov/lovsub_dev.c | 5 +- fs/lustre/obdclass/lu_object.c | 481 +++++++++++++++++++---------------------- 4 files changed, 272 insertions(+), 339 deletions(-) diff --git a/fs/lustre/include/lu_object.h b/fs/lustre/include/lu_object.h index 2a2f38e..1a6b6e1 100644 --- a/fs/lustre/include/lu_object.h +++ b/fs/lustre/include/lu_object.h @@ -36,6 +36,7 @@ #include #include +#include #include #include #include @@ -469,11 +470,6 @@ enum lu_object_header_flags { * initialized yet, the object allocator will initialize it. */ LU_OBJECT_INITED = 2, - /** - * Object is being purged, so mustn't be returned by - * htable_lookup() - */ - LU_OBJECT_PURGING = 3, }; enum lu_object_header_attr { @@ -496,6 +492,8 @@ enum lu_object_header_attr { * it is created for things like not-yet-existing child created by mkdir or * create calls. lu_object_operations::loo_exists() can be used to check * whether object is backed by persistent storage entity. + * Any object containing this structre which might be placed in an + * rhashtable via loh_hash MUST be freed using call_rcu() or rcu_kfree(). */ struct lu_object_header { /** @@ -517,9 +515,9 @@ struct lu_object_header { */ u32 loh_attr; /** - * Linkage into per-site hash table. Protected by lu_site::ls_guard. + * Linkage into per-site hash table. */ - struct hlist_node loh_hash; + struct rhash_head loh_hash; /** * Linkage into per-site LRU list. Protected by lu_site::ls_guard. */ @@ -566,7 +564,7 @@ struct lu_site { /** * objects hash table */ - struct cfs_hash *ls_obj_hash; + struct rhashtable ls_obj_hash; /* * buckets for summary data */ @@ -643,6 +641,8 @@ int lu_object_init(struct lu_object *o, void lu_object_fini(struct lu_object *o); void lu_object_add_top(struct lu_object_header *h, struct lu_object *o); void lu_object_add(struct lu_object *before, struct lu_object *o); +struct lu_object *lu_object_get_first(struct lu_object_header *h, + struct lu_device *dev); /** * Helpers to initialize and finalize device types. @@ -697,8 +697,8 @@ static inline int lu_site_purge(const struct lu_env *env, struct lu_site *s, return lu_site_purge_objects(env, s, nr, true); } -void lu_site_print(const struct lu_env *env, struct lu_site *s, void *cookie, - lu_printer_t printer); +void lu_site_print(const struct lu_env *env, struct lu_site *s, atomic_t *ref, + int msg_flags, lu_printer_t printer); struct lu_object *lu_object_find_at(const struct lu_env *env, struct lu_device *dev, const struct lu_fid *f, diff --git a/fs/lustre/llite/vvp_dev.c b/fs/lustre/llite/vvp_dev.c index e1d87f9..aa8b2c5 100644 --- a/fs/lustre/llite/vvp_dev.c +++ b/fs/lustre/llite/vvp_dev.c @@ -361,21 +361,13 @@ int cl_sb_fini(struct super_block *sb) * ****************************************************************************/ -struct vvp_pgcache_id { - unsigned int vpi_bucket; - unsigned int vpi_depth; - u32 vpi_index; - - unsigned int vpi_curdep; - struct lu_object_header *vpi_obj; -}; - struct vvp_seq_private { struct ll_sb_info *vsp_sbi; struct lu_env *vsp_env; u16 vsp_refcheck; struct cl_object *vsp_clob; - struct vvp_pgcache_id vsp_id; + struct rhashtable_iter vsp_iter; + u32 vsp_page_index; /* * prev_pos is the 'pos' of the last object returned * by ->start of ->next. @@ -383,81 +375,43 @@ struct vvp_seq_private { loff_t vsp_prev_pos; }; -static int vvp_pgcache_obj_get(struct cfs_hash *hs, struct cfs_hash_bd *bd, - struct hlist_node *hnode, void *data) -{ - struct vvp_pgcache_id *id = data; - struct lu_object_header *hdr = cfs_hash_object(hs, hnode); - - if (lu_object_is_dying(hdr)) - return 1; - - if (id->vpi_curdep-- > 0) - return 0; /* continue */ - - cfs_hash_get(hs, hnode); - id->vpi_obj = hdr; - return 1; -} - -static struct cl_object *vvp_pgcache_obj(const struct lu_env *env, - struct lu_device *dev, - struct vvp_pgcache_id *id) -{ - LASSERT(lu_device_is_cl(dev)); - - id->vpi_obj = NULL; - id->vpi_curdep = id->vpi_depth; - - cfs_hash_hlist_for_each(dev->ld_site->ls_obj_hash, id->vpi_bucket, - vvp_pgcache_obj_get, id); - if (id->vpi_obj) { - struct lu_object *lu_obj; - - lu_obj = lu_object_locate(id->vpi_obj, dev->ld_type); - if (lu_obj) { - lu_object_ref_add(lu_obj, "dump", current); - return lu2cl(lu_obj); - } - lu_object_put(env, lu_object_top(id->vpi_obj)); - } - return NULL; -} - static struct page *vvp_pgcache_current(struct vvp_seq_private *priv) { struct lu_device *dev = &priv->vsp_sbi->ll_cl->cd_lu_dev; + struct lu_object_header *h; + struct page *vmpage = NULL; - while (1) { + rhashtable_walk_start(&priv->vsp_iter); + while ((h = rhashtable_walk_next(&priv->vsp_iter)) != NULL) { struct inode *inode; - struct page *vmpage; int nr; if (!priv->vsp_clob) { - struct cl_object *clob; - - while ((clob = vvp_pgcache_obj(priv->vsp_env, dev, &priv->vsp_id)) == NULL && - ++(priv->vsp_id.vpi_bucket) < CFS_HASH_NHLIST(dev->ld_site->ls_obj_hash)) - priv->vsp_id.vpi_depth = 0; - if (!clob) - return NULL; - priv->vsp_clob = clob; - priv->vsp_id.vpi_index = 0; + struct lu_object *lu_obj; + + lu_obj = lu_object_get_first(h, dev); + if (!lu_obj) + continue; + + priv->vsp_clob = lu2cl(lu_obj); + lu_object_ref_add(lu_obj, "dump", current); + priv->vsp_page_index = 0; } inode = vvp_object_inode(priv->vsp_clob); nr = find_get_pages_contig(inode->i_mapping, - priv->vsp_id.vpi_index, 1, &vmpage); + priv->vsp_page_index, 1, &vmpage); if (nr > 0) { - priv->vsp_id.vpi_index = vmpage->index; - return vmpage; + priv->vsp_page_index = vmpage->index; + break; } lu_object_ref_del(&priv->vsp_clob->co_lu, "dump", current); cl_object_put(priv->vsp_env, priv->vsp_clob); priv->vsp_clob = NULL; - priv->vsp_id.vpi_index = 0; - priv->vsp_id.vpi_depth++; + priv->vsp_page_index = 0; } + rhashtable_walk_stop(&priv->vsp_iter); + return vmpage; } #define seq_page_flag(seq, page, flag, has_flags) do { \ @@ -521,7 +475,10 @@ static int vvp_pgcache_show(struct seq_file *f, void *v) static void vvp_pgcache_rewind(struct vvp_seq_private *priv) { if (priv->vsp_prev_pos) { - memset(&priv->vsp_id, 0, sizeof(priv->vsp_id)); + struct lu_site *s = priv->vsp_sbi->ll_cl->cd_lu_dev.ld_site; + + rhashtable_walk_exit(&priv->vsp_iter); + rhashtable_walk_enter(&s->ls_obj_hash, &priv->vsp_iter); priv->vsp_prev_pos = 0; if (priv->vsp_clob) { lu_object_ref_del(&priv->vsp_clob->co_lu, "dump", @@ -534,7 +491,7 @@ static void vvp_pgcache_rewind(struct vvp_seq_private *priv) static struct page *vvp_pgcache_next_page(struct vvp_seq_private *priv) { - priv->vsp_id.vpi_index += 1; + priv->vsp_page_index += 1; return vvp_pgcache_current(priv); } @@ -548,7 +505,7 @@ static void *vvp_pgcache_start(struct seq_file *f, loff_t *pos) /* Return the current item */; } else { WARN_ON(*pos != priv->vsp_prev_pos + 1); - priv->vsp_id.vpi_index += 1; + priv->vsp_page_index += 1; } priv->vsp_prev_pos = *pos; @@ -580,6 +537,7 @@ static void vvp_pgcache_stop(struct seq_file *f, void *v) static int vvp_dump_pgcache_seq_open(struct inode *inode, struct file *filp) { struct vvp_seq_private *priv; + struct lu_site *s; priv = __seq_open_private(filp, &vvp_pgcache_ops, sizeof(*priv)); if (!priv) @@ -588,13 +546,16 @@ static int vvp_dump_pgcache_seq_open(struct inode *inode, struct file *filp) priv->vsp_sbi = inode->i_private; priv->vsp_env = cl_env_get(&priv->vsp_refcheck); priv->vsp_clob = NULL; - memset(&priv->vsp_id, 0, sizeof(priv->vsp_id)); if (IS_ERR(priv->vsp_env)) { int err = PTR_ERR(priv->vsp_env); seq_release_private(inode, filp); return err; } + + s = priv->vsp_sbi->ll_cl->cd_lu_dev.ld_site; + rhashtable_walk_enter(&s->ls_obj_hash, &priv->vsp_iter); + return 0; } @@ -607,8 +568,8 @@ static int vvp_dump_pgcache_seq_release(struct inode *inode, struct file *file) lu_object_ref_del(&priv->vsp_clob->co_lu, "dump", current); cl_object_put(priv->vsp_env, priv->vsp_clob); } - cl_env_put(priv->vsp_env, &priv->vsp_refcheck); + rhashtable_walk_exit(&priv->vsp_iter); return seq_release_private(inode, file); } diff --git a/fs/lustre/lov/lovsub_dev.c b/fs/lustre/lov/lovsub_dev.c index 69380fc..0555737 100644 --- a/fs/lustre/lov/lovsub_dev.c +++ b/fs/lustre/lov/lovsub_dev.c @@ -88,10 +88,7 @@ static struct lu_device *lovsub_device_free(const struct lu_env *env, struct lovsub_device *lsd = lu2lovsub_dev(d); struct lu_device *next = cl2lu_dev(lsd->acid_next); - if (atomic_read(&d->ld_ref) && d->ld_site) { - LIBCFS_DEBUG_MSG_DATA_DECL(msgdata, D_ERROR, NULL); - lu_site_print(env, d->ld_site, &msgdata, lu_cdebug_printer); - } + lu_site_print(env, d->ld_site, &d->ld_ref, D_ERROR, lu_cdebug_printer); cl_device_fini(lu2cl_dev(d)); kfree(lsd); return next; diff --git a/fs/lustre/obdclass/lu_object.c b/fs/lustre/obdclass/lu_object.c index ec3f6a3..5cd8231 100644 --- a/fs/lustre/obdclass/lu_object.c +++ b/fs/lustre/obdclass/lu_object.c @@ -41,12 +41,11 @@ #define DEBUG_SUBSYSTEM S_CLASS +#include #include #include #include -/* hash_long() */ -#include #include #include #include @@ -85,12 +84,10 @@ enum { #define LU_CACHE_NR_MAX_ADJUST 512 #define LU_CACHE_NR_UNLIMITED -1 #define LU_CACHE_NR_DEFAULT LU_CACHE_NR_UNLIMITED -#define LU_CACHE_NR_LDISKFS_LIMIT LU_CACHE_NR_UNLIMITED -#define LU_CACHE_NR_ZFS_LIMIT 256 -#define LU_SITE_BITS_MIN 12 -#define LU_SITE_BITS_MAX 24 -#define LU_SITE_BITS_MAX_CL 19 +#define LU_CACHE_NR_MIN 4096 +#define LU_CACHE_NR_MAX 0x80000000UL + /** * Max 256 buckets, we don't want too many buckets because: * - consume too much memory (currently max 16K) @@ -111,7 +108,7 @@ enum { static void lu_object_free(const struct lu_env *env, struct lu_object *o); static u32 ls_stats_read(struct lprocfs_stats *stats, int idx); -static u32 lu_fid_hash(const void *data, u32 seed) +static u32 lu_fid_hash(const void *data, u32 len, u32 seed) { const struct lu_fid *fid = data; @@ -120,9 +117,17 @@ static u32 lu_fid_hash(const void *data, u32 seed) return seed; } +static const struct rhashtable_params obj_hash_params = { + .key_len = sizeof(struct lu_fid), + .key_offset = offsetof(struct lu_object_header, loh_fid), + .head_offset = offsetof(struct lu_object_header, loh_hash), + .hashfn = lu_fid_hash, + .automatic_shrinking = true, +}; + static inline int lu_bkt_hash(struct lu_site *s, const struct lu_fid *fid) { - return lu_fid_hash(fid, s->ls_bkt_seed) & + return lu_fid_hash(fid, sizeof(*fid), s->ls_bkt_seed) & (s->ls_bkt_cnt - 1); } @@ -147,9 +152,7 @@ void lu_object_put(const struct lu_env *env, struct lu_object *o) struct lu_object_header *top = o->lo_header; struct lu_site *site = o->lo_dev->ld_site; struct lu_object *orig = o; - struct cfs_hash_bd bd; const struct lu_fid *fid = lu_object_fid(o); - bool is_dying; /* * till we have full fids-on-OST implemented anonymous objects @@ -157,7 +160,6 @@ void lu_object_put(const struct lu_env *env, struct lu_object *o) * so we should not remove it from the site. */ if (fid_is_zero(fid)) { - LASSERT(!top->loh_hash.next && !top->loh_hash.pprev); LASSERT(list_empty(&top->loh_lru)); if (!atomic_dec_and_test(&top->loh_ref)) return; @@ -169,40 +171,45 @@ void lu_object_put(const struct lu_env *env, struct lu_object *o) return; } - cfs_hash_bd_get(site->ls_obj_hash, &top->loh_fid, &bd); - - is_dying = lu_object_is_dying(top); - if (!cfs_hash_bd_dec_and_lock(site->ls_obj_hash, &bd, &top->loh_ref)) { - /* at this point the object reference is dropped and lock is + bkt = &site->ls_bkts[lu_bkt_hash(site, &top->loh_fid)]; + if (atomic_add_unless(&top->loh_ref, -1, 1)) { +still_active: + /* + * At this point the object reference is dropped and lock is * not taken, so lu_object should not be touched because it - * can be freed by concurrent thread. Use local variable for - * check. + * can be freed by concurrent thread. + * + * Somebody may be waiting for this, currently only used for + * cl_object, see cl_object_put_last(). */ - if (is_dying) { - /* - * somebody may be waiting for this, currently only - * used for cl_object, see cl_object_put_last(). - */ - bkt = &site->ls_bkts[lu_bkt_hash(site, &top->loh_fid)]; - wake_up_all(&bkt->lsb_waitq); - } + wake_up(&bkt->lsb_waitq); + return; } + spin_lock(&bkt->lsb_waitq.lock); + if (!atomic_dec_and_test(&top->loh_ref)) { + spin_unlock(&bkt->lsb_waitq.lock); + goto still_active; + } + /* - * When last reference is released, iterate over object - * layers, and notify them that object is no longer busy. + * Refcount is zero, and cannot be incremented without taking the bkt + * lock, so object is stable. + */ + + /* + * When last reference is released, iterate over object layers, and + * notify them that object is no longer busy. */ list_for_each_entry_reverse(o, &top->loh_layers, lo_linkage) { if (o->lo_ops->loo_object_release) o->lo_ops->loo_object_release(env, o); } - bkt = &site->ls_bkts[lu_bkt_hash(site, &top->loh_fid)]; - spin_lock(&bkt->lsb_waitq.lock); - - /* don't use local 'is_dying' here because if was taken without lock - * but here we need the latest actual value of it so check lu_object + /* + * Don't use local 'is_dying' here because if was taken without lock but + * here we need the latest actual value of it so check lu_object * directly here. */ if (!lu_object_is_dying(top)) { @@ -210,26 +217,26 @@ void lu_object_put(const struct lu_env *env, struct lu_object *o) list_add_tail(&top->loh_lru, &bkt->lsb_lru); spin_unlock(&bkt->lsb_waitq.lock); percpu_counter_inc(&site->ls_lru_len_counter); - CDEBUG(D_INODE, "Add %p/%p to site lru. hash: %p, bkt: %p\n", - orig, top, site->ls_obj_hash, bkt); - cfs_hash_bd_unlock(site->ls_obj_hash, &bd, 1); + CDEBUG(D_INODE, "Add %p/%p to site lru. bkt: %p\n", + orig, top, bkt); return; } /* - * If object is dying (will not be cached), then removed it - * from hash table (it is already not on the LRU). + * If object is dying (will not be cached) then removed it from hash + * table (it is already not on the LRU). * - * This is done with hash table lists locked. As the only - * way to acquire first reference to previously unreferenced - * object is through hash-table lookup (lu_object_find()) - * which is done under hash-table, no race with concurrent - * object lookup is possible and we can safely destroy object below. + * This is done with bucket lock held. As the only way to acquire first + * reference to previously unreferenced object is through hash-table + * lookup (lu_object_find()) which takes the lock for first reference, + * no race with concurrent object lookup is possible and we can safely + * destroy object below. */ if (!test_and_set_bit(LU_OBJECT_UNHASHED, &top->loh_flags)) - cfs_hash_bd_del_locked(site->ls_obj_hash, &bd, &top->loh_hash); + rhashtable_remove_fast(&site->ls_obj_hash, &top->loh_hash, + obj_hash_params); + spin_unlock(&bkt->lsb_waitq.lock); - cfs_hash_bd_unlock(site->ls_obj_hash, &bd, 1); /* Object was already removed from hash above, can kill it. */ lu_object_free(env, orig); } @@ -247,21 +254,19 @@ void lu_object_unhash(const struct lu_env *env, struct lu_object *o) set_bit(LU_OBJECT_HEARD_BANSHEE, &top->loh_flags); if (!test_and_set_bit(LU_OBJECT_UNHASHED, &top->loh_flags)) { struct lu_site *site = o->lo_dev->ld_site; - struct cfs_hash *obj_hash = site->ls_obj_hash; - struct cfs_hash_bd bd; + struct rhashtable *obj_hash = &site->ls_obj_hash; + struct lu_site_bkt_data *bkt; - cfs_hash_bd_get_and_lock(obj_hash, &top->loh_fid, &bd, 1); + bkt = &site->ls_bkts[lu_bkt_hash(site, &top->loh_fid)]; + spin_lock(&bkt->lsb_waitq.lock); if (!list_empty(&top->loh_lru)) { - struct lu_site_bkt_data *bkt; - - bkt = &site->ls_bkts[lu_bkt_hash(site, &top->loh_fid)]; - spin_lock(&bkt->lsb_waitq.lock); list_del_init(&top->loh_lru); - spin_unlock(&bkt->lsb_waitq.lock); percpu_counter_dec(&site->ls_lru_len_counter); } - cfs_hash_bd_del_locked(obj_hash, &bd, &top->loh_hash); - cfs_hash_bd_unlock(obj_hash, &bd, 1); + spin_unlock(&bkt->lsb_waitq.lock); + + rhashtable_remove_fast(obj_hash, &top->loh_hash, + obj_hash_params); } } EXPORT_SYMBOL(lu_object_unhash); @@ -445,11 +450,9 @@ int lu_site_purge_objects(const struct lu_env *env, struct lu_site *s, LINVRNT(lu_bkt_hash(s, &h->loh_fid) == i); - /* Cannot remove from hash under current spinlock, - * so set flag to stop object from being found - * by htable_lookup(). - */ - set_bit(LU_OBJECT_PURGING, &h->loh_flags); + set_bit(LU_OBJECT_UNHASHED, &h->loh_flags); + rhashtable_remove_fast(&s->ls_obj_hash, &h->loh_hash, + obj_hash_params); list_move(&h->loh_lru, &dispose); percpu_counter_dec(&s->ls_lru_len_counter); if (did_sth == 0) @@ -470,7 +473,6 @@ int lu_site_purge_objects(const struct lu_env *env, struct lu_site *s, while ((h = list_first_entry_or_null(&dispose, struct lu_object_header, loh_lru)) != NULL) { - cfs_hash_del(s->ls_obj_hash, &h->loh_fid, &h->loh_hash); list_del_init(&h->loh_lru); lu_object_free(env, lu_object_top(h)); lprocfs_counter_incr(s->ls_stats, LU_SS_LRU_PURGED); @@ -582,9 +584,9 @@ void lu_object_header_print(const struct lu_env *env, void *cookie, (*printer)(env, cookie, "header@%p[%#lx, %d, " DFID "%s%s%s]", hdr, hdr->loh_flags, atomic_read(&hdr->loh_ref), PFID(&hdr->loh_fid), - hlist_unhashed(&hdr->loh_hash) ? "" : " hash", - list_empty((struct list_head *)&hdr->loh_lru) ? \ - "" : " lru", + test_bit(LU_OBJECT_UNHASHED, + &hdr->loh_flags) ? "" : " hash", + list_empty(&hdr->loh_lru) ? "" : " lru", hdr->loh_attr & LOHA_EXISTS ? " exist":""); } EXPORT_SYMBOL(lu_object_header_print); @@ -621,54 +623,94 @@ void lu_object_print(const struct lu_env *env, void *cookie, EXPORT_SYMBOL(lu_object_print); /* - * NOTE: htable_lookup() is called with the relevant - * hash bucket locked, but might drop and re-acquire the lock. + * Limit the lu_object cache to a maximum of lu_cache_nr objects. Because the + * calculation for the number of objects to reclaim is not covered by a lock the + * maximum number of objects is capped by LU_CACHE_MAX_ADJUST. This ensures + * that many concurrent threads will not accidentally purge the entire cache. */ -static struct lu_object *htable_lookup(struct lu_site *s, - struct cfs_hash_bd *bd, +static void lu_object_limit(const struct lu_env *env, + struct lu_device *dev) +{ + u64 size, nr; + + if (lu_cache_nr == LU_CACHE_NR_UNLIMITED) + return; + + size = atomic_read(&dev->ld_site->ls_obj_hash.nelems); + nr = (u64)lu_cache_nr; + if (size <= nr) + return; + + lu_site_purge_objects(env, dev->ld_site, + min_t(u64, size - nr, LU_CACHE_NR_MAX_ADJUST), + false); +} + +static struct lu_object *htable_lookup(const struct lu_env *env, + struct lu_device *dev, + struct lu_site_bkt_data *bkt, const struct lu_fid *f, - u64 *version) + struct lu_object_header *new) { + struct lu_site *s = dev->ld_site; struct lu_object_header *h; - struct hlist_node *hnode; - u64 ver = cfs_hash_bd_version_get(bd); - if (*version == ver) +try_again: + rcu_read_lock(); + if (new) + h = rhashtable_lookup_get_insert_fast(&s->ls_obj_hash, + &new->loh_hash, + obj_hash_params); + else + h = rhashtable_lookup(&s->ls_obj_hash, f, obj_hash_params); + if (IS_ERR_OR_NULL(h)) { + /* Not found */ + if (!new) + lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_MISS); + rcu_read_unlock(); + if (PTR_ERR(h) == -ENOMEM) { + msleep(20); + goto try_again; + } + lu_object_limit(env, dev); + if (PTR_ERR(h) == -E2BIG) + goto try_again; + return ERR_PTR(-ENOENT); + } - *version = ver; - /* cfs_hash_bd_peek_locked is a somehow "internal" function - * of cfs_hash, it doesn't add refcount on object. - */ - hnode = cfs_hash_bd_peek_locked(s->ls_obj_hash, bd, (void *)f); - if (!hnode) { + if (atomic_inc_not_zero(&h->loh_ref)) { + rcu_read_unlock(); + return lu_object_top(h); + } + + spin_lock(&bkt->lsb_waitq.lock); + if (lu_object_is_dying(h) || + test_bit(LU_OBJECT_UNHASHED, &h->loh_flags)) { + spin_unlock(&bkt->lsb_waitq.lock); + rcu_read_unlock(); + if (new) { + /* + * Old object might have already been removed, or will + * be soon. We need to insert our new object, so + * remove the old one just in case it is still there. + */ + rhashtable_remove_fast(&s->ls_obj_hash, &h->loh_hash, + obj_hash_params); + goto try_again; + } lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_MISS); return ERR_PTR(-ENOENT); } + /* Now protected by spinlock */ + rcu_read_unlock(); - h = container_of(hnode, struct lu_object_header, loh_hash); if (!list_empty(&h->loh_lru)) { - struct lu_site_bkt_data *bkt; - - bkt = &s->ls_bkts[lu_bkt_hash(s, &h->loh_fid)]; - spin_lock(&bkt->lsb_waitq.lock); - /* Might have just been moved to the dispose list, in which - * case LU_OBJECT_PURGING will be set. In that case, - * delete it from the hash table immediately. - * When lu_site_purge_objects() tried, it will find it - * isn't there, which is harmless. - */ - if (test_bit(LU_OBJECT_PURGING, &h->loh_flags)) { - spin_unlock(&bkt->lsb_waitq.lock); - cfs_hash_bd_del_locked(s->ls_obj_hash, bd, hnode); - lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_MISS); - return ERR_PTR(-ENOENT); - } list_del_init(&h->loh_lru); - spin_unlock(&bkt->lsb_waitq.lock); percpu_counter_dec(&s->ls_lru_len_counter); } - cfs_hash_get(s->ls_obj_hash, hnode); + atomic_inc(&h->loh_ref); + spin_unlock(&bkt->lsb_waitq.lock); lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_HIT); return lu_object_top(h); } @@ -687,28 +729,37 @@ static struct lu_object *lu_object_find(const struct lu_env *env, } /* - * Limit the lu_object cache to a maximum of lu_cache_nr objects. Because - * the calculation for the number of objects to reclaim is not covered by - * a lock the maximum number of objects is capped by LU_CACHE_MAX_ADJUST. - * This ensures that many concurrent threads will not accidentally purge - * the entire cache. + * Get a 'first' reference to an object that was found while looking through the + * hash table. */ -static void lu_object_limit(const struct lu_env *env, struct lu_device *dev) +struct lu_object *lu_object_get_first(struct lu_object_header *h, + struct lu_device *dev) { - u64 size, nr; + struct lu_site *s = dev->ld_site; + struct lu_object *ret; - if (lu_cache_nr == LU_CACHE_NR_UNLIMITED) - return; + if (IS_ERR_OR_NULL(h) || lu_object_is_dying(h)) + return NULL; - size = cfs_hash_size_get(dev->ld_site->ls_obj_hash); - nr = (u64)lu_cache_nr; - if (size <= nr) - return; + ret = lu_object_locate(h, dev->ld_type); + if (!ret) + return ret; - lu_site_purge_objects(env, dev->ld_site, - min_t(u64, size - nr, LU_CACHE_NR_MAX_ADJUST), - false); + if (!atomic_inc_not_zero(&h->loh_ref)) { + struct lu_site_bkt_data *bkt; + + bkt = &s->ls_bkts[lu_bkt_hash(s, &h->loh_fid)]; + spin_lock(&bkt->lsb_waitq.lock); + if (!lu_object_is_dying(h) && + !test_bit(LU_OBJECT_UNHASHED, &h->loh_flags)) + atomic_inc(&h->loh_ref); + else + ret = NULL; + spin_unlock(&bkt->lsb_waitq.lock); + } + return ret; } +EXPORT_SYMBOL(lu_object_get_first); /** * Core logic of lu_object_find*() functions. @@ -725,10 +776,8 @@ struct lu_object *lu_object_find_at(const struct lu_env *env, struct lu_object *o; struct lu_object *shadow; struct lu_site *s; - struct cfs_hash *hs; - struct cfs_hash_bd bd; struct lu_site_bkt_data *bkt; - u64 version = 0; + struct rhashtable *hs; int rc; /* @@ -750,16 +799,13 @@ struct lu_object *lu_object_find_at(const struct lu_env *env, * */ s = dev->ld_site; - hs = s->ls_obj_hash; + hs = &s->ls_obj_hash; if (unlikely(OBD_FAIL_PRECHECK(OBD_FAIL_OBD_ZERO_NLINK_RACE))) lu_site_purge(env, s, -1); bkt = &s->ls_bkts[lu_bkt_hash(s, f)]; - cfs_hash_bd_get(hs, f, &bd); if (!(conf && conf->loc_flags & LOC_F_NEW)) { - cfs_hash_bd_lock(hs, &bd, 1); - o = htable_lookup(s, &bd, f, &version); - cfs_hash_bd_unlock(hs, &bd, 1); + o = htable_lookup(env, dev, bkt, f, NULL); if (!IS_ERR(o)) { if (likely(lu_object_is_inited(o->lo_header))) @@ -795,29 +841,31 @@ struct lu_object *lu_object_find_at(const struct lu_env *env, CFS_RACE_WAIT(OBD_FAIL_OBD_ZERO_NLINK_RACE); - cfs_hash_bd_lock(hs, &bd, 1); - - if (conf && conf->loc_flags & LOC_F_NEW) - shadow = ERR_PTR(-ENOENT); - else - shadow = htable_lookup(s, &bd, f, &version); + if (conf && conf->loc_flags & LOC_F_NEW) { + int status = rhashtable_insert_fast(hs, &o->lo_header->loh_hash, + obj_hash_params); + if (status) + /* Strange error - go the slow way */ + shadow = htable_lookup(env, dev, bkt, f, o->lo_header); + else + shadow = ERR_PTR(-ENOENT); + } else { + shadow = htable_lookup(env, dev, bkt, f, o->lo_header); + } if (likely(PTR_ERR(shadow) == -ENOENT)) { - cfs_hash_bd_add_locked(hs, &bd, &o->lo_header->loh_hash); - cfs_hash_bd_unlock(hs, &bd, 1); - /* + * The new object has been successfully inserted. + * * This may result in rather complicated operations, including * fld queries, inode loading, etc. */ rc = lu_object_start(env, dev, o, conf); if (rc) { - set_bit(LU_OBJECT_HEARD_BANSHEE, - &o->lo_header->loh_flags); lu_object_put(env, o); return ERR_PTR(rc); } - wake_up_all(&bkt->lsb_waitq); + wake_up(&bkt->lsb_waitq); lu_object_limit(env, dev); @@ -825,10 +873,10 @@ struct lu_object *lu_object_find_at(const struct lu_env *env, } lprocfs_counter_incr(s->ls_stats, LU_SS_CACHE_RACE); - cfs_hash_bd_unlock(hs, &bd, 1); lu_object_free(env, o); if (!(conf && conf->loc_flags & LOC_F_NEW) && + !IS_ERR(shadow) && !lu_object_is_inited(shadow->lo_header)) { wait_event_idle(bkt->lsb_waitq, lu_object_is_inited(shadow->lo_header) || @@ -906,14 +954,9 @@ struct lu_site_print_arg { lu_printer_t lsp_printer; }; -static int -lu_site_obj_print(struct cfs_hash *hs, struct cfs_hash_bd *bd, - struct hlist_node *hnode, void *data) +static void +lu_site_obj_print(struct lu_object_header *h, struct lu_site_print_arg *arg) { - struct lu_site_print_arg *arg = (struct lu_site_print_arg *)data; - struct lu_object_header *h; - - h = hlist_entry(hnode, struct lu_object_header, loh_hash); if (!list_empty(&h->loh_layers)) { const struct lu_object *o; @@ -924,36 +967,45 @@ struct lu_site_print_arg { lu_object_header_print(arg->lsp_env, arg->lsp_cookie, arg->lsp_printer, h); } - return 0; } /** * Print all objects in @s. */ -void lu_site_print(const struct lu_env *env, struct lu_site *s, void *cookie, - lu_printer_t printer) +void lu_site_print(const struct lu_env *env, struct lu_site *s, atomic_t *ref, + int msg_flag, lu_printer_t printer) { struct lu_site_print_arg arg = { .lsp_env = (struct lu_env *)env, - .lsp_cookie = cookie, .lsp_printer = printer, }; + struct rhashtable_iter iter; + struct lu_object_header *h; + LIBCFS_DEBUG_MSG_DATA_DECL(msgdata, msg_flag, NULL); + + if (!s || !atomic_read(ref)) + return; - cfs_hash_for_each(s->ls_obj_hash, lu_site_obj_print, &arg); + arg.lsp_cookie = (void *)&msgdata; + + rhashtable_walk_enter(&s->ls_obj_hash, &iter); + rhashtable_walk_start(&iter); + while ((h = rhashtable_walk_next(&iter)) != NULL) { + if (IS_ERR(h)) + continue; + lu_site_obj_print(h, &arg); + } + rhashtable_walk_stop(&iter); + rhashtable_walk_exit(&iter); } EXPORT_SYMBOL(lu_site_print); /** * Return desired hash table order. */ -static unsigned long lu_htable_order(struct lu_device *top) +static void lu_htable_limits(struct lu_device *top) { - unsigned long bits_max = LU_SITE_BITS_MAX; unsigned long cache_size; - unsigned long bits; - - if (!strcmp(top->ld_type->ldt_name, LUSTRE_VVP_NAME)) - bits_max = LU_SITE_BITS_MAX_CL; /* * Calculate hash table size, assuming that we want reasonable @@ -979,75 +1031,12 @@ static unsigned long lu_htable_order(struct lu_device *top) lu_cache_percent = LU_CACHE_PERCENT_DEFAULT; } cache_size = cache_size / 100 * lu_cache_percent * - (PAGE_SIZE / 1024); - - for (bits = 1; (1 << bits) < cache_size; ++bits) - ; - return clamp_t(typeof(bits), bits, LU_SITE_BITS_MIN, bits_max); -} - -static unsigned int lu_obj_hop_hash(struct cfs_hash *hs, - const void *key, unsigned int mask) -{ - struct lu_fid *fid = (struct lu_fid *)key; - u32 hash; + (PAGE_SIZE / 1024); - hash = fid_flatten32(fid); - hash += (hash >> 4) + (hash << 12); /* mixing oid and seq */ - hash = hash_long(hash, hs->hs_bkt_bits); - - /* give me another random factor */ - hash -= hash_long((unsigned long)hs, fid_oid(fid) % 11 + 3); - - hash <<= hs->hs_cur_bits - hs->hs_bkt_bits; - hash |= (fid_seq(fid) + fid_oid(fid)) & (CFS_HASH_NBKT(hs) - 1); - - return hash & mask; -} - -static void *lu_obj_hop_object(struct hlist_node *hnode) -{ - return hlist_entry(hnode, struct lu_object_header, loh_hash); -} - -static void *lu_obj_hop_key(struct hlist_node *hnode) -{ - struct lu_object_header *h; - - h = hlist_entry(hnode, struct lu_object_header, loh_hash); - return &h->loh_fid; -} - -static int lu_obj_hop_keycmp(const void *key, struct hlist_node *hnode) -{ - struct lu_object_header *h; - - h = hlist_entry(hnode, struct lu_object_header, loh_hash); - return lu_fid_eq(&h->loh_fid, (struct lu_fid *)key); -} - -static void lu_obj_hop_get(struct cfs_hash *hs, struct hlist_node *hnode) -{ - struct lu_object_header *h; - - h = hlist_entry(hnode, struct lu_object_header, loh_hash); - atomic_inc(&h->loh_ref); + lu_cache_nr = clamp_t(typeof(cache_size), cache_size, + LU_CACHE_NR_MIN, LU_CACHE_NR_MAX); } -static void lu_obj_hop_put_locked(struct cfs_hash *hs, struct hlist_node *hnode) -{ - LBUG(); /* we should never called it */ -} - -static struct cfs_hash_ops lu_site_hash_ops = { - .hs_hash = lu_obj_hop_hash, - .hs_key = lu_obj_hop_key, - .hs_keycmp = lu_obj_hop_keycmp, - .hs_object = lu_obj_hop_object, - .hs_get = lu_obj_hop_get, - .hs_put_locked = lu_obj_hop_put_locked, -}; - static void lu_dev_add_linkage(struct lu_site *s, struct lu_device *d) { spin_lock(&s->ls_ld_lock); @@ -1062,35 +1051,19 @@ static void lu_dev_add_linkage(struct lu_site *s, struct lu_device *d) int lu_site_init(struct lu_site *s, struct lu_device *top) { struct lu_site_bkt_data *bkt; - unsigned long bits; - unsigned long i; - char name[16]; + unsigned int i; int rc; memset(s, 0, sizeof(*s)); mutex_init(&s->ls_purge_mutex); + lu_htable_limits(top); rc = percpu_counter_init(&s->ls_lru_len_counter, 0, GFP_NOFS); if (rc) return -ENOMEM; - snprintf(name, sizeof(name), "lu_site_%s", top->ld_type->ldt_name); - for (bits = lu_htable_order(top); bits >= LU_SITE_BITS_MIN; bits--) { - s->ls_obj_hash = cfs_hash_create(name, bits, bits, - bits - LU_SITE_BKT_BITS, - 0, 0, 0, - &lu_site_hash_ops, - CFS_HASH_SPIN_BKTLOCK | - CFS_HASH_NO_ITEMREF | - CFS_HASH_DEPTH | - CFS_HASH_ASSERT_EMPTY | - CFS_HASH_COUNTER); - if (s->ls_obj_hash) - break; - } - - if (!s->ls_obj_hash) { - CERROR("failed to create lu_site hash with bits: %lu\n", bits); + if (rhashtable_init(&s->ls_obj_hash, &obj_hash_params) != 0) { + CERROR("failed to create lu_site hash\n"); return -ENOMEM; } @@ -1101,8 +1074,7 @@ int lu_site_init(struct lu_site *s, struct lu_device *top) s->ls_bkts = kvmalloc_array(s->ls_bkt_cnt, sizeof(*bkt), GFP_KERNEL | __GFP_ZERO); if (!s->ls_bkts) { - cfs_hash_putref(s->ls_obj_hash); - s->ls_obj_hash = NULL; + rhashtable_destroy(&s->ls_obj_hash); s->ls_bkts = NULL; return -ENOMEM; } @@ -1116,9 +1088,8 @@ int lu_site_init(struct lu_site *s, struct lu_device *top) s->ls_stats = lprocfs_alloc_stats(LU_SS_LAST_STAT, 0); if (!s->ls_stats) { kvfree(s->ls_bkts); - cfs_hash_putref(s->ls_obj_hash); - s->ls_obj_hash = NULL; s->ls_bkts = NULL; + rhashtable_destroy(&s->ls_obj_hash); return -ENOMEM; } @@ -1161,13 +1132,12 @@ void lu_site_fini(struct lu_site *s) percpu_counter_destroy(&s->ls_lru_len_counter); - if (s->ls_obj_hash) { - cfs_hash_putref(s->ls_obj_hash); - s->ls_obj_hash = NULL; + if (s->ls_bkts) { + rhashtable_destroy(&s->ls_obj_hash); + kvfree(s->ls_bkts); + s->ls_bkts = NULL; } - kvfree(s->ls_bkts); - if (s->ls_top_dev) { s->ls_top_dev->ld_site = NULL; lu_ref_del(&s->ls_top_dev->ld_reference, "site-top", s); @@ -1323,7 +1293,6 @@ int lu_object_header_init(struct lu_object_header *h) { memset(h, 0, sizeof(*h)); atomic_set(&h->loh_ref, 1); - INIT_HLIST_NODE(&h->loh_hash); INIT_LIST_HEAD(&h->loh_lru); INIT_LIST_HEAD(&h->loh_layers); lu_ref_init(&h->loh_reference); @@ -1338,7 +1307,6 @@ void lu_object_header_fini(struct lu_object_header *h) { LASSERT(list_empty(&h->loh_layers)); LASSERT(list_empty(&h->loh_lru)); - LASSERT(hlist_unhashed(&h->loh_hash)); lu_ref_fini(&h->loh_reference); } EXPORT_SYMBOL(lu_object_header_fini); @@ -1933,7 +1901,7 @@ struct lu_site_stats { static void lu_site_stats_get(const struct lu_site *s, struct lu_site_stats *stats) { - int cnt = cfs_hash_size_get(s->ls_obj_hash); + int cnt = atomic_read(&s->ls_obj_hash.nelems); /* * percpu_counter_sum_positive() won't accept a const pointer * as it does modify the struct by taking a spinlock @@ -2235,16 +2203,23 @@ static u32 ls_stats_read(struct lprocfs_stats *stats, int idx) */ int lu_site_stats_print(const struct lu_site *s, struct seq_file *m) { + const struct bucket_table *tbl; struct lu_site_stats stats; + unsigned int chains; memset(&stats, 0, sizeof(stats)); lu_site_stats_get(s, &stats); - seq_printf(m, "%d/%d %d/%ld %d %d %d %d %d %d %d\n", + rcu_read_lock(); + tbl = rht_dereference_rcu(s->ls_obj_hash.tbl, + &((struct lu_site *)s)->ls_obj_hash); + chains = tbl->size; + rcu_read_unlock(); + seq_printf(m, "%d/%d %d/%u %d %d %d %d %d %d %d\n", stats.lss_busy, stats.lss_total, stats.lss_populated, - CFS_HASH_NHLIST(s->ls_obj_hash), + chains, stats.lss_max_search, ls_stats_read(s->ls_stats, LU_SS_CREATED), ls_stats_read(s->ls_stats, LU_SS_CACHE_HIT), From patchwork Wed Jul 15 20:44:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666219 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 277F51392 for ; Wed, 15 Jul 2020 20:45:48 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 113C420672 for ; Wed, 15 Jul 2020 20:45:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 113C420672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 3B42D21F6BA; Wed, 15 Jul 2020 13:45:39 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 19EBF21F6E3 for ; Wed, 15 Jul 2020 13:45:25 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 7D359477; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7382F2BA; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:48 -0400 Message-Id: <1594845918-29027-8-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 07/37] lustre: osc: disable ext merging for rdma only pages and non-rdma X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong This patch try to add logic to prevent CPU memory pages and RDMA memory pages from merging into one RPC, codes which set OBD_BRW_RDMA_ONLY will be added whenever RDMA only codes added later. WC-bug-id: https://jira.whamcloud.com/browse/LU-13180 Lustre-commit: 9f6c9fa44d6e6 ("LU-13180 osc: disable ext merging for rdma only pages and non-rdma") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/37567 Reviewed-by: Andreas Dilger Reviewed-by: Gu Zheng Reviewed-by: Yingjin Qian Signed-off-by: James Simmons --- fs/lustre/include/lustre_osc.h | 4 +++- fs/lustre/osc/osc_cache.c | 4 ++++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h index 11b7e92..cd08f27 100644 --- a/fs/lustre/include/lustre_osc.h +++ b/fs/lustre/include/lustre_osc.h @@ -939,7 +939,9 @@ struct osc_extent { /* Non-delay RPC should be used for this extent. */ oe_ndelay:1, /* direct IO pages */ - oe_dio:1; + oe_dio:1, + /* this extent consists of RDMA only pages */ + oe_is_rdma_only; /* how many grants allocated for this extent. * Grant allocated for this extent. There is no grant allocated * for reading extents and sync write extents. diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c index 474b711..f811dadb 100644 --- a/fs/lustre/osc/osc_cache.c +++ b/fs/lustre/osc/osc_cache.c @@ -1927,6 +1927,9 @@ static inline unsigned int osc_extent_chunks(const struct osc_extent *ext) if (in_rpc->oe_dio && overlapped(ext, in_rpc)) return false; + if (ext->oe_is_rdma_only != in_rpc->oe_is_rdma_only) + return false; + return true; } @@ -2688,6 +2691,7 @@ int osc_queue_sync_pages(const struct lu_env *env, const struct cl_io *io, ext->oe_srvlock = !!(brw_flags & OBD_BRW_SRVLOCK); ext->oe_ndelay = !!(brw_flags & OBD_BRW_NDELAY); ext->oe_dio = !!(brw_flags & OBD_BRW_NOCACHE); + ext->oe_is_rdma_only = !!(brw_flags & OBD_BRW_RDMA_ONLY); ext->oe_nr_pages = page_count; ext->oe_mppr = mppr; list_splice_init(list, &ext->oe_pages); From patchwork Wed Jul 15 20:44:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666213 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 69531618 for ; Wed, 15 Jul 2020 20:45:40 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5296920672 for ; Wed, 15 Jul 2020 20:45:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5296920672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1A0A221F868; Wed, 15 Jul 2020 13:45:35 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5B1E121F6E3 for ; Wed, 15 Jul 2020 13:45:25 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 7E2F1478; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 779AA8D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:49 -0400 Message-Id: <1594845918-29027-9-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 08/37] lnet: socklnd: fix local interface binding X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Amir Shehata , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Amir Shehata When a node is configured with multiple interfaces in Multi-Rail config, socklnd was not utilizing the local interface requested by LNet. In essence LNet was using all the NIDs in round robin, however the socklnd module was not binding to the correct interface. Traffic was thus sent on a subset of the interfaces. The reason is that the route interface number was not being set. In most cases lnet_connect() is called to create a socket. The socket is bound to the interface provided and then ksocknal_create_conn() is called to create the socklnd connection. ksocknal_create_conn() calls ksocknal_associate_route_conn_locked() at which point the route's local interface is assigned. However, this is already too late as the socket has already been created and bound to a local interface. Therefore, it's important to assign the route's interface before calling lnet_connect() to ensure socket is bound to correct local interface. To address this issue, the route's interface index is initialized to the NI's interface index when it's added to the peer_ni. Another bug fixed: The interface index was not being initialized in the startup routine. Note: We're strictly assuming that there is one interface for each NI. This is because tcp bonding will be removed from the socklnd as it has been deprecated by LNet mutli-rail. WC-bug-id: https://jira.whamcloud.com/browse/LU-13566 Lustre-commit: a7c9aba5eb96d ("LU-13566 socklnd: fix local interface binding") Signed-off-by: Amir Shehata Reviewed-on: https://review.whamcloud.com/38743 Reviewed-by: Neil Brown Reviewed-by: Serguei Smirnov Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c index 444b90b..2b8fd3d 100644 --- a/net/lnet/klnds/socklnd/socklnd.c +++ b/net/lnet/klnds/socklnd/socklnd.c @@ -409,12 +409,14 @@ struct ksock_peer_ni * { struct ksock_conn *conn; struct ksock_route *route2; + struct ksock_net *net = peer_ni->ksnp_ni->ni_data; LASSERT(!peer_ni->ksnp_closing); LASSERT(!route->ksnr_peer); LASSERT(!route->ksnr_scheduled); LASSERT(!route->ksnr_connecting); LASSERT(!route->ksnr_connected); + LASSERT(net->ksnn_ninterfaces > 0); /* LASSERT(unique) */ list_for_each_entry(route2, &peer_ni->ksnp_routes, ksnr_list) { @@ -428,6 +430,11 @@ struct ksock_peer_ni * route->ksnr_peer = peer_ni; ksocknal_peer_addref(peer_ni); + + /* set the route's interface to the current net's interface */ + route->ksnr_myiface = net->ksnn_interfaces[0].ksni_index; + net->ksnn_interfaces[0].ksni_nroutes++; + /* peer_ni's routelist takes over my ref on 'route' */ list_add_tail(&route->ksnr_list, &peer_ni->ksnp_routes); @@ -2667,6 +2674,7 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) net->ksnn_ninterfaces = 1; ni->ni_dev_cpt = ifaces[0].li_cpt; ksi->ksni_ipaddr = ifaces[0].li_ipaddr; + ksi->ksni_index = ksocknal_ip2index(ksi->ksni_ipaddr, ni); ksi->ksni_netmask = ifaces[0].li_netmask; strlcpy(ksi->ksni_name, ifaces[0].li_name, sizeof(ksi->ksni_name)); @@ -2706,6 +2714,8 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) ksi = &net->ksnn_interfaces[j]; ni->ni_dev_cpt = ifaces[j].li_cpt; ksi->ksni_ipaddr = ifaces[j].li_ipaddr; + ksi->ksni_index = + ksocknal_ip2index(ksi->ksni_ipaddr, ni); ksi->ksni_netmask = ifaces[j].li_netmask; strlcpy(ksi->ksni_name, ifaces[j].li_name, sizeof(ksi->ksni_name)); From patchwork Wed Jul 15 20:44:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666217 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B9991392 for ; Wed, 15 Jul 2020 20:45:46 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8519B20672 for ; Wed, 15 Jul 2020 20:45:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8519B20672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 72E6521F834; Wed, 15 Jul 2020 13:45:38 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B2A8721F6E3 for ; Wed, 15 Jul 2020 13:45:25 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 80D80479; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7A7C62BB; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:50 -0400 Message-Id: <1594845918-29027-10-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 09/37] lnet: o2iblnd: allocate init_qp_attr on stack. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown 'struct ib_qp_init_attr' is not so large that it cannot be allocated on the stack. It is about 100 bytes, various other function in Linux allocate it on the stack, and the stack isn't as constrained as it once was. So allocate on stack instead of using kmalloc and handling errors. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 524a5a733ba1c ("LU-12678 o2iblnd: allocate init_qp_attr on stack.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39122 Reviewed-by: Serguei Smirnov Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.c | 45 +++++++++++++++------------------------- 1 file changed, 17 insertions(+), 28 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index 16edfba..d8fca2a 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -699,7 +699,7 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, rwlock_t *glock = &kiblnd_data.kib_global_lock; struct kib_net *net = peer_ni->ibp_ni->ni_data; struct kib_dev *dev; - struct ib_qp_init_attr *init_qp_attr; + struct ib_qp_init_attr init_qp_attr = {}; struct kib_sched_info *sched; struct ib_cq_init_attr cq_attr = {}; struct kib_conn *conn; @@ -727,18 +727,11 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, */ cpt = sched->ibs_cpt; - init_qp_attr = kzalloc_cpt(sizeof(*init_qp_attr), GFP_NOFS, cpt); - if (!init_qp_attr) { - CERROR("Can't allocate qp_attr for %s\n", - libcfs_nid2str(peer_ni->ibp_nid)); - goto failed_0; - } - conn = kzalloc_cpt(sizeof(*conn), GFP_NOFS, cpt); if (!conn) { CERROR("Can't allocate connection for %s\n", libcfs_nid2str(peer_ni->ibp_nid)); - goto failed_1; + goto failed_0; } conn->ibc_state = IBLND_CONN_INIT; @@ -819,27 +812,27 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, goto failed_2; } - init_qp_attr->event_handler = kiblnd_qp_event; - init_qp_attr->qp_context = conn; - init_qp_attr->cap.max_send_sge = *kiblnd_tunables.kib_wrq_sge; - init_qp_attr->cap.max_recv_sge = 1; - init_qp_attr->sq_sig_type = IB_SIGNAL_REQ_WR; - init_qp_attr->qp_type = IB_QPT_RC; - init_qp_attr->send_cq = cq; - init_qp_attr->recv_cq = cq; + init_qp_attr.event_handler = kiblnd_qp_event; + init_qp_attr.qp_context = conn; + init_qp_attr.cap.max_send_sge = *kiblnd_tunables.kib_wrq_sge; + init_qp_attr.cap.max_recv_sge = 1; + init_qp_attr.sq_sig_type = IB_SIGNAL_REQ_WR; + init_qp_attr.qp_type = IB_QPT_RC; + init_qp_attr.send_cq = cq; + init_qp_attr.recv_cq = cq; /* kiblnd_send_wrs() can change the connection's queue depth if * the maximum work requests for the device is maxed out */ - init_qp_attr->cap.max_send_wr = kiblnd_send_wrs(conn); - init_qp_attr->cap.max_recv_wr = IBLND_RECV_WRS(conn); + init_qp_attr.cap.max_send_wr = kiblnd_send_wrs(conn); + init_qp_attr.cap.max_recv_wr = IBLND_RECV_WRS(conn); - rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, init_qp_attr); + rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, &init_qp_attr); if (rc) { CERROR("Can't create QP: %d, send_wr: %d, recv_wr: %d, send_sge: %d, recv_sge: %d\n", - rc, init_qp_attr->cap.max_send_wr, - init_qp_attr->cap.max_recv_wr, - init_qp_attr->cap.max_send_sge, - init_qp_attr->cap.max_recv_sge); + rc, init_qp_attr.cap.max_send_wr, + init_qp_attr.cap.max_recv_wr, + init_qp_attr.cap.max_send_sge, + init_qp_attr.cap.max_recv_sge); goto failed_2; } @@ -851,8 +844,6 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, peer_ni->ibp_queue_depth, conn->ibc_queue_depth); - kfree(init_qp_attr); - conn->ibc_rxs = kzalloc_cpt(IBLND_RX_MSGS(conn) * sizeof(*conn->ibc_rxs), GFP_NOFS, cpt); @@ -918,8 +909,6 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer_ni *peer_ni, failed_2: kiblnd_destroy_conn(conn); kfree(conn); -failed_1: - kfree(init_qp_attr); failed_0: return NULL; } From patchwork Wed Jul 15 20:44:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666225 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7031C618 for ; Wed, 15 Jul 2020 20:45:58 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 59B812065F for ; Wed, 15 Jul 2020 20:45:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 59B812065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 38EC721F77B; Wed, 15 Jul 2020 13:45:45 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1503921F6E3 for ; Wed, 15 Jul 2020 13:45:26 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 8280F47A; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8080C2BC; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:51 -0400 Message-Id: <1594845918-29027-11-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 10/37] lnet: Fix some out-of-date comments. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown The structures these comments describe have changed or been removed, but the comments weren't updated at the time. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 617ad3af720a3 ("LU-12678 lnet: Fix some out-of-date comments.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39127 Reviewed-by: James Simmons Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.h | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h index 7d49fff..0ac3637 100644 --- a/net/lnet/klnds/socklnd/socklnd.h +++ b/net/lnet/klnds/socklnd/socklnd.h @@ -255,15 +255,13 @@ struct ksock_nal_data { #define SOCKNAL_INIT_DATA 1 #define SOCKNAL_INIT_ALL 2 -/* - * A packet just assembled for transmission is represented by 1 or more - * struct iovec fragments (the first frag contains the portals header), - * followed by 0 or more struct bio_vec fragments. +/* A packet just assembled for transmission is represented by 1 + * struct iovec fragment - the portals header - followed by 0 + * or more struct bio_vec fragments. * * On the receive side, initially 1 struct iovec fragment is posted for * receive (the header). Once the header has been received, the payload is - * received into either struct iovec or struct bio_vec fragments, depending on - * what the header matched or whether the message needs forwarding. + * received into struct bio_vec fragments. */ struct ksock_conn; /* forward ref */ struct ksock_route; /* forward ref */ @@ -296,8 +294,6 @@ struct ksock_tx { /* transmit packet */ #define KSOCK_NOOP_TX_SIZE (offsetof(struct ksock_tx, tx_payload[0])) -/* network zero copy callback descriptor embedded in struct ksock_tx */ - #define SOCKNAL_RX_KSM_HEADER 1 /* reading ksock message header */ #define SOCKNAL_RX_LNET_HEADER 2 /* reading lnet message header */ #define SOCKNAL_RX_PARSE 3 /* Calling lnet_parse() */ From patchwork Wed Jul 15 20:44:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666221 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B8736618 for ; Wed, 15 Jul 2020 20:45:52 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A1F5B20672 for ; Wed, 15 Jul 2020 20:45:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A1F5B20672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B4C8821F844; Wed, 15 Jul 2020 13:45:41 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 587C421F783 for ; Wed, 15 Jul 2020 13:45:26 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 8518E484; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 824192A0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:52 -0400 Message-Id: <1594845918-29027-12-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 11/37] lnet: socklnd: don't fall-back to tcp_sendpage. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown sk_prot->sendpage is never NULL, so there is no need for a fallback to tcp_sendpage. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 011d760069142 ("LU-12678 socklnd: don't fall-back to tcp_sendpage.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39134 Reviewed-by: Shaun Tancheff Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd_lib.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd_lib.c b/net/lnet/klnds/socklnd/socklnd_lib.c index 2adc99c..1d6cd0e 100644 --- a/net/lnet/klnds/socklnd/socklnd_lib.c +++ b/net/lnet/klnds/socklnd/socklnd_lib.c @@ -123,12 +123,8 @@ fragsize < tx->tx_resid) msgflg |= MSG_MORE; - if (sk->sk_prot->sendpage) { - rc = sk->sk_prot->sendpage(sk, page, - offset, fragsize, msgflg); - } else { - rc = tcp_sendpage(sk, page, offset, fragsize, msgflg); - } + rc = sk->sk_prot->sendpage(sk, page, + offset, fragsize, msgflg); } else { struct msghdr msg = { .msg_flags = MSG_DONTWAIT }; int i; From patchwork Wed Jul 15 20:44:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666231 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 45CB0618 for ; Wed, 15 Jul 2020 20:46:06 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2F3C82065F for ; Wed, 15 Jul 2020 20:46:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F3C82065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 7217521F910; Wed, 15 Jul 2020 13:45:49 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9E17921F783 for ; Wed, 15 Jul 2020 13:45:26 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 88B55486; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 852C72BD; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:53 -0400 Message-Id: <1594845918-29027-13-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 12/37] lustre: ptlrpc: re-enterable signal_completed_replay() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mikhail Pershin The signal_completed_replay() can meet race conditions while checking imp_replay_inflight counter, so remove assertion and check race conditions instead. Fixes: 8cc7f22847 ("lustre: ptlrpc: limit rate of lock replays") WC-bug-id: https://jira.whamcloud.com/browse/LU-13600 Lustre-commit: 24451f3790503 ("LU-13600 ptlrpc: re-enterable signal_completed_replay()") Signed-off-by: Mikhail Pershin Reviewed-on: https://review.whamcloud.com/39140 Reviewed-by: Andreas Dilger Reviewed-by: Hongchao Zhang Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/import.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/lustre/ptlrpc/import.c b/fs/lustre/ptlrpc/import.c index 7ec3638..1b62b81 100644 --- a/fs/lustre/ptlrpc/import.c +++ b/fs/lustre/ptlrpc/import.c @@ -1407,8 +1407,8 @@ static int signal_completed_replay(struct obd_import *imp) if (unlikely(OBD_FAIL_CHECK(OBD_FAIL_PTLRPC_FINISH_REPLAY))) return 0; - LASSERT(atomic_read(&imp->imp_replay_inflight) == 0); - atomic_inc(&imp->imp_replay_inflight); + if (!atomic_add_unless(&imp->imp_replay_inflight, 1, 1)) + return 0; req = ptlrpc_request_alloc_pack(imp, &RQF_OBD_PING, LUSTRE_OBD_VERSION, OBD_PING); From patchwork Wed Jul 15 20:44:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666235 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 116CC13A4 for ; Wed, 15 Jul 2020 20:46:13 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EF2A52065F for ; Wed, 15 Jul 2020 20:46:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EF2A52065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B71D121F895; Wed, 15 Jul 2020 13:45:52 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id E0B3F21F798 for ; Wed, 15 Jul 2020 13:45:26 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 8A67248F; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 88AA72B5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:54 -0400 Message-Id: <1594845918-29027-14-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 13/37] lustre: obdcalss: ensure LCT_QUIESCENT take sync X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yang Sheng , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Yang Sheng Add locking in lu_device_init ensure LCT_QUIESCENT operating can be seen on other thread in parallel mounting. Also add extra checking before unset the flag to make sure we don't do it after device has been started. (osd_handler.c:7730:osd_device_init0()) ASSERTION( info ) failed: (osd_handler.c:7730:osd_device_init0()) LBUG Pid: 28098, comm: mount.lustre 3.10.0-1062.9.1.el7_lustre.x86_64 Call Trace: libcfs_call_trace+0x8c/0xc0 [libcfs] lbug_with_loc+0x4c/0xa0 [libcfs] osd_device_alloc+0x778/0x8f0 [osd_ldiskfs] obd_setup+0x129/0x2f0 [obdclass] class_setup+0x48f/0x7f0 [obdclass] class_process_config+0x190f/0x2830 [obdclass] do_lcfg+0x258/0x500 [obdclass] lustre_start_simple+0x88/0x210 [obdclass] server_fill_super+0xf55/0x1890 [obdclass] lustre_fill_super+0x498/0x990 [obdclass] mount_nodev+0x4f/0xb0 lustre_mount+0x18/0x20 [obdclass] mount_fs+0x3e/0x1b0 vfs_kern_mount+0x67/0x110 do_mount+0x1ef/0xce0 SyS_mount+0x83/0xd0 system_call_fastpath+0x25/0x2a 0xffffffffffffffff Kernel panic - not syncing: LBUG WC-bug-id: https://jira.whamcloud.com/browse/LU-11814 Lustre-commit: 979f5e1db041d ("LU-11814 obdcalss: ensure LCT_QUIESCENT take sync") Signed-off-by: Yang Sheng Reviewed-on: https://review.whamcloud.com/38416 Reviewed-by: Andreas Dilger Reviewed-by: Wang Shilong Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lu_object.h | 8 +++--- fs/lustre/obdclass/lu_object.c | 58 ++++++++++++++++++++++++------------------ 2 files changed, 38 insertions(+), 28 deletions(-) diff --git a/fs/lustre/include/lu_object.h b/fs/lustre/include/lu_object.h index 1a6b6e1..6c47f43 100644 --- a/fs/lustre/include/lu_object.h +++ b/fs/lustre/include/lu_object.h @@ -1151,7 +1151,8 @@ struct lu_context_key { void lu_context_key_degister(struct lu_context_key *key); void *lu_context_key_get(const struct lu_context *ctx, const struct lu_context_key *key); -void lu_context_key_quiesce(struct lu_context_key *key); +void lu_context_key_quiesce(struct lu_device_type *t, + struct lu_context_key *key); void lu_context_key_revive(struct lu_context_key *key); /* @@ -1199,7 +1200,7 @@ void *lu_context_key_get(const struct lu_context *ctx, #define LU_TYPE_STOP(mod, ...) \ static void mod##_type_stop(struct lu_device_type *t) \ { \ - lu_context_key_quiesce_many(__VA_ARGS__, NULL); \ + lu_context_key_quiesce_many(t, __VA_ARGS__, NULL); \ } \ struct __##mod##_dummy_type_stop {; } @@ -1223,7 +1224,8 @@ void *lu_context_key_get(const struct lu_context *ctx, int lu_context_key_register_many(struct lu_context_key *k, ...); void lu_context_key_degister_many(struct lu_context_key *k, ...); void lu_context_key_revive_many(struct lu_context_key *k, ...); -void lu_context_key_quiesce_many(struct lu_context_key *k, ...); +void lu_context_key_quiesce_many(struct lu_device_type *t, + struct lu_context_key *k, ...); /* * update/clear ctx/ses tags. diff --git a/fs/lustre/obdclass/lu_object.c b/fs/lustre/obdclass/lu_object.c index 5cd8231..42bb7a6 100644 --- a/fs/lustre/obdclass/lu_object.c +++ b/fs/lustre/obdclass/lu_object.c @@ -1185,14 +1185,25 @@ void lu_device_put(struct lu_device *d) } EXPORT_SYMBOL(lu_device_put); +enum { /* Maximal number of tld slots. */ + LU_CONTEXT_KEY_NR = 40 +}; +static struct lu_context_key *lu_keys[LU_CONTEXT_KEY_NR] = { NULL, }; +static DECLARE_RWSEM(lu_key_initing); + /** * Initialize device @d of type @t. */ int lu_device_init(struct lu_device *d, struct lu_device_type *t) { - if (atomic_inc_return(&t->ldt_device_nr) == 1 && - t->ldt_ops->ldto_start) - t->ldt_ops->ldto_start(t); + if (atomic_add_unless(&t->ldt_device_nr, 1, 0) == 0) { + down_write(&lu_key_initing); + if (t->ldt_ops->ldto_start && + atomic_read(&t->ldt_device_nr) == 0) + t->ldt_ops->ldto_start(t); + atomic_inc(&t->ldt_device_nr); + up_write(&lu_key_initing); + } memset(d, 0, sizeof(*d)); atomic_set(&d->ld_ref, 0); @@ -1358,17 +1369,6 @@ void lu_stack_fini(const struct lu_env *env, struct lu_device *top) } } -enum { - /** - * Maximal number of tld slots. - */ - LU_CONTEXT_KEY_NR = 40 -}; - -static struct lu_context_key *lu_keys[LU_CONTEXT_KEY_NR] = { NULL, }; - -static DECLARE_RWSEM(lu_key_initing); - /** * Global counter incremented whenever key is registered, unregistered, * revived or quiesced. This is used to void unnecessary calls to @@ -1442,7 +1442,7 @@ void lu_context_key_degister(struct lu_context_key *key) LASSERT(atomic_read(&key->lct_used) >= 1); LINVRNT(0 <= key->lct_index && key->lct_index < ARRAY_SIZE(lu_keys)); - lu_context_key_quiesce(key); + lu_context_key_quiesce(NULL, key); key_fini(&lu_shrink_env.le_ctx, key->lct_index); @@ -1527,13 +1527,14 @@ void lu_context_key_revive_many(struct lu_context_key *k, ...) /** * Quiescent a number of keys. */ -void lu_context_key_quiesce_many(struct lu_context_key *k, ...) +void lu_context_key_quiesce_many(struct lu_device_type *t, + struct lu_context_key *k, ...) { va_list args; va_start(args, k); do { - lu_context_key_quiesce(k); + lu_context_key_quiesce(t, k); k = va_arg(args, struct lu_context_key*); } while (k); va_end(args); @@ -1564,18 +1565,22 @@ void *lu_context_key_get(const struct lu_context *ctx, * values in "shared" contexts (like service threads), when a module owning * the key is about to be unloaded. */ -void lu_context_key_quiesce(struct lu_context_key *key) +void lu_context_key_quiesce(struct lu_device_type *t, + struct lu_context_key *key) { struct lu_context *ctx; + if (key->lct_tags & LCT_QUIESCENT) + return; + /* + * The write-lock on lu_key_initing will ensure that any + * keys_fill() which didn't see LCT_QUIESCENT will have + * finished before we call key_fini(). + */ + down_write(&lu_key_initing); if (!(key->lct_tags & LCT_QUIESCENT)) { - /* - * The write-lock on lu_key_initing will ensure that any - * keys_fill() which didn't see LCT_QUIESCENT will have - * finished before we call key_fini(). - */ - down_write(&lu_key_initing); - key->lct_tags |= LCT_QUIESCENT; + if (!t || atomic_read(&t->ldt_device_nr) == 0) + key->lct_tags |= LCT_QUIESCENT; up_write(&lu_key_initing); spin_lock(&lu_context_remembered_guard); @@ -1584,7 +1589,10 @@ void lu_context_key_quiesce(struct lu_context_key *key) key_fini(ctx, key->lct_index); } spin_unlock(&lu_context_remembered_guard); + + return; } + up_write(&lu_key_initing); } void lu_context_key_revive(struct lu_context_key *key) From patchwork Wed Jul 15 20:44:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666227 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A5D9E1392 for ; Wed, 15 Jul 2020 20:45:59 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8F8422065F for ; Wed, 15 Jul 2020 20:45:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8F8422065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C849121F885; Wed, 15 Jul 2020 13:45:45 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4467E21F6BD for ; Wed, 15 Jul 2020 13:45:27 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 8D476490; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8B8978D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:55 -0400 Message-Id: <1594845918-29027-15-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 14/37] lustre: remove some "#ifdef CONFIG*" from .c files. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown It is Linux policy to avoid #ifdef in C files where convenient - .h files are OK. This patch defines a few inline functions which differ depending on CONFIG_LUSTRE_FS_POSIX_ACL, and removes some #ifdefs from .c files. WC-bug-id: https://jira.whamcloud.com/browse/LU-9679 Lustre-commit: f37e26964a34f ("LU-9679 lustre: remove some "#ifdef CONFIG*" from .c files.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39131 Reviewed-by: James Simmons Reviewed-by: Jian Yu Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd.h | 21 ++++++++++++++++++++ fs/lustre/llite/llite_internal.h | 29 +++++++++++++++++++++++++++ fs/lustre/llite/llite_lib.c | 43 +++++++++------------------------------- fs/lustre/mdc/mdc_request.c | 8 +++----- 4 files changed, 62 insertions(+), 39 deletions(-) diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h index 438f4ca..ad2b2f4 100644 --- a/fs/lustre/include/obd.h +++ b/fs/lustre/include/obd.h @@ -34,6 +34,8 @@ #ifndef __OBD_H #define __OBD_H +#include +#include #include #include #include @@ -930,6 +932,25 @@ struct lustre_md { struct mdt_remote_perm *remote_perm; }; +#ifdef CONFIG_LUSTRE_FS_POSIX_ACL +static inline void lmd_clear_acl(struct lustre_md *md) +{ + if (md->posix_acl) { + posix_acl_release(md->posix_acl); + md->posix_acl = NULL; + } +} + +#define OBD_CONNECT_ACL_FLAGS \ + (OBD_CONNECT_ACL | OBD_CONNECT_UMASK | OBD_CONNECT_LARGE_ACL) +#else +static inline void lmd_clear_acl(struct lustre_md *md) +{ +} + +#define OBD_CONNECT_ACL_FLAGS (0) +#endif + struct md_open_data { struct obd_client_handle *mod_och; struct ptlrpc_request *mod_open_req; diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 2556dd8..31c528f 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -350,6 +350,35 @@ static inline void trunc_sem_up_write(struct ll_trunc_sem *sem) wake_up_var(&sem->ll_trunc_readers); } +#ifdef CONFIG_LUSTRE_FS_POSIX_ACL +static inline void lli_clear_acl(struct ll_inode_info *lli) +{ + if (lli->lli_posix_acl) { + posix_acl_release(lli->lli_posix_acl); + lli->lli_posix_acl = NULL; + } +} + +static inline void lli_replace_acl(struct ll_inode_info *lli, + struct lustre_md *md) +{ + spin_lock(&lli->lli_lock); + if (lli->lli_posix_acl) + posix_acl_release(lli->lli_posix_acl); + lli->lli_posix_acl = md->posix_acl; + spin_unlock(&lli->lli_lock); +} +#else +static inline void lli_clear_acl(struct ll_inode_info *lli) +{ +} + +static inline void lli_replace_acl(struct ll_inode_info *lli, + struct lustre_md *md) +{ +} +#endif + static inline u32 ll_layout_version_get(struct ll_inode_info *lli) { u32 gen; diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 1a7d805..c62e182 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -265,10 +265,7 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt) if (sbi->ll_flags & LL_SBI_LRU_RESIZE) data->ocd_connect_flags |= OBD_CONNECT_LRU_RESIZE; -#ifdef CONFIG_LUSTRE_FS_POSIX_ACL - data->ocd_connect_flags |= OBD_CONNECT_ACL | OBD_CONNECT_UMASK | - OBD_CONNECT_LARGE_ACL; -#endif + data->ocd_connect_flags |= OBD_CONNECT_ACL_FLAGS; data->ocd_cksum_types = obd_cksum_types_supported_client(); @@ -618,13 +615,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt) ptlrpc_req_finished(request); if (IS_ERR(root)) { -#ifdef CONFIG_LUSTRE_FS_POSIX_ACL - if (lmd.posix_acl) { - posix_acl_release(lmd.posix_acl); - lmd.posix_acl = NULL; - } -#endif - err = -EBADF; + lmd_clear_acl(&lmd); + err = IS_ERR(root) ? PTR_ERR(root) : -EBADF; CERROR("lustre_lite: bad iget4 for root\n"); goto out_root; } @@ -1584,13 +1576,7 @@ void ll_clear_inode(struct inode *inode) ll_xattr_cache_destroy(inode); -#ifdef CONFIG_LUSTRE_FS_POSIX_ACL - forget_all_cached_acls(inode); - if (lli->lli_posix_acl) { - posix_acl_release(lli->lli_posix_acl); - lli->lli_posix_acl = NULL; - } -#endif + lli_clear_acl(lli); lli->lli_inode_magic = LLI_INODE_DEAD; if (S_ISDIR(inode->i_mode)) @@ -2233,15 +2219,9 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md) return rc; } -#ifdef CONFIG_LUSTRE_FS_POSIX_ACL - if (body->mbo_valid & OBD_MD_FLACL) { - spin_lock(&lli->lli_lock); - if (lli->lli_posix_acl) - posix_acl_release(lli->lli_posix_acl); - lli->lli_posix_acl = md->posix_acl; - spin_unlock(&lli->lli_lock); - } -#endif + if (body->mbo_valid & OBD_MD_FLACL) + lli_replace_acl(lli, md); + inode->i_ino = cl_fid_build_ino(&body->mbo_fid1, sbi->ll_flags & LL_SBI_32BIT_API); inode->i_generation = cl_fid_build_gen(&body->mbo_fid1); @@ -2691,13 +2671,8 @@ int ll_prep_inode(struct inode **inode, struct ptlrpc_request *req, sbi->ll_flags & LL_SBI_32BIT_API), &md); if (IS_ERR(*inode)) { -#ifdef CONFIG_LUSTRE_FS_POSIX_ACL - if (md.posix_acl) { - posix_acl_release(md.posix_acl); - md.posix_acl = NULL; - } -#endif - rc = PTR_ERR(*inode); + lmd_clear_acl(&md); + rc = IS_ERR(*inode) ? PTR_ERR(*inode) : -ENOMEM; CERROR("new_inode -fatal: rc %d\n", rc); goto out; } diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c index d6d9f43..cacc58b 100644 --- a/fs/lustre/mdc/mdc_request.c +++ b/fs/lustre/mdc/mdc_request.c @@ -675,11 +675,9 @@ static int mdc_get_lustre_md(struct obd_export *exp, } out: - if (rc) { -#ifdef CONFIG_LUSTRE_FS_POSIX_ACL - posix_acl_release(md->posix_acl); -#endif - } + if (rc) + lmd_clear_acl(md); + return rc; } From patchwork Wed Jul 15 20:44:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666229 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A277E1392 for ; Wed, 15 Jul 2020 20:46:05 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8BCDA2065F for ; Wed, 15 Jul 2020 20:46:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8BCDA2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 19BE821F90C; Wed, 15 Jul 2020 13:45:49 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9B07721F6BD for ; Wed, 15 Jul 2020 13:45:27 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 91C67496; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8F1112BA; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:56 -0400 Message-Id: <1594845918-29027-16-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 15/37] lustre: obdclass: use offset instead of cp_linkage X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong Since we have fixed-size cl_page allocations, we could use offset array to store every slices pointer for cl_page. With this patch, we will reduce cl_page size from 392 bytes to 336 bytes which means we could allocate from 10 to 12 objects. WC-bug-id: https://jira.whamcloud.com/browse/LU-13134 Lustre-commit: 55967f1e5c701 ("LU-13134 obdclass: use offset instead of cp_linkage") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/37428 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 8 +- fs/lustre/obdclass/cl_page.c | 284 ++++++++++++++++++++++++------------------ 2 files changed, 168 insertions(+), 124 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index a0b9e87..47997f8 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -737,8 +737,10 @@ struct cl_page { struct page *cp_vmpage; /** Linkage of pages within group. Pages must be owned */ struct list_head cp_batch; - /** List of slices. Immutable after creation. */ - struct list_head cp_layers; + /** array of slices offset. Immutable after creation. */ + unsigned char cp_layer_offset[3]; + /** current slice index */ + unsigned char cp_layer_count:2; /** * Page state. This field is const to avoid accidental update, it is * modified only internally within cl_page.c. Protected by a VM lock. @@ -781,8 +783,6 @@ struct cl_page_slice { */ struct cl_object *cpl_obj; const struct cl_page_operations *cpl_ops; - /** Linkage into cl_page::cp_layers. Immutable after creation. */ - struct list_head cpl_linkage; }; /** diff --git a/fs/lustre/obdclass/cl_page.c b/fs/lustre/obdclass/cl_page.c index d5be0c5..cced026 100644 --- a/fs/lustre/obdclass/cl_page.c +++ b/fs/lustre/obdclass/cl_page.c @@ -72,22 +72,47 @@ static void cl_page_get_trust(struct cl_page *page) refcount_inc(&page->cp_ref); } +static struct cl_page_slice * +cl_page_slice_get(const struct cl_page *cl_page, int index) +{ + if (index < 0 || index >= cl_page->cp_layer_count) + return NULL; + + /* To get the cp_layer_offset values fit under 256 bytes, we + * use the offset beyond the end of struct cl_page. + */ + return (struct cl_page_slice *)((char *)cl_page + sizeof(*cl_page) + + cl_page->cp_layer_offset[index]); +} + +#define cl_page_slice_for_each(cl_page, slice, i) \ + for (i = 0, slice = cl_page_slice_get(cl_page, 0); \ + i < (cl_page)->cp_layer_count; \ + slice = cl_page_slice_get(cl_page, ++i)) + +#define cl_page_slice_for_each_reverse(cl_page, slice, i) \ + for (i = (cl_page)->cp_layer_count - 1, \ + slice = cl_page_slice_get(cl_page, i); i >= 0; \ + slice = cl_page_slice_get(cl_page, --i)) + /** - * Returns a slice within a page, corresponding to the given layer in the + * Returns a slice within a cl_page, corresponding to the given layer in the * device stack. * * \see cl_lock_at() */ static const struct cl_page_slice * -cl_page_at_trusted(const struct cl_page *page, +cl_page_at_trusted(const struct cl_page *cl_page, const struct lu_device_type *dtype) { const struct cl_page_slice *slice; + int i; - list_for_each_entry(slice, &page->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_obj->co_lu.lo_dev->ld_type == dtype) return slice; } + return NULL; } @@ -104,28 +129,28 @@ static void __cl_page_free(struct cl_page *cl_page, unsigned short bufsize) } } -static void cl_page_free(const struct lu_env *env, struct cl_page *page, +static void cl_page_free(const struct lu_env *env, struct cl_page *cl_page, struct pagevec *pvec) { - struct cl_object *obj = page->cp_obj; - struct cl_page_slice *slice; + struct cl_object *obj = cl_page->cp_obj; unsigned short bufsize = cl_object_header(obj)->coh_page_bufsize; + struct cl_page_slice *slice; + int i; - PASSERT(env, page, list_empty(&page->cp_batch)); - PASSERT(env, page, !page->cp_owner); - PASSERT(env, page, page->cp_state == CPS_FREEING); + PASSERT(env, cl_page, list_empty(&cl_page->cp_batch)); + PASSERT(env, cl_page, !cl_page->cp_owner); + PASSERT(env, cl_page, cl_page->cp_state == CPS_FREEING); - while ((slice = list_first_entry_or_null(&page->cp_layers, - struct cl_page_slice, - cpl_linkage)) != NULL) { - list_del_init(page->cp_layers.next); + cl_page_slice_for_each(cl_page, slice, i) { if (unlikely(slice->cpl_ops->cpo_fini)) slice->cpl_ops->cpo_fini(env, slice, pvec); } - lu_object_ref_del_at(&obj->co_lu, &page->cp_obj_ref, "cl_page", page); + cl_page->cp_layer_count = 0; + lu_object_ref_del_at(&obj->co_lu, &cl_page->cp_obj_ref, + "cl_page", cl_page); cl_object_put(env, obj); - lu_ref_fini(&page->cp_reference); - __cl_page_free(page, bufsize); + lu_ref_fini(&cl_page->cp_reference); + __cl_page_free(cl_page, bufsize); } /** @@ -212,7 +237,6 @@ struct cl_page *cl_page_alloc(const struct lu_env *env, page->cp_vmpage = vmpage; cl_page_state_set_trust(page, CPS_CACHED); page->cp_type = type; - INIT_LIST_HEAD(&page->cp_layers); INIT_LIST_HEAD(&page->cp_batch); lu_ref_init(&page->cp_reference); cl_object_for_each(o2, o) { @@ -455,22 +479,23 @@ static void cl_page_owner_set(struct cl_page *page) } void __cl_page_disown(const struct lu_env *env, - struct cl_io *io, struct cl_page *pg) + struct cl_io *io, struct cl_page *cl_page) { const struct cl_page_slice *slice; enum cl_page_state state; + int i; - state = pg->cp_state; - cl_page_owner_clear(pg); + state = cl_page->cp_state; + cl_page_owner_clear(cl_page); if (state == CPS_OWNED) - cl_page_state_set(env, pg, CPS_CACHED); + cl_page_state_set(env, cl_page, CPS_CACHED); /* * Completion call-backs are executed in the bottom-up order, so that * uppermost layer (llite), responsible for VFS/VM interaction runs * last and can release locks safely. */ - list_for_each_entry_reverse(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each_reverse(cl_page, slice, i) { if (slice->cpl_ops->cpo_disown) (*slice->cpl_ops->cpo_disown)(env, slice, io); } @@ -494,12 +519,12 @@ int cl_page_is_owned(const struct cl_page *pg, const struct cl_io *io) * Waits until page is in cl_page_state::CPS_CACHED state, and then switch it * into cl_page_state::CPS_OWNED state. * - * \pre !cl_page_is_owned(pg, io) - * \post result == 0 iff cl_page_is_owned(pg, io) + * \pre !cl_page_is_owned(cl_page, io) + * \post result == 0 iff cl_page_is_owned(cl_page, io) * * Return: 0 success * - * -ve failure, e.g., page was destroyed (and landed in + * -ve failure, e.g., cl_page was destroyed (and landed in * cl_page_state::CPS_FREEING instead of * cl_page_state::CPS_CACHED). or, page was owned by * another thread, or in IO. @@ -510,19 +535,20 @@ int cl_page_is_owned(const struct cl_page *pg, const struct cl_io *io) * \see cl_page_own */ static int __cl_page_own(const struct lu_env *env, struct cl_io *io, - struct cl_page *pg, int nonblock) + struct cl_page *cl_page, int nonblock) { const struct cl_page_slice *slice; int result = 0; + int i; io = cl_io_top(io); - if (pg->cp_state == CPS_FREEING) { + if (cl_page->cp_state == CPS_FREEING) { result = -ENOENT; goto out; } - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_own) result = (*slice->cpl_ops->cpo_own)(env, slice, io, nonblock); @@ -533,13 +559,13 @@ static int __cl_page_own(const struct lu_env *env, struct cl_io *io, result = 0; if (result == 0) { - PASSERT(env, pg, !pg->cp_owner); - pg->cp_owner = cl_io_top(io); - cl_page_owner_set(pg); - if (pg->cp_state != CPS_FREEING) { - cl_page_state_set(env, pg, CPS_OWNED); + PASSERT(env, cl_page, !cl_page->cp_owner); + cl_page->cp_owner = cl_io_top(io); + cl_page_owner_set(cl_page); + if (cl_page->cp_state != CPS_FREEING) { + cl_page_state_set(env, cl_page, CPS_OWNED); } else { - __cl_page_disown(env, io, pg); + __cl_page_disown(env, io, cl_page); result = -ENOENT; } } @@ -575,51 +601,53 @@ int cl_page_own_try(const struct lu_env *env, struct cl_io *io, * * Called when page is already locked by the hosting VM. * - * \pre !cl_page_is_owned(pg, io) - * \post cl_page_is_owned(pg, io) + * \pre !cl_page_is_owned(cl_page, io) + * \post cl_page_is_owned(cl_page, io) * * \see cl_page_operations::cpo_assume() */ void cl_page_assume(const struct lu_env *env, - struct cl_io *io, struct cl_page *pg) + struct cl_io *io, struct cl_page *cl_page) { const struct cl_page_slice *slice; + int i; io = cl_io_top(io); - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_assume) (*slice->cpl_ops->cpo_assume)(env, slice, io); } - PASSERT(env, pg, !pg->cp_owner); - pg->cp_owner = cl_io_top(io); - cl_page_owner_set(pg); - cl_page_state_set(env, pg, CPS_OWNED); + PASSERT(env, cl_page, !cl_page->cp_owner); + cl_page->cp_owner = cl_io_top(io); + cl_page_owner_set(cl_page); + cl_page_state_set(env, cl_page, CPS_OWNED); } EXPORT_SYMBOL(cl_page_assume); /** * Releases page ownership without unlocking the page. * - * Moves page into cl_page_state::CPS_CACHED without releasing a lock on the - * underlying VM page (as VM is supposed to do this itself). + * Moves cl_page into cl_page_state::CPS_CACHED without releasing a lock + * on the underlying VM page (as VM is supposed to do this itself). * - * \pre cl_page_is_owned(pg, io) - * \post !cl_page_is_owned(pg, io) + * \pre cl_page_is_owned(cl_page, io) + * \post !cl_page_is_owned(cl_page, io) * * \see cl_page_assume() */ void cl_page_unassume(const struct lu_env *env, - struct cl_io *io, struct cl_page *pg) + struct cl_io *io, struct cl_page *cl_page) { const struct cl_page_slice *slice; + int i; io = cl_io_top(io); - cl_page_owner_clear(pg); - cl_page_state_set(env, pg, CPS_CACHED); + cl_page_owner_clear(cl_page); + cl_page_state_set(env, cl_page, CPS_CACHED); - list_for_each_entry_reverse(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each_reverse(cl_page, slice, i) { if (slice->cpl_ops->cpo_unassume) (*slice->cpl_ops->cpo_unassume)(env, slice, io); } @@ -646,21 +674,22 @@ void cl_page_disown(const struct lu_env *env, EXPORT_SYMBOL(cl_page_disown); /** - * Called when page is to be removed from the object, e.g., as a result of - * truncate. + * Called when cl_page is to be removed from the object, e.g., + * as a result of truncate. * * Calls cl_page_operations::cpo_discard() top-to-bottom. * - * \pre cl_page_is_owned(pg, io) + * \pre cl_page_is_owned(cl_page, io) * * \see cl_page_operations::cpo_discard() */ void cl_page_discard(const struct lu_env *env, - struct cl_io *io, struct cl_page *pg) + struct cl_io *io, struct cl_page *cl_page) { const struct cl_page_slice *slice; + int i; - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_discard) (*slice->cpl_ops->cpo_discard)(env, slice, io); } @@ -669,22 +698,24 @@ void cl_page_discard(const struct lu_env *env, /** * Version of cl_page_delete() that can be called for not fully constructed - * pages, e.g,. in a error handling cl_page_find()->__cl_page_delete() + * cl_pages, e.g,. in a error handling cl_page_find()->__cl_page_delete() * path. Doesn't check page invariant. */ -static void __cl_page_delete(const struct lu_env *env, struct cl_page *pg) +static void __cl_page_delete(const struct lu_env *env, + struct cl_page *cl_page) { const struct cl_page_slice *slice; + int i; - PASSERT(env, pg, pg->cp_state != CPS_FREEING); + PASSERT(env, cl_page, cl_page->cp_state != CPS_FREEING); /* - * Sever all ways to obtain new pointers to @pg. + * Sever all ways to obtain new pointers to @cl_page. */ - cl_page_owner_clear(pg); - __cl_page_state_set(env, pg, CPS_FREEING); + cl_page_owner_clear(cl_page); + __cl_page_state_set(env, cl_page, CPS_FREEING); - list_for_each_entry_reverse(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each_reverse(cl_page, slice, i) { if (slice->cpl_ops->cpo_delete) (*slice->cpl_ops->cpo_delete)(env, slice); } @@ -729,11 +760,13 @@ void cl_page_delete(const struct lu_env *env, struct cl_page *pg) * * \see cl_page_operations::cpo_export() */ -void cl_page_export(const struct lu_env *env, struct cl_page *pg, int uptodate) +void cl_page_export(const struct lu_env *env, struct cl_page *cl_page, + int uptodate) { const struct cl_page_slice *slice; + int i; - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_export) (*slice->cpl_ops->cpo_export)(env, slice, uptodate); } @@ -741,34 +774,36 @@ void cl_page_export(const struct lu_env *env, struct cl_page *pg, int uptodate) EXPORT_SYMBOL(cl_page_export); /** - * Returns true, if @pg is VM locked in a suitable sense by the calling + * Returns true, if @cl_page is VM locked in a suitable sense by the calling * thread. */ -int cl_page_is_vmlocked(const struct lu_env *env, const struct cl_page *pg) +int cl_page_is_vmlocked(const struct lu_env *env, + const struct cl_page *cl_page) { const struct cl_page_slice *slice; int result; - slice = list_first_entry(&pg->cp_layers, - const struct cl_page_slice, cpl_linkage); - PASSERT(env, pg, slice->cpl_ops->cpo_is_vmlocked); + slice = cl_page_slice_get(cl_page, 0); + PASSERT(env, cl_page, slice->cpl_ops->cpo_is_vmlocked); /* * Call ->cpo_is_vmlocked() directly instead of going through * CL_PAGE_INVOKE(), because cl_page_is_vmlocked() is used by * cl_page_invariant(). */ result = slice->cpl_ops->cpo_is_vmlocked(env, slice); - PASSERT(env, pg, result == -EBUSY || result == -ENODATA); + PASSERT(env, cl_page, result == -EBUSY || result == -ENODATA); + return result == -EBUSY; } EXPORT_SYMBOL(cl_page_is_vmlocked); -void cl_page_touch(const struct lu_env *env, const struct cl_page *pg, - size_t to) +void cl_page_touch(const struct lu_env *env, + const struct cl_page *cl_page, size_t to) { const struct cl_page_slice *slice; + int i; - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_page_touch) (*slice->cpl_ops->cpo_page_touch)(env, slice, to); } @@ -799,20 +834,21 @@ static void cl_page_io_start(const struct lu_env *env, * transfer now. */ int cl_page_prep(const struct lu_env *env, struct cl_io *io, - struct cl_page *pg, enum cl_req_type crt) + struct cl_page *cl_page, enum cl_req_type crt) { const struct cl_page_slice *slice; int result = 0; + int i; /* - * XXX this has to be called bottom-to-top, so that llite can set up + * this has to be called bottom-to-top, so that llite can set up * PG_writeback without risking other layers deciding to skip this * page. */ if (crt >= CRT_NR) return -EINVAL; - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_own) result = (*slice->cpl_ops->io[crt].cpo_prep)(env, slice, io); @@ -822,10 +858,10 @@ int cl_page_prep(const struct lu_env *env, struct cl_io *io, if (result >= 0) { result = 0; - cl_page_io_start(env, pg, crt); + cl_page_io_start(env, cl_page, crt); } - CL_PAGE_HEADER(D_TRACE, env, pg, "%d %d\n", crt, result); + CL_PAGE_HEADER(D_TRACE, env, cl_page, "%d %d\n", crt, result); return result; } EXPORT_SYMBOL(cl_page_prep); @@ -840,35 +876,36 @@ int cl_page_prep(const struct lu_env *env, struct cl_io *io, * uppermost layer (llite), responsible for the VFS/VM interaction runs last * and can release locks safely. * - * \pre pg->cp_state == CPS_PAGEIN || pg->cp_state == CPS_PAGEOUT - * \post pg->cp_state == CPS_CACHED + * \pre cl_page->cp_state == CPS_PAGEIN || cl_page->cp_state == CPS_PAGEOUT + * \post cl_page->cp_state == CPS_CACHED * * \see cl_page_operations::cpo_completion() */ void cl_page_completion(const struct lu_env *env, - struct cl_page *pg, enum cl_req_type crt, int ioret) + struct cl_page *cl_page, enum cl_req_type crt, + int ioret) { - struct cl_sync_io *anchor = pg->cp_sync_io; + struct cl_sync_io *anchor = cl_page->cp_sync_io; const struct cl_page_slice *slice; + int i; - PASSERT(env, pg, crt < CRT_NR); - PASSERT(env, pg, pg->cp_state == cl_req_type_state(crt)); - - CL_PAGE_HEADER(D_TRACE, env, pg, "%d %d\n", crt, ioret); + PASSERT(env, cl_page, crt < CRT_NR); + PASSERT(env, cl_page, cl_page->cp_state == cl_req_type_state(crt)); - cl_page_state_set(env, pg, CPS_CACHED); + CL_PAGE_HEADER(D_TRACE, env, cl_page, "%d %d\n", crt, ioret); + cl_page_state_set(env, cl_page, CPS_CACHED); if (crt >= CRT_NR) return; - list_for_each_entry_reverse(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each_reverse(cl_page, slice, i) { if (slice->cpl_ops->io[crt].cpo_completion) (*slice->cpl_ops->io[crt].cpo_completion)(env, slice, ioret); } if (anchor) { - LASSERT(pg->cp_sync_io == anchor); - pg->cp_sync_io = NULL; + LASSERT(cl_page->cp_sync_io == anchor); + cl_page->cp_sync_io = NULL; cl_sync_io_note(env, anchor, ioret); } } @@ -878,53 +915,56 @@ void cl_page_completion(const struct lu_env *env, * Notify layers that transfer formation engine decided to yank this page from * the cache and to make it a part of a transfer. * - * \pre pg->cp_state == CPS_CACHED - * \post pg->cp_state == CPS_PAGEIN || pg->cp_state == CPS_PAGEOUT + * \pre cl_page->cp_state == CPS_CACHED + * \post cl_page->cp_state == CPS_PAGEIN || cl_page->cp_state == CPS_PAGEOUT * * \see cl_page_operations::cpo_make_ready() */ -int cl_page_make_ready(const struct lu_env *env, struct cl_page *pg, +int cl_page_make_ready(const struct lu_env *env, struct cl_page *cl_page, enum cl_req_type crt) { - const struct cl_page_slice *sli; + const struct cl_page_slice *slice; int result = 0; + int i; if (crt >= CRT_NR) return -EINVAL; - list_for_each_entry(sli, &pg->cp_layers, cpl_linkage) { - if (sli->cpl_ops->io[crt].cpo_make_ready) - result = (*sli->cpl_ops->io[crt].cpo_make_ready)(env, - sli); + cl_page_slice_for_each(cl_page, slice, i) { + if (slice->cpl_ops->io[crt].cpo_make_ready) + result = (*slice->cpl_ops->io[crt].cpo_make_ready)(env, + slice); if (result != 0) break; } if (result >= 0) { - PASSERT(env, pg, pg->cp_state == CPS_CACHED); - cl_page_io_start(env, pg, crt); + PASSERT(env, cl_page, cl_page->cp_state == CPS_CACHED); + cl_page_io_start(env, cl_page, crt); result = 0; } - CL_PAGE_HEADER(D_TRACE, env, pg, "%d %d\n", crt, result); + CL_PAGE_HEADER(D_TRACE, env, cl_page, "%d %d\n", crt, result); + return result; } EXPORT_SYMBOL(cl_page_make_ready); /** - * Called if a pge is being written back by kernel's intention. + * Called if a page is being written back by kernel's intention. * - * \pre cl_page_is_owned(pg, io) - * \post ergo(result == 0, pg->cp_state == CPS_PAGEOUT) + * \pre cl_page_is_owned(cl_page, io) + * \post ergo(result == 0, cl_page->cp_state == CPS_PAGEOUT) * * \see cl_page_operations::cpo_flush() */ int cl_page_flush(const struct lu_env *env, struct cl_io *io, - struct cl_page *pg) + struct cl_page *cl_page) { const struct cl_page_slice *slice; int result = 0; + int i; - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_flush) result = (*slice->cpl_ops->cpo_flush)(env, slice, io); if (result != 0) @@ -933,7 +973,7 @@ int cl_page_flush(const struct lu_env *env, struct cl_io *io, if (result > 0) result = 0; - CL_PAGE_HEADER(D_TRACE, env, pg, "%d\n", result); + CL_PAGE_HEADER(D_TRACE, env, cl_page, "%d\n", result); return result; } EXPORT_SYMBOL(cl_page_flush); @@ -943,14 +983,14 @@ int cl_page_flush(const struct lu_env *env, struct cl_io *io, * * \see cl_page_operations::cpo_clip() */ -void cl_page_clip(const struct lu_env *env, struct cl_page *pg, +void cl_page_clip(const struct lu_env *env, struct cl_page *cl_page, int from, int to) { const struct cl_page_slice *slice; + int i; - CL_PAGE_HEADER(D_TRACE, env, pg, "%d %d\n", from, to); - - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + CL_PAGE_HEADER(D_TRACE, env, cl_page, "%d %d\n", from, to); + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_clip) (*slice->cpl_ops->cpo_clip)(env, slice, from, to); } @@ -972,24 +1012,24 @@ void cl_page_header_print(const struct lu_env *env, void *cookie, EXPORT_SYMBOL(cl_page_header_print); /** - * Prints human readable representation of @pg to the @f. + * Prints human readable representation of @cl_page to the @f. */ void cl_page_print(const struct lu_env *env, void *cookie, - lu_printer_t printer, const struct cl_page *pg) + lu_printer_t printer, const struct cl_page *cl_page) { const struct cl_page_slice *slice; int result = 0; + int i; - cl_page_header_print(env, cookie, printer, pg); - - list_for_each_entry(slice, &pg->cp_layers, cpl_linkage) { + cl_page_header_print(env, cookie, printer, cl_page); + cl_page_slice_for_each(cl_page, slice, i) { if (slice->cpl_ops->cpo_print) result = (*slice->cpl_ops->cpo_print)(env, slice, cookie, printer); if (result != 0) break; } - (*printer)(env, cookie, "end page@%p\n", pg); + (*printer)(env, cookie, "end page@%p\n", cl_page); } EXPORT_SYMBOL(cl_page_print); @@ -1032,14 +1072,18 @@ size_t cl_page_size(const struct cl_object *obj) * * \see cl_lock_slice_add(), cl_req_slice_add(), cl_io_slice_add() */ -void cl_page_slice_add(struct cl_page *page, struct cl_page_slice *slice, +void cl_page_slice_add(struct cl_page *cl_page, struct cl_page_slice *slice, struct cl_object *obj, const struct cl_page_operations *ops) { - list_add_tail(&slice->cpl_linkage, &page->cp_layers); + unsigned int offset = (char *)slice - + ((char *)cl_page + sizeof(*cl_page)); + + LASSERT(offset < (1 << sizeof(cl_page->cp_layer_offset[0]) * 8)); + cl_page->cp_layer_offset[cl_page->cp_layer_count++] = offset; slice->cpl_obj = obj; slice->cpl_ops = ops; - slice->cpl_page = page; + slice->cpl_page = cl_page; } EXPORT_SYMBOL(cl_page_slice_add); From patchwork Wed Jul 15 20:44:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666271 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F37E618 for ; Wed, 15 Jul 2020 20:46:52 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 589642065F for ; Wed, 15 Jul 2020 20:46:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 589642065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D752721F897; Wed, 15 Jul 2020 13:46:13 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 00F3E21F6BD for ; Wed, 15 Jul 2020 13:45:27 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 936E4498; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 91F3F2A0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:57 -0400 Message-Id: <1594845918-29027-17-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 16/37] lustre: obdclass: re-declare cl_page variables to reduce its size X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong With following changes: 1) make CPS_CACHED declare start from 1 consistent with CPT_CACHED 2) add CPT_NR to indicate max allowed CPT state value. 3) Reserve 4 bits for @cp_state which allow 15 kind of states 4) Reserve 2 bits for @cp_type which allow 3 kinds of cl_page types 5) use short int for @cp_kmem_index and We still have another 16 bits reserved for future extension. 6) move @cp_lov_index after @cp_ref to fill 4 bytes hole. After this patch, cl_page size could reduce from 336 bytes to 320 bytes WC-bug-id: https://jira.whamcloud.com/browse/LU-13134 Lustre-commit: 5fb29cd1e77ca ("LU-13134 obdclass: re-declare cl_page variables to reduce its size") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/37480 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 26 +++++++++------ fs/lustre/obdclass/cl_page.c | 76 +++++++++++++++++++++---------------------- 2 files changed, 53 insertions(+), 49 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index 47997f8..8611285 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -621,7 +621,7 @@ enum cl_page_state { * * \invariant cl_page::cp_owner == NULL && cl_page::cp_req == NULL */ - CPS_CACHED, + CPS_CACHED = 1, /** * Page is exclusively owned by some cl_io. Page may end up in this * state as a result of @@ -715,8 +715,13 @@ enum cl_page_type { * it is used in DirectIO and lockless IO. */ CPT_TRANSIENT, + CPT_NR }; +#define CP_STATE_BITS 4 +#define CP_TYPE_BITS 2 +#define CP_MAX_LAYER 3 + /** * Fields are protected by the lock on struct page, except for atomics and * immutables. @@ -729,8 +734,9 @@ enum cl_page_type { struct cl_page { /** Reference counter. */ refcount_t cp_ref; - /* which slab kmem index this memory allocated from */ - int cp_kmem_index; + /** layout_entry + stripe index, composed using lov_comp_index() */ + unsigned int cp_lov_index; + pgoff_t cp_osc_index; /** An object this page is a part of. Immutable after creation. */ struct cl_object *cp_obj; /** vmpage */ @@ -738,19 +744,22 @@ struct cl_page { /** Linkage of pages within group. Pages must be owned */ struct list_head cp_batch; /** array of slices offset. Immutable after creation. */ - unsigned char cp_layer_offset[3]; + unsigned char cp_layer_offset[CP_MAX_LAYER]; /* 24 bits */ /** current slice index */ - unsigned char cp_layer_count:2; + unsigned char cp_layer_count:2; /* 26 bits */ /** * Page state. This field is const to avoid accidental update, it is * modified only internally within cl_page.c. Protected by a VM lock. */ - const enum cl_page_state cp_state; + enum cl_page_state cp_state:CP_STATE_BITS; /* 30 bits */ /** * Page type. Only CPT_TRANSIENT is used so far. Immutable after * creation. */ - enum cl_page_type cp_type; + enum cl_page_type cp_type:CP_TYPE_BITS; /* 32 bits */ + /* which slab kmem index this memory allocated from */ + short int cp_kmem_index; /* 48 bits */ + unsigned int cp_unused1:16; /* 64 bits */ /** * Owning IO in cl_page_state::CPS_OWNED state. Sub-page can be owned @@ -765,9 +774,6 @@ struct cl_page { struct lu_ref_link cp_queue_ref; /** Assigned if doing a sync_io */ struct cl_sync_io *cp_sync_io; - /** layout_entry + stripe index, composed using lov_comp_index() */ - unsigned int cp_lov_index; - pgoff_t cp_osc_index; }; /** diff --git a/fs/lustre/obdclass/cl_page.c b/fs/lustre/obdclass/cl_page.c index cced026..53f88a7 100644 --- a/fs/lustre/obdclass/cl_page.c +++ b/fs/lustre/obdclass/cl_page.c @@ -153,17 +153,6 @@ static void cl_page_free(const struct lu_env *env, struct cl_page *cl_page, __cl_page_free(cl_page, bufsize); } -/** - * Helper function updating page state. This is the only place in the code - * where cl_page::cp_state field is mutated. - */ -static inline void cl_page_state_set_trust(struct cl_page *page, - enum cl_page_state state) -{ - /* bypass const. */ - *(enum cl_page_state *)&page->cp_state = state; -} - static struct cl_page *__cl_page_alloc(struct cl_object *o) { int i = 0; @@ -217,44 +206,50 @@ static struct cl_page *__cl_page_alloc(struct cl_object *o) return cl_page; } -struct cl_page *cl_page_alloc(const struct lu_env *env, - struct cl_object *o, pgoff_t ind, - struct page *vmpage, +struct cl_page *cl_page_alloc(const struct lu_env *env, struct cl_object *o, + pgoff_t ind, struct page *vmpage, enum cl_page_type type) { - struct cl_page *page; + struct cl_page *cl_page; struct cl_object *o2; - page = __cl_page_alloc(o); - if (page) { + cl_page = __cl_page_alloc(o); + if (cl_page) { int result = 0; - refcount_set(&page->cp_ref, 1); - page->cp_obj = o; + /* + * Please fix cl_page:cp_state/type declaration if + * these assertions fail in the future. + */ + BUILD_BUG_ON((1 << CP_STATE_BITS) < CPS_NR); /* cp_state */ + BUILD_BUG_ON((1 << CP_TYPE_BITS) < CPT_NR); /* cp_type */ + refcount_set(&cl_page->cp_ref, 1); + cl_page->cp_obj = o; cl_object_get(o); - lu_object_ref_add_at(&o->co_lu, &page->cp_obj_ref, "cl_page", - page); - page->cp_vmpage = vmpage; - cl_page_state_set_trust(page, CPS_CACHED); - page->cp_type = type; - INIT_LIST_HEAD(&page->cp_batch); - lu_ref_init(&page->cp_reference); + lu_object_ref_add_at(&o->co_lu, &cl_page->cp_obj_ref, + "cl_page", cl_page); + cl_page->cp_vmpage = vmpage; + cl_page->cp_state = CPS_CACHED; + cl_page->cp_type = type; + INIT_LIST_HEAD(&cl_page->cp_batch); + lu_ref_init(&cl_page->cp_reference); cl_object_for_each(o2, o) { if (o2->co_ops->coo_page_init) { result = o2->co_ops->coo_page_init(env, o2, - page, ind); + cl_page, + ind); if (result != 0) { - __cl_page_delete(env, page); - cl_page_free(env, page, NULL); - page = ERR_PTR(result); + __cl_page_delete(env, cl_page); + cl_page_free(env, cl_page, NULL); + cl_page = ERR_PTR(result); break; } } } } else { - page = ERR_PTR(-ENOMEM); + cl_page = ERR_PTR(-ENOMEM); } - return page; + return cl_page; } /** @@ -317,7 +312,8 @@ static inline int cl_page_invariant(const struct cl_page *pg) } static void __cl_page_state_set(const struct lu_env *env, - struct cl_page *page, enum cl_page_state state) + struct cl_page *cl_page, + enum cl_page_state state) { enum cl_page_state old; @@ -363,12 +359,13 @@ static void __cl_page_state_set(const struct lu_env *env, } }; - old = page->cp_state; - PASSERT(env, page, allowed_transitions[old][state]); - CL_PAGE_HEADER(D_TRACE, env, page, "%d -> %d\n", old, state); - PASSERT(env, page, page->cp_state == old); - PASSERT(env, page, equi(state == CPS_OWNED, page->cp_owner)); - cl_page_state_set_trust(page, state); + old = cl_page->cp_state; + PASSERT(env, cl_page, allowed_transitions[old][state]); + CL_PAGE_HEADER(D_TRACE, env, cl_page, "%d -> %d\n", old, state); + PASSERT(env, cl_page, cl_page->cp_state == old); + PASSERT(env, cl_page, equi(state == CPS_OWNED, + cl_page->cp_owner)); + cl_page->cp_state = state; } static void cl_page_state_set(const struct lu_env *env, @@ -1079,6 +1076,7 @@ void cl_page_slice_add(struct cl_page *cl_page, struct cl_page_slice *slice, unsigned int offset = (char *)slice - ((char *)cl_page + sizeof(*cl_page)); + LASSERT(cl_page->cp_layer_count < CP_MAX_LAYER); LASSERT(offset < (1 << sizeof(cl_page->cp_layer_offset[0]) * 8)); cl_page->cp_layer_offset[cl_page->cp_layer_count++] = offset; slice->cpl_obj = obj; From patchwork Wed Jul 15 20:44:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666233 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 98373618 for ; Wed, 15 Jul 2020 20:46:12 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 81F832065F for ; Wed, 15 Jul 2020 20:46:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 81F832065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 69C6B21F927; Wed, 15 Jul 2020 13:45:52 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5806421F72E for ; Wed, 15 Jul 2020 13:45:28 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 96689499; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 951852B5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:58 -0400 Message-Id: <1594845918-29027-18-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 17/37] lustre: osc: re-declare ops_from/to to shrink osc_page X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong @ops_from and @ops_to is within PAGE_SIZE, use PAGE_SHIFT bits to limit it is fine, on x86_64 platform, this patch will reduce another 8 bytes. Notice, previous @ops_to is exclusive which could be PAGE_SIZE, this patch change it to inclusive which means max value will be PAGE_SIZE - 1, and be careful to calculate its length. After this patch, cl_page size could reduce from 320 to 312 bytes, and we are able to allocate 13 objects using slab pool for 4K page. WC-bug-id: https://jira.whamcloud.com/browse/LU-13134 Lustre-commit: 9821754235e24 ("LU-13134 osc: re-declare ops_from/to to shrink osc_page") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/37487 Reviewed-by: Andreas Dilger Reviewed-by: Neil Brown Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_osc.h | 8 ++++---- fs/lustre/osc/osc_cache.c | 5 +++-- fs/lustre/osc/osc_page.c | 21 +++++++++++---------- 3 files changed, 18 insertions(+), 16 deletions(-) diff --git a/fs/lustre/include/lustre_osc.h b/fs/lustre/include/lustre_osc.h index cd08f27..3956ab4 100644 --- a/fs/lustre/include/lustre_osc.h +++ b/fs/lustre/include/lustre_osc.h @@ -507,17 +507,17 @@ struct osc_page { * An offset within page from which next transfer starts. This is used * by cl_page_clip() to submit partial page transfers. */ - int ops_from; + unsigned int ops_from:PAGE_SHIFT, /* - * An offset within page at which next transfer ends. + * An offset within page at which next transfer ends(inclusive). * * \see osc_page::ops_from. */ - int ops_to; + ops_to:PAGE_SHIFT, /* * Boolean, true iff page is under transfer. Used for sanity checking. */ - unsigned ops_transfer_pinned:1, + ops_transfer_pinned:1, /* * in LRU? */ diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c index f811dadb..fe03c0d 100644 --- a/fs/lustre/osc/osc_cache.c +++ b/fs/lustre/osc/osc_cache.c @@ -2395,7 +2395,7 @@ int osc_queue_async_io(const struct lu_env *env, struct cl_io *io, oap->oap_cmd = cmd; oap->oap_page_off = ops->ops_from; - oap->oap_count = ops->ops_to - ops->ops_from; + oap->oap_count = ops->ops_to - ops->ops_from + 1; /* * No need to hold a lock here, * since this page is not in any list yet. @@ -2664,7 +2664,8 @@ int osc_queue_sync_pages(const struct lu_env *env, const struct cl_io *io, ++page_count; mppr <<= (page_count > mppr); - if (unlikely(opg->ops_from > 0 || opg->ops_to < PAGE_SIZE)) + if (unlikely(opg->ops_from > 0 || + opg->ops_to < PAGE_SIZE - 1)) can_merge = false; } diff --git a/fs/lustre/osc/osc_page.c b/fs/lustre/osc/osc_page.c index 2856f30..bb605af 100644 --- a/fs/lustre/osc/osc_page.c +++ b/fs/lustre/osc/osc_page.c @@ -211,7 +211,8 @@ static void osc_page_clip(const struct lu_env *env, struct osc_async_page *oap = &opg->ops_oap; opg->ops_from = from; - opg->ops_to = to; + /* argument @to is exclusive, but @ops_to is inclusive */ + opg->ops_to = to - 1; spin_lock(&oap->oap_lock); oap->oap_async_flags |= ASYNC_COUNT_STABLE; spin_unlock(&oap->oap_lock); @@ -246,28 +247,28 @@ static void osc_page_touch(const struct lu_env *env, }; int osc_page_init(const struct lu_env *env, struct cl_object *obj, - struct cl_page *page, pgoff_t index) + struct cl_page *cl_page, pgoff_t index) { struct osc_object *osc = cl2osc(obj); - struct osc_page *opg = cl_object_page_slice(obj, page); + struct osc_page *opg = cl_object_page_slice(obj, cl_page); struct osc_io *oio = osc_env_io(env); int result; opg->ops_from = 0; - opg->ops_to = PAGE_SIZE; + opg->ops_to = PAGE_SIZE - 1; INIT_LIST_HEAD(&opg->ops_lru); - result = osc_prep_async_page(osc, opg, page->cp_vmpage, + result = osc_prep_async_page(osc, opg, cl_page->cp_vmpage, cl_offset(obj, index)); if (result != 0) return result; opg->ops_srvlock = osc_io_srvlock(oio); - cl_page_slice_add(page, &opg->ops_cl, obj, &osc_page_ops); - page->cp_osc_index = index; + cl_page_slice_add(cl_page, &opg->ops_cl, obj, &osc_page_ops); + cl_page->cp_osc_index = index; - /* reserve an LRU space for this page */ - if (page->cp_type == CPT_CACHEABLE) { + /* reserve an LRU space for this cl_page */ + if (cl_page->cp_type == CPT_CACHEABLE) { result = osc_lru_alloc(env, osc_cli(osc), opg); if (result == 0) { result = radix_tree_preload(GFP_KERNEL); @@ -308,7 +309,7 @@ void osc_page_submit(const struct lu_env *env, struct osc_page *opg, oap->oap_cmd = crt == CRT_WRITE ? OBD_BRW_WRITE : OBD_BRW_READ; oap->oap_page_off = opg->ops_from; - oap->oap_count = opg->ops_to - opg->ops_from; + oap->oap_count = opg->ops_to - opg->ops_from + 1; oap->oap_brw_flags = OBD_BRW_SYNC | brw_flags; if (oio->oi_cap_sys_resource) { From patchwork Wed Jul 15 20:44:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666239 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 58609618 for ; Wed, 15 Jul 2020 20:46:21 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4231920672 for ; Wed, 15 Jul 2020 20:46:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4231920672 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1C5ED21F80D; Wed, 15 Jul 2020 13:45:57 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B02F421F7FB for ; Wed, 15 Jul 2020 13:45:28 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 9A15649B; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 981498D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:44:59 -0400 Message-Id: <1594845918-29027-19-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 18/37] lustre: llite: Fix lock ordering in pagevec_dirty X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Shaun Tancheff , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Shaun Tancheff In vvp_set_pagevec_dirty lock order between i_pages and lock_page_memcg was inverted with the expectation that no other users would conflict. However in vvp_page_completion_write the call to test_clear_page_writeback does expect to be able to lock_page_memcg then lock i_pages which appears to conflict with the original analysis. The reported case shows as RCU stalls with vvp_set_pagevec_dirty blocked attempting to lock i_pages. Fixes: f8a5fb036ae ("lustre: vvp: dirty pages with pagevec") HPE-bug-id: LUS-8798 WC-bug-id: https://jira.whamcloud.com/browse/LU-13746 Lustre-commit: c4ed9b0fb1013 ("LU-13476 llite: Fix lock ordering in pagevec_dirty") Signed-off-by: Shaun Tancheff Reviewed-on: https://review.whamcloud.com/38317 Reviewed-by: Wang Shilong Reviewed-by: Patrick Farrell Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/vvp_io.c | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index 8edd3c1..7627431 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -897,19 +897,31 @@ void vvp_set_pagevec_dirty(struct pagevec *pvec) struct page *page = pvec->pages[0]; struct address_space *mapping = page->mapping; unsigned long flags; + unsigned long skip_pages = 0; int count = pagevec_count(pvec); int dirtied = 0; - int i = 0; - - /* From set_page_dirty */ - for (i = 0; i < count; i++) - ClearPageReclaim(pvec->pages[i]); + int i; + BUILD_BUG_ON(PAGEVEC_SIZE > BITS_PER_LONG); LASSERTF(page->mapping, "mapping must be set. page %p, page->private (cl_page) %p\n", page, (void *) page->private); - /* Rest of code derived from __set_page_dirty_nobuffers */ + for (i = 0; i < count; i++) { + page = pvec->pages[i]; + + ClearPageReclaim(page); + + lock_page_memcg(page); + if (TestSetPageDirty(page)) { + /* page is already dirty .. no extra work needed + * set a flag for the i'th page to be skipped + */ + unlock_page_memcg(page); + skip_pages |= (1 << i); + } + } + xa_lock_irqsave(&mapping->i_pages, flags); /* Notes on differences with __set_page_dirty_nobuffers: @@ -920,17 +932,13 @@ void vvp_set_pagevec_dirty(struct pagevec *pvec) * 3. No mapping is impossible. (Race w/truncate mentioned in * dirty_nobuffers should be impossible because we hold the page lock.) * 4. All mappings are the same because i/o is only to one file. - * 5. We invert the lock order on lock_page_memcg(page) and the mapping - * xa_lock, but this is the only function that should use that pair of - * locks and it can't race because Lustre locks pages throughout i/o. */ for (i = 0; i < count; i++) { page = pvec->pages[i]; - lock_page_memcg(page); - if (TestSetPageDirty(page)) { - unlock_page_memcg(page); + /* if the i'th page was unlocked above, skip it here */ + if ((skip_pages >> i) & 1) continue; - } + LASSERTF(page->mapping == mapping, "all pages must have the same mapping. page %p, mapping %p, first mapping %p\n", page, page->mapping, mapping); From patchwork Wed Jul 15 20:45:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666237 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5A8A1392 for ; Wed, 15 Jul 2020 20:46:18 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CEB202065F for ; Wed, 15 Jul 2020 20:46:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CEB202065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id A566221F956; Wed, 15 Jul 2020 13:45:55 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 13E0621F750 for ; Wed, 15 Jul 2020 13:45:29 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 9D11749C; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 9B41A2BB; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:00 -0400 Message-Id: <1594845918-29027-20-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 19/37] lustre: misc: quiet compiler warning on armv7l X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Avoid overflow in lu_prandom_u64_max(). Quiet printk() warning for mismatched type of size_t variables by using %z modifier for those variables. Fixes: bc2e21c54ba2 ("lustre: obdclass: generate random u64 max correctly") WC-bug-id: https://jira.whamcloud.com/browse/LU-13673 Lustre-commit: 57bb302461383 ("LU-13673 misc: quiet compiler warning on armv7l") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/38927 Reviewed-by: James Simmons Reviewed-by: Lai Siyao Reviewed-by: Yang Sheng Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/vvp_io.c | 4 ++-- fs/lustre/obdclass/lu_tgt_descs.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index 7627431..c3fb03a 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -791,7 +791,7 @@ static int vvp_io_read_start(const struct lu_env *env, goto out; LU_OBJECT_HEADER(D_INODE, env, &obj->co_lu, - "Read ino %lu, %lu bytes, offset %lld, size %llu\n", + "Read ino %lu, %zu bytes, offset %lld, size %llu\n", inode->i_ino, cnt, pos, i_size_read(inode)); /* turn off the kernel's read-ahead */ @@ -1197,7 +1197,7 @@ static int vvp_io_write_start(const struct lu_env *env, } if (vio->vui_iocb->ki_pos != (pos + io->ci_nob - nob)) { CDEBUG(D_VFSTRACE, - "%s: write position mismatch: ki_pos %lld vs. pos %lld, written %ld, commit %ld rc %ld\n", + "%s: write position mismatch: ki_pos %lld vs. pos %lld, written %zd, commit %zd rc %zd\n", file_dentry(file)->d_name.name, vio->vui_iocb->ki_pos, pos + io->ci_nob - nob, written, io->ci_nob - nob, result); diff --git a/fs/lustre/obdclass/lu_tgt_descs.c b/fs/lustre/obdclass/lu_tgt_descs.c index db5a93b..469c935 100644 --- a/fs/lustre/obdclass/lu_tgt_descs.c +++ b/fs/lustre/obdclass/lu_tgt_descs.c @@ -62,7 +62,7 @@ u64 lu_prandom_u64_max(u64 ep_ro) * 32 bits (truncated to the upper limit, if needed) */ if (ep_ro > 0xffffffffULL) - rand = prandom_u32_max((u32)(ep_ro >> 32)) << 32; + rand = (u64)prandom_u32_max((u32)(ep_ro >> 32)) << 32; if (rand == (ep_ro & 0xffffffff00000000ULL)) rand |= prandom_u32_max((u32)ep_ro); From patchwork Wed Jul 15 20:45:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666273 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 03D61618 for ; Wed, 15 Jul 2020 20:46:55 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E189E2065F for ; Wed, 15 Jul 2020 20:46:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E189E2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6ED9321FA75; Wed, 15 Jul 2020 13:46:15 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5736B21F815 for ; Wed, 15 Jul 2020 13:45:29 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 9FBD349D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 9E5122A0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:01 -0400 Message-Id: <1594845918-29027-21-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 20/37] lustre: llite: fix to free cl_dio_aio properly X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong @cl_dio_aio is allocated by slab, we should use slab free helper to free its memory. Fixes: ebdbecbaf50b ("lustre: obdclass: use slab allocation for cl_dio_aio") WC-bug-id: https://jira.whamcloud.com/browse/LU-13134 Lustre-commit: f71a539c3e41b ("LU-13134 llite: fix to free cl_dio_aio properly") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/39103 Reviewed-by: Patrick Farrell Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 2 ++ fs/lustre/llite/rw26.c | 2 +- fs/lustre/obdclass/cl_io.c | 10 ++++++++-- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index 8611285..e656c68 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -2538,6 +2538,8 @@ int cl_sync_io_wait(const struct lu_env *env, struct cl_sync_io *anchor, void cl_sync_io_note(const struct lu_env *env, struct cl_sync_io *anchor, int ioret); struct cl_dio_aio *cl_aio_alloc(struct kiocb *iocb); +void cl_aio_free(struct cl_dio_aio *aio); + static inline void cl_sync_io_init(struct cl_sync_io *anchor, int nr) { cl_sync_io_init_notify(anchor, nr, NULL, NULL); diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c index 0971185..d0e3ff6 100644 --- a/fs/lustre/llite/rw26.c +++ b/fs/lustre/llite/rw26.c @@ -384,7 +384,7 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter) vio->u.write.vui_written += tot_bytes; result = tot_bytes; } - kfree(aio); + cl_aio_free(aio); } else { result = -EIOCBQUEUED; } diff --git a/fs/lustre/obdclass/cl_io.c b/fs/lustre/obdclass/cl_io.c index 2f597d1..dcf940f 100644 --- a/fs/lustre/obdclass/cl_io.c +++ b/fs/lustre/obdclass/cl_io.c @@ -1106,6 +1106,13 @@ struct cl_dio_aio *cl_aio_alloc(struct kiocb *iocb) } EXPORT_SYMBOL(cl_aio_alloc); +void cl_aio_free(struct cl_dio_aio *aio) +{ + if (aio) + kmem_cache_free(cl_dio_aio_kmem, aio); +} +EXPORT_SYMBOL(cl_aio_free); + /** * Indicate that transfer of a single page completed. */ @@ -1143,8 +1150,7 @@ void cl_sync_io_note(const struct lu_env *env, struct cl_sync_io *anchor, * If anchor->csi_aio is set, we are responsible for freeing * memory here rather than when cl_sync_io_wait() completes. */ - if (aio) - kmem_cache_free(cl_dio_aio_kmem, aio); + cl_aio_free(aio); } } EXPORT_SYMBOL(cl_sync_io_note); From patchwork Wed Jul 15 20:45:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666275 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F137B1392 for ; Wed, 15 Jul 2020 20:46:57 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DA8392065F for ; Wed, 15 Jul 2020 20:46:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DA8392065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0CA8B21FA92; Wed, 15 Jul 2020 13:46:17 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 98C2F21F815 for ; Wed, 15 Jul 2020 13:45:29 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id A2DE349E; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A16112B5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:02 -0400 Message-Id: <1594845918-29027-22-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 21/37] lnet: o2iblnd: Use ib_mtu_int_to_enum() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Rather than bespoke code for converting an MTU into the enum, use ib_mtu_int_to_enum(). This has slightly different behaviour for invalid values, but those are caught when the parameter is set. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 1b622e2007483 ("LU-12678 o2iblnd: Use ib_mtu_int_to_enum()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39123 Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.c | 29 +++-------------------------- net/lnet/klnds/o2iblnd/o2iblnd_modparams.c | 4 +++- 2 files changed, 6 insertions(+), 27 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index d8fca2a..e2e94b7 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -560,38 +560,15 @@ static struct kib_conn *kiblnd_get_conn_by_idx(struct lnet_ni *ni, int index) return NULL; } -int kiblnd_translate_mtu(int value) -{ - switch (value) { - default: - return -1; - case 0: - return 0; - case 256: - return IB_MTU_256; - case 512: - return IB_MTU_512; - case 1024: - return IB_MTU_1024; - case 2048: - return IB_MTU_2048; - case 4096: - return IB_MTU_4096; - } -} - static void kiblnd_setup_mtu_locked(struct rdma_cm_id *cmid) { - int mtu; - /* XXX There is no path record for iWARP, set by netdev->change_mtu? */ if (!cmid->route.path_rec) return; - mtu = kiblnd_translate_mtu(*kiblnd_tunables.kib_ib_mtu); - LASSERT(mtu >= 0); - if (mtu) - cmid->route.path_rec->mtu = mtu; + if (*kiblnd_tunables.kib_ib_mtu) + cmid->route.path_rec->mtu = + ib_mtu_int_to_enum(*kiblnd_tunables.kib_ib_mtu); } static int kiblnd_get_completion_vector(struct kib_conn *conn, int cpt) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c b/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c index f341376..73ad22d 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_modparams.c @@ -230,7 +230,9 @@ int kiblnd_tunables_setup(struct lnet_ni *ni) /* Current API version */ tunables->lnd_version = 0; - if (kiblnd_translate_mtu(*kiblnd_tunables.kib_ib_mtu) < 0) { + if (*kiblnd_tunables.kib_ib_mtu && + ib_mtu_enum_to_int(ib_mtu_int_to_enum(*kiblnd_tunables.kib_ib_mtu)) != + *kiblnd_tunables.kib_ib_mtu) { CERROR("Invalid ib_mtu %d, expected 256/512/1024/2048/4096\n", *kiblnd_tunables.kib_ib_mtu); return -EINVAL; From patchwork Wed Jul 15 20:45:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666245 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 61E8D618 for ; Wed, 15 Jul 2020 20:46:27 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4B7C32065F for ; Wed, 15 Jul 2020 20:46:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4B7C32065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5D77C21F98C; Wed, 15 Jul 2020 13:46:00 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id DB3E121F815 for ; Wed, 15 Jul 2020 13:45:29 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id A647E49F; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A45F52BA; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:03 -0400 Message-Id: <1594845918-29027-23-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 22/37] lnet: o2iblnd: wait properly for fps->increasing. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown If we need to allocate a new fmr_pool and another thread is currently allocating one, we call schedule() and then try again. This can spin, consuming a CPU and wasting power. Instead, use wait_var_event() and wake_up_var() to wait for fps_increasing to be cleared. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 530eca31556f7 ("LU-12768 o2iblnd: wait properly for fps->increasing.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39124 Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index e2e94b7..6c7659c 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -1750,7 +1750,7 @@ int kiblnd_fmr_pool_map(struct kib_fmr_poolset *fps, struct kib_tx *tx, if (fps->fps_increasing) { spin_unlock(&fps->fps_lock); CDEBUG(D_NET, "Another thread is allocating new FMR pool, waiting for her to complete\n"); - schedule(); + wait_var_event(fps, !fps->fps_increasing); goto again; } @@ -1767,6 +1767,7 @@ int kiblnd_fmr_pool_map(struct kib_fmr_poolset *fps, struct kib_tx *tx, rc = kiblnd_create_fmr_pool(fps, &fpo); spin_lock(&fps->fps_lock); fps->fps_increasing = 0; + wake_up_var(fps); if (!rc) { fps->fps_version++; list_add_tail(&fpo->fpo_list, &fps->fps_pool_list); From patchwork Wed Jul 15 20:45:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666279 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 66F9C618 for ; Wed, 15 Jul 2020 20:47:03 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 50B242065F for ; Wed, 15 Jul 2020 20:47:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 50B242065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 0D21C21FAC8; Wed, 15 Jul 2020 13:46:20 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2A53621F815 for ; Wed, 15 Jul 2020 13:45:30 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id A8C3B5C2; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A75F48D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:04 -0400 Message-Id: <1594845918-29027-24-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 23/37] lnet: o2iblnd: use need_resched() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Rather than using a counter to decide when to drop the lock and see if we need to reshedule we can use need_resched(), which is a precise test instead of a guess. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: dcd799269f693 ("LU-12678 o2iblnd: use need_resched()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39125 Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.h | 2 -- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 5 +---- 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.h b/net/lnet/klnds/o2iblnd/o2iblnd.h index f60a69d..9a2fb42 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.h +++ b/net/lnet/klnds/o2iblnd/o2iblnd.h @@ -67,8 +67,6 @@ #include #define IBLND_PEER_HASH_SIZE 101 /* # peer_ni lists */ -/* # scheduler loops before reschedule */ -#define IBLND_RESCHED 100 #define IBLND_N_SCHED 2 #define IBLND_N_SCHED_HIGH 4 diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index 3b9d10d..2c670a33 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -3605,7 +3605,6 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid, unsigned long flags; struct ib_wc wc; int did_something; - int busy_loops = 0; int rc; init_waitqueue_entry(&wait, current); @@ -3621,11 +3620,10 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid, spin_lock_irqsave(&sched->ibs_lock, flags); while (!kiblnd_data.kib_shutdown) { - if (busy_loops++ >= IBLND_RESCHED) { + if (need_resched()) { spin_unlock_irqrestore(&sched->ibs_lock, flags); cond_resched(); - busy_loops = 0; spin_lock_irqsave(&sched->ibs_lock, flags); } @@ -3718,7 +3716,6 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid, spin_unlock_irqrestore(&sched->ibs_lock, flags); schedule(); - busy_loops = 0; remove_wait_queue(&sched->ibs_waitq, &wait); spin_lock_irqsave(&sched->ibs_lock, flags); From patchwork Wed Jul 15 20:45:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666277 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A7F871392 for ; Wed, 15 Jul 2020 20:47:00 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9148C2065F for ; Wed, 15 Jul 2020 20:47:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9148C2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8C90621F72E; Wed, 15 Jul 2020 13:46:18 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 6F6BA21F832 for ; Wed, 15 Jul 2020 13:45:30 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id AC30F5C4; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id AAC142A0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:05 -0400 Message-Id: <1594845918-29027-25-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 24/37] lnet: o2iblnd: Use list_for_each_entry_safe X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Several loops use list_for_each_safe(), then call list_entry() as first step. These can be merged using list_for_each_entry_safe(). WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: e5574f72f2fd9 ("LU-12678 o2iblnd: Use list_for_each_entry_safe") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39126 Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/o2iblnd/o2iblnd.c | 26 ++++++++++---------------- net/lnet/klnds/o2iblnd/o2iblnd_cb.c | 7 ++----- 2 files changed, 12 insertions(+), 21 deletions(-) diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index 6c7659c..c6a077b 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -454,18 +454,16 @@ static int kiblnd_get_peer_info(struct lnet_ni *ni, int index, static void kiblnd_del_peer_locked(struct kib_peer_ni *peer_ni) { - struct list_head *ctmp; - struct list_head *cnxt; + struct kib_conn *cnxt; struct kib_conn *conn; if (list_empty(&peer_ni->ibp_conns)) { kiblnd_unlink_peer_locked(peer_ni); } else { - list_for_each_safe(ctmp, cnxt, &peer_ni->ibp_conns) { - conn = list_entry(ctmp, struct kib_conn, ibc_list); - + list_for_each_entry_safe(conn, cnxt, &peer_ni->ibp_conns, + ibc_list) kiblnd_close_conn_locked(conn, 0); - } + /* NB closing peer_ni's last conn unlinked it. */ } /* @@ -952,13 +950,11 @@ void kiblnd_destroy_conn(struct kib_conn *conn) int kiblnd_close_peer_conns_locked(struct kib_peer_ni *peer_ni, int why) { struct kib_conn *conn; - struct list_head *ctmp; - struct list_head *cnxt; + struct kib_conn *cnxt; int count = 0; - list_for_each_safe(ctmp, cnxt, &peer_ni->ibp_conns) { - conn = list_entry(ctmp, struct kib_conn, ibc_list); - + list_for_each_entry_safe(conn, cnxt, &peer_ni->ibp_conns, + ibc_list) { CDEBUG(D_NET, "Closing conn -> %s, version: %x, reason: %d\n", libcfs_nid2str(peer_ni->ibp_nid), conn->ibc_version, why); @@ -974,13 +970,11 @@ int kiblnd_close_stale_conns_locked(struct kib_peer_ni *peer_ni, int version, u64 incarnation) { struct kib_conn *conn; - struct list_head *ctmp; - struct list_head *cnxt; + struct kib_conn *cnxt; int count = 0; - list_for_each_safe(ctmp, cnxt, &peer_ni->ibp_conns) { - conn = list_entry(ctmp, struct kib_conn, ibc_list); - + list_for_each_entry_safe(conn, cnxt, &peer_ni->ibp_conns, + ibc_list) { if (conn->ibc_version == version && conn->ibc_incarnation == incarnation) continue; diff --git a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c index 2c670a33..ba2f46f 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd_cb.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd_cb.c @@ -1982,15 +1982,12 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid, kiblnd_abort_txs(struct kib_conn *conn, struct list_head *txs) { LIST_HEAD(zombies); - struct list_head *tmp; - struct list_head *nxt; + struct kib_tx *nxt; struct kib_tx *tx; spin_lock(&conn->ibc_lock); - list_for_each_safe(tmp, nxt, txs) { - tx = list_entry(tmp, struct kib_tx, tx_list); - + list_for_each_entry_safe(tx, nxt, txs, tx_list) { if (txs == &conn->ibc_active_txs) { LASSERT(!tx->tx_queued); LASSERT(tx->tx_waiting || tx->tx_sending); From patchwork Wed Jul 15 20:45:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666249 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 664511392 for ; Wed, 15 Jul 2020 20:46:31 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4FB642065F for ; Wed, 15 Jul 2020 20:46:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4FB642065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B8DE521F9AE; Wed, 15 Jul 2020 13:46:02 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C74D521F760 for ; Wed, 15 Jul 2020 13:45:30 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id AF4A55C5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id ADE3B2B5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:06 -0400 Message-Id: <1594845918-29027-26-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 25/37] lnet: socklnd: use need_resched() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Rather than using a counter to decide when to drop the lock and see if we need to reshedule we can use need_resched(), which is a precise test instead of a guess. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 3f848f85ba3d3 ("LU-12678 socklnd: use need_resched()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39128 Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.h | 1 - net/lnet/klnds/socklnd/socklnd_cb.c | 12 +++--------- 2 files changed, 3 insertions(+), 10 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h index 0ac3637..0a0f0a7 100644 --- a/net/lnet/klnds/socklnd/socklnd.h +++ b/net/lnet/klnds/socklnd/socklnd.h @@ -55,7 +55,6 @@ #define SOCKNAL_NSCHEDS_HIGH (SOCKNAL_NSCHEDS << 1) #define SOCKNAL_PEER_HASH_BITS 7 /* # log2 of # of peer_ni lists */ -#define SOCKNAL_RESCHED 100 /* # scheduler loops before reschedule */ #define SOCKNAL_INSANITY_RECONN 5000 /* connd is trying on reconn infinitely */ #define SOCKNAL_ENOMEM_RETRY 1 /* seconds between retries */ diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c index 623478c..936054ee 100644 --- a/net/lnet/klnds/socklnd/socklnd_cb.c +++ b/net/lnet/klnds/socklnd/socklnd_cb.c @@ -1328,7 +1328,6 @@ int ksocknal_scheduler(void *arg) struct ksock_conn *conn; struct ksock_tx *tx; int rc; - int nloops = 0; long id = (long)arg; sched = ksocknal_data.ksnd_schedulers[KSOCK_THREAD_CPT(id)]; @@ -1470,12 +1469,10 @@ int ksocknal_scheduler(void *arg) did_something = 1; } - if (!did_something || /* nothing to do */ - ++nloops == SOCKNAL_RESCHED) { /* hogging CPU? */ + if (!did_something || /* nothing to do */ + need_resched()) { /* hogging CPU? */ spin_unlock_bh(&sched->kss_lock); - nloops = 0; - if (!did_something) { /* wait for something to do */ rc = wait_event_interruptible_exclusive( sched->kss_waitq, @@ -2080,7 +2077,6 @@ void ksocknal_write_callback(struct ksock_conn *conn) spinlock_t *connd_lock = &ksocknal_data.ksnd_connd_lock; struct ksock_connreq *cr; wait_queue_entry_t wait; - int nloops = 0; int cons_retry = 0; init_waitqueue_entry(&wait, current); @@ -2158,10 +2154,9 @@ void ksocknal_write_callback(struct ksock_conn *conn) } if (dropped_lock) { - if (++nloops < SOCKNAL_RESCHED) + if (!need_resched()) continue; spin_unlock_bh(connd_lock); - nloops = 0; cond_resched(); spin_lock_bh(connd_lock); continue; @@ -2173,7 +2168,6 @@ void ksocknal_write_callback(struct ksock_conn *conn) &wait); spin_unlock_bh(connd_lock); - nloops = 0; schedule_timeout(timeout); remove_wait_queue(&ksocknal_data.ksnd_connd_waitq, &wait); From patchwork Wed Jul 15 20:45:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666251 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BBBB4618 for ; Wed, 15 Jul 2020 20:46:32 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A5D762065F for ; Wed, 15 Jul 2020 20:46:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A5D762065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9DCD921F840; Wed, 15 Jul 2020 13:46:03 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2AF1C21F761 for ; Wed, 15 Jul 2020 13:45:31 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id B2F495C7; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B0DA92BB; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:07 -0400 Message-Id: <1594845918-29027-27-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 26/37] lnet: socklnd: use list_for_each_entry_safe() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Several loops use list_for_each_safe(), then call list_entry() as first step. These can be merged using list_for_each_entry_safe(). In one case, the 'safe' version is clearly not needed, so just use list_for_each_entry(). WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 03f375e9f6390 ("LU-12678 socklnd: use list_for_each_entry_safe()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39129 Reviewed-by: James Simmons Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.c | 55 ++++++++++++++-------------------------- 1 file changed, 19 insertions(+), 36 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c index 2b8fd3d..2e11737 100644 --- a/net/lnet/klnds/socklnd/socklnd.c +++ b/net/lnet/klnds/socklnd/socklnd.c @@ -453,15 +453,12 @@ struct ksock_peer_ni * struct ksock_peer_ni *peer_ni = route->ksnr_peer; struct ksock_interface *iface; struct ksock_conn *conn; - struct list_head *ctmp; - struct list_head *cnxt; + struct ksock_conn *cnxt; LASSERT(!route->ksnr_deleted); /* Close associated conns */ - list_for_each_safe(ctmp, cnxt, &peer_ni->ksnp_conns) { - conn = list_entry(ctmp, struct ksock_conn, ksnc_list); - + list_for_each_entry_safe(conn, cnxt, &peer_ni->ksnp_conns, ksnc_list) { if (conn->ksnc_route != route) continue; @@ -548,9 +545,9 @@ struct ksock_peer_ni * ksocknal_del_peer_locked(struct ksock_peer_ni *peer_ni, u32 ip) { struct ksock_conn *conn; + struct ksock_conn *cnxt; struct ksock_route *route; - struct list_head *tmp; - struct list_head *nxt; + struct ksock_route *rnxt; int nshared; LASSERT(!peer_ni->ksnp_closing); @@ -558,9 +555,8 @@ struct ksock_peer_ni * /* Extra ref prevents peer_ni disappearing until I'm done with it */ ksocknal_peer_addref(peer_ni); - list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { - route = list_entry(tmp, struct ksock_route, ksnr_list); - + list_for_each_entry_safe(route, rnxt, &peer_ni->ksnp_routes, + ksnr_list) { /* no match */ if (!(!ip || route->ksnr_ipaddr == ip)) continue; @@ -571,29 +567,23 @@ struct ksock_peer_ni * } nshared = 0; - list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { - route = list_entry(tmp, struct ksock_route, ksnr_list); + list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) nshared += route->ksnr_share_count; - } if (!nshared) { - /* - * remove everything else if there are no explicit entries + /* remove everything else if there are no explicit entries * left */ - list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { - route = list_entry(tmp, struct ksock_route, ksnr_list); - + list_for_each_entry_safe(route, rnxt, &peer_ni->ksnp_routes, + ksnr_list) { /* we should only be removing auto-entries */ LASSERT(!route->ksnr_share_count); ksocknal_del_route_locked(route); } - list_for_each_safe(tmp, nxt, &peer_ni->ksnp_conns) { - conn = list_entry(tmp, struct ksock_conn, ksnc_list); - + list_for_each_entry_safe(conn, cnxt, &peer_ni->ksnp_conns, + ksnc_list) ksocknal_close_conn_locked(conn, 0); - } } ksocknal_peer_decref(peer_ni); @@ -1752,13 +1742,10 @@ struct ksock_peer_ni * u32 ipaddr, int why) { struct ksock_conn *conn; - struct list_head *ctmp; - struct list_head *cnxt; + struct ksock_conn *cnxt; int count = 0; - list_for_each_safe(ctmp, cnxt, &peer_ni->ksnp_conns) { - conn = list_entry(ctmp, struct ksock_conn, ksnc_list); - + list_for_each_entry_safe(conn, cnxt, &peer_ni->ksnp_conns, ksnc_list) { if (!ipaddr || conn->ksnc_ipaddr == ipaddr) { count++; ksocknal_close_conn_locked(conn, why); @@ -1992,10 +1979,10 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) ksocknal_peer_del_interface_locked(struct ksock_peer_ni *peer_ni, u32 ipaddr, int index) { - struct list_head *tmp; - struct list_head *nxt; struct ksock_route *route; + struct ksock_route *rnxt; struct ksock_conn *conn; + struct ksock_conn *cnxt; int i; int j; @@ -2008,9 +1995,8 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) break; } - list_for_each_safe(tmp, nxt, &peer_ni->ksnp_routes) { - route = list_entry(tmp, struct ksock_route, ksnr_list); - + list_for_each_entry_safe(route, rnxt, &peer_ni->ksnp_routes, + ksnr_list) { if (route->ksnr_myiface != index) continue; @@ -2022,12 +2008,9 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) } } - list_for_each_safe(tmp, nxt, &peer_ni->ksnp_conns) { - conn = list_entry(tmp, struct ksock_conn, ksnc_list); - + list_for_each_entry_safe(conn, cnxt, &peer_ni->ksnp_conns, ksnc_list) if (conn->ksnc_myipaddr == ipaddr) ksocknal_close_conn_locked(conn, 0); - } } static int From patchwork Wed Jul 15 20:45:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666281 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B41B9618 for ; Wed, 15 Jul 2020 20:47:06 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9D72B2065F for ; Wed, 15 Jul 2020 20:47:06 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D72B2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id BFA8221FAE7; Wed, 15 Jul 2020 13:46:21 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8157D21F761 for ; Wed, 15 Jul 2020 13:45:31 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id B67275C9; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B44888D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:08 -0400 Message-Id: <1594845918-29027-28-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 27/37] lnet: socklnd: convert various refcounts to refcount_t X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Each of these refcounts exactly follows the expectations of refcount_t, so change the atomic_t to refcoun_t. We can remove the LASSERTs on incref/decref as they can now be enabled at build time with CONFIG_REFCOUNT_FULL WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: db3e51f612069 ("LU-12678 socklnd: convert various refcounts to refcount_t") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39130 Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.c | 28 ++++++++++++------------- net/lnet/klnds/socklnd/socklnd.h | 41 +++++++++++++++---------------------- net/lnet/klnds/socklnd/socklnd_cb.c | 6 +++--- 3 files changed, 33 insertions(+), 42 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c index 2e11737..22a73c3 100644 --- a/net/lnet/klnds/socklnd/socklnd.c +++ b/net/lnet/klnds/socklnd/socklnd.c @@ -123,7 +123,7 @@ static int ksocknal_ip2index(__u32 ipaddress, struct lnet_ni *ni) if (!route) return NULL; - atomic_set(&route->ksnr_refcount, 1); + refcount_set(&route->ksnr_refcount, 1); route->ksnr_peer = NULL; route->ksnr_retry_interval = 0; /* OK to connect at any time */ route->ksnr_ipaddr = ipaddr; @@ -142,7 +142,7 @@ static int ksocknal_ip2index(__u32 ipaddress, struct lnet_ni *ni) void ksocknal_destroy_route(struct ksock_route *route) { - LASSERT(!atomic_read(&route->ksnr_refcount)); + LASSERT(!refcount_read(&route->ksnr_refcount)); if (route->ksnr_peer) ksocknal_peer_decref(route->ksnr_peer); @@ -174,7 +174,7 @@ static int ksocknal_ip2index(__u32 ipaddress, struct lnet_ni *ni) peer_ni->ksnp_ni = ni; peer_ni->ksnp_id = id; - atomic_set(&peer_ni->ksnp_refcount, 1); /* 1 ref for caller */ + refcount_set(&peer_ni->ksnp_refcount, 1); /* 1 ref for caller */ peer_ni->ksnp_closing = 0; peer_ni->ksnp_accepting = 0; peer_ni->ksnp_proto = NULL; @@ -198,7 +198,7 @@ static int ksocknal_ip2index(__u32 ipaddress, struct lnet_ni *ni) CDEBUG(D_NET, "peer_ni %s %p deleted\n", libcfs_id2str(peer_ni->ksnp_id), peer_ni); - LASSERT(!atomic_read(&peer_ni->ksnp_refcount)); + LASSERT(!refcount_read(&peer_ni->ksnp_refcount)); LASSERT(!peer_ni->ksnp_accepting); LASSERT(list_empty(&peer_ni->ksnp_conns)); LASSERT(list_empty(&peer_ni->ksnp_routes)); @@ -235,7 +235,7 @@ struct ksock_peer_ni * CDEBUG(D_NET, "got peer_ni [%p] -> %s (%d)\n", peer_ni, libcfs_id2str(id), - atomic_read(&peer_ni->ksnp_refcount)); + refcount_read(&peer_ni->ksnp_refcount)); return peer_ni; } return NULL; @@ -1069,10 +1069,10 @@ struct ksock_peer_ni * * 2 ref, 1 for conn, another extra ref prevents socket * being closed before establishment of connection */ - atomic_set(&conn->ksnc_sock_refcount, 2); + refcount_set(&conn->ksnc_sock_refcount, 2); conn->ksnc_type = type; ksocknal_lib_save_callback(sock, conn); - atomic_set(&conn->ksnc_conn_refcount, 1); /* 1 ref for me */ + refcount_set(&conn->ksnc_conn_refcount, 1); /* 1 ref for me */ conn->ksnc_rx_ready = 0; conn->ksnc_rx_scheduled = 0; @@ -1667,7 +1667,7 @@ struct ksock_peer_ni * { /* Queue the conn for the reaper to destroy */ - LASSERT(!atomic_read(&conn->ksnc_conn_refcount)); + LASSERT(!refcount_read(&conn->ksnc_conn_refcount)); spin_lock_bh(&ksocknal_data.ksnd_reaper_lock); list_add_tail(&conn->ksnc_list, &ksocknal_data.ksnd_zombie_conns); @@ -1684,8 +1684,8 @@ struct ksock_peer_ni * /* Final coup-de-grace of the reaper */ CDEBUG(D_NET, "connection %p\n", conn); - LASSERT(!atomic_read(&conn->ksnc_conn_refcount)); - LASSERT(!atomic_read(&conn->ksnc_sock_refcount)); + LASSERT(!refcount_read(&conn->ksnc_conn_refcount)); + LASSERT(!refcount_read(&conn->ksnc_sock_refcount)); LASSERT(!conn->ksnc_sock); LASSERT(!conn->ksnc_route); LASSERT(!conn->ksnc_tx_scheduled); @@ -2412,7 +2412,7 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) CWARN("Active peer_ni on shutdown: %s, ref %d, closing %d, accepting %d, err %d, zcookie %llu, txq %d, zc_req %d\n", libcfs_id2str(peer_ni->ksnp_id), - atomic_read(&peer_ni->ksnp_refcount), + refcount_read(&peer_ni->ksnp_refcount), peer_ni->ksnp_closing, peer_ni->ksnp_accepting, peer_ni->ksnp_error, peer_ni->ksnp_zc_next_cookie, @@ -2421,7 +2421,7 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) list_for_each_entry(route, &peer_ni->ksnp_routes, ksnr_list) { CWARN("Route: ref %d, schd %d, conn %d, cnted %d, del %d\n", - atomic_read(&route->ksnr_refcount), + refcount_read(&route->ksnr_refcount), route->ksnr_scheduled, route->ksnr_connecting, route->ksnr_connected, @@ -2430,8 +2430,8 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) list_for_each_entry(conn, &peer_ni->ksnp_conns, ksnc_list) { CWARN("Conn: ref %d, sref %d, t %d, c %d\n", - atomic_read(&conn->ksnc_conn_refcount), - atomic_read(&conn->ksnc_sock_refcount), + refcount_read(&conn->ksnc_conn_refcount), + refcount_read(&conn->ksnc_sock_refcount), conn->ksnc_type, conn->ksnc_closing); } goto done; diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h index 0a0f0a7..df863f2 100644 --- a/net/lnet/klnds/socklnd/socklnd.h +++ b/net/lnet/klnds/socklnd/socklnd.h @@ -37,6 +37,7 @@ #include #include #include +#include #include #include #include @@ -270,7 +271,7 @@ struct ksock_tx { /* transmit packet */ struct list_head tx_list; /* queue on conn for transmission etc */ struct list_head tx_zc_list; /* queue on peer_ni for ZC request */ - atomic_t tx_refcount; /* tx reference count */ + refcount_t tx_refcount; /* tx reference count */ int tx_nob; /* # packet bytes */ int tx_resid; /* residual bytes */ int tx_niov; /* # packet iovec frags */ @@ -311,8 +312,8 @@ struct ksock_conn { void *ksnc_saved_write_space; /* socket's original * write_space() callback */ - atomic_t ksnc_conn_refcount; /* conn refcount */ - atomic_t ksnc_sock_refcount; /* sock refcount */ + refcount_t ksnc_conn_refcount; /* conn refcount */ + refcount_t ksnc_sock_refcount; /* sock refcount */ struct ksock_sched *ksnc_scheduler; /* who schedules this connection */ u32 ksnc_myipaddr; /* my IP */ @@ -374,7 +375,7 @@ struct ksock_route { struct list_head ksnr_list; /* chain on peer_ni route list */ struct list_head ksnr_connd_list; /* chain on ksnr_connd_routes */ struct ksock_peer_ni *ksnr_peer; /* owning peer_ni */ - atomic_t ksnr_refcount; /* # users */ + refcount_t ksnr_refcount; /* # users */ time64_t ksnr_timeout; /* when (in secs) reconnection * can happen next */ @@ -404,7 +405,7 @@ struct ksock_peer_ni { * alive */ struct lnet_process_id ksnp_id; /* who's on the other end(s) */ - atomic_t ksnp_refcount; /* # users */ + refcount_t ksnp_refcount; /* # users */ int ksnp_closing; /* being closed */ int ksnp_accepting; /* # passive connections pending */ @@ -510,8 +511,7 @@ struct ksock_proto { static inline void ksocknal_conn_addref(struct ksock_conn *conn) { - LASSERT(atomic_read(&conn->ksnc_conn_refcount) > 0); - atomic_inc(&conn->ksnc_conn_refcount); + refcount_inc(&conn->ksnc_conn_refcount); } void ksocknal_queue_zombie_conn(struct ksock_conn *conn); @@ -520,8 +520,7 @@ struct ksock_proto { static inline void ksocknal_conn_decref(struct ksock_conn *conn) { - LASSERT(atomic_read(&conn->ksnc_conn_refcount) > 0); - if (atomic_dec_and_test(&conn->ksnc_conn_refcount)) + if (refcount_dec_and_test(&conn->ksnc_conn_refcount)) ksocknal_queue_zombie_conn(conn); } @@ -532,8 +531,7 @@ struct ksock_proto { read_lock(&ksocknal_data.ksnd_global_lock); if (!conn->ksnc_closing) { - LASSERT(atomic_read(&conn->ksnc_sock_refcount) > 0); - atomic_inc(&conn->ksnc_sock_refcount); + refcount_inc(&conn->ksnc_sock_refcount); rc = 0; } read_unlock(&ksocknal_data.ksnd_global_lock); @@ -544,8 +542,7 @@ struct ksock_proto { static inline void ksocknal_connsock_decref(struct ksock_conn *conn) { - LASSERT(atomic_read(&conn->ksnc_sock_refcount) > 0); - if (atomic_dec_and_test(&conn->ksnc_sock_refcount)) { + if (refcount_dec_and_test(&conn->ksnc_sock_refcount)) { LASSERT(conn->ksnc_closing); sock_release(conn->ksnc_sock); conn->ksnc_sock = NULL; @@ -556,8 +553,7 @@ struct ksock_proto { static inline void ksocknal_tx_addref(struct ksock_tx *tx) { - LASSERT(atomic_read(&tx->tx_refcount) > 0); - atomic_inc(&tx->tx_refcount); + refcount_inc(&tx->tx_refcount); } void ksocknal_tx_prep(struct ksock_conn *, struct ksock_tx *tx); @@ -566,16 +562,14 @@ struct ksock_proto { static inline void ksocknal_tx_decref(struct ksock_tx *tx) { - LASSERT(atomic_read(&tx->tx_refcount) > 0); - if (atomic_dec_and_test(&tx->tx_refcount)) + if (refcount_dec_and_test(&tx->tx_refcount)) ksocknal_tx_done(NULL, tx, 0); } static inline void ksocknal_route_addref(struct ksock_route *route) { - LASSERT(atomic_read(&route->ksnr_refcount) > 0); - atomic_inc(&route->ksnr_refcount); + refcount_inc(&route->ksnr_refcount); } void ksocknal_destroy_route(struct ksock_route *route); @@ -583,16 +577,14 @@ struct ksock_proto { static inline void ksocknal_route_decref(struct ksock_route *route) { - LASSERT(atomic_read(&route->ksnr_refcount) > 0); - if (atomic_dec_and_test(&route->ksnr_refcount)) + if (refcount_dec_and_test(&route->ksnr_refcount)) ksocknal_destroy_route(route); } static inline void ksocknal_peer_addref(struct ksock_peer_ni *peer_ni) { - LASSERT(atomic_read(&peer_ni->ksnp_refcount) > 0); - atomic_inc(&peer_ni->ksnp_refcount); + refcount_inc(&peer_ni->ksnp_refcount); } void ksocknal_destroy_peer(struct ksock_peer_ni *peer_ni); @@ -600,8 +592,7 @@ struct ksock_proto { static inline void ksocknal_peer_decref(struct ksock_peer_ni *peer_ni) { - LASSERT(atomic_read(&peer_ni->ksnp_refcount) > 0); - if (atomic_dec_and_test(&peer_ni->ksnp_refcount)) + if (refcount_dec_and_test(&peer_ni->ksnp_refcount)) ksocknal_destroy_peer(peer_ni); } diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c index 936054ee..9b3b604 100644 --- a/net/lnet/klnds/socklnd/socklnd_cb.c +++ b/net/lnet/klnds/socklnd/socklnd_cb.c @@ -52,7 +52,7 @@ struct ksock_tx * if (!tx) return NULL; - atomic_set(&tx->tx_refcount, 1); + refcount_set(&tx->tx_refcount, 1); tx->tx_zc_aborted = 0; tx->tx_zc_capable = 0; tx->tx_zc_checked = 0; @@ -381,7 +381,7 @@ struct ksock_tx * tx->tx_hstatus = LNET_MSG_STATUS_LOCAL_ERROR; } - LASSERT(atomic_read(&tx->tx_refcount) == 1); + LASSERT(refcount_read(&tx->tx_refcount) == 1); ksocknal_tx_done(ni, tx, error); } } @@ -1072,7 +1072,7 @@ struct ksock_route * struct lnet_process_id *id; int rc; - LASSERT(atomic_read(&conn->ksnc_conn_refcount) > 0); + LASSERT(refcount_read(&conn->ksnc_conn_refcount) > 0); /* NB: sched lock NOT held */ /* SOCKNAL_RX_LNET_HEADER is here for backward compatibility */ From patchwork Wed Jul 15 20:45:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666285 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 36DE0618 for ; Wed, 15 Jul 2020 20:47:12 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 201C72065F for ; Wed, 15 Jul 2020 20:47:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 201C72065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2DEEE21FB13; Wed, 15 Jul 2020 13:46:25 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id D767C21F761 for ; Wed, 15 Jul 2020 13:45:31 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id B88A95CC; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B724F2A0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:09 -0400 Message-Id: <1594845918-29027-29-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 28/37] lnet: libcfs: don't call unshare_fs_struct() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown A kthread runs with the same fs_struct as init. It is only helpful to unshare this if the thread will change one of the fields in the fs_struct: root directory current working directory umask. No lustre kthread changes any of these, so there is no need to call unshare_fs_struct(). WC-bug-id: https://jira.whamcloud.com/browse/LU-9859 Lustre-commit: 9013eb2bb5492 ("LU-9859 libcfs: don't call unshare_fs_struct()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39132 Reviewed-by: James Simmons Reviewed-by: Yang Sheng Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/obdclass/llog.c | 2 -- fs/lustre/ptlrpc/import.c | 2 -- fs/lustre/ptlrpc/ptlrpcd.c | 1 - fs/lustre/ptlrpc/service.c | 3 --- 4 files changed, 8 deletions(-) diff --git a/fs/lustre/obdclass/llog.c b/fs/lustre/obdclass/llog.c index b2667d9..e172ebc 100644 --- a/fs/lustre/obdclass/llog.c +++ b/fs/lustre/obdclass/llog.c @@ -449,8 +449,6 @@ static int llog_process_thread_daemonize(void *arg) struct lu_env env; int rc; - unshare_fs_struct(); - /* client env has no keys, tags is just 0 */ rc = lu_env_init(&env, LCT_LOCAL | LCT_MG_THREAD); if (rc) diff --git a/fs/lustre/ptlrpc/import.c b/fs/lustre/ptlrpc/import.c index 1b62b81..1490dcf 100644 --- a/fs/lustre/ptlrpc/import.c +++ b/fs/lustre/ptlrpc/import.c @@ -1438,8 +1438,6 @@ static int ptlrpc_invalidate_import_thread(void *data) { struct obd_import *imp = data; - unshare_fs_struct(); - CDEBUG(D_HA, "thread invalidate import %s to %s@%s\n", imp->imp_obd->obd_name, obd2cli_tgt(imp->imp_obd), imp->imp_connection->c_remote_uuid.uuid); diff --git a/fs/lustre/ptlrpc/ptlrpcd.c b/fs/lustre/ptlrpc/ptlrpcd.c index 533f592..b0b81cc 100644 --- a/fs/lustre/ptlrpc/ptlrpcd.c +++ b/fs/lustre/ptlrpc/ptlrpcd.c @@ -393,7 +393,6 @@ static int ptlrpcd(void *arg) int rc = 0; int exit = 0; - unshare_fs_struct(); if (cfs_cpt_bind(cfs_cpt_tab, pc->pc_cpt) != 0) CWARN("Failed to bind %s on CPT %d\n", pc->pc_name, pc->pc_cpt); diff --git a/fs/lustre/ptlrpc/service.c b/fs/lustre/ptlrpc/service.c index 4d5e6b3..5881e0a 100644 --- a/fs/lustre/ptlrpc/service.c +++ b/fs/lustre/ptlrpc/service.c @@ -2175,7 +2175,6 @@ static int ptlrpc_main(void *arg) thread->t_task = current; thread->t_pid = current->pid; - unshare_fs_struct(); if (svc->srv_cpt_bind) { rc = cfs_cpt_bind(svc->srv_cptable, svcpt->scp_cpt); @@ -2391,8 +2390,6 @@ static int ptlrpc_hr_main(void *arg) if (!env) return -ENOMEM; - unshare_fs_struct(); - rc = cfs_cpt_bind(ptlrpc_hr.hr_cpt_table, hrp->hrp_cpt); if (rc != 0) { char threadname[20]; From patchwork Wed Jul 15 20:45:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666243 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 09ADB13A4 for ; Wed, 15 Jul 2020 20:46:25 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E757D2065F for ; Wed, 15 Jul 2020 20:46:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E757D2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 26F0C21F8B9; Wed, 15 Jul 2020 13:45:59 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 2B49821F761 for ; Wed, 15 Jul 2020 13:45:32 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id BDCB85CF; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id BA7822B5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:10 -0400 Message-Id: <1594845918-29027-30-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 29/37] lnet: Allow router to forward to healthier NID X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn When a final-hop router (aka edge router) is forwarding a message, if both the originator and destination of the message are mutli-rail capable, then allow the router to choose a new destination lpni if the one selected by the message originator is unhealthy or down. HPE-bug-id: LUS-8905 WC-bug-id: https://jira.whamcloud.com/browse/LU-13606 Lustre-commit: b0e8ab1a5f6f8 ("LU-13606 lnet: Allow router to forward to healthier NID") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/38798 Reviewed-by: Serguei Smirnov Reviewed-by: Amir Shehata Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 4 ++-- net/lnet/lnet/lib-move.c | 37 +++++++++++++++++++++++++++++++++++-- 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index 75c0da7..b069422 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -819,8 +819,8 @@ int lnet_get_peer_ni_info(u32 peer_index, u64 *nid, } /* - * A peer is alive if it satisfies the following two conditions: - * 1. peer health >= LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage + * A peer NI is alive if it satisfies the following two conditions: + * 1. peer NI health >= LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage * 2. the cached NI status received when we discover the peer is UP */ static inline bool diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 2f3ef8c..234fbb5 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -2371,6 +2371,8 @@ struct lnet_ni * int cpt, rc; int md_cpt; u32 send_case = 0; + bool final_hop; + bool mr_forwarding_allowed; memset(&send_data, 0, sizeof(send_data)); @@ -2447,16 +2449,47 @@ struct lnet_ni * else send_case |= REMOTE_DST; + final_hop = false; + if (msg->msg_routing && (send_case & LOCAL_DST)) + final_hop = true; + + /* Determine whether to allow MR forwarding for this message. + * NB: MR forwarding is allowed if the message originator and the + * destination are both MR capable, and the destination lpni that was + * originally chosen by the originator is unhealthy or down. + * We check the MR capability of the destination further below + */ + mr_forwarding_allowed = false; + if (final_hop) { + struct lnet_peer *src_lp; + struct lnet_peer_ni *src_lpni; + + src_lpni = lnet_nid2peerni_locked(msg->msg_hdr.src_nid, + LNET_NID_ANY, cpt); + /* We don't fail the send if we hit any errors here. We'll just + * try to send it via non-multi-rail criteria + */ + if (!IS_ERR(src_lpni)) { + src_lp = lpni->lpni_peer_net->lpn_peer; + if (lnet_peer_is_multi_rail(src_lp) && + !lnet_is_peer_ni_alive(lpni)) + mr_forwarding_allowed = true; + } + CDEBUG(D_NET, "msg %p MR forwarding %s\n", msg, + mr_forwarding_allowed ? "allowed" : "not allowed"); + } + /* Deal with the peer as NMR in the following cases: * 1. the peer is NMR * 2. We're trying to recover a specific peer NI - * 3. I'm a router sending to the final destination + * 3. I'm a router sending to the final destination and MR forwarding is + * not allowed for this message (as determined above). * In this case the source of the message would've * already selected the final destination so my job * is to honor the selection. */ if (!lnet_peer_is_multi_rail(peer) || msg->msg_recovery || - (msg->msg_routing && (send_case & LOCAL_DST))) + (final_hop && !mr_forwarding_allowed)) send_case |= NMR_DST; else send_case |= MR_DST; From patchwork Wed Jul 15 20:45:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666261 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 574C3618 for ; Wed, 15 Jul 2020 20:46:40 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 40D472065F for ; Wed, 15 Jul 2020 20:46:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 40D472065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 81B9F21F9ED; Wed, 15 Jul 2020 13:46:07 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 8273E21F807 for ; Wed, 15 Jul 2020 13:45:32 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id C043B5D0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id BDEEA2BA; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:11 -0400 Message-Id: <1594845918-29027-31-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 30/37] lustre: llite: annotate non-owner locking X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown The lli_lsm_sem locks taken by ll_prep_md_op_data() are sometimes released by a different thread. This confuses lockdep unless we explain the situation. So use down_read_non_owner() and up_read_non_owner(). WC-bug-id: https://jira.whamcloud.com/browse/LU-9679 Lustre-commit: f34392412fe22 ("LU-9679 llite: annotate non-owner locking") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39234 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Reviewed-by: Shaun Tancheff Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/llite_lib.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index c62e182..f52d2b5 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2783,12 +2783,12 @@ int ll_obd_statfs(struct inode *inode, void __user *arg) void ll_unlock_md_op_lsm(struct md_op_data *op_data) { if (op_data->op_mea2_sem) { - up_read(op_data->op_mea2_sem); + up_read_non_owner(op_data->op_mea2_sem); op_data->op_mea2_sem = NULL; } if (op_data->op_mea1_sem) { - up_read(op_data->op_mea1_sem); + up_read_non_owner(op_data->op_mea1_sem); op_data->op_mea1_sem = NULL; } } @@ -2823,7 +2823,7 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data, op_data->op_code = opc; if (S_ISDIR(i1->i_mode)) { - down_read(&ll_i2info(i1)->lli_lsm_sem); + down_read_non_owner(&ll_i2info(i1)->lli_lsm_sem); op_data->op_mea1_sem = &ll_i2info(i1)->lli_lsm_sem; op_data->op_mea1 = ll_i2info(i1)->lli_lsm_md; op_data->op_default_mea1 = ll_i2info(i1)->lli_default_lsm_md; @@ -2833,7 +2833,10 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data, op_data->op_fid2 = *ll_inode2fid(i2); if (S_ISDIR(i2->i_mode)) { if (i2 != i1) { - down_read(&ll_i2info(i2)->lli_lsm_sem); + /* i2 is typically a child of i1, and MUST be + * further from the root to avoid deadlocks. + */ + down_read_non_owner(&ll_i2info(i2)->lli_lsm_sem); op_data->op_mea2_sem = &ll_i2info(i2)->lli_lsm_sem; } From patchwork Wed Jul 15 20:45:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666283 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17C801392 for ; Wed, 15 Jul 2020 20:47:09 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F344A206F4 for ; Wed, 15 Jul 2020 20:47:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F344A206F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4137D21FAF8; Wed, 15 Jul 2020 13:46:23 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id C7FC021F84C for ; Wed, 15 Jul 2020 13:45:32 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id C260C5D2; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C0DCD2BB; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:12 -0400 Message-Id: <1594845918-29027-32-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 31/37] lustre: osc: consume grants for direct I/O X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vladimir Saveliev , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Vladimir Saveliev New IO engine implementation lost consuming grants by direct I/O writes. That led to early emergence of out of space condition during direct I/O. The below illustrates the problem: # OSTSIZE=100000 sh llmount.sh # dd if=/dev/zero of=/mnt/lustre/file bs=4k count=100 oflag=direct dd: error writing ‘/mnt/lustre/file’: No space left on device Consume grants for direct I/O. Try to consume grants in osc_queue_sync_pages() when it is called for pages which are being writted in direct i/o. Tests are added to verify grant consumption in buffered and direct i/o and to verify direct i/o overwrite when ost is full. The overwrite test is for ldiskfs only as zfs is unable to overwrite when it is full. Cray-bug-id: LUS-7036 WC-bug-id: https://jira.whamcloud.com/browse/LU-12687 Lustre-commit: 05f326a7988a7a ("LU-12687 osc: consume grants for direct I/O") Signed-off-by: Vladimir Saveliev Reviewed-on: https://review.whamcloud.com/35896 Reviewed-by: Wang Shilong Reviewed-by: Andreas Dilger Reviewed-by: Mike Pershin Signed-off-by: James Simmons --- fs/lustre/osc/osc_cache.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/fs/lustre/osc/osc_cache.c b/fs/lustre/osc/osc_cache.c index fe03c0d..c7aaabb 100644 --- a/fs/lustre/osc/osc_cache.c +++ b/fs/lustre/osc/osc_cache.c @@ -2692,6 +2692,28 @@ int osc_queue_sync_pages(const struct lu_env *env, const struct cl_io *io, ext->oe_srvlock = !!(brw_flags & OBD_BRW_SRVLOCK); ext->oe_ndelay = !!(brw_flags & OBD_BRW_NDELAY); ext->oe_dio = !!(brw_flags & OBD_BRW_NOCACHE); + if (ext->oe_dio && !ext->oe_rw) { /* direct io write */ + int grants; + int ppc; + + ppc = 1 << (cli->cl_chunkbits - PAGE_SHIFT); + grants = cli->cl_grant_extent_tax; + grants += (1 << cli->cl_chunkbits) * + ((page_count + ppc - 1) / ppc); + + spin_lock(&cli->cl_loi_list_lock); + if (osc_reserve_grant(cli, grants) == 0) { + list_for_each_entry(oap, list, oap_pending_item) { + osc_consume_write_grant(cli, + &oap->oap_brw_page); + atomic_long_inc(&obd_dirty_pages); + } + osc_unreserve_grant_nolock(cli, grants, 0); + ext->oe_grants = grants; + } + spin_unlock(&cli->cl_loi_list_lock); + } + ext->oe_is_rdma_only = !!(brw_flags & OBD_BRW_RDMA_ONLY); ext->oe_nr_pages = page_count; ext->oe_mppr = mppr; From patchwork Wed Jul 15 20:45:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666257 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 938C41392 for ; Wed, 15 Jul 2020 20:46:37 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7D97A2065F for ; Wed, 15 Jul 2020 20:46:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7D97A2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 1F34121F9E0; Wed, 15 Jul 2020 13:46:06 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 16C9721F81A for ; Wed, 15 Jul 2020 13:45:33 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id C70F55E0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C43398D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:13 -0400 Message-Id: <1594845918-29027-33-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 32/37] lnet: remove LNetMEUnlink and clean up related code X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown LNetMEUnlink is not particularly useful, and exposing it as an LNet interface only provides the opportunity for it to be misused. Every successful call to LNetMEAttach() is followed by a call to LNetMDAttach(). If that call succeeds, the ME is owned by the MD and the caller mustn't touch it again. If the call fails, the caller is currently required to call LNetMEUnlink(), which all callers do, and these are the only places that LNetMEUnlink() are called. As LNetMDAttach() knows when it will fail, it can unlink the ME itself and save the caller the effort. This allows LNetMEUnlink() to be removed which simplifies the LNet interface. LNetMEUnlink() is also used in in ptl_send_rpc() in a situation where ptl_send_buf() fails. In this case both the ME and the MD need to be unlinked, as as they are interconnected, LNetMEUnlink() or LNetMDUnlink() can equally do the job. So change it to use LNetMDUnlink(). LNetMEUnlink() is primarily a call the lnet_me_unlink(). It also - has some handling if ->me_md is not NULL, but that is never the case - takes the lnet_res_lock(). However LNetMDAttach() already takes that lock. So none of this functionality is useful to LNetMDAttach(). On failure, it can call lnet_me_unlink() directly while ensuring it still has the lock. This patch: - moves the calls to lnet_md_validate() into lnet_md_build() - changes LNetMDAttach() to always take the lnet_res_lock(), and to call lnet_me_unlink() on failure. - removes all calls to LNetMEUnlink() and sometimes simplifies surrounding code. - changes lnet_md_link() to 'void' as it only ever returns '0', and thus simplify error handling in LNetMDAttach() and LNetMDBind() WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: e17ee2296c201 ("LU-12678 lnet: remove LNetMEUnlink and clean up related code") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/38646 Reviewed-by: Yang Sheng Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/niobuf.c | 12 +++------ include/linux/lnet/api.h | 6 ++--- net/lnet/lnet/api-ni.c | 5 +--- net/lnet/lnet/lib-md.c | 62 +++++++++++++++-------------------------------- net/lnet/lnet/lib-me.c | 39 ----------------------------- net/lnet/selftest/rpc.c | 1 - 6 files changed, 26 insertions(+), 99 deletions(-) diff --git a/fs/lustre/ptlrpc/niobuf.c b/fs/lustre/ptlrpc/niobuf.c index 6fb79a2..924b9c4 100644 --- a/fs/lustre/ptlrpc/niobuf.c +++ b/fs/lustre/ptlrpc/niobuf.c @@ -203,7 +203,6 @@ static int ptlrpc_register_bulk(struct ptlrpc_request *req) CERROR("%s: LNetMDAttach failed x%llu/%d: rc = %d\n", desc->bd_import->imp_obd->obd_name, mbits, posted_md, rc); - LNetMEUnlink(me); break; } } @@ -676,7 +675,7 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) request->rq_receiving_reply = 0; spin_unlock(&request->rq_lock); rc = -ENOMEM; - goto cleanup_me; + goto cleanup_bulk; } percpu_ref_get(&ptlrpc_pending); @@ -720,12 +719,8 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) if (noreply) goto out; -cleanup_me: - /* MEUnlink is safe; the PUT didn't even get off the ground, and - * nobody apart from the PUT's target has the right nid+XID to - * access the reply buffer. - */ - LNetMEUnlink(reply_me); + LNetMDUnlink(request->rq_reply_md_h); + /* UNLINKED callback called synchronously */ LASSERT(!request->rq_receiving_reply); @@ -802,7 +797,6 @@ int ptlrpc_register_rqbd(struct ptlrpc_request_buffer_desc *rqbd) CERROR("ptlrpc: LNetMDAttach failed: rc = %d\n", rc); LASSERT(rc == -ENOMEM); - LNetMEUnlink(me); rqbd->rqbd_refcount = 0; return -ENOMEM; diff --git a/include/linux/lnet/api.h b/include/linux/lnet/api.h index 24115eb..95805de 100644 --- a/include/linux/lnet/api.h +++ b/include/linux/lnet/api.h @@ -90,8 +90,8 @@ * list is a chain of MEs. Each ME includes a pointer to a memory descriptor * and a set of match criteria. The match criteria can be used to reject * incoming requests based on process ID or the match bits provided in the - * request. MEs can be dynamically inserted into a match list by LNetMEAttach() - * and removed from its list by LNetMEUnlink(). + * request. MEs can be dynamically inserted into a match list by LNetMEAttach(), + * and must then be attached to an MD with LNetMDAttach(). * @{ */ struct lnet_me * @@ -101,8 +101,6 @@ struct lnet_me * u64 ignore_bits_in, enum lnet_unlink unlink_in, enum lnet_ins_pos pos_in); - -void LNetMEUnlink(struct lnet_me *current_in); /** @} lnet_me */ /** \defgroup lnet_md Memory descriptors diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 3e69435..5f35468 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -1645,14 +1645,12 @@ struct lnet_ping_buffer * rc = LNetMDAttach(me, &md, LNET_RETAIN, ping_mdh); if (rc) { CERROR("Can't attach ping target MD: %d\n", rc); - goto fail_unlink_ping_me; + goto fail_decref_ping_buffer; } lnet_ping_buffer_addref(*ppbuf); return 0; -fail_unlink_ping_me: - LNetMEUnlink(me); fail_decref_ping_buffer: LASSERT(atomic_read(&(*ppbuf)->pb_refcnt) == 1); lnet_ping_buffer_decref(*ppbuf); @@ -1855,7 +1853,6 @@ int lnet_push_target_post(struct lnet_ping_buffer *pbuf, rc = LNetMDAttach(me, &md, LNET_UNLINK, mdhp); if (rc) { CERROR("Can't attach push MD: %d\n", rc); - LNetMEUnlink(me); lnet_ping_buffer_decref(pbuf); pbuf->pb_needs_post = true; return rc; diff --git a/net/lnet/lnet/lib-md.c b/net/lnet/lnet/lib-md.c index e80dc6f..48249f3 100644 --- a/net/lnet/lnet/lib-md.c +++ b/net/lnet/lnet/lib-md.c @@ -123,6 +123,8 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) return cpt; } +static int lnet_md_validate(const struct lnet_md *umd); + static struct lnet_libmd * lnet_md_build(const struct lnet_md *umd, int unlink) { @@ -132,6 +134,9 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) struct lnet_libmd *lmd; unsigned int size; + if (lnet_md_validate(umd) != 0) + return ERR_PTR(-EINVAL); + if (umd->options & LNET_MD_KIOV) niov = umd->length; else @@ -228,15 +233,14 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) } /* must be called with resource lock held */ -static int +static void lnet_md_link(struct lnet_libmd *md, lnet_handler_t handler, int cpt) { struct lnet_res_container *container = the_lnet.ln_md_containers[cpt]; /* * NB we are passed an allocated, but inactive md. - * if we return success, caller may lnet_md_unlink() it. - * otherwise caller may only kfree() it. + * Caller may lnet_md_unlink() it, or may lnet_md_free() it. */ /* * This implementation doesn't know how to create START events or @@ -255,8 +259,6 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) LASSERT(list_empty(&md->md_list)); list_add(&md->md_list, &container->rec_active); - - return 0; } /* must be called with lnet_res_lock held */ @@ -304,14 +306,11 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) * @handle On successful returns, a handle to the newly created MD is * saved here. This handle can be used later in LNetMDUnlink(). * + * The ME will either be linked to the new MD, or it will be freed. + * * Return: 0 on success. * -EINVAL If @umd is not valid. * -ENOMEM If new MD cannot be allocated. - * -ENOENT Either @me or @umd.handle does not point to a - * valid object. Note that it's OK to supply a NULL @umd.handle - * by calling LNetInvalidateHandle() on it. - * -EBUSY if the ME pointed to by @me is already associated with - * a MD. */ int LNetMDAttach(struct lnet_me *me, const struct lnet_md *umd, @@ -321,33 +320,27 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) LIST_HEAD(drops); struct lnet_libmd *md; int cpt; - int rc; LASSERT(the_lnet.ln_refcount > 0); - if (lnet_md_validate(umd)) - return -EINVAL; + LASSERT(!me->me_md); if (!(umd->options & (LNET_MD_OP_GET | LNET_MD_OP_PUT))) { CERROR("Invalid option: no MD_OP set\n"); - return -EINVAL; - } - - md = lnet_md_build(umd, unlink); - if (IS_ERR(md)) - return PTR_ERR(md); + md = ERR_PTR(-EINVAL); + } else + md = lnet_md_build(umd, unlink); cpt = me->me_cpt; - lnet_res_lock(cpt); - if (me->me_md) - rc = -EBUSY; - else - rc = lnet_md_link(md, umd->handler, cpt); + if (IS_ERR(md)) { + lnet_me_unlink(me); + lnet_res_unlock(cpt); + return PTR_ERR(md); + } - if (rc) - goto out_unlock; + lnet_md_link(md, umd->handler, cpt); /* * attach this MD to portal of ME and check if it matches any @@ -363,11 +356,6 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) lnet_recv_delayed_msg_list(&matches); return 0; - -out_unlock: - lnet_res_unlock(cpt); - kfree(md); - return rc; } EXPORT_SYMBOL(LNetMDAttach); @@ -383,9 +371,6 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) * Return: 0 On success. * -EINVAL If @umd is not valid. * -ENOMEM If new MD cannot be allocated. - * -ENOENT @umd.handle does not point to a valid EQ. - * Note that it's OK to supply a NULL @umd.handle by - * calling LNetInvalidateHandle() on it. */ int LNetMDBind(const struct lnet_md *umd, enum lnet_unlink unlink, @@ -397,9 +382,6 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) LASSERT(the_lnet.ln_refcount > 0); - if (lnet_md_validate(umd)) - return -EINVAL; - if ((umd->options & (LNET_MD_OP_GET | LNET_MD_OP_PUT))) { CERROR("Invalid option: GET|PUT illegal on active MDs\n"); return -EINVAL; @@ -418,17 +400,13 @@ int lnet_cpt_of_md(struct lnet_libmd *md, unsigned int offset) cpt = lnet_res_lock_current(); - rc = lnet_md_link(md, umd->handler, cpt); - if (rc) - goto out_unlock; + lnet_md_link(md, umd->handler, cpt); lnet_md2handle(handle, md); lnet_res_unlock(cpt); return 0; -out_unlock: - lnet_res_unlock(cpt); out_free: kfree(md); diff --git a/net/lnet/lnet/lib-me.c b/net/lnet/lnet/lib-me.c index 14ab21f..f75f3cb 100644 --- a/net/lnet/lnet/lib-me.c +++ b/net/lnet/lnet/lib-me.c @@ -118,45 +118,6 @@ struct lnet_me * } EXPORT_SYMBOL(LNetMEAttach); -/** - * Unlink a match entry from its match list. - * - * This operation also releases any resources associated with the ME. If a - * memory descriptor is attached to the ME, then it will be unlinked as well - * and an unlink event will be generated. It is an error to use the ME handle - * after calling LNetMEUnlink(). - * - * @me The ME to be unlinked. - * - * \see LNetMDUnlink() for the discussion on delivering unlink event. - */ -void -LNetMEUnlink(struct lnet_me *me) -{ - struct lnet_libmd *md; - struct lnet_event ev; - int cpt; - - LASSERT(the_lnet.ln_refcount > 0); - - cpt = me->me_cpt; - lnet_res_lock(cpt); - - md = me->me_md; - if (md) { - md->md_flags |= LNET_MD_FLAG_ABORTED; - if (md->md_handler && !md->md_refcount) { - lnet_build_unlink_event(md, &ev); - md->md_handler(&ev); - } - } - - lnet_me_unlink(me); - - lnet_res_unlock(cpt); -} -EXPORT_SYMBOL(LNetMEUnlink); - /* call with lnet_res_lock please */ void lnet_me_unlink(struct lnet_me *me) diff --git a/net/lnet/selftest/rpc.c b/net/lnet/selftest/rpc.c index 799ad99..a72e485 100644 --- a/net/lnet/selftest/rpc.c +++ b/net/lnet/selftest/rpc.c @@ -383,7 +383,6 @@ struct srpc_bulk * CERROR("LNetMDAttach failed: %d\n", rc); LASSERT(rc == -ENOMEM); - LNetMEUnlink(me); return -ENOMEM; } From patchwork Wed Jul 15 20:45:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666265 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C5AC1392 for ; Wed, 15 Jul 2020 20:46:43 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 85BD12065F for ; Wed, 15 Jul 2020 20:46:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 85BD12065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 4AC5F21FA11; Wed, 15 Jul 2020 13:46:09 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 706E921F81A for ; Wed, 15 Jul 2020 13:45:33 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id C91B75E1; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C70D12A0; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:14 -0400 Message-Id: <1594845918-29027-34-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 33/37] lnet: Set remote NI status in lnet_notify X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn The gnilnd receives node health information asynchronous from any tx failure, so aliveness of lpni as reported by lnet_is_peer_ni_alive() may not match what LND is telling us. Use existing reset flag to set cached NI status down so we can be sure that remote NIs are correctly set down. HPE-bug-id: LUS-8897 WC-bug-id: https://jira.whamcloud.com/browse/LU-13648 Lustre-commit: 8010dbb660766 ("LU-13648 lnet: Set remote NI status in lnet_notify") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/38862 Reviewed-by: Amir Shehata Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/router.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/lnet/lnet/router.c b/net/lnet/lnet/router.c index c0578d9..e3b3e71 100644 --- a/net/lnet/lnet/router.c +++ b/net/lnet/lnet/router.c @@ -1671,8 +1671,7 @@ bool lnet_router_checker_active(void) CDEBUG(D_NET, "%s notifying %s: %s\n", !ni ? "userspace" : libcfs_nid2str(ni->ni_nid), - libcfs_nid2str(nid), - alive ? "up" : "down"); + libcfs_nid2str(nid), alive ? "up" : "down"); if (ni && LNET_NIDNET(ni->ni_nid) != LNET_NIDNET(nid)) { @@ -1714,6 +1713,7 @@ bool lnet_router_checker_active(void) if (alive) { if (reset) { + lpni->lpni_ns_status = LNET_NI_STATUS_UP; lnet_set_lpni_healthv_locked(lpni, LNET_MAX_HEALTH_VALUE); } else { @@ -1726,6 +1726,8 @@ bool lnet_router_checker_active(void) (sensitivity) ? sensitivity : lnet_health_sensitivity); } + } else if (reset) { + lpni->lpni_ns_status = LNET_NI_STATUS_DOWN; } /* recalculate aliveness */ From patchwork Wed Jul 15 20:45:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666287 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7EFC3618 for ; Wed, 15 Jul 2020 20:47:14 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 689C42065F for ; Wed, 15 Jul 2020 20:47:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 689C42065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5E0E021FB25; Wed, 15 Jul 2020 13:46:26 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id B27E721F85B for ; Wed, 15 Jul 2020 13:45:33 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id CB80D5E2; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C9CCC2B5; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:15 -0400 Message-Id: <1594845918-29027-35-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 34/37] lustre: ptlrpc: fix endless loop issue X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hongchao Zhang , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Hongchao Zhang In ptlrpc_pinger_main, if the process to ping the recoverable clients takes too long time, it could be stuck in endless loop because of the negative value returned by pinger_check_timeout. WC-bug-id: https://jira.whamcloud.com/browse/LU-13667 Lustre-commit: 6be2dbb259512 ("LU-13667 ptlrpc: fix endless loop issue") Signed-off-by: Hongchao Zhang Reviewed-on: https://review.whamcloud.com/38915 Reviewed-by: Andreas Dilger Reviewed-by: Olaf Faaland-LLNL Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/pinger.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/lustre/ptlrpc/pinger.c b/fs/lustre/ptlrpc/pinger.c index ec4c51a..9f57c61 100644 --- a/fs/lustre/ptlrpc/pinger.c +++ b/fs/lustre/ptlrpc/pinger.c @@ -258,12 +258,13 @@ static void ptlrpc_pinger_process_import(struct obd_import *imp, static void ptlrpc_pinger_main(struct work_struct *ws) { - time64_t this_ping = ktime_get_seconds(); - time64_t time_to_next_wake; + time64_t this_ping, time_after_ping, time_to_next_wake; struct timeout_item *item; struct obd_import *imp; do { + this_ping = ktime_get_seconds(); + mutex_lock(&pinger_mutex); list_for_each_entry(item, &timeout_list, ti_chain) { item->ti_cb(item, item->ti_cb_data); @@ -277,6 +278,12 @@ static void ptlrpc_pinger_main(struct work_struct *ws) } mutex_unlock(&pinger_mutex); + time_after_ping = ktime_get_seconds(); + + if ((ktime_get_seconds() - this_ping - 3) > PING_INTERVAL) + CDEBUG(D_HA, "long time to ping: %lld, %lld, %lld\n", + this_ping, time_after_ping, ktime_get_seconds()); + /* Wait until the next ping time, or until we're stopped. */ time_to_next_wake = pinger_check_timeout(this_ping); /* The ping sent by ptlrpc_send_rpc may get sent out From patchwork Wed Jul 15 20:45:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666269 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6A58618 for ; Wed, 15 Jul 2020 20:46:49 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B02B72065F for ; Wed, 15 Jul 2020 20:46:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B02B72065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 320C721FA48; Wed, 15 Jul 2020 13:46:12 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 039A121F7F9 for ; Wed, 15 Jul 2020 13:45:34 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id CEB1B5E6; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id CCB052BA; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:16 -0400 Message-Id: <1594845918-29027-36-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 35/37] lustre: llite: fix short io for AIO X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Wang Shilong , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Wang Shilong The problem is currently AIO could not handle i/o size > stripe size: We need cl io loop to handle io across stripes, since -EIOCBQUEUED is returned for AIO, io loop will be stopped thus short io happen. The patch try to fix the problem by making IO engine aware of special error, and it will be proceed to finish all IO requests. Fixes: fde7ac1942f5 ("lustre: clio: AIO support for direct IO") WC-bug-id: https://jira.whamcloud.com/browse/LU-13697 Lustre-commit: 84c3e85ced2dd ("LU-13697 llite: fix short io for AIO") Signed-off-by: Wang Shilong Reviewed-on: https://review.whamcloud.com/39104 Reviewed-by: Andreas Dilger Reviewed-by: Bobi Jam Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 2 ++ fs/lustre/llite/file.c | 32 +++++++++++++++++- fs/lustre/llite/rw26.c | 43 +++++++++++++++++-------- fs/lustre/llite/vvp_internal.h | 3 +- fs/lustre/llite/vvp_io.c | 73 ++++++++++++++++++++++++++++-------------- fs/lustre/obdclass/cl_io.c | 9 +++++- 6 files changed, 122 insertions(+), 40 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index e656c68..e849f23 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -1814,6 +1814,8 @@ struct cl_io { enum cl_io_state ci_state; /** main object this io is against. Immutable after creation. */ struct cl_object *ci_obj; + /** one AIO request might be split in cl_io_loop */ + struct cl_dio_aio *ci_aio; /** * Upper layer io, of which this io is a part of. Immutable after * creation. diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 1849229..757950f 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -1514,6 +1514,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, int rc = 0; unsigned int retried = 0; unsigned int ignore_lockless = 0; + bool is_aio = false; CDEBUG(D_VFSTRACE, "file: %pD, type: %d ppos: %llu, count: %zu\n", file, iot, *ppos, count); @@ -1536,6 +1537,15 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, vio->vui_fd = file->private_data; vio->vui_iter = args->u.normal.via_iter; vio->vui_iocb = args->u.normal.via_iocb; + if (file->f_flags & O_DIRECT) { + if (!is_sync_kiocb(vio->vui_iocb)) + is_aio = true; + io->ci_aio = cl_aio_alloc(vio->vui_iocb); + if (!io->ci_aio) { + rc = -ENOMEM; + goto out; + } + } /* * Direct IO reads must also take range lock, * or multiple reads will try to work on the same pages @@ -1567,7 +1577,14 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, rc = io->ci_result; } - if (io->ci_nob > 0) { + /* + * In order to move forward AIO, ci_nob was increased, + * but that doesn't mean io have been finished, it just + * means io have been submited, we will always return + * EIOCBQUEUED to the caller, So we could only return + * number of bytes in non-AIO case. + */ + if (io->ci_nob > 0 && !is_aio) { result += io->ci_nob; count -= io->ci_nob; *ppos = io->u.ci_wr.wr.crw_pos; @@ -1577,6 +1594,19 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, args->u.normal.via_iter = vio->vui_iter; } out: + if (io->ci_aio) { + /** + * Drop one extra reference so that end_io() could be + * called for this IO context, we could call it after + * we make sure all AIO requests have been proceed. + */ + cl_sync_io_note(env, &io->ci_aio->cda_sync, + rc == -EIOCBQUEUED ? 0 : rc); + if (!is_aio) { + cl_aio_free(io->ci_aio); + io->ci_aio = NULL; + } + } cl_io_fini(env, io); CDEBUG(D_VFSTRACE, diff --git a/fs/lustre/llite/rw26.c b/fs/lustre/llite/rw26.c index d0e3ff6..b3802cf 100644 --- a/fs/lustre/llite/rw26.c +++ b/fs/lustre/llite/rw26.c @@ -290,6 +290,7 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter) ssize_t tot_bytes = 0, result = 0; loff_t file_offset = iocb->ki_pos; int rw = iov_iter_rw(iter); + struct vvp_io *vio; /* if file is encrypted, return 0 so that we fall back to buffered IO */ if (IS_ENCRYPTED(inode)) @@ -319,12 +320,13 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter) env = lcc->lcc_env; LASSERT(!IS_ERR(env)); + vio = vvp_env_io(env); io = lcc->lcc_io; LASSERT(io); - aio = cl_aio_alloc(iocb); - if (!aio) - return -ENOMEM; + aio = io->ci_aio; + LASSERT(aio); + LASSERT(aio->cda_iocb == iocb); /* 0. Need locking between buffered and direct access. and race with * size changing by concurrent truncates and writes. @@ -368,24 +370,39 @@ static ssize_t ll_direct_IO(struct kiocb *iocb, struct iov_iter *iter) } out: - aio->cda_bytes = tot_bytes; - cl_sync_io_note(env, &aio->cda_sync, result); + aio->cda_bytes += tot_bytes; if (is_sync_kiocb(iocb)) { + struct cl_sync_io *anchor = &aio->cda_sync; ssize_t rc2; - rc2 = cl_sync_io_wait(env, &aio->cda_sync, 0); + /** + * @anchor was inited as 1 to prevent end_io to be + * called before we add all pages for IO, so drop + * one extra reference to make sure we could wait + * count to be zero. + */ + cl_sync_io_note(env, anchor, result); + + rc2 = cl_sync_io_wait(env, anchor, 0); if (result == 0 && rc2) result = rc2; + /** + * One extra reference again, as if @anchor is + * reused we assume it as 1 before using. + */ + atomic_add(1, &anchor->csi_sync_nr); if (result == 0) { - struct vvp_io *vio = vvp_env_io(env); /* no commit async for direct IO */ - vio->u.write.vui_written += tot_bytes; + vio->u.readwrite.vui_written += tot_bytes; result = tot_bytes; } - cl_aio_free(aio); } else { + if (rw == WRITE) + vio->u.readwrite.vui_written += tot_bytes; + else + vio->u.readwrite.vui_read += tot_bytes; result = -EIOCBQUEUED; } @@ -523,7 +540,7 @@ static int ll_write_begin(struct file *file, struct address_space *mapping, vmpage = grab_cache_page_nowait(mapping, index); if (unlikely(!vmpage || PageDirty(vmpage) || PageWriteback(vmpage))) { struct vvp_io *vio = vvp_env_io(env); - struct cl_page_list *plist = &vio->u.write.vui_queue; + struct cl_page_list *plist = &vio->u.readwrite.vui_queue; /* if the page is already in dirty cache, we have to commit * the pages right now; otherwise, it may cause deadlock @@ -685,17 +702,17 @@ static int ll_write_end(struct file *file, struct address_space *mapping, LASSERT(cl_page_is_owned(page, io)); if (copied > 0) { - struct cl_page_list *plist = &vio->u.write.vui_queue; + struct cl_page_list *plist = &vio->u.readwrite.vui_queue; lcc->lcc_page = NULL; /* page will be queued */ /* Add it into write queue */ cl_page_list_add(plist, page); if (plist->pl_nr == 1) /* first page */ - vio->u.write.vui_from = from; + vio->u.readwrite.vui_from = from; else LASSERT(from == 0); - vio->u.write.vui_to = from + copied; + vio->u.readwrite.vui_to = from + copied; /* * To address the deadlock in balance_dirty_pages() where diff --git a/fs/lustre/llite/vvp_internal.h b/fs/lustre/llite/vvp_internal.h index cff85ea..6956d6b 100644 --- a/fs/lustre/llite/vvp_internal.h +++ b/fs/lustre/llite/vvp_internal.h @@ -88,9 +88,10 @@ struct vvp_io { struct { struct cl_page_list vui_queue; unsigned long vui_written; + unsigned long vui_read; int vui_from; int vui_to; - } write; + } readwrite; /* normal io */ } u; /** diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index c3fb03a..59da56d 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -249,10 +249,20 @@ static int vvp_io_write_iter_init(const struct lu_env *env, { struct vvp_io *vio = cl2vvp_io(env, ios); - cl_page_list_init(&vio->u.write.vui_queue); - vio->u.write.vui_written = 0; - vio->u.write.vui_from = 0; - vio->u.write.vui_to = PAGE_SIZE; + cl_page_list_init(&vio->u.readwrite.vui_queue); + vio->u.readwrite.vui_written = 0; + vio->u.readwrite.vui_from = 0; + vio->u.readwrite.vui_to = PAGE_SIZE; + + return 0; +} + +static int vvp_io_read_iter_init(const struct lu_env *env, + const struct cl_io_slice *ios) +{ + struct vvp_io *vio = cl2vvp_io(env, ios); + + vio->u.readwrite.vui_read = 0; return 0; } @@ -262,7 +272,7 @@ static void vvp_io_write_iter_fini(const struct lu_env *env, { struct vvp_io *vio = cl2vvp_io(env, ios); - LASSERT(vio->u.write.vui_queue.pl_nr == 0); + LASSERT(vio->u.readwrite.vui_queue.pl_nr == 0); } static int vvp_io_fault_iter_init(const struct lu_env *env, @@ -824,7 +834,13 @@ static int vvp_io_read_start(const struct lu_env *env, io->ci_continue = 0; io->ci_nob += result; result = 0; + } else if (result == -EIOCBQUEUED) { + io->ci_nob += vio->u.readwrite.vui_read; + if (vio->vui_iocb) + vio->vui_iocb->ki_pos = pos + + vio->u.readwrite.vui_read; } + return result; } @@ -1017,23 +1033,24 @@ int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io) struct cl_object *obj = io->ci_obj; struct inode *inode = vvp_object_inode(obj); struct vvp_io *vio = vvp_env_io(env); - struct cl_page_list *queue = &vio->u.write.vui_queue; + struct cl_page_list *queue = &vio->u.readwrite.vui_queue; struct cl_page *page; int rc = 0; int bytes = 0; - unsigned int npages = vio->u.write.vui_queue.pl_nr; + unsigned int npages = vio->u.readwrite.vui_queue.pl_nr; if (npages == 0) return 0; CDEBUG(D_VFSTRACE, "commit async pages: %d, from %d, to %d\n", - npages, vio->u.write.vui_from, vio->u.write.vui_to); + npages, vio->u.readwrite.vui_from, vio->u.readwrite.vui_to); LASSERT(page_list_sanity_check(obj, queue)); /* submit IO with async write */ rc = cl_io_commit_async(env, io, queue, - vio->u.write.vui_from, vio->u.write.vui_to, + vio->u.readwrite.vui_from, + vio->u.readwrite.vui_to, write_commit_callback); npages -= queue->pl_nr; /* already committed pages */ if (npages > 0) { @@ -1041,18 +1058,18 @@ int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io) bytes = npages << PAGE_SHIFT; /* first page */ - bytes -= vio->u.write.vui_from; + bytes -= vio->u.readwrite.vui_from; if (queue->pl_nr == 0) /* last page */ - bytes -= PAGE_SIZE - vio->u.write.vui_to; + bytes -= PAGE_SIZE - vio->u.readwrite.vui_to; LASSERTF(bytes > 0, "bytes = %d, pages = %d\n", bytes, npages); - vio->u.write.vui_written += bytes; + vio->u.readwrite.vui_written += bytes; CDEBUG(D_VFSTRACE, "Committed %d pages %d bytes, tot: %ld\n", - npages, bytes, vio->u.write.vui_written); + npages, bytes, vio->u.readwrite.vui_written); /* the first page must have been written. */ - vio->u.write.vui_from = 0; + vio->u.readwrite.vui_from = 0; } LASSERT(page_list_sanity_check(obj, queue)); LASSERT(ergo(rc == 0, queue->pl_nr == 0)); @@ -1060,10 +1077,10 @@ int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io) /* out of quota, try sync write */ if (rc == -EDQUOT && !cl_io_is_mkwrite(io)) { rc = vvp_io_commit_sync(env, io, queue, - vio->u.write.vui_from, - vio->u.write.vui_to); + vio->u.readwrite.vui_from, + vio->u.readwrite.vui_to); if (rc > 0) { - vio->u.write.vui_written += rc; + vio->u.readwrite.vui_written += rc; rc = 0; } } @@ -1181,15 +1198,15 @@ static int vvp_io_write_start(const struct lu_env *env, result = vvp_io_write_commit(env, io); /* Simulate short commit */ if (CFS_FAULT_CHECK(OBD_FAIL_LLITE_SHORT_COMMIT)) { - vio->u.write.vui_written >>= 1; - if (vio->u.write.vui_written > 0) + vio->u.readwrite.vui_written >>= 1; + if (vio->u.readwrite.vui_written > 0) io->ci_need_restart = 1; } - if (vio->u.write.vui_written > 0) { - result = vio->u.write.vui_written; + if (vio->u.readwrite.vui_written > 0) { + result = vio->u.readwrite.vui_written; io->ci_nob += result; - - CDEBUG(D_VFSTRACE, "write: nob %zd, result: %zd\n", + CDEBUG(D_VFSTRACE, "%s: write: nob %zd, result: %zd\n", + file_dentry(file)->d_name.name, io->ci_nob, result); } else { io->ci_continue = 0; @@ -1215,11 +1232,18 @@ static int vvp_io_write_start(const struct lu_env *env, if (result > 0 || result == -EIOCBQUEUED) { set_bit(LLIF_DATA_MODIFIED, &(ll_i2info(inode))->lli_flags); - if (result < cnt) + if (result != -EIOCBQUEUED && result < cnt) io->ci_continue = 0; if (result > 0) result = 0; + /* move forward */ + if (result == -EIOCBQUEUED) { + io->ci_nob += vio->u.readwrite.vui_written; + vio->vui_iocb->ki_pos = pos + + vio->u.readwrite.vui_written; + } } + return result; } @@ -1509,6 +1533,7 @@ static int vvp_io_read_ahead(const struct lu_env *env, .op = { [CIT_READ] = { .cio_fini = vvp_io_fini, + .cio_iter_init = vvp_io_read_iter_init, .cio_lock = vvp_io_read_lock, .cio_start = vvp_io_read_start, .cio_end = vvp_io_rw_end, diff --git a/fs/lustre/obdclass/cl_io.c b/fs/lustre/obdclass/cl_io.c index dcf940f..1564d9f 100644 --- a/fs/lustre/obdclass/cl_io.c +++ b/fs/lustre/obdclass/cl_io.c @@ -695,6 +695,7 @@ int cl_io_submit_sync(const struct lu_env *env, struct cl_io *io, int cl_io_loop(const struct lu_env *env, struct cl_io *io) { int result = 0; + int rc = 0; LINVRNT(cl_io_is_loopable(io)); @@ -727,7 +728,13 @@ int cl_io_loop(const struct lu_env *env, struct cl_io *io) } } cl_io_iter_fini(env, io); - } while (result == 0 && io->ci_continue); + if (result) + rc = result; + } while ((result == 0 || result == -EIOCBQUEUED) && + io->ci_continue); + + if (rc && !result) + result = rc; if (result == -EWOULDBLOCK && io->ci_ndelay) { io->ci_need_restart = 1; From patchwork Wed Jul 15 20:45:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666267 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4D271392 for ; Wed, 15 Jul 2020 20:46:46 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BE6602065F for ; Wed, 15 Jul 2020 20:46:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BE6602065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9BCA321F6BA; Wed, 15 Jul 2020 13:46:10 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 5ADF321F7F9 for ; Wed, 15 Jul 2020 13:45:34 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id D11AB5E7; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id CFD4A2BB; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:17 -0400 Message-Id: <1594845918-29027-37-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 36/37] lnet: socklnd: change ksnd_nthreads to atomic_t X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown This variable is treated like an atomic_t, but a global spinlock is used to protect updates - and also unnecessarily to protect reads. Change to atomic_t and avoid using the spinlock. WC-bug-id: https://jira.whamcloud.com/browse/LU-12678 Lustre-commit: 4b0d3c0e41201 ("LU-12678 socklnd: change ksnd_nthreads to atomic_t") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/39121 Reviewed-by: James Simmons Reviewed-by: Chris Horn Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/klnds/socklnd/socklnd.c | 4 ++-- net/lnet/klnds/socklnd/socklnd.h | 2 +- net/lnet/klnds/socklnd/socklnd_cb.c | 8 ++------ 3 files changed, 5 insertions(+), 9 deletions(-) diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c index 22a73c3..91925475 100644 --- a/net/lnet/klnds/socklnd/socklnd.c +++ b/net/lnet/klnds/socklnd/socklnd.c @@ -2260,9 +2260,9 @@ static int ksocknal_push(struct lnet_ni *ni, struct lnet_process_id id) } wait_var_event_warning(&ksocknal_data.ksnd_nthreads, - ksocknal_data.ksnd_nthreads == 0, + atomic_read(&ksocknal_data.ksnd_nthreads) == 0, "waiting for %d threads to terminate\n", - ksocknal_data.ksnd_nthreads); + atomic_read(&ksocknal_data.ksnd_nthreads)); ksocknal_free_buffers(); diff --git a/net/lnet/klnds/socklnd/socklnd.h b/net/lnet/klnds/socklnd/socklnd.h index df863f2..350f2c8 100644 --- a/net/lnet/klnds/socklnd/socklnd.h +++ b/net/lnet/klnds/socklnd/socklnd.h @@ -196,7 +196,7 @@ struct ksock_nal_data { * known peers */ - int ksnd_nthreads; /* # live threads */ + atomic_t ksnd_nthreads; /* # live threads */ int ksnd_shuttingdown; /* tell threads to exit */ struct ksock_sched **ksnd_schedulers; /* schedulers info */ diff --git a/net/lnet/klnds/socklnd/socklnd_cb.c b/net/lnet/klnds/socklnd/socklnd_cb.c index 9b3b604..a1c0c3d 100644 --- a/net/lnet/klnds/socklnd/socklnd_cb.c +++ b/net/lnet/klnds/socklnd/socklnd_cb.c @@ -976,19 +976,15 @@ struct ksock_route * if (IS_ERR(task)) return PTR_ERR(task); - write_lock_bh(&ksocknal_data.ksnd_global_lock); - ksocknal_data.ksnd_nthreads++; - write_unlock_bh(&ksocknal_data.ksnd_global_lock); + atomic_inc(&ksocknal_data.ksnd_nthreads); return 0; } void ksocknal_thread_fini(void) { - write_lock_bh(&ksocknal_data.ksnd_global_lock); - if (--ksocknal_data.ksnd_nthreads == 0) + if (atomic_dec_and_test(&ksocknal_data.ksnd_nthreads)) wake_up_var(&ksocknal_data.ksnd_nthreads); - write_unlock_bh(&ksocknal_data.ksnd_global_lock); } int From patchwork Wed Jul 15 20:45:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 11666289 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EA3CF618 for ; Wed, 15 Jul 2020 20:47:19 +0000 (UTC) Received: from pdx1-mailman02.dreamhost.com (pdx1-mailman02.dreamhost.com [64.90.62.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D27AE2065F for ; Wed, 15 Jul 2020 20:47:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D27AE2065F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lustre-devel-bounces@lists.lustre.org Received: from pdx1-mailman02.dreamhost.com (localhost [IPv6:::1]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 65CA721FB5D; Wed, 15 Jul 2020 13:46:29 -0700 (PDT) X-Original-To: lustre-devel@lists.lustre.org Delivered-To: lustre-devel-lustre.org@pdx1-mailman02.dreamhost.com Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) by pdx1-mailman02.dreamhost.com (Postfix) with ESMTP id 9E21B21F7F9 for ; Wed, 15 Jul 2020 13:45:34 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id D3DE95E9; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id D2AC88D; Wed, 15 Jul 2020 16:45:20 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Wed, 15 Jul 2020 16:45:18 -0400 Message-Id: <1594845918-29027-38-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> References: <1594845918-29027-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 37/37] lnet: check rtr_nid is a gateway X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Amir Shehata , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Amir Shehata The rtr_nid is specified for all REPLY/ACK. However it is possible for the route through the gateway specified by rtr_nid to be removed. In this case we don't want to use it. We should lookup alternative paths. This patch checks if the peer looked up is indeed a gateway. If it's not a gateway then we attempt to find another path. There is no need to fail right away. It's not a hard requirement to fail if the default rtr_nid is not valid. WC-bug-id: https://jira.whamcloud.com/browse/LU-13713 Lustre-commit: 07397a2e7473c ("LU-13713 lnet: check rtr_nid is a gateway") Signed-off-by: Amir Shehata Reviewed-on: https://review.whamcloud.com/39175 Reviewed-by: Chris Horn Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/lib-move.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 234fbb5..c0dd30c 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -1777,6 +1777,7 @@ struct lnet_ni * struct lnet_route *last_route = NULL; struct lnet_peer_ni *lpni = NULL; struct lnet_peer_ni *gwni = NULL; + bool route_found = false; lnet_nid_t src_nid = (sd->sd_src_nid != LNET_NID_ANY) ? sd->sd_src_nid : sd->sd_best_ni ? sd->sd_best_ni->ni_nid : LNET_NID_ANY; @@ -1790,15 +1791,20 @@ struct lnet_ni * */ if (sd->sd_rtr_nid != LNET_NID_ANY) { gwni = lnet_find_peer_ni_locked(sd->sd_rtr_nid); - if (!gwni) { - CERROR("No peer NI for gateway %s\n", + if (gwni) { + gw = gwni->lpni_peer_net->lpn_peer; + lnet_peer_ni_decref_locked(gwni); + if (gw->lp_rtr_refcount) { + local_lnet = LNET_NIDNET(sd->sd_rtr_nid); + route_found = true; + } + } else { + CWARN("No peer NI for gateway %s. Attempting to find an alternative route.\n", libcfs_nid2str(sd->sd_rtr_nid)); - return -EHOSTUNREACH; } - gw = gwni->lpni_peer_net->lpn_peer; - lnet_peer_ni_decref_locked(gwni); - local_lnet = LNET_NIDNET(sd->sd_rtr_nid); - } else { + } + + if (!route_found) { /* we've already looked up the initial lpni using dst_nid */ lpni = sd->sd_best_lpni; /* the peer tree must be in existence */