From patchwork Thu Jan 30 14:10:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E6BAC0218F for ; Thu, 30 Jan 2025 14:18:22 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLZw2f6rz1y1r; Thu, 30 Jan 2025 06:12:56 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYL0rXmz1whJ for ; Thu, 30 Jan 2025 06:11:34 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm5-e204-208.ccs.ornl.gov [160.91.203.29]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id D7658891204; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id CE737106BE16; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:51 -0500 Message-ID: <20250130141115.950749-2-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 01/25] lustre: remove additional cl_env_get() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" The patch to adjust the read count for truncated file was not properly applied. In ll_file_read_iter() cl_env moved to earlier in the function but the original call to cl_env_get() remained. This left an additional reference that later prevented the lustre module from unloading. Fixes: 07084899763 ("lustre: llite: adjust read count as file got truncated") Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 05a75aed7826..24904acb28e0 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -2108,10 +2108,6 @@ static ssize_t ll_file_read_iter(struct kiocb *iocb, struct iov_iter *to) if (result < 0 || iov_iter_count(to) == 0) goto out; - env = cl_env_get(&refcheck); - if (IS_ERR(env)) - return PTR_ERR(env); - args = ll_env_args(env); args->u.normal.via_iter = to; args->u.normal.via_iocb = iocb; From patchwork Thu Jan 30 14:10:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954654 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AB810C0218F for ; Thu, 30 Jan 2025 14:28:22 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLd93pT5z214h; Thu, 30 Jan 2025 06:14:53 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYP0QzGz1xP1 for ; Thu, 30 Jan 2025 06:11:37 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id D982617D1ED; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id D3C0B106BE17; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:52 -0500 Message-ID: <20250130141115.950749-3-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 02/25] lustre: obdclass: improve iocontrol error messages X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Arshad Hussain , Vitaliy Kuznetsov , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Add consistent CDEBUG() messages for iocontrol handlers. Add helpers OBD_IOC_ERROR() and OBD_IOC_DEBUG() to print the iocontrol parameters consistently in case of an error. WC-bug-id: https://jira.whamcloud.com/browse/LU-16634 Lustre-commit: 1f4825eff026321a8 ("LU-16634 obdclass: improve iocontrol error messages") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50334 Reviewed-by: Arshad Hussain Reviewed-by: Vitaliy Kuznetsov Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_class.h | 7 ++ fs/lustre/llite/dir.c | 7 +- fs/lustre/llite/file.c | 30 +++--- fs/lustre/lmv/lmv_obd.c | 9 +- fs/lustre/lov/lov_obd.c | 13 ++- fs/lustre/mdc/mdc_request.c | 8 +- fs/lustre/obdclass/class_obd.c | 162 ++++++++++++++++++-------------- fs/lustre/obdecho/echo_client.c | 9 +- fs/lustre/osc/osc_request.c | 8 +- 9 files changed, 146 insertions(+), 107 deletions(-) diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h index 2b66bc46168c..eef9bfb91f4d 100644 --- a/fs/lustre/include/obd_class.h +++ b/fs/lustre/include/obd_class.h @@ -1836,4 +1836,11 @@ struct attribute *get_attr_starts_with(const struct kobj_type *typ, return _get_attr_matches(typ, name, len, _attr_name_starts_with); } +int obd_ioctl_msg(const char *file, const char *func, int line, int level, + const char *name, unsigned int cmd, const char *msg, int rc); +#define OBD_IOC_DEBUG(level, dev, cmd, msg, rc) \ + obd_ioctl_msg(__FILE__, __func__, __LINE__, level, dev, cmd, msg, rc) +#define OBD_IOC_ERROR(dev, cmd, msg, rc) \ + obd_ioctl_msg(__FILE__, __func__, __LINE__, D_ERROR, dev, cmd, msg, rc) + #endif /* __LINUX_OBD_CLASS_H */ diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 9caff36c9bef..2e44f9bb3895 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -1467,8 +1467,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) void __user *uarg = (void __user *)arg; int rc = 0; - CDEBUG(D_VFSTRACE, "VFS Op:inode=" DFID "(%p), cmd=%#x\n", - PFID(ll_inode2fid(inode)), inode, cmd); + CDEBUG(D_VFSTRACE|D_IOCTL, "VFS Op:inode="DFID"(%pK) cmd=%x arg=%lx\n", + PFID(ll_inode2fid(inode)), inode, cmd, arg); /* asm-ppc{,64} declares TCGETS, et. al. as type 't' not 'T' */ if (_IOC_TYPE(cmd) == 'T' || _IOC_TYPE(cmd) == 't') /* tty ioctls */ @@ -2063,7 +2063,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) rc = obd_get_info(NULL, exp, sizeof(KEY_TGT_COUNT), KEY_TGT_COUNT, &vallen, &count); if (rc) { - CERROR("get target count failed: %d\n", rc); + CERROR("%s: get target count failed: rc = %d\n", + sbi->ll_fsname, rc); return rc; } diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 24904acb28e0..9307007c3e18 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -4099,8 +4099,8 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) void __user *uarg = (void __user *)arg; int flags, rc = 0; - CDEBUG(D_VFSTRACE, "VFS Op:inode=" DFID "(%p),cmd=%x\n", - PFID(ll_inode2fid(inode)), inode, cmd); + CDEBUG(D_VFSTRACE|D_IOCTL, "VFS Op:inode="DFID"(%pK) cmd=%x arg=%lx\n", + PFID(ll_inode2fid(inode)), inode, cmd, arg); ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_IOCTL, 1); /* asm-ppc{,64} declares TCGETS, et. al. as type 't' not 'T' */ @@ -4123,9 +4123,11 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) if (cmd == LL_IOC_SETFLAGS) { if ((flags & LL_FILE_IGNORE_LOCK) && !(file->f_flags & O_DIRECT)) { - CERROR("%s: unable to disable locking on non-O_DIRECT file\n", - current->comm); - return -EINVAL; + rc = -EINVAL; + CERROR("%s: unable to disable locking on non-O_DIRECT file "DFID": rc = %d\n", + current->comm, PFID(ll_inode2fid(inode)), + rc); + return rc; } fd->fd_flags |= flags; @@ -4865,32 +4867,30 @@ ll_file_flock(struct file *file, int cmd, struct file_lock *file_lock) einfo.ei_mode = LCK_PW; break; default: - CDEBUG(D_INFO, "Unknown fcntl lock type: %d\n", fl_type); - return -ENOTSUPP; + rc = -EINVAL; + CERROR("%s: fcntl from '%s' unknown lock type=%d: rc = %d\n", + sbi->ll_fsname, current->comm, fl_type, rc); + return rc; } switch (cmd) { case F_SETLKW: -#ifdef F_SETLKW64 case F_SETLKW64: -#endif flags = 0; break; case F_SETLK: -#ifdef F_SETLK64 case F_SETLK64: -#endif flags = LDLM_FL_BLOCK_NOWAIT; break; case F_GETLK: -#ifdef F_GETLK64 case F_GETLK64: -#endif flags = LDLM_FL_TEST_LOCK; break; default: - CERROR("unknown fcntl lock command: %d\n", cmd); - return -EINVAL; + rc = -EINVAL; + CERROR("%s: fcntl from '%s' unknown lock command=%d: rc = %d\n", + sbi->ll_fsname, current->comm, cmd, rc); + return rc; } /* diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 54f86736e9bf..62385ac9b00f 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -831,6 +831,9 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, u32 count = lmv->lmv_mdt_count; int rc = 0; + CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", + exp->exp_obd->obd_name, cmd, len, karg, uarg); + if (count == 0) return -ENOTTY; @@ -1069,10 +1072,8 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, err = obd_iocontrol(cmd, tgt->ltd_exp, len, karg, uarg); if (err) { if (tgt->ltd_active) { - CERROR("%s: error: iocontrol MDC %s on MDTidx %d cmd %x: err = %d\n", - lmv2obd_dev(lmv)->obd_name, - tgt->ltd_uuid.uuid, - tgt->ltd_index, cmd, err); + OBD_IOC_ERROR(obd->obd_name, cmd, + tgt->ltd_uuid.uuid, err); if (!rc) rc = err; } diff --git a/fs/lustre/lov/lov_obd.c b/fs/lustre/lov/lov_obd.c index d2fe8c32097a..8bfce5001d58 100644 --- a/fs/lustre/lov/lov_obd.c +++ b/fs/lustre/lov/lov_obd.c @@ -970,6 +970,9 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, struct lov_obd *lov = &obd->u.lov; int i = 0, rc = 0, count = lov->desc.ld_tgt_count; + CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", + exp->exp_obd->obd_name, cmd, len, karg, uarg); + switch (cmd) { case IOC_OBD_STATFS: { struct obd_ioctl_data *data = karg; @@ -1097,11 +1100,11 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, len, karg, uarg); if (err) { if (lov->lov_tgts[i]->ltd_active) { - CDEBUG_LIMIT(err == -ENOTTY ? - D_IOCTL : D_WARNING, - "iocontrol OSC %s on OST idx %d cmd %x: err = %d\n", - lov_uuid2str(lov, i), - i, cmd, err); + OBD_IOC_DEBUG(err == -ENOTTY ? + D_IOCTL : D_WARNING, + obd->obd_name, cmd, + lov_uuid2str(lov, i), + err); if (!rc) rc = err; } diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c index 15e58e84ddbb..84c4d2888e7d 100644 --- a/fs/lustre/mdc/mdc_request.c +++ b/fs/lustre/mdc/mdc_request.c @@ -2208,6 +2208,9 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, struct obd_import *imp = obd->u.cli.cl_import; int rc; + CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", + obd->obd_name, cmd, len, karg, uarg); + if (!try_module_get(THIS_MODULE)) { CERROR("%s: cannot get module '%s'\n", obd->obd_name, module_name(THIS_MODULE)); @@ -2321,9 +2324,8 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, rc = mdc_ioc_swap_layouts(exp, karg); goto out; default: - CERROR("unrecognised ioctl: cmd = %#x\n", cmd); - rc = -ENOTTY; - goto out; + rc = OBD_IOC_ERROR(obd->obd_name, cmd, "unrecognized", -ENOTTY); + break; } out: module_put(THIS_MODULE); diff --git a/fs/lustre/obdclass/class_obd.c b/fs/lustre/obdclass/class_obd.c index a7a2a6c127c3..dd5fcc895f02 100644 --- a/fs/lustre/obdclass/class_obd.c +++ b/fs/lustre/obdclass/class_obd.c @@ -82,6 +82,29 @@ EXPORT_SYMBOL(at_early_margin); int at_extra = 30; EXPORT_SYMBOL(at_extra); +int obd_ioctl_msg(const char *file, const char *func, int line, int level, + const char *name, unsigned int cmd, const char *msg, int rc) +{ + static struct cfs_debug_limit_state cdls; + static char *dirs[] = { + [_IOC_NONE] = "_IO", + [_IOC_READ] = "_IOR", + [_IOC_WRITE] = "_IOW", + [_IOC_READ|_IOC_WRITE] = "_IOWR", + }; + char type; + + type = _IOC_TYPE(cmd); + __CDEBUG_WITH_LOC(file, func, line, level, &cdls, + "%s: iocontrol from '%s' cmd=%x %s('%c', %u, %u) %s: rc = %d\n", + name, current->comm, cmd, + dirs[_IOC_DIR(cmd)] ?: "_IO?", + isprint(type) ? type : '?', _IOC_NR(cmd), + _IOC_SIZE(cmd), msg, rc); + return rc; +} +EXPORT_SYMBOL(obd_ioctl_msg); + static int class_resolve_dev_name(u32 len, const char *name) { int rc; @@ -199,26 +222,27 @@ int obd_ioctl_getdata(struct obd_ioctl_data **datap, int *len, void __user *arg) struct obd_ioctl_data *data; struct obd_ioctl_hdr hdr; int offset = 0; - int err; + int rc = -EINVAL; if (copy_from_user(&hdr, arg, sizeof(hdr))) return -EFAULT; if (hdr.ioc_version != OBD_IOCTL_VERSION) { - CERROR("Version mismatch kernel (%x) vs application (%x)\n", - OBD_IOCTL_VERSION, hdr.ioc_version); - return -EINVAL; + CERROR("%s: kernel/user version mismatch (%x != %x): rc = %d\n", + current->comm, OBD_IOCTL_VERSION, hdr.ioc_version, rc); + return rc; } if (hdr.ioc_len > OBD_MAX_IOCTL_BUFFER) { - CERROR("User buffer len %d exceeds %d max buffer\n", - hdr.ioc_len, OBD_MAX_IOCTL_BUFFER); - return -EINVAL; + CERROR("%s: user buffer len %d exceeds %d max: rc = %d\n", + current->comm, hdr.ioc_len, OBD_MAX_IOCTL_BUFFER, rc); + return rc; } - if (hdr.ioc_len < sizeof(struct obd_ioctl_data)) { - CERROR("User buffer too small for ioctl (%d)\n", hdr.ioc_len); - return -EINVAL; + if (hdr.ioc_len < sizeof(*data)) { + CERROR("%s: user buffer %d too small for ioctl %zu: rc = %d\n", + current->comm, hdr.ioc_len, sizeof(*data), rc); + return rc; } /* When there are lots of processes calling vmalloc on multi-core @@ -228,26 +252,27 @@ int obd_ioctl_getdata(struct obd_ioctl_data **datap, int *len, void __user *arg) */ data = kvzalloc(hdr.ioc_len, GFP_KERNEL); if (!data) { - CERROR("Cannot allocate control buffer of len %d\n", - hdr.ioc_len); - return -EINVAL; + rc = -ENOMEM; + CERROR("%s: cannot allocate control buffer len %d: rc = %d\n", + current->comm, hdr.ioc_len, rc); + return rc; } *len = hdr.ioc_len; *datap = data; if (copy_from_user(data, arg, hdr.ioc_len)) { - err = -EFAULT; + rc = -EFAULT; goto free_buf; } if (hdr.ioc_len != data->ioc_len) { - err = -EINVAL; + rc = -EINVAL; goto free_buf; } if (obd_ioctl_is_invalid(data)) { CERROR("ioctl not correctly formatted\n"); - err = -EINVAL; + rc = -EINVAL; goto free_buf; } @@ -273,7 +298,7 @@ int obd_ioctl_getdata(struct obd_ioctl_data **datap, int *len, void __user *arg) free_buf: kvfree(data); - return err; + return rc; } EXPORT_SYMBOL(obd_ioctl_getdata); @@ -281,12 +306,13 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) { struct obd_ioctl_data *data; struct obd_device *obd = NULL; - int err = 0, len = 0; + int rc = 0, len = 0; - CDEBUG(D_IOCTL, "cmd = %x\n", cmd); - if (obd_ioctl_getdata(&data, &len, uarg)) { - CERROR("OBD ioctl: data error\n"); - return -EINVAL; + CDEBUG(D_IOCTL, "obdclass: cmd=%x len=%u uarg=%pK\n", cmd, len, uarg); + rc = obd_ioctl_getdata(&data, &len, uarg); + if (rc) { + CERROR("%s: ioctl data error: rc = %d\n", current->comm, rc); + return rc; } switch (cmd) { @@ -294,28 +320,26 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) struct lustre_cfg *lcfg; if (!data->ioc_plen1 || !data->ioc_pbuf1) { - CERROR("No config buffer passed!\n"); - err = -EINVAL; + rc = OBD_IOC_ERROR("obdclass", cmd, "no config buffer", + -EINVAL); goto out; } lcfg = kzalloc(data->ioc_plen1, GFP_NOFS); if (!lcfg) { - err = -ENOMEM; + rc = -ENOMEM; goto out; } - if (copy_from_user(lcfg, data->ioc_pbuf1, data->ioc_plen1)) - err = -EFAULT; - if (!err) - err = lustre_cfg_sanity_check(lcfg, data->ioc_plen1); - if (!err) - err = class_process_config(lcfg); + rc = copy_from_user(lcfg, data->ioc_pbuf1, data->ioc_plen1); + if (!rc) + rc = lustre_cfg_sanity_check(lcfg, data->ioc_plen1); + if (!rc) + rc = class_process_config(lcfg); kfree(lcfg); goto out; } case OBD_GET_VERSION: { - /* This was the method to pass to user land the lustre version. * Today that information is in the sysfs tree so we can in the * future remove this. @@ -324,14 +348,14 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) LUSTRE_VERSION_CODE); if (!data->ioc_inlbuf1) { - CERROR("No buffer passed in ioctl\n"); - err = -EINVAL; + rc = OBD_IOC_ERROR("obdclass", cmd, "no buffer passed", + -EINVAL); goto out; } if (strlen(LUSTRE_VERSION_STRING) + 1 > data->ioc_inllen1) { - CERROR("ioctl buffer too small to hold version\n"); - err = -EINVAL; + rc = OBD_IOC_ERROR("obdclass", cmd, "buffer too small", + -EINVAL); goto out; } @@ -342,7 +366,7 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) strlen(LUSTRE_VERSION_STRING) + 1); if (copy_to_user(uarg, data, len)) - err = -EFAULT; + rc = -EFAULT; goto out; } case OBD_IOC_NAME2DEV: { @@ -355,30 +379,28 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) data->ioc_inlbuf1); data->ioc_dev = dev; if (dev < 0) { - err = -EINVAL; + rc = -EINVAL; goto out; } if (copy_to_user(uarg, data, sizeof(*data))) - err = -EFAULT; + rc = -EFAULT; goto out; } case OBD_IOC_UUID2DEV: { - /* Resolve a device uuid. This does not change the - * currently selected device. - */ - int dev; + /* Resolve device uuid, does not change current selected dev */ struct obd_uuid uuid; + int dev; if (!data->ioc_inllen1 || !data->ioc_inlbuf1) { - CERROR("No UUID passed!\n"); - err = -EINVAL; + rc = OBD_IOC_ERROR("obdclass", cmd, "no UUID passed", + -EINVAL); goto out; } if (data->ioc_inlbuf1[data->ioc_inllen1 - 1] != 0) { - CERROR("UUID not NUL terminated!\n"); - err = -EINVAL; + rc = OBD_IOC_ERROR("obdclass", cmd, "unterminated UUID", + -EINVAL); goto out; } @@ -389,7 +411,7 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) if (dev == -1) { CDEBUG(D_IOCTL, "No device for UUID %s!\n", data->ioc_inlbuf1); - err = -EINVAL; + rc = -EINVAL; goto out; } @@ -397,28 +419,28 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) dev); if (copy_to_user(uarg, data, sizeof(*data))) - err = -EFAULT; + rc = -EFAULT; goto out; } case OBD_IOC_GETDEVICE: { - int index = data->ioc_count; - char *status, *str; + int index = data->ioc_count; + char *status, *str; if (!data->ioc_inlbuf1) { - CERROR("No buffer passed in ioctl\n"); - err = -EINVAL; + rc = OBD_IOC_ERROR("obdclass", cmd, "no buffer passed", + -EINVAL); goto out; } - if (data->ioc_inllen1 < 128) { - CERROR("ioctl buffer too small to hold version\n"); - err = -EINVAL; + if (data->ioc_inllen1 < MAX_OBD_NAME) { + rc = OBD_IOC_ERROR("obdclass", cmd, "too small version", + -EINVAL); goto out; } obd = class_num2obd(index); if (!obd) { - err = -ENOENT; + rc = -ENOENT; goto out; } @@ -439,51 +461,51 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) atomic_read(&obd->obd_refcount)); if (copy_to_user(uarg, data, len)) - err = -EFAULT; + rc = -EFAULT; goto out; } } if (data->ioc_dev == OBD_DEV_BY_DEVNAME) { if (data->ioc_inllen4 <= 0 || !data->ioc_inlbuf4) { - err = -EINVAL; + rc = -EINVAL; goto out; } if (strnlen(data->ioc_inlbuf4, MAX_OBD_NAME) >= MAX_OBD_NAME) { - err = -EINVAL; + rc = -EINVAL; goto out; } obd = class_name2obd(data->ioc_inlbuf4); } else if (data->ioc_dev < class_devno_max()) { obd = class_num2obd(data->ioc_dev); } else { - CERROR("OBD ioctl: No device\n"); - err = -EINVAL; + rc = OBD_IOC_ERROR("obdclass", cmd, "no device", -EINVAL); goto out; } if (!obd) { - CERROR("OBD ioctl : No Device %d\n", data->ioc_dev); - err = -EINVAL; + rc = OBD_IOC_ERROR(data->ioc_inlbuf4, cmd, "no device found", + -EINVAL); goto out; } LASSERT(obd->obd_magic == OBD_DEVICE_MAGIC); if (!obd->obd_set_up || obd->obd_stopping) { - CERROR("OBD ioctl: device not setup %d\n", data->ioc_dev); - err = -EINVAL; + rc = -EINVAL; + CERROR("obdclass: device %d not set up: rc = %d\n", + data->ioc_dev, rc); goto out; } - err = obd_iocontrol(cmd, obd->obd_self_export, len, data, NULL); - if (err) + rc = obd_iocontrol(cmd, obd->obd_self_export, len, data, NULL); + if (rc) goto out; if (copy_to_user(uarg, data, len)) - err = -EFAULT; + rc = -EFAULT; out: kvfree(data); - return err; + return rc; } /* class_handle_ioctl */ /* to control /dev/obd */ diff --git a/fs/lustre/obdecho/echo_client.c b/fs/lustre/obdecho/echo_client.c index d7de6e48c7cb..95af2af66918 100644 --- a/fs/lustre/obdecho/echo_client.c +++ b/fs/lustre/obdecho/echo_client.c @@ -999,6 +999,9 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len, int rw = OBD_BRW_READ; int rc = 0; + CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", + exp->exp_obd->obd_name, cmd, len, karg, uarg); + oa = &data->ioc_obdo1; if (!(oa->o_valid & OBD_MD_FLGROUP)) { oa->o_valid |= OBD_MD_FLGROUP; @@ -1076,11 +1079,9 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len, case OBD_IOC_BRW_READ: rc = echo_client_brw_ioctl(env, rw, exp, data); goto out; - default: - CERROR("echo_ioctl(): unrecognised ioctl %#x\n", cmd); - rc = -ENOTTY; - goto out; + rc = OBD_IOC_ERROR(obd->obd_name, cmd, "unrecognized", -ENOTTY); + break; } out: diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index 35dd009b3507..e0955c11191b 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -3359,6 +3359,9 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, struct obd_ioctl_data *data = karg; int rc = 0; + CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", + obd->obd_name, cmd, len, karg, uarg); + if (!try_module_get(THIS_MODULE)) { CERROR("%s: cannot get module '%s'\n", obd->obd_name, module_name(THIS_MODULE)); @@ -3379,9 +3382,8 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, data->ioc_offset); break; default: - CDEBUG(D_INODE, "%s: unrecognised ioctl %#x by %s\n", - obd->obd_name, cmd, current->comm); - rc = -ENOTTY; + rc = OBD_IOC_DEBUG(D_IOCTL, obd->obd_name, cmd, "unrecognized", + -ENOTTY); break; } From patchwork Thu Jan 30 14:10:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 03564C0218A for ; Thu, 30 Jan 2025 14:33:17 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLf74vkGz21YC; Thu, 30 Jan 2025 06:15:43 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYW4B5Sz1xGG for ; Thu, 30 Jan 2025 06:11:43 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm5-e204-208.ccs.ornl.gov [160.91.203.29]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id DED7D18232E; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id D7DF7106BE18; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:53 -0500 Message-ID: <20250130141115.950749-4-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 03/25] lustre: introduce class_parse_nid() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Frank Sehr , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown class_parse_nid() and class_parse_nid_quiet() can be used to parse struct lnet_nid including nids with ipv6 addresses. WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: 163331cb81c9f7d7a ("LU-10391 lustre: introduce class_parse_nid()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50089 Reviewed-by: Andreas Dilger Reviewed-by: James Simmons Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_class.h | 2 ++ fs/lustre/obdclass/obd_config.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h index eef9bfb91f4d..e91335e8cd70 100644 --- a/fs/lustre/include/obd_class.h +++ b/fs/lustre/include/obd_class.h @@ -146,6 +146,8 @@ struct cfg_interop_param *class_find_old_param(const char *param, int class_get_next_param(char **params, char *copy); int class_parse_nid4(char *buf, lnet_nid_t *nid4, char **endh); int class_parse_nid4_quiet(char *buf, lnet_nid_t *nid4, char **endh); +int class_parse_nid(char *buf, struct lnet_nid *nid, char **endh); +int class_parse_nid_quiet(char *buf, struct lnet_nid *nid, char **endh); int class_parse_net(char *buf, u32 *net, char **endh); int class_match_net(char *buf, char *key, u32 net); diff --git a/fs/lustre/obdclass/obd_config.c b/fs/lustre/obdclass/obd_config.c index eb14ca807f34..cc7810cae659 100644 --- a/fs/lustre/obdclass/obd_config.c +++ b/fs/lustre/obdclass/obd_config.c @@ -165,6 +165,18 @@ static int parse_nid4(char *buf, void *value, int quiet) return -EINVAL; } +static int parse_nid(char *buf, void *value, int quiet) +{ + struct lnet_nid *nid = value; + + if (libcfs_strnid(nid, buf) == 0) + return 0; + + if (!quiet) + LCONSOLE_ERROR_MSG(0x159, "Can't parse NID '%s'\n", buf); + return -EINVAL; +} + static int parse_net(char *buf, void *value) { u32 *net = value; @@ -176,6 +188,7 @@ static int parse_net(char *buf, void *value) enum { CLASS_PARSE_NID4 = 1, + CLASS_PARSE_NID, CLASS_PARSE_NET, }; @@ -211,6 +224,9 @@ static int class_parse_value(char *buf, int opc, void *value, char **endh, case CLASS_PARSE_NID4: rc = parse_nid4(buf, value, quiet); break; + case CLASS_PARSE_NID: + rc = parse_nid(buf, value, quiet); + break; case CLASS_PARSE_NET: rc = parse_net(buf, value); break; @@ -235,6 +251,18 @@ int class_parse_nid4_quiet(char *buf, lnet_nid_t *nid4, char **endh) } EXPORT_SYMBOL(class_parse_nid4_quiet); +int class_parse_nid(char *buf, struct lnet_nid *nid, char **endh) +{ + return class_parse_value(buf, CLASS_PARSE_NID, (void *)nid, endh, 0); +} +EXPORT_SYMBOL(class_parse_nid); + +int class_parse_nid_quiet(char *buf, struct lnet_nid *nid, char **endh) +{ + return class_parse_value(buf, CLASS_PARSE_NID, (void *)nid, endh, 1); +} +EXPORT_SYMBOL(class_parse_nid_quiet); + char *lustre_cfg_string(struct lustre_cfg *lcfg, u32 index) { char *s; From patchwork Thu Jan 30 14:10:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954651 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69FDAC0218A for ; Thu, 30 Jan 2025 14:22:02 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLbk0y6mz1xfG; Thu, 30 Jan 2025 06:13:38 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYY4Wl6z1xP1 for ; Thu, 30 Jan 2025 06:11:45 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id E369B182332; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id DC4B1106BE19; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:54 -0500 Message-ID: <20250130141115.950749-5-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 04/25] lnet: change cfs_match_nid to take large nid. X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Frank Sehr , Cyril Bordage , Serguei Smirnov , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown large nid now used more places. WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: 3d0b1b0200542f845 ("LU-10391 lustre: change cfs_match_nid to take large nid.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50098 Reviewed-by: James Simmons Reviewed-by: Frank Sehr Reviewed-by: Serguei Smirnov Reviewed-by: Cyril Bordage Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/llite_lib.c | 2 +- include/uapi/linux/lnet/nidstr.h | 2 +- net/lnet/lnet/nidstrings.c | 10 ++++++---- 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 2c286e858056..d4f17ce24465 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -3746,7 +3746,7 @@ void ll_compute_rootsquash_state(struct ll_sb_info *sbi) while (LNetGetId(i++, &id) != -ENOENT) { if (nid_is_lo0(&id.nid)) continue; - if (cfs_match_nid(lnet_nid_to_nid4(&id.nid), + if (cfs_match_nid(&id.nid, &squash->rsi_nosquash_nids)) { matched = true; break; diff --git a/include/uapi/linux/lnet/nidstr.h b/include/uapi/linux/lnet/nidstr.h index d5829fef0d9f..1ccb8fa826b4 100644 --- a/include/uapi/linux/lnet/nidstr.h +++ b/include/uapi/linux/lnet/nidstr.h @@ -98,7 +98,7 @@ char *libcfs_id2str(struct lnet_process_id id); void cfs_free_nidlist(struct list_head *list); int cfs_parse_nidlist(char *str, int len, struct list_head *list); int cfs_print_nidlist(char *buffer, int count, struct list_head *list); -int cfs_match_nid(lnet_nid_t nid, struct list_head *list); +int cfs_match_nid(struct lnet_nid *nid, struct list_head *list); int cfs_match_net(__u32 net_id, __u32 net_type, struct list_head *net_num_list); diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c index b5a585507d6a..d235048a8ff0 100644 --- a/net/lnet/lnet/nidstrings.c +++ b/net/lnet/lnet/nidstrings.c @@ -358,20 +358,22 @@ EXPORT_SYMBOL(cfs_parse_nidlist); * * Return: 1 on match, 0 otherwises */ -int cfs_match_nid(lnet_nid_t nid, struct list_head *nidlist) +int cfs_match_nid(struct lnet_nid *nid, struct list_head *nidlist) { struct nidrange *nr; struct addrrange *ar; + if (!nid_is_nid4(nid)) + return 0; list_for_each_entry(nr, nidlist, nr_link) { - if (nr->nr_netstrfns->nf_type != LNET_NETTYP(LNET_NIDNET(nid))) + if (nr->nr_netstrfns->nf_type != nid->nid_type) continue; - if (nr->nr_netnum != LNET_NETNUM(LNET_NIDNET(nid))) + if (nr->nr_netnum != be16_to_cpu(nid->nid_num)) continue; if (nr->nr_all) return 1; list_for_each_entry(ar, &nr->nr_addrranges, ar_link) - if (nr->nr_netstrfns->nf_match_addr(LNET_NIDADDR(nid), + if (nr->nr_netstrfns->nf_match_addr(be32_to_cpu(nid->nid_addr[0]), &ar->ar_numaddr_ranges)) return 1; } From patchwork Thu Jan 30 14:10:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954638 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D827DC0218A for ; Thu, 30 Jan 2025 14:17:20 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLZj54MMz1xsp; Thu, 30 Jan 2025 06:12:45 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYM6b1Pz1xNk for ; Thu, 30 Jan 2025 06:11:35 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm-e204-208.ccs.ornl.gov [160.91.203.12]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id E5256891224; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id E06C5106BE14; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:55 -0500 Message-ID: <20250130141115.950749-6-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 05/25] lustre: mount: improve mount/unmount messages X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Arshad Hussain , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger In some cases, unmount errors are printed in multiple places, or are expected so printing them on the console log is not necessary. Conversely, some status messages such as mounting or unmounting the whole target should not be rate limited. WC-bug-id: https://jira.whamcloud.com/browse/LU-980 Lustre-commit: cba8c65b384f92d269 ("LU-980 mount: improve mount/unmount messages") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50511 Reviewed-by: Arshad Hussain Reviewed-by: Olaf Faaland Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/obdclass/obd_config.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/lustre/obdclass/obd_config.c b/fs/lustre/obdclass/obd_config.c index cc7810cae659..689e8f54084d 100644 --- a/fs/lustre/obdclass/obd_config.c +++ b/fs/lustre/obdclass/obd_config.c @@ -507,8 +507,8 @@ static int class_cleanup(struct obd_device *obd, struct lustre_cfg *lcfg) obd->obd_force = 1; break; case 'A': - LCONSOLE_WARN("Failing over %s\n", - obd->obd_name); + LCONSOLE(D_WARNING, "Failing over %s\n", + obd->obd_name); spin_lock(&obd->obd_dev_lock); obd->obd_fail = 1; obd->obd_no_recov = 1; From patchwork Thu Jan 30 14:10:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954660 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EF257C0218A for ; Thu, 30 Jan 2025 14:35:44 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLgB3czmz2285; Thu, 30 Jan 2025 06:16:38 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYS2XCkz1whJ for ; Thu, 30 Jan 2025 06:11:40 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id E8262893E88; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id E581A106BE16; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:56 -0500 Message-ID: <20250130141115.950749-7-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 06/25] lustre: tbf: pb_uid/pb_gid ptlrpc_body fields for TBF rules X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Etienne AUJAMES , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Etienne AUJAMES The file UID/GID are packed inside bulk IO because the requests are sent asynchronously (cannot use the current thread UID/GID). This is an issue for TBF rules if the file/inode UID/GID doesn't match the process ones (e.g: reading common libraries): we can't limit the user RPCs doing the IOs in that case. This patch pack UID/GID for TBF rules inside ptlrpc_body ( pb_padding64_2 -> (pb_uid, pb_gid)) to be independent of quota interactions: it stores the client process UID/GID instead of the values of the file attrs. Moreover, it enables to track requests naturally without UID/GID like ldlm_flock_enqueue. This patch saves the process UID/GID inside the ll_inode_info struct. Then it restores these values when sending a bulk IO from a ptlrpc thread (like for jobids). Fixes: fa7d8b2 ("lustre: ptlrpc: Add QoS for uid and gid in NRS-TBF") WC-bug-id: https://jira.whamcloud.com/browse/LU-16077 Lustre-commit: 0544c108c12c87a43 ("LU-16077 tbf: pb_uid/pb_gid ptlrpc_body fields for TBF rules") Signed-off-by: Etienne AUJAMES Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48235 Reviewed-by: Qian Yingjin Reviewed-by: Oleg Drokin Reviewed-by: Andreas Dilger Signed-off-by: James Simmons --- fs/lustre/include/cl_object.h | 3 + fs/lustre/include/lustre_net.h | 2 + fs/lustre/llite/llite_internal.h | 12 ++-- fs/lustre/llite/llite_lib.c | 4 +- fs/lustre/llite/vvp_io.c | 8 ++- fs/lustre/llite/vvp_object.c | 9 ++- fs/lustre/osc/osc_request.c | 2 + fs/lustre/ptlrpc/client.c | 4 +- fs/lustre/ptlrpc/pack_generic.c | 99 +++++++++++++++++++++++--- fs/lustre/ptlrpc/ptlrpcd.c | 4 +- fs/lustre/ptlrpc/wiretest.c | 22 +++--- include/uapi/linux/lustre/lustre_idl.h | 4 +- 12 files changed, 141 insertions(+), 32 deletions(-) diff --git a/fs/lustre/include/cl_object.h b/fs/lustre/include/cl_object.h index 8a98413fba31..77f00d7fc220 100644 --- a/fs/lustre/include/cl_object.h +++ b/fs/lustre/include/cl_object.h @@ -1974,6 +1974,9 @@ struct cl_req_attr { struct obdo *cra_oa; /** Jobid */ char cra_jobid[LUSTRE_JOBID_SIZE]; + /** uid/gid of the process doing an io */ + u32 cra_uid; + u32 cra_gid; }; enum cache_stats_item { diff --git a/fs/lustre/include/lustre_net.h b/fs/lustre/include/lustre_net.h index a305ba3d08db..de1ef881d9d0 100644 --- a/fs/lustre/include/lustre_net.h +++ b/fs/lustre/include/lustre_net.h @@ -2088,6 +2088,7 @@ u32 lustre_msg_get_conn_cnt(struct lustre_msg *msg); u32 lustre_msg_get_magic(struct lustre_msg *msg); timeout_t lustre_msg_get_timeout(struct lustre_msg *msg); timeout_t lustre_msg_get_service_timeout(struct lustre_msg *msg); +int lustre_msg_get_uid_gid(struct lustre_msg *msg, u32 *uid, u32 *gid); char *lustre_msg_get_jobid(struct lustre_msg *msg); u32 lustre_msg_get_cksum(struct lustre_msg *msg); u64 lustre_msg_get_mbits(struct lustre_msg *msg); @@ -2106,6 +2107,7 @@ void ptlrpc_request_set_replen(struct ptlrpc_request *req); void lustre_msg_set_timeout(struct lustre_msg *msg, timeout_t timeout); void lustre_msg_set_service_timeout(struct lustre_msg *msg, timeout_t service_timeout); +void lustre_msg_set_uid_gid(struct lustre_msg *msg, u32 *uid, u32 *gid); void lustre_msg_set_jobid(struct lustre_msg *msg, char *jobid); void lustre_msg_set_cksum(struct lustre_msg *msg, u32 cksum); void lustre_msg_set_mbits(struct lustre_msg *msg, u64 mbits); diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 88dbd6c692db..e86d700a182b 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -230,13 +230,17 @@ struct ll_inode_info { /* * Whenever a process try to read/write the file, the - * jobid of the process will be saved here, and it'll - * be packed into the write PRC when flush later. + * jobid, uid and gid of the process will be saved here, + * and it'll be packed into write RPC when flush later. * - * So the read/write statistics for jobid will not be - * accurate if the file is shared by different jobs. + * So the read/write statistics or TBF rules for jobid, + * uid or gid will not be accurate if the file is shared + * by different jobs. */ char lli_jobid[LUSTRE_JOBID_SIZE]; + u32 lli_uid; + u32 lli_gid; + struct mutex lli_pcc_lock; enum lu_pcc_state_flags lli_pcc_state; diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index d4f17ce24465..936a81c65870 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -1215,9 +1215,11 @@ void ll_lli_init(struct ll_inode_info *lli) mutex_init(&lli->lli_group_mutex); lli->lli_group_users = 0; lli->lli_group_gid = 0; + memset(lli->lli_jobid, 0, sizeof(lli->lli_jobid)); + lli->lli_uid = (u32) -1; + lli->lli_gid = (u32) -1; } mutex_init(&lli->lli_layout_mutex); - memset(lli->lli_jobid, 0, sizeof(lli->lli_jobid)); /* ll_cl_context initialize */ INIT_LIST_HEAD(&lli->lli_lccs); seqlock_init(&lli->lli_page_inv_lock); diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index 31a3992dad34..26dfaaa76bd9 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -1741,13 +1741,15 @@ int vvp_io_init(const struct lu_env *env, struct cl_object *obj, else vio->vui_tot_count = count; - /* for read/write, we store the jobid in the inode, and - * it'll be fetched by osc when building RPC. + /* for read/write, we store the process jobid/gid/uid in the + * inode, and it'll be fetched by osc when building RPC. * * it's not accurate if the file is shared by different - * jobs. + * jobs/user/group. */ lustre_get_jobid(lli->lli_jobid, sizeof(lli->lli_jobid)); + lli->lli_uid = from_kuid(&init_user_ns, current_uid()); + lli->lli_gid = from_kgid(&init_user_ns, current_gid()); } else if (io->ci_type == CIT_SETATTR) { if (!cl_io_is_trunc(io)) io->ci_lockreq = CILR_MANDATORY; diff --git a/fs/lustre/llite/vvp_object.c b/fs/lustre/llite/vvp_object.c index 302f90018982..c79bc5c0e6c9 100644 --- a/fs/lustre/llite/vvp_object.c +++ b/fs/lustre/llite/vvp_object.c @@ -199,11 +199,13 @@ static void vvp_req_attr_set(const struct lu_env *env, struct cl_object *obj, { u64 valid_flags = OBD_MD_FLTYPE | OBD_MD_FLUID | OBD_MD_FLGID | OBD_MD_FLPROJID; + struct ll_inode_info *lli; struct inode *inode; struct obdo *oa; oa = attr->cra_oa; inode = vvp_object_inode(obj); + lli = ll_i2info(inode); if (attr->cra_type == CRT_WRITE) { valid_flags |= OBD_MD_FLMTIME | OBD_MD_FLCTIME; @@ -215,8 +217,11 @@ static void vvp_req_attr_set(const struct lu_env *env, struct cl_object *obj, obdo_set_parent_fid(oa, &ll_i2info(inode)->lli_fid); if (OBD_FAIL_CHECK(OBD_FAIL_LFSCK_INVALID_PFID)) oa->o_parent_oid++; - memcpy(attr->cra_jobid, ll_i2info(inode)->lli_jobid, - sizeof(attr->cra_jobid)); + + attr->cra_uid = lli->lli_uid; + attr->cra_gid = lli->lli_gid; + + memcpy(attr->cra_jobid, &lli->lli_jobid, sizeof(attr->cra_jobid)); } static const struct cl_object_operations vvp_ops = { diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index e0955c11191b..8efdd5a8cd8a 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -2797,6 +2797,8 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli, crattr->cra_oa = &body->oa; crattr->cra_flags = OBD_MD_FLMTIME | OBD_MD_FLCTIME | OBD_MD_FLATIME; cl_req_attr_set(env, osc2cl(obj), crattr); + lustre_msg_set_uid_gid(req->rq_reqmsg, &crattr->cra_uid, + &crattr->cra_gid); lustre_msg_set_jobid(req->rq_reqmsg, crattr->cra_jobid); aa = ptlrpc_req_async_args(aa, req); diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c index c9a8c8f5841d..13c27977b14d 100644 --- a/fs/lustre/ptlrpc/client.c +++ b/fs/lustre/ptlrpc/client.c @@ -1159,8 +1159,10 @@ void ptlrpc_set_add_req(struct ptlrpc_request_set *set, atomic_inc(&set->set_remaining); req->rq_queued_time = ktime_get_seconds(); - if (req->rq_reqmsg) + if (req->rq_reqmsg) { lustre_msg_set_jobid(req->rq_reqmsg, NULL); + lustre_msg_set_uid_gid(req->rq_reqmsg, NULL, NULL); + } if (set->set_producer) /* diff --git a/fs/lustre/ptlrpc/pack_generic.c b/fs/lustre/ptlrpc/pack_generic.c index 8d58f9b9da2e..53e2912a28e7 100644 --- a/fs/lustre/ptlrpc/pack_generic.c +++ b/fs/lustre/ptlrpc/pack_generic.c @@ -1182,6 +1182,37 @@ timeout_t lustre_msg_get_service_timeout(struct lustre_msg *msg) } } +int lustre_msg_get_uid_gid(struct lustre_msg *msg, u32 *uid, u32 *gid) +{ + switch (msg->lm_magic) { + case LUSTRE_MSG_MAGIC_V2: { + struct ptlrpc_body *pb; + + /* the old pltrpc_body_v2 is smaller; doesn't include uid/gid */ + if (msg->lm_buflens[MSG_PTLRPC_BODY_OFF] < + sizeof(struct ptlrpc_body)) + return -EOPNOTSUPP; + + pb = lustre_msg_buf_v2(msg, MSG_PTLRPC_BODY_OFF, + sizeof(struct ptlrpc_body)); + + if (!pb || !(pb->pb_flags & MSG_PACK_UID_GID)) + return -EOPNOTSUPP; + + if (uid) + *uid = pb->pb_uid; + if (gid) + *gid = pb->pb_gid; + + return 0; + } + default: + CERROR("incorrect message magic: %08x\n", msg->lm_magic); + return -EOPNOTSUPP; + } +} +EXPORT_SYMBOL(lustre_msg_get_uid_gid); + char *lustre_msg_get_jobid(struct lustre_msg *msg) { switch (msg->lm_magic) { @@ -1443,6 +1474,40 @@ void lustre_msg_set_service_timeout(struct lustre_msg *msg, } } +void lustre_msg_set_uid_gid(struct lustre_msg *msg, u32 *uid, u32 *gid) +{ + switch (msg->lm_magic) { + case LUSTRE_MSG_MAGIC_V2: { + u32 opc = lustre_msg_get_opc(msg); + struct ptlrpc_body *pb; + + /* Don't set uid/gid for ldlm ast RPCs */ + if (!opc || opc == LDLM_BL_CALLBACK || + opc == LDLM_CP_CALLBACK || opc == LDLM_GL_CALLBACK) + return; + + pb = lustre_msg_buf_v2(msg, MSG_PTLRPC_BODY_OFF, + sizeof(struct ptlrpc_body)); + LASSERTF(pb, "invalid msg %p: no ptlrpc body!\n", msg); + + if (uid && gid) { + pb->pb_uid = *uid; + pb->pb_gid = *gid; + pb->pb_flags |= MSG_PACK_UID_GID; + } else if (!(pb->pb_flags & MSG_PACK_UID_GID)) { + pb->pb_uid = from_kuid(&init_user_ns, current_uid()); + pb->pb_gid = from_kgid(&init_user_ns, current_gid()); + pb->pb_flags |= MSG_PACK_UID_GID; + } + + return; + } + default: + LASSERTF(0, "incorrect message magic: %08x\n", msg->lm_magic); + } +} +EXPORT_SYMBOL(lustre_msg_set_uid_gid); + void lustre_msg_set_jobid(struct lustre_msg *msg, char *jobid) { switch (msg->lm_magic) { @@ -1592,7 +1657,8 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *body) __swab64s(&body->pb_mbits); BUILD_BUG_ON(offsetof(typeof(*body), pb_padding64_0) == 0); BUILD_BUG_ON(offsetof(typeof(*body), pb_padding64_1) == 0); - BUILD_BUG_ON(offsetof(typeof(*body), pb_padding64_2) == 0); + __swab32s(&body->pb_uid); + __swab32s(&body->pb_gid); /* While we need to maintain compatibility between * clients and servers without ptlrpc_body_v2 (< 2.3) * do not swab any fields beyond pb_jobid, as we are @@ -2473,8 +2539,14 @@ void _debug_req(struct ptlrpc_request *req, struct lnet_nid *nid = NULL; int rep_flags = -1; int rep_status = -1; - va_list args; + u64 req_transno = 0; + int req_opc = -1; + u32 req_flags = (u32) -1; + u32 req_uid = (u32) -1; + u32 req_gid = (u32) -1; + char *req_jobid = NULL; struct va_format vaf; + va_list args; spin_lock(&req->rq_early_free_lock); if (req->rq_repmsg) @@ -2496,15 +2568,22 @@ void _debug_req(struct ptlrpc_request *req, else if (req->rq_export && req->rq_export->exp_connection) nid = &req->rq_export->exp_connection->c_peer.nid; + if (req_ok) { + req_transno = lustre_msg_get_transno(req->rq_reqmsg); + req_opc = lustre_msg_get_opc(req->rq_reqmsg); + req_jobid = lustre_msg_get_jobid(req->rq_reqmsg); + lustre_msg_get_uid_gid(req->rq_reqmsg, &req_uid, &req_gid); + req_flags = lustre_msg_get_flags(req->rq_reqmsg); + } + va_start(args, fmt); vaf.fmt = fmt; vaf.va = &args; libcfs_debug_msg(msgdata, - "%pV req@%p x%llu/t%lld(%lld) o%d->%s@%s:%d/%d lens %d/%d e %d to %lld dl %lld ref %d fl " REQ_FLAGS_FMT "/%x/%x rc %d/%d job:'%s'\n", + "%pV req@%p x%llu/t%lld(%llu) o%d->%s@%s:%d/%d lens %d/%d e %d to %lld dl %lld ref %d fl " REQ_FLAGS_FMT "/%x/%x rc %d/%d uid:%u gid:%u job:'%s'\n", &vaf, - req, req->rq_xid, req->rq_transno, - req_ok ? lustre_msg_get_transno(req->rq_reqmsg) : 0, - req_ok ? lustre_msg_get_opc(req->rq_reqmsg) : -1, + req, req->rq_xid, req->rq_transno, req_transno, + req_opc, req->rq_import ? req->rq_import->imp_obd->obd_name : req->rq_export ? @@ -2516,11 +2595,9 @@ void _debug_req(struct ptlrpc_request *req, req->rq_early_count, (s64)req->rq_timedout, (s64)req->rq_deadline, atomic_read(&req->rq_refcount), - DEBUG_REQ_FLAGS(req), - req_ok ? lustre_msg_get_flags(req->rq_reqmsg) : -1, - rep_flags, req->rq_status, rep_status, - req_ok ? lustre_msg_get_jobid(req->rq_reqmsg) ?: "" - : ""); + DEBUG_REQ_FLAGS(req), req_flags, rep_flags, + req->rq_status, rep_status, + req_uid, req_gid, req_jobid ?: ""); va_end(args); } EXPORT_SYMBOL(_debug_req); diff --git a/fs/lustre/ptlrpc/ptlrpcd.c b/fs/lustre/ptlrpc/ptlrpcd.c index 23fb52dc2b6b..7342db8e56a3 100644 --- a/fs/lustre/ptlrpc/ptlrpcd.c +++ b/fs/lustre/ptlrpc/ptlrpcd.c @@ -224,8 +224,10 @@ void ptlrpcd_add_req(struct ptlrpc_request *req) { struct ptlrpcd_ctl *pc; - if (req->rq_reqmsg) + if (req->rq_reqmsg) { lustre_msg_set_jobid(req->rq_reqmsg, NULL); + lustre_msg_set_uid_gid(req->rq_reqmsg, NULL, NULL); + } spin_lock(&req->rq_lock); if (req->rq_invalid_rqset) { diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c index 6e893f0275bc..0da776dc6366 100644 --- a/fs/lustre/ptlrpc/wiretest.c +++ b/fs/lustre/ptlrpc/wiretest.c @@ -772,10 +772,14 @@ void lustre_assert_wire_constants(void) (long long)(int)offsetof(struct ptlrpc_body_v3, pb_padding64_1)); LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_1) == 8, "found %lld\n", (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_1)); - LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding64_2) == 144, "found %lld\n", - (long long)(int)offsetof(struct ptlrpc_body_v3, pb_padding64_2)); - LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_2) == 8, "found %lld\n", - (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_2)); + LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_uid) == 144, "found %lld\n", + (long long)(int)offsetof(struct ptlrpc_body_v3, pb_uid)); + LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_uid) == 4, "found %lld\n", + (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_uid)); + LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_gid) == 148, "found %lld\n", + (long long)(int)offsetof(struct ptlrpc_body_v3, pb_gid)); + LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_gid) == 4, "found %lld\n", + (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_gid)); BUILD_BUG_ON(LUSTRE_JOBID_SIZE != 32); LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_jobid) == 152, "found %lld\n", (long long)(int)offsetof(struct ptlrpc_body_v3, pb_jobid)); @@ -869,10 +873,12 @@ void lustre_assert_wire_constants(void) (int)offsetof(struct ptlrpc_body_v3, pb_padding64_1), (int)offsetof(struct ptlrpc_body_v2, pb_padding64_1)); LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_1) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding64_1), "%d != %d\n", (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_1), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding64_1)); - LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding64_2) == (int)offsetof(struct ptlrpc_body_v2, pb_padding64_2), "%d != %d\n", - (int)offsetof(struct ptlrpc_body_v3, pb_padding64_2), (int)offsetof(struct ptlrpc_body_v2, pb_padding64_2)); - LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_2) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding64_2), "%d != %d\n", - (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding64_2), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding64_2)); + LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_uid) == (int)offsetof(struct ptlrpc_body_v2, pb_padding64_2), "%d != %d\n", + (int)offsetof(struct ptlrpc_body_v3, pb_uid), (int)offsetof(struct ptlrpc_body_v2, pb_padding64_2)); + LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_uid) + (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_gid) == + (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding64_2), "%d != %d\n", + (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_uid) + (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_gid), + (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding64_2)); LASSERTF(MSG_PTLRPC_BODY_OFF == 0, "found %lld\n", (long long)MSG_PTLRPC_BODY_OFF); LASSERTF(REQ_REC_OFF == 1, "found %lld\n", diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h index 83c8ea8f841a..a77a005f0ab1 100644 --- a/include/uapi/linux/lustre/lustre_idl.h +++ b/include/uapi/linux/lustre/lustre_idl.h @@ -593,6 +593,7 @@ enum lustre_msg_version { /* #define MSG_CONNECT_ASYNC 0x00000040 obsolete since 1.5 */ #define MSG_CONNECT_NEXT_VER 0x00000080 /* use next version of lustre_msg */ #define MSG_CONNECT_TRANSNO 0x00000100 /* client sent transno in replay */ +#define MSG_PACK_UID_GID 0x00000200 /* thread UID/GID in ptlrpc_body */ /* number of previous object versions in pb_pre_versions[] */ #define PTLRPC_NUM_VERSIONS 4 @@ -622,7 +623,8 @@ struct ptlrpc_body_v3 { /* padding for future needs - fix lustre_swab_ptlrpc_body() also */ __u64 pb_padding64_0; __u64 pb_padding64_1; - __u64 pb_padding64_2; + __u32 pb_uid; /* req: process uid, use by tbf rules */ + __u32 pb_gid; /* req: process gid, use by tbf rules */ char pb_jobid[LUSTRE_JOBID_SIZE]; /* req: ASCII jobid from env + NUL */ }; From patchwork Thu Jan 30 14:10:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 54837C0218A for ; Thu, 30 Jan 2025 14:13:19 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLYt6N1zz1xfS; Thu, 30 Jan 2025 06:12:02 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYd5lkCz1xPH for ; Thu, 30 Jan 2025 06:11:49 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id ECB1C182336; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id E9E79106BE17; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:57 -0500 Message-ID: <20250130141115.950749-8-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 07/25] lustre: ptlrpc: Track highest reply XID X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Serguei Smirnov , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn Keep track of the highest XID that we've received a reply for. When an OBD_PING expires, do not disconnect the import if the failed XID is less than or equal to the last reply XID. This avoids situation where a lost OBD_PING rpc causes a reconnect even though we've completed other RPCs in the meantime. HPE-bug-id: LUS-11474 WC-bug-id: https://jira.whamcloud.com/browse/LU-16483 Lustre-commit: eb1f4a5222039be9f7 ("LU-16483 ptlrpc: Track highest reply XID") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49807 Reviewed-by: Andreas Dilger Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_import.h | 4 +++- fs/lustre/include/obd_support.h | 1 + fs/lustre/ptlrpc/client.c | 29 +++++++++++++++++++++-------- fs/lustre/ptlrpc/events.c | 3 +++ fs/lustre/ptlrpc/niobuf.c | 5 +++++ 5 files changed, 33 insertions(+), 9 deletions(-) diff --git a/fs/lustre/include/lustre_import.h b/fs/lustre/include/lustre_import.h index ac46aaef09bf..4789bba8a0b9 100644 --- a/fs/lustre/include/lustre_import.h +++ b/fs/lustre/include/lustre_import.h @@ -198,8 +198,10 @@ struct obd_import { /** List of not replied requests */ struct list_head imp_unreplied_list; - /** Known maximal replied XID */ + /** XID below which we know all replies have been received */ u64 imp_known_replied_xid; + /** highest XID for which we have received a reply */ + u64 imp_highest_replied_xid; /** obd device for this import */ struct obd_device *imp_obd; diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h index 55196ce8e3f4..ab7899cd1384 100644 --- a/fs/lustre/include/obd_support.h +++ b/fs/lustre/include/obd_support.h @@ -369,6 +369,7 @@ extern char obd_jobid_var[]; #define OBD_FAIL_PTLRPC_CONNECT_RACE 0x531 #define OBD_FAIL_PTLRPC_IDLE_RACE 0x533 #define OBD_FAIL_PTLRPC_ENQ_RESEND 0x534 +#define OBD_FAIL_PTLRPC_DELAY_SEND_FAIL 0x535 #define OBD_FAIL_OBD_PING_NET 0x600 /* OBD_FAIL_OBD_LOG_CANCEL_NET 0x601 obsolete since 1.5 */ diff --git a/fs/lustre/ptlrpc/client.c b/fs/lustre/ptlrpc/client.c index 13c27977b14d..7a267e67e45c 100644 --- a/fs/lustre/ptlrpc/client.c +++ b/fs/lustre/ptlrpc/client.c @@ -1289,12 +1289,9 @@ static int ptlrpc_import_delay_req(struct obd_import *imp, * Return: false if no message should be printed * true if console message should be printed */ -static bool ptlrpc_console_allow(struct ptlrpc_request *req) +static bool ptlrpc_console_allow(struct ptlrpc_request *req, u32 opc, int err) { - u32 opc; - LASSERT(req->rq_reqmsg); - opc = lustre_msg_get_opc(req->rq_reqmsg); /* Suppress particular reconnect errors which are to be expected. */ if (opc == OST_CONNECT || opc == MDS_CONNECT || opc == MGS_CONNECT) { @@ -1316,6 +1313,15 @@ static bool ptlrpc_console_allow(struct ptlrpc_request *req) return false; } + if (opc == LDLM_ENQUEUE && err == -EAGAIN) + /* -EAGAIN is normal when using POSIX flocks */ + return false; + + if (opc == OBD_PING && (err == -ENODEV || err == -ENOTCONN) && + (req->rq_xid & 0xf) != 10) + /* Suppress most ping requests, they may fail occasionally */ + return false; + return true; } @@ -1334,8 +1340,7 @@ static int ptlrpc_check_status(struct ptlrpc_request *req) u32 opc = lustre_msg_get_opc(req->rq_reqmsg); /* -EAGAIN is normal when using POSIX flocks */ - if (ptlrpc_console_allow(req) && - !(opc == LDLM_ENQUEUE && rc == -EAGAIN)) + if (ptlrpc_console_allow(req, opc, rc)) LCONSOLE_ERROR_MSG(0x011, "%s: operation %s to node %s failed: rc = %d\n", imp->imp_obd->obd_name, @@ -2226,13 +2231,19 @@ EXPORT_SYMBOL(ptlrpc_check_set); int ptlrpc_expire_one_request(struct ptlrpc_request *req, int async_unlink) { struct obd_import *imp = req->rq_import; + unsigned int debug_mask = D_RPCTRACE; int rc = 0; + u32 opc; spin_lock(&req->rq_lock); req->rq_timedout = 1; spin_unlock(&req->rq_lock); - DEBUG_REQ(D_WARNING, req, "Request sent has %s: [sent %lld/real %lld]", + opc = lustre_msg_get_opc(req->rq_reqmsg); + if (ptlrpc_console_allow(req, opc, + lustre_msg_get_status(req->rq_reqmsg))) + debug_mask = D_WARNING; + DEBUG_REQ(debug_mask, req, "Request sent has %s: [sent %lld/real %lld]", req->rq_net_err ? "failed due to network error" : ((req->rq_real_sent == 0 || req->rq_real_sent < req->rq_sent || @@ -2286,7 +2297,9 @@ int ptlrpc_expire_one_request(struct ptlrpc_request *req, int async_unlink) rc = 1; } - ptlrpc_fail_import(imp, lustre_msg_get_conn_cnt(req->rq_reqmsg)); + if (opc != OBD_PING || req->rq_xid > imp->imp_highest_replied_xid) + ptlrpc_fail_import(imp, + lustre_msg_get_conn_cnt(req->rq_reqmsg)); return rc; } diff --git a/fs/lustre/ptlrpc/events.c b/fs/lustre/ptlrpc/events.c index 17ef775923db..93ff704ac4ec 100644 --- a/fs/lustre/ptlrpc/events.c +++ b/fs/lustre/ptlrpc/events.c @@ -171,6 +171,9 @@ void reply_in_callback(struct lnet_event *ev) if (lustre_msg_get_opc(req->rq_reqmsg) != OBD_PING) req->rq_import->imp_last_reply_time = ktime_get_real_seconds(); + if (req->rq_xid > req->rq_import->imp_highest_replied_xid) + req->rq_import->imp_highest_replied_xid = req->rq_xid; + out_wake: /* NB don't unlock till after wakeup; req can disappear under us * since we don't have our own ref diff --git a/fs/lustre/ptlrpc/niobuf.c b/fs/lustre/ptlrpc/niobuf.c index 09f68157b883..ccc2caab3876 100644 --- a/fs/lustre/ptlrpc/niobuf.c +++ b/fs/lustre/ptlrpc/niobuf.c @@ -725,6 +725,10 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) request->rq_deadline = request->rq_sent + request->rq_timeout + ptlrpc_at_get_net_latency(request); + if (unlikely(opc == OBD_PING && + OBD_FAIL_TIMEOUT(OBD_FAIL_PTLRPC_DELAY_SEND_FAIL, cfs_fail_val))) + goto skip_send; + DEBUG_REQ(D_INFO, request, "send flags=%x", lustre_msg_get_flags(request->rq_reqmsg)); rc = ptl_send_buf(&request->rq_req_md_h, @@ -737,6 +741,7 @@ int ptl_send_rpc(struct ptlrpc_request *request, int noreply) if (likely(rc == 0)) goto out; +skip_send: request->rq_req_unlinked = 1; ptlrpc_req_finished(request); if (noreply) From patchwork Thu Jan 30 14:10:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954656 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5891EC0218A for ; Thu, 30 Jan 2025 14:31:24 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLdq2ntTz217m; Thu, 30 Jan 2025 06:15:27 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYl34Phz1xPH for ; Thu, 30 Jan 2025 06:11:55 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id F19AB18233D; Thu, 30 Jan 2025 09:11:32 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id EF585106BE18; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:58 -0500 Message-ID: <20250130141115.950749-9-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 08/25] lustre: enc: make sure DoM files are correctly decrypted X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Sebastien Buisson Make sure DoM files are decrypted upon read by loading their associated encryption context, via fscrypt_prepare_readdir()/ fscrypt_get_encryption_info(). WC-bug-id: https://jira.whamcloud.com/browse/LU-16670 Lustre-commit: 1c424252d37c64e3c ("LU-16670 enc: make sure DoM files are correctly decrypted") Signed-off-by: Sebastien Buisson Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50429 Reviewed-by: Andreas Dilger Reviewed-by: Patrick Farrell Reviewed-by: Mikhail Pershin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 7 +++++-- fs/lustre/llite/namei.c | 6 +----- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 9307007c3e18..d196362a40ca 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -453,14 +453,13 @@ static inline int ll_dom_readpage(void *data, struct page *page) if (lnb->lnb_len < PAGE_SIZE) memset(kaddr + lnb->lnb_len, 0, PAGE_SIZE - lnb->lnb_len); - flush_dcache_page(page); - SetPageUptodate(page); kunmap_atomic(kaddr); if (inode && IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode)) { if (!fscrypt_has_encryption_key(inode)) { CDEBUG(D_SEC, "no enc key for " DFID "\n", PFID(ll_inode2fid(inode))); + rc = -ENOKEY; } else { unsigned int offs = 0; @@ -481,6 +480,10 @@ static inline int ll_dom_readpage(void *data, struct page *page) } } } + if (!rc) { + flush_dcache_page(page); + SetPageUptodate(page); + } unlock_page(page); return rc; diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c index a19e5f707027..920b592489ab 100644 --- a/fs/lustre/llite/namei.c +++ b/fs/lustre/llite/namei.c @@ -735,10 +735,6 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request, rc = fscrypt_get_encryption_info(inode); if (rc) goto out; - if (!fscrypt_has_encryption_key(inode)) { - rc = -ENOKEY; - goto out; - } } } else if (!it_disposition(it, DISP_OPEN_CREATE)) { /* @@ -1204,6 +1200,7 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry, rc = fscrypt_get_encryption_info(dir); if (rc) goto out_release; + encrypt = true; if (open_flags & O_CREAT) { /* For migration or mirroring without enc key, we still * need to be able to create a volatile file. @@ -1216,7 +1213,6 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry, rc = -ENOKEY; goto out_release; } - encrypt = true; } } From patchwork Thu Jan 30 14:10:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954650 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 664FCC0218A for ; Thu, 30 Jan 2025 14:21:39 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLbc5ZMJz1yY0; Thu, 30 Jan 2025 06:13:32 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYX2cxyz1xNk for ; Thu, 30 Jan 2025 06:11:44 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm-e204-208.ccs.ornl.gov [160.91.203.12]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 019A1899AC0; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id F39AC106BE14; Thu, 30 Jan 2025 09:11:32 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:10:59 -0500 Message-ID: <20250130141115.950749-10-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 09/25] lustre: pcc: reserve flags for PCC-RO X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Feng Lei , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin This patch reserves flags for PCC-RO. It also adds wire check / test for these new flags. WC-bug-id: https://jira.whamcloud.com/browse/LU-16700 Lustre-commit: d3874966e4938df175 ("LU-16700 pcc: reserve flags for PCC-RO") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50504 Reviewed-by: Andreas Dilger Reviewed-by: Feng Lei Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/wiretest.c | 4 ++++ include/uapi/linux/lustre/lustre_idl.h | 4 +++- include/uapi/linux/lustre/lustre_user.h | 17 ++++++++++++++++- 3 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c index 0da776dc6366..d4a2b82c961e 100644 --- a/fs/lustre/ptlrpc/wiretest.c +++ b/fs/lustre/ptlrpc/wiretest.c @@ -4373,6 +4373,10 @@ void lustre_assert_wire_constants(void) (long long)LAYOUT_INTENT_RELEASE); LASSERTF(LAYOUT_INTENT_RESTORE == 6, "found %lld\n", (long long)LAYOUT_INTENT_RESTORE); + LASSERTF(LAYOUT_INTENT_PCCRO_SET == 7, "found %lld\n", + (long long)LAYOUT_INTENT_PCCRO_SET); + LASSERTF(LAYOUT_INTENT_PCCRO_CLEAR == 8, "found %lld\n", + (long long)LAYOUT_INTENT_PCCRO_CLEAR); /* Checks for struct hsm_action_item */ LASSERTF((int)sizeof(struct hsm_action_item) == 72, "found %lld\n", diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h index a77a005f0ab1..187a807d4809 100644 --- a/include/uapi/linux/lustre/lustre_idl.h +++ b/include/uapi/linux/lustre/lustre_idl.h @@ -2835,7 +2835,9 @@ enum layout_intent_opc { LAYOUT_INTENT_GLIMPSE = 3, /** not used */ LAYOUT_INTENT_TRUNC = 4, /** truncate file, for comp layout */ LAYOUT_INTENT_RELEASE = 5, /** reserved for HSM release */ - LAYOUT_INTENT_RESTORE = 6 /** reserved for HSM restore */ + LAYOUT_INTENT_RESTORE = 6, /** reserved for HSM restore */ + LAYOUT_INTENT_PCCRO_SET = 7, /** set read-only layout for PCC */ + LAYOUT_INTENT_PCCRO_CLEAR = 8, /** clear read-only layout */ }; /* enqueue layout lock with intent */ diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index 68fddcf4cb59..876d337a3b2b 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -373,6 +373,7 @@ struct ll_ioc_lease_id { #define LL_IOC_LADVISE _IOR('f', 250, struct llapi_lu_ladvise) #define LL_IOC_HEAT_GET _IOWR('f', 251, struct lu_heat) #define LL_IOC_HEAT_SET _IOW('f', 251, __u64) +#define LL_IOC_PCC_ATTACH _IOW('f', 252, struct lu_pcc_attach) #define LL_IOC_PCC_DETACH _IOW('f', 252, struct lu_pcc_detach) #define LL_IOC_PCC_DETACH_BY_FID _IOW('f', 252, struct lu_pcc_detach_fid) #define LL_IOC_PCC_STATE _IOR('f', 252, struct lu_pcc_state) @@ -830,9 +831,18 @@ struct lustre_foreign_type { **/ enum lustre_foreign_types { LU_FOREIGN_TYPE_NONE = 0, + /* HSM copytool lhsm_posix */ + LU_FOREIGN_TYPE_POSIX = 1, + /* Used for PCC-RW. PCCRW components are local to a single archive. */ + LU_FOREIGN_TYPE_PCCRW = 2, + /* Used for PCC-RO. PCCRO components may be shared between archives. */ + LU_FOREIGN_TYPE_PCCRO = 3, + /* Used for S3 */ + LU_FOREIGN_TYPE_S3 = 4, + /* Used for DAOS */ LU_FOREIGN_TYPE_SYMLINK = 0xda05, /* must be the max/last one */ - LU_FOREIGN_TYPE_UNKNOWN = 0xffffffff, + LU_FOREIGN_TYPE_UNKNOWN = 0xffffffff, }; extern struct lustre_foreign_type lu_foreign_types[]; @@ -1928,6 +1938,8 @@ enum hsm_states { HS_NORELEASE = 0x00000010, HS_NOARCHIVE = 0x00000020, HS_LOST = 0x00000040, + HS_PCCRW = 0x00000080, + HS_PCCRO = 0x00000100, }; /* HSM user-setable flags. */ @@ -2356,6 +2368,7 @@ enum lu_pcc_type { LU_PCC_READWRITE = 0x01, LU_PCC_READONLY = 0x02, LU_PCC_TYPE_MASK = LU_PCC_READWRITE | LU_PCC_READONLY, + LU_PCC_FL_ASYNC = 0x10, LU_PCC_MAX }; @@ -2399,6 +2412,8 @@ enum lu_pcc_state_flags { PCC_STATE_FL_ATTR_VALID = 0x01, /* The file is being attached into PCC */ PCC_STATE_FL_ATTACHING = 0x02, + /* The PCC copy is unlinked */ + PCC_STATE_FL_UNLINKED = 0x04, }; struct lu_pcc_state { From patchwork Thu Jan 30 14:11:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3C9B9C0218F for ; Thu, 30 Jan 2025 14:39:20 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLh63T0cz2293; Thu, 30 Jan 2025 06:17:26 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYc6PyJz1whJ for ; Thu, 30 Jan 2025 06:11:48 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 070E1899AD6; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 03D07106BE16; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:00 -0500 Message-ID: <20250130141115.950749-11-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 10/25] lustre: lov: refresh LOVEA with LL granted X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhenyu Xu , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Alex Zhuravlev This change tries to fix: lov_layout_change() should not apply old layouts which can get through when MDS doesn't take layout lock This patch misses an optimization and can result in a number of useless calls to OSD to fetch LOVEA. To be fixed in a followup patch. WC-bug-id: https://jira.whamcloud.com/browse/LU-15300 Lustre-commit: 13557aa86904376e4 ("LU-15300 mdt: refresh LOVEA with LL granted") Signed-off-by: Alex Zhuravlev Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46413 Reviewed-by: Andreas Dilger Reviewed-by: Zhenyu Xu Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_support.h | 1 + fs/lustre/llite/vvp_page.c | 4 ++++ fs/lustre/lov/lov_object.c | 18 ++++++++++++++++++ 3 files changed, 23 insertions(+) diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h index ab7899cd1384..43b4684f418a 100644 --- a/fs/lustre/include/obd_support.h +++ b/fs/lustre/include/obd_support.h @@ -492,6 +492,7 @@ extern char obd_jobid_var[]; #define OBD_FAIL_LLITE_XATTR_PAUSE 0x1420 #define OBD_FAIL_LLITE_PAGE_INVALIDATE_PAUSE 0x1421 #define OBD_FAIL_LLITE_READPAGE_PAUSE 0x1422 +#define OBD_FAIL_LLITE_PANIC_ON_ESTALE 0x1423 #define OBD_FAIL_FID_INDIR 0x1501 #define OBD_FAIL_FID_INLMA 0x1502 diff --git a/fs/lustre/llite/vvp_page.c b/fs/lustre/llite/vvp_page.c index 30524fda692e..9994c3d292a9 100644 --- a/fs/lustre/llite/vvp_page.c +++ b/fs/lustre/llite/vvp_page.c @@ -115,6 +115,10 @@ static void vvp_vmpage_error(struct inode *inode, struct page *vmpage, obj->vob_discard_page_warned = 0; } else { SetPageError(vmpage); + if (ioret != -ENOSPC && + OBD_FAIL_CHECK(OBD_FAIL_LLITE_PANIC_ON_ESTALE)) + LBUG(); + mapping_set_error(inode->i_mapping, ioret); if ((ioret == -ESHUTDOWN || ioret == -EINTR || diff --git a/fs/lustre/lov/lov_object.c b/fs/lustre/lov/lov_object.c index 5d65aabe7645..7c20f6eae03b 100644 --- a/fs/lustre/lov/lov_object.c +++ b/fs/lustre/lov/lov_object.c @@ -1370,6 +1370,24 @@ static int lov_conf_set(const struct lu_env *env, struct cl_object *obj, LASSERT(conf->coc_opc == OBJECT_CONF_SET); + /* + * don't apply old layouts which can be brought + * if returned w/o ldlm lock. + * XXX: can we rollback in case of recovery? + */ + if (lsm && lov->lo_lsm) { + u32 oldgen = lov->lo_lsm->lsm_layout_gen &= ~LU_LAYOUT_RESYNC; + u32 newgen = lsm->lsm_layout_gen & ~LU_LAYOUT_RESYNC; + + if (newgen < oldgen) { + CDEBUG(D_HA, "skip old for "DFID": %d < %d\n", + PFID(lu_object_fid(lov2lu(lov))), + (int)newgen, (int)oldgen); + result = 0; + goto out; + } + } + if ((!lsm && !lov->lo_lsm) || ((lsm && lov->lo_lsm) && (lov->lo_lsm->lsm_layout_gen == lsm->lsm_layout_gen) && From patchwork Thu Jan 30 14:11:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 81D56C0218A for ; Thu, 30 Jan 2025 14:28:39 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLdH5NnTz214p; Thu, 30 Jan 2025 06:14:59 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYv582Xz1xhB for ; Thu, 30 Jan 2025 06:12:03 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 0A82318236A; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 083D5106BE17; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:01 -0500 Message-ID: <20250130141115.950749-12-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 11/25] lustre: nrs: change nrs policies at run time X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Feng Lei , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Etienne AUJAMES This patch take extra references on policy to avoid stop a NRS policy with pending/queued request in it. It uses a new refcount_t "pol_start_ref" for this purpose to keep track of policy usage in started state. It enables to safely stop a policy without "nrs_lock" and avoids to sleep in the spinlock. It adds a wait queue field "pol_wq" in "struct ptlrpc_nrs_policy" to wait all queued request in a stopping policy to be drained when restarting policy with a different argument. WC-bug-id: https://jira.whamcloud.com/browse/LU-14976 Lustre-commit: c098c09564a125dd4 ("LU-14976 nrs: change nrs policies at run time") Signed-off-by: Etienne AUJAMES Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48523 Reviewed-by: Andreas Dilger Reviewed-by: Feng Lei Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_nrs.h | 12 ++- fs/lustre/ptlrpc/nrs.c | 166 ++++++++++++++++++++++----------- fs/lustre/ptlrpc/nrs_delay.c | 3 +- fs/lustre/ptlrpc/nrs_fifo.c | 2 +- 4 files changed, 128 insertions(+), 55 deletions(-) diff --git a/fs/lustre/include/lustre_nrs.h b/fs/lustre/include/lustre_nrs.h index 0e0dd73cadac..bca6b76aa699 100644 --- a/fs/lustre/include/lustre_nrs.h +++ b/fs/lustre/include/lustre_nrs.h @@ -100,10 +100,12 @@ struct ptlrpc_nrs_pol_ops { * initialize their resources here; this operation is optional. * * @policy: The policy being started + * @arg: A generic char buffer * * \see nrs_policy_start_locked() */ - int (*op_policy_start)(struct ptlrpc_nrs_policy *policy); + int (*op_policy_start)(struct ptlrpc_nrs_policy *policy, + char *arg); /** * Called when deactivating a policy via lprocfs; policies deallocate * their resources here; this operation is optional @@ -616,6 +618,10 @@ struct ptlrpc_nrs_policy { * Usage Reference count taken on the policy instance */ long pol_ref; + /** + * Usage Reference count taken for a started policy + */ + refcount_t pol_start_ref; /** * Human-readable policy argument */ @@ -632,6 +638,10 @@ struct ptlrpc_nrs_policy { * Policy descriptor for this policy instance. */ struct ptlrpc_nrs_pol_desc *pol_desc; + /** + * Policy wait queue + */ + wait_queue_head_t pol_wq; }; /** diff --git a/fs/lustre/ptlrpc/nrs.c b/fs/lustre/ptlrpc/nrs.c index dd36d182d11e..661bba7a0f06 100644 --- a/fs/lustre/ptlrpc/nrs.c +++ b/fs/lustre/ptlrpc/nrs.c @@ -59,6 +59,7 @@ static int nrs_policy_init(struct ptlrpc_nrs_policy *policy) static void nrs_policy_fini(struct ptlrpc_nrs_policy *policy) { LASSERT(policy->pol_ref == 0); + LASSERT(refcount_read(&policy->pol_start_ref) == 0); LASSERT(policy->pol_req_queued == 0); if (policy->pol_desc->pd_ops->op_policy_fini) @@ -92,13 +93,35 @@ static void __nrs_policy_stop(struct ptlrpc_nrs_policy *policy) policy->pol_req_started == 0); policy->pol_private = NULL; + policy->pol_arg[0] = '\0'; policy->pol_state = NRS_POL_STATE_STOPPED; + wake_up(&policy->pol_wq); if (atomic_dec_and_test(&policy->pol_desc->pd_refs)) module_put(policy->pol_desc->pd_owner); } +/** + * Increases the policy's usage started reference count. + */ +static inline void nrs_policy_started_get(struct ptlrpc_nrs_policy *policy) +{ + refcount_inc(&policy->pol_start_ref); +} + +/** + * Decreases the policy's usage started reference count, and stops the policy + * in case it was already stopping and have no more outstanding usage + * references (which indicates it has no more queued or started requests, and + * can be safely stopped). + */ +static void nrs_policy_started_put(struct ptlrpc_nrs_policy *policy) +{ + if (refcount_dec_and_test(&policy->pol_start_ref)) + __nrs_policy_stop(policy); +} + static int nrs_policy_stop_locked(struct ptlrpc_nrs_policy *policy) { struct ptlrpc_nrs *nrs = policy->pol_nrs; @@ -123,9 +146,18 @@ static int nrs_policy_stop_locked(struct ptlrpc_nrs_policy *policy) nrs->nrs_policy_fallback = NULL; } - /* I have the only refcount */ - if (policy->pol_ref == 1) - __nrs_policy_stop(policy); + /* Drop started ref and wait for requests to be drained */ + spin_unlock(&nrs->nrs_lock); + nrs_policy_started_put(policy); + + wait_event_timeout(policy->pol_wq, + policy->pol_state == NRS_POL_STATE_STOPPED, + 30 * HZ); + + spin_lock(&nrs->nrs_lock); + + if (policy->pol_state != NRS_POL_STATE_STOPPED) + return -EBUSY; return 0; } @@ -149,8 +181,10 @@ static void nrs_policy_stop_primary(struct ptlrpc_nrs *nrs) LASSERT(tmp->pol_state == NRS_POL_STATE_STARTED); tmp->pol_state = NRS_POL_STATE_STOPPING; - if (tmp->pol_ref == 0) - __nrs_policy_stop(tmp); + /* Drop started ref to free the policy */ + spin_unlock(&nrs->nrs_lock); + nrs_policy_started_put(tmp); + spin_lock(&nrs->nrs_lock); } /** @@ -172,7 +206,7 @@ static void nrs_policy_stop_primary(struct ptlrpc_nrs *nrs) * references on the policy to ptlrpc_nrs_pol_stae::NRS_POL_STATE_STOPPED. In * this case, the fallback policy is only left active in the NRS head. */ -static int nrs_policy_start_locked(struct ptlrpc_nrs_policy *policy) +static int nrs_policy_start_locked(struct ptlrpc_nrs_policy *policy, char *arg) { struct ptlrpc_nrs *nrs = policy->pol_nrs; int rc = 0; @@ -189,6 +223,11 @@ static int nrs_policy_start_locked(struct ptlrpc_nrs_policy *policy) if (policy->pol_state == NRS_POL_STATE_STOPPING) return -EAGAIN; + if (arg && strlen(arg) >= sizeof(policy->pol_arg)) { + CWARN("NRS: arg '%s' is too long\n", arg); + return -EINVAL; + } + if (policy->pol_flags & PTLRPC_NRS_FL_FALLBACK) { /** * This is for cases in which the user sets the policy to the @@ -215,8 +254,20 @@ static int nrs_policy_start_locked(struct ptlrpc_nrs_policy *policy) if (!nrs->nrs_policy_fallback) return -EPERM; - if (policy->pol_state == NRS_POL_STATE_STARTED) - return 0; + if (policy->pol_state == NRS_POL_STATE_STARTED) { + /** + * If the policy argument now is different from the last time, + * stop the policy first and start it again with the new + * argument. + */ + if ((!arg && strlen(policy->pol_arg) == 0) || + (arg && strcmp(policy->pol_arg, arg) == 0)) + return 0; + + rc = nrs_policy_stop_locked(policy); + if (rc) + return rc; + } } /** @@ -241,7 +292,7 @@ static int nrs_policy_start_locked(struct ptlrpc_nrs_policy *policy) if (policy->pol_desc->pd_ops->op_policy_start) { spin_unlock(&nrs->nrs_lock); - rc = policy->pol_desc->pd_ops->op_policy_start(policy); + rc = policy->pol_desc->pd_ops->op_policy_start(policy, arg); spin_lock(&nrs->nrs_lock); if (rc != 0) { @@ -253,6 +304,11 @@ static int nrs_policy_start_locked(struct ptlrpc_nrs_policy *policy) } } + if (arg) + strlcpy(policy->pol_arg, arg, sizeof(policy->pol_arg)); + + /* take the started reference */ + refcount_set(&policy->pol_start_ref, 1); policy->pol_state = NRS_POL_STATE_STARTED; if (policy->pol_flags & PTLRPC_NRS_FL_FALLBACK) { @@ -279,34 +335,23 @@ static int nrs_policy_start_locked(struct ptlrpc_nrs_policy *policy) } /** - * Increases the policy's usage reference count. + * Increases the policy's usage reference count (caller count). */ static inline void nrs_policy_get_locked(struct ptlrpc_nrs_policy *policy) +__must_hold(&policy->pol_nrs->nrs_lock) { policy->pol_ref++; } /** - * Decreases the policy's usage reference count, and stops the policy in case it - * was already stopping and have no more outstanding usage references (which - * indicates it has no more queued or started requests, and can be safely - * stopped). + * Decreases the policy's usage reference count. */ static void nrs_policy_put_locked(struct ptlrpc_nrs_policy *policy) +__must_hold(&policy->pol_nrs->nrs_lock) { LASSERT(policy->pol_ref > 0); policy->pol_ref--; - if (unlikely(policy->pol_ref == 0 && - policy->pol_state == NRS_POL_STATE_STOPPING)) - __nrs_policy_stop(policy); -} - -static void nrs_policy_put(struct ptlrpc_nrs_policy *policy) -{ - spin_lock(&policy->pol_nrs->nrs_lock); - nrs_policy_put_locked(policy); - spin_unlock(&policy->pol_nrs->nrs_lock); } /** @@ -428,11 +473,11 @@ static void nrs_resource_get_safe(struct ptlrpc_nrs *nrs, spin_lock(&nrs->nrs_lock); fallback = nrs->nrs_policy_fallback; - nrs_policy_get_locked(fallback); + nrs_policy_started_get(fallback); primary = nrs->nrs_policy_primary; if (primary) - nrs_policy_get_locked(primary); + nrs_policy_started_get(primary); spin_unlock(&nrs->nrs_lock); @@ -452,7 +497,7 @@ static void nrs_resource_get_safe(struct ptlrpc_nrs *nrs, * request. */ if (!resp[NRS_RES_PRIMARY]) - nrs_policy_put(primary); + nrs_policy_started_put(primary); } } @@ -481,8 +526,10 @@ static void nrs_resource_put_safe(struct ptlrpc_nrs_resource **resp) } for (i = 0; i < NRS_RES_MAX; i++) { - if (pols[i]) - nrs_policy_put(pols[i]); + if (!pols[i]) + continue; + + nrs_policy_started_put(pols[i]); } } @@ -510,6 +557,10 @@ struct ptlrpc_nrs_request *nrs_request_get(struct ptlrpc_nrs_policy *policy, LASSERT(policy->pol_req_queued > 0); + /* for a non-started policy, use force mode to drain requests */ + if (unlikely(policy->pol_state != NRS_POL_STATE_STARTED)) + force = true; + nrq = policy->pol_desc->pd_ops->op_req_get(policy, peek, force); LASSERT(ergo(nrq, nrs_request_policy(nrq) == policy)); @@ -548,6 +599,11 @@ static inline void nrs_request_enqueue(struct ptlrpc_nrs_request *nrq) if (rc == 0) { policy->pol_nrs->nrs_req_queued++; policy->pol_req_queued++; + /** + * Take an extra ref to avoid stopping policy with + * pending request in it + */ + nrs_policy_started_get(policy); return; } } @@ -632,7 +688,7 @@ static int nrs_policy_ctl(struct ptlrpc_nrs *nrs, char *name, * Start \e policy */ case PTLRPC_NRS_CTL_START: - rc = nrs_policy_start_locked(policy); + rc = nrs_policy_start_locked(policy, arg); break; } out: @@ -657,47 +713,50 @@ static int nrs_policy_ctl(struct ptlrpc_nrs *nrs, char *name, static int nrs_policy_unregister(struct ptlrpc_nrs *nrs, char *name) { struct ptlrpc_nrs_policy *policy = NULL; + int rc = 0; spin_lock(&nrs->nrs_lock); policy = nrs_policy_find_locked(nrs, name); if (!policy) { - spin_unlock(&nrs->nrs_lock); - - CERROR("Can't find NRS policy %s\n", name); - return -ENOENT; + rc = -ENOENT; + CERROR("NRS: cannot find policy '%s': rc = %d\n", name, rc); + goto out_unlock; } if (policy->pol_ref > 1) { - CERROR("Policy %s is busy with %d references\n", name, - (int)policy->pol_ref); - nrs_policy_put_locked(policy); - - spin_unlock(&nrs->nrs_lock); - return -EBUSY; + rc = -EBUSY; + CERROR("NRS: policy '%s' is busy with %ld references: rc = %d", + name, policy->pol_ref, rc); + goto out_put; } LASSERT(policy->pol_req_queued == 0); LASSERT(policy->pol_req_started == 0); if (policy->pol_state != NRS_POL_STATE_STOPPED) { - nrs_policy_stop_locked(policy); - LASSERT(policy->pol_state == NRS_POL_STATE_STOPPED); + rc = nrs_policy_stop_locked(policy); + if (rc) { + CERROR("NRS: failed to stop policy '%s' with refcount %d: rc = %d\n", + name, refcount_read(&policy->pol_start_ref), rc); + goto out_put; + } } + LASSERT(!policy->pol_private); list_del(&policy->pol_list); nrs->nrs_num_pols--; - +out_put: nrs_policy_put_locked(policy); - +out_unlock: spin_unlock(&nrs->nrs_lock); - nrs_policy_fini(policy); - - LASSERT(!policy->pol_private); - kfree(policy); + if (rc == 0) { + nrs_policy_fini(policy); + kfree(policy); + } - return 0; + return rc; } /** @@ -738,6 +797,8 @@ static int nrs_policy_register(struct ptlrpc_nrs *nrs, INIT_LIST_HEAD(&policy->pol_list); INIT_LIST_HEAD(&policy->pol_list_queued); + init_waitqueue_head(&policy->pol_wq); + rc = nrs_policy_init(policy); if (rc != 0) { kfree(policy); @@ -764,7 +825,7 @@ static int nrs_policy_register(struct ptlrpc_nrs *nrs, nrs->nrs_num_pols++; if (policy->pol_flags & PTLRPC_NRS_FL_REG_START) - rc = nrs_policy_start_locked(policy); + rc = nrs_policy_start_locked(policy, NULL); spin_unlock(&nrs->nrs_lock); @@ -1425,6 +1486,9 @@ static void nrs_request_removed(struct ptlrpc_nrs_policy *policy) list_move_tail(&policy->pol_list_queued, &policy->pol_nrs->nrs_policy_queued); } + + /* remove the extra ref for policy pending requests */ + nrs_policy_started_put(policy); } /** @@ -1613,5 +1677,3 @@ void ptlrpc_nrs_fini(void) kfree(desc); } } - -/** @} nrs */ diff --git a/fs/lustre/ptlrpc/nrs_delay.c b/fs/lustre/ptlrpc/nrs_delay.c index b249749d010a..0ca6ad1481b2 100644 --- a/fs/lustre/ptlrpc/nrs_delay.c +++ b/fs/lustre/ptlrpc/nrs_delay.c @@ -102,6 +102,7 @@ static struct binheap_ops nrs_delay_heap_ops = { * the delay-specific private data structure. * * @policy The policy to start + * @arg Generic char buffer; unused in this policy * * Return: -ENOMEM OOM error * 0 success @@ -109,7 +110,7 @@ static struct binheap_ops nrs_delay_heap_ops = { * \see nrs_policy_register() * \see nrs_policy_ctl() */ -static int nrs_delay_start(struct ptlrpc_nrs_policy *policy) +static int nrs_delay_start(struct ptlrpc_nrs_policy *policy, char *arg) { struct nrs_delay_data *delay_data; diff --git a/fs/lustre/ptlrpc/nrs_fifo.c b/fs/lustre/ptlrpc/nrs_fifo.c index 1689616e3949..227fe5ed11e5 100644 --- a/fs/lustre/ptlrpc/nrs_fifo.c +++ b/fs/lustre/ptlrpc/nrs_fifo.c @@ -74,7 +74,7 @@ * \see nrs_policy_register() * \see nrs_policy_ctl() */ -static int nrs_fifo_start(struct ptlrpc_nrs_policy *policy) +static int nrs_fifo_start(struct ptlrpc_nrs_policy *policy, char *arg) { struct nrs_fifo_head *head; From patchwork Thu Jan 30 14:11:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954659 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B849FC0218A for ; Thu, 30 Jan 2025 14:34:54 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLfZ1fPQz21cn; Thu, 30 Jan 2025 06:16:06 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLZK5GSDz1xpQ for ; Thu, 30 Jan 2025 06:12:25 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 0E99218236B; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 0C4C3106BE18; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:02 -0500 Message-ID: <20250130141115.950749-13-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 12/25] lnet: move libcfs_nidstr to UAPI headers X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Serguei Smirnov , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown User-space now has functions to convert between strings and large nids that mirror the kernel. Move those function prototypes to UAPI since this is the case. Lustre-commit: b6c702df5d4de8e5a ("LU-10391 libcfs: add large-nid string conversion functions.") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50092 Reviewed-by: James Simmons Reviewed-by: Serguei Smirnov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-types.h | 11 ----------- include/uapi/linux/lnet/nidstr.h | 11 +++++++++++ 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h index 1ae4530d1bc6..bc9f0020f93a 100644 --- a/include/linux/lnet/lib-types.h +++ b/include/linux/lnet/lib-types.h @@ -51,17 +51,6 @@ #include #include -char *libcfs_nidstr_r(const struct lnet_nid *nid, - char *buf, size_t buf_size); - -static inline char *libcfs_nidstr(const struct lnet_nid *nid) -{ - return libcfs_nidstr_r(nid, libcfs_next_nidstring(), - LNET_NIDSTR_SIZE); -} - -int libcfs_strnid(struct lnet_nid *nid, const char *str); -char *libcfs_idstr(struct lnet_processid *id); int libcfs_strid(struct lnet_processid *id, const char *str); int cfs_match_nid_net(struct lnet_nid *nid, u32 net, diff --git a/include/uapi/linux/lnet/nidstr.h b/include/uapi/linux/lnet/nidstr.h index 1ccb8fa826b4..87244ce7c698 100644 --- a/include/uapi/linux/lnet/nidstr.h +++ b/include/uapi/linux/lnet/nidstr.h @@ -91,6 +91,17 @@ static inline char *libcfs_nid2str(lnet_nid_t nid) LNET_NIDSTR_SIZE); } +char *libcfs_nidstr_r(const struct lnet_nid *nid, + char *buf, __kernel_size_t buf_size); + +static inline char *libcfs_nidstr(const struct lnet_nid *nid) +{ + return libcfs_nidstr_r(nid, libcfs_next_nidstring(), + LNET_NIDSTR_SIZE); +} + +int libcfs_strnid(struct lnet_nid *nid, const char *str); +char *libcfs_idstr(struct lnet_processid *id); __u32 libcfs_str2net(const char *str); lnet_nid_t libcfs_str2nid(const char *str); int libcfs_str2anynid(lnet_nid_t *nid, const char *str); From patchwork Thu Jan 30 14:11:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954652 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 194F3C0218F for ; Thu, 30 Jan 2025 14:24:44 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLcg1lxrz212R; Thu, 30 Jan 2025 06:14:27 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYl36Y4z1xf0 for ; Thu, 30 Jan 2025 06:11:55 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm-e204-208.ccs.ornl.gov [160.91.203.12]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 143BE899AD7; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 1090A106BE14; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:03 -0500 Message-ID: <20250130141115.950749-14-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 13/25] lustre: obdclass: convert class_parse_nid4 to class_parse_nid X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown All callers of class_parse_nid4() now use class_parse_nid() and so much handle a large nid. do_lcfg_nid() is introduced to help with this. WC-bug-id: https://jira.whamcloud.com/browse/LU-13340 Lustre-commit: 6c3b50434b321cc16 ("LU-13340 mgs: convert class_parse_nid4 to class_parse_nid") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50094 Reviewed-by: James Simmons Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/obdclass/obd_mount.c | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/fs/lustre/obdclass/obd_mount.c b/fs/lustre/obdclass/obd_mount.c index 6eaa214c813f..c569592bd6e7 100644 --- a/fs/lustre/obdclass/obd_mount.c +++ b/fs/lustre/obdclass/obd_mount.c @@ -179,6 +179,15 @@ static int do_lcfg(char *cfgname, lnet_nid_t nid, int cmd, return rc; } +static int do_lcfg_nid(char *cfgname, struct lnet_nid *nid, int cmd, + char *s1) +{ + if (nid_is_nid4(nid)) + return do_lcfg(cfgname, lnet_nid_to_nid4(nid), cmd, s1, + NULL, NULL, NULL); + return -EINVAL; +} + /** Call class_attach and class_setup. These methods in turn call * obd type-specific methods. */ @@ -218,7 +227,7 @@ int lustre_start_mgc(struct super_block *sb) struct obd_export *exp; struct obd_uuid *uuid = NULL; uuid_t uuidc; - lnet_nid_t nid; + struct lnet_nid nid; char nidstr[LNET_NIDSTR_SIZE]; char *mgcname = NULL, *niduuid = NULL, *mgssec = NULL; char *ptr; @@ -228,7 +237,7 @@ int lustre_start_mgc(struct super_block *sb) /* Use nids from mount line: uml1,1@elan:uml2,2@elan:/lustre */ ptr = lsi->lsi_lmd->lmd_dev; - if (class_parse_nid4(ptr, &nid, &ptr) == 0) + if (class_parse_nid(ptr, &nid, &ptr) == 0) i++; if (i == 0) { CERROR("No valid MGS nids found.\n"); @@ -237,7 +246,7 @@ int lustre_start_mgc(struct super_block *sb) mutex_lock(&mgc_start_lock); - libcfs_nid2str_r(nid, nidstr, sizeof(nidstr)); + libcfs_nidstr_r(&nid, nidstr, sizeof(nidstr)); mgcname = kasprintf(GFP_NOFS, "%s%s", LUSTRE_MGC_OBDNAME, nidstr); niduuid = kasprintf(GFP_NOFS, "%s_%x", mgcname, 0); @@ -314,10 +323,9 @@ int lustre_start_mgc(struct super_block *sb) i = 0; /* Use nids from mount line: uml1,1@elan:uml2,2@elan:/lustre */ ptr = lsi->lsi_lmd->lmd_dev; - while (class_parse_nid4(ptr, &nid, &ptr) == 0) { - rc = do_lcfg(mgcname, nid, - LCFG_ADD_UUID, niduuid, NULL, NULL, NULL); - if (!rc) + while (class_parse_nid(ptr, &nid, &ptr) == 0) { + rc = do_lcfg_nid(mgcname, &nid, LCFG_ADD_UUID, niduuid); + if (rc == 0) i++; /* Stop at the first failover nid */ if (*ptr == ':') @@ -354,10 +362,10 @@ int lustre_start_mgc(struct super_block *sb) /* New failover node */ sprintf(niduuid, "%s_%x", mgcname, i); j = 0; - while (class_parse_nid4_quiet(ptr, &nid, &ptr) == 0) { - rc = do_lcfg(mgcname, nid, LCFG_ADD_UUID, niduuid, - NULL, NULL, NULL); - if (!rc) + while (class_parse_nid_quiet(ptr, &nid, &ptr) == 0) { + rc = do_lcfg_nid(mgcname, &nid, LCFG_ADD_UUID, + niduuid); + if (rc == 0) ++j; if (*ptr == ':') break; @@ -863,14 +871,14 @@ static int lmd_parse_string(char **handle, char *ptr) /* Collect multiple values for mgsnid specifiers */ static int lmd_parse_mgs(struct lustre_mount_data *lmd, char **ptr) { - lnet_nid_t nid; + struct lnet_nid nid; char *tail = *ptr; char *mgsnid; int length; int oldlen = 0; /* Find end of nidlist */ - while (class_parse_nid4_quiet(tail, &nid, &tail) == 0) + while (class_parse_nid_quiet(tail, &nid, &tail) == 0) ; length = tail - *ptr; if (length == 0) { From patchwork Thu Jan 30 14:11:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3A6E6C0218A for ; Thu, 30 Jan 2025 14:25:46 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLcm5qlZz213R; Thu, 30 Jan 2025 06:14:32 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLYv4tbWz1xh7 for ; Thu, 30 Jan 2025 06:12:03 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 17AA8899AD8; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 14BA7106BE16; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:04 -0500 Message-ID: <20250130141115.950749-15-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 14/25] lustre: obdclass: remove class_parse_nid4() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown class_parse_nid4() not used any more. WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: 89aac3d3b39f68982 ("LU-10391 obdclass: remove class_parse_nid4()") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50095 Reviewed-by: James Simmons Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_class.h | 4 ---- fs/lustre/obdclass/obd_config.c | 31 +------------------------------ 2 files changed, 1 insertion(+), 34 deletions(-) diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h index e91335e8cd70..06dfcf9bbd9e 100644 --- a/fs/lustre/include/obd_class.h +++ b/fs/lustre/include/obd_class.h @@ -144,12 +144,8 @@ int class_find_param(char *buf, char *key, char **valp); struct cfg_interop_param *class_find_old_param(const char *param, struct cfg_interop_param *ptr); int class_get_next_param(char **params, char *copy); -int class_parse_nid4(char *buf, lnet_nid_t *nid4, char **endh); -int class_parse_nid4_quiet(char *buf, lnet_nid_t *nid4, char **endh); int class_parse_nid(char *buf, struct lnet_nid *nid, char **endh); int class_parse_nid_quiet(char *buf, struct lnet_nid *nid, char **endh); -int class_parse_net(char *buf, u32 *net, char **endh); -int class_match_net(char *buf, char *key, u32 net); struct obd_device *class_incref(struct obd_device *obd, const char *scope, const void *source); diff --git a/fs/lustre/obdclass/obd_config.c b/fs/lustre/obdclass/obd_config.c index 689e8f54084d..3fe1cb6e5341 100644 --- a/fs/lustre/obdclass/obd_config.c +++ b/fs/lustre/obdclass/obd_config.c @@ -152,19 +152,6 @@ static int class_match_param(char *buf, const char *key, char **valp) return 0; } -static int parse_nid4(char *buf, void *value, int quiet) -{ - lnet_nid_t *nid4 = value; - - *nid4 = libcfs_str2nid(buf); - if (*nid4 != LNET_NID_ANY) - return 0; - - if (!quiet) - LCONSOLE_ERROR_MSG(0x159, "Can't parse NID '%s'\n", buf); - return -EINVAL; -} - static int parse_nid(char *buf, void *value, int quiet) { struct lnet_nid *nid = value; @@ -187,8 +174,7 @@ static int parse_net(char *buf, void *value) } enum { - CLASS_PARSE_NID4 = 1, - CLASS_PARSE_NID, + CLASS_PARSE_NID = 1, CLASS_PARSE_NET, }; @@ -221,9 +207,6 @@ static int class_parse_value(char *buf, int opc, void *value, char **endh, switch (opc) { default: LBUG(); - case CLASS_PARSE_NID4: - rc = parse_nid4(buf, value, quiet); - break; case CLASS_PARSE_NID: rc = parse_nid(buf, value, quiet); break; @@ -239,18 +222,6 @@ static int class_parse_value(char *buf, int opc, void *value, char **endh, return 0; } -int class_parse_nid4(char *buf, lnet_nid_t *nid4, char **endh) -{ - return class_parse_value(buf, CLASS_PARSE_NID4, (void *)nid4, endh, 0); -} -EXPORT_SYMBOL(class_parse_nid4); - -int class_parse_nid4_quiet(char *buf, lnet_nid_t *nid4, char **endh) -{ - return class_parse_value(buf, CLASS_PARSE_NID4, (void *)nid4, endh, 1); -} -EXPORT_SYMBOL(class_parse_nid4_quiet); - int class_parse_nid(char *buf, struct lnet_nid *nid, char **endh) { return class_parse_value(buf, CLASS_PARSE_NID, (void *)nid, endh, 0); From patchwork Thu Jan 30 14:11:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954683 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D848DC0218F for ; Thu, 30 Jan 2025 14:48:25 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLkf2KSBz22Lk; Thu, 30 Jan 2025 06:19:38 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLZy3YTlz1y22 for ; Thu, 30 Jan 2025 06:12:58 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 1E08318236C; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 1A3AF106BE17; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:05 -0500 Message-ID: <20250130141115.950749-16-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 15/25] lustre: statahead: add stats for batch RPC requests X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin This patch adds stats for batch PtlRPC request. It can show the statistical information such as how many subreqs in a batch RPC. WC-bug-id: https://jira.whamcloud.com/browse/LU-14139 Lustre-commit: a20f25d24b5f0ce7b ("LU-14139 statahead: add stats for batch RPC requests") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40943 Reviewed-by: Andreas Dilger Reviewed-by: Mikhail Pershin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd.h | 3 +++ fs/lustre/ldlm/ldlm_lib.c | 1 + fs/lustre/mdc/lproc_mdc.c | 44 +++++++++++++++++++++++++++++++++++++++ fs/lustre/ptlrpc/batch.c | 7 ++++++- 4 files changed, 54 insertions(+), 1 deletion(-) diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h index 4d65775ab4b1..174372001b23 100644 --- a/fs/lustre/include/obd.h +++ b/fs/lustre/include/obd.h @@ -277,6 +277,9 @@ struct client_obd { struct obd_histogram cl_write_page_hist; struct obd_histogram cl_read_offset_hist; struct obd_histogram cl_write_offset_hist; + ktime_t cl_batch_stats_init; + struct obd_histogram cl_batch_rpc_hist; + /* LRU for osc caching pages */ struct cl_client_cache *cl_cache; diff --git a/fs/lustre/ldlm/ldlm_lib.c b/fs/lustre/ldlm/ldlm_lib.c index 4f9cf5f5c388..6a03ca9e2c06 100644 --- a/fs/lustre/ldlm/ldlm_lib.c +++ b/fs/lustre/ldlm/ldlm_lib.c @@ -367,6 +367,7 @@ int client_obd_setup(struct obd_device *obd, struct lustre_cfg *lcfg) spin_lock_init(&cli->cl_write_page_hist.oh_lock); spin_lock_init(&cli->cl_read_offset_hist.oh_lock); spin_lock_init(&cli->cl_write_offset_hist.oh_lock); + spin_lock_init(&cli->cl_batch_rpc_hist.oh_lock); /* lru for osc. */ INIT_LIST_HEAD(&cli->cl_lru_osc); diff --git a/fs/lustre/mdc/lproc_mdc.c b/fs/lustre/mdc/lproc_mdc.c index fa799c525f46..d8dded8ed8a6 100644 --- a/fs/lustre/mdc/lproc_mdc.c +++ b/fs/lustre/mdc/lproc_mdc.c @@ -509,6 +509,48 @@ static int mdc_rpc_stats_seq_show(struct seq_file *seq, void *v) } LDEBUGFS_SEQ_FOPS(mdc_rpc_stats); +static ssize_t mdc_batch_stats_seq_write(struct file *file, + const char __user *buf, + size_t len, loff_t *off) +{ + struct seq_file *seq = file->private_data; + struct obd_device *obd = seq->private; + struct client_obd *cli = &obd->u.cli; + + lprocfs_oh_clear(&cli->cl_batch_rpc_hist); + cli->cl_batch_stats_init = ktime_get_real(); + + return len; +} + +static int mdc_batch_stats_seq_show(struct seq_file *seq, void *v) +{ + struct obd_device *obd = seq->private; + struct client_obd *cli = &obd->u.cli; + unsigned long tot; + unsigned long cum; + int i; + + lprocfs_stats_header(seq, ktime_get_real(), cli->cl_batch_stats_init, + 25, ":", true, ""); + seq_puts(seq, "subreqs per batch batches %% cum %%\n"); + tot = lprocfs_oh_sum(&cli->cl_batch_rpc_hist); + cum = 0; + + for (i = 0; i < OBD_HIST_MAX; i++) { + unsigned long cnt = cli->cl_batch_rpc_hist.oh_buckets[i]; + + cum += cnt; + seq_printf(seq, "%d:\t\t%10lu %3u %3u\n", + 1 << i, cnt, pct(cnt, tot), pct(cum, tot)); + if (cum == tot) + break; + } + + return 0; +} +LDEBUGFS_SEQ_FOPS(mdc_batch_stats); + static int mdc_stats_seq_show(struct seq_file *seq, void *v) { struct obd_device *obd = seq->private; @@ -624,6 +666,8 @@ static struct ldebugfs_vars lprocfs_mdc_obd_vars[] = { .fops = &mdc_pinger_recov_fops }, { .name = "rpc_stats", .fops = &mdc_rpc_stats_fops }, + { .name = "batch_stats", + .fops = &mdc_batch_stats_fops }, { .name = "unstable_stats", .fops = &mdc_unstable_stats_fops }, { .name = "mdc_stats", diff --git a/fs/lustre/ptlrpc/batch.c b/fs/lustre/ptlrpc/batch.c index 76eb4cf0ac4f..75a6bc21b869 100644 --- a/fs/lustre/ptlrpc/batch.c +++ b/fs/lustre/ptlrpc/batch.c @@ -39,6 +39,7 @@ #include #include #include +#include #define OUT_UPDATE_REPLY_SIZE 4096 @@ -403,14 +404,16 @@ static int batch_update_interpret(const struct lu_env *env, static int batch_send_update_req(const struct lu_env *env, struct batch_update_head *head) { - struct lu_batch *bh; + struct obd_device *obd; struct ptlrpc_request *req = NULL; struct batch_update_args *aa; + struct lu_batch *bh; int rc; if (!head) return 0; + obd = class_exp2obd(head->buh_exp); bh = head->buh_batch; rc = batch_prep_update_req(head, &req); if (rc) { @@ -447,6 +450,8 @@ static int batch_send_update_req(const struct lu_env *env, if (req) ptlrpc_req_finished(req); + lprocfs_oh_tally_log2(&obd->u.cli.cl_batch_rpc_hist, + head->buh_update_count); return rc; } From patchwork Thu Jan 30 14:11:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CB402C0218F for ; Thu, 30 Jan 2025 14:49:23 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLlt0fPyz22Sn; Thu, 30 Jan 2025 06:20:42 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLbM6ps7z1y8x for ; Thu, 30 Jan 2025 06:13:19 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 20B9718236D; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 1E6E8106BE18; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:06 -0500 Message-ID: <20250130141115.950749-17-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 16/25] lnet: Check empty list in cfs_match_nid_net X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Frank Sehr , Serguei Smirnov , Cyril Bordage , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn cfs_match_nid_net() needs to check whether the list of range expressions describing the address is empty. Otherwise, for numeric based addresses, we may hit the assert in libcfs_num_match(). HPE-bug-id: LUS-11480 WC-bug-id: https://jira.whamcloud.com/browse/LU-16573 Lustre-commit: f3ba286b05d557b0d ("LU-16573 lnet: Check empty list in cfs_match_nid_net") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50576 Reviewed-by: Serguei Smirnov Reviewed-by: Frank Sehr Reviewed-by: Cyril Bordage Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/lnet/nidstrings.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/lnet/lnet/nidstrings.c b/net/lnet/lnet/nidstrings.c index d235048a8ff0..05a9b3275624 100644 --- a/net/lnet/lnet/nidstrings.c +++ b/net/lnet/lnet/nidstrings.c @@ -822,11 +822,11 @@ cfs_match_nid_net(struct lnet_nid *nid, u32 net_type, u32 address; struct netstrfns *nf; - if (!addr || !net_num_list) + if (!addr || list_empty(addr) || !net_num_list) return 0; nf = type2net_info(LNET_NETTYP(LNET_NID_NET(nid))); - if (!nf || !net_num_list || !addr) + if (!nf) return 0; /* FIXME handle long-addr nid */ From patchwork Thu Jan 30 14:11:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9C0F9C0218F for ; Thu, 30 Jan 2025 14:42:32 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLhq72Lfz229t; Thu, 30 Jan 2025 06:18:03 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLZK5vP7z1xpR for ; Thu, 30 Jan 2025 06:12:25 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm-e204-208.ccs.ornl.gov [160.91.203.12]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 268A9899AD9; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 22E9C106BE14; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:07 -0500 Message-ID: <20250130141115.950749-18-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 17/25] lustre: ptlrpc: grow PtlRPC properly when prepare sub request X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin In this patch, it prepares and grows PtlRPC reply buffer properly for SUB batch request in @req_capsule_server_pack(). At the same time, it adds a limit of reply buffer size with BUT_MAXREPSIZE = (1000 * 1024). WC-bug-id: https://jira.whamcloud.com/browse/LU-14139 Lustre-commit: 5a2dfd36f9c2b6c10 ("LU-14139 ptlrpc: grow PtlRPC properly when prepare sub request") Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43707 WC-bug-id: https://jira.whamcloud.com/browse/LU-16907 Lustre-commit: 8a7703eec9bb77a0d ("LU-16907 ptlrpc: correct the reply buffer size for batch RPC") Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/56645 Signed-off-by: Qian Yingjin Reviewed-by: Mikhail Pershin Reviewed-by: Alex Zhuravlev Reviewed-by: Andreas Dilger Reviewed-by: Timothy Day Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_net.h | 5 + fs/lustre/include/lustre_req_layout.h | 3 + fs/lustre/mdc/mdc_batch.c | 2 +- fs/lustre/ptlrpc/batch.c | 5 + fs/lustre/ptlrpc/layout.c | 191 +++++++++++++++++++++++++- fs/lustre/ptlrpc/pack_generic.c | 70 ++++++++++ 6 files changed, 273 insertions(+), 3 deletions(-) diff --git a/fs/lustre/include/lustre_net.h b/fs/lustre/include/lustre_net.h index de1ef881d9d0..b8b4afe96230 100644 --- a/fs/lustre/include/lustre_net.h +++ b/fs/lustre/include/lustre_net.h @@ -265,6 +265,9 @@ #define OUT_MAXREQSIZE (1000 * 1024) #define OUT_MAXREPSIZE MDS_MAXREPSIZE +#define BUT_MAXREQSIZE OUT_MAXREQSIZE +#define BUT_MAXREPSIZE BUT_MAXREQSIZE + /* * LDLM threads constants: * @@ -2051,6 +2054,7 @@ int lustre_pack_reply_flags(struct ptlrpc_request *, int count, u32 *lens, char **bufs, int flags); int lustre_shrink_msg(struct lustre_msg *msg, int segment, unsigned int newlen, int move_data); +int lustre_grow_msg(struct lustre_msg *msg, int segment, unsigned int newlen); void lustre_free_reply_state(struct ptlrpc_reply_state *rs); int __lustre_unpack_msg(struct lustre_msg *m, int len); u32 lustre_msg_hdr_size(u32 magic, u32 count); @@ -2061,6 +2065,7 @@ extern u32 lustre_msg_early_size; void *lustre_msg_buf_v2(struct lustre_msg_v2 *m, u32 n, u32 min_size); void *lustre_msg_buf(struct lustre_msg *m, u32 n, u32 minlen); u32 lustre_msg_buflen(struct lustre_msg *m, u32 n); +void lustre_msg_set_buflen(struct lustre_msg *m, u32 n, u32 len); u32 lustre_msg_bufcount(struct lustre_msg *m); char *lustre_msg_string(struct lustre_msg *m, u32 n, u32 max_len); u32 lustre_msghdr_get_flags(struct lustre_msg *msg); diff --git a/fs/lustre/include/lustre_req_layout.h b/fs/lustre/include/lustre_req_layout.h index 505e9a10c486..1504a591d96e 100644 --- a/fs/lustre/include/lustre_req_layout.h +++ b/fs/lustre/include/lustre_req_layout.h @@ -132,6 +132,9 @@ int req_capsule_field_present(const struct req_capsule *pill, void req_capsule_shrink(struct req_capsule *pill, const struct req_msg_field *field, u32 newlen, enum req_location loc); +int req_capsule_server_grow(struct req_capsule *pill, + const struct req_msg_field *field, + u32 newlen); bool req_capsule_need_swab(struct req_capsule *pill, enum req_location loc, u32 index); void req_capsule_set_swabbed(struct req_capsule *pill, enum req_location loc, diff --git a/fs/lustre/mdc/mdc_batch.c b/fs/lustre/mdc/mdc_batch.c index 73f5a8c5f9ed..16805243165f 100644 --- a/fs/lustre/mdc/mdc_batch.c +++ b/fs/lustre/mdc/mdc_batch.c @@ -133,7 +133,7 @@ static int mdc_batch_getattr_pack(struct batch_update_head *head, req_capsule_set_size(&pill, &RMF_ACL, RCL_SERVER, LUSTRE_POSIX_ACL_MAX_SIZE_OLD); req_capsule_set_size(&pill, &RMF_DEFAULT_MDT_MD, RCL_SERVER, - sizeof(struct lmv_user_md)); + /*sizeof(struct lmv_user_md)*/MIN_MD_SIZE); if (have_secctx) { char *secctx_name; diff --git a/fs/lustre/ptlrpc/batch.c b/fs/lustre/ptlrpc/batch.c index 75a6bc21b869..83342c7b3605 100644 --- a/fs/lustre/ptlrpc/batch.c +++ b/fs/lustre/ptlrpc/batch.c @@ -360,11 +360,16 @@ static int batch_update_request_fini(struct batch_update_head *head, */ repmsg = NULL; rc1 = -ECANCELED; + /* + * TODO: resend the unfinished sub request when the + * return code is -EOVERFLOW. + */ } if (ouc->ouc_interpret) ouc->ouc_interpret(req, repmsg, ouc, rc1); + index++; object_update_callback_fini(ouc); if (rc == 0 && rc1 < 0) rc = rc1; diff --git a/fs/lustre/ptlrpc/layout.c b/fs/lustre/ptlrpc/layout.c index 5beebb7776a3..3a9e83f5262b 100644 --- a/fs/lustre/ptlrpc/layout.c +++ b/fs/lustre/ptlrpc/layout.c @@ -1915,16 +1915,62 @@ int req_capsule_server_pack(struct req_capsule *pill) count, fmt->rf_name); } } else { /* SUB request */ + struct ptlrpc_request *req = pill->rc_req; + u32 used_len; u32 msg_len; msg_len = lustre_msg_size_v2(count, pill->rc_area[RCL_SERVER]); - if (msg_len > pill->rc_reqmsg->lm_repsize) { + used_len = (char *)pill->rc_repmsg - (char *)req->rq_repmsg; + /* Overflow the reply buffer */ + if (used_len + msg_len > req->rq_replen) { + u32 len; + u32 max; + u32 add; + + if (!req_capsule_has_field(&req->rq_pill, + &RMF_BUT_REPLY, RCL_SERVER)) + return -EINVAL; + + if (!req_capsule_field_present(&req->rq_pill, + &RMF_BUT_REPLY, + RCL_SERVER)) + return -EINVAL; + + if (used_len + msg_len > BUT_MAXREPSIZE) + return -EOVERFLOW; + + len = req_capsule_get_size(&req->rq_pill, + &RMF_BUT_REPLY, RCL_SERVER); + /* + * Currently just increase the batch RPC reply buffer + * (including @RMF_PTLRPC_BODY + @RMF_BUT_REPLY) by 2. + * We must set the new length carefully as it will be + * rounded up with 8. + */ + max = BUT_MAXREPSIZE - req->rq_replen; + add = len; + if (used_len + msg_len > len) + add = used_len + msg_len; + + if (add > max) + len += max; + else + len += add; + rc = req_capsule_server_grow(&req->rq_pill, + &RMF_BUT_REPLY, len); + if (rc) + return rc; + + pill->rc_repmsg = + (struct lustre_msg *)((char *)req->rq_repmsg + + used_len); + } + if (msg_len > pill->rc_reqmsg->lm_repsize) /* TODO: Check whether there is enough buffer size */ CDEBUG(D_INFO, "Overflow pack %d fields in format '%s' for the SUB request with message len %u:%u\n", count, fmt->rf_name, msg_len, pill->rc_reqmsg->lm_repsize); - } rc = 0; lustre_init_msg_v2(pill->rc_repmsg, count, @@ -2498,6 +2544,147 @@ void req_capsule_shrink(struct req_capsule *pill, } EXPORT_SYMBOL(req_capsule_shrink); +int req_capsule_server_grow(struct req_capsule *pill, + const struct req_msg_field *field, + u32 newlen) +{ + struct ptlrpc_request *req = pill->rc_req; + struct ptlrpc_reply_state *rs = req->rq_reply_state, *nrs; + char *from, *to, *sptr = NULL; + u32 slen = 0, snewlen = 0; + u32 offset, len, max, diff; + int rc; + + LASSERT(pill->rc_fmt); + LASSERT(__req_format_is_sane(pill->rc_fmt)); + LASSERT(req_capsule_has_field(pill, field, RCL_SERVER)); + LASSERT(req_capsule_field_present(pill, field, RCL_SERVER)); + + if (req_capsule_subreq(pill)) { + if (!req_capsule_has_field(&req->rq_pill, &RMF_BUT_REPLY, + RCL_SERVER)) + return -EINVAL; + + if (!req_capsule_field_present(&req->rq_pill, &RMF_BUT_REPLY, + RCL_SERVER)) + return -EINVAL; + + len = req_capsule_get_size(&req->rq_pill, &RMF_BUT_REPLY, + RCL_SERVER); + sptr = req_capsule_server_get(&req->rq_pill, &RMF_BUT_REPLY); + slen = req_capsule_get_size(pill, field, RCL_SERVER); + + LASSERT(len >= (char *)pill->rc_repmsg - sptr + + lustre_packed_msg_size(pill->rc_repmsg)); + if (len >= (char *)pill->rc_repmsg - sptr + + lustre_packed_msg_size(pill->rc_repmsg) - slen + + newlen) { + req_capsule_set_size(pill, field, RCL_SERVER, newlen); + offset = __req_capsule_offset(pill, field, RCL_SERVER); + lustre_grow_msg(pill->rc_repmsg, offset, newlen); + return 0; + } + + /* + * Currently first try to increase the reply buffer by + * 2 * newlen with reply buffer limit of BUT_MAXREPSIZE. + * TODO: Enlarge the reply buffer properly according to the + * left SUB requests in the batch PTLRPC request. + */ + snewlen = newlen; + diff = snewlen - slen; + max = BUT_MAXREPSIZE - req->rq_replen; + if (diff > max) + return -EOVERFLOW; + + if (diff * 2 + len < max) + newlen = (len + diff) * 2; + else + newlen = len + max; + + req_capsule_set_size(pill, field, RCL_SERVER, snewlen); + req_capsule_set_size(&req->rq_pill, &RMF_BUT_REPLY, RCL_SERVER, + newlen); + offset = __req_capsule_offset(&req->rq_pill, &RMF_BUT_REPLY, + RCL_SERVER); + } else { + len = req_capsule_get_size(pill, field, RCL_SERVER); + offset = __req_capsule_offset(pill, field, RCL_SERVER); + req_capsule_set_size(pill, field, RCL_SERVER, newlen); + } + + CDEBUG(D_INFO, "Reply packed: %d, allocated: %d, field len %d -> %d\n", + lustre_packed_msg_size(rs->rs_msg), rs->rs_repbuf_len, + len, newlen); + + /** + * There can be enough space in current reply buffer, make sure + * that rs_repbuf is not a wrapper but real reply msg, otherwise + * re-packing is still needed. + */ + if (rs->rs_msg == rs->rs_repbuf && + rs->rs_repbuf_len >= + lustre_packed_msg_size(rs->rs_msg) - len + newlen) { + req->rq_replen = lustre_grow_msg(rs->rs_msg, offset, newlen); + return 0; + } + + /* Re-allocate replay state */ + req->rq_reply_state = NULL; + rc = req_capsule_server_pack(&req->rq_pill); + if (rc) { + /* put old values back, the caller should decide what to do */ + if (req_capsule_subreq(pill)) { + req_capsule_set_size(&req->rq_pill, &RMF_BUT_REPLY, + RCL_SERVER, len); + req_capsule_set_size(pill, field, RCL_SERVER, slen); + } else { + req_capsule_set_size(pill, field, RCL_SERVER, len); + } + pill->rc_req->rq_reply_state = rs; + return rc; + } + nrs = req->rq_reply_state; + LASSERT(lustre_packed_msg_size(nrs->rs_msg) > + lustre_packed_msg_size(rs->rs_msg)); + + /* Now we need only buffers, copy them and grow the needed one */ + to = lustre_msg_buf(nrs->rs_msg, 0, 0); + from = lustre_msg_buf(rs->rs_msg, 0, 0); + memcpy(to, from, + (char *)rs->rs_msg + lustre_packed_msg_size(rs->rs_msg) - from); + lustre_msg_set_buflen(nrs->rs_msg, offset, len); + req->rq_replen = lustre_grow_msg(nrs->rs_msg, offset, newlen); + + if (req_capsule_subreq(pill)) { + char *ptr; + + ptr = req_capsule_server_get(&req->rq_pill, &RMF_BUT_REPLY); + pill->rc_repmsg = (struct lustre_msg *)(ptr + + ((char *)pill->rc_repmsg - sptr)); + offset = __req_capsule_offset(pill, field, RCL_SERVER); + lustre_grow_msg(pill->rc_repmsg, offset, snewlen); + } + + if (rs->rs_difficult) { + /* copy rs data */ + int i; + + nrs->rs_difficult = 1; + nrs->rs_no_ack = rs->rs_no_ack; + for (i = 0; i < rs->rs_nlocks; i++) { + nrs->rs_locks[i] = rs->rs_locks[i]; + nrs->rs_nlocks++; + } + rs->rs_nlocks = 0; + rs->rs_difficult = 0; + rs->rs_no_ack = 0; + } + ptlrpc_rs_decref(rs); + return 0; +} +EXPORT_SYMBOL(req_capsule_server_grow); + void req_capsule_subreq_init(struct req_capsule *pill, const struct req_format *fmt, struct ptlrpc_request *req, diff --git a/fs/lustre/ptlrpc/pack_generic.c b/fs/lustre/ptlrpc/pack_generic.c index 53e2912a28e7..16058b9cd9be 100644 --- a/fs/lustre/ptlrpc/pack_generic.c +++ b/fs/lustre/ptlrpc/pack_generic.c @@ -454,6 +454,58 @@ int lustre_shrink_msg(struct lustre_msg *msg, int segment, } EXPORT_SYMBOL(lustre_shrink_msg); +static int lustre_grow_msg_v2(struct lustre_msg_v2 *msg, __u32 segment, + unsigned int newlen) +{ + char *tail = NULL, *newpos; + int tail_len = 0, n; + + LASSERT(msg); + LASSERT(msg->lm_bufcount > segment); + LASSERT(msg->lm_buflens[segment] <= newlen); + + if (msg->lm_buflens[segment] == newlen) + goto out; + + if (msg->lm_bufcount > segment + 1) { + tail = lustre_msg_buf_v2(msg, segment + 1, 0); + for (n = segment + 1; n < msg->lm_bufcount; n++) + tail_len += round_up(msg->lm_buflens[n], 8); + } + + msg->lm_buflens[segment] = newlen; + + if (tail && tail_len) { + newpos = lustre_msg_buf_v2(msg, segment + 1, 0); + memmove(newpos, tail, tail_len); + } +out: + return lustre_msg_size_v2(msg->lm_bufcount, msg->lm_buflens); +} + +/* + * for @msg, grow @segment to size @newlen. + * Always move higher buffer forward. + * + * return new msg size after growing. + * + * CAUTION: + * - caller must make sure there is enough space in allocated message buffer + * - caller should NOT keep pointers to msg buffers which higher than @segment + * after call shrink. + */ +int lustre_grow_msg(struct lustre_msg *msg, int segment, unsigned int newlen) +{ + switch (msg->lm_magic) { + case LUSTRE_MSG_MAGIC_V2: + return lustre_grow_msg_v2(msg, segment, newlen); + default: + LASSERTF(0, "incorrect message magic: %08x\n", msg->lm_magic); + return -EINVAL; + } +} +EXPORT_SYMBOL(lustre_grow_msg); + void lustre_free_reply_state(struct ptlrpc_reply_state *rs) { PTLRPC_RS_DEBUG_LRU_DEL(rs); @@ -660,6 +712,24 @@ u32 lustre_msg_buflen(struct lustre_msg *m, u32 n) } EXPORT_SYMBOL(lustre_msg_buflen); +static inline void +lustre_msg_set_buflen_v2(struct lustre_msg_v2 *m, u32 n, u32 len) +{ + LASSERT(n < m->lm_bufcount); + m->lm_buflens[n] = len; +} + +void lustre_msg_set_buflen(struct lustre_msg *m, u32 n, u32 len) +{ + switch (m->lm_magic) { + case LUSTRE_MSG_MAGIC_V2: + lustre_msg_set_buflen_v2(m, n, len); + return; + default: + LASSERTF(0, "incorrect message magic: %08x\n", m->lm_magic); + } +} + /* NB return the bufcount for lustre_msg_v2 format, so if message is packed * in V1 format, the result is one bigger. (add struct ptlrpc_body). */ From patchwork Thu Jan 30 14:11:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954682 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DBCC4C0218A for ; Thu, 30 Jan 2025 14:48:13 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLjp239Lz22LJ; Thu, 30 Jan 2025 06:18:54 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLZw5yVwz1y22 for ; Thu, 30 Jan 2025 06:12:56 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 2AFD5899ADB; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 27255106BE16; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:08 -0500 Message-ID: <20250130141115.950749-19-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 18/25] lustre: llite: Check for page deletion after fault X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Zhenyu Xu , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Patrick Farrell Before completing a page fault and returning to the kernel, we lock the page and verify it has not been truncated. But we must also verify the page has not been deleted from Lustre, or we can return a disconnected (ie, not tracked by Lustre) page to the kernel. We mark deleted pages !uptodate, but this doesn't matter for faulted pages, because the kernel assumes they are returned uptodate, and maps them in to the process address space. Once mapped, the page state is not checked until the page is unmapped. But because the page is referenced by the mapping, it stays in the page cache even though it's been disconnected from Lustre. Because the page is disconnected from Lustre, it will not be found and cancelled on lock cancellation. This can result in stale data reads. This is particularly an issue with releasepage (called from drop_caches or under memory pressure), which can delete pages separately from cancelling covering locks. If releasepage is disabled, which is effectively what "LU-14541 llite: Check vmpage in releasepage" does, this is not an issue. But disabling releasepage causes other problems and is incorrect anyway. Fixes: 9c78efe1a4 ("lustre: llite: Check vmpage in releasepage") WC-bug-id: https://jira.whamcloud.com/browse/LU-14541 Lustre-commit: b3d2114e538cf95a7 ("LU-14541 llite: Check for page deletion after fault") Signed-off-by: Patrick Farrell Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49653 Reviewed-by: Oleg Drokin Reviewed-by: Qian Yingjin Reviewed-by: Zhenyu Xu Signed-off-by: James Simmons --- fs/lustre/llite/llite_mmap.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/fs/lustre/llite/llite_mmap.c b/fs/lustre/llite/llite_mmap.c index db069de1ef31..d6c1a6fd0794 100644 --- a/fs/lustre/llite/llite_mmap.c +++ b/fs/lustre/llite/llite_mmap.c @@ -418,15 +418,22 @@ static vm_fault_t ll_fault(struct vm_fault *vmf) !(result & (VM_FAULT_RETRY | VM_FAULT_ERROR | VM_FAULT_LOCKED))) { struct page *vmpage = vmf->page; - /* check if this page has been truncated */ + /* lock the page, then check if this page has been truncated + * or deleted from Lustre and retry if so + */ lock_page(vmpage); - if (unlikely(!vmpage->mapping)) { /* unlucky */ + if (unlikely(vmpage->mapping == NULL) || + vmpage->private == 0) { /* unlucky */ unlock_page(vmpage); put_page(vmpage); vmf->page = NULL; if (!printed && ++count > 16) { - CWARN("the page is under heavy contention, maybe your app(%s) needs revising :-)\n", + struct inode *inode = file_inode(vma->vm_file); + + CWARN("%s: FID "DFID" under heavy mmap contention by '%s', consider revising IO pattern\n", + ll_i2sbi(inode)->ll_fsname, + PFID(&ll_i2info(inode)->lli_fid), current->comm); printed = true; } From patchwork Thu Jan 30 14:11:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954687 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8ACFDC0218A for ; Thu, 30 Jan 2025 14:50:06 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLmf6BXTz1x69; Thu, 30 Jan 2025 06:21:22 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLc5036rz210R for ; Thu, 30 Jan 2025 06:13:56 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 2D6BA18236E; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 2B9F9106BE17; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:09 -0500 Message-ID: <20250130141115.950749-20-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 19/25] lustre: misc: standardize iocontrol param handling X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Arshad Hussain , Vitaliy Kuznetsov , Tao Lyu , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Validate uarg and karg early in iocontrol processing where needed. This needs kernel interop for 4.20+ kernels for access_ok(). WC-bug-id: https://jira.whamcloud.com/browse/LU-16634 Lustre-commit: 5eae8514f5f1730fe ("LU-16634 misc: standardize iocontrol param handling") Signed-off-by: Andreas Dilger Reported-by: Tao Lyu Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50314 Reviewed-by: Arshad Hussain Reviewed-by: Vitaliy Kuznetsov Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 47 ++++++++++++++++--------- fs/lustre/llite/llite_lib.c | 20 +++++++---- fs/lustre/lmv/lmv_obd.c | 28 ++++++++++----- fs/lustre/lov/lov_obd.c | 26 ++++++++++++-- fs/lustre/mdc/mdc_request.c | 24 +++++++------ fs/lustre/obdclass/class_obd.c | 12 ++++--- fs/lustre/obdecho/echo_client.c | 8 ++++- fs/lustre/osc/osc_request.c | 22 ++++++++++-- include/uapi/linux/lustre/lustre_user.h | 1 + 9 files changed, 137 insertions(+), 51 deletions(-) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index d196362a40ca..11006682be1a 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -2500,6 +2500,10 @@ static int ll_file_getstripe(struct inode *inode, void __user *lum, size_t size) u16 refcheck; int rc; + /* exit before doing any work if pointer is bad */ + if (unlikely(!access_ok(lum, sizeof(struct lov_user_md)))) + return -EFAULT; + env = cl_env_get(&refcheck); if (IS_ERR(env)) return PTR_ERR(env); @@ -3826,7 +3830,7 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, bool lease_broken = false; fmode_t fmode = 0; enum mds_op_bias bias = 0; - int fdv; + u32 fdv; struct file *layout_file = NULL; void *data = NULL; size_t data_size = 0; @@ -3873,7 +3877,7 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, } uarg += sizeof(*ioc); - if (copy_from_user(&fdv, uarg, sizeof(u32))) { + if (copy_from_user(&fdv, uarg, sizeof(fdv))) { rc = -EFAULT; goto out_lease_close; } @@ -3894,7 +3898,7 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, bias = MDS_CLOSE_LAYOUT_MERGE; break; case LL_LEASE_LAYOUT_SPLIT: { - int mirror_id; + u32 mirror_id; if (ioc->lil_count != 2) { rc = -EINVAL; @@ -3902,17 +3906,20 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, } uarg += sizeof(*ioc); - if (copy_from_user(&fdv, uarg, sizeof(u32))) { + if (copy_from_user(&fdv, uarg, sizeof(fdv))) { rc = -EFAULT; goto out_lease_close; } - uarg += sizeof(u32); - if (copy_from_user(&mirror_id, uarg, - sizeof(u32))) { + uarg += sizeof(fdv); + if (copy_from_user(&mirror_id, uarg, sizeof(mirror_id))) { rc = -EFAULT; goto out_lease_close; } + if (mirror_id >= MIRROR_ID_NEG) { + rc = -EINVAL; + goto out_lease_close; + } layout_file = fget(fdv); if (!layout_file) { @@ -4110,6 +4117,9 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) if (_IOC_TYPE(cmd) == 'T' || _IOC_TYPE(cmd) == 't') /* tty ioctls */ return -ENOTTY; + /* can't do a generic karg == NULL check here, since it is too noisy and + * we need to return -ENOTTY for unsupported ioctls instead of -EINVAL. + */ switch (cmd) { case LL_IOC_GETFLAGS: /* Get the current value of the file flags */ @@ -4263,6 +4273,9 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) struct hsm_user_state *hus; int rc; + if (!access_ok(uarg, sizeof(*hus))) + return -EFAULT; + hus = kzalloc(sizeof(*hus), GFP_KERNEL); if (!hus) return -ENOMEM; @@ -4270,17 +4283,16 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0, LUSTRE_OPC_ANY, hus); if (IS_ERR(op_data)) { - kfree(hus); - return PTR_ERR(op_data); - } - - rc = obd_iocontrol(cmd, ll_i2mdexp(inode), sizeof(*op_data), - op_data, NULL); + rc = PTR_ERR(op_data); + } else { + rc = obd_iocontrol(cmd, ll_i2mdexp(inode), + sizeof(*op_data), op_data, NULL); - if (copy_to_user(uarg, hus, sizeof(*hus))) - rc = -EFAULT; + if (copy_to_user(uarg, hus, sizeof(*hus))) + rc = -EFAULT; - ll_finish_md_op_data(op_data); + ll_finish_md_op_data(op_data); + } kfree(hus); return rc; } @@ -4303,6 +4315,9 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg) const char *action; int rc; + if (!access_ok(uarg, sizeof(*hca))) + return -EFAULT; + hca = kzalloc(sizeof(*hca), GFP_KERNEL); if (!hca) return -ENOMEM; diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 936a81c65870..751886b47445 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -37,18 +37,18 @@ #define DEBUG_SUBSYSTEM S_LLITE #include + +#include +#include +#include +#include #include #include #include -#include #include -#include -#include +#include +#include #include -#include -#include -#include -#include #include #include @@ -2887,6 +2887,9 @@ int ll_iocontrol(struct inode *inode, struct file *file, struct mdt_body *body; struct md_op_data *op_data; + if (!access_ok(uarg, sizeof(int))) + return -EFAULT; + op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0, LUSTRE_OPC_ANY, NULL); if (IS_ERR(op_data)) { @@ -2982,6 +2985,9 @@ int ll_iocontrol(struct inode *inode, struct file *file, rc = ll_obd_statfs(inode, uarg); break; case LL_IOC_GET_MDTIDX: { + if (!access_ok(uarg, sizeof(rc))) + return -EFAULT; + rc = ll_get_mdt_idx(inode); if (rc < 0) break; diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 62385ac9b00f..98ee902d8cb5 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -837,6 +837,23 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, if (count == 0) return -ENOTTY; + /* exit early for unknown ioctl types */ + if (unlikely(_IOC_TYPE(cmd) != 'f' && cmd != IOC_OSC_SET_ACTIVE)) + return OBD_IOC_ERROR(obd->obd_name, cmd, "unknown", -ENOTTY); + + /* handle commands that don't use @karg first */ + switch (cmd) { + case LL_IOC_GET_CONNECT_FLAGS: + tgt = lmv_tgt(lmv, 0); + rc = -ENODATA; + if (tgt && tgt->ltd_exp) + rc = obd_iocontrol(cmd, tgt->ltd_exp, len, NULL, uarg); + return rc; + } + + if (unlikely(!karg)) + return OBD_IOC_ERROR(obd->obd_name, cmd, "karg=NULL", -EINVAL); + switch (cmd) { case IOC_OBD_STATFS: { struct obd_ioctl_data *data = karg; @@ -926,14 +943,6 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, kfree(oqctl); break; } - case LL_IOC_GET_CONNECT_FLAGS: { - tgt = lmv_tgt(lmv, 0); - rc = -ENODATA; - - if (tgt && tgt->ltd_exp) - rc = obd_iocontrol(cmd, tgt->ltd_exp, len, karg, uarg); - break; - } case LL_IOC_FID2MDTIDX: { struct lu_fid *fid = karg; int mdt_index; @@ -1076,6 +1085,8 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, tgt->ltd_uuid.uuid, err); if (!rc) rc = err; + if (unlikely(err == -ENOTTY)) + break; } } else { set = 1; @@ -1083,6 +1094,7 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp, } if (!set && !rc) rc = -EIO; + break; } return rc; } diff --git a/fs/lustre/lov/lov_obd.c b/fs/lustre/lov/lov_obd.c index 8bfce5001d58..a152f0b2d2ec 100644 --- a/fs/lustre/lov/lov_obd.c +++ b/fs/lustre/lov/lov_obd.c @@ -973,15 +973,28 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", exp->exp_obd->obd_name, cmd, len, karg, uarg); + /* exit early for unknown ioctl types */ + if (unlikely(_IOC_TYPE(cmd) != 'f' && cmd != IOC_OSC_SET_ACTIVE)) + return OBD_IOC_DEBUG(D_IOCTL, obd->obd_name, cmd, "unknown", + -ENOTTY); + + /* can't do a generic karg == NULL check here, since it is too noisy and + * we need to return -ENOTTY for unsupported ioctls instead of -EINVAL. + */ switch (cmd) { case IOC_OBD_STATFS: { - struct obd_ioctl_data *data = karg; + struct obd_ioctl_data *data; struct obd_device *osc_obd; struct obd_statfs stat_buf = { 0 }; struct obd_import *imp; u32 index; u32 flags; + if (unlikely(!karg)) + return OBD_IOC_ERROR(obd->obd_name, cmd, "karg=null", + -EINVAL); + data = karg; + memcpy(&index, data->ioc_inlbuf2, sizeof(index)); if (index >= count) return -ENODEV; @@ -1021,11 +1034,16 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, break; } case OBD_IOC_QUOTACTL: { - struct if_quotactl *qctl = karg; + struct if_quotactl *qctl; struct lov_tgt_desc *tgt = NULL; struct obd_quotactl *oqctl; struct obd_import *imp; + if (unlikely(!karg)) + return OBD_IOC_ERROR(obd->obd_name, cmd, "karg=null", + -EINVAL); + qctl = karg; + if (qctl->qc_valid == QC_OSTIDX) { if (count <= qctl->qc_idx) return -EINVAL; @@ -1107,6 +1125,9 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, err); if (!rc) rc = err; + + if (err == -ENOTTY) + break; } } else { set = 1; @@ -1114,6 +1135,7 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len, } if (!set && !rc) rc = -EIO; + break; } } diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c index 84c4d2888e7d..55a7b5cf1249 100644 --- a/fs/lustre/mdc/mdc_request.c +++ b/fs/lustre/mdc/mdc_request.c @@ -2204,13 +2204,26 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, void *karg, void __user *uarg) { struct obd_device *obd = exp->exp_obd; - struct obd_ioctl_data *data = karg; + struct obd_ioctl_data *data; struct obd_import *imp = obd->u.cli.cl_import; int rc; CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", obd->obd_name, cmd, len, karg, uarg); + /* handle commands that do not need @karg first */ + switch (cmd) { + case LL_IOC_GET_CONNECT_FLAGS: + if (copy_to_user(uarg, exp_connect_flags_ptr(exp), + sizeof(*exp_connect_flags_ptr(exp)))) + return -EFAULT; + return 0; + } + + if (unlikely(!karg)) + return OBD_IOC_ERROR(obd->obd_name, cmd, "karg=NULL", -EINVAL); + data = karg; + if (!try_module_get(THIS_MODULE)) { CERROR("%s: cannot get module '%s'\n", obd->obd_name, module_name(THIS_MODULE)); @@ -2311,15 +2324,6 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, kfree(oqctl); goto out; } - case LL_IOC_GET_CONNECT_FLAGS: - if (copy_to_user(uarg, exp_connect_flags_ptr(exp), - sizeof(*exp_connect_flags_ptr(exp)))) { - rc = -EFAULT; - goto out; - } - - rc = 0; - goto out; case LL_IOC_LOV_SWAP_LAYOUTS: rc = mdc_ioc_swap_layouts(exp, karg); goto out; diff --git a/fs/lustre/obdclass/class_obd.c b/fs/lustre/obdclass/class_obd.c index dd5fcc895f02..3692d57fbcef 100644 --- a/fs/lustre/obdclass/class_obd.c +++ b/fs/lustre/obdclass/class_obd.c @@ -262,18 +262,17 @@ int obd_ioctl_getdata(struct obd_ioctl_data **datap, int *len, void __user *arg) if (copy_from_user(data, arg, hdr.ioc_len)) { rc = -EFAULT; - goto free_buf; + goto out_free; } if (hdr.ioc_len != data->ioc_len) { rc = -EINVAL; - goto free_buf; + goto out_free; } if (obd_ioctl_is_invalid(data)) { - CERROR("ioctl not correctly formatted\n"); rc = -EINVAL; - goto free_buf; + goto out_free; } if (data->ioc_inllen1) { @@ -296,7 +295,7 @@ int obd_ioctl_getdata(struct obd_ioctl_data **datap, int *len, void __user *arg) return 0; -free_buf: +out_free: kvfree(data); return rc; } @@ -309,6 +308,9 @@ int class_handle_ioctl(unsigned int cmd, void __user *uarg) int rc = 0, len = 0; CDEBUG(D_IOCTL, "obdclass: cmd=%x len=%u uarg=%pK\n", cmd, len, uarg); + if (unlikely(_IOC_TYPE(cmd) != 'f' && cmd != IOC_OSC_SET_ACTIVE)) + return OBD_IOC_ERROR(obd->obd_name, cmd, "unknown", -ENOTTY); + rc = obd_ioctl_getdata(&data, &len, uarg); if (rc) { CERROR("%s: ioctl data error: rc = %d\n", current->comm, rc); diff --git a/fs/lustre/obdecho/echo_client.c b/fs/lustre/obdecho/echo_client.c index 95af2af66918..220ceae89150 100644 --- a/fs/lustre/obdecho/echo_client.c +++ b/fs/lustre/obdecho/echo_client.c @@ -991,7 +991,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len, struct echo_device *ed = obd2echo_dev(obd); struct echo_client_obd *ec = ed->ed_ec; struct echo_object *eco; - struct obd_ioctl_data *data = karg; + struct obd_ioctl_data *data; struct lu_env *env; u16 refcheck; struct obdo *oa; @@ -1002,6 +1002,12 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len, CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", exp->exp_obd->obd_name, cmd, len, karg, uarg); + CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", + exp->exp_obd->obd_name, cmd, len, karg, uarg); + if (unlikely(!karg)) + return OBD_IOC_ERROR(obd->obd_name, cmd, "karg=NULL", rc); + data = karg; + oa = &data->ioc_obdo1; if (!(oa->o_valid & OBD_MD_FLGROUP)) { oa->o_valid |= OBD_MD_FLGROUP; diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index 8efdd5a8cd8a..061767503401 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -3358,8 +3358,8 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, void *karg, void __user *uarg) { struct obd_device *obd = exp->exp_obd; - struct obd_ioctl_data *data = karg; - int rc = 0; + struct obd_ioctl_data *data; + int rc; CDEBUG(D_IOCTL, "%s: cmd=%x len=%u karg=%pK uarg=%pK\n", obd->obd_name, cmd, len, karg, uarg); @@ -3371,15 +3371,33 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len, } switch (cmd) { case OBD_IOC_CLIENT_RECOVER: + if (unlikely(!karg)) { + OBD_IOC_ERROR(obd->obd_name, cmd, "karg=NULL", + rc = -EINVAL); + break; + } + data = karg; rc = ptlrpc_recover_import(obd->u.cli.cl_import, data->ioc_inlbuf1, 0); if (rc > 0) rc = 0; break; case OBD_IOC_GETATTR: + if (unlikely(!karg)) { + OBD_IOC_ERROR(obd->obd_name, cmd, "karg=NULL", + rc = -EINVAL); + break; + } + data = karg; rc = obd_getattr(NULL, exp, &data->ioc_obdo1); break; case IOC_OSC_SET_ACTIVE: + if (unlikely(!karg)) { + OBD_IOC_ERROR(obd->obd_name, cmd, "karg=NULL", + rc = -EINVAL); + break; + } + data = karg; rc = ptlrpc_set_import_active(obd->u.cli.cl_import, data->ioc_offset); break; diff --git a/include/uapi/linux/lustre/lustre_user.h b/include/uapi/linux/lustre/lustre_user.h index 876d337a3b2b..9c0632856bc8 100644 --- a/include/uapi/linux/lustre/lustre_user.h +++ b/include/uapi/linux/lustre/lustre_user.h @@ -653,6 +653,7 @@ struct lov_comp_md_entry_v1 { #define SEQ_ID_MASK SEQ_ID_MAX /* bit 30:16 of lcme_id is used to store mirror id */ #define MIRROR_ID_MASK 0x7FFF0000 +#define MIRROR_ID_NEG 0x8000 #define MIRROR_ID_SHIFT 16 static inline __u32 pflr_id(__u16 mirror_id, __u16 seqid) From patchwork Thu Jan 30 14:11:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954689 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E7C7C0218A for ; Thu, 30 Jan 2025 14:51:42 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLnX6kYTz22V4; Thu, 30 Jan 2025 06:22:08 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLdJ3wrvz214y for ; Thu, 30 Jan 2025 06:15:00 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 32EAB18236F; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 2FE38106BE18; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:10 -0500 Message-ID: <20250130141115.950749-21-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 20/25] lustre: ptlrpc: retry mechanism for overflowed batched RPCs X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mikhail Pershin , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin Before send the batched RPC, the client has no idea about the actual reply buffer size. The reply buffer size prepared by a client may be smalller than the reply buffer buffer size in need. We already have the patch to grow the reply buffer properly in most cases. However, when the reply buffer size is growing larger than BUT_MAXREPSIZE (1000 * 1024), the server will return -EOVERFLOW error code. At this time, the server only executed the partial sub requests in the batched RPC. The overflowed sub requests are not handled. In this patch, it adds a retry mechanism for overflowed batched RPC. When found that the reply buffer overflowed, the client will rebuild the batched RPC for the unhandled sub requests, and use work queue mechanism to resend the new batched RPC to the server to re-execute then again. WC-bug-id: https://jira.whamcloud.com/browse/LU-15550 Lustre-commit: 668f48f87bec39998 ("LU-15550 ptlrpc: retry mechanism for overflowed batched RPCs") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46540 Reviewed-by: Andreas Dilger Reviewed-by: Mikhail Pershin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ptlrpc/batch.c | 146 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 140 insertions(+), 6 deletions(-) diff --git a/fs/lustre/ptlrpc/batch.c b/fs/lustre/ptlrpc/batch.c index 83342c7b3605..77e3261862e0 100644 --- a/fs/lustre/ptlrpc/batch.c +++ b/fs/lustre/ptlrpc/batch.c @@ -43,6 +43,17 @@ #define OUT_UPDATE_REPLY_SIZE 4096 +static inline struct lustre_msg * +batch_update_reqmsg_next(struct batch_update_request *bur, + struct lustre_msg *reqmsg) +{ + if (reqmsg) + return (struct lustre_msg *)((char *)reqmsg + + lustre_packed_msg_size(reqmsg)); + else + return &bur->burq_reqmsg[0]; +} + static inline struct lustre_msg * batch_update_repmsg_next(struct batch_update_reply *bur, struct lustre_msg *repmsg) @@ -65,6 +76,12 @@ struct batch_update_args { struct batch_update_head *ba_head; }; +struct batch_work_resend { + struct work_struct bwr_work; + struct batch_update_head *bwr_head; + int bwr_index; +}; + /** * Prepare inline update request * @@ -325,6 +342,8 @@ static void batch_update_request_destroy(struct batch_update_head *head) kfree(head); } +static void cli_batch_resend_work(struct work_struct *data); + static int batch_update_request_fini(struct batch_update_head *head, struct ptlrpc_request *req, struct batch_update_reply *reply, int rc) @@ -340,8 +359,6 @@ static int batch_update_request_fini(struct batch_update_head *head, list_for_each_entry_safe(ouc, next, &head->buh_cb_list, ouc_item) { int rc1 = 0; - list_del_init(&ouc->ouc_item); - /* * The peer may only have handled some requests (indicated by * @count) in the packaged OUT PRC, we can only get results @@ -364,8 +381,24 @@ static int batch_update_request_fini(struct batch_update_head *head, * TODO: resend the unfinished sub request when the * return code is -EOVERFLOW. */ + if (rc == -EOVERFLOW) { + struct batch_work_resend *work; + + work = kmalloc(sizeof(*work), GFP_ATOMIC); + if (!work) { + rc1 = -ENOMEM; + } else { + INIT_WORK(&work->bwr_work, + cli_batch_resend_work); + work->bwr_head = head; + work->bwr_index = index; + schedule_work(&work->bwr_work); + return 0; + } + } } + list_del_init(&ouc->ouc_item); if (ouc->ouc_interpret) ouc->ouc_interpret(req, repmsg, ouc, rc1); @@ -413,6 +446,7 @@ static int batch_send_update_req(const struct lu_env *env, struct ptlrpc_request *req = NULL; struct batch_update_args *aa; struct lu_batch *bh; + u32 flags = 0; int rc; if (!head) @@ -420,6 +454,9 @@ static int batch_send_update_req(const struct lu_env *env, obd = class_exp2obd(head->buh_exp); bh = head->buh_batch; + if (bh) + flags = bh->lbt_flags; + rc = batch_prep_update_req(head, &req); if (rc) { rc = batch_update_request_fini(head, NULL, NULL, rc); @@ -434,16 +471,16 @@ static int batch_send_update_req(const struct lu_env *env, * Only acquire modification RPC slot for the batched RPC * which contains metadata updates. */ - if (!(bh->lbt_flags & BATCH_FL_RDONLY)) + if (!(flags & BATCH_FL_RDONLY)) ptlrpc_get_mod_rpc_slot(req); - if (bh->lbt_flags & BATCH_FL_SYNC) { + if (flags & BATCH_FL_SYNC) { rc = ptlrpc_queue_wait(req); } else { - if ((bh->lbt_flags & (BATCH_FL_RDONLY | BATCH_FL_RQSET)) == + if ((flags & (BATCH_FL_RDONLY | BATCH_FL_RQSET)) == BATCH_FL_RDONLY) { ptlrpcd_add_req(req); - } else if (bh->lbt_flags & BATCH_FL_RQSET) { + } else if (flags & BATCH_FL_RQSET) { ptlrpc_set_add_req(bh->lbt_rqset, req); ptlrpc_check_set(env, bh->lbt_rqset); } else { @@ -522,6 +559,103 @@ static int batch_update_request_add(struct batch_update_head **headp, return rc; } +static void cli_batch_resend_work(struct work_struct *data) +{ + struct batch_work_resend *work = container_of(data, + struct batch_work_resend, bwr_work); + struct batch_update_head *obuh = work->bwr_head; + struct object_update_callback *ouc; + struct batch_update_head *head; + struct batch_update_buffer *buf; + struct batch_update_buffer *tmp; + int index = work->bwr_index; + int rc = 0; + int i = 0; + + head = batch_update_request_create(obuh->buh_exp, NULL); + if (!head) { + rc = -ENOMEM; + goto err_up; + } + + list_for_each_entry_safe(buf, tmp, &obuh->buh_buf_list, bub_item) { + struct batch_update_request *bur = buf->bub_req; + struct batch_update_buffer *newbuf; + struct lustre_msg *reqmsg = NULL; + size_t max_len; + int j; + + if (i + bur->burq_count < index) { + i += bur->burq_count; + continue; + } + + /* reused the allocated buffer */ + if (i >= index) { + list_move_tail(&buf->bub_item, &head->buh_buf_list); + head->buh_update_count += buf->bub_req->burq_count; + head->buh_buf_count++; + continue; + } + + for (j = 0; j < bur->burq_count; j++) { + struct lustre_msg *newmsg; + u32 msgsz; + + reqmsg = batch_update_reqmsg_next(bur, reqmsg); + if (i + j < index) + continue; +repeat: + newbuf = current_batch_update_buffer(head); + LASSERT(newbuf); + max_len = newbuf->bub_size - newbuf->bub_end; + newmsg = (struct lustre_msg *)((char *)newbuf->bub_req + + newbuf->bub_end); + msgsz = lustre_packed_msg_size(reqmsg); + if (msgsz >= max_len) { + int rc2; + + /* Create new batch update buffer */ + rc2 = batch_update_buffer_create(head, msgsz + + offsetof(struct batch_update_request, + burq_reqmsg[0]) + 1); + if (rc2 != 0) { + rc = rc2; + goto err_up; + } + goto repeat; + } + + memcpy(newmsg, reqmsg, msgsz); + newbuf->bub_end += msgsz; + newbuf->bub_req->burq_count++; + head->buh_update_count++; + } + + i = index; + } + + list_splice_init(&obuh->buh_cb_list, &head->buh_cb_list); + list_for_each_entry(ouc, &head->buh_cb_list, ouc_item) + ouc->ouc_head = head; + + head->buh_repsize = BUT_MAXREPSIZE - SPTLRPC_MAX_PAYLOAD; + rc = batch_send_update_req(NULL, head); + if (rc) + goto err_up; + + batch_update_request_destroy(obuh); + kfree(work); + return; + +err_up: + batch_update_request_fini(obuh, NULL, NULL, rc); + if (head) + batch_update_request_fini(head, NULL, NULL, rc); + + kfree(work); +} + struct lu_batch *cli_batch_create(struct obd_export *exp, enum lu_batch_flags flags, u32 max_count) { From patchwork Thu Jan 30 14:11:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954684 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CFAFCC0218A for ; Thu, 30 Jan 2025 14:48:51 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLl81C8fz22SD; Thu, 30 Jan 2025 06:20:04 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLbF3D5sz1y6S for ; Thu, 30 Jan 2025 06:13:13 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm-e204-208.ccs.ornl.gov [160.91.203.12]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 35B55899ADC; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 340B5106BE14; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:11 -0500 Message-ID: <20250130141115.950749-22-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 21/25] lnet: restore IOC_LIBCFS_GET_NI X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Frank Sehr , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" Restore IOC_LIBCFS_GET_NI for compatibility until there have been some releases with netlink support, so that older utilities just work. Fixes: fafd24988 ("lnet: use Netlink to support old and new NI APIs.") WC-bug-id: https://jira.whamcloud.com/browse/LU-16462 Lustre-commit: ae1ee11cea0a90631 ("LU-16462 utils: handle lack of newer nla_attrs") Signed-off-by: James Simmons Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49608 Reviewed-by: Andreas Dilger Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin --- include/uapi/linux/lnet/libcfs_ioctl.h | 2 +- net/lnet/lnet/api-ni.c | 7 +++++++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/lnet/libcfs_ioctl.h b/include/uapi/linux/lnet/libcfs_ioctl.h index e012532fc88a..a77a736c1107 100644 --- a/include/uapi/linux/lnet/libcfs_ioctl.h +++ b/include/uapi/linux/lnet/libcfs_ioctl.h @@ -94,7 +94,7 @@ struct libcfs_ioctl_data { #define IOC_LIBCFS_MARK_DEBUG _IOWR('e', 32, IOCTL_LIBCFS_TYPE) /* IOC_LIBCFS_MEMHOG obsolete in 2.8.0, was _IOWR('e', 36, IOCTL_LIBCFS_TYPE) */ /* lnet ioctls */ -/* IOC_LIBCFS_GET_NI obsolete in 2.16, was _IOWR('e', 50, IOCTL_LIBCFS_TYPE) */ +#define IOC_LIBCFS_GET_NI _IOWR('e', 50, IOCTL_LIBCFS_TYPE) #define IOC_LIBCFS_FAIL_NID _IOWR('e', 51, IOCTL_LIBCFS_TYPE) #define IOC_LIBCFS_NOTIFY_ROUTER _IOWR('e', 55, IOCTL_LIBCFS_TYPE) #define IOC_LIBCFS_UNCONFIGURE _IOWR('e', 56, IOCTL_LIBCFS_TYPE) diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index f3f9aeef04dd..5ec6faa2da98 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -4030,6 +4030,13 @@ LNetCtl(unsigned int cmd, void *arg) sizeof(struct lnet_ioctl_config_data)); switch (cmd) { + case IOC_LIBCFS_GET_NI: { + struct lnet_processid id = {}; + + rc = LNetGetId(data->ioc_count, &id); + data->ioc_nid = lnet_nid_to_nid4(&id.nid); + return rc; + } case IOC_LIBCFS_FAIL_NID: return lnet_fail_nid(data->ioc_nid, data->ioc_count); From patchwork Thu Jan 30 14:11:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954662 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38AA9C0218A for ; Thu, 30 Jan 2025 14:40:00 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLgw0wnyz228v; Thu, 30 Jan 2025 06:17:16 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLc85nFDz211D for ; Thu, 30 Jan 2025 06:14:00 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 3A16F899ADD; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 38480106BE16; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:12 -0500 Message-ID: <20250130141115.950749-23-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 22/25] lustre: obdclass: Free t10pi crypto state on error X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Li Xi , Li Dongyang , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin Looks like when error happens we forgot to release crypto state that not only leaks memory directly, but potentially can tie in-memory pages too. WC-bug-id: https://jira.whamcloud.com/browse/LU-15615 Lustre-commit: 6a88222bd6a1c0f5b ("LU-15615 target: Free t10pi crypto state on error") Signed-off-by: Oleg Drokin Reviewed-by: Andreas Dilger Reviewed-by: Li Dongyang Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50539 Reviewed-by: Li Xi Signed-off-by: James Simmons --- fs/lustre/obdclass/integrity.c | 35 ++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/fs/lustre/obdclass/integrity.c b/fs/lustre/obdclass/integrity.c index e6069cb30213..57f52ab829cd 100644 --- a/fs/lustre/obdclass/integrity.c +++ b/fs/lustre/obdclass/integrity.c @@ -45,37 +45,40 @@ __be16 obd_dif_ip_fn(void *data, unsigned int len) EXPORT_SYMBOL(obd_dif_ip_fn); int obd_page_dif_generate_buffer(const char *obd_name, struct page *page, - u32 offset, u32 length, + u32 start, u32 length, __be16 *guard_start, int guard_number, int *used_number, int sector_size, obd_dif_csum_fn *fn) { - unsigned int i = offset; - unsigned int end = offset + length; + unsigned int off = start; + unsigned int end = start + length; char *data_buf; __be16 *guard_buf = guard_start; unsigned int data_size; - int used = 0; + int guard_used = 0; + int rc = 0; - data_buf = kmap(page) + offset; - while (i < end) { - if (used >= guard_number) { - CERROR("%s: unexpected used guard number of DIF %u/%u, data length %u, sector size %u: rc = %d\n", - obd_name, used, guard_number, length, - sector_size, -E2BIG); - return -E2BIG; + data_buf = kmap(page) + start; + while (off < end) { + if (guard_used >= guard_number) { + rc = -E2BIG; + CERROR("%s: used %u >= guard %u, data %u+%u, sector_size %u: rc = %d\n", + obd_name, guard_used, guard_number, start, + length, sector_size, rc); + goto out; } - data_size = min(round_up(i + 1, sector_size), end) - i; + data_size = min(round_up(off + 1, sector_size), end) - off; *guard_buf = fn(data_buf, data_size); guard_buf++; + guard_used++; data_buf += data_size; - i += data_size; - used++; + off += data_size; } + *used_number = guard_used; +out: kunmap(page); - *used_number = used; - return 0; + return rc; } EXPORT_SYMBOL(obd_page_dif_generate_buffer); From patchwork Thu Jan 30 14:11:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954664 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4DBCEC0218A for ; Thu, 30 Jan 2025 14:43:30 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLj84RFSz22B8; Thu, 30 Jan 2025 06:18:20 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLdw0t7Xz1xpP for ; Thu, 30 Jan 2025 06:15:31 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 3DB18182370; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 3C4EE106BE17; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:13 -0500 Message-ID: <20250130141115.950749-24-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 23/25] lustre: remove obsolete OBD_FAIL_OSC_DIO_PAUSE fail_loc X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Patrick Farrell The fail_loc used for testing was removed in Lustre 2.0. The fail_loc tests for a bug which should be obvious - a serious delay when doing DIO writes - and is definitely fixed in current versions. (Bugzilla 15950) And without the fail_loc, the test isn't doing anything interesting. But the timer based aspect of it fails occasionally due to hardware delays. So let's just remove the fail_loc. WC-bug-id: https://jira.whamcloud.com/browse/LU-13706 lustre-commit: 59d5bb1558b281d75 ("LU-13706 tests: remove test 119d") Signed-off-by: Patrick Farrell Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50731 Reviewed-by: Andreas Dilger Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_support.h | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/lustre/include/obd_support.h b/fs/lustre/include/obd_support.h index 43b4684f418a..cee7e3164d66 100644 --- a/fs/lustre/include/obd_support.h +++ b/fs/lustre/include/obd_support.h @@ -327,7 +327,6 @@ extern char obd_jobid_var[]; #define OBD_FAIL_OSC_BRW_PREP_REQ2 0x40a /* #define OBD_FAIL_OSC_CONNECT_CKSUM 0x40b Obsolete since 2.9 */ #define OBD_FAIL_OSC_CKSUM_ADLER_ONLY 0x40c -#define OBD_FAIL_OSC_DIO_PAUSE 0x40d #define OBD_FAIL_OSC_OBJECT_CONTENTION 0x40e #define OBD_FAIL_OSC_CP_CANCEL_RACE 0x40f #define OBD_FAIL_OSC_CP_ENQ_RACE 0x410 From patchwork Thu Jan 30 14:11:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954690 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CCCA2C0218F for ; Thu, 30 Jan 2025 14:52:48 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLp22HDfz22VR; Thu, 30 Jan 2025 06:22:34 -0800 (PST) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLfb4CV6z21l8 for ; Thu, 30 Jan 2025 06:16:07 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm3-e204-208.ccs.ornl.gov [160.91.203.26]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 44034182371; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 408AD106BE18; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:14 -0500 Message-ID: <20250130141115.950749-25-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 24/25] lustre: obdclass: init osc.*.rpc_stats start_time X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Arshad Hussain , Feng Lei , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Add missing start_time initialization for osc.*.rpc_stats. Fixes: 653198e691 ("lustre: obdclass: add start time to stats files") WC-bug-id: https://jira.whamcloud.com/browse/LU-11407 Lustre-commit: 0176531449899c30e ("LU-11407 obdclass: init osc.*.rpc_stats start_time") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50734 Reviewed-by: Feng Lei Reviewed-by: Arshad Hussain Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_lib.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/lustre/ldlm/ldlm_lib.c b/fs/lustre/ldlm/ldlm_lib.c index 6a03ca9e2c06..726f70c43e19 100644 --- a/fs/lustre/ldlm/ldlm_lib.c +++ b/fs/lustre/ldlm/ldlm_lib.c @@ -361,6 +361,7 @@ int client_obd_setup(struct obd_device *obd, struct lustre_cfg *lcfg) cli->cl_r_in_flight = 0; cli->cl_w_in_flight = 0; + cli->cl_stats_init = ktime_get_real(); spin_lock_init(&cli->cl_read_rpc_hist.oh_lock); spin_lock_init(&cli->cl_write_rpc_hist.oh_lock); spin_lock_init(&cli->cl_read_page_hist.oh_lock); From patchwork Thu Jan 30 14:11:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13954688 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F17D2C0218F for ; Thu, 30 Jan 2025 14:50:48 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4YkLn52lVXz22Td; Thu, 30 Jan 2025 06:21:45 -0800 (PST) Received: from smtp3.ccs.ornl.gov (smtp3.ccs.ornl.gov [160.91.203.39]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4YkLdJ0zSSz214t for ; Thu, 30 Jan 2025 06:15:00 -0800 (PST) Received: from star2.ccs.ornl.gov (ltm-e204-208.ccs.ornl.gov [160.91.203.12]) by smtp3.ccs.ornl.gov (Postfix) with ESMTP id 46B11899ADE; Thu, 30 Jan 2025 09:11:33 -0500 (EST) Received: by star2.ccs.ornl.gov (Postfix, from userid 2004) id 44B89106BE14; Thu, 30 Jan 2025 09:11:33 -0500 (EST) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Thu, 30 Jan 2025 09:11:15 -0500 Message-ID: <20250130141115.950749-26-jsimmons@infradead.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250130141115.950749-1-jsimmons@infradead.org> References: <20250130141115.950749-1-jsimmons@infradead.org> MIME-Version: 1.0 Subject: [lustre-devel] [PATCH 25/25] Revert "lustre: llite: access lli_lsm_md with lock in all places" X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vitaly Fertman , Lai Siyao , Lustre Development List Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Vitaly Fertman This reverts commit f0c0d38c855f186b37844c4d92e24e44f1879fab. as a prerequisite of a larger fix in this ticket which covers this problem as well. WC-bug-id: https://jira.whamcloud.com/browse/LU-15535 Lustre-commit: be278f82efa736035 ("LU-15535 revert: "LU-15284 llite: access lli_lsm_md with lock in all places") Signed-off-by: Vitaly Fertman Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50488 Reviewed-by: Oleg Drokin Reviewed-by: Lai Siyao Signed-off-by: James Simmons --- fs/lustre/llite/dir.c | 6 +++--- fs/lustre/llite/file.c | 8 +++----- fs/lustre/llite/llite_internal.h | 17 ++--------------- fs/lustre/llite/llite_lib.c | 20 +++++++++----------- fs/lustre/llite/namei.c | 7 ++----- fs/lustre/llite/statahead.c | 2 -- fs/lustre/lmv/lmv_obd.c | 3 +-- 7 files changed, 20 insertions(+), 43 deletions(-) diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 2e44f9bb3895..98da2c95ca37 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -162,11 +162,11 @@ void ll_release_page(struct inode *inode, struct page *page, bool remove) { kunmap(page); - /* Always remove the page for striped dir, because the page is + /* + * Always remove the page for striped dir, because the page is * built from temporarily in LMV layer */ - if (inode && S_ISDIR(inode->i_mode) && - lmv_dir_striped(ll_i2info(inode)->lli_lsm_md)) { + if (inode && ll_dir_striped(inode)) { __free_page(page); return; } diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 11006682be1a..c99e9c01bc65 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -5330,14 +5330,12 @@ static int ll_merge_md_attr(struct inode *inode) struct cl_attr attr = { 0 }; int rc; - if (!lli->lli_lsm_md) + LASSERT(lli->lli_lsm_md); + + if (!lmv_dir_striped(lli->lli_lsm_md)) return 0; down_read(&lli->lli_lsm_sem); - if (!lmv_dir_striped(lli->lli_lsm_md)) { - up_read(&lli->lli_lsm_sem); - return 0; - } rc = md_merge_attr(ll_i2mdexp(inode), lli->lli_lsm_md, &attr, ll_md_blocking_ast); up_read(&lli->lli_lsm_sem); diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index e86d700a182b..42f991ea9a7e 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -1469,22 +1469,9 @@ static inline struct lu_fid *ll_inode2fid(struct inode *inode) static inline bool ll_dir_striped(struct inode *inode) { - struct ll_inode_info *lli; - bool rc; - LASSERT(inode); - if (!S_ISDIR(inode->i_mode)) - return false; - - lli = ll_i2info(inode); - if (!lli->lli_lsm_md) - return false; - - down_read(&lli->lli_lsm_sem); - rc = lmv_dir_striped(lli->lli_lsm_md); - up_read(&lli->lli_lsm_sem); - - return rc; + return S_ISDIR(inode->i_mode) && + lmv_dir_striped(ll_i2info(inode)->lli_lsm_md); } static inline loff_t ll_file_maxbytes(struct inode *inode) diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 751886b47445..37327be5be66 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -1702,25 +1702,23 @@ static int ll_update_lsm_md(struct inode *inode, struct lustre_md *md) } rc = ll_init_lsm_md(inode, md); - if (rc) { - up_write(&lli->lli_lsm_sem); - return rc; - } + up_write(&lli->lli_lsm_sem); - /* md_merge_attr() may take long, since lsm is already set, switch to - * read lock. - */ - downgrade_write(&lli->lli_lsm_sem); + if (rc) + return rc; /* set md->lmv to NULL, so the following free lustre_md will not free * this lsm. */ md->lmv = NULL; - if (!lmv_dir_striped(lli->lli_lsm_md)) { - rc = 0; + /* md_merge_attr() may take long, since lsm is already set, switch to + * read lock. + */ + down_read(&lli->lli_lsm_sem); + + if (!lmv_dir_striped(lli->lli_lsm_md)) goto unlock; - } attr = kzalloc(sizeof(*attr), GFP_NOFS); if (!attr) { diff --git a/fs/lustre/llite/namei.c b/fs/lustre/llite/namei.c index 920b592489ab..59a7bbb7a99f 100644 --- a/fs/lustre/llite/namei.c +++ b/fs/lustre/llite/namei.c @@ -746,17 +746,14 @@ static int ll_lookup_it_finish(struct ptlrpc_request *request, .it_op = IT_GETATTR, .it_lock_handle = 0 }; - struct ll_inode_info *lli = ll_i2info(parent); - struct lu_fid fid = lli->lli_fid; + struct lu_fid fid = ll_i2info(parent)->lli_fid; /* If it is striped directory, get the real stripe parent */ if (unlikely(ll_dir_striped(parent))) { - down_read(&lli->lli_lsm_sem); rc = md_get_fid_from_lsm(ll_i2mdexp(parent), - lli->lli_lsm_md, + ll_i2info(parent)->lli_lsm_md, (*de)->d_name.name, (*de)->d_name.len, &fid); - up_read(&lli->lli_lsm_sem); if (rc) return rc; } diff --git a/fs/lustre/llite/statahead.c b/fs/lustre/llite/statahead.c index 59688b4d7c7f..c820455cc3af 100644 --- a/fs/lustre/llite/statahead.c +++ b/fs/lustre/llite/statahead.c @@ -1211,10 +1211,8 @@ static int ll_statahead_thread(void *arg) } pos = le64_to_cpu(dp->ldp_hash_end); - down_read(&lli->lli_lsm_sem); ll_release_page(dir, page, le32_to_cpu(dp->ldp_flags) & LDF_COLLIDE); - up_read(&lli->lli_lsm_sem); if (sa_low_hit(sai)) { rc = -EFAULT; diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 98ee902d8cb5..27345a2a65b0 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -3730,8 +3730,7 @@ lmv_get_fid_from_lsm(struct obd_export *exp, { const struct lmv_oinfo *oinfo; - if (!lmv_dir_striped(lsm)) - return -ESTALE; + LASSERT(lmv_dir_striped(lsm)); oinfo = lsm_name_to_stripe_info(lsm, name, namelen, false); if (IS_ERR(oinfo))