From patchwork Mon Apr 17 13:46:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214051 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B1233C77B76 for ; Mon, 17 Apr 2023 13:51:10 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T0G2rnTz1yFv; Mon, 17 Apr 2023 06:48:14 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0SzM11Qwz1y5y for ; Mon, 17 Apr 2023 06:47:27 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 5B2AF1005F8A; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 4DB8C372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:46:57 -0400 Message-Id: <1681739243-29375-2-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 01/27] lustre: llite: fix the wrong beyond read end calculation X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin During the test, we found a dead loop in the read path which retruns AOP_TRUNCATED_PAGE(0x8001) endless. The reason is that the calculation of the ending beyond offset is wrong: (iter->count + iocb->ki_pos). The ending beyond offset was supposed to be not changed during the read I/O loop for each page in buffered I/O mode. However, @iter->count is decreased with read bytes when finished the read of each page: @iter->count -= read_bytes. In this patch, we store the ending beyond page index in @lcc->lcc_end_index before call @generic_file_read_iter into a loop for each read page and solve this bug. Fixes: c9f68ebdc6 ("lustre: llite: check read page past requested") WC-bug-id: https://jira.whamcloud.com/browse/LU-16579 Lustre-commit: ae356dc325877bd13 ("LU-16579 llite: fix the wrong beyond read end calculation") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50065 Reviewed-by: Andreas Dilger Reviewed-by: Patrick Farrell Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/llite_internal.h | 3 +-- fs/lustre/llite/rw.c | 11 +++-------- fs/lustre/llite/vvp_io.c | 6 ++---- 3 files changed, 6 insertions(+), 14 deletions(-) diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 72de8f7..d8eee75 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -1375,8 +1375,7 @@ struct ll_cl_context { struct cl_io *lcc_io; struct cl_page *lcc_page; enum lcc_type lcc_type; - struct kiocb *lcc_iocb; - struct iov_iter *lcc_iter; + pgoff_t lcc_end_index; }; struct ll_thread_info { diff --git a/fs/lustre/llite/rw.c b/fs/lustre/llite/rw.c index d285ae1..0c73258 100644 --- a/fs/lustre/llite/rw.c +++ b/fs/lustre/llite/rw.c @@ -1863,9 +1863,7 @@ int ll_readpage(struct file *file, struct page *vmpage) struct cl_read_ahead ra = { 0 }; struct ll_cl_context *lcc; struct cl_io *io = NULL; - struct iov_iter *iter; struct cl_page *page; - struct kiocb *iocb; int result; if (OBD_FAIL_PRECHECK(OBD_FAIL_LLITE_READPAGE_PAUSE)) { @@ -1974,11 +1972,8 @@ int ll_readpage(struct file *file, struct page *vmpage) } if (lcc && lcc->lcc_type != LCC_MMAP) { - iocb = lcc->lcc_iocb; - iter = lcc->lcc_iter; - - CDEBUG(D_VFSTRACE, "pgno:%ld, cnt:%ld, pos:%lld\n", - vmpage->index, iter->count, iocb->ki_pos); + CDEBUG(D_VFSTRACE, "pgno:%ld, beyond read end_index:%ld\n", + vmpage->index, lcc->lcc_end_index); /* * This handles a kernel bug introduced in kernel 5.12: @@ -2004,7 +1999,7 @@ int ll_readpage(struct file *file, struct page *vmpage) * This should never occur except in kernels with the bug * mentioned above. */ - if (cl_offset(clob, vmpage->index) >= iter->count + iocb->ki_pos) { + if (vmpage->index >= lcc->lcc_end_index) { result = cl_io_read_ahead(env, io, vmpage->index, &ra); if (result < 0 || vmpage->index > ra.cra_end_idx) { cl_read_ahead_release(env, &ra); diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index 561ce66..50c2872 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -871,10 +871,8 @@ static int vvp_io_read_start(const struct lu_env *env, iter = *vio->vui_iter; lcc = ll_cl_find(inode); - lcc->lcc_iter = &iter; - lcc->lcc_iocb = vio->vui_iocb; - CDEBUG(D_VFSTRACE, "cnt:%ld,iocb pos:%lld\n", lcc->lcc_iter->count, - lcc->lcc_iocb->ki_pos); + lcc->lcc_end_index = DIV_ROUND_UP(pos + iter.count, PAGE_SIZE); + CDEBUG(D_VFSTRACE, "count:%ld iocb pos:%lld\n", iter.count, pos); result = generic_file_read_iter(vio->vui_iocb, &iter); out: From patchwork Mon Apr 17 13:46:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DA6A1C77B76 for ; Mon, 17 Apr 2023 13:47:56 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0SzY2p26z1yCS; Mon, 17 Apr 2023 06:47:37 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0SzN1m7Cz1y6h for ; Mon, 17 Apr 2023 06:47:28 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 5DB401005F9F; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 5097B375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:46:58 -0400 Message-Id: <1681739243-29375-3-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 02/27] lustre: lov: continue fsync on other OST objs even on -ENOENT X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam When fsync races with truncate, we'd continue to other OST object's fsync even some stripe fsync returns -ENOENT, so that on client it could potentially discard caching pages by calling osc_io_fsync_start()->osc_cache_writebase_range(). WC-bug-id: https://jira.whamcloud.com/browse/LU-16263 Lustre-commit: 927b5cd49c3369d53 ("LU-16263 lov: continue fsync on other OST objs even on -ENOENT") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50005 Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Reviewed-by: Alex Zhuravlev Signed-off-by: James Simmons --- fs/lustre/lov/lov_io.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/lustre/lov/lov_io.c b/fs/lustre/lov/lov_io.c index 32f028b..4c842cd 100644 --- a/fs/lustre/lov/lov_io.c +++ b/fs/lustre/lov/lov_io.c @@ -1013,8 +1013,16 @@ static int lov_io_call(const struct lu_env *env, struct lov_io *lio, list_for_each_entry(sub, &lio->lis_active, sub_linkage) { rc = iofunc(sub->sub_env, &sub->sub_io); - if (rc) + if (rc) { + /** + * fsync race with truncate, we'd continue to other + * OST object's fsync to potentially discard + * caching pages (osc_cache_writeback_range). + */ + if (rc == -ENOENT && parent->ci_type == CIT_FSYNC) + continue; break; + } if (parent->ci_result == 0) parent->ci_result = sub->sub_io.ci_result; From patchwork Mon Apr 17 13:46:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ACA70C77B76 for ; Mon, 17 Apr 2023 13:53:28 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T0w0Ldrz216B; Mon, 17 Apr 2023 06:48:48 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0SzP28bRz1y5y for ; Mon, 17 Apr 2023 06:47:29 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 5FD2D1005FA1; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 56BB1379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:46:59 -0400 Message-Id: <1681739243-29375-4-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 03/27] lustre: llite: protect cp_state with vmpage lock X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam cl_page_make_ready() calls cl_page_io_start() without vmpage lock protection, and that could mess up cl_page's cp_state/cp_owner. WC-bug-id: https://jira.whamcloud.com/browse/LU-16612 Lustre-commit: d03b038d0dd8360dc ("LU-16612 llite: protect cp_state with vmpage lock") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50180 Reviewed-by: Patrick Farrell Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/obdclass/cl_page.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/lustre/obdclass/cl_page.c b/fs/lustre/obdclass/cl_page.c index 8320293..80423b7 100644 --- a/fs/lustre/obdclass/cl_page.c +++ b/fs/lustre/obdclass/cl_page.c @@ -871,6 +871,7 @@ int cl_page_make_ready(const struct lu_env *env, struct cl_page *cp, enum cl_req_type crt) { struct page *vmpage = cp->cp_vmpage; + bool unlock = false; int rc = 0; PASSERT(env, cp, crt == CRT_WRITE); @@ -879,6 +880,7 @@ int cl_page_make_ready(const struct lu_env *env, struct cl_page *cp, goto out; lock_page(vmpage); + unlock = true; if (clear_page_dirty_for_io(vmpage)) { LASSERT(cp->cp_state == CPS_CACHED); @@ -899,13 +901,15 @@ int cl_page_make_ready(const struct lu_env *env, struct cl_page *cp, LBUG(); } - unlock_page(vmpage); out: if (rc == 0) { PASSERT(env, cp, cp->cp_state == CPS_CACHED); cl_page_io_start(env, cp, crt); } + if (unlock) + unlock_page(vmpage); + CL_PAGE_HEADER(D_TRACE, env, cp, "%d %d\n", crt, rc); return rc; From patchwork Mon Apr 17 13:47:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214050 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 74D8CC77B76 for ; Mon, 17 Apr 2023 13:50:47 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T065jQYz1yFh; Mon, 17 Apr 2023 06:48:06 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0SzQ1Tg1z1y6h for ; Mon, 17 Apr 2023 06:47:30 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 649311005FA2; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 5B1C337B; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:00 -0400 Message-Id: <1681739243-29375-5-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 04/27] lustre: llite: restart clio for AIO if necessary X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Li Dongyang , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Li Dongyang If the clio needs to be restarted from where it left off, do it for AIO as well, so we don't end up with short IO. Limit thr number of retries to 1000, to avoid potential issues if the loop is stuck forever. WC-bug-id: https://jira.whamcloud.com/browse/LU-14760 Lustre-commit: 6b1e747ad5bf02915 ("LU-14760 llite: restart clio for AIO if necessary") Signed-off-by: Li Dongyang Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43995 Reviewed-by: Oleg Drokin Reviewed-by: Andreas Dilger Reviewed-by: Patrick Farrell Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 746c18f..b96efb1 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -1689,6 +1689,7 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, ssize_t result = 0; int rc = 0; int rc2 = 0; + int retries = 1000; unsigned int retried = 0; unsigned int dio_lock = 0; bool is_aio = false; @@ -1851,13 +1852,13 @@ static void ll_heat_add(struct inode *inode, enum cl_io_type iot, file->f_path.dentry->d_name.name, iot, rc, result, io->ci_need_restart); - if ((!rc || rc == -ENODATA || rc == -ENOLCK) && - count > 0 && io->ci_need_restart) { + if ((!rc || rc == -ENODATA || rc == -ENOLCK || rc == -EIOCBQUEUED) && + count > 0 && io->ci_need_restart && retries-- > 0) { CDEBUG(D_VFSTRACE, - "%s: restart %s from %lld, count:%zu, result: %zd\n", + "%s: restart %s from ppos=%lld count=%zu retries=%u ret=%zd: rc = %d\n", file_dentry(file)->d_name.name, iot == CIT_READ ? "read" : "write", - *ppos, count, result); + *ppos, count, retries, result, rc); /* preserve the tried count for FLR */ retried = io->ci_ndelay_tried; dio_lock = io->ci_dio_lock; From patchwork Mon Apr 17 13:47:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214052 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CF89DC77B70 for ; Mon, 17 Apr 2023 13:53:16 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T0r0sbLz215p; Mon, 17 Apr 2023 06:48:44 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0SzR3fjYz1y7W for ; Mon, 17 Apr 2023 06:47:31 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 66FA51005FA3; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 60DF137C; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:01 -0400 Message-Id: <1681739243-29375-6-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 05/27] lustre: protocol: add OBD_BRW_COMPRESSED X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Alex Zhuravlev so the client can hint OST the data is compressed WC-bug-id: https://jira.whamcloud.com/browse/LU-16603 Lustre-commit: 764e19186cfc99123 ("LU-16603 protocol: add OBD_BRW_COMPRESSED") Signed-off-by: Alex Zhuravlev Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50154 Reviewed-by: Artem Blagodarenko Reviewed-by: Oleg Drokin Reviewed-by: Andreas Dilger Signed-off-by: James Simmons --- fs/lustre/ptlrpc/wiretest.c | 2 ++ include/uapi/linux/lustre/lustre_idl.h | 1 + 2 files changed, 3 insertions(+) diff --git a/fs/lustre/ptlrpc/wiretest.c b/fs/lustre/ptlrpc/wiretest.c index 472d155c..6e893f0 100644 --- a/fs/lustre/ptlrpc/wiretest.c +++ b/fs/lustre/ptlrpc/wiretest.c @@ -2085,6 +2085,8 @@ void lustre_assert_wire_constants(void) OBD_BRW_RDMA_ONLY); LASSERTF(OBD_BRW_SYS_RESOURCE == 0x40000, "found 0x%.8x\n", OBD_BRW_SYS_RESOURCE); + LASSERTF(OBD_BRW_COMPRESSED == 0x80000, "found 0x%.8x\n", + OBD_BRW_COMPRESSED); /* Checks for struct ost_body */ LASSERTF((int)sizeof(struct ost_body) == 208, "found %lld\n", diff --git a/include/uapi/linux/lustre/lustre_idl.h b/include/uapi/linux/lustre/lustre_idl.h index c979e24..83c8ea8 100644 --- a/include/uapi/linux/lustre/lustre_idl.h +++ b/include/uapi/linux/lustre/lustre_idl.h @@ -1253,6 +1253,7 @@ struct hsm_state_set { #define OBD_BRW_ROOT_PRJQUOTA 0x10000 /* check project quota for root */ #define OBD_BRW_RDMA_ONLY 0x20000 /* RPC contains RDMA-only pages*/ #define OBD_BRW_SYS_RESOURCE 0x40000 /* page has CAP_SYS_RESOURCE */ +#define OBD_BRW_COMPRESSED 0x80000 /* data compressed on client */ #define OBD_MAX_GRANT 0x7fffffffUL /* Max grant allowed to one client: 2 GiB */ From patchwork Mon Apr 17 13:47:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214064 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D2483C77B76 for ; Mon, 17 Apr 2023 13:54:49 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T1L3crmz21BM; Mon, 17 Apr 2023 06:49:10 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0SzS6Bbtz1y87 for ; Mon, 17 Apr 2023 06:47:32 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 686A11005FA6; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 65E9B37D; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:02 -0400 Message-Id: <1681739243-29375-7-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 06/27] lustre: llite: call truncate_inode_pages() under inode lock X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam truncate_inode_pages() is required to be called under (and serialised by) inode lock. WC-bug-id: https://jira.whamcloud.com/browse/LU-16637 Lustre-commit: ef9be34478036db05 ("LU-16637 llite: call truncate_inode_pages() under inode lock") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50284 Reviewed-by: Patrick Farrell Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/llite_internal.h | 2 +- fs/lustre/llite/llite_lib.c | 12 ++++++++---- fs/lustre/llite/vvp_object.c | 9 ++++++++- 3 files changed, 17 insertions(+), 6 deletions(-) diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index d8eee75..fdc0f89 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -1287,7 +1287,7 @@ int ll_statfs_internal(struct ll_sb_info *sbi, struct obd_statfs *osfs, void ll_update_inode_flags(struct inode *inode, unsigned int ext_flags); void ll_update_dir_depth(struct inode *dir, struct inode *inode); int ll_read_inode2(struct inode *inode, void *opaque); -void ll_truncate_inode_pages_final(struct inode *inode); +void ll_truncate_inode_pages_final(struct inode *inode, struct cl_io *io); void ll_delete_inode(struct inode *inode); int ll_iocontrol(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg); diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 5a9bc61..049cd23 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2755,12 +2755,15 @@ void ll_update_dir_depth(struct inode *dir, struct inode *inode) PFID(&lli->lli_fid), lli->lli_dir_depth, lli->lli_inherit_depth); } -void ll_truncate_inode_pages_final(struct inode *inode) +void ll_truncate_inode_pages_final(struct inode *inode, struct cl_io *io) { struct address_space *mapping = &inode->i_data; unsigned long nrpages; unsigned long flags; + LASSERTF(io == NULL || inode_is_locked(inode), "io %p (type %d)\n", + io, io ? io->ci_type : 0); + truncate_inode_pages_final(mapping); /* Workaround for LU-118: Note nrpages may not be totally updated when @@ -2777,9 +2780,10 @@ void ll_truncate_inode_pages_final(struct inode *inode) xa_unlock_irqrestore(&mapping->i_pages, flags); } /* Workaround end */ - LASSERTF(nrpages == 0, "%s: inode="DFID"(%p) nrpages=%lu, see https://jira.whamcloud.com/browse/LU-118\n", + LASSERTF(nrpages == 0, "%s: inode="DFID"(%p) nrpages=%lu io %p (io_type %d), see https://jira.whamcloud.com/browse/LU-118\n", ll_i2sbi(inode)->ll_fsname, - PFID(ll_inode2fid(inode)), inode, nrpages); + PFID(ll_inode2fid(inode)), inode, nrpages, + io, io ? io->ci_type : 0); } int ll_read_inode2(struct inode *inode, void *opaque) @@ -2843,7 +2847,7 @@ void ll_delete_inode(struct inode *inode) CL_FSYNC_LOCAL : CL_FSYNC_DISCARD, 1); } - ll_truncate_inode_pages_final(inode); + ll_truncate_inode_pages_final(inode, NULL); ll_clear_inode(inode); clear_inode(inode); } diff --git a/fs/lustre/llite/vvp_object.c b/fs/lustre/llite/vvp_object.c index 0ef055f..302f900 100644 --- a/fs/lustre/llite/vvp_object.c +++ b/fs/lustre/llite/vvp_object.c @@ -153,6 +153,7 @@ static int vvp_conf_set(const struct lu_env *env, struct cl_object *obj, static int vvp_prune(const struct lu_env *env, struct cl_object *obj) { + struct cl_io *io = vvp_env_io(env)->vui_cl.cis_io; struct inode *inode = vvp_object_inode(obj); int rc; @@ -163,9 +164,15 @@ static int vvp_prune(const struct lu_env *env, struct cl_object *obj) return rc; } - ll_truncate_inode_pages_final(inode); + if (io != NULL) + inode_lock(inode); + + ll_truncate_inode_pages_final(inode, io); mapping_clear_exiting(inode->i_mapping); + if (io != NULL) + inode_unlock(inode); + return 0; } From patchwork Mon Apr 17 13:47:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38ECEC77B76 for ; Mon, 17 Apr 2023 13:54:42 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T1J73wCz21B1; Mon, 17 Apr 2023 06:49:08 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0SzW58BWz1yBk for ; Mon, 17 Apr 2023 06:47:35 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 6D0011005FA7; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 6B5A7372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:03 -0400 Message-Id: <1681739243-29375-8-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 07/27] lustre: fid: reduce LUSTRE_DATA_SEQ_MAX_WIDTH X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Li Dongyang , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Li Dongyang Reduce LUSTRE_DATA_SEQ_MAX_WIDTH from ~4B to ~32M to limit the number of objects under /O/[seq]/d[0..31] dir on OSTs. This makes the directories stay optimial for ldiskfs, to avoid going into the largedir/3-level htree territory. Check the seq->lcs_width which is a tunable set to LUSTRE_DATA_SEQ_MAX_WIDTH by default, allow the value up to IDIF_MAX_OID if a larger seq width is needed. The seq will rollover when the seq width is exhausted, the default is LUSTRE_DATA_SEQ_MAX_WIDTH. For seq >= FID_SEQ_NORMAL objects, the upper limit of seq width is OBIF_MAX_OID, For IDIF/MDT0 objects, the upper limit is IDIF_MAX_OID. The seq FID_SEQ_OST_MDT0 will change to a normal seq after the rollover. WC-bug-id: https://jira.whamcloud.com/browse/LU-11912 Lustre-commit: 0ecb2a167c56ffff8 ("LU-11912 ofd: reduce LUSTRE_DATA_SEQ_MAX_WIDTH") Signed-off-by: Li Dongyang Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/38424 Reviewed-by: Andreas Dilger Reviewed-by: Sergey Cheremencev Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/fid/lproc_fid.c | 2 +- fs/lustre/include/lustre_fid.h | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/lustre/fid/lproc_fid.c b/fs/lustre/fid/lproc_fid.c index 8f6a4a8..9ca2814 100644 --- a/fs/lustre/fid/lproc_fid.c +++ b/fs/lustre/fid/lproc_fid.c @@ -154,7 +154,7 @@ static ssize_t ldebugfs_fid_width_seq_write(struct file *file, spin_lock(&seq->lcs_lock); if (seq->lcs_type == LUSTRE_SEQ_DATA) - max = LUSTRE_DATA_SEQ_MAX_WIDTH; + max = IDIF_MAX_OID; else max = LUSTRE_METADATA_SEQ_MAX_WIDTH; diff --git a/fs/lustre/include/lustre_fid.h b/fs/lustre/include/lustre_fid.h index 88a6061..5ebe362 100644 --- a/fs/lustre/include/lustre_fid.h +++ b/fs/lustre/include/lustre_fid.h @@ -173,9 +173,9 @@ enum { LUSTRE_METADATA_SEQ_MAX_WIDTH = 0x0000000000020000ULL, /* - * This is how many data FIDs could be allocated in one sequence(4B - 1) + * This is how many data FIDs could be allocated in one sequence(32M - 1) */ - LUSTRE_DATA_SEQ_MAX_WIDTH = 0x00000000FFFFFFFFULL, + LUSTRE_DATA_SEQ_MAX_WIDTH = 0x0000000001FFFFFFULL, /* * How many sequences to allocate to a client at once. From patchwork Mon Apr 17 13:47:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214066 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8135C77B76 for ; Mon, 17 Apr 2023 13:56:13 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T2634bYz21H2; Mon, 17 Apr 2023 06:49:50 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0Szb2MjNz1yD0 for ; Mon, 17 Apr 2023 06:47:39 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 717FA1006C90; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 702B9375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:04 -0400 Message-Id: <1681739243-29375-9-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 08/27] lnet: handle multi-rail setups X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" For multi-rail setups we can push more than one interface at a time to setup the local NIs but our netlink code ignored all but one interface. Refactor both lnet_genl_parse_local_ni() and lnet_net_cmd() to setup all the passed in interfaces. Also remove setting ni to NULL in the NI deletion case which causes an oops when we have more than one interface. WC-bug-id: https://jira.whamcloud.com/browse/LU-9680 Lustre-commit: 6fab1fe4a5c5615d4 ("LU-9680 lnet: handle multi-rail setups") Signed-off-by: James Simmons Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50026 Reviewed-by: Serguei Smirnov Reviewed-by: Chris Horn Reviewed-by: Cyril Bordage Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin --- fs/lustre/obdclass/kernelcomm.c | 7 +- net/lnet/lnet/api-ni.c | 183 ++++++++++++++++++++++------------------ net/lnet/selftest/conctl.c | 7 +- 3 files changed, 111 insertions(+), 86 deletions(-) diff --git a/fs/lustre/obdclass/kernelcomm.c b/fs/lustre/obdclass/kernelcomm.c index 5682d4e..0cf3c44 100644 --- a/fs/lustre/obdclass/kernelcomm.c +++ b/fs/lustre/obdclass/kernelcomm.c @@ -116,7 +116,12 @@ static int lustre_device_list_start(struct netlink_callback *cb) struct nlattr *dev; int rem; - nla_for_each_attr(dev, params, msg_len, rem) { + if (!(nla_type(params) & LN_SCALAR_ATTR_LIST)) { + NL_SET_ERR_MSG(extack, "no configuration"); + goto report_err; + } + + nla_for_each_nested(dev, params, rem) { struct nlattr *prop; int rem2; diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 9095d4e..8b0ab53 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -4695,7 +4695,12 @@ static int lnet_net_show_start(struct netlink_callback *cb) return 0; params = genlmsg_data(gnlh); - nla_for_each_attr(top, params, msg_len, rem) { + if (!(nla_type(params) & LN_SCALAR_ATTR_LIST)) { + NL_SET_ERR_MSG(extack, "invalid configuration"); + return -EINVAL; + } + + nla_for_each_nested(top, params, rem) { struct nlattr *net; int rem2; @@ -4703,7 +4708,7 @@ static int lnet_net_show_start(struct netlink_callback *cb) char filter[LNET_NIDSTR_SIZE]; if (nla_type(net) != LN_SCALAR_ATTR_VALUE || - nla_strcmp(net, "name") != 0) + nla_strcmp(net, "net type") != 0) continue; net = nla_next(net, &rem2); @@ -4931,12 +4936,21 @@ static int lnet_genl_parse_lnd_tunables(struct nlattr *settings, static int lnet_genl_parse_local_ni(struct nlattr *entry, struct genl_info *info, int net_id, struct lnet_ioctl_config_ni *conf, - struct lnet_ioctl_config_lnd_tunables *tun, bool *ni_list) { + bool create = info->nlhdr->nlmsg_flags & NLM_F_CREATE; + struct lnet_ioctl_config_lnd_tunables tun; struct nlattr *settings; int rem3, rc = 0; + memset(&tun, 0, sizeof(tun)); + /* Use LND defaults */ + tun.lt_cmn.lct_peer_timeout = -1; + tun.lt_cmn.lct_peer_tx_credits = -1; + tun.lt_cmn.lct_peer_rtr_credits = -1; + tun.lt_cmn.lct_max_tx_credits = -1; + conf->lic_ncpts = 0; + nla_for_each_nested(settings, entry, rem3) { if (nla_type(settings) != LN_SCALAR_ATTR_VALUE) continue; @@ -4983,7 +4997,7 @@ static int lnet_genl_parse_lnd_tunables(struct nlattr *settings, goto out; } - rc = lnet_genl_parse_tunables(settings, tun); + rc = lnet_genl_parse_tunables(settings, &tun); if (rc < 0) { GENL_SET_ERR_MSG(info, "failed to parse tunables"); @@ -5010,7 +5024,7 @@ static int lnet_genl_parse_lnd_tunables(struct nlattr *settings, } rc = lnet_genl_parse_lnd_tunables(settings, - &tun->lt_tun, lnd); + &tun.lt_tun, lnd); if (rc < 0) { GENL_SET_ERR_MSG(info, "failed to parse lnd tunables"); @@ -5052,6 +5066,73 @@ static int lnet_genl_parse_lnd_tunables(struct nlattr *settings, } } } + + if (!create) { + struct lnet_net *net; + struct lnet_ni *ni; + + rc = -ENODEV; + if (!strlen(conf->lic_ni_intf)) { + GENL_SET_ERR_MSG(info, + "interface is missing"); + goto out; + } + + lnet_net_lock(LNET_LOCK_EX); + net = lnet_get_net_locked(net_id); + if (!net) { + GENL_SET_ERR_MSG(info, + "LNet net doesn't exist"); + lnet_net_unlock(LNET_LOCK_EX); + goto out; + } + + list_for_each_entry(ni, &net->net_ni_list, + ni_netlist) { + if (!ni->ni_interface || + strcmp(ni->ni_interface, + conf->lic_ni_intf) != 0) + continue; + + lnet_net_unlock(LNET_LOCK_EX); + rc = lnet_dyn_del_ni(&ni->ni_nid); + if (rc < 0) { + GENL_SET_ERR_MSG(info, + "cannot del LNet NI"); + goto out; + } + break; + } + + if (rc < 0) { /* will be -ENODEV */ + GENL_SET_ERR_MSG(info, + "interface invalid for deleting LNet NI"); + lnet_net_unlock(LNET_LOCK_EX); + } + } else { + if (!strlen(conf->lic_ni_intf)) { + GENL_SET_ERR_MSG(info, + "interface is missing"); + goto out; + } + + rc = lnet_dyn_add_ni(conf, net_id, &tun); + switch (rc) { + case -ENOENT: + GENL_SET_ERR_MSG(info, + "cannot parse net"); + break; + case -ERANGE: + GENL_SET_ERR_MSG(info, + "invalid CPT set"); + break; + default: + GENL_SET_ERR_MSG(info, + "cannot add LNet NI"); + case 0: + break; + } + } out: return rc; } @@ -5070,7 +5151,12 @@ static int lnet_net_cmd(struct sk_buff *skb, struct genl_info *info) return -ENOMSG; } - nla_for_each_attr(attr, params, msg_len, rem) { + if (!(nla_type(params) & LN_SCALAR_ATTR_LIST)) { + GENL_SET_ERR_MSG(info, "invalid configuration"); + return -EINVAL; + } + + nla_for_each_nested(attr, params, rem) { struct lnet_ioctl_config_ni conf; u32 net_id = LNET_NET_ANY; struct nlattr *entry; @@ -5149,85 +5235,13 @@ static int lnet_net_cmd(struct sk_buff *skb, struct genl_info *info) break; } case LN_SCALAR_ATTR_LIST: { - bool create = info->nlhdr->nlmsg_flags & - NLM_F_CREATE; - struct lnet_ioctl_config_lnd_tunables tun; - - memset(&tun, 0, sizeof(tun)); - /* Use LND defaults */ - tun.lt_cmn.lct_peer_timeout = -1; - tun.lt_cmn.lct_peer_tx_credits = -1; - tun.lt_cmn.lct_peer_rtr_credits = -1; - tun.lt_cmn.lct_max_tx_credits = -1; - conf.lic_ncpts = 0; - - rc = lnet_genl_parse_local_ni(entry, info, - net_id, &conf, - &tun, &ni_list); - if (rc < 0) - goto out; + struct nlattr *interface; + int rem3; - if (!create) { - struct lnet_net *net; - struct lnet_ni *ni; - - rc = -ENODEV; - if (!strlen(conf.lic_ni_intf)) { - GENL_SET_ERR_MSG(info, - "interface is missing"); - goto out; - } - - lnet_net_lock(LNET_LOCK_EX); - net = lnet_get_net_locked(net_id); - if (!net) { - GENL_SET_ERR_MSG(info, - "LNet net doesn't exist"); - lnet_net_unlock(LNET_LOCK_EX); - goto out; - } - list_for_each_entry(ni, &net->net_ni_list, - ni_netlist) { - if (!ni->ni_interface || - strncmp(ni->ni_interface, - conf.lic_ni_intf, - strlen(conf.lic_ni_intf)) != 0) { - ni = NULL; - continue; - } - - lnet_net_unlock(LNET_LOCK_EX); - rc = lnet_dyn_del_ni(&ni->ni_nid); - if (rc < 0) { - GENL_SET_ERR_MSG(info, - "cannot del LNet NI"); - goto out; - } - break; - } - - if (rc < 0) { /* will be -ENODEV */ - GENL_SET_ERR_MSG(info, - "interface invalid for deleting LNet NI"); - lnet_net_unlock(LNET_LOCK_EX); - } - } else { - rc = lnet_dyn_add_ni(&conf, net_id, &tun); - switch (rc) { - case -ENOENT: - GENL_SET_ERR_MSG(info, - "cannot parse net"); - break; - case -ERANGE: - GENL_SET_ERR_MSG(info, - "invalid CPT set"); - fallthrough; - default: - GENL_SET_ERR_MSG(info, - "cannot add LNet NI"); - case 0: - break; - } + nla_for_each_nested(interface, entry, rem3) { + rc = lnet_genl_parse_local_ni(interface, info, + net_id, &conf, + &ni_list); if (rc < 0) goto out; } @@ -5593,6 +5607,7 @@ static int lnet_ping_show_dump(struct sk_buff *msg, static const struct genl_ops lnet_genl_ops[] = { { .cmd = LNET_CMD_NETS, + .flags = GENL_ADMIN_PERM, .start = lnet_net_show_start, .dumpit = lnet_net_show_dump, .done = lnet_net_show_done, diff --git a/net/lnet/selftest/conctl.c b/net/lnet/selftest/conctl.c index ea590b2..a7ec0d5 100644 --- a/net/lnet/selftest/conctl.c +++ b/net/lnet/selftest/conctl.c @@ -1024,7 +1024,12 @@ static int lst_groups_show_start(struct netlink_callback *cb) } glist->lggl_verbose = true; - nla_for_each_attr(groups, params, msg_len, rem) { + if (!(nla_type(params) & LN_SCALAR_ATTR_LIST)) { + NL_SET_ERR_MSG(extack, "no configuration"); + goto report_err; + } + + nla_for_each_nested(groups, params, rem) { struct lst_genl_group_prop *prop = NULL; struct nlattr *group; int rem2; From patchwork Mon Apr 17 13:47:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214068 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 12CE8C77B76 for ; Mon, 17 Apr 2023 13:57:55 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T2q2m88z21JW; Mon, 17 Apr 2023 06:50:27 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0Szm2lzxz1y5y for ; Mon, 17 Apr 2023 06:47:48 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7643B100526A; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 74D1A379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:05 -0400 Message-Id: <1681739243-29375-10-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 09/27] lustre: readahead: clip readahead with kms X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin During I/O test, it found that the read-ahead pages reach 255 for small files with only several KiB. The amount of read data reaches more than 1MiB. The reason is that the granted DLM extent lock is [0, EOF], which is larger than the requested extent. During readahead, the OSC layer will also return [0, EOF] extent which will clip into stripe size (1MiB) regardless the actual object size. In this patch, the readahead range is clipped to the known min size (kms) on OSC layer during readahead. By this way, the read-ahead data will not beyong the last page of the file. This patch also fixes multiop to return successfully when reaching EOF instead of exiting with ENODATA during read. WC-bug-id: https://jira.whamcloud.com/browse/LU-16338 Lustre-commit: b33808d3aebb06cf0 ("LU-16338 readahead: clip readahead with kms") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49226 Reviewed-by: Andreas Dilger Reviewed-by: Patrick Farrell Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/osc/osc_io.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/lustre/osc/osc_io.c b/fs/lustre/osc/osc_io.c index c9a3175..d0ee748 100644 --- a/fs/lustre/osc/osc_io.c +++ b/fs/lustre/osc/osc_io.c @@ -83,6 +83,8 @@ static int osc_io_read_ahead(const struct lu_env *env, oio->oi_is_readahead = true; dlmlock = osc_dlmlock_at_pgoff(env, osc, start, 0); if (dlmlock) { + struct lov_oinfo *oinfo = osc->oo_oinfo; + LASSERT(dlmlock->l_ast_data == osc); if (dlmlock->l_req_mode != LCK_PR) { struct lustre_handle lockh; @@ -100,6 +102,9 @@ static int osc_io_read_ahead(const struct lu_env *env, ra->cra_oio = oio; if (ra->cra_end_idx != CL_PAGE_EOF) ra->cra_contention = true; + ra->cra_end_idx = min_t(pgoff_t, ra->cra_end_idx, + cl_index(osc2cl(osc), + oinfo->loi_kms - 1)); result = 0; } From patchwork Mon Apr 17 13:47:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DA9DDC77B76 for ; Mon, 17 Apr 2023 13:59:32 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T3P3dVbz21Nx; Mon, 17 Apr 2023 06:50:57 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0Szv4h3Mz1yDy for ; Mon, 17 Apr 2023 06:47:55 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7AE3B1008483; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 796B3372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:06 -0400 Message-Id: <1681739243-29375-11-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 10/27] lnet: use discovered ni status to set initial health X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Serguei Smirnov , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Serguei Smirnov If not routing, track local NI status in the ping buffer such that locally recognized "down" state, for example, due to a downed network interface/link, is available to any discovering peer. If NI 'fatal' status is changed, push update to peers. On the active side of discovery, check peer NI status so if NI is down, decrement its health score and queue for recovery. WC-bug-id: https://jira.whamcloud.com/browse/LU-16563 Lustre-commit: da230373bd14306cb ("LU-16563 lnet: use discovered ni status to set initial health") Signed-off-by: Serguei Smirnov Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50027 Reviewed-by: Chris Horn Reviewed-by: Cyril Bordage Reviewed-by: Frank Sehr Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 3 ++- net/lnet/klnds/o2iblnd/o2iblnd.c | 51 ++++++++++++++++++++++++++++++---------- net/lnet/klnds/socklnd/socklnd.c | 38 +++++++++++++++++++++++------- net/lnet/lnet/api-ni.c | 20 ++++++++++++++++ net/lnet/lnet/peer.c | 14 +++++++++++ 5 files changed, 104 insertions(+), 22 deletions(-) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index e26e150..f9f4815 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -127,7 +127,7 @@ return LNET_NI_STATUS_UP; else if (atomic_read(&ni->ni_fatal_error_on)) return LNET_NI_STATUS_DOWN; - else if (ni->ni_status) + else if (the_lnet.ln_routing && ni->ni_status) return *ni->ni_status; else return LNET_NI_STATUS_UP; @@ -1216,4 +1216,5 @@ void lnet_usr_translate_stats(struct lnet_ioctl_element_msg_stats *msg_stats, old ? "up" : "down", alive ? "up" : "down"); } +void lnet_update_ping_buffer(void); #endif diff --git a/net/lnet/klnds/o2iblnd/o2iblnd.c b/net/lnet/klnds/o2iblnd/o2iblnd.c index a7a3c79..fc59f88 100644 --- a/net/lnet/klnds/o2iblnd/o2iblnd.c +++ b/net/lnet/klnds/o2iblnd/o2iblnd.c @@ -2382,15 +2382,23 @@ static int kiblnd_port_get_attr(struct kib_hca_dev *hdev) static inline void kiblnd_set_ni_fatal_on(struct kib_hca_dev *hdev, int val) { - struct kib_net *net; + struct kib_net *net; + u32 ni_state_before; + bool update_ping_buf = false; /* for health check */ list_for_each_entry(net, &hdev->ibh_dev->ibd_nets, ibn_list) { if (val) CDEBUG(D_NETERROR, "Fatal device error for NI %s\n", libcfs_nidstr(&net->ibn_ni->ni_nid)); - atomic_set(&net->ibn_ni->ni_fatal_error_on, val); + ni_state_before = atomic_xchg(&net->ibn_ni->ni_fatal_error_on, + val); + if (!update_ping_buf && val != ni_state_before) + update_ping_buf = true; } + + if (update_ping_buf) + lnet_update_ping_buffer(); } void @@ -2748,6 +2756,8 @@ void kiblnd_destroy_dev(struct kib_dev *dev) bool link_down = !(operstate == IF_OPER_UP); struct in_device *in_dev; bool found_ip = false; + u32 ni_state_before; + bool update_ping_buf = false; const struct in_ifaddr *ifa; event_kibdev = kiblnd_dev_search(dev->name); @@ -2757,7 +2767,6 @@ void kiblnd_destroy_dev(struct kib_dev *dev) list_for_each_entry_safe(net, cnxt, &event_kibdev->ibd_nets, ibn_list) { found_ip = false; - ni = net->ibn_ni; in_dev = __in_dev_get_rtnl(dev); @@ -2766,8 +2775,9 @@ void kiblnd_destroy_dev(struct kib_dev *dev) dev->name); CDEBUG(D_NET, "%s: set link fatal state to 1\n", libcfs_nidstr(&net->ibn_ni->ni_nid)); - atomic_set(&ni->ni_fatal_error_on, 1); - continue; + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + 1); + goto ni_done; } in_dev_for_each_ifa_rtnl(ifa, in_dev) { if (htonl(event_kibdev->ibd_ifip) == ifa->ifa_local) @@ -2779,22 +2789,31 @@ void kiblnd_destroy_dev(struct kib_dev *dev) dev->name); CDEBUG(D_NET, "%s: set link fatal state to 1\n", libcfs_nidstr(&net->ibn_ni->ni_nid)); - atomic_set(&ni->ni_fatal_error_on, 1); - continue; + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + 1); + goto ni_done; } if (link_down) { CDEBUG(D_NET, "%s: set link fatal state to 1\n", libcfs_nidstr(&net->ibn_ni->ni_nid)); - atomic_set(&ni->ni_fatal_error_on, link_down); + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + link_down); } else { CDEBUG(D_NET, "%s: set link fatal state to %u\n", libcfs_nidstr(&net->ibn_ni->ni_nid), (kiblnd_get_link_status(dev) == 0)); - atomic_set(&ni->ni_fatal_error_on, - (kiblnd_get_link_status(dev) == 0)); + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + (kiblnd_get_link_status(dev) == 0)); } +ni_done: + if (!update_ping_buf && + (atomic_read(&ni->ni_fatal_error_on) != ni_state_before)) + update_ping_buf = true; } + + if (update_ping_buf) + lnet_update_ping_buffer(); out: return 0; } @@ -2806,6 +2825,8 @@ void kiblnd_destroy_dev(struct kib_dev *dev) struct kib_net *net; struct kib_net *cnxt; struct net_device *event_netdev = ifa->ifa_dev->dev; + u32 ni_state_before; + bool update_ping_buf = false; event_kibdev = kiblnd_dev_search(event_netdev->name); @@ -2820,9 +2841,15 @@ void kiblnd_destroy_dev(struct kib_dev *dev) CDEBUG(D_NET, "%s: set link fatal state to %u\n", libcfs_nidstr(&net->ibn_ni->ni_nid), (event == NETDEV_DOWN)); - atomic_set(&net->ibn_ni->ni_fatal_error_on, - (event == NETDEV_DOWN)); + ni_state_before = atomic_xchg(&net->ibn_ni->ni_fatal_error_on, + (event == NETDEV_DOWN)); + if (!update_ping_buf && + ((event == NETDEV_DOWN) != ni_state_before)) + update_ping_buf = true; } + + if (update_ping_buf) + lnet_update_ping_buffer(); out: return 0; } diff --git a/net/lnet/klnds/socklnd/socklnd.c b/net/lnet/klnds/socklnd/socklnd.c index b8d6e28..435762f 100644 --- a/net/lnet/klnds/socklnd/socklnd.c +++ b/net/lnet/klnds/socklnd/socklnd.c @@ -2000,6 +2000,8 @@ static int ksocknal_get_link_status(struct net_device *dev) bool found_ip = false; struct ksock_interface *ksi = NULL; struct sockaddr_in *sa; + u32 ni_state_before; + bool update_ping_buf = false; const struct in_ifaddr *ifa; ifindex = dev->ifindex; @@ -2045,8 +2047,9 @@ static int ksocknal_get_link_status(struct net_device *dev) CDEBUG(D_NET, "Interface %s has no IPv4 status.\n", dev->name); CDEBUG(D_NET, "set link fatal state to 1\n"); - atomic_set(&ni->ni_fatal_error_on, 1); - continue; + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + 1); + goto ni_done; } in_dev_for_each_ifa_rtnl(ifa, in_dev) { if (sa->sin_addr.s_addr == ifa->ifa_local) @@ -2057,20 +2060,29 @@ static int ksocknal_get_link_status(struct net_device *dev) CDEBUG(D_NET, "Interface %s has no matching ip\n", dev->name); CDEBUG(D_NET, "set link fatal state to 1\n"); - atomic_set(&ni->ni_fatal_error_on, 1); - continue; + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + 1); + goto ni_done; } if (link_down) { CDEBUG(D_NET, "set link fatal state to 1\n"); - atomic_set(&ni->ni_fatal_error_on, link_down); + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + 1); } else { CDEBUG(D_NET, "set link fatal state to %u\n", (ksocknal_get_link_status(dev) == 0)); - atomic_set(&ni->ni_fatal_error_on, - (ksocknal_get_link_status(dev) == 0)); + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + (ksocknal_get_link_status(dev) == 0)); } +ni_done: + if (!update_ping_buf && + (atomic_read(&ni->ni_fatal_error_on) != ni_state_before)) + update_ping_buf = true; } + + if (update_ping_buf) + lnet_update_ping_buffer(); out: return 0; } @@ -2086,6 +2098,8 @@ static int ksocknal_get_link_status(struct net_device *dev) int ifindex; struct ksock_interface *ksi = NULL; struct sockaddr_in *sa; + u32 ni_state_before; + bool update_ping_buf = false; if (!ksocknal_data.ksnd_nnets) goto out; @@ -2106,10 +2120,16 @@ static int ksocknal_get_link_status(struct net_device *dev) CDEBUG(D_NET, "set link fatal state to %u\n", (event == NETDEV_DOWN)); ni = net->ksnn_ni; - atomic_set(&ni->ni_fatal_error_on, - (event == NETDEV_DOWN)); + ni_state_before = atomic_xchg(&ni->ni_fatal_error_on, + (event == NETDEV_DOWN)); + if (!update_ping_buf && + ((event == NETDEV_DOWN) != ni_state_before)) + update_ping_buf = true; } } + + if (update_ping_buf) + lnet_update_ping_buffer(); out: return 0; } diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 8b0ab53..9f01dbe 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -3841,6 +3841,26 @@ int lnet_dyn_del_ni(struct lnet_nid *nid) return rc; } +void lnet_update_ping_buffer(void) +{ + struct lnet_ping_buffer *pbuf; + struct lnet_handle_md ping_mdh; + + if (the_lnet.ln_routing) + return; + + mutex_lock(&the_lnet.ln_api_mutex); + + if (!lnet_ping_target_setup(&pbuf, &ping_mdh, + LNET_PING_INFO_HDR_SIZE + + lnet_get_ni_bytes(), + false)) + lnet_ping_target_update(pbuf, ping_mdh); + + mutex_unlock(&the_lnet.ln_api_mutex); +} +EXPORT_SYMBOL(lnet_update_ping_buffer); + void lnet_incr_dlc_seq(void) { atomic_inc(&lnet_dlc_seq_no); diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index 619973b..ef924ce 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -3079,6 +3079,15 @@ int ping_info_count_entries(struct lnet_ping_buffer *pbuf) return nnis; } +static inline void handle_disc_lpni_health(struct lnet_peer_ni *lpni) +{ + if (lpni->lpni_ns_status == LNET_NI_STATUS_DOWN) + lnet_handle_remote_failure_locked(lpni); + else if (lpni->lpni_ns_status == LNET_NI_STATUS_UP && + !lpni->lpni_last_alive) + atomic_set(&lpni->lpni_healthv, LNET_MAX_HEALTH_VALUE); +} + /* * Build a peer from incoming data. * @@ -3118,6 +3127,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, int i; int j; int rc; + u32 old_st; flags = LNET_PEER_DISCOVERED; if (pbuf->pb_info.pi_features & LNET_PING_FEAT_MULTI_RAIL) @@ -3194,7 +3204,10 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, */ lpni = lnet_peer_ni_find_locked(&curnis[i]); if (lpni) { + old_st = lpni->lpni_ns_status; lpni->lpni_ns_status = *stp; + if (old_st != lpni->lpni_ns_status) + handle_disc_lpni_health(lpni); lnet_peer_ni_decref_locked(lpni); } break; @@ -3224,6 +3237,7 @@ static int lnet_peer_merge_data(struct lnet_peer *lp, lpni = lnet_peer_ni_find_locked(&addnis[i].ns_nid); if (lpni) { lpni->lpni_ns_status = addnis[i].ns_status; + handle_disc_lpni_health(lpni); lnet_peer_ni_decref_locked(lpni); } } From patchwork Mon Apr 17 13:47:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ADD69C77B70 for ; Mon, 17 Apr 2023 13:56:07 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T1p1B2Fz21Bq; Mon, 17 Apr 2023 06:49:34 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T036P55z1yFb for ; Mon, 17 Apr 2023 06:48:03 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 7F523100848D; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 7DF3B375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:07 -0400 Message-Id: <1681739243-29375-12-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 11/27] lnet: add 'lock_prim_nid" lnet module parameter X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Serguei Smirnov , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Serguei Smirnov Add 'lock_prim_nid' lnet module parameter to allow control of how Lustre peer primary NID is selected. If set to 1 (default), the NID specified by Lustre when calling LNet API is designated as primary for the peer, allowing for non-blocking discovery in the background. If set to 0, peer discovery is blocking until complete and the NID listed first in discovery response is designated as primary. WC-bug-id: https://jira.whamcloud.com/browse/LU-14668 Lustre-commit: fc7a0d6013b46ebc1 ("LU-14668 lnet: add 'lock_prim_nid" lnet module parameter") Signed-off-by: Serguei Smirnov Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50159 Reviewed-by: Chris Horn Reviewed-by: Frank Sehr Reviewed-by: Cyril Bordage Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-lnet.h | 1 + net/lnet/lnet/api-ni.c | 5 ++ net/lnet/lnet/peer.c | 105 +++++++++++++++++++++++++++--------------- 3 files changed, 73 insertions(+), 38 deletions(-) diff --git a/include/linux/lnet/lib-lnet.h b/include/linux/lnet/lib-lnet.h index f9f4815..4aa1e5c 100644 --- a/include/linux/lnet/lib-lnet.h +++ b/include/linux/lnet/lib-lnet.h @@ -565,6 +565,7 @@ unsigned int lnet_nid_cpt_hash(struct lnet_nid *nid, extern int live_router_check_interval; extern int dead_router_check_interval; extern int portal_rotor; +extern int lock_prim_nid; int lnet_lib_init(void); void lnet_lib_exit(void); diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index 9f01dbe..fb596ed 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -208,6 +208,11 @@ static int response_tracking_set(const char *val, MODULE_PARM_DESC(lnet_response_tracking, "(0|1|2|3) LNet Internal Only|GET Reply only|PUT ACK only|Full Tracking (default)"); +int lock_prim_nid = 1; +module_param(lock_prim_nid, int, 0444); +MODULE_PARM_DESC(lock_prim_nid, + "Whether nid passed down by Lustre is locked as primary"); + #define LNET_LND_TIMEOUT_DEFAULT ((LNET_TRANSACTION_TIMEOUT_DEFAULT - 1) / \ (LNET_RETRY_COUNT_DEFAULT + 1)) unsigned int lnet_lnd_timeout = LNET_LND_TIMEOUT_DEFAULT; diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index ef924ce..f1b0eb0d 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -1346,6 +1346,7 @@ struct lnet_peer_ni * struct lnet_nid pnid = LNET_ANY_NID; bool mr; int i, rc; + int flags = lock_prim_nid ? LNET_PEER_LOCK_PRIMARY : 0; if (!nids || num_nids < 1) return -EINVAL; @@ -1368,8 +1369,7 @@ struct lnet_peer_ni * lnet_nid4_to_nid(nids[i], &nid); if (LNET_NID_IS_ANY(&pnid)) { lnet_nid4_to_nid(nids[i], &pnid); - rc = lnet_add_peer_ni(&pnid, &LNET_ANY_NID, mr, - LNET_PEER_LOCK_PRIMARY); + rc = lnet_add_peer_ni(&pnid, &LNET_ANY_NID, mr, flags); if (rc == -EALREADY) { struct lnet_peer *lp; @@ -1385,12 +1385,10 @@ struct lnet_peer_ni * } } else if (lnet_peer_discovery_disabled) { lnet_nid4_to_nid(nids[i], &nid); - rc = lnet_add_peer_ni(&nid, &LNET_ANY_NID, mr, - LNET_PEER_LOCK_PRIMARY); + rc = lnet_add_peer_ni(&nid, &LNET_ANY_NID, mr, flags); } else { lnet_nid4_to_nid(nids[i], &nid); - rc = lnet_add_peer_ni(&pnid, &nid, mr, - LNET_PEER_LOCK_PRIMARY); + rc = lnet_add_peer_ni(&pnid, &nid, mr, flags); } if (rc && rc != -EEXIST) @@ -1432,36 +1430,53 @@ void LNetPrimaryNID(struct lnet_nid *nid) * down then this discovery can introduce long delays into the mount * process, so skip it if it isn't necessary. */ +again: spin_lock(&lp->lp_lock); - if (!lnet_peer_discovery_disabled && - (!(lp->lp_state & LNET_PEER_LOCK_PRIMARY) || - !lnet_peer_is_uptodate_locked(lp))) { - /* force a full discovery cycle */ - lp->lp_state |= LNET_PEER_FORCE_PING | LNET_PEER_FORCE_PUSH | - LNET_PEER_LOCK_PRIMARY; + if (!(lp->lp_state & LNET_PEER_LOCK_PRIMARY) && lock_prim_nid) + lp->lp_state |= LNET_PEER_LOCK_PRIMARY; + + /* DD disabled, nothing to do */ + if (lnet_peer_discovery_disabled) { + *nid = lp->lp_primary_nid; spin_unlock(&lp->lp_lock); + goto out_decref; + } - /* start discovery in the background. Messages to that - * peer will not go through until the discovery is - * complete - */ - rc = lnet_discover_peer_locked(lpni, cpt, false); - if (rc) - goto out_decref; - /* The lpni (or lp) for this NID may have changed and our ref is - * the only thing keeping the old one around. Release the ref - * and lookup the lpni again - */ - lnet_peer_ni_decref_locked(lpni); - lpni = lnet_peer_ni_find_locked(nid); - if (!lpni) { - rc = -ENOENT; - goto out_unlock; - } - lp = lpni->lpni_peer_net->lpn_peer; - } else { + /* Peer already up to date, nothing to do */ + if (lnet_peer_is_uptodate_locked(lp)) { + *nid = lp->lp_primary_nid; spin_unlock(&lp->lp_lock); + goto out_decref; } + spin_unlock(&lp->lp_lock); + + /* If primary nid locking is enabled, discovery is performed + * in the background. + * If primary nid locking is disabled, discovery blocks here. + * Messages to the peer will not go through until the discovery is + * complete. + */ + if (lock_prim_nid) + rc = lnet_discover_peer_locked(lpni, cpt, false); + else + rc = lnet_discover_peer_locked(lpni, cpt, true); + if (rc) + goto out_decref; + + /* The lpni (or lp) for this NID may have changed and our ref is + * the only thing keeping the old one around. Release the ref + * and lookup the lpni again + */ + lnet_peer_ni_decref_locked(lpni); + lpni = lnet_peer_ni_find_locked(nid); + if (!lpni) { + rc = -ENOENT; + goto out_unlock; + } + lp = lpni->lpni_peer_net->lpn_peer; + + if (!lock_prim_nid && !lnet_is_discovery_disabled(lp)) + goto again; *nid = lp->lp_primary_nid; out_decref: lnet_peer_ni_decref_locked(lpni); @@ -1553,7 +1568,6 @@ struct lnet_peer_net * ptable->pt_peers++; } - /* Update peer state */ spin_lock(&lp->lp_lock); if (flags & LNET_PEER_CONFIGURED) { @@ -1630,10 +1644,8 @@ struct lnet_peer_net * rc = -EPERM; goto out; } else if (lp->lp_state & LNET_PEER_LOCK_PRIMARY) { - if (nid_same(&lp->lp_primary_nid, nid)) { + if (nid_same(&lp->lp_primary_nid, nid)) rc = -EEXIST; - goto out; - } /* we're trying to recreate an existing peer which * has already been created and its primary * locked. This is likely due to two servers @@ -1641,8 +1653,18 @@ struct lnet_peer_net * * to that node with the primary NID which was * first added by Lustre */ - rc = -EALREADY; + else + rc = -EALREADY; goto out; + } else if (!(flags & (LNET_PEER_LOCK_PRIMARY | LNET_PEER_CONFIGURED))) { + /* if not recreating peer as configured and + * not locking primary nid, no need to + * do anything if primary nid is not being changed + */ + if (nid_same(&lp->lp_primary_nid, nid)) { + rc = -EEXIST; + goto out; + } } /* Delete and recreate the peer. * We can get here: @@ -1952,6 +1974,14 @@ struct lnet_peer_net * lnet_peer_ni_decref_locked(lpni); lp = lpni->lpni_peer_net->lpn_peer; + /* Peer must have been configured. */ + if ((flags & LNET_PEER_CONFIGURED) && + !(lp->lp_state & LNET_PEER_CONFIGURED)) { + CDEBUG(D_NET, "peer %s was not configured\n", + libcfs_nidstr(prim_nid)); + return -ENOENT; + } + /* Primary NID must match */ if (!nid_same(&lp->lp_primary_nid, prim_nid)) { CDEBUG(D_NET, "prim_nid %s is not primary for peer %s\n", @@ -1967,8 +1997,7 @@ struct lnet_peer_net * return -EPERM; } - if ((flags & LNET_PEER_LOCK_PRIMARY) && - (lnet_peer_is_uptodate(lp) && (lp->lp_state & LNET_PEER_LOCK_PRIMARY))) { + if (lnet_peer_is_uptodate(lp) && !(flags & LNET_PEER_CONFIGURED)) { CDEBUG(D_NET, "Don't add temporary peer NI for uptodate peer %s\n", libcfs_nidstr(&lp->lp_primary_nid)); From patchwork Mon Apr 17 13:47:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3B336C77B76 for ; Mon, 17 Apr 2023 13:57:36 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T2f4Gndz21Hy; Mon, 17 Apr 2023 06:50:18 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T0S0N4Kz1yGh for ; Mon, 17 Apr 2023 06:48:23 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 84036100848E; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 82B45379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:08 -0400 Message-Id: <1681739243-29375-13-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 12/27] lustre: obdclass: fix rpc slot leakage X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Alex Zhuravlev obd_get_mod_rpc_slot() can race with obd_put_mod_rpc_slot(): finishing wait_woken() resets WQ_FLAG_WOKEN (which is set when the corresponding thread gets a slot incrementing cl_mod_rpcs_in_flight. then another thread execting __wake_up_locked_key() may find that wq_entry again and call claim_mod_rpc_function() one more time again incrementing cl_mod_rpc_in_flight. thus it's incremented twice for a single obd_get_mod_rpc_slot(). flags &= ~WQ_FLAG_WOKEN list_add() wait_woken() schedule claim_mod_rpc_function() cl_mod_rpcs_in_flight++ wake_up() flags &= ~WQ_FLAG_WOKEN #3: obd_put_mod_rpc_slot() claim_mod_rpc_function() cl_mod_rpcs_in_flight++ wake_up() list_del() the patch introduces a replacement for WQ_FLAG_WOKEN which is never reset once set. Fixes: 6d398c0843 ("lustre: obdclass: improve precision of wakeups for mod_rpcs") WC-bug-id: https://jira.whamcloud.com/browse/LU-16633 Lustre-commit: 91a3726f313df33e09 ("LU-16633 obdclass: fix rpc slot leakage") Signed-off-by: Alex Zhuravlev Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50261 Reviewed-by: Andreas Dilger Reviewed-by: Lai Siyao Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/mdc/mdc_request.c | 3 +++ fs/lustre/obdclass/genops.c | 11 +++++++---- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/lustre/mdc/mdc_request.c b/fs/lustre/mdc/mdc_request.c index 58ea982..15e58e8 100644 --- a/fs/lustre/mdc/mdc_request.c +++ b/fs/lustre/mdc/mdc_request.c @@ -2964,6 +2964,9 @@ static int mdc_precleanup(struct obd_device *obd) static int mdc_cleanup(struct obd_device *obd) { + struct client_obd *cli = &obd->u.cli; + + LASSERT(cli->cl_mod_rpcs_in_flight == 0); return osc_cleanup_common(obd); } diff --git a/fs/lustre/obdclass/genops.c b/fs/lustre/obdclass/genops.c index b6bde00..43772aa 100644 --- a/fs/lustre/obdclass/genops.c +++ b/fs/lustre/obdclass/genops.c @@ -1487,6 +1487,7 @@ int obd_mod_rpc_stats_seq_show(struct client_obd *cli, struct seq_file *seq) struct mod_waiter { struct client_obd *cli; bool close_req; + bool woken; wait_queue_entry_t wqe; }; static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry, @@ -1499,10 +1500,9 @@ static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry, int ret; /* As woken_wake_function() doesn't remove us from the wait_queue, - * we could get called twice for the same thread - take care. + * we use own flag to ensure we're called just once. */ - if (wq_entry->flags & WQ_FLAG_WOKEN) - /* Already woke this thread, don't try again */ + if (w->woken) return 0; /* A slot is available if @@ -1516,6 +1516,7 @@ static int claim_mod_rpc_function(wait_queue_entry_t *wq_entry, if (w->close_req) cli->cl_close_rpcs_in_flight++; ret = woken_wake_function(wq_entry, mode, flags, key); + w->woken = true; } else if (cli->cl_close_rpcs_in_flight) /* No other waiter could be woken */ ret = -1; @@ -1543,6 +1544,7 @@ u16 obd_get_mod_rpc_slot(struct client_obd *cli, u32 opc) struct mod_waiter wait = { .cli = cli, .close_req = (opc == MDS_CLOSE), + .woken = false, }; u16 i, max; @@ -1556,7 +1558,8 @@ u16 obd_get_mod_rpc_slot(struct client_obd *cli, u32 opc) * and there will be no need to wait. */ wake_up_locked(&cli->cl_mod_rpcs_waitq); - if (!(wait.wqe.flags & WQ_FLAG_WOKEN)) { + /* XXX: handle spurious wakeups (from unknown yet source */ + while (wait.woken == false) { spin_unlock_irq(&cli->cl_mod_rpcs_waitq.lock); wait_woken(&wait.wqe, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT); From patchwork Mon Apr 17 13:47:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214069 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 64261C77B77 for ; Mon, 17 Apr 2023 13:58:44 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T3H4t6sz21KC; Mon, 17 Apr 2023 06:50:51 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T0j3rSYz215c for ; Mon, 17 Apr 2023 06:48:37 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 88AE2100848F; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 87177372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:09 -0400 Message-Id: <1681739243-29375-14-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 13/27] lnet: libcfs: cleanup console messages X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Change early libcfs cfs_cpu_init() messages from CERROR() to pr_err() to avoid circular dependencies on libcfs setup before printing an error message to the console during module init. WC-bug-id: https://jira.whamcloud.com/browse/LU-16639 Lustre-commit: 8f40a3d7110da1af8e ("LU-16639 misc: cleanup concole messages") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50283 Reviewed-by: Alex Zhuravlev Reviewed-by: Feng Lei Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- net/lnet/libcfs/libcfs_cpu.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/lnet/libcfs/libcfs_cpu.c b/net/lnet/libcfs/libcfs_cpu.c index dca92cd..a5bf4f6 100644 --- a/net/lnet/libcfs/libcfs_cpu.c +++ b/net/lnet/libcfs/libcfs_cpu.c @@ -1177,27 +1177,27 @@ int cfs_cpu_init(void) if (*cpu_pattern) { cfs_cpt_tab = cfs_cpt_table_create_pattern(cpu_pattern); if (IS_ERR(cfs_cpt_tab)) { - CERROR("Failed to create cptab from pattern '%s'\n", - cpu_pattern); ret = PTR_ERR(cfs_cpt_tab); + pr_err("libcfs: failed to create cptab from pattern '%s': rc = %d\n", + cpu_pattern, ret); goto failed_alloc_table; } } else { cfs_cpt_tab = cfs_cpt_table_create(cpu_npartitions); if (IS_ERR(cfs_cpt_tab)) { - CERROR("Failed to create cptab with npartitions %d\n", - cpu_npartitions); ret = PTR_ERR(cfs_cpt_tab); + pr_err("libcfs: failed to create cptab with npartitions=%d: rc = %d\n", + cpu_npartitions, ret); goto failed_alloc_table; } } put_online_cpus(); - LCONSOLE(0, "HW NUMA nodes: %d, HW CPU cores: %d, npartitions: %d\n", - num_online_nodes(), num_online_cpus(), - cfs_cpt_number(cfs_cpt_tab)); + pr_notice("libcfs: HW NUMA nodes: %d, HW CPU cores: %d, npartitions: %d\n", + num_online_nodes(), num_online_cpus(), + cfs_cpt_number(cfs_cpt_tab)); return 0; failed_alloc_table: From patchwork Mon Apr 17 13:47:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E6D30C77B70 for ; Mon, 17 Apr 2023 14:02:55 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T3v16j4z22NP; Mon, 17 Apr 2023 06:51:23 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T0n72tWz215t for ; Mon, 17 Apr 2023 06:48:41 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 8D12E1008490; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 8BB81375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:10 -0400 Message-Id: <1681739243-29375-15-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 14/27] lustre: ldlm: clear lock converting flag on resource cleanup X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Bobi Jam During resource cleanup clear lock's converting flag so that ldlm_cli_cancel() won't erroneously trip the assertion, the assertion is used for normal lock revoke callbacks. WC-bug-id: https://jira.whamcloud.com/browse/LU-16371 Lustre-commit: 4990f4ef5eb81d8017 ("LU-16371 ldlm: clear lock converting flag on resource cleanup") Signed-off-by: Bobi Jam Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49339 Reviewed-by: Andreas Dilger Reviewed-by: Patrick Farrell Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/ldlm/ldlm_resource.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/lustre/ldlm/ldlm_resource.c b/fs/lustre/ldlm/ldlm_resource.c index 9a269cb..28f64b6 100644 --- a/fs/lustre/ldlm/ldlm_resource.c +++ b/fs/lustre/ldlm/ldlm_resource.c @@ -794,6 +794,7 @@ static void cleanup_resource(struct ldlm_resource *res, struct list_head *q, */ ldlm_set_cbpending(lock); ldlm_set_failed(lock); + ldlm_clear_converting(lock); lock->l_flags |= flags; /* ... without sending a CANCEL message for local_only. */ From patchwork Mon Apr 17 13:47:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214133 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9EA2CC77B70 for ; Mon, 17 Apr 2023 14:05:37 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T4K30FYz22QX; Mon, 17 Apr 2023 06:51:45 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T0x4Q0tz1yDG for ; Mon, 17 Apr 2023 06:48:49 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 919EC1008491; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 903F4379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:11 -0400 Message-Id: <1681739243-29375-16-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 15/27] lustre: statahead: statahead thread doesn't stop X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Yang Sheng , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Yang Sheng Add a barrier to ensure sai_task changing can be seen when access it without locking. Else the statahead thread could sleep forever since wake_up was lost. WC-bug-id: https://jira.whamcloud.com/browse/LU-15660 Lustre-commit: b977caa2dc7dddcec ("LU-15660 statahead: statahead thread doesn't stop") Signed-off-by: Yang Sheng Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/47673 Reviewed-by: Neil Brown Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/statahead.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/fs/lustre/llite/statahead.c b/fs/lustre/llite/statahead.c index e6ea2ee..12d8266 100644 --- a/fs/lustre/llite/statahead.c +++ b/fs/lustre/llite/statahead.c @@ -1005,7 +1005,8 @@ static int ll_statahead_thread(void *arg) goto out; } - while (pos != MDS_DIR_END_OFF && sai->sai_task) { + /* matches smp_store_release() in ll_deauthorize_statahead() */ + while (pos != MDS_DIR_END_OFF && smp_load_acquire(&sai->sai_task)) { struct lu_dirpage *dp; struct lu_dirent *ent; @@ -1029,7 +1030,9 @@ static int ll_statahead_thread(void *arg) dp = page_address(page); for (ent = lu_dirent_start(dp); - ent && sai->sai_task && !sa_low_hit(sai); + /* matches smp_store_release() in ll_deauthorize_statahead() */ + ent && smp_load_acquire(&sai->sai_task) && + !sa_low_hit(sai); ent = lu_dirent_next(ent)) { struct lu_fid fid; u64 hash; @@ -1081,7 +1084,10 @@ static int ll_statahead_thread(void *arg) fid_le_to_cpu(&fid, &ent->lde_fid); while (({set_current_state(TASK_IDLE); - sai->sai_task; })) { + /* matches smp_store_release() in + * ll_deauthorize_statahead() + */ + smp_load_acquire(&sai->sai_task); })) { spin_lock(&lli->lli_agl_lock); while (sa_sent_full(sai) && !agl_list_empty(sai)) { @@ -1163,7 +1169,8 @@ static int ll_statahead_thread(void *arg) * for file release closedir() call to stop me. */ while (({set_current_state(TASK_IDLE); - sai->sai_task; })) { + /* matches smp_store_release() in ll_deauthorize_statahead() */ + smp_load_acquire(&sai->sai_task); })) { schedule(); } __set_current_state(TASK_RUNNING); @@ -1244,7 +1251,8 @@ void ll_deauthorize_statahead(struct inode *dir, void *key) */ struct task_struct *task = sai->sai_task; - sai->sai_task = NULL; + /* matches smp_load_acquire() in ll_statahead_thread() */ + smp_store_release(&sai->sai_task, NULL); wake_up_process(task); } spin_unlock(&lli->lli_sa_lock); @@ -1634,11 +1642,10 @@ static int start_statahead_thread(struct inode *dir, struct dentry *dentry, goto out; } - if (test_bit(LL_SBI_AGL_ENABLED, ll_i2sbi(parent->d_inode)->ll_flags) && - agl) + if (test_bit(LL_SBI_AGL_ENABLED, sbi->ll_flags) && agl) ll_start_agl(parent, sai); - atomic_inc(&ll_i2sbi(parent->d_inode)->ll_sa_total); + atomic_inc(&sbi->ll_sa_total); sai->sai_task = task; wake_up_process(task); From patchwork Mon Apr 17 13:47:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7E6A9C77B70 for ; Mon, 17 Apr 2023 14:02:06 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T3k2Wxkz226d; Mon, 17 Apr 2023 06:51:14 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T1K4389z21BC for ; Mon, 17 Apr 2023 06:49:09 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 9645E1008492; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 94DD2372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:12 -0400 Message-Id: <1681739243-29375-17-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 16/27] lustre: uapi: fix unused function errors X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Timothy Day Clang has default errors related to unused functions. The errors related to 'fid_flatten' and 'fid_flatten32' were resolved by moving the definitions of these functions to the 'lustre_fid' header. This is a better place for them, since they are small 'static inline' functions and has the added benefit of cutting down code duplication. WC-bug-id: https://jira.whamcloud.com/browse/LU-16518 Lustre-commit: 0991267eab728e9a6 ("LU-16518 utils: fix unused function errors") Signed-off-by: Timothy Day Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49901 Reviewed-by: Andreas Dilger Reviewed-by: Shaun Tancheff Reviewed-by: James Simmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/lustre_fid.h | 61 ++----------------------------- fs/lustre/llite/lcommon_cl.c | 4 +-- include/uapi/linux/lustre/lustre_fid.h | 65 ++++++++++++++++++++++++++++++++++ 3 files changed, 70 insertions(+), 60 deletions(-) diff --git a/fs/lustre/include/lustre_fid.h b/fs/lustre/include/lustre_fid.h index 5ebe362..bc3f058 100644 --- a/fs/lustre/include/lustre_fid.h +++ b/fs/lustre/include/lustre_fid.h @@ -536,68 +536,13 @@ static inline void ost_fid_build_resid(const struct lu_fid *fid, } } -/** - * Flatten 128-bit FID values into a 64-bit value for use as an inode number. - * For non-IGIF FIDs this starts just over 2^32, and continues without - * conflict until 2^64, at which point we wrap the high 24 bits of the SEQ - * into the range where there may not be many OID values in use, to minimize - * the risk of conflict. - * - * Suppose LUSTRE_SEQ_MAX_WIDTH less than (1 << 24) which is currently true, - * the time between re-used inode numbers is very long - 2^40 SEQ numbers, - * or about 2^40 client mounts, if clients create less than 2^24 files/mount. - */ -static inline u64 fid_flatten(const struct lu_fid *fid) -{ - u64 ino; - u64 seq; - - if (fid_is_igif(fid)) { - ino = lu_igif_ino(fid); - return ino; - } - - seq = fid_seq(fid); - - ino = (seq << 24) + ((seq >> 24) & 0xffffff0000ULL) + fid_oid(fid); - - return ino ? ino : fid_oid(fid); -} - static inline u32 fid_hash(const struct lu_fid *f, int bits) { - /* all objects with same id and different versions will belong to same + /* + * All objects with same id and different versions will belong to same * collisions list. */ - return hash_long(fid_flatten(f), bits); -} - -/** - * map fid to 32 bit value for ino on 32bit systems. - */ -static inline u32 fid_flatten32(const struct lu_fid *fid) -{ - u32 ino; - u64 seq; - - if (fid_is_igif(fid)) { - ino = lu_igif_ino(fid); - return ino; - } - - seq = fid_seq(fid) - FID_SEQ_START; - - /* Map the high bits of the OID into higher bits of the inode number so - * that inodes generated at about the same time have a reduced chance - * of collisions. This will give a period of 2^12 = 1024 unique clients - * (from SEQ) and up to min(LUSTRE_SEQ_MAX_WIDTH, 2^20) = 128k objects - * (from OID), or up to 128M inodes without collisions for new files. - */ - ino = ((seq & 0x000fffffULL) << 12) + ((seq >> 8) & 0xfffff000) + - (seq >> (64 - (40 - 8)) & 0xffffff00) + - (fid_oid(fid) & 0xff000fff) + ((fid_oid(fid) & 0x00fff000) << 8); - - return ino ? ino : fid_oid(fid); + return hash_long(fid_flatten64(f), bits); } static inline int lu_fid_diff(const struct lu_fid *fid1, diff --git a/fs/lustre/llite/lcommon_cl.c b/fs/lustre/llite/lcommon_cl.c index 2735d5c..9b0c6bc 100644 --- a/fs/lustre/llite/lcommon_cl.c +++ b/fs/lustre/llite/lcommon_cl.c @@ -280,7 +280,7 @@ u64 cl_fid_build_ino(const struct lu_fid *fid, bool api32) if (BITS_PER_LONG == 32 || api32) return fid_flatten32(fid); else - return fid_flatten(fid); + return fid_flatten64(fid); } /* @@ -292,5 +292,5 @@ u32 cl_fid_build_gen(const struct lu_fid *fid) if (fid_is_igif(fid)) return lu_igif_gen(fid); - return fid_flatten(fid) >> 32; + return fid_flatten64(fid) >> 32; } diff --git a/include/uapi/linux/lustre/lustre_fid.h b/include/uapi/linux/lustre/lustre_fid.h index d8561cd..ef47f45 100644 --- a/include/uapi/linux/lustre/lustre_fid.h +++ b/include/uapi/linux/lustre/lustre_fid.h @@ -302,4 +302,69 @@ static inline int lu_fid_cmp(const struct lu_fid *f0, return 0; } + +/** + * Flatten 128-bit FID values into a 64-bit value for use as an inode number. + * For non-IGIF FIDs this starts just over 2^32, and continues without + * conflict until 2^64, at which point we wrap the high 24 bits of the SEQ + * into the range where there may not be many OID values in use, to minimize + * the risk of conflict. + * + * Suppose LUSTRE_SEQ_MAX_WIDTH less than (1 << 24) which is currently true, + * the time between re-used inode numbers is very long - 2^40 SEQ numbers, + * or about 2^40 client mounts, if clients create less than 2^24 files/mount. + */ +static inline __u64 fid_flatten64(const struct lu_fid *fid) +{ + __u64 ino; + __u64 seq; + + if (fid_is_igif(fid)) { + ino = lu_igif_ino(fid); + return ino; + } + + seq = fid_seq(fid); + + ino = (seq << 24) + ((seq >> 24) & 0xffffff0000ULL) + fid_oid(fid); + + return ino ?: fid_oid(fid); +} + +/** + * map fid to 32 bit value for ino on 32bit systems. + */ +static inline __u32 fid_flatten32(const struct lu_fid *fid) +{ + __u32 ino; + __u64 seq; + + if (fid_is_igif(fid)) { + ino = lu_igif_ino(fid); + return ino; + } + + seq = fid_seq(fid) - FID_SEQ_START; + + /* Map the high bits of the OID into higher bits of the inode number so + * that inodes generated at about the same time have a reduced chance + * of collisions. This will give a period of 2^12 = 1024 unique clients + * (from SEQ) and up to min(LUSTRE_SEQ_MAX_WIDTH, 2^20) = 128k objects + * (from OID), or up to 128M inodes without collisions for new files. + */ + ino = ((seq & 0x000fffffULL) << 12) + ((seq >> 8) & 0xfffff000) + + (seq >> (64 - (40-8)) & 0xffffff00) + + (fid_oid(fid) & 0xff000fff) + ((fid_oid(fid) & 0x00fff000) << 8); + + return ino ?: fid_oid(fid); +} + +#if __BITS_PER_LONG == 32 +#define fid_flatten_long fid_flatten32 +#elif __BITS_PER_LONG == 64 +#define fid_flatten_long fid_flatten64 +#else +#error "Wordsize not 32 or 64" +#endif + #endif From patchwork Mon Apr 17 13:47:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214135 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 70097C77B76 for ; Mon, 17 Apr 2023 14:07:21 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T574889z22Rs; Mon, 17 Apr 2023 06:52:27 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T1P5J66z21BZ for ; Mon, 17 Apr 2023 06:49:13 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 9B0D61008493; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 9971A375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:13 -0400 Message-Id: <1681739243-29375-18-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 17/27] lnet: Health logging improvements X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Chris Horn , Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Chris Horn LNet health activity can generate noise in console logs. The NI/Peer NI recovery pings could be expected to fail and the related messages from lnet_handle_recovery_reply() are generally redundant. Improve this logging by having the lnet_monitor_thread() provide a summary of NIs in recovery. Another useful metric in spotting network trouble is if we have messages exceeding their deadline. We do not currently log this information. Keep a count of messages that have exceeded their deadline and track the total excess time. The lnet_monitor_thread() will then provide a summary of the number of messages and their average excess time at a regular interval. These stats are then reset when the monitor thread prints this information to the console. Because NIs can be in recovery for extended periods of time, the interval of console updates will increase from 1 to 5 minutes. The interval is reset when it is detected that there are no longer any NIs in recovery and there haven't been any messages past their deadline since the last console update. HPE-bug-id: LUS-11500 WC-bug-id: https://jira.whamcloud.com/browse/LU-16643 Lustre-commit: 0cb3d86c4004d7581 ("LU-16643 lnet: Health logging improvements") Signed-off-by: Chris Horn Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50305 Reviewed-by: Andreas Dilger Reviewed-by: Cyril Bordage Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/linux/lnet/lib-types.h | 5 ++ net/lnet/lnet/api-ni.c | 2 + net/lnet/lnet/lib-move.c | 165 +++++++++++++++++++++++++++++++++++++++-- net/lnet/lnet/lib-msg.c | 16 +++- 4 files changed, 176 insertions(+), 12 deletions(-) diff --git a/include/linux/lnet/lib-types.h b/include/linux/lnet/lib-types.h index eb54e75..1ae4530 100644 --- a/include/linux/lnet/lib-types.h +++ b/include/linux/lnet/lib-types.h @@ -1529,6 +1529,11 @@ struct lnet { struct completion ln_started; /* UDSP list */ struct list_head ln_udsp_list; + + /* Number of messages that have exceeded their message deadline */ + atomic_t ln_late_msg_count; + /* Total amount of time past their deadline for all late ^ messages */ + atomic64_t ln_late_msg_nsecs; }; struct genl_filter_list { diff --git a/net/lnet/lnet/api-ni.c b/net/lnet/lnet/api-ni.c index fb596ed..f3f9aee 100644 --- a/net/lnet/lnet/api-ni.c +++ b/net/lnet/lnet/api-ni.c @@ -1295,6 +1295,8 @@ struct list_head ** init_waitqueue_head(&the_lnet.ln_dc_waitq); the_lnet.ln_mt_handler = NULL; init_completion(&the_lnet.ln_started); + atomic_set(&the_lnet.ln_late_msg_count, 0); + atomic64_set(&the_lnet.ln_late_msg_nsecs, 0); rc = lnet_slab_setup(); if (rc != 0) diff --git a/net/lnet/lnet/lib-move.c b/net/lnet/lnet/lib-move.c index 95abe4f1..9d50260 100644 --- a/net/lnet/lnet/lib-move.c +++ b/net/lnet/lnet/lib-move.c @@ -3237,8 +3237,11 @@ struct lnet_mt_event_info { lnet_ni_lock(ni); } -static void -lnet_recover_local_nis(void) +/* Returns the total number of local NIs in recovery. + * Records up to @arrsz of the associated NIDs in the @nidarr array + */ +static int +lnet_recover_local_nis(struct lnet_nid *nidarr, unsigned int arrsz) { struct lnet_mt_event_info *ev_info; LIST_HEAD(processed_list); @@ -3250,6 +3253,7 @@ struct lnet_mt_event_info { int healthv; int rc; time64_t now; + unsigned int nnis = 0; /* splice the recovery queue on a local queue. We will iterate * through the local queue and update it as needed. Once we're @@ -3286,6 +3290,10 @@ struct lnet_mt_event_info { continue; } + if (nnis < arrsz) + nidarr[nnis] = ni->ni_nid; + nnis++; + /* if the local NI failed recovery we must unlink the md. * But we want to keep the local_ni on the recovery queue * so we can continue the attempts to recover it. @@ -3391,6 +3399,8 @@ struct lnet_mt_event_info { lnet_net_lock(0); list_splice(&local_queue, &the_lnet.ln_mt_localNIRecovq); lnet_net_unlock(0); + + return nnis; } static int @@ -3490,12 +3500,16 @@ struct lnet_mt_event_info { cfs_percpt_free(the_lnet.ln_mt_resendqs); } -static void -lnet_recover_peer_nis(void) +/* Returns the total number of peer NIs in recovery. + * Records up to @arrsz of the associated NIDs in the @nidarr array + */ +static unsigned int +lnet_recover_peer_nis(struct lnet_nid *nidarr, unsigned int arrsz) { struct lnet_mt_event_info *ev_info; LIST_HEAD(processed_list); LIST_HEAD(local_queue); + unsigned int nlpnis = 0; struct lnet_handle_md mdh; struct lnet_peer_ni *lpni; struct lnet_peer_ni *tmp; @@ -3532,6 +3546,10 @@ struct lnet_mt_event_info { continue; } + if (nlpnis < arrsz) + nidarr[nlpnis] = lpni->lpni_nid; + nlpnis++; + /* If the peer NI has failed recovery we must unlink the * md. But we want to keep the peer ni on the recovery * queue so we can try to continue recovering it @@ -3621,6 +3639,131 @@ struct lnet_mt_event_info { lnet_net_lock(0); list_splice(&local_queue, &the_lnet.ln_mt_peerNIRecovq); lnet_net_unlock(0); + + return nlpnis; +} + +#define LNET_MAX_NNIDS 20 +/* @nids is array of nids that are in recovery. It has max size of + * LNET_MAX_NNIDS. + * @nnids is the total number of nids that are in recovery. It can be + * larger than LNET_MAX_NNIDS. + * @local tells us whether these are local or peer NIs in recovery. + */ +static void +lnet_print_recovery_list(struct lnet_nid *nids, unsigned int nnids, + bool local) +{ + static bool printed; + char *buf = NULL; + char *tmp; + int i; + unsigned int arrsz; + unsigned int bufsz; + + if (!nnids) + return; + + arrsz = nnids < LNET_MAX_NNIDS ? nnids : LNET_MAX_NNIDS; + + /* Printing arrsz NIDs, each has max size LNET_NIDSTR_SIZE, a comma + * and space for each nid after the first (2 * (arrsz - 1)), + * + 1 for terminating null byte + */ + bufsz = (arrsz * LNET_NIDSTR_SIZE) + (2 * (arrsz - 1)) + 1; + buf = kzalloc(bufsz, GFP_KERNEL); + if (!buf) { + LCONSOLE(D_INFO, "%u %s NIs in recovery\n", + nnids, local ? "local" : "peer"); + return; + } + + tmp = buf; + tmp += sprintf(tmp, "%s", libcfs_nidstr(&nids[0])); + for (i = 1; i < arrsz; i++) + tmp += sprintf(tmp, ", %s", libcfs_nidstr(&nids[i])); + + /* LCONSOLE() used to avoid rate limiting when we have both local + * and peer NIs in recovery + */ + LCONSOLE(D_INFO, "%u %s NIs in recovery (showing %u): %s\n", + nnids, local ? "local" : "peer", arrsz, buf); + + kfree(buf); + + if (!printed && nnids > LNET_MAX_NNIDS) { + LCONSOLE(D_INFO, "See full list with 'lnetctl debug recovery -(p|l)'\n"); + printed = true; + } +} + +static void +lnet_health_update_console(struct lnet_nid *lnids, unsigned int nnis, + struct lnet_nid *rnids, unsigned int nlpnis, + time64_t now) +{ + static time64_t next_ni_update; + static time64_t next_lpni_update; + static time64_t next_msg_update; + static unsigned int num_ni_updates; + static unsigned int num_lpni_updates; + static unsigned int num_msg_updates = 1; + int late_count; + + if (now >= next_ni_update) { + if (nnis) { + lnet_print_recovery_list(lnids, nnis, true); + if (num_ni_updates < 5) + num_ni_updates++; + next_ni_update = now + (60 * num_ni_updates); + } else { + next_ni_update = 0; + num_ni_updates = 0; + } + } + + if (now >= next_lpni_update) { + if (nlpnis) { + lnet_print_recovery_list(rnids, nlpnis, false); + if (num_lpni_updates < 5) + num_lpni_updates++; + next_lpni_update = now + (60 * num_lpni_updates); + } else { + next_lpni_update = 0; + num_lpni_updates = 0; + } + } + + /* Let late_count accumulate for 60 seconds */ + if (unlikely(!next_msg_update)) + next_msg_update = now + 60; + + if (now >= next_msg_update) { + late_count = atomic_read(&the_lnet.ln_late_msg_count); + + if (late_count) { + s64 avg = atomic64_xchg(&the_lnet.ln_late_msg_nsecs, 0) / + atomic_xchg(&the_lnet.ln_late_msg_count, 0); + + if (avg > NSEC_PER_SEC) { + unsigned int avg_msec; + + avg_msec = do_div(avg, NSEC_PER_SEC) / + NSEC_PER_MSEC; + LCONSOLE_INFO("%u messages in past %us over their deadline by avg %lld.%03us\n", + late_count, + (60 * num_msg_updates), avg, + avg_msec); + + if (num_msg_updates < 5) + num_msg_updates++; + next_msg_update = now + (60 * num_msg_updates); + } + } else { + next_msg_update = now + 60; + num_msg_updates = 1; + } + } } static int @@ -3628,6 +3771,10 @@ struct lnet_mt_event_info { { time64_t rsp_timeout = 0; time64_t now; + unsigned int nnis; + unsigned int nlpnis; + struct lnet_nid local_nids[LNET_MAX_NNIDS]; + struct lnet_nid peer_nids[LNET_MAX_NNIDS]; wait_for_completion(&the_lnet.ln_started); @@ -3653,8 +3800,10 @@ struct lnet_mt_event_info { rsp_timeout = now + (lnet_transaction_timeout / 2); } - lnet_recover_local_nis(); - lnet_recover_peer_nis(); + nnis = lnet_recover_local_nis(local_nids, LNET_MAX_NNIDS); + nlpnis = lnet_recover_peer_nis(peer_nids, LNET_MAX_NNIDS); + lnet_health_update_console(local_nids, nnis, peer_nids, nlpnis, + now); /* TODO do we need to check if we should sleep without * timeout? Technically, an active system will always @@ -3768,7 +3917,7 @@ struct lnet_mt_event_info { lnet_net_unlock(0); if (status != 0) { - CERROR("local NI (%s) recovery failed with %d\n", + CDEBUG(D_NET, "local NI (%s) recovery failed with %d\n", libcfs_nidstr(nid), status); return; } @@ -3800,7 +3949,7 @@ struct lnet_mt_event_info { lnet_net_unlock(cpt); if (status != 0) - CERROR("peer NI (%s) recovery failed with %d\n", + CDEBUG(D_NET, "peer NI (%s) recovery failed with %d\n", libcfs_nidstr(nid), status); } } diff --git a/net/lnet/lnet/lib-msg.c b/net/lnet/lnet/lib-msg.c index 82d117d..420236d 100644 --- a/net/lnet/lnet/lib-msg.c +++ b/net/lnet/lnet/lib-msg.c @@ -761,6 +761,7 @@ bool attempt_remote_resend; bool handle_local_health; bool handle_remote_health; + ktime_t now; /* if we're shutting down no point in handling health. */ if (the_lnet.ln_mt_state != LNET_MT_STATE_RUNNING) @@ -778,10 +779,6 @@ nid_is_lo0(&msg->msg_txni->ni_nid)) lo = true; - if (hstatus != LNET_MSG_STATUS_OK && - ktime_after(ktime_get(), msg->msg_deadline)) - return -1; - /* always prefer txni/txpeer if they message is committed for both * directions. */ @@ -802,6 +799,17 @@ else LASSERT(ni); + now = ktime_get(); + if (ktime_after(now, msg->msg_deadline)) { + s64 time = ktime_to_ns(ktime_sub(now, msg->msg_deadline)); + + atomic64_add(time, &the_lnet.ln_late_msg_nsecs); + atomic_inc(&the_lnet.ln_late_msg_count); + + if (hstatus != LNET_MSG_STATUS_OK) + return -1; + } + CDEBUG(D_NET, "health check: %s->%s: %s: %s\n", libcfs_nidstr(&ni->ni_nid), (lo) ? "self" : libcfs_nidstr(&lpni->lpni_nid), From patchwork Mon Apr 17 13:47:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 303E0C77B70 for ; Mon, 17 Apr 2023 14:10:57 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T6752brz215v; Mon, 17 Apr 2023 06:53:19 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T1R0Jmfz21Bv for ; Mon, 17 Apr 2023 06:49:15 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id 9F3FA1008494; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id 9DF5F379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:14 -0400 Message-Id: <1681739243-29375-19-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 18/27] lustre: update version to 2.15.54 X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Oleg Drokin New tag 2.15.54 Signed-off-by: Oleg Drokin Signed-off-by: James Simmons --- include/uapi/linux/lustre/lustre_ver.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/lustre/lustre_ver.h b/include/uapi/linux/lustre/lustre_ver.h index bc7a49c..4a72d70 100644 --- a/include/uapi/linux/lustre/lustre_ver.h +++ b/include/uapi/linux/lustre/lustre_ver.h @@ -3,9 +3,9 @@ #define LUSTRE_MAJOR 2 #define LUSTRE_MINOR 15 -#define LUSTRE_PATCH 54 +#define LUSTRE_PATCH 55 #define LUSTRE_FIX 0 -#define LUSTRE_VERSION_STRING "2.15.54" +#define LUSTRE_VERSION_STRING "2.15.55" #define OBD_OCD_VERSION(major, minor, patch, fix) \ (((major) << 24) + ((minor) << 16) + ((patch) << 8) + (fix)) From patchwork Mon Apr 17 13:47:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214140 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BEBDDC77B70 for ; Mon, 17 Apr 2023 14:11:12 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T6g06lxz22Tf; Mon, 17 Apr 2023 06:53:47 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T1S2s7mz21C6 for ; Mon, 17 Apr 2023 06:49:16 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id A45F61008495; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A3052372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:15 -0400 Message-Id: <1681739243-29375-20-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 19/27] lustre: misc: remove unnecessary ioctl typecasts X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Declare "void __user *uarg" in the ioctl handling code so that it isn't typecast on every access in the ioctl handler. Unnecessary typecast risks hiding compiler warnings and bugs. Convert indentation to tabs for lines previously using spaces. Change local variable declarations to use only a single space. WC-bug-id: https://jira.whamcloud.com/browse/LU-16634 Lustre-commit: 4a1465577e1310ce09 ("LU-16634 misc: remove unnecessary ioctl typecasts") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50333 Reviewed-by: Arshad Hussain Reviewed-by: Vitaliy Kuznetsov Reviewed-by: jsimmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_class.h | 2 +- fs/lustre/llite/dir.c | 62 +++++++++-------- fs/lustre/llite/file.c | 142 +++++++++++++++++---------------------- fs/lustre/llite/llite_internal.h | 14 ++-- fs/lustre/llite/llite_lib.c | 15 ++--- fs/lustre/obdclass/class_obd.c | 16 ++--- 6 files changed, 115 insertions(+), 136 deletions(-) diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h index 123a214..0c95c3c 100644 --- a/fs/lustre/include/obd_class.h +++ b/fs/lustre/include/obd_class.h @@ -58,7 +58,7 @@ /* OBD Operations Declarations */ struct obd_device *class_exp2obd(struct obd_export *exp); -int class_handle_ioctl(unsigned int cmd, unsigned long arg); +int class_handle_ioctl(unsigned int cmd, void __user *uarg); int lustre_get_jobid(char *jobid, size_t len); void jobid_cache_fini(void); int jobid_cache_init(void); diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 871dd93..6bb95ad 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -1464,6 +1464,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) struct inode *inode = file_inode(file); struct ll_sb_info *sbi = ll_i2sbi(inode); struct obd_ioctl_data *data; + void __user *uarg = (void __user *)arg; int rc = 0; CDEBUG(D_VFSTRACE, "VFS Op:inode=" DFID "(%p), cmd=%#x\n", @@ -1477,7 +1478,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) switch (cmd) { case FS_IOC_GETFLAGS: case FS_IOC_SETFLAGS: - return ll_iocontrol(inode, file, cmd, arg); + return ll_iocontrol(inode, file, cmd, uarg); case FSFILT_IOC_GETVERSION: case FS_IOC_GETVERSION: return put_user(inode->i_generation, (int __user *)arg); @@ -1504,7 +1505,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) int namelen, len = 0; char *filename; - rc = obd_ioctl_getdata(&data, &len, (void __user *)arg); + rc = obd_ioctl_getdata(&data, &len, uarg); if (rc) return rc; @@ -1537,7 +1538,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) int len; int rc; - rc = obd_ioctl_getdata(&data, &len, (void __user *)arg); + rc = obd_ioctl_getdata(&data, &len, uarg); if (rc) return rc; @@ -1591,11 +1592,10 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return rc; } case LL_IOC_LMV_SET_DEFAULT_STRIPE: { - struct lmv_user_md __user *ulump; + struct lmv_user_md __user *ulump = uarg; struct lmv_user_md lum; int rc; - ulump = (struct lmv_user_md __user *)arg; if (copy_from_user(&lum, ulump, sizeof(lum))) return -EFAULT; @@ -1611,8 +1611,8 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) struct lov_user_md_v3 *lumv3 = NULL; struct lov_user_md_v1 lumv1; struct lov_user_md_v1 *lumv1_ptr = &lumv1; - struct lov_user_md_v1 __user *lumv1p = (void __user *)arg; - struct lov_user_md_v3 __user *lumv3p = (void __user *)arg; + struct lov_user_md_v1 __user *lumv1p = uarg; + struct lov_user_md_v3 __user *lumv3p = uarg; int lum_size = 0; int set_default = 0; @@ -1656,7 +1656,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return rc; } case LL_IOC_LMV_GETSTRIPE: { - struct lmv_user_md __user *ulmv; + struct lmv_user_md __user *ulmv = uarg; struct lmv_user_md lum; struct ptlrpc_request *request = NULL; struct ptlrpc_request *root_request = NULL; @@ -1881,7 +1881,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) if (cmd == IOC_MDC_GETFILEINFO_V1 || cmd == IOC_MDC_GETFILEINFO_V2 || cmd == IOC_MDC_GETFILESTRIPE) { - filename = ll_getname((const char __user *)arg); + filename = ll_getname(uarg); if (IS_ERR(filename)) return PTR_ERR(filename); @@ -2064,7 +2064,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) if (!qctl) return -ENOMEM; - if (copy_from_user(qctl, (void __user *)arg, sizeof(*qctl))) { + if (copy_from_user(qctl, uarg, sizeof(*qctl))) { rc = -EFAULT; goto out_quotactl; } @@ -2082,7 +2082,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) rc = quotactl_ioctl(inode->i_sb, qctl); if ((rc == 0 || rc == -ENODATA) && - copy_to_user((void __user *)arg, qctl, sizeof(*qctl))) + copy_to_user(uarg, qctl, sizeof(*qctl))) rc = -EFAULT; out_quotactl: kfree(qctl); @@ -2093,7 +2093,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) case OBD_IOC_GETDTNAME: fallthrough; case OBD_IOC_GETMDNAME: - return ll_get_obd_name(inode, cmd, arg); + return ll_get_obd_name(inode, cmd, uarg); case LL_IOC_FLUSHCTX: return ll_flush_ctx(inode); case LL_IOC_GETOBDCOUNT: { @@ -2119,18 +2119,18 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return 0; } case LL_IOC_PATH2FID: - if (copy_to_user((void __user *)arg, ll_inode2fid(inode), + if (copy_to_user(uarg, ll_inode2fid(inode), sizeof(struct lu_fid))) return -EFAULT; return 0; case LL_IOC_GET_CONNECT_FLAGS: { return obd_iocontrol(cmd, sbi->ll_md_exp, 0, NULL, - (void __user *)arg); + uarg); } case OBD_IOC_FID2PATH: - return ll_fid2path(inode, (void __user *)arg); + return ll_fid2path(inode, uarg); case LL_IOC_GETPARENT: - return ll_getparent(file, (void __user *)arg); + return ll_getparent(file, uarg); case LL_IOC_FID2MDTIDX: { struct obd_export *exp = ll_i2mdexp(inode); struct lu_fid fid; @@ -2152,7 +2152,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) struct hsm_user_request *hur; ssize_t totalsize; - hur = memdup_user((void __user *)arg, sizeof(*hur)); + hur = memdup_user(uarg, sizeof(*hur)); if (IS_ERR(hur)) return PTR_ERR(hur); @@ -2171,7 +2171,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return -ENOMEM; /* Copy the whole struct */ - if (copy_from_user(hur, (void __user *)arg, totalsize)) { + if (copy_from_user(hur, uarg, totalsize)) { kvfree(hur); return -EFAULT; } @@ -2207,7 +2207,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) struct hsm_progress_kernel hpk; struct hsm_progress hp; - if (copy_from_user(&hp, (void __user *)arg, sizeof(hp))) + if (copy_from_user(&hp, uarg, sizeof(hp))) return -EFAULT; hpk.hpk_fid = hp.hp_fid; @@ -2270,7 +2270,7 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) int len; int rc; - rc = obd_ioctl_getdata(&data, &len, (void __user *)arg); + rc = obd_ioctl_getdata(&data, &len, uarg); if (rc) return rc; @@ -2306,11 +2306,11 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return rc; } case FS_IOC_FSGETXATTR: - return ll_ioctl_fsgetxattr(inode, cmd, arg); + return ll_ioctl_fsgetxattr(inode, cmd, uarg); case FS_IOC_FSSETXATTR: - return ll_ioctl_fssetxattr(inode, cmd, arg); + return ll_ioctl_fssetxattr(inode, cmd, uarg); case LL_IOC_PROJECT: - return ll_ioctl_project(file, cmd, arg); + return ll_ioctl_project(file, cmd, uarg); case LL_IOC_PCC_DETACH_BY_FID: { struct lu_pcc_detach_fid *detach; struct lu_fid *fid; @@ -2360,35 +2360,33 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) case FS_IOC_SET_ENCRYPTION_POLICY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_set_policy(file, (const void __user *)arg); + return fscrypt_ioctl_set_policy(file, uarg); case FS_IOC_GET_ENCRYPTION_POLICY_EX: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_get_policy_ex(file, (void __user *)arg); + return fscrypt_ioctl_get_policy_ex(file, uarg); case FS_IOC_ADD_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - rc = fscrypt_ioctl_add_key(file, (void __user *)arg); + rc = fscrypt_ioctl_add_key(file, uarg); if (!rc) sptlrpc_enc_pool_add_user(); return rc; case FS_IOC_REMOVE_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key(file, (void __user *)arg); + return fscrypt_ioctl_remove_key(file, uarg); case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key_all_users(file, - (void __user *)arg); + return fscrypt_ioctl_remove_key_all_users(file, uarg); case FS_IOC_GET_ENCRYPTION_KEY_STATUS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_get_key_status(file, (void __user *)arg); + return fscrypt_ioctl_get_key_status(file, uarg); #endif default: - return obd_iocontrol(cmd, sbi->ll_dt_exp, 0, NULL, - (void __user *)arg); + return obd_iocontrol(cmd, sbi->ll_dt_exp, 0, NULL, uarg); } } diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index b96efb1..44197a8 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -1382,7 +1382,7 @@ static int ll_lease_close(struct obd_client_handle *och, struct inode *inode, * After lease is taken, send the RPC MDS_REINT_RESYNC to the MDT */ static int ll_lease_file_resync(struct obd_client_handle *och, - struct inode *inode, unsigned long arg) + struct inode *inode, void __user *uarg) { struct ll_sb_info *sbi = ll_i2sbi(inode); struct md_op_data *op_data; @@ -1395,8 +1395,7 @@ static int ll_lease_file_resync(struct obd_client_handle *och, if (IS_ERR(op_data)) return PTR_ERR(op_data); - if (copy_from_user(&ioc, (struct ll_ioc_lease_id __user *)arg, - sizeof(ioc))) + if (copy_from_user(&ioc, uarg, sizeof(ioc))) return -EFAULT; /* before starting file resync, it's necessary to clean up page cache @@ -2496,7 +2495,7 @@ static int ll_file_getstripe(struct inode *inode, void __user *lum, size_t size) static int ll_lov_setstripe(struct inode *inode, struct file *file, void __user *arg) { - struct lov_user_md __user *lum = (struct lov_user_md __user *)arg; + struct lov_user_md __user *lum = arg; struct lov_user_md *klum; int lum_size, rc; u64 flags = FMODE_WRITE; @@ -2556,8 +2555,9 @@ static int ll_lov_setstripe(struct inode *inode, struct file *file, if (file->f_flags & O_NONBLOCK) { if (!mutex_trylock(&lli->lli_group_mutex)) return -EAGAIN; - } else + } else { mutex_lock(&lli->lli_group_mutex); + } if (fd->fd_flags & LL_FILE_GROUP_LOCKED) { CWARN("group lock already existed with gid %lu\n", @@ -3622,22 +3622,19 @@ static int ll_lock_noexpand(struct file *file, int flags) } int ll_ioctl_fsgetxattr(struct inode *inode, unsigned int cmd, - unsigned long arg) + void __user *uarg) { struct ll_inode_info *lli = ll_i2info(inode); struct fsxattr fsxattr; - if (copy_from_user(&fsxattr, - (const struct fsxattr __user *)arg, - sizeof(fsxattr))) + if (copy_from_user(&fsxattr, uarg, sizeof(fsxattr))) return -EFAULT; fsxattr.fsx_xflags = ll_inode_flags_to_xflags(inode->i_flags); if (test_bit(LLIF_PROJECT_INHERIT, &lli->lli_flags)) fsxattr.fsx_xflags |= FS_XFLAG_PROJINHERIT; fsxattr.fsx_projid = ll_i2info(inode)->lli_projid; - if (copy_to_user((struct fsxattr __user *)arg, - &fsxattr, sizeof(fsxattr))) + if (copy_to_user(uarg, &fsxattr, sizeof(fsxattr))) return -EFAULT; return 0; @@ -3730,21 +3727,18 @@ static int ll_set_project(struct inode *inode, u32 xflags, u32 projid) } int ll_ioctl_fssetxattr(struct inode *inode, unsigned int cmd, - unsigned long arg) + void __user *uarg) { struct fsxattr fsxattr; - if (copy_from_user(&fsxattr, - (const struct fsxattr __user *)arg, - sizeof(fsxattr))) + if (copy_from_user(&fsxattr, uarg, sizeof(fsxattr))) return -EFAULT; return ll_set_project(inode, fsxattr.fsx_xflags, fsxattr.fsx_projid); } -int ll_ioctl_project(struct file *file, unsigned int cmd, - unsigned long arg) +int ll_ioctl_project(struct file *file, unsigned int cmd, void __user *uarg) { struct lu_project lu_project; struct dentry *dentry = file_dentry(file); @@ -3752,9 +3746,7 @@ int ll_ioctl_project(struct file *file, unsigned int cmd, struct dentry *child_dentry = NULL; int rc = 0, name_len; - if (copy_from_user(&lu_project, - (const struct lu_project __user *)arg, - sizeof(lu_project))) + if (copy_from_user(&lu_project, uarg, sizeof(lu_project))) return -EFAULT; /* apply child dentry if name is valid */ @@ -3790,8 +3782,7 @@ int ll_ioctl_project(struct file *file, unsigned int cmd, &ll_i2info(inode)->lli_flags)) lu_project.project_xflags |= FS_XFLAG_PROJINHERIT; lu_project.project_id = ll_i2info(inode)->lli_projid; - if (copy_to_user((struct lu_project __user *)arg, - &lu_project, sizeof(lu_project))) { + if (copy_to_user(uarg, &lu_project, sizeof(lu_project))) { rc = -EFAULT; goto out; } @@ -3807,7 +3798,7 @@ int ll_ioctl_project(struct file *file, unsigned int cmd, } static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, - unsigned long arg) + void __user *uarg) { struct inode *inode = file_inode(file); struct ll_file_data *fd = file->private_data; @@ -3851,7 +3842,7 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, goto out_lease_close; } - if (copy_from_user(data, (void __user *)arg, data_size)) { + if (copy_from_user(data, uarg, data_size)) { rc = -EFAULT; goto out_lease_close; } @@ -3864,8 +3855,8 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, goto out_lease_close; } - arg += sizeof(*ioc); - if (copy_from_user(&fdv, (void __user *)arg, sizeof(u32))) { + uarg += sizeof(*ioc); + if (copy_from_user(&fdv, uarg, sizeof(u32))) { rc = -EFAULT; goto out_lease_close; } @@ -3893,14 +3884,14 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, goto out_lease_close; } - arg += sizeof(*ioc); - if (copy_from_user(&fdv, (void __user *)arg, sizeof(u32))) { + uarg += sizeof(*ioc); + if (copy_from_user(&fdv, uarg, sizeof(u32))) { rc = -EFAULT; goto out_lease_close; } - arg += sizeof(u32); - if (copy_from_user(&mirror_id, (void __user *)arg, + uarg += sizeof(u32); + if (copy_from_user(&mirror_id, uarg, sizeof(u32))) { rc = -EFAULT; goto out_lease_close; @@ -3925,8 +3916,8 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, if (IS_ENCRYPTED(inode)) return -EOPNOTSUPP; - arg += sizeof(*ioc); - if (copy_from_user(¶m.pa_archive_id, (void __user *)arg, + uarg += sizeof(*ioc); + if (copy_from_user(¶m.pa_archive_id, uarg, sizeof(u32))) { rc2 = -EFAULT; goto out_lease_close; @@ -3986,7 +3977,7 @@ static long ll_file_unlock_lease(struct file *file, struct ll_ioc_lease *ioc, } static long ll_file_set_lease(struct file *file, struct ll_ioc_lease *ioc, - unsigned long arg) + void __user *uarg) { struct inode *inode = file_inode(file); struct ll_inode_info *lli = ll_i2info(inode); @@ -4009,7 +4000,7 @@ static long ll_file_set_lease(struct file *file, struct ll_ioc_lease *ioc, fmode = FMODE_READ; break; case LL_LEASE_UNLCK: - return ll_file_unlock_lease(file, ioc, arg); + return ll_file_unlock_lease(file, ioc, uarg); default: return -EINVAL; } @@ -4024,7 +4015,7 @@ static long ll_file_set_lease(struct file *file, struct ll_ioc_lease *ioc, return PTR_ERR(och); if (ioc->lil_flags & LL_LEASE_RESYNC) { - rc = ll_lease_file_resync(och, inode, arg); + rc = ll_lease_file_resync(och, inode, uarg); if (rc) { ll_lease_close(och, inode, NULL); return rc; @@ -4091,6 +4082,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) { struct inode *inode = file_inode(file); struct ll_file_data *fd = file->private_data; + void __user *uarg = (void __user *)arg; int flags, rc; CDEBUG(D_VFSTRACE, "VFS Op:inode=" DFID "(%p),cmd=%x\n", @@ -4129,15 +4121,14 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) return 0; case LL_IOC_LOV_SETSTRIPE: case LL_IOC_LOV_SETSTRIPE_NEW: - return ll_lov_setstripe(inode, file, (void __user *)arg); + return ll_lov_setstripe(inode, file, uarg); case LL_IOC_LOV_SETEA: - return ll_lov_setea(inode, file, (void __user *)arg); + return ll_lov_setea(inode, file, uarg); case LL_IOC_LOV_SWAP_LAYOUTS: { struct file *file2; struct lustre_swap_layouts lsl; - if (copy_from_user(&lsl, (char __user *)arg, - sizeof(struct lustre_swap_layouts))) + if (copy_from_user(&lsl, uarg, sizeof(lsl))) return -EFAULT; if ((file->f_flags & O_ACCMODE) == O_RDONLY) @@ -4180,10 +4171,10 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) } case LL_IOC_LOV_GETSTRIPE: case LL_IOC_LOV_GETSTRIPE_NEW: - return ll_file_getstripe(inode, (void __user *)arg, 0); + return ll_file_getstripe(inode, uarg, 0); case FS_IOC_GETFLAGS: case FS_IOC_SETFLAGS: - return ll_iocontrol(inode, file, cmd, arg); + return ll_iocontrol(inode, file, cmd, uarg); case FSFILT_IOC_GETVERSION: case FS_IOC_GETVERSION: return put_user(inode->i_generation, (int __user *)arg); @@ -4199,12 +4190,12 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) case LL_IOC_GROUP_UNLOCK: return ll_put_grouplock(inode, file, arg); case IOC_OBD_STATFS: - return ll_obd_statfs(inode, (void __user *)arg); + return ll_obd_statfs(inode, uarg); case LL_IOC_FLUSHCTX: return ll_flush_ctx(inode); case LL_IOC_PATH2FID: { - if (copy_to_user((void __user *)arg, ll_inode2fid(inode), + if (copy_to_user(uarg, ll_inode2fid(inode), sizeof(struct lu_fid))) return -EFAULT; @@ -4213,17 +4204,17 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) case LL_IOC_GETPARENT: return ll_getparent(file, (struct getparent __user *)arg); case OBD_IOC_FID2PATH: - return ll_fid2path(inode, (void __user *)arg); + return ll_fid2path(inode, uarg); case LL_IOC_DATA_VERSION: { struct ioc_data_version idv; int rc; - if (copy_from_user(&idv, (char __user *)arg, sizeof(idv))) + if (copy_from_user(&idv, uarg, sizeof(idv))) return -EFAULT; idv.idv_flags &= LL_DV_RD_FLUSH | LL_DV_WR_FLUSH; rc = ll_ioc_data_version(inode, &idv); - if (rc == 0 && copy_to_user((char __user *)arg, &idv, + if (rc == 0 && copy_to_user(uarg, &idv, sizeof(idv))) return -EFAULT; @@ -4237,7 +4228,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) if (mdtidx < 0) return mdtidx; - if (put_user(mdtidx, (int __user *)arg)) + if (put_user(mdtidx, (int __user *)uarg)) return -EFAULT; return 0; @@ -4247,7 +4238,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) case OBD_IOC_GETDTNAME: fallthrough; case OBD_IOC_GETMDNAME: - return ll_get_obd_name(inode, cmd, arg); + return ll_get_obd_name(inode, cmd, uarg); case LL_IOC_HSM_STATE_GET: { struct md_op_data *op_data; struct hsm_user_state *hus; @@ -4267,7 +4258,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) rc = obd_iocontrol(cmd, ll_i2mdexp(inode), sizeof(*op_data), op_data, NULL); - if (copy_to_user((void __user *)arg, hus, sizeof(*hus))) + if (copy_to_user(uarg, hus, sizeof(*hus))) rc = -EFAULT; ll_finish_md_op_data(op_data); @@ -4278,7 +4269,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) struct hsm_state_set *hss; int rc; - hss = memdup_user((char __user *)arg, sizeof(*hss)); + hss = memdup_user(uarg, sizeof(*hss)); if (IS_ERR(hss)) return PTR_ERR(hss); @@ -4323,7 +4314,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) hca->hca_location.offset, hca->hca_location.length); } - if (copy_to_user((char __user *)arg, hca, sizeof(*hca))) + if (copy_to_user(uarg, hca, sizeof(*hca))) rc = -EFAULT; skip_copy: ll_finish_md_op_data(op_data); @@ -4338,10 +4329,10 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) case LL_IOC_SET_LEASE: { struct ll_ioc_lease ioc; - if (copy_from_user(&ioc, (void __user *)arg, sizeof(ioc))) + if (copy_from_user(&ioc, uarg, sizeof(ioc))) return -EFAULT; - return ll_file_set_lease(file, &ioc, arg); + return ll_file_set_lease(file, &ioc, uarg); } case LL_IOC_GET_LEASE: { struct ll_inode_info *lli = ll_i2info(inode); @@ -4367,7 +4358,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) case LL_IOC_HSM_IMPORT: { struct hsm_user_import *hui; - hui = memdup_user((void __user *)arg, sizeof(*hui)); + hui = memdup_user(uarg, sizeof(*hui)); if (IS_ERR(hui)) return PTR_ERR(hui); @@ -4377,11 +4368,9 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) return rc; } case LL_IOC_FUTIMES_3: { - const struct ll_futimes_3 __user *lfu_user; struct ll_futimes_3 lfu; - lfu_user = (const struct ll_futimes_3 __user *)arg; - if (copy_from_user(&lfu, lfu_user, sizeof(lfu))) + if (copy_from_user(&lfu, uarg, sizeof(lfu))) return -EFAULT; return ll_file_futimes_3(file, &lfu); @@ -4394,7 +4383,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) int i; rc = 0; - u_ladvise_hdr = (void __user *)arg; + u_ladvise_hdr = uarg; k_ladvise_hdr = kzalloc(alloc_size, GFP_KERNEL); if (!k_ladvise_hdr) return -ENOMEM; @@ -4479,15 +4468,15 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) if (!(file->f_flags & O_DIRECT)) return -EINVAL; - fd->fd_designated_mirror = (u32)arg; + fd->fd_designated_mirror = arg; return 0; } case FS_IOC_FSGETXATTR: - return ll_ioctl_fsgetxattr(inode, cmd, arg); + return ll_ioctl_fsgetxattr(inode, cmd, uarg); case FS_IOC_FSSETXATTR: - return ll_ioctl_fssetxattr(inode, cmd, arg); + return ll_ioctl_fssetxattr(inode, cmd, uarg); case LL_IOC_PROJECT: - return ll_ioctl_project(file, cmd, arg); + return ll_ioctl_project(file, cmd, uarg); case BLKSSZGET: return put_user(PAGE_SIZE, (int __user *)arg); case LL_IOC_HEAT_GET: { @@ -4495,7 +4484,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) struct lu_heat *heat; int size; - if (copy_from_user(&uheat, (void __user *)arg, sizeof(uheat))) + if (copy_from_user(&uheat, uarg, sizeof(uheat))) return -EFAULT; if (uheat.lh_count > OBD_HEAT_COUNT) @@ -4508,14 +4497,14 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) heat->lh_count = uheat.lh_count; ll_heat_get(inode, heat); - rc = copy_to_user((char __user *)arg, heat, size); + rc = copy_to_user(uarg, heat, size); kfree(heat); return rc ? -EFAULT : 0; } case LL_IOC_HEAT_SET: { u64 flags; - if (copy_from_user(&flags, (void __user *)arg, sizeof(flags))) + if (copy_from_user(&flags, uarg, sizeof(flags))) return -EFAULT; rc = ll_heat_set(inode, flags); @@ -4528,9 +4517,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) if (!detach) return -ENOMEM; - if (copy_from_user(detach, - (const struct lu_pcc_detach __user *)arg, - sizeof(*detach))) { + if (copy_from_user(detach, uarg, sizeof(*detach))) { rc = -EFAULT; goto out_detach_free; } @@ -4551,8 +4538,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) return rc; } case LL_IOC_PCC_STATE: { - struct lu_pcc_state __user *ustate = - (struct lu_pcc_state __user *)arg; + struct lu_pcc_state __user *ustate = uarg; struct lu_pcc_state *state; state = kzalloc(sizeof(*state), GFP_KERNEL); @@ -4581,28 +4567,27 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) case FS_IOC_SET_ENCRYPTION_POLICY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_set_policy(file, (const void __user *)arg); + return fscrypt_ioctl_set_policy(file, uarg); case FS_IOC_GET_ENCRYPTION_POLICY_EX: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_get_policy_ex(file, (void __user *)arg); + return fscrypt_ioctl_get_policy_ex(file, uarg); case FS_IOC_ADD_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_add_key(file, (void __user *)arg); + return fscrypt_ioctl_add_key(file, uarg); case FS_IOC_REMOVE_ENCRYPTION_KEY: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key(file, (void __user *)arg); + return fscrypt_ioctl_remove_key(file, uarg); case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key_all_users(file, - (void __user *)arg); + return fscrypt_ioctl_remove_key_all_users(file, uarg); case FS_IOC_GET_ENCRYPTION_KEY_STATUS: if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) return -EOPNOTSUPP; - return fscrypt_ioctl_get_key_status(file, (void __user *)arg); + return fscrypt_ioctl_get_key_status(file, uarg); #endif case LL_IOC_UNLOCK_FOREIGN: { @@ -4619,8 +4604,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) } default: - return obd_iocontrol(cmd, ll_i2dtexp(inode), 0, NULL, - (void __user *)arg); + return obd_iocontrol(cmd, ll_i2dtexp(inode), 0, NULL, uarg); } } diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index fdc0f89..6590399 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -1222,12 +1222,10 @@ int ll_get_fid_by_name(struct inode *parent, const char *name, int ll_inode_permission(struct inode *inode, int mask); int ll_ioctl_check_project(struct inode *inode, u32 xflags, u32 projid); int ll_ioctl_fsgetxattr(struct inode *inode, unsigned int cmd, - unsigned long arg); + void __user *uarg); int ll_ioctl_fssetxattr(struct inode *inode, unsigned int cmd, - unsigned long arg); -int ll_ioctl_project(struct file *file, unsigned int cmd, - unsigned long arg); - + void __user *uarg); +int ll_ioctl_project(struct file *file, unsigned int cmd, void __user *uarg); int ll_lov_setstripe_ea_info(struct inode *inode, struct dentry *dentry, u64 flags, struct lov_user_md *lum, int lum_size); @@ -1290,7 +1288,7 @@ int ll_statfs_internal(struct ll_sb_info *sbi, struct obd_statfs *osfs, void ll_truncate_inode_pages_final(struct inode *inode, struct cl_io *io); void ll_delete_inode(struct inode *inode); int ll_iocontrol(struct inode *inode, struct file *file, - unsigned int cmd, unsigned long arg); + unsigned int cmd, void __user *uarg); int ll_flush_ctx(struct inode *inode); void ll_umount_begin(struct super_block *sb); int ll_remount_fs(struct super_block *sb, int *flags, char *data); @@ -1298,7 +1296,7 @@ int ll_iocontrol(struct inode *inode, struct file *file, void ll_dirty_page_discard_warn(struct inode *inode, int ioret); int ll_prep_inode(struct inode **inode, struct req_capsule *pill, struct super_block *sb, struct lookup_intent *it); -int ll_obd_statfs(struct inode *inode, void __user *arg); +int ll_obd_statfs(struct inode *inode, void __user *uarg); int ll_get_max_mdsize(struct ll_sb_info *sbi, int *max_mdsize); int ll_get_default_mdsize(struct ll_sb_info *sbi, int *default_mdsize); int ll_set_default_mdsize(struct ll_sb_info *sbi, int default_mdsize); @@ -1310,7 +1308,7 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data, u32 mode, enum md_op_code opc, void *data); void ll_finish_md_op_data(struct md_op_data *op_data); -int ll_get_obd_name(struct inode *inode, unsigned int cmd, unsigned long arg); +int ll_get_obd_name(struct inode *inode, unsigned int cmd, void __user *uarg); void ll_compute_rootsquash_state(struct ll_sb_info *sbi); ssize_t ll_copy_user_md(const struct lov_user_md __user *md, struct lov_user_md **kbuf); diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 049cd23..913e096 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2853,7 +2853,7 @@ void ll_delete_inode(struct inode *inode) } int ll_iocontrol(struct inode *inode, struct file *file, - unsigned int cmd, unsigned long arg) + unsigned int cmd, void __user *uarg) { struct ll_sb_info *sbi = ll_i2sbi(inode); struct ptlrpc_request *req = NULL; @@ -2891,7 +2891,7 @@ int ll_iocontrol(struct inode *inode, struct file *file, ptlrpc_req_finished(req); - return put_user(flags, (int __user *)arg); + return put_user(flags, (int __user *)uarg); } case FS_IOC_SETFLAGS: { struct md_op_data *op_data; @@ -2899,7 +2899,7 @@ int ll_iocontrol(struct inode *inode, struct file *file, struct iattr *attr; struct fsxattr fa = { 0 }; - if (get_user(flags, (int __user *)arg)) + if (get_user(flags, (int __user *)uarg)) return -EFAULT; fa.fsx_projid = ll_i2info(inode)->lli_projid; @@ -3219,7 +3219,7 @@ int ll_prep_inode(struct inode **inode, struct req_capsule *pill, return rc; } -int ll_obd_statfs(struct inode *inode, void __user *arg) +int ll_obd_statfs(struct inode *inode, void __user *uarg) { struct ll_sb_info *sbi = NULL; struct obd_export *exp; @@ -3238,7 +3238,7 @@ int ll_obd_statfs(struct inode *inode, void __user *arg) goto out_statfs; } - rc = obd_ioctl_getdata(&data, &len, arg); + rc = obd_ioctl_getdata(&data, &len, uarg); if (rc) goto out_statfs; @@ -3491,7 +3491,7 @@ int ll_show_options(struct seq_file *seq, struct dentry *dentry) /** * Get obd name by cmd, and copy out to user space */ -int ll_get_obd_name(struct inode *inode, unsigned int cmd, unsigned long arg) +int ll_get_obd_name(struct inode *inode, unsigned int cmd, void __user *uarg) { struct ll_sb_info *sbi = ll_i2sbi(inode); struct obd_device *obd; @@ -3506,8 +3506,7 @@ int ll_get_obd_name(struct inode *inode, unsigned int cmd, unsigned long arg) if (!obd) return -ENOENT; - if (copy_to_user((void __user *)arg, obd->obd_name, - strlen(obd->obd_name) + 1)) + if (copy_to_user(uarg, obd->obd_name, strlen(obd->obd_name) + 1)) return -EFAULT; return 0; diff --git a/fs/lustre/obdclass/class_obd.c b/fs/lustre/obdclass/class_obd.c index 67a9422..a7a2a6c 100644 --- a/fs/lustre/obdclass/class_obd.c +++ b/fs/lustre/obdclass/class_obd.c @@ -277,14 +277,14 @@ int obd_ioctl_getdata(struct obd_ioctl_data **datap, int *len, void __user *arg) } EXPORT_SYMBOL(obd_ioctl_getdata); -int class_handle_ioctl(unsigned int cmd, unsigned long arg) +int class_handle_ioctl(unsigned int cmd, void __user *uarg) { struct obd_ioctl_data *data; struct obd_device *obd = NULL; int err = 0, len = 0; CDEBUG(D_IOCTL, "cmd = %x\n", cmd); - if (obd_ioctl_getdata(&data, &len, (void __user *)arg)) { + if (obd_ioctl_getdata(&data, &len, uarg)) { CERROR("OBD ioctl: data error\n"); return -EINVAL; } @@ -341,7 +341,7 @@ int class_handle_ioctl(unsigned int cmd, unsigned long arg) memcpy(data->ioc_bulk, LUSTRE_VERSION_STRING, strlen(LUSTRE_VERSION_STRING) + 1); - if (copy_to_user((void __user *)arg, data, len)) + if (copy_to_user(uarg, data, len)) err = -EFAULT; goto out; } @@ -359,7 +359,7 @@ int class_handle_ioctl(unsigned int cmd, unsigned long arg) goto out; } - if (copy_to_user((void __user *)arg, data, sizeof(*data))) + if (copy_to_user(uarg, data, sizeof(*data))) err = -EFAULT; goto out; } @@ -396,7 +396,7 @@ int class_handle_ioctl(unsigned int cmd, unsigned long arg) CDEBUG(D_IOCTL, "device name %s, dev %d\n", data->ioc_inlbuf1, dev); - if (copy_to_user((void __user *)arg, data, sizeof(*data))) + if (copy_to_user(uarg, data, sizeof(*data))) err = -EFAULT; goto out; } @@ -438,7 +438,7 @@ int class_handle_ioctl(unsigned int cmd, unsigned long arg) obd->obd_name, obd->obd_uuid.uuid, atomic_read(&obd->obd_refcount)); - if (copy_to_user((void __user *)arg, data, len)) + if (copy_to_user(uarg, data, len)) err = -EFAULT; goto out; } @@ -479,7 +479,7 @@ int class_handle_ioctl(unsigned int cmd, unsigned long arg) if (err) goto out; - if (copy_to_user((void __user *)arg, data, len)) + if (copy_to_user(uarg, data, len)) err = -EFAULT; out: kvfree(data); @@ -497,7 +497,7 @@ static long obd_class_ioctl(struct file *filp, unsigned int cmd, if ((cmd & 0xffffff00) == ((int)'T') << 8) /* ignore all tty ioctls */ return -ENOTTY; - return class_handle_ioctl(cmd, (unsigned long)arg); + return class_handle_ioctl(cmd, (void __user *)arg); } /* declare character device */ From patchwork Mon Apr 17 13:47:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A11DDC77B76 for ; Mon, 17 Apr 2023 14:11:42 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T7C67yvz22V9; Mon, 17 Apr 2023 06:54:15 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T1V41bDz21CV for ; Mon, 17 Apr 2023 06:49:18 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id A92BC1008496; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id A7D49375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:16 -0400 Message-Id: <1681739243-29375-21-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 20/27] lustre: llite: move common ioctl code to ll_iocontrol() X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Andreas Dilger Move common ioctl cases from ll_dir_ioctl() and ll_file_ioctl() into ll_iocontrol() to avoid duplicate code. WC-bug-id: https://jira.whamcloud.com/browse/LU-16634 Lustre-commit: 3be425883918528ef9 ("LU-16634 llite: move common ioctl code to ll_iocontrol()") Signed-off-by: Andreas Dilger Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50335 Reviewed-by: Arshad Hussain Reviewed-by: Vitaliy Kuznetsov Reviewed-by: Timothy Day Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/dir.c | 92 ++------------------------- fs/lustre/llite/file.c | 57 ++--------------- fs/lustre/llite/llite_lib.c | 152 ++++++++++++++++++++++++++++++++++++++------ 3 files changed, 143 insertions(+), 158 deletions(-) diff --git a/fs/lustre/llite/dir.c b/fs/lustre/llite/dir.c index 6bb95ad..9caff36 100644 --- a/fs/lustre/llite/dir.c +++ b/fs/lustre/llite/dir.c @@ -1476,31 +1476,6 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_IOCTL, 1); switch (cmd) { - case FS_IOC_GETFLAGS: - case FS_IOC_SETFLAGS: - return ll_iocontrol(inode, file, cmd, uarg); - case FSFILT_IOC_GETVERSION: - case FS_IOC_GETVERSION: - return put_user(inode->i_generation, (int __user *)arg); - /* We need to special case any other ioctls we want to handle, - * to send them to the MDS/OST as appropriate and to properly - * network encode the arg field. - */ - case FS_IOC_SETVERSION: - return -ENOTSUPP; - - case LL_IOC_GET_MDTIDX: { - int mdtidx; - - mdtidx = ll_get_mdt_idx(inode); - if (mdtidx < 0) - return mdtidx; - - if (put_user((int)mdtidx, (int __user *)arg)) - return -EFAULT; - - return 0; - } case IOC_MDC_LOOKUP: { int namelen, len = 0; char *filename; @@ -1840,23 +1815,10 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) ptlrpc_req_finished(root_request); return rc; } - - case LL_IOC_UNLOCK_FOREIGN: - /* if not a foreign symlink do nothing */ - if (ll_foreign_is_removable(dentry, true)) { - CDEBUG(D_INFO, - "prevent rmdir of non-foreign dir ("DFID")\n", - PFID(ll_inode2fid(inode))); - return -EOPNOTSUPP; - } - return 0; - case LL_IOC_RMFID: return ll_rmfid(file, (void __user *)arg); case LL_IOC_LOV_SWAP_LAYOUTS: return -EPERM; - case IOC_OBD_STATFS: - return ll_obd_statfs(inode, (void __user *)arg); case LL_IOC_LOV_GETSTRIPE: case LL_IOC_LOV_GETSTRIPE_NEW: case LL_IOC_MDC_GETINFO_V1: @@ -2088,14 +2050,6 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) kfree(qctl); return rc; } - case OBD_IOC_GETNAME_OLD: - fallthrough; - case OBD_IOC_GETDTNAME: - fallthrough; - case OBD_IOC_GETMDNAME: - return ll_get_obd_name(inode, cmd, uarg); - case LL_IOC_FLUSHCTX: - return ll_flush_ctx(inode); case LL_IOC_GETOBDCOUNT: { int count, vallen; struct obd_export *exp; @@ -2118,11 +2072,6 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return 0; } - case LL_IOC_PATH2FID: - if (copy_to_user(uarg, ll_inode2fid(inode), - sizeof(struct lu_fid))) - return -EFAULT; - return 0; case LL_IOC_GET_CONNECT_FLAGS: { return obd_iocontrol(cmd, sbi->ll_md_exp, 0, NULL, uarg); @@ -2305,12 +2254,6 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) return rc; } - case FS_IOC_FSGETXATTR: - return ll_ioctl_fsgetxattr(inode, cmd, uarg); - case FS_IOC_FSSETXATTR: - return ll_ioctl_fssetxattr(inode, cmd, uarg); - case LL_IOC_PROJECT: - return ll_ioctl_project(file, cmd, uarg); case LL_IOC_PCC_DETACH_BY_FID: { struct lu_pcc_detach_fid *detach; struct lu_fid *fid; @@ -2356,38 +2299,13 @@ static long ll_dir_ioctl(struct file *file, unsigned int cmd, unsigned long arg) kfree(detach); return rc; } -#ifdef CONFIG_FS_ENCRYPTION - case FS_IOC_SET_ENCRYPTION_POLICY: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_set_policy(file, uarg); - case FS_IOC_GET_ENCRYPTION_POLICY_EX: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_get_policy_ex(file, uarg); - case FS_IOC_ADD_ENCRYPTION_KEY: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - rc = fscrypt_ioctl_add_key(file, uarg); - if (!rc) - sptlrpc_enc_pool_add_user(); - return rc; - case FS_IOC_REMOVE_ENCRYPTION_KEY: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key(file, uarg); - case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key_all_users(file, uarg); - case FS_IOC_GET_ENCRYPTION_KEY_STATUS: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_get_key_status(file, uarg); -#endif default: - return obd_iocontrol(cmd, sbi->ll_dt_exp, 0, NULL, uarg); + rc = ll_iocontrol(inode, file, cmd, uarg); + if (rc == -ENOTTY) + rc = obd_iocontrol(cmd, sbi->ll_dt_exp, 0, NULL, uarg); + break; } + return rc; } static loff_t ll_dir_seek(struct file *file, loff_t offset, int origin) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 44197a8..ceac08c 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -4083,7 +4083,7 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) struct inode *inode = file_inode(file); struct ll_file_data *fd = file->private_data; void __user *uarg = (void __user *)arg; - int flags, rc; + int flags, rc = 0; CDEBUG(D_VFSTRACE, "VFS Op:inode=" DFID "(%p),cmd=%x\n", PFID(ll_inode2fid(inode)), inode, cmd); @@ -4471,14 +4471,6 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) fd->fd_designated_mirror = arg; return 0; } - case FS_IOC_FSGETXATTR: - return ll_ioctl_fsgetxattr(inode, cmd, uarg); - case FS_IOC_FSSETXATTR: - return ll_ioctl_fssetxattr(inode, cmd, uarg); - case LL_IOC_PROJECT: - return ll_ioctl_project(file, cmd, uarg); - case BLKSSZGET: - return put_user(PAGE_SIZE, (int __user *)arg); case LL_IOC_HEAT_GET: { struct lu_heat uheat; struct lu_heat *heat; @@ -4563,49 +4555,14 @@ static int ll_heat_set(struct inode *inode, enum lu_heat_flag flags) kfree(state); return rc; } -#ifdef CONFIG_FS_ENCRYPTION - case FS_IOC_SET_ENCRYPTION_POLICY: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_set_policy(file, uarg); - case FS_IOC_GET_ENCRYPTION_POLICY_EX: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_get_policy_ex(file, uarg); - case FS_IOC_ADD_ENCRYPTION_KEY: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_add_key(file, uarg); - case FS_IOC_REMOVE_ENCRYPTION_KEY: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key(file, uarg); - case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_remove_key_all_users(file, uarg); - case FS_IOC_GET_ENCRYPTION_KEY_STATUS: - if (!ll_sbi_has_encrypt(ll_i2sbi(inode))) - return -EOPNOTSUPP; - return fscrypt_ioctl_get_key_status(file, uarg); -#endif - - case LL_IOC_UNLOCK_FOREIGN: { - struct dentry *dentry = file_dentry(file); - - /* if not a foreign symlink do nothing */ - if (ll_foreign_is_removable(dentry, true)) { - CDEBUG(D_INFO, - "prevent unlink of non-foreign file ("DFID")\n", - PFID(ll_inode2fid(inode))); - return -EOPNOTSUPP; - } - return 0; - } - default: - return obd_iocontrol(cmd, ll_i2dtexp(inode), 0, NULL, uarg); + rc = ll_iocontrol(inode, file, cmd, uarg); + if (rc == -ENOTTY) + rc = obd_iocontrol(cmd, ll_i2dtexp(inode), 0, NULL, uarg); + break; } + + return rc; } loff_t ll_lseek(struct file *file, loff_t offset, int whence) diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 913e096..c54ca1f 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2852,24 +2852,32 @@ void ll_delete_inode(struct inode *inode) clear_inode(inode); } +/* ioctl commands shared between files and directories */ int ll_iocontrol(struct inode *inode, struct file *file, unsigned int cmd, void __user *uarg) { struct ll_sb_info *sbi = ll_i2sbi(inode); struct ptlrpc_request *req = NULL; - int rc, flags = 0; + int rc = 0, flags = 0; switch (cmd) { + case BLKSSZGET: + rc = put_user(PAGE_SIZE, (int __user *)uarg); + break; + case FSFILT_IOC_GETVERSION: + case FS_IOC_GETVERSION: + rc = put_user(inode->i_generation, (int __user *)uarg); + break; case FS_IOC_GETFLAGS: { struct mdt_body *body; struct md_op_data *op_data; - op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, - 0, 0, LUSTRE_OPC_ANY, - NULL); - if (IS_ERR(op_data)) - return PTR_ERR(op_data); - + op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0, + LUSTRE_OPC_ANY, NULL); + if (IS_ERR(op_data)) { + rc = PTR_ERR(op_data); + break; + } op_data->op_valid = OBD_MD_FLFLAGS; rc = md_getattr(sbi->ll_md_exp, op_data, &req); ll_finish_md_op_data(op_data); @@ -2877,7 +2885,8 @@ int ll_iocontrol(struct inode *inode, struct file *file, CERROR("%s: failure inode " DFID ": rc = %d\n", sbi->ll_md_exp->exp_obd->obd_name, PFID(ll_inode2fid(inode)), rc); - return -abs(rc); + rc = -abs(rc); + break; } body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY); @@ -2891,7 +2900,8 @@ int ll_iocontrol(struct inode *inode, struct file *file, ptlrpc_req_finished(req); - return put_user(flags, (int __user *)uarg); + rc = put_user(flags, (int __user *)uarg); + break; } case FS_IOC_SETFLAGS: { struct md_op_data *op_data; @@ -2899,8 +2909,10 @@ int ll_iocontrol(struct inode *inode, struct file *file, struct iattr *attr; struct fsxattr fa = { 0 }; - if (get_user(flags, (int __user *)uarg)) - return -EFAULT; + if (get_user(flags, (int __user *)uarg)) { + rc = -EFAULT; + break; + } fa.fsx_projid = ll_i2info(inode)->lli_projid; if (flags & LUSTRE_PROJINHERIT_FL) @@ -2909,12 +2921,14 @@ int ll_iocontrol(struct inode *inode, struct file *file, rc = ll_ioctl_check_project(inode, fa.fsx_xflags, fa.fsx_projid); if (rc) - return rc; + break; op_data = ll_prep_md_op_data(NULL, inode, NULL, NULL, 0, 0, LUSTRE_OPC_ANY, NULL); - if (IS_ERR(op_data)) - return PTR_ERR(op_data); + if (IS_ERR(op_data)) { + rc = PTR_ERR(op_data); + break; + } op_data->op_attr_flags = flags; op_data->op_xvalid |= OP_XVALID_FLAGS; @@ -2922,27 +2936,123 @@ int ll_iocontrol(struct inode *inode, struct file *file, ll_finish_md_op_data(op_data); ptlrpc_req_finished(req); if (rc) - return rc; + break; ll_update_inode_flags(inode, flags); obj = ll_i2info(inode)->lli_clob; if (!obj) - return 0; + break; attr = kzalloc(sizeof(*attr), GFP_NOFS); - if (!attr) - return -ENOMEM; + if (!attr) { + rc = -ENOMEM; + break; + } rc = cl_setattr_ost(obj, attr, OP_XVALID_FLAGS, flags); kfree(attr); - return rc; + break; + } + case FS_IOC_FSGETXATTR: + rc = ll_ioctl_fsgetxattr(inode, cmd, uarg); + break; + case FS_IOC_FSSETXATTR: + rc = ll_ioctl_fssetxattr(inode, cmd, uarg); + break; + case LL_IOC_PROJECT: + rc = ll_ioctl_project(file, cmd, uarg); + break; + case IOC_OBD_STATFS: + rc = ll_obd_statfs(inode, uarg); + break; + case LL_IOC_GET_MDTIDX: { + rc = ll_get_mdt_idx(inode); + if (rc < 0) + break; + + if (put_user(rc, (int __user *)uarg)) + rc = -EFAULT; + + break; + } + case LL_IOC_FLUSHCTX: + rc = ll_flush_ctx(inode); + break; +#ifdef CONFIG_FS_ENCRYPTION + case FS_IOC_ADD_ENCRYPTION_KEY: + if (ll_sbi_has_encrypt(ll_i2sbi(inode))) + rc = fscrypt_ioctl_add_key(file, uarg); + else + rc = -EOPNOTSUPP; + break; + case FS_IOC_GET_ENCRYPTION_KEY_STATUS: + if (ll_sbi_has_encrypt(ll_i2sbi(inode))) + rc = fscrypt_ioctl_get_key_status(file, uarg); + else + rc = -EOPNOTSUPP; + break; + case FS_IOC_GET_ENCRYPTION_POLICY_EX: + if (ll_sbi_has_encrypt(ll_i2sbi(inode))) + rc = fscrypt_ioctl_get_policy_ex(file, uarg); + else + rc = -EOPNOTSUPP; + break; + case FS_IOC_SET_ENCRYPTION_POLICY: + if (ll_sbi_has_encrypt(ll_i2sbi(inode))) + rc = fscrypt_ioctl_set_policy(file, uarg); + else + rc = -EOPNOTSUPP; + break; + case FS_IOC_REMOVE_ENCRYPTION_KEY: + if (ll_sbi_has_encrypt(ll_i2sbi(inode))) + rc = fscrypt_ioctl_remove_key(file, uarg); + else + rc = -EOPNOTSUPP; + break; + case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: + if (ll_sbi_has_encrypt(ll_i2sbi(inode))) + rc = fscrypt_ioctl_remove_key_all_users(file, uarg); + else + rc = -EOPNOTSUPP; + break; +#endif + case LL_IOC_GETPARENT: + rc = ll_getparent(file, uarg); + break; + case LL_IOC_PATH2FID: + if (copy_to_user(uarg, ll_inode2fid(inode), + sizeof(struct lu_fid))) + rc = -EFAULT; + break; + case LL_IOC_UNLOCK_FOREIGN: { + struct dentry *dentry = file_dentry(file); + + /* if not a foreign symlink do nothing */ + if (ll_foreign_is_removable(dentry, true)) { + CDEBUG(D_INFO, + "prevent unlink of non-foreign file ("DFID")\n", + PFID(ll_inode2fid(inode))); + rc = -EOPNOTSUPP; + } + break; } + case OBD_IOC_FID2PATH: + rc = ll_fid2path(inode, uarg); + break; + case OBD_IOC_GETNAME_OLD: + fallthrough; + case OBD_IOC_GETDTNAME: + fallthrough; + case OBD_IOC_GETMDNAME: + rc = ll_get_obd_name(inode, cmd, uarg); + break; default: - return -EINVAL; + rc = ENOTTY; + break; } - return 0; + return rc; } int ll_flush_ctx(struct inode *inode) From patchwork Mon Apr 17 13:47:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3929BC77B70 for ; Mon, 17 Apr 2023 14:03:56 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T4B2fwrz1yHG; Mon, 17 Apr 2023 06:51:38 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T231KkLz1wb0 for ; Mon, 17 Apr 2023 06:49:47 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id AE8B01008497; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id ACDD5379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:17 -0400 Message-Id: <1681739243-29375-22-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 21/27] lnet: change LNetAddPeer() to take struct lnet_nid X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Rather than an array of lnet_nid_t, LNetAddPeer now takes an array of struct lnet_nid. The array passed is *always* from struct uuid_nid_data, so that data structure is changed to store struct lnet_nid. WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: 42b49afdc8a4d2ec65 ("LU-10391 lnet: change LNetAddPeer() to take struct lnet_nid") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50085 Reviewed-by: jsimmons Reviewed-by: Oleg Drokin Reviewed-by: Frank Sehr Reviewed-by: Chris Horn Signed-off-by: James Simmons --- fs/lustre/obdclass/lustre_peer.c | 34 +++++++++++++++++++--------------- include/linux/lnet/api.h | 2 +- net/lnet/lnet/peer.c | 17 +++++++---------- 3 files changed, 27 insertions(+), 26 deletions(-) diff --git a/fs/lustre/obdclass/lustre_peer.c b/fs/lustre/obdclass/lustre_peer.c index 5eae2eb..2049c41 100644 --- a/fs/lustre/obdclass/lustre_peer.c +++ b/fs/lustre/obdclass/lustre_peer.c @@ -44,7 +44,7 @@ struct uuid_nid_data { struct list_head un_list; struct obd_uuid un_uuid; int un_nid_count; - lnet_nid_t un_nids[MTI_NIDS_MAX]; + struct lnet_nid un_nids[MTI_NIDS_MAX]; }; /* FIXME: This should probably become more elegant than a global linked list */ @@ -65,7 +65,7 @@ int lustre_uuid_to_peer(const char *uuid, struct lnet_nid *peer_nid, int index) break; rc = 0; - lnet_nid4_to_nid(data->un_nids[index], peer_nid); + *peer_nid = data->un_nids[index]; break; } } @@ -77,13 +77,15 @@ int lustre_uuid_to_peer(const char *uuid, struct lnet_nid *peer_nid, int index) /* Add a nid to a niduuid. Multiple nids can be added to a single uuid; * LNET will choose the best one. */ -int class_add_uuid(const char *uuid, u64 nid) +int class_add_uuid(const char *uuid, lnet_nid_t nid4) { struct uuid_nid_data *data, *entry; + struct lnet_nid nid; int found = 0; int rc; - LASSERT(nid != 0); /* valid newconfig NID is never zero */ + LASSERT(nid4 != 0); /* valid newconfig NID is never zero */ + lnet_nid4_to_nid(nid4, &nid); if (strlen(uuid) > UUID_MAX - 1) return -EOVERFLOW; @@ -103,7 +105,7 @@ int class_add_uuid(const char *uuid, u64 nid) found = 1; for (i = 0; i < entry->un_nid_count; i++) - if (nid == entry->un_nids[i]) + if (nid_same(&nid, &entry->un_nids[i])) break; if (i == entry->un_nid_count) { @@ -119,16 +121,16 @@ int class_add_uuid(const char *uuid, u64 nid) if (found) { CDEBUG(D_INFO, "found uuid %s %s cnt=%d\n", uuid, - libcfs_nid2str(nid), entry->un_nid_count); + libcfs_nidstr(&nid), entry->un_nid_count); rc = LNetAddPeer(entry->un_nids, entry->un_nid_count); CDEBUG(D_INFO, "Add peer %s rc = %d\n", - libcfs_nid2str(data->un_nids[0]), rc); + libcfs_nidstr(&data->un_nids[0]), rc); kfree(data); } else { - CDEBUG(D_INFO, "add uuid %s %s\n", uuid, libcfs_nid2str(nid)); + CDEBUG(D_INFO, "add uuid %s %s\n", uuid, libcfs_nidstr(&nid)); rc = LNetAddPeer(data->un_nids, data->un_nid_count); CDEBUG(D_INFO, "Add peer %s rc = %d\n", - libcfs_nid2str(data->un_nids[0]), rc); + libcfs_nidstr(&data->un_nids[0]), rc); } return 0; } @@ -167,7 +169,7 @@ int class_del_uuid(const char *uuid) CDEBUG(D_INFO, "del uuid %s %s/%d\n", obd_uuid2str(&data->un_uuid), - libcfs_nid2str(data->un_nids[0]), + libcfs_nidstr(&data->un_nids[0]), data->un_nid_count); kfree(data); @@ -200,7 +202,7 @@ int class_add_nids_to_uuid(struct obd_uuid *uuid, lnet_nid_t *nids, matched = true; CDEBUG(D_NET, "Updating UUID '%s'\n", obd_uuid2str(uuid)); for (i = 0; i < nid_count; i++) - entry->un_nids[i] = nids[i]; + lnet_nid4_to_nid(nids[i], &entry->un_nids[i]); entry->un_nid_count = nid_count; break; } @@ -208,7 +210,7 @@ int class_add_nids_to_uuid(struct obd_uuid *uuid, lnet_nid_t *nids, if (matched) { rc = LNetAddPeer(entry->un_nids, entry->un_nid_count); CDEBUG(D_INFO, "Add peer %s rc = %d\n", - libcfs_nid2str(entry->un_nids[0]), rc); + libcfs_nidstr(&entry->un_nids[0]), rc); } return 0; @@ -216,13 +218,15 @@ int class_add_nids_to_uuid(struct obd_uuid *uuid, lnet_nid_t *nids, EXPORT_SYMBOL(class_add_nids_to_uuid); /* check if @nid exists in nid list of @uuid */ -int class_check_uuid(struct obd_uuid *uuid, u64 nid) +int class_check_uuid(struct obd_uuid *uuid, lnet_nid_t nid4) { struct uuid_nid_data *entry; + struct lnet_nid nid; int found = 0; + lnet_nid4_to_nid(nid4, &nid); CDEBUG(D_INFO, "check if uuid %s has %s.\n", - obd_uuid2str(uuid), libcfs_nid2str(nid)); + obd_uuid2str(uuid), libcfs_nidstr(&nid)); spin_lock(&g_uuid_lock); list_for_each_entry(entry, &g_uuid_list, un_list) { @@ -233,7 +237,7 @@ int class_check_uuid(struct obd_uuid *uuid, u64 nid) /* found the uuid, check if it has @nid */ for (i = 0; i < entry->un_nid_count; i++) { - if (entry->un_nids[i] == nid) { + if (nid_same(&entry->un_nids[i], &nid)) { found = 1; break; } diff --git a/include/linux/lnet/api.h b/include/linux/lnet/api.h index 7ea61cb..f6d6c17 100644 --- a/include/linux/lnet/api.h +++ b/include/linux/lnet/api.h @@ -164,7 +164,7 @@ int LNetGet(struct lnet_nid *self, int LNetCtl(unsigned int cmd, void *arg); void LNetDebugPeer(struct lnet_processid *id); int LNetGetPeerDiscoveryStatus(void); -int LNetAddPeer(lnet_nid_t *nids, u32 num_nids); +int LNetAddPeer(struct lnet_nid *nids, u32 num_nids); /** @} lnet_misc */ diff --git a/net/lnet/lnet/peer.c b/net/lnet/lnet/peer.c index f1b0eb0d..9168641 100644 --- a/net/lnet/lnet/peer.c +++ b/net/lnet/lnet/peer.c @@ -1341,7 +1341,7 @@ struct lnet_peer_ni * } int -LNetAddPeer(lnet_nid_t *nids, u32 num_nids) +LNetAddPeer(struct lnet_nid *nids, u32 num_nids) { struct lnet_nid pnid = LNET_ANY_NID; bool mr; @@ -1361,14 +1361,11 @@ struct lnet_peer_ni * rc = 0; for (i = 0; i < num_nids; i++) { - struct lnet_nid nid; - - if (nids[i] == LNET_NID_LO_0) + if (nid_is_lo0(&nids[i])) continue; - lnet_nid4_to_nid(nids[i], &nid); if (LNET_NID_IS_ANY(&pnid)) { - lnet_nid4_to_nid(nids[i], &pnid); + pnid = nids[i]; rc = lnet_add_peer_ni(&pnid, &LNET_ANY_NID, mr, flags); if (rc == -EALREADY) { struct lnet_peer *lp; @@ -1384,11 +1381,11 @@ struct lnet_peer_ni * lnet_peer_decref_locked(lp); } } else if (lnet_peer_discovery_disabled) { - lnet_nid4_to_nid(nids[i], &nid); - rc = lnet_add_peer_ni(&nid, &LNET_ANY_NID, mr, flags); + rc = lnet_add_peer_ni(&nids[i], &LNET_ANY_NID, mr, + flags); } else { - lnet_nid4_to_nid(nids[i], &nid); - rc = lnet_add_peer_ni(&pnid, &nid, mr, flags); + rc = lnet_add_peer_ni(&pnid, &nids[i], mr, + flags); } if (rc && rc != -EEXIST) From patchwork Mon Apr 17 13:47:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214142 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A9BFC77B70 for ; Mon, 17 Apr 2023 14:12:21 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T7W6K51z22Vv; Mon, 17 Apr 2023 06:54:31 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T2G44w1z21HF for ; Mon, 17 Apr 2023 06:49:58 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id B2E231008498; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B1BD9372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:18 -0400 Message-Id: <1681739243-29375-23-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 22/27] lustre: obdclass: change class_add/check_uuid to large nid X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown class_add_uuid() and class_check_uuid() are changed to take a struct lnet_nid* rather than a u64 (aka lnet_nid_t). WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: 16d84b030520c431da ("LU-10391 obdclass: change class_add/check_uuid to large nid") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50086 Reviewed-by: jsimmons Reviewed-by: Frank Sehr Reviewed-by: Chris Horn Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_class.h | 4 ++-- fs/lustre/ldlm/ldlm_lib.c | 13 +++++++++---- fs/lustre/obdclass/lustre_peer.c | 24 ++++++++++-------------- fs/lustre/obdclass/obd_config.c | 5 +++-- 4 files changed, 24 insertions(+), 22 deletions(-) diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h index 0c95c3c..f77fd12 100644 --- a/fs/lustre/include/obd_class.h +++ b/fs/lustre/include/obd_class.h @@ -1742,11 +1742,11 @@ struct lwp_register_item { /* lustre_peer.c */ int lustre_uuid_to_peer(const char *uuid, struct lnet_nid *peer_nid, int index); -int class_add_uuid(const char *uuid, u64 nid); +int class_add_uuid(const char *uuid, struct lnet_nid *nid); int class_del_uuid(const char *uuid); int class_add_nids_to_uuid(struct obd_uuid *uuid, lnet_nid_t *nids, int nid_count); -int class_check_uuid(struct obd_uuid *uuid, u64 nid); +int class_check_uuid(struct obd_uuid *uuid, struct lnet_nid *nid); /* class_obd.c */ extern char obd_jobid_name[]; diff --git a/fs/lustre/ldlm/ldlm_lib.c b/fs/lustre/ldlm/ldlm_lib.c index b1ce0d4..4f9cf5f 100644 --- a/fs/lustre/ldlm/ldlm_lib.c +++ b/fs/lustre/ldlm/ldlm_lib.c @@ -141,16 +141,18 @@ int client_import_add_conn(struct obd_import *imp, struct obd_uuid *uuid, EXPORT_SYMBOL(client_import_add_conn); int client_import_dyn_add_conn(struct obd_import *imp, struct obd_uuid *uuid, - lnet_nid_t prim_nid, int priority) + lnet_nid_t prim_nid4, int priority) { struct ptlrpc_connection *ptlrpc_conn; + struct lnet_nid prim_nid; int rc; - ptlrpc_conn = ptlrpc_uuid_to_connection(uuid, LNET_NIDNET(prim_nid)); + lnet_nid4_to_nid(prim_nid4, &prim_nid); + ptlrpc_conn = ptlrpc_uuid_to_connection(uuid, LNET_NID_NET(&prim_nid)); if (!ptlrpc_conn) { const char *str_uuid = obd_uuid2str(uuid); - rc = class_add_uuid(str_uuid, prim_nid); + rc = class_add_uuid(str_uuid, &prim_nid); if (rc) { CERROR("%s: failed to add UUID '%s': rc = %d\n", imp->imp_obd->obd_name, str_uuid, rc); @@ -172,7 +174,10 @@ int client_import_add_nids_to_conn(struct obd_import *imp, lnet_nid_t *nids, spin_lock(&imp->imp_lock); list_for_each_entry(conn, &imp->imp_conn_list, oic_item) { - if (class_check_uuid(&conn->oic_uuid, nids[0])) { + struct lnet_nid nid; + + lnet_nid4_to_nid(nids[0], &nid); + if (class_check_uuid(&conn->oic_uuid, &nid)) { *uuid = conn->oic_uuid; spin_unlock(&imp->imp_lock); rc = class_add_nids_to_uuid(&conn->oic_uuid, nids, diff --git a/fs/lustre/obdclass/lustre_peer.c b/fs/lustre/obdclass/lustre_peer.c index 2049c41..aae69d3 100644 --- a/fs/lustre/obdclass/lustre_peer.c +++ b/fs/lustre/obdclass/lustre_peer.c @@ -77,15 +77,13 @@ int lustre_uuid_to_peer(const char *uuid, struct lnet_nid *peer_nid, int index) /* Add a nid to a niduuid. Multiple nids can be added to a single uuid; * LNET will choose the best one. */ -int class_add_uuid(const char *uuid, lnet_nid_t nid4) +int class_add_uuid(const char *uuid, struct lnet_nid *nid) { struct uuid_nid_data *data, *entry; - struct lnet_nid nid; int found = 0; int rc; - LASSERT(nid4 != 0); /* valid newconfig NID is never zero */ - lnet_nid4_to_nid(nid4, &nid); + LASSERT(nid->nid_type != 0); /* valid newconfig NID is never zero */ if (strlen(uuid) > UUID_MAX - 1) return -EOVERFLOW; @@ -95,7 +93,7 @@ int class_add_uuid(const char *uuid, lnet_nid_t nid4) return -ENOMEM; obd_str2uuid(&data->un_uuid, uuid); - data->un_nids[0] = nid; + data->un_nids[0] = *nid; data->un_nid_count = 1; spin_lock(&g_uuid_lock); @@ -105,12 +103,12 @@ int class_add_uuid(const char *uuid, lnet_nid_t nid4) found = 1; for (i = 0; i < entry->un_nid_count; i++) - if (nid_same(&nid, &entry->un_nids[i])) + if (nid_same(nid, &entry->un_nids[i])) break; if (i == entry->un_nid_count) { LASSERT(entry->un_nid_count < MTI_NIDS_MAX); - entry->un_nids[entry->un_nid_count++] = nid; + entry->un_nids[entry->un_nid_count++] = *nid; } break; } @@ -121,13 +119,13 @@ int class_add_uuid(const char *uuid, lnet_nid_t nid4) if (found) { CDEBUG(D_INFO, "found uuid %s %s cnt=%d\n", uuid, - libcfs_nidstr(&nid), entry->un_nid_count); + libcfs_nidstr(nid), entry->un_nid_count); rc = LNetAddPeer(entry->un_nids, entry->un_nid_count); CDEBUG(D_INFO, "Add peer %s rc = %d\n", libcfs_nidstr(&data->un_nids[0]), rc); kfree(data); } else { - CDEBUG(D_INFO, "add uuid %s %s\n", uuid, libcfs_nidstr(&nid)); + CDEBUG(D_INFO, "add uuid %s %s\n", uuid, libcfs_nidstr(nid)); rc = LNetAddPeer(data->un_nids, data->un_nid_count); CDEBUG(D_INFO, "Add peer %s rc = %d\n", libcfs_nidstr(&data->un_nids[0]), rc); @@ -218,15 +216,13 @@ int class_add_nids_to_uuid(struct obd_uuid *uuid, lnet_nid_t *nids, EXPORT_SYMBOL(class_add_nids_to_uuid); /* check if @nid exists in nid list of @uuid */ -int class_check_uuid(struct obd_uuid *uuid, lnet_nid_t nid4) +int class_check_uuid(struct obd_uuid *uuid, struct lnet_nid *nid) { struct uuid_nid_data *entry; - struct lnet_nid nid; int found = 0; - lnet_nid4_to_nid(nid4, &nid); CDEBUG(D_INFO, "check if uuid %s has %s.\n", - obd_uuid2str(uuid), libcfs_nidstr(&nid)); + obd_uuid2str(uuid), libcfs_nidstr(nid)); spin_lock(&g_uuid_lock); list_for_each_entry(entry, &g_uuid_list, un_list) { @@ -237,7 +233,7 @@ int class_check_uuid(struct obd_uuid *uuid, lnet_nid_t nid4) /* found the uuid, check if it has @nid */ for (i = 0; i < entry->un_nid_count; i++) { - if (nid_same(&entry->un_nids[i], &nid)) { + if (nid_same(&entry->un_nids[i], nid)) { found = 1; break; } diff --git a/fs/lustre/obdclass/obd_config.c b/fs/lustre/obdclass/obd_config.c index f2173df..81021e1 100644 --- a/fs/lustre/obdclass/obd_config.c +++ b/fs/lustre/obdclass/obd_config.c @@ -822,6 +822,7 @@ static int process_param2_config(struct lustre_cfg *lcfg) int class_process_config(struct lustre_cfg *lcfg) { struct obd_device *obd; + struct lnet_nid nid; int err; LASSERT(lcfg && !IS_ERR(lcfg)); @@ -839,8 +840,8 @@ int class_process_config(struct lustre_cfg *lcfg) lustre_cfg_string(lcfg, 1), lcfg->lcfg_nid, libcfs_nid2str(lcfg->lcfg_nid)); - err = class_add_uuid(lustre_cfg_string(lcfg, 1), - lcfg->lcfg_nid); + lnet_nid4_to_nid(lcfg->lcfg_nid, &nid); + err = class_add_uuid(lustre_cfg_string(lcfg, 1), &nid); goto out; } case LCFG_DEL_UUID: { From patchwork Mon Apr 17 13:47:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214134 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 58207C77B76 for ; Mon, 17 Apr 2023 14:07:07 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T4l5Yqzz22RF; Mon, 17 Apr 2023 06:52:07 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T2P6Gmhz21Hs for ; Mon, 17 Apr 2023 06:50:05 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id B7DE21008499; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id B68C1375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:19 -0400 Message-Id: <1681739243-29375-24-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 23/27] lustre: obdclass: rename class_parse_nid to class_parse_nid4 X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Mr NeilBrown Use the name "nid4" for class_parse_nid(), class_parse_nid_quiet(), parse_nid() and CLASS_PARSE_NID. This will allow a new class_parse_nid which handle larger nids. WC-bug-id: https://jira.whamcloud.com/browse/LU-10391 Lustre-commit: b4a28a3269fadb1059 ("LU-10391 lustre: rename class_parse_nid to class_parse_nid4") Signed-off-by: Mr NeilBrown Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50088 Reviewed-by: jsimmons Reviewed-by: Frank Sehr Reviewed-by: Chris Horn Reviewed-by: Andreas Dilger Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/include/obd_class.h | 5 ++--- fs/lustre/obdclass/obd_config.c | 26 +++++++++++++------------- fs/lustre/obdclass/obd_mount.c | 8 ++++---- 3 files changed, 19 insertions(+), 20 deletions(-) diff --git a/fs/lustre/include/obd_class.h b/fs/lustre/include/obd_class.h index f77fd12..2b66bc4 100644 --- a/fs/lustre/include/obd_class.h +++ b/fs/lustre/include/obd_class.h @@ -144,10 +144,9 @@ struct cfg_interop_param { struct cfg_interop_param *class_find_old_param(const char *param, struct cfg_interop_param *ptr); int class_get_next_param(char **params, char *copy); -int class_parse_nid(char *buf, lnet_nid_t *nid, char **endh); -int class_parse_nid_quiet(char *buf, lnet_nid_t *nid, char **endh); +int class_parse_nid4(char *buf, lnet_nid_t *nid4, char **endh); +int class_parse_nid4_quiet(char *buf, lnet_nid_t *nid4, char **endh); int class_parse_net(char *buf, u32 *net, char **endh); -int class_match_nid(char *buf, char *key, lnet_nid_t nid); int class_match_net(char *buf, char *key, u32 net); struct obd_device *class_incref(struct obd_device *obd, diff --git a/fs/lustre/obdclass/obd_config.c b/fs/lustre/obdclass/obd_config.c index 81021e1..eb14ca8 100644 --- a/fs/lustre/obdclass/obd_config.c +++ b/fs/lustre/obdclass/obd_config.c @@ -152,12 +152,12 @@ static int class_match_param(char *buf, const char *key, char **valp) return 0; } -static int parse_nid(char *buf, void *value, int quiet) +static int parse_nid4(char *buf, void *value, int quiet) { - lnet_nid_t *nid = value; + lnet_nid_t *nid4 = value; - *nid = libcfs_str2nid(buf); - if (*nid != LNET_NID_ANY) + *nid4 = libcfs_str2nid(buf); + if (*nid4 != LNET_NID_ANY) return 0; if (!quiet) @@ -175,7 +175,7 @@ static int parse_net(char *buf, void *value) } enum { - CLASS_PARSE_NID = 1, + CLASS_PARSE_NID4 = 1, CLASS_PARSE_NET, }; @@ -208,8 +208,8 @@ static int class_parse_value(char *buf, int opc, void *value, char **endh, switch (opc) { default: LBUG(); - case CLASS_PARSE_NID: - rc = parse_nid(buf, value, quiet); + case CLASS_PARSE_NID4: + rc = parse_nid4(buf, value, quiet); break; case CLASS_PARSE_NET: rc = parse_net(buf, value); @@ -223,17 +223,17 @@ static int class_parse_value(char *buf, int opc, void *value, char **endh, return 0; } -int class_parse_nid(char *buf, lnet_nid_t *nid, char **endh) +int class_parse_nid4(char *buf, lnet_nid_t *nid4, char **endh) { - return class_parse_value(buf, CLASS_PARSE_NID, (void *)nid, endh, 0); + return class_parse_value(buf, CLASS_PARSE_NID4, (void *)nid4, endh, 0); } -EXPORT_SYMBOL(class_parse_nid); +EXPORT_SYMBOL(class_parse_nid4); -int class_parse_nid_quiet(char *buf, lnet_nid_t *nid, char **endh) +int class_parse_nid4_quiet(char *buf, lnet_nid_t *nid4, char **endh) { - return class_parse_value(buf, CLASS_PARSE_NID, (void *)nid, endh, 1); + return class_parse_value(buf, CLASS_PARSE_NID4, (void *)nid4, endh, 1); } -EXPORT_SYMBOL(class_parse_nid_quiet); +EXPORT_SYMBOL(class_parse_nid4_quiet); char *lustre_cfg_string(struct lustre_cfg *lcfg, u32 index) { diff --git a/fs/lustre/obdclass/obd_mount.c b/fs/lustre/obdclass/obd_mount.c index 58ca72d..6eaa214 100644 --- a/fs/lustre/obdclass/obd_mount.c +++ b/fs/lustre/obdclass/obd_mount.c @@ -228,7 +228,7 @@ int lustre_start_mgc(struct super_block *sb) /* Use nids from mount line: uml1,1@elan:uml2,2@elan:/lustre */ ptr = lsi->lsi_lmd->lmd_dev; - if (class_parse_nid(ptr, &nid, &ptr) == 0) + if (class_parse_nid4(ptr, &nid, &ptr) == 0) i++; if (i == 0) { CERROR("No valid MGS nids found.\n"); @@ -314,7 +314,7 @@ int lustre_start_mgc(struct super_block *sb) i = 0; /* Use nids from mount line: uml1,1@elan:uml2,2@elan:/lustre */ ptr = lsi->lsi_lmd->lmd_dev; - while (class_parse_nid(ptr, &nid, &ptr) == 0) { + while (class_parse_nid4(ptr, &nid, &ptr) == 0) { rc = do_lcfg(mgcname, nid, LCFG_ADD_UUID, niduuid, NULL, NULL, NULL); if (!rc) @@ -354,7 +354,7 @@ int lustre_start_mgc(struct super_block *sb) /* New failover node */ sprintf(niduuid, "%s_%x", mgcname, i); j = 0; - while (class_parse_nid_quiet(ptr, &nid, &ptr) == 0) { + while (class_parse_nid4_quiet(ptr, &nid, &ptr) == 0) { rc = do_lcfg(mgcname, nid, LCFG_ADD_UUID, niduuid, NULL, NULL, NULL); if (!rc) @@ -870,7 +870,7 @@ static int lmd_parse_mgs(struct lustre_mount_data *lmd, char **ptr) int oldlen = 0; /* Find end of nidlist */ - while (class_parse_nid_quiet(tail, &nid, &tail) == 0) + while (class_parse_nid4_quiet(tail, &nid, &tail) == 0) ; length = tail - *ptr; if (length == 0) { From patchwork Mon Apr 17 13:47:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D6F84C77B76 for ; Mon, 17 Apr 2023 14:13:58 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T7j4HkCz1yG6; Mon, 17 Apr 2023 06:54:41 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T2c18Hsz21Hy for ; Mon, 17 Apr 2023 06:50:16 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id BCAC1100849A; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id BB4C6379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:20 -0400 Message-Id: <1681739243-29375-25-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 24/27] lustre: llite: only first sync to MDS matter X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Alex Zhuravlev fsync() is supposed to sync metadata and data, but given file's layout the only first MDS_SYNC matters to ensure the file creation has been committed, everything else go to OSTs (data and attributes) also, uid/gid/mode, EAs and ACLs must be subject to sync. WC-bug-id: https://jira.whamcloud.com/browse/LU-11404 Lustre-commit: 1c8a49bedff2746775 ("LU-11404 llite: only first sync to MDS matter") Signed-off-by: Alex Zhuravlev Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/33175 Reviewed-by: Andreas Dilger Reviewed-by: Mikhail Pershin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/acl.c | 2 ++ fs/lustre/llite/file.c | 19 ++++++++++++++----- fs/lustre/llite/llite_internal.h | 1 + fs/lustre/llite/llite_lib.c | 1 + fs/lustre/llite/xattr.c | 1 + 5 files changed, 19 insertions(+), 5 deletions(-) diff --git a/fs/lustre/llite/acl.c b/fs/lustre/llite/acl.c index bd045cc..91a7421 100644 --- a/fs/lustre/llite/acl.c +++ b/fs/lustre/llite/acl.c @@ -94,6 +94,8 @@ int ll_set_acl(struct inode *inode, struct posix_acl *acl, int type) rc = md_setxattr(sbi->ll_md_exp, ll_inode2fid(inode), value ? OBD_MD_FLXATTR : OBD_MD_FLXATTRRM, name, value, value_size, 0, 0, &req); + if (!rc) + ll_i2info(inode)->lli_synced_to_mds = false; ptlrpc_req_finished(req); out_value: diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index ceac08c..0186db4 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -4758,11 +4758,20 @@ int ll_fsync(struct file *file, loff_t start, loff_t end, int datasync) } } - err = md_fsync(ll_i2sbi(inode)->ll_md_exp, ll_inode2fid(inode), &req); - if (!rc) - rc = err; - if (!err) - ptlrpc_req_finished(req); + if (S_ISREG(inode->i_mode) && !lli->lli_synced_to_mds) { + /* + * only the first sync on MDS makes sense, + * everything else is stored on OSTs + */ + err = md_fsync(ll_i2sbi(inode)->ll_md_exp, + ll_inode2fid(inode), &req); + if (!rc) + rc = err; + if (!err) { + lli->lli_synced_to_mds = true; + ptlrpc_req_finished(req); + } + } if (S_ISREG(inode->i_mode)) { struct ll_file_data *fd = file->private_data; diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 6590399..129c817 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -155,6 +155,7 @@ struct ll_inode_info { s64 lli_ctime; s64 lli_btime; spinlock_t lli_agl_lock; + bool lli_synced_to_mds; /* inode specific open lock caching threshold */ u32 lli_open_thrsh_count; diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index c54ca1f..002e870 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2181,6 +2181,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, rc = ll_md_setattr(dentry, op_data); if (rc) goto out; + lli->lli_synced_to_mds = false; if (!S_ISREG(inode->i_mode) || hsm_import) { rc = 0; diff --git a/fs/lustre/llite/xattr.c b/fs/lustre/llite/xattr.c index c90f501..ceaad60 100644 --- a/fs/lustre/llite/xattr.c +++ b/fs/lustre/llite/xattr.c @@ -160,6 +160,7 @@ static int ll_xattr_set_common(const struct xattr_handler *handler, } return rc; } + ll_i2info(inode)->lli_synced_to_mds = false; ptlrpc_req_finished(req); From patchwork Mon Apr 17 13:47:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214136 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 44AE9C77B70 for ; Mon, 17 Apr 2023 14:07:50 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T5G5DZ8z215b; Mon, 17 Apr 2023 06:52:34 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T2p2FRyz21JT for ; Mon, 17 Apr 2023 06:50:26 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id C158B100849B; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C00BA372; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:21 -0400 Message-Id: <1681739243-29375-26-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 25/27] lustre: statahead: batched statahead processing X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Qian Yingjin Batched metadata processing can get a big performance boost. In this patch, it implements a batched statahead mechanism which can also increase the performance for a directory traverse or listing such as the command 'ls'. For the batched statahead, one batch getattr() RPC equals to 'N' normal lookup/getattr RPCs. It can pack a number of dentry name getting from the readdir() call and prepared lock handles one client side lock namespace into one large batched RPC transferring via bulk I/O to obtain ibits DLM locks and associated attributes for a lot of files in one blow. When MDS receives a batched getattr() RPC, it executes the sub requests in it one by one serially. A tunable parameter named "statahead_batch_max" is defined, it means the maximal items can be batched and processed within one aggregate RPC. Once the number of sub requests exceeds this predefined limit, it will pack and trigger the batched RPC. The batched RPC will also be triggered explicitly when the readdir() call comes to the end position of the directory or the statahead thread exits abnormally. Batched metadata processing can get a big performance boost. The mdtest performance results without/with this patch series are as follow: mdtest-easy-stat 720.562369 kIOPS : time 118.695 seconds mdtest-easy-stat 1218.290192 kIOPS : time 70.656 seconds In this patch, we set statahead_batch_max=0 and disabled batched statahead by default. It will enable accordingly once some subsequent fixes about batched RPC have been merged. WC-bug-id: https://jira.whamcloud.com/browse/LU-14139 Lustre-commit: 4435d0121f72aac3ad ("LU-14139 statahead: batched statahead processing") Signed-off-by: Qian Yingjin Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40720 Reviewed-by: Alex Zhuravlev Reviewed-by: Oleg Drokin Reviewed-by: Andreas Dilger Signed-off-by: James Simmons --- fs/lustre/include/lustre_dlm.h | 10 ++- fs/lustre/include/lustre_req_layout.h | 7 ++ fs/lustre/include/obd.h | 2 + fs/lustre/ldlm/ldlm_request.c | 80 ++++++++++++++--- fs/lustre/llite/llite_internal.h | 18 +++- fs/lustre/llite/llite_lib.c | 4 +- fs/lustre/llite/lproc_llite.c | 47 ++++++++-- fs/lustre/llite/statahead.c | 98 +++++++++++++++++--- fs/lustre/lmv/lmv_obd.c | 27 ++++++ fs/lustre/mdc/mdc_batch.c | 163 +++++++++++++++++++++++++++++++++- fs/lustre/mdc/mdc_dev.c | 4 +- fs/lustre/mdc/mdc_internal.h | 6 ++ fs/lustre/mdc/mdc_locks.c | 24 ++--- fs/lustre/osc/osc_request.c | 5 +- fs/lustre/ptlrpc/layout.c | 40 +++++++++ 15 files changed, 485 insertions(+), 50 deletions(-) diff --git a/fs/lustre/include/lustre_dlm.h b/fs/lustre/include/lustre_dlm.h index d08c48f..a3a339f 100644 --- a/fs/lustre/include/lustre_dlm.h +++ b/fs/lustre/include/lustre_dlm.h @@ -1342,11 +1342,19 @@ int ldlm_prep_elc_req(struct obd_export *exp, struct list_head *cancels, int count); struct ptlrpc_request *ldlm_enqueue_pack(struct obd_export *exp, int lvb_len); -int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, +int ldlm_cli_enqueue_fini(struct obd_export *exp, struct req_capsule *pill, struct ldlm_enqueue_info *einfo, u8 with_policy, u64 *flags, void *lvb, u32 lvb_len, const struct lustre_handle *lockh, int rc, bool request_slot); +int ldlm_cli_lock_create_pack(struct obd_export *exp, + struct ldlm_request *dlmreq, + struct ldlm_enqueue_info *einfo, + const struct ldlm_res_id *res_id, + union ldlm_policy_data const *policy, + u64 *flags, void *lvb, u32 lvb_len, + enum lvb_type lvb_type, + struct lustre_handle *lockh); int ldlm_cli_convert_req(struct ldlm_lock *lock, u32 *flags, u64 new_bits); int ldlm_cli_convert(struct ldlm_lock *lock, enum ldlm_cancel_flags cancel_flags); diff --git a/fs/lustre/include/lustre_req_layout.h b/fs/lustre/include/lustre_req_layout.h index a7ed89b..505e9a1 100644 --- a/fs/lustre/include/lustre_req_layout.h +++ b/fs/lustre/include/lustre_req_layout.h @@ -80,6 +80,12 @@ void req_capsule_init(struct req_capsule *pill, struct ptlrpc_request *req, void req_capsule_fini(struct req_capsule *pill); void req_capsule_set(struct req_capsule *pill, const struct req_format *fmt); +void req_capsule_subreq_init(struct req_capsule *pill, + const struct req_format *fmt, + struct ptlrpc_request *req, + struct lustre_msg *reqmsg, + struct lustre_msg *repmsg, + enum req_location loc); size_t req_capsule_filled_sizes(struct req_capsule *pill, enum req_location loc); int req_capsule_server_pack(struct req_capsule *pill); @@ -282,6 +288,7 @@ static inline void req_capsule_set_rep_swabbed(struct req_capsule *pill, extern struct req_format RQF_CONNECT; /* Batch UpdaTe req_format */ +extern struct req_format RQF_BUT_GETATTR; extern struct req_format RQF_MDS_BATCH; /* Batch UpdaTe format */ diff --git a/fs/lustre/include/obd.h b/fs/lustre/include/obd.h index bd167ac..4d65775 100644 --- a/fs/lustre/include/obd.h +++ b/fs/lustre/include/obd.h @@ -852,6 +852,8 @@ struct md_op_item { struct inode *mop_dir; struct req_capsule *mop_pill; struct work_struct mop_work; + u64 mop_lock_flags; + unsigned int mop_subpill_allocated:1; }; enum lu_batch_flags { diff --git a/fs/lustre/ldlm/ldlm_request.c b/fs/lustre/ldlm/ldlm_request.c index 11071d9..57cf1c0 100644 --- a/fs/lustre/ldlm/ldlm_request.c +++ b/fs/lustre/ldlm/ldlm_request.c @@ -369,7 +369,7 @@ static bool ldlm_request_slot_needed(struct ldlm_enqueue_info *einfo) * * Called after receiving reply from server. */ -int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, +int ldlm_cli_enqueue_fini(struct obd_export *exp, struct req_capsule *pill, struct ldlm_enqueue_info *einfo, u8 with_policy, u64 *ldlm_flags, void *lvb, u32 lvb_len, const struct lustre_handle *lockh, @@ -382,10 +382,17 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, struct ldlm_reply *reply; int cleanup_phase = 1; - if (request_slot) - obd_put_request_slot(&req->rq_import->imp_obd->u.cli); + if (req_capsule_ptlreq(pill)) { + struct ptlrpc_request *req = pill->rc_req; - ptlrpc_put_mod_rpc_slot(req); + if (request_slot) + obd_put_request_slot(&req->rq_import->imp_obd->u.cli); + + ptlrpc_put_mod_rpc_slot(req); + + if (req && req->rq_svc_thread) + env = req->rq_svc_thread->t_env; + } lock = ldlm_handle2lock(lockh); /* ldlm_cli_enqueue is holding a reference on this lock. */ @@ -407,7 +414,7 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, } /* Before we return, swab the reply */ - reply = req_capsule_server_get(&req->rq_pill, &RMF_DLM_REP); + reply = req_capsule_server_get(pill, &RMF_DLM_REP); if (!reply) { rc = -EPROTO; goto cleanup; @@ -416,8 +423,7 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, if (lvb_len > 0) { int size = 0; - size = req_capsule_get_size(&req->rq_pill, &RMF_DLM_LVB, - RCL_SERVER); + size = req_capsule_get_size(pill, &RMF_DLM_LVB, RCL_SERVER); if (size < 0) { LDLM_ERROR(lock, "Fail to get lvb_len, rc = %d", size); rc = size; @@ -434,7 +440,7 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, if (rc == ELDLM_LOCK_ABORTED) { if (lvb_len > 0 && lvb) - rc = ldlm_fill_lvb(lock, &req->rq_pill, RCL_SERVER, + rc = ldlm_fill_lvb(lock, pill, RCL_SERVER, lvb, lvb_len); if (rc == 0) rc = ELDLM_LOCK_ABORTED; @@ -520,7 +526,7 @@ int ldlm_cli_enqueue_fini(struct obd_export *exp, struct ptlrpc_request *req, */ lock_res_and_lock(lock); if (!ldlm_is_granted(lock)) - rc = ldlm_fill_lvb(lock, &req->rq_pill, RCL_SERVER, + rc = ldlm_fill_lvb(lock, pill, RCL_SERVER, lock->l_lvb_data, lvb_len); unlock_res_and_lock(lock); if (rc < 0) { @@ -857,8 +863,9 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp, rc = ptlrpc_queue_wait(req); - err = ldlm_cli_enqueue_fini(exp, req, einfo, policy ? 1 : 0, flags, - lvb, lvb_len, lockh, rc, need_req_slot); + err = ldlm_cli_enqueue_fini(exp, &req->rq_pill, einfo, policy ? 1 : 0, + flags, lvb, lvb_len, lockh, rc, + need_req_slot); /* * If ldlm_cli_enqueue_fini did not find the lock, we need to free @@ -880,6 +887,57 @@ int ldlm_cli_enqueue(struct obd_export *exp, struct ptlrpc_request **reqp, EXPORT_SYMBOL(ldlm_cli_enqueue); /** + * Client-side IBITS lock create and pack for WBC EX lock request. + */ +int ldlm_cli_lock_create_pack(struct obd_export *exp, + struct ldlm_request *dlmreq, + struct ldlm_enqueue_info *einfo, + const struct ldlm_res_id *res_id, + union ldlm_policy_data const *policy, + u64 *flags, void *lvb, u32 lvb_len, + enum lvb_type lvb_type, + struct lustre_handle *lockh) +{ + const struct ldlm_callback_suite cbs = { + .lcs_completion = einfo->ei_cb_cp, + .lcs_blocking = einfo->ei_cb_bl, + .lcs_glimpse = einfo->ei_cb_gl + }; + struct ldlm_namespace *ns; + struct ldlm_lock *lock; + + LASSERT(exp); + LASSERT(!(*flags & LDLM_FL_REPLAY)); + + ns = exp->exp_obd->obd_namespace; + lock = ldlm_lock_create(ns, res_id, einfo->ei_type, einfo->ei_mode, + &cbs, einfo->ei_cbdata, lvb_len, lvb_type); + if (IS_ERR(lock)) + return PTR_ERR(lock); + + /* For the local lock, add the reference */ + ldlm_lock_addref_internal(lock, einfo->ei_mode); + ldlm_lock2handle(lock, lockh); + if (policy) + lock->l_policy_data = *policy; + + LDLM_DEBUG(lock, "client-side enqueue START, flags %#llx", *flags); + lock->l_conn_export = exp; + lock->l_export = NULL; + lock->l_blocking_ast = einfo->ei_cb_bl; + lock->l_flags |= (*flags & (LDLM_FL_NO_LRU | LDLM_FL_EXCL | + LDLM_FL_ATOMIC_CB)); + lock->l_activity = ktime_get_real_seconds(); + + ldlm_lock2desc(lock, &dlmreq->lock_desc); + dlmreq->lock_flags = ldlm_flags_to_wire(*flags); + dlmreq->lock_handle[0] = *lockh; + + return 0; +} +EXPORT_SYMBOL(ldlm_cli_lock_create_pack); + +/** * Client-side IBITS lock convert. * * Inform server that lock has been converted instead of canceling. diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 129c817..6088da08 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -792,6 +792,9 @@ struct ll_sb_info { unsigned int ll_sa_running_max; /* max concurrent * statahead instances */ + unsigned int ll_sa_batch_max;/* max SUB request count in + * a batch PTLRPC request + */ unsigned int ll_sa_max; /* max statahead RPCs */ atomic_t ll_sa_total; /* statahead thread started * count @@ -1520,9 +1523,10 @@ enum ras_update_flags { void ll_ra_stats_inc(struct inode *inode, enum ra_stat which); /* statahead.c */ -#define LL_SA_RPC_MIN 2 -#define LL_SA_RPC_DEF 32 -#define LL_SA_RPC_MAX 512 + +#define LL_SA_RPC_MIN 8 +#define LL_SA_RPC_DEF 32 +#define LL_SA_RPC_MAX 2048 /* XXX: If want to support more concurrent statahead instances, * please consider to decentralize the RPC lists attached @@ -1532,7 +1536,10 @@ enum ras_update_flags { #define LL_SA_RUNNING_MAX 256 #define LL_SA_RUNNING_DEF 16 -#define LL_SA_CACHE_BIT 5 +#define LL_SA_BATCH_MAX 1024 +#define LL_SA_BATCH_DEF 0 + +#define LL_SA_CACHE_BIT 5 #define LL_SA_CACHE_SIZE BIT(LL_SA_CACHE_BIT) #define LL_SA_CACHE_MASK (LL_SA_CACHE_SIZE - 1) @@ -1576,6 +1583,9 @@ struct ll_statahead_info { struct list_head sai_cache[LL_SA_CACHE_SIZE]; spinlock_t sai_cache_lock[LL_SA_CACHE_SIZE]; atomic_t sai_cache_count; /* entry count in cache */ + struct lu_batch *sai_bh; + u32 sai_max_batch_count; + u64 sai_index_end; }; int ll_revalidate_statahead(struct inode *dir, struct dentry **dentry, diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index 002e870..b1bbeb3 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -167,6 +167,7 @@ static struct ll_sb_info *ll_init_sbi(struct lustre_sb_info *lsi) /* metadata statahead is enabled by default */ sbi->ll_sa_running_max = LL_SA_RUNNING_DEF; + sbi->ll_sa_batch_max = LL_SA_BATCH_DEF; sbi->ll_sa_max = LL_SA_RPC_DEF; atomic_set(&sbi->ll_sa_total, 0); atomic_set(&sbi->ll_sa_wrong, 0); @@ -324,7 +325,8 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt) OBD_CONNECT2_GETATTR_PFID | OBD_CONNECT2_DOM_LVB | OBD_CONNECT2_REP_MBITS | - OBD_CONNECT2_ATOMIC_OPEN_LOCK; + OBD_CONNECT2_ATOMIC_OPEN_LOCK | + OBD_CONNECT2_BATCH_RPC; if (test_bit(LL_SBI_LRU_RESIZE, sbi->ll_flags)) data->ocd_connect_flags |= OBD_CONNECT_LRU_RESIZE; diff --git a/fs/lustre/llite/lproc_llite.c b/fs/lustre/llite/lproc_llite.c index 8b6c86f..4ea0bb2 100644 --- a/fs/lustre/llite/lproc_llite.c +++ b/fs/lustre/llite/lproc_llite.c @@ -768,6 +768,41 @@ static ssize_t statahead_running_max_store(struct kobject *kobj, } LUSTRE_RW_ATTR(statahead_running_max); +static ssize_t statahead_batch_max_show(struct kobject *kobj, + struct attribute *attr, + char *buf) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + + return snprintf(buf, 16, "%u\n", sbi->ll_sa_batch_max); +} + +static ssize_t statahead_batch_max_store(struct kobject *kobj, + struct attribute *attr, + const char *buffer, + size_t count) +{ + struct ll_sb_info *sbi = container_of(kobj, struct ll_sb_info, + ll_kset.kobj); + unsigned long val; + int rc; + + rc = kstrtoul(buffer, 0, &val); + if (rc) + return rc; + + if (val > LL_SA_BATCH_MAX) { + CWARN("%s: statahead_batch_max value %lu limited to maximum %d\n", + sbi->ll_fsname, val, LL_SA_BATCH_MAX); + val = LL_SA_BATCH_MAX; + } + + sbi->ll_sa_batch_max = val; + return count; +} +LUSTRE_RW_ATTR(statahead_batch_max); + static ssize_t statahead_max_show(struct kobject *kobj, struct attribute *attr, char *buf) @@ -792,12 +827,13 @@ static ssize_t statahead_max_store(struct kobject *kobj, if (rc) return rc; - if (val <= LL_SA_RPC_MAX) - sbi->ll_sa_max = val; - else - CERROR("Bad statahead_max value %lu. Valid values are in the range [0, %d]\n", - val, LL_SA_RPC_MAX); + if (val > LL_SA_RPC_MAX) { + CWARN("%s: statahead_max value %lu limited to maximum %d\n", + sbi->ll_fsname, val, LL_SA_RPC_MAX); + val = LL_SA_RPC_MAX; + } + sbi->ll_sa_max = val; return count; } LUSTRE_RW_ATTR(statahead_max); @@ -1788,6 +1824,7 @@ struct ldebugfs_vars lprocfs_llite_obd_vars[] = { &lustre_attr_stats_track_ppid.attr, &lustre_attr_stats_track_gid.attr, &lustre_attr_statahead_running_max.attr, + &lustre_attr_statahead_batch_max.attr, &lustre_attr_statahead_max.attr, &lustre_attr_statahead_agl.attr, &lustre_attr_lazystatfs.attr, diff --git a/fs/lustre/llite/statahead.c b/fs/lustre/llite/statahead.c index 12d8266..59688b4 100644 --- a/fs/lustre/llite/statahead.c +++ b/fs/lustre/llite/statahead.c @@ -132,6 +132,21 @@ static inline int sa_sent_full(struct ll_statahead_info *sai) return atomic_read(&sai->sai_cache_count) >= sai->sai_max; } +/* Batch metadata handle */ +static inline bool sa_has_batch_handle(struct ll_statahead_info *sai) +{ + return sai->sai_bh != NULL; +} + +static inline void ll_statahead_flush_nowait(struct ll_statahead_info *sai) +{ + if (sa_has_batch_handle(sai)) { + sai->sai_index_end = sai->sai_index - 1; + (void) md_batch_flush(ll_i2mdexp(sai->sai_dentry->d_inode), + sai->sai_bh, false); + } +} + static inline int agl_list_empty(struct ll_statahead_info *sai) { return list_empty(&sai->sai_agls); @@ -256,19 +271,35 @@ static void sa_free(struct ll_statahead_info *sai, struct sa_entry *entry) /* called by scanner after use, sa_entry will be killed */ static void -sa_put(struct ll_statahead_info *sai, struct sa_entry *entry) +sa_put(struct inode *dir, struct ll_statahead_info *sai, struct sa_entry *entry) { + struct ll_inode_info *lli = ll_i2info(dir); struct sa_entry *tmp, *next; + bool wakeup = false; if (entry && entry->se_state == SA_ENTRY_SUCC) { struct ll_sb_info *sbi = ll_i2sbi(sai->sai_dentry->d_inode); sai->sai_hit++; sai->sai_consecutive_miss = 0; - sai->sai_max = min(2 * sai->sai_max, sbi->ll_sa_max); + if (sai->sai_max < sbi->ll_sa_max) { + sai->sai_max = min(2 * sai->sai_max, sbi->ll_sa_max); + wakeup = true; + } else if (sai->sai_max_batch_count > 0) { + if (sai->sai_max >= sai->sai_max_batch_count && + (sai->sai_index_end - entry->se_index) % + sai->sai_max_batch_count == 0) { + wakeup = true; + } else if (entry->se_index == sai->sai_index_end) { + wakeup = true; + } + } else { + wakeup = true; + } } else { sai->sai_miss++; sai->sai_consecutive_miss++; + wakeup = true; } if (entry) @@ -283,6 +314,11 @@ static void sa_free(struct ll_statahead_info *sai, struct sa_entry *entry) break; sa_kill(sai, tmp); } + + spin_lock(&lli->lli_sa_lock); + if (wakeup && sai->sai_task) + wake_up_process(sai->sai_task); + spin_unlock(&lli->lli_sa_lock); } /* @@ -326,6 +362,9 @@ static void sa_fini_data(struct md_op_item *item) kfree(op_data->op_name); ll_unlock_md_op_lsm(op_data); iput(item->mop_dir); + /* make sure it wasn't allocated with kmem_cache_alloc */ + if (item->mop_subpill_allocated) + kfree(item->mop_pill); kfree(item); } @@ -356,6 +395,7 @@ static void sa_fini_data(struct md_op_item *item) if (!child) op_data->op_fid2 = entry->se_fid; + item->mop_opc = MD_OP_GETATTR; item->mop_it.it_op = IT_GETATTR; item->mop_dir = igrab(dir); item->mop_cb = ll_statahead_interpret; @@ -657,8 +697,12 @@ static void ll_statahead_interpret_work(struct work_struct *work) } rc = ll_prep_inode(&child, pill, dir->i_sb, it); - if (rc) + if (rc) { + CERROR("%s: getattr callback for %.*s "DFID": rc = %d\n", + ll_i2sbi(dir)->ll_fsname, entry->se_qstr.len, + entry->se_qstr.name, PFID(&entry->se_fid), rc); goto out; + } /* If encryption context was returned by MDT, put it in * inode now to save an extra getxattr. @@ -782,6 +826,19 @@ static int ll_statahead_interpret(struct md_op_item *item, int rc) return rc; } +static inline int sa_getattr(struct inode *dir, struct md_op_item *item) +{ + struct ll_statahead_info *sai = ll_i2info(dir)->lli_sai; + int rc; + + if (sa_has_batch_handle(sai)) + rc = md_batch_add(ll_i2mdexp(dir), sai->sai_bh, item); + else + rc = md_intent_getattr_async(ll_i2mdexp(dir), item); + + return rc; +} + /* async stat for file not found in dcache */ static int sa_lookup(struct inode *dir, struct sa_entry *entry) { @@ -792,8 +849,8 @@ static int sa_lookup(struct inode *dir, struct sa_entry *entry) if (IS_ERR(item)) return PTR_ERR(item); - rc = md_intent_getattr_async(ll_i2mdexp(dir), item); - if (rc) + rc = sa_getattr(dir, item); + if (rc < 0) sa_fini_data(item); return rc; @@ -837,7 +894,7 @@ static int sa_revalidate(struct inode *dir, struct sa_entry *entry, return 1; } - rc = md_intent_getattr_async(ll_i2mdexp(dir), item); + rc = sa_getattr(dir, item); if (rc) { entry->se_inode = NULL; iput(inode); @@ -880,6 +937,9 @@ static void sa_statahead(struct dentry *parent, const char *name, int len, sai->sai_sent++; sai->sai_index++; + + if (sa_sent_full(sai)) + ll_statahead_flush_nowait(sai); } /* async glimpse (agl) thread main function */ @@ -991,6 +1051,7 @@ static int ll_statahead_thread(void *arg) struct ll_sb_info *sbi = ll_i2sbi(dir); struct ll_statahead_info *sai = lli->lli_sai; struct page *page = NULL; + struct lu_batch *bh = NULL; u64 pos = 0; int first = 0; int rc = 0; @@ -999,6 +1060,17 @@ static int ll_statahead_thread(void *arg) CDEBUG(D_READA, "statahead thread starting: sai %p, parent %pd\n", sai, parent); + sai->sai_max_batch_count = sbi->ll_sa_batch_max; + if (sai->sai_max_batch_count) { + bh = md_batch_create(ll_i2mdexp(dir), BATCH_FL_RDONLY, + sai->sai_max_batch_count); + if (IS_ERR(bh)) { + rc = PTR_ERR(bh); + goto out_stop_agl; + } + } + + sai->sai_bh = bh; op_data = kzalloc(sizeof(*op_data), GFP_NOFS); if (!op_data) { rc = -ENOMEM; @@ -1164,6 +1236,8 @@ static int ll_statahead_thread(void *arg) spin_unlock(&lli->lli_sa_lock); } + ll_statahead_flush_nowait(sai); + /* * statahead is finished, but statahead entries need to be cached, wait * for file release closedir() call to stop me. @@ -1175,6 +1249,12 @@ static int ll_statahead_thread(void *arg) } __set_current_state(TASK_RUNNING); out: + if (bh) { + rc = md_batch_stop(ll_i2mdexp(dir), sai->sai_bh); + sai->sai_bh = NULL; + } + +out_stop_agl: ll_stop_agl(sai); /* @@ -1553,11 +1633,7 @@ static int revalidate_statahead_dentry(struct inode *dir, */ ldd = ll_d2d(*dentryp); ldd->lld_sa_generation = lli->lli_sa_generation; - sa_put(sai, entry); - spin_lock(&lli->lli_sa_lock); - if (sai->sai_task) - wake_up_process(sai->sai_task); - spin_unlock(&lli->lli_sa_lock); + sa_put(dir, sai, entry); return rc; } diff --git a/fs/lustre/lmv/lmv_obd.c b/fs/lustre/lmv/lmv_obd.c index 157498c..54f8673 100644 --- a/fs/lustre/lmv/lmv_obd.c +++ b/fs/lustre/lmv/lmv_obd.c @@ -3913,11 +3913,38 @@ static int lmv_batch_flush(struct obd_export *exp, struct lu_batch *bh, static inline struct lmv_tgt_desc * lmv_batch_locate_tgt(struct lmv_obd *lmv, struct md_op_item *item) { + struct md_op_data *op_data = &item->mop_data; struct lmv_tgt_desc *tgt; switch (item->mop_opc) { + case MD_OP_GETATTR: + if (fid_is_sane(&op_data->op_fid2)) { + struct lmv_tgt_desc *ptgt; + + ptgt = lmv_locate_tgt(lmv, op_data); + if (IS_ERR(ptgt)) { + tgt = ptgt; + } else { + tgt = lmv_fid2tgt(lmv, &op_data->op_fid2); + if (!IS_ERR(tgt)) { + /* + * Remote object needs two RPCs to + * lookup and getattr, considering + * the complexity don't support + * statahead for now. + */ + if (tgt != ptgt) + tgt = ERR_PTR(-EREMOTE); + } + } + } else { + tgt = ERR_PTR(-EINVAL); + } + break; + default: tgt = ERR_PTR(-EOPNOTSUPP); + break; } return tgt; diff --git a/fs/lustre/mdc/mdc_batch.c b/fs/lustre/mdc/mdc_batch.c index 496d61e3..73f5a8c 100644 --- a/fs/lustre/mdc/mdc_batch.c +++ b/fs/lustre/mdc/mdc_batch.c @@ -41,9 +41,163 @@ #include "mdc_internal.h" -static md_update_pack_t mdc_update_packers[MD_OP_MAX]; +static int mdc_ldlm_lock_pack(struct obd_export *exp, + struct req_capsule *pill, + union ldlm_policy_data *policy, + struct lu_fid *fid, struct md_op_item *item) +{ + struct ldlm_request *dlmreq; + struct ldlm_res_id res_id; + struct ldlm_enqueue_info *einfo = &item->mop_einfo; + + dlmreq = req_capsule_client_get(pill, &RMF_DLM_REQ); + if (IS_ERR(dlmreq)) + return PTR_ERR(dlmreq); + + /* With Data-on-MDT the glimpse callback is needed too. + * It is set here in advance but not in mdc_finish_enqueue() + * to avoid possible races. It is safe to have glimpse handler + * for non-DOM locks and costs nothing. + */ + if (!einfo->ei_cb_gl) + einfo->ei_cb_gl = mdc_ldlm_glimpse_ast; + + fid_build_reg_res_name(fid, &res_id); + + return ldlm_cli_lock_create_pack(exp, dlmreq, einfo, &res_id, + policy, &item->mop_lock_flags, + NULL, 0, LVB_T_NONE, &item->mop_lockh); +} + +static int mdc_batch_getattr_pack(struct batch_update_head *head, + struct lustre_msg *reqmsg, + size_t *max_pack_size, + struct md_op_item *item) +{ + struct obd_export *exp = head->buh_exp; + struct lookup_intent *it = &item->mop_it; + struct md_op_data *op_data = &item->mop_data; + u64 valid = OBD_MD_FLGETATTR | OBD_MD_FLEASIZE | OBD_MD_FLMODEASIZE | + OBD_MD_FLDIREA | OBD_MD_MEA | OBD_MD_FLACL | + OBD_MD_DEFAULT_MEA; + union ldlm_policy_data policy = { + .l_inodebits = { MDS_INODELOCK_LOOKUP | MDS_INODELOCK_UPDATE } + }; + struct ldlm_intent *lit; + bool have_secctx = false; + struct req_capsule pill; + u32 easize; + u32 size; + int rc; + + req_capsule_subreq_init(&pill, &RQF_BUT_GETATTR, NULL, + reqmsg, NULL, RCL_CLIENT); + + /* send name of security xattr to get upon intent */ + if (it->it_op & (IT_LOOKUP | IT_GETATTR) && + req_capsule_has_field(&pill, &RMF_FILE_SECCTX_NAME, + RCL_CLIENT) && + op_data->op_file_secctx_name_size > 0 && + op_data->op_file_secctx_name) { + have_secctx = true; + req_capsule_set_size(&pill, &RMF_FILE_SECCTX_NAME, RCL_CLIENT, + op_data->op_file_secctx_name_size); + } + + req_capsule_set_size(&pill, &RMF_NAME, RCL_CLIENT, + op_data->op_namelen + 1); + + size = req_capsule_msg_size(&pill, RCL_CLIENT); + if (unlikely(size >= *max_pack_size)) { + *max_pack_size = size; + return -E2BIG; + } + + req_capsule_client_pack(&pill); + /* pack the intent */ + lit = req_capsule_client_get(&pill, &RMF_LDLM_INTENT); + lit->opc = (u64)it->it_op; + + easize = MAX_MD_SIZE_OLD; /* obd->u.cli.cl_default_mds_easize; */ + + /* pack the intended request */ + mdc_getattr_pack(&pill, valid, it->it_flags, op_data, easize); + + item->mop_lock_flags |= LDLM_FL_HAS_INTENT; + rc = mdc_ldlm_lock_pack(head->buh_exp, &pill, &policy, + &item->mop_data.op_fid1, item); + if (rc) + return rc; -static object_update_interpret_t mdc_update_interpreters[MD_OP_MAX]; + req_capsule_set_size(&pill, &RMF_MDT_MD, RCL_SERVER, easize); + req_capsule_set_size(&pill, &RMF_ACL, RCL_SERVER, + LUSTRE_POSIX_ACL_MAX_SIZE_OLD); + req_capsule_set_size(&pill, &RMF_DEFAULT_MDT_MD, RCL_SERVER, + sizeof(struct lmv_user_md)); + + if (have_secctx) { + char *secctx_name; + + secctx_name = req_capsule_client_get(&pill, + &RMF_FILE_SECCTX_NAME); + memcpy(secctx_name, op_data->op_file_secctx_name, + op_data->op_file_secctx_name_size); + + req_capsule_set_size(&pill, &RMF_FILE_SECCTX, + RCL_SERVER, easize); + + CDEBUG(D_SEC, "packed '%.*s' as security xattr name\n", + op_data->op_file_secctx_name_size, + op_data->op_file_secctx_name); + } else { + req_capsule_set_size(&pill, &RMF_FILE_SECCTX, RCL_SERVER, 0); + } + + if (exp_connect_encrypt(exp) && it->it_op & (IT_LOOKUP | IT_GETATTR)) + req_capsule_set_size(&pill, &RMF_FILE_ENCCTX, + RCL_SERVER, easize); + else + req_capsule_set_size(&pill, &RMF_FILE_ENCCTX, + RCL_SERVER, 0); + + req_capsule_set_replen(&pill); + reqmsg->lm_opc = BUT_GETATTR; + *max_pack_size = size; + return rc; +} + +static md_update_pack_t mdc_update_packers[MD_OP_MAX] = { + [MD_OP_GETATTR] = mdc_batch_getattr_pack, +}; + +static int mdc_batch_getattr_interpret(struct ptlrpc_request *req, + struct lustre_msg *repmsg, + struct object_update_callback *ouc, + int rc) +{ + struct md_op_item *item = (struct md_op_item *)ouc->ouc_data; + struct ldlm_enqueue_info *einfo = &item->mop_einfo; + struct batch_update_head *head = ouc->ouc_head; + struct obd_export *exp = head->buh_exp; + struct req_capsule *pill = item->mop_pill; + + req_capsule_subreq_init(pill, &RQF_BUT_GETATTR, req, + NULL, repmsg, RCL_CLIENT); + + rc = ldlm_cli_enqueue_fini(exp, pill, einfo, 1, &item->mop_lock_flags, + NULL, 0, &item->mop_lockh, rc, false); + if (rc) + goto out; + + rc = mdc_finish_enqueue(exp, pill, einfo, &item->mop_it, + &item->mop_lockh, rc); +out: + return item->mop_cb(item, rc); +} + +object_update_interpret_t mdc_update_interpreters[MD_OP_MAX] = { + [MD_OP_GETATTR] = mdc_batch_getattr_interpret, +}; int mdc_batch_add(struct obd_export *exp, struct lu_batch *bh, struct md_op_item *item) @@ -57,6 +211,11 @@ int mdc_batch_add(struct obd_export *exp, struct lu_batch *bh, return -EFAULT; } + item->mop_pill = kzalloc(sizeof(*item->mop_pill), GFP_NOFS); + if (!item->mop_pill) + return -ENOMEM; + + item->mop_subpill_allocated = 1; return cli_batch_add(exp, bh, item, mdc_update_packers[opc], mdc_update_interpreters[opc]); } diff --git a/fs/lustre/mdc/mdc_dev.c b/fs/lustre/mdc/mdc_dev.c index 984d1a8..74911da 100644 --- a/fs/lustre/mdc/mdc_dev.c +++ b/fs/lustre/mdc/mdc_dev.c @@ -663,8 +663,8 @@ int mdc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, OBD_FAIL_TIMEOUT(OBD_FAIL_OSC_CP_ENQ_RACE, 1); /* Complete obtaining the lock procedure. */ - rc = ldlm_cli_enqueue_fini(aa->oa_exp, req, &einfo, 1, aa->oa_flags, - aa->oa_lvb, aa->oa_lvb ? + rc = ldlm_cli_enqueue_fini(aa->oa_exp, &req->rq_pill, &einfo, 1, + aa->oa_flags, aa->oa_lvb, aa->oa_lvb ? sizeof(*aa->oa_lvb) : 0, lockh, rc, true); /* Complete mdc stuff. */ rc = mdc_enqueue_fini(aa->oa_exp, req, aa->oa_upcall, aa->oa_cookie, diff --git a/fs/lustre/mdc/mdc_internal.h b/fs/lustre/mdc/mdc_internal.h index ae12a37..e752414 100644 --- a/fs/lustre/mdc/mdc_internal.h +++ b/fs/lustre/mdc/mdc_internal.h @@ -194,6 +194,12 @@ int mdc_ldlm_blocking_ast(struct ldlm_lock *dlmlock, int mdc_ldlm_glimpse_ast(struct ldlm_lock *dlmlock, void *data); int mdc_fill_lvb(struct req_capsule *pill, struct ost_lvb *lvb); +int mdc_finish_enqueue(struct obd_export *exp, + struct req_capsule *pill, + struct ldlm_enqueue_info *einfo, + struct lookup_intent *it, + struct lustre_handle *lockh, int rc); + /* the minimum inline repsize should be PAGE_SIZE at least */ #define MDC_DOM_DEF_INLINE_REPSIZE max(8192UL, PAGE_SIZE) #define MDC_DOM_MAX_INLINE_REPSIZE XATTR_SIZE_MAX diff --git a/fs/lustre/mdc/mdc_locks.c b/fs/lustre/mdc/mdc_locks.c index f36e0ec..7695c78 100644 --- a/fs/lustre/mdc/mdc_locks.c +++ b/fs/lustre/mdc/mdc_locks.c @@ -665,13 +665,13 @@ static struct ptlrpc_request *mdc_enqueue_pack(struct obd_export *exp, return req; } -static int mdc_finish_enqueue(struct obd_export *exp, - struct ptlrpc_request *req, - struct ldlm_enqueue_info *einfo, - struct lookup_intent *it, - struct lustre_handle *lockh, int rc) +int mdc_finish_enqueue(struct obd_export *exp, + struct req_capsule *pill, + struct ldlm_enqueue_info *einfo, + struct lookup_intent *it, + struct lustre_handle *lockh, int rc) { - struct req_capsule *pill = &req->rq_pill; + struct ptlrpc_request *req = pill->rc_req; struct ldlm_request *lockreq; struct ldlm_reply *lockrep; struct ldlm_lock *lock; @@ -1067,7 +1067,7 @@ int mdc_enqueue_base(struct obd_export *exp, struct ldlm_enqueue_info *einfo, goto resend; } - rc = mdc_finish_enqueue(exp, req, einfo, it, lockh, rc); + rc = mdc_finish_enqueue(exp, &req->rq_pill, einfo, it, lockh, rc); if (rc < 0) { if (lustre_handle_is_used(lockh)) { ldlm_lock_decref(lockh, einfo->ei_mode); @@ -1369,13 +1369,14 @@ static int mdc_intent_getattr_async_interpret(const struct lu_env *env, struct ldlm_enqueue_info *einfo = &item->mop_einfo; struct lookup_intent *it = &item->mop_it; struct lustre_handle *lockh = &item->mop_lockh; + struct req_capsule *pill = &req->rq_pill; struct ldlm_reply *lockrep; u64 flags = LDLM_FL_HAS_INTENT; if (OBD_FAIL_CHECK(OBD_FAIL_MDC_GETATTR_ENQUEUE)) rc = -ETIMEDOUT; - rc = ldlm_cli_enqueue_fini(exp, req, einfo, 1, &flags, NULL, 0, + rc = ldlm_cli_enqueue_fini(exp, pill, einfo, 1, &flags, NULL, 0, lockh, rc, true); if (rc < 0) { CERROR("%s: ldlm_cli_enqueue_fini() failed: rc = %d\n", @@ -1384,19 +1385,20 @@ static int mdc_intent_getattr_async_interpret(const struct lu_env *env, goto out; } - lockrep = req_capsule_server_get(&req->rq_pill, &RMF_DLM_REP); + lockrep = req_capsule_server_get(pill, &RMF_DLM_REP); + LASSERT(lockrep); lockrep->lock_policy_res2 = ptlrpc_status_ntoh(lockrep->lock_policy_res2); - rc = mdc_finish_enqueue(exp, req, einfo, it, lockh, rc); + rc = mdc_finish_enqueue(exp, pill, einfo, it, lockh, rc); if (rc) goto out; rc = mdc_finish_intent_lock(exp, req, &item->mop_data, it, lockh); out: - item->mop_pill = &req->rq_pill; + item->mop_pill = pill; item->mop_cb(item, rc); return 0; } diff --git a/fs/lustre/osc/osc_request.c b/fs/lustre/osc/osc_request.c index 6ea1db6..35dd009 100644 --- a/fs/lustre/osc/osc_request.c +++ b/fs/lustre/osc/osc_request.c @@ -2990,8 +2990,9 @@ int osc_enqueue_interpret(const struct lu_env *env, struct ptlrpc_request *req, } /* Complete obtaining the lock procedure. */ - rc = ldlm_cli_enqueue_fini(aa->oa_exp, req, &einfo, 1, aa->oa_flags, - lvb, lvb_len, lockh, rc, false); + rc = ldlm_cli_enqueue_fini(aa->oa_exp, &req->rq_pill, &einfo, 1, + aa->oa_flags, lvb, lvb_len, lockh, rc, + false); /* Complete osc stuff. */ rc = osc_enqueue_fini(req, aa->oa_upcall, aa->oa_cookie, lockh, mode, aa->oa_flags, aa->oa_speculative, rc); diff --git a/fs/lustre/ptlrpc/layout.c b/fs/lustre/ptlrpc/layout.c index 0fe74ff..5beebb7 100644 --- a/fs/lustre/ptlrpc/layout.c +++ b/fs/lustre/ptlrpc/layout.c @@ -722,6 +722,26 @@ &RMF_GENERIC_DATA, }; +static const struct req_msg_field *mds_batch_getattr_client[] = { + &RMF_DLM_REQ, + &RMF_LDLM_INTENT, + &RMF_MDT_BODY, /* coincides with mds_getattr_name_client[] */ + &RMF_CAPA1, + &RMF_NAME, + &RMF_FILE_SECCTX_NAME +}; + +static const struct req_msg_field *mds_batch_getattr_server[] = { + &RMF_DLM_REP, + &RMF_MDT_BODY, + &RMF_MDT_MD, + &RMF_ACL, + &RMF_CAPA1, + &RMF_FILE_SECCTX, + &RMF_DEFAULT_MDT_MD, + &RMF_FILE_ENCCTX, +}; + static struct req_format *req_formats[] = { &RQF_OBD_PING, &RQF_OBD_SET_INFO, @@ -811,6 +831,7 @@ &RQF_LLOG_ORIGIN_HANDLE_PREV_BLOCK, &RQF_LLOG_ORIGIN_HANDLE_READ_HEADER, &RQF_CONNECT, + &RQF_BUT_GETATTR, &RQF_MDS_BATCH, }; @@ -1701,6 +1722,11 @@ struct req_format RQF_OST_LADVISE = DEFINE_REQ_FMT0("OST_LADVISE", ost_ladvise, ost_body_only); EXPORT_SYMBOL(RQF_OST_LADVISE); +struct req_format RQF_BUT_GETATTR = + DEFINE_REQ_FMT0("MDS_BATCH_GETATTR", mds_batch_getattr_client, + mds_batch_getattr_server); +EXPORT_SYMBOL(RQF_BUT_GETATTR); + /* Convenience macro */ #define FMT_FIELD(fmt, i, j) ((fmt)->rf_fields[(i)].d[(j)]) @@ -2472,6 +2498,20 @@ void req_capsule_shrink(struct req_capsule *pill, } EXPORT_SYMBOL(req_capsule_shrink); +void req_capsule_subreq_init(struct req_capsule *pill, + const struct req_format *fmt, + struct ptlrpc_request *req, + struct lustre_msg *reqmsg, + struct lustre_msg *repmsg, + enum req_location loc) +{ + req_capsule_init(pill, req, loc); + req_capsule_set(pill, fmt); + pill->rc_reqmsg = reqmsg; + pill->rc_repmsg = repmsg; +} +EXPORT_SYMBOL(req_capsule_subreq_init); + void req_capsule_set_replen(struct req_capsule *pill) { if (req_capsule_ptlreq(pill)) { From patchwork Mon Apr 17 13:47:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214144 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DDA83C77B70 for ; Mon, 17 Apr 2023 14:14:03 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T81250Rz22Wp; Mon, 17 Apr 2023 06:54:57 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T3J2nd2z21KN for ; Mon, 17 Apr 2023 06:50:52 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id C69F1100849C; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C5203375; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:22 -0400 Message-Id: <1681739243-29375-27-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 26/27] lustre: llite: fix LSOM blocks for ftruncate and close X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Etienne AUJAMES LSOM is updated on close and setattr request. For the setattr, clients do not know the numbers blocks yet (OSTs setattr requests have to finish). So the blocks number is set to 1 by the server. The close request send after a ftruncate() will wrongly update LSOM back to its old blocks number. This is because clients do not update the inode.i_blocks after an OST setattr. Then the MDS will denied a client close request to update LSOM to its correct blocks number. Only truncates are allowed to decrease the blocks number (server side). This patch force the client inode update at the end of an OST setattr. And it tries (if no contention on the inode_size) to update the inode at the end of an OST fsync or a sync IO. WC-bug-id: https://jira.whamcloud.com/browse/LU-16465 Lustre-commit: dfb08bbf77a1362f79 ("LU-16465 llite: fix LSOM blocks for ftruncate and close") Signed-off-by: Etienne AUJAMES Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49675 Reviewed-by: Andreas Dilger Reviewed-by: Qian Yingjin Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- fs/lustre/llite/file.c | 36 +++++++++++++++++++++++++++--------- fs/lustre/llite/llite_internal.h | 2 ++ fs/lustre/llite/llite_lib.c | 10 ++++++++++ fs/lustre/llite/vvp_io.c | 34 ++++++++++++++++++++++++++++++++-- 4 files changed, 71 insertions(+), 11 deletions(-) diff --git a/fs/lustre/llite/file.c b/fs/lustre/llite/file.c index 0186db4..05a75ae 100644 --- a/fs/lustre/llite/file.c +++ b/fs/lustre/llite/file.c @@ -1417,7 +1417,7 @@ static int ll_lease_file_resync(struct obd_client_handle *och, return rc; } -int ll_merge_attr(const struct lu_env *env, struct inode *inode) +static int ll_merge_attr_nolock(const struct lu_env *env, struct inode *inode) { struct ll_inode_info *lli = ll_i2info(inode); struct cl_object *obj = lli->lli_clob; @@ -1427,10 +1427,7 @@ int ll_merge_attr(const struct lu_env *env, struct inode *inode) s64 ctime; int rc = 0; - ll_inode_size_lock(inode); - - /* - * merge timestamps the most recently obtained from MDS with + /* merge timestamps the most recently obtained from MDS with * timestamps obtained from OSTSs. * * Do not overwrite atime of inode because it may be refreshed @@ -1463,7 +1460,7 @@ int ll_merge_attr(const struct lu_env *env, struct inode *inode) if (rc != 0) { if (rc == -ENODATA) rc = 0; - goto out_size_unlock; + goto out; } if (atime < attr->cat_atime) @@ -1475,8 +1472,8 @@ int ll_merge_attr(const struct lu_env *env, struct inode *inode) if (mtime < attr->cat_mtime) mtime = attr->cat_mtime; - CDEBUG(D_VFSTRACE, DFID " updating i_size %llu\n", - PFID(&lli->lli_fid), attr->cat_size); + CDEBUG(D_VFSTRACE, DFID" updating i_size %llu i_block %llu\n", + PFID(&lli->lli_fid), attr->cat_size, attr->cat_blocks); if (fscrypt_require_key(inode) == -ENOKEY) { /* Without the key, round up encrypted file size to next @@ -1495,13 +1492,34 @@ int ll_merge_attr(const struct lu_env *env, struct inode *inode) inode->i_mtime.tv_sec = mtime; inode->i_atime.tv_sec = atime; inode->i_ctime.tv_sec = ctime; +out: + return rc; +} -out_size_unlock: +int ll_merge_attr(const struct lu_env *env, struct inode *inode) +{ + int rc; + + ll_inode_size_lock(inode); + rc = ll_merge_attr_nolock(env, inode); ll_inode_size_unlock(inode); return rc; } +/* Use to update size and blocks on inode for LSOM if there is no contention */ +int ll_merge_attr_try(const struct lu_env *env, struct inode *inode) +{ + int rc = 0; + + if (ll_inode_size_trylock(inode)) { + rc = ll_merge_attr_nolock(env, inode); + ll_inode_size_unlock(inode); + } + + return rc; +} + /** * Set designated mirror for I/O. * diff --git a/fs/lustre/llite/llite_internal.h b/fs/lustre/llite/llite_internal.h index 6088da08..88dbd6c 100644 --- a/fs/lustre/llite/llite_internal.h +++ b/fs/lustre/llite/llite_internal.h @@ -551,6 +551,7 @@ static inline void obd_connect_set_enc_fid2path(struct obd_connect_data *data) void ll_inode_size_lock(struct inode *inode); void ll_inode_size_unlock(struct inode *inode); +int ll_inode_size_trylock(struct inode *inode); static inline struct ll_inode_info *ll_i2info(struct inode *inode) { @@ -1248,6 +1249,7 @@ int ll_dir_getstripe(struct inode *inode, void **plmm, int *plmm_size, struct ptlrpc_request **request, u64 valid); int ll_fsync(struct file *file, loff_t start, loff_t end, int data); int ll_merge_attr(const struct lu_env *env, struct inode *inode); +int ll_merge_attr_try(const struct lu_env *env, struct inode *inode); int ll_fid2path(struct inode *inode, void __user *arg); int __ll_fid2path(struct inode *inode, struct getinfo_fid2path *gfout, size_t outsize, u32 pathlen_orig); diff --git a/fs/lustre/llite/llite_lib.c b/fs/lustre/llite/llite_lib.c index b1bbeb3..2c286e8 100644 --- a/fs/lustre/llite/llite_lib.c +++ b/fs/lustre/llite/llite_lib.c @@ -2558,6 +2558,16 @@ void ll_inode_size_unlock(struct inode *inode) mutex_unlock(&lli->lli_size_mutex); } +int ll_inode_size_trylock(struct inode *inode) +{ + struct ll_inode_info *lli; + + LASSERT(!S_ISDIR(inode->i_mode)); + + lli = ll_i2info(inode); + return mutex_trylock(&lli->lli_size_mutex); +} + void ll_update_inode_flags(struct inode *inode, unsigned int ext_flags) { struct ll_inode_info *lli = ll_i2info(inode); diff --git a/fs/lustre/llite/vvp_io.c b/fs/lustre/llite/vvp_io.c index 50c2872..31a3992 100644 --- a/fs/lustre/llite/vvp_io.c +++ b/fs/lustre/llite/vvp_io.c @@ -763,6 +763,10 @@ static void vvp_io_setattr_end(const struct lu_env *env, vvp_do_vmtruncate(inode, size); mutex_unlock(&lli->lli_setattr_mutex); trunc_sem_up_write(&lli->lli_trunc_sem); + + /* Update size and blocks for LSOM */ + if (!io->ci_ignore_layout) + ll_merge_attr(env, inode); } else if (cl_io_is_fallocate(io)) { int mode = io->u.ci_setattr.sa_falloc_mode; @@ -1306,6 +1310,20 @@ static void vvp_io_rw_end(const struct lu_env *env, trunc_sem_up_read(&lli->lli_trunc_sem); } +static void vvp_io_write_end(const struct lu_env *env, + const struct cl_io_slice *ios) +{ + struct inode *inode = vvp_object_inode(ios->cis_obj); + struct cl_io *io = ios->cis_io; + + vvp_io_rw_end(env, ios); + + /* Update size and blocks for LSOM (best effort) */ + if (!io->ci_ignore_layout && cl_io_is_sync_write(io)) + ll_merge_attr_try(env, inode); +} + + static int vvp_io_kernel_fault(struct vvp_fault_io *cfio) { struct vm_fault *vmf = cfio->ft_vmf; @@ -1559,6 +1577,17 @@ static int vvp_io_fsync_start(const struct lu_env *env, return 0; } +static void vvp_io_fsync_end(const struct lu_env *env, + const struct cl_io_slice *ios) +{ + struct inode *inode = vvp_object_inode(ios->cis_obj); + struct cl_io *io = ios->cis_io; + + /* Update size and blocks for LSOM (best effort) */ + if (!io->ci_ignore_layout) + ll_merge_attr_try(env, inode); +} + static int vvp_io_read_ahead(const struct lu_env *env, const struct cl_io_slice *ios, pgoff_t start, struct cl_read_ahead *ra) @@ -1639,7 +1668,7 @@ static void vvp_io_lseek_end(const struct lu_env *env, .cio_iter_fini = vvp_io_write_iter_fini, .cio_lock = vvp_io_write_lock, .cio_start = vvp_io_write_start, - .cio_end = vvp_io_rw_end, + .cio_end = vvp_io_write_end, .cio_advance = vvp_io_advance, }, [CIT_SETATTR] = { @@ -1658,7 +1687,8 @@ static void vvp_io_lseek_end(const struct lu_env *env, }, [CIT_FSYNC] = { .cio_start = vvp_io_fsync_start, - .cio_fini = vvp_io_fini + .cio_fini = vvp_io_fini, + .cio_end = vvp_io_fsync_end, }, [CIT_GLIMPSE] = { .cio_fini = vvp_io_fini From patchwork Mon Apr 17 13:47:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Simmons X-Patchwork-Id: 13214146 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from pdx1-mailman-customer002.dreamhost.com (listserver-buz.dreamhost.com [69.163.136.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43F8DC77B70 for ; Mon, 17 Apr 2023 14:15:09 +0000 (UTC) Received: from pdx1-mailman-customer002.dreamhost.com (localhost [127.0.0.1]) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTP id 4Q0T8G4Xwmz22XD; Mon, 17 Apr 2023 06:55:10 -0700 (PDT) Received: from smtp4.ccs.ornl.gov (smtp4.ccs.ornl.gov [160.91.203.40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pdx1-mailman-customer002.dreamhost.com (Postfix) with ESMTPS id 4Q0T3Z3mH9z226J for ; Mon, 17 Apr 2023 06:51:06 -0700 (PDT) Received: from star.ccs.ornl.gov (star.ccs.ornl.gov [160.91.202.134]) by smtp4.ccs.ornl.gov (Postfix) with ESMTP id CB1CE10084A5; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) Received: by star.ccs.ornl.gov (Postfix, from userid 2004) id C9A17379; Mon, 17 Apr 2023 09:47:24 -0400 (EDT) From: James Simmons To: Andreas Dilger , Oleg Drokin , NeilBrown Date: Mon, 17 Apr 2023 09:47:23 -0400 Message-Id: <1681739243-29375-28-git-send-email-jsimmons@infradead.org> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> References: <1681739243-29375-1-git-send-email-jsimmons@infradead.org> Subject: [lustre-devel] [PATCH 27/27] lnet: fix clang build errors X-BeenThere: lustre-devel@lists.lustre.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: "For discussing Lustre software development." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Lustre Development List MIME-Version: 1.0 Errors-To: lustre-devel-bounces@lists.lustre.org Sender: "lustre-devel" From: Timothy Day LNET_PID_ANY and LNET_NID_ANY were defined outside the range of an u64. They were moved from -1 to the maximum value for u64. WC-bug-id: https://jira.whamcloud.com/browse/LU-16518 Lustre-commit: b0297a1056a4b0ee65 ("LU-16518 lnet: fix clang build errors") Signed-off-by: Timothy Day Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50318 Reviewed-by: Neil Brown Reviewed-by: Chris Horn Reviewed-by: Frank Sehr Reviewed-by: jsimmons Reviewed-by: Oleg Drokin Signed-off-by: James Simmons --- include/uapi/linux/lnet/lnet-types.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/lnet/lnet-types.h b/include/uapi/linux/lnet/lnet-types.h index 6c6a66eb..959d9af 100644 --- a/include/uapi/linux/lnet/lnet-types.h +++ b/include/uapi/linux/lnet/lnet-types.h @@ -60,9 +60,9 @@ #define LNET_RESERVED_PORTAL 0 /** wildcard NID that matches any end-point address */ -#define LNET_NID_ANY ((lnet_nid_t)(-1)) +#define LNET_NID_ANY (~(lnet_nid_t) 0) /** wildcard PID that matches any lnet_pid_t */ -#define LNET_PID_ANY ((lnet_pid_t)(-1)) +#define LNET_PID_ANY (~(lnet_pid_t) 0) static inline int LNET_NID_IS_ANY(const struct lnet_nid *nid) {