From patchwork Mon Jun 17 12:55:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999201 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6AA026C5 for ; Mon, 17 Jun 2019 12:55:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5C2C528683 for ; Mon, 17 Jun 2019 12:55:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5706D2866C; Mon, 17 Jun 2019 12:55:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9ADFC2866C for ; Mon, 17 Jun 2019 12:55:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726731AbfFQMzh (ORCPT ); Mon, 17 Jun 2019 08:55:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51446 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726028AbfFQMzh (ORCPT ); Mon, 17 Jun 2019 08:55:37 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D5317307D934 for ; Mon, 17 Jun 2019 12:55:36 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id B5B3D7E575; Mon, 17 Jun 2019 12:55:34 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 1/8] libceph: add function that reset client's entity addr Date: Mon, 17 Jun 2019 20:55:22 +0800 Message-Id: <20190617125529.6230-2-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Mon, 17 Jun 2019 12:55:36 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: "Yan, Zheng" --- include/linux/ceph/libceph.h | 1 + include/linux/ceph/messenger.h | 1 + include/linux/ceph/mon_client.h | 1 + include/linux/ceph/osd_client.h | 1 + net/ceph/ceph_common.c | 8 ++++++++ net/ceph/messenger.c | 5 +++++ net/ceph/mon_client.c | 7 +++++++ net/ceph/osd_client.c | 16 ++++++++++++++++ 8 files changed, 40 insertions(+) diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h index a3cddf5f0e60..f29959eed025 100644 --- a/include/linux/ceph/libceph.h +++ b/include/linux/ceph/libceph.h @@ -291,6 +291,7 @@ struct ceph_client *ceph_create_client(struct ceph_options *opt, void *private); struct ceph_entity_addr *ceph_client_addr(struct ceph_client *client); u64 ceph_client_gid(struct ceph_client *client); extern void ceph_destroy_client(struct ceph_client *client); +extern void ceph_reset_client_addr(struct ceph_client *client); extern int __ceph_open_session(struct ceph_client *client, unsigned long started); extern int ceph_open_session(struct ceph_client *client); diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index 23895d178149..c4458dc6a757 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -337,6 +337,7 @@ extern void ceph_msgr_flush(void); extern void ceph_messenger_init(struct ceph_messenger *msgr, struct ceph_entity_addr *myaddr); extern void ceph_messenger_fini(struct ceph_messenger *msgr); +extern void ceph_messenger_reset_nonce(struct ceph_messenger *msgr); extern void ceph_con_init(struct ceph_connection *con, void *private, const struct ceph_connection_operations *ops, diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h index 3a4688af7455..0d8d890c6759 100644 --- a/include/linux/ceph/mon_client.h +++ b/include/linux/ceph/mon_client.h @@ -110,6 +110,7 @@ extern int ceph_monmap_contains(struct ceph_monmap *m, extern int ceph_monc_init(struct ceph_mon_client *monc, struct ceph_client *cl); extern void ceph_monc_stop(struct ceph_mon_client *monc); +extern void ceph_monc_reopen_session(struct ceph_mon_client *monc); enum { CEPH_SUB_MONMAP = 0, diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 2294f963dab7..a12b7fc9cfd6 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -381,6 +381,7 @@ extern void ceph_osdc_cleanup(void); extern int ceph_osdc_init(struct ceph_osd_client *osdc, struct ceph_client *client); extern void ceph_osdc_stop(struct ceph_osd_client *osdc); +extern void ceph_osdc_reopen_osds(struct ceph_osd_client *osdc); extern void ceph_osdc_handle_reply(struct ceph_osd_client *osdc, struct ceph_msg *msg); diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c index 79eac465ec65..55210823d1cc 100644 --- a/net/ceph/ceph_common.c +++ b/net/ceph/ceph_common.c @@ -693,6 +693,14 @@ void ceph_destroy_client(struct ceph_client *client) } EXPORT_SYMBOL(ceph_destroy_client); +void ceph_reset_client_addr(struct ceph_client *client) +{ + ceph_messenger_reset_nonce(&client->msgr); + ceph_monc_reopen_session(&client->monc); + ceph_osdc_reopen_osds(&client->osdc); +} +EXPORT_SYMBOL(ceph_reset_client_addr); + /* * true if we have the mon map (and have thus joined the cluster) */ diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index 3ee380758ddd..cd03a1cba849 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -3028,6 +3028,11 @@ static void con_fault(struct ceph_connection *con) } +void ceph_messenger_reset_nonce(struct ceph_messenger *msgr) +{ + msgr->inst.addr.nonce += 1000000; + encode_my_addr(msgr); +} /* * initialize a new messenger instance diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c index 895679d3529b..6dab6a94e9cc 100644 --- a/net/ceph/mon_client.c +++ b/net/ceph/mon_client.c @@ -209,6 +209,13 @@ static void reopen_session(struct ceph_mon_client *monc) __open_session(monc); } +void ceph_monc_reopen_session(struct ceph_mon_client *monc) +{ + mutex_lock(&monc->mutex); + reopen_session(monc); + mutex_unlock(&monc->mutex); +} + static void un_backoff(struct ceph_mon_client *monc) { monc->hunt_mult /= 2; /* reduce by 50% */ diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index e6d31e0f0289..67e9466f27fd 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -5089,6 +5089,22 @@ int ceph_osdc_call(struct ceph_osd_client *osdc, } EXPORT_SYMBOL(ceph_osdc_call); +/* + * reset all osd connections + */ +void ceph_osdc_reopen_osds(struct ceph_osd_client *osdc) +{ + struct rb_node *n; + down_write(&osdc->lock); + for (n = rb_first(&osdc->osds); n; ) { + struct ceph_osd *osd = rb_entry(n, struct ceph_osd, o_node); + n = rb_next(n); + if (!reopen_osd(osd)) + kick_osd_requests(osd); + } + up_write(&osdc->lock); +} + /* * init, shutdown */ From patchwork Mon Jun 17 12:55:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999203 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 996BA6C5 for ; Mon, 17 Jun 2019 12:55:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8BFDC286AC for ; Mon, 17 Jun 2019 12:55:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 800BB286F1; Mon, 17 Jun 2019 12:55:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C093286AC for ; Mon, 17 Jun 2019 12:55:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727136AbfFQMzk (ORCPT ); Mon, 17 Jun 2019 08:55:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:16560 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726028AbfFQMzj (ORCPT ); Mon, 17 Jun 2019 08:55:39 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C0DF4307D868 for ; Mon, 17 Jun 2019 12:55:39 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id 711D3989C; Mon, 17 Jun 2019 12:55:37 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 2/8] libceph: add function that clears osd client's abort_err Date: Mon, 17 Jun 2019 20:55:23 +0800 Message-Id: <20190617125529.6230-3-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Mon, 17 Jun 2019 12:55:39 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: "Yan, Zheng" --- include/linux/ceph/osd_client.h | 1 + net/ceph/osd_client.c | 8 ++++++++ 2 files changed, 9 insertions(+) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index a12b7fc9cfd6..71ddb0a89f4d 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -389,6 +389,7 @@ extern void ceph_osdc_handle_map(struct ceph_osd_client *osdc, struct ceph_msg *msg); void ceph_osdc_update_epoch_barrier(struct ceph_osd_client *osdc, u32 eb); void ceph_osdc_abort_requests(struct ceph_osd_client *osdc, int err); +void ceph_osdc_clear_abort_err(struct ceph_osd_client *osdc); extern void osd_req_op_init(struct ceph_osd_request *osd_req, unsigned int which, u16 opcode, u32 flags); diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 67e9466f27fd..9a46ae296424 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2485,6 +2485,14 @@ void ceph_osdc_abort_requests(struct ceph_osd_client *osdc, int err) } EXPORT_SYMBOL(ceph_osdc_abort_requests); +void ceph_osdc_clear_abort_err(struct ceph_osd_client *osdc) +{ + down_write(&osdc->lock); + osdc->abort_err = 0; + up_write(&osdc->lock); +} +EXPORT_SYMBOL(ceph_osdc_clear_abort_err); + static void update_epoch_barrier(struct ceph_osd_client *osdc, u32 eb) { if (likely(eb > osdc->epoch_barrier)) { From patchwork Mon Jun 17 12:55:24 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999209 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 501996C5 for ; Mon, 17 Jun 2019 12:55:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 40489285EB for ; Mon, 17 Jun 2019 12:55:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34CF628904; Mon, 17 Jun 2019 12:55:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C429D285EB for ; Mon, 17 Jun 2019 12:55:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728031AbfFQMzu (ORCPT ); Mon, 17 Jun 2019 08:55:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56868 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727987AbfFQMzq (ORCPT ); Mon, 17 Jun 2019 08:55:46 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A28C481E07 for ; Mon, 17 Jun 2019 12:55:46 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id 74E76989C; Mon, 17 Jun 2019 12:55:40 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 3/8] ceph: allow closing session in restarting/reconnect state Date: Mon, 17 Jun 2019 20:55:24 +0800 Message-Id: <20190617125529.6230-4-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 17 Jun 2019 12:55:46 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP CEPH_MDS_SESSION_{RESTARTING,RECONNECTING} are for for mds failover, they are sub-states of CEPH_MDS_SESSION_OPEN. So __close_session() should send close request for session in these two state. Signed-off-by: "Yan, Zheng" --- fs/ceph/mds_client.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 330769ecb601..eb527919011b 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -146,9 +146,9 @@ enum { CEPH_MDS_SESSION_OPENING = 2, CEPH_MDS_SESSION_OPEN = 3, CEPH_MDS_SESSION_HUNG = 4, - CEPH_MDS_SESSION_CLOSING = 5, - CEPH_MDS_SESSION_RESTARTING = 6, - CEPH_MDS_SESSION_RECONNECTING = 7, + CEPH_MDS_SESSION_RESTARTING = 5, + CEPH_MDS_SESSION_RECONNECTING = 6, + CEPH_MDS_SESSION_CLOSING = 7, CEPH_MDS_SESSION_REJECTED = 8, }; From patchwork Mon Jun 17 12:55:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999211 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B1B1C1580 for ; Mon, 17 Jun 2019 12:55:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A496C28610 for ; Mon, 17 Jun 2019 12:55:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2EC828904; Mon, 17 Jun 2019 12:55:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5C39A28610 for ; Mon, 17 Jun 2019 12:55:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728044AbfFQMzu (ORCPT ); Mon, 17 Jun 2019 08:55:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33216 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728030AbfFQMzu (ORCPT ); Mon, 17 Jun 2019 08:55:50 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E40D330860A2 for ; Mon, 17 Jun 2019 12:55:49 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6C57239BE; Mon, 17 Jun 2019 12:55:47 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 4/8] ceph: allow remounting aborted mount Date: Mon, 17 Jun 2019 20:55:25 +0800 Message-Id: <20190617125529.6230-5-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Mon, 17 Jun 2019 12:55:49 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When remounting aborted mount, also reset client's entity addr. 'umount -f /ceph; mount -o remount /ceph' can be used for recovering from blacklist. Signed-off-by: "Yan, Zheng" --- fs/ceph/mds_client.c | 16 +++++++++++++--- fs/ceph/super.c | 23 +++++++++++++++++++++-- 2 files changed, 34 insertions(+), 5 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 19c62cf7d5b8..188c33709d9a 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1378,9 +1378,12 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, struct ceph_cap_flush *cf; struct ceph_mds_client *mdsc = fsc->mdsc; - if (ci->i_wrbuffer_ref > 0 && - READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) - invalidate = true; + if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) { + if (inode->i_data.nrpages > 0) + invalidate = true; + if (ci->i_wrbuffer_ref > 0) + mapping_set_error(&inode->i_data, -EIO); + } while (!list_empty(&ci->i_cap_flush_list)) { cf = list_first_entry(&ci->i_cap_flush_list, @@ -4350,7 +4353,12 @@ void ceph_mdsc_force_umount(struct ceph_mds_client *mdsc) session = __ceph_lookup_mds_session(mdsc, mds); if (!session) continue; + + if (session->s_state == CEPH_MDS_SESSION_REJECTED) + __unregister_session(mdsc, session); + __wake_requests(mdsc, &session->s_waiting); mutex_unlock(&mdsc->mutex); + mutex_lock(&session->s_mutex); __close_session(mdsc, session); if (session->s_state == CEPH_MDS_SESSION_CLOSING) { @@ -4359,9 +4367,11 @@ void ceph_mdsc_force_umount(struct ceph_mds_client *mdsc) } mutex_unlock(&session->s_mutex); ceph_put_mds_session(session); + mutex_lock(&mdsc->mutex); kick_requests(mdsc, mds); } + __wake_requests(mdsc, &mdsc->waiting_for_map); mutex_unlock(&mdsc->mutex); } diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 67eb9d592ab7..a6a3c065f697 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -833,8 +833,27 @@ static void ceph_umount_begin(struct super_block *sb) static int ceph_remount(struct super_block *sb, int *flags, char *data) { - sync_filesystem(sb); - return 0; + struct ceph_fs_client *fsc = ceph_sb_to_client(sb); + + if (fsc->mount_state != CEPH_MOUNT_SHUTDOWN) { + sync_filesystem(sb); + return 0; + } + + /* Make sure all page caches get invalidated. + * see remove_session_caps_cb() */ + flush_workqueue(fsc->inode_wq); + /* In case that we were blacklisted. This also reset + * all mon/osd connections */ + ceph_reset_client_addr(fsc->client); + + ceph_osdc_clear_abort_err(&fsc->client->osdc); + fsc->mount_state = 0; + + if (!sb->s_root) + return 0; + return __ceph_do_getattr(d_inode(sb->s_root), NULL, + CEPH_STAT_CAP_INODE, true); } static const struct super_operations ceph_super_ops = { From patchwork Mon Jun 17 12:55:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B02314B6 for ; Mon, 17 Jun 2019 12:55:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C6C928678 for ; Mon, 17 Jun 2019 12:55:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 60BB328904; Mon, 17 Jun 2019 12:55:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AE20428678 for ; Mon, 17 Jun 2019 12:55:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728057AbfFQMzy (ORCPT ); Mon, 17 Jun 2019 08:55:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:46614 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728030AbfFQMzx (ORCPT ); Mon, 17 Jun 2019 08:55:53 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 010FA3001749 for ; Mon, 17 Jun 2019 12:55:53 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id AE85839BE; Mon, 17 Jun 2019 12:55:50 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 5/8] ceph: track and report error of async metadata operation Date: Mon, 17 Jun 2019 20:55:26 +0800 Message-Id: <20190617125529.6230-6-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Mon, 17 Jun 2019 12:55:53 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Use errseq_t to track and report errors of async metadata operations, similar to how kernel handles errors during writeback. If any dirty caps or any unsafe request gets dropped during session eviction, record -EIO in corresponding inode's i_meta_err. The error will be reported by subsequent fsync, Signed-off-by: "Yan, Zheng" --- fs/ceph/caps.c | 24 +++++++++++++++++------- fs/ceph/file.c | 6 ++++-- fs/ceph/inode.c | 2 ++ fs/ceph/mds_client.c | 40 +++++++++++++++++++++++++++------------- fs/ceph/super.h | 4 ++++ 5 files changed, 54 insertions(+), 22 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 50409d9fdc90..e83a481c3e6f 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2244,35 +2244,45 @@ static int unsafe_request_wait(struct inode *inode) int ceph_fsync(struct file *file, loff_t start, loff_t end, int datasync) { + struct ceph_file_info *fi = file->private_data; struct inode *inode = file->f_mapping->host; struct ceph_inode_info *ci = ceph_inode(inode); u64 flush_tid; - int ret; + int ret, err; int dirty; dout("fsync %p%s\n", inode, datasync ? " datasync" : ""); ret = file_write_and_wait_range(file, start, end); - if (ret < 0) - goto out; - if (datasync) goto out; dirty = try_flush_caps(inode, &flush_tid); dout("fsync dirty caps are %s\n", ceph_cap_string(dirty)); - ret = unsafe_request_wait(inode); + err = unsafe_request_wait(inode); /* * only wait on non-file metadata writeback (the mds * can recover size and mtime, so we don't need to * wait for that) */ - if (!ret && (dirty & ~CEPH_CAP_ANY_FILE_WR)) { - ret = wait_event_interruptible(ci->i_cap_wq, + if (!err && (dirty & ~CEPH_CAP_ANY_FILE_WR)) { + err = wait_event_interruptible(ci->i_cap_wq, caps_are_flushed(inode, flush_tid)); } + + if (err < 0) + ret = err; + + if (errseq_check(&ci->i_meta_err, READ_ONCE(fi->meta_err))) { + spin_lock(&file->f_lock); + err = errseq_check_and_advance(&ci->i_meta_err, + &fi->meta_err); + spin_unlock(&file->f_lock); + if (err < 0) + ret = err; + } out: dout("fsync %p%s result=%d\n", inode, datasync ? " datasync" : "", ret); return ret; diff --git a/fs/ceph/file.c b/fs/ceph/file.c index a7080783fe20..2fe8ca7805f4 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -200,6 +200,7 @@ prepare_open_request(struct super_block *sb, int flags, int create_mode) static int ceph_init_file_info(struct inode *inode, struct file *file, int fmode, bool isdir) { + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_file_info *fi; dout("%s %p %p 0%o (%s)\n", __func__, inode, file, @@ -210,7 +211,7 @@ static int ceph_init_file_info(struct inode *inode, struct file *file, struct ceph_dir_file_info *dfi = kmem_cache_zalloc(ceph_dir_file_cachep, GFP_KERNEL); if (!dfi) { - ceph_put_fmode(ceph_inode(inode), fmode); /* clean up */ + ceph_put_fmode(ci, fmode); /* clean up */ return -ENOMEM; } @@ -221,7 +222,7 @@ static int ceph_init_file_info(struct inode *inode, struct file *file, } else { fi = kmem_cache_zalloc(ceph_file_cachep, GFP_KERNEL); if (!fi) { - ceph_put_fmode(ceph_inode(inode), fmode); /* clean up */ + ceph_put_fmode(ci, fmode); /* clean up */ return -ENOMEM; } @@ -231,6 +232,7 @@ static int ceph_init_file_info(struct inode *inode, struct file *file, fi->fmode = fmode; spin_lock_init(&fi->rw_contexts_lock); INIT_LIST_HEAD(&fi->rw_contexts); + fi->meta_err = errseq_sample(&ci->i_meta_err); return 0; } diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 6003187dd39e..8c555734f8d5 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -512,6 +512,8 @@ struct inode *ceph_alloc_inode(struct super_block *sb) ceph_fscache_inode_init(ci); + ci->i_meta_err = 0; + return &ci->vfs_inode; } diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 188c33709d9a..52a3bd7a49f4 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -1265,6 +1265,7 @@ static void cleanup_session_requests(struct ceph_mds_client *mdsc, { struct ceph_mds_request *req; struct rb_node *p; + struct ceph_inode_info *ci; dout("cleanup_session_requests mds%d\n", session->s_mds); mutex_lock(&mdsc->mutex); @@ -1273,6 +1274,16 @@ static void cleanup_session_requests(struct ceph_mds_client *mdsc, struct ceph_mds_request, r_unsafe_item); pr_warn_ratelimited(" dropping unsafe request %llu\n", req->r_tid); + if (req->r_target_inode) { + /* dropping unsafe change of inode's attributes */ + ci = ceph_inode(req->r_target_inode); + errseq_set(&ci->i_meta_err, -EIO); + } + if (req->r_unsafe_dir) { + /* dropping unsafe directory operation */ + ci = ceph_inode(req->r_unsafe_dir); + errseq_set(&ci->i_meta_err, -EIO); + } __unregister_request(mdsc, req); } /* zero r_attempts, so kick_requests() will re-send requests */ @@ -1365,7 +1376,7 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, struct ceph_fs_client *fsc = (struct ceph_fs_client *)arg; struct ceph_inode_info *ci = ceph_inode(inode); LIST_HEAD(to_remove); - bool drop = false; + bool dirty_dropped = false; bool invalidate = false; dout("removing cap %p, ci is %p, inode is %p\n", @@ -1403,7 +1414,7 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, inode, ceph_ino(inode)); ci->i_dirty_caps = 0; list_del_init(&ci->i_dirty_item); - drop = true; + dirty_dropped = true; } if (!list_empty(&ci->i_flushing_item)) { pr_warn_ratelimited( @@ -1413,10 +1424,22 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, ci->i_flushing_caps = 0; list_del_init(&ci->i_flushing_item); mdsc->num_cap_flushing--; - drop = true; + dirty_dropped = true; } spin_unlock(&mdsc->cap_dirty_lock); + if (dirty_dropped) { + errseq_set(&ci->i_meta_err, -EIO); + + if (ci->i_wrbuffer_ref_head == 0 && + ci->i_wr_ref == 0 && + ci->i_dirty_caps == 0 && + ci->i_flushing_caps == 0) { + ceph_put_snap_context(ci->i_head_snapc); + ci->i_head_snapc = NULL; + } + } + if (atomic_read(&ci->i_filelock_ref) > 0) { /* make further file lock syscall return -EIO */ ci->i_ceph_flags |= CEPH_I_ERROR_FILELOCK; @@ -1428,15 +1451,6 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, list_add(&ci->i_prealloc_cap_flush->i_list, &to_remove); ci->i_prealloc_cap_flush = NULL; } - - if (drop && - ci->i_wrbuffer_ref_head == 0 && - ci->i_wr_ref == 0 && - ci->i_dirty_caps == 0 && - ci->i_flushing_caps == 0) { - ceph_put_snap_context(ci->i_head_snapc); - ci->i_head_snapc = NULL; - } } spin_unlock(&ci->i_ceph_lock); while (!list_empty(&to_remove)) { @@ -1450,7 +1464,7 @@ static int remove_session_caps_cb(struct inode *inode, struct ceph_cap *cap, wake_up_all(&ci->i_cap_wq); if (invalidate) ceph_queue_invalidate(inode); - if (drop) + if (dirty_dropped) iput(inode); return 0; } diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 98d2bafc2ee2..2e516d47052f 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -393,6 +393,8 @@ struct ceph_inode_info { struct fscache_cookie *fscache; u32 i_fscache_gen; #endif + errseq_t i_meta_err; + struct inode vfs_inode; /* at end */ }; @@ -701,6 +703,8 @@ struct ceph_file_info { spinlock_t rw_contexts_lock; struct list_head rw_contexts; + + errseq_t meta_err; }; struct ceph_dir_file_info { From patchwork Mon Jun 17 12:55:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999215 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2F40914B6 for ; Mon, 17 Jun 2019 12:56:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1FEC128905 for ; Mon, 17 Jun 2019 12:56:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 148A128610; Mon, 17 Jun 2019 12:56:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2AD68286AC for ; Mon, 17 Jun 2019 12:55:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728059AbfFQMz6 (ORCPT ); Mon, 17 Jun 2019 08:55:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:30434 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727463AbfFQMz5 (ORCPT ); Mon, 17 Jun 2019 08:55:57 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 184AD81DF1 for ; Mon, 17 Jun 2019 12:55:57 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id C2E7A7D934; Mon, 17 Jun 2019 12:55:53 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 6/8] ceph: pass filp to ceph_get_caps() Date: Mon, 17 Jun 2019 20:55:27 +0800 Message-Id: <20190617125529.6230-7-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Mon, 17 Jun 2019 12:55:57 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Also change several other functions' arguments, no logical changes. This is preparetion for later patch that checks filp error. Signed-off-by: "Yan, Zheng" --- fs/ceph/addr.c | 15 +++++++++------ fs/ceph/caps.c | 32 +++++++++++++++++--------------- fs/ceph/file.c | 35 +++++++++++++++++------------------ fs/ceph/super.h | 6 +++--- 4 files changed, 46 insertions(+), 42 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index a47c541f8006..4d39fd4c1dd6 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -322,7 +322,8 @@ static int start_read(struct inode *inode, struct ceph_rw_context *rw_ctx, /* caller of readpages does not hold buffer and read caps * (fadvise, madvise and readahead cases) */ int want = CEPH_CAP_FILE_CACHE; - ret = ceph_try_get_caps(ci, CEPH_CAP_FILE_RD, want, true, &got); + ret = ceph_try_get_caps(inode, CEPH_CAP_FILE_RD, want, + true, &got); if (ret < 0) { dout("start_read %p, error getting cap\n", inode); } else if (!(got & want)) { @@ -1450,7 +1451,8 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault *vmf) want = CEPH_CAP_FILE_CACHE; got = 0; - err = ceph_get_caps(ci, CEPH_CAP_FILE_RD, want, -1, &got, &pinned_page); + err = ceph_get_caps(vma->vm_file, CEPH_CAP_FILE_RD, want, -1, + &got, &pinned_page); if (err < 0) goto out_restore; @@ -1566,7 +1568,7 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf) want = CEPH_CAP_FILE_BUFFER; got = 0; - err = ceph_get_caps(ci, CEPH_CAP_FILE_WR, want, off + len, + err = ceph_get_caps(vma->vm_file, CEPH_CAP_FILE_WR, want, off + len, &got, NULL); if (err < 0) goto out_free; @@ -1986,10 +1988,11 @@ static int __ceph_pool_perm_get(struct ceph_inode_info *ci, return err; } -int ceph_pool_perm_check(struct ceph_inode_info *ci, int need) +int ceph_pool_perm_check(struct inode *inode, int need) { - s64 pool; + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_string *pool_ns; + s64 pool; int ret, flags; if (ci->i_vino.snap != CEPH_NOSNAP) { @@ -2001,7 +2004,7 @@ int ceph_pool_perm_check(struct ceph_inode_info *ci, int need) return 0; } - if (ceph_test_mount_opt(ceph_inode_to_client(&ci->vfs_inode), + if (ceph_test_mount_opt(ceph_inode_to_client(inode), NOPOOLPERM)) return 0; diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index e83a481c3e6f..57e1447a9d4b 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2543,10 +2543,10 @@ static void __take_cap_refs(struct ceph_inode_info *ci, int got, * * FIXME: how does a 0 return differ from -EAGAIN? */ -static int try_get_cap_refs(struct ceph_inode_info *ci, int need, int want, +static int try_get_cap_refs(struct inode *inode, int need, int want, loff_t endoff, bool nonblock, int *got) { - struct inode *inode = &ci->vfs_inode; + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc; int ret = 0; int have, implemented; @@ -2714,18 +2714,18 @@ static void check_max_size(struct inode *inode, loff_t endoff) ceph_check_caps(ci, CHECK_CAPS_AUTHONLY, NULL); } -int ceph_try_get_caps(struct ceph_inode_info *ci, int need, int want, +int ceph_try_get_caps(struct inode *inode, int need, int want, bool nonblock, int *got) { int ret; BUG_ON(need & ~CEPH_CAP_FILE_RD); BUG_ON(want & ~(CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO|CEPH_CAP_FILE_SHARED)); - ret = ceph_pool_perm_check(ci, need); + ret = ceph_pool_perm_check(inode, need); if (ret < 0) return ret; - ret = try_get_cap_refs(ci, need, want, 0, nonblock, got); + ret = try_get_cap_refs(inode, need, want, 0, nonblock, got); return ret == -EAGAIN ? 0 : ret; } @@ -2734,21 +2734,23 @@ int ceph_try_get_caps(struct ceph_inode_info *ci, int need, int want, * due to a small max_size, make sure we check_max_size (and possibly * ask the mds) so we don't get hung up indefinitely. */ -int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, +int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got, struct page **pinned_page) { + struct inode *inode = file_inode(filp); + struct ceph_inode_info *ci = ceph_inode(inode); int _got, ret; - ret = ceph_pool_perm_check(ci, need); + ret = ceph_pool_perm_check(inode, need); if (ret < 0) return ret; while (true) { if (endoff > 0) - check_max_size(&ci->vfs_inode, endoff); + check_max_size(inode, endoff); _got = 0; - ret = try_get_cap_refs(ci, need, want, endoff, + ret = try_get_cap_refs(inode, need, want, endoff, false, &_got); if (ret == -EAGAIN) continue; @@ -2756,8 +2758,8 @@ int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, DEFINE_WAIT_FUNC(wait, woken_wake_function); add_wait_queue(&ci->i_cap_wq, &wait); - while (!(ret = try_get_cap_refs(ci, need, want, endoff, - true, &_got))) { + while (!(ret = try_get_cap_refs(inode, need, want, + endoff, true, &_got))) { if (signal_pending(current)) { ret = -ERESTARTSYS; break; @@ -2772,7 +2774,7 @@ int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, if (ret < 0) { if (ret == -ESTALE) { /* session was killed, try renew caps */ - ret = ceph_renew_caps(&ci->vfs_inode); + ret = ceph_renew_caps(inode); if (ret == 0) continue; } @@ -2781,9 +2783,9 @@ int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, if (ci->i_inline_version != CEPH_INLINE_NONE && (_got & (CEPH_CAP_FILE_CACHE|CEPH_CAP_FILE_LAZYIO)) && - i_size_read(&ci->vfs_inode) > 0) { + i_size_read(inode) > 0) { struct page *page = - find_get_page(ci->vfs_inode.i_mapping, 0); + find_get_page(inode->i_mapping, 0); if (page) { if (PageUptodate(page)) { *pinned_page = page; @@ -2802,7 +2804,7 @@ int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, * getattr request will bring inline data into * page cache */ - ret = __ceph_do_getattr(&ci->vfs_inode, NULL, + ret = __ceph_do_getattr(inode, NULL, CEPH_STAT_CAP_INLINE_DATA, true); if (ret < 0) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 2fe8ca7805f4..300cbaf09aa1 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1260,7 +1260,8 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) want = CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO; else want = CEPH_CAP_FILE_CACHE; - ret = ceph_get_caps(ci, CEPH_CAP_FILE_RD, want, -1, &got, &pinned_page); + ret = ceph_get_caps(filp, CEPH_CAP_FILE_RD, want, -1, + &got, &pinned_page); if (ret < 0) return ret; @@ -1455,7 +1456,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) else want = CEPH_CAP_FILE_BUFFER; got = 0; - err = ceph_get_caps(ci, CEPH_CAP_FILE_WR, want, pos + count, + err = ceph_get_caps(file, CEPH_CAP_FILE_WR, want, pos + count, &got, NULL); if (err < 0) goto out; @@ -1779,7 +1780,7 @@ static long ceph_fallocate(struct file *file, int mode, else want = CEPH_CAP_FILE_BUFFER; - ret = ceph_get_caps(ci, CEPH_CAP_FILE_WR, want, endoff, &got, NULL); + ret = ceph_get_caps(file, CEPH_CAP_FILE_WR, want, endoff, &got, NULL); if (ret < 0) goto unlock; @@ -1808,16 +1809,15 @@ static long ceph_fallocate(struct file *file, int mode, * src_ci. Two attempts are made to obtain both caps, and an error is return if * this fails; zero is returned on success. */ -static int get_rd_wr_caps(struct ceph_inode_info *src_ci, - loff_t src_endoff, int *src_got, - struct ceph_inode_info *dst_ci, +static int get_rd_wr_caps(struct file *src_filp, int *src_got, + struct file *dst_filp, loff_t dst_endoff, int *dst_got) { int ret = 0; bool retrying = false; retry_caps: - ret = ceph_get_caps(dst_ci, CEPH_CAP_FILE_WR, CEPH_CAP_FILE_BUFFER, + ret = ceph_get_caps(dst_filp, CEPH_CAP_FILE_WR, CEPH_CAP_FILE_BUFFER, dst_endoff, dst_got, NULL); if (ret < 0) return ret; @@ -1827,24 +1827,24 @@ static int get_rd_wr_caps(struct ceph_inode_info *src_ci, * we would risk a deadlock by using ceph_get_caps. Thus, we'll do some * retry dance instead to try to get both capabilities. */ - ret = ceph_try_get_caps(src_ci, CEPH_CAP_FILE_RD, CEPH_CAP_FILE_SHARED, + ret = ceph_try_get_caps(file_inode(src_filp), + CEPH_CAP_FILE_RD, CEPH_CAP_FILE_SHARED, false, src_got); if (ret <= 0) { /* Start by dropping dst_ci caps and getting src_ci caps */ - ceph_put_cap_refs(dst_ci, *dst_got); + ceph_put_cap_refs(ceph_inode(file_inode(dst_filp)), *dst_got); if (retrying) { if (!ret) /* ceph_try_get_caps masks EAGAIN */ ret = -EAGAIN; return ret; } - ret = ceph_get_caps(src_ci, CEPH_CAP_FILE_RD, - CEPH_CAP_FILE_SHARED, src_endoff, - src_got, NULL); + ret = ceph_get_caps(src_filp, CEPH_CAP_FILE_RD, + CEPH_CAP_FILE_SHARED, -1, src_got, NULL); if (ret < 0) return ret; /*... drop src_ci caps too, and retry */ - ceph_put_cap_refs(src_ci, *src_got); + ceph_put_cap_refs(ceph_inode(file_inode(src_filp)), *src_got); retrying = true; goto retry_caps; } @@ -1958,8 +1958,8 @@ static ssize_t ceph_copy_file_range(struct file *src_file, loff_t src_off, * clients may have dirty data in their caches. And OSDs know nothing * about caps, so they can't safely do the remote object copies. */ - err = get_rd_wr_caps(src_ci, (src_off + len), &src_got, - dst_ci, (dst_off + len), &dst_got); + err = get_rd_wr_caps(src_file, &src_got, + dst_file, (dst_off + len), &dst_got); if (err < 0) { dout("get_rd_wr_caps returned %d\n", err); ret = -EOPNOTSUPP; @@ -2016,9 +2016,8 @@ static ssize_t ceph_copy_file_range(struct file *src_file, loff_t src_off, goto out; } len -= ret; - err = get_rd_wr_caps(src_ci, (src_off + len), - &src_got, dst_ci, - (dst_off + len), &dst_got); + err = get_rd_wr_caps(src_file, &src_got, + dst_file, (dst_off + len), &dst_got); if (err < 0) goto out; err = is_file_size_ok(src_inode, dst_inode, diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 2e516d47052f..f45a06475f4f 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1061,9 +1061,9 @@ extern int ceph_encode_dentry_release(void **p, struct dentry *dn, struct inode *dir, int mds, int drop, int unless); -extern int ceph_get_caps(struct ceph_inode_info *ci, int need, int want, +extern int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got, struct page **pinned_page); -extern int ceph_try_get_caps(struct ceph_inode_info *ci, +extern int ceph_try_get_caps(struct inode *inode, int need, int want, bool nonblock, int *got); /* for counting open files by mode */ @@ -1074,7 +1074,7 @@ extern void ceph_put_fmode(struct ceph_inode_info *ci, int mode); extern const struct address_space_operations ceph_aops; extern int ceph_mmap(struct file *file, struct vm_area_struct *vma); extern int ceph_uninline_data(struct file *filp, struct page *locked_page); -extern int ceph_pool_perm_check(struct ceph_inode_info *ci, int need); +extern int ceph_pool_perm_check(struct inode *inode, int need); extern void ceph_pool_perm_destroy(struct ceph_mds_client* mdsc); /* file.c */ From patchwork Mon Jun 17 12:55:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999217 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4AA9D6C5 for ; Mon, 17 Jun 2019 12:56:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3BFB8285E3 for ; Mon, 17 Jun 2019 12:56:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3076A2890E; Mon, 17 Jun 2019 12:56:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D0D3C285E3 for ; Mon, 17 Jun 2019 12:56:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728063AbfFQM4D (ORCPT ); Mon, 17 Jun 2019 08:56:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51944 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726599AbfFQM4D (ORCPT ); Mon, 17 Jun 2019 08:56:03 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A23B8356E5 for ; Mon, 17 Jun 2019 12:56:03 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id D843739BE; Mon, 17 Jun 2019 12:55:57 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 7/8] ceph: check page writeback error during write Date: Mon, 17 Jun 2019 20:55:28 +0800 Message-Id: <20190617125529.6230-8-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 17 Jun 2019 12:56:03 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Make write(2) return error prematurely if there is writeback error. User can use fsync() or fdatasync() to clear the error. This change is mainly for reporting errors after blacklist + reconnect. Signed-off-by: "Yan, Zheng" --- fs/ceph/caps.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 57e1447a9d4b..f07767d3864c 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2814,6 +2814,14 @@ int ceph_get_caps(struct file *filp, int need, int want, break; } + if (_got & CEPH_CAP_FILE_WR) { + ret = filemap_check_wb_err(inode->i_mapping, filp->f_wb_err); + if (ret < 0) { + ceph_put_cap_refs(ci, _got); + return ret; + } + } + if ((_got & CEPH_CAP_FILE_RD) && (_got & CEPH_CAP_FILE_CACHE)) ceph_fscache_revalidate_cookie(ci); From patchwork Mon Jun 17 12:55:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 10999219 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9265814B6 for ; Mon, 17 Jun 2019 12:56:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 82F282623D for ; Mon, 17 Jun 2019 12:56:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8173828904; Mon, 17 Jun 2019 12:56:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B4772623D for ; Mon, 17 Jun 2019 12:56:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728073AbfFQM4K (ORCPT ); Mon, 17 Jun 2019 08:56:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51129 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726405AbfFQM4K (ORCPT ); Mon, 17 Jun 2019 08:56:10 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 82A982ED2C1 for ; Mon, 17 Jun 2019 12:56:09 +0000 (UTC) Received: from zhyan-laptop.redhat.com (ovpn-12-92.pek2.redhat.com [10.72.12.92]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6C2517D66C; Mon, 17 Jun 2019 12:56:04 +0000 (UTC) From: "Yan, Zheng" To: ceph-devel@vger.kernel.org Cc: idryomov@redhat.com, jlayton@redhat.com, "Yan, Zheng" Subject: [PATCH 8/8] ceph: return -EIO if read/write against filp that lost file locks Date: Mon, 17 Jun 2019 20:55:29 +0800 Message-Id: <20190617125529.6230-9-zyan@redhat.com> In-Reply-To: <20190617125529.6230-1-zyan@redhat.com> References: <20190617125529.6230-1-zyan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 17 Jun 2019 12:56:09 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP After mds evicts session, file locks get lost sliently. It's not safe to let programs continue to do read/write. Signed-off-by: "Yan, Zheng" --- fs/ceph/caps.c | 28 ++++++++++++++++++++++------ fs/ceph/locks.c | 8 ++++++-- fs/ceph/super.h | 1 + 3 files changed, 29 insertions(+), 8 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index f07767d3864c..d8efacd874b7 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2543,8 +2543,13 @@ static void __take_cap_refs(struct ceph_inode_info *ci, int got, * * FIXME: how does a 0 return differ from -EAGAIN? */ +enum { + NON_BLOCKING = 1, + CHECK_FILELOCK = 2, +}; + static int try_get_cap_refs(struct inode *inode, int need, int want, - loff_t endoff, bool nonblock, int *got) + loff_t endoff, int flags, int *got) { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_client *mdsc = ceph_inode_to_client(inode)->mdsc; @@ -2559,6 +2564,13 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, again: spin_lock(&ci->i_ceph_lock); + if ((flags & CHECK_FILELOCK) && + (ci->i_ceph_flags & CEPH_I_ERROR_FILELOCK)) { + dout("try_get_cap_refs %p error filelock\n", inode); + ret = -EIO; + goto out_unlock; + } + /* make sure file is actually open */ file_wanted = __ceph_caps_file_wanted(ci); if ((file_wanted & need) != need) { @@ -2620,7 +2632,7 @@ static int try_get_cap_refs(struct inode *inode, int need, int want, * we can not call down_read() when * task isn't in TASK_RUNNING state */ - if (nonblock) { + if (flags & NON_BLOCKING) { ret = -EAGAIN; goto out_unlock; } @@ -2725,7 +2737,8 @@ int ceph_try_get_caps(struct inode *inode, int need, int want, if (ret < 0) return ret; - ret = try_get_cap_refs(inode, need, want, 0, nonblock, got); + ret = try_get_cap_refs(inode, need, want, 0, + (nonblock ? NON_BLOCKING : 0), got); return ret == -EAGAIN ? 0 : ret; } @@ -2737,9 +2750,10 @@ int ceph_try_get_caps(struct inode *inode, int need, int want, int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got, struct page **pinned_page) { + struct ceph_file_info *fi = filp->private_data; struct inode *inode = file_inode(filp); struct ceph_inode_info *ci = ceph_inode(inode); - int _got, ret; + int ret, _got, flags; ret = ceph_pool_perm_check(inode, need); if (ret < 0) @@ -2749,17 +2763,19 @@ int ceph_get_caps(struct file *filp, int need, int want, if (endoff > 0) check_max_size(inode, endoff); + flags = atomic_read(&fi->num_locks) ? CHECK_FILELOCK : 0; _got = 0; ret = try_get_cap_refs(inode, need, want, endoff, - false, &_got); + flags, &_got); if (ret == -EAGAIN) continue; if (!ret) { DEFINE_WAIT_FUNC(wait, woken_wake_function); add_wait_queue(&ci->i_cap_wq, &wait); + flags |= NON_BLOCKING; while (!(ret = try_get_cap_refs(inode, need, want, - endoff, true, &_got))) { + endoff, flags, &_got))) { if (signal_pending(current)) { ret = -ERESTARTSYS; break; diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c index ac9b53b89365..cb216501c959 100644 --- a/fs/ceph/locks.c +++ b/fs/ceph/locks.c @@ -32,14 +32,18 @@ void __init ceph_flock_init(void) static void ceph_fl_copy_lock(struct file_lock *dst, struct file_lock *src) { - struct inode *inode = file_inode(src->fl_file); + struct ceph_file_info *fi = dst->fl_file->private_data; + struct inode *inode = file_inode(dst->fl_file); atomic_inc(&ceph_inode(inode)->i_filelock_ref); + atomic_inc(&fi->num_locks); } static void ceph_fl_release_lock(struct file_lock *fl) { + struct ceph_file_info *fi = fl->fl_file->private_data; struct inode *inode = file_inode(fl->fl_file); struct ceph_inode_info *ci = ceph_inode(inode); + atomic_dec(&fi->num_locks); if (atomic_dec_and_test(&ci->i_filelock_ref)) { /* clear error when all locks are released */ spin_lock(&ci->i_ceph_lock); @@ -73,7 +77,7 @@ static int ceph_lock_message(u8 lock_type, u16 operation, struct inode *inode, * window. Caller function will decrease the counter. */ fl->fl_ops = &ceph_fl_lock_ops; - atomic_inc(&ceph_inode(inode)->i_filelock_ref); + fl->fl_ops->fl_copy_lock(fl, NULL); } if (operation != CEPH_MDS_OP_SETFILELOCK || cmd == CEPH_LOCK_UNLOCK) diff --git a/fs/ceph/super.h b/fs/ceph/super.h index f45a06475f4f..999fd3244907 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -705,6 +705,7 @@ struct ceph_file_info { struct list_head rw_contexts; errseq_t meta_err; + atomic_t num_locks; }; struct ceph_dir_file_info {