From patchwork Wed May 30 17:43:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 10439487 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 47394602CC for ; Wed, 30 May 2018 17:44:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3365B289F1 for ; Wed, 30 May 2018 17:44:19 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2632828A4A; Wed, 30 May 2018 17:44:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A7724289F1 for ; Wed, 30 May 2018 17:44:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753857AbeE3RoQ (ORCPT ); Wed, 30 May 2018 13:44:16 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:34518 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932141AbeE3Rny (ORCPT ); Wed, 30 May 2018 13:43:54 -0400 Received: by mail-wm0-f68.google.com with SMTP id q4-v6so2564891wmq.1 for ; Wed, 30 May 2018 10:43:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SREH9bwJHHNN7p0XIngOjv7wn1hdFyWUlH+QIMz5vlY=; b=LcttnlKRLWng4vEJ15t6g5Y8U/HixwwP+Axs+2dV5Hi8ayJ5bIR0GW2Fgd6PS5EW6H 86hjZ509td1a3mUba0/MA8GSdChQe0MO6D2F9rZawifhOpk17E2KEvpLQ1hb/q6P373X xyzNf1OpP6Jg78AbAzbQdGns0j/1+cu0cVeYgnzg7HKAgeRnbzccUgnNC7iS7Gng0W0d WqslAMKFYkU6qj8GKVuM8Avy6i8OVX4/Gvwfbchcra8X8N/oQeavqSCJXIgtgsO2YW5+ oAzO7+RzJJxsnGlowrMv5CYdJvPxwVvwcfPcwKwY57g+g2zl5sywedCGMfeMc1XzNaWs +jZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SREH9bwJHHNN7p0XIngOjv7wn1hdFyWUlH+QIMz5vlY=; b=lNLkv+wxIKbNSHXImr8RnuhAF+kOJmMotQmx7vhImQCENJ9cTdXYYdbke7itUzWBHK AeXlNMiXKxpo5rOAluTtRb4Jgpv4iqAxnSi6QoIikyfSBTqzc1eg7pPG2Xa5pudMUqWy f/dxkFCM4KDiI7QTFOdAkOS4VMZAiG64qubZ87CcWkC1GvMdpT/XIZw9Q2+ZXI0ZpIIF Ffl7dNKjmZwRyW9DXOnFIO0XAuWtcB0HpuTqNKa7nmC7eIZVvlkemZWMwk/mzSoUVg37 jo9FYsvgyqQ2zA/34gZ8Be3VreFHGlbJJitYx/IZr1hdZSY0OooLTGBvjLsJqyjXNySO yrSw== X-Gm-Message-State: ALKqPwfwm64Xl8XOQnKqVLK8DLkz0znAWE0roX7ud2yGV2wKeVJOwo04 OwhiZKOrzb7Eb+cdzjJ+FPak3IIb X-Google-Smtp-Source: ADUXVKICIXXdO+4IQ55sSI/lcpoDY5TFd7rKrjtNY5L/SSQD2D4IsJtsr28uQ5LhLzjSkeEk8UF1RQ== X-Received: by 2002:a1c:1a49:: with SMTP id a70-v6mr2284862wma.77.1527702233018; Wed, 30 May 2018 10:43:53 -0700 (PDT) Received: from orange.brq.redhat.com. (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id p3-v6sm18647658wrn.31.2018.05.30.10.43.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 May 2018 10:43:52 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Jeff Layton Subject: [PATCH 2/7] libceph: defer __complete_request() to a workqueue Date: Wed, 30 May 2018 19:43:37 +0200 Message-Id: <1527702222-8232-3-git-send-email-idryomov@gmail.com> X-Mailer: git-send-email 2.4.3 In-Reply-To: <1527702222-8232-1-git-send-email-idryomov@gmail.com> References: <1527702222-8232-1-git-send-email-idryomov@gmail.com> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In the common case, req->r_callback is called by handle_reply() on the ceph-msgr worker thread without any locks. If handle_reply() fails, it is called with both osd->lock and osdc->lock. In the map check case, it is called with just osdc->lock but held for write. Finally, if the request is aborted because of -ENOSPC or by ceph_osdc_abort_requests(), it is called directly on the submitter's thread, again with both locks. req->r_callback on the submitter's thread is relatively new (introduced in 4.12) and ripe for deadlocks -- e.g. writeback worker thread waiting on itself: inode_wait_for_writeback+0x26/0x40 evict+0xb5/0x1a0 iput+0x1d2/0x220 ceph_put_wrbuffer_cap_refs+0xe0/0x2c0 [ceph] writepages_finish+0x2d3/0x410 [ceph] __complete_request+0x26/0x60 [libceph] complete_request+0x2e/0x70 [libceph] __submit_request+0x256/0x330 [libceph] submit_request+0x2b/0x30 [libceph] ceph_osdc_start_request+0x25/0x40 [libceph] ceph_writepages_start+0xdfe/0x1320 [ceph] do_writepages+0x1f/0x70 __writeback_single_inode+0x45/0x330 writeback_sb_inodes+0x26a/0x600 __writeback_inodes_wb+0x92/0xc0 wb_writeback+0x274/0x330 wb_workfn+0x2d5/0x3b0 Defer __complete_request() to a workqueue in all failure cases so it's never on the same thread as ceph_osdc_start_request() and always called with no locks held. Link: http://tracker.ceph.com/issues/23978 Signed-off-by: Ilya Dryomov --- include/linux/ceph/osd_client.h | 2 ++ net/ceph/osd_client.c | 19 ++++++++++++++++++- 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 874c31c01f80..d4191bde95a4 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -170,6 +170,7 @@ struct ceph_osd_request { u64 r_tid; /* unique for this client */ struct rb_node r_node; struct rb_node r_mc_node; /* map check */ + struct work_struct r_complete_work; struct ceph_osd *r_osd; struct ceph_osd_request_target r_t; @@ -360,6 +361,7 @@ struct ceph_osd_client { struct ceph_msgpool msgpool_op_reply; struct workqueue_struct *notify_wq; + struct workqueue_struct *completion_wq; }; static inline bool ceph_osdmap_flag(struct ceph_osd_client *osdc, int flag) diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index a78f578a2da7..a4c12c37aa90 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2329,6 +2329,14 @@ static void __complete_request(struct ceph_osd_request *req) ceph_osdc_put_request(req); } +static void complete_request_workfn(struct work_struct *work) +{ + struct ceph_osd_request *req = + container_of(work, struct ceph_osd_request, r_complete_work); + + __complete_request(req); +} + /* * This is open-coded in handle_reply(). */ @@ -2338,7 +2346,9 @@ static void complete_request(struct ceph_osd_request *req, int err) req->r_result = err; finish_request(req); - __complete_request(req); + + INIT_WORK(&req->r_complete_work, complete_request_workfn); + queue_work(req->r_osdc->completion_wq, &req->r_complete_work); } static void cancel_map_check(struct ceph_osd_request *req) @@ -5058,6 +5068,10 @@ int ceph_osdc_init(struct ceph_osd_client *osdc, struct ceph_client *client) if (!osdc->notify_wq) goto out_msgpool_reply; + osdc->completion_wq = create_singlethread_workqueue("ceph-completion"); + if (!osdc->completion_wq) + goto out_notify_wq; + schedule_delayed_work(&osdc->timeout_work, osdc->client->options->osd_keepalive_timeout); schedule_delayed_work(&osdc->osds_timeout_work, @@ -5065,6 +5079,8 @@ int ceph_osdc_init(struct ceph_osd_client *osdc, struct ceph_client *client) return 0; +out_notify_wq: + destroy_workqueue(osdc->notify_wq); out_msgpool_reply: ceph_msgpool_destroy(&osdc->msgpool_op_reply); out_msgpool: @@ -5079,6 +5095,7 @@ int ceph_osdc_init(struct ceph_osd_client *osdc, struct ceph_client *client) void ceph_osdc_stop(struct ceph_osd_client *osdc) { + destroy_workqueue(osdc->completion_wq); destroy_workqueue(osdc->notify_wq); cancel_delayed_work_sync(&osdc->timeout_work); cancel_delayed_work_sync(&osdc->osds_timeout_work);