From patchwork Thu Feb 9 13:04:11 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artur Molchanov X-Patchwork-Id: 9564523 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EB744601C3 for ; Thu, 9 Feb 2017 13:12:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2C3B284C9 for ; Thu, 9 Feb 2017 13:12:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C745E284FB; Thu, 9 Feb 2017 13:12:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ECAC6284C9 for ; Thu, 9 Feb 2017 13:12:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752459AbdBINMz (ORCPT ); Thu, 9 Feb 2017 08:12:55 -0500 Received: from mail-lf0-f66.google.com ([209.85.215.66]:33222 "EHLO mail-lf0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751590AbdBINMJ (ORCPT ); Thu, 9 Feb 2017 08:12:09 -0500 Received: by mail-lf0-f66.google.com with SMTP id x1so319071lff.0 for ; Thu, 09 Feb 2017 05:11:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=synesis-ru.20150623.gappssmtp.com; s=20150623; h=to:cc:from:subject:message-id:date:user-agent:mime-version :content-transfer-encoding; bh=XmFD8dyKew16WZbBWRwc/FeTqOzgBa88bIozGz0OjHY=; b=iyR09YpXE8u/FvCuQ28HuB5XjF4QDns8ZCdIX/9BhNurBAvUsvO3Ou0cvsW4DRGy10 BjGKrnW78WxgrJMKOdmRtlxkWbFzadyAhXQcpBN+T/05f+bFRy99Rk42QyRamluj4wK2 HkCr7I9Ld5fQ6NrK3m2X5wlzWyVDHpO6P3ZaeRkNb1Tn7klGS1eydrc98D8DMDiLDp9r iGFEByIJeW3xtxHx8S8nIImdY4BUGcf7CkzneFzjHn0ei455znUXgR6cPHV1pmIEzNGG CuQOirfKVsDF1b7t68da9tDbKQlOb95kP72mzmzhqP9yzuXOxMAu8T9YJZXuq7yyQwsi KUZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:from:subject:message-id:date:user-agent :mime-version:content-transfer-encoding; bh=XmFD8dyKew16WZbBWRwc/FeTqOzgBa88bIozGz0OjHY=; b=MRgNiVi+gl6w7tGlgYMnQnF2njcLQLUP/BQg9X/qOwihBLldg5tq5AyUy7YD65+OqY HmTs6y970wmCAsCQpbj9UFIeEQbejJrO7GeePW8WN7Tt2bwg36yuvWpoCIGz9JOv98+f C/yX60Xurmbm1ij8WAZ0pSKchpiZYKE+pq9ebyw1QHtXIUCS0b+Nuxn7z4JjurvCWN+E Ae7JTfyg3iXgYijL719Hmc0IEgrgj3iOSKh+IOomtRk4I/ylO7YndF5LR7+O2yPejZ/e UmSq6ziIRTOL4UmnRL/7j4kslOuC9eHxQZ5UKm0+gaPJlVLo6nmwJz3XVgiCkAE9C6/r rKiw== X-Gm-Message-State: AMke39mx+mmBGD7cNXpNzkppqEaOTd55vJH1mzZfoDubUaG8YzhvVL3Ikm2jraDHvXOvCg== X-Received: by 10.46.72.1 with SMTP id v1mr1155851lja.48.1486645452888; Thu, 09 Feb 2017 05:04:12 -0800 (PST) Received: from [172.20.9.50] ([86.57.155.118]) by smtp.gmail.com with ESMTPSA id f133sm3471607lfg.32.2017.02.09.05.04.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 09 Feb 2017 05:04:12 -0800 (PST) To: Ilya Dryomov Cc: ceph-devel@vger.kernel.org From: Artur Molchanov Subject: [PATCH] libceph: Complete stuck requests to OSD with EIO Message-ID: <4e080919-2df5-1b75-3c8a-3ae95eb8f08d@synesis.ru> Date: Thu, 9 Feb 2017 16:04:11 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Artur Molchanov Complete stuck requests to OSD with error EIO after osd_request_timeout expired. If osd_request_timeout equals to 0 (default value) then do nothing with hung requests (keep default behavior). Create RBD map option osd_request_timeout to set timeout in seconds. Set osd_request_timeout to 0 by default. Signed-off-by: Artur Molchanov --- include/linux/ceph/libceph.h | 8 +++-- include/linux/ceph/osd_client.h | 1 + net/ceph/ceph_common.c | 12 +++++++ net/ceph/osd_client.c | 72 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 90 insertions(+), 3 deletions(-) osdc->osdmap = ceph_osdmap_alloc(); @@ -4120,6 +4190,8 @@ int ceph_osdc_init(struct ceph_osd_client *osdc, struct ceph_client *client) osdc->client->options->osd_keepalive_timeout); schedule_delayed_work(&osdc->osds_timeout_work, round_jiffies_relative(osdc->client->options->osd_idle_ttl)); + schedule_delayed_work(&osdc->osd_request_timeout_work, + round_jiffies_relative(osdc->client->options->osd_request_timeout)); return 0; diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h index 1816c5e..10d8acb 100644 --- a/include/linux/ceph/libceph.h +++ b/include/linux/ceph/libceph.h @@ -48,6 +48,7 @@ struct ceph_options { unsigned long mount_timeout; /* jiffies */ unsigned long osd_idle_ttl; /* jiffies */ unsigned long osd_keepalive_timeout; /* jiffies */ + unsigned long osd_request_timeout; /* jiffies */ /* * any type that can't be simply compared or doesn't need need @@ -65,9 +66,10 @@ struct ceph_options { /* * defaults */ -#define CEPH_MOUNT_TIMEOUT_DEFAULT msecs_to_jiffies(60 * 1000) -#define CEPH_OSD_KEEPALIVE_DEFAULT msecs_to_jiffies(5 * 1000) -#define CEPH_OSD_IDLE_TTL_DEFAULT msecs_to_jiffies(60 * 1000) +#define CEPH_MOUNT_TIMEOUT_DEFAULT msecs_to_jiffies(60 * 1000) +#define CEPH_OSD_KEEPALIVE_DEFAULT msecs_to_jiffies(5 * 1000) +#define CEPH_OSD_IDLE_TTL_DEFAULT msecs_to_jiffies(60 * 1000) +#define CEPH_OSD_REQUEST_TIMEOUT_DEFAULT msecs_to_jiffies(0 * 1000) #define CEPH_MONC_HUNT_INTERVAL msecs_to_jiffies(3 * 1000) #define CEPH_MONC_PING_INTERVAL msecs_to_jiffies(10 * 1000) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 03a6653..e45cba0 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -279,6 +279,7 @@ struct ceph_osd_client { atomic_t num_homeless; struct delayed_work timeout_work; struct delayed_work osds_timeout_work; + struct delayed_work osd_request_timeout_work; #ifdef CONFIG_DEBUG_FS struct dentry *debugfs_file; #endif diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c index 464e885..81876c8 100644 --- a/net/ceph/ceph_common.c +++ b/net/ceph/ceph_common.c @@ -230,6 +230,7 @@ enum { Opt_osdkeepalivetimeout, Opt_mount_timeout, Opt_osd_idle_ttl, + Opt_osd_request_timeout, Opt_last_int, /* int args above */ Opt_fsid, @@ -256,6 +257,7 @@ static match_table_t opt_tokens = { {Opt_osdkeepalivetimeout, "osdkeepalive=%d"}, {Opt_mount_timeout, "mount_timeout=%d"}, {Opt_osd_idle_ttl, "osd_idle_ttl=%d"}, + {Opt_osd_request_timeout, "osd_request_timeout=%d"}, /* int args above */ {Opt_fsid, "fsid=%s"}, {Opt_name, "name=%s"}, @@ -361,6 +363,7 @@ ceph_parse_options(char *options, const char *dev_name, opt->osd_keepalive_timeout = CEPH_OSD_KEEPALIVE_DEFAULT; opt->mount_timeout = CEPH_MOUNT_TIMEOUT_DEFAULT; opt->osd_idle_ttl = CEPH_OSD_IDLE_TTL_DEFAULT; + opt->osd_request_timeout = CEPH_OSD_REQUEST_TIMEOUT_DEFAULT; /* get mon ip(s) */ /* ip1[:port1][,ip2[:port2]...] */ @@ -473,6 +476,15 @@ ceph_parse_options(char *options, const char *dev_name, } opt->mount_timeout = msecs_to_jiffies(intval * 1000); break; + case Opt_osd_request_timeout: + /* 0 is "wait forever" (i.e. infinite timeout) */ + if (intval < 0 || intval > INT_MAX / 1000) { + pr_err("osd_request_timeout out of range\n"); + err = -EINVAL; + goto out; + } + opt->osd_request_timeout = msecs_to_jiffies(intval * 1000); + break; case Opt_share: opt->flags &= ~CEPH_OPT_NOSHARE; diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 842f049..2f5a024 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -2554,6 +2554,75 @@ static void handle_timeout(struct work_struct *work) osdc->client->options->osd_keepalive_timeout); } +/* + * Complete request to OSD with error err if it started before cutoff. + */ +static void complete_osd_stuck_request(struct ceph_osd_request *req, + unsigned long cutoff, + int err) +{ + if (!time_before(req->r_stamp, cutoff)) + return; + + pr_warn_ratelimited("osd req on osd%d timeout expired\n", + req->r_osd->o_osd); + + complete_request(req, err); +} + +/* + * Complete all requests to OSD that has been active for more than timeout. + */ +static void complete_osd_stuck_requests(struct ceph_osd *osd, + unsigned long timeout, + int err) +{ + struct ceph_osd_request *req, *n; + const unsigned long cutoff = jiffies - timeout; + + rbtree_postorder_for_each_entry_safe(req, n, &osd->o_requests, r_node) { + complete_osd_stuck_request(req, cutoff, -EIO); + } +} + +/* + * Timeout callback, called every N seconds. When 1 or more OSD + * requests that has been active for more than N seconds, + * we complete it with error EIO. + */ +static void handle_osd_request_timeout(struct work_struct *work) +{ + struct ceph_osd_client *osdc = + container_of(work, struct ceph_osd_client, + osd_request_timeout_work.work); + struct ceph_osd *osd, *n; + struct ceph_options *opts = osdc->client->options; + const unsigned long timeout = opts->osd_request_timeout; + + dout("%s osdc %p\n", __func__, osdc); + + if (!timeout) + return; + + down_write(&osdc->lock); + + rbtree_postorder_for_each_entry_safe(osd, n, &osdc->osds, o_node) { + complete_osd_stuck_requests(osd, timeout, -EIO); + } + + up_write(&osdc->lock); + + if (!atomic_read(&osdc->num_homeless)) + goto out; + + down_write(&osdc->lock); + complete_osd_stuck_requests(&osdc->homeless_osd, timeout, -EIO); + up_write(&osdc->lock); + +out: + schedule_delayed_work(&osdc->osd_request_timeout_work, timeout); +} + static void handle_osds_timeout(struct work_struct *work) { struct ceph_osd_client *osdc = @@ -4091,6 +4160,7 @@ int ceph_osdc_init(struct ceph_osd_client *osdc, struct ceph_client *client) osdc->linger_map_checks = RB_ROOT; INIT_DELAYED_WORK(&osdc->timeout_work, handle_timeout); INIT_DELAYED_WORK(&osdc->osds_timeout_work, handle_osds_timeout); + INIT_DELAYED_WORK(&osdc->osd_request_timeout_work, handle_osd_request_timeout); err = -ENOMEM;