From patchwork Tue Jan 21 11:11:07 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 3517351 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id D61469F2D6 for ; Tue, 21 Jan 2014 11:12:21 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9C00220148 for ; Tue, 21 Jan 2014 11:12:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5BF1620122 for ; Tue, 21 Jan 2014 11:12:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754404AbaAULMQ (ORCPT ); Tue, 21 Jan 2014 06:12:16 -0500 Received: from mail-ea0-f172.google.com ([209.85.215.172]:56791 "EHLO mail-ea0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754366AbaAULMM (ORCPT ); Tue, 21 Jan 2014 06:12:12 -0500 Received: by mail-ea0-f172.google.com with SMTP id g15so3075495eak.31 for ; Tue, 21 Jan 2014 03:12:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=yOTgGrv7uQeXhWnxoZU/dx70BP02P0ns6iueG7qWaN4=; b=Km6mbp+5M0uVfXHJkfsNWw3FUcJT3p8OZ+dEOPBc4HZNbyfW3PmTddm1dxyWA7go18 LzRZSfGv9dx/VFSaOqfP0DX4oIGKbB+gIjxPSwQ0gKjIVMfKng1icll1dqoygq5PK+2b nBOeE4xTkRTkDCAJTjuuPS0F5gn4G+The4sxTaiVCj8nPXSlHjp6j8UAhMa/TX4vvtab 27mqRchBGLoGE+Opu5WQyvEaVLKvxPyq3kwho/ve6lQTQgv7kmh2/ASrEU2MSujFjcka w0tclGwm5BVhvyTscCeGB3st51T+t9cVssN6YPw6LxhM/Q1ok6qoMSpVW0XecWouwYL0 +hrw== X-Gm-Message-State: ALoCoQm4C1V2q76f+e8YEPc9Pmo1isQZhYBAxqenf2XCoEYqoNCmE3Q1YRifopcdQhijU0GvUda6 X-Received: by 10.15.41.140 with SMTP id s12mr23170920eev.50.1390302731369; Tue, 21 Jan 2014 03:12:11 -0800 (PST) Received: from localhost ([109.110.66.199]) by mx.google.com with ESMTPSA id 4sm13575198eed.14.2014.01.21.03.12.10 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 21 Jan 2014 03:12:11 -0800 (PST) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: ilya.dryomov@inktank.com Subject: [PATCH v2 3/3] libceph: handle dead tcp connections during connection negotiation Date: Tue, 21 Jan 2014 13:11:07 +0200 Message-Id: <1390302667-8729-4-git-send-email-ilya.dryomov@inktank.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1390302667-8729-1-git-send-email-ilya.dryomov@inktank.com> References: <1390302667-8729-1-git-send-email-ilya.dryomov@inktank.com> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Keepalive mechanism that we are currently using doesn't handle dead (e.g. half-open in RFC 793 sense) TCP connections: a) it's based on pending ceph_osd_requests which are not necessarily present, and b) keepalive byte is only sent if connection is in CON_STATE_OPEN state because of protocol restrictions. This does not cover connection handshake and negotiation stages and can lead to kernel client hanging forever. Fix it by forcibly resetting the connection if after two mount_timeouts we still haven't transitioned to CON_STATE_OPEN. Fixes: http://tracker.ceph.com/issues/7139 Signed-off-by: Ilya Dryomov --- include/linux/ceph/messenger.h | 5 +++++ net/ceph/ceph_common.c | 1 + net/ceph/messenger.c | 42 ++++++++++++++++++++++++++++++++++++++-- 3 files changed, 46 insertions(+), 2 deletions(-) diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index 20ee8b63a968..25d3ec8bcdde 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -62,6 +62,8 @@ struct ceph_messenger { u64 supported_features; u64 required_features; + + unsigned long connect_timeout; /* jiffies */ }; enum ceph_msg_data_type { @@ -240,6 +242,8 @@ struct ceph_connection { struct delayed_work work; /* send|recv work */ unsigned long delay; /* current delay interval */ + + struct delayed_work connect_timeout_work; }; @@ -257,6 +261,7 @@ extern void ceph_messenger_init(struct ceph_messenger *msgr, struct ceph_entity_addr *myaddr, u64 supported_features, u64 required_features, + unsigned long connect_timeout, bool nocrc); extern void ceph_con_init(struct ceph_connection *con, void *private, diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c index 67d7721d237e..bc41e5a392a4 100644 --- a/net/ceph/ceph_common.c +++ b/net/ceph/ceph_common.c @@ -511,6 +511,7 @@ struct ceph_client *ceph_create_client(struct ceph_options *opt, void *private, ceph_messenger_init(&client->msgr, myaddr, client->supported_features, client->required_features, + 2 * client->options->mount_timeout * HZ, ceph_test_opt(client, NOCRC)); /* subsystems */ diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index c7e0143d24f1..65dea6c8c6f8 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -385,6 +385,16 @@ static void con_fault_raise(struct ceph_connection *con) queue_con(con); } +static void handle_connect_timeout(struct work_struct *work) +{ + struct ceph_connection *con = container_of(work, + struct ceph_connection, + connect_timeout_work.work); + + dout("%s connect timeout\n", __func__); + con_fault_raise(con); +} + /* * socket callback functions @@ -447,6 +457,18 @@ static void ceph_sock_state_change(struct sock *sk) case TCP_ESTABLISHED: dout("%s TCP_ESTABLISHED\n", __func__); con_sock_state_connected(con); + /* + * If not told otherwise, arm connect_timeout hammer to + * be able to deal with half-open (in RFC 793 sense) + * sockets in the interim before we transition to + * CON_STATE_OPEN and regular request keepalive + * mechanism (ceph_con_keepalive()) kicks in. + */ + if (con->msgr->connect_timeout) { + dout("arming connect_timeout hammer\n"); + schedule_delayed_work(&con->connect_timeout_work, + con->msgr->connect_timeout); + } queue_con(con); break; default: /* Everything else is uninteresting */ @@ -586,6 +608,13 @@ static int con_close_socket(struct ceph_connection *con) int rc = 0; dout("con_close_socket on %p sock %p\n", con, con->sock); + + /* + * Sync with con_fault_raise() in handle_connect_timeout() to + * make sure that SOCK_CLOSED is going to be cleared for good. + */ + cancel_delayed_work_sync(&con->connect_timeout_work); + if (con->sock) { rc = con->sock->ops->shutdown(con->sock, SHUT_RDWR); sock_release(con->sock); @@ -595,8 +624,8 @@ static int con_close_socket(struct ceph_connection *con) /* * Forcibly clear the SOCK_CLOSED flag. It gets set * independent of the connection mutex, and we could have - * received a socket close event before we had the chance to - * shut the socket down. + * received a socket close and/or connect_timeout event + * before we had the chance to shut the socket down. */ con_flag_clear(con, CON_FLAG_SOCK_CLOSED); @@ -725,6 +754,7 @@ void ceph_con_init(struct ceph_connection *con, void *private, INIT_LIST_HEAD(&con->out_queue); INIT_LIST_HEAD(&con->out_sent); INIT_DELAYED_WORK(&con->work, con_work); + INIT_DELAYED_WORK(&con->connect_timeout_work, handle_connect_timeout); con->state = CON_STATE_CLOSED; } @@ -2076,6 +2106,7 @@ static int process_connect(struct ceph_connection *con) WARN_ON(con->state != CON_STATE_NEGOTIATING); con->state = CON_STATE_OPEN; + con->auth_retry = 0; /* we authenticated; clear flag */ con->peer_global_seq = le32_to_cpu(con->in_reply.global_seq); con->connect_seq++; @@ -2092,6 +2123,11 @@ static int process_connect(struct ceph_connection *con) con->delay = 0; /* reset backoff memory */ + if (con->msgr->connect_timeout) { + dout("disarming connect_timeout hammer\n"); + cancel_delayed_work(&con->connect_timeout_work); + } + if (con->in_reply.tag == CEPH_MSGR_TAG_SEQ) { prepare_write_seq(con); prepare_read_seq(con); @@ -2866,10 +2902,12 @@ void ceph_messenger_init(struct ceph_messenger *msgr, struct ceph_entity_addr *myaddr, u64 supported_features, u64 required_features, + unsigned long connect_timeout, bool nocrc) { msgr->supported_features = supported_features; msgr->required_features = required_features; + msgr->connect_timeout = connect_timeout; spin_lock_init(&msgr->global_seq_lock);