From patchwork Sat Jan 5 13:56:02 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dongsu Park X-Patchwork-Id: 1936431 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 5275DE0143 for ; Sat, 5 Jan 2013 13:56:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755691Ab3AEN4h (ORCPT ); Sat, 5 Jan 2013 08:56:37 -0500 Received: from mail-bk0-f52.google.com ([209.85.214.52]:37685 "EHLO mail-bk0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755664Ab3AEN4g (ORCPT ); Sat, 5 Jan 2013 08:56:36 -0500 Received: by mail-bk0-f52.google.com with SMTP id w5so7658756bku.11 for ; Sat, 05 Jan 2013 05:56:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:cc:subject:date:message-id:x-mailer :x-gm-message-state; bh=50ZakcC6/J9fGCPbJ3TnEZ8wHjI3Mmdp5H5MFVWlhTA=; b=G7vpXpisL+LP5vBrb2F5yWB2ZUobuDGyPqdGjsLeXaYSr3qtmebsXUngkyU7etK6SH Z9tE79Y/MB7aXYp62MMOTUCEWGVDJvqVItDB1jrMhJQ5MbBwmav5nAO3M/Btjpp2v1Sx ypmtcSIpSEWehjNE/v2bOsZONjlrQ4x/LAfFS97Lc59XC7tfEgJJvEstO4L8E0PmYny+ BliPgnv2s5LjMYfmIAgFCtir86LvwqrlD3/RhJqUM2+9pEiF3qvutlyTfFhku/v4JnZR dnEKzra8rkept+9MONa7ew5f+jbQ1ICVfJLgm9QAvs55Ev4lWG07qTBtUWqypU1wIkH0 kv4w== X-Received: by 10.204.151.7 with SMTP id a7mr27505159bkw.8.1357394195093; Sat, 05 Jan 2013 05:56:35 -0800 (PST) Received: from dneo.profitbricks.localdomain ([199.127.250.66]) by mx.google.com with ESMTPS id u3sm38605939bkw.9.2013.01.05.05.56.29 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 05 Jan 2013 05:56:34 -0800 (PST) From: Dongsu Park To: linux-rdma@vger.kernel.org Cc: Dongsu Park , Sebastian Riemer , Bart Van Assche , David Dillow , Roland Dreier , Sean Hefty , Hal Rosenstock Subject: [PATCH] IB/srp: disconnect to SRP target before removing SCSI host Date: Sat, 5 Jan 2013 14:56:02 +0100 Message-Id: <1357394162-26316-1-git-send-email-dongsu.park@profitbricks.com> X-Mailer: git-send-email 1.7.10.4 X-Gm-Message-State: ALoCoQmn7BlVcJDdo3tgBl3nVhf3LKZnJPnpskmeqEtDvgMcNPCJs/it4diWSRH34zlzr1Necjjg Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org There has been a nasty problem upon removing an SRP target when the SRP target machine crashed accidently without giving back any IB events. In that case, the admin cannot make use of deleting remote ports for the purpose of tearing down SRP targets as well as SCSI host. One of the reasons was a completion on target->done, the other was the invocation order of srp_disconnect_target() and scsi_remove_host(). Consequence of the latter was unfortunately hanging forever on device_del(), until the target machine comes up again after having rebooted. That symptom is simply reproducible via sysfs. First of all, trigger an immediate reboot via /proc/sysrq-trigger on the SRP target. target# echo b > /proc/sysrq-trigger After doing that, the Infiniband connection will be completely gone for several minutes at least. Then on the initiator's side, delete a remote port by writing 1 to /sys/class/srp_remote_ports/port-*\:1/delete, e.g.: initiator# echo 1 > /sys/class/srp_remote_ports/port-6\:1/delete , where the SRP remote port to be deleted has its number 6. Then you will see stale SCSI targets remaining despite of rport delete, which is not expected though. That was resulted from device_del() hanging forever on destroying SCSI LLD. The solution consists of two modifications. The first one was already committed to jejb/for-next. See the commit 55d93898 "IB/srp: send disconnect request without waiting for CM timewait exit" by Vu Pham . That will prevent from waiting for completion. The next one, which this commit is saying about, is changing the invocation order in srp_remove_target(). Call srp_disconnect_target() before scsi_remove_host(). This change will prevent device_del() from hanging indefinitely. This patch is based on the srp-ha-v3.7 tree by Bart Van Assche. See also . If necessary, I could rebase it on the stable tree. Signed-off-by: Dongsu Park Cc: Sebastian Riemer Cc: Bart Van Assche Cc: David Dillow Cc: Roland Dreier Cc: Sean Hefty Cc: Hal Rosenstock --- drivers/infiniband/ulp/srp/ib_srp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 307430e..ca4bf40 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -553,10 +553,11 @@ static void srp_remove_target(struct srp_target_port *target) if (scsi_host_added) { srp_del_scsi_host_attr(shost); srp_remove_host(shost); + srp_disconnect_target(target); scsi_remove_host(shost); - } + } else + srp_disconnect_target(target); - srp_disconnect_target(target); ib_destroy_cm_id(target->cm_id); cancel_work_sync(&target->tl_err_work); srp_free_target_ib(target);