From patchwork Sun Oct 11 18:05:47 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 7369201 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 62394BEEA4 for ; Sun, 11 Oct 2015 16:26:01 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 60E202047D for ; Sun, 11 Oct 2015 16:26:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3F74D2047C for ; Sun, 11 Oct 2015 16:25:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751558AbbJKQZ5 (ORCPT ); Sun, 11 Oct 2015 12:25:57 -0400 Received: from mail-wi0-f178.google.com ([209.85.212.178]:35557 "EHLO mail-wi0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751261AbbJKQZ4 (ORCPT ); Sun, 11 Oct 2015 12:25:56 -0400 Received: by wicge5 with SMTP id ge5so123716344wic.0 for ; Sun, 11 Oct 2015 09:25:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=2kju8hZSi6hLadoHdF0Za9NSA2TVUVREK53tHCRZSbM=; b=LLmjS0VZouUF9/pLzfAf9zD6Zk66wNixTNkuFRc5VjLX5RxyveKiC0VS9rjcedCGsC F2AfTIBLqeCG1UcVQocI1k8L7YbRoAov+xLpi4ZvD3lVAq6u613enpMUO6av+ugK7aRM 3eBYRx/bIyOWcP+/D7R7c0U/p3tRpYLTJAYz0Le6kTInwwwEF+NjdcvWpr4B7GXTcv0u s/yZgcxLPuQhFM4Zu41WfV59kTAjtvK0wjpm6J2GHA5tAUw0CV8fPI5voZ1JAToj1MhW Tpy0gZ3jVRCZwUrVDBNaDZify5KIjN6gXCGGz48x7LSzCeFUAjmHitT/8ZZhNW6bYh// /a9Q== X-Received: by 10.194.112.104 with SMTP id ip8mr30120996wjb.85.1444580755263; Sun, 11 Oct 2015 09:25:55 -0700 (PDT) Received: from orange.localdomain (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id it4sm14634886wjb.0.2015.10.11.09.25.54 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 11 Oct 2015 09:25:54 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Xudong Cao Subject: [PATCH] rbd: prevent kernel stack blow up on rbd map Date: Sun, 11 Oct 2015 20:05:47 +0200 Message-Id: <1444586747-24268-1-git-send-email-idryomov@gmail.com> X-Mailer: git-send-email 2.4.3 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Mapping an image with a long parent chain (e.g. image foo, whose parent is bar, whose parent is baz, etc) currently leads to a kernel stack overflow, due to the following recursion in the reply path: rbd_osd_req_callback() rbd_obj_request_complete() rbd_img_obj_callback() rbd_img_parent_read_callback() rbd_obj_request_complete() ... Limit the parent chain to 8 images, which is ~3K worth of stack. It's probably a good thing to do regardless - performance with more than a few images long parent chain is likely to be pretty bad. Signed-off-by: Ilya Dryomov Reviewed-by: Josh Durgin --- drivers/block/rbd.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index ccbc3cbbf24e..07f666f4ca18 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -96,6 +96,8 @@ static int atomic_dec_return_safe(atomic_t *v) #define RBD_MINORS_PER_MAJOR 256 #define RBD_SINGLE_MAJOR_PART_SHIFT 4 +#define RBD_MAX_PARENT_CHAIN_LEN 8 + #define RBD_SNAP_DEV_NAME_PREFIX "snap_" #define RBD_MAX_SNAP_NAME_LEN \ (NAME_MAX - (sizeof (RBD_SNAP_DEV_NAME_PREFIX) - 1)) @@ -426,7 +428,7 @@ static ssize_t rbd_add_single_major(struct bus_type *bus, const char *buf, size_t count); static ssize_t rbd_remove_single_major(struct bus_type *bus, const char *buf, size_t count); -static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping); +static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth); static void rbd_spec_put(struct rbd_spec *spec); static int rbd_dev_id_to_minor(int dev_id) @@ -5131,7 +5133,12 @@ out_err: return ret; } -static int rbd_dev_probe_parent(struct rbd_device *rbd_dev) +/* + * @depth is rbd_dev_image_probe() -> rbd_dev_probe_parent() -> + * rbd_dev_image_probe() recursion depth, which means it's also the + * length of the already discovered part of the parent chain. + */ +static int rbd_dev_probe_parent(struct rbd_device *rbd_dev, int depth) { struct rbd_device *parent = NULL; int ret; @@ -5139,6 +5146,12 @@ static int rbd_dev_probe_parent(struct rbd_device *rbd_dev) if (!rbd_dev->parent_spec) return 0; + if (++depth > RBD_MAX_PARENT_CHAIN_LEN) { + pr_info("parent chain is too long (%d)\n", depth); + ret = -EINVAL; + goto out_err; + } + parent = rbd_dev_create(rbd_dev->rbd_client, rbd_dev->parent_spec, NULL); if (!parent) { @@ -5153,7 +5166,7 @@ static int rbd_dev_probe_parent(struct rbd_device *rbd_dev) __rbd_get_client(rbd_dev->rbd_client); rbd_spec_get(rbd_dev->parent_spec); - ret = rbd_dev_image_probe(parent, false); + ret = rbd_dev_image_probe(parent, depth); if (ret < 0) goto out_err; @@ -5282,7 +5295,7 @@ static void rbd_dev_image_release(struct rbd_device *rbd_dev) * parent), initiate a watch on its header object before using that * object to get detailed information about the rbd image. */ -static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping) +static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) { int ret; @@ -5300,7 +5313,7 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping) if (ret) goto err_out_format; - if (mapping) { + if (!depth) { ret = rbd_dev_header_watch_sync(rbd_dev); if (ret) { if (ret == -ENOENT) @@ -5321,7 +5334,7 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping) * Otherwise this is a parent image, identified by pool, image * and snap ids - need to fill in names for those ids. */ - if (mapping) + if (!depth) ret = rbd_spec_fill_snap_id(rbd_dev); else ret = rbd_spec_fill_names(rbd_dev); @@ -5343,12 +5356,12 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping) * Need to warn users if this image is the one being * mapped and has a parent. */ - if (mapping && rbd_dev->parent_spec) + if (!depth && rbd_dev->parent_spec) rbd_warn(rbd_dev, "WARNING: kernel layering is EXPERIMENTAL!"); } - ret = rbd_dev_probe_parent(rbd_dev); + ret = rbd_dev_probe_parent(rbd_dev, depth); if (ret) goto err_out_probe; @@ -5359,7 +5372,7 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, bool mapping) err_out_probe: rbd_dev_unprobe(rbd_dev); err_out_watch: - if (mapping) + if (!depth) rbd_dev_header_unwatch_sync(rbd_dev); out_header_name: kfree(rbd_dev->header_name); @@ -5422,7 +5435,7 @@ static ssize_t do_rbd_add(struct bus_type *bus, spec = NULL; /* rbd_dev now owns this */ rbd_opts = NULL; /* rbd_dev now owns this */ - rc = rbd_dev_image_probe(rbd_dev, true); + rc = rbd_dev_image_probe(rbd_dev, 0); if (rc < 0) goto err_out_rbd_dev;