From patchwork Wed Feb 3 21:37:17 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Konstantin Khlebnikov X-Patchwork-Id: 8209261 Return-Path: X-Original-To: patchwork-linux-fsdevel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 10207BEEE5 for ; Wed, 3 Feb 2016 21:37:38 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1E56A20270 for ; Wed, 3 Feb 2016 21:37:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0A9FD20260 for ; Wed, 3 Feb 2016 21:37:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756165AbcBCVhX (ORCPT ); Wed, 3 Feb 2016 16:37:23 -0500 Received: from mail-lf0-f53.google.com ([209.85.215.53]:36609 "EHLO mail-lf0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756160AbcBCVhT (ORCPT ); Wed, 3 Feb 2016 16:37:19 -0500 Received: by mail-lf0-f53.google.com with SMTP id 78so23464345lfy.3; Wed, 03 Feb 2016 13:37:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1VD9qJnAIqvIncSnBjU/+dZEblG8jGOHxZaTfCMClN8=; b=uy3WAv65eD7WDqoEg9/gYjip8qyzqjaciLYFwZYoaiE47WuW5MCX5oAe0kjlWQJZWE C8wD9i9NOpAmFeSoNGommf83zLIHuQpVkUqAm9KmTaDkEGl06K785Idj4XVCGXjSoHLg ox5Y/f36xehzEa6EHYEMd3eKCwZXSqbrWt4e/XprI0DUAsdIjIsd2gniqEKUJpAnwKkO OL3fTgZXWulqtp7+G6KfYWG2ePWJ0rigtPgav1ttIxX//ODiD7Xg1ZWzFdzL53vpLUwg ig0jkFt1pdUDxvA6c9RfQFCzb0Ip0RompXqd+f+VZX9BXvOhVLRTkX41v6LG0gRs9a0S 70nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=1VD9qJnAIqvIncSnBjU/+dZEblG8jGOHxZaTfCMClN8=; b=IK8WBFtDvzYHP0O3xs5+ryN6C/oTV+Yk1Cu9ifs/0qaj4H6c06ssWJXO957kflfa50 6TW/YX0vpzOWX/VEaLUamzXrS9UO0mzXrlJymeAk8ioLMXEJuvD2Ho8A1x8OO5pgYg7L Q1s58V964KoHfLPlewBb9wGUCwGVfh/tcYdbjPXgKMHKB0UPPjn8QPxCRq+GRLRP6Uwv BLcmKJJ4R9vPjzk1L0+ZutzWxTByYJYm/dWerBqqZkDjMFhl6MZsBi4FGg+fbs/1Cg23 lcZQtjql9oDLr9tJTfTX8OBImcQmUe18yvQ0nD3YShyECegM37YFDng+ZFn9TjKF2Q98 b0/w== X-Gm-Message-State: AG10YOSuSPyx/z7qePMyTTb6WhvGuMDL0HW/23ETsq/LKCB+H1vELARGysSvBLxZvYpinMyWpYevQDd9CLfwVg== MIME-Version: 1.0 X-Received: by 10.25.166.130 with SMTP id p124mr1951958lfe.120.1454535437906; Wed, 03 Feb 2016 13:37:17 -0800 (PST) Received: by 10.112.181.234 with HTTP; Wed, 3 Feb 2016 13:37:17 -0800 (PST) In-Reply-To: <1453929472-25566-2-git-send-email-matthew.r.wilcox@intel.com> References: <1453929472-25566-1-git-send-email-matthew.r.wilcox@intel.com> <1453929472-25566-2-git-send-email-matthew.r.wilcox@intel.com> Date: Thu, 4 Feb 2016 00:37:17 +0300 Message-ID: Subject: Re: [PATCH 1/5] radix-tree: Fix race in gang lookup From: Konstantin Khlebnikov To: Matthew Wilcox Cc: Andrew Morton , Hugh Dickins , Ohad Ben-Cohen , Matthew Wilcox , Linux Kernel Mailing List , linux-fsdevel , "linux-mm@kvack.org" , Stable Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Jan 28, 2016 at 12:17 AM, Matthew Wilcox wrote: > From: Matthew Wilcox > > If the indirect_ptr bit is set on a slot, that indicates we need to > redo the lookup. Introduce a new function radix_tree_iter_retry() > which forces the loop to retry the lookup by setting 'slot' to NULL and > turning the iterator back to point at the problematic entry. > > This is a pretty rare problem to hit at the moment; the lookup has to > race with a grow of the radix tree from a height of 0. The consequences > of hitting this race are that gang lookup could return a pointer to a > radix_tree_node instead of a pointer to whatever the user had inserted > in the tree. > > Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator") > Signed-off-by: Matthew Wilcox > Cc: stable@vger.kernel.org > --- > include/linux/radix-tree.h | 16 ++++++++++++++++ > lib/radix-tree.c | 12 ++++++++++-- > 2 files changed, 26 insertions(+), 2 deletions(-) > > diff --git a/include/linux/radix-tree.h b/include/linux/radix-tree.h > index f9a3da5bf892..db0ed595749b 100644 > --- a/include/linux/radix-tree.h > +++ b/include/linux/radix-tree.h > @@ -387,6 +387,22 @@ void **radix_tree_next_chunk(struct radix_tree_root *root, > struct radix_tree_iter *iter, unsigned flags); > > /** > + * radix_tree_iter_retry - retry this chunk of the iteration > + * @iter: iterator state > + * > + * If we iterate over a tree protected only by the RCU lock, a race > + * against deletion or creation may result in seeing a slot for which > + * radix_tree_deref_retry() returns true. If so, call this function > + * and continue the iteration. > + */ > +static inline __must_check > +void **radix_tree_iter_retry(struct radix_tree_iter *iter) > +{ > + iter->next_index = iter->index; > + return NULL; > +} > + > +/** > * radix_tree_chunk_size - get current chunk size > * > * @iter: pointer to radix tree iterator > diff --git a/lib/radix-tree.c b/lib/radix-tree.c > index a25f635dcc56..65422ac17114 100644 > --- a/lib/radix-tree.c > +++ b/lib/radix-tree.c > @@ -1105,9 +1105,13 @@ radix_tree_gang_lookup(struct radix_tree_root *root, void **results, > return 0; > > radix_tree_for_each_slot(slot, root, &iter, first_index) { > - results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot)); > + results[ret] = rcu_dereference_raw(*slot); > if (!results[ret]) > continue; > + if (radix_tree_is_indirect_ptr(results[ret])) { > + slot = radix_tree_iter_retry(&iter); > + continue; > + } > if (++ret == max_items) > break; > } Looks like your fix doesn't work. After radix_tree_iter_retry: radix_tree_for_each_slot will call radix_tree_next_slot which isn't safe to call for NULL slot. #define radix_tree_for_each_slot(slot, root, iter, start) \ for (slot = radix_tree_iter_init(iter, start) ; \ slot || (slot = radix_tree_next_chunk(root, iter, 0)) ; \ slot = radix_tree_next_slot(slot, iter, 0)) tagged iterator works becase restart happens only at root - tags filled with single bit. quick (untested) fix for that if (likely(*slot)) > @@ -1184,9 +1188,13 @@ radix_tree_gang_lookup_tag(struct radix_tree_root *root, void **results, > return 0; > > radix_tree_for_each_tagged(slot, root, &iter, first_index, tag) { > - results[ret] = indirect_to_ptr(rcu_dereference_raw(*slot)); > + results[ret] = rcu_dereference_raw(*slot); > if (!results[ret]) > continue; > + if (radix_tree_is_indirect_ptr(results[ret])) { > + slot = radix_tree_iter_retry(&iter); > + continue; > + } > if (++ret == max_items) > break; > } > -- > 2.7.0.rc3 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org --- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/include/linux/radix-tree.h +++ b/include/linux/radix-tree.h @@ -457,9 +457,9 @@ radix_tree_next_slot(void **slot, struct radix_tree_iter *iter, unsigned flags) return slot + offset + 1; } } else { - unsigned size = radix_tree_chunk_size(iter) - 1; + int size = radix_tree_chunk_size(iter) - 1; - while (size--) { + while (size-- > 0) { slot++; iter->index++;