From patchwork Mon Mar 23 06:00:36 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: NeilBrown X-Patchwork-Id: 13643 X-Patchwork-Delegate: agk@redhat.com Received: from hormel.redhat.com (hormel1.redhat.com [209.132.177.33]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n2N60unX007847 for ; Mon, 23 Mar 2009 06:00:56 GMT Received: from listman.util.phx.redhat.com (listman.util.phx.redhat.com [10.8.4.110]) by hormel.redhat.com (Postfix) with ESMTP id CC3FA618E37; Mon, 23 Mar 2009 02:00:55 -0400 (EDT) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by listman.util.phx.redhat.com (8.13.1/8.13.1) with ESMTP id n2N60rCi015214 for ; Mon, 23 Mar 2009 02:00:53 -0400 Received: from mx3.redhat.com (mx3.redhat.com [172.16.48.32]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n2N60tHm016817 for ; Mon, 23 Mar 2009 02:00:55 -0400 Received: from mx1.suse.de (cantor.suse.de [195.135.220.2]) by mx3.redhat.com (8.13.8/8.13.8) with ESMTP id n2N60a88030007 for ; Mon, 23 Mar 2009 02:00:36 -0400 Received: from Relay1.suse.de (relay-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.suse.de (Postfix) with ESMTP id 168288D893 for ; Mon, 23 Mar 2009 07:00:36 +0100 (CET) From: Neil Brown To: dm-devel@redhat.com Date: Mon, 23 Mar 2009 17:00:36 +1100 MIME-Version: 1.0 Message-ID: <18887.9605.64321.547734@notabene.brown> X-face: [Gw_3E*Gng}4rRrKRYotwlE?.2|**#s9D List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com Hi, A customer recently reported an Oops in dm_table_any_congested (in a 2.6.16 based kernel) that was due to dd->bdev being NULL. so bdev_get_queue dereferenced that NULL and caused the oops. The only credible explanation for this that we can find is that upgrade_mode sets bdev to NULL temporarily, and does not have any locking to exclude anything from seeing that NULL. The code in current mainline is exactly the same so if we are correct in our assessment, then the bug is still present. The Oops has only occurred once and cannot be reproduced so we cannot be certain that this is the cause. However if it really is a bug - and there is not something else which causes mutual exclusion of these two routines, then it should probably be fixed. Our current patch is below. It is a big ugly, and a better fix might be a more thorough rewrite of the code. However I offer it incase it is useful. Thanks, NeilBrown Signed-off-By: NeilBrown --- drivers/md/dm-table.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel Index: linux-2.6.16-SLES10_SP2_BRANCH/drivers/md/dm-table.c =================================================================== --- linux-2.6.16-SLES10_SP2_BRANCH.orig/drivers/md/dm-table.c 2009-03-20 11:03:14.000000000 +0530 +++ linux-2.6.16-SLES10_SP2_BRANCH/drivers/md/dm-table.c 2009-03-20 11:22:07.000000000 +0530 @@ -414,14 +414,14 @@ static int upgrade_mode(struct dm_dev *d dd_copy = *dd; - dd->mode |= new_mode; - dd->bdev = NULL; - r = open_dev(dd, dev); - if (!r) - close_dev(&dd_copy); - else + dd_copy.mode |= new_mode; + dd_copy.bdev = NULL; + r = open_dev(&dd_copy, dev); + if (!r) { + struct dm_dev dd_copy2 = *dd; *dd = dd_copy; - + close_dev(&dd_copy2); + } return r; }