From patchwork Tue Nov 4 15:58:48 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Mason X-Patchwork-Id: 5228941 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id D7EC29F349 for ; Tue, 4 Nov 2014 16:03:39 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0852D20136 for ; Tue, 4 Nov 2014 16:03:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DF67A2017A for ; Tue, 4 Nov 2014 16:03:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754755AbaKDQDM (ORCPT ); Tue, 4 Nov 2014 11:03:12 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:60060 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754453AbaKDQDL (ORCPT ); Tue, 4 Nov 2014 11:03:11 -0500 Received: from pps.filterd (m0004348 [127.0.0.1]) by m0004348.ppops.net (8.14.5/8.14.5) with SMTP id sA4Fwiww014154; Tue, 4 Nov 2014 08:03:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=date : from : subject : to : cc : message-id : in-reply-to : references : mime-version : content-type; s=facebook; bh=oAPmCaizs7bRLgzROftWM+FW/5jQPZNIgKNbnnHDi7g=; b=OydcDKRiONVYBx0xsB0+UuxomncOVnj5zuDbE3KNskDcBlV7SOJdUaz4qtVRP4KaLcSJ CweG/FPmWhaQtg9QOpXB1jU6Nqc+ArGlP6OEUNmJJCForvLI4lbH3gnBqPx6SpS2ahw6 dTYlHTl3P9+6DSHTAOBYfWy5fJnJj3DFRYg= Received: from mail.thefacebook.com ([199.201.64.23]) by m0004348.ppops.net with ESMTP id 1qf1mw01c8-3 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=OK); Tue, 04 Nov 2014 08:03:00 -0800 Received: from [172.23.2.80] (192.168.16.4) by mail.thefacebook.com (192.168.16.19) with Microsoft SMTP Server (TLS) id 14.3.195.1; Tue, 4 Nov 2014 07:58:51 -0800 Date: Tue, 4 Nov 2014 10:58:48 -0500 From: Chris Mason Subject: Re: Kernel crash during "btrfs device delete" on raid6 volume To: Erik Berg CC: , Mark Fasheh Message-ID: <1415116728.25930.1@mail.thefacebook.com> In-Reply-To: <1415112914.25930.0@mail.thefacebook.com> References: <1415112914.25930.0@mail.thefacebook.com> X-Mailer: geary/0.8.1 MIME-Version: 1.0 X-Originating-IP: [192.168.16.4] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.12.52, 1.0.28, 0.0.0000 definitions=2014-11-04_06:2014-11-04, 2014-11-04, 1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 kscore.is_bulkscore=0 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.925924926977281 urlsuspect_oldscore=0.925924926977281 suspectscore=2 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=62764 rbsscore=0.925924926977281 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1411040145 X-FB-Internal: deliver Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,T_DKIM_INVALID,T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Nov 4, 2014 at 9:55 AM, Chris Mason wrote: > On Tue, Nov 4, 2014 at 9:36 AM, Erik Berg > wrote: >> Pulled the latest btrfs-progs from kdave (v3.17-12-gcafacda) and >> using the latest linux release candidate (3.18.0-031800rc3-generic) >> from canonical/ubuntu >> >> Trying to remove device sdb1, the kernel crashes after a minute or >> so. >> >> [ 597.576827] ------------[ cut here ]------------ >> [ 597.617519] kernel BUG at /home/apw/COD/linux/mm/slub.c:3334! >> [ 597.668145] invalid opcode: 0000 [#1] SMP >> [ 597.704410] Modules linked in: arc4 md4 ipt_MASQUERADE >> nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat >> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack >> ipt_REJECT nf_reject_ipv4 xt_CHECKSUM iptable_mangle xt_tcpudp >> bridge stp llc ip6table_filter ip6_tables iptable_filter ip_tables >> ebtable_nat ebtables x_tables gpio_ich intel_rapl >> x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm >> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cryptd serio_raw >> hpilo hpwdt 8250_fintek acpi_power_meter ie31200_edac lpc_ich >> edac_core ipmi_si ipmi_msghandler mac_hid lp parport nls_utf8 cifs >> fscache hid_generic usbhid hid btrfs xor raid6_pq uas usb_storage >> tg3 ptp ahci psmouse libahci pps_core hpsa >> [ 598.268179] CPU: 1 PID: 129 Comm: kworker/u128:3 Not tainted >> 3.18.0-031800rc3-generic #201411022335 >> [ 598.349925] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 >> 11/09/2013 >> [ 598.413231] Workqueue: writeback bdi_writeback_workfn >> (flush-btrfs-2) >> [ 598.471103] task: ffff8803f16a3c00 ti: ffff880036b70000 task.ti: >> ffff880036b70000 >> [ 598.538393] RIP: 0010:[] [] >> kfree+0x16d/0x170 >> [ 598.606217] RSP: 0018:ffff880036b73528 EFLAGS: 00010246 >> [ 598.653844] RAX: 01ffff0000000000 RBX: ffff880036b735c8 RCX: >> 0000000000000000 >> [ 598.717899] RDX: ffff8803743a6010 RSI: dead000000100100 RDI: >> ffff880036b735c8 >> [ 598.781662] RBP: ffff880036b73558 R08: 0000000000000000 R09: >> ffffea0000dadcc0 >> [ 598.846028] R10: 0000000000000001 R11: 0000000000000010 R12: >> ffff8803f1e09800 >> [ 598.910713] R13: ffff8803ac757d40 R14: ffffffffc04fed0c R15: >> ffff880036b735d8 >> [ 598.975333] FS: 0000000000000000(0000) GS:ffff88040b420000(0000) >> knlGS:0000000000000000 >> [ 599.048512] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 599.100167] CR2: 00007fa9a3854024 CR3: 0000000001c16000 CR4: >> 00000000001407e0 >> [ 599.165150] Stack: >> [ 599.183305] ffff8803f1e09800 00000dad07c20000 ffff8803f1e09800 >> ffff8803ac757d40 >> [ 599.249603] ffff8803ac757d40 ffff880036b735d8 ffff880036b73618 >> ffffffffc04fed0c >> [ 599.316306] ffff8803f1b86b00 ffff880374338000 00000dad07dc0000 >> ffff880036b73638 >> [ 599.383404] Call Trace: >> [ 599.405429] [] >> btrfs_lookup_csums_range+0x2ac/0x4a0 [btrfs] > > Not a new bug unfortunately, but since it is in the error handling > people must not be hitting it often. It's also not related to device > replace. > > > while (ret < 0 && !list_empty(&tmplist)) { > sums = list_entry(&tmplist, struct btrfs_ordered_sum, > list); > list_del(&sums->list); > kfree(sums); > } > > We're trying to call kfree on the on-stack list head. I'm fixing it > up here, thanks for posting the oops! Fix attached, or you can wait for the next rc. Thanks. -chris Reviewed-by: Mark Fasheh From 6e5aafb27419f32575b27ef9d6a31e5d54661aca Mon Sep 17 00:00:00 2001 From: Chris Mason Date: Tue, 4 Nov 2014 06:59:04 -0800 Subject: [PATCH] Btrfs: fix kfree on list_head in btrfs_lookup_csums_range error cleanup If we hit any errors in btrfs_lookup_csums_range, we'll loop through all the csums we allocate and free them. But the code was using list_entry incorrectly, and ended up trying to free the on-stack list_head instead. This bug came from commit 0678b6185 btrfs: Don't BUG_ON kzalloc error in btrfs_lookup_csums_range() Signed-off-by: Chris Mason Reported-by: Erik Berg cc: stable@vger.kernel.org # 3.3 or newer --- fs/btrfs/file-item.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index 783a943..84a2d18 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -413,7 +413,7 @@ int btrfs_lookup_csums_range(struct btrfs_root *root, u64 start, u64 end, ret = 0; fail: while (ret < 0 && !list_empty(&tmplist)) { - sums = list_entry(&tmplist, struct btrfs_ordered_sum, list); + sums = list_entry(tmplist.next, struct btrfs_ordered_sum, list); list_del(&sums->list); kfree(sums); } -- 1.8.1