From patchwork Sun Nov 26 03:04:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Torvalds X-Patchwork-Id: 13468768 X-Patchwork-Delegate: kuba@kernel.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="fKHDb2P3" Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C5F9D3 for ; Sat, 25 Nov 2023 19:04:57 -0800 (PST) Received: by mail-ej1-x62b.google.com with SMTP id a640c23a62f3a-9fa2714e828so425994666b.1 for ; Sat, 25 Nov 2023 19:04:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; t=1700967895; x=1701572695; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=rkaMoHrFlZfVMg+m4s+x5AD3GeehE8yA2Qqsd3VE908=; b=fKHDb2P3A5H/08NFkefR28i53gh/+NjqO03sqYQBmUmCxLiYG6Ge7S+KR87Rx7UAAH 6xkhQHfGyBHxPFbJp9fieJS87JemWHboaIQU9NKggYUPRmgoU2ipSFHcp53aUPQpjifa gKH4tiavdI07m6kIHn0mtkB7KI5kyCWBiG9ro= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700967895; x=1701572695; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=rkaMoHrFlZfVMg+m4s+x5AD3GeehE8yA2Qqsd3VE908=; b=h9Pff/B94kfYHJkx3I2fSlHB9L5Eqfxn62L8UyhSA2twCRejYguYkQemthb6htfTFu QgCQ+VV9Dof0KHWmpGX0EjXFWCrSOTt6jOlj0f+Mmz59mh5wGFeKLnL4261zlwr4nUAd vWYtw/rKDPwFYNQeazuqtXHPYzbVT5r0TaRWXLXuvdUfnTM3mL65Ze137+7W0sE7xxAx y+fyXOq+q8GPoB5Zi7YxkVoABhGTJeAQgiXd2Ra8zg77H0A5dRcXNpYsS2EemN7fe224 Xo/Dg+OoiOSJhQr2+L5x6WZpDXSH6E+RfDd2WtvLsl4Ow6XJ9VyDQDZRMfKhDVfDwExa Y14A== X-Gm-Message-State: AOJu0Yz9WCU4m6ettaX9NIzTZCjmWfKWVfzdvQrGj8Zkjcn4E9GQGYUv icScsKeDsD0uQud7lT4xI4k3O6KwdjeDWt4JE0oyzg== X-Google-Smtp-Source: AGHT+IHGtaQ1jj5k3o0H79xso+28UxEwwMMKj1xcdDYxW8LOLWrFxePkxbv1LwmHGgQLGLb9QIsUJw== X-Received: by 2002:a17:906:5187:b0:a0d:3a70:11e3 with SMTP id y7-20020a170906518700b00a0d3a7011e3mr851494ejk.63.1700967895153; Sat, 25 Nov 2023 19:04:55 -0800 (PST) Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com. [209.85.208.43]) by smtp.gmail.com with ESMTPSA id k13-20020a17090632cd00b009ca522853ecsm4107600ejk.58.2023.11.25.19.04.54 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 25 Nov 2023 19:04:54 -0800 (PST) Received: by mail-ed1-f43.google.com with SMTP id 4fb4d7f45d1cf-54a95657df3so4468730a12.0 for ; Sat, 25 Nov 2023 19:04:54 -0800 (PST) X-Received: by 2002:a50:99dc:0:b0:53e:7d60:58bb with SMTP id n28-20020a5099dc000000b0053e7d6058bbmr5738755edb.27.1700967893714; Sat, 25 Nov 2023 19:04:53 -0800 (PST) Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Linus Torvalds Date: Sat, 25 Nov 2023 19:04:36 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Aquantia ethernet driver suspend/resume issues To: Igor Russkikh Cc: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Netdev Ok, so this is pretty random, but I ended up replacing my main SSD today, and decided that I'll just do a clean re-install and copy my user data over from my old SSD. As a result of all that, my ethernet cable ended up in a random ethernet port when I reconnected everything, and because of the system reinstall I ended up with suspend-at-idle on by default (which I very much don't want, but I only noticed after it happened). And it turns out that suspend/resume *really* doesn't work on the Aquantia ethernet driver, which is where the cable happened to be. First you get an allocation failure at resume: kworker/u256:41: page allocation failure: order:6, mode:0x40d00(GFP_NOIO|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 58 PID: 11654 Comm: kworker/u256:41 Not tainted Workqueue: events_unbound async_run_entry_fn Call Trace: dump_stack_lvl+0x47/0x60 warn_alloc+0x165/0x1e0 __alloc_pages_slowpath.constprop.0+0xcd4/0xd90 __alloc_pages+0x32d/0x350 __kmalloc_large_node+0x73/0x130 __kmalloc+0xc3/0x150 aq_ring_alloc+0x22/0xb0 [atlantic] aq_vec_ring_alloc+0xee/0x1a0 [atlantic] aq_nic_init+0x118/0x1d0 [atlantic] atl_resume_common+0x40/0xd0 [atlantic] ... and immediately after that we get trying to free invalid coherent area: 000000006fb35228 WARNING: CPU: 58 PID: 11654 at kernel/dma/remap.c:65 dma_common_free_remap+0x2d/0x40 CPU: 58 PID: 11654 Comm: kworker/u256:41 Not tainted 6.5.6-300.fc39.x86_64 #1 Workqueue: events_unbound async_run_entry_fn Call Trace: __iommu_dma_free+0xe8/0x100 aq_ring_alloc+0xa4/0xb0 [atlantic] aq_vec_ring_alloc+0xee/0x1a0 [atlantic] aq_nic_init+0x118/0x1d0 [atlantic] atl_resume_common+0x40/0xd0 [atlantic] ... atlantic 0000:44:00.0: PM: dpm_run_callback(): pci_pm_resume+0x0/0xf0 returns -12 atlantic 0000:44:00.0: PM: failed to resume async: error -12 and now the slab cache is corrupt and the system is dead. My *guess* is that what is going on is that when the kcalloc() failued (because it tries to allocate a large area, and it has only been tested at boot-time when it succeeds), we end up doing that err_exit: if (err < 0) { aq_ring_free(self); self = NULL; } but aq_ring_free() does kfree(self->buff_ring); if (self->dx_ring) dma_free_coherent(aq_nic_get_dev(self->aq_nic), self->size * self->dx_size, self->dx_ring, self->dx_ring_pa); and notice how it will free the dx_ring even though it was never allocated! I suspect dc_ring is non-zero because it was allocated earlier, but the suspend free'd it - but never cleared the pointer. That "never cleared the pointer on free" is true for buff_ring too, but the aq_ring_alloc() did self->buff_ring = kcalloc(self->size, sizeof(struct aq_ring_buff_s), GFP_KERNEL); so when that failed, at least it re-initialized that part to NULL, so we just had a kfree(NULL) which is fine. Anyway, I suspect a fix for the fatal error might be something like the attached, but I think the *root* of the problem is how the aquantia driver tried to allocate a humongous buff_ring with kmalloc, which really doesn't work. You can see that "order:6", ie we're talking an allocation > 100kB, and in low-memory situations that kind of kmalloc space simply isn't available. It *will* fail. Again, during boot you'll probably never see any issues. During suspend/resume it very much does not work. In general, suspend/resume should *not* do big memory management things. It should probably have never free'd the old data structure, and it most definitely cannot try to allocate a big new data structure in resume. To make matters worse, it looks like there's not just *one* of those big allocations, there's multiple ones, both for RX and TX. But I didn't look much more closely. I don't know what the right fix is, but *one* fix would certainly be to not tear everything down at suspend time, only to build it up again at resume. And please please please don't double-free things randomly (if that is what was going on, but it does look like it was). Linus drivers/net/ethernet/aquantia/atlantic/aq_ring.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c index 4de22eed099a..472c7c08bfed 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c @@ -932,11 +932,14 @@ void aq_ring_free(struct aq_ring_s *self) return; kfree(self->buff_ring); + self->buff_ring = NULL; - if (self->dx_ring) + if (self->dx_ring) { dma_free_coherent(aq_nic_get_dev(self->aq_nic), self->size * self->dx_size, self->dx_ring, self->dx_ring_pa); + self->dx_ring = NULL; + } } unsigned int aq_ring_fill_stats_data(struct aq_ring_s *self, u64 *data)