Message ID | cc4b61809e2520d835cf3d4f62e7d5ed00a9d031.1674468099.git.lukas@wunner.de |
---|---|
State | Superseded |
Headers | show |
Series | Collection of DOE material | expand |
Lukas Wunner wrote: > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL > probing because pci_doe_submit_task() invokes INIT_WORK() instead of > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. > > All callers of pci_doe_submit_task() allocate the work_struct on the > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable > short-term fix. > > Stacktrace for posterity: > > WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183 > CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > Call Trace: > pci_doe_submit_task+0x5d/0xd0 > pci_doe_discovery+0xb4/0x100 > pcim_doe_create_mb+0x219/0x290 > cxl_pci_probe+0x192/0x430 > local_pci_probe+0x41/0x80 > pci_device_probe+0xb3/0x220 > really_probe+0xde/0x380 > __driver_probe_device+0x78/0x170 > driver_probe_device+0x1f/0x90 > __driver_attach_async_helper+0x5c/0xe0 > async_run_entry_fn+0x30/0x130 > process_one_work+0x294/0x5b0 > > Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions") > Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/ > Reported-by: Gregory Price <gregory.price@memverge.com> > Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > Cc: stable@vger.kernel.org # v6.0+ > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > Changes v1 -> v2: > * Add note in kernel-doc of pci_doe_submit_task() that pci_doe_task must > be allocated on the stack (Jonathan) > > drivers/pci/doe.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c > index 66d9ab288646..12a6752351bf 100644 > --- a/drivers/pci/doe.c > +++ b/drivers/pci/doe.c > @@ -520,6 +520,8 @@ EXPORT_SYMBOL_GPL(pci_doe_supports_prot); > * task->complete will be called when the state machine is done processing this > * task. > * > + * @task must be allocated on the stack. > + * > * Excess data will be discarded. > * > * RETURNS: 0 when task has been successfully queued, -ERRNO on error > @@ -541,7 +543,7 @@ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task) > return -EIO; > > task->doe_mb = doe_mb; > - INIT_WORK(&task->work, doe_statemachine_work); > + INIT_WORK_ONSTACK(&task->work, doe_statemachine_work); > queue_work(doe_mb->work_queue, &task->work); > return 0; > } > -- > 2.39.1 >
On Mon, 23 Jan 2023 16:33:36 -0800 Ira Weiny <ira.weiny@intel.com> wrote: > Lukas Wunner wrote: > > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL > > probing because pci_doe_submit_task() invokes INIT_WORK() instead of > > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. > > > > All callers of pci_doe_submit_task() allocate the work_struct on the > > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable > > short-term fix. > > > > Stacktrace for posterity: > > > > WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183 > > CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1 > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > > Call Trace: > > pci_doe_submit_task+0x5d/0xd0 > > pci_doe_discovery+0xb4/0x100 > > pcim_doe_create_mb+0x219/0x290 > > cxl_pci_probe+0x192/0x430 > > local_pci_probe+0x41/0x80 > > pci_device_probe+0xb3/0x220 > > really_probe+0xde/0x380 > > __driver_probe_device+0x78/0x170 > > driver_probe_device+0x1f/0x90 > > __driver_attach_async_helper+0x5c/0xe0 > > async_run_entry_fn+0x30/0x130 > > process_one_work+0x294/0x5b0 > > > > Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions") > > Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/ > > Reported-by: Gregory Price <gregory.price@memverge.com> > > Tested-by: Ira Weiny <ira.weiny@intel.com> > > Reviewed-by: Ira Weiny <ira.weiny@intel.com> It's an unusual requirement, but this is indeed the minimal fix given current users. Obviously becomes more sensible later in the series once you make the API synchronous only. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> > > > Signed-off-by: Lukas Wunner <lukas@wunner.de> > > Cc: stable@vger.kernel.org # v6.0+ > > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > --- > > Changes v1 -> v2: > > * Add note in kernel-doc of pci_doe_submit_task() that pci_doe_task must > > be allocated on the stack (Jonathan) > > > > drivers/pci/doe.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c > > index 66d9ab288646..12a6752351bf 100644 > > --- a/drivers/pci/doe.c > > +++ b/drivers/pci/doe.c > > @@ -520,6 +520,8 @@ EXPORT_SYMBOL_GPL(pci_doe_supports_prot); > > * task->complete will be called when the state machine is done processing this > > * task. > > * > > + * @task must be allocated on the stack. > > + * > > * Excess data will be discarded. > > * > > * RETURNS: 0 when task has been successfully queued, -ERRNO on error > > @@ -541,7 +543,7 @@ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task) > > return -EIO; > > > > task->doe_mb = doe_mb; > > - INIT_WORK(&task->work, doe_statemachine_work); > > + INIT_WORK_ONSTACK(&task->work, doe_statemachine_work); > > queue_work(doe_mb->work_queue, &task->work); > > return 0; > > } > > -- > > 2.39.1 > > > >
On Mon, Jan 23, 2023 at 11:11:00AM +0100, Lukas Wunner wrote: > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL > probing because pci_doe_submit_task() invokes INIT_WORK() instead of > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. > > All callers of pci_doe_submit_task() allocate the work_struct on the > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable > short-term fix. > > ... snip ... > Reported-by: Gregory Price <gregory.price@memverge.com> Tested-by: Gregory Price <gregory.price@memverge.com> Reviewed-by: Gregory Price <gregory.price@memverge.com>
On Tue, Jan 24, 2023 at 10:32:08AM +0000, Jonathan Cameron wrote: > On Mon, 23 Jan 2023 16:33:36 -0800 Ira Weiny <ira.weiny@intel.com> wrote: > > Lukas Wunner wrote: > > > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL > > > probing because pci_doe_submit_task() invokes INIT_WORK() instead of > > > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. > > > > > > All callers of pci_doe_submit_task() allocate the work_struct on the > > > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable > > > short-term fix. [...] > It's an unusual requirement, but this is indeed the minimal fix > given current users. Obviously becomes more sensible later in the > series once you make the API synchronous only. Okay, I'll amend the commit message as follows when respinning to make more obvious what's being done here: The long-term fix implemented by a subsequent commit is to move to a synchronous API which allocates the work_struct internally in the DOE library. Thanks, Lukas
Lukas Wunner wrote: > Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL > probing because pci_doe_submit_task() invokes INIT_WORK() instead of > INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. > > All callers of pci_doe_submit_task() allocate the work_struct on the > stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable > short-term fix. > > Stacktrace for posterity: > > WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183 > CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 > Call Trace: > pci_doe_submit_task+0x5d/0xd0 > pci_doe_discovery+0xb4/0x100 > pcim_doe_create_mb+0x219/0x290 > cxl_pci_probe+0x192/0x430 > local_pci_probe+0x41/0x80 > pci_device_probe+0xb3/0x220 > really_probe+0xde/0x380 > __driver_probe_device+0x78/0x170 > driver_probe_device+0x1f/0x90 > __driver_attach_async_helper+0x5c/0xe0 > async_run_entry_fn+0x30/0x130 > process_one_work+0x294/0x5b0 > > Fixes: 9d24322e887b ("PCI/DOE: Add DOE mailbox support functions") > Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/ > Reported-by: Gregory Price <gregory.price@memverge.com> > Tested-by: Ira Weiny <ira.weiny@intel.com> > Signed-off-by: Lukas Wunner <lukas@wunner.de> > Cc: stable@vger.kernel.org # v6.0+ > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c index 66d9ab288646..12a6752351bf 100644 --- a/drivers/pci/doe.c +++ b/drivers/pci/doe.c @@ -520,6 +520,8 @@ EXPORT_SYMBOL_GPL(pci_doe_supports_prot); * task->complete will be called when the state machine is done processing this * task. * + * @task must be allocated on the stack. + * * Excess data will be discarded. * * RETURNS: 0 when task has been successfully queued, -ERRNO on error @@ -541,7 +543,7 @@ int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task) return -EIO; task->doe_mb = doe_mb; - INIT_WORK(&task->work, doe_statemachine_work); + INIT_WORK_ONSTACK(&task->work, doe_statemachine_work); queue_work(doe_mb->work_queue, &task->work); return 0; }