diff mbox

[v2-UPDATE2,3/4] resource: Add device-managed insert/remove_resource()

Message ID CAPcyv4gT1RvnEquYXVo7sZscdUrkiWpG-Yvr_EX4=L25397X0g@mail.gmail.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Dan Williams March 8, 2016, 8:59 p.m. UTC
On Tue, Mar 8, 2016 at 12:05 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Mar 8, 2016 at 10:08 AM, Toshi Kani <toshi.kani@hpe.com> wrote:
>>
>> Add device-managed implementaions of insert_resource() and
>> remove_resouce() functions.
>
> Can we get an example of the users and show how much this would actually help?
>
> insert_resource() is fairly unusual to begin with, and it should
> basically *never* be used by a driver at all (it's more of a bus
> thing).
>
> The patch doesn't look _wrong_, but it does look potentially largely
> pointless. Adding new interfaces to do things that aren't common
> enough to worry about is counter-productive, imho.

Here's the usage patch from Toshi [1] (copied below).  It is indeed a
resource injected by nfit / nvdimm bus implementation.  We just happen
to support nfit and libnvdimm as modules.

The goal of these patches is to use the ACPI NFIT data to create a
"Persistent Memory" rather than "reserved" resource.  This is for
platform-firmware implementations that use E820-Type2 rather than
E820-Type7 to describe pmem.

[1]: https://patchwork.kernel.org/patch/8485691/

8<--
From: Toshi Kani <toshi.kani@hpe.com>

ACPI 6.0 defines persistent memory (PMEM) ranges in multiple
firmware interfaces, e820, EFI, and ACPI NFIT table.  This EFI
change, however, leads to hit a bug in the grub bootloader, which
treats EFI_PERSISTENT_MEMORY type as regular memory and corrupts
stored user data [1].

Therefore, BIOS may set generic reserved type in e820 and EFI
to cover PMEM ranges.  The kernel can initialize PMEM ranges
from ACPI NFIT table alone.

This scheme causes a problem in the iomem table, though.  On x86,
for instance, e820_reserve_resources() initializes top-level entries
(iomem_resource.child) from the e820 table at early boot-time.
This creates "reserved" entry for a PMEM range, which does not allow
region_intersects() to check with PMEM type.

Change acpi_nfit_register_region() to call acpi_nfit_insert_resource(),
which calls devm_insert_resource() to insert a PMEM entry from NFIT
when the iomem table does not have a PMEM entry already.  That is,
when a PMEM range is marked as reserved type in e820, it inserts
"Persistent Memory" entry, which results as follows.

 + "Persistent Memory"
    + "reserved"

This allows the EINJ driver, which calls region_intersects() to
check PMEM ranges, to work continuously even if BIOS sets reserved
type (or sets nothing) to PMEM ranges in e820 and EFI.

[1]: https://lists.gnu.org/archive/html/grub-devel/2015-11/msg00209.html
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 drivers/acpi/nfit.c |   30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

 {
@@ -1781,6 +1805,12 @@ static int acpi_nfit_register_region(struct
acpi_nfit_desc *acpi_desc,

        nvdimm_bus = acpi_desc->nvdimm_bus;
        if (nfit_spa_type(spa) == NFIT_SPA_PM) {
+               rc = acpi_nfit_insert_resource(acpi_desc, ndr_desc);
+               if (rc)
+                       dev_warn(acpi_desc->dev,
+                               "failed to insert pmem resource to iomem: %d\n",
+                               rc);
+
                rc = acpi_nfit_find_poison(acpi_desc, ndr_desc);
                if (rc) {
                        dev_err(acpi_desc->dev,

Comments

Linus Torvalds March 8, 2016, 10:23 p.m. UTC | #1
On Tue, Mar 8, 2016 at 12:59 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>
> Here's the usage patch from Toshi [1] (copied below).  It is indeed a
> resource injected by nfit / nvdimm bus implementation.  We just happen
> to support nfit and libnvdimm as modules.
>
> The goal of these patches is to use the ACPI NFIT data to create a
> "Persistent Memory" rather than "reserved" resource.  This is for
> platform-firmware implementations that use E820-Type2 rather than
> E820-Type7 to describe pmem.

So my worry is that there is likely exactly one or two of these kinds of sites.

Why couldn't they just use insert_resource() and then remove it manually?

                 Linus
Dan Williams March 8, 2016, 10:44 p.m. UTC | #2
On Tue, Mar 8, 2016 at 2:23 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Mar 8, 2016 at 12:59 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>>
>> Here's the usage patch from Toshi [1] (copied below).  It is indeed a
>> resource injected by nfit / nvdimm bus implementation.  We just happen
>> to support nfit and libnvdimm as modules.
>>
>> The goal of these patches is to use the ACPI NFIT data to create a
>> "Persistent Memory" rather than "reserved" resource.  This is for
>> platform-firmware implementations that use E820-Type2 rather than
>> E820-Type7 to describe pmem.
>
> So my worry is that there is likely exactly one or two of these kinds of sites.
>
> Why couldn't they just use insert_resource() and then remove it manually?

You mean instead of introducing a devm_insert_resource() as a helpful
first-class-citizen api, just arrange for the resource to be inserted
locally?  Sure.

I assume Toshi was looking to keep the devm semantics like the rest of
the nfit driver, but we can do that locally with devm_add_action() and
skip the new general purpose api.
Linus Torvalds March 8, 2016, 11:31 p.m. UTC | #3
On Tue, Mar 8, 2016 at 4:04 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
>
> Yes, I prefer the devm semantics.  insert_resource() and remove_resource()
> are not exported interfaces.  So, with devm_add_action(), we still need to
> introduce built-in exported wrappers for insert/remove_resource(), unless
> we change to export them directly.  Since we need to export "something", I
> think it is better to export their devm interfaces.

So I'm coming from the background that

 (a) less code is better

 (b) the "devm_" interface may be convenient, but it has also
traditionally also been a cause of problems and limitations.

Now, the main problems with the devm interface has been either
ordering (which just isn't an issue with resource allocation - it's
been an issue with irqs) or the fact that it can't always be used if
you're not in the right context. So it's "convenient but potentially
inflexible".

And the thing is, I think convenience functions mainly make sense for
places where there are multiple users. If there really is just one or
two (number completely pulled out of my ass), I don't see the point of
a "convenience" function, when we've had the main actual _code_
functionality for over a decade.

So unless there are more users, I'd suggest just exporting the
insert_resource function.

We already export allocate_resource and adjust_resource.

Now, the _one_ argument for devm_insert_resource() is that we do have
"devm_request_resource()".

But quite frankly, just counting the number of devm_request_resource()
calls weakens that argument. There's 7 callers in the whole kernel.
The regular "request_resource()" has 200+ callers.

That may be due to historical reasons, but it may also be at least
partially due to (b) above - there are a number of cases where the
"devm_xyz()" model doesn't work well.

So I think we should see the "devm_xyz()" forms as being a "let's make
things easy for driver writers". I do _not_ think it makes sense for
one-off users.

Now, if it turns out that there are lots of other potential users of
devm_insert_resource(), that would maks all of my arguments go away.

             Linus
Kani, Toshi March 9, 2016, 12:04 a.m. UTC | #4
On Tue, 2016-03-08 at 14:44 -0800, Dan Williams wrote:
> On Tue, Mar 8, 2016 at 2:23 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Tue, Mar 8, 2016 at 12:59 PM, Dan Williams <dan.j.williams@intel.com
> > > wrote:
> > > 
> > > Here's the usage patch from Toshi [1] (copied below).  It is indeed a
> > > resource injected by nfit / nvdimm bus implementation.  We just
> > > happen
> > > to support nfit and libnvdimm as modules.
> > > 
> > > The goal of these patches is to use the ACPI NFIT data to create a
> > > "Persistent Memory" rather than "reserved" resource.  This is for
> > > platform-firmware implementations that use E820-Type2 rather than
> > > E820-Type7 to describe pmem.
> > 
> > So my worry is that there is likely exactly one or two of these kinds
> > of sites.
> > 
> > Why couldn't they just use insert_resource() and then remove it
> > manually?
> 
> You mean instead of introducing a devm_insert_resource() as a helpful
> first-class-citizen api, just arrange for the resource to be inserted
> locally?  Sure.
> 
> I assume Toshi was looking to keep the devm semantics like the rest of
> the nfit driver, but we can do that locally with devm_add_action() and
> skip the new general purpose api.

Yes, I prefer the devm semantics.  insert_resource() and remove_resource()
are not exported interfaces.  So, with devm_add_action(), we still need to
introduce built-in exported wrappers for insert/remove_resource(), unless
we change to export them directly.  Since we need to export "something", I
think it is better to export their devm interfaces.

Thanks,
-Toshi
Kani, Toshi March 9, 2016, 1 a.m. UTC | #5
On Tue, 2016-03-08 at 15:31 -0800, Linus Torvalds wrote:
> On Tue, Mar 8, 2016 at 4:04 PM, Toshi Kani <toshi.kani@hpe.com> wrote:
> > 
> > Yes, I prefer the devm semantics.  insert_resource() and
> > remove_resource() are not exported interfaces.  So, with
> > devm_add_action(), we still need to introduce built-in exported
> > wrappers for insert/remove_resource(), unless we change to export them
> > directly.  Since we need to export "something", I think it is better to
> > export their devm interfaces.
> 
> So I'm coming from the background that
> 
>  (a) less code is better
> 
>  (b) the "devm_" interface may be convenient, but it has also
> traditionally also been a cause of problems and limitations.
> 
> Now, the main problems with the devm interface has been either
> ordering (which just isn't an issue with resource allocation - it's
> been an issue with irqs) or the fact that it can't always be used if
> you're not in the right context. So it's "convenient but potentially
> inflexible".
> 
> And the thing is, I think convenience functions mainly make sense for
> places where there are multiple users. If there really is just one or
> two (number completely pulled out of my ass), I don't see the point of
> a "convenience" function, when we've had the main actual _code_
> functionality for over a decade.
> 
> So unless there are more users, I'd suggest just exporting the
> insert_resource function.
> 
> We already export allocate_resource and adjust_resource.
> 
> Now, the _one_ argument for devm_insert_resource() is that we do have
> "devm_request_resource()".
> 
> But quite frankly, just counting the number of devm_request_resource()
> calls weakens that argument. There's 7 callers in the whole kernel.
> The regular "request_resource()" has 200+ callers.
> 
> That may be due to historical reasons, but it may also be at least
> partially due to (b) above - there are a number of cases where the
> "devm_xyz()" model doesn't work well.
> 
> So I think we should see the "devm_xyz()" forms as being a "let's make
> things easy for driver writers". I do _not_ think it makes sense for
> one-off users.
> 
> Now, if it turns out that there are lots of other potential users of
> devm_insert_resource(), that would maks all of my arguments go away.

I agree that there won't be many users of devm_insert_resource().  So, I am
going to export insert_resource() and remove_resource() as you suggested,
and let the NFIT driver to call them using devm_add_action() as a one-off
solution.

Thanks!
-Toshi
diff mbox

Patch

diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index fb53db1..d97b53f 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -1571,6 +1571,30 @@  static int ars_status_process_records(struct
nvdimm_bus *nvdimm_bus,
        return 0;
 }

+static int acpi_nfit_insert_resource(struct acpi_nfit_desc *acpi_desc,
+               struct nd_region_desc *ndr_desc)
+{
+       struct resource *res, *nd_res = ndr_desc->res;
+       size_t size = nd_res->end - nd_res->start + 1;
+
+       /* No operation if the region is already registered as PMEM */
+       if (region_intersects(nd_res->start, size, IORESOURCE_MEM,
+                       IORES_DESC_PERSISTENT_MEMORY) == REGION_INTERSECTS)
+               return 0;
+
+       res = devm_kzalloc(acpi_desc->dev, sizeof(*res), GFP_KERNEL);
+       if (!res)
+               return -ENOMEM;
+
+       res->name = "Persistent Memory";
+       res->start = nd_res->start;
+       res->end = nd_res->end;
+       res->flags = IORESOURCE_MEM;
+       res->desc = IORES_DESC_PERSISTENT_MEMORY;
+
+       return devm_insert_resource(acpi_desc->dev, &iomem_resource, res);
+}
+
 static int acpi_nfit_find_poison(struct acpi_nfit_desc *acpi_desc,
                struct nd_region_desc *ndr_desc)