cxl: Add generic MSI/MSI-X interrupt support

Message ID	20221012180432.473373-1-dave@stgolabs.net
State	New, archived
Headers	show Return-Path: <linux-cxl-owner@kernel.org> sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id CA372541420; Wed, 12 Oct 2022 18:04:56 +0000 (UTC) sender: dave@stgolabs.net) by pdx1-sub0-mail-a274.dreamhost.com (Postfix) with ESMTPSA id 4MngXm06Zrz2l; Wed, 12 Oct 2022 11:04:55 -0700 (PDT) From: Davidlohr Bueso <dave@stgolabs.net> To: dan.j.williams@intel.com Cc: ira.weiny@intel.com, Jonathan.Cameron@huawei.com, alison.schofield@intel.com, vishal.l.verma@intel.com, bwidawsk@kernel.org, a.manzanares@samsung.com, linux-kernel@vger.kernel.org, linux-cxl@vger.kernel.org, dave@stgolabs.net Subject: [PATCH] cxl: Add generic MSI/MSI-X interrupt support Date: Wed, 12 Oct 2022 11:04:32 -0700 Message-Id: <20221012180432.473373-1-dave@stgolabs.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	cxl: Add generic MSI/MSI-X interrupt support \| expand cxl: Add generic MSI/MSI-X interrupt support

Davidlohr Bueso Oct. 12, 2022, 6:04 p.m. UTC

Introduce a generic irq table for CXL components/features that can have
standard irq support - DOE requires dynamic vector sizing and is not
considered here.

Create an infrastructure to query the max vectors required for the CXL
device.

Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
 drivers/cxl/pci.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

Dave Jiang Oct. 12, 2022, 10:57 p.m. UTC | #1

On 10/12/2022 11:04 AM, Davidlohr Bueso wrote:
> Introduce a generic irq table for CXL components/features that can have
> standard irq support - DOE requires dynamic vector sizing and is not
> considered here.
>
> Create an infrastructure to query the max vectors required for the CXL
> device.
>
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

LGTM. Although it would be nice to see the other half of the picture. 
i.e. having the mailbox consuming the vector via request_irq() as an 
example.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>


> ---
>   drivers/cxl/pci.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 63 insertions(+)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..467f2d568e3e 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -428,6 +428,66 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
>   	}
>   }
>   
> +/**
> + * struct cxl_irq_cap - CXL feature that is capable of receiving MSI/MSI-X irqs.
> + *
> + * @name: Name of the device generating this interrupt.
> + * @get_max_msgnum: Get the feature's largest interrupt message number.  If the
> + *		    feature does not have the Interrupt Supported bit set, then
> + *		    return -1.
> + */
> +struct cxl_irq_cap {
> +	const char *name;
> +	int (*get_max_msgnum)(struct cxl_dev_state *cxlds);
> +};
> +
> +static const struct cxl_irq_cap cxl_irq_cap_table[] = {
> +	{ "isolation", NULL },
> +	{ "pmu_overflow", NULL },
> +	{ "mailbox", NULL },
> +	{ "event", NULL },
> +};
> +
> +static void cxl_pci_free_irq_vectors(void *data)
> +{
> +	pci_free_irq_vectors(data);
> +}
> +
> +static int cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int rc, i, vectors = -1;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_irq_cap_table); i++) {
> +		int irq;
> +
> +		if (!cxl_irq_cap_table[i].get_max_msgnum)
> +			continue;
> +
> +		irq = cxl_irq_cap_table[i].get_max_msgnum(cxlds);
> +		vectors = max_t(int, irq, vectors);
> +	}
> +
> +	if (vectors == -1)
> +		return -EINVAL; /* no irq support whatsoever */
> +
> +	vectors++;
> +	rc = pci_alloc_irq_vectors(pdev, vectors, vectors,
> +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (rc < 0)
> +		return rc;
> +
> +	if (rc != vectors) {
> +		dev_err(dev, "Not enough interrupts; use polling where supported\n");
> +		/* Some got allocated; clean them up */
> +		cxl_pci_free_irq_vectors(pdev);
> +		return -ENOSPC;
> +	}
> +
> +	return devm_add_action_or_reset(dev, cxl_pci_free_irq_vectors, pdev);
> +}
> +
>   static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   {
>   	struct cxl_register_map map;
> @@ -498,6 +558,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   	if (IS_ERR(cxlmd))
>   		return PTR_ERR(cxlmd);
>   
> +	/* TODO: When there are users, this return value must be checked */
> +	cxl_pci_alloc_irq_vectors(cxlds);
> +
>   	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>   		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
>

Davidlohr Bueso Oct. 12, 2022, 11:15 p.m. UTC | #2

On Wed, 12 Oct 2022, Dave Jiang wrote:

>
>On 10/12/2022 11:04 AM, Davidlohr Bueso wrote:
>>Introduce a generic irq table for CXL components/features that can have
>>standard irq support - DOE requires dynamic vector sizing and is not
>>considered here.
>>
>>Create an infrastructure to query the max vectors required for the CXL
>>device.
>>
>>Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
>
>LGTM. Although it would be nice to see the other half of the picture.
>i.e. having the mailbox consuming the vector via request_irq() as an
>example.

Well Ira has already implemented the irq support for the events[0], and my
plan is to send my mbox irq patches along with the async handling stuff
once this patch lands.

>
>Reviewed-by: Dave Jiang <dave.jiang@intel.com>

Thanks!

[0] https://lore.kernel.org/linux-cxl/20221012180136.tc3ryjumquvnaqk2@offworld/T/#me42f5ff69bb78ac47ce94b647e6628fa3e841696

Jonathan Cameron Oct. 13, 2022, 12:19 p.m. UTC | #3

On Wed, 12 Oct 2022 11:04:32 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:

> Introduce a generic irq table for CXL components/features that can have
> standard irq support - DOE requires dynamic vector sizing and is not
> considered here.
> 
> Create an infrastructure to query the max vectors required for the CXL
> device.
> 
> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>

Hi Davidlohr,

Basically good, but a few comments inline.

I'll role this onto front of the v2 of the CPMU set as well.

> ---
>  drivers/cxl/pci.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index faeb5d9d7a7a..467f2d568e3e 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -428,6 +428,66 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
>  	}
>  }
>  
> +/**
> + * struct cxl_irq_cap - CXL feature that is capable of receiving MSI/MSI-X irqs.
> + *
> + * @name: Name of the device generating this interrupt.
> + * @get_max_msgnum: Get the feature's largest interrupt message number.  If the
> + *		    feature does not have the Interrupt Supported bit set, then
> + *		    return -1.
> + */
> +struct cxl_irq_cap {
> +	const char *name;
> +	int (*get_max_msgnum)(struct cxl_dev_state *cxlds);

For the CPMU case I need to walk the register locator dvsec block so need
the callback to take the pci_dev not the cxl_dev_state.

Also need it later to map the resulting register blocks to go find the irq before
then dropping them mappings so that the resulting CPMU device can grab them
later.

> +};
> +
> +static const struct cxl_irq_cap cxl_irq_cap_table[] = {
> +	{ "isolation", NULL },
> +	{ "pmu_overflow", NULL },
> +	{ "mailbox", NULL },
> +	{ "event", NULL },

Fill these in as we provide them, not upfront. I'd rather see this
attached to one (or possibly several) of the series that are coming along
than stand alone.  so start off with an empty table.



> +};
> +
> +static void cxl_pci_free_irq_vectors(void *data)
> +{
> +	pci_free_irq_vectors(data);
> +}
> +
> +static int cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> +{
> +	struct device *dev = cxlds->dev;
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	int rc, i, vectors = -1;
> +
> +	for (i = 0; i < ARRAY_SIZE(cxl_irq_cap_table); i++) {
> +		int irq;
> +
> +		if (!cxl_irq_cap_table[i].get_max_msgnum)
> +			continue;
> +
> +		irq = cxl_irq_cap_table[i].get_max_msgnum(cxlds);
> +		vectors = max_t(int, irq, vectors);
> +	}
> +
> +	if (vectors == -1)
> +		return -EINVAL; /* no irq support whatsoever */

return 0 in this case.  No irqs present is a 'good' result if there
aren't any.  Will be up to the consumers of the interrupts to get
their own interrupt vector numbers and they should get the same
answers!

> +
> +	vectors++;
> +	rc = pci_alloc_irq_vectors(pdev, vectors, vectors,
> +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
> +	if (rc < 0)
> +		return rc;
> +
> +	if (rc != vectors) {
> +		dev_err(dev, "Not enough interrupts; use polling where supported\n");
> +		/* Some got allocated; clean them up */
> +		cxl_pci_free_irq_vectors(pdev);
> +		return -ENOSPC;
> +	}
> +
> +	return devm_add_action_or_reset(dev, cxl_pci_free_irq_vectors, pdev);
> +}
> +
>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  {
>  	struct cxl_register_map map;
> @@ -498,6 +558,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>  	if (IS_ERR(cxlmd))
>  		return PTR_ERR(cxlmd);
>  
> +	/* TODO: When there are users, this return value must be checked */
> +	cxl_pci_alloc_irq_vectors(cxlds);
> +

Gut feeling is this will end up moving ahead of any of the sub device creation
because many of them end up needing interrupts.

Also check response from the start - can't see a reason to not do so as we
won't be registering any at all if no callbacks provided.

So I'd move it above the devm_cxl_add_memdev() call.



>  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
>  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
>

Davidlohr Bueso Oct. 13, 2022, 5:37 p.m. UTC | #4

Thanks for having a look.

On Thu, 13 Oct 2022, Jonathan Cameron wrote:

>> +struct cxl_irq_cap {
>> +	const char *name;
>> +	int (*get_max_msgnum)(struct cxl_dev_state *cxlds);
>
>For the CPMU case I need to walk the register locator dvsec block so need
>the callback to take the pci_dev not the cxl_dev_state.

Hmm ok, however maybe I'm missing something, but given a pdev, do we have a
way to get back to the cxlds?

...

>>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>  {
>>  	struct cxl_register_map map;
>> @@ -498,6 +558,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>>  	if (IS_ERR(cxlmd))
>>  		return PTR_ERR(cxlmd);
>>
>> +	/* TODO: When there are users, this return value must be checked */
>> +	cxl_pci_alloc_irq_vectors(cxlds);
>> +
>
>Gut feeling is this will end up moving ahead of any of the sub device creation
>because many of them end up needing interrupts.
>
>Also check response from the start - can't see a reason to not do so as we
>won't be registering any at all if no callbacks provided.
>
>So I'd move it above the devm_cxl_add_memdev() call.

Will do. In addition, are you ok with grouping the irq setup for each cxl
feature/component, ie:

if (cxl_pci_alloc_irq_vectors(cxlds) > 0) {
    cxl_setup_mbox_irq();
    cxl_setup_events_irq();
    cxl_setup_pmu_irq();
}

I ask mostly from the mailbox perspective, in that we already have
a mbox setup call and can certainly understand if people would prefer
it there, but I tend to prefer the above (logically wrt irqs).

Thanks,
Davidlohr

Davidlohr Bueso Oct. 13, 2022, 6:56 p.m. UTC | #5

On Thu, 13 Oct 2022, Davidlohr Bueso wrote:

>if (cxl_pci_alloc_irq_vectors(cxlds) > 0) {

bleh this should be == 0 of course.

>   cxl_setup_mbox_irq();
>   cxl_setup_events_irq();
>   cxl_setup_pmu_irq();
>}

Jonathan Cameron Oct. 14, 2022, 3:56 p.m. UTC | #6

On Thu, 13 Oct 2022 10:37:03 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:

> Thanks for having a look.
> 
> On Thu, 13 Oct 2022, Jonathan Cameron wrote:
> 
> >> +struct cxl_irq_cap {
> >> +	const char *name;
> >> +	int (*get_max_msgnum)(struct cxl_dev_state *cxlds);  
> >
> >For the CPMU case I need to walk the register locator dvsec block so need
> >the callback to take the pci_dev not the cxl_dev_state.  
> 
> Hmm ok, however maybe I'm missing something, but given a pdev, do we have a
> way to get back to the cxlds?

I'd failed to register we can easily get from cxlds to pci dev.
Had wrong mental model of what embedded what.

> 
> ...
> 
> >>  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >>  {
> >>  	struct cxl_register_map map;
> >> @@ -498,6 +558,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >>  	if (IS_ERR(cxlmd))
> >>  		return PTR_ERR(cxlmd);
> >>
> >> +	/* TODO: When there are users, this return value must be checked */
> >> +	cxl_pci_alloc_irq_vectors(cxlds);
> >> +  
> >
> >Gut feeling is this will end up moving ahead of any of the sub device creation
> >because many of them end up needing interrupts.
> >
> >Also check response from the start - can't see a reason to not do so as we
> >won't be registering any at all if no callbacks provided.
> >
> >So I'd move it above the devm_cxl_add_memdev() call.  
> 
> Will do. In addition, are you ok with grouping the irq setup for each cxl
> feature/component, ie:
> 
> if (cxl_pci_alloc_irq_vectors(cxlds) > 0) {
>     cxl_setup_mbox_irq();
>     cxl_setup_events_irq();
>     cxl_setup_pmu_irq();
> }
> 
> I ask mostly from the mailbox perspective, in that we already have
> a mbox setup call and can certainly understand if people would prefer
> it there, but I tend to prefer the above (logically wrt irqs).

I'd rather see it in each of the setup calls.  Out in pci.c we
should just be doing minimum to find that max irq number.
e.g. the CPMU driver will rewalk and find it's vector number directly
from it's own register space.

Jonathan
> 
> Thanks,
> Davidlohr

Ira Weiny Oct. 14, 2022, 4:07 p.m. UTC | #7

On Thu, Oct 13, 2022 at 01:19:13PM +0100, Jonathan Cameron wrote:
> On Wed, 12 Oct 2022 11:04:32 -0700
> Davidlohr Bueso <dave@stgolabs.net> wrote:
> 
> > Introduce a generic irq table for CXL components/features that can have
> > standard irq support - DOE requires dynamic vector sizing and is not
> > considered here.
> > 
> > Create an infrastructure to query the max vectors required for the CXL
> > device.
> > 
> > Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
> 
> Hi Davidlohr,
> 
> Basically good, but a few comments inline.
> 
> I'll role this onto front of the v2 of the CPMU set as well.

And I don't mind this landing ahead of the event stuff.  I'll take this in my
series too but expect it to drop out when applied.

Ira

> 
> > ---
> >  drivers/cxl/pci.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 63 insertions(+)
> > 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index faeb5d9d7a7a..467f2d568e3e 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -428,6 +428,66 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds)
> >  	}
> >  }
> >  
> > +/**
> > + * struct cxl_irq_cap - CXL feature that is capable of receiving MSI/MSI-X irqs.
> > + *
> > + * @name: Name of the device generating this interrupt.
> > + * @get_max_msgnum: Get the feature's largest interrupt message number.  If the
> > + *		    feature does not have the Interrupt Supported bit set, then
> > + *		    return -1.
> > + */
> > +struct cxl_irq_cap {
> > +	const char *name;
> > +	int (*get_max_msgnum)(struct cxl_dev_state *cxlds);
> 
> For the CPMU case I need to walk the register locator dvsec block so need
> the callback to take the pci_dev not the cxl_dev_state.
> 
> Also need it later to map the resulting register blocks to go find the irq before
> then dropping them mappings so that the resulting CPMU device can grab them
> later.
> 
> > +};
> > +
> > +static const struct cxl_irq_cap cxl_irq_cap_table[] = {
> > +	{ "isolation", NULL },
> > +	{ "pmu_overflow", NULL },
> > +	{ "mailbox", NULL },
> > +	{ "event", NULL },
> 
> Fill these in as we provide them, not upfront. I'd rather see this
> attached to one (or possibly several) of the series that are coming along
> than stand alone.  so start off with an empty table.
> 
> 
> 
> > +};
> > +
> > +static void cxl_pci_free_irq_vectors(void *data)
> > +{
> > +	pci_free_irq_vectors(data);
> > +}
> > +
> > +static int cxl_pci_alloc_irq_vectors(struct cxl_dev_state *cxlds)
> > +{
> > +	struct device *dev = cxlds->dev;
> > +	struct pci_dev *pdev = to_pci_dev(dev);
> > +	int rc, i, vectors = -1;
> > +
> > +	for (i = 0; i < ARRAY_SIZE(cxl_irq_cap_table); i++) {
> > +		int irq;
> > +
> > +		if (!cxl_irq_cap_table[i].get_max_msgnum)
> > +			continue;
> > +
> > +		irq = cxl_irq_cap_table[i].get_max_msgnum(cxlds);
> > +		vectors = max_t(int, irq, vectors);
> > +	}
> > +
> > +	if (vectors == -1)
> > +		return -EINVAL; /* no irq support whatsoever */
> 
> return 0 in this case.  No irqs present is a 'good' result if there
> aren't any.  Will be up to the consumers of the interrupts to get
> their own interrupt vector numbers and they should get the same
> answers!
> 
> > +
> > +	vectors++;
> > +	rc = pci_alloc_irq_vectors(pdev, vectors, vectors,
> > +				   PCI_IRQ_MSIX | PCI_IRQ_MSI);
> > +	if (rc < 0)
> > +		return rc;
> > +
> > +	if (rc != vectors) {
> > +		dev_err(dev, "Not enough interrupts; use polling where supported\n");
> > +		/* Some got allocated; clean them up */
> > +		cxl_pci_free_irq_vectors(pdev);
> > +		return -ENOSPC;
> > +	}
> > +
> > +	return devm_add_action_or_reset(dev, cxl_pci_free_irq_vectors, pdev);
> > +}
> > +
> >  static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  {
> >  	struct cxl_register_map map;
> > @@ -498,6 +558,9 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> >  	if (IS_ERR(cxlmd))
> >  		return PTR_ERR(cxlmd);
> >  
> > +	/* TODO: When there are users, this return value must be checked */
> > +	cxl_pci_alloc_irq_vectors(cxlds);
> > +
> 
> Gut feeling is this will end up moving ahead of any of the sub device creation
> because many of them end up needing interrupts.
> 
> Also check response from the start - can't see a reason to not do so as we
> won't be registering any at all if no callbacks provided.
> 
> So I'd move it above the devm_cxl_add_memdev() call.
> 
> 
> 
> >  	if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM))
> >  		rc = devm_cxl_add_nvdimm(&pdev->dev, cxlmd);
> >  
>

cxl: Add generic MSI/MSI-X interrupt support

Commit Message

Comments

Patch