diff mbox series

[v2,9/9] PCI: rockchip: Add parameter check for RK3399 PCIe endpoint core set_msi()

Message ID 20230214140858.1133292-10-rick.wertenbroek@gmail.com (mailing list archive)
State Superseded
Delegated to: Lorenzo Pieralisi
Headers show
Series PCI: rockchip: Fix RK3399 PCIe endpoint controller driver | expand

Commit Message

Rick Wertenbroek Feb. 14, 2023, 2:08 p.m. UTC
The RK3399 PCIe endpoint core supports only a single PCIe physcial
function (function number 0), therefore return -EINVAL if set_msi() is
called with a function number greater than 0.
The PCIe standard only allows the multi message capability (MMC) value
to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
called with a MMC value of over 0x5.

Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
---
 drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Damien Le Moal Feb. 15, 2023, 1:39 a.m. UTC | #1
On 2/14/23 23:08, Rick Wertenbroek wrote:
> The RK3399 PCIe endpoint core supports only a single PCIe physcial
> function (function number 0), therefore return -EINVAL if set_msi() is
> called with a function number greater than 0.
> The PCIe standard only allows the multi message capability (MMC) value
> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> called with a MMC value of over 0x5.
> 
> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> ---
>  drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> index b7865a94e..80634b690 100644
> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>  	struct rockchip_pcie *rockchip = &ep->rockchip;
>  	u32 flags;
>  
> +	if (fn) {
> +		dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> +		return -EINVAL;
> +	}

Checking this here is late... Given that at most only one physical
function is supported, the check should be in rockchip_pcie_parse_ep_dt().
Something like:

	err = of_property_read_u8(dev->of_node, "max-functions",
                                  &ep->epc->max_functions);

        if (err < 0 || ep->epc->max_functions > 1)

                ep->epc->max_functions = 1;

And all the macros with the (fn) argument could also be simplified
(argument fn removed) since fn will always be 0.

> +
> +	if (mmc > 0x5) {
> +		dev_err(&epc->dev, "Number of MSI IRQs cannot be more than 32\n");

Long line. Please split it after the comma.

> +		return -EINVAL;
> +	}
> +
>  	flags = rockchip_pcie_read(rockchip,
>  				   ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
>  				   ROCKCHIP_PCIE_EP_MSI_CTRL_REG);

Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
here all the time.
Rick Wertenbroek Feb. 21, 2023, 10:47 a.m. UTC | #2
On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
>
> On 2/14/23 23:08, Rick Wertenbroek wrote:
> > The RK3399 PCIe endpoint core supports only a single PCIe physcial
> > function (function number 0), therefore return -EINVAL if set_msi() is
> > called with a function number greater than 0.
> > The PCIe standard only allows the multi message capability (MMC) value
> > to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> > called with a MMC value of over 0x5.
> >
> > Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> > ---
> >  drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> > index b7865a94e..80634b690 100644
> > --- a/drivers/pci/controller/pcie-rockchip-ep.c
> > +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> > @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> >       struct rockchip_pcie *rockchip = &ep->rockchip;
> >       u32 flags;
> >
> > +     if (fn) {
> > +             dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> > +             return -EINVAL;
> > +     }
>
> Checking this here is late... Given that at most only one physical
> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
> Something like:
>
>         err = of_property_read_u8(dev->of_node, "max-functions",
>                                   &ep->epc->max_functions);
>
>         if (err < 0 || ep->epc->max_functions > 1)
>
>                 ep->epc->max_functions = 1;
>

Yes, this could be moved to the probe, thanks.

> And all the macros with the (fn) argument could also be simplified
> (argument fn removed) since fn will always be 0.

These functions cannot be simplified because they have to follow the signature
given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
function number as a parameter. If we change the function signature we won't
be able to assign these functions to the pc_epc_ops structure

static const struct pci_epc_ops rockchip_pcie_epc_ops = {
       .write_header = rockchip_pcie_ep_write_header,
       .set_bar = rockchip_pcie_ep_set_bar,
       .clear_bar = rockchip_pcie_ep_clear_bar,
       .map_addr = rockchip_pcie_ep_map_addr,
       .unmap_addr = rockchip_pcie_ep_unmap_addr,
       .set_msi = rockchip_pcie_ep_set_msi,
       .get_msi = rockchip_pcie_ep_get_msi,
       .raise_irq = rockchip_pcie_ep_raise_irq,
       .start = rockchip_pcie_ep_start,
       .get_features = rockchip_pcie_ep_get_features,
};

>
> > +
> > +     if (mmc > 0x5) {
> > +             dev_err(&epc->dev, "Number of MSI IRQs cannot be more than 32\n");
>
> Long line. Please split it after the comma.
>
> > +             return -EINVAL;
> > +     }
> > +
> >       flags = rockchip_pcie_read(rockchip,
> >                                  ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
> >                                  ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
>
> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
> here all the time.

Yes, this could be an improvement but this is the way it is written
everywhere in this
driver, I chose to keep it so as to remain coherent with the rest of the driver.
Cleaning this is not so important since this code will not be
rewritten / changed so
often. But I agree that it might be nicer. But, on the other side if
at some point
support for virtual functions would be added, the offsets would need
to be computed
based on the virtual function number and the code would be written
like it is now,
so I suggest keeping this the way it is for now.

>
> --
> Damien Le Moal
> Western Digital Research
>

Thank you for your comments.

Regards
Rick
Damien Le Moal Feb. 21, 2023, 10:55 a.m. UTC | #3
On 2/21/23 19:47, Rick Wertenbroek wrote:
> On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
>>
>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
>>> function (function number 0), therefore return -EINVAL if set_msi() is
>>> called with a function number greater than 0.
>>> The PCIe standard only allows the multi message capability (MMC) value
>>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
>>> called with a MMC value of over 0x5.
>>>
>>> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
>>> ---
>>>  drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>>>  1 file changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>> index b7865a94e..80634b690 100644
>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>>       struct rockchip_pcie *rockchip = &ep->rockchip;
>>>       u32 flags;
>>>
>>> +     if (fn) {
>>> +             dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
>>> +             return -EINVAL;
>>> +     }
>>
>> Checking this here is late... Given that at most only one physical
>> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
>> Something like:
>>
>>         err = of_property_read_u8(dev->of_node, "max-functions",
>>                                   &ep->epc->max_functions);
>>
>>         if (err < 0 || ep->epc->max_functions > 1)
>>
>>                 ep->epc->max_functions = 1;
>>
> 
> Yes, this could be moved to the probe, thanks.
> 
>> And all the macros with the (fn) argument could also be simplified
>> (argument fn removed) since fn will always be 0.
> 
> These functions cannot be simplified because they have to follow the signature
> given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
> function number as a parameter. If we change the function signature we won't
> be able to assign these functions to the pc_epc_ops structure

I was not suggesting to change the functions signature. I was suggesting
dropping the fn argument for the *macros*, e.g.

ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE

since fn is always 0.

That said, I am not entirely sure if the limit really is 1 function at most. The
TRM seems to be suggesting that up to 4 functions can be supported...

[...]

>> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
>> here all the time.
> 
> Yes, this could be an improvement but this is the way it is written
> everywhere in this
> driver, I chose to keep it so as to remain coherent with the rest of the driver.
> Cleaning this is not so important since this code will not be
> rewritten / changed so
> often. But I agree that it might be nicer. But, on the other side if
> at some point
> support for virtual functions would be added, the offsets would need
> to be computed
> based on the virtual function number and the code would be written
> like it is now,
> so I suggest keeping this the way it is for now.

Yes, sure, this can be cleaned later.

A more pressing problem is the lack of support for MSIX despite the fact that
the controller supports that *and* advertize it as a capability. That is what
was causing my problem with the Linux nvme driver and my prototype nvme epf
function driver: the host driver was seeing MSIX support (1 vector supported by
default), and so was allocating one MSIX for the device probe. But on the EP
end, it is MSI or INTX only... Working on adding that to solve this issue.
Rick Wertenbroek Feb. 21, 2023, 1:19 p.m. UTC | #4
On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
<damien.lemoal@opensource.wdc.com> wrote:
>
> On 2/21/23 19:47, Rick Wertenbroek wrote:
> > On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
> > <damien.lemoal@opensource.wdc.com> wrote:
> >>
> >> On 2/14/23 23:08, Rick Wertenbroek wrote:
> >>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
> >>> function (function number 0), therefore return -EINVAL if set_msi() is
> >>> called with a function number greater than 0.
> >>> The PCIe standard only allows the multi message capability (MMC) value
> >>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> >>> called with a MMC value of over 0x5.
> >>>
> >>> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> >>> ---
> >>>  drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
> >>>  1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> >>> index b7865a94e..80634b690 100644
> >>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> >>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> >>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> >>>       struct rockchip_pcie *rockchip = &ep->rockchip;
> >>>       u32 flags;
> >>>
> >>> +     if (fn) {
> >>> +             dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> >>> +             return -EINVAL;
> >>> +     }
> >>
> >> Checking this here is late... Given that at most only one physical
> >> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
> >> Something like:
> >>
> >>         err = of_property_read_u8(dev->of_node, "max-functions",
> >>                                   &ep->epc->max_functions);
> >>
> >>         if (err < 0 || ep->epc->max_functions > 1)
> >>
> >>                 ep->epc->max_functions = 1;
> >>
> >
> > Yes, this could be moved to the probe, thanks.
> >
> >> And all the macros with the (fn) argument could also be simplified
> >> (argument fn removed) since fn will always be 0.
> >
> > These functions cannot be simplified because they have to follow the signature
> > given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
> > function number as a parameter. If we change the function signature we won't
> > be able to assign these functions to the pc_epc_ops structure
>
> I was not suggesting to change the functions signature. I was suggesting
> dropping the fn argument for the *macros*, e.g.
>
> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
>
> since fn is always 0.
>
> That said, I am not entirely sure if the limit really is 1 function at most. The
> TRM seems to be suggesting that up to 4 functions can be supported...
>
> [...]
>
> >> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
> >> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
> >> here all the time.
> >
> > Yes, this could be an improvement but this is the way it is written
> > everywhere in this
> > driver, I chose to keep it so as to remain coherent with the rest of the driver.
> > Cleaning this is not so important since this code will not be
> > rewritten / changed so
> > often. But I agree that it might be nicer. But, on the other side if
> > at some point
> > support for virtual functions would be added, the offsets would need
> > to be computed
> > based on the virtual function number and the code would be written
> > like it is now,
> > so I suggest keeping this the way it is for now.
>
> Yes, sure, this can be cleaned later.
>
> A more pressing problem is the lack of support for MSIX despite the fact that
> the controller supports that *and* advertize it as a capability. That is what
> was causing my problem with the Linux nvme driver and my prototype nvme epf
> function driver: the host driver was seeing MSIX support (1 vector supported by
> default), and so was allocating one MSIX for the device probe. But on the EP
> end, it is MSI or INTX only... Working on adding that to solve this issue.
>

I have seen this too, the controller advertises the capability. However, the TRM
(section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
So the solution should be to modify the probe function of the endpoint
controller
so that the MSI-X capability would not be advertised, this should fix
your problem.

I wonder if one could still implement MSI-X because from the endpoint we would
just need to implement it as a message (TLP) over PCIe (because the space for
the vectors is allocated and written, so that part should be ok). I am
not an expert
on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
requires some fields in the PCIe header descriptor to be filled with values that
cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.

Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
send such interrupts. Once this is fixed you should be able to have your NVMe
function running.

Regards.
Rick


> --
> Damien Le Moal
> Western Digital Research
>
Rick Wertenbroek Feb. 21, 2023, 4:37 p.m. UTC | #5
On Tue, Feb 21, 2023 at 2:19 PM Rick Wertenbroek
<rick.wertenbroek@gmail.com> wrote:
>
> On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
> >
> > On 2/21/23 19:47, Rick Wertenbroek wrote:
> > > On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
> > > <damien.lemoal@opensource.wdc.com> wrote:
> > >>
> > >> On 2/14/23 23:08, Rick Wertenbroek wrote:
> > >>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
> > >>> function (function number 0), therefore return -EINVAL if set_msi() is
> > >>> called with a function number greater than 0.
> > >>> The PCIe standard only allows the multi message capability (MMC) value
> > >>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
> > >>> called with a MMC value of over 0x5.
> > >>>
> > >>> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
> > >>> ---
> > >>>  drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
> > >>>  1 file changed, 10 insertions(+)
> > >>>
> > >>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
> > >>> index b7865a94e..80634b690 100644
> > >>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
> > >>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
> > >>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
> > >>>       struct rockchip_pcie *rockchip = &ep->rockchip;
> > >>>       u32 flags;
> > >>>
> > >>> +     if (fn) {
> > >>> +             dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
> > >>> +             return -EINVAL;
> > >>> +     }
> > >>
> > >> Checking this here is late... Given that at most only one physical
> > >> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
> > >> Something like:
> > >>
> > >>         err = of_property_read_u8(dev->of_node, "max-functions",
> > >>                                   &ep->epc->max_functions);
> > >>
> > >>         if (err < 0 || ep->epc->max_functions > 1)
> > >>
> > >>                 ep->epc->max_functions = 1;
> > >>
> > >
> > > Yes, this could be moved to the probe, thanks.
> > >
> > >> And all the macros with the (fn) argument could also be simplified
> > >> (argument fn removed) since fn will always be 0.
> > >
> > > These functions cannot be simplified because they have to follow the signature
> > > given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
> > > function number as a parameter. If we change the function signature we won't
> > > be able to assign these functions to the pc_epc_ops structure
> >
> > I was not suggesting to change the functions signature. I was suggesting
> > dropping the fn argument for the *macros*, e.g.
> >
> > ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
> >
> > since fn is always 0.
> >
> > That said, I am not entirely sure if the limit really is 1 function at most. The
> > TRM seems to be suggesting that up to 4 functions can be supported...
> >
> > [...]
> >
> > >> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
> > >> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
> > >> here all the time.
> > >
> > > Yes, this could be an improvement but this is the way it is written
> > > everywhere in this
> > > driver, I chose to keep it so as to remain coherent with the rest of the driver.
> > > Cleaning this is not so important since this code will not be
> > > rewritten / changed so
> > > often. But I agree that it might be nicer. But, on the other side if
> > > at some point
> > > support for virtual functions would be added, the offsets would need
> > > to be computed
> > > based on the virtual function number and the code would be written
> > > like it is now,
> > > so I suggest keeping this the way it is for now.
> >
> > Yes, sure, this can be cleaned later.
> >
> > A more pressing problem is the lack of support for MSIX despite the fact that
> > the controller supports that *and* advertize it as a capability. That is what
> > was causing my problem with the Linux nvme driver and my prototype nvme epf
> > function driver: the host driver was seeing MSIX support (1 vector supported by
> > default), and so was allocating one MSIX for the device probe. But on the EP
> > end, it is MSI or INTX only... Working on adding that to solve this issue.
> >
>
> I have seen this too, the controller advertises the capability. However, the TRM
> (section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
> So the solution should be to modify the probe function of the endpoint
> controller
> so that the MSI-X capability would not be advertised, this should fix
> your problem.
>
> I wonder if one could still implement MSI-X because from the endpoint we would
> just need to implement it as a message (TLP) over PCIe (because the space for
> the vectors is allocated and written, so that part should be ok). I am
> not an expert
> on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
> requires some fields in the PCIe header descriptor to be filled with values that
> cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.
>
> Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
> send such interrupts. Once this is fixed you should be able to have your NVMe
> function running.
>
> Regards.
> Rick
>

It is possible to disable MSI-X by skipping the MSI-X capability
structure in the PCIe
capabilities structures linked-list.
The current linked list is MSI cap (0x90) -> MSI-X cap (0xb0) -> PCIe
Device cap (0xc0)
So we want to set MSI (0x90) -> PCIe Device cap (0xc0)

This can be done by writing 0xc0 to bits 15-8 of 0xFDA0'0090 (MSI cap).
I tested this quickly through devmem2 before loading the endpoint
function driver
and it fixes the issue of MSI-X capabilities being advertised to the host.

In the driver the changes would look like this;
In the probe function you can disable MSI-X as follows:

@@ -631,6 +618,28 @@ static int rockchip_pcie_ep_probe(struct
platform_device *pdev)

        ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;

+       /*
+        * Disable MSI-X because the controller is not capable of MSI-X
+        * This requires to skip the MSI-X capabilities entry in the
+        * chain of PCIe capabilities, we get the next pointer from the
+        * MSI-X entry and set that in the MSI capability entry, this way
+        * the MSI-X entry is skipped (left out of the linked-list)
+        */
+       cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
+               ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
+
+       cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK;
+
+       cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
+               ROCKCHIP_PCIE_EP_MSIX_CAP_REG) &
ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK;
+
+       cfg_msi |= cfg_msix_cp;
+
+       rockchip_pcie_write(rockchip, cfg_msi,
+               PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
+
        rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE,
PCIE_CLIENT_CONFIG);

        return 0;
 err_epc_mem_exit:
        pci_epc_mem_exit(epc);

In the pcie-rockchip.h add the following #defines:

@@ -216,21 +227,28 @@
 #define ROCKCHIP_PCIE_EP_CMD_STATUS                    0x4
 #define   ROCKCHIP_PCIE_EP_CMD_STATUS_IS               BIT(19)
 #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG                  0x90
+#define   ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET                      8
+#define   ROCKCHIP_PCIE_EP_MSI_CP1_MASK                        GENMASK(15, 8)
+#define   ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET                    16
 #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET         17
 #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK           GENMASK(19, 17)
 #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET         20
 #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK           GENMASK(22, 20)
 #define   ROCKCHIP_PCIE_EP_MSI_CTRL_ME                         BIT(16)
 #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP       BIT(24)
+#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG                  0xb0
+#define   ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET          8
+#define   ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK            GENMASK(15, 8)
 #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR                                0x1
 #define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR           0x3

I will add this to the next version of the patch set.
Thank you Damien for pointing this out ! This should solve the issues
you have with
your NVMe endpoint function regarding MSI-X interrupts.

Regards
Rick

>
> > --
> > Damien Le Moal
> > Western Digital Research
> >
Damien Le Moal Feb. 21, 2023, 9:57 p.m. UTC | #6
On 2/21/23 22:19, Rick Wertenbroek wrote:
> On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
> <damien.lemoal@opensource.wdc.com> wrote:
>>
>> On 2/21/23 19:47, Rick Wertenbroek wrote:
>>> On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
>>> <damien.lemoal@opensource.wdc.com> wrote:
>>>>
>>>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>>>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
>>>>> function (function number 0), therefore return -EINVAL if set_msi() is
>>>>> called with a function number greater than 0.
>>>>> The PCIe standard only allows the multi message capability (MMC) value
>>>>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
>>>>> called with a MMC value of over 0x5.
>>>>>
>>>>> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
>>>>> ---
>>>>>  drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>>>>>  1 file changed, 10 insertions(+)
>>>>>
>>>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>> index b7865a94e..80634b690 100644
>>>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>>>>       struct rockchip_pcie *rockchip = &ep->rockchip;
>>>>>       u32 flags;
>>>>>
>>>>> +     if (fn) {
>>>>> +             dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
>>>>> +             return -EINVAL;
>>>>> +     }
>>>>
>>>> Checking this here is late... Given that at most only one physical
>>>> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
>>>> Something like:
>>>>
>>>>         err = of_property_read_u8(dev->of_node, "max-functions",
>>>>                                   &ep->epc->max_functions);
>>>>
>>>>         if (err < 0 || ep->epc->max_functions > 1)
>>>>
>>>>                 ep->epc->max_functions = 1;
>>>>
>>>
>>> Yes, this could be moved to the probe, thanks.
>>>
>>>> And all the macros with the (fn) argument could also be simplified
>>>> (argument fn removed) since fn will always be 0.
>>>
>>> These functions cannot be simplified because they have to follow the signature
>>> given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
>>> function number as a parameter. If we change the function signature we won't
>>> be able to assign these functions to the pc_epc_ops structure
>>
>> I was not suggesting to change the functions signature. I was suggesting
>> dropping the fn argument for the *macros*, e.g.
>>
>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
>>
>> since fn is always 0.
>>
>> That said, I am not entirely sure if the limit really is 1 function at most. The
>> TRM seems to be suggesting that up to 4 functions can be supported...
>>
>> [...]
>>
>>>> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
>>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
>>>> here all the time.
>>>
>>> Yes, this could be an improvement but this is the way it is written
>>> everywhere in this
>>> driver, I chose to keep it so as to remain coherent with the rest of the driver.
>>> Cleaning this is not so important since this code will not be
>>> rewritten / changed so
>>> often. But I agree that it might be nicer. But, on the other side if
>>> at some point
>>> support for virtual functions would be added, the offsets would need
>>> to be computed
>>> based on the virtual function number and the code would be written
>>> like it is now,
>>> so I suggest keeping this the way it is for now.
>>
>> Yes, sure, this can be cleaned later.
>>
>> A more pressing problem is the lack of support for MSIX despite the fact that
>> the controller supports that *and* advertize it as a capability. That is what
>> was causing my problem with the Linux nvme driver and my prototype nvme epf
>> function driver: the host driver was seeing MSIX support (1 vector supported by
>> default), and so was allocating one MSIX for the device probe. But on the EP
>> end, it is MSI or INTX only... Working on adding that to solve this issue.
>>
> 
> I have seen this too, the controller advertises the capability. However, the TRM
> (section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
> So the solution should be to modify the probe function of the endpoint
> controller
> so that the MSI-X capability would not be advertised, this should fix
> your problem.

Yep, that is what I did for now: write 0 in the capability ID of the MSIX
capability list entry. A cleaner solution would be to change the next entry
pointer of the entry preceding the MSIX entry. Will send a patch for that.

> 
> I wonder if one could still implement MSI-X because from the endpoint we would
> just need to implement it as a message (TLP) over PCIe (because the space for
> the vectors is allocated and written, so that part should be ok). I am
> not an expert
> on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
> requires some fields in the PCIe header descriptor to be filled with values that
> cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.
> 
> Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
> send such interrupts. Once this is fixed you should be able to have your NVMe
> function running.
> 
> Regards.
> Rick
> 
> 
>> --
>> Damien Le Moal
>> Western Digital Research
>>
Damien Le Moal Feb. 21, 2023, 10:01 p.m. UTC | #7
On 2/22/23 01:37, Rick Wertenbroek wrote:
> On Tue, Feb 21, 2023 at 2:19 PM Rick Wertenbroek
> <rick.wertenbroek@gmail.com> wrote:
>>
>> On Tue, Feb 21, 2023 at 11:55 AM Damien Le Moal
>> <damien.lemoal@opensource.wdc.com> wrote:
>>>
>>> On 2/21/23 19:47, Rick Wertenbroek wrote:
>>>> On Wed, Feb 15, 2023 at 2:39 AM Damien Le Moal
>>>> <damien.lemoal@opensource.wdc.com> wrote:
>>>>>
>>>>> On 2/14/23 23:08, Rick Wertenbroek wrote:
>>>>>> The RK3399 PCIe endpoint core supports only a single PCIe physcial
>>>>>> function (function number 0), therefore return -EINVAL if set_msi() is
>>>>>> called with a function number greater than 0.
>>>>>> The PCIe standard only allows the multi message capability (MMC) value
>>>>>> to be up to 0x5 (32 messages), therefore return -EINVAL if set_msi() is
>>>>>> called with a MMC value of over 0x5.
>>>>>>
>>>>>> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
>>>>>> ---
>>>>>>  drivers/pci/controller/pcie-rockchip-ep.c | 10 ++++++++++
>>>>>>  1 file changed, 10 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> index b7865a94e..80634b690 100644
>>>>>> --- a/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> +++ b/drivers/pci/controller/pcie-rockchip-ep.c
>>>>>> @@ -294,6 +294,16 @@ static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
>>>>>>       struct rockchip_pcie *rockchip = &ep->rockchip;
>>>>>>       u32 flags;
>>>>>>
>>>>>> +     if (fn) {
>>>>>> +             dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
>>>>>> +             return -EINVAL;
>>>>>> +     }
>>>>>
>>>>> Checking this here is late... Given that at most only one physical
>>>>> function is supported, the check should be in rockchip_pcie_parse_ep_dt().
>>>>> Something like:
>>>>>
>>>>>         err = of_property_read_u8(dev->of_node, "max-functions",
>>>>>                                   &ep->epc->max_functions);
>>>>>
>>>>>         if (err < 0 || ep->epc->max_functions > 1)
>>>>>
>>>>>                 ep->epc->max_functions = 1;
>>>>>
>>>>
>>>> Yes, this could be moved to the probe, thanks.
>>>>
>>>>> And all the macros with the (fn) argument could also be simplified
>>>>> (argument fn removed) since fn will always be 0.
>>>>
>>>> These functions cannot be simplified because they have to follow the signature
>>>> given by "pci_epc_ops" (include/linux/pci-epc.h). And this signature has the
>>>> function number as a parameter. If we change the function signature we won't
>>>> be able to assign these functions to the pc_epc_ops structure
>>>
>>> I was not suggesting to change the functions signature. I was suggesting
>>> dropping the fn argument for the *macros*, e.g.
>>>
>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) -> ROCKCHIP_PCIE_EP_FUNC_BASE
>>>
>>> since fn is always 0.
>>>
>>> That said, I am not entirely sure if the limit really is 1 function at most. The
>>> TRM seems to be suggesting that up to 4 functions can be supported...
>>>
>>> [...]
>>>
>>>>> Another nice cleanup: define ROCKCHIP_PCIE_EP_MSI_CTRL_REG to include the
>>>>> ROCKCHIP_PCIE_EP_FUNC_BASE(fn) addition so that we do not have to do it
>>>>> here all the time.
>>>>
>>>> Yes, this could be an improvement but this is the way it is written
>>>> everywhere in this
>>>> driver, I chose to keep it so as to remain coherent with the rest of the driver.
>>>> Cleaning this is not so important since this code will not be
>>>> rewritten / changed so
>>>> often. But I agree that it might be nicer. But, on the other side if
>>>> at some point
>>>> support for virtual functions would be added, the offsets would need
>>>> to be computed
>>>> based on the virtual function number and the code would be written
>>>> like it is now,
>>>> so I suggest keeping this the way it is for now.
>>>
>>> Yes, sure, this can be cleaned later.
>>>
>>> A more pressing problem is the lack of support for MSIX despite the fact that
>>> the controller supports that *and* advertize it as a capability. That is what
>>> was causing my problem with the Linux nvme driver and my prototype nvme epf
>>> function driver: the host driver was seeing MSIX support (1 vector supported by
>>> default), and so was allocating one MSIX for the device probe. But on the EP
>>> end, it is MSI or INTX only... Working on adding that to solve this issue.
>>>
>>
>> I have seen this too, the controller advertises the capability. However, the TRM
>> (section 17.5.9) says that MSI-X is not supported (MSI / INTx only as you said).
>> So the solution should be to modify the probe function of the endpoint
>> controller
>> so that the MSI-X capability would not be advertised, this should fix
>> your problem.
>>
>> I wonder if one could still implement MSI-X because from the endpoint we would
>> just need to implement it as a message (TLP) over PCIe (because the space for
>> the vectors is allocated and written, so that part should be ok). I am
>> not an expert
>> on MSI-X, but the reason the endpoint cannot send them could be because MSI-X
>> requires some fields in the PCIe header descriptor to be filled with values that
>> cannot be set through the "desc0-3" registers of the RK3399 PCIe endpoint core.
>>
>> Anyways, the endpoint should not advertise the MSI-X capabilities when it cannot
>> send such interrupts. Once this is fixed you should be able to have your NVMe
>> function running.
>>
>> Regards.
>> Rick
>>
> 
> It is possible to disable MSI-X by skipping the MSI-X capability
> structure in the PCIe
> capabilities structures linked-list.
> The current linked list is MSI cap (0x90) -> MSI-X cap (0xb0) -> PCIe
> Device cap (0xc0)
> So we want to set MSI (0x90) -> PCIe Device cap (0xc0)
> 
> This can be done by writing 0xc0 to bits 15-8 of 0xFDA0'0090 (MSI cap).
> I tested this quickly through devmem2 before loading the endpoint
> function driver
> and it fixes the issue of MSI-X capabilities being advertised to the host.
> 
> In the driver the changes would look like this;
> In the probe function you can disable MSI-X as follows:
> 
> @@ -631,6 +618,28 @@ static int rockchip_pcie_ep_probe(struct
> platform_device *pdev)
> 
>         ep->irq_pci_addr = ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR;
> 
> +       /*
> +        * Disable MSI-X because the controller is not capable of MSI-X
> +        * This requires to skip the MSI-X capabilities entry in the

s/capabilities/capability

> +        * chain of PCIe capabilities, we get the next pointer from the
> +        * MSI-X entry and set that in the MSI capability entry, this way
> +        * the MSI-X entry is skipped (left out of the linked-list)
> +        */
> +       cfg_msi = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
> +               ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> +
> +       cfg_msi &= ~ROCKCHIP_PCIE_EP_MSI_CP1_MASK;
> +
> +       cfg_msix_cp = rockchip_pcie_read(rockchip, PCIE_EP_CONFIG_BASE +
> +               ROCKCHIP_PCIE_EP_MSIX_CAP_REG) &
> ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK;
> +
> +       cfg_msi |= cfg_msix_cp;
> +
> +       rockchip_pcie_write(rockchip, cfg_msi,
> +               PCIE_EP_CONFIG_BASE + ROCKCHIP_PCIE_EP_MSI_CTRL_REG);
> +
>         rockchip_pcie_write(rockchip, PCIE_CLIENT_CONF_ENABLE,
> PCIE_CLIENT_CONFIG);
> 
>         return 0;
>  err_epc_mem_exit:
>         pci_epc_mem_exit(epc);
> 
> In the pcie-rockchip.h add the following #defines:
> 
> @@ -216,21 +227,28 @@
>  #define ROCKCHIP_PCIE_EP_CMD_STATUS                    0x4
>  #define   ROCKCHIP_PCIE_EP_CMD_STATUS_IS               BIT(19)
>  #define ROCKCHIP_PCIE_EP_MSI_CTRL_REG                  0x90
> +#define   ROCKCHIP_PCIE_EP_MSI_CP1_OFFSET                      8
> +#define   ROCKCHIP_PCIE_EP_MSI_CP1_MASK                        GENMASK(15, 8)
> +#define   ROCKCHIP_PCIE_EP_MSI_FLAGS_OFFSET                    16
>  #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_OFFSET         17
>  #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MMC_MASK           GENMASK(19, 17)
>  #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MME_OFFSET         20
>  #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MME_MASK           GENMASK(22, 20)
>  #define   ROCKCHIP_PCIE_EP_MSI_CTRL_ME                         BIT(16)
>  #define   ROCKCHIP_PCIE_EP_MSI_CTRL_MASK_MSI_CAP       BIT(24)
> +#define ROCKCHIP_PCIE_EP_MSIX_CAP_REG                  0xb0
> +#define   ROCKCHIP_PCIE_EP_MSIX_CAP_CP_OFFSET          8
> +#define   ROCKCHIP_PCIE_EP_MSIX_CAP_CP_MASK            GENMASK(15, 8)
>  #define ROCKCHIP_PCIE_EP_DUMMY_IRQ_ADDR                                0x1
>  #define ROCKCHIP_PCIE_EP_PCI_LEGACY_IRQ_ADDR           0x3
> 
> I will add this to the next version of the patch set.
> Thank you Damien for pointing this out ! This should solve the issues
> you have with
> your NVMe endpoint function regarding MSI-X interrupts.

OK. I replied to your previous mail with the same idea. Looks good :)
Will test that instead of my dirty hack that puts 0 in the MSIX capability ID.

> 
> Regards
> Rick
> 
>>
>>> --
>>> Damien Le Moal
>>> Western Digital Research
>>>
diff mbox series

Patch

diff --git a/drivers/pci/controller/pcie-rockchip-ep.c b/drivers/pci/controller/pcie-rockchip-ep.c
index b7865a94e..80634b690 100644
--- a/drivers/pci/controller/pcie-rockchip-ep.c
+++ b/drivers/pci/controller/pcie-rockchip-ep.c
@@ -294,6 +294,16 @@  static int rockchip_pcie_ep_set_msi(struct pci_epc *epc, u8 fn, u8 vfn,
 	struct rockchip_pcie *rockchip = &ep->rockchip;
 	u32 flags;
 
+	if (fn) {
+		dev_err(&epc->dev, "This endpoint controller only supports a single physical function\n");
+		return -EINVAL;
+	}
+
+	if (mmc > 0x5) {
+		dev_err(&epc->dev, "Number of MSI IRQs cannot be more than 32\n");
+		return -EINVAL;
+	}
+
 	flags = rockchip_pcie_read(rockchip,
 				   ROCKCHIP_PCIE_EP_FUNC_BASE(fn) +
 				   ROCKCHIP_PCIE_EP_MSI_CTRL_REG);