diff mbox series

[v3] PCI: of: Avoid pci_remap_iospace() when PCI_IOBASE not defined

Message ID 20210922042041.16326-1-sergio.paracuellos@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v3] PCI: of: Avoid pci_remap_iospace() when PCI_IOBASE not defined | expand

Commit Message

Sergio Paracuellos Sept. 22, 2021, 4:20 a.m. UTC
Request for I/O resources from device tree call 'pci_remap_iospace()' from
'devm_pci_remap_iospace()' which is also called from device tree function
'pci_parse_request_of_pci_ranges()'. If PCI_IOBASE is not defined and I/O
resources are requested the following warning appears:

WARNING: CPU: 2 PID: 1 at ../drivers/pci/pci.c:4066 pci_remap_iospace+0x3c/0x54
This architecture does not support memory mapped I/O

Since there are architectures (like MIPS ralink) that can request I/O
resources from device tree but don't have mappable I/O space and therefore
don't define PCI_IOBASE avoid calling devm_pci_remap_iospace() in that case.

Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
---
Hi all,

This pach was part of a three patch series [0] and [1] but the other two
patches have been already added through staging git tree. I would like also
this patch going through the staging-tree to have all the three patches merged
together in the same tree, but if it is not possible I can also live with
that :).

Changes in v3:
 - Went back to the same approach that was done in v1 of the series [0].
 - Adjust commit message and description as I was suggested to do by Bjorn.

v1 series:
- [0]: https://lore.kernel.org/linux-pci/20210807072409.9018-2-sergio.paracuellos@gmail.com/T/#meb2e6c283f6d391407cb0c1158feea71de59b87d

v2 series:
- [1]: https://lore.kernel.org/all/20210822161005.22467-1-sergio.paracuellos@gmail.com/ 

Thanks in advance for your time.

Best regards,
    Sergio Paracuellos

 drivers/pci/of.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Arnd Bergmann Sept. 22, 2021, 3:47 p.m. UTC | #1
On Wed, Sep 22, 2021 at 6:23 AM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:

> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index d84381ce82b5..7d7aab1d1d64 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -564,12 +564,14 @@ static int pci_parse_request_of_pci_ranges(struct device *dev,
>
>                 switch (resource_type(res)) {
>                 case IORESOURCE_IO:
> +#ifdef PCI_IOBASE
>                         err = devm_pci_remap_iospace(dev, res, iobase);
>                         if (err) {
>                                 dev_warn(dev, "error %d: failed to map resource %pR\n",
>                                          err, res);
>                                 resource_list_destroy_entry(win);
>                         }
> +#endif
>                         break;

I wonder if we should have a different symbol controlling this than PCI_IOBASE,
because we are somewhat overloading the semantics here. There are a couple
of ways that I/O space can be handled

a) inb()/outb() are custom instructions, as on x86, PCI_IOBASE is not defined
b) there is no I/O space, as on s390, PCI_IOBASE is not defined
c) PCI_IOBASE points to a virtual address used for dynamic mapping of I/O
    space, as on ARM
d) PCI_IOBASE is NULL, and the port number corresponds to the virtual
   address (some older architectures)

I'm not completely sure where your platform fits in here, it sounds like you
address them using a machine specific physical address as the base in
inb() plus the port number as an offset, is that correct?

       Arnd
Sergio Paracuellos Sept. 22, 2021, 5:42 p.m. UTC | #2
Hi Arnd,

Thanks for reviewing this.

On Wed, Sep 22, 2021 at 5:47 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Sep 22, 2021 at 6:23 AM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
>
> > diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> > index d84381ce82b5..7d7aab1d1d64 100644
> > --- a/drivers/pci/of.c
> > +++ b/drivers/pci/of.c
> > @@ -564,12 +564,14 @@ static int pci_parse_request_of_pci_ranges(struct device *dev,
> >
> >                 switch (resource_type(res)) {
> >                 case IORESOURCE_IO:
> > +#ifdef PCI_IOBASE
> >                         err = devm_pci_remap_iospace(dev, res, iobase);
> >                         if (err) {
> >                                 dev_warn(dev, "error %d: failed to map resource %pR\n",
> >                                          err, res);
> >                                 resource_list_destroy_entry(win);
> >                         }
> > +#endif
> >                         break;
>
> I wonder if we should have a different symbol controlling this than PCI_IOBASE,
> because we are somewhat overloading the semantics here. There are a couple
> of ways that I/O space can be handled
>
> a) inb()/outb() are custom instructions, as on x86, PCI_IOBASE is not defined
> b) there is no I/O space, as on s390, PCI_IOBASE is not defined
> c) PCI_IOBASE points to a virtual address used for dynamic mapping of I/O
>     space, as on ARM
> d) PCI_IOBASE is NULL, and the port number corresponds to the virtual
>    address (some older architectures)
>
> I'm not completely sure where your platform fits in here, it sounds like you
> address them using a machine specific physical address as the base in
> inb() plus the port number as an offset, is that correct?

I guess none of the above options? I will try to explain this as per
my understanding.

[+cc Thomas Bogendoerfer as mips maintainer and with better knowledge
of mips platforms than me]

On MIPS I/O ports are memory mapped, so we access them using normal
load/store instructions.
Mips 'plat_mem_setup()' function does a 'set_io_port_base(KSEG1)'.
There, variable 'mips_io_port_base'
is set then using this address which is a virtual address to which all
ports are being mapped.
KSEG1 addresses are uncached and are not translated by the MMU. This
KSEG1 range is directly mapped in physical space starting with address
0x0.
Because of this reason, defining PCI_IOBASE as KSEG1 won't work since,
at the end 'pci_parse_request_of_pci_ranges' tries to remap to a fixed
virtual address (PCI_IOBASE). This can't work for KSEG1 addresses.
What happens if I try to do that is that I get bad addresses at pci
enumeration for IO resources. Mips ralink mt7621 SoC (which is the one
I am using and trying to mainline the driver from staging) have I/O at
address 0x1e160000. So instead of getting this address for pcie IO
BARS I get a range from 0x0000 to 0xffff since 'pci_adress_to_pio' in
that case got that range 0x0-0xffff which is wrong. To have this
working this way we would need to put PCI_IOBASE somewhere into KSEG2
which will result in creating TLB entries for IO addresses, which most
of the time isn't needed on MIPS because of access via KSEG1. Instead
of that, what happens when I avoid defining PCI_IOBASE and set
IO_SPACE_LIMIT  (See [0] and [1] commits already added to staging tree
which was part of this patch series for context of what works with
this patch together) all works properly. There have also been some
patches accepted in the past which avoid this
'pci_parse_request_of_pci_ranges' call since it is not working for
most pci legacy drivers of arch/mips for ralinks platform [2].

So I am not sure what should be the correct approach to properly make
this work (this one works for me and I cannot see others better) but I
will be happy to try whatever you propose for me to do.

Thanks in advance for your time.

Best regards,
    Sergio Paracuellos

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-testing&id=159697474db41732ef3b6c2e8d9395f09d1f659e
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-testing&id=50fb34eca2944fd67493717c9fbda125336f1655
[2]: https://www.spinics.net/lists/stable-commits/msg197972.html


>
>        Arnd
Arnd Bergmann Sept. 22, 2021, 6:07 p.m. UTC | #3
On Wed, Sep 22, 2021 at 7:42 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 5:47 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > I'm not completely sure where your platform fits in here, it sounds like you
> > address them using a machine specific physical address as the base in
> > inb() plus the port number as an offset, is that correct?
>
> I guess none of the above options? I will try to explain this as per
> my understanding.
>
> [+cc Thomas Bogendoerfer as mips maintainer and with better knowledge
> of mips platforms than me]
>
> On MIPS I/O ports are memory mapped, so we access them using normal
> load/store instructions.
> Mips 'plat_mem_setup()' function does a 'set_io_port_base(KSEG1)'.
> There, variable 'mips_io_port_base'
> is set then using this address which is a virtual address to which all
> ports are being mapped.
> KSEG1 addresses are uncached and are not translated by the MMU. This
> KSEG1 range is directly mapped in physical space starting with address
> 0x0.
> Because of this reason, defining PCI_IOBASE as KSEG1 won't work since,
> at the end 'pci_parse_request_of_pci_ranges' tries to remap to a fixed
> virtual address (PCI_IOBASE). This can't work for KSEG1 addresses.
> What happens if I try to do that is that I get bad addresses at pci
> enumeration for IO resources. Mips ralink mt7621 SoC (which is the one
> I am using and trying to mainline the driver from staging) have I/O at
> address 0x1e160000. So instead of getting this address for pcie IO
> BARS I get a range from 0x0000 to 0xffff since 'pci_adress_to_pio' in
> that case got that range 0x0-0xffff which is wrong. To have this
> working this way we would need to put PCI_IOBASE somewhere into KSEG2
> which will result in creating TLB entries for IO addresses, which most
> of the time isn't needed on MIPS because of access via KSEG1. Instead
> of that, what happens when I avoid defining PCI_IOBASE and set
> IO_SPACE_LIMIT  (See [0] and [1] commits already added to staging tree
> which was part of this patch series for context of what works with
> this patch together) all works properly. There have also been some
> patches accepted in the past which avoid this
> 'pci_parse_request_of_pci_ranges' call since it is not working for
> most pci legacy drivers of arch/mips for ralinks platform [2].
>
> So I am not sure what should be the correct approach to properly make
> this work (this one works for me and I cannot see others better) but I
> will be happy to try whatever you propose for me to do.

Ok, thank you for the detailed explanation.

I suppose you can use the generic infrastructure in asm-generic/io.h
if you "#define PCI_IOBASE mips_io_port_base". In this case, you
should have an architecture specific implementation of
pci_remap_iospace() that assigns mips_io_port_base.
pci_remap_iospace() was originally meant as an architecture
specific helper, but it moved into generic code after all architectures
had the same requirements. If MIPS has different requirements,
then it should not be shared.

I don't yet understand how you deal with having multiple PCIe
host bridge devices if they have distinct I/O port ranges.
Without remapping to dynamic virtual addresses, does
that mean that every MMIO register between the first and
last PCIe bridge also shows up in /dev/ioport? Or do you
only support port I/O on the first PCIe host bridge?

Note that you could also decide to completely sidestep the
problem by just defining port I/O to be unavailable on MIPS
when probing a generic host bridge driver. Most likely this
is never going to be used anyway, and it will be rather hard
to test if you don't already have the need ;-)

         Arnd
Sergio Paracuellos Sept. 22, 2021, 6:40 p.m. UTC | #4
Hi Arnd,

On Wed, Sep 22, 2021 at 8:07 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Sep 22, 2021 at 7:42 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Wed, Sep 22, 2021 at 5:47 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > I'm not completely sure where your platform fits in here, it sounds like you
> > > address them using a machine specific physical address as the base in
> > > inb() plus the port number as an offset, is that correct?
> >
> > I guess none of the above options? I will try to explain this as per
> > my understanding.
> >
> > [+cc Thomas Bogendoerfer as mips maintainer and with better knowledge
> > of mips platforms than me]
> >
> > On MIPS I/O ports are memory mapped, so we access them using normal
> > load/store instructions.
> > Mips 'plat_mem_setup()' function does a 'set_io_port_base(KSEG1)'.
> > There, variable 'mips_io_port_base'
> > is set then using this address which is a virtual address to which all
> > ports are being mapped.
> > KSEG1 addresses are uncached and are not translated by the MMU. This
> > KSEG1 range is directly mapped in physical space starting with address
> > 0x0.
> > Because of this reason, defining PCI_IOBASE as KSEG1 won't work since,
> > at the end 'pci_parse_request_of_pci_ranges' tries to remap to a fixed
> > virtual address (PCI_IOBASE). This can't work for KSEG1 addresses.
> > What happens if I try to do that is that I get bad addresses at pci
> > enumeration for IO resources. Mips ralink mt7621 SoC (which is the one
> > I am using and trying to mainline the driver from staging) have I/O at
> > address 0x1e160000. So instead of getting this address for pcie IO
> > BARS I get a range from 0x0000 to 0xffff since 'pci_adress_to_pio' in
> > that case got that range 0x0-0xffff which is wrong. To have this
> > working this way we would need to put PCI_IOBASE somewhere into KSEG2
> > which will result in creating TLB entries for IO addresses, which most
> > of the time isn't needed on MIPS because of access via KSEG1. Instead
> > of that, what happens when I avoid defining PCI_IOBASE and set
> > IO_SPACE_LIMIT  (See [0] and [1] commits already added to staging tree
> > which was part of this patch series for context of what works with
> > this patch together) all works properly. There have also been some
> > patches accepted in the past which avoid this
> > 'pci_parse_request_of_pci_ranges' call since it is not working for
> > most pci legacy drivers of arch/mips for ralinks platform [2].
> >
> > So I am not sure what should be the correct approach to properly make
> > this work (this one works for me and I cannot see others better) but I
> > will be happy to try whatever you propose for me to do.
>
> Ok, thank you for the detailed explanation.
>
> I suppose you can use the generic infrastructure in asm-generic/io.h
> if you "#define PCI_IOBASE mips_io_port_base". In this case, you
> should have an architecture specific implementation of
> pci_remap_iospace() that assigns mips_io_port_base.

No, that is what I tried originally defining PCI_IOBASE as
_AC(0xa0000000, UL) [0] which is the same as KSEG1 [1] that ends in
'mips_io_port_base'.

> pci_remap_iospace() was originally meant as an architecture
> specific helper, but it moved into generic code after all architectures
> had the same requirements. If MIPS has different requirements,
> then it should not be shared.

I see. So, if it can not be shared, would defining 'pci_remap_iospace'
as 'weak' acceptable? Doing in this way I guess I can redefine the
symbol for mips to have the same I currently have but without the
ifdef in the core APIs...

>
> I don't yet understand how you deal with having multiple PCIe
> host bridge devices if they have distinct I/O port ranges.
> Without remapping to dynamic virtual addresses, does
> that mean that every MMIO register between the first and
> last PCIe bridge also shows up in /dev/ioport? Or do you
> only support port I/O on the first PCIe host bridge?

For example, this board is using all available three pci ports [2] and I get:

root@gnubee:~# cat /proc/ioports
1e160000-1e16ffff : pcie@1e140000
  1e160000-1e160fff : PCI Bus 0000:01
    1e160000-1e16000f : 0000:01:00.0
      1e160000-1e16000f : ahci
    1e160010-1e160017 : 0000:01:00.0
      1e160010-1e160017 : ahci
    1e160018-1e16001f : 0000:01:00.0
      1e160018-1e16001f : ahci
    1e160020-1e160023 : 0000:01:00.0
      1e160020-1e160023 : ahci
    1e160024-1e160027 : 0000:01:00.0
      1e160024-1e160027 : ahci
  1e161000-1e161fff : PCI Bus 0000:02
    1e161000-1e16100f : 0000:02:00.0
      1e161000-1e16100f : ahci
    1e161010-1e161017 : 0000:02:00.0
      1e161010-1e161017 : ahci
    1e161018-1e16101f : 0000:02:00.0
      1e161018-1e16101f : ahci
    1e161020-1e161023 : 0000:02:00.0
      1e161020-1e161023 : ahci
    1e161024-1e161027 : 0000:02:00.0
      1e161024-1e161027 : ahci
  1e162000-1e162fff : PCI Bus 0000:03
    1e162000-1e16200f : 0000:03:00.0
      1e162000-1e16200f : ahci
    1e162010-1e162017 : 0000:03:00.0
      1e162010-1e162017 : ahci
    1e162018-1e16201f : 0000:03:00.0
      1e162018-1e16201f : ahci
    1e162020-1e162023 : 0000:03:00.0
      1e162020-1e162023 : ahci
    1e162024-1e162027 : 0000:03:00.0
      1e162024-1e162027 : ahci
root@gnubee:~#

>
> Note that you could also decide to completely sidestep the
> problem by just defining port I/O to be unavailable on MIPS
> when probing a generic host bridge driver. Most likely this
> is never going to be used anyway, and it will be rather hard
> to test if you don't already have the need ;-)

I don't really have any pci card with IO resources to test, but this
SoC is extensively used in openWRT and I remember people who had such
cards with I/O because they were asking me for I/O since it did not
work properly, so I guess I can decide that but if it can work I
prefer to try to make it work :). I searched a bit for the link with
that conversation but I was not able to find it :(.

Thanks for your feedback and time :).

Best regards,
    Sergio Paracuellos

>
>          Arnd

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-testing&id=159697474db41732ef3b6c2e8d9395f09d1f659e
[1]: https://elixir.bootlin.com/linux/v5.15-rc2/source/arch/mips/include/asm/addrspace.h#L99
[2]: https://patchwork.kernel.org/project/linux-pci/patch/20210922050035.18162-2-sergio.paracuellos@gmail.com/
Arnd Bergmann Sept. 22, 2021, 7:57 p.m. UTC | #5
On Wed, Sep 22, 2021 at 8:40 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 8:07 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wed, Sep 22, 2021 at 7:42 PM Sergio Paracuellos
> > Ok, thank you for the detailed explanation.
> >
> > I suppose you can use the generic infrastructure in asm-generic/io.h
> > if you "#define PCI_IOBASE mips_io_port_base". In this case, you
> > should have an architecture specific implementation of
> > pci_remap_iospace() that assigns mips_io_port_base.
>
> No, that is what I tried originally defining PCI_IOBASE as
> _AC(0xa0000000, UL) [0] which is the same as KSEG1 [1] that ends in
> 'mips_io_port_base'.

Defining it as KSEG1 would be problematic because that means that
the Linux-visible port numbers are offset from the bus-visible ones.

You really want PCI_IOBASE to point to the address of port 0.

> > pci_remap_iospace() was originally meant as an architecture
> > specific helper, but it moved into generic code after all architectures
> > had the same requirements. If MIPS has different requirements,
> > then it should not be shared.
>
> I see. So, if it can not be shared, would defining 'pci_remap_iospace'
> as 'weak' acceptable? Doing in this way I guess I can redefine the
> symbol for mips to have the same I currently have but without the
> ifdef in the core APIs...

I would hope to kill off the __weak functions, and prefer using an #ifdef
around the generic implementation. One way to do it is to define
a macro with the same name, such as

#define pci_remap_iospace pci_remap_iospace

and then use #ifdef around the C function to see if that's arleady defined.

> >
> > I don't yet understand how you deal with having multiple PCIe
> > host bridge devices if they have distinct I/O port ranges.
> > Without remapping to dynamic virtual addresses, does
> > that mean that every MMIO register between the first and
> > last PCIe bridge also shows up in /dev/ioport? Or do you
> > only support port I/O on the first PCIe host bridge?
>
> For example, this board is using all available three pci ports [2] and I get:
>
> root@gnubee:~# cat /proc/ioports
> 1e160000-1e16ffff : pcie@1e140000
>   1e160000-1e160fff : PCI Bus 0000:01
>     1e160000-1e16000f : 0000:01:00.0
>       1e160000-1e16000f : ahci
>     1e160010-1e160017 : 0000:01:00.0
>       1e160010-1e160017 : ahci
>     1e160018-1e16001f : 0000:01:00.0
>       1e160018-1e16001f : ahci
>     1e160020-1e160023 : 0000:01:00.0
>       1e160020-1e160023 : ahci
>     1e160024-1e160027 : 0000:01:00.0
>       1e160024-1e160027 : ahci
>   1e161000-1e161fff : PCI Bus 0000:02
>     1e161000-1e16100f : 0000:02:00.0
>       1e161000-1e16100f : ahci
>     1e161010-1e161017 : 0000:02:00.0
>       1e161010-1e161017 : ahci
>     1e161018-1e16101f : 0000:02:00.0
>       1e161018-1e16101f : ahci
>     1e161020-1e161023 : 0000:02:00.0
>       1e161020-1e161023 : ahci
>     1e161024-1e161027 : 0000:02:00.0
>       1e161024-1e161027 : ahci
>   1e162000-1e162fff : PCI Bus 0000:03
>     1e162000-1e16200f : 0000:03:00.0
>       1e162000-1e16200f : ahci
>     1e162010-1e162017 : 0000:03:00.0
>       1e162010-1e162017 : ahci
>     1e162018-1e16201f : 0000:03:00.0
>       1e162018-1e16201f : ahci
>     1e162020-1e162023 : 0000:03:00.0
>       1e162020-1e162023 : ahci
>     1e162024-1e162027 : 0000:03:00.0
>       1e162024-1e162027 : ahci

Ah ok, so there are I/O ports that are at least
visible (may or may not be accessed by the driver).

I only see one host bridge here though, and it has a single
I/O port range, so maybe all three ports are inside of
a single PCI domain?

Having high numbers for the I/O ports is definitely a
problem as I mentioned. Anything that tries to access
PC-style legacy devices on the low port numbers
will now directly go on the bus accessing MMIO
registers that it shouldn't, either causing a CPU exception
or (worse) undefined behavior from random register
accesses.

       Arnd
Sergio Paracuellos Sept. 22, 2021, 8:55 p.m. UTC | #6
Hi Arnd,

On Wed, Sep 22, 2021 at 9:57 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Sep 22, 2021 at 8:40 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Wed, Sep 22, 2021 at 8:07 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Wed, Sep 22, 2021 at 7:42 PM Sergio Paracuellos
> > > Ok, thank you for the detailed explanation.
> > >
> > > I suppose you can use the generic infrastructure in asm-generic/io.h
> > > if you "#define PCI_IOBASE mips_io_port_base". In this case, you
> > > should have an architecture specific implementation of
> > > pci_remap_iospace() that assigns mips_io_port_base.
> >
> > No, that is what I tried originally defining PCI_IOBASE as
> > _AC(0xa0000000, UL) [0] which is the same as KSEG1 [1] that ends in
> > 'mips_io_port_base'.
>
> Defining it as KSEG1 would be problematic because that means that
> the Linux-visible port numbers are offset from the bus-visible ones.
>
> You really want PCI_IOBASE to point to the address of port 0.

Do you mean that doing

#define PCI_IOBASE mips_io_port_base

would have different result that doing what I did

#define PCI_IOBASE _AC(0xa0000000, UL)

?

I am not really understanding this yet (I think I need a bit of sleep
time :)), but I will test this tomorrow and come back to you again
with results.

>
> > > pci_remap_iospace() was originally meant as an architecture
> > > specific helper, but it moved into generic code after all architectures
> > > had the same requirements. If MIPS has different requirements,
> > > then it should not be shared.
> >
> > I see. So, if it can not be shared, would defining 'pci_remap_iospace'
> > as 'weak' acceptable? Doing in this way I guess I can redefine the
> > symbol for mips to have the same I currently have but without the
> > ifdef in the core APIs...
>
> I would hope to kill off the __weak functions, and prefer using an #ifdef
> around the generic implementation. One way to do it is to define
> a macro with the same name, such as
>
> #define pci_remap_iospace pci_remap_iospace

I guess this should be defined in arch/mips/include/asm/pci.h?

>
> and then use #ifdef around the C function to see if that's already defined.

I see. That would work, I guess. But I don't really understand why
this approach would be better than this patch changes itself. Looks
more hacky to me. As Bjorn pointed out in a previous version of this
patch [0], this case is the same as the one in
'acpi_pci_root_remap_iospace' and the same approach is used there...

>
> > >
> > > I don't yet understand how you deal with having multiple PCIe
> > > host bridge devices if they have distinct I/O port ranges.
> > > Without remapping to dynamic virtual addresses, does
> > > that mean that every MMIO register between the first and
> > > last PCIe bridge also shows up in /dev/ioport? Or do you
> > > only support port I/O on the first PCIe host bridge?
> >
> > For example, this board is using all available three pci ports [2] and I get:
> >
> > root@gnubee:~# cat /proc/ioports
> > 1e160000-1e16ffff : pcie@1e140000
> >   1e160000-1e160fff : PCI Bus 0000:01
> >     1e160000-1e16000f : 0000:01:00.0
> >       1e160000-1e16000f : ahci
> >     1e160010-1e160017 : 0000:01:00.0
> >       1e160010-1e160017 : ahci
> >     1e160018-1e16001f : 0000:01:00.0
> >       1e160018-1e16001f : ahci
> >     1e160020-1e160023 : 0000:01:00.0
> >       1e160020-1e160023 : ahci
> >     1e160024-1e160027 : 0000:01:00.0
> >       1e160024-1e160027 : ahci
> >   1e161000-1e161fff : PCI Bus 0000:02
> >     1e161000-1e16100f : 0000:02:00.0
> >       1e161000-1e16100f : ahci
> >     1e161010-1e161017 : 0000:02:00.0
> >       1e161010-1e161017 : ahci
> >     1e161018-1e16101f : 0000:02:00.0
> >       1e161018-1e16101f : ahci
> >     1e161020-1e161023 : 0000:02:00.0
> >       1e161020-1e161023 : ahci
> >     1e161024-1e161027 : 0000:02:00.0
> >       1e161024-1e161027 : ahci
> >   1e162000-1e162fff : PCI Bus 0000:03
> >     1e162000-1e16200f : 0000:03:00.0
> >       1e162000-1e16200f : ahci
> >     1e162010-1e162017 : 0000:03:00.0
> >       1e162010-1e162017 : ahci
> >     1e162018-1e16201f : 0000:03:00.0
> >       1e162018-1e16201f : ahci
> >     1e162020-1e162023 : 0000:03:00.0
> >       1e162020-1e162023 : ahci
> >     1e162024-1e162027 : 0000:03:00.0
> >       1e162024-1e162027 : ahci
>
> Ah ok, so there are I/O ports that are at least
> visible (may or may not be accessed by the driver).

Yes, all of them are enumerated on boot and exposed in /proc/ioports.

>
> I only see one host bridge here though, and it has a single
> I/O port range, so maybe all three ports are inside of
> a single PCI domain?

See this cover letter [1] with a fantastic ascii art :) to a better
understanding of this pci topology. Yes, there is one host bridge and
from here three virtual bridges where at most three endpoints can be
connected.

>
> Having high numbers for the I/O ports is definitely a
> problem as I mentioned. Anything that tries to access
> PC-style legacy devices on the low port numbers
> will now directly go on the bus accessing MMIO
> registers that it shouldn't, either causing a CPU exception
> or (worse) undefined behavior from random register
> accesses.

All I/O port addresses for ralink SoCs are in higher addresses than
default IO_SPACE_LIMIT 0xFFFF, that's why we have to also change this
limit together with this patch changes. Nothing to do with this, is an
architectural thing of these SoCs.

Best regards,
     Sergio Paracuellos
>
>        Arnd

[0]: https://lore.kernel.org/linux-staging/CAMhs-H-OJyXs+QViJa5_O3wUGhoupmaW4qvVG8WGjxm1haRj9Q@mail.gmail.com/T/#m9e8e8d3afca908d1e1c57539d5e03dd9f5efd850
[1]: https://www.spinics.net/lists/kernel/msg4085596.html
Arnd Bergmann Sept. 23, 2021, 5:51 a.m. UTC | #7
isOn Wed, Sep 22, 2021 at 10:55 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Wed, Sep 22, 2021 at 9:57 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Wed, Sep 22, 2021 at 8:40 PM Sergio Paracuellos
> > <sergio.paracuellos@gmail.com> wrote:
> > > On Wed, Sep 22, 2021 at 8:07 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > On Wed, Sep 22, 2021 at 7:42 PM Sergio Paracuellos
> > > > Ok, thank you for the detailed explanation.
> > > >
> > > > I suppose you can use the generic infrastructure in asm-generic/io.h
> > > > if you "#define PCI_IOBASE mips_io_port_base". In this case, you
> > > > should have an architecture specific implementation of
> > > > pci_remap_iospace() that assigns mips_io_port_base.
> > >
> > > No, that is what I tried originally defining PCI_IOBASE as
> > > _AC(0xa0000000, UL) [0] which is the same as KSEG1 [1] that ends in
> > > 'mips_io_port_base'.
> >
> > Defining it as KSEG1 would be problematic because that means that
> > the Linux-visible port numbers are offset from the bus-visible ones.
> >
> > You really want PCI_IOBASE to point to the address of port 0.
>
> Do you mean that doing
>
> #define PCI_IOBASE mips_io_port_base
>
> would have different result that doing what I did
>
> #define PCI_IOBASE _AC(0xa0000000, UL)
>
> ?
>
> I am not really understanding this yet (I think I need a bit of sleep
> time :)), but I will test this tomorrow and come back to you again
> with results.

Both would let devices access the registers, but they are different
regarding the bus translations you have to program into the
host bridge, and how to access the hardcoded port numbers.

> > > > pci_remap_iospace() was originally meant as an architecture
> > > > specific helper, but it moved into generic code after all architectures
> > > > had the same requirements. If MIPS has different requirements,
> > > > then it should not be shared.
> > >
> > > I see. So, if it can not be shared, would defining 'pci_remap_iospace'
> > > as 'weak' acceptable? Doing in this way I guess I can redefine the
> > > symbol for mips to have the same I currently have but without the
> > > ifdef in the core APIs...
> >
> > I would hope to kill off the __weak functions, and prefer using an #ifdef
> > around the generic implementation. One way to do it is to define
> > a macro with the same name, such as
> >
> > #define pci_remap_iospace pci_remap_iospace
>
> I guess this should be defined in arch/mips/include/asm/pci.h?

Yes, that would be a good place for that, possibly next to
the (static inline) definition.

> > and then use #ifdef around the C function to see if that's already defined.
>
> I see. That would work, I guess. But I don't really understand why
> this approach would be better than this patch changes itself. Looks
> more hacky to me. As Bjorn pointed out in a previous version of this
> patch [0], this case is the same as the one in
> 'acpi_pci_root_remap_iospace' and the same approach is used there...

The acpi_pci_root_remap_iospace() does this because on that code is
shared with x86 and ia64, where the port numbers are accessed using
separate instructions that do not translate into MMIO addresses at all.

On MIPS, the port access eventually does translate into MMIO, and
you need a way to communicate the mapping between the host
bridge and the architecture specific code.

This is particularly important since we want the host bridge driver
to be portable. If you set up the mapping differently between e.g.
mt7621 and mt7623, they are not able to use the same driver
code for setting pci_host_bridge->io_offset and for programming
the inbound translation registers.

> > I only see one host bridge here though, and it has a single
> > I/O port range, so maybe all three ports are inside of
> > a single PCI domain?
>
> See this cover letter [1] with a fantastic ascii art :) to a better
> understanding of this pci topology. Yes, there is one host bridge and
> from here three virtual bridges where at most three endpoints can be
> connected.

Ok, so you don't have the problem I was referring to. A lot of
SoCs actually have multiple host bridges, but only one root
port per host bridge, because they are based on licensed IP
blocks that don't support a normal topology like the one you have.

> > Having high numbers for the I/O ports is definitely a
> > problem as I mentioned. Anything that tries to access
> > PC-style legacy devices on the low port numbers
> > will now directly go on the bus accessing MMIO
> > registers that it shouldn't, either causing a CPU exception
> > or (worse) undefined behavior from random register
> > accesses.
>
> All I/O port addresses for ralink SoCs are in higher addresses than
> default IO_SPACE_LIMIT 0xFFFF, that's why we have to also change this
> limit together with this patch changes. Nothing to do with this, is an
> architectural thing of these SoCs.

I don't understand. What I see above is that the host bridge
has the region 1e160000-1e16ffff registered, so presumably
1e160000 is actually the start of the window into the host bridge.
If you set PCI_IOBASE to that location, the highest port number
would become 0x2027, which is under 0xffff.

       Arnd
Sergio Paracuellos Sept. 23, 2021, 6:36 a.m. UTC | #8
Hi Arnd,

On Thu, Sep 23, 2021 at 7:51 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
>  isOn Wed, Sep 22, 2021 at 10:55 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Wed, Sep 22, 2021 at 9:57 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > On Wed, Sep 22, 2021 at 8:40 PM Sergio Paracuellos
> > > <sergio.paracuellos@gmail.com> wrote:
> > > > On Wed, Sep 22, 2021 at 8:07 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > On Wed, Sep 22, 2021 at 7:42 PM Sergio Paracuellos
> > > > > Ok, thank you for the detailed explanation.
> > > > >
> > > > > I suppose you can use the generic infrastructure in asm-generic/io.h
> > > > > if you "#define PCI_IOBASE mips_io_port_base". In this case, you
> > > > > should have an architecture specific implementation of
> > > > > pci_remap_iospace() that assigns mips_io_port_base.
> > > >
> > > > No, that is what I tried originally defining PCI_IOBASE as
> > > > _AC(0xa0000000, UL) [0] which is the same as KSEG1 [1] that ends in
> > > > 'mips_io_port_base'.
> > >
> > > Defining it as KSEG1 would be problematic because that means that
> > > the Linux-visible port numbers are offset from the bus-visible ones.
> > >
> > > You really want PCI_IOBASE to point to the address of port 0.
> >
> > Do you mean that doing
> >
> > #define PCI_IOBASE mips_io_port_base
> >
> > would have different result that doing what I did
> >
> > #define PCI_IOBASE _AC(0xa0000000, UL)
> >
> > ?
> >
> > I am not really understanding this yet (I think I need a bit of sleep
> > time :)), but I will test this tomorrow and come back to you again
> > with results.
>
> Both would let devices access the registers, but they are different
> regarding the bus translations you have to program into the
> host bridge, and how to access the hardcoded port numbers.

I have tested this and I get initial invalidid BAR value errors on pci
bus I/O enumeration an also bad addresses in /proc/ioports in the same
way I got defining PCI_IOBASE as _AC(0xa0000000, UL):

root@gnubee:~# cat /proc/ioports
00000000-0000ffff : pcie@1e140000
  00000000-00000fff : PCI Bus 0000:01
    00000000-0000000f : 0000:01:00.0
      00000000-0000000f : ahci
    00000010-00000017 : 0000:01:00.0
      00000010-00000017 : ahci
    00000018-0000001f : 0000:01:00.0
      00000018-0000001f : ahci
    00000020-00000023 : 0000:01:00.0
      00000020-00000023 : ahci
    00000024-00000027 : 0000:01:00.0
      00000024-00000027 : ahci
  00001000-00001fff : PCI Bus 0000:02
    00001000-0000100f : 0000:02:00.0
      00001000-0000100f : ahci
    00001010-00001017 : 0000:02:00.0
      00001010-00001017 : ahci
    00001018-0000101f : 0000:02:00.0
      00001018-0000101f : ahci
    00001020-00001023 : 0000:02:00.0
      00001020-00001023 : ahci
    00001024-00001027 : 0000:02:00.0
      00001024-00001027 : ahci
  00002000-00002fff : PCI Bus 0000:03
    00002000-0000200f : 0000:03:00.0
      00002000-0000200f : ahci
    00002010-00002017 : 0000:03:00.0
      00002010-00002017 : ahci
    00002018-0000201f : 0000:03:00.0
      00002018-0000201f : ahci
    00002020-00002023 : 0000:03:00.0
      00002020-00002023 : ahci
    00002024-00002027 : 0000:03:00.0
      00002024-00002027 : ahci
root@gnubee:~#

This is the pci boot trace I get doing what you proposed (I have also
added some traces for 'logic_pio_trans_cpuaddr' and
'pci_address_to_pio'):

diff --git a/lib/logic_pio.c b/lib/logic_pio.c
index f32fe481b492..529e28aba74b 100644
--- a/lib/logic_pio.c
+++ b/lib/logic_pio.c
@@ -216,6 +218,7 @@ unsigned long logic_pio_trans_cpuaddr(resource_size_t addr)
                        unsigned long cpuaddr;

                        cpuaddr = addr - range->hw_start + range->io_start;
+                       pr_info("PIO TO CPUADDR: ADDR: 0x%x -  addr
HW_START: 0x%08x + RANGE IO: 0x%08x\n", addr, range->hw_start,
range->io_start);

                        rcu_read_unlock();
                        return cpuaddr;

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 1c3257a2d4e3..52b77b5defac 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -298,6 +298,8 @@ int of_pci_range_to_resource(struct of_pci_range *range,
                res->start = range->cpu_addr;
        }
        res->end = res->start + range->size - 1;
+       pr_info("IO START returned by pci_address_to_pio: 0x%x-0x%x\n",
+               res->start, res->end);
        return 0;

 invalid_range:


mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
0x1e160000 + RANGE IO: 0x00000000
OF: IO START returned by pci_address_to_pio: 0x0-0xffff
mt7621-pci 1e140000.pcie: PCIE0 enabled
mt7621-pci 1e140000.pcie: PCIE1 enabled
mt7621-pci 1e140000.pcie: PCIE2 enabled
mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000,
mask/settings: 0xf0000002
mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
pci_bus 0000:00: root bus resource [io  0x0000-0xffff] (bus address
[0x1e160000-0x1e16ffff])
pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:00.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:00.0: supports D1
pci 0000:00:00.0: PME# supported from D0 D1 D3hot
pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:01.0: supports D1
pci 0000:00:01.0: PME# supported from D0 D1 D3hot
pci 0000:00:02.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:02.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:02.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:02.0: supports D1
pci 0000:00:02.0: PME# supported from D0 D1 D3hot
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:01:00.0: reg 0x10: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x10: [io  size 0x0008]
pci 0000:01:00.0: reg 0x14: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x14: [io  size 0x0004]
pci 0000:01:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x18: [io  size 0x0008]
pci 0000:01:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x1c: [io  size 0x0004]
pci 0000:01:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x20: [io  size 0x0010]
pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:01:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:00.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci 0000:02:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:02:00.0: reg 0x10: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x10: [io  size 0x0008]
pci 0000:02:00.0: reg 0x14: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x14: [io  size 0x0004]
pci 0000:02:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x18: [io  size 0x0008]
pci 0000:02:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x1c: [io  size 0x0004]
pci 0000:02:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x20: [io  size 0x0010]
pci 0000:02:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:02:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:01.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:01.0: PCI bridge to [bus 02-ff]
pci 0000:00:01.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
pci 0000:03:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:03:00.0: reg 0x10: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x10: [io  size 0x0008]
pci 0000:03:00.0: reg 0x14: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x14: [io  size 0x0004]
pci 0000:03:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x18: [io  size 0x0008]
pci 0000:03:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x1c: [io  size 0x0004]
pci 0000:03:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x20: [io  size 0x0010]
pci 0000:03:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:03:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:02.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:02.0: PCI bridge to [bus 03-ff]
pci 0000:00:02.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:00.0: BAR 8: assigned [mem 0x60000000-0x600fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x60100000-0x601fffff pref]
pci 0000:00:01.0: BAR 8: assigned [mem 0x60200000-0x602fffff]
pci 0000:00:01.0: BAR 9: assigned [mem 0x60300000-0x603fffff pref]
pci 0000:00:02.0: BAR 8: assigned [mem 0x60400000-0x604fffff]
pci 0000:00:02.0: BAR 9: assigned [mem 0x60500000-0x605fffff pref]
pci 0000:00:00.0: BAR 1: assigned [mem 0x60600000-0x6060ffff]
pci 0000:00:01.0: BAR 1: assigned [mem 0x60610000-0x6061ffff]
pci 0000:00:02.0: BAR 1: assigned [mem 0x60620000-0x6062ffff]
pci 0000:00:00.0: BAR 7: assigned [io  0x0000-0x0fff]
pci 0000:00:01.0: BAR 7: assigned [io  0x1000-0x1fff]
pci 0000:00:02.0: BAR 7: assigned [io  0x2000-0x2fff]
pci 0000:01:00.0: BAR 5: assigned [mem 0x60000000-0x600001ff]
pci 0000:01:00.0: BAR 4: assigned [io  0x0000-0x000f]
pci 0000:01:00.0: BAR 0: assigned [io  0x0010-0x0017]
pci 0000:01:00.0: BAR 2: assigned [io  0x0018-0x001f]
pci 0000:01:00.0: BAR 1: assigned [io  0x0020-0x0023]
pci 0000:01:00.0: BAR 3: assigned [io  0x0024-0x0027]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
pci 0000:02:00.0: BAR 5: assigned [mem 0x60200000-0x602001ff]
pci 0000:02:00.0: BAR 4: assigned [io  0x1000-0x100f]
pci 0000:02:00.0: BAR 0: assigned [io  0x1010-0x1017]
pci 0000:02:00.0: BAR 2: assigned [io  0x1018-0x101f]
pci 0000:02:00.0: BAR 1: assigned [io  0x1020-0x1023]
pci 0000:02:00.0: BAR 3: assigned [io  0x1024-0x1027]
pci 0000:00:01.0: PCI bridge to [bus 02]
pci 0000:00:01.0:   bridge window [io  0x1000-0x1fff]
pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
pci 0000:00:01.0:   bridge window [mem 0x60300000-0x603fffff pref]
pci 0000:03:00.0: BAR 5: assigned [mem 0x60400000-0x604001ff]
pci 0000:03:00.0: BAR 4: assigned [io  0x2000-0x200f]
pci 0000:03:00.0: BAR 0: assigned [io  0x2010-0x2017]
pci 0000:03:00.0: BAR 2: assigned [io  0x2018-0x201f]
pci 0000:03:00.0: BAR 1: assigned [io  0x2020-0x2023]
pci 0000:03:00.0: BAR 3: assigned [io  0x2024-0x2027]
pci 0000:00:02.0: PCI bridge to [bus 03]
pci 0000:00:02.0:   bridge window [io  0x2000-0x2fff]
pci 0000:00:02.0:   bridge window [mem 0x60400000-0x604fffff]
pci 0000:00:02.0:   bridge window [mem 0x60500000-0x605fffff pref]

This other one (correct behaviour AFAICS) is what I get with this
patch series setting IO_SPACE_LIMIT and ifdef to avoid the remapping:

mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff
mt7621-pci 1e140000.pcie: PCIE0 enabled
mt7621-pci 1e140000.pcie: PCIE1 enabled
mt7621-pci 1e140000.pcie: PCIE2 enabled
mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000,
mask/settings: 0xf0000002
mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
pci_bus 0000:00: root bus resource [io  0x1e160000-0x1e16ffff]
pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:00.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:00.0: supports D1
pci 0000:00:00.0: PME# supported from D0 D1 D3hot
pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:01.0: supports D1
pci 0000:00:01.0: PME# supported from D0 D1 D3hot
pci 0000:00:02.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:02.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:02.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:02.0: supports D1
pci 0000:00:02.0: PME# supported from D0 D1 D3hot
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:01:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:01:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:01:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:01:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:01:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:01:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:00.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci 0000:02:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:02:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:02:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:02:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:02:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:02:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:02:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:02:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:01.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:01.0: PCI bridge to [bus 02-ff]
pci 0000:00:01.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
pci 0000:03:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:03:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:03:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:03:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:03:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:03:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:03:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:03:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:02.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:02.0: PCI bridge to [bus 03-ff]
pci 0000:00:02.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:00.0: BAR 8: assigned [mem 0x60000000-0x600fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x60100000-0x601fffff pref]
pci 0000:00:01.0: BAR 8: assigned [mem 0x60200000-0x602fffff]
pci 0000:00:01.0: BAR 9: assigned [mem 0x60300000-0x603fffff pref]
pci 0000:00:02.0: BAR 8: assigned [mem 0x60400000-0x604fffff]
pci 0000:00:02.0: BAR 9: assigned [mem 0x60500000-0x605fffff pref]
pci 0000:00:00.0: BAR 1: assigned [mem 0x60600000-0x6060ffff]
pci 0000:00:01.0: BAR 1: assigned [mem 0x60610000-0x6061ffff]
pci 0000:00:02.0: BAR 1: assigned [mem 0x60620000-0x6062ffff]
pci 0000:00:00.0: BAR 7: assigned [io  0x1e160000-0x1e160fff]
pci 0000:00:01.0: BAR 7: assigned [io  0x1e161000-0x1e161fff]
pci 0000:00:02.0: BAR 7: assigned [io  0x1e162000-0x1e162fff]
pci 0000:01:00.0: BAR 5: assigned [mem 0x60000000-0x600001ff]
pci 0000:01:00.0: BAR 4: assigned [io  0x1e160000-0x1e16000f]
pci 0000:01:00.0: BAR 0: assigned [io  0x1e160010-0x1e160017]
pci 0000:01:00.0: BAR 2: assigned [io  0x1e160018-0x1e16001f]
pci 0000:01:00.0: BAR 1: assigned [io  0x1e160020-0x1e160023]
pci 0000:01:00.0: BAR 3: assigned [io  0x1e160024-0x1e160027]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0:   bridge window [io  0x1e160000-0x1e160fff]
pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
pci 0000:02:00.0: BAR 5: assigned [mem 0x60200000-0x602001ff]
pci 0000:02:00.0: BAR 4: assigned [io  0x1e161000-0x1e16100f]
pci 0000:02:00.0: BAR 0: assigned [io  0x1e161010-0x1e161017]
pci 0000:02:00.0: BAR 2: assigned [io  0x1e161018-0x1e16101f]
pci 0000:02:00.0: BAR 1: assigned [io  0x1e161020-0x1e161023]
pci 0000:02:00.0: BAR 3: assigned [io  0x1e161024-0x1e161027]
pci 0000:00:01.0: PCI bridge to [bus 02]
pci 0000:00:01.0:   bridge window [io  0x1e161000-0x1e161fff]
pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
pci 0000:00:01.0:   bridge window [mem 0x60300000-0x603fffff pref]
pci 0000:03:00.0: BAR 5: assigned [mem 0x60400000-0x604001ff]
pci 0000:03:00.0: BAR 4: assigned [io  0x1e162000-0x1e16200f]
pci 0000:03:00.0: BAR 0: assigned [io  0x1e162010-0x1e162017]
pci 0000:03:00.0: BAR 2: assigned [io  0x1e162018-0x1e16201f]
pci 0000:03:00.0: BAR 1: assigned [io  0x1e162020-0x1e162023]
pci 0000:03:00.0: BAR 3: assigned [io  0x1e162024-0x1e162027]
pci 0000:00:02.0: PCI bridge to [bus 03]
pci 0000:00:02.0:   bridge window [io  0x1e162000-0x1e162fff]
pci 0000:00:02.0:   bridge window [mem 0x60400000-0x604fffff]
pci 0000:00:02.0:   bridge window [mem 0x60500000-0x605fffff pref]

>
> > > > > pci_remap_iospace() was originally meant as an architecture
> > > > > specific helper, but it moved into generic code after all architectures
> > > > > had the same requirements. If MIPS has different requirements,
> > > > > then it should not be shared.
> > > >
> > > > I see. So, if it can not be shared, would defining 'pci_remap_iospace'
> > > > as 'weak' acceptable? Doing in this way I guess I can redefine the
> > > > symbol for mips to have the same I currently have but without the
> > > > ifdef in the core APIs...
> > >
> > > I would hope to kill off the __weak functions, and prefer using an #ifdef
> > > around the generic implementation. One way to do it is to define
> > > a macro with the same name, such as
> > >
> > > #define pci_remap_iospace pci_remap_iospace
> >
> > I guess this should be defined in arch/mips/include/asm/pci.h?
>
> Yes, that would be a good place for that, possibly next to
> the (static inline) definition.
>
> > > and then use #ifdef around the C function to see if that's already defined.

I don't know if I am following you properly... you mean to define:

diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
index 6f48649201c5..9a8ca258c68b 100644
--- a/arch/mips/include/asm/pci.h
+++ b/arch/mips/include/asm/pci.h
@@ -20,6 +20,12 @@
 #include <linux/list.h>
 #include <linux/of.h>

+#define pci_remap_iospace pci_remap_iospace
+static inline int pci_remap_iospace(const struct resource *res,
phys_addr_t phys_addr)
+{
+       return 0;
+}

And then in the PCI core code do something like this?

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4044,6 +4044,8 @@ unsigned long __weak
pci_address_to_pio(phys_addr_t address)
  * architectures that have memory mapped IO functions defined (and the
  * PCI_IOBASE value defined) should call this function.
  */
+#ifndef pci_remap_iospace
+#define pci_remap_iospace pci_remap_iospace
 int pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
 {
 #if defined(PCI_IOBASE) && defined(CONFIG_MMU)
@@ -4067,6 +4069,7 @@ int pci_remap_iospace(const struct resource
*res, phys_addr_t phys_addr)
 #endif
 }
 EXPORT_SYMBOL(pci_remap_iospace);
+#endif

But since this 'pci_remap_iospace' is already defined in
'include/linux/pci.h' and is not static at all the compiler complains
about doing such a thing. What am I missing here?

> >
> > I see. That would work, I guess. But I don't really understand why
> > this approach would be better than this patch changes itself. Looks
> > more hacky to me. As Bjorn pointed out in a previous version of this
> > patch [0], this case is the same as the one in
> > 'acpi_pci_root_remap_iospace' and the same approach is used there...
>
> The acpi_pci_root_remap_iospace() does this because on that code is
> shared with x86 and ia64, where the port numbers are accessed using
> separate instructions that do not translate into MMIO addresses at all.
>
> On MIPS, the port access eventually does translate into MMIO, and
> you need a way to communicate the mapping between the host
> bridge and the architecture specific code.

Thanks for the explanation.

>
> This is particularly important since we want the host bridge driver
> to be portable. If you set up the mapping differently between e.g.
> mt7621 and mt7623, they are not able to use the same driver
> code for setting pci_host_bridge->io_offset and for programming
> the inbound translation registers.

mt7621 is only mips, mt7623 is arm based SoC. They cannot use the same
driver at all. mt7620 or mt7628 which have drivers which are in
'arch/mips/pci' using legacy pci code would use and would be tried to
be ported to share the driver with mt7621 but those are also only mips
and the I/O addresses for all of them are similar and have I/O higher
than 0xffff as mt7621 has.

>
> > > I only see one host bridge here though, and it has a single
> > > I/O port range, so maybe all three ports are inside of
> > > a single PCI domain?
> >
> > See this cover letter [1] with a fantastic ascii art :) to a better
> > understanding of this pci topology. Yes, there is one host bridge and
> > from here three virtual bridges where at most three endpoints can be
> > connected.
>
> Ok, so you don't have the problem I was referring to. A lot of
> SoCs actually have multiple host bridges, but only one root
> port per host bridge, because they are based on licensed IP
> blocks that don't support a normal topology like the one you have.
>
> > > Having high numbers for the I/O ports is definitely a
> > > problem as I mentioned. Anything that tries to access
> > > PC-style legacy devices on the low port numbers
> > > will now directly go on the bus accessing MMIO
> > > registers that it shouldn't, either causing a CPU exception
> > > or (worse) undefined behavior from random register
> > > accesses.
> >
> > All I/O port addresses for ralink SoCs are in higher addresses than
> > default IO_SPACE_LIMIT 0xFFFF, that's why we have to also change this
> > limit together with this patch changes. Nothing to do with this, is an
> > architectural thing of these SoCs.
>
> I don't understand. What I see above is that the host bridge
> has the region 1e160000-1e16ffff registered, so presumably
> 1e160000 is actually the start of the window into the host bridge.
> If you set PCI_IOBASE to that location, the highest port number
> would become 0x2027, which is under 0xffff.

But 0x1e160000 is defined in the device tree as the I/O start address
of the range and should not be hardcoded anywhere else since other
ralink platforms don't use this address as PCI_IOBASE. And yes, 0x2027
is the highest port number I get if I initially define PCI_IOBASE also
as KSEG1 since at the end is the entry point for I/O in mips (see
trace above).

Thanks very much for your time and feedback.

Best regards,
     Sergio Paracuellos
>
>        Arnd
Arnd Bergmann Sept. 23, 2021, 9:07 a.m. UTC | #9
()On Thu, Sep 23, 2021 at 8:36 AM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Thu, Sep 23, 2021 at 7:51 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > > I am not really understanding this yet (I think I need a bit of sleep
> > > time :)), but I will test this tomorrow and come back to you again
> > > with results.
> >
> > Both would let devices access the registers, but they are different
> > regarding the bus translations you have to program into the
> > host bridge, and how to access the hardcoded port numbers.
>
> I have tested this and I get initial invalidid BAR value errors on pci
> bus I/O enumeration an also bad addresses in /proc/ioports in the same
> way I got defining PCI_IOBASE as _AC(0xa0000000, UL):
>
> root@gnubee:~# cat /proc/ioports
> 00000000-0000ffff : pcie@1e140000
>   00000000-00000fff : PCI Bus 0000:01
>     00000000-0000000f : 0000:01:00.0
>       00000000-0000000f : ahci
>     00000010-00000017 : 0000:01:00.0
>       00000010-00000017 : ahci
>     00000018-0000001f : 0000:01:00.0
>       00000018-0000001f : ahci

Ok, These look good to me now.

> mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> 0x1e160000 + RANGE IO: 0x00000000
> OF: IO START returned by pci_address_to_pio: 0x0-0xffff
> mt7621-pci 1e140000.pcie: PCIE0 enabled
> mt7621-pci 1e140000.pcie: PCIE1 enabled
> mt7621-pci 1e140000.pcie: PCIE2 enabled
> mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000,
> mask/settings: 0xf0000002
> mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
> pci_bus 0000:00: root bus resource [bus 00-ff]
> pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
> pci_bus 0000:00: root bus resource [io  0x0000-0xffff] (bus address
> [0x1e160000-0x1e16ffff])
>
> This other one (correct behaviour AFAICS) is what I get with this
> patch series setting IO_SPACE_LIMIT and ifdef to avoid the remapping:
>
> mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff

While these look wrong, the output should be in the 0-0ffff range.
I suppose you have set an incorrect io_offset now.

> diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
> index 6f48649201c5..9a8ca258c68b 100644
> --- a/arch/mips/include/asm/pci.h
> +++ b/arch/mips/include/asm/pci.h
> @@ -20,6 +20,12 @@
>  #include <linux/list.h>
>  #include <linux/of.h>
>
> +#define pci_remap_iospace pci_remap_iospace
> +static inline int pci_remap_iospace(const struct resource *res,
> phys_addr_t phys_addr)
> +{
> +       return 0;
> +}
>
> And then in the PCI core code do something like this?

This is not sufficient: pci_remap_iospace() has to tell the architecture
code where the start of the I/O space is, which normally means
ioremapping it, and in your case would mean setting the
mips_io_port_base variable to phys_addr.

> But since this 'pci_remap_iospace' is already defined in
> 'include/linux/pci.h' and is not static at all the compiler complains
> about doing such a thing. What am I missing here?

I think you have to have another #ifdef around the declaration in
this case, or alternatively move the mips definition back to a .c
file and leave only the #define

> > This is particularly important since we want the host bridge driver
> > to be portable. If you set up the mapping differently between e.g.
> > mt7621 and mt7623, they are not able to use the same driver
> > code for setting pci_host_bridge->io_offset and for programming
> > the inbound translation registers.
>
> mt7621 is only mips, mt7623 is arm based SoC. They cannot use the same
> driver at all. mt7620 or mt7628 which have drivers which are in
> 'arch/mips/pci' using legacy pci code would use and would be tried to
> be ported to share the driver with mt7621 but those are also only mips
> and the I/O addresses for all of them are similar and have I/O higher
> than 0xffff as mt7621 has.

That was my point: the driver should never care what the I/O addresses
are, so as long as it gets the addresses from DT and passes them into
generic kernel interfaces, it must do the right thing on both MIPS and ARM.

The mt7620/28/80/88 driver is obviously not portable because it does
not attempt to be a generic PCI host bridge driver.

> > > All I/O port addresses for ralink SoCs are in higher addresses than
> > > default IO_SPACE_LIMIT 0xFFFF, that's why we have to also change this
> > > limit together with this patch changes. Nothing to do with this, is an
> > > architectural thing of these SoCs.
> >
> > I don't understand. What I see above is that the host bridge
> > has the region 1e160000-1e16ffff registered, so presumably
> > 1e160000 is actually the start of the window into the host bridge.
> > If you set PCI_IOBASE to that location, the highest port number
> > would become 0x2027, which is under 0xffff.
>
> But 0x1e160000 is defined in the device tree as the I/O start address
> of the range and should not be hardcoded anywhere else since other
> ralink platforms don't use this address as PCI_IOBASE. And yes, 0x2027
> is the highest port number I get if I initially define PCI_IOBASE also
> as KSEG1 since at the end is the entry point for I/O in mips (see
> trace above).
>
> Thanks very much for your time and feedback.

0x1e160000 should not be listed as the I/O start address, it should
be listed in the 'ranges' property as the MMIO address that the I/O
window translates into, with the actual port numbers (on the bus)
being in the low range.

I realize this is very confusing, but there are indeed at least three
address spaces that you must not confuse here:

a) I/O port numbers as programmed into BAR registers and
    used in PCIe transactions, normally 0 through 0xffff on each
    bus.
b) Linux I/O port numbers as seen from user space, in the range
     from 0 to IO_SPACE_LIMIT, these correspond to the
     bus addresses from a) if io_offset is zero, but could be
     different with a non-zero value passed into
     pci_add_resource_offset() when the region is probed.
     The offset may be different on each pci host bridge.
c) MMIO address used to access ports, offset by PCI_IOBASE
    from the Linux port numbers in b).
    No other registers should be visible between PCI_IOBASE
    and PCI_IOBASE+IO_SPACE_LIMIT

          Arnd
Sergio Paracuellos Sept. 23, 2021, 11:08 a.m. UTC | #10
Hi Arnd,

On Thu, Sep 23, 2021 at 11:07 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> ()On Thu, Sep 23, 2021 at 8:36 AM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Thu, Sep 23, 2021 at 7:51 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > I am not really understanding this yet (I think I need a bit of sleep
> > > > time :)), but I will test this tomorrow and come back to you again
> > > > with results.
> > >
> > > Both would let devices access the registers, but they are different
> > > regarding the bus translations you have to program into the
> > > host bridge, and how to access the hardcoded port numbers.
> >
> > I have tested this and I get initial invalidid BAR value errors on pci
> > bus I/O enumeration an also bad addresses in /proc/ioports in the same
> > way I got defining PCI_IOBASE as _AC(0xa0000000, UL):
> >
> > root@gnubee:~# cat /proc/ioports
> > 00000000-0000ffff : pcie@1e140000
> >   00000000-00000fff : PCI Bus 0000:01
> >     00000000-0000000f : 0000:01:00.0
> >       00000000-0000000f : ahci
> >     00000010-00000017 : 0000:01:00.0
> >       00000010-00000017 : ahci
> >     00000018-0000001f : 0000:01:00.0
> >       00000018-0000001f : ahci
>
> Ok, These look good to me now.

This is the behaviour we already had with spaces.h [0] without any
other change. See also comments of Thomas [1] about this being wrong
which at the end are the motivation for this patch series.

> > mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> > LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> > 0x1e160000 + RANGE IO: 0x00000000

Why is my RANGE IO start transformed here to 0x0? Should not be the
one defined in dts 0x001e160000?

> > OF: IO START returned by pci_address_to_pio: 0x0-0xffff
> > mt7621-pci 1e140000.pcie: PCIE0 enabled
> > mt7621-pci 1e140000.pcie: PCIE1 enabled
> > mt7621-pci 1e140000.pcie: PCIE2 enabled
> > mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000,
> > mask/settings: 0xf0000002
> > mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
> > pci_bus 0000:00: root bus resource [bus 00-ff]
> > pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
> > pci_bus 0000:00: root bus resource [io  0x0000-0xffff] (bus address
> > [0x1e160000-0x1e16ffff])
> >
> > This other one (correct behaviour AFAICS) is what I get with this
> > patch series setting IO_SPACE_LIMIT and ifdef to avoid the remapping:
> >
> > mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> > OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff
>
> While these look wrong, the output should be in the 0-0ffff range.
> I suppose you have set an incorrect io_offset now.

Well, here I am only using IO_SPACE_LIMIT set to 0x1fffffff and no
PCI_IOBASE defined and skipping the remap with this patch, so the
address listed in the dts range is returned as it is. Thus the range
'0x1e160000-0x1e16ffff'.

>
> > diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
> > index 6f48649201c5..9a8ca258c68b 100644
> > --- a/arch/mips/include/asm/pci.h
> > +++ b/arch/mips/include/asm/pci.h
> > @@ -20,6 +20,12 @@
> >  #include <linux/list.h>
> >  #include <linux/of.h>
> >
> > +#define pci_remap_iospace pci_remap_iospace
> > +static inline int pci_remap_iospace(const struct resource *res,
> > phys_addr_t phys_addr)
> > +{
> > +       return 0;
> > +}
> >
> > And then in the PCI core code do something like this?
>
> This is not sufficient: pci_remap_iospace() has to tell the architecture
> code where the start of the I/O space is, which normally means
> ioremapping it, and in your case would mean setting the
> mips_io_port_base variable to phys_addr.

Understood.

>
> > But since this 'pci_remap_iospace' is already defined in
> > 'include/linux/pci.h' and is not static at all the compiler complains
> > about doing such a thing. What am I missing here?
>
> I think you have to have another #ifdef around the declaration in
> this case, or alternatively move the mips definition back to a .c
> file and leave only the #define

Ok, so the following changes:

diff --git a/arch/mips/pci/pci-generic.c b/arch/mips/pci/pci-generic.c
index 95b00017886c..ee0e0951b800 100644
--- a/arch/mips/pci/pci-generic.c
+++ b/arch/mips/pci/pci-generic.c
@@ -46,3 +46,9 @@ void pcibios_fixup_bus(struct pci_bus *bus)
 {
        pci_read_bridge_bases(bus);
 }
+
+int pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
+{
+       mips_io_port_base = phys_addr;
+       return 0;
+}

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e578d34095e9..8db76c3a4f67 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4044,6 +4044,7 @@ unsigned long __weak
pci_address_to_pio(phys_addr_t address)
  * architectures that have memory mapped IO functions defined (and the
  * PCI_IOBASE value defined) should call this function.
  */
+#ifndef pci_remap_iospace
 int pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
 {
 #if defined(PCI_IOBASE) && defined(CONFIG_MMU)
@@ -4067,6 +4068,7 @@ int pci_remap_iospace(const struct resource
*res, phys_addr_t phys_addr)
 #endif
 }
 EXPORT_SYMBOL(pci_remap_iospace);
+#endif

diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
index 6f48649201c5..ac89354af651 100644
--- a/arch/mips/include/asm/pci.h
+++ b/arch/mips/include/asm/pci.h
@@ -20,6 +20,8 @@
 #include <linux/list.h>
 #include <linux/of.h>

+#define pci_remap_iospace pci_remap_iospace
+
 #ifdef CONFIG_PCI_DRIVERS_LEGACY

These changes got me to the same behaviour that this patch pretends
without the ifdef on this patch. But, this behaviour is wrong
according to your explanations since I got

OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff

and ioports using this range address and not lower 0x0-0xffff.

So all of these changes seem to be invalid: this patch and the already
added to staging-tree two ones: [2] and [3], right?

>
> > > This is particularly important since we want the host bridge driver
> > > to be portable. If you set up the mapping differently between e.g.
> > > mt7621 and mt7623, they are not able to use the same driver
> > > code for setting pci_host_bridge->io_offset and for programming
> > > the inbound translation registers.
> >
> > mt7621 is only mips, mt7623 is arm based SoC. They cannot use the same
> > driver at all. mt7620 or mt7628 which have drivers which are in
> > 'arch/mips/pci' using legacy pci code would use and would be tried to
> > be ported to share the driver with mt7621 but those are also only mips
> > and the I/O addresses for all of them are similar and have I/O higher
> > than 0xffff as mt7621 has.
>
> That was my point: the driver should never care what the I/O addresses
> are, so as long as it gets the addresses from DT and passes them into
> generic kernel interfaces, it must do the right thing on both MIPS and ARM.
>
> The mt7620/28/80/88 driver is obviously not portable because it does
> not attempt to be a generic PCI host bridge driver.

Currently, no. But if they were ideally moved to work in the same way
mt7621 would be the same case. Mt7621 is device tree based PCI host
bridge driver that uses pci core apis but is still mips since it has
to properly set IO coherency units which is a mips thing...

>
> > > > All I/O port addresses for ralink SoCs are in higher addresses than
> > > > default IO_SPACE_LIMIT 0xFFFF, that's why we have to also change this
> > > > limit together with this patch changes. Nothing to do with this, is an
> > > > architectural thing of these SoCs.
> > >
> > > I don't understand. What I see above is that the host bridge
> > > has the region 1e160000-1e16ffff registered, so presumably
> > > 1e160000 is actually the start of the window into the host bridge.
> > > If you set PCI_IOBASE to that location, the highest port number
> > > would become 0x2027, which is under 0xffff.
> >
> > But 0x1e160000 is defined in the device tree as the I/O start address
> > of the range and should not be hardcoded anywhere else since other
> > ralink platforms don't use this address as PCI_IOBASE. And yes, 0x2027
> > is the highest port number I get if I initially define PCI_IOBASE also
> > as KSEG1 since at the end is the entry point for I/O in mips (see
> > trace above).
> >
> > Thanks very much for your time and feedback.
>
> 0x1e160000 should not be listed as the I/O start address, it should
> be listed in the 'ranges' property as the MMIO address that the I/O
> window translates into, with the actual port numbers (on the bus)
> being in the low range.
>
> I realize this is very confusing, but there are indeed at least three
> address spaces that you must not confuse here:
>
> a) I/O port numbers as programmed into BAR registers and
>     used in PCIe transactions, normally 0 through 0xffff on each
>     bus.
> b) Linux I/O port numbers as seen from user space, in the range
>      from 0 to IO_SPACE_LIMIT, these correspond to the
>      bus addresses from a) if io_offset is zero, but could be
>      different with a non-zero value passed into
>      pci_add_resource_offset() when the region is probed.
>      The offset may be different on each pci host bridge.

This "offset" is the pci address configured in device tree range,
right? This seems the part is not doing properly in my case since all
of these changes are needed to at the end got BAR's as

pci 0000:02:00.0: BAR 4: assigned [io  0x1e161000-0x1e16100f]
pci 0000:02:00.0: BAR 0: assigned [io  0x1e161010-0x1e161017]
pci 0000:02:00.0: BAR 2: assigned [io  0x1e161018-0x1e16101f]
pci 0000:02:00.0: BAR 1: assigned [io  0x1e161020-0x1e161023]
pci 0000:02:00.0: BAR 3: assigned [io  0x1e161024-0x1e161027]

which I understand is correct.

> c) MMIO address used to access ports, offset by PCI_IOBASE
>     from the Linux port numbers in b).
>     No other registers should be visible between PCI_IOBASE
>     and PCI_IOBASE+IO_SPACE_LIMIT

mips_io_port_base + offset, right? KSEG1 addresses for mips by default.

Thanks for your patience, BTW :)).

Best regards,
    Sergio Paracuellos

>
>           Arnd

[0]: https://elixir.bootlin.com/linux/v5.15-rc2/source/arch/mips/include/asm/mach-ralink/spaces.h
[1]: https://lore.kernel.org/linux-mips/20210729100146.GA8648@alpha.franken.de/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-testing&id=159697474db41732ef3b6c2e8d9395f09d1f659e
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-testing&id=50fb34eca2944fd67493717c9fbda125336f1655
Arnd Bergmann Sept. 23, 2021, 1:26 p.m. UTC | #11
==On Thu, Sep 23, 2021 at 1:09 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Thu, Sep 23, 2021 at 11:07 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > ()On Thu, Sep 23, 2021 at 8:36 AM Sergio Paracuellos
> > <sergio.paracuellos@gmail.com> wrote:
> > > On Thu, Sep 23, 2021 at 7:51 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > I am not really understanding this yet (I think I need a bit of sleep
> > > > > time :)), but I will test this tomorrow and come back to you again
> > > > > with results.
> > > >
> > > > Both would let devices access the registers, but they are different
> > > > regarding the bus translations you have to program into the
> > > > host bridge, and how to access the hardcoded port numbers.
> > >
> > > I have tested this and I get initial invalidid BAR value errors on pci
> > > bus I/O enumeration an also bad addresses in /proc/ioports in the same
> > > way I got defining PCI_IOBASE as _AC(0xa0000000, UL):
> > >
> > > root@gnubee:~# cat /proc/ioports
> > > 00000000-0000ffff : pcie@1e140000
> > >   00000000-00000fff : PCI Bus 0000:01
> > >     00000000-0000000f : 0000:01:00.0
> > >       00000000-0000000f : ahci
> > >     00000010-00000017 : 0000:01:00.0
> > >       00000010-00000017 : ahci
> > >     00000018-0000001f : 0000:01:00.0
> > >       00000018-0000001f : ahci
> >
> > Ok, These look good to me now.
>
> This is the behaviour we already had with spaces.h [0] without any
> other change. See also comments of Thomas [1] about this being wrong
> which at the end are the motivation for this patch series.
>
> > > mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> > > LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> > > 0x1e160000 + RANGE IO: 0x00000000
>
> Why is my RANGE IO start transformed here to 0x0? Should not be the
> one defined in dts 0x001e160000?

Can you show the exact property in your device tree? It sounds like the
problem is an incorrect entry in the ranges, unless the chip is hardwired
to the bus address in an unusual way.

> > I think you have to have another #ifdef around the declaration in
> > this case, or alternatively move the mips definition back to a .c
> > file and leave only the #define
>
> Ok, so the following changes:
>
> diff --git a/arch/mips/pci/pci-generic.c b/arch/mips/pci/pci-generic.c
> index 95b00017886c..ee0e0951b800 100644
> --- a/arch/mips/pci/pci-generic.c
> +++ b/arch/mips/pci/pci-generic.c
> @@ -46,3 +46,9 @@ void pcibios_fixup_bus(struct pci_bus *bus)
>  {
>         pci_read_bridge_bases(bus);
>  }
> +
> +int pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
> +{
> +       mips_io_port_base = phys_addr;
> +       return 0;
> +}
...
> These changes got me to the same behaviour that this patch pretends
> without the ifdef on this patch. But, this behaviour is wrong
> according to your explanations since I got
>
> OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff
>
> and ioports using this range address and not lower 0x0-0xffff.
>
> So all of these changes seem to be invalid: this patch and the already
> added to staging-tree two ones: [2] and [3], right?

Right, there is probably yet another problem. Patch [2] should
be harmless here, but patch [3] is wrong as you should not override
the length of the I/O port window that is in the DT.

> Currently, no. But if they were ideally moved to work in the same way
> mt7621 would be the same case. Mt7621 is device tree based PCI host
> bridge driver that uses pci core apis but is still mips since it has
> to properly set IO coherency units which is a mips thing...

I don't know what those IO coherency units are, but I would think that
if you have to do some extra things on MIPS but not ARM, then those
should be done from the common PCI host bridge code and stubbed out
on architectures that don't need them.

> > I realize this is very confusing, but there are indeed at least three
> > address spaces that you must not confuse here:
> >
> > a) I/O port numbers as programmed into BAR registers and
> >     used in PCIe transactions, normally 0 through 0xffff on each
> >     bus.
> > b) Linux I/O port numbers as seen from user space, in the range
> >      from 0 to IO_SPACE_LIMIT, these correspond to the
> >      bus addresses from a) if io_offset is zero, but could be
> >      different with a non-zero value passed into
> >      pci_add_resource_offset() when the region is probed.
> >      The offset may be different on each pci host bridge.
>
> This "offset" is the pci address configured in device tree range,
> right? This seems the part is not doing properly in my case since all
> of these changes are needed to at the end got BAR's as
>
> pci 0000:02:00.0: BAR 4: assigned [io  0x1e161000-0x1e16100f]
> pci 0000:02:00.0: BAR 0: assigned [io  0x1e161010-0x1e161017]
> pci 0000:02:00.0: BAR 2: assigned [io  0x1e161018-0x1e16101f]
> pci 0000:02:00.0: BAR 1: assigned [io  0x1e161020-0x1e161023]
> pci 0000:02:00.0: BAR 3: assigned [io  0x1e161024-0x1e161027]
>
> which I understand is correct.

The "offset" is between two numbers that can normally both be
picked freely, so it could literally be anything, but in the most common
and ideal case, it is zero:

The Linux port number gets assigned when probing the host bridge,
this is purely a software construct and the first bridge should normally
get range 0-0xffff, the second bridge gets range 0x10000-0x1ffff
etc. The code assigning these numbers is rather confusing, and I
can't even find where it is now...

The port number on the bus is platform specific. In some cases
you can set it through a register in the pci host bridge, in other
cases it is fixed to starting at zero. If the address is programmable,
it can be either set by the firmware or bootloader and passed down
to the kernel through the DT ranges property, or the ranges can
contain a suggested value that then has to be programmed by
the host bridge driver.

If the value is not zero, you should try setting it to zero to get
an identity mapping against the Linux port numbers, to minimize
the confusion.

It is possible that the hardware (or bootloader) designers
misunderstood what the window is about, and hardcoded it so
that the port number on the bus is the same as the physical
address as seen from the CPU. If this is the case and you
can't change it to a sane value, you have to put the 1:1
translation into the DT and would actually get the strange
port numbers 0x1e161000-0x1e16100f from that nonzero offset.

This means you can only use PCI devices that can be
programmed with high port numbers, but not devices with
hardcoded legacy ports.

> > c) MMIO address used to access ports, offset by PCI_IOBASE
> >     from the Linux port numbers in b).
> >     No other registers should be visible between PCI_IOBASE
> >     and PCI_IOBASE+IO_SPACE_LIMIT
>
> mips_io_port_base + offset, right? KSEG1 addresses for mips by default.

no, not the offset. As long as mips_io_port_base==PCI_IOBASE,
the accessible ports will be between mips_io_port_base and
mips_io_port_base+0xffff.

         Arnd
Sergio Paracuellos Sept. 23, 2021, 2:55 p.m. UTC | #12
Hi Arnd,

On Thu, Sep 23, 2021 at 3:26 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
>  ==On Thu, Sep 23, 2021 at 1:09 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Thu, Sep 23, 2021 at 11:07 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > > ()On Thu, Sep 23, 2021 at 8:36 AM Sergio Paracuellos
> > > <sergio.paracuellos@gmail.com> wrote:
> > > > On Thu, Sep 23, 2021 at 7:51 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > > > I am not really understanding this yet (I think I need a bit of sleep
> > > > > > time :)), but I will test this tomorrow and come back to you again
> > > > > > with results.
> > > > >
> > > > > Both would let devices access the registers, but they are different
> > > > > regarding the bus translations you have to program into the
> > > > > host bridge, and how to access the hardcoded port numbers.
> > > >
> > > > I have tested this and I get initial invalidid BAR value errors on pci
> > > > bus I/O enumeration an also bad addresses in /proc/ioports in the same
> > > > way I got defining PCI_IOBASE as _AC(0xa0000000, UL):
> > > >
> > > > root@gnubee:~# cat /proc/ioports
> > > > 00000000-0000ffff : pcie@1e140000
> > > >   00000000-00000fff : PCI Bus 0000:01
> > > >     00000000-0000000f : 0000:01:00.0
> > > >       00000000-0000000f : ahci
> > > >     00000010-00000017 : 0000:01:00.0
> > > >       00000010-00000017 : ahci
> > > >     00000018-0000001f : 0000:01:00.0
> > > >       00000018-0000001f : ahci
> > >
> > > Ok, These look good to me now.
> >
> > This is the behaviour we already had with spaces.h [0] without any
> > other change. See also comments of Thomas [1] about this being wrong
> > which at the end are the motivation for this patch series.
> >
> > > > mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> > > > LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> > > > 0x1e160000 + RANGE IO: 0x00000000
> >
> > Why is my RANGE IO start transformed here to 0x0? Should not be the
> > one defined in dts 0x001e160000?

>
> Can you show the exact property in your device tree? It sounds like the
> problem is an incorrect entry in the ranges, unless the chip is hardwired
> to the bus address in an unusual way.

Here it is:

ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
               <0x01000000 0 0x1e160000 0x1e160000 0 0x00010000>;  /*
io space */

>
> > > I think you have to have another #ifdef around the declaration in
> > > this case, or alternatively move the mips definition back to a .c
> > > file and leave only the #define
> >
> > Ok, so the following changes:
> >
> > diff --git a/arch/mips/pci/pci-generic.c b/arch/mips/pci/pci-generic.c
> > index 95b00017886c..ee0e0951b800 100644
> > --- a/arch/mips/pci/pci-generic.c
> > +++ b/arch/mips/pci/pci-generic.c
> > @@ -46,3 +46,9 @@ void pcibios_fixup_bus(struct pci_bus *bus)
> >  {
> >         pci_read_bridge_bases(bus);
> >  }
> > +
> > +int pci_remap_iospace(const struct resource *res, phys_addr_t phys_addr)
> > +{
> > +       mips_io_port_base = phys_addr;
> > +       return 0;
> > +}
> ...
> > These changes got me to the same behaviour that this patch pretends
> > without the ifdef on this patch. But, this behaviour is wrong
> > according to your explanations since I got
> >
> > OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff
> >
> > and ioports using this range address and not lower 0x0-0xffff.
> >
> > So all of these changes seem to be invalid: this patch and the already
> > added to staging-tree two ones: [2] and [3], right?
>
> Right, there is probably yet another problem. Patch [2] should
> be harmless here, but patch [3] is wrong as you should not override
> the length of the I/O port window that is in the DT.

Understood and makes sense. If I also set the start address for the
port the length won't be overwritten but I am only setting the end
limit there since I don't have the io range start address in such
early probe part of the driver and makes no sense to hardcode it. That
is the only reason.

>
> > Currently, no. But if they were ideally moved to work in the same way
> > mt7621 would be the same case. Mt7621 is device tree based PCI host
> > bridge driver that uses pci core apis but is still mips since it has
> > to properly set IO coherency units which is a mips thing...
>
> I don't know what those IO coherency units are, but I would think that
> if you have to do some extra things on MIPS but not ARM, then those
> should be done from the common PCI host bridge code and stubbed out
> on architectures that don't need them.

Simple definition here [0]. And we must adjust them in the driver here
[1]. Mips code, yes, but cannot be done in another way. Is related
with address of the memory resource and must be set up. In any other
case the pci subsystem won't work. I initially submitted this driver
from staging to arch/mips but I was told that even though it is mips
code, as it is device tree based and use common pci core apis, it
would be better to move it to 'drivers/pci/controller'. But I still
need the mips specific code for the iocu thing.

>
> > > I realize this is very confusing, but there are indeed at least three
> > > address spaces that you must not confuse here:
> > >
> > > a) I/O port numbers as programmed into BAR registers and
> > >     used in PCIe transactions, normally 0 through 0xffff on each
> > >     bus.
> > > b) Linux I/O port numbers as seen from user space, in the range
> > >      from 0 to IO_SPACE_LIMIT, these correspond to the
> > >      bus addresses from a) if io_offset is zero, but could be
> > >      different with a non-zero value passed into
> > >      pci_add_resource_offset() when the region is probed.
> > >      The offset may be different on each pci host bridge.
> >
> > This "offset" is the pci address configured in device tree range,
> > right? This seems the part is not doing properly in my case since all
> > of these changes are needed to at the end got BAR's as
> >
> > pci 0000:02:00.0: BAR 4: assigned [io  0x1e161000-0x1e16100f]
> > pci 0000:02:00.0: BAR 0: assigned [io  0x1e161010-0x1e161017]
> > pci 0000:02:00.0: BAR 2: assigned [io  0x1e161018-0x1e16101f]
> > pci 0000:02:00.0: BAR 1: assigned [io  0x1e161020-0x1e161023]
> > pci 0000:02:00.0: BAR 3: assigned [io  0x1e161024-0x1e161027]
> >
> > which I understand is correct.
>
> The "offset" is between two numbers that can normally both be
> picked freely, so it could literally be anything, but in the most common
> and ideal case, it is zero:
>
> The Linux port number gets assigned when probing the host bridge,
> this is purely a software construct and the first bridge should normally
> get range 0-0xffff, the second bridge gets range 0x10000-0x1ffff
> etc. The code assigning these numbers is rather confusing, and I
> can't even find where it is now...
>
> The port number on the bus is platform specific. In some cases
> you can set it through a register in the pci host bridge, in other
> cases it is fixed to starting at zero. If the address is programmable,
> it can be either set by the firmware or bootloader and passed down
> to the kernel through the DT ranges property, or the ranges can
> contain a suggested value that then has to be programmed by
> the host bridge driver.
>
> If the value is not zero, you should try setting it to zero to get
> an identity mapping against the Linux port numbers, to minimize
> the confusion.
>
> It is possible that the hardware (or bootloader) designers
> misunderstood what the window is about, and hardcoded it so
> that the port number on the bus is the same as the physical
> address as seen from the CPU. If this is the case and you
> can't change it to a sane value, you have to put the 1:1
> translation into the DT and would actually get the strange
> port numbers 0x1e161000-0x1e16100f from that nonzero offset.

Yes, and that pci_add_resource_offset() is called inside
devm_of_pci_get_host_bridge_resources() after parsing the ranges and
storing them as resources.  To calculate that offset passed around,
subtracts: res->start - range.pci_addr [2], so looking into my ranges
my offset should be zero. And I have added a trace just to confirm and
there are zero:

mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
OF: IO START returned by pci_address_to_pio: 0x60000000-0x6fffffff
PCI: OF: OFFSET -> RES START 0x60000000 - PCI ADDRESS 0x60000000 -> 0x0
mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff
PCI: OF: OFFSET -> RES START 0x1e160000 - PCI ADDRESS 0x1e160000 -> 0x0

But if I define PCI_IOBASE I get my I/O range start set to zero but
also the offset?? Why this substract is not getting '0x1e160000' as
offset here?

LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
0x1e160000 + RANGE IO: 0x00000000
OF: IO START returned by pci_address_to_pio: 0x0-0xffff
PCI: OF: OFFSET -> RES START 0x0 - PCI ADDRESS 0x1e160000 -> 0x0


>
> This means you can only use PCI devices that can be
> programmed with high port numbers, but not devices with
> hardcoded legacy ports.

So there is no real sense of having PCI IO working then since this
such devices is unlikely to exist.

>
> > > c) MMIO address used to access ports, offset by PCI_IOBASE
> > >     from the Linux port numbers in b).
> > >     No other registers should be visible between PCI_IOBASE
> > >     and PCI_IOBASE+IO_SPACE_LIMIT
> >
> > mips_io_port_base + offset, right? KSEG1 addresses for mips by default.
>
> no, not the offset. As long as mips_io_port_base==PCI_IOBASE,
> the accessible ports will be between mips_io_port_base and
> mips_io_port_base+0xffff.

Understood, thanks.

Best regards,
    Sergio Paracuellos

>
>          Arnd

[0]; https://www.linux-mips.org/wiki/IO_Coherence_Unit
[1]; https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/tree/drivers/staging/mt7621-pci/pci-mt7621.c?h=staging-testing#n211
[2]: https://elixir.bootlin.com/linux/latest/source/drivers/pci/of.c#L401
Arnd Bergmann Sept. 23, 2021, 5:55 p.m. UTC | #13
On Thu, Sep 23, 2021 at 4:55 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Thu, Sep 23, 2021 at 3:26 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > >
> > > > > mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> > > > > LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> > > > > 0x1e160000 + RANGE IO: 0x00000000
> > >
> > > Why is my RANGE IO start transformed here to 0x0? Should not be the
> > > one defined in dts 0x001e160000?
>
> >
> > Can you show the exact property in your device tree? It sounds like the
> > problem is an incorrect entry in the ranges, unless the chip is hardwired
> > to the bus address in an unusual way.
>
> Here it is:
>
> ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
>                <0x01000000 0 0x1e160000 0x1e160000 0 0x00010000>;  /*
> io space */

Ok, got it. So the memory space uses a normal zero offset with cpu address
equal to bus address, but the I/O space has a rather usual mapping of
bus address equal to the window into the mmio space, with an offset of
0x1e160000.

The normal way to do this would be map the PCI port range 0-0x10000
into CPU address 0x1e160000, like:

ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
               <0x01000000 0 0x1e160000 0 0 0x00010000>;

Do you know where that mapping is set up? Is this a hardware setting,
or is there a way to change the inbound I/O port ranges to the
normal mapping?

> > > Currently, no. But if they were ideally moved to work in the same way
> > > mt7621 would be the same case. Mt7621 is device tree based PCI host
> > > bridge driver that uses pci core apis but is still mips since it has
> > > to properly set IO coherency units which is a mips thing...
> >
> > I don't know what those IO coherency units are, but I would think that
> > if you have to do some extra things on MIPS but not ARM, then those
> > should be done from the common PCI host bridge code and stubbed out
> > on architectures that don't need them.
>
> Simple definition here [0]. And we must adjust them in the driver here
> [1]. Mips code, yes, but cannot be done in another way. Is related
> with address of the memory resource and must be set up. In any other
> case the pci subsystem won't work. I initially submitted this driver
> from staging to arch/mips but I was told that even though it is mips
> code, as it is device tree based and use common pci core apis, it
> would be better to move it to 'drivers/pci/controller'. But I still
> need the mips specific code for the iocu thing.

It does sound like this could be reworked into an architecture
specific helper that is defined for MIPS but just an empty stub
for everything else. Or you can just have an #ifdef in your
driver to at least enable compile-testing it on other architecture
if not completely sharing it with others. This is certainly a less
important point compared to the I/O port mapping.

> > It is possible that the hardware (or bootloader) designers
> > misunderstood what the window is about, and hardcoded it so
> > that the port number on the bus is the same as the physical
> > address as seen from the CPU. If this is the case and you
> > can't change it to a sane value, you have to put the 1:1
> > translation into the DT and would actually get the strange
> > port numbers 0x1e161000-0x1e16100f from that nonzero offset.
>
> Yes, and that pci_add_resource_offset() is called inside
> devm_of_pci_get_host_bridge_resources() after parsing the ranges and
> storing them as resources.  To calculate that offset passed around,
> subtracts: res->start - range.pci_addr [2], so looking into my ranges
> my offset should be zero. And I have added a trace just to confirm and
> there are zero:
>
> mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
> OF: IO START returned by pci_address_to_pio: 0x60000000-0x6fffffff
> PCI: OF: OFFSET -> RES START 0x60000000 - PCI ADDRESS 0x60000000 -> 0x0
> mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff
> PCI: OF: OFFSET -> RES START 0x1e160000 - PCI ADDRESS 0x1e160000 -> 0x0
>
> But if I define PCI_IOBASE I get my I/O range start set to zero but
> also the offset?? Why this substract is not getting '0x1e160000' as
> offset here?
>
> LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> 0x1e160000 + RANGE IO: 0x00000000
> OF: IO START returned by pci_address_to_pio: 0x0-0xffff
> PCI: OF: OFFSET -> RES START 0x0 - PCI ADDRESS 0x1e160000 -> 0x0

I'm not completely following what each of the numbers in your log refer
to.  The main thing you still need is to have the hardware set up
so it matches the ranges property, and the io_offset matching that.

With PCI_IOBASE=0x1e160000, there are two possible ways this
can work:

a) according to the ranges property you listed above:
    linux port numbers 0-0xffff, pci port numbers 0x1e160000-0x1e16ffff,
    io_offset=0x1e160000 (possibly negative that number, I can never
    keep track)

b) the normal way with the ranges according to my reply above, works only
    if the hardware mapping window can be reprogrammed that way:
    linux port numbers 0-0xffff, pci port numbers 0-0xffff,
    io_offset=0

         Arnd
Sergio Paracuellos Sept. 23, 2021, 8:32 p.m. UTC | #14
Hi Arnd,

On Thu, Sep 23, 2021 at 7:55 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Sep 23, 2021 at 4:55 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Thu, Sep 23, 2021 at 3:26 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > >
> > > > > > mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> > > > > > LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> > > > > > 0x1e160000 + RANGE IO: 0x00000000
> > > >
> > > > Why is my RANGE IO start transformed here to 0x0? Should not be the
> > > > one defined in dts 0x001e160000?
> >
> > >
> > > Can you show the exact property in your device tree? It sounds like the
> > > problem is an incorrect entry in the ranges, unless the chip is hardwired
> > > to the bus address in an unusual way.
> >
> > Here it is:
> >
> > ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
> >                <0x01000000 0 0x1e160000 0x1e160000 0 0x00010000>;  /*
> > io space */
>
> Ok, got it. So the memory space uses a normal zero offset with cpu address
> equal to bus address, but the I/O space has a rather usual mapping of
> bus address equal to the window into the mmio space, with an offset of
> 0x1e160000.
>
> The normal way to do this would be map the PCI port range 0-0x10000
> into CPU address 0x1e160000, like:
>
> ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
>                <0x01000000 0 0x1e160000 0 0 0x00010000>;

This change as it is got into the same behaviour:

mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000,
using [bus 00-ff]
mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
mt7621-pci 1e140000.pcie:       IO 0x0000000000..0x000000ffff -> 0x001e160000
                                                   ^^^^^^
                                                    This is the only
change I appreciate in all the trace with the range change.
mt7621-pci 1e140000.pcie: PCIE0 enabled
mt7621-pci 1e140000.pcie: PCIE1 enabled
mt7621-pci 1e140000.pcie: PCIE2 enabled
mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000,
mask/settings: 0xf0000002
mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
pci_bus 0000:00: root bus resource [io  0x0000-0xffff] (bus address
[0x1e160000-0x1e16ffff])
pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:00.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:00.0: supports D1
pci 0000:00:00.0: PME# supported from D0 D1 D3hot
pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:01.0: supports D1
pci 0000:00:01.0: PME# supported from D0 D1 D3hot
pci 0000:00:02.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:02.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:02.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:02.0: supports D1
pci 0000:00:02.0: PME# supported from D0 D1 D3hot
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:01:00.0: reg 0x10: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x10: [io  size 0x0008]
pci 0000:01:00.0: reg 0x14: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x14: [io  size 0x0004]
pci 0000:01:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x18: [io  size 0x0008]
pci 0000:01:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x1c: [io  size 0x0004]
pci 0000:01:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:01:00.0: reg 0x20: [io  size 0x0010]
pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:01:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:00.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci 0000:02:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:02:00.0: reg 0x10: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x10: [io  size 0x0008]
pci 0000:02:00.0: reg 0x14: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x14: [io  size 0x0004]
pci 0000:02:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x18: [io  size 0x0008]
pci 0000:02:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x1c: [io  size 0x0004]
pci 0000:02:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:02:00.0: reg 0x20: [io  size 0x0010]
pci 0000:02:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:02:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:01.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:01.0: PCI bridge to [bus 02-ff]
pci 0000:00:01.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
pci 0000:03:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:03:00.0: reg 0x10: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x10: [io  size 0x0008]
pci 0000:03:00.0: reg 0x14: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x14: [io  size 0x0004]
pci 0000:03:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x18: [io  size 0x0008]
pci 0000:03:00.0: reg 0x1c: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x1c: [io  size 0x0004]
pci 0000:03:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: reg 0x20: [io  size 0x0010]
pci 0000:03:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:03:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:02.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:02.0: PCI bridge to [bus 03-ff]
pci 0000:00:02.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:00.0: BAR 8: assigned [mem 0x60000000-0x600fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x60100000-0x601fffff pref]
pci 0000:00:01.0: BAR 8: assigned [mem 0x60200000-0x602fffff]
pci 0000:00:01.0: BAR 9: assigned [mem 0x60300000-0x603fffff pref]
pci 0000:00:02.0: BAR 8: assigned [mem 0x60400000-0x604fffff]
pci 0000:00:02.0: BAR 9: assigned [mem 0x60500000-0x605fffff pref]
pci 0000:00:00.0: BAR 1: assigned [mem 0x60600000-0x6060ffff]
pci 0000:00:01.0: BAR 1: assigned [mem 0x60610000-0x6061ffff]
pci 0000:00:02.0: BAR 1: assigned [mem 0x60620000-0x6062ffff]
pci 0000:00:00.0: BAR 7: assigned [io  0x0000-0x0fff]
pci 0000:00:01.0: BAR 7: assigned [io  0x1000-0x1fff]
pci 0000:00:02.0: BAR 7: assigned [io  0x2000-0x2fff]
pci 0000:01:00.0: BAR 5: assigned [mem 0x60000000-0x600001ff]
pci 0000:01:00.0: BAR 4: assigned [io  0x0000-0x000f]
pci 0000:01:00.0: BAR 0: assigned [io  0x0010-0x0017]
pci 0000:01:00.0: BAR 2: assigned [io  0x0018-0x001f]
pci 0000:01:00.0: BAR 1: assigned [io  0x0020-0x0023]
pci 0000:01:00.0: BAR 3: assigned [io  0x0024-0x0027]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
pci 0000:02:00.0: BAR 5: assigned [mem 0x60200000-0x602001ff]
pci 0000:02:00.0: BAR 4: assigned [io  0x1000-0x100f]
pci 0000:02:00.0: BAR 0: assigned [io  0x1010-0x1017]
pci 0000:02:00.0: BAR 2: assigned [io  0x1018-0x101f]
pci 0000:02:00.0: BAR 1: assigned [io  0x1020-0x1023]
pci 0000:02:00.0: BAR 3: assigned [io  0x1024-0x1027]
pci 0000:00:01.0: PCI bridge to [bus 02]
pci 0000:00:01.0:   bridge window [io  0x1000-0x1fff]
pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
pci 0000:00:01.0:   bridge window [mem 0x60300000-0x603fffff pref]
pci 0000:03:00.0: BAR 5: assigned [mem 0x60400000-0x604001ff]
pci 0000:03:00.0: BAR 4: assigned [io  0x2000-0x200f]
pci 0000:03:00.0: BAR 0: assigned [io  0x2010-0x2017]
pci 0000:03:00.0: BAR 2: assigned [io  0x2018-0x201f]
pci 0000:03:00.0: BAR 1: assigned [io  0x2020-0x2023]
pci 0000:03:00.0: BAR 3: assigned [io  0x2024-0x2027]
pci 0000:00:02.0: PCI bridge to [bus 03]
pci 0000:00:02.0:   bridge window [io  0x2000-0x2fff]
pci 0000:00:02.0:   bridge window [mem 0x60400000-0x604fffff]
pci 0000:00:02.0:   bridge window [mem 0x60500000-0x605fffff pref

# cat /proc/ioports
00000000-0000ffff : pcie@1e140000
  00000000-00000fff : PCI Bus 0000:01
    00000000-0000000f : 0000:01:00.0
      00000000-0000000f : ahci
    00000010-00000017 : 0000:01:00.0
      00000010-00000017 : ahci
    00000018-0000001f : 0000:01:00.0
      00000018-0000001f : ahci
    00000020-00000023 : 0000:01:00.0
      00000020-00000023 : ahci
    00000024-00000027 : 0000:01:00.0
      00000024-00000027 : ahci
  00001000-00001fff : PCI Bus 0000:02
    00001000-0000100f : 0000:02:00.0
      00001000-0000100f : ahci
    00001010-00001017 : 0000:02:00.0
      00001010-00001017 : ahci
    00001018-0000101f : 0000:02:00.0
      00001018-0000101f : ahci
    00001020-00001023 : 0000:02:00.0
      00001020-00001023 : ahci
    00001024-00001027 : 0000:02:00.0
      00001024-00001027 : ahci
  00002000-00002fff : PCI Bus 0000:03
    00002000-0000200f : 0000:03:00.0
      00002000-0000200f : ahci
    00002010-00002017 : 0000:03:00.0
      00002010-00002017 : ahci
    00002018-0000201f : 0000:03:00.0
      00002018-0000201f : ahci
    00002020-00002023 : 0000:03:00.0
      00002020-00002023 : ahci
    00002024-00002027 : 0000:03:00.0
      00002024-00002027 : ahci

>
> Do you know where that mapping is set up? Is this a hardware setting,
> or is there a way to change the inbound I/O port ranges to the
> normal mapping?

The only thing related is the IOBASE register which is the base
address for the IO space window. Driver code is setting this up to pci
IO resource start address [0].

>
> > > > Currently, no. But if they were ideally moved to work in the same way
> > > > mt7621 would be the same case. Mt7621 is device tree based PCI host
> > > > bridge driver that uses pci core apis but is still mips since it has
> > > > to properly set IO coherency units which is a mips thing...
> > >
> > > I don't know what those IO coherency units are, but I would think that
> > > if you have to do some extra things on MIPS but not ARM, then those
> > > should be done from the common PCI host bridge code and stubbed out
> > > on architectures that don't need them.
> >
> > Simple definition here [0]. And we must adjust them in the driver here
> > [1]. Mips code, yes, but cannot be done in another way. Is related
> > with address of the memory resource and must be set up. In any other
> > case the pci subsystem won't work. I initially submitted this driver
> > from staging to arch/mips but I was told that even though it is mips
> > code, as it is device tree based and use common pci core apis, it
> > would be better to move it to 'drivers/pci/controller'. But I still
> > need the mips specific code for the iocu thing.
>
> It does sound like this could be reworked into an architecture
> specific helper that is defined for MIPS but just an empty stub
> for everything else. Or you can just have an #ifdef in your
> driver to at least enable compile-testing it on other architecture
> if not completely sharing it with others. This is certainly a less
> important point compared to the I/O port mapping.

Currently, Kconfig is enabling COMPILE_TEST only for MIPS
architecture, but agreed, this is not important yet :).

>
> > > It is possible that the hardware (or bootloader) designers
> > > misunderstood what the window is about, and hardcoded it so
> > > that the port number on the bus is the same as the physical
> > > address as seen from the CPU. If this is the case and you
> > > can't change it to a sane value, you have to put the 1:1
> > > translation into the DT and would actually get the strange
> > > port numbers 0x1e161000-0x1e16100f from that nonzero offset.
> >
> > Yes, and that pci_add_resource_offset() is called inside
> > devm_of_pci_get_host_bridge_resources() after parsing the ranges and
> > storing them as resources.  To calculate that offset passed around,
> > subtracts: res->start - range.pci_addr [2], so looking into my ranges
> > my offset should be zero. And I have added a trace just to confirm and
> > there are zero:
> >
> > mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
> > OF: IO START returned by pci_address_to_pio: 0x60000000-0x6fffffff
> > PCI: OF: OFFSET -> RES START 0x60000000 - PCI ADDRESS 0x60000000 -> 0x0
> > mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x001e160000
> > OF: IO START returned by pci_address_to_pio: 0x1e160000-0x1e16ffff
> > PCI: OF: OFFSET -> RES START 0x1e160000 - PCI ADDRESS 0x1e160000 -> 0x0
> >
> > But if I define PCI_IOBASE I get my I/O range start set to zero but
> > also the offset?? Why this substract is not getting '0x1e160000' as
> > offset here?
> >
> > LOGIC PIO: PIO TO CPUADDR: ADDR: 0x1e160000 -  addr HW_START:
> > 0x1e160000 + RANGE IO: 0x00000000
> > OF: IO START returned by pci_address_to_pio: 0x0-0xffff
> > PCI: OF: OFFSET -> RES START 0x0 - PCI ADDRESS 0x1e160000 -> 0x0
>
> I'm not completely following what each of the numbers in your log refer
> to.  The main thing you still need is to have the hardware set up
> so it matches the ranges property, and the io_offset matching that.
>
> With PCI_IOBASE=0x1e160000, there are two possible ways this
> can work:
>
> a) according to the ranges property you listed above:
>     linux port numbers 0-0xffff, pci port numbers 0x1e160000-0x1e16ffff,
>     io_offset=0x1e160000 (possibly negative that number, I can never
>     keep track)
>
> b) the normal way with the ranges according to my reply above, works only
>     if the hardware mapping window can be reprogrammed that way:
>     linux port numbers 0-0xffff, pci port numbers 0-0xffff,
>     io_offset=0

This is what I currently have in the trace above, right? I also have
the same enumeration trace without any change at all defining
PCI_IOBASE as KSEG1 and without any modification in ranges property:

ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
               <0x01000000 0 0x1e160000 0x1e160000 0 0x00010000>;  /*
io space */

So it seems apparently changes do not have effect more than ranges
listed in logs or since I don't have a real pci card which uses IO
bars I cannot be sure about what is or not working at the end :(.
Thomas, do you have such a board with IO bars to test this and see
what works at the end??

>
>          Arnd

Thanks for your time.

Best regards,
    Sergio Paracuellos

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/tree/drivers/staging/mt7621-pci/pci-mt7621.c?h=staging-testing#n485
Arnd Bergmann Sept. 24, 2021, 8:53 a.m. UTC | #15
On Thu, Sep 23, 2021 at 10:33 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Thu, Sep 23, 2021 at 7:55 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Thu, Sep 23, 2021 at 4:55 PM Sergio Paracuellos
> >
> > Ok, got it. So the memory space uses a normal zero offset with cpu address
> > equal to bus address, but the I/O space has a rather usual mapping of
> > bus address equal to the window into the mmio space, with an offset of
> > 0x1e160000.
> >
> > The normal way to do this would be map the PCI port range 0-0x10000
> > into CPU address 0x1e160000, like:
> >
> > ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
> >                <0x01000000 0 0x1e160000 0 0 0x00010000>;
>
> This change as it is got into the same behaviour:
>
> mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
> mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000,
> using [bus 00-ff]
> mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
> mt7621-pci 1e140000.pcie:       IO 0x0000000000..0x000000ffff -> 0x001e160000
>                                                    ^^^^^^
>                                                     This is the only
> change I appreciate in all the trace with the range change.

Oops, my mistake, I mixed up the CPU address and the PCI address.

The correct notation should be

<0x01000000 0 0  0 0x1e160000 0x00010000>;

i.e. bus address 0 to cpu address 0x1e160000, rather than the other
way around as I wrote it.

> pci 0000:00:02.0: PCI bridge to [bus 03-ff]
> pci 0000:00:02.0:   bridge window [io  0x0000-0x0fff]
> pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff]
> pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff pref]
...
> pci 0000:00:02.0: PCI bridge to [bus 03]
> pci 0000:00:02.0:   bridge window [io  0x2000-0x2fff]
> pci 0000:00:02.0:   bridge window [mem 0x60400000-0x604fffff]
> pci 0000:00:02.0:   bridge window [mem 0x60500000-0x605fffff pref
>
> # cat /proc/ioports
> 00000000-0000ffff : pcie@1e140000
>   00000000-00000fff : PCI Bus 0000:01
>     00000000-0000000f : 0000:01:00.0
>       00000000-0000000f : ahci
>     00000010-00000017 : 0000:01:00.0
...

These are all what I would expect, but that should just be
based on the PCI_IOBASE value, not the mapping behind that.

> > Do you know where that mapping is set up? Is this a hardware setting,
> > or is there a way to change the inbound I/O port ranges to the
> > normal mapping?
>
> The only thing related is the IOBASE register which is the base
> address for the IO space window. Driver code is setting this up to pci
> IO resource start address [0].

So this line:
      pcie_write(pcie, entry->res->start, RALINK_PCI_IOBASE);

That does sound like it is the bus address you are writing, not
the CPU address. Some host bridges allow you to configure both,
some only one of them, but the sensible assumption here would
be that this is the bus address if only one is configurable.

If my assumption is correct here, then you must write the value
that you have read from the DT property, which would be
0x001e160000 or 0 in the two versions of the DT property we
have listed, but in theory any (properly aligned) number ought
to work here, as long as the BAR values, the RALINK_PCI_IOBASE
and the io_offset all agree what it is.

The line just before this is

pcie_write(pcie, 0xffffffff, RALINK_PCI_MEMBASE)

Can you clarify what this does? I would have expected an
identity-map for the memory space, which would mean writing
the third cell from the ranges property (0x60000000)
into this. Is -1 a special value for identity mapping, or does
this register do something else?

          Arnd
Sergio Paracuellos Sept. 24, 2021, 10:15 a.m. UTC | #16
Hi Arnd,

On Fri, Sep 24, 2021 at 10:53 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Thu, Sep 23, 2021 at 10:33 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Thu, Sep 23, 2021 at 7:55 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Thu, Sep 23, 2021 at 4:55 PM Sergio Paracuellos
> > >
> > > Ok, got it. So the memory space uses a normal zero offset with cpu address
> > > equal to bus address, but the I/O space has a rather usual mapping of
> > > bus address equal to the window into the mmio space, with an offset of
> > > 0x1e160000.
> > >
> > > The normal way to do this would be map the PCI port range 0-0x10000
> > > into CPU address 0x1e160000, like:
> > >
> > > ranges = <0x02000000 0 0x60000000 0x60000000 0 0x10000000>,  /* pci memory */
> > >                <0x01000000 0 0x1e160000 0 0 0x00010000>;
> >
> > This change as it is got into the same behaviour:
> >
> > mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
> > mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000,
> > using [bus 00-ff]
> > mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
> > mt7621-pci 1e140000.pcie:       IO 0x0000000000..0x000000ffff -> 0x001e160000
> >                                                    ^^^^^^
> >                                                     This is the only
> > change I appreciate in all the trace with the range change.
>
> Oops, my mistake, I mixed up the CPU address and the PCI address.
>
> The correct notation should be
>
> <0x01000000 0 0  0 0x1e160000 0x00010000>;
>
> i.e. bus address 0 to cpu address 0x1e160000, rather than the other
> way around as I wrote it.

Mmmm... Do you mean <0x01000000 0 0  0x1e160000 0 0x00010000>; instead?

In any case both of them ends up in a similar trace (the one listed
here is using your last range as it is):

mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000,
using [bus 00-ff]
mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
mt7621-pci 1e140000.pcie:       IO 0x0000000000..0x1e1600000000ffff ->
0x0000000000
                                                   ^^^^
                                                    This is the part
that change between both traces (t7621-pci 1e140000.pcie:       IO
0x001e160000..0x001e16ffff -> 0x0000000000 in the one I have just put
above)
mt7621-pci 1e140000.pcie: Writing 0x0 in RALINK_PCI_IOBASE
                                                  ^^^
                                                     This is the
current value written in bridge register as IOBASE for the window
which is IORESOURCE_IO -> res.start address (in both traces).

mt7621-pci 1e140000.pcie: PCIE0 enabled
mt7621-pci 1e140000.pcie: PCIE1 enabled
mt7621-pci 1e140000.pcie: PCIE2 enabled
mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000,
mask/settings: 0xf0000002
mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:00.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:00.0: supports D1
pci 0000:00:00.0: PME# supported from D0 D1 D3hot
pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:01.0: supports D1
pci 0000:00:01.0: PME# supported from D0 D1 D3hot
pci 0000:00:02.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:02.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:02.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:02.0: supports D1
pci 0000:00:02.0: PME# supported from D0 D1 D3hot
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:01:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:01:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:01:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:01:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:01:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:01:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:00.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci 0000:02:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:02:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:02:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:02:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:02:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:02:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:02:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:02:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:01.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:01.0: PCI bridge to [bus 02-ff]
pci 0000:00:01.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
pci 0000:03:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:03:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:03:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:03:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:03:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:03:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:03:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:03:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:02.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:02.0: PCI bridge to [bus 03-ff]
pci 0000:00:02.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:00.0: BAR 8: assigned [mem 0x60000000-0x600fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x60100000-0x601fffff pref]
pci 0000:00:01.0: BAR 8: assigned [mem 0x60200000-0x602fffff]
pci 0000:00:01.0: BAR 9: assigned [mem 0x60300000-0x603fffff pref]
pci 0000:00:02.0: BAR 8: assigned [mem 0x60400000-0x604fffff]
pci 0000:00:02.0: BAR 9: assigned [mem 0x60500000-0x605fffff pref]
pci 0000:00:00.0: BAR 1: assigned [mem 0x60600000-0x6060ffff]
pci 0000:00:01.0: BAR 1: assigned [mem 0x60610000-0x6061ffff]
pci 0000:00:02.0: BAR 1: assigned [mem 0x60620000-0x6062ffff]
pci 0000:00:00.0: BAR 7: assigned [io  0x0000-0x0fff]
pci 0000:00:01.0: BAR 7: assigned [io  0x1000-0x1fff]
pci 0000:00:02.0: BAR 7: assigned [io  0x2000-0x2fff]
pci 0000:01:00.0: BAR 5: assigned [mem 0x60000000-0x600001ff]
pci 0000:01:00.0: BAR 4: assigned [io  0x0000-0x000f]
pci 0000:01:00.0: BAR 0: assigned [io  0x0010-0x0017]
pci 0000:01:00.0: BAR 2: assigned [io  0x0018-0x001f]
pci 0000:01:00.0: BAR 1: assigned [io  0x0020-0x0023]
pci 0000:01:00.0: BAR 3: assigned [io  0x0024-0x0027]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
pci 0000:02:00.0: BAR 5: assigned [mem 0x60200000-0x602001ff]
pci 0000:02:00.0: BAR 4: assigned [io  0x1000-0x100f]
pci 0000:02:00.0: BAR 0: assigned [io  0x1010-0x1017]
pci 0000:02:00.0: BAR 2: assigned [io  0x1018-0x101f]
pci 0000:02:00.0: BAR 1: assigned [io  0x1020-0x1023]
pci 0000:02:00.0: BAR 3: assigned [io  0x1024-0x1027]
pci 0000:00:01.0: PCI bridge to [bus 02]
pci 0000:00:01.0:   bridge window [io  0x1000-0x1fff]
pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
pci 0000:00:01.0:   bridge window [mem 0x60300000-0x603fffff pref]
pci 0000:03:00.0: BAR 5: assigned [mem 0x60400000-0x604001ff]
pci 0000:03:00.0: BAR 4: assigned [io  0x2000-0x200f]
pci 0000:03:00.0: BAR 0: assigned [io  0x2010-0x2017]
pci 0000:03:00.0: BAR 2: assigned [io  0x2018-0x201f]
pci 0000:03:00.0: BAR 1: assigned [io  0x2020-0x2023]
pci 0000:03:00.0: BAR 3: assigned [io  0x2024-0x2027]
pci 0000:00:02.0: PCI bridge to [bus 03]
pci 0000:00:02.0:   bridge window [io  0x2000-0x2fff]
pci 0000:00:02.0:   bridge window [mem 0x60400000-0x604fffff]
pci 0000:00:02.0:   bridge window [mem 0x60500000-0x605fffff pref]

Also, I noticed that all of these messages have disappeared from the trace:

pci 0000:03:00.0: reg 0x20: initial BAR value 0x00000000 invalid

and now the first tested value seems to be valid....

No changes in ioports also:

root@gnubee:~# cat /proc/ioports
00000000-0000ffff : pcie@1e140000
  00000000-00000fff : PCI Bus 0000:01
    00000000-0000000f : 0000:01:00.0
      00000000-0000000f : ahci
    00000010-00000017 : 0000:01:00.0
      00000010-00000017 : ahci
    00000018-0000001f : 0000:01:00.0
      00000018-0000001f : ahci
    00000020-00000023 : 0000:01:00.0
      00000020-00000023 : ahci
    00000024-00000027 : 0000:01:00.0
      00000024-00000027 : ahci
  00001000-00001fff : PCI Bus 0000:02
    00001000-0000100f : 0000:02:00.0
      00001000-0000100f : ahci
    00001010-00001017 : 0000:02:00.0
      00001010-00001017 : ahci
    00001018-0000101f : 0000:02:00.0
      00001018-0000101f : ahci
    00001020-00001023 : 0000:02:00.0
      00001020-00001023 : ahci
    00001024-00001027 : 0000:02:00.0
      00001024-00001027 : ahci
  00002000-00002fff : PCI Bus 0000:03
    00002000-0000200f : 0000:03:00.0
      00002000-0000200f : ahci
    00002010-00002017 : 0000:03:00.0
      00002010-00002017 : ahci
    00002018-0000201f : 0000:03:00.0
      00002018-0000201f : ahci
    00002020-00002023 : 0000:03:00.0
      00002020-00002023 : ahci
    00002024-00002027 : 0000:03:00.0
      00002024-00002027 : ahci
root@gnubee:~#


>
> > pci 0000:00:02.0: PCI bridge to [bus 03-ff]
> > pci 0000:00:02.0:   bridge window [io  0x0000-0x0fff]
> > pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff]
> > pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff pref]
> ...
> > pci 0000:00:02.0: PCI bridge to [bus 03]
> > pci 0000:00:02.0:   bridge window [io  0x2000-0x2fff]
> > pci 0000:00:02.0:   bridge window [mem 0x60400000-0x604fffff]
> > pci 0000:00:02.0:   bridge window [mem 0x60500000-0x605fffff pref
> >
> > # cat /proc/ioports
> > 00000000-0000ffff : pcie@1e140000
> >   00000000-00000fff : PCI Bus 0000:01
> >     00000000-0000000f : 0000:01:00.0
> >       00000000-0000000f : ahci
> >     00000010-00000017 : 0000:01:00.0
> ...
>
> These are all what I would expect, but that should just be
> based on the PCI_IOBASE value, not the mapping behind that.
>
> > > Do you know where that mapping is set up? Is this a hardware setting,
> > > or is there a way to change the inbound I/O port ranges to the
> > > normal mapping?
> >
> > The only thing related is the IOBASE register which is the base
> > address for the IO space window. Driver code is setting this up to pci
> > IO resource start address [0].
>
> So this line:
>       pcie_write(pcie, entry->res->start, RALINK_PCI_IOBASE);
>
> That does sound like it is the bus address you are writing, not
> the CPU address. Some host bridges allow you to configure both,
> some only one of them, but the sensible assumption here would
> be that this is the bus address if only one is configurable.
>
> If my assumption is correct here, then you must write the value
> that you have read from the DT property, which would be
> 0x001e160000 or 0 in the two versions of the DT property we
> have listed, but in theory any (properly aligned) number ought
> to work here, as long as the BAR values, the RALINK_PCI_IOBASE
> and the io_offset all agree what it is.

RALINK_PCI_IOBASE in that line, is the bridge register you write with
value from the IO resource created from DT property, yes. So I guess
you meant PCI_IOBASE for ralink in this sentence, right?

(Prototype of the function -> static inline void pcie_write(struct
mt7621_pcie *pcie, u32 val, u32 reg))

>
> The line just before this is
>
> pcie_write(pcie, 0xffffffff, RALINK_PCI_MEMBASE)
>
> Can you clarify what this does? I would have expected an
> identity-map for the memory space, which would mean writing
> the third cell from the ranges property (0x60000000)
> into this. Is -1 a special value for identity mapping, or does
> this register do something else?

That was also my understanding at first when I took this code and
started playing with it since the register is the base address for the
memory space window. Setting the value you pointed out worked for me
(and makes sense), but some people seem to have problems with some
cards when accessing PCI memory resources. That's the only reason I
have maintained the original value but I don't really understand what
it means. The same value is used in mt7620 which has a pretty similar
topology but with one only virtual bridge behind the host-bridge [0].
However there is no documentation for mt7621 PCI subsystem and the one
that exists for mt7620 PCI does not clarify much more [1].

Thanks,
     Sergio Paracuellos
>
>           Arnd

[0]: https://elixir.bootlin.com/linux/latest/source/arch/mips/pci/pci-mt7620.c#L342
[1]: https://usermanual.wiki/Pdf/MT7620ProgrammingGuideE220120815.111503371
Arnd Bergmann Sept. 24, 2021, 11:39 a.m. UTC | #17
On Fri, Sep 24, 2021 at 12:15 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Fri, Sep 24, 2021 at 10:53 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Thu, Sep 23, 2021 at 10:33 PM Sergio Paracuellos
> >
> > Oops, my mistake, I mixed up the CPU address and the PCI address.
> >
> > The correct notation should be
> >
> > <0x01000000 0 0  0 0x1e160000 0x00010000>;
> >
> > i.e. bus address 0 to cpu address 0x1e160000, rather than the other
> > way around as I wrote it.
>
> Mmmm... Do you mean <0x01000000 0 0  0x1e160000 0 0x00010000>; instead?

Yes, sorry for getting it wrong again.

> In any case both of them ends up in a similar trace (the one listed
> here is using your last range as it is):
>
> mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
> mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000,
> using [bus 00-ff]
> mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
> mt7621-pci 1e140000.pcie:       IO 0x0000000000..0x1e1600000000ffff ->
> 0x0000000000
>                                                    ^^^^
>                                                     This is the part
> that change between both traces (t7621-pci 1e140000.pcie:       IO
>
> 0x001e160000..0x001e16ffff -> 0x0000000000 in the one I have just put
> above)

Yes, the latter is the expected output.

> mt7621-pci 1e140000.pcie: Writing 0x0 in RALINK_PCI_IOBASE
>                                                   ^^^
>                                                      This is the
> current value written in bridge register as IOBASE for the window
> which is IORESOURCE_IO -> res.start address (in both traces).

The value is correct, but strictly speaking this must be the
raw value from DT, not the translated Linux port number.

As long as io_offset is zero, the two are the same, but if you were
to use multiple host bridge in the system, or pick a different bus
address in DT, you can have a nonzero io_offset. I think this means

   pcie_write(pcie, entry->res->start, RALINK_PCI_IOBASE);

should become

   pcie_write(pcie, entry->res->start + entry->offset, RALINK_PCI_IOBASE);

Try setting some other value as the bus address in DT and see if that
is what gets written to the register.
(I may have the polarity of offset wrong, so this may need to be '-' instead
of '+').

> Also, I noticed that all of these messages have disappeared from the trace:
>
> pci 0000:03:00.0: reg 0x20: initial BAR value 0x00000000 invalid
>
> and now the first tested value seems to be valid....

This just means that the power-on default value of '0' is now within
the configured range of 0...fffff. It's still slightly odd that it warns about
those in the first place if it is going to reassign everything, but that
should be harmless either way.

> > If my assumption is correct here, then you must write the value
> > that you have read from the DT property, which would be
> > 0x001e160000 or 0 in the two versions of the DT property we
> > have listed, but in theory any (properly aligned) number ought
> > to work here, as long as the BAR values, the RALINK_PCI_IOBASE
> > and the io_offset all agree what it is.
>
> RALINK_PCI_IOBASE in that line, is the bridge register you write with
> value from the IO resource created from DT property, yes. So I guess
> you meant PCI_IOBASE for ralink in this sentence, right?
>
> (Prototype of the function -> static inline void pcie_write(struct
> mt7621_pcie *pcie, u32 val, u32 reg))

I meant RALINK_PCI_IOBASE. We do need to write both, to clarify:

RALINK_PCI_IOBASE must be set to match the *bus* address in DT,
so ideally '0', but any value should work as long as these two match.

PCI_IOBASE/mips_io_port_base must be set to the *CPU* address
in DT, so that must be 0x1e160000, possibly converted from
physical to a virtual __iomem address (this is where my MIPS
knowledge ends).

> > pcie_write(pcie, 0xffffffff, RALINK_PCI_MEMBASE)
> >
> > Can you clarify what this does? I would have expected an
> > identity-map for the memory space, which would mean writing
> > the third cell from the ranges property (0x60000000)
> > into this. Is -1 a special value for identity mapping, or does
> > this register do something else?
>
> That was also my understanding at first when I took this code and
> started playing with it since the register is the base address for the
> memory space window. Setting the value you pointed out worked for me
> (and makes sense), but some people seem to have problems with some
> cards when accessing PCI memory resources. That's the only reason I
> have maintained the original value but I don't really understand what
> it means. The same value is used in mt7620 which has a pretty similar
> topology but with one only virtual bridge behind the host-bridge [0].
> However there is no documentation for mt7621 PCI subsystem and the one
> that exists for mt7620 PCI does not clarify much more [1].

Ok, fair enough.

          Arnd
Sergio Paracuellos Sept. 24, 2021, 12:46 p.m. UTC | #18
On Fri, Sep 24, 2021 at 1:39 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Sep 24, 2021 at 12:15 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Fri, Sep 24, 2021 at 10:53 AM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Thu, Sep 23, 2021 at 10:33 PM Sergio Paracuellos
> > >
> > > Oops, my mistake, I mixed up the CPU address and the PCI address.
> > >
> > > The correct notation should be
> > >
> > > <0x01000000 0 0  0 0x1e160000 0x00010000>;
> > >
> > > i.e. bus address 0 to cpu address 0x1e160000, rather than the other
> > > way around as I wrote it.
> >
> > Mmmm... Do you mean <0x01000000 0 0  0x1e160000 0 0x00010000>; instead?
>
> Yes, sorry for getting it wrong again.
>
> > In any case both of them ends up in a similar trace (the one listed
> > here is using your last range as it is):
> >
> > mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
> > mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000,
> > using [bus 00-ff]
> > mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
> > mt7621-pci 1e140000.pcie:       IO 0x0000000000..0x1e1600000000ffff ->
> > 0x0000000000
> >                                                    ^^^^
> >                                                     This is the part
> > that change between both traces (t7621-pci 1e140000.pcie:       IO
> >
> > 0x001e160000..0x001e16ffff -> 0x0000000000 in the one I have just put
> > above)
>
> Yes, the latter is the expected output.

Perfect, thanks.

>
> > mt7621-pci 1e140000.pcie: Writing 0x0 in RALINK_PCI_IOBASE
> >                                                   ^^^
> >                                                      This is the
> > current value written in bridge register as IOBASE for the window
> > which is IORESOURCE_IO -> res.start address (in both traces).
>
> The value is correct, but strictly speaking this must be the
> raw value from DT, not the translated Linux port number.
>
> As long as io_offset is zero, the two are the same, but if you were
> to use multiple host bridge in the system, or pick a different bus
> address in DT, you can have a nonzero io_offset. I think this means
>
>    pcie_write(pcie, entry->res->start, RALINK_PCI_IOBASE);
>
> should become
>
>    pcie_write(pcie, entry->res->start + entry->offset, RALINK_PCI_IOBASE);
>
> Try setting some other value as the bus address in DT and see if that
> is what gets written to the register.
> (I may have the polarity of offset wrong, so this may need to be '-' instead
> of '+').

Ok, polarity is inverted. Correct one is:

pcie_write(pcie, entry->res->start - entry->offset, RALINK_PCI_IOBASE);

>
> > Also, I noticed that all of these messages have disappeared from the trace:
> >
> > pci 0000:03:00.0: reg 0x20: initial BAR value 0x00000000 invalid
> >
> > and now the first tested value seems to be valid....
>
> This just means that the power-on default value of '0' is now within
> the configured range of 0...fffff. It's still slightly odd that it warns about
> those in the first place if it is going to reassign everything, but that
> should be harmless either way.

I see. Thanks for clarification.

>
> > > If my assumption is correct here, then you must write the value
> > > that you have read from the DT property, which would be
> > > 0x001e160000 or 0 in the two versions of the DT property we
> > > have listed, but in theory any (properly aligned) number ought
> > > to work here, as long as the BAR values, the RALINK_PCI_IOBASE
> > > and the io_offset all agree what it is.
> >
> > RALINK_PCI_IOBASE in that line, is the bridge register you write with
> > value from the IO resource created from DT property, yes. So I guess
> > you meant PCI_IOBASE for ralink in this sentence, right?
> >
> > (Prototype of the function -> static inline void pcie_write(struct
> > mt7621_pcie *pcie, u32 val, u32 reg))
>
> I meant RALINK_PCI_IOBASE. We do need to write both, to clarify:
>
> RALINK_PCI_IOBASE must be set to match the *bus* address in DT,
> so ideally '0', but any value should work as long as these two match.
>
> PCI_IOBASE/mips_io_port_base must be set to the *CPU* address
> in DT, so that must be 0x1e160000, possibly converted from
> physical to a virtual __iomem address (this is where my MIPS
> knowledge ends).

Understood. I have tried the following:

I have added the following at the beggining of the pci host driver to
match what you are describing above:

unsigned long vaddr = (unsigned long)ioremap(PCI_IOBASE, 0x10000);
set_io_port_base(vaddr);

dev_info(dev, "Setting base to PCI_IOBASE: 0x%x -> mips_io_port_base
0x%lx", PCI_IOBASE, mips_io_port_base);

PCI_IOBASE is the physical cpu address. Hence, 0x1e160000
set_io_port_base sets 'mips_io_port_base' to the virtual address where
'PCI_IOBASE' has been mapped (vaddr).

However, nothing seems to change:

mt7621-pci 1e140000.pcie: Setting base to PCI_IOBASE: 0x1e160000 ->
mips_io_port_base 0xbe160000
                                                ^^^
                                                 This seems aligned
with what you are saying. mips_io_port_base have now a proper virtual
addr for 0x1e160000
mt7621-pci 1e140000.pcie: host bridge /pcie@1e140000 ranges:
mt7621-pci 1e140000.pcie:   No bus range found for /pcie@1e140000,
using [bus 00-ff]
mt7621-pci 1e140000.pcie:      MEM 0x0060000000..0x006fffffff -> 0x0060000000
mt7621-pci 1e140000.pcie:       IO 0x001e160000..0x001e16ffff -> 0x0000000000
mt7621-pci 1e140000.pcie: Writting 0x0 in RALINK_PCI_IOBASE
                                               ^^^
                                               PCI Bus address is zero
and linux port number too, so this also looks good
mt7621-pci 1e140000.pcie: PCIE0 enabled
mt7621-pci 1e140000.pcie: PCIE1 enabled
mt7621-pci 1e140000.pcie: PCIE2 enabled
mt7621-pci 1e140000.pcie: PCI coherence region base: 0x60000000,
mask/settings: 0xf0000002
mt7621-pci 1e140000.pcie: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [mem 0x60000000-0x6fffffff]
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]
pci 0000:00:00.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:00.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:00.0: supports D1
pci 0000:00:00.0: PME# supported from D0 D1 D3hot
pci 0000:00:01.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:01.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:01.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:01.0: supports D1
pci 0000:00:01.0: PME# supported from D0 D1 D3hot
pci 0000:00:02.0: [0e8d:0801] type 01 class 0x060400
pci 0000:00:02.0: reg 0x10: [mem 0x00000000-0x7fffffff]
pci 0000:00:02.0: reg 0x14: [mem 0x00000000-0x0000ffff]
pci 0000:00:02.0: supports D1
pci 0000:00:02.0: PME# supported from D0 D1 D3hot
pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:01:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:01:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:01:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:01:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:01:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:01:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:01:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:01:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:00.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:00.0: PCI bridge to [bus 01-ff]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
pci 0000:02:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:02:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:02:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:02:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:02:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:02:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:02:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:02:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:01.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:01.0: PCI bridge to [bus 02-ff]
pci 0000:00:01.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:01.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:02: busn_res: [bus 02-ff] end is updated to 02
pci 0000:03:00.0: [1b21:0611] type 00 class 0x010185
pci 0000:03:00.0: reg 0x10: [io  0x0000-0x0007]
pci 0000:03:00.0: reg 0x14: [io  0x0000-0x0003]
pci 0000:03:00.0: reg 0x18: [io  0x0000-0x0007]
pci 0000:03:00.0: reg 0x1c: [io  0x0000-0x0003]
pci 0000:03:00.0: reg 0x20: [io  0x0000-0x000f]
pci 0000:03:00.0: reg 0x24: [mem 0x00000000-0x000001ff]
pci 0000:03:00.0: 2.000 Gb/s available PCIe bandwidth, limited by 2.5
GT/s PCIe x1 link at 0000:00:02.0 (capable of 4.000 Gb/s with 5.0 GT/s
PCIe x1 link)
pci 0000:00:02.0: PCI bridge to [bus 03-ff]
pci 0000:00:02.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff]
pci 0000:00:02.0:   bridge window [mem 0x00000000-0x000fffff pref]
pci_bus 0000:03: busn_res: [bus 03-ff] end is updated to 03
pci 0000:00:00.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:01.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: no space for [mem size 0x80000000]
pci 0000:00:02.0: BAR 0: failed to assign [mem size 0x80000000]
pci 0000:00:00.0: BAR 8: assigned [mem 0x60000000-0x600fffff]
pci 0000:00:00.0: BAR 9: assigned [mem 0x60100000-0x601fffff pref]
pci 0000:00:01.0: BAR 8: assigned [mem 0x60200000-0x602fffff]
pci 0000:00:01.0: BAR 9: assigned [mem 0x60300000-0x603fffff pref]
pci 0000:00:02.0: BAR 8: assigned [mem 0x60400000-0x604fffff]
pci 0000:00:02.0: BAR 9: assigned [mem 0x60500000-0x605fffff pref]
pci 0000:00:00.0: BAR 1: assigned [mem 0x60600000-0x6060ffff]
pci 0000:00:01.0: BAR 1: assigned [mem 0x60610000-0x6061ffff]
pci 0000:00:02.0: BAR 1: assigned [mem 0x60620000-0x6062ffff]
pci 0000:00:00.0: BAR 7: assigned [io  0x0000-0x0fff]
pci 0000:00:01.0: BAR 7: assigned [io  0x1000-0x1fff]
pci 0000:00:02.0: BAR 7: assigned [io  0x2000-0x2fff]
pci 0000:01:00.0: BAR 5: assigned [mem 0x60000000-0x600001ff]
pci 0000:01:00.0: BAR 4: assigned [io  0x0000-0x000f]
pci 0000:01:00.0: BAR 0: assigned [io  0x0010-0x0017]
pci 0000:01:00.0: BAR 2: assigned [io  0x0018-0x001f]
pci 0000:01:00.0: BAR 1: assigned [io  0x0020-0x0023]
pci 0000:01:00.0: BAR 3: assigned [io  0x0024-0x0027]
pci 0000:00:00.0: PCI bridge to [bus 01]
pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
pci 0000:00:00.0:   bridge window [mem 0x60000000-0x600fffff]
pci 0000:00:00.0:   bridge window [mem 0x60100000-0x601fffff pref]
pci 0000:02:00.0: BAR 5: assigned [mem 0x60200000-0x602001ff]
pci 0000:02:00.0: BAR 4: assigned [io  0x1000-0x100f]
pci 0000:02:00.0: BAR 0: assigned [io  0x1010-0x1017]
pci 0000:02:00.0: BAR 2: assigned [io  0x1018-0x101f]
pci 0000:02:00.0: BAR 1: assigned [io  0x1020-0x1023]
pci 0000:02:00.0: BAR 3: assigned [io  0x1024-0x1027]
pci 0000:00:01.0: PCI bridge to [bus 02]
pci 0000:00:01.0:   bridge window [io  0x1000-0x1fff]
pci 0000:00:01.0:   bridge window [mem 0x60200000-0x602fffff]
pci 0000:00:01.0:   bridge window [mem 0x60300000-0x603fffff pref]
pci 0000:03:00.0: BAR 5: assigned [mem 0x60400000-0x604001ff]
pci 0000:03:00.0: BAR 4: assigned [io  0x2000-0x200f]
pci 0000:03:00.0: BAR 0: assigned [io  0x2010-0x2017]
pci 0000:03:00.0: BAR 2: assigned [io  0x2018-0x201f]
pci 0000:03:00.0: BAR 1: assigned [io  0x2020-0x2023]
pci 0000:03:00.0: BAR 3: assigned [io  0x2024-0x2027]
pci 0000:00:02.0: PCI bridge to [bus 03]
pci 0000:00:02.0:   bridge window [io  0x2000-0x2fff]
pci 0000:00:02.0:   bridge window [mem 0x60400000-0x604fffff]
pci 0000:00:02.0:   bridge window [mem 0x60500000-0x605fffff pref]

No changes also here:

root@gnubee:~# cat /proc/ioports
00000000-0000ffff : pcie@1e140000
  00000000-00000fff : PCI Bus 0000:01
    00000000-0000000f : 0000:01:00.0
      00000000-0000000f : ahci
    00000010-00000017 : 0000:01:00.0
      00000010-00000017 : ahci
    00000018-0000001f : 0000:01:00.0
      00000018-0000001f : ahci
    00000020-00000023 : 0000:01:00.0
      00000020-00000023 : ahci
    00000024-00000027 : 0000:01:00.0
      00000024-00000027 : ahci
  00001000-00001fff : PCI Bus 0000:02
    00001000-0000100f : 0000:02:00.0
      00001000-0000100f : ahci
    00001010-00001017 : 0000:02:00.0
      00001010-00001017 : ahci
    00001018-0000101f : 0000:02:00.0
      00001018-0000101f : ahci
    00001020-00001023 : 0000:02:00.0
      00001020-00001023 : ahci
    00001024-00001027 : 0000:02:00.0
      00001024-00001027 : ahci
  00002000-00002fff : PCI Bus 0000:03
    00002000-0000200f : 0000:03:00.0
      00002000-0000200f : ahci
    00002010-00002017 : 0000:03:00.0
      00002010-00002017 : ahci
    00002018-0000201f : 0000:03:00.0
      00002018-0000201f : ahci
    00002020-00002023 : 0000:03:00.0
      00002020-00002023 : ahci
    00002024-00002027 : 0000:03:00.0
      00002024-00002027 : ahci

>
> > > pcie_write(pcie, 0xffffffff, RALINK_PCI_MEMBASE)
> > >
> > > Can you clarify what this does? I would have expected an
> > > identity-map for the memory space, which would mean writing
> > > the third cell from the ranges property (0x60000000)
> > > into this. Is -1 a special value for identity mapping, or does
> > > this register do something else?
> >
> > That was also my understanding at first when I took this code and
> > started playing with it since the register is the base address for the
> > memory space window. Setting the value you pointed out worked for me
> > (and makes sense), but some people seem to have problems with some
> > cards when accessing PCI memory resources. That's the only reason I
> > have maintained the original value but I don't really understand what
> > it means. The same value is used in mt7620 which has a pretty similar
> > topology but with one only virtual bridge behind the host-bridge [0].
> > However there is no documentation for mt7621 PCI subsystem and the one
> > that exists for mt7620 PCI does not clarify much more [1].
>
> Ok, fair enough.
>
>           Arnd

Thanks,
     Sergio Paracuellos
Arnd Bergmann Sept. 24, 2021, 1:27 p.m. UTC | #19
On Fri, Sep 24, 2021 at 2:46 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Fri, Sep 24, 2021 at 1:39 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Fri, Sep 24, 2021 at 12:15 PM Sergio Paracuellos
>
> > I meant RALINK_PCI_IOBASE. We do need to write both, to clarify:
> >
> > RALINK_PCI_IOBASE must be set to match the *bus* address in DT,
> > so ideally '0', but any value should work as long as these two match.
> >
> > PCI_IOBASE/mips_io_port_base must be set to the *CPU* address
> > in DT, so that must be 0x1e160000, possibly converted from
> > physical to a virtual __iomem address (this is where my MIPS
> > knowledge ends).
>
> Understood. I have tried the following:
>
> I have added the following at the beggining of the pci host driver to
> match what you are describing above:
>
> unsigned long vaddr = (unsigned long)ioremap(PCI_IOBASE, 0x10000);
> set_io_port_base(vaddr);
>
> dev_info(dev, "Setting base to PCI_IOBASE: 0x%x -> mips_io_port_base
> 0x%lx", PCI_IOBASE, mips_io_port_base);
>
> PCI_IOBASE is the physical cpu address. Hence, 0x1e160000
> set_io_port_base sets 'mips_io_port_base' to the virtual address where
> 'PCI_IOBASE' has been mapped (vaddr).

Ok, sounds good. I would still suggest using
"#define PCI_IOBASE mips_io_port_base", so it has the same meaning
as on other architectures (the virtual address of port 0), and replace
the hardcoded base with the CPU address you read from DT to
make that code more portable. As a general rule, DT-enabled drivers
should contain no hardcoded addresses.

> However, nothing seems to change:
>
> mt7621-pci 1e140000.pcie: Setting base to PCI_IOBASE: 0x1e160000 ->
> mips_io_port_base 0xbe160000
>                                                 ^^^
>                                                  This seems aligned
> with what you are saying. mips_io_port_base have now a proper virtual
> addr for 0x1e160000

Ok.

            Arnd
Sergio Paracuellos Sept. 24, 2021, 4:45 p.m. UTC | #20
Hi Arnd,

On Fri, Sep 24, 2021 at 3:28 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Sep 24, 2021 at 2:46 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Fri, Sep 24, 2021 at 1:39 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Fri, Sep 24, 2021 at 12:15 PM Sergio Paracuellos
> >
> > > I meant RALINK_PCI_IOBASE. We do need to write both, to clarify:
> > >
> > > RALINK_PCI_IOBASE must be set to match the *bus* address in DT,
> > > so ideally '0', but any value should work as long as these two match.
> > >
> > > PCI_IOBASE/mips_io_port_base must be set to the *CPU* address
> > > in DT, so that must be 0x1e160000, possibly converted from
> > > physical to a virtual __iomem address (this is where my MIPS
> > > knowledge ends).
> >
> > Understood. I have tried the following:
> >
> > I have added the following at the beggining of the pci host driver to
> > match what you are describing above:
> >
> > unsigned long vaddr = (unsigned long)ioremap(PCI_IOBASE, 0x10000);
> > set_io_port_base(vaddr);
> >
> > dev_info(dev, "Setting base to PCI_IOBASE: 0x%x -> mips_io_port_base
> > 0x%lx", PCI_IOBASE, mips_io_port_base);
> >
> > PCI_IOBASE is the physical cpu address. Hence, 0x1e160000
> > set_io_port_base sets 'mips_io_port_base' to the virtual address where
> > 'PCI_IOBASE' has been mapped (vaddr).
>
> Ok, sounds good. I would still suggest using
> "#define PCI_IOBASE mips_io_port_base", so it has the same meaning
> as on other architectures (the virtual address of port 0), and replace
> the hardcoded base with the CPU address you read from DT to
> make that code more portable. As a general rule, DT-enabled drivers
> should contain no hardcoded addresses.

Yes, it must be cleaned. I was only explaining a possible way to proceed.

So, the changes would be:
1) Reverting already added two commits in staging-tree [0] and [1].
(two revert patches)
2) Setting PCI_IOBASE to 'mips_io_port_base' so the spaces.h become: (one patch)

$ cat arch/mips/include/asm/mach-ralink/spaces.h
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef __ASM_MACH_RALINK_SPACES_H_
#define __ASM_MACH_RALINK_SPACES_H_

#define PCI_IOBASE    mips_io_port_base
#define PCI_IOSIZE    SZ_16M
#define IO_SPACE_LIMIT  (PCI_IOSIZE - 1)

#include <asm/mach-generic/spaces.h>
#endif

3) Change the value written in RALINK_PCI_IOBASE to be sure the value
written takes into account address before linux port translation (one
patch):

pcie_write(pcie, entry->res->start - entry->offset, RALINK_PCI_IOBASE);

4) Virtually Map cpu physical address 0x1e160000 and set
'mips_io_port_base' to that virtual address. Something like the
following (one patch):

static int mt7621_set_io(struct device *dev)
{
    struct device_node *node = dev->of_node;
    struct of_pci_range_parser parser;
    struct of_pci_range range;
    unsigned long vaddr;
    int ret = -EINVAL;

    ret = of_pci_range_parser_init(&parser, node);
    if (ret)
        return ret;

    for_each_of_pci_range(&parser, &range) {
        switch (range.flags & IORESOURCE_TYPE_BITS) {
        case IORESOURCE_IO:
            vaddr = (unsigned long)ioremap(range.cpu_addr, range.size);
            set_io_port_base(vaddr);
            ret = 0;
            break;
        }
    }

    return ret;
}

static int mt7621_pci_probe(struct platform_device *pdev)
{
  ...
    err = mt7621_set_io(dev);
    if (err) {
        dev_err(dev, "error setting io\n");
        return err;
    }
...
    return 0;
}

And now my concerns:
1) We have to read DT range IO values in the driver and those values
will be also parsed by core apis but converting them to linux io
ports. Should this be done by the driver or is there a better way to
abstract this to don't do things twice?
2) 'set_io_port_base()' function does what we want but it is only
mips. We already have the iocu stuff there and the driver is mips
anyway, but it is worth to comment this just in case.

Thoughts?

Thanks in advance for your time.

Best regards,
    Sergio Paracuellos

>
> > However, nothing seems to change:
> >
> > mt7621-pci 1e140000.pcie: Setting base to PCI_IOBASE: 0x1e160000 ->
> > mips_io_port_base 0xbe160000
> >                                                 ^^^
> >                                                  This seems aligned
> > with what you are saying. mips_io_port_base have now a proper virtual
> > addr for 0x1e160000
>
> Ok.
>
>             Arnd
Sergio Paracuellos Sept. 24, 2021, 4:47 p.m. UTC | #21
On Fri, Sep 24, 2021 at 6:45 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
>
> Hi Arnd,
>
> On Fri, Sep 24, 2021 at 3:28 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Fri, Sep 24, 2021 at 2:46 PM Sergio Paracuellos
> > <sergio.paracuellos@gmail.com> wrote:
> > > On Fri, Sep 24, 2021 at 1:39 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > > On Fri, Sep 24, 2021 at 12:15 PM Sergio Paracuellos
> > >
> > > > I meant RALINK_PCI_IOBASE. We do need to write both, to clarify:
> > > >
> > > > RALINK_PCI_IOBASE must be set to match the *bus* address in DT,
> > > > so ideally '0', but any value should work as long as these two match.
> > > >
> > > > PCI_IOBASE/mips_io_port_base must be set to the *CPU* address
> > > > in DT, so that must be 0x1e160000, possibly converted from
> > > > physical to a virtual __iomem address (this is where my MIPS
> > > > knowledge ends).
> > >
> > > Understood. I have tried the following:
> > >
> > > I have added the following at the beggining of the pci host driver to
> > > match what you are describing above:
> > >
> > > unsigned long vaddr = (unsigned long)ioremap(PCI_IOBASE, 0x10000);
> > > set_io_port_base(vaddr);
> > >
> > > dev_info(dev, "Setting base to PCI_IOBASE: 0x%x -> mips_io_port_base
> > > 0x%lx", PCI_IOBASE, mips_io_port_base);
> > >
> > > PCI_IOBASE is the physical cpu address. Hence, 0x1e160000
> > > set_io_port_base sets 'mips_io_port_base' to the virtual address where
> > > 'PCI_IOBASE' has been mapped (vaddr).
> >
> > Ok, sounds good. I would still suggest using
> > "#define PCI_IOBASE mips_io_port_base", so it has the same meaning
> > as on other architectures (the virtual address of port 0), and replace
> > the hardcoded base with the CPU address you read from DT to
> > make that code more portable. As a general rule, DT-enabled drivers
> > should contain no hardcoded addresses.
>
> Yes, it must be cleaned. I was only explaining a possible way to proceed.
>
> So, the changes would be:
> 1) Reverting already added two commits in staging-tree [0] and [1].

Sorry, forgot to add links:

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-testing&id=159697474db41732ef3b6c2e8d9395f09d1f659e
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git/commit/?h=staging-testing&id=50fb34eca2944fd67493717c9fbda125336f1655

> (two revert patches)
> 2) Setting PCI_IOBASE to 'mips_io_port_base' so the spaces.h become: (one patch)
>
> $ cat arch/mips/include/asm/mach-ralink/spaces.h
> /* SPDX-License-Identifier: GPL-2.0 */
> #ifndef __ASM_MACH_RALINK_SPACES_H_
> #define __ASM_MACH_RALINK_SPACES_H_
>
> #define PCI_IOBASE    mips_io_port_base
> #define PCI_IOSIZE    SZ_16M
> #define IO_SPACE_LIMIT  (PCI_IOSIZE - 1)
>
> #include <asm/mach-generic/spaces.h>
> #endif
>
> 3) Change the value written in RALINK_PCI_IOBASE to be sure the value
> written takes into account address before linux port translation (one
> patch):
>
> pcie_write(pcie, entry->res->start - entry->offset, RALINK_PCI_IOBASE);
>
> 4) Virtually Map cpu physical address 0x1e160000 and set
> 'mips_io_port_base' to that virtual address. Something like the
> following (one patch):
>
> static int mt7621_set_io(struct device *dev)
> {
>     struct device_node *node = dev->of_node;
>     struct of_pci_range_parser parser;
>     struct of_pci_range range;
>     unsigned long vaddr;
>     int ret = -EINVAL;
>
>     ret = of_pci_range_parser_init(&parser, node);
>     if (ret)
>         return ret;
>
>     for_each_of_pci_range(&parser, &range) {
>         switch (range.flags & IORESOURCE_TYPE_BITS) {
>         case IORESOURCE_IO:
>             vaddr = (unsigned long)ioremap(range.cpu_addr, range.size);
>             set_io_port_base(vaddr);
>             ret = 0;
>             break;
>         }
>     }
>
>     return ret;
> }
>
> static int mt7621_pci_probe(struct platform_device *pdev)
> {
>   ...
>     err = mt7621_set_io(dev);
>     if (err) {
>         dev_err(dev, "error setting io\n");
>         return err;
>     }
> ...
>     return 0;
> }
>
> And now my concerns:
> 1) We have to read DT range IO values in the driver and those values
> will be also parsed by core apis but converting them to linux io
> ports. Should this be done by the driver or is there a better way to
> abstract this to don't do things twice?
> 2) 'set_io_port_base()' function does what we want but it is only
> mips. We already have the iocu stuff there and the driver is mips
> anyway, but it is worth to comment this just in case.
>
> Thoughts?
>
> Thanks in advance for your time.
>
> Best regards,
>     Sergio Paracuellos
>
> >
> > > However, nothing seems to change:
> > >
> > > mt7621-pci 1e140000.pcie: Setting base to PCI_IOBASE: 0x1e160000 ->
> > > mips_io_port_base 0xbe160000
> > >                                                 ^^^
> > >                                                  This seems aligned
> > > with what you are saying. mips_io_port_base have now a proper virtual
> > > addr for 0x1e160000
> >
> > Ok.
> >
> >             Arnd
Arnd Bergmann Sept. 24, 2021, 5:04 p.m. UTC | #22
On Fri, Sep 24, 2021 at 6:45 PM Sergio Paracuellos
<sergio.paracuellos@gmail.com> wrote:
> On Fri, Sep 24, 2021 at 3:28 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Fri, Sep 24, 2021 at 2:46 PM Sergio Paracuellos
> >
> > Ok, sounds good. I would still suggest using
> > "#define PCI_IOBASE mips_io_port_base", so it has the same meaning
> > as on other architectures (the virtual address of port 0), and replace
> > the hardcoded base with the CPU address you read from DT to
> > make that code more portable. As a general rule, DT-enabled drivers
> > should contain no hardcoded addresses.
>
> Yes, it must be cleaned. I was only explaining a possible way to proceed.

Ok

> So, the changes would be:
> 1) Reverting already added two commits in staging-tree [0] and [1].
> (two revert patches)
> 2) Setting PCI_IOBASE to 'mips_io_port_base' so the spaces.h become: (one patch)
>
> $ cat arch/mips/include/asm/mach-ralink/spaces.h
> /* SPDX-License-Identifier: GPL-2.0 */
> #ifndef __ASM_MACH_RALINK_SPACES_H_
> #define __ASM_MACH_RALINK_SPACES_H_
>
> #define PCI_IOBASE    mips_io_port_base
> #define PCI_IOSIZE    SZ_16M
> #define IO_SPACE_LIMIT  (PCI_IOSIZE - 1)

As a minor comment, I would make the PCI_IOSIZE only 64KB in this
case, unless plan to support ralink/mediatek SoCs that have a multiple
PCIe domains with distinct 64KB windows.

> 3) Change the value written in RALINK_PCI_IOBASE to be sure the value
> written takes into account address before linux port translation (one
> patch):
>
> pcie_write(pcie, entry->res->start - entry->offset, RALINK_PCI_IOBASE);

ok

> 4) Virtually Map cpu physical address 0x1e160000 and set
> 'mips_io_port_base' to that virtual address. Something like the
> following (one patch):
>
> static int mt7621_set_io(struct device *dev)
> {
>     struct device_node *node = dev->of_node;
>     struct of_pci_range_parser parser;
>     struct of_pci_range range;
>     unsigned long vaddr;
>     int ret = -EINVAL;
>
>     ret = of_pci_range_parser_init(&parser, node);
>     if (ret)
>         return ret;
>
>     for_each_of_pci_range(&parser, &range) {
>         switch (range.flags & IORESOURCE_TYPE_BITS) {
>         case IORESOURCE_IO:
>             vaddr = (unsigned long)ioremap(range.cpu_addr, range.size);
>             set_io_port_base(vaddr);
>             ret = 0;
>             break;
>         }
>     }
>
>     return ret;
> }

This looks like it does the right thing, but conceptually this would belong into
the mips specific pci_remap_iospace() as we discussed a few emails back,
not inside the driver. pci_remap_iospace() does get the CPU address
as an argument, so it just needs to ioremap()/set_io_port_base() it.

> And now my concerns:
> 1) We have to read DT range IO values in the driver and those values
> will be also parsed by core apis but converting them to linux io
> ports. Should this be done by the driver or is there a better way to
> abstract this to don't do things twice?
> 2) 'set_io_port_base()' function does what we want but it is only
> mips. We already have the iocu stuff there and the driver is mips
> anyway, but it is worth to comment this just in case.

I think in both cases the core APIs should do what we need, with the
change to the mips pci_remap_iospace() I mention. If there is anything
missing in the core API that you need, we can discuss extending those,
e.g. to store additional data in the pci_host_bridge structure.

         Arnd
Sergio Paracuellos Sept. 24, 2021, 5:15 p.m. UTC | #23
Hi Arnd,

On Fri, Sep 24, 2021 at 7:04 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Sep 24, 2021 at 6:45 PM Sergio Paracuellos
> <sergio.paracuellos@gmail.com> wrote:
> > On Fri, Sep 24, 2021 at 3:28 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Fri, Sep 24, 2021 at 2:46 PM Sergio Paracuellos
> > >
> > > Ok, sounds good. I would still suggest using
> > > "#define PCI_IOBASE mips_io_port_base", so it has the same meaning
> > > as on other architectures (the virtual address of port 0), and replace
> > > the hardcoded base with the CPU address you read from DT to
> > > make that code more portable. As a general rule, DT-enabled drivers
> > > should contain no hardcoded addresses.
> >
> > Yes, it must be cleaned. I was only explaining a possible way to proceed.
>
> Ok
>
> > So, the changes would be:
> > 1) Reverting already added two commits in staging-tree [0] and [1].
> > (two revert patches)
> > 2) Setting PCI_IOBASE to 'mips_io_port_base' so the spaces.h become: (one patch)
> >
> > $ cat arch/mips/include/asm/mach-ralink/spaces.h
> > /* SPDX-License-Identifier: GPL-2.0 */
> > #ifndef __ASM_MACH_RALINK_SPACES_H_
> > #define __ASM_MACH_RALINK_SPACES_H_
> >
> > #define PCI_IOBASE    mips_io_port_base
> > #define PCI_IOSIZE    SZ_16M
> > #define IO_SPACE_LIMIT  (PCI_IOSIZE - 1)
>
> As a minor comment, I would make the PCI_IOSIZE only 64KB in this
> case, unless plan to support ralink/mediatek SoCs that have a multiple
> PCIe domains with distinct 64KB windows.

Ok, I will adjust to SZ_64K , then.

>
> > 3) Change the value written in RALINK_PCI_IOBASE to be sure the value
> > written takes into account address before linux port translation (one
> > patch):
> >
> > pcie_write(pcie, entry->res->start - entry->offset, RALINK_PCI_IOBASE);
>
> ok
>
> > 4) Virtually Map cpu physical address 0x1e160000 and set
> > 'mips_io_port_base' to that virtual address. Something like the
> > following (one patch):
> >
> > static int mt7621_set_io(struct device *dev)
> > {
> >     struct device_node *node = dev->of_node;
> >     struct of_pci_range_parser parser;
> >     struct of_pci_range range;
> >     unsigned long vaddr;
> >     int ret = -EINVAL;
> >
> >     ret = of_pci_range_parser_init(&parser, node);
> >     if (ret)
> >         return ret;
> >
> >     for_each_of_pci_range(&parser, &range) {
> >         switch (range.flags & IORESOURCE_TYPE_BITS) {
> >         case IORESOURCE_IO:
> >             vaddr = (unsigned long)ioremap(range.cpu_addr, range.size);
> >             set_io_port_base(vaddr);
> >             ret = 0;
> >             break;
> >         }
> >     }
> >
> >     return ret;
> > }
>
> This looks like it does the right thing, but conceptually this would belong into
> the mips specific pci_remap_iospace() as we discussed a few emails back,
> not inside the driver. pci_remap_iospace() does get the CPU address
> as an argument, so it just needs to ioremap()/set_io_port_base() it.

Ok, this is what I had in mind, but since this change also needs to
add the #ifdef to the whole 'pci_remap_iospace' function of core apis,
just to be sure that would be a valid approach. Thanks for
clarification.

>
> > And now my concerns:
> > 1) We have to read DT range IO values in the driver and those values
> > will be also parsed by core apis but converting them to linux io
> > ports. Should this be done by the driver or is there a better way to
> > abstract this to don't do things twice?
> > 2) 'set_io_port_base()' function does what we want but it is only
> > mips. We already have the iocu stuff there and the driver is mips
> > anyway, but it is worth to comment this just in case.
>
> I think in both cases the core APIs should do what we need, with the
> change to the mips pci_remap_iospace() I mention. If there is anything
> missing in the core API that you need, we can discuss extending those,
> e.g. to store additional data in the pci_host_bridge structure.

Understood. I'll try to test all of these changes during this weekend
and hopefully send these patches during next week.

Thank you very much for your time and support.

Best regards,
    Sergio Paracuellos

>
>          Arnd
diff mbox series

Patch

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index d84381ce82b5..7d7aab1d1d64 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -564,12 +564,14 @@  static int pci_parse_request_of_pci_ranges(struct device *dev,
 
 		switch (resource_type(res)) {
 		case IORESOURCE_IO:
+#ifdef PCI_IOBASE
 			err = devm_pci_remap_iospace(dev, res, iobase);
 			if (err) {
 				dev_warn(dev, "error %d: failed to map resource %pR\n",
 					 err, res);
 				resource_list_destroy_entry(win);
 			}
+#endif
 			break;
 		case IORESOURCE_MEM:
 			res_valid |= !(res->flags & IORESOURCE_PREFETCH);