Message ID | 1520444223-8389-1-git-send-email-linux@roeck-us.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 7 March 2018 at 17:37, Guenter Roeck <linux@roeck-us.net> wrote: > The sabrelite machine model used by qemu-system-arm is based on the > Freescale/NXP i.MX6Q processor. This SoC has an on-board ethernet > controller which is supported in QEMU using the imx_fec.c module > (actually called imx.enet for this model.) > > The include/hw/arm/fsm-imx6.h file defines the interrupt vectors for the > imx.enet device like this: > > #define FSL_IMX6_ENET_MAC_1588_IRQ 118 > #define FSL_IMX6_ENET_MAC_IRQ 119 > > According to https://www.nxp.com/docs/en/reference-manual/IMX6DQRM.pdf, > page 225, in Table 3-1. ARM Cortex A9 domain interrupt summary, > interrupts are as follows. > > 150 ENET MAC 0 IRQ > 151 ENET MAC 0 1588 Timer interrupt > > where > > 150 - 32 == 118 > 151 - 32 == 119 > > In other words, the vector definitions in the fsl-imx6.h file are reversed. > > This results in lost interrupt warnings when running recent (v4.15+) Linux > kernels, and the Ethernet interface will fail to probe. > > Note that applying this patch will cause problems with older Linux kernels: > The Ethernet interface will fail to probe with Linux v4.9 and earlier. > Linux v4.1 and earlier will crash. This is a Linux kernel problem, not a > qemu problem: the Linux kernel only worked by accident since it requested > both interrupts. So do the works-by-accident kernels fail on QEMU because we don't emulate some bit of the ethernet device ? Ideally we could fix that so we could boot newer kernels without breaking the old ones... thanks -- PMM
On 03/08/2018 02:50 AM, Peter Maydell wrote: > On 7 March 2018 at 17:37, Guenter Roeck <linux@roeck-us.net> wrote: >> The sabrelite machine model used by qemu-system-arm is based on the >> Freescale/NXP i.MX6Q processor. This SoC has an on-board ethernet >> controller which is supported in QEMU using the imx_fec.c module >> (actually called imx.enet for this model.) >> >> The include/hw/arm/fsm-imx6.h file defines the interrupt vectors for the >> imx.enet device like this: >> >> #define FSL_IMX6_ENET_MAC_1588_IRQ 118 >> #define FSL_IMX6_ENET_MAC_IRQ 119 >> >> According to https://www.nxp.com/docs/en/reference-manual/IMX6DQRM.pdf, >> page 225, in Table 3-1. ARM Cortex A9 domain interrupt summary, >> interrupts are as follows. >> >> 150 ENET MAC 0 IRQ >> 151 ENET MAC 0 1588 Timer interrupt >> >> where >> >> 150 - 32 == 118 >> 151 - 32 == 119 >> >> In other words, the vector definitions in the fsl-imx6.h file are reversed. >> >> This results in lost interrupt warnings when running recent (v4.15+) Linux >> kernels, and the Ethernet interface will fail to probe. >> >> Note that applying this patch will cause problems with older Linux kernels: >> The Ethernet interface will fail to probe with Linux v4.9 and earlier. >> Linux v4.1 and earlier will crash. This is a Linux kernel problem, not a >> qemu problem: the Linux kernel only worked by accident since it requested >> both interrupts. > > So do the works-by-accident kernels fail on QEMU because > we don't emulate some bit of the ethernet device ? > Ideally we could fix that so we could boot newer kernels > without breaking the old ones... > As mentioned, older versions of Linux assign both interrupt lines instead of one to the Ethernet interrupt handler. After applying this patch, this causes the probe function to fail. This in turn results in 4.1 and earlier to crash, This is fixed by applying upstream commit 32cba57ba7 to those kernels. I asked upstream maintainers to apply this patch to 3.18.y and 4.1.y. The wrong interrupt problem is solved in Linux with upstream commit 4c8777892e80b, as I pointed out in bugzilla. With this patch applied, older versions of Linux (4.9 and older) work with qemu, both with and without this patch. What I don't know is if older versions of Linux still work on real hardware with 4c8777892e80b applied. I don't know if a fix working for all versions of Linux is even possible. Creating both interrupts might be an option, but would likely cause other problems since some versions of Linux would handle the same interrupt twice, while others expect the second interrupt to be associated with the timer. Guenter
On 8 March 2018 at 14:47, Guenter Roeck <linux@roeck-us.net> wrote: > On 03/08/2018 02:50 AM, Peter Maydell wrote: >> So do the works-by-accident kernels fail on QEMU because >> we don't emulate some bit of the ethernet device ? >> Ideally we could fix that so we could boot newer kernels >> without breaking the old ones... > > I don't know if a fix working for all versions of Linux is even possible. > Creating both interrupts might be an option, but would likely cause > other problems since some versions of Linux would handle the same > interrupt twice, while others expect the second interrupt > to be associated with the timer. Did the older Linux kernels work on the real hardware? (I would guess so, but sometimes these things only get tested on emulation...) If so, then in theory "make QEMU work like the hardware" should allow all the kernels to work on QEMU. thanks -- PMM
On 03/08/2018 06:51 AM, Peter Maydell wrote: > On 8 March 2018 at 14:47, Guenter Roeck <linux@roeck-us.net> wrote: >> On 03/08/2018 02:50 AM, Peter Maydell wrote: >>> So do the works-by-accident kernels fail on QEMU because >>> we don't emulate some bit of the ethernet device ? >>> Ideally we could fix that so we could boot newer kernels >>> without breaking the old ones... > >> >> I don't know if a fix working for all versions of Linux is even possible. >> Creating both interrupts might be an option, but would likely cause >> other problems since some versions of Linux would handle the same >> interrupt twice, while others expect the second interrupt >> to be associated with the timer. > > Did the older Linux kernels work on the real hardware? (I > would guess so, but sometimes these things only get tested > on emulation...) If so, then in theory "make QEMU work like > the hardware" should allow all the kernels to work on QEMU. > Good point. I'll assume that this is the case, and have another look. Unfortunately I don't have real hardware, so it is difficult to observe the differences for real. Guenter
On Thu, Mar 08, 2018 at 02:51:04PM +0000, Peter Maydell wrote: > On 8 March 2018 at 14:47, Guenter Roeck <linux@roeck-us.net> wrote: > > On 03/08/2018 02:50 AM, Peter Maydell wrote: > >> So do the works-by-accident kernels fail on QEMU because > >> we don't emulate some bit of the ethernet device ? > >> Ideally we could fix that so we could boot newer kernels > >> without breaking the old ones... > > > > > I don't know if a fix working for all versions of Linux is even possible. > > Creating both interrupts might be an option, but would likely cause > > other problems since some versions of Linux would handle the same > > interrupt twice, while others expect the second interrupt > > to be associated with the timer. > > Did the older Linux kernels work on the real hardware? (I > would guess so, but sometimes these things only get tested > on emulation...) If so, then in theory "make QEMU work like > the hardware" should allow all the kernels to work on QEMU. > Older kernels (4.9 and earlier) register the second interrupt on a gpio pin. Linux 4.1, ToT qemu with this patch applied: 38: 0 gpio-mxc 6 Level 2188000.ethernet 286: 0 GIC 151 Level 2188000.ethernet Linux 4.1, ToT qemu without this patch: 38: 0 gpio-mxc 6 Level 2188000.ethernet 286: 64 GIC 151 Level 2188000.ethernet linux 4.14, ToT qemu with this patch: 64: 64 GIC-0 150 Level 2188000.ethernet 65: 0 GIC-0 151 Level 2188000.ethernet linux 4.14, ToT qemu without this patch: 64: 0 GIC-0 150 Level 2188000.ethernet 65: 64 GIC-0 151 Level 2188000.ethernet Presumably real hardware generates interrupts on the gpio pin. The qemu code specifically states that it won't generate MDIO related interrupts on gpio pins. From hw/net/imx_fec.c: /* * The MII phy could raise a GPIO to the processor which in turn * could be handled as an interrpt by the OS. * For now we don't handle any GPIO/interrupt line, so the OS will * have to poll for the PHY status. */ Linux doesn't poll the phy but expects interrupts to work. I have no idea what is supposed to happen with regular ethernet interrupts, and if those are also mapped to a gpio pin on real hardware. Any idea how to fix this ? Guenter
Of all the gin joints in all the towns in all the world, Guenter Roeck had to walk into mine at 09:12 on Thursday 08 March 2018 and say: > On Thu, Mar 08, 2018 at 02:51:04PM +0000, Peter Maydell wrote: > > On 8 March 2018 at 14:47, Guenter Roeck <linux@roeck-us.net> wrote: > > > On 03/08/2018 02:50 AM, Peter Maydell wrote: > > >> So do the works-by-accident kernels fail on QEMU because > > >> we don't emulate some bit of the ethernet device ? > > >> Ideally we could fix that so we could boot newer kernels > > >> without breaking the old ones... > > > > > > I don't know if a fix working for all versions of Linux is even > > > possible. Creating both interrupts might be an option, but would > > > likely cause other problems since some versions of Linux would handle > > > the same interrupt twice, while others expect the second interrupt > > > to be associated with the timer. > > > > Did the older Linux kernels work on the real hardware? (I > > would guess so, but sometimes these things only get tested > > on emulation...) If so, then in theory "make QEMU work like > > the hardware" should allow all the kernels to work on QEMU. > > Older kernels (4.9 and earlier) register the second interrupt > on a gpio pin. > > Linux 4.1, ToT qemu with this patch applied: > > 38: 0 gpio-mxc 6 Level 2188000.ethernet > 286: 0 GIC 151 Level 2188000.ethernet > > Linux 4.1, ToT qemu without this patch: > > 38: 0 gpio-mxc 6 Level 2188000.ethernet > 286: 64 GIC 151 Level 2188000.ethernet > > linux 4.14, ToT qemu with this patch: > > 64: 64 GIC-0 150 Level 2188000.ethernet > 65: 0 GIC-0 151 Level 2188000.ethernet > > linux 4.14, ToT qemu without this patch: > > 64: 0 GIC-0 150 Level 2188000.ethernet > 65: 64 GIC-0 151 Level 2188000.ethernet > > Presumably real hardware generates interrupts on the gpio pin. > > The qemu code specifically states that it won't generate > MDIO related interrupts on gpio pins. From hw/net/imx_fec.c: > > /* > * The MII phy could raise a GPIO to the processor which in turn > * could be handled as an interrpt by the OS. > * For now we don't handle any GPIO/interrupt line, so the OS will > * have to poll for the PHY status. > */ > > Linux doesn't poll the phy but expects interrupts to work. I have > no idea what is supposed to happen with regular ethernet interrupts, > and if those are also mapped to a gpio pin on real hardware. The i.MX6, like a lot of ARM chips, has an internal pin/signal muxing block. The implementation can vary from one SoC to another. The i.MX6 calls it the IOMUX block. The reason for using the GPIO pin here has to do with erratum 6687 in the i.MX6. The problem is this: the ENET block has several interrupt sources such as RX done, TX done, various errors and wakeup request. The i.MX6 also has two blocks to which interrupts can be routed: the GIC (Generic Interrupt Controller) and GPC (Generic Power Controller). The GPC is only used for power management: you can use it to configure what device interrupts (from internal SoC peripherals) can wake the system from hibernation. The GIC receives all ENET events ORed together via vector 150. But the GPC is only connected to the wakeup request signal. According to the erratum 6687 documentation, this means that a normal RX or TX event would not wake up the system, which is suboptimal. The workaround for this is: - Configure the internal mux block to route the ENET interrupt line to a GPIO pin (as with the GIC, this also gives you all ENET events ORed together) - Unmask the GPIO interrupt in the GPC block instead of the ENET interrupt The erratum description recommends using GPIO6. It also says that this workaround was implemented in Freescale/NXP's Linux BSP for this chip. I don't know if it was also upstreamed to the official Linux kernel. Note that this workaround is applied by software, and is really only needed if you want the ethernet controller to serve as a wakeup event source. As far as I can tell QEMU does not support the IOMUX block, so it would never be possible for a kernel with this workaround in place to get interrupts via the GPIO6 pin. That said, I don't think the i.MX6 ENET driver itself would need to use the GPIO6 interrupt. When we implemented power management support for VxWorks we also used this workaround, but only for the benefit of the GPC module. As far as I recall, the ENET driver still used GIC vector 150. This may be why this behavior was removed in later kernels. Anyway, this means that the only reason older Linux kernels worked in QEMU with the broken interrupt configuration is that they also registered a handler on vector 151 (119). Even though QEMU could not send events via GPIO6, it was mistakenly sending them via vector 151, so it looked like things worked. On real hardware, the older kernels would have gotten their interrupts via GPIO6 and also worked. The older kernels would _not_ work if you fix QEMU because now they would never get interrupts on either vector, unless you fudge things so that the ENET module triggers both vector 150 and the vector for GPIO6 in the GIC or patch them to back out the erratum 6678 workaround as later kernels do. Later kernels that register vectors 150 and 151 would work with both broken and fixed QEMU and real hardware. But you should fix the incorrect interrupt configuration regardless. :) -Bill > Any idea how to fix this ? > > Guenter
On Thu, Mar 08, 2018 at 10:28:39AM -0800, Bill Paul wrote: > Of all the gin joints in all the towns in all the world, Guenter Roeck had to > walk into mine at 09:12 on Thursday 08 March 2018 and say: > > > On Thu, Mar 08, 2018 at 02:51:04PM +0000, Peter Maydell wrote: > > > On 8 March 2018 at 14:47, Guenter Roeck <linux@roeck-us.net> wrote: > > > > On 03/08/2018 02:50 AM, Peter Maydell wrote: > > > >> So do the works-by-accident kernels fail on QEMU because > > > >> we don't emulate some bit of the ethernet device ? > > > >> Ideally we could fix that so we could boot newer kernels > > > >> without breaking the old ones... > > > > > > > > I don't know if a fix working for all versions of Linux is even > > > > possible. Creating both interrupts might be an option, but would > > > > likely cause other problems since some versions of Linux would handle > > > > the same interrupt twice, while others expect the second interrupt > > > > to be associated with the timer. > > > > > > Did the older Linux kernels work on the real hardware? (I > > > would guess so, but sometimes these things only get tested > > > on emulation...) If so, then in theory "make QEMU work like > > > the hardware" should allow all the kernels to work on QEMU. > > > > Older kernels (4.9 and earlier) register the second interrupt > > on a gpio pin. > > > > Linux 4.1, ToT qemu with this patch applied: > > > > 38: 0 gpio-mxc 6 Level 2188000.ethernet > > 286: 0 GIC 151 Level 2188000.ethernet > > > > Linux 4.1, ToT qemu without this patch: > > > > 38: 0 gpio-mxc 6 Level 2188000.ethernet > > 286: 64 GIC 151 Level 2188000.ethernet > > > > linux 4.14, ToT qemu with this patch: > > > > 64: 64 GIC-0 150 Level 2188000.ethernet > > 65: 0 GIC-0 151 Level 2188000.ethernet > > > > linux 4.14, ToT qemu without this patch: > > > > 64: 0 GIC-0 150 Level 2188000.ethernet > > 65: 64 GIC-0 151 Level 2188000.ethernet > > > > Presumably real hardware generates interrupts on the gpio pin. > > > > The qemu code specifically states that it won't generate > > MDIO related interrupts on gpio pins. From hw/net/imx_fec.c: > > > > /* > > * The MII phy could raise a GPIO to the processor which in turn > > * could be handled as an interrpt by the OS. > > * For now we don't handle any GPIO/interrupt line, so the OS will > > * have to poll for the PHY status. > > */ > > > > Linux doesn't poll the phy but expects interrupts to work. I have > > no idea what is supposed to happen with regular ethernet interrupts, > > and if those are also mapped to a gpio pin on real hardware. > > The i.MX6, like a lot of ARM chips, has an internal pin/signal muxing block. > The implementation can vary from one SoC to another. The i.MX6 calls it the > IOMUX block. > > The reason for using the GPIO pin here has to do with erratum 6687 in the > i.MX6. The problem is this: the ENET block has several interrupt sources such > as RX done, TX done, various errors and wakeup request. The i.MX6 also has two > blocks to which interrupts can be routed: the GIC (Generic Interrupt > Controller) and GPC (Generic Power Controller). The GPC is only used for power > management: you can use it to configure what device interrupts (from internal > SoC peripherals) can wake the system from hibernation. > > The GIC receives all ENET events ORed together via vector 150. > > But the GPC is only connected to the wakeup request signal. > > According to the erratum 6687 documentation, this means that a normal RX or TX > event would not wake up the system, which is suboptimal. > > The workaround for this is: > > - Configure the internal mux block to route the ENET interrupt line to a GPIO > pin (as with the GIC, this also gives you all ENET events ORed together) > - Unmask the GPIO interrupt in the GPC block instead of the ENET interrupt > > The erratum description recommends using GPIO6. It also says that this > workaround was implemented in Freescale/NXP's Linux BSP for this chip. I don't > know if it was also upstreamed to the official Linux kernel. > My understanding is that it was upstreamed but later (after 4.9) at least partially removed. > Note that this workaround is applied by software, and is really only needed if > you want the ethernet controller to serve as a wakeup event source. > > As far as I can tell QEMU does not support the IOMUX block, so it would never > be possible for a kernel with this workaround in place to get interrupts via > the GPIO6 pin. > > That said, I don't think the i.MX6 ENET driver itself would need to use the > GPIO6 interrupt. When we implemented power management support for VxWorks we > also used this workaround, but only for the benefit of the GPC module. As far > as I recall, the ENET driver still used GIC vector 150. This may be why this > behavior was removed in later kernels. > > Anyway, this means that the only reason older Linux kernels worked in QEMU > with the broken interrupt configuration is that they also registered a handler > on vector 151 (119). Even though QEMU could not send events via GPIO6, it was > mistakenly sending them via vector 151, so it looked like things worked. On > real hardware, the older kernels would have gotten their interrupts via GPIO6 > and also worked. The older kernels would _not_ work if you fix QEMU because > now they would never get interrupts on either vector, unless you fudge things > so that the ENET module triggers both vector 150 and the vector for GPIO6 in > the GIC or patch them to back out the erratum 6678 workaround as later kernels > do. > Excellent analysis. > Later kernels that register vectors 150 and 151 would work with both broken > and fixed QEMU and real hardware. > ... except in Linux v4.15 and later since those kernels register the second interrupt as a timer interrupt. Because of this, the shipping version of qemu does not work with Linux v4.15+. Thanks, Guenter
On 8 March 2018 at 18:28, Bill Paul <wpaul@windriver.com> wrote: > Anyway, this means that the only reason older Linux kernels worked in QEMU > with the broken interrupt configuration is that they also registered a handler > on vector 151 (119). Even though QEMU could not send events via GPIO6, it was > mistakenly sending them via vector 151, so it looked like things worked. On > real hardware, the older kernels would have gotten their interrupts via GPIO6 > and also worked. The older kernels would _not_ work if you fix QEMU because > now they would never get interrupts on either vector, unless you fudge things > so that the ENET module triggers both vector 150 and the vector for GPIO6 in > the GIC or patch them to back out the erratum 6678 workaround as later kernels > do. Thanks for that really useful writeup. So if I understand correctly we have several choices here: (1) we could implement a model of the IOMUX block that is at least sufficient to support guests that configure it to route the ENET interrupt line to a GPIO pin. Then we could apply this patch that fixes the ENET line definitions. Old kernels would continue to work (for the same reason they worked on hardware), and new ones would work now too. This is in some ways the preferred option, but it's possibly a lot of code and we're nearly in freeze for 2.12. (2) we could leave everything as it is for 2.12. This would mean that at least we don't regress setups that used to work on older QEMU versions. Downside is that we wouldn't be able to run Linux v4.15+, or other guest OSes that don't have the bug that older Linux kernels do. (Presumably we'd only do this on the understanding that we were going to go down route (1) for 2.13.) (3) we could apply this patch for 2.12. Linux v4.15+ now works, as do other guest OSes that use the ENET interrupt. v4.1 and older Linux guests that used to boot in QEMU stop doing so, and 4.2-4.9 boot but lose the ethernet device support. Perhaps for 2.13 we might take route (1) to make those older guests start working again. Do I have that right? None of these options seems especially palatable to me, so we're choosing the lesser evil, I think... (unless somebody wants to say that option (1) would be 20 lines of code and here's the patch :-)) I guess in the absence of (1) that (3) is better than (2) ? [NB: I have just sent a target-arm pull request but that doesn't mean this patch has missed the boat for 2.12, since it's in any case a bug fix.] thanks -- PMM
On Fri, Mar 09, 2018 at 05:47:16PM +0000, Peter Maydell wrote: > On 8 March 2018 at 18:28, Bill Paul <wpaul@windriver.com> wrote: > > Anyway, this means that the only reason older Linux kernels worked in QEMU > > with the broken interrupt configuration is that they also registered a handler > > on vector 151 (119). Even though QEMU could not send events via GPIO6, it was > > mistakenly sending them via vector 151, so it looked like things worked. On > > real hardware, the older kernels would have gotten their interrupts via GPIO6 > > and also worked. The older kernels would _not_ work if you fix QEMU because > > now they would never get interrupts on either vector, unless you fudge things > > so that the ENET module triggers both vector 150 and the vector for GPIO6 in > > the GIC or patch them to back out the erratum 6678 workaround as later kernels > > do. > > Thanks for that really useful writeup. So if I understand correctly > we have several choices here: > > (1) we could implement a model of the IOMUX block that is at least > sufficient to support guests that configure it to route the ENET interrupt > line to a GPIO pin. Then we could apply this patch that fixes the ENET > line definitions. Old kernels would continue to work (for the same > reason they worked on hardware), and new ones would work now too. > This is in some ways the preferred option, but it's possibly a lot > of code and we're nearly in freeze for 2.12. > > (2) we could leave everything as it is for 2.12. This would mean that > at least we don't regress setups that used to work on older QEMU versions. > Downside is that we wouldn't be able to run Linux v4.15+, or other > guest OSes that don't have the bug that older Linux kernels do. > (Presumably we'd only do this on the understanding that we were going > to go down route (1) for 2.13.) > > (3) we could apply this patch for 2.12. Linux v4.15+ now works, as > do other guest OSes that use the ENET interrupt. v4.1 and older Linux > guests that used to boot in QEMU stop doing so, and 4.2-4.9 boot but > lose the ethernet device support. Perhaps for 2.13 we might > take route (1) to make those older guests start working again. > > Do I have that right? > Pretty much. > None of these options seems especially palatable to me, so we're > choosing the lesser evil, I think... (unless somebody wants to say > that option (1) would be 20 lines of code and here's the patch :-)) > I guess in the absence of (1) that (3) is better than (2) ? > I would prefer (2). This is what I decided to use in my "local" version of qemu. Older versions of Linux can be fixed by applying one (4.2..4.9) or two (4.1 and older) upstream patches; anyone interested running those kernels in qemu with Ethernet working should apply those patches (or, alternatively, provide a patch adding IOMUX support to qemu). Thanks, Guenter
On 9 March 2018 at 18:20, Guenter Roeck <linux@roeck-us.net> wrote: > On Fri, Mar 09, 2018 at 05:47:16PM +0000, Peter Maydell wrote: >> Thanks for that really useful writeup. So if I understand correctly >> we have several choices here: >> >> (1) we could implement a model of the IOMUX block that is at least >> sufficient to support guests that configure it to route the ENET interrupt >> line to a GPIO pin. Then we could apply this patch that fixes the ENET >> line definitions. Old kernels would continue to work (for the same >> reason they worked on hardware), and new ones would work now too. >> This is in some ways the preferred option, but it's possibly a lot >> of code and we're nearly in freeze for 2.12. >> >> (2) we could leave everything as it is for 2.12. This would mean that >> at least we don't regress setups that used to work on older QEMU versions. >> Downside is that we wouldn't be able to run Linux v4.15+, or other >> guest OSes that don't have the bug that older Linux kernels do. >> (Presumably we'd only do this on the understanding that we were going >> to go down route (1) for 2.13.) >> >> (3) we could apply this patch for 2.12. Linux v4.15+ now works, as >> do other guest OSes that use the ENET interrupt. v4.1 and older Linux >> guests that used to boot in QEMU stop doing so, and 4.2-4.9 boot but >> lose the ethernet device support. Perhaps for 2.13 we might >> take route (1) to make those older guests start working again. >> >> Do I have that right? >> > Pretty much. > >> None of these options seems especially palatable to me, so we're >> choosing the lesser evil, I think... (unless somebody wants to say >> that option (1) would be 20 lines of code and here's the patch :-)) >> I guess in the absence of (1) that (3) is better than (2) ? >> > > I would prefer (2). This is what I decided to use in my "local" > version of qemu. Older versions of Linux can be fixed by applying one > (4.2..4.9) or two (4.1 and older) upstream patches; anyone interested > running those kernels in qemu with Ethernet working should apply those > patches (or, alternatively, provide a patch adding IOMUX support to > qemu). Did you mean "prefer (3) [apply this patch]" ? The rest of the paragraph makes more sense if you did. thanks -- PMM
Of all the gin joints in all the towns in all the world, Guenter Roeck had to walk into mine at 10:20 on Friday 09 March 2018 and say: > On Fri, Mar 09, 2018 at 05:47:16PM +0000, Peter Maydell wrote: > > On 8 March 2018 at 18:28, Bill Paul <wpaul@windriver.com> wrote: > > > Anyway, this means that the only reason older Linux kernels worked in > > > QEMU with the broken interrupt configuration is that they also > > > registered a handler on vector 151 (119). Even though QEMU could not > > > send events via GPIO6, it was mistakenly sending them via vector 151, > > > so it looked like things worked. On real hardware, the older kernels > > > would have gotten their interrupts via GPIO6 and also worked. The > > > older kernels would _not_ work if you fix QEMU because now they would > > > never get interrupts on either vector, unless you fudge things so that > > > the ENET module triggers both vector 150 and the vector for GPIO6 in > > > the GIC or patch them to back out the erratum 6678 workaround as later > > > kernels do. > > > > Thanks for that really useful writeup. So if I understand correctly > > > > we have several choices here: > > (1) we could implement a model of the IOMUX block that is at least > > sufficient to support guests that configure it to route the ENET > > interrupt line to a GPIO pin. Then we could apply this patch that fixes > > the ENET line definitions. Old kernels would continue to work (for the > > same reason they worked on hardware), and new ones would work now too. > > This is in some ways the preferred option, but it's possibly a lot of > > code and we're nearly in freeze for 2.12. > > > > (2) we could leave everything as it is for 2.12. This would mean that > > at least we don't regress setups that used to work on older QEMU > > versions. Downside is that we wouldn't be able to run Linux v4.15+, or > > other guest OSes that don't have the bug that older Linux kernels do. > > (Presumably we'd only do this on the understanding that we were going > > to go down route (1) for 2.13.) > > > > (3) we could apply this patch for 2.12. Linux v4.15+ now works, as > > do other guest OSes that use the ENET interrupt. v4.1 and older Linux > > guests that used to boot in QEMU stop doing so, and 4.2-4.9 boot but > > lose the ethernet device support. Perhaps for 2.13 we might > > take route (1) to make those older guests start working again. > > > > Do I have that right? > > Pretty much. There may be a 4th option. Since older kernels work because they were looking at vector 151, you could patch the imx_fec.c module so that when it signals an event for vector 150, it also signals vector 151 too. This would keep newer versions of QEMU "bug- compatible" with older versions while also fixing support for newer Linux kernels and other guests (e.g. VxWorks) that follow the hardware spec correctly. I think this would require only a small modification to this function: static void imx_eth_update(IMXFECState *s) { if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_TS_TIMER) { qemu_set_irq(s->irq[1], 1); } else { qemu_set_irq(s->irq[1], 0); } if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_MAC) { qemu_set_irq(s->irq[0], 1); } else { qemu_set_irq(s->irq[0], 0); } } (For ENET_INT_MAC events, just signal both s->irq[0] and s->irq[1]). This of course means signaling spurious events on vector 151, but you're doing that now anyway. :) This is much less work than implementing an IOMUX module. Creating an IOMUX module for QEMU just for this one fixup would probably be the most elegant solution, but adding IOMUX support to QEMU also doesn't make much sense since QEMU has no actual pins. -Bill > > None of these options seems especially palatable to me, so we're > > choosing the lesser evil, I think... (unless somebody wants to say > > that option (1) would be 20 lines of code and here's the patch :-)) > > I guess in the absence of (1) that (3) is better than (2) ? > > I would prefer (2). This is what I decided to use in my "local" > version of qemu. Older versions of Linux can be fixed by applying one > (4.2..4.9) or two (4.1 and older) upstream patches; anyone interested > running those kernels in qemu with Ethernet working should apply those > patches (or, alternatively, provide a patch adding IOMUX support to > qemu). > > Thanks, > Guenter
Of all the gin joints in all the towns in all the world, Bill Paul had to walk into mine at 10:53 on Friday 09 March 2018 and say: > Of all the gin joints in all the towns in all the world, Guenter Roeck had > to > > walk into mine at 10:20 on Friday 09 March 2018 and say: > > On Fri, Mar 09, 2018 at 05:47:16PM +0000, Peter Maydell wrote: > > > On 8 March 2018 at 18:28, Bill Paul <wpaul@windriver.com> wrote: > > > > Anyway, this means that the only reason older Linux kernels worked in > > > > QEMU with the broken interrupt configuration is that they also > > > > registered a handler on vector 151 (119). Even though QEMU could not > > > > send events via GPIO6, it was mistakenly sending them via vector 151, > > > > so it looked like things worked. On real hardware, the older kernels > > > > would have gotten their interrupts via GPIO6 and also worked. The > > > > older kernels would _not_ work if you fix QEMU because now they would > > > > never get interrupts on either vector, unless you fudge things so > > > > that the ENET module triggers both vector 150 and the vector for > > > > GPIO6 in the GIC or patch them to back out the erratum 6678 > > > > workaround as later kernels do. > > > > > > Thanks for that really useful writeup. So if I understand correctly > > > > > > we have several choices here: > > > (1) we could implement a model of the IOMUX block that is at least > > > sufficient to support guests that configure it to route the ENET > > > interrupt line to a GPIO pin. Then we could apply this patch that > > > fixes the ENET line definitions. Old kernels would continue to work > > > (for the same reason they worked on hardware), and new ones would > > > work now too. This is in some ways the preferred option, but it's > > > possibly a lot of code and we're nearly in freeze for 2.12. > > > > > > (2) we could leave everything as it is for 2.12. This would mean that > > > at least we don't regress setups that used to work on older QEMU > > > versions. Downside is that we wouldn't be able to run Linux v4.15+, or > > > other guest OSes that don't have the bug that older Linux kernels do. > > > (Presumably we'd only do this on the understanding that we were going > > > to go down route (1) for 2.13.) > > > > > > (3) we could apply this patch for 2.12. Linux v4.15+ now works, as > > > do other guest OSes that use the ENET interrupt. v4.1 and older Linux > > > guests that used to boot in QEMU stop doing so, and 4.2-4.9 boot but > > > lose the ethernet device support. Perhaps for 2.13 we might > > > take route (1) to make those older guests start working again. > > > > > > Do I have that right? > > > > Pretty much. > > There may be a 4th option. > > Since older kernels work because they were looking at vector 151, you could > patch the imx_fec.c module so that when it signals an event for vector 150, > it also signals vector 151 too. This would keep newer versions of QEMU > "bug- compatible" with older versions while also fixing support for newer > Linux kernels and other guests (e.g. VxWorks) that follow the hardware > spec correctly. Oops, just to be clear: I mean that the vector definitions should be fixed and this backward-compatibility change should be applied at the same time. -Bill > I think this would require only a small modification to this function: > > static void imx_eth_update(IMXFECState *s) > { > if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_TS_TIMER) { > qemu_set_irq(s->irq[1], 1); > } else { > qemu_set_irq(s->irq[1], 0); > } > > if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_MAC) { > qemu_set_irq(s->irq[0], 1); > } else { > qemu_set_irq(s->irq[0], 0); > } > } > > (For ENET_INT_MAC events, just signal both s->irq[0] and s->irq[1]). > > This of course means signaling spurious events on vector 151, but you're > doing that now anyway. :) > > This is much less work than implementing an IOMUX module. Creating an IOMUX > module for QEMU just for this one fixup would probably be the most elegant > solution, but adding IOMUX support to QEMU also doesn't make much sense > since QEMU has no actual pins. > > -Bill > > > > None of these options seems especially palatable to me, so we're > > > choosing the lesser evil, I think... (unless somebody wants to say > > > that option (1) would be 20 lines of code and here's the patch :-)) > > > I guess in the absence of (1) that (3) is better than (2) ? > > > > I would prefer (2). This is what I decided to use in my "local" > > version of qemu. Older versions of Linux can be fixed by applying one > > (4.2..4.9) or two (4.1 and older) upstream patches; anyone interested > > running those kernels in qemu with Ethernet working should apply those > > patches (or, alternatively, provide a patch adding IOMUX support to > > qemu). > > > > Thanks, > > Guenter
On Fri, Mar 09, 2018 at 06:48:43PM +0000, Peter Maydell wrote: > On 9 March 2018 at 18:20, Guenter Roeck <linux@roeck-us.net> wrote: > > On Fri, Mar 09, 2018 at 05:47:16PM +0000, Peter Maydell wrote: > >> Thanks for that really useful writeup. So if I understand correctly > >> we have several choices here: > >> > >> (1) we could implement a model of the IOMUX block that is at least > >> sufficient to support guests that configure it to route the ENET interrupt > >> line to a GPIO pin. Then we could apply this patch that fixes the ENET > >> line definitions. Old kernels would continue to work (for the same > >> reason they worked on hardware), and new ones would work now too. > >> This is in some ways the preferred option, but it's possibly a lot > >> of code and we're nearly in freeze for 2.12. > >> > >> (2) we could leave everything as it is for 2.12. This would mean that > >> at least we don't regress setups that used to work on older QEMU versions. > >> Downside is that we wouldn't be able to run Linux v4.15+, or other > >> guest OSes that don't have the bug that older Linux kernels do. > >> (Presumably we'd only do this on the understanding that we were going > >> to go down route (1) for 2.13.) > >> > >> (3) we could apply this patch for 2.12. Linux v4.15+ now works, as > >> do other guest OSes that use the ENET interrupt. v4.1 and older Linux > >> guests that used to boot in QEMU stop doing so, and 4.2-4.9 boot but > >> lose the ethernet device support. Perhaps for 2.13 we might > >> take route (1) to make those older guests start working again. > >> > >> Do I have that right? > >> > > Pretty much. > > > >> None of these options seems especially palatable to me, so we're > >> choosing the lesser evil, I think... (unless somebody wants to say > >> that option (1) would be 20 lines of code and here's the patch :-)) > >> I guess in the absence of (1) that (3) is better than (2) ? > >> > > > > I would prefer (2). This is what I decided to use in my "local" > > version of qemu. Older versions of Linux can be fixed by applying one > > (4.2..4.9) or two (4.1 and older) upstream patches; anyone interested > > running those kernels in qemu with Ethernet working should apply those > > patches (or, alternatively, provide a patch adding IOMUX support to > > qemu). > > Did you mean "prefer (3) [apply this patch]" ? The rest of the paragraph > makes more sense if you did. > Yes, sorry. Guenter
On Fri, Mar 09, 2018 at 10:53:43AM -0800, Bill Paul wrote: > Of all the gin joints in all the towns in all the world, Guenter Roeck had to > walk into mine at 10:20 on Friday 09 March 2018 and say: > > > On Fri, Mar 09, 2018 at 05:47:16PM +0000, Peter Maydell wrote: > > > On 8 March 2018 at 18:28, Bill Paul <wpaul@windriver.com> wrote: > > > > Anyway, this means that the only reason older Linux kernels worked in > > > > QEMU with the broken interrupt configuration is that they also > > > > registered a handler on vector 151 (119). Even though QEMU could not > > > > send events via GPIO6, it was mistakenly sending them via vector 151, > > > > so it looked like things worked. On real hardware, the older kernels > > > > would have gotten their interrupts via GPIO6 and also worked. The > > > > older kernels would _not_ work if you fix QEMU because now they would > > > > never get interrupts on either vector, unless you fudge things so that > > > > the ENET module triggers both vector 150 and the vector for GPIO6 in > > > > the GIC or patch them to back out the erratum 6678 workaround as later > > > > kernels do. > > > > > > Thanks for that really useful writeup. So if I understand correctly > > > > > > we have several choices here: > > > (1) we could implement a model of the IOMUX block that is at least > > > sufficient to support guests that configure it to route the ENET > > > interrupt line to a GPIO pin. Then we could apply this patch that fixes > > > the ENET line definitions. Old kernels would continue to work (for the > > > same reason they worked on hardware), and new ones would work now too. > > > This is in some ways the preferred option, but it's possibly a lot of > > > code and we're nearly in freeze for 2.12. > > > > > > (2) we could leave everything as it is for 2.12. This would mean that > > > at least we don't regress setups that used to work on older QEMU > > > versions. Downside is that we wouldn't be able to run Linux v4.15+, or > > > other guest OSes that don't have the bug that older Linux kernels do. > > > (Presumably we'd only do this on the understanding that we were going > > > to go down route (1) for 2.13.) > > > > > > (3) we could apply this patch for 2.12. Linux v4.15+ now works, as > > > do other guest OSes that use the ENET interrupt. v4.1 and older Linux > > > guests that used to boot in QEMU stop doing so, and 4.2-4.9 boot but > > > lose the ethernet device support. Perhaps for 2.13 we might > > > take route (1) to make those older guests start working again. > > > > > > Do I have that right? > > > > Pretty much. > > There may be a 4th option. > > Since older kernels work because they were looking at vector 151, you could > patch the imx_fec.c module so that when it signals an event for vector 150, it > also signals vector 151 too. This would keep newer versions of QEMU "bug- > compatible" with older versions while also fixing support for newer Linux > kernels and other guests (e.g. VxWorks) that follow the hardware spec > correctly. > > I think this would require only a small modification to this function: > > static void imx_eth_update(IMXFECState *s) > { > if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_TS_TIMER) { > qemu_set_irq(s->irq[1], 1); > } else { > qemu_set_irq(s->irq[1], 0); > } > > if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_MAC) { > qemu_set_irq(s->irq[0], 1); > } else { > qemu_set_irq(s->irq[0], 0); > } > } > > (For ENET_INT_MAC events, just signal both s->irq[0] and s->irq[1]). > Now this is an excellent idea. > This of course means signaling spurious events on vector 151, but you're doing > that now anyway. :) > Actually, it doesn't. It looks like the first interrupt is handled, resetting the interrupt status, and the second interrupt is never even executed. I tested this with all kernel releases back to v3.16. I'll send out patch v2 with this change and let Peter decide which version to apply. Thanks, Guenter
Of all the gin joints in all the towns in all the world, Guenter Roeck had to walk into mine at 13:38 on Friday 09 March 2018 and say: [...] > > > > Do I have that right? > > > > > > Pretty much. > > > > There may be a 4th option. > > > > Since older kernels work because they were looking at vector 151, you > > could patch the imx_fec.c module so that when it signals an event for > > vector 150, it also signals vector 151 too. This would keep newer > > versions of QEMU "bug- compatible" with older versions while also fixing > > support for newer Linux kernels and other guests (e.g. VxWorks) that > > follow the hardware spec correctly. > > > > I think this would require only a small modification to this function: > > > > static void imx_eth_update(IMXFECState *s) > > { > > > > if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_TS_TIMER) { > > > > qemu_set_irq(s->irq[1], 1); > > > > } else { > > > > qemu_set_irq(s->irq[1], 0); > > > > } > > > > if (s->regs[ENET_EIR] & s->regs[ENET_EIMR] & ENET_INT_MAC) { > > > > qemu_set_irq(s->irq[0], 1); > > > > } else { > > > > qemu_set_irq(s->irq[0], 0); > > > > } > > > > } > > > > (For ENET_INT_MAC events, just signal both s->irq[0] and s->irq[1]). > > Now this is an excellent idea. > > > This of course means signaling spurious events on vector 151, but you're > > doing that now anyway. :) > > Actually, it doesn't. It looks like the first interrupt is handled, > resetting the interrupt status, and the second interrupt is never even > executed. I tested this with all kernel releases back to v3.16. I just did a quick test with your patch and I can confirm that VxWorks 7 also works correctly. -Bill > I'll send out patch v2 with this change and let Peter decide which version > to apply. > > Thanks, > Guenter
diff --git a/include/hw/arm/fsl-imx6.h b/include/hw/arm/fsl-imx6.h index ec6c509..06f8aae 100644 --- a/include/hw/arm/fsl-imx6.h +++ b/include/hw/arm/fsl-imx6.h @@ -438,8 +438,8 @@ typedef struct FslIMX6State { #define FSL_IMX6_HDMI_MASTER_IRQ 115 #define FSL_IMX6_HDMI_CEC_IRQ 116 #define FSL_IMX6_MLB150_LOW_IRQ 117 -#define FSL_IMX6_ENET_MAC_1588_IRQ 118 -#define FSL_IMX6_ENET_MAC_IRQ 119 +#define FSL_IMX6_ENET_MAC_IRQ 118 +#define FSL_IMX6_ENET_MAC_1588_IRQ 119 #define FSL_IMX6_PCIE1_IRQ 120 #define FSL_IMX6_PCIE2_IRQ 121 #define FSL_IMX6_PCIE3_IRQ 122
The sabrelite machine model used by qemu-system-arm is based on the Freescale/NXP i.MX6Q processor. This SoC has an on-board ethernet controller which is supported in QEMU using the imx_fec.c module (actually called imx.enet for this model.) The include/hw/arm/fsm-imx6.h file defines the interrupt vectors for the imx.enet device like this: #define FSL_IMX6_ENET_MAC_1588_IRQ 118 #define FSL_IMX6_ENET_MAC_IRQ 119 According to https://www.nxp.com/docs/en/reference-manual/IMX6DQRM.pdf, page 225, in Table 3-1. ARM Cortex A9 domain interrupt summary, interrupts are as follows. 150 ENET MAC 0 IRQ 151 ENET MAC 0 1588 Timer interrupt where 150 - 32 == 118 151 - 32 == 119 In other words, the vector definitions in the fsl-imx6.h file are reversed. This results in lost interrupt warnings when running recent (v4.15+) Linux kernels, and the Ethernet interface will fail to probe. Note that applying this patch will cause problems with older Linux kernels: The Ethernet interface will fail to probe with Linux v4.9 and earlier. Linux v4.1 and earlier will crash. This is a Linux kernel problem, not a qemu problem: the Linux kernel only worked by accident since it requested both interrupts. Link: https://bugs.launchpad.net/qemu/+bug/1753309 Signed-off-by: Guenter Roeck <linux@roeck-us.net> --- include/hw/arm/fsl-imx6.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)