Message ID | 20171106154813.19936-4-andrew.smirnov@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 6 November 2017 at 15:47, Andrey Smirnov <andrew.smirnov@gmail.com> wrote: > In current implementation, packet queue flushing logic seem to suffer > from a deadlock like scenario if a packet is received by the interface > before before Rx ring is initialized by Guest's driver. Consider the > following sequence of events: > > 1. A QEMU instance is started against a TAP device on Linux > host, running Linux guest, e. g., something to the effect > of: > > qemu-system-arm \ > -net nic,model=imx.fec,netdev=lan0 \ > netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \ > ... rest of the arguments ... > > 2. Once QEMU starts, but before guest reaches the point where > FEC deriver is done initializing the HW, Guest, via TAP > interface, receives a number of multicast MDNS packets from > Host (not necessarily true for every OS, but it happens at > least on Fedora 25) > > 3. Recieving a packet in such a state results in > imx_eth_can_receive() returning '0', which in turn causes > tap_send() to disable corresponding event (tap.c:203) > > 4. Once Guest's driver reaches the point where it is ready to > recieve packets it prepares Rx ring descriptors and writes > ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that > more descriptors are ready. And at this points emulation > layer does this: > > s->regs[index] = ENET_RDAR_RDAR; > imx_eth_enable_rx(s); > > which, combined with: > > if (!s->regs[ENET_RDAR]) { > qemu_flush_queued_packets(qemu_get_queue(s->nic)); > } > > results in Rx queue never being flushed and corresponding > I/O event beign disabled. > > To prevent the problem, change the code to always flush packet queue > when ENET_RDAR transitions 0 -> ENET_RDAR_RDAR. > > Cc: Peter Maydell <peter.maydell@linaro.org> > Cc: Jason Wang <jasowang@redhat.com> > Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> > Cc: qemu-devel@nongnu.org > Cc: qemu-arm@nongnu.org > Cc: yurovsky@gmail.com > Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> > diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h > index 62ad473b05..4bc8f03ec2 100644 > --- a/include/hw/net/imx_fec.h > +++ b/include/hw/net/imx_fec.h > @@ -252,6 +252,7 @@ typedef struct IMXFECState { > uint32_t phy_int_mask; > > bool is_fec; > + bool needs_flush; > } IMXFECState; This field isn't needed any more in this version of the patch, I think? otherwise Reviewed-by: Peter Maydell <peter.maydell@linaro.org> thanks -- PMM
On Tue, Nov 21, 2017 at 9:27 AM, Peter Maydell <peter.maydell@linaro.org> wrote: > On 6 November 2017 at 15:47, Andrey Smirnov <andrew.smirnov@gmail.com> wrote: >> In current implementation, packet queue flushing logic seem to suffer >> from a deadlock like scenario if a packet is received by the interface >> before before Rx ring is initialized by Guest's driver. Consider the >> following sequence of events: >> >> 1. A QEMU instance is started against a TAP device on Linux >> host, running Linux guest, e. g., something to the effect >> of: >> >> qemu-system-arm \ >> -net nic,model=imx.fec,netdev=lan0 \ >> netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \ >> ... rest of the arguments ... >> >> 2. Once QEMU starts, but before guest reaches the point where >> FEC deriver is done initializing the HW, Guest, via TAP >> interface, receives a number of multicast MDNS packets from >> Host (not necessarily true for every OS, but it happens at >> least on Fedora 25) >> >> 3. Recieving a packet in such a state results in >> imx_eth_can_receive() returning '0', which in turn causes >> tap_send() to disable corresponding event (tap.c:203) >> >> 4. Once Guest's driver reaches the point where it is ready to >> recieve packets it prepares Rx ring descriptors and writes >> ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that >> more descriptors are ready. And at this points emulation >> layer does this: >> >> s->regs[index] = ENET_RDAR_RDAR; >> imx_eth_enable_rx(s); >> >> which, combined with: >> >> if (!s->regs[ENET_RDAR]) { >> qemu_flush_queued_packets(qemu_get_queue(s->nic)); >> } >> >> results in Rx queue never being flushed and corresponding >> I/O event beign disabled. >> >> To prevent the problem, change the code to always flush packet queue >> when ENET_RDAR transitions 0 -> ENET_RDAR_RDAR. >> >> Cc: Peter Maydell <peter.maydell@linaro.org> >> Cc: Jason Wang <jasowang@redhat.com> >> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> >> Cc: qemu-devel@nongnu.org >> Cc: qemu-arm@nongnu.org >> Cc: yurovsky@gmail.com >> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> >> diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h >> index 62ad473b05..4bc8f03ec2 100644 >> --- a/include/hw/net/imx_fec.h >> +++ b/include/hw/net/imx_fec.h >> @@ -252,6 +252,7 @@ typedef struct IMXFECState { >> uint32_t phy_int_mask; >> >> bool is_fec; >> + bool needs_flush; >> } IMXFECState; > > This field isn't needed any more in this version of the patch, I think? > Yeah, my bad, forgot to remove this part. Will do in v4. Thanks, Andrey Smirnov
diff --git a/hw/net/imx_fec.c b/hw/net/imx_fec.c index 8b2e4b8ffe..eb034ffd0c 100644 --- a/hw/net/imx_fec.c +++ b/hw/net/imx_fec.c @@ -533,7 +533,7 @@ static void imx_eth_do_tx(IMXFECState *s) } } -static void imx_eth_enable_rx(IMXFECState *s) +static void imx_eth_enable_rx(IMXFECState *s, bool flush) { IMXFECBufDesc bd; bool rx_ring_full; @@ -544,7 +544,7 @@ static void imx_eth_enable_rx(IMXFECState *s) if (rx_ring_full) { FEC_PRINTF("RX buffer full\n"); - } else if (!s->regs[ENET_RDAR]) { + } else if (flush) { qemu_flush_queued_packets(qemu_get_queue(s->nic)); } @@ -807,7 +807,7 @@ static void imx_eth_write(void *opaque, hwaddr offset, uint64_t value, if (s->regs[ENET_ECR] & ENET_ECR_ETHEREN) { if (!s->regs[index]) { s->regs[index] = ENET_RDAR_RDAR; - imx_eth_enable_rx(s); + imx_eth_enable_rx(s, true); } } else { s->regs[index] = 0; @@ -930,7 +930,7 @@ static int imx_eth_can_receive(NetClientState *nc) FEC_PRINTF("\n"); - return s->regs[ENET_RDAR] ? 1 : 0; + return !!s->regs[ENET_RDAR]; } static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf, @@ -1020,7 +1020,7 @@ static ssize_t imx_fec_receive(NetClientState *nc, const uint8_t *buf, } } s->rx_descriptor = addr; - imx_eth_enable_rx(s); + imx_eth_enable_rx(s, false); imx_eth_update(s); return len; } @@ -1116,7 +1116,7 @@ static ssize_t imx_enet_receive(NetClientState *nc, const uint8_t *buf, } } s->rx_descriptor = addr; - imx_eth_enable_rx(s); + imx_eth_enable_rx(s, false); imx_eth_update(s); return len; } diff --git a/include/hw/net/imx_fec.h b/include/hw/net/imx_fec.h index 62ad473b05..4bc8f03ec2 100644 --- a/include/hw/net/imx_fec.h +++ b/include/hw/net/imx_fec.h @@ -252,6 +252,7 @@ typedef struct IMXFECState { uint32_t phy_int_mask; bool is_fec; + bool needs_flush; } IMXFECState; #endif
In current implementation, packet queue flushing logic seem to suffer from a deadlock like scenario if a packet is received by the interface before before Rx ring is initialized by Guest's driver. Consider the following sequence of events: 1. A QEMU instance is started against a TAP device on Linux host, running Linux guest, e. g., something to the effect of: qemu-system-arm \ -net nic,model=imx.fec,netdev=lan0 \ netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \ ... rest of the arguments ... 2. Once QEMU starts, but before guest reaches the point where FEC deriver is done initializing the HW, Guest, via TAP interface, receives a number of multicast MDNS packets from Host (not necessarily true for every OS, but it happens at least on Fedora 25) 3. Recieving a packet in such a state results in imx_eth_can_receive() returning '0', which in turn causes tap_send() to disable corresponding event (tap.c:203) 4. Once Guest's driver reaches the point where it is ready to recieve packets it prepares Rx ring descriptors and writes ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that more descriptors are ready. And at this points emulation layer does this: s->regs[index] = ENET_RDAR_RDAR; imx_eth_enable_rx(s); which, combined with: if (!s->regs[ENET_RDAR]) { qemu_flush_queued_packets(qemu_get_queue(s->nic)); } results in Rx queue never being flushed and corresponding I/O event beign disabled. To prevent the problem, change the code to always flush packet queue when ENET_RDAR transitions 0 -> ENET_RDAR_RDAR. Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Jason Wang <jasowang@redhat.com> Cc: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: qemu-devel@nongnu.org Cc: qemu-arm@nongnu.org Cc: yurovsky@gmail.com Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> --- hw/net/imx_fec.c | 12 ++++++------ include/hw/net/imx_fec.h | 1 + 2 files changed, 7 insertions(+), 6 deletions(-)