Message ID | 20211026180909.1953355-1-mailhol.vincent@wanadoo.fr (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,v1] can: m_can: m_can_read_fifo: fix memory leak in error branch | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Series ignored based on subject |
On 27.10.2021 03:09:09, Vincent Mailhol wrote: > In m_can_read_fifo(), if the second call to m_can_fifo_read() fails, > the function jump to the out_fail label and returns without calling > m_can_receive_skb(). This means that the skb previously allocated by > alloc_can_skb() is not freed. In other terms, this is a memory leak. > > This patch adds a new goto statement: out_receive_skb and do some > small code refactoring to fix the issue. This means we pass a skb to the user space, which contains wrong data. Probably 0x0, but if the CAN frame doesn't contain 0x0, it's wrong. That doesn't look like a good idea. If the CAN frame broke due to a CRC issue on the wire it is not received. IMHO it's best to discard the skb and return the error. Marc
On Fri. 29 Oct 2021 at 20:34, Marc Kleine-Budde <mkl@pengutronix.de> wrote: > On 27.10.2021 03:09:09, Vincent Mailhol wrote: > > In m_can_read_fifo(), if the second call to m_can_fifo_read() fails, > > the function jump to the out_fail label and returns without calling > > m_can_receive_skb(). This means that the skb previously allocated by > > alloc_can_skb() is not freed. In other terms, this is a memory leak. > > > > This patch adds a new goto statement: out_receive_skb and do some > > small code refactoring to fix the issue. > > This means we pass a skb to the user space, which contains wrong data. > Probably 0x0, but if the CAN frame doesn't contain 0x0, it's wrong. That > doesn't look like a good idea. If the CAN frame broke due to a CRC issue > on the wire it is not received. IMHO it's best to discard the skb and > return the error. Arg... Guess I made the right choice to tag the patch as RFC... Just one question, what is the correct function to discard the skb? The driver uses the napi polling system (which I am not entirely familiar with). Does it mean that the rx is not done in IRQ context and that we can simply use kfree_skb() instead of dev_kfree_skb_irq()? Yours sincerely, Vincent Mailhol
On 30.10.2021 01:35:01, Vincent MAILHOL wrote: > On Fri. 29 Oct 2021 at 20:34, Marc Kleine-Budde <mkl@pengutronix.de> wrote: > > On 27.10.2021 03:09:09, Vincent Mailhol wrote: > > > In m_can_read_fifo(), if the second call to m_can_fifo_read() fails, > > > the function jump to the out_fail label and returns without calling > > > m_can_receive_skb(). This means that the skb previously allocated by > > > alloc_can_skb() is not freed. In other terms, this is a memory leak. > > > > > > This patch adds a new goto statement: out_receive_skb and do some > > > small code refactoring to fix the issue. > > > > This means we pass a skb to the user space, which contains wrong data. > > Probably 0x0, but if the CAN frame doesn't contain 0x0, it's wrong. That > > doesn't look like a good idea. If the CAN frame broke due to a CRC issue > > on the wire it is not received. IMHO it's best to discard the skb and > > return the error. > > Arg... Guess I made the right choice to tag the patch as RFC... > > Just one question, what is the correct function to discard the > skb? The driver uses the napi polling system (which I am not > entirely familiar with). Does it mean that the rx is not done in > IRQ context and that we can simply use kfree_skb() instead of > dev_kfree_skb_irq()? The m_can driver is a bit more complicated. It uses NAPI for mmio devices, but threaded IRQs for SPI devices. Looking at dev_kfree_skb_any(), it checks for hard IRQs or IRQs disabled, I think this is not the case for both threaded IRQs and NAPI. https://elixir.bootlin.com/linux/v5.15/source/net/core/dev.c#L3108 So I think kfree_skb() should be OK. regards, Marc
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 2470c47b2e31..4e81ff9dd5c6 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -476,7 +476,7 @@ static int m_can_read_fifo(struct net_device *dev, u32 rxfs) struct id_and_dlc fifo_header; u32 fgi; u32 timestamp = 0; - int err; + int err = 0; /* calculate the fifo get index for where to read data */ fgi = FIELD_GET(RXFS_FGI_MASK, rxfs); @@ -517,7 +517,7 @@ static int m_can_read_fifo(struct net_device *dev, u32 rxfs) err = m_can_fifo_read(cdev, fgi, M_CAN_FIFO_DATA, cf->data, DIV_ROUND_UP(cf->len, 4)); if (err) - goto out_fail; + goto out_receive_skb; } /* acknowledge rx fifo 0 */ @@ -528,12 +528,12 @@ static int m_can_read_fifo(struct net_device *dev, u32 rxfs) timestamp = FIELD_GET(RX_BUF_RXTS_MASK, fifo_header.dlc); +out_receive_skb: m_can_receive_skb(cdev, skb, timestamp); - return 0; - out_fail: - netdev_err(dev, "FIFO read returned %d\n", err); + if (err) + netdev_err(dev, "FIFO read returned %d\n", err); return err; }
In m_can_read_fifo(), if the second call to m_can_fifo_read() fails, the function jump to the out_fail label and returns without calling m_can_receive_skb(). This means that the skb previously allocated by alloc_can_skb() is not freed. In other terms, this is a memory leak. This patch adds a new goto statement: out_receive_skb and do some small code refactoring to fix the issue. * Appendix: how the issue was found * This issue was found using GCC's static analysis tool: -fanalyzer: https://gcc.gnu.org/onlinedocs/gcc/Static-Analyzer-Options.html The step to reproduce are: 1. Install GCC 11. 2. Hack the kernel's Makefile to add the -fanalyzer flag (we leave it as an exercise for the reader to figure out the details of how to do so). 3. Decorate the function alloc_can_skb() with __attribute__((__malloc__ (dealloc, netif_rx))). This step helps the static analyzer to figure out the constructor/destructor pairs (not something it can deduce by himself). 4. Compile. The compiler then throws below warning: | drivers/net/can/m_can/m_can.c: In function 'm_can_read_fifo': | drivers/net/can/m_can/m_can.c:537:9: warning: leak of 'skb' [CWE-401] [-Wanalyzer-malloc-leak] | 537 | return err; | | ^~~~~~ | 'm_can_rx_handler': events 1-6 | | | | 899 | static int m_can_rx_handler(struct net_device *dev, int quota) | | | ^~~~~~~~~~~~~~~~ | | | | | | | (1) entry to 'm_can_rx_handler' | |...... | | 907 | if (!irqstatus) | | | ~ | | | | | | | (2) following 'false' branch (when 'irqstatus != 0')... | |...... | | 920 | if (cdev->version <= 31 && irqstatus & IR_MRAF && | | | ~~ | | | | | | | (3) ...to here | |...... | | 939 | if (irqstatus & IR_RF0N) { | | | ~ | | | | | | | (4) following 'true' branch... | | 940 | rx_work_or_err = m_can_do_rx_poll(dev, (quota - work_done)); | | | ~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | | | | | (6) calling 'm_can_do_rx_poll' from 'm_can_rx_handler' | | | (5) ...to here | | | +--> 'm_can_do_rx_poll': events 7-8 | | | | 540 | static int m_can_do_rx_poll(struct net_device *dev, int quota) | | | ^~~~~~~~~~~~~~~~ | | | | | | | (7) entry to 'm_can_do_rx_poll' | |...... | | 548 | if (!(rxfs & RXFS_FFL_MASK)) { | | | ~ | | | | | | | (8) following 'false' branch... | | | 'm_can_do_rx_poll': event 9 | | | |cc1: | | (9): ...to here | | | 'm_can_do_rx_poll': events 10-12 | | | | 553 | while ((rxfs & RXFS_FFL_MASK) && (quota > 0)) { | | | ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~ | | | | | | | (10) following 'true' branch... | | 554 | err = m_can_read_fifo(dev, rxfs); | | | ~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | | | | | (12) calling 'm_can_read_fifo' from 'm_can_do_rx_poll' | | | (11) ...to here | | | +--> 'm_can_read_fifo': events 13-24 | | | | 470 | static int m_can_read_fifo(struct net_device *dev, u32 rxfs) | | | ^~~~~~~~~~~~~~~ | | | | | | | (13) entry to 'm_can_read_fifo' | |...... | | 484 | if (err) | | | ~ | | | | | | | (14) following 'false' branch... | |...... | | 487 | if (fifo_header.dlc & RX_BUF_FDF) | | | ~~ ~ | | | | | | | | | (16) following 'true' branch... | | | (15) ...to here | | 488 | skb = alloc_canfd_skb(dev, &cf); | | | ~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | | | | | (18) allocated here | | | (17) ...to here | |...... | | 491 | if (!skb) { | | | ~ | | | | | | | (19) assuming 'skb' is non-NULL | | | (20) following 'false' branch (when 'skb' is non-NULL)... | |...... | | 496 | if (fifo_header.dlc & RX_BUF_FDF) | | | ~~ | | | | | | | (21) ...to here | |...... | | 519 | if (err) | | | ~ | | | | | | | (22) following 'true' branch... | | 520 | goto out_fail; | | | ~~~~ | | | | | | | (23) ...to here | |...... | | 537 | return err; | | | ~~~~~~ | | | | | | | (24) 'skb' leaks here; was allocated at (18) | | Fixes: e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors") CC: Matt Kline <matt@bitbashing.io> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> --- drivers/net/can/m_can/m_can.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)