Message ID | 20210108150545.2018-2-rojay@codeaurora.org (mailing list archive) |
---|---|
State | Accepted |
Commit | e0371298ddc51761be257698554ea507ac8bf831 |
Headers | show |
Series | Implement Shutdown callback for geni-i2c | expand |
On 1/8/2021 8:35 PM, Roja Rani Yarubandi wrote: > If the hardware is still accessing memory after SMMU translation > is disabled (as part of smmu shutdown callback), then the > IOVAs (I/O virtual address) which it was using will go on the bus > as the physical addresses which will result in unknown crashes > like NoC/interconnect errors. > > So, implement shutdown callback to i2c driver to stop on-going transfer > and unmap DMA mappings during system "reboot" or "shutdown". > > Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") > Signed-off-by: Roja Rani Yarubandi <rojay@codeaurora.org> Reviewed-by: Akash Asthana <akashast@codeaurora.org>
Quoting Roja Rani Yarubandi (2021-01-08 07:05:45) > diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c > index 214b4c913a13..c3f584795911 100644 > --- a/drivers/i2c/busses/i2c-qcom-geni.c > +++ b/drivers/i2c/busses/i2c-qcom-geni.c > @@ -375,6 +375,32 @@ static void geni_i2c_tx_msg_cleanup(struct geni_i2c_dev *gi2c, > } > } > > +static void geni_i2c_stop_xfer(struct geni_i2c_dev *gi2c) > +{ > + int ret; > + u32 geni_status; > + struct i2c_msg *cur; > + > + /* Resume device, as runtime suspend can happen anytime during transfer */ > + ret = pm_runtime_get_sync(gi2c->se.dev); > + if (ret < 0) { > + dev_err(gi2c->se.dev, "Failed to resume device: %d\n", ret); > + return; > + } > + > + geni_status = readl_relaxed(gi2c->se.base + SE_GENI_STATUS); > + if (geni_status & M_GENI_CMD_ACTIVE) { > + cur = gi2c->cur; Why don't we need to hold the spinlock gi2c::lock here? > + geni_i2c_abort_xfer(gi2c); > + if (cur->flags & I2C_M_RD) > + geni_i2c_rx_msg_cleanup(gi2c, cur); > + else > + geni_i2c_tx_msg_cleanup(gi2c, cur); > + } > + > + pm_runtime_put_sync_suspend(gi2c->se.dev); > +} > + > static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, > u32 m_param) > {
Hi Stephen, On 2021-01-13 12:24, Stephen Boyd wrote: > Quoting Roja Rani Yarubandi (2021-01-08 07:05:45) >> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c >> b/drivers/i2c/busses/i2c-qcom-geni.c >> index 214b4c913a13..c3f584795911 100644 >> --- a/drivers/i2c/busses/i2c-qcom-geni.c >> +++ b/drivers/i2c/busses/i2c-qcom-geni.c >> @@ -375,6 +375,32 @@ static void geni_i2c_tx_msg_cleanup(struct >> geni_i2c_dev *gi2c, >> } >> } >> >> +static void geni_i2c_stop_xfer(struct geni_i2c_dev *gi2c) >> +{ >> + int ret; >> + u32 geni_status; >> + struct i2c_msg *cur; >> + >> + /* Resume device, as runtime suspend can happen anytime during >> transfer */ >> + ret = pm_runtime_get_sync(gi2c->se.dev); >> + if (ret < 0) { >> + dev_err(gi2c->se.dev, "Failed to resume device: %d\n", >> ret); >> + return; >> + } >> + >> + geni_status = readl_relaxed(gi2c->se.base + SE_GENI_STATUS); >> + if (geni_status & M_GENI_CMD_ACTIVE) { >> + cur = gi2c->cur; > > Why don't we need to hold the spinlock gi2c::lock here? > I am not seeing any race here. May I know which race are you suspecting here? >> + geni_i2c_abort_xfer(gi2c); >> + if (cur->flags & I2C_M_RD) >> + geni_i2c_rx_msg_cleanup(gi2c, cur); >> + else >> + geni_i2c_tx_msg_cleanup(gi2c, cur); >> + } >> + >> + pm_runtime_put_sync_suspend(gi2c->se.dev); >> +} >> + >> static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct >> i2c_msg *msg, >> u32 m_param) >> {
Quoting rojay@codeaurora.org (2021-02-18 06:15:17) > Hi Stephen, > > On 2021-01-13 12:24, Stephen Boyd wrote: > > Quoting Roja Rani Yarubandi (2021-01-08 07:05:45) > >> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c > >> b/drivers/i2c/busses/i2c-qcom-geni.c > >> index 214b4c913a13..c3f584795911 100644 > >> --- a/drivers/i2c/busses/i2c-qcom-geni.c > >> +++ b/drivers/i2c/busses/i2c-qcom-geni.c > >> @@ -375,6 +375,32 @@ static void geni_i2c_tx_msg_cleanup(struct > >> geni_i2c_dev *gi2c, > >> } > >> } > >> > >> +static void geni_i2c_stop_xfer(struct geni_i2c_dev *gi2c) > >> +{ > >> + int ret; > >> + u32 geni_status; > >> + struct i2c_msg *cur; > >> + > >> + /* Resume device, as runtime suspend can happen anytime during > >> transfer */ > >> + ret = pm_runtime_get_sync(gi2c->se.dev); > >> + if (ret < 0) { > >> + dev_err(gi2c->se.dev, "Failed to resume device: %d\n", > >> ret); > >> + return; > >> + } > >> + > >> + geni_status = readl_relaxed(gi2c->se.base + SE_GENI_STATUS); > >> + if (geni_status & M_GENI_CMD_ACTIVE) { > >> + cur = gi2c->cur; > > > > Why don't we need to hold the spinlock gi2c::lock here? > > > > I am not seeing any race here. May I know which race are you suspecting > here? Sorry there are long delays between posting and replies to my review comments. It takes me some time to remember what we're talking about because this patch has dragged on for many months. So my understanding is that gi2c::lock protects the 'cur' pointer. I imagine this scenario might go bad CPU0 CPU1 ---- ---- geni_i2c_stop_xfer() ... geni_i2c_rx_one_msg() gi2c->cur = cur1; cur = gi2c->cur; ... geni_i2c_tx_one_msg() gi2c->cur = cur2; geni_i2c_abort_xfer() <uses cur2> if (cur->flags & I2C_M_RD) <uses cur1 for the condition and call; oops that's bad> It's almost like we should combine the geni_i2c_abort_xfer() logic with the rx/tx message cleanup functions so that it's all done under one lock. Unfortunately it's complicated by the fact that there are various completion waiting timeouts involved. Fun! But even after all that, I don't see how the geni_i2c_stop_xfer() puts a stop to future calls to geni_i2c_rx_one_msg() or geni_i2c_tx_one_msg(). The hardware isn't disabled from what I can tell. The irq isn't disabled, the clks aren't turned off, etc. What is to stop an i2c device from trying to use the bus after this shutdown function is called? If anything, this function looks like a "flush", where we flush out any pending transfer. Where's the "plug" operation that prevents any future operations from following this call? BTW, I see this is merged upstream. That's great, but it seems broken. Please fix it or revert it out. > > >> + geni_i2c_abort_xfer(gi2c); > >> + if (cur->flags & I2C_M_RD) > >> + geni_i2c_rx_msg_cleanup(gi2c, cur); > >> + else > >> + geni_i2c_tx_msg_cleanup(gi2c, cur); > >> + } > >> + > >> + pm_runtime_put_sync_suspend(gi2c->se.dev); > >> +} > >> + > >> static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct > >> i2c_msg *msg, > >> u32 m_param) > >> {
> BTW, I see this is merged upstream. That's great, but it seems broken. > Please fix it or revert it out. Sorry, that was my mistake! I was aware of this discussion. It seems I accidently applied it. I'll send a revert and we can reapply it once everyone is happy.
Hi Stephen, On 2021-02-24 12:36, Stephen Boyd wrote: > Quoting rojay@codeaurora.org (2021-02-18 06:15:17) >> Hi Stephen, >> >> On 2021-01-13 12:24, Stephen Boyd wrote: >> > Quoting Roja Rani Yarubandi (2021-01-08 07:05:45) >> >> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c >> >> b/drivers/i2c/busses/i2c-qcom-geni.c >> >> index 214b4c913a13..c3f584795911 100644 >> >> --- a/drivers/i2c/busses/i2c-qcom-geni.c >> >> +++ b/drivers/i2c/busses/i2c-qcom-geni.c >> >> @@ -375,6 +375,32 @@ static void geni_i2c_tx_msg_cleanup(struct >> >> geni_i2c_dev *gi2c, >> >> } >> >> } >> >> >> >> +static void geni_i2c_stop_xfer(struct geni_i2c_dev *gi2c) >> >> +{ >> >> + int ret; >> >> + u32 geni_status; >> >> + struct i2c_msg *cur; >> >> + >> >> + /* Resume device, as runtime suspend can happen anytime during >> >> transfer */ >> >> + ret = pm_runtime_get_sync(gi2c->se.dev); >> >> + if (ret < 0) { >> >> + dev_err(gi2c->se.dev, "Failed to resume device: %d\n", >> >> ret); >> >> + return; >> >> + } >> >> + >> >> + geni_status = readl_relaxed(gi2c->se.base + SE_GENI_STATUS); >> >> + if (geni_status & M_GENI_CMD_ACTIVE) { >> >> + cur = gi2c->cur; >> > >> > Why don't we need to hold the spinlock gi2c::lock here? >> > >> >> I am not seeing any race here. May I know which race are you >> suspecting >> here? > > Sorry there are long delays between posting and replies to my review > comments. It takes me some time to remember what we're talking about > because this patch has dragged on for many months. > Sorry for the delayed responses. > So my understanding is that gi2c::lock protects the 'cur' pointer. I > imagine this scenario might go bad > > CPU0 CPU1 > ---- ---- > geni_i2c_stop_xfer() > ... geni_i2c_rx_one_msg() > gi2c->cur = cur1; > cur = gi2c->cur; > ... geni_i2c_tx_one_msg() > gi2c->cur = cur2; > geni_i2c_abort_xfer() > <uses cur2> > if (cur->flags & I2C_M_RD) > <uses cur1 for the condition and call; oops that's bad> > > It's almost like we should combine the geni_i2c_abort_xfer() logic with > the rx/tx message cleanup functions so that it's all done under one > lock. Unfortunately it's complicated by the fact that there are various > completion waiting timeouts involved. Fun! > Thanks for the explanation. Fixed this possible race by protecting gi2c->cur and calling geni_i2c_abort_xfer() with adding another parameter to differentiate from which sequence is the geni_i2c_abort_xfer() called and handle the spin_lock/spin_unlock accordingly inside geni_i2c_abort_xfer() > But even after all that, I don't see how the geni_i2c_stop_xfer() puts > a > stop to future calls to geni_i2c_rx_one_msg() or geni_i2c_tx_one_msg(). Now handled this by adding a bool variable "stop_xfer" in geni_i2c_dev struct, used to put stop to upcoming geni_i2c_rx_one_msg() and geni_i2c_tx_one_msg() calls once we receive the shutdown call. > The hardware isn't disabled from what I can tell. The irq isn't > disabled, the clks aren't turned off, etc. What is to stop an i2c > device > from trying to use the bus after this shutdown function is called? If > anything, this function looks like a "flush", where we flush out any > pending transfer. Where's the "plug" operation that prevents any future > operations from following this call? > We are turning off clocks and disabling irq in geni_i2c_runtime_suspend(). IIUC about shutdown sequence, during "remove" we will unplug the device with opposite calls to "probe's" plug operations example i2c_del_adapter(). For "shutdown", as system is going to shutdown, there is no need of unplug operations to be done. > BTW, I see this is merged upstream. That's great, but it seems broken. > Please fix it or revert it out. > >> >> >> + geni_i2c_abort_xfer(gi2c); >> >> + if (cur->flags & I2C_M_RD) >> >> + geni_i2c_rx_msg_cleanup(gi2c, cur); >> >> + else >> >> + geni_i2c_tx_msg_cleanup(gi2c, cur); >> >> + } >> >> + >> >> + pm_runtime_put_sync_suspend(gi2c->se.dev); >> >> +} >> >> + >> >> static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct >> >> i2c_msg *msg, >> >> u32 m_param) >> >> { Thanks, Roja
diff --git a/drivers/i2c/busses/i2c-qcom-geni.c b/drivers/i2c/busses/i2c-qcom-geni.c index 214b4c913a13..c3f584795911 100644 --- a/drivers/i2c/busses/i2c-qcom-geni.c +++ b/drivers/i2c/busses/i2c-qcom-geni.c @@ -375,6 +375,32 @@ static void geni_i2c_tx_msg_cleanup(struct geni_i2c_dev *gi2c, } } +static void geni_i2c_stop_xfer(struct geni_i2c_dev *gi2c) +{ + int ret; + u32 geni_status; + struct i2c_msg *cur; + + /* Resume device, as runtime suspend can happen anytime during transfer */ + ret = pm_runtime_get_sync(gi2c->se.dev); + if (ret < 0) { + dev_err(gi2c->se.dev, "Failed to resume device: %d\n", ret); + return; + } + + geni_status = readl_relaxed(gi2c->se.base + SE_GENI_STATUS); + if (geni_status & M_GENI_CMD_ACTIVE) { + cur = gi2c->cur; + geni_i2c_abort_xfer(gi2c); + if (cur->flags & I2C_M_RD) + geni_i2c_rx_msg_cleanup(gi2c, cur); + else + geni_i2c_tx_msg_cleanup(gi2c, cur); + } + + pm_runtime_put_sync_suspend(gi2c->se.dev); +} + static int geni_i2c_rx_one_msg(struct geni_i2c_dev *gi2c, struct i2c_msg *msg, u32 m_param) { @@ -650,6 +676,13 @@ static int geni_i2c_remove(struct platform_device *pdev) return 0; } +static void geni_i2c_shutdown(struct platform_device *pdev) +{ + struct geni_i2c_dev *gi2c = platform_get_drvdata(pdev); + + geni_i2c_stop_xfer(gi2c); +} + static int __maybe_unused geni_i2c_runtime_suspend(struct device *dev) { int ret; @@ -714,6 +747,7 @@ MODULE_DEVICE_TABLE(of, geni_i2c_dt_match); static struct platform_driver geni_i2c_driver = { .probe = geni_i2c_probe, .remove = geni_i2c_remove, + .shutdown = geni_i2c_shutdown, .driver = { .name = "geni_i2c", .pm = &geni_i2c_pm_ops,
If the hardware is still accessing memory after SMMU translation is disabled (as part of smmu shutdown callback), then the IOVAs (I/O virtual address) which it was using will go on the bus as the physical addresses which will result in unknown crashes like NoC/interconnect errors. So, implement shutdown callback to i2c driver to stop on-going transfer and unmap DMA mappings during system "reboot" or "shutdown". Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Roja Rani Yarubandi <rojay@codeaurora.org> --- Changes in V2: - As per Stephen's comments added seperate function for stop transfer, fixed minor nitpicks. - As per Stephen's comments, changed commit text. Changes in V3: - As per Stephen's comments, squashed patch 1 into patch 2, added Fixes tag. - As per Akash's comments, included FIFO case in stop_xfer, fixed minor nitpicks. Changes in V4: - As per Stephen's comments cleaned up geni_i2c_stop_xfer function, added dma_buf in geni_i2c_dev struct to call i2c_put_dma_safe_msg_buf() from other functions, removed "iova" check in geni_se_rx_dma_unprep() and geni_se_tx_dma_unprep() functions. - Added two helper functions geni_i2c_rx_one_msg_done() and geni_i2c_tx_one_msg_done() to unwrap the things after rx/tx FIFO/DMA transfers, so that the same can be used in geni_i2c_stop_xfer() function during shutdown callback. Updated commit text accordingly. - Checking whether it is tx/rx transfer using I2C_M_RD which is valid for both FIFO and DMA cases, so dropped DMA_RX_ACTIVE and DMA_TX_ACTIVE bit checking Changes in V5: - As per Stephen's comments, added spin_lock_irqsave & spin_unlock_irqsave in geni_i2c_stop_xfer() function. Changes in V6: - As per Stephen's comments, taken care of unsafe lock order in geni_i2c_stop_xfer(). - Moved spin_lock/unlock to geni_i2c_rx_msg_cleanup() and geni_i2c_tx_msg_cleanup() functions. Changes in V7: - No changes Changes in V8: - As per Wolfram Sang comment, removed goto and modified geni_i2c_stop_xfer() accordingly. drivers/i2c/busses/i2c-qcom-geni.c | 34 ++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+)