Message ID | 1313434172-18319-1-git-send-email-dianders@chromium.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On Mon, Aug 15, 2011 at 11:49:32AM -0700, Doug Anderson wrote: > This change doesn't fix any known problems but turns > on the overflow detection feature of the i2c controller > in the hopes of flushing out any current (or future) > bugs in the i2c driver. > > Inspired by a change on nvidia's git server: > http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=commit;h=266d1b7397284505e55d06254b497cb32be07b69 > > Signed-off-by: Doug Anderson <dianders@chromium.org> > --- > drivers/i2c/busses/i2c-tegra.c | 11 ++++++++--- > 1 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c > index 2440b74..4dbba23 100644 > --- a/drivers/i2c/busses/i2c-tegra.c > +++ b/drivers/i2c/busses/i2c-tegra.c > @@ -367,7 +367,8 @@ static int tegra_i2c_init(struct tegra_i2c_dev *i2c_dev) > static irqreturn_t tegra_i2c_isr(int irq, void *dev_id) > { > u32 status; > - const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST; > + const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST | > + I2C_INT_TX_FIFO_OVERFLOW; > struct tegra_i2c_dev *i2c_dev = dev_id; > > status = i2c_readl(i2c_dev, I2C_INT_STATUS); > @@ -389,6 +390,9 @@ static irqreturn_t tegra_i2c_isr(int irq, void *dev_id) > } > > if (unlikely(status & status_err)) { > + /* Don't pass this back--it can only happen due to a bug. */ > + BUG_ON(status & I2C_INT_TX_FIFO_OVERFLOW); so due to a FIFO overflow you lock up the whole system ? Can't you e.g. reset the controller and reconfigure it rather than locking up the system ?
Felipe, On Mon, Aug 15, 2011 at 12:17 PM, Felipe Balbi <balbi@ti.com> wrote: > so due to a FIFO overflow you lock up the whole system ? Can't you e.g. > reset the controller and reconfigure it rather than locking up the > system ? Certainly we could try to be more proactive and reset / retry / return the error to the client. However, since the only expected situation where this BUG_ON should hit is due to a bug in this driver itself (AKA: i2c clients shouldn't be able to do anything to cause the BUG_ON to hit), that seems like a lot of added complexity. Also: if there is an arbitrary software bug that causing an overflow condition to occur, I'm not sure how stable the system will be. Specifically, the i2c controller is used (among other things) to talk to the PMU and adjust voltages in the system. If we just sent it a random command, I'd rather report the bug right away so we don't get hard to find/reproduce failures in other parts of the system. What do others think? -Doug
HI, On Mon, Aug 15, 2011 at 12:52:36PM -0700, Doug Anderson wrote: > Felipe, > > On Mon, Aug 15, 2011 at 12:17 PM, Felipe Balbi <balbi@ti.com> wrote: > > so due to a FIFO overflow you lock up the whole system ? Can't you e.g. > > reset the controller and reconfigure it rather than locking up the > > system ? > > Certainly we could try to be more proactive and reset / retry / return > the error to the client. However, since the only expected situation > where this BUG_ON should hit is due to a bug in this driver itself > (AKA: i2c clients shouldn't be able to do anything to cause the BUG_ON > to hit), that seems like a lot of added complexity. so at least just pass an error to the client, but hanging the entire system seems a bit too much, dont you think ? > Also: if there is an arbitrary software bug that causing an overflow > condition to occur, I'm not sure how stable the system will be. > Specifically, the i2c controller is used (among other things) to talk > to the PMU and adjust voltages in the system. If we just sent it a > random command, I'd rather report the bug right away so we don't get > hard to find/reproduce failures in other parts of the system. that's a good point, I still think that e.g. making a cellphone unresponsive until a watchdog reset triggers just because you got a FIFO overflow on the I2C controller is too much.
On Mon, Aug 15, 2011 at 11:03:50PM +0300, Felipe Balbi wrote: > HI, > > On Mon, Aug 15, 2011 at 12:52:36PM -0700, Doug Anderson wrote: > > Felipe, > > > > On Mon, Aug 15, 2011 at 12:17 PM, Felipe Balbi <balbi@ti.com> wrote: > > > so due to a FIFO overflow you lock up the whole system ? Can't you e.g. > > > reset the controller and reconfigure it rather than locking up the > > > system ? > > > > Certainly we could try to be more proactive and reset / retry / return > > the error to the client. However, since the only expected situation > > where this BUG_ON should hit is due to a bug in this driver itself > > (AKA: i2c clients shouldn't be able to do anything to cause the BUG_ON > > to hit), that seems like a lot of added complexity. > > so at least just pass an error to the client, but hanging the entire > system seems a bit too much, dont you think ? > > > Also: if there is an arbitrary software bug that causing an overflow > > condition to occur, I'm not sure how stable the system will be. > > Specifically, the i2c controller is used (among other things) to talk > > to the PMU and adjust voltages in the system. If we just sent it a > > random command, I'd rather report the bug right away so we don't get > > hard to find/reproduce failures in other parts of the system. > > that's a good point, I still think that e.g. making a cellphone > unresponsive until a watchdog reset triggers just because you got a FIFO > overflow on the I2C controller is too much. Yes, I would agree on that. BUG() really should be only used for occasions where there's little possiblity the entire system can continue to work. In this case, it seems far more sensible to report this as an error and see what can be done to recover the bus and controller for the next transaction.
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c index 2440b74..4dbba23 100644 --- a/drivers/i2c/busses/i2c-tegra.c +++ b/drivers/i2c/busses/i2c-tegra.c @@ -367,7 +367,8 @@ static int tegra_i2c_init(struct tegra_i2c_dev *i2c_dev) static irqreturn_t tegra_i2c_isr(int irq, void *dev_id) { u32 status; - const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST; + const u32 status_err = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST | + I2C_INT_TX_FIFO_OVERFLOW; struct tegra_i2c_dev *i2c_dev = dev_id; status = i2c_readl(i2c_dev, I2C_INT_STATUS); @@ -389,6 +390,9 @@ static irqreturn_t tegra_i2c_isr(int irq, void *dev_id) } if (unlikely(status & status_err)) { + /* Don't pass this back--it can only happen due to a bug. */ + BUG_ON(status & I2C_INT_TX_FIFO_OVERFLOW); + if (status & I2C_INT_NO_ACK) i2c_dev->msg_err |= I2C_ERR_NO_ACK; if (status & I2C_INT_ARBITRATION_LOST) @@ -423,7 +427,7 @@ err: /* An error occurred, mask all interrupts */ tegra_i2c_mask_irq(i2c_dev, I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST | I2C_INT_PACKET_XFER_COMPLETE | I2C_INT_TX_FIFO_DATA_REQ | - I2C_INT_RX_FIFO_DATA_REQ); + I2C_INT_RX_FIFO_DATA_REQ | I2C_INT_TX_FIFO_OVERFLOW); i2c_writel(i2c_dev, status, I2C_INT_STATUS); if (i2c_dev->is_dvc) dvc_writel(i2c_dev, DVC_STATUS_I2C_DONE_INTR, DVC_STATUS); @@ -473,7 +477,8 @@ static int tegra_i2c_xfer_msg(struct tegra_i2c_dev *i2c_dev, if (!(msg->flags & I2C_M_RD)) tegra_i2c_fill_tx_fifo(i2c_dev); - int_mask = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST; + int_mask = I2C_INT_NO_ACK | I2C_INT_ARBITRATION_LOST | + I2C_INT_TX_FIFO_OVERFLOW; if (msg->flags & I2C_M_RD) int_mask |= I2C_INT_RX_FIFO_DATA_REQ; else if (i2c_dev->msg_buf_remaining)
This change doesn't fix any known problems but turns on the overflow detection feature of the i2c controller in the hopes of flushing out any current (or future) bugs in the i2c driver. Inspired by a change on nvidia's git server: http://nv-tegra.nvidia.com/gitweb/?p=linux-2.6.git;a=commit;h=266d1b7397284505e55d06254b497cb32be07b69 Signed-off-by: Doug Anderson <dianders@chromium.org> --- drivers/i2c/busses/i2c-tegra.c | 11 ++++++++--- 1 files changed, 8 insertions(+), 3 deletions(-)