Message ID | 20220818203308.439043-4-nicolas.dufresne@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1,1/3] media: cedrus: Fix watchdog race condition | expand |
On 8/18/22 23:33, Nicolas Dufresne wrote: > From: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > The busy status bit may never de-assert if number of programmed skip > bits is incorrect, resulting in a kernel hang because the bit is polled > endlessly in the code. Fix it by adding timeout for the bit-polling. > This problem is reproducible by setting the data_bit_offset field of > the HEVC slice params to a wrong value by userspace. > > Cc: stable@vger.kernel.org > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > --- > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > index f703c585d91c5..f0bc118021b0a 100644 > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct cedrus_dev *dev, > static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num) > { > int count = 0; > + u32 reg; This "reg" variable isn't needed anymore after switching to cedrus_wait_for(). Sorry, I missed it :) > while (count < num) { > int tmp = min(num - count, 32); > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num) > cedrus_write(dev, VE_DEC_H265_TRIGGER, > VE_DEC_H265_TRIGGER_FLUSH_BITS | > VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp)); > - while (cedrus_read(dev, VE_DEC_H265_STATUS) & VE_DEC_H265_STATUS_VLD_BUSY) > - udelay(1); > + > + if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, VE_DEC_H265_STATUS_VLD_BUSY)) > + dev_err_ratelimited(dev->dev, "timed out waiting to skip bits\n"); > > count += tmp; > }
Le jeudi 18 août 2022 à 23:39 +0300, Dmitry Osipenko a écrit : > On 8/18/22 23:33, Nicolas Dufresne wrote: > > From: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > > > The busy status bit may never de-assert if number of programmed skip > > bits is incorrect, resulting in a kernel hang because the bit is polled > > endlessly in the code. Fix it by adding timeout for the bit-polling. > > This problem is reproducible by setting the data_bit_offset field of > > the HEVC slice params to a wrong value by userspace. > > > > Cc: stable@vger.kernel.org > > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > > --- > > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > index f703c585d91c5..f0bc118021b0a 100644 > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct cedrus_dev *dev, > > static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num) > > { > > int count = 0; > > + u32 reg; > > This "reg" variable isn't needed anymore after switching to > cedrus_wait_for(). Sorry, I missed it :) Good catch thanks, will fix. > > > while (count < num) { > > int tmp = min(num - count, 32); > > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num) > > cedrus_write(dev, VE_DEC_H265_TRIGGER, > > VE_DEC_H265_TRIGGER_FLUSH_BITS | > > VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp)); > > - while (cedrus_read(dev, VE_DEC_H265_STATUS) & VE_DEC_H265_STATUS_VLD_BUSY) > > - udelay(1); > > + > > + if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, VE_DEC_H265_STATUS_VLD_BUSY)) > > + dev_err_ratelimited(dev->dev, "timed out waiting to skip bits\n"); > > > > count += tmp; > > } > >
Dne četrtek, 18. avgust 2022 ob 22:33:08 CEST je Nicolas Dufresne napisal(a): > From: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > The busy status bit may never de-assert if number of programmed skip > bits is incorrect, resulting in a kernel hang because the bit is polled > endlessly in the code. Fix it by adding timeout for the bit-polling. > This problem is reproducible by setting the data_bit_offset field of > the HEVC slice params to a wrong value by userspace. > > Cc: stable@vger.kernel.org > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Fixes tag would be nice. > --- > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c index > f703c585d91c5..f0bc118021b0a 100644 > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct > cedrus_dev *dev, static void cedrus_h265_skip_bits(struct cedrus_dev *dev, > int num) { > int count = 0; > + u32 reg; > > while (count < num) { > int tmp = min(num - count, 32); > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev > *dev, int num) cedrus_write(dev, VE_DEC_H265_TRIGGER, > VE_DEC_H265_TRIGGER_FLUSH_BITS | > VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp)); > - while (cedrus_read(dev, VE_DEC_H265_STATUS) & > VE_DEC_H265_STATUS_VLD_BUSY) - udelay(1); > + > + if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, > VE_DEC_H265_STATUS_VLD_BUSY)) + dev_err_ratelimited(dev->dev, "timed out > waiting to skip bits\n"); Reporting issue is nice, but better would be to propagate error, since there is no way to properly decode this slice if above code block fails. Best regards, Jernej > > count += tmp; > }
Le vendredi 19 août 2022 à 06:16 +0200, Jernej Škrabec a écrit : > Dne četrtek, 18. avgust 2022 ob 22:33:08 CEST je Nicolas Dufresne napisal(a): > > From: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > > > The busy status bit may never de-assert if number of programmed skip > > bits is incorrect, resulting in a kernel hang because the bit is polled > > endlessly in the code. Fix it by adding timeout for the bit-polling. > > This problem is reproducible by setting the data_bit_offset field of > > the HEVC slice params to a wrong value by userspace. > > > > Cc: stable@vger.kernel.org > > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > > Fixes tag would be nice. > > > --- > > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c index > > f703c585d91c5..f0bc118021b0a 100644 > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct > > cedrus_dev *dev, static void cedrus_h265_skip_bits(struct cedrus_dev *dev, > > int num) { > > int count = 0; > > + u32 reg; > > > > while (count < num) { > > int tmp = min(num - count, 32); > > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev > > *dev, int num) cedrus_write(dev, VE_DEC_H265_TRIGGER, > > VE_DEC_H265_TRIGGER_FLUSH_BITS | > > VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp)); > > - while (cedrus_read(dev, VE_DEC_H265_STATUS) & > > VE_DEC_H265_STATUS_VLD_BUSY) - udelay(1); > > + > > + if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, > > VE_DEC_H265_STATUS_VLD_BUSY)) + > dev_err_ratelimited(dev->dev, "timed out > > waiting to skip bits\n"); > > Reporting issue is nice, but better would be to propagate error, since there > is no way to properly decode this slice if above code block fails. This mimic what was already there, mind if we do that later ? The propagation is doing to be a lot more intrusive. > > Best regards, > Jernej > > > > > count += tmp; > > } > > > >
Dne petek, 19. avgust 2022 ob 17:39:25 CEST je Nicolas Dufresne napisal(a): > Le vendredi 19 août 2022 à 06:16 +0200, Jernej Škrabec a écrit : > > Dne četrtek, 18. avgust 2022 ob 22:33:08 CEST je Nicolas Dufresne napisal(a): > > > From: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > > > > > The busy status bit may never de-assert if number of programmed skip > > > bits is incorrect, resulting in a kernel hang because the bit is polled > > > endlessly in the code. Fix it by adding timeout for the bit-polling. > > > This problem is reproducible by setting the data_bit_offset field of > > > the HEVC slice params to a wrong value by userspace. > > > > > > Cc: stable@vger.kernel.org > > > Reported-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > > > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> > > > Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > > > > Fixes tag would be nice. > > > > > --- > > > > > > drivers/staging/media/sunxi/cedrus/cedrus_h265.c | 6 ++++-- > > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > > b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c index > > > f703c585d91c5..f0bc118021b0a 100644 > > > --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > > +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c > > > @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct > > > cedrus_dev *dev, static void cedrus_h265_skip_bits(struct cedrus_dev > > > *dev, > > > int num) { > > > > > > int count = 0; > > > > > > + u32 reg; > > > > > > while (count < num) { > > > > > > int tmp = min(num - count, 32); > > > > > > @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev > > > *dev, int num) cedrus_write(dev, VE_DEC_H265_TRIGGER, > > > > > > VE_DEC_H265_TRIGGER_FLUSH_BITS | > > > VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp)); > > > > > > - while (cedrus_read(dev, VE_DEC_H265_STATUS) & > > > VE_DEC_H265_STATUS_VLD_BUSY) - udelay(1); > > > + > > > + if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, > > > VE_DEC_H265_STATUS_VLD_BUSY)) + > > > > dev_err_ratelimited(dev->dev, "timed out > > > > > waiting to skip bits\n"); > > > > Reporting issue is nice, but better would be to propagate error, since > > there is no way to properly decode this slice if above code block fails. > This mimic what was already there, mind if we do that later ? The > propagation is doing to be a lot more intrusive. Since backporting fixes before 6.0 isn't priority, viability for backpporting isn't that important. You would only need to return 0 or -ETIMEDOUT and check for error in only one location. That doesn't sound very intrusive for me. Best regards, Jernej > > > Best regards, > > Jernej > > > > > count += tmp; > > > > > > }
diff --git a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c index f703c585d91c5..f0bc118021b0a 100644 --- a/drivers/staging/media/sunxi/cedrus/cedrus_h265.c +++ b/drivers/staging/media/sunxi/cedrus/cedrus_h265.c @@ -227,6 +227,7 @@ static void cedrus_h265_pred_weight_write(struct cedrus_dev *dev, static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num) { int count = 0; + u32 reg; while (count < num) { int tmp = min(num - count, 32); @@ -234,8 +235,9 @@ static void cedrus_h265_skip_bits(struct cedrus_dev *dev, int num) cedrus_write(dev, VE_DEC_H265_TRIGGER, VE_DEC_H265_TRIGGER_FLUSH_BITS | VE_DEC_H265_TRIGGER_TYPE_N_BITS(tmp)); - while (cedrus_read(dev, VE_DEC_H265_STATUS) & VE_DEC_H265_STATUS_VLD_BUSY) - udelay(1); + + if (cedrus_wait_for(dev, VE_DEC_H265_STATUS, VE_DEC_H265_STATUS_VLD_BUSY)) + dev_err_ratelimited(dev->dev, "timed out waiting to skip bits\n"); count += tmp; }