mbox series

[RFC,0/8] media: hantro: Add 10-bit support

Message ID 20220227144926.3006585-1-jernej.skrabec@gmail.com (mailing list archive)
Headers show
Series media: hantro: Add 10-bit support | expand

Message

Jernej Škrabec Feb. 27, 2022, 2:49 p.m. UTC
First two patches add 10-bit formats to UAPI, third extends filtering
mechanism, fourth fixes incorrect assumption, fifth moves register
configuration code to proper place, sixth and seventh enable 10-bit
VP9 decoding on Allwinner H6 and last increases core frequency on
Allwinner H6.

I'm sending this as RFC to get some comments:
1. format definitions - are fourcc's ok? are comments/descriptions ok?
2. is extended filtering mechanism ok?

I would also like if these patches are tested on some more HW.
Additionally, can someone test tiled P010?

Please take a look.

Best regards,
Jernej

Ezequiel Garcia (1):
  media: Add P010 tiled format

Jernej Skrabec (7):
  media: Add P010 format
  media: hantro: Support format filtering by depth
  media: hantro: postproc: Fix buffer size calculation
  media: hantro: postproc: Fix legacy regs configuration
  media: hantro: Store VP9 bit depth in context
  media: hantro: sunxi: Enable 10-bit decoding
  media: hantro: sunxi: Increase frequency

 drivers/media/v4l2-core/v4l2-common.c         |  3 ++
 drivers/media/v4l2-core/v4l2-ioctl.c          |  2 +
 drivers/staging/media/hantro/hantro.h         |  4 ++
 drivers/staging/media/hantro/hantro_drv.c     | 23 +++++++++
 .../staging/media/hantro/hantro_g2_vp9_dec.c  |  8 ---
 .../staging/media/hantro/hantro_postproc.c    | 34 ++++++++++---
 drivers/staging/media/hantro/hantro_v4l2.c    | 50 +++++++++++++++++--
 drivers/staging/media/hantro/hantro_v4l2.h    |  3 ++
 drivers/staging/media/hantro/sunxi_vpu_hw.c   | 13 ++++-
 include/uapi/linux/videodev2.h                |  2 +
 10 files changed, 122 insertions(+), 20 deletions(-)

Comments

Jernej Škrabec Feb. 27, 2022, 5:03 p.m. UTC | #1
Dne nedelja, 27. februar 2022 ob 15:49:18 CET je Jernej Skrabec napisal(a):
> First two patches add 10-bit formats to UAPI, third extends filtering
> mechanism, fourth fixes incorrect assumption, fifth moves register
> configuration code to proper place, sixth and seventh enable 10-bit
> VP9 decoding on Allwinner H6 and last increases core frequency on
> Allwinner H6.

FYI, additional patch is needed for linear P010 output:
https://github.com/jernejsk/linux-1/commit/
28338c00749b821819690c9fd548fd5c311682b5

With that, only native format was not tested.

Regards,
Jernej

> 
> I'm sending this as RFC to get some comments:
> 1. format definitions - are fourcc's ok? are comments/descriptions ok?
> 2. is extended filtering mechanism ok?
> 
> I would also like if these patches are tested on some more HW.
> Additionally, can someone test tiled P010?
> 
> Please take a look.
> 
> Best regards,
> Jernej
> 
> Ezequiel Garcia (1):
>   media: Add P010 tiled format
> 
> Jernej Skrabec (7):
>   media: Add P010 format
>   media: hantro: Support format filtering by depth
>   media: hantro: postproc: Fix buffer size calculation
>   media: hantro: postproc: Fix legacy regs configuration
>   media: hantro: Store VP9 bit depth in context
>   media: hantro: sunxi: Enable 10-bit decoding
>   media: hantro: sunxi: Increase frequency
> 
>  drivers/media/v4l2-core/v4l2-common.c         |  3 ++
>  drivers/media/v4l2-core/v4l2-ioctl.c          |  2 +
>  drivers/staging/media/hantro/hantro.h         |  4 ++
>  drivers/staging/media/hantro/hantro_drv.c     | 23 +++++++++
>  .../staging/media/hantro/hantro_g2_vp9_dec.c  |  8 ---
>  .../staging/media/hantro/hantro_postproc.c    | 34 ++++++++++---
>  drivers/staging/media/hantro/hantro_v4l2.c    | 50 +++++++++++++++++--
>  drivers/staging/media/hantro/hantro_v4l2.h    |  3 ++
>  drivers/staging/media/hantro/sunxi_vpu_hw.c   | 13 ++++-
>  include/uapi/linux/videodev2.h                |  2 +
>  10 files changed, 122 insertions(+), 20 deletions(-)
> 
> -- 
> 2.35.1
> 
>
Benjamin Gaignard April 5, 2022, 4:07 p.m. UTC | #2
Le 27/02/2022 à 15:49, Jernej Skrabec a écrit :
> First two patches add 10-bit formats to UAPI, third extends filtering
> mechanism, fourth fixes incorrect assumption, fifth moves register
> configuration code to proper place, sixth and seventh enable 10-bit
> VP9 decoding on Allwinner H6 and last increases core frequency on
> Allwinner H6.
>
> I'm sending this as RFC to get some comments:
> 1. format definitions - are fourcc's ok? are comments/descriptions ok?
> 2. is extended filtering mechanism ok?
>
> I would also like if these patches are tested on some more HW.
> Additionally, can someone test tiled P010?
>
> Please take a look.

Hi Jernej,

I have create a branch to test this series with VP9 and HEVC:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/tree/10bit_imx8m
Feel free to pick what I may need in it.

That doesn't improve fluster scores, I think more dev are still needed in GST
before getting something fully functional.
Anyway I able to select P010 pixel format if the input is a 10bit bitstream.

Regards,
Benjamin

>
> Best regards,
> Jernej
>
> Ezequiel Garcia (1):
>    media: Add P010 tiled format
>
> Jernej Skrabec (7):
>    media: Add P010 format
>    media: hantro: Support format filtering by depth
>    media: hantro: postproc: Fix buffer size calculation
>    media: hantro: postproc: Fix legacy regs configuration
>    media: hantro: Store VP9 bit depth in context
>    media: hantro: sunxi: Enable 10-bit decoding
>    media: hantro: sunxi: Increase frequency
>
>   drivers/media/v4l2-core/v4l2-common.c         |  3 ++
>   drivers/media/v4l2-core/v4l2-ioctl.c          |  2 +
>   drivers/staging/media/hantro/hantro.h         |  4 ++
>   drivers/staging/media/hantro/hantro_drv.c     | 23 +++++++++
>   .../staging/media/hantro/hantro_g2_vp9_dec.c  |  8 ---
>   .../staging/media/hantro/hantro_postproc.c    | 34 ++++++++++---
>   drivers/staging/media/hantro/hantro_v4l2.c    | 50 +++++++++++++++++--
>   drivers/staging/media/hantro/hantro_v4l2.h    |  3 ++
>   drivers/staging/media/hantro/sunxi_vpu_hw.c   | 13 ++++-
>   include/uapi/linux/videodev2.h                |  2 +
>   10 files changed, 122 insertions(+), 20 deletions(-)
>
Jernej Škrabec April 5, 2022, 6:40 p.m. UTC | #3
Hi Benjamin!

Dne torek, 05. april 2022 ob 18:07:41 CEST je Benjamin Gaignard napisal(a):
> Le 27/02/2022 à 15:49, Jernej Skrabec a écrit :
> > First two patches add 10-bit formats to UAPI, third extends filtering
> > mechanism, fourth fixes incorrect assumption, fifth moves register
> > configuration code to proper place, sixth and seventh enable 10-bit
> > VP9 decoding on Allwinner H6 and last increases core frequency on
> > Allwinner H6.
> > 
> > I'm sending this as RFC to get some comments:
> > 1. format definitions - are fourcc's ok? are comments/descriptions ok?
> > 2. is extended filtering mechanism ok?
> > 
> > I would also like if these patches are tested on some more HW.
> > Additionally, can someone test tiled P010?
> > 
> > Please take a look.
> 
> Hi Jernej,
> 
> I have create a branch to test this series with VP9 and HEVC:
> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/tree/10bit_imx
> 8m Feel free to pick what I may need in it.
> 
> That doesn't improve fluster scores, I think more dev are still needed in
> GST before getting something fully functional.
> Anyway I able to select P010 pixel format if the input is a 10bit bitstream.

What kind of improvements do you expect? Actually, this series is designed to 
change nothing for platforms, where 10-bit format is not added into the list 
of supported formats. I think reasons are quite obvious. First, not every 
device may support 10-bit output. Second, as you might already figured it out, 
registers in this series are set only for legacy cores. I have no idea, what 
needs to be done for newer ones, since I don't have them. Anyway, I tested 
this with fluster and only one additional test passes, because it is the only 
one for 10-bit YUV420.

Best regards,
Jernej
Benjamin Gaignard April 6, 2022, 6:54 a.m. UTC | #4
Le 05/04/2022 à 20:40, Jernej Škrabec a écrit :
> Hi Benjamin!
>
> Dne torek, 05. april 2022 ob 18:07:41 CEST je Benjamin Gaignard napisal(a):
>> Le 27/02/2022 à 15:49, Jernej Skrabec a écrit :
>>> First two patches add 10-bit formats to UAPI, third extends filtering
>>> mechanism, fourth fixes incorrect assumption, fifth moves register
>>> configuration code to proper place, sixth and seventh enable 10-bit
>>> VP9 decoding on Allwinner H6 and last increases core frequency on
>>> Allwinner H6.
>>>
>>> I'm sending this as RFC to get some comments:
>>> 1. format definitions - are fourcc's ok? are comments/descriptions ok?
>>> 2. is extended filtering mechanism ok?
>>>
>>> I would also like if these patches are tested on some more HW.
>>> Additionally, can someone test tiled P010?
>>>
>>> Please take a look.
>> Hi Jernej,
>>
>> I have create a branch to test this series with VP9 and HEVC:
>> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/tree/10bit_imx
>> 8m Feel free to pick what I may need in it.
>>
>> That doesn't improve fluster scores, I think more dev are still needed in
>> GST before getting something fully functional.
>> Anyway I able to select P010 pixel format if the input is a 10bit bitstream.
> What kind of improvements do you expect? Actually, this series is designed to
> change nothing for platforms, where 10-bit format is not added into the list
> of supported formats. I think reasons are quite obvious. First, not every
> device may support 10-bit output. Second, as you might already figured it out,
> registers in this series are set only for legacy cores. I have no idea, what
> needs to be done for newer ones, since I don't have them. Anyway, I tested
> this with fluster and only one additional test passes, because it is the only
> one for 10-bit YUV420.

In this series you will find that I have added the registers for the new cores,
fix hevc to be able to use 10-bit, and enable that in IMX8M.

Regards,
Benjamin

>
> Best regards,
> Jernej
>
>
Jernej Škrabec April 6, 2022, 5:21 p.m. UTC | #5
Dne sreda, 06. april 2022 ob 08:54:07 CEST je Benjamin Gaignard napisal(a):
> Le 05/04/2022 à 20:40, Jernej Škrabec a écrit :
> > Hi Benjamin!
> > 
> > Dne torek, 05. april 2022 ob 18:07:41 CEST je Benjamin Gaignard 
napisal(a):
> >> Le 27/02/2022 à 15:49, Jernej Skrabec a écrit :
> >>> First two patches add 10-bit formats to UAPI, third extends filtering
> >>> mechanism, fourth fixes incorrect assumption, fifth moves register
> >>> configuration code to proper place, sixth and seventh enable 10-bit
> >>> VP9 decoding on Allwinner H6 and last increases core frequency on
> >>> Allwinner H6.
> >>> 
> >>> I'm sending this as RFC to get some comments:
> >>> 1. format definitions - are fourcc's ok? are comments/descriptions ok?
> >>> 2. is extended filtering mechanism ok?
> >>> 
> >>> I would also like if these patches are tested on some more HW.
> >>> Additionally, can someone test tiled P010?
> >>> 
> >>> Please take a look.
> >> 
> >> Hi Jernej,
> >> 
> >> I have create a branch to test this series with VP9 and HEVC:
> >> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/tree/10bit_
> >> imx 8m Feel free to pick what I may need in it.
> >> 
> >> That doesn't improve fluster scores, I think more dev are still needed in
> >> GST before getting something fully functional.
> >> Anyway I able to select P010 pixel format if the input is a 10bit
> >> bitstream.> 
> > What kind of improvements do you expect? Actually, this series is designed
> > to change nothing for platforms, where 10-bit format is not added into
> > the list of supported formats. I think reasons are quite obvious. First,
> > not every device may support 10-bit output. Second, as you might already
> > figured it out, registers in this series are set only for legacy cores. I
> > have no idea, what needs to be done for newer ones, since I don't have
> > them. Anyway, I tested this with fluster and only one additional test
> > passes, because it is the only one for 10-bit YUV420.
> 
> In this series you will find that I have added the registers for the new
> cores, fix hevc to be able to use 10-bit, and enable that in IMX8M.

Your changes seems reasonable, but at this point I wouldn't bother with 
fluster. Instead, try to test with one specific bitstream or even a sample video 
file. I just tested with one random 10-bit VP9 video that I found when working 
on this series. That way you avoid any corner cases which sometimes plaque 
fluster testing (reference bitstreams smaller than min. supported size). 
Anyway, re-check vendor lib if there is any other place to adjust something 
for 10-bit.

Best regards,
Jernej

> 
> Regards,
> Benjamin
> 
> > Best regards,
> > Jernej
Nicolas Dufresne April 6, 2022, 5:50 p.m. UTC | #6
Le mercredi 06 avril 2022 à 19:21 +0200, Jernej Škrabec a écrit :
> Dne sreda, 06. april 2022 ob 08:54:07 CEST je Benjamin Gaignard napisal(a):
> > Le 05/04/2022 à 20:40, Jernej Škrabec a écrit :
> > > Hi Benjamin!
> > > 
> > > Dne torek, 05. april 2022 ob 18:07:41 CEST je Benjamin Gaignard 
> napisal(a):
> > > > Le 27/02/2022 à 15:49, Jernej Skrabec a écrit :
> > > > > First two patches add 10-bit formats to UAPI, third extends filtering
> > > > > mechanism, fourth fixes incorrect assumption, fifth moves register
> > > > > configuration code to proper place, sixth and seventh enable 10-bit
> > > > > VP9 decoding on Allwinner H6 and last increases core frequency on
> > > > > Allwinner H6.
> > > > > 
> > > > > I'm sending this as RFC to get some comments:
> > > > > 1. format definitions - are fourcc's ok? are comments/descriptions ok?
> > > > > 2. is extended filtering mechanism ok?
> > > > > 
> > > > > I would also like if these patches are tested on some more HW.
> > > > > Additionally, can someone test tiled P010?
> > > > > 
> > > > > Please take a look.
> > > > 
> > > > Hi Jernej,
> > > > 
> > > > I have create a branch to test this series with VP9 and HEVC:
> > > > https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/tree/10bit_
> > > > imx 8m Feel free to pick what I may need in it.
> > > > 
> > > > That doesn't improve fluster scores, I think more dev are still needed in
> > > > GST before getting something fully functional.
> > > > Anyway I able to select P010 pixel format if the input is a 10bit
> > > > bitstream.> 
> > > What kind of improvements do you expect? Actually, this series is designed
> > > to change nothing for platforms, where 10-bit format is not added into
> > > the list of supported formats. I think reasons are quite obvious. First,
> > > not every device may support 10-bit output. Second, as you might already
> > > figured it out, registers in this series are set only for legacy cores. I
> > > have no idea, what needs to be done for newer ones, since I don't have
> > > them. Anyway, I tested this with fluster and only one additional test
> > > passes, because it is the only one for 10-bit YUV420.
> > 
> > In this series you will find that I have added the registers for the new
> > cores, fix hevc to be able to use 10-bit, and enable that in IMX8M.
> 
> Your changes seems reasonable, but at this point I wouldn't bother with 
> fluster. Instead, try to test with one specific bitstream or even a sample video 
> file. I just tested with one random 10-bit VP9 video that I found when working 
> on this series. That way you avoid any corner cases which sometimes plaque 
> fluster testing (reference bitstreams smaller than min. supported size). 
> Anyway, re-check vendor lib if there is any other place to adjust something 
> for 10-bit.

Just so we don't forget, there is a handful of 10bit tests that Daniel Almeida
omitted when he added tests to fluster (though only 1 is 420). I will try and
fix that later on. There is otherwise 5G worth of 10bit tests available. In
fluster we decided to go for the same subset libvpx uses, otherwise no one would
ever want to download these tests.

https://storage.googleapis.com/downloads.webmproject.org/vp9/decoder-test-streams/Profile_2_10bit.

About the "min supported", G2 VP9 scores is 157/303 here (in comparision rkvdec
is 225, and MTK VCODEC 275). At this failure level this has no longer anything
to do with the size of the render. There is likely couple of bugs hidden in the
driver for the corner cases tested by the suite. Also, to illustrate that the
size isn't the only variable in the failures, we have a vp90-2-02-size-
64x34.webm that pass (the driver pretends that 64x64 is the minimum). I didn't
look at G2 output very closely, but on RKVDEC, in similar failures we have
perfect keyframe, and a single corrupted tile on the following decode. My belief
is that there is bugs in the drivers to be found and fixed. In absence of vendor
support, or working reference it will be difficult / near impossible to fix, but
I'm documenting this so we stop thinking this is just "not supported".

cheers,
Nicolas