mbox series

[v2,0/8] Add new formats support to vkms

Message ID 20211026113409.7242-1-igormtorrente@gmail.com (mailing list archive)
Headers show
Series Add new formats support to vkms | expand

Message

Igor Matheus Andrade Torrente Oct. 26, 2021, 11:34 a.m. UTC
Summary
=======
This series of patches refactor some vkms components in order to introduce
new formats to the planes and writeback connector.

Now in the blend function, the plane's pixels are converted to ARGB16161616
and then blended together.

The CRC is calculated based on the ARGB1616161616 buffer. And if required,
this buffer is copied/converted to the writeback buffer format.

And to handle the pixel conversion, new functions were added to convert
from a specific format to ARGB16161616 (the reciprocal is also true).

Tests
=====
This patch series was tested using the following igt tests:
-t ".*kms_plane.*"
-t ".*kms_writeback.*"
-t ".*kms_cursor_crc*"
-t ".*kms_flip.*"

New tests passing
-------------------
- pipe-A-cursor-size-change
- pipe-A-cursor-alpha-transparent

Performance
-----------
Following some optimization proposed by Pekka Paalanen, now the code
runs way faster than V1 and slightly faster than the current implementation.

|                          Frametime                          |
|:---------------:|:---------:|:--------------:|:------------:|
|  implmentation  |  Current  |  Per-pixel(V1) | Per-line(V2) |
| frametime range |  8~22 ms  |    32~56 ms    |    6~19 ms   |
|     Average     |  10.0 ms  |     35.8 ms    |    8.6 ms    |

Writeback test
--------------
During the development of this patch series, I discovered that the
writeback-check-output test wasn't filling the plane correctly.

So, currently, this patch series is failing in this test. But I sent a
patch to igt to fix it[1].

XRGB to ARGB behavior
=====================
During the development, I decided to always fill the alpha channel of
the output pixel whenever the conversion from a format without an alpha
channel to ARGB16161616 is necessary. Therefore, I ignore the value
received from the XRGB and overwrite the value with 0xFFFF.

---
Igor Torrente (8):
  drm: vkms: Replace the deprecated drm_mode_config_init
  drm: vkms: Alloc the compose frame using vzalloc
  drm: vkms: Replace hardcoded value of `vkms_composer.map` to
    DRM_FORMAT_MAX_PLANES
  drm: vkms: Add fb information to `vkms_writeback_job`
  drm: drm_atomic_helper: Add a new helper to deal with the writeback
    connector validation
  drm: vkms: Refactor the plane composer to accept new formats
  drm: vkms: Exposes ARGB_1616161616 and adds XRGB_16161616 formats
  drm: vkms: Add support to the RGB565 format

 drivers/gpu/drm/drm_atomic_helper.c   |  47 ++++
 drivers/gpu/drm/vkms/vkms_composer.c  | 329 +++++++++++++++-----------
 drivers/gpu/drm/vkms/vkms_drv.c       |   6 +-
 drivers/gpu/drm/vkms/vkms_drv.h       |  14 +-
 drivers/gpu/drm/vkms/vkms_formats.h   | 252 ++++++++++++++++++++
 drivers/gpu/drm/vkms/vkms_plane.c     |  17 +-
 drivers/gpu/drm/vkms/vkms_writeback.c |  33 ++-
 include/drm/drm_atomic_helper.h       |   3 +
 8 files changed, 545 insertions(+), 156 deletions(-)
 create mode 100644 drivers/gpu/drm/vkms/vkms_formats.h

Comments

Pekka Paalanen Nov. 9, 2021, 9:32 a.m. UTC | #1
On Tue, 26 Oct 2021 08:34:00 -0300
Igor Torrente <igormtorrente@gmail.com> wrote:

> Summary
> =======
> This series of patches refactor some vkms components in order to introduce
> new formats to the planes and writeback connector.
> 
> Now in the blend function, the plane's pixels are converted to ARGB16161616
> and then blended together.
> 
> The CRC is calculated based on the ARGB1616161616 buffer. And if required,
> this buffer is copied/converted to the writeback buffer format.
> 
> And to handle the pixel conversion, new functions were added to convert
> from a specific format to ARGB16161616 (the reciprocal is also true).
> 
> Tests
> =====
> This patch series was tested using the following igt tests:
> -t ".*kms_plane.*"
> -t ".*kms_writeback.*"
> -t ".*kms_cursor_crc*"
> -t ".*kms_flip.*"
> 
> New tests passing
> -------------------
> - pipe-A-cursor-size-change
> - pipe-A-cursor-alpha-transparent
> 
> Performance
> -----------
> Following some optimization proposed by Pekka Paalanen, now the code
> runs way faster than V1 and slightly faster than the current implementation.
> 
> |                          Frametime                          |
> |:---------------:|:---------:|:--------------:|:------------:|
> |  implmentation  |  Current  |  Per-pixel(V1) | Per-line(V2) |
> | frametime range |  8~22 ms  |    32~56 ms    |    6~19 ms   |
> |     Average     |  10.0 ms  |     35.8 ms    |    8.6 ms    |

Wow, that's much better than I expected.

What is your benchmark? That is, what program do you use and what
operations does it trigger to produce these measurements? What are the
sizes of all the planes/buffers involved? What kind of CPU was this ran
on?


Thanks,
pq

> 
> Writeback test
> --------------
> During the development of this patch series, I discovered that the
> writeback-check-output test wasn't filling the plane correctly.
> 
> So, currently, this patch series is failing in this test. But I sent a
> patch to igt to fix it[1].
> 
> XRGB to ARGB behavior
> =====================
> During the development, I decided to always fill the alpha channel of
> the output pixel whenever the conversion from a format without an alpha
> channel to ARGB16161616 is necessary. Therefore, I ignore the value
> received from the XRGB and overwrite the value with 0xFFFF.
> 
> ---
> Igor Torrente (8):
>   drm: vkms: Replace the deprecated drm_mode_config_init
>   drm: vkms: Alloc the compose frame using vzalloc
>   drm: vkms: Replace hardcoded value of `vkms_composer.map` to
>     DRM_FORMAT_MAX_PLANES
>   drm: vkms: Add fb information to `vkms_writeback_job`
>   drm: drm_atomic_helper: Add a new helper to deal with the writeback
>     connector validation
>   drm: vkms: Refactor the plane composer to accept new formats
>   drm: vkms: Exposes ARGB_1616161616 and adds XRGB_16161616 formats
>   drm: vkms: Add support to the RGB565 format
> 
>  drivers/gpu/drm/drm_atomic_helper.c   |  47 ++++
>  drivers/gpu/drm/vkms/vkms_composer.c  | 329 +++++++++++++++-----------
>  drivers/gpu/drm/vkms/vkms_drv.c       |   6 +-
>  drivers/gpu/drm/vkms/vkms_drv.h       |  14 +-
>  drivers/gpu/drm/vkms/vkms_formats.h   | 252 ++++++++++++++++++++
>  drivers/gpu/drm/vkms/vkms_plane.c     |  17 +-
>  drivers/gpu/drm/vkms/vkms_writeback.c |  33 ++-
>  include/drm/drm_atomic_helper.h       |   3 +
>  8 files changed, 545 insertions(+), 156 deletions(-)
>  create mode 100644 drivers/gpu/drm/vkms/vkms_formats.h
>
Igor Matheus Andrade Torrente Nov. 10, 2021, 5:32 p.m. UTC | #2
Hi Pekka,

On Tue, Nov 9, 2021 at 6:32 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Tue, 26 Oct 2021 08:34:00 -0300
> Igor Torrente <igormtorrente@gmail.com> wrote:
>
> > Summary
> > =======
> > This series of patches refactor some vkms components in order to introduce
> > new formats to the planes and writeback connector.
> >
> > Now in the blend function, the plane's pixels are converted to ARGB16161616
> > and then blended together.
> >
> > The CRC is calculated based on the ARGB1616161616 buffer. And if required,
> > this buffer is copied/converted to the writeback buffer format.
> >
> > And to handle the pixel conversion, new functions were added to convert
> > from a specific format to ARGB16161616 (the reciprocal is also true).
> >
> > Tests
> > =====
> > This patch series was tested using the following igt tests:
> > -t ".*kms_plane.*"
> > -t ".*kms_writeback.*"
> > -t ".*kms_cursor_crc*"
> > -t ".*kms_flip.*"
> >
> > New tests passing
> > -------------------
> > - pipe-A-cursor-size-change
> > - pipe-A-cursor-alpha-transparent
> >
> > Performance
> > -----------
> > Following some optimization proposed by Pekka Paalanen, now the code
> > runs way faster than V1 and slightly faster than the current implementation.
> >
> > |                          Frametime                          |
> > |:---------------:|:---------:|:--------------:|:------------:|
> > |  implmentation  |  Current  |  Per-pixel(V1) | Per-line(V2) |
> > | frametime range |  8~22 ms  |    32~56 ms    |    6~19 ms   |
> > |     Average     |  10.0 ms  |     35.8 ms    |    8.6 ms    |
>
> Wow, that's much better than I expected.
>
> What is your benchmark? That is, what program do you use and what
> operations does it trigger to produce these measurements? What are the
> sizes of all the planes/buffers involved? What kind of CPU was this ran
> on?

1 and 2) I just measured the frametime of the IGT test ".*kms_cursor_crc*"
using jiffies. I Collected all the frametimes, put all of them into a
spreadsheet, calculated some values and drew some histograms.

I mean, it is not the best benchmark, but at least give an idea of what
is happening.

3) The primary plane was 1024x768, but the cursor plane
varies between the tests. All XRGB_8888, if I'm not mistaken.

4) I tested it on a Qemu VM running on the Intel core i5 4440. ~3.3GHz

>
>
> Thanks,
> pq
>
> >
> > Writeback test
> > --------------
> > During the development of this patch series, I discovered that the
> > writeback-check-output test wasn't filling the plane correctly.
> >
> > So, currently, this patch series is failing in this test. But I sent a
> > patch to igt to fix it[1].
> >
> > XRGB to ARGB behavior
> > =====================
> > During the development, I decided to always fill the alpha channel of
> > the output pixel whenever the conversion from a format without an alpha
> > channel to ARGB16161616 is necessary. Therefore, I ignore the value
> > received from the XRGB and overwrite the value with 0xFFFF.
> >
> > ---
> > Igor Torrente (8):
> >   drm: vkms: Replace the deprecated drm_mode_config_init
> >   drm: vkms: Alloc the compose frame using vzalloc
> >   drm: vkms: Replace hardcoded value of `vkms_composer.map` to
> >     DRM_FORMAT_MAX_PLANES
> >   drm: vkms: Add fb information to `vkms_writeback_job`
> >   drm: drm_atomic_helper: Add a new helper to deal with the writeback
> >     connector validation
> >   drm: vkms: Refactor the plane composer to accept new formats
> >   drm: vkms: Exposes ARGB_1616161616 and adds XRGB_16161616 formats
> >   drm: vkms: Add support to the RGB565 format
> >
> >  drivers/gpu/drm/drm_atomic_helper.c   |  47 ++++
> >  drivers/gpu/drm/vkms/vkms_composer.c  | 329 +++++++++++++++-----------
> >  drivers/gpu/drm/vkms/vkms_drv.c       |   6 +-
> >  drivers/gpu/drm/vkms/vkms_drv.h       |  14 +-
> >  drivers/gpu/drm/vkms/vkms_formats.h   | 252 ++++++++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_plane.c     |  17 +-
> >  drivers/gpu/drm/vkms/vkms_writeback.c |  33 ++-
> >  include/drm/drm_atomic_helper.h       |   3 +
> >  8 files changed, 545 insertions(+), 156 deletions(-)
> >  create mode 100644 drivers/gpu/drm/vkms/vkms_formats.h
> >
>
Pekka Paalanen Nov. 11, 2021, 8:32 a.m. UTC | #3
On Wed, 10 Nov 2021 14:32:26 -0300
Igor Torrente <igormtorrente@gmail.com> wrote:

> Hi Pekka,
> 
> On Tue, Nov 9, 2021 at 6:32 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Tue, 26 Oct 2021 08:34:00 -0300
> > Igor Torrente <igormtorrente@gmail.com> wrote:
> >  
> > > Summary
> > > =======
> > > This series of patches refactor some vkms components in order to introduce
> > > new formats to the planes and writeback connector.
> > >
> > > Now in the blend function, the plane's pixels are converted to ARGB16161616
> > > and then blended together.
> > >
> > > The CRC is calculated based on the ARGB1616161616 buffer. And if required,
> > > this buffer is copied/converted to the writeback buffer format.
> > >
> > > And to handle the pixel conversion, new functions were added to convert
> > > from a specific format to ARGB16161616 (the reciprocal is also true).
> > >
> > > Tests
> > > =====
> > > This patch series was tested using the following igt tests:
> > > -t ".*kms_plane.*"
> > > -t ".*kms_writeback.*"
> > > -t ".*kms_cursor_crc*"
> > > -t ".*kms_flip.*"
> > >
> > > New tests passing
> > > -------------------
> > > - pipe-A-cursor-size-change
> > > - pipe-A-cursor-alpha-transparent
> > >
> > > Performance
> > > -----------
> > > Following some optimization proposed by Pekka Paalanen, now the code
> > > runs way faster than V1 and slightly faster than the current implementation.
> > >
> > > |                          Frametime                          |
> > > |:---------------:|:---------:|:--------------:|:------------:|
> > > |  implmentation  |  Current  |  Per-pixel(V1) | Per-line(V2) |
> > > | frametime range |  8~22 ms  |    32~56 ms    |    6~19 ms   |
> > > |     Average     |  10.0 ms  |     35.8 ms    |    8.6 ms    |  
> >
> > Wow, that's much better than I expected.
> >
> > What is your benchmark? That is, what program do you use and what
> > operations does it trigger to produce these measurements? What are the
> > sizes of all the planes/buffers involved? What kind of CPU was this ran
> > on?  
> 
> 1 and 2) I just measured the frametime of the IGT test ".*kms_cursor_crc*"
> using jiffies. I Collected all the frametimes, put all of them into a
> spreadsheet, calculated some values and drew some histograms.
> 
> I mean, it is not the best benchmark, but at least give an idea of what
> is happening.
> 
> 3) The primary plane was 1024x768, but the cursor plane
> varies between the tests. All XRGB_8888, if I'm not mistaken.
> 
> 4) I tested it on a Qemu VM running on the Intel core i5 4440. ~3.3GHz

Hi Igor,

alright, that analysis sounds fine, even though varying cursor plane
size is casting some ambiguity on the results.

If you want to dig deeper into measuring this, I would suggest some
scenarios if at all possible:

- large primary plane and large cursor plane with 100% overlap, to
  measure the raw pixel throughput

- large primary plane and small cursor plane with 100% overlap, to
  measure the efficiency of skipping pixels that do not need blending

- large primary plane and large cursor plane with only a little
  overlap (cursor largely off-screen), to measure the efficiency of
  skipping pixels that do not contribute to the end result at all

But that's only curiosity, I think your existing benchmarks sound
perfectly fine as the difference is so big.


Thanks,
pq