Message ID | 20211026113409.7242-1-igormtorrente@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Add new formats support to vkms | expand |
On Tue, 26 Oct 2021 08:34:00 -0300 Igor Torrente <igormtorrente@gmail.com> wrote: > Summary > ======= > This series of patches refactor some vkms components in order to introduce > new formats to the planes and writeback connector. > > Now in the blend function, the plane's pixels are converted to ARGB16161616 > and then blended together. > > The CRC is calculated based on the ARGB1616161616 buffer. And if required, > this buffer is copied/converted to the writeback buffer format. > > And to handle the pixel conversion, new functions were added to convert > from a specific format to ARGB16161616 (the reciprocal is also true). > > Tests > ===== > This patch series was tested using the following igt tests: > -t ".*kms_plane.*" > -t ".*kms_writeback.*" > -t ".*kms_cursor_crc*" > -t ".*kms_flip.*" > > New tests passing > ------------------- > - pipe-A-cursor-size-change > - pipe-A-cursor-alpha-transparent > > Performance > ----------- > Following some optimization proposed by Pekka Paalanen, now the code > runs way faster than V1 and slightly faster than the current implementation. > > | Frametime | > |:---------------:|:---------:|:--------------:|:------------:| > | implmentation | Current | Per-pixel(V1) | Per-line(V2) | > | frametime range | 8~22 ms | 32~56 ms | 6~19 ms | > | Average | 10.0 ms | 35.8 ms | 8.6 ms | Wow, that's much better than I expected. What is your benchmark? That is, what program do you use and what operations does it trigger to produce these measurements? What are the sizes of all the planes/buffers involved? What kind of CPU was this ran on? Thanks, pq > > Writeback test > -------------- > During the development of this patch series, I discovered that the > writeback-check-output test wasn't filling the plane correctly. > > So, currently, this patch series is failing in this test. But I sent a > patch to igt to fix it[1]. > > XRGB to ARGB behavior > ===================== > During the development, I decided to always fill the alpha channel of > the output pixel whenever the conversion from a format without an alpha > channel to ARGB16161616 is necessary. Therefore, I ignore the value > received from the XRGB and overwrite the value with 0xFFFF. > > --- > Igor Torrente (8): > drm: vkms: Replace the deprecated drm_mode_config_init > drm: vkms: Alloc the compose frame using vzalloc > drm: vkms: Replace hardcoded value of `vkms_composer.map` to > DRM_FORMAT_MAX_PLANES > drm: vkms: Add fb information to `vkms_writeback_job` > drm: drm_atomic_helper: Add a new helper to deal with the writeback > connector validation > drm: vkms: Refactor the plane composer to accept new formats > drm: vkms: Exposes ARGB_1616161616 and adds XRGB_16161616 formats > drm: vkms: Add support to the RGB565 format > > drivers/gpu/drm/drm_atomic_helper.c | 47 ++++ > drivers/gpu/drm/vkms/vkms_composer.c | 329 +++++++++++++++----------- > drivers/gpu/drm/vkms/vkms_drv.c | 6 +- > drivers/gpu/drm/vkms/vkms_drv.h | 14 +- > drivers/gpu/drm/vkms/vkms_formats.h | 252 ++++++++++++++++++++ > drivers/gpu/drm/vkms/vkms_plane.c | 17 +- > drivers/gpu/drm/vkms/vkms_writeback.c | 33 ++- > include/drm/drm_atomic_helper.h | 3 + > 8 files changed, 545 insertions(+), 156 deletions(-) > create mode 100644 drivers/gpu/drm/vkms/vkms_formats.h >
Hi Pekka, On Tue, Nov 9, 2021 at 6:32 AM Pekka Paalanen <ppaalanen@gmail.com> wrote: > > On Tue, 26 Oct 2021 08:34:00 -0300 > Igor Torrente <igormtorrente@gmail.com> wrote: > > > Summary > > ======= > > This series of patches refactor some vkms components in order to introduce > > new formats to the planes and writeback connector. > > > > Now in the blend function, the plane's pixels are converted to ARGB16161616 > > and then blended together. > > > > The CRC is calculated based on the ARGB1616161616 buffer. And if required, > > this buffer is copied/converted to the writeback buffer format. > > > > And to handle the pixel conversion, new functions were added to convert > > from a specific format to ARGB16161616 (the reciprocal is also true). > > > > Tests > > ===== > > This patch series was tested using the following igt tests: > > -t ".*kms_plane.*" > > -t ".*kms_writeback.*" > > -t ".*kms_cursor_crc*" > > -t ".*kms_flip.*" > > > > New tests passing > > ------------------- > > - pipe-A-cursor-size-change > > - pipe-A-cursor-alpha-transparent > > > > Performance > > ----------- > > Following some optimization proposed by Pekka Paalanen, now the code > > runs way faster than V1 and slightly faster than the current implementation. > > > > | Frametime | > > |:---------------:|:---------:|:--------------:|:------------:| > > | implmentation | Current | Per-pixel(V1) | Per-line(V2) | > > | frametime range | 8~22 ms | 32~56 ms | 6~19 ms | > > | Average | 10.0 ms | 35.8 ms | 8.6 ms | > > Wow, that's much better than I expected. > > What is your benchmark? That is, what program do you use and what > operations does it trigger to produce these measurements? What are the > sizes of all the planes/buffers involved? What kind of CPU was this ran > on? 1 and 2) I just measured the frametime of the IGT test ".*kms_cursor_crc*" using jiffies. I Collected all the frametimes, put all of them into a spreadsheet, calculated some values and drew some histograms. I mean, it is not the best benchmark, but at least give an idea of what is happening. 3) The primary plane was 1024x768, but the cursor plane varies between the tests. All XRGB_8888, if I'm not mistaken. 4) I tested it on a Qemu VM running on the Intel core i5 4440. ~3.3GHz > > > Thanks, > pq > > > > > Writeback test > > -------------- > > During the development of this patch series, I discovered that the > > writeback-check-output test wasn't filling the plane correctly. > > > > So, currently, this patch series is failing in this test. But I sent a > > patch to igt to fix it[1]. > > > > XRGB to ARGB behavior > > ===================== > > During the development, I decided to always fill the alpha channel of > > the output pixel whenever the conversion from a format without an alpha > > channel to ARGB16161616 is necessary. Therefore, I ignore the value > > received from the XRGB and overwrite the value with 0xFFFF. > > > > --- > > Igor Torrente (8): > > drm: vkms: Replace the deprecated drm_mode_config_init > > drm: vkms: Alloc the compose frame using vzalloc > > drm: vkms: Replace hardcoded value of `vkms_composer.map` to > > DRM_FORMAT_MAX_PLANES > > drm: vkms: Add fb information to `vkms_writeback_job` > > drm: drm_atomic_helper: Add a new helper to deal with the writeback > > connector validation > > drm: vkms: Refactor the plane composer to accept new formats > > drm: vkms: Exposes ARGB_1616161616 and adds XRGB_16161616 formats > > drm: vkms: Add support to the RGB565 format > > > > drivers/gpu/drm/drm_atomic_helper.c | 47 ++++ > > drivers/gpu/drm/vkms/vkms_composer.c | 329 +++++++++++++++----------- > > drivers/gpu/drm/vkms/vkms_drv.c | 6 +- > > drivers/gpu/drm/vkms/vkms_drv.h | 14 +- > > drivers/gpu/drm/vkms/vkms_formats.h | 252 ++++++++++++++++++++ > > drivers/gpu/drm/vkms/vkms_plane.c | 17 +- > > drivers/gpu/drm/vkms/vkms_writeback.c | 33 ++- > > include/drm/drm_atomic_helper.h | 3 + > > 8 files changed, 545 insertions(+), 156 deletions(-) > > create mode 100644 drivers/gpu/drm/vkms/vkms_formats.h > > >
On Wed, 10 Nov 2021 14:32:26 -0300 Igor Torrente <igormtorrente@gmail.com> wrote: > Hi Pekka, > > On Tue, Nov 9, 2021 at 6:32 AM Pekka Paalanen <ppaalanen@gmail.com> wrote: > > > > On Tue, 26 Oct 2021 08:34:00 -0300 > > Igor Torrente <igormtorrente@gmail.com> wrote: > > > > > Summary > > > ======= > > > This series of patches refactor some vkms components in order to introduce > > > new formats to the planes and writeback connector. > > > > > > Now in the blend function, the plane's pixels are converted to ARGB16161616 > > > and then blended together. > > > > > > The CRC is calculated based on the ARGB1616161616 buffer. And if required, > > > this buffer is copied/converted to the writeback buffer format. > > > > > > And to handle the pixel conversion, new functions were added to convert > > > from a specific format to ARGB16161616 (the reciprocal is also true). > > > > > > Tests > > > ===== > > > This patch series was tested using the following igt tests: > > > -t ".*kms_plane.*" > > > -t ".*kms_writeback.*" > > > -t ".*kms_cursor_crc*" > > > -t ".*kms_flip.*" > > > > > > New tests passing > > > ------------------- > > > - pipe-A-cursor-size-change > > > - pipe-A-cursor-alpha-transparent > > > > > > Performance > > > ----------- > > > Following some optimization proposed by Pekka Paalanen, now the code > > > runs way faster than V1 and slightly faster than the current implementation. > > > > > > | Frametime | > > > |:---------------:|:---------:|:--------------:|:------------:| > > > | implmentation | Current | Per-pixel(V1) | Per-line(V2) | > > > | frametime range | 8~22 ms | 32~56 ms | 6~19 ms | > > > | Average | 10.0 ms | 35.8 ms | 8.6 ms | > > > > Wow, that's much better than I expected. > > > > What is your benchmark? That is, what program do you use and what > > operations does it trigger to produce these measurements? What are the > > sizes of all the planes/buffers involved? What kind of CPU was this ran > > on? > > 1 and 2) I just measured the frametime of the IGT test ".*kms_cursor_crc*" > using jiffies. I Collected all the frametimes, put all of them into a > spreadsheet, calculated some values and drew some histograms. > > I mean, it is not the best benchmark, but at least give an idea of what > is happening. > > 3) The primary plane was 1024x768, but the cursor plane > varies between the tests. All XRGB_8888, if I'm not mistaken. > > 4) I tested it on a Qemu VM running on the Intel core i5 4440. ~3.3GHz Hi Igor, alright, that analysis sounds fine, even though varying cursor plane size is casting some ambiguity on the results. If you want to dig deeper into measuring this, I would suggest some scenarios if at all possible: - large primary plane and large cursor plane with 100% overlap, to measure the raw pixel throughput - large primary plane and small cursor plane with 100% overlap, to measure the efficiency of skipping pixels that do not need blending - large primary plane and large cursor plane with only a little overlap (cursor largely off-screen), to measure the efficiency of skipping pixels that do not contribute to the end result at all But that's only curiosity, I think your existing benchmarks sound perfectly fine as the difference is so big. Thanks, pq