mbox series

[0/8] drm/format-helper: Add helpers for line conversion

Message ID 20250325110407.81107-1-tzimmermann@suse.de (mailing list archive)
Headers show
Series drm/format-helper: Add helpers for line conversion | expand

Message

Thomas Zimmermann March 25, 2025, 10:31 a.m. UTC
Add helpers for converting the pixel format of scanline. This used
to be implemented for each format individually, but only the per-pixel
code is really different among formats.

Patch 1 moves the per-pixel format helpers from drm_draw.c to an
internal header. These functions have equivalents in the format-helper
code. Just update the interface to make them exportable and add some
comments.

Patches 2 to 5 add generic line-conversion helpers. Each takes a 
scanline with pixels with a specific number of bits per pixel and
converts it to a scanline with a specific number of bits per pixel.
Conversion of each individual pixel is done by the provided per-pixel
helper that was extracted from drm_draw.c.

Patches 6 to 8 slightly optimize the line-conversion helpers by
storing multiple pixels at once. With the current code, there are
too many functions to make opimization feasible. But with the new
helpers, there are only a few places to optimize to benefit all use
cases.

Here's a little benchmark for the common use case of watching a
full-screen video under X. The tests measures the average time
to display a single frame. It uses an XRGB8888 framebuffer for each
frame, but displays it to an RGB565 scanout buffer, thus doing format
conversion with the _32to16 helper. The test system is an Intel Core
i5-3470 with an Intel HD2500 graphics card. The graphics driver is
simpledrm at 1024x768 pixels.

- 32-bit memcpy:	 510 µs/frame
- Current code:		1640 µs/frame
- New (unoptimized):	1600 µs/frame
- New (no 64-bit):	1580 µs/frame
- New (optimized):	1470 µs/frame

The first line shows the time to memcpy a single frame to video
memory without format conversion. With conversion, the old and new
code takes roughly 3 times as long. Both variants are comparable
in performance. With 64-bit stores added, the new code is even 10%
faster then the current one. (This is not to be considered a full
performance test. The main objective is to see whether the new code
is on par with the old code, which is the case.)

Thomas Zimmermann (8):
  drm/format-helper: Move helpers for pixel conversion to header file
  drm/format-helper: Add generic conversion to 32-bit formats
  drm/format-helper: Add generic conversion to 24-bit formats
  drm/format-helper: Add generic conversion to 16-bit formats
  drm/format-helper: Add generic conversion to 8-bit formats
  drm/format-helper: Optimize 32-to-24-bpp conversion
  drm/format-helper: Optimize 32-to-16-bpp conversion
  drm/format-helper: Optimize 32-to-8-bpp conversion

 drivers/gpu/drm/drm_draw.c            | 100 +------
 drivers/gpu/drm/drm_format_helper.c   | 378 ++++++++++++--------------
 drivers/gpu/drm/drm_format_internal.h | 160 +++++++++++
 3 files changed, 339 insertions(+), 299 deletions(-)
 create mode 100644 drivers/gpu/drm/drm_format_internal.h