Message ID | 20240315170959.165505-1-sunpeng.li@amd.com (mailing list archive) |
---|---|
Headers | show |
Series | drm/amdgpu/display: Make multi-plane configurations more flexible | expand |
On Fri, 15 Mar 2024 13:09:56 -0400 <sunpeng.li@amd.com> wrote: > From: Leo Li <sunpeng.li@amd.com> > > These patches aim to make the amdgpgu KMS driver play nicer with compositors > when building multi-plane scanout configurations. They do so by: > > 1. Making cursor behavior more sensible. > 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for > 'underlay' configurations (perhaps more of a RFC, see below). > > Please see the commit messages for details. > > > For #2, the simplest way to accomplish this was to increase the value of the > immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with > a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an > underlay scanout configuration. > > Technically speaking, DCN hardware does not have a concept of primary or overlay > planes - there are simply 4 general purpose hardware pipes that can be maped in > any configuration. So the immutable zpos restriction on the PRIMARY plane is > kind of arbitrary; it can have a mutable range of (0-254) just like the > OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat > arbitrary. We can interpret PRIMARY as the first plane that should be enabled on > a CRTC, but beyond that, it doesn't mean much for amdgpu. > > Therefore, I'm curious about how compositors devs understand KMS planes and > their zpos properties, and how we would like to use them. It isn't clear to me > how compositors wish to interpret and use the DRM zpos property, or > differentiate between OVERLAY and PRIMARY planes, when it comes to setting up > multi-plane scanout. You already quoted me on the Weston link, so I don't think I have anything to add. Sounds fine to me, and we don't have a standard plane arrangement algorithm that the kernel could optimize zpos ranges against, yet. > Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM > plane API side, that can make building multi-plane scanout configurations easier > for compositors?" I'm hoping we can converge on something, whether that be > updating the existing documentation to better define the usage, or update the > API to provide support for something that is lacking. I think there probably should be a standardised plane arrangement algorithm in userspace, because the search space suffers from permutational explosion. Either there needs to be very few planes (max 4 or 5 at-all-possible per CRTC, including shareable ones) for an exhaustive search to be feasible, or all planes should be more or less equal in capabilities and userspace employs some simplified or heuristic search. If the search algorithm is fixed, then drivers could optimize zpos ranges to have the algorithm find a solution faster. My worry is that userspace already has heuristic search algorithms that may start failing if drivers later change their zpos ranges to be more optimal for another algorithm. OTOH, as long as exhaustive search is feasible, then it does not matter how DRM drivers set up the zpos ranges. In any case, the zpos ranges should try to allow all possible plane arrangements while minimizing the number of arrangements that won't work. The absolute values of zpos are pretty much irrelevant, so I think setting one plane to have an immutable zpos is a good idea, even if it's not necessary by the driver. That is one less moving part, and only the relative ordering between the planes matters. Thanks, pq > Some links to provide context and details: > * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 > * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 > > Cc: Joshua Ashton <joshua@froggi.es> > Cc: Michel Dänzer <mdaenzer@redhat.com> > Cc: Chao Guo <chao.guo@nxp.com> > Cc: Xaver Hugl <xaver.hugl@gmail.com> > Cc: Vikas Korjani <Vikas.Korjani@amd.com> > Cc: Robert Mader <robert.mader@posteo.de> > Cc: Pekka Paalanen <pekka.paalanen@collabora.com> > Cc: Sean Paul <sean@poorly.run> > Cc: Simon Ser <contact@emersion.fr> > Cc: Shashank Sharma <shashank.sharma@amd.com> > Cc: Harry Wentland <harry.wentland@amd.com> > Cc: Sebastian Wick <sebastian.wick@redhat.com> > > Leo Li (2): > drm/amd/display: Introduce overlay cursor mode > drm/amd/display: Move PRIMARY plane zpos higher > > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- > .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + > .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + > .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- > 4 files changed, 391 insertions(+), 50 deletions(-) >
On 2024-03-28 10:33, Pekka Paalanen wrote: > On Fri, 15 Mar 2024 13:09:56 -0400 > <sunpeng.li@amd.com> wrote: > >> From: Leo Li <sunpeng.li@amd.com> >> >> These patches aim to make the amdgpgu KMS driver play nicer with compositors >> when building multi-plane scanout configurations. They do so by: >> >> 1. Making cursor behavior more sensible. >> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for >> 'underlay' configurations (perhaps more of a RFC, see below). >> >> Please see the commit messages for details. >> >> >> For #2, the simplest way to accomplish this was to increase the value of the >> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with >> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an >> underlay scanout configuration. >> >> Technically speaking, DCN hardware does not have a concept of primary or overlay >> planes - there are simply 4 general purpose hardware pipes that can be maped in >> any configuration. So the immutable zpos restriction on the PRIMARY plane is >> kind of arbitrary; it can have a mutable range of (0-254) just like the >> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat >> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on >> a CRTC, but beyond that, it doesn't mean much for amdgpu. >> >> Therefore, I'm curious about how compositors devs understand KMS planes and >> their zpos properties, and how we would like to use them. It isn't clear to me >> how compositors wish to interpret and use the DRM zpos property, or >> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up >> multi-plane scanout. > > You already quoted me on the Weston link, so I don't think I have > anything to add. Sounds fine to me, and we don't have a standard plane > arrangement algorithm that the kernel could optimize zpos ranges > against, yet. > >> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM >> plane API side, that can make building multi-plane scanout configurations easier >> for compositors?" I'm hoping we can converge on something, whether that be >> updating the existing documentation to better define the usage, or update the >> API to provide support for something that is lacking. > > I think there probably should be a standardised plane arrangement > algorithm in userspace, because the search space suffers from > permutational explosion. Either there needs to be very few planes (max > 4 or 5 at-all-possible per CRTC, including shareable ones) for an > exhaustive search to be feasible, or all planes should be more or less > equal in capabilities and userspace employs some simplified or > heuristic search. > > If the search algorithm is fixed, then drivers could optimize zpos > ranges to have the algorithm find a solution faster. > > My worry is that userspace already has heuristic search algorithms that > may start failing if drivers later change their zpos ranges to be more > optimal for another algorithm. > > OTOH, as long as exhaustive search is feasible, then it does not matter > how DRM drivers set up the zpos ranges. > > In any case, the zpos ranges should try to allow all possible plane > arrangements while minimizing the number of arrangements that won't > work. The absolute values of zpos are pretty much irrelevant, so I > think setting one plane to have an immutable zpos is a good idea, even > if it's not necessary by the driver. That is one less moving part, and > only the relative ordering between the planes matters. > > > Thanks, > pq Right, thanks for your thoughts! I agree that there should be a common plane arrangement algorithm. I think libliftoff is the most obvious candidate here. It only handles overlay arrangements currently, but mixed-mode arrangements is something I've been trying to look at. Taking the driver's reported zpos into account could narrow down the search space for mixed arrangements. We could tell whether underlay, or overlay, or both, is supported by looking at the allowed zpos ranges. I also wonder if it'll make underlay assignments easier. libliftoff has an assumption that the PRIMARY plane has the lowest zpos (which now I realize, is not always true). Therefore, the underlay buffer has to be placed on the PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between planes when testing mixed-arrangements is kind of awkward, and simply setting the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. Currently only gamescope makes use of libliftoff, but I'm curious if patches hooking it up to Weston would be welcomed? If there are other ways to have a common arrangement algorithm, I'd be happy to hear that as well. Note that libliftoff's algorithm is more complex than weston, since it searches harder, and suffers from that permutational explosion. But it solves that by trying high benefit arrangements first (offloading surfaces that update frequently), and bailing out once the search reaches a hard-coded deadline. Since it's currently overlay-only, the goal could be to "simply" have no regressions. Thanks, Leo > >> Some links to provide context and details: >> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 >> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 >> >> Cc: Joshua Ashton <joshua@froggi.es> >> Cc: Michel Dänzer <mdaenzer@redhat.com> >> Cc: Chao Guo <chao.guo@nxp.com> >> Cc: Xaver Hugl <xaver.hugl@gmail.com> >> Cc: Vikas Korjani <Vikas.Korjani@amd.com> >> Cc: Robert Mader <robert.mader@posteo.de> >> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> >> Cc: Sean Paul <sean@poorly.run> >> Cc: Simon Ser <contact@emersion.fr> >> Cc: Shashank Sharma <shashank.sharma@amd.com> >> Cc: Harry Wentland <harry.wentland@amd.com> >> Cc: Sebastian Wick <sebastian.wick@redhat.com> >> >> Leo Li (2): >> drm/amd/display: Introduce overlay cursor mode >> drm/amd/display: Move PRIMARY plane zpos higher >> >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + >> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + >> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- >> 4 files changed, 391 insertions(+), 50 deletions(-) >> >
On Wed, 3 Apr 2024 17:32:46 -0400 Leo Li <sunpeng.li@amd.com> wrote: > On 2024-03-28 10:33, Pekka Paalanen wrote: > > On Fri, 15 Mar 2024 13:09:56 -0400 > > <sunpeng.li@amd.com> wrote: > > > >> From: Leo Li <sunpeng.li@amd.com> > >> > >> These patches aim to make the amdgpgu KMS driver play nicer with compositors > >> when building multi-plane scanout configurations. They do so by: > >> > >> 1. Making cursor behavior more sensible. > >> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for > >> 'underlay' configurations (perhaps more of a RFC, see below). > >> > >> Please see the commit messages for details. > >> > >> > >> For #2, the simplest way to accomplish this was to increase the value of the > >> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with > >> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an > >> underlay scanout configuration. > >> > >> Technically speaking, DCN hardware does not have a concept of primary or overlay > >> planes - there are simply 4 general purpose hardware pipes that can be maped in > >> any configuration. So the immutable zpos restriction on the PRIMARY plane is > >> kind of arbitrary; it can have a mutable range of (0-254) just like the > >> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat > >> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on > >> a CRTC, but beyond that, it doesn't mean much for amdgpu. > >> > >> Therefore, I'm curious about how compositors devs understand KMS planes and > >> their zpos properties, and how we would like to use them. It isn't clear to me > >> how compositors wish to interpret and use the DRM zpos property, or > >> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up > >> multi-plane scanout. > > > > You already quoted me on the Weston link, so I don't think I have > > anything to add. Sounds fine to me, and we don't have a standard plane > > arrangement algorithm that the kernel could optimize zpos ranges > > against, yet. > > > >> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM > >> plane API side, that can make building multi-plane scanout configurations easier > >> for compositors?" I'm hoping we can converge on something, whether that be > >> updating the existing documentation to better define the usage, or update the > >> API to provide support for something that is lacking. > > > > I think there probably should be a standardised plane arrangement > > algorithm in userspace, because the search space suffers from > > permutational explosion. Either there needs to be very few planes (max > > 4 or 5 at-all-possible per CRTC, including shareable ones) for an > > exhaustive search to be feasible, or all planes should be more or less > > equal in capabilities and userspace employs some simplified or > > heuristic search. > > > > If the search algorithm is fixed, then drivers could optimize zpos > > ranges to have the algorithm find a solution faster. > > > > My worry is that userspace already has heuristic search algorithms that > > may start failing if drivers later change their zpos ranges to be more > > optimal for another algorithm. > > > > OTOH, as long as exhaustive search is feasible, then it does not matter > > how DRM drivers set up the zpos ranges. > > > > In any case, the zpos ranges should try to allow all possible plane > > arrangements while minimizing the number of arrangements that won't > > work. The absolute values of zpos are pretty much irrelevant, so I > > think setting one plane to have an immutable zpos is a good idea, even > > if it's not necessary by the driver. That is one less moving part, and > > only the relative ordering between the planes matters. > > > > > > Thanks, > > pq > > Right, thanks for your thoughts! I agree that there should be a common plane > arrangement algorithm. I think libliftoff is the most obvious candidate here. It > only handles overlay arrangements currently, but mixed-mode arrangements is > something I've been trying to look at. > > Taking the driver's reported zpos into account could narrow down the search > space for mixed arrangements. We could tell whether underlay, or overlay, or > both, is supported by looking at the allowed zpos ranges. > > I also wonder if it'll make underlay assignments easier. libliftoff has an > assumption that the PRIMARY plane has the lowest zpos (which now I realize, is > not always true). Therefore, the underlay buffer has to be placed on the > PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between > planes when testing mixed-arrangements is kind of awkward, and simply setting > the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. > > Currently only gamescope makes use of libliftoff, but I'm curious if patches > hooking it up to Weston would be welcomed? If there are other ways to have a > common arrangement algorithm, I'd be happy to hear that as well. A natural thing would be to document such an algorithm with the KMS UAPI. I don't know libliftoff well enough to say how welcome it would be in Weston. I have no fundamental or policy reason to keep an independent implementation in Weston though, so it's plausible at least. It would need investigation, and perhaps also extending Weston test suite a lot more towards VKMS to verify plane assignments. Currently all plane assignment testing is manual on real hardware. > Note that libliftoff's algorithm is more complex than weston, since it searches > harder, and suffers from that permutational explosion. But it solves that by > trying high benefit arrangements first (offloading surfaces that update > frequently), and bailing out once the search reaches a hard-coded deadline. > Since it's currently overlay-only, the goal could be to "simply" have no > regressions. Ensuring no regressions would indeed need to be taken care of by extending the VKMS-based automated testing. Thanks, pq > > > >> Some links to provide context and details: > >> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 > >> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 > >> > >> Cc: Joshua Ashton <joshua@froggi.es> > >> Cc: Michel Dänzer <mdaenzer@redhat.com> > >> Cc: Chao Guo <chao.guo@nxp.com> > >> Cc: Xaver Hugl <xaver.hugl@gmail.com> > >> Cc: Vikas Korjani <Vikas.Korjani@amd.com> > >> Cc: Robert Mader <robert.mader@posteo.de> > >> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> > >> Cc: Sean Paul <sean@poorly.run> > >> Cc: Simon Ser <contact@emersion.fr> > >> Cc: Shashank Sharma <shashank.sharma@amd.com> > >> Cc: Harry Wentland <harry.wentland@amd.com> > >> Cc: Sebastian Wick <sebastian.wick@redhat.com> > >> > >> Leo Li (2): > >> drm/amd/display: Introduce overlay cursor mode > >> drm/amd/display: Move PRIMARY plane zpos higher > >> > >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- > >> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + > >> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + > >> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- > >> 4 files changed, 391 insertions(+), 50 deletions(-) > >> > >
On 2024-04-04 06:24, Pekka Paalanen wrote: > On Wed, 3 Apr 2024 17:32:46 -0400 > Leo Li <sunpeng.li@amd.com> wrote: > >> On 2024-03-28 10:33, Pekka Paalanen wrote: >>> On Fri, 15 Mar 2024 13:09:56 -0400 >>> <sunpeng.li@amd.com> wrote: >>> >>>> From: Leo Li <sunpeng.li@amd.com> >>>> >>>> These patches aim to make the amdgpgu KMS driver play nicer with compositors >>>> when building multi-plane scanout configurations. They do so by: >>>> >>>> 1. Making cursor behavior more sensible. >>>> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for >>>> 'underlay' configurations (perhaps more of a RFC, see below). >>>> >>>> Please see the commit messages for details. >>>> >>>> >>>> For #2, the simplest way to accomplish this was to increase the value of the >>>> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with >>>> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an >>>> underlay scanout configuration. >>>> >>>> Technically speaking, DCN hardware does not have a concept of primary or overlay >>>> planes - there are simply 4 general purpose hardware pipes that can be maped in >>>> any configuration. So the immutable zpos restriction on the PRIMARY plane is >>>> kind of arbitrary; it can have a mutable range of (0-254) just like the >>>> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat >>>> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on >>>> a CRTC, but beyond that, it doesn't mean much for amdgpu. >>>> >>>> Therefore, I'm curious about how compositors devs understand KMS planes and >>>> their zpos properties, and how we would like to use them. It isn't clear to me >>>> how compositors wish to interpret and use the DRM zpos property, or >>>> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up >>>> multi-plane scanout. >>> >>> You already quoted me on the Weston link, so I don't think I have >>> anything to add. Sounds fine to me, and we don't have a standard plane >>> arrangement algorithm that the kernel could optimize zpos ranges >>> against, yet. >>> >>>> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM >>>> plane API side, that can make building multi-plane scanout configurations easier >>>> for compositors?" I'm hoping we can converge on something, whether that be >>>> updating the existing documentation to better define the usage, or update the >>>> API to provide support for something that is lacking. >>> >>> I think there probably should be a standardised plane arrangement >>> algorithm in userspace, because the search space suffers from >>> permutational explosion. Either there needs to be very few planes (max >>> 4 or 5 at-all-possible per CRTC, including shareable ones) for an >>> exhaustive search to be feasible, or all planes should be more or less >>> equal in capabilities and userspace employs some simplified or >>> heuristic search. >>> >>> If the search algorithm is fixed, then drivers could optimize zpos >>> ranges to have the algorithm find a solution faster. >>> >>> My worry is that userspace already has heuristic search algorithms that >>> may start failing if drivers later change their zpos ranges to be more >>> optimal for another algorithm. >>> >>> OTOH, as long as exhaustive search is feasible, then it does not matter >>> how DRM drivers set up the zpos ranges. >>> >>> In any case, the zpos ranges should try to allow all possible plane >>> arrangements while minimizing the number of arrangements that won't >>> work. The absolute values of zpos are pretty much irrelevant, so I >>> think setting one plane to have an immutable zpos is a good idea, even >>> if it's not necessary by the driver. That is one less moving part, and >>> only the relative ordering between the planes matters. >>> >>> >>> Thanks, >>> pq >> >> Right, thanks for your thoughts! I agree that there should be a common plane >> arrangement algorithm. I think libliftoff is the most obvious candidate here. It >> only handles overlay arrangements currently, but mixed-mode arrangements is >> something I've been trying to look at. >> >> Taking the driver's reported zpos into account could narrow down the search >> space for mixed arrangements. We could tell whether underlay, or overlay, or >> both, is supported by looking at the allowed zpos ranges. >> >> I also wonder if it'll make underlay assignments easier. libliftoff has an >> assumption that the PRIMARY plane has the lowest zpos (which now I realize, is >> not always true). Therefore, the underlay buffer has to be placed on the >> PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between >> planes when testing mixed-arrangements is kind of awkward, and simply setting >> the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. >> >> Currently only gamescope makes use of libliftoff, but I'm curious if patches >> hooking it up to Weston would be welcomed? If there are other ways to have a >> common arrangement algorithm, I'd be happy to hear that as well. > > A natural thing would be to document such an algorithm with the KMS > UAPI. > > I don't know libliftoff well enough to say how welcome it would be in > Weston. I have no fundamental or policy reason to keep an independent > implementation in Weston though, so it's plausible at least. > > It would need investigation, and perhaps also extending Weston test > suite a lot more towards VKMS to verify plane assignments. Currently > all plane assignment testing is manual on real hardware. > It looks like VKMS doesn't have explicit zpos yet, so someone would probably need to add that. https://drmdb.emersion.fr/properties/4008636142/zpos Harry >> Note that libliftoff's algorithm is more complex than weston, since it searches >> harder, and suffers from that permutational explosion. But it solves that by >> trying high benefit arrangements first (offloading surfaces that update >> frequently), and bailing out once the search reaches a hard-coded deadline. >> Since it's currently overlay-only, the goal could be to "simply" have no >> regressions. > > Ensuring no regressions would indeed need to be taken care of by > extending the VKMS-based automated testing. > > > Thanks, > pq > >>> >>>> Some links to provide context and details: >>>> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 >>>> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 >>>> >>>> Cc: Joshua Ashton <joshua@froggi.es> >>>> Cc: Michel Dänzer <mdaenzer@redhat.com> >>>> Cc: Chao Guo <chao.guo@nxp.com> >>>> Cc: Xaver Hugl <xaver.hugl@gmail.com> >>>> Cc: Vikas Korjani <Vikas.Korjani@amd.com> >>>> Cc: Robert Mader <robert.mader@posteo.de> >>>> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> >>>> Cc: Sean Paul <sean@poorly.run> >>>> Cc: Simon Ser <contact@emersion.fr> >>>> Cc: Shashank Sharma <shashank.sharma@amd.com> >>>> Cc: Harry Wentland <harry.wentland@amd.com> >>>> Cc: Sebastian Wick <sebastian.wick@redhat.com> >>>> >>>> Leo Li (2): >>>> drm/amd/display: Introduce overlay cursor mode >>>> drm/amd/display: Move PRIMARY plane zpos higher >>>> >>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- >>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + >>>> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + >>>> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- >>>> 4 files changed, 391 insertions(+), 50 deletions(-) >>>> >>> >
On Thu, Apr 04, 2024 at 09:59:03AM -0400, Harry Wentland wrote: > Hi all, > > On 2024-04-04 06:24, Pekka Paalanen wrote: > > On Wed, 3 Apr 2024 17:32:46 -0400 > > Leo Li <sunpeng.li@amd.com> wrote: > > > >> On 2024-03-28 10:33, Pekka Paalanen wrote: > >>> On Fri, 15 Mar 2024 13:09:56 -0400 > >>> <sunpeng.li@amd.com> wrote: > >>> > >>>> From: Leo Li <sunpeng.li@amd.com> > >>>> > >>>> These patches aim to make the amdgpgu KMS driver play nicer with compositors > >>>> when building multi-plane scanout configurations. They do so by: > >>>> > >>>> 1. Making cursor behavior more sensible. > >>>> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for > >>>> 'underlay' configurations (perhaps more of a RFC, see below). > >>>> > >>>> Please see the commit messages for details. > >>>> > >>>> > >>>> For #2, the simplest way to accomplish this was to increase the value of the > >>>> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with > >>>> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an > >>>> underlay scanout configuration. > >>>> > >>>> Technically speaking, DCN hardware does not have a concept of primary or overlay > >>>> planes - there are simply 4 general purpose hardware pipes that can be maped in > >>>> any configuration. So the immutable zpos restriction on the PRIMARY plane is > >>>> kind of arbitrary; it can have a mutable range of (0-254) just like the > >>>> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat > >>>> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on > >>>> a CRTC, but beyond that, it doesn't mean much for amdgpu. > >>>> > >>>> Therefore, I'm curious about how compositors devs understand KMS planes and > >>>> their zpos properties, and how we would like to use them. It isn't clear to me > >>>> how compositors wish to interpret and use the DRM zpos property, or > >>>> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up > >>>> multi-plane scanout. > >>> > >>> You already quoted me on the Weston link, so I don't think I have > >>> anything to add. Sounds fine to me, and we don't have a standard plane > >>> arrangement algorithm that the kernel could optimize zpos ranges > >>> against, yet. > >>> > >>>> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM > >>>> plane API side, that can make building multi-plane scanout configurations easier > >>>> for compositors?" I'm hoping we can converge on something, whether that be > >>>> updating the existing documentation to better define the usage, or update the > >>>> API to provide support for something that is lacking. > >>> > >>> I think there probably should be a standardised plane arrangement > >>> algorithm in userspace, because the search space suffers from > >>> permutational explosion. Either there needs to be very few planes (max > >>> 4 or 5 at-all-possible per CRTC, including shareable ones) for an > >>> exhaustive search to be feasible, or all planes should be more or less > >>> equal in capabilities and userspace employs some simplified or > >>> heuristic search. > >>> > >>> If the search algorithm is fixed, then drivers could optimize zpos > >>> ranges to have the algorithm find a solution faster. > >>> > >>> My worry is that userspace already has heuristic search algorithms that > >>> may start failing if drivers later change their zpos ranges to be more > >>> optimal for another algorithm. > >>> > >>> OTOH, as long as exhaustive search is feasible, then it does not matter > >>> how DRM drivers set up the zpos ranges. > >>> > >>> In any case, the zpos ranges should try to allow all possible plane > >>> arrangements while minimizing the number of arrangements that won't > >>> work. The absolute values of zpos are pretty much irrelevant, so I > >>> think setting one plane to have an immutable zpos is a good idea, even > >>> if it's not necessary by the driver. That is one less moving part, and > >>> only the relative ordering between the planes matters. > >>> > >>> > >>> Thanks, > >>> pq > >> > >> Right, thanks for your thoughts! I agree that there should be a common plane > >> arrangement algorithm. I think libliftoff is the most obvious candidate here. It > >> only handles overlay arrangements currently, but mixed-mode arrangements is > >> something I've been trying to look at. > >> > >> Taking the driver's reported zpos into account could narrow down the search > >> space for mixed arrangements. We could tell whether underlay, or overlay, or > >> both, is supported by looking at the allowed zpos ranges. > >> > >> I also wonder if it'll make underlay assignments easier. libliftoff has an > >> assumption that the PRIMARY plane has the lowest zpos (which now I realize, is > >> not always true). Therefore, the underlay buffer has to be placed on the > >> PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between > >> planes when testing mixed-arrangements is kind of awkward, and simply setting > >> the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. > >> > >> Currently only gamescope makes use of libliftoff, but I'm curious if patches > >> hooking it up to Weston would be welcomed? If there are other ways to have a > >> common arrangement algorithm, I'd be happy to hear that as well. > > > > A natural thing would be to document such an algorithm with the KMS > > UAPI. > > > > I don't know libliftoff well enough to say how welcome it would be in > > Weston. I have no fundamental or policy reason to keep an independent > > implementation in Weston though, so it's plausible at least. > > > > It would need investigation, and perhaps also extending Weston test > > suite a lot more towards VKMS to verify plane assignments. Currently > > all plane assignment testing is manual on real hardware. > > > > It looks like VKMS doesn't have explicit zpos yet, so someone would > probably need to add that. > > https://drmdb.emersion.fr/properties/4008636142/zpos Yes. If we look into adding that, maybe it should be done using with ConfigFS: https://patchwork.freedesktop.org/series/122618/ With that in and with zpos support, we could then run a batch of tests that can dynamically exercise on-the-fly all possible combinations. > > Harry > > >> Note that libliftoff's algorithm is more complex than weston, since it searches > >> harder, and suffers from that permutational explosion. But it solves that by > >> trying high benefit arrangements first (offloading surfaces that update > >> frequently), and bailing out once the search reaches a hard-coded deadline. > >> Since it's currently overlay-only, the goal could be to "simply" have no > >> regressions. > > > > Ensuring no regressions would indeed need to be taken care of by > > extending the VKMS-based automated testing. > > > > > > Thanks, > > pq > > > >>> > >>>> Some links to provide context and details: > >>>> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 > >>>> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 > >>>> > >>>> Cc: Joshua Ashton <joshua@froggi.es> > >>>> Cc: Michel Dänzer <mdaenzer@redhat.com> > >>>> Cc: Chao Guo <chao.guo@nxp.com> > >>>> Cc: Xaver Hugl <xaver.hugl@gmail.com> > >>>> Cc: Vikas Korjani <Vikas.Korjani@amd.com> > >>>> Cc: Robert Mader <robert.mader@posteo.de> > >>>> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> > >>>> Cc: Sean Paul <sean@poorly.run> > >>>> Cc: Simon Ser <contact@emersion.fr> > >>>> Cc: Shashank Sharma <shashank.sharma@amd.com> > >>>> Cc: Harry Wentland <harry.wentland@amd.com> > >>>> Cc: Sebastian Wick <sebastian.wick@redhat.com> > >>>> > >>>> Leo Li (2): > >>>> drm/amd/display: Introduce overlay cursor mode > >>>> drm/amd/display: Move PRIMARY plane zpos higher > >>>> > >>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- > >>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + > >>>> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + > >>>> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- > >>>> 4 files changed, 391 insertions(+), 50 deletions(-) > >>>> > >>> > > >
On 2024-04-04 10:22, Marius Vlad wrote: > On Thu, Apr 04, 2024 at 09:59:03AM -0400, Harry Wentland wrote: >> > Hi all, >> >> On 2024-04-04 06:24, Pekka Paalanen wrote: >>> On Wed, 3 Apr 2024 17:32:46 -0400 >>> Leo Li <sunpeng.li@amd.com> wrote: >>> >>>> On 2024-03-28 10:33, Pekka Paalanen wrote: >>>>> On Fri, 15 Mar 2024 13:09:56 -0400 >>>>> <sunpeng.li@amd.com> wrote: >>>>> >>>>>> From: Leo Li <sunpeng.li@amd.com> >>>>>> >>>>>> These patches aim to make the amdgpgu KMS driver play nicer with compositors >>>>>> when building multi-plane scanout configurations. They do so by: >>>>>> >>>>>> 1. Making cursor behavior more sensible. >>>>>> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for >>>>>> 'underlay' configurations (perhaps more of a RFC, see below). >>>>>> >>>>>> Please see the commit messages for details. >>>>>> >>>>>> >>>>>> For #2, the simplest way to accomplish this was to increase the value of the >>>>>> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with >>>>>> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an >>>>>> underlay scanout configuration. >>>>>> >>>>>> Technically speaking, DCN hardware does not have a concept of primary or overlay >>>>>> planes - there are simply 4 general purpose hardware pipes that can be maped in >>>>>> any configuration. So the immutable zpos restriction on the PRIMARY plane is >>>>>> kind of arbitrary; it can have a mutable range of (0-254) just like the >>>>>> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat >>>>>> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on >>>>>> a CRTC, but beyond that, it doesn't mean much for amdgpu. >>>>>> >>>>>> Therefore, I'm curious about how compositors devs understand KMS planes and >>>>>> their zpos properties, and how we would like to use them. It isn't clear to me >>>>>> how compositors wish to interpret and use the DRM zpos property, or >>>>>> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up >>>>>> multi-plane scanout. >>>>> >>>>> You already quoted me on the Weston link, so I don't think I have >>>>> anything to add. Sounds fine to me, and we don't have a standard plane >>>>> arrangement algorithm that the kernel could optimize zpos ranges >>>>> against, yet. >>>>> >>>>>> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM >>>>>> plane API side, that can make building multi-plane scanout configurations easier >>>>>> for compositors?" I'm hoping we can converge on something, whether that be >>>>>> updating the existing documentation to better define the usage, or update the >>>>>> API to provide support for something that is lacking. >>>>> >>>>> I think there probably should be a standardised plane arrangement >>>>> algorithm in userspace, because the search space suffers from >>>>> permutational explosion. Either there needs to be very few planes (max >>>>> 4 or 5 at-all-possible per CRTC, including shareable ones) for an >>>>> exhaustive search to be feasible, or all planes should be more or less >>>>> equal in capabilities and userspace employs some simplified or >>>>> heuristic search. >>>>> >>>>> If the search algorithm is fixed, then drivers could optimize zpos >>>>> ranges to have the algorithm find a solution faster. >>>>> >>>>> My worry is that userspace already has heuristic search algorithms that >>>>> may start failing if drivers later change their zpos ranges to be more >>>>> optimal for another algorithm. >>>>> >>>>> OTOH, as long as exhaustive search is feasible, then it does not matter >>>>> how DRM drivers set up the zpos ranges. >>>>> >>>>> In any case, the zpos ranges should try to allow all possible plane >>>>> arrangements while minimizing the number of arrangements that won't >>>>> work. The absolute values of zpos are pretty much irrelevant, so I >>>>> think setting one plane to have an immutable zpos is a good idea, even >>>>> if it's not necessary by the driver. That is one less moving part, and >>>>> only the relative ordering between the planes matters. >>>>> >>>>> >>>>> Thanks, >>>>> pq >>>> >>>> Right, thanks for your thoughts! I agree that there should be a common plane >>>> arrangement algorithm. I think libliftoff is the most obvious candidate here. It >>>> only handles overlay arrangements currently, but mixed-mode arrangements is >>>> something I've been trying to look at. >>>> >>>> Taking the driver's reported zpos into account could narrow down the search >>>> space for mixed arrangements. We could tell whether underlay, or overlay, or >>>> both, is supported by looking at the allowed zpos ranges. >>>> >>>> I also wonder if it'll make underlay assignments easier. libliftoff has an >>>> assumption that the PRIMARY plane has the lowest zpos (which now I realize, is >>>> not always true). Therefore, the underlay buffer has to be placed on the >>>> PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between >>>> planes when testing mixed-arrangements is kind of awkward, and simply setting >>>> the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. >>>> >>>> Currently only gamescope makes use of libliftoff, but I'm curious if patches >>>> hooking it up to Weston would be welcomed? If there are other ways to have a >>>> common arrangement algorithm, I'd be happy to hear that as well. >>> >>> A natural thing would be to document such an algorithm with the KMS >>> UAPI. >>> >>> I don't know libliftoff well enough to say how welcome it would be in >>> Weston. I have no fundamental or policy reason to keep an independent >>> implementation in Weston though, so it's plausible at least. Is it the case that different compositors may want different plane arrangement behaviors? Like selecting which surfaces to offload, for example? It occurred to me that prescribing an allocation algorithm via something like libliftoff might be too restrictive. In which case, documenting the parts that can be nailed down would be better. That begs the question of what can be nailed down and what can left to independent implementation. I guess things like which plane should be enabled first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) can be defined. How to handle atomic test failures could be as well. I can start working on a draft for this. If anything, as a spark for discussions for the display hackfest. >>> >>> It would need investigation, and perhaps also extending Weston test >>> suite a lot more towards VKMS to verify plane assignments. Currently >>> all plane assignment testing is manual on real hardware. >>> >> >> It looks like VKMS doesn't have explicit zpos yet, so someone would >> probably need to add that. >> >> https://drmdb.emersion.fr/properties/4008636142/zpos > Yes. If we look into adding that, maybe it should be done using with > ConfigFS: https://patchwork.freedesktop.org/series/122618/ > > With that in and with zpos support, we could then run a batch of tests that > can dynamically exercise on-the-fly all possible combinations. Using vkms to come up with a bunch of different hw plane configurations is a good idea. It may come in handy for testing other compositors too. Thanks for the suggestions. - Leo >> >> Harry >> >>>> Note that libliftoff's algorithm is more complex than weston, since it searches >>>> harder, and suffers from that permutational explosion. But it solves that by >>>> trying high benefit arrangements first (offloading surfaces that update >>>> frequently), and bailing out once the search reaches a hard-coded deadline. >>>> Since it's currently overlay-only, the goal could be to "simply" have no >>>> regressions. >>> >>> Ensuring no regressions would indeed need to be taken care of by >>> extending the VKMS-based automated testing. >>> >>> >>> Thanks, >>> pq >>> >>>>> >>>>>> Some links to provide context and details: >>>>>> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 >>>>>> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 >>>>>> >>>>>> Cc: Joshua Ashton <joshua@froggi.es> >>>>>> Cc: Michel Dänzer <mdaenzer@redhat.com> >>>>>> Cc: Chao Guo <chao.guo@nxp.com> >>>>>> Cc: Xaver Hugl <xaver.hugl@gmail.com> >>>>>> Cc: Vikas Korjani <Vikas.Korjani@amd.com> >>>>>> Cc: Robert Mader <robert.mader@posteo.de> >>>>>> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> >>>>>> Cc: Sean Paul <sean@poorly.run> >>>>>> Cc: Simon Ser <contact@emersion.fr> >>>>>> Cc: Shashank Sharma <shashank.sharma@amd.com> >>>>>> Cc: Harry Wentland <harry.wentland@amd.com> >>>>>> Cc: Sebastian Wick <sebastian.wick@redhat.com> >>>>>> >>>>>> Leo Li (2): >>>>>> drm/amd/display: Introduce overlay cursor mode >>>>>> drm/amd/display: Move PRIMARY plane zpos higher >>>>>> >>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- >>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + >>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + >>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- >>>>>> 4 files changed, 391 insertions(+), 50 deletions(-) >>>>>> >>>>> >>> >>
On Thu, 11 Apr 2024 16:33:57 -0400 Leo Li <sunpeng.li@amd.com> wrote: > On 2024-04-04 10:22, Marius Vlad wrote: > > On Thu, Apr 04, 2024 at 09:59:03AM -0400, Harry Wentland wrote: > >> > > Hi all, > >> > >> On 2024-04-04 06:24, Pekka Paalanen wrote: > >>> On Wed, 3 Apr 2024 17:32:46 -0400 > >>> Leo Li <sunpeng.li@amd.com> wrote: > >>> > >>>> On 2024-03-28 10:33, Pekka Paalanen wrote: > >>>>> On Fri, 15 Mar 2024 13:09:56 -0400 > >>>>> <sunpeng.li@amd.com> wrote: > >>>>> > >>>>>> From: Leo Li <sunpeng.li@amd.com> > >>>>>> > >>>>>> These patches aim to make the amdgpgu KMS driver play nicer with compositors > >>>>>> when building multi-plane scanout configurations. They do so by: > >>>>>> > >>>>>> 1. Making cursor behavior more sensible. > >>>>>> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for > >>>>>> 'underlay' configurations (perhaps more of a RFC, see below). > >>>>>> > >>>>>> Please see the commit messages for details. > >>>>>> > >>>>>> > >>>>>> For #2, the simplest way to accomplish this was to increase the value of the > >>>>>> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with > >>>>>> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an > >>>>>> underlay scanout configuration. > >>>>>> > >>>>>> Technically speaking, DCN hardware does not have a concept of primary or overlay > >>>>>> planes - there are simply 4 general purpose hardware pipes that can be maped in > >>>>>> any configuration. So the immutable zpos restriction on the PRIMARY plane is > >>>>>> kind of arbitrary; it can have a mutable range of (0-254) just like the > >>>>>> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat > >>>>>> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on > >>>>>> a CRTC, but beyond that, it doesn't mean much for amdgpu. > >>>>>> > >>>>>> Therefore, I'm curious about how compositors devs understand KMS planes and > >>>>>> their zpos properties, and how we would like to use them. It isn't clear to me > >>>>>> how compositors wish to interpret and use the DRM zpos property, or > >>>>>> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up > >>>>>> multi-plane scanout. > >>>>> > >>>>> You already quoted me on the Weston link, so I don't think I have > >>>>> anything to add. Sounds fine to me, and we don't have a standard plane > >>>>> arrangement algorithm that the kernel could optimize zpos ranges > >>>>> against, yet. > >>>>> > >>>>>> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM > >>>>>> plane API side, that can make building multi-plane scanout configurations easier > >>>>>> for compositors?" I'm hoping we can converge on something, whether that be > >>>>>> updating the existing documentation to better define the usage, or update the > >>>>>> API to provide support for something that is lacking. > >>>>> > >>>>> I think there probably should be a standardised plane arrangement > >>>>> algorithm in userspace, because the search space suffers from > >>>>> permutational explosion. Either there needs to be very few planes (max > >>>>> 4 or 5 at-all-possible per CRTC, including shareable ones) for an > >>>>> exhaustive search to be feasible, or all planes should be more or less > >>>>> equal in capabilities and userspace employs some simplified or > >>>>> heuristic search. > >>>>> > >>>>> If the search algorithm is fixed, then drivers could optimize zpos > >>>>> ranges to have the algorithm find a solution faster. > >>>>> > >>>>> My worry is that userspace already has heuristic search algorithms that > >>>>> may start failing if drivers later change their zpos ranges to be more > >>>>> optimal for another algorithm. > >>>>> > >>>>> OTOH, as long as exhaustive search is feasible, then it does not matter > >>>>> how DRM drivers set up the zpos ranges. > >>>>> > >>>>> In any case, the zpos ranges should try to allow all possible plane > >>>>> arrangements while minimizing the number of arrangements that won't > >>>>> work. The absolute values of zpos are pretty much irrelevant, so I > >>>>> think setting one plane to have an immutable zpos is a good idea, even > >>>>> if it's not necessary by the driver. That is one less moving part, and > >>>>> only the relative ordering between the planes matters. > >>>>> > >>>>> > >>>>> Thanks, > >>>>> pq > >>>> > >>>> Right, thanks for your thoughts! I agree that there should be a common plane > >>>> arrangement algorithm. I think libliftoff is the most obvious candidate here. It > >>>> only handles overlay arrangements currently, but mixed-mode arrangements is > >>>> something I've been trying to look at. > >>>> > >>>> Taking the driver's reported zpos into account could narrow down the search > >>>> space for mixed arrangements. We could tell whether underlay, or overlay, or > >>>> both, is supported by looking at the allowed zpos ranges. > >>>> > >>>> I also wonder if it'll make underlay assignments easier. libliftoff has an > >>>> assumption that the PRIMARY plane has the lowest zpos (which now I realize, is > >>>> not always true). Therefore, the underlay buffer has to be placed on the > >>>> PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between > >>>> planes when testing mixed-arrangements is kind of awkward, and simply setting > >>>> the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. > >>>> > >>>> Currently only gamescope makes use of libliftoff, but I'm curious if patches > >>>> hooking it up to Weston would be welcomed? If there are other ways to have a > >>>> common arrangement algorithm, I'd be happy to hear that as well. > >>> > >>> A natural thing would be to document such an algorithm with the KMS > >>> UAPI. > >>> > >>> I don't know libliftoff well enough to say how welcome it would be in > >>> Weston. I have no fundamental or policy reason to keep an independent > >>> implementation in Weston though, so it's plausible at least. > > Is it the case that different compositors may want different plane arrangement > behaviors? Like selecting which surfaces to offload, for example? It occurred to > me that prescribing an allocation algorithm via something like libliftoff might > be too restrictive. In which case, documenting the parts that can be nailed down > would be better. I don't know. Probably there shouldn't be, eventually, because it's hard to imagine how DE or end user style/taste/preferences would affect things. Usually the global goal would be optimising power consumption. There could be trade-offs though, when reduction in power consumption results in reduced image quality or increased latency. What to favour for which surfaces is definitely policy and preference. Such preferences could perhaps be designed into libliftoff API. > That begs the question of what can be nailed down and what can left to > independent implementation. I guess things like which plane should be enabled > first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) > can be defined. How to handle atomic test failures could be as well. What room is there for the interpretation of zpos values? I thought they are unambiguous already: only the relative numerical order matters, and that uniquely defines the KMS plane ordering. Thanks, pq > I can start working on a draft for this. If anything, as a spark for discussions > for the display hackfest. > > >>> > >>> It would need investigation, and perhaps also extending Weston test > >>> suite a lot more towards VKMS to verify plane assignments. Currently > >>> all plane assignment testing is manual on real hardware. > >>> > >> > >> It looks like VKMS doesn't have explicit zpos yet, so someone would > >> probably need to add that. > >> > >> https://drmdb.emersion.fr/properties/4008636142/zpos > > Yes. If we look into adding that, maybe it should be done using with > > ConfigFS: https://patchwork.freedesktop.org/series/122618/ > > > > With that in and with zpos support, we could then run a batch of tests that > > can dynamically exercise on-the-fly all possible combinations. > > Using vkms to come up with a bunch of different hw plane configurations is a > good idea. It may come in handy for testing other compositors too. Thanks for > the suggestions. > > - Leo > > > >> > >> Harry > >> > >>>> Note that libliftoff's algorithm is more complex than weston, since it searches > >>>> harder, and suffers from that permutational explosion. But it solves that by > >>>> trying high benefit arrangements first (offloading surfaces that update > >>>> frequently), and bailing out once the search reaches a hard-coded deadline. > >>>> Since it's currently overlay-only, the goal could be to "simply" have no > >>>> regressions. > >>> > >>> Ensuring no regressions would indeed need to be taken care of by > >>> extending the VKMS-based automated testing. > >>> > >>> > >>> Thanks, > >>> pq > >>> > >>>>> > >>>>>> Some links to provide context and details: > >>>>>> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 > >>>>>> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 > >>>>>> > >>>>>> Cc: Joshua Ashton <joshua@froggi.es> > >>>>>> Cc: Michel Dänzer <mdaenzer@redhat.com> > >>>>>> Cc: Chao Guo <chao.guo@nxp.com> > >>>>>> Cc: Xaver Hugl <xaver.hugl@gmail.com> > >>>>>> Cc: Vikas Korjani <Vikas.Korjani@amd.com> > >>>>>> Cc: Robert Mader <robert.mader@posteo.de> > >>>>>> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> > >>>>>> Cc: Sean Paul <sean@poorly.run> > >>>>>> Cc: Simon Ser <contact@emersion.fr> > >>>>>> Cc: Shashank Sharma <shashank.sharma@amd.com> > >>>>>> Cc: Harry Wentland <harry.wentland@amd.com> > >>>>>> Cc: Sebastian Wick <sebastian.wick@redhat.com> > >>>>>> > >>>>>> Leo Li (2): > >>>>>> drm/amd/display: Introduce overlay cursor mode > >>>>>> drm/amd/display: Move PRIMARY plane zpos higher > >>>>>> > >>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- > >>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + > >>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + > >>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- > >>>>>> 4 files changed, 391 insertions(+), 50 deletions(-) > >>>>>> > >>>>> > >>> > >>
On 2024-04-12 04:03, Pekka Paalanen wrote: > On Thu, 11 Apr 2024 16:33:57 -0400 > Leo Li <sunpeng.li@amd.com> wrote: > >> On 2024-04-04 10:22, Marius Vlad wrote: >>> On Thu, Apr 04, 2024 at 09:59:03AM -0400, Harry Wentland wrote: >>>> >>> Hi all, >>>> >>>> On 2024-04-04 06:24, Pekka Paalanen wrote: >>>>> On Wed, 3 Apr 2024 17:32:46 -0400 >>>>> Leo Li <sunpeng.li@amd.com> wrote: >>>>> >>>>>> On 2024-03-28 10:33, Pekka Paalanen wrote: >>>>>>> On Fri, 15 Mar 2024 13:09:56 -0400 >>>>>>> <sunpeng.li@amd.com> wrote: >>>>>>> >>>>>>>> From: Leo Li <sunpeng.li@amd.com> >>>>>>>> >>>>>>>> These patches aim to make the amdgpgu KMS driver play nicer with compositors >>>>>>>> when building multi-plane scanout configurations. They do so by: >>>>>>>> >>>>>>>> 1. Making cursor behavior more sensible. >>>>>>>> 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for >>>>>>>> 'underlay' configurations (perhaps more of a RFC, see below). >>>>>>>> >>>>>>>> Please see the commit messages for details. >>>>>>>> >>>>>>>> >>>>>>>> For #2, the simplest way to accomplish this was to increase the value of the >>>>>>>> immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with >>>>>>>> a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an >>>>>>>> underlay scanout configuration. >>>>>>>> >>>>>>>> Technically speaking, DCN hardware does not have a concept of primary or overlay >>>>>>>> planes - there are simply 4 general purpose hardware pipes that can be maped in >>>>>>>> any configuration. So the immutable zpos restriction on the PRIMARY plane is >>>>>>>> kind of arbitrary; it can have a mutable range of (0-254) just like the >>>>>>>> OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat >>>>>>>> arbitrary. We can interpret PRIMARY as the first plane that should be enabled on >>>>>>>> a CRTC, but beyond that, it doesn't mean much for amdgpu. >>>>>>>> >>>>>>>> Therefore, I'm curious about how compositors devs understand KMS planes and >>>>>>>> their zpos properties, and how we would like to use them. It isn't clear to me >>>>>>>> how compositors wish to interpret and use the DRM zpos property, or >>>>>>>> differentiate between OVERLAY and PRIMARY planes, when it comes to setting up >>>>>>>> multi-plane scanout. >>>>>>> >>>>>>> You already quoted me on the Weston link, so I don't think I have >>>>>>> anything to add. Sounds fine to me, and we don't have a standard plane >>>>>>> arrangement algorithm that the kernel could optimize zpos ranges >>>>>>> against, yet. >>>>>>> >>>>>>>> Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM >>>>>>>> plane API side, that can make building multi-plane scanout configurations easier >>>>>>>> for compositors?" I'm hoping we can converge on something, whether that be >>>>>>>> updating the existing documentation to better define the usage, or update the >>>>>>>> API to provide support for something that is lacking. >>>>>>> >>>>>>> I think there probably should be a standardised plane arrangement >>>>>>> algorithm in userspace, because the search space suffers from >>>>>>> permutational explosion. Either there needs to be very few planes (max >>>>>>> 4 or 5 at-all-possible per CRTC, including shareable ones) for an >>>>>>> exhaustive search to be feasible, or all planes should be more or less >>>>>>> equal in capabilities and userspace employs some simplified or >>>>>>> heuristic search. >>>>>>> >>>>>>> If the search algorithm is fixed, then drivers could optimize zpos >>>>>>> ranges to have the algorithm find a solution faster. >>>>>>> >>>>>>> My worry is that userspace already has heuristic search algorithms that >>>>>>> may start failing if drivers later change their zpos ranges to be more >>>>>>> optimal for another algorithm. >>>>>>> >>>>>>> OTOH, as long as exhaustive search is feasible, then it does not matter >>>>>>> how DRM drivers set up the zpos ranges. >>>>>>> >>>>>>> In any case, the zpos ranges should try to allow all possible plane >>>>>>> arrangements while minimizing the number of arrangements that won't >>>>>>> work. The absolute values of zpos are pretty much irrelevant, so I >>>>>>> think setting one plane to have an immutable zpos is a good idea, even >>>>>>> if it's not necessary by the driver. That is one less moving part, and >>>>>>> only the relative ordering between the planes matters. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> pq >>>>>> >>>>>> Right, thanks for your thoughts! I agree that there should be a common plane >>>>>> arrangement algorithm. I think libliftoff is the most obvious candidate here. It >>>>>> only handles overlay arrangements currently, but mixed-mode arrangements is >>>>>> something I've been trying to look at. >>>>>> >>>>>> Taking the driver's reported zpos into account could narrow down the search >>>>>> space for mixed arrangements. We could tell whether underlay, or overlay, or >>>>>> both, is supported by looking at the allowed zpos ranges. >>>>>> >>>>>> I also wonder if it'll make underlay assignments easier. libliftoff has an >>>>>> assumption that the PRIMARY plane has the lowest zpos (which now I realize, is >>>>>> not always true). Therefore, the underlay buffer has to be placed on the >>>>>> PRIMARY, with the render buffer on a higher OVERLAY. Swapping buffers between >>>>>> planes when testing mixed-arrangements is kind of awkward, and simply setting >>>>>> the OVERLAY's zpos to be lower or higher than the PRIMARY's sounds simpler. >>>>>> >>>>>> Currently only gamescope makes use of libliftoff, but I'm curious if patches >>>>>> hooking it up to Weston would be welcomed? If there are other ways to have a >>>>>> common arrangement algorithm, I'd be happy to hear that as well. >>>>> >>>>> A natural thing would be to document such an algorithm with the KMS >>>>> UAPI. >>>>> >>>>> I don't know libliftoff well enough to say how welcome it would be in >>>>> Weston. I have no fundamental or policy reason to keep an independent >>>>> implementation in Weston though, so it's plausible at least. >> >> Is it the case that different compositors may want different plane arrangement >> behaviors? Like selecting which surfaces to offload, for example? It occurred to >> me that prescribing an allocation algorithm via something like libliftoff might >> be too restrictive. In which case, documenting the parts that can be nailed down >> would be better. > > I don't know. Probably there shouldn't be, eventually, because it's > hard to imagine how DE or end user style/taste/preferences would affect > things. Usually the global goal would be optimising power consumption. > > There could be trade-offs though, when reduction in power consumption > results in reduced image quality or increased latency. What to favour > for which surfaces is definitely policy and preference. Such > preferences could perhaps be designed into libliftoff API. > >> That begs the question of what can be nailed down and what can left to >> independent implementation. I guess things like which plane should be enabled >> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) >> can be defined. How to handle atomic test failures could be as well. > > What room is there for the interpretation of zpos values? > > I thought they are unambiguous already: only the relative numerical > order matters, and that uniquely defines the KMS plane ordering. The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way for vendors to communicate overlay, underlay, or mixed-arrangement support. I don't think allowing OVERLAYs to be placed under the PRIMARY is currently documented as a way to support underlay. libliftoff for example, assumes that the PRIMARY has the lowest zpos. So underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY for the underlay view. Thanks, Leo > > > Thanks, > pq > >> I can start working on a draft for this. If anything, as a spark for discussions >> for the display hackfest. >> >>>>> >>>>> It would need investigation, and perhaps also extending Weston test >>>>> suite a lot more towards VKMS to verify plane assignments. Currently >>>>> all plane assignment testing is manual on real hardware. >>>>> >>>> >>>> It looks like VKMS doesn't have explicit zpos yet, so someone would >>>> probably need to add that. >>>> >>>> https://drmdb.emersion.fr/properties/4008636142/zpos >>> Yes. If we look into adding that, maybe it should be done using with >>> ConfigFS: https://patchwork.freedesktop.org/series/122618/ >>> >>> With that in and with zpos support, we could then run a batch of tests that >>> can dynamically exercise on-the-fly all possible combinations. >> >> Using vkms to come up with a bunch of different hw plane configurations is a >> good idea. It may come in handy for testing other compositors too. Thanks for >> the suggestions. >> >> - Leo >> >> >>>> >>>> Harry >>>> >>>>>> Note that libliftoff's algorithm is more complex than weston, since it searches >>>>>> harder, and suffers from that permutational explosion. But it solves that by >>>>>> trying high benefit arrangements first (offloading surfaces that update >>>>>> frequently), and bailing out once the search reaches a hard-coded deadline. >>>>>> Since it's currently overlay-only, the goal could be to "simply" have no >>>>>> regressions. >>>>> >>>>> Ensuring no regressions would indeed need to be taken care of by >>>>> extending the VKMS-based automated testing. >>>>> >>>>> >>>>> Thanks, >>>>> pq >>>>> >>>>>>> >>>>>>>> Some links to provide context and details: >>>>>>>> * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 >>>>>>>> * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 >>>>>>>> >>>>>>>> Cc: Joshua Ashton <joshua@froggi.es> >>>>>>>> Cc: Michel Dänzer <mdaenzer@redhat.com> >>>>>>>> Cc: Chao Guo <chao.guo@nxp.com> >>>>>>>> Cc: Xaver Hugl <xaver.hugl@gmail.com> >>>>>>>> Cc: Vikas Korjani <Vikas.Korjani@amd.com> >>>>>>>> Cc: Robert Mader <robert.mader@posteo.de> >>>>>>>> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> >>>>>>>> Cc: Sean Paul <sean@poorly.run> >>>>>>>> Cc: Simon Ser <contact@emersion.fr> >>>>>>>> Cc: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>> Cc: Harry Wentland <harry.wentland@amd.com> >>>>>>>> Cc: Sebastian Wick <sebastian.wick@redhat.com> >>>>>>>> >>>>>>>> Leo Li (2): >>>>>>>> drm/amd/display: Introduce overlay cursor mode >>>>>>>> drm/amd/display: Move PRIMARY plane zpos higher >>>>>>>> >>>>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- >>>>>>>> .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + >>>>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + >>>>>>>> .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- >>>>>>>> 4 files changed, 391 insertions(+), 50 deletions(-) >>>>>>>> >>>>>>> >>>>> >>>> >
On Fri, 12 Apr 2024 10:28:52 -0400 Leo Li <sunpeng.li@amd.com> wrote: > On 2024-04-12 04:03, Pekka Paalanen wrote: > > On Thu, 11 Apr 2024 16:33:57 -0400 > > Leo Li <sunpeng.li@amd.com> wrote: > > ... > >> That begs the question of what can be nailed down and what can left to > >> independent implementation. I guess things like which plane should be enabled > >> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) > >> can be defined. How to handle atomic test failures could be as well. > > > > What room is there for the interpretation of zpos values? > > > > I thought they are unambiguous already: only the relative numerical > > order matters, and that uniquely defines the KMS plane ordering. > > The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way > for vendors to communicate overlay, underlay, or mixed-arrangement support. I > don't think allowing OVERLAYs to be placed under the PRIMARY is currently > documented as a way to support underlay. I always thought it's obvious that the zpos numbers dictate the plane order without any other rules. After all, we have the universal planes concept, where the plane type is only informational to aid heuristics rather than defining anything. Only if the zpos property does not exist, the plane types would come into play. Of course, if there actually exists userspace that fails if zpos allows an overlay type plane to be placed below primary, or fails if primary zpos is not zero, then DRM needs a new client cap. > libliftoff for example, assumes that the PRIMARY has the lowest zpos. So > underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY > for the underlay view. That's totally ok. It works, right? Plane type does not matter if the KMS driver accepts the configuration. What is a "scanout plane"? Aren't all KMS planes by definition scanout planes? IOW, if the KMS client understands zpos and can do a proper KMS configuration search, and all planes have zpos property, then there is no need to look at the plane type at all. That is the goal of the universal planes feature. Thanks, pq
On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen <pekka.paalanen@collabora.com> wrote: > > On Fri, 12 Apr 2024 10:28:52 -0400 > Leo Li <sunpeng.li@amd.com> wrote: > > > On 2024-04-12 04:03, Pekka Paalanen wrote: > > > On Thu, 11 Apr 2024 16:33:57 -0400 > > > Leo Li <sunpeng.li@amd.com> wrote: > > > > > ... > > > >> That begs the question of what can be nailed down and what can left to > > >> independent implementation. I guess things like which plane should be enabled > > >> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) > > >> can be defined. How to handle atomic test failures could be as well. > > > > > > What room is there for the interpretation of zpos values? > > > > > > I thought they are unambiguous already: only the relative numerical > > > order matters, and that uniquely defines the KMS plane ordering. > > > > The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way > > for vendors to communicate overlay, underlay, or mixed-arrangement support. I > > don't think allowing OVERLAYs to be placed under the PRIMARY is currently > > documented as a way to support underlay. > > I always thought it's obvious that the zpos numbers dictate the plane > order without any other rules. After all, we have the universal planes > concept, where the plane type is only informational to aid heuristics > rather than defining anything. > > Only if the zpos property does not exist, the plane types would come > into play. > > Of course, if there actually exists userspace that fails if zpos allows > an overlay type plane to be placed below primary, or fails if primary > zpos is not zero, then DRM needs a new client cap. > > > libliftoff for example, assumes that the PRIMARY has the lowest zpos. So > > underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY > > for the underlay view. > > That's totally ok. It works, right? Plane type does not matter if the > KMS driver accepts the configuration. > > What is a "scanout plane"? Aren't all KMS planes by definition scanout > planes? > > IOW, if the KMS client understands zpos and can do a proper KMS > configuration search, and all planes have zpos property, then there is > no need to look at the plane type at all. That is the goal of the > universal planes feature. The optimal configuration with DCN hardware is using underlays. E.g., the desktop plane would be at the top and would have holes cut out of it for videos or windows that want their own plane. If you do it the other way around, there are lots of limitations. Alex > > > Thanks, > pq
On 2024-04-12 11:31, Alex Deucher wrote: > On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen > <pekka.paalanen@collabora.com> wrote: >> >> On Fri, 12 Apr 2024 10:28:52 -0400 >> Leo Li <sunpeng.li@amd.com> wrote: >> >>> On 2024-04-12 04:03, Pekka Paalanen wrote: >>>> On Thu, 11 Apr 2024 16:33:57 -0400 >>>> Leo Li <sunpeng.li@amd.com> wrote: >>>> >> >> ... >> >>>>> That begs the question of what can be nailed down and what can left to >>>>> independent implementation. I guess things like which plane should be enabled >>>>> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) >>>>> can be defined. How to handle atomic test failures could be as well. >>>> >>>> What room is there for the interpretation of zpos values? >>>> >>>> I thought they are unambiguous already: only the relative numerical >>>> order matters, and that uniquely defines the KMS plane ordering. >>> >>> The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way >>> for vendors to communicate overlay, underlay, or mixed-arrangement support. I >>> don't think allowing OVERLAYs to be placed under the PRIMARY is currently >>> documented as a way to support underlay. >> >> I always thought it's obvious that the zpos numbers dictate the plane >> order without any other rules. After all, we have the universal planes >> concept, where the plane type is only informational to aid heuristics >> rather than defining anything. >> >> Only if the zpos property does not exist, the plane types would come >> into play. >> >> Of course, if there actually exists userspace that fails if zpos allows >> an overlay type plane to be placed below primary, or fails if primary >> zpos is not zero, then DRM needs a new client cap. Right, it wasn't immediately clear to me that the API allowed placement of things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*, there's nothing that forbids it. >> >>> libliftoff for example, assumes that the PRIMARY has the lowest zpos. So >>> underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY >>> for the underlay view. >> >> That's totally ok. It works, right? Plane type does not matter if the >> KMS driver accepts the configuration. >> >> What is a "scanout plane"? Aren't all KMS planes by definition scanout >> planes? Pardon my terminology, I thought the scanout plane was where weston rendered non-offloadable surfaces to. I guess it's more correct to call it the "render plane". On weston, it seems to be always assigned to the PRIMARY. For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay plane would work. But I think keeping the render plane on PRIMARY (a la weston) makes underlay arrangements easier to allocate, and would be nice to incorporate into a shared algorithm. In an underlay arrangement, pushing down an OVERLAY's zpos below the PRIMARY's zpos is simpler than swapping their surfaces. If such an arrangement fails atomic_test, we won't have to worry about swapping the surfaces back. Of course, it's not that we can't keep track of that in the algorithm, but I think it does make things easier. It may help with reducing the amount of atomic tests. Assuming that the same DRM plane provides the same format/color management/transformation support regardless of it's zpos, we should be able to reasonably expect that changing it's z-ordering will not cause atomic_test failures (or at least, expect less causes for failure). In other words, swapping the render plane from the PRIMARY to an OVERLAY might have more causes for an atomic_test fail, versus changing their z-ordering. The driver might have to do more things under-the-hood to provide this consistent behavior, but I think that's the right place for it. After all, drivers should know more about their hardware's behavior. The assumption that the PRIMARY has the lowest zpos isn't always true. I was made aware that the imx8mq platform places all of their OVERLAYS beneath the PRIMARY. Granted, the KMS code for enabling OVERLAYS is not upstream yet, but it is available from this thread: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2319898 . I guess this is more of a bad assumption that should be fixed in libliftoff. >> >> IOW, if the KMS client understands zpos and can do a proper KMS >> configuration search, and all planes have zpos property, then there is >> no need to look at the plane type at all. That is the goal of the >> universal planes feature. > > The optimal configuration with DCN hardware is using underlays. E.g., > the desktop plane would be at the top and would have holes cut out of > it for videos or windows that want their own plane. If you do it the > other way around, there are lots of limitations. > > Alex Right, patch 1/2 tries to work around one of these limitations (cursor-on-yuv). Others have mentioned we can do the same for scaling. Thanks, Leo > >> >> >> Thanks, >> pq
On Fri, 12 Apr 2024 16:14:28 -0400 Leo Li <sunpeng.li@amd.com> wrote: > On 2024-04-12 11:31, Alex Deucher wrote: > > On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen > > <pekka.paalanen@collabora.com> wrote: > >> > >> On Fri, 12 Apr 2024 10:28:52 -0400 > >> Leo Li <sunpeng.li@amd.com> wrote: > >> > >>> On 2024-04-12 04:03, Pekka Paalanen wrote: > >>>> On Thu, 11 Apr 2024 16:33:57 -0400 > >>>> Leo Li <sunpeng.li@amd.com> wrote: > >>>> > >> > >> ... > >> > >>>>> That begs the question of what can be nailed down and what can left to > >>>>> independent implementation. I guess things like which plane should be enabled > >>>>> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) > >>>>> can be defined. How to handle atomic test failures could be as well. > >>>> > >>>> What room is there for the interpretation of zpos values? > >>>> > >>>> I thought they are unambiguous already: only the relative numerical > >>>> order matters, and that uniquely defines the KMS plane ordering. > >>> > >>> The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way > >>> for vendors to communicate overlay, underlay, or mixed-arrangement support. I > >>> don't think allowing OVERLAYs to be placed under the PRIMARY is currently > >>> documented as a way to support underlay. > >> > >> I always thought it's obvious that the zpos numbers dictate the plane > >> order without any other rules. After all, we have the universal planes > >> concept, where the plane type is only informational to aid heuristics > >> rather than defining anything. > >> > >> Only if the zpos property does not exist, the plane types would come > >> into play. > >> > >> Of course, if there actually exists userspace that fails if zpos allows > >> an overlay type plane to be placed below primary, or fails if primary > >> zpos is not zero, then DRM needs a new client cap. > > Right, it wasn't immediately clear to me that the API allowed placement of > things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*, > there's nothing that forbids it. > > >> > >>> libliftoff for example, assumes that the PRIMARY has the lowest zpos. So > >>> underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY > >>> for the underlay view. > >> > >> That's totally ok. It works, right? Plane type does not matter if the > >> KMS driver accepts the configuration. > >> > >> What is a "scanout plane"? Aren't all KMS planes by definition scanout > >> planes? > > Pardon my terminology, I thought the scanout plane was where weston rendered > non-offloadable surfaces to. I guess it's more correct to call it the "render > plane". On weston, it seems to be always assigned to the PRIMARY. > The assignment restriction is just technical design debt. It is limiting. There is no other good reason for it, than when lighting up a CRTC for the first time, Weston should do it with the renderer FB only, on the plane that is most likely to succeed i.e. PRIMARY. After the CRTC is lit, there should be no built-in limitations in what can go where. The reason for this is that if a CRTC can be activated, it must always be able to show the renderer FB without incurring a modeset. This is important for ensuring that the fallback compositing (renderer) is always possible. So we start with that configuration, and everything else is optional bonus. > > For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay > plane would work. But I think keeping the render plane on PRIMARY (a la weston) > makes underlay arrangements easier to allocate, and would be nice to incorporate > into a shared algorithm. If zpos exists, I don't think such limitation is a good idea. It will just limit the possible configurations for no reason. With zpos, the KMS plane type should be irrelevant for their z-ordering. Underlay vs. overlay completely loses its meaning at the KMS level. > In an underlay arrangement, pushing down an OVERLAY's zpos below the PRIMARY's > zpos is simpler than swapping their surfaces. If such an arrangement fails > atomic_test, we won't have to worry about swapping the surfaces back. Of course, > it's not that we can't keep track of that in the algorithm, but I think it does > make things easier. There is no "swapping" or "swapping back". The tentative configuration is created as a new object that contains the complete CRTC+connector state, and if it doesn't work, it's simply destroyed. In Weston at least, I don't know of libliftoff. One surface could also be assigned to multiple KMS planes for different CRTCs, so there should be no 1:1 association in the first place. > It may help with reducing the amount of atomic tests. Assuming that the same DRM > plane provides the same format/color management/transformation support > regardless of it's zpos, I would definitely expect so. > we should be able to reasonably expect that changing > it's z-ordering will not cause atomic_test failures (or at least, expect less > causes for failure). In other words, swapping the render plane from the PRIMARY > to an OVERLAY might have more causes for an atomic_test fail, versus changing > their z-ordering. The driver might have to do more things under-the-hood to > provide this consistent behavior, but I think that's the right place for it. > After all, drivers should know more about their hardware's behavior. Indeed. > The assumption that the PRIMARY has the lowest zpos isn't always true. I > was made aware that the imx8mq platform places all of their OVERLAYS beneath the > PRIMARY. Granted, the KMS code for enabling OVERLAYS is not upstream yet, but it > is available from this thread: > https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2319898 > . I guess this is more of a bad assumption that should be fixed in libliftoff. Weston needs fixing too, at least in case a renderer FB is used on the CRTC. Weston has two problems: renderer FB is always on PRIMARY plane, and renderer FB is always completely opaque. Thanks, pq > >> > >> IOW, if the KMS client understands zpos and can do a proper KMS > >> configuration search, and all planes have zpos property, then there is > >> no need to look at the plane type at all. That is the goal of the > >> universal planes feature. > > > > The optimal configuration with DCN hardware is using underlays. E.g., > > the desktop plane would be at the top and would have holes cut out of > > it for videos or windows that want their own plane. If you do it the > > other way around, there are lots of limitations. > > > > Alex > > Right, patch 1/2 tries to work around one of these limitations (cursor-on-yuv). > Others have mentioned we can do the same for scaling. > > Thanks, > Leo
On 2024-04-15 04:19, Pekka Paalanen wrote: > On Fri, 12 Apr 2024 16:14:28 -0400 > Leo Li <sunpeng.li@amd.com> wrote: > >> On 2024-04-12 11:31, Alex Deucher wrote: >>> On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen >>> <pekka.paalanen@collabora.com> wrote: >>>> >>>> On Fri, 12 Apr 2024 10:28:52 -0400 >>>> Leo Li <sunpeng.li@amd.com> wrote: >>>> >>>>> On 2024-04-12 04:03, Pekka Paalanen wrote: >>>>>> On Thu, 11 Apr 2024 16:33:57 -0400 >>>>>> Leo Li <sunpeng.li@amd.com> wrote: >>>>>> >>>> >>>> ... >>>> >>>>>>> That begs the question of what can be nailed down and what can left to >>>>>>> independent implementation. I guess things like which plane should be enabled >>>>>>> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) >>>>>>> can be defined. How to handle atomic test failures could be as well. >>>>>> >>>>>> What room is there for the interpretation of zpos values? >>>>>> >>>>>> I thought they are unambiguous already: only the relative numerical >>>>>> order matters, and that uniquely defines the KMS plane ordering. >>>>> >>>>> The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way >>>>> for vendors to communicate overlay, underlay, or mixed-arrangement support. I >>>>> don't think allowing OVERLAYs to be placed under the PRIMARY is currently >>>>> documented as a way to support underlay. >>>> >>>> I always thought it's obvious that the zpos numbers dictate the plane >>>> order without any other rules. After all, we have the universal planes >>>> concept, where the plane type is only informational to aid heuristics >>>> rather than defining anything. >>>> >>>> Only if the zpos property does not exist, the plane types would come >>>> into play. >>>> >>>> Of course, if there actually exists userspace that fails if zpos allows >>>> an overlay type plane to be placed below primary, or fails if primary >>>> zpos is not zero, then DRM needs a new client cap. >> >> Right, it wasn't immediately clear to me that the API allowed placement of >> things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*, >> there's nothing that forbids it. >> >>>> >>>>> libliftoff for example, assumes that the PRIMARY has the lowest zpos. So >>>>> underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY >>>>> for the underlay view. >>>> >>>> That's totally ok. It works, right? Plane type does not matter if the >>>> KMS driver accepts the configuration. >>>> >>>> What is a "scanout plane"? Aren't all KMS planes by definition scanout >>>> planes? >> >> Pardon my terminology, I thought the scanout plane was where weston rendered >> non-offloadable surfaces to. I guess it's more correct to call it the "render >> plane". On weston, it seems to be always assigned to the PRIMARY. >> > > The assignment restriction is just technical design debt. It is > limiting. There is no other good reason for it, than when lighting > up a CRTC for the first time, Weston should do it with the renderer FB > only, on the plane that is most likely to succeed i.e. PRIMARY. After > the CRTC is lit, there should be no built-in limitations in what can go > where. > > The reason for this is that if a CRTC can be activated, it must always > be able to show the renderer FB without incurring a modeset. This is > important for ensuring that the fallback compositing (renderer) is > always possible. So we start with that configuration, and everything > else is optional bonus. Genuinely curious - What exactly is limiting with keeping the renderer FB on PRIMARY? IOW, what is the additional benefit of placing the renderer FB on something other than PRIMARY? > >> >> For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay >> plane would work. But I think keeping the render plane on PRIMARY (a la weston) >> makes underlay arrangements easier to allocate, and would be nice to incorporate >> into a shared algorithm. > > If zpos exists, I don't think such limitation is a good idea. It will > just limit the possible configurations for no reason. > > With zpos, the KMS plane type should be irrelevant for their > z-ordering. Underlay vs. overlay completely loses its meaning at the > KMS level. Right, the plane types loose their meanings. But at least with the way libliftoff builds the plane arrangement, where we first allocate the renderer fb matters. libliftoff incrementally builds the atomic state by adding a single plane to the atomic state, then testing it. It essentially does a depth-first-search of all possible arrangements, pruning the search on atomic test fail. The state that offloads the most number of FBs will be the arrangement used. Of course, it's unlikely that the entire DFS tree will traversed in time for a frame. So the key is to search the most probable and high-benefit branches first, while minimizing the # of atomic tests needed, before a hard-coded deadline is hit. Following this algorithm, the PRIMARY needs to be enabled first, followed by all the secondary planes. After a plane is enabled, it's not preferred to change it's assigned FB, since that can cause the state to be rejected (in actuality, not just the FB, but also any color and transformation stuffs associated with the surface). It is preferable to build on the state by enabling another fb->plane. This is where changing a plane's zpos to be above/below the PRIMARY is advantageous, rather than changing the FBs assigned, to accommodate overlay/underlay arrangements. I imagine that any algorithm which incrementally builds up the plane arrangement will have a similar preference. Of course, it's entirely possible that such an algorithm isn't the best, I admittedly have not thought much about other possibilities, yet... Thanks, Leo > >> In an underlay arrangement, pushing down an OVERLAY's zpos below the PRIMARY's >> zpos is simpler than swapping their surfaces. If such an arrangement fails >> atomic_test, we won't have to worry about swapping the surfaces back. Of course, >> it's not that we can't keep track of that in the algorithm, but I think it does >> make things easier. > > There is no "swapping" or "swapping back". The tentative configuration > is created as a new object that contains the complete CRTC+connector > state, and if it doesn't work, it's simply destroyed. In Weston at > least, I don't know of libliftoff. > > One surface could also be assigned to multiple KMS planes for different > CRTCs, so there should be no 1:1 association in the first place. > >> It may help with reducing the amount of atomic tests. Assuming that the same DRM >> plane provides the same format/color management/transformation support >> regardless of it's zpos, > > I would definitely expect so. > >> we should be able to reasonably expect that changing >> it's z-ordering will not cause atomic_test failures (or at least, expect less >> causes for failure). In other words, swapping the render plane from the PRIMARY >> to an OVERLAY might have more causes for an atomic_test fail, versus changing >> their z-ordering. The driver might have to do more things under-the-hood to >> provide this consistent behavior, but I think that's the right place for it. >> After all, drivers should know more about their hardware's behavior. > > Indeed. > >> The assumption that the PRIMARY has the lowest zpos isn't always true. I >> was made aware that the imx8mq platform places all of their OVERLAYS beneath the >> PRIMARY. Granted, the KMS code for enabling OVERLAYS is not upstream yet, but it >> is available from this thread: >> https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2319898 >> . I guess this is more of a bad assumption that should be fixed in libliftoff. > > Weston needs fixing too, at least in case a renderer FB is used on the > CRTC. Weston has two problems: renderer FB is always on PRIMARY plane, > and renderer FB is always completely opaque. > > > Thanks, > pq > >>>> >>>> IOW, if the KMS client understands zpos and can do a proper KMS >>>> configuration search, and all planes have zpos property, then there is >>>> no need to look at the plane type at all. That is the goal of the >>>> universal planes feature. >>> >>> The optimal configuration with DCN hardware is using underlays. E.g., >>> the desktop plane would be at the top and would have holes cut out of >>> it for videos or windows that want their own plane. If you do it the >>> other way around, there are lots of limitations. >>> >>> Alex >> >> Right, patch 1/2 tries to work around one of these limitations (cursor-on-yuv). >> Others have mentioned we can do the same for scaling. >> >> Thanks, >> Leo
On Mon, 15 Apr 2024 18:33:39 -0400 Leo Li <sunpeng.li@amd.com> wrote: > On 2024-04-15 04:19, Pekka Paalanen wrote: > > On Fri, 12 Apr 2024 16:14:28 -0400 > > Leo Li <sunpeng.li@amd.com> wrote: > > > >> On 2024-04-12 11:31, Alex Deucher wrote: > >>> On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen > >>> <pekka.paalanen@collabora.com> wrote: > >>>> > >>>> On Fri, 12 Apr 2024 10:28:52 -0400 > >>>> Leo Li <sunpeng.li@amd.com> wrote: > >>>> > >>>>> On 2024-04-12 04:03, Pekka Paalanen wrote: > >>>>>> On Thu, 11 Apr 2024 16:33:57 -0400 > >>>>>> Leo Li <sunpeng.li@amd.com> wrote: > >>>>>> > >>>> > >>>> ... > >>>> > >>>>>>> That begs the question of what can be nailed down and what can left to > >>>>>>> independent implementation. I guess things like which plane should be enabled > >>>>>>> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) > >>>>>>> can be defined. How to handle atomic test failures could be as well. > >>>>>> > >>>>>> What room is there for the interpretation of zpos values? > >>>>>> > >>>>>> I thought they are unambiguous already: only the relative numerical > >>>>>> order matters, and that uniquely defines the KMS plane ordering. > >>>>> > >>>>> The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way > >>>>> for vendors to communicate overlay, underlay, or mixed-arrangement support. I > >>>>> don't think allowing OVERLAYs to be placed under the PRIMARY is currently > >>>>> documented as a way to support underlay. > >>>> > >>>> I always thought it's obvious that the zpos numbers dictate the plane > >>>> order without any other rules. After all, we have the universal planes > >>>> concept, where the plane type is only informational to aid heuristics > >>>> rather than defining anything. > >>>> > >>>> Only if the zpos property does not exist, the plane types would come > >>>> into play. > >>>> > >>>> Of course, if there actually exists userspace that fails if zpos allows > >>>> an overlay type plane to be placed below primary, or fails if primary > >>>> zpos is not zero, then DRM needs a new client cap. > >> > >> Right, it wasn't immediately clear to me that the API allowed placement of > >> things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*, > >> there's nothing that forbids it. > >> > >>>> > >>>>> libliftoff for example, assumes that the PRIMARY has the lowest zpos. So > >>>>> underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY > >>>>> for the underlay view. > >>>> > >>>> That's totally ok. It works, right? Plane type does not matter if the > >>>> KMS driver accepts the configuration. > >>>> > >>>> What is a "scanout plane"? Aren't all KMS planes by definition scanout > >>>> planes? > >> > >> Pardon my terminology, I thought the scanout plane was where weston rendered > >> non-offloadable surfaces to. I guess it's more correct to call it the "render > >> plane". On weston, it seems to be always assigned to the PRIMARY. > >> > > > > The assignment restriction is just technical design debt. It is > > limiting. There is no other good reason for it, than when lighting > > up a CRTC for the first time, Weston should do it with the renderer FB > > only, on the plane that is most likely to succeed i.e. PRIMARY. After > > the CRTC is lit, there should be no built-in limitations in what can go > > where. > > > > The reason for this is that if a CRTC can be activated, it must always > > be able to show the renderer FB without incurring a modeset. This is > > important for ensuring that the fallback compositing (renderer) is > > always possible. So we start with that configuration, and everything > > else is optional bonus. > > Genuinely curious - What exactly is limiting with keeping the renderer FB on > PRIMARY? IOW, what is the additional benefit of placing the renderer FB on > something other than PRIMARY? The limitations come from a combination of hardware limitations. Perhaps zpos is not mutable, or maybe other planes cannot arbitrarily move between above and below the primary. This reduces the number of possible configurations, which might cause off-loading to fail. I think older hardware has more of these arbitrary restrictions. > >> > >> For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay > >> plane would work. But I think keeping the render plane on PRIMARY (a la weston) > >> makes underlay arrangements easier to allocate, and would be nice to incorporate > >> into a shared algorithm. > > > > If zpos exists, I don't think such limitation is a good idea. It will > > just limit the possible configurations for no reason. > > > > With zpos, the KMS plane type should be irrelevant for their > > z-ordering. Underlay vs. overlay completely loses its meaning at the > > KMS level. > > Right, the plane types loose their meanings. But at least with the way > libliftoff builds the plane arrangement, where we first allocate the renderer fb > matters. > > libliftoff incrementally builds the atomic state by adding a single plane to the > atomic state, then testing it. It essentially does a depth-first-search of all > possible arrangements, pruning the search on atomic test fail. The state that > offloads the most number of FBs will be the arrangement used. > > Of course, it's unlikely that the entire DFS tree will traversed in time for a > frame. So the key is to search the most probable and high-benefit branches > first, while minimizing the # of atomic tests needed, before a hard-coded > deadline is hit. > > Following this algorithm, the PRIMARY needs to be enabled first, followed by all > the secondary planes. After a plane is enabled, it's not preferred to change > it's assigned FB, since that can cause the state to be rejected (in actuality, > not just the FB, but also any color and transformation stuffs associated with > the surface). It is preferable to build on the state by enabling another > fb->plane. This is where changing a plane's zpos to be above/below the PRIMARY > is advantageous, rather than changing the FBs assigned, to accommodate > overlay/underlay arrangements. This all sounds reasonable, but why limit this to only the renderer FB on primary plane? The same idea should apply equally to any FB on any plane. Then one needs more heuristics on when to stop the search short, and when to reconsider each FB-plane assignment in case new candidates have appeared but the old ones have not disappeared. > I imagine that any algorithm which incrementally builds up the plane arrangement > will have a similar preference. Of course, it's entirely possible that such an > algorithm isn't the best, I admittedly have not thought much about other > possibilities, yet... It's a complicated problem, indeed. Maybe there needs to be a background task that is not limited by the page flip deadline and can do an exhaustive search over many refresh periods. Thanks, pq
On 2024-04-16 04:01, Pekka Paalanen wrote: > On Mon, 15 Apr 2024 18:33:39 -0400 > Leo Li <sunpeng.li@amd.com> wrote: > >> On 2024-04-15 04:19, Pekka Paalanen wrote: >>> On Fri, 12 Apr 2024 16:14:28 -0400 >>> Leo Li <sunpeng.li@amd.com> wrote: >>> >>>> On 2024-04-12 11:31, Alex Deucher wrote: >>>>> On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen >>>>> <pekka.paalanen@collabora.com> wrote: >>>>>> >>>>>> On Fri, 12 Apr 2024 10:28:52 -0400 >>>>>> Leo Li <sunpeng.li@amd.com> wrote: >>>>>> >>>>>>> On 2024-04-12 04:03, Pekka Paalanen wrote: >>>>>>>> On Thu, 11 Apr 2024 16:33:57 -0400 >>>>>>>> Leo Li <sunpeng.li@amd.com> wrote: >>>>>>>> >>>>>> >>>>>> ... >>>>>> >>>>>>>>> That begs the question of what can be nailed down and what can left to >>>>>>>>> independent implementation. I guess things like which plane should be enabled >>>>>>>>> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) >>>>>>>>> can be defined. How to handle atomic test failures could be as well. >>>>>>>> >>>>>>>> What room is there for the interpretation of zpos values? >>>>>>>> >>>>>>>> I thought they are unambiguous already: only the relative numerical >>>>>>>> order matters, and that uniquely defines the KMS plane ordering. >>>>>>> >>>>>>> The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way >>>>>>> for vendors to communicate overlay, underlay, or mixed-arrangement support. I >>>>>>> don't think allowing OVERLAYs to be placed under the PRIMARY is currently >>>>>>> documented as a way to support underlay. >>>>>> >>>>>> I always thought it's obvious that the zpos numbers dictate the plane >>>>>> order without any other rules. After all, we have the universal planes >>>>>> concept, where the plane type is only informational to aid heuristics >>>>>> rather than defining anything. >>>>>> >>>>>> Only if the zpos property does not exist, the plane types would come >>>>>> into play. >>>>>> >>>>>> Of course, if there actually exists userspace that fails if zpos allows >>>>>> an overlay type plane to be placed below primary, or fails if primary >>>>>> zpos is not zero, then DRM needs a new client cap. >>>> >>>> Right, it wasn't immediately clear to me that the API allowed placement of >>>> things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*, >>>> there's nothing that forbids it. >>>> >>>>>> >>>>>>> libliftoff for example, assumes that the PRIMARY has the lowest zpos. So >>>>>>> underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY >>>>>>> for the underlay view. >>>>>> >>>>>> That's totally ok. It works, right? Plane type does not matter if the >>>>>> KMS driver accepts the configuration. >>>>>> >>>>>> What is a "scanout plane"? Aren't all KMS planes by definition scanout >>>>>> planes? >>>> >>>> Pardon my terminology, I thought the scanout plane was where weston rendered >>>> non-offloadable surfaces to. I guess it's more correct to call it the "render >>>> plane". On weston, it seems to be always assigned to the PRIMARY. >>>> >>> >>> The assignment restriction is just technical design debt. It is >>> limiting. There is no other good reason for it, than when lighting >>> up a CRTC for the first time, Weston should do it with the renderer FB >>> only, on the plane that is most likely to succeed i.e. PRIMARY. After >>> the CRTC is lit, there should be no built-in limitations in what can go >>> where. >>> >>> The reason for this is that if a CRTC can be activated, it must always >>> be able to show the renderer FB without incurring a modeset. This is >>> important for ensuring that the fallback compositing (renderer) is >>> always possible. So we start with that configuration, and everything >>> else is optional bonus. >> >> Genuinely curious - What exactly is limiting with keeping the renderer FB on >> PRIMARY? IOW, what is the additional benefit of placing the renderer FB on >> something other than PRIMARY? > > The limitations come from a combination of hardware limitations. > Perhaps zpos is not mutable, or maybe other planes cannot arbitrarily > move between above and below the primary. This reduces the number of > possible configurations, which might cause off-loading to fail. > > I think older hardware has more of these arbitrary restrictions. > >>>> >>>> For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay >>>> plane would work. But I think keeping the render plane on PRIMARY (a la weston) >>>> makes underlay arrangements easier to allocate, and would be nice to incorporate >>>> into a shared algorithm. >>> >>> If zpos exists, I don't think such limitation is a good idea. It will >>> just limit the possible configurations for no reason. >>> >>> With zpos, the KMS plane type should be irrelevant for their >>> z-ordering. Underlay vs. overlay completely loses its meaning at the >>> KMS level. >> >> Right, the plane types loose their meanings. But at least with the way >> libliftoff builds the plane arrangement, where we first allocate the renderer fb >> matters. >> >> libliftoff incrementally builds the atomic state by adding a single plane to the >> atomic state, then testing it. It essentially does a depth-first-search of all >> possible arrangements, pruning the search on atomic test fail. The state that >> offloads the most number of FBs will be the arrangement used. >> >> Of course, it's unlikely that the entire DFS tree will traversed in time for a >> frame. So the key is to search the most probable and high-benefit branches >> first, while minimizing the # of atomic tests needed, before a hard-coded >> deadline is hit. >> >> Following this algorithm, the PRIMARY needs to be enabled first, followed by all >> the secondary planes. After a plane is enabled, it's not preferred to change >> it's assigned FB, since that can cause the state to be rejected (in actuality, >> not just the FB, but also any color and transformation stuffs associated with >> the surface). It is preferable to build on the state by enabling another >> fb->plane. This is where changing a plane's zpos to be above/below the PRIMARY >> is advantageous, rather than changing the FBs assigned, to accommodate >> overlay/underlay arrangements. > > This all sounds reasonable, but why limit this to only the renderer FB > on primary plane? The same idea should apply equally to any FB on any > plane. Then one needs more heuristics on when to stop the search short, > and when to reconsider each FB-plane assignment in case new candidates > have appeared but the old ones have not disappeared. > >> I imagine that any algorithm which incrementally builds up the plane arrangement >> will have a similar preference. Of course, it's entirely possible that such an >> algorithm isn't the best, I admittedly have not thought much about other >> possibilities, yet... > > It's a complicated problem, indeed. Maybe there needs to be a background > task that is not limited by the page flip deadline and can do an > exhaustive search over many refresh periods. > That would be nice. Kick this off when there is a configuration change, e.g., user starts video playback, opens a new video, etc. One would need to avoid doing too much of that, though, as one could envision scenarios where this happens frequently and could have its own impact on power by keeping the CPU busy. Harry > > Thanks, > pq
On 2024-04-16 10:10, Harry Wentland wrote: > > > On 2024-04-16 04:01, Pekka Paalanen wrote: >> On Mon, 15 Apr 2024 18:33:39 -0400 >> Leo Li <sunpeng.li@amd.com> wrote: >> >>> On 2024-04-15 04:19, Pekka Paalanen wrote: >>>> On Fri, 12 Apr 2024 16:14:28 -0400 >>>> Leo Li <sunpeng.li@amd.com> wrote: >>>> >>>>> On 2024-04-12 11:31, Alex Deucher wrote: >>>>>> On Fri, Apr 12, 2024 at 11:08 AM Pekka Paalanen >>>>>> <pekka.paalanen@collabora.com> wrote: >>>>>>> >>>>>>> On Fri, 12 Apr 2024 10:28:52 -0400 >>>>>>> Leo Li <sunpeng.li@amd.com> wrote: >>>>>>> >>>>>>>> On 2024-04-12 04:03, Pekka Paalanen wrote: >>>>>>>>> On Thu, 11 Apr 2024 16:33:57 -0400 >>>>>>>>> Leo Li <sunpeng.li@amd.com> wrote: >>>>>>>>> >>>>>>> >>>>>>> ... >>>>>>> >>>>>>>>>> That begs the question of what can be nailed down and what can left to >>>>>>>>>> independent implementation. I guess things like which plane should be enabled >>>>>>>>>> first (PRIMARY), and how zpos should be interpreted (overlay, underlay, mixed) >>>>>>>>>> can be defined. How to handle atomic test failures could be as well. >>>>>>>>> >>>>>>>>> What room is there for the interpretation of zpos values? >>>>>>>>> >>>>>>>>> I thought they are unambiguous already: only the relative numerical >>>>>>>>> order matters, and that uniquely defines the KMS plane ordering. >>>>>>>> >>>>>>>> The zpos value of the PRIMARY plane relative to OVERLAYS, for example, as a way >>>>>>>> for vendors to communicate overlay, underlay, or mixed-arrangement support. I >>>>>>>> don't think allowing OVERLAYs to be placed under the PRIMARY is currently >>>>>>>> documented as a way to support underlay. >>>>>>> >>>>>>> I always thought it's obvious that the zpos numbers dictate the plane >>>>>>> order without any other rules. After all, we have the universal planes >>>>>>> concept, where the plane type is only informational to aid heuristics >>>>>>> rather than defining anything. >>>>>>> >>>>>>> Only if the zpos property does not exist, the plane types would come >>>>>>> into play. >>>>>>> >>>>>>> Of course, if there actually exists userspace that fails if zpos allows >>>>>>> an overlay type plane to be placed below primary, or fails if primary >>>>>>> zpos is not zero, then DRM needs a new client cap. >>>>> >>>>> Right, it wasn't immediately clear to me that the API allowed placement of >>>>> things beneath the PRIMARY. But reading the docs for drm_plane_create_zpos*, >>>>> there's nothing that forbids it. >>>>> >>>>>>> >>>>>>>> libliftoff for example, assumes that the PRIMARY has the lowest zpos. So >>>>>>>> underlay arrangements will use an OVERLAY for the scanout plane, and the PRIMARY >>>>>>>> for the underlay view. >>>>>>> >>>>>>> That's totally ok. It works, right? Plane type does not matter if the >>>>>>> KMS driver accepts the configuration. >>>>>>> >>>>>>> What is a "scanout plane"? Aren't all KMS planes by definition scanout >>>>>>> planes? >>>>> >>>>> Pardon my terminology, I thought the scanout plane was where weston rendered >>>>> non-offloadable surfaces to. I guess it's more correct to call it the "render >>>>> plane". On weston, it seems to be always assigned to the PRIMARY. >>>>> >>>> >>>> The assignment restriction is just technical design debt. It is >>>> limiting. There is no other good reason for it, than when lighting >>>> up a CRTC for the first time, Weston should do it with the renderer FB >>>> only, on the plane that is most likely to succeed i.e. PRIMARY. After >>>> the CRTC is lit, there should be no built-in limitations in what can go >>>> where. >>>> >>>> The reason for this is that if a CRTC can be activated, it must always >>>> be able to show the renderer FB without incurring a modeset. This is >>>> important for ensuring that the fallback compositing (renderer) is >>>> always possible. So we start with that configuration, and everything >>>> else is optional bonus. >>> >>> Genuinely curious - What exactly is limiting with keeping the renderer FB on >>> PRIMARY? IOW, what is the additional benefit of placing the renderer FB on >>> something other than PRIMARY? >> >> The limitations come from a combination of hardware limitations. >> Perhaps zpos is not mutable, or maybe other planes cannot arbitrarily >> move between above and below the primary. This reduces the number of >> possible configurations, which might cause off-loading to fail. >> >> I think older hardware has more of these arbitrary restrictions. I see. I was thinking that drivers can do under-the-hood stuff to present a mutable zpos to clients, even if their hardware planes cannot be arbitrarily rearranged, by mapping the PRIMARY to a different hardware plane. But not all planes have the same function, so this sounds more complicated than helpful. >> >>>>> >>>>> For libliftoff, using OVERLAYs as the render plane and PRIMARY as the underlay >>>>> plane would work. But I think keeping the render plane on PRIMARY (a la weston) >>>>> makes underlay arrangements easier to allocate, and would be nice to incorporate >>>>> into a shared algorithm. >>>> >>>> If zpos exists, I don't think such limitation is a good idea. It will >>>> just limit the possible configurations for no reason. >>>> >>>> With zpos, the KMS plane type should be irrelevant for their >>>> z-ordering. Underlay vs. overlay completely loses its meaning at the >>>> KMS level. >>> >>> Right, the plane types loose their meanings. But at least with the way >>> libliftoff builds the plane arrangement, where we first allocate the renderer fb >>> matters. >>> >>> libliftoff incrementally builds the atomic state by adding a single plane to the >>> atomic state, then testing it. It essentially does a depth-first-search of all >>> possible arrangements, pruning the search on atomic test fail. The state that >>> offloads the most number of FBs will be the arrangement used. >>> >>> Of course, it's unlikely that the entire DFS tree will traversed in time for a >>> frame. So the key is to search the most probable and high-benefit branches >>> first, while minimizing the # of atomic tests needed, before a hard-coded >>> deadline is hit. >>> >>> Following this algorithm, the PRIMARY needs to be enabled first, followed by all >>> the secondary planes. After a plane is enabled, it's not preferred to change >>> it's assigned FB, since that can cause the state to be rejected (in actuality, >>> not just the FB, but also any color and transformation stuffs associated with >>> the surface). It is preferable to build on the state by enabling another >>> fb->plane. This is where changing a plane's zpos to be above/below the PRIMARY >>> is advantageous, rather than changing the FBs assigned, to accommodate >>> overlay/underlay arrangements. >> >> This all sounds reasonable, but why limit this to only the renderer FB >> on primary plane? The same idea should apply equally to any FB on any >> plane. Then one needs more heuristics on when to stop the search short, >> and when to reconsider each FB-plane assignment in case new candidates >> have appeared but the old ones have not disappeared. libliftoff starts the search by assigning the renderer FB, if one is provided by the compositor, to PRIMARY. I think the reason is to always have the renderer option available for FBs that need it. Eventually, if the search tree is traversed enough, an arrangement that does not need the renderer fb may be found, if all the FBs can be assigned, and there are enough planes for them. But we may not get there before the deadline. Perhaps having more time to search is the solution here. (p.s. if a candidate FB is added or removed, libliftoff starts the search anew) >> >>> I imagine that any algorithm which incrementally builds up the plane arrangement >>> will have a similar preference. Of course, it's entirely possible that such an >>> algorithm isn't the best, I admittedly have not thought much about other >>> possibilities, yet... >> >> It's a complicated problem, indeed. Maybe there needs to be a background >> task that is not limited by the page flip deadline and can do an >> exhaustive search over many refresh periods. >> > > That would be nice. Kick this off when there is a configuration change, > e.g., user starts video playback, opens a new video, etc. > > One would need to avoid doing too much of that, though, as one could > envision scenarios where this happens frequently and could have its > own impact on power by keeping the CPU busy. > > Harry I recall emersion had a similar suggestion for libliftoff by caching the incomplete plane arrangement for further processing on future frames once the deadline is reached. It avoids the need for a separate task. Having more time to do a more exhaustive search would make zpos meaningless outside of determining the correct z-ordering, as pq previously mentioned. It would support hardware that have zpos limitations. It is more complex, but maybe that's fine, as long as the complexity doesn't bleed into other parts of the compositor. There are still ways to limit the # of atomic tests needed for the search, which will help speed things up (already considered by libliftoff today): * IN_FORMAT property for what FB formats a plane supports * zpos property for correct z-ordering * Occlusion rules. A FB occluded by a rendered FB or underlay-ed FB cannot be overlay-ed, for example * And potentially more Thanks, Leo > >> >> Thanks, >> pq >
From: Leo Li <sunpeng.li@amd.com> These patches aim to make the amdgpgu KMS driver play nicer with compositors when building multi-plane scanout configurations. They do so by: 1. Making cursor behavior more sensible. 2. Allowing placement of DRM OVERLAY planes underneath the PRIMARY plane for 'underlay' configurations (perhaps more of a RFC, see below). Please see the commit messages for details. For #2, the simplest way to accomplish this was to increase the value of the immutable zpos property for the PRIMARY plane. This allowed OVERLAY planes with a mutable zpos range of (0-254) to be positioned underneath the PRIMARY for an underlay scanout configuration. Technically speaking, DCN hardware does not have a concept of primary or overlay planes - there are simply 4 general purpose hardware pipes that can be maped in any configuration. So the immutable zpos restriction on the PRIMARY plane is kind of arbitrary; it can have a mutable range of (0-254) just like the OVERLAYs. The distinction between PRIMARY and OVERLAY planes is also somewhat arbitrary. We can interpret PRIMARY as the first plane that should be enabled on a CRTC, but beyond that, it doesn't mean much for amdgpu. Therefore, I'm curious about how compositors devs understand KMS planes and their zpos properties, and how we would like to use them. It isn't clear to me how compositors wish to interpret and use the DRM zpos property, or differentiate between OVERLAY and PRIMARY planes, when it comes to setting up multi-plane scanout. Ultimately, what I'd like to answer is "What can we do on the KMS driver and DRM plane API side, that can make building multi-plane scanout configurations easier for compositors?" I'm hoping we can converge on something, whether that be updating the existing documentation to better define the usage, or update the API to provide support for something that is lacking. Thanks, Leo Some links to provide context and details: * What is underlay?: https://gitlab.freedesktop.org/emersion/libliftoff/-/issues/76 * Discussion on how to implement underlay on Weston: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/1258#note_2325164 Cc: Joshua Ashton <joshua@froggi.es> Cc: Michel Dänzer <mdaenzer@redhat.com> Cc: Chao Guo <chao.guo@nxp.com> Cc: Xaver Hugl <xaver.hugl@gmail.com> Cc: Vikas Korjani <Vikas.Korjani@amd.com> Cc: Robert Mader <robert.mader@posteo.de> Cc: Pekka Paalanen <pekka.paalanen@collabora.com> Cc: Sean Paul <sean@poorly.run> Cc: Simon Ser <contact@emersion.fr> Cc: Shashank Sharma <shashank.sharma@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Sebastian Wick <sebastian.wick@redhat.com> Leo Li (2): drm/amd/display: Introduce overlay cursor mode drm/amd/display: Move PRIMARY plane zpos higher .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 405 ++++++++++++++++-- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 7 + .../amd/display/amdgpu_dm/amdgpu_dm_crtc.c | 1 + .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 28 +- 4 files changed, 391 insertions(+), 50 deletions(-)