Message ID | 20240112203803.6421-1-ian.forbes@broadcom.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/vmwgfx: Filter modes which exceed 3/4 of graphics memory. | expand |
Hi Am 12.01.24 um 21:38 schrieb Ian Forbes: > SVGA requires surfaces to fit within graphics memory (max_mob_pages) which > means that modes with a final buffer size that would exceed graphics memory > must be pruned otherwise creation will fail. > > Additionally, device commands which use multiple graphics resources must > have all their resources fit within graphics memory for the duration of the > command. Thus we need a small carve out of 1/4 of graphics memory to ensure > commands likes surface copies to the primary framebuffer for cursor > composition or damage clips can fit within graphics memory. > > This fixes an issue where VMs with low graphics memory (< 64MiB) configured > with high resolution mode boot to a black screen because surface creation > fails. That is a long-standing problem, which we have observed with other drivers as well. On low-memory devices, TTM doesn't play well. The real fix would be to export all modes that possibly fit and sort out the invalid configurations in atomic_check. It's just a lot more work. Did you consider simply ignoring vmwgfx devices with less than 64 MiB of VRAM? > > Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> > --- > drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c > index 28ff30e32fab..39d6d17fc488 100644 > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c > @@ -2854,12 +2854,12 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, > struct vmw_private *dev_priv = vmw_priv(dev); > u32 max_width = dev_priv->texture_max_width; > u32 max_height = dev_priv->texture_max_height; > - u32 assumed_cpp = 4; > + u32 assumed_cpp = dev_priv->assume_16bpp ? 2 : 4; > + u32 pitch = mode->hdisplay * assumed_cpp; > + u64 total = mode->vdisplay * pitch; > + bool using_stdu = dev_priv->active_display_unit == vmw_du_screen_target; > > - if (dev_priv->assume_16bpp) > - assumed_cpp = 2; > - > - if (dev_priv->active_display_unit == vmw_du_screen_target) { > + if (using_stdu) { > max_width = min(dev_priv->stdu_max_width, max_width); > max_height = min(dev_priv->stdu_max_height, max_height); > } > @@ -2870,9 +2870,13 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, > if (max_height < mode->vdisplay) > return MODE_BAD_VVALUE; > > - if (!vmw_kms_validate_mode_vram(dev_priv, > - mode->hdisplay * assumed_cpp, > - mode->vdisplay)) > + if (using_stdu && > + (total > (dev_priv->max_mob_pages * PAGE_SIZE * 3 / 4) || IDK the details of vmwgfx's memory management, but instead of this '3/4 test', wouldn't it be better to partition the VRAM via TTM and test against the partition sizes? Best regards Thomas > + total > dev_priv->max_mob_size)) { > + return MODE_MEM; > + } > + > + if (!vmw_kms_validate_mode_vram(dev_priv, pitch, mode->vdisplay)) > return MODE_MEM; > > return MODE_OK;
On Mon, Jan 15, 2024 at 3:21 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi > > Am 12.01.24 um 21:38 schrieb Ian Forbes: > > SVGA requires surfaces to fit within graphics memory (max_mob_pages) which > > means that modes with a final buffer size that would exceed graphics memory > > must be pruned otherwise creation will fail. > > > > Additionally, device commands which use multiple graphics resources must > > have all their resources fit within graphics memory for the duration of the > > command. Thus we need a small carve out of 1/4 of graphics memory to ensure > > commands likes surface copies to the primary framebuffer for cursor > > composition or damage clips can fit within graphics memory. > > > > This fixes an issue where VMs with low graphics memory (< 64MiB) configured > > with high resolution mode boot to a black screen because surface creation > > fails. > > That is a long-standing problem, which we have observed with other > drivers as well. On low-memory devices, TTM doesn't play well. The real > fix would be to export all modes that possibly fit and sort out the > invalid configurations in atomic_check. It's just a lot more work. > > Did you consider simply ignoring vmwgfx devices with less than 64 MiB of > VRAM? Unfortunately we can't do that because on new esx servers without gpu's the default is 16MB. A lot of people are still running their esx boxes with 4MB, which is in general the most common problem because with 4MB people still tend to like to set 1280x800 which with 32bpp fb takes 4096000 bytes and with 4MB available that leaves only 96KB available and we need more to also allocate things like the cursor. Even if ttm did everything right technically 1280x800 @ 32bpp resolution will fit in a 4MB graphics memory, but then the system will not be able to have a hardware (well, virtualized) cursor. It's extremely unlikely people would even be aware of this tradeoff when making the decision to increase resolution. So the driver either needs to preallocate all the memory it possibly might need for all the basic functionality before modesetting or the available modes need to be validated with some constraints. z
Hi Am 18.01.24 um 19:25 schrieb Zack Rusin: > On Mon, Jan 15, 2024 at 3:21 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: >> >> Hi >> >> Am 12.01.24 um 21:38 schrieb Ian Forbes: >>> SVGA requires surfaces to fit within graphics memory (max_mob_pages) which >>> means that modes with a final buffer size that would exceed graphics memory >>> must be pruned otherwise creation will fail. >>> >>> Additionally, device commands which use multiple graphics resources must >>> have all their resources fit within graphics memory for the duration of the >>> command. Thus we need a small carve out of 1/4 of graphics memory to ensure >>> commands likes surface copies to the primary framebuffer for cursor >>> composition or damage clips can fit within graphics memory. >>> >>> This fixes an issue where VMs with low graphics memory (< 64MiB) configured >>> with high resolution mode boot to a black screen because surface creation >>> fails. >> >> That is a long-standing problem, which we have observed with other >> drivers as well. On low-memory devices, TTM doesn't play well. The real >> fix would be to export all modes that possibly fit and sort out the >> invalid configurations in atomic_check. It's just a lot more work. >> >> Did you consider simply ignoring vmwgfx devices with less than 64 MiB of >> VRAM? > > Unfortunately we can't do that because on new esx servers without > gpu's the default is 16MB. A lot of people are still running their esx > boxes with 4MB, which is in general the most common problem because > with 4MB people still tend to like to set 1280x800 which with 32bpp fb > takes 4096000 bytes and with 4MB available that leaves only 96KB > available and we need more to also allocate things like the cursor. > Even if ttm did everything right technically 1280x800 @ 32bpp > resolution will fit in a 4MB graphics memory, but then the system will > not be able to have a hardware (well, virtualized) cursor. It's > extremely unlikely people would even be aware of this tradeoff when > making the decision to increase resolution. Do you allocate buffer storage directly in the provided VRAM? If so how do you do page flips then? You'd need for the example of 1280x800-32, you'd need around 8 MiB to keep front and back buffer in VRAM. I guess, you only support the framebuffer console (which doesn't do pageflips)? In mgag200 and ast, I had the luxury for replacing TTM with SHMEM helpers, which worked around the problem easily. Maybe that's an option for low-memory systems? Best regards Thomas > > So the driver either needs to preallocate all the memory it possibly > might need for all the basic functionality before modesetting or the > available modes need to be validated with some constraints. > > z
On Fri, Jan 19, 2024 at 4:22 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi > > Am 18.01.24 um 19:25 schrieb Zack Rusin: > > On Mon, Jan 15, 2024 at 3:21 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > >> > >> Hi > >> > >> Am 12.01.24 um 21:38 schrieb Ian Forbes: > >>> SVGA requires surfaces to fit within graphics memory (max_mob_pages) which > >>> means that modes with a final buffer size that would exceed graphics memory > >>> must be pruned otherwise creation will fail. > >>> > >>> Additionally, device commands which use multiple graphics resources must > >>> have all their resources fit within graphics memory for the duration of the > >>> command. Thus we need a small carve out of 1/4 of graphics memory to ensure > >>> commands likes surface copies to the primary framebuffer for cursor > >>> composition or damage clips can fit within graphics memory. > >>> > >>> This fixes an issue where VMs with low graphics memory (< 64MiB) configured > >>> with high resolution mode boot to a black screen because surface creation > >>> fails. > >> > >> That is a long-standing problem, which we have observed with other > >> drivers as well. On low-memory devices, TTM doesn't play well. The real > >> fix would be to export all modes that possibly fit and sort out the > >> invalid configurations in atomic_check. It's just a lot more work. > >> > >> Did you consider simply ignoring vmwgfx devices with less than 64 MiB of > >> VRAM? > > > > Unfortunately we can't do that because on new esx servers without > > gpu's the default is 16MB. A lot of people are still running their esx > > boxes with 4MB, which is in general the most common problem because > > with 4MB people still tend to like to set 1280x800 which with 32bpp fb > > takes 4096000 bytes and with 4MB available that leaves only 96KB > > available and we need more to also allocate things like the cursor. > > Even if ttm did everything right technically 1280x800 @ 32bpp > > resolution will fit in a 4MB graphics memory, but then the system will > > not be able to have a hardware (well, virtualized) cursor. It's > > extremely unlikely people would even be aware of this tradeoff when > > making the decision to increase resolution. > > Do you allocate buffer storage directly in the provided VRAM? If so how > do you do page flips then? You'd need for the example of 1280x800-32, > you'd need around 8 MiB to keep front and back buffer in VRAM. I guess, > you only support the framebuffer console (which doesn't do pageflips)? In general, yes. Of course it's a little more convoluted because we'll act like OpenGL runtime here (i.e. glXSwapBuffers), i.e. our driver will fake page-flips because the only memory we'll have is a single buffer as the actual page-flipping happens in the presentation code on the host. So the guest is not aware of the actual presentation (it's also why we don't have any sort of vblank signaling in vmwgfx, the concept just doesn't exist for us). i.e. on para-virtualized drivers the actual page-flips will be property of the presentation code that's outside of the guest. It's definitely one those things that I wanted to have a good solution for in a while, in particular to have a better story behind vblank handling, but it's difficult because "presentation" on vm's is in general difficult to define - it might be some vnc connected host on the other continent. Having said that that's basically a wonky VRR display so we should be able to handle our presentation as VRR and give more control of updates to the guest, but we haven't done it yet. > In mgag200 and ast, I had the luxury for replacing TTM with SHMEM > helpers, which worked around the problem easily. Maybe that's an option > for low-memory systems? Our current device doesn't have the ability to present out of unspecified memory in the guest, i.e. the host, which is doing the presentation, is not aware of how guest kernel lays out the memory so we need to basically create a page-table for every graphics object (VMW_PL_MOB placement in vmwgfx) so that the host can actually find the memory it needs to read. So the shmem helpers would need something extra for us to be able to generate those page tables for the drm_gem_object's it deals with. z
On Fri, Jan 12, 2024 at 4:20 PM Ian Forbes <ian.forbes@broadcom.com> wrote: > > SVGA requires surfaces to fit within graphics memory (max_mob_pages) which > means that modes with a final buffer size that would exceed graphics memory > must be pruned otherwise creation will fail. > > Additionally, device commands which use multiple graphics resources must > have all their resources fit within graphics memory for the duration of the > command. Thus we need a small carve out of 1/4 of graphics memory to ensure > commands likes surface copies to the primary framebuffer for cursor > composition or damage clips can fit within graphics memory. > > This fixes an issue where VMs with low graphics memory (< 64MiB) configured > with high resolution mode boot to a black screen because surface creation > fails. > > Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> > --- > drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c > index 28ff30e32fab..39d6d17fc488 100644 > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c > @@ -2854,12 +2854,12 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, > struct vmw_private *dev_priv = vmw_priv(dev); > u32 max_width = dev_priv->texture_max_width; > u32 max_height = dev_priv->texture_max_height; > - u32 assumed_cpp = 4; > + u32 assumed_cpp = dev_priv->assume_16bpp ? 2 : 4; > + u32 pitch = mode->hdisplay * assumed_cpp; > + u64 total = mode->vdisplay * pitch; > + bool using_stdu = dev_priv->active_display_unit == vmw_du_screen_target; > > - if (dev_priv->assume_16bpp) > - assumed_cpp = 2; > - > - if (dev_priv->active_display_unit == vmw_du_screen_target) { > + if (using_stdu) { > max_width = min(dev_priv->stdu_max_width, max_width); > max_height = min(dev_priv->stdu_max_height, max_height); > } > @@ -2870,9 +2870,13 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, > if (max_height < mode->vdisplay) > return MODE_BAD_VVALUE; > > - if (!vmw_kms_validate_mode_vram(dev_priv, > - mode->hdisplay * assumed_cpp, > - mode->vdisplay)) > + if (using_stdu && > + (total > (dev_priv->max_mob_pages * PAGE_SIZE * 3 / 4) || Could you export that computation somewhere where we could document why we're doing this? Just to not leave the awkward "* 3 /4" that everyone reading this code will wonder about? And also make sure you indent this correctly, "dim checkpatch" should warn about this. z
Hi, On Tue, 30 Jan 2024 at 18:39, Zack Rusin <zack.rusin@broadcom.com> wrote: > In general, yes. Of course it's a little more convoluted because we'll > act like OpenGL runtime here (i.e. glXSwapBuffers), i.e. our driver > will fake page-flips because the only memory we'll have is a single > buffer as the actual page-flipping happens in the presentation code on > the host. So the guest is not aware of the actual presentation (it's > also why we don't have any sort of vblank signaling in vmwgfx, the > concept just doesn't exist for us). i.e. on para-virtualized drivers > the actual page-flips will be property of the presentation code that's > outside of the guest. It's definitely one those things that I wanted > to have a good solution for in a while, in particular to have a better > story behind vblank handling, but it's difficult because > "presentation" on vm's is in general difficult to define - it might be > some vnc connected host on the other continent. Having said that > that's basically a wonky VRR display so we should be able to handle > our presentation as VRR and give more control of updates to the guest, > but we haven't done it yet. Please don't. Photon time is _a_ useful metric, but only backwards-informational. It's nice to give userspace a good forward estimate of when pixels will hit retinas, but as it's not fully reliable, the main part is being able to let it know when it did happen so it can adjust. Given that it's not reliable, we can't use it as a basis for preparing submissions though, so we don't, even on bare-metal drivers. As you've noted though, it really falls apart on non-bare-metal cases, especially where latency vastly exceeds throughput, or when either is hugely variable. So we don't ever use it as a basis. VRR is worse though. The FRR model is 'you can display new content every $period, and here's your basis so you can calibrate phase'. The VRR model is 'you can display new content so rapidly it's not worth trying to quantise, just fire it as rapidly as possible'. That's a world away from 'errrr ... might be 16ms, might be 500? dunno really'. The entire model we have is that basis timing flows backwards. The 'hardware' gives us a deadline, KMS angles to meet that with a small margin, the compositor angles to meet that with a margin again, and it lines up client repaints to hit that window too. Everything works on that model, so it's not super surprising that using svga is - to quote one of Weston's DRM-backend people who uses ESXi - 'a juddery mess'. Given that the entire ecosystem is based on this model, I don't think there's an easy way out where svga just does something wildly different. The best way to fix it is to probably work on predictable quantisation with updates: pick 5/12/47/60Hz to quantise to based on your current throughput, with something similar to hotplug/LINK_STATUS and faked EDID to let userspace know when the period changes. If you have variability within the cycle, e.g. dropped frames, then just suck it up and keep the illusion alive to userspace that it's presenting to a fixed period, and if/when you calculate there's a better quantisation then let userspace know what it is so it can adjust. But there's really no future in just doing random presentation rates, because that's not the API anyone has written for. Cheers, Daniel
On Tue, Jan 30, 2024 at 6:50 PM Daniel Stone <daniel@fooishbar.org> wrote: > > Hi, > > On Tue, 30 Jan 2024 at 18:39, Zack Rusin <zack.rusin@broadcom.com> wrote: > > In general, yes. Of course it's a little more convoluted because we'll > > act like OpenGL runtime here (i.e. glXSwapBuffers), i.e. our driver > > will fake page-flips because the only memory we'll have is a single > > buffer as the actual page-flipping happens in the presentation code on > > the host. So the guest is not aware of the actual presentation (it's > > also why we don't have any sort of vblank signaling in vmwgfx, the > > concept just doesn't exist for us). i.e. on para-virtualized drivers > > the actual page-flips will be property of the presentation code that's > > outside of the guest. It's definitely one those things that I wanted > > to have a good solution for in a while, in particular to have a better > > story behind vblank handling, but it's difficult because > > "presentation" on vm's is in general difficult to define - it might be > > some vnc connected host on the other continent. Having said that > > that's basically a wonky VRR display so we should be able to handle > > our presentation as VRR and give more control of updates to the guest, > > but we haven't done it yet. > > Please don't. > > Photon time is _a_ useful metric, but only backwards-informational. > It's nice to give userspace a good forward estimate of when pixels > will hit retinas, but as it's not fully reliable, the main part is > being able to let it know when it did happen so it can adjust. Given > that it's not reliable, we can't use it as a basis for preparing > submissions though, so we don't, even on bare-metal drivers. > > As you've noted though, it really falls apart on non-bare-metal cases, > especially where latency vastly exceeds throughput, or when either is > hugely variable. So we don't ever use it as a basis. > > VRR is worse though. The FRR model is 'you can display new content > every $period, and here's your basis so you can calibrate phase'. The > VRR model is 'you can display new content so rapidly it's not worth > trying to quantise, just fire it as rapidly as possible'. That's a > world away from 'errrr ... might be 16ms, might be 500? dunno really'. > > The entire model we have is that basis timing flows backwards. The > 'hardware' gives us a deadline, KMS angles to meet that with a small > margin, the compositor angles to meet that with a margin again, and it > lines up client repaints to hit that window too. Everything works on > that model, so it's not super surprising that using svga is - to quote > one of Weston's DRM-backend people who uses ESXi - 'a juddery mess'. That's very hurtful. Or it would be but of course you didn't believe them because they're working on Weston so clearly don't make good choices in general, right? The presentation on esxi is just as smooth as it is by default on Ubuntu on new hardware... > Given that the entire ecosystem is based on this model, I don't think > there's an easy way out where svga just does something wildly > different. The best way to fix it is to probably work on predictable > quantisation with updates: pick 5/12/47/60Hz to quantise to based on > your current throughput, with something similar to hotplug/LINK_STATUS > and faked EDID to let userspace know when the period changes. If you > have variability within the cycle, e.g. dropped frames, then just suck > it up and keep the illusion alive to userspace that it's presenting to > a fixed period, and if/when you calculate there's a better > quantisation then let userspace know what it is so it can adjust. > > But there's really no future in just doing random presentation rates, > because that's not the API anyone has written for. See, my hope was that with vrr we could layer the weird remote presentation semantics of virtualized guest on top of the same infrastructure that would be used on real hardware. If you're saying that it's not the way userspace will work, then yea, that doesn't help. My issue, that's general for para-virtualized drivers, is that any behavior that differs from hw drivers means that it's going to break at some point, we see that even for basic things like the update-layout hotplug events that have been largely standardized for many years. I'm assuming that refresh-rate-changed will result in the same regressions, but fwiw if I can implement FRR correctly and punt any issues that arise due to changes in the FRR as issues in userspace then that does make my life a lot easier, so I'm not going to object to that. z
Hi Zack Am 30.01.24 um 20:38 schrieb Zack Rusin: > On Fri, Jan 12, 2024 at 4:20 PM Ian Forbes <ian.forbes@broadcom.com> wrote: >> >> SVGA requires surfaces to fit within graphics memory (max_mob_pages) which >> means that modes with a final buffer size that would exceed graphics memory >> must be pruned otherwise creation will fail. >> >> Additionally, device commands which use multiple graphics resources must >> have all their resources fit within graphics memory for the duration of the >> command. Thus we need a small carve out of 1/4 of graphics memory to ensure >> commands likes surface copies to the primary framebuffer for cursor >> composition or damage clips can fit within graphics memory. >> >> This fixes an issue where VMs with low graphics memory (< 64MiB) configured >> with high resolution mode boot to a black screen because surface creation >> fails. >> >> Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> >> --- >> drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 20 ++++++++++++-------- >> 1 file changed, 12 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c >> index 28ff30e32fab..39d6d17fc488 100644 >> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c >> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c >> @@ -2854,12 +2854,12 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, >> struct vmw_private *dev_priv = vmw_priv(dev); >> u32 max_width = dev_priv->texture_max_width; >> u32 max_height = dev_priv->texture_max_height; >> - u32 assumed_cpp = 4; >> + u32 assumed_cpp = dev_priv->assume_16bpp ? 2 : 4; >> + u32 pitch = mode->hdisplay * assumed_cpp; >> + u64 total = mode->vdisplay * pitch; >> + bool using_stdu = dev_priv->active_display_unit == vmw_du_screen_target; >> >> - if (dev_priv->assume_16bpp) >> - assumed_cpp = 2; >> - >> - if (dev_priv->active_display_unit == vmw_du_screen_target) { >> + if (using_stdu) { >> max_width = min(dev_priv->stdu_max_width, max_width); >> max_height = min(dev_priv->stdu_max_height, max_height); >> } >> @@ -2870,9 +2870,13 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, >> if (max_height < mode->vdisplay) >> return MODE_BAD_VVALUE; >> >> - if (!vmw_kms_validate_mode_vram(dev_priv, >> - mode->hdisplay * assumed_cpp, >> - mode->vdisplay)) >> + if (using_stdu && >> + (total > (dev_priv->max_mob_pages * PAGE_SIZE * 3 / 4) || > > Could you export that computation somewhere where we could document > why we're doing this? Just to not leave the awkward "* 3 /4" that > everyone reading this code will wonder about? There's code in VRAM helpers that does a similar test. [1] But the VRAM helpers are obsolete. I guess that TTM could contain such a test somewhere by testing against a max_bo_size for each TTM resource. Whether that is feasible in practice is a different question. Maybe ask the TTM maintainers. Best regards Thomas [1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_gem_vram_helper.c#L1156 > And also make sure you indent this correctly, "dim checkpatch" should > warn about this. > > z
On Wed, 31 Jan 2024 at 02:31, Zack Rusin <zack.rusin@broadcom.com> wrote: > On Tue, Jan 30, 2024 at 6:50 PM Daniel Stone <daniel@fooishbar.org> wrote: > > The entire model we have is that basis timing flows backwards. The > > 'hardware' gives us a deadline, KMS angles to meet that with a small > > margin, the compositor angles to meet that with a margin again, and it > > lines up client repaints to hit that window too. Everything works on > > that model, so it's not super surprising that using svga is - to quote > > one of Weston's DRM-backend people who uses ESXi - 'a juddery mess'. > > That's very hurtful. Or it would be but of course you didn't believe > them because they're working on Weston so clearly don't make good > choices in general, right? The presentation on esxi is just as smooth > as it is by default on Ubuntu on new hardware... Yeah sorry, that wasn't a 'VMware is bad' dig, that was a 'oh that explains so much if you're deliberately doing the other thing' realisation. I'm not suggesting anyone else use my life choices as a template :) > > Given that the entire ecosystem is based on this model, I don't think > > there's an easy way out where svga just does something wildly > > different. The best way to fix it is to probably work on predictable > > quantisation with updates: pick 5/12/47/60Hz to quantise to based on > > your current throughput, with something similar to hotplug/LINK_STATUS > > and faked EDID to let userspace know when the period changes. If you > > have variability within the cycle, e.g. dropped frames, then just suck > > it up and keep the illusion alive to userspace that it's presenting to > > a fixed period, and if/when you calculate there's a better > > quantisation then let userspace know what it is so it can adjust. > > > > But there's really no future in just doing random presentation rates, > > because that's not the API anyone has written for. > > See, my hope was that with vrr we could layer the weird remote > presentation semantics of virtualized guest on top of the same > infrastructure that would be used on real hardware. If you're saying > that it's not the way userspace will work, then yea, that doesn't > help. My issue, that's general for para-virtualized drivers, is that > any behavior that differs from hw drivers means that it's going to > break at some point, we see that even for basic things like the > update-layout hotplug events that have been largely standardized for > many years. I'm assuming that refresh-rate-changed will result in the > same regressions, but fwiw if I can implement FRR correctly and punt > any issues that arise due to changes in the FRR as issues in userspace > then that does make my life a lot easier, so I'm not going to object > to that. Yeah, I think that's the best way forward ... modelling the full pipeline in all its glory starts to look way less like KMS and way more like something like GStreamer. Trying to encapsulate all that reasonably in the kernel would've required - at the very least - a KMS-side queue with target display times in order to be at all useful, and that seemed like way too much complexity when the majority of hardware could be handled with 'I'll fire an ioctl at you and you update at the next 16ms boundary'. I'd be super happy to review any uAPI extensions which added feedback to userspace to let it know that the optimal presentation cadence had changed. Cheers, Daniel
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index 28ff30e32fab..39d6d17fc488 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -2854,12 +2854,12 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, struct vmw_private *dev_priv = vmw_priv(dev); u32 max_width = dev_priv->texture_max_width; u32 max_height = dev_priv->texture_max_height; - u32 assumed_cpp = 4; + u32 assumed_cpp = dev_priv->assume_16bpp ? 2 : 4; + u32 pitch = mode->hdisplay * assumed_cpp; + u64 total = mode->vdisplay * pitch; + bool using_stdu = dev_priv->active_display_unit == vmw_du_screen_target; - if (dev_priv->assume_16bpp) - assumed_cpp = 2; - - if (dev_priv->active_display_unit == vmw_du_screen_target) { + if (using_stdu) { max_width = min(dev_priv->stdu_max_width, max_width); max_height = min(dev_priv->stdu_max_height, max_height); } @@ -2870,9 +2870,13 @@ enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector, if (max_height < mode->vdisplay) return MODE_BAD_VVALUE; - if (!vmw_kms_validate_mode_vram(dev_priv, - mode->hdisplay * assumed_cpp, - mode->vdisplay)) + if (using_stdu && + (total > (dev_priv->max_mob_pages * PAGE_SIZE * 3 / 4) || + total > dev_priv->max_mob_size)) { + return MODE_MEM; + } + + if (!vmw_kms_validate_mode_vram(dev_priv, pitch, mode->vdisplay)) return MODE_MEM; return MODE_OK;
SVGA requires surfaces to fit within graphics memory (max_mob_pages) which means that modes with a final buffer size that would exceed graphics memory must be pruned otherwise creation will fail. Additionally, device commands which use multiple graphics resources must have all their resources fit within graphics memory for the duration of the command. Thus we need a small carve out of 1/4 of graphics memory to ensure commands likes surface copies to the primary framebuffer for cursor composition or damage clips can fit within graphics memory. This fixes an issue where VMs with low graphics memory (< 64MiB) configured with high resolution mode boot to a black screen because surface creation fails. Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> --- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)