Message ID | 1493116501-29327-1-git-send-email-xiong.y.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
+ David and Jon On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote: The blocking issue I see is that bisecting is still not pointing at relevant commits. Both bisected commits from Bugzilla are not related to changes in stolen memory usage behavior. I'd assume a successful bisect to land at the patches where we start creating kernel internal objects from stolen memory. Otherwise we could be ignoring a bug elsewhere. If it consistently lands on those patches, then there might be something wrong with them, in addition to stolen memory problems. Disabling power saving makes many bugs go away, but we still don't disable power saving as a resolution to such bugs, but instead root cause and fix the individual bugs. > Stolen memory isn't a standard pci resource and exists in RMRR which has > identity mapping in iommu table when host boot up, so IGD could access > stolen memory in host OS. While according to 'commit c875d2c1b808 > ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains")',RMRR > isn't supported by kvm, then both EPT and guest iommu domain table lack > of maaping for stolen memory in kvm IGD passthrough environment. Commit message text still fails to address that an exclusion was added by commit: commit 18436afdc11a00ac881990b454cfb2eae81d6003 Author: David Woodhouse <David.Woodhouse@intel.com> Date: Wed Mar 25 15:05:47 2015 +0000 iommu/vt-d: Allow RMRR on graphics devices too Commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains") prevents certain options for devices with RMRRs. This even prevents those devices from getting a 1:1 mapping with 'iommu=pt', because we don't have the code to handle *preserving* the RMRR regions when moving the device between domains. <SNIP> The quoted part of David's commit message leads me to believe it's simply lack of some code in kernel for juggling the RMRRs when moving a device between domains that is missing. Why is not that considered instead? With that implemented, we would have more transparent pass- through, which should be good. Also, was fixing the IGD driver loading with zero stolen memory considered instead? All this information should exist in the commit message. After the bisecting is properly done, there is an agreement that suggested RMRR preservation is absolutely a no-go, other options are not viable, the commit message should be updated to reflect all that. Then we should look in more detail on how to detect the scenarios when we're running in a virtual machine that doesn't set up the 1:1 mapping for RMRRs. Regards, Joonas
> + David and Jon > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote: > > The blocking issue I see is that bisecting is still not pointing at > relevant commits. Both bisected commits from Bugzilla are not related > to changes in stolen memory usage behavior. I'd assume a successful > bisect to land at the patches where we start creating kernel internal > objects from stolen memory. Otherwise we could be ignoring a bug > elsewhere. If it consistently lands on those patches, then there might > be something wrong with them, in addition to stolen memory problems. [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla descripted, guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu hang in guest dmesg. From this point, we could do git bisect. But tons of IOMMU DMA R/W exception to stolen memory exist in host dmesg when guest kernel is 4.8 and 4.9. This means guest domain iommu table doesn't have mapping for stolen memory and IGD fail in accessing stolen memory from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression and shouldn't go git bisect. You could check this host error message from the bugzilla attachment. And this should be fixed first. Anyway, I will try my best to get the ideal commit through git bisect, but I'm afraid the result is the same as past because we don't have a stable good point to start git bisect. > Disabling power saving makes many bugs go away, but we still don't > disable power saving as a resolution to such bugs, but instead root > cause and fix the individual bugs. [Zhang, Xiong Y] I add i915.enable_rc6=0, i915.enable_dc=0, i915.enable_fbc=0, I915.enable_psr=0, i915.disable_power_well=0,i915.enable_ips=0 to grub. But gpu hang exist in guest and DMA R/W error exist in host. > > > Stolen memory isn't a standard pci resource and exists in RMRR which has > > identity mapping in iommu table when host boot up, so IGD could access > > stolen memory in host OS. While according to 'commit c875d2c1b808 > > ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API > domains")',RMRR > > isn't supported by kvm, then both EPT and guest iommu domain table lack > > of maaping for stolen memory in kvm IGD passthrough environment. > > Commit message text still fails to address that an exclusion was added > by commit: > > commit 18436afdc11a00ac881990b454cfb2eae81d6003 > Author: David Woodhouse <David.Woodhouse@intel.com> > Date: Wed Mar 25 15:05:47 2015 +0000 > > iommu/vt-d: Allow RMRR on graphics devices too > > Commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from > IOMMU API > domains") prevents certain options for devices with RMRRs. This even > prevents those devices from getting a 1:1 mapping with 'iommu=pt', > because we don't have the code to handle *preserving* the RMRR > regions > when moving the device between domains. > > <SNIP> > > The quoted part of David's commit message leads me to believe it's > simply lack of some code in kernel for juggling the RMRRs when moving a > device between domains that is missing. Why is not that considered > instead? With that implemented, we would have more transparent pass- > through, which should be good. [Zhang, Xiong Y] c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains). This patch prevent devices associated with RMRRs from assigning to a guest, the one of reason is it knows RMRR isn't supported in guest domain IOMMU table, If these device's driver still access RMRR from guest, serious error will happen. 18436afdc ("iommu/vt-d: Allow RMRR on graphics devices too "), add an exception to above commit. So IGD could be assigned to a guest. But this doesn't mean IGD 1:1 mapping for RMRR will be support in guest domain iommu table 'iommu=pt' is to set 1:1 mapping for all pci device in host domain iommu table. When one device is assigned to a guest and this guest boot up, this guest domain Iommu table will take place of host domain iommu table on hardware. Our issue is guest domain iommu table doesn't have 1:1 mapping for RMRR. In order to set up 1:1 mapping for RMRR in guest domain iommu table, we have to modify kvm and qemu and kvm community have declined this. > > Also, was fixing the IGD driver loading with zero stolen memory > considered instead? All this information should exist in the commit > message. [Zhang, Xiong Y] IGD and i915 driver read pci config register 0x50 to get the size of stolen memory. When guest read this register, qemu could trap it and return one value to guest. So in order to " fixing the IGD driver loading with zero stolen memory ", We have to modify both Qemu and IGD driver: 1) QEMU: trap read from pci cfg 0x50 register, then return zero to guest 2) IGD driver: when IGD driver see zero size of stolen memory, don't exit loading and continue. This doesn't give any benefit to i915, i915 will still disable stolen memory as i915 see zero size stolen memory . So I prefer to disable stolen memory in i915 directly and keep Qemu and IGD driver unchanged. > > After the bisecting is properly done, there is an agreement that > suggested RMRR preservation is absolutely a no-go, other options are > not viable, the commit message should be updated to reflect all that. > Then we should look in more detail on how to detect the scenarios when > we're running in a virtual machine that doesn't set up the 1:1 mapping > for RMRRs. [Zhang, Xiong Y] Sure, I will do this once we have an agreement. I really need the help from others who could correct me if I am wrong. > > Regards, Joonas > -- > Joonas Lahtinen > Open Source Technology Center > Intel Corporation
> > + David and Jon > > > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote: > > > > The blocking issue I see is that bisecting is still not pointing at > > relevant commits. Both bisected commits from Bugzilla are not related > > to changes in stolen memory usage behavior. I'd assume a successful > > bisect to land at the patches where we start creating kernel internal > > objects from stolen memory. Otherwise we could be ignoring a bug > > elsewhere. If it consistently lands on those patches, then there might > > be something wrong with them, in addition to stolen memory problems. > [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla descripted, > guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu hang > in guest dmesg. From this point, we could do git bisect. > But tons of IOMMU DMA R/W exception to stolen memory exist in host dmesg > when guest kernel is 4.8 and 4.9. This means guest domain iommu table > doesn't > have mapping for stolen memory and IGD fail in accessing stolen memory > from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression and > shouldn't go git bisect. You could check this host error message from the > bugzilla > attachment. And this should be fixed first. > Anyway, I will try my best to get the ideal commit through git bisect, but I'm > afraid > the result is the same as past because we don't have a stable good point to > start git > bisect. [Zhang, Xiong Y] hi, Joonas: As you said, the gpu hang exist because i915 create ring buffer from stolen memory. I did git bisect again, and the following commit is the first bad commit: commit c58b735fc762e891481e92af7124b85cb0a51fce Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Aug 18 17:16:57 2016 +0100 drm/i915: Allocate rings from stolen If we have stolen available, make use of it for ringbuffer allocation. Previously this was restricted to !llc platforms, as writing to stolen requires a GGTT mapping - but now that we have partial mappable support, the mappable aperture isn't quite so precious so we can use it more freely and ringbuffers are a good user for the otherwise wasted stolen. After reverting this patch from drm-intel-nightly, I didn't see gpu hang during guest boot process. So what's our next step ? thanks > > > Disabling power saving makes many bugs go away, but we still don't > > disable power saving as a resolution to such bugs, but instead root > > cause and fix the individual bugs. > [Zhang, Xiong Y] I add i915.enable_rc6=0, i915.enable_dc=0, > i915.enable_fbc=0, > I915.enable_psr=0, i915.disable_power_well=0,i915.enable_ips=0 to grub. > But gpu hang exist in guest and DMA R/W error exist in host. > > > > > Stolen memory isn't a standard pci resource and exists in RMRR which has > > > identity mapping in iommu table when host boot up, so IGD could access > > > stolen memory in host OS. While according to 'commit c875d2c1b808 > > > ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API > > domains")',RMRR > > > isn't supported by kvm, then both EPT and guest iommu domain table lack > > > of maaping for stolen memory in kvm IGD passthrough environment. > > > > Commit message text still fails to address that an exclusion was added > > by commit: > > > > commit 18436afdc11a00ac881990b454cfb2eae81d6003 > > Author: David Woodhouse <David.Woodhouse@intel.com> > > Date: Wed Mar 25 15:05:47 2015 +0000 > > > > iommu/vt-d: Allow RMRR on graphics devices too > > > > Commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from > > IOMMU API > > domains") prevents certain options for devices with RMRRs. This even > > prevents those devices from getting a 1:1 mapping with 'iommu=pt', > > because we don't have the code to handle *preserving* the RMRR > > regions > > when moving the device between domains. > > > > <SNIP> > > > > The quoted part of David's commit message leads me to believe it's > > simply lack of some code in kernel for juggling the RMRRs when moving a > > device between domains that is missing. Why is not that considered > > instead? With that implemented, we would have more transparent pass- > > through, which should be good. > [Zhang, Xiong Y] c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from > IOMMU API domains). This patch prevent devices associated with RMRRs from > assigning to a guest, the one of reason is it knows RMRR isn't supported in > guest > domain IOMMU table, If these device's driver still access RMRR from guest, > serious error will happen. > 18436afdc ("iommu/vt-d: Allow RMRR on graphics devices too "), add an > exception > to above commit. So IGD could be assigned to a guest. But this doesn't mean > IGD > 1:1 mapping for RMRR will be support in guest domain iommu table > 'iommu=pt' is to set 1:1 mapping for all pci device in host domain iommu > table. > > When one device is assigned to a guest and this guest boot up, this guest > domain > Iommu table will take place of host domain iommu table on hardware. Our > issue > is guest domain iommu table doesn't have 1:1 mapping for RMRR. > In order to set up 1:1 mapping for RMRR in guest domain iommu table, we > have > to modify kvm and qemu and kvm community have declined this. > > > > Also, was fixing the IGD driver loading with zero stolen memory > > considered instead? All this information should exist in the commit > > message. > [Zhang, Xiong Y] IGD and i915 driver read pci config register 0x50 to get > the size of stolen memory. When guest read this register, qemu could trap > it and return one value to guest. > So in order to " fixing the IGD driver loading with zero stolen memory ", > We have to modify both Qemu and IGD driver: > 1) QEMU: trap read from pci cfg 0x50 register, then return zero to guest > 2) IGD driver: when IGD driver see zero size of stolen memory, don't exit > loading > and continue. > This doesn't give any benefit to i915, i915 will still disable stolen memory as > i915 > see zero size stolen memory . So I prefer to disable stolen memory in i915 > directly > and keep Qemu and IGD driver unchanged. > > > > After the bisecting is properly done, there is an agreement that > > suggested RMRR preservation is absolutely a no-go, other options are > > not viable, the commit message should be updated to reflect all that. > > Then we should look in more detail on how to detect the scenarios when > > we're running in a virtual machine that doesn't set up the 1:1 mapping > > for RMRRs. > [Zhang, Xiong Y] Sure, I will do this once we have an agreement. > I really need the help from others who could correct me if I am wrong. > > > > Regards, Joonas > > -- > > Joonas Lahtinen > > Open Source Technology Center > > Intel Corporation
On to, 2017-04-27 at 05:54 +0000, Zhang, Xiong Y wrote: > > > > Also, was fixing the IGD driver loading with zero stolen memory > > considered instead? All this information should exist in the commit > > message. > [Zhang, Xiong Y] IGD and i915 driver read pci config register 0x50 to get > the size of stolen memory. When guest read this register, qemu could trap > it and return one value to guest. > So in order to " fixing the IGD driver loading with zero stolen memory ", > We have to modify both Qemu and IGD driver: > 1) QEMU: trap read from pci cfg 0x50 register, then return zero to guest > 2) IGD driver: when IGD driver see zero size of stolen memory, don't exit loading > and continue. > This doesn't give any benefit to i915, i915 will still disable stolen memory as i915 > see zero size stolen memory . So I prefer to disable stolen memory in i915 directly > and keep Qemu and IGD driver unchanged. If due to lack of code in QEMU, RMRR is not available. It'd sound to me they should carry the change to report stolen as zero, because they're in best position to track if that changes in future. Also, if the IGD driver requires a special treatment where stolen is reported to exist but is not functional, that logic should be fixed where the flaw is. i915 driver is not the go-to for fixing any quirks and lack of code related to virtualization. The driver performs perfectly logically, if stolen is reported to exist, it is expected to work, if not, it's not touched. Adding special cases to satisfy conditions outside of i915 will make driver maintenance a burden. And, if all other options fail and we end up having to fix the issue in i915, the fix should be robust, which this currently is not (see my previous messages). Regards, Joonas
On ke, 2017-05-03 at 09:22 +0000, Zhang, Xiong Y wrote: > > > > > > > > + David and Jon > > > > > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote: > > > > > > The blocking issue I see is that bisecting is still not pointing at > > > relevant commits. Both bisected commits from Bugzilla are not related > > > to changes in stolen memory usage behavior. I'd assume a successful > > > bisect to land at the patches where we start creating kernel internal > > > objects from stolen memory. Otherwise we could be ignoring a bug > > > elsewhere. If it consistently lands on those patches, then there might > > > be something wrong with them, in addition to stolen memory problems. > > [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla descripted, > > guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu hang > > in guest dmesg. From this point, we could do git bisect. > > But tons of IOMMU DMA R/W exception to stolen memory exist in host dmesg > > when guest kernel is 4.8 and 4.9. This means guest domain iommu table > > doesn't > > have mapping for stolen memory and IGD fail in accessing stolen memory > > from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression and > > shouldn't go git bisect. You could check this host error message from the > > bugzilla > > attachment. And this should be fixed first. > > Anyway, I will try my best to get the ideal commit through git bisect, but I'm > > afraid > > the result is the same as past because we don't have a stable good point to > > start git > > bisect. > [Zhang, Xiong Y] hi, Joonas: > As you said, the gpu hang exist because i915 create ring buffer from stolen memory. > I did git bisect again, and the following commit is the first bad commit: > commit c58b735fc762e891481e92af7124b85cb0a51fce > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Thu Aug 18 17:16:57 2016 +0100 > > drm/i915: Allocate rings from stolen > > If we have stolen available, make use of it for ringbuffer allocation. > Previously this was restricted to !llc platforms, as writing to stolen > requires a GGTT mapping - but now that we have partial mappable support, > the mappable aperture isn't quite so precious so we can use it more > freely and ringbuffers are a good user for the otherwise wasted stolen. > > After reverting this patch from drm-intel-nightly, I didn't see gpu hang during guest boot process. > So what's our next step ? An appropriate next step would be to evaluate how much work it is to support the RMRR passthrough David mentioned about in his commit. I'd also go talk with the IGD team, why they refuse to load the driver when stolen memory is correctly reported as zero, and insist on being lied to. While doing that, please update the freedesktop.org bugs. Regards, Joonas
> On ke, 2017-05-03 at 09:22 +0000, Zhang, Xiong Y wrote: > > > > > > > > > > > + David and Jon > > > > > > > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote: > > > > > > > > The blocking issue I see is that bisecting is still not pointing at > > > > relevant commits. Both bisected commits from Bugzilla are not related > > > > to changes in stolen memory usage behavior. I'd assume a successful > > > > bisect to land at the patches where we start creating kernel internal > > > > objects from stolen memory. Otherwise we could be ignoring a bug > > > > elsewhere. If it consistently lands on those patches, then there might > > > > be something wrong with them, in addition to stolen memory problems. > > > [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla > descripted, > > > guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu > hang > > > in guest dmesg. From this point, we could do git bisect. > > > But tons of IOMMU DMA R/W exception to stolen memory exist in host > dmesg > > > when guest kernel is 4.8 and 4.9. This means guest domain iommu table > > > doesn't > > > have mapping for stolen memory and IGD fail in accessing stolen memory > > > from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression > and > > > shouldn't go git bisect. You could check this host error message from the > > > bugzilla > > > attachment. And this should be fixed first. > > > Anyway, I will try my best to get the ideal commit through git bisect, but > I'm > > > afraid > > > the result is the same as past because we don't have a stable good point to > > > start git > > > bisect. > > [Zhang, Xiong Y] hi, Joonas: > > As you said, the gpu hang exist because i915 create ring buffer from stolen > memory. > > I did git bisect again, and the following commit is the first bad commit: > > commit c58b735fc762e891481e92af7124b85cb0a51fce > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > Date: Thu Aug 18 17:16:57 2016 +0100 > > > > drm/i915: Allocate rings from stolen > > > > If we have stolen available, make use of it for ringbuffer allocation. > > Previously this was restricted to !llc platforms, as writing to stolen > > requires a GGTT mapping - but now that we have partial mappable > support, > > the mappable aperture isn't quite so precious so we can use it more > > freely and ringbuffers are a good user for the otherwise wasted stolen. > > > > After reverting this patch from drm-intel-nightly, I didn't see gpu hang during > guest boot process. > > So what's our next step ? > > An appropriate next step would be to evaluate how much work it is to > support the RMRR passthrough David mentioned about in his commit. [Zhang, Xiong Y] As Kevin explained, KVM community found the disadvantage Of RMRR and have decided to not support RMRR passthrough, so it is really hard for us to push such solution and isn't related to the workload. Except usb and graphic card, all other devices with RMRR couldn't passthrough to guest. But the driver of usb and graphic card couldn't access RMRR in such environment. https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf > I'd also go talk with the IGD team, why they refuse to load the driver > when stolen memory is correctly reported as zero, and insist on being > lied to. [Zhang, Xiong Y] thanks a lot for doing so. > > While doing that, please update the freedesktop.org bugs. [Zhang, Xiong Y] sure, I will update bugzilla once we have further finding and make a decision. > > Regards, Joonas > -- > Joonas Lahtinen > Open Source Technology Center > Intel Corporation
On la, 2017-05-06 at 02:58 +0000, Zhang, Xiong Y wrote: > > > > On ke, 2017-05-03 at 09:22 +0000, Zhang, Xiong Y wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > + David and Jon > > > > > > > > > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote: > > > > > > > > > > The blocking issue I see is that bisecting is still not pointing at > > > > > relevant commits. Both bisected commits from Bugzilla are not related > > > > > to changes in stolen memory usage behavior. I'd assume a successful > > > > > bisect to land at the patches where we start creating kernel internal > > > > > objects from stolen memory. Otherwise we could be ignoring a bug > > > > > elsewhere. If it consistently lands on those patches, then there might > > > > > be something wrong with them, in addition to stolen memory problems. > > > > [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla > > descripted, > > > > > > > > > > > guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu > > hang > > > > > > > > > > > in guest dmesg. From this point, we could do git bisect. > > > > But tons of IOMMU DMA R/W exception to stolen memory exist in host > > dmesg > > > > > > > > > > > when guest kernel is 4.8 and 4.9. This means guest domain iommu table > > > > doesn't > > > > have mapping for stolen memory and IGD fail in accessing stolen memory > > > > from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression > > and > > > > > > > > > > > shouldn't go git bisect. You could check this host error message from the > > > > bugzilla > > > > attachment. And this should be fixed first. > > > > Anyway, I will try my best to get the ideal commit through git bisect, but > > I'm > > > > > > > > > > > afraid > > > > the result is the same as past because we don't have a stable good point to > > > > start git > > > > bisect. > > > [Zhang, Xiong Y] hi, Joonas: > > > As you said, the gpu hang exist because i915 create ring buffer from stolen > > memory. > > > > > > I did git bisect again, and the following commit is the first bad commit: > > > commit c58b735fc762e891481e92af7124b85cb0a51fce > > > > > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > > Date: Thu Aug 18 17:16:57 2016 +0100 > > > > > > drm/i915: Allocate rings from stolen > > > > > > If we have stolen available, make use of it for ringbuffer allocation. > > > Previously this was restricted to !llc platforms, as writing to stolen > > > requires a GGTT mapping - but now that we have partial mappable > > support, > > > > > > the mappable aperture isn't quite so precious so we can use it more > > > freely and ringbuffers are a good user for the otherwise wasted stolen. > > > > > > After reverting this patch from drm-intel-nightly, I didn't see gpu hang during > > guest boot process. > > > > > > So what's our next step ? > > > > An appropriate next step would be to evaluate how much work it is to > > support the RMRR passthrough David mentioned about in his commit. > [Zhang, Xiong Y] As Kevin explained, KVM community found the disadvantage > Of RMRR and have decided to not support RMRR passthrough, so it is really hard > for us to push such solution and isn't related to the workload. > Except usb and graphic card, all other devices with RMRR couldn't passthrough > to guest. But the driver of usb and graphic card couldn't access RMRR in such > environment. > https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf Does this patch have the right Cc's from KVM team? I'd like to hear directly from them that even the usage of RMRRs that follow the intention of VT-d spec are not going to be supported. That document predates the patches to add the exclusion for graphics. > > I'd also go talk with the IGD team, why they refuse to load the driver > > when stolen memory is correctly reported as zero, and insist on being > > lied to. > [Zhang, Xiong Y] thanks a lot for doing so. I don't have the contacts, so I assume you to pursue that. Regards, Joonas
On Mon, 08 May 2017 13:07:10 +0300 Joonas Lahtinen <joonas.lahtinen@linux.intel.com> wrote: > On la, 2017-05-06 at 02:58 +0000, Zhang, Xiong Y wrote: > > > > > > On ke, 2017-05-03 at 09:22 +0000, Zhang, Xiong Y wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > + David and Jon > > > > > > > > > > > > On ti, 2017-04-25 at 18:34 +0800, Xiong Zhang wrote: > > > > > > > > > > > > The blocking issue I see is that bisecting is still not pointing at > > > > > > relevant commits. Both bisected commits from Bugzilla are not related > > > > > > to changes in stolen memory usage behavior. I'd assume a successful > > > > > > bisect to land at the patches where we start creating kernel internal > > > > > > objects from stolen memory. Otherwise we could be ignoring a bug > > > > > > elsewhere. If it consistently lands on those patches, then there might > > > > > > be something wrong with them, in addition to stolen memory problems. > > > > > [Zhang, Xiong Y] I only try kernel 4.8 and 4.9 above, as the bugzilla > > > descripted, > > > > > > > > > > > > > > guest 4.8 kernel doesn't see gpu hang in guest dmesg, 4.9 kernel has gpu > > > hang > > > > > > > > > > > > > > in guest dmesg. From this point, we could do git bisect. > > > > > But tons of IOMMU DMA R/W exception to stolen memory exist in host > > > dmesg > > > > > > > > > > > > > > when guest kernel is 4.8 and 4.9. This means guest domain iommu table > > > > > doesn't > > > > > have mapping for stolen memory and IGD fail in accessing stolen memory > > > > > from guest kernel 4.8 and 4.9. From this point, this issue isn't a regression > > > and > > > > > > > > > > > > > > shouldn't go git bisect. You could check this host error message from the > > > > > bugzilla > > > > > attachment. And this should be fixed first. > > > > > Anyway, I will try my best to get the ideal commit through git bisect, but > > > I'm > > > > > > > > > > > > > > afraid > > > > > the result is the same as past because we don't have a stable good point to > > > > > start git > > > > > bisect. > > > > [Zhang, Xiong Y] hi, Joonas: > > > > As you said, the gpu hang exist because i915 create ring buffer from stolen > > > memory. > > > > > > > > I did git bisect again, and the following commit is the first bad commit: > > > > commit c58b735fc762e891481e92af7124b85cb0a51fce > > > > > > > Author: Chris Wilson <chris@chris-wilson.co.uk> > > > > Date: Thu Aug 18 17:16:57 2016 +0100 > > > > > > > > drm/i915: Allocate rings from stolen > > > > > > > > If we have stolen available, make use of it for ringbuffer allocation. > > > > Previously this was restricted to !llc platforms, as writing to stolen > > > > requires a GGTT mapping - but now that we have partial mappable > > > support, > > > > > > > > the mappable aperture isn't quite so precious so we can use it more > > > > freely and ringbuffers are a good user for the otherwise wasted stolen. > > > > > > > > After reverting this patch from drm-intel-nightly, I didn't see gpu hang during > > > guest boot process. > > > > > > > > So what's our next step ? > > > > > > An appropriate next step would be to evaluate how much work it is to > > > support the RMRR passthrough David mentioned about in his commit. > > [Zhang, Xiong Y] As Kevin explained, KVM community found the disadvantage > > Of RMRR and have decided to not support RMRR passthrough, so it is really hard > > for us to push such solution and isn't related to the workload. > > Except usb and graphic card, all other devices with RMRR couldn't passthrough > > to guest. But the driver of usb and graphic card couldn't access RMRR in such > > environment. > > https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf > > Does this patch have the right Cc's from KVM team? I'd like to hear > directly from them that even the usage of RMRRs that follow the > intention of VT-d spec are not going to be supported. That document > predates the patches to add the exclusion for graphics. I'm the QEMU and kernel vfio maintainer and co-author of the above whitepaper. Even the VT-d spec suggests that usage of RMRRs should be limited (rev 2.3, 8.4): Platform designers should avoid or limit use of reserved memory regions since these require system software to create holes in the DMA virtual address range available to system software and its drivers. Also, if you read the entire thread which added the graphics exception for RMRR, you'll see that it went in under some degree of protest and ultimately under the conclusion that we should just ignore the RMRR anyway: https://lists.linuxfoundation.org/pipermail/iommu/2015-April/012790.html At least for the case of IGD RMRRs, we don't expect that they're used for health monitoring of the devices via back channels as the interface has been abused to do elsewhere, so ignoring the RMRR and not supporting it in the VM means that the device is only impairing itself, which is fine. I would balk at trying to add RMRR support into vfio and the virt stack for the reasons outlined in the above whitepaper. Thanks, Alex
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index cc7393e..c0e8968 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -164,31 +164,42 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv) * * In some virtualized environments (e.g. XEN), there is irrelevant * ISA bridge in the system. To work reliably, we should scan trhough - * all the ISA bridge devices and check for the first match, instead + * all the ISA bridge devices and check for all the match, instead * of only checking the first one. + * + * If both the real ISA bridge and IGD are assigned to one guest, this + * guest will have two matched ISA bridges: real and emulatated. We + * couldn't guarantee which one is detected first, so we scan all the + * match, instead of only checking the first match. Real one take + * precedence over emulated to set pch_type and pch_id.Emulated one is + * used to disable stolen memory.Finally pci_get_class() will return + * NULL to exit loop and deference the last matched pch. */ while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) { if (pch->vendor == PCI_VENDOR_ID_INTEL) { unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK; - dev_priv->pch_id = id; if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_IBX; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found Ibex Peak PCH\n"); WARN_ON(!IS_GEN5(dev_priv)); } else if (id == INTEL_PCH_CPT_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_CPT; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found CougarPoint PCH\n"); WARN_ON(!(IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv))); } else if (id == INTEL_PCH_PPT_DEVICE_ID_TYPE) { /* PantherPoint is CPT compatible */ dev_priv->pch_type = PCH_CPT; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found PantherPoint PCH\n"); WARN_ON(!(IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv))); } else if (id == INTEL_PCH_LPT_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_LPT; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found LynxPoint PCH\n"); WARN_ON(!IS_HASWELL(dev_priv) && !IS_BROADWELL(dev_priv)); @@ -196,6 +207,7 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv) IS_BDW_ULT(dev_priv)); } else if (id == INTEL_PCH_LPT_LP_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_LPT; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found LynxPoint LP PCH\n"); WARN_ON(!IS_HASWELL(dev_priv) && !IS_BROADWELL(dev_priv)); @@ -203,16 +215,19 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv) !IS_BDW_ULT(dev_priv)); } else if (id == INTEL_PCH_SPT_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_SPT; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found SunrisePoint PCH\n"); WARN_ON(!IS_SKYLAKE(dev_priv) && !IS_KABYLAKE(dev_priv)); } else if (id == INTEL_PCH_SPT_LP_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_SPT; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found SunrisePoint LP PCH\n"); WARN_ON(!IS_SKYLAKE(dev_priv) && !IS_KABYLAKE(dev_priv)); } else if (id == INTEL_PCH_KBP_DEVICE_ID_TYPE) { dev_priv->pch_type = PCH_KBP; + dev_priv->pch_id = id; DRM_DEBUG_KMS("Found KabyPoint PCH\n"); WARN_ON(!IS_SKYLAKE(dev_priv) && !IS_KABYLAKE(dev_priv)); @@ -223,18 +238,26 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv) PCI_SUBVENDOR_ID_REDHAT_QUMRANET && pch->subsystem_device == PCI_SUBDEVICE_ID_QEMU)) { - dev_priv->pch_type = - intel_virt_detect_pch(dev_priv); + /* + * P2X is used for VMware, exclude it + */ + if (id != INTEL_PCH_P2X_DEVICE_ID_TYPE) + dev_priv->disable_stolen = true; + /* + * Real PCH still hasn't been detected + */ + if (!HAS_PCH_SPLIT(dev_priv)) { + dev_priv->pch_type = + intel_virt_detect_pch(dev_priv); + dev_priv->pch_id = id; + } } else continue; - break; } } - if (!pch) + if (!HAS_PCH_SPLIT(dev_priv)) DRM_DEBUG_KMS("No PCH found.\n"); - - pci_dev_put(pch); } static int i915_getparam(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 357b6c6..3b8e067 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2132,6 +2132,7 @@ struct drm_i915_private { struct intel_uncore uncore; struct i915_virtual_gpu vgpu; + bool disable_stolen; struct intel_gvt *gvt; diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c index f3abdc2..bffeae7 100644 --- a/drivers/gpu/drm/i915/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c @@ -409,8 +409,8 @@ int i915_gem_init_stolen(struct drm_i915_private *dev_priv) mutex_init(&dev_priv->mm.stolen_lock); - if (intel_vgpu_active(dev_priv)) { - DRM_INFO("iGVT-g active, disabling use of stolen memory\n"); + if (dev_priv->disable_stolen || intel_vgpu_active(dev_priv)) { + DRM_INFO("Running in guest, disabling use of stolen memory\n"); return 0; }
Stolen memory isn't a standard pci resource and exists in RMRR which has identity mapping in iommu table when host boot up, so IGD could access stolen memory in host OS. While according to 'commit c875d2c1b808 ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API domains")',RMRR isn't supported by kvm, then both EPT and guest iommu domain table lack of maaping for stolen memory in kvm IGD passthrough environment. If IGD access stolen memory in such environment, many iommu exceptions exist in host dmesg and gpu hang exists also. DMAR: [DMA Read] Request device [00:02.0] fault addr da012000 [fault reason 05] PTE Write access is not set DMAR: [DMA Read] Request device [00:02.0] fault addr da2df000 [fault reason 06] PTE Read access is not set So stolen memory should be disabled in KVM IGD passthrough environment, this patch detects such environment through the existence of qemu emulated isa bridge. Stolen memory exists in kernel for a long time, and this patch depends on INTEL_PCH_QEMU_DEVICE_ID_TYPE which was introduced in v4.5 kernel, so this patch should be backported into v4.5 kernel and later. v2:GVT-g may run in non qemu (Zhenyu) v3:Make commit message clear (Daniel) v4:Fix typo v5:Exclude P2X as it is used for VMware (Joonas) v6:Handle real ISA bridge assigned to guest also (Joonas) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99025 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028 Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Cc: stable@vger.kernel.org # 4.5+ --- drivers/gpu/drm/i915/i915_drv.c | 39 +++++++++++++++++++++++++++------- drivers/gpu/drm/i915/i915_drv.h | 1 + drivers/gpu/drm/i915/i915_gem_stolen.c | 4 ++-- 3 files changed, 34 insertions(+), 10 deletions(-)