Message ID | 20190131135211.6732-1-thunder.leizhen@huawei.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | [RFC,1/1] iommu: set the default iommu-dma mode as non-strict | expand |
Hi, On 31/01/2019 13:52, Zhen Lei wrote: > Currently, many peripherals are faster than before. For example, the top > speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But > when iommu page-table mapping enabled, it's hard to reach the top speed > in strict mode, because of frequently map and unmap operations. In order > to keep abreast of the times, I think it's better to set non-strict as > default. Most users won't be aware of this relaxation and will have their system vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred Invalidation in http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf Why not keep the policy to secure by default, as we do for iommu.passthrough? And maybe add something similar to CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a command-line argument or change the default config. Thanks, Jean > > Below it's our iperf performance data of 25Gb netcard: > strict mode: 18-20 Gb/s > non-strict mode: 23.5 Gb/s > > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > --- > Documentation/admin-guide/kernel-parameters.txt | 4 ++-- > drivers/iommu/iommu.c | 2 +- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index b799bcf..667221f 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -1779,13 +1779,13 @@ > > iommu.strict= [ARM64] Configure TLB invalidation behaviour > Format: { "0" | "1" } > - 0 - Lazy mode. > + 0 - Lazy mode (default). > Request that DMA unmap operations use deferred > invalidation of hardware TLBs, for increased > throughput at the cost of reduced device isolation. > Will fall back to strict mode if not supported by > the relevant IOMMU driver. > - 1 - Strict mode (default). > + 1 - Strict mode. > DMA unmap operations invalidate IOMMU hardware TLBs > synchronously. > > diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c > index 3ed4db3..10e0b49 100644 > --- a/drivers/iommu/iommu.c > +++ b/drivers/iommu/iommu.c > @@ -43,7 +43,7 @@ > #else > static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA; > #endif > -static bool iommu_dma_strict __read_mostly = true; > +static bool iommu_dma_strict __read_mostly; > > struct iommu_callback_data { > const struct iommu_ops *ops; > -- > 1.8.3 > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
Hi Jean, On 2019/1/31 22:55, Jean-Philippe Brucker wrote: > Hi, > > On 31/01/2019 13:52, Zhen Lei wrote: >> Currently, many peripherals are faster than before. For example, the top >> speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But >> when iommu page-table mapping enabled, it's hard to reach the top speed >> in strict mode, because of frequently map and unmap operations. In order >> to keep abreast of the times, I think it's better to set non-strict as >> default. > > Most users won't be aware of this relaxation and will have their system > vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred > Invalidation in > http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf > > Why not keep the policy to secure by default, as we do for > iommu.passthrough? And maybe add something similar to > CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a > command-line argument or change the default config. Sorry for the late reply, it was Chinese new year, and we had a long discussion internally, we are fine to add a Kconfig but not sure OS vendors will set it to default y. OS vendors seems not happy to pass a command-line argument, to be honest, this is our motivation to enable non-strict as default. Hope OS vendors can see this email thread, and give some input here. Thanks Hanjun
On 2019/2/26 20:36, Hanjun Guo wrote: > Hi Jean, > > On 2019/1/31 22:55, Jean-Philippe Brucker wrote: >> Hi, >> >> On 31/01/2019 13:52, Zhen Lei wrote: >>> Currently, many peripherals are faster than before. For example, the top >>> speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But >>> when iommu page-table mapping enabled, it's hard to reach the top speed >>> in strict mode, because of frequently map and unmap operations. In order >>> to keep abreast of the times, I think it's better to set non-strict as >>> default. >> >> Most users won't be aware of this relaxation and will have their system >> vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred >> Invalidation in >> http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf Hi Jean, In fact, we have discussed the vulnerable of deferred invalidation before upstream the non-strict patches. The attacks maybe possible because of an untrusted device or the mistake of the device driver. And we limited the VFIO to still use strict mode. As mentioned in the pdf, limit the freed memory with deferred invalidation only to be reused by the device, can mitigate the vulnerability. But it's too hard to implement it now. A compromise maybe we only apply non-strict to (1) dma_free_coherent, because the memory is controlled by DMA common module, so we can make the memory to be freed after the global invalidation in the timer handler. (2) And provide some new APIs related to iommu_unmap_page/sg, these new APIs deferred invalidation. And the candiate device drivers update the APIs if they want to improve performance. (3) Make sure that only the trusted devices and trusted drivers can apply (1) and (2). For example, the driver must be built into kernel Image. So that some high-end trusted devices use non-strict mode, and keep others still using strict mode. The drivers who want to use non-strict mode, should change to use new APIs by themselves. >> >> Why not keep the policy to secure by default, as we do for >> iommu.passthrough? And maybe add something similar to >> CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a >> command-line argument or change the default config. > > Sorry for the late reply, it was Chinese new year, and we had a long discussion > internally, we are fine to add a Kconfig but not sure OS vendors will set it > to default y. > > OS vendors seems not happy to pass a command-line argument, to be honest, > this is our motivation to enable non-strict as default. Hope OS vendors > can see this email thread, and give some input here. > > Thanks > Hanjun > > > . >
Hi Leizhen, On 01/03/2019 04:44, Leizhen (ThunderTown) wrote: > > > On 2019/2/26 20:36, Hanjun Guo wrote: >> Hi Jean, >> >> On 2019/1/31 22:55, Jean-Philippe Brucker wrote: >>> Hi, >>> >>> On 31/01/2019 13:52, Zhen Lei wrote: >>>> Currently, many peripherals are faster than before. For example, the top >>>> speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But >>>> when iommu page-table mapping enabled, it's hard to reach the top speed >>>> in strict mode, because of frequently map and unmap operations. In order >>>> to keep abreast of the times, I think it's better to set non-strict as >>>> default. >>> >>> Most users won't be aware of this relaxation and will have their system >>> vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred >>> Invalidation in >>> http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf > Hi Jean, > > In fact, we have discussed the vulnerable of deferred invalidation before upstream > the non-strict patches. The attacks maybe possible because of an untrusted device or > the mistake of the device driver. And we limited the VFIO to still use strict mode. > As mentioned in the pdf, limit the freed memory with deferred invalidation only to > be reused by the device, can mitigate the vulnerability. But it's too hard to implement > it now. > A compromise maybe we only apply non-strict to (1) dma_free_coherent, because the > memory is controlled by DMA common module, so we can make the memory to be freed after > the global invalidation in the timer handler. (2) And provide some new APIs related to > iommu_unmap_page/sg, these new APIs deferred invalidation. And the candiate device > drivers update the APIs if they want to improve performance. (3) Make sure that only > the trusted devices and trusted drivers can apply (1) and (2). For example, the driver > must be built into kernel Image. Do we have a notion of untrusted kernel drivers? A userspace driver (VFIO) is untrusted, ok. But a malicious driver loaded into the kernel address space would have much easier ways to corrupt the system than to exploit lazy mode... For (3), I agree that we should at least disallow lazy mode if pci_dev->untrusted is set. At the moment it means that we require the strictest IOMMU configuration for external-facing PCI ports, but it can be extended to blacklist other vulnerable devices or locations. If you do (3) then maybe we don't need (1) and (2), which require a tonne of work in the DMA and IOMMU layers (but would certainly be nice to see, since it would also help handle ATS invalidation timeouts) Thanks, Jean > So that some high-end trusted devices use non-strict mode, and keep others still using > strict mode. The drivers who want to use non-strict mode, should change to use new APIs > by themselves. > > >>> >>> Why not keep the policy to secure by default, as we do for >>> iommu.passthrough? And maybe add something similar to >>> CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a >>> command-line argument or change the default config. >> >> Sorry for the late reply, it was Chinese new year, and we had a long discussion >> internally, we are fine to add a Kconfig but not sure OS vendors will set it >> to default y. >> >> OS vendors seems not happy to pass a command-line argument, to be honest, >> this is our motivation to enable non-strict as default. Hope OS vendors >> can see this email thread, and give some input here. >> >> Thanks >> Hanjun >> >> >> . >> >
On 2019/3/1 19:07, Jean-Philippe Brucker wrote: > Hi Leizhen, > > On 01/03/2019 04:44, Leizhen (ThunderTown) wrote: >> >> >> On 2019/2/26 20:36, Hanjun Guo wrote: >>> Hi Jean, >>> >>> On 2019/1/31 22:55, Jean-Philippe Brucker wrote: >>>> Hi, >>>> >>>> On 31/01/2019 13:52, Zhen Lei wrote: >>>>> Currently, many peripherals are faster than before. For example, the top >>>>> speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But >>>>> when iommu page-table mapping enabled, it's hard to reach the top speed >>>>> in strict mode, because of frequently map and unmap operations. In order >>>>> to keep abreast of the times, I think it's better to set non-strict as >>>>> default. >>>> >>>> Most users won't be aware of this relaxation and will have their system >>>> vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred >>>> Invalidation in >>>> http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf >> Hi Jean, >> >> In fact, we have discussed the vulnerable of deferred invalidation before upstream >> the non-strict patches. The attacks maybe possible because of an untrusted device or >> the mistake of the device driver. And we limited the VFIO to still use strict mode. >> As mentioned in the pdf, limit the freed memory with deferred invalidation only to >> be reused by the device, can mitigate the vulnerability. But it's too hard to implement >> it now. >> A compromise maybe we only apply non-strict to (1) dma_free_coherent, because the >> memory is controlled by DMA common module, so we can make the memory to be freed after >> the global invalidation in the timer handler. (2) And provide some new APIs related to >> iommu_unmap_page/sg, these new APIs deferred invalidation. And the candiate device >> drivers update the APIs if they want to improve performance. (3) Make sure that only >> the trusted devices and trusted drivers can apply (1) and (2). For example, the driver >> must be built into kernel Image. > > Do we have a notion of untrusted kernel drivers? A userspace driver It seems impossible to have such driver. The modules insmod by root users should be guaranteed by themselves. > (VFIO) is untrusted, ok. But a malicious driver loaded into the kernel > address space would have much easier ways to corrupt the system than to > exploit lazy mode... Yes, so that we have no need to consider untrusted drivers. > > For (3), I agree that we should at least disallow lazy mode if > pci_dev->untrusted is set. At the moment it means that we require the > strictest IOMMU configuration for external-facing PCI ports, but it can > be extended to blacklist other vulnerable devices or locations. I plan to add an attribute file for each device, espcially for hotplug devices. And let the root users to decide which mode should be used, strict or non-strict. Becasue they should known whether the hot-plug divice is trusted or not. > > If you do (3) then maybe we don't need (1) and (2), which require a > tonne of work in the DMA and IOMMU layers (but would certainly be nice > to see, since it would also help handle ATS invalidation timeouts) > > Thanks, > Jean > >> So that some high-end trusted devices use non-strict mode, and keep others still using >> strict mode. The drivers who want to use non-strict mode, should change to use new APIs >> by themselves. >> >> >>>> >>>> Why not keep the policy to secure by default, as we do for >>>> iommu.passthrough? And maybe add something similar to >>>> CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a >>>> command-line argument or change the default config. >>> >>> Sorry for the late reply, it was Chinese new year, and we had a long discussion >>> internally, we are fine to add a Kconfig but not sure OS vendors will set it >>> to default y. >>> >>> OS vendors seems not happy to pass a command-line argument, to be honest, >>> this is our motivation to enable non-strict as default. Hope OS vendors >>> can see this email thread, and give some input here. >>> >>> Thanks >>> Hanjun >>> >>> >>> . >>> >> > > > . >
On 02/03/2019 06:12, Leizhen (ThunderTown) wrote: > > > On 2019/3/1 19:07, Jean-Philippe Brucker wrote: >> Hi Leizhen, >> >> On 01/03/2019 04:44, Leizhen (ThunderTown) wrote: >>> >>> >>> On 2019/2/26 20:36, Hanjun Guo wrote: >>>> Hi Jean, >>>> >>>> On 2019/1/31 22:55, Jean-Philippe Brucker wrote: >>>>> Hi, >>>>> >>>>> On 31/01/2019 13:52, Zhen Lei wrote: >>>>>> Currently, many peripherals are faster than before. For example, the top >>>>>> speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But >>>>>> when iommu page-table mapping enabled, it's hard to reach the top speed >>>>>> in strict mode, because of frequently map and unmap operations. In order >>>>>> to keep abreast of the times, I think it's better to set non-strict as >>>>>> default. >>>>> >>>>> Most users won't be aware of this relaxation and will have their system >>>>> vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred >>>>> Invalidation in >>>>> http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf >>> Hi Jean, >>> >>> In fact, we have discussed the vulnerable of deferred invalidation before upstream >>> the non-strict patches. The attacks maybe possible because of an untrusted device or >>> the mistake of the device driver. And we limited the VFIO to still use strict mode. >>> As mentioned in the pdf, limit the freed memory with deferred invalidation only to >>> be reused by the device, can mitigate the vulnerability. But it's too hard to implement >>> it now. >>> A compromise maybe we only apply non-strict to (1) dma_free_coherent, because the >>> memory is controlled by DMA common module, so we can make the memory to be freed after >>> the global invalidation in the timer handler. (2) And provide some new APIs related to >>> iommu_unmap_page/sg, these new APIs deferred invalidation. And the candiate device >>> drivers update the APIs if they want to improve performance. (3) Make sure that only >>> the trusted devices and trusted drivers can apply (1) and (2). For example, the driver >>> must be built into kernel Image. >> >> Do we have a notion of untrusted kernel drivers? A userspace driver > It seems impossible to have such driver. The modules insmod by root users should be > guaranteed by themselves. > >> (VFIO) is untrusted, ok. But a malicious driver loaded into the kernel >> address space would have much easier ways to corrupt the system than to >> exploit lazy mode... > Yes, so that we have no need to consider untrusted drivers. > >> >> For (3), I agree that we should at least disallow lazy mode if >> pci_dev->untrusted is set. At the moment it means that we require the >> strictest IOMMU configuration for external-facing PCI ports, but it can >> be extended to blacklist other vulnerable devices or locations. > I plan to add an attribute file for each device, espcially for hotplug devices. And > let the root users to decide which mode should be used, strict or non-strict. Becasue > they should known whether the hot-plug divice is trusted or not. Aside from the problem that without massive implementation changes strict/non-strict is at best a per-domain property, not a per-device one, I can't see this being particularly practical - surely the whole point of a malicious endpoint is that it's going to pretend to be some common device for which a 'trusted' kernel driver already exists? If you've chosen to trust *any* external device, I think you may as well have just set non-strict globally anyway. The effort involved in trying to implement super-fine-grained control seems hard to justify. Robin. >> >> If you do (3) then maybe we don't need (1) and (2), which require a >> tonne of work in the DMA and IOMMU layers (but would certainly be nice >> to see, since it would also help handle ATS invalidation timeouts) >> >> Thanks, >> Jean >> >>> So that some high-end trusted devices use non-strict mode, and keep others still using >>> strict mode. The drivers who want to use non-strict mode, should change to use new APIs >>> by themselves. >>> >>> >>>>> >>>>> Why not keep the policy to secure by default, as we do for >>>>> iommu.passthrough? And maybe add something similar to >>>>> CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a >>>>> command-line argument or change the default config. >>>> >>>> Sorry for the late reply, it was Chinese new year, and we had a long discussion >>>> internally, we are fine to add a Kconfig but not sure OS vendors will set it >>>> to default y. >>>> >>>> OS vendors seems not happy to pass a command-line argument, to be honest, >>>> this is our motivation to enable non-strict as default. Hope OS vendors >>>> can see this email thread, and give some input here. >>>> >>>> Thanks >>>> Hanjun >>>> >>>> >>>> . >>>> >>> >> >> >> . >> >
On 2019/3/4 23:52, Robin Murphy wrote: > On 02/03/2019 06:12, Leizhen (ThunderTown) wrote: >> >> >> On 2019/3/1 19:07, Jean-Philippe Brucker wrote: >>> Hi Leizhen, >>> >>> On 01/03/2019 04:44, Leizhen (ThunderTown) wrote: >>>> >>>> >>>> On 2019/2/26 20:36, Hanjun Guo wrote: >>>>> Hi Jean, >>>>> >>>>> On 2019/1/31 22:55, Jean-Philippe Brucker wrote: >>>>>> Hi, >>>>>> >>>>>> On 31/01/2019 13:52, Zhen Lei wrote: >>>>>>> Currently, many peripherals are faster than before. For example, the top >>>>>>> speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But >>>>>>> when iommu page-table mapping enabled, it's hard to reach the top speed >>>>>>> in strict mode, because of frequently map and unmap operations. In order >>>>>>> to keep abreast of the times, I think it's better to set non-strict as >>>>>>> default. >>>>>> >>>>>> Most users won't be aware of this relaxation and will have their system >>>>>> vulnerable to e.g. thunderbolt hotplug. See for example 4.3 Deferred >>>>>> Invalidation in >>>>>> http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2018/MSC/MSC-2018-21.pdf >>>> Hi Jean, >>>> >>>> In fact, we have discussed the vulnerable of deferred invalidation before upstream >>>> the non-strict patches. The attacks maybe possible because of an untrusted device or >>>> the mistake of the device driver. And we limited the VFIO to still use strict mode. >>>> As mentioned in the pdf, limit the freed memory with deferred invalidation only to >>>> be reused by the device, can mitigate the vulnerability. But it's too hard to implement >>>> it now. >>>> A compromise maybe we only apply non-strict to (1) dma_free_coherent, because the >>>> memory is controlled by DMA common module, so we can make the memory to be freed after >>>> the global invalidation in the timer handler. (2) And provide some new APIs related to >>>> iommu_unmap_page/sg, these new APIs deferred invalidation. And the candiate device >>>> drivers update the APIs if they want to improve performance. (3) Make sure that only >>>> the trusted devices and trusted drivers can apply (1) and (2). For example, the driver >>>> must be built into kernel Image. >>> >>> Do we have a notion of untrusted kernel drivers? A userspace driver >> It seems impossible to have such driver. The modules insmod by root users should be >> guaranteed by themselves. >> >>> (VFIO) is untrusted, ok. But a malicious driver loaded into the kernel >>> address space would have much easier ways to corrupt the system than to >>> exploit lazy mode... >> Yes, so that we have no need to consider untrusted drivers. >> >>> >>> For (3), I agree that we should at least disallow lazy mode if >>> pci_dev->untrusted is set. At the moment it means that we require the >>> strictest IOMMU configuration for external-facing PCI ports, but it can >>> be extended to blacklist other vulnerable devices or locations. >> I plan to add an attribute file for each device, espcially for hotplug devices. And >> let the root users to decide which mode should be used, strict or non-strict. Becasue >> they should known whether the hot-plug divice is trusted or not. > > Aside from the problem that without massive implementation changes strict/non-strict is at > best a per-domain property, not a per-device one, I can't see this being particularly practical > - surely the whole point of a malicious endpoint is that it's going to pretend to be some common > device for which a 'trusted' kernel driver already exists? Yes, It should be assumed that all kernel drivers and all hard-wired devices are trusted. There is no reason to doubt that the open source drivers or the drivers and devices provided by legitimate suppliers are malicious. > If you've chosen to trust *any* external device, I think you may as well have just set non-strict globally anyway. > The effort involved in trying to implement super-fine-grained control seems hard to justify. The default mode of external devices is strict, it can be obviously changed to non-strict mode. But as you said, it maybe hard to be implemented. In addition, bring a malicious device into computer room, attach and export data it's not easy also. Maybe I should follow Jean'suggestion first, add a config item. > > Robin. > >>> >>> If you do (3) then maybe we don't need (1) and (2), which require a >>> tonne of work in the DMA and IOMMU layers (but would certainly be nice >>> to see, since it would also help handle ATS invalidation timeouts) >>> >>> Thanks, >>> Jean >>> >>>> So that some high-end trusted devices use non-strict mode, and keep others still using >>>> strict mode. The drivers who want to use non-strict mode, should change to use new APIs >>>> by themselves. >>>> >>>> >>>>>> >>>>>> Why not keep the policy to secure by default, as we do for >>>>>> iommu.passthrough? And maybe add something similar to >>>>>> CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a >>>>>> command-line argument or change the default config. >>>>> >>>>> Sorry for the late reply, it was Chinese new year, and we had a long discussion >>>>> internally, we are fine to add a Kconfig but not sure OS vendors will set it >>>>> to default y. >>>>> >>>>> OS vendors seems not happy to pass a command-line argument, to be honest, >>>>> this is our motivation to enable non-strict as default. Hope OS vendors >>>>> can see this email thread, and give some input here. >>>>> >>>>> Thanks >>>>> Hanjun >>>>> >>>>> >>>>> . >>>>> >>>> >>> >>> >>> . >>> >> > > . >
>>> >>>> (VFIO) is untrusted, ok. But a malicious driver loaded into the kernel >>>> address space would have much easier ways to corrupt the system than to >>>> exploit lazy mode... >>> Yes, so that we have no need to consider untrusted drivers. >>> >>>> >>>> For (3), I agree that we should at least disallow lazy mode if >>>> pci_dev->untrusted is set. At the moment it means that we require the >>>> strictest IOMMU configuration for external-facing PCI ports, but it can >>>> be extended to blacklist other vulnerable devices or locations. >>> I plan to add an attribute file for each device, espcially for hotplug devices. And >>> let the root users to decide which mode should be used, strict or non-strict. Becasue >>> they should known whether the hot-plug divice is trusted or not. >> >> Aside from the problem that without massive implementation changes strict/non-strict is at >> best a per-domain property, not a per-device one, I can't see this being particularly practical >> - surely the whole point of a malicious endpoint is that it's going to pretend to be some common >> device for which a 'trusted' kernel driver already exists? > Yes, It should be assumed that all kernel drivers and all hard-wired devices are trusted. There is > no reason to doubt that the open source drivers or the drivers and devices provided by legitimate > suppliers are malicious. > > >> If you've chosen to trust *any* external device, I think you may as well have just set non-strict globally anyway. >> The effort involved in trying to implement super-fine-grained control seems hard to justify. > The default mode of external devices is strict, it can be obviously changed to non-strict mode. But as > you said, it maybe hard to be implemented. In addition, bring a malicious device into computer room, > attach and export data it's not easy also. Maybe I should follow Jean'suggestion first, >add a config item. > +1 On another topic, we did also see a use case for selectively passthrough'ing devices. Typically, having the kernel use the identity mapping for when driving a device is fine. In fact, having the IOMMU translating puts a big performance burden on the system. However sometimes we may require the IOMMU involved for certain devices, like for when the kernel device driver has big contiguous DMA requirements, which is the case for some RDMA NIC cards. John >> >> Robin. >> >>>> >>>> If you do (3) then maybe we don't need (1) and (2), which require a >>>> tonne of work in the DMA and IOMMU layers (but would certainly be nice >>>> to see, since it would also help handle ATS invalidation timeouts) >>>> >>>> Thanks, >>>> Jean >>>> >>>>> So that some high-end trusted devices use non-strict mode, and keep others still using >>>>> strict mode. The drivers who want to use non-strict mode, should change to use new APIs >>>>> by themselves. >>>>> >>>>> >>>>>>> >>>>>>> Why not keep the policy to secure by default, as we do for >>>>>>> iommu.passthrough? And maybe add something similar to >>>>>>> CONFIG_IOMMU_DEFAULT_PASSTRHOUGH? It's easy enough for experts to pass a >>>>>>> command-line argument or change the default config. >>>>>> >>>>>> Sorry for the late reply, it was Chinese new year, and we had a long discussion >>>>>> internally, we are fine to add a Kconfig but not sure OS vendors will set it >>>>>> to default y. >>>>>> >>>>>> OS vendors seems not happy to pass a command-line argument, to be honest, >>>>>> this is our motivation to enable non-strict as default. Hope OS vendors >>>>>> can see this email thread, and give some input here. >>>>>> >>>>>> Thanks >>>>>> Hanjun >>>>>> >>>>>> >>>>>> . >>>>>> >>>>> >>>> >>>> >>>> . >>>> >>> >> >> . >> >
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index b799bcf..667221f 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1779,13 +1779,13 @@ iommu.strict= [ARM64] Configure TLB invalidation behaviour Format: { "0" | "1" } - 0 - Lazy mode. + 0 - Lazy mode (default). Request that DMA unmap operations use deferred invalidation of hardware TLBs, for increased throughput at the cost of reduced device isolation. Will fall back to strict mode if not supported by the relevant IOMMU driver. - 1 - Strict mode (default). + 1 - Strict mode. DMA unmap operations invalidate IOMMU hardware TLBs synchronously. diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3ed4db3..10e0b49 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -43,7 +43,7 @@ #else static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA; #endif -static bool iommu_dma_strict __read_mostly = true; +static bool iommu_dma_strict __read_mostly; struct iommu_callback_data { const struct iommu_ops *ops;
Currently, many peripherals are faster than before. For example, the top speed of the older netcard is 10Gb/s, and now it's more than 25Gb/s. But when iommu page-table mapping enabled, it's hard to reach the top speed in strict mode, because of frequently map and unmap operations. In order to keep abreast of the times, I think it's better to set non-strict as default. Below it's our iperf performance data of 25Gb netcard: strict mode: 18-20 Gb/s non-strict mode: 23.5 Gb/s Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- Documentation/admin-guide/kernel-parameters.txt | 4 ++-- drivers/iommu/iommu.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) -- 1.8.3