Message ID | 20170626153650.23017-1-ross.lagerwall@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 26/06/17 16:36, Ross Lagerwall wrote: > Xen Live Patching has been available as tech preview feature since Xen > 4.7 and has now had a couple of releases to stabilize. Xen Live patching > has been used by multiple vendors to fix several real-world security > issues without any severe bugs encountered. Additionally, there are now > tests in OSSTest that test live patching to ensure that no regressions > are introduced. > > Based on the amount of testing and usage it has had, we are ready to > declare live patching as a 'Supported' feature. > > Live patching is slightly peculiar when it comes to support because it > allows the host administrator to break their system rather easily > depending on the content of the live patch. > Because of this, it is worth detailing out the scope of security > support: > > * Unprivileged access to live patching operations: > Live patching operations should only be accessible to privileged > guests and it shall be treated as a security issue if this is not > the case. > > * Bugs in the patch-application code such that vulnerabilities exist > after application: > If a correct live patch is loaded but it is not applied correctly > such that it might result in an insecure system (e.g. not all > functions are patched), it shall be treated as a security issue. > > * Bugs in livepatch-build-tools creating incorrect live patch that > results in an insecure host: > If livepatch-build-tools creates an incorrect live patch that > results in an insecure host, this shall not be considered a security > issue. There are too many OSes and toolchains to consider supporting > this. A live patch should be checked to verify that it is valid > before loading. > > * Loading an incorrect live patch that results in an insecure host or > host crash: > If a live patch (whether created using livepatch-build-tools or some > alternative) is loaded and it results in an insecure host or host > crash due to the content of the live patch being incorrect or the > issue being inappropriate to live patch, this is not considered as a > security issue. > > * Bugs in the live patch parsing code (the ELF loader): > Bugs in the live patch parsing code such as out-of-bounds reads > caused by invalid ELF files are not considered to be security issues > because the it can only be triggered by a privileged domain. For these last points, I think it is worth stating that people using livepatching are expected to test their patches in a test environment first. > > * Bugs which allow a guest to prevent the application of a livepatch: > A guest should not be able to prevent the application of a live > patch. If an unprivileged guest can prevent the application of a > live patch, it shall be treated as a security issue. This one is harder to say. We know that enough concurrent live migrations can, which extends to "lots of activity in the guest". Its perhaps worth noting the potential workaround of `xl pause $DOM; xen-livepatch ...; xl unpause`. I'd prefer that we excluded situations like this from being within security support. "guest having heavy workloads" is normal for end users, so shouldn't constitute a security vulnerability, as there is nothing we can do about it. > > There are also some generic security questions which it is worth asking: > > 1) Is guest->host privilege escalation possible? > > The new live patching sysctl subops are only accessible to privileged > domains and this is tested by OSSTest with an XTF test. > There is a caveat -- an incorrect live patch can introduce a guest->host > privilege escalation. > > 2) Is guest user->guest kernel escalation possible? > > No, although an incorrect live patch can introduce a guest user->guest > kernel privilege escalation. > > 3) Is there any information leakage? > > The new live patching sysctl subops are only accessible to privileged > domains so it is not possible for an unprivileged guest to access the > list of loaded live patches. This is tested by OSSTest with an XTF test. > There is a caveat -- an incorrect live patch can introduce an > information leakage. > > 4) Can a Denial-of-Service be triggered? > > There are no known ways that an unprivileged guest can prevent a live > patch from being loaded. > Once again, there is a caveat that an incorrect live patch can introduce > an arbitrary denial of service. > > Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> This is all good, but this information needs to be in a file in docs/features/, most probably livepatching.pandoc > --- > xen/common/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/xen/common/Kconfig b/xen/common/Kconfig > index dc8e876..876086c 100644 > --- a/xen/common/Kconfig > +++ b/xen/common/Kconfig > @@ -226,7 +226,7 @@ config CRYPTO > bool > > config LIVEPATCH > - bool "Live patching support (TECH PREVIEW)" > + bool "Live patching support" > default n This default should flip as well. ~Andrew > depends on HAS_BUILD_ID = "y" > ---help---
On 26/06/17 17:39, Andrew Cooper wrote: >> * Bugs which allow a guest to prevent the application of a livepatch: >> A guest should not be able to prevent the application of a live >> patch. If an unprivileged guest can prevent the application of a >> live patch, it shall be treated as a security issue. > > This one is harder to say. We know that enough concurrent live > migrations can, which extends to "lots of activity in the guest". Its > perhaps worth noting the potential workaround of `xl pause $DOM; > xen-livepatch ...; xl unpause`. And what if the guest can prevent itself from being paused? Or, what if the guest can trigger some other persistent state change such that livepatching will fail even if the domain is paused (or destroyed)? I agree that as long as the patch can be applied after "xl pause", then the domain cannot be said to be preventing the application of the livepatch. But if either 'xl pause' doesn't work, or if livepatching fails due to a malicious domain's actions after 'xl pause' (or 'xl destroy'), then it should be treated as a security issue. > This is all good, but this information needs to be in a file in > docs/features/, most probably livepatching.pandoc +1 -George
On 06/26/2017 05:39 PM, Andrew Cooper wrote: > On 26/06/17 16:36, Ross Lagerwall wrote: snip >> * Unprivileged access to live patching operations: >> Live patching operations should only be accessible to privileged >> guests and it shall be treated as a security issue if this is not >> the case. >> >> * Bugs in the patch-application code such that vulnerabilities exist >> after application: >> If a correct live patch is loaded but it is not applied correctly >> such that it might result in an insecure system (e.g. not all >> functions are patched), it shall be treated as a security issue. >> >> * Bugs in livepatch-build-tools creating incorrect live patch that >> results in an insecure host: >> If livepatch-build-tools creates an incorrect live patch that >> results in an insecure host, this shall not be considered a security >> issue. There are too many OSes and toolchains to consider supporting >> this. A live patch should be checked to verify that it is valid >> before loading. >> >> * Loading an incorrect live patch that results in an insecure host or >> host crash: >> If a live patch (whether created using livepatch-build-tools or some >> alternative) is loaded and it results in an insecure host or host >> crash due to the content of the live patch being incorrect or the >> issue being inappropriate to live patch, this is not considered as a >> security issue. >> >> * Bugs in the live patch parsing code (the ELF loader): >> Bugs in the live patch parsing code such as out-of-bounds reads >> caused by invalid ELF files are not considered to be security issues >> because the it can only be triggered by a privileged domain. > > For these last points, I think it is worth stating that people using > livepatching are expected to test their patches in a test environment first. OK. > >> >> * Bugs which allow a guest to prevent the application of a livepatch: >> A guest should not be able to prevent the application of a live >> patch. If an unprivileged guest can prevent the application of a >> live patch, it shall be treated as a security issue. > > This one is harder to say. We know that enough concurrent live > migrations can, which extends to "lots of activity in the guest". Its > perhaps worth noting the potential workaround of `xl pause $DOM; > xen-livepatch ...; xl unpause`. > > I'd prefer that we excluded situations like this from being within > security support. "guest having heavy workloads" is normal for end > users, so shouldn't constitute a security vulnerability, as there is > nothing we can do about it. But surely live migrations cannot be triggered by the guest, only the host administrator? I don't know of any way of triggering the timeout from within an unprivileged guest. > >> >> There are also some generic security questions which it is worth asking: >> >> 1) Is guest->host privilege escalation possible? >> >> The new live patching sysctl subops are only accessible to privileged >> domains and this is tested by OSSTest with an XTF test. >> There is a caveat -- an incorrect live patch can introduce a guest->host >> privilege escalation. >> >> 2) Is guest user->guest kernel escalation possible? >> >> No, although an incorrect live patch can introduce a guest user->guest >> kernel privilege escalation. >> >> 3) Is there any information leakage? >> >> The new live patching sysctl subops are only accessible to privileged >> domains so it is not possible for an unprivileged guest to access the >> list of loaded live patches. This is tested by OSSTest with an XTF test. >> There is a caveat -- an incorrect live patch can introduce an >> information leakage. >> >> 4) Can a Denial-of-Service be triggered? >> >> There are no known ways that an unprivileged guest can prevent a live >> patch from being loaded. >> Once again, there is a caveat that an incorrect live patch can introduce >> an arbitrary denial of service. >> >> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> > > This is all good, but this information needs to be in a file in > docs/features/, most probably livepatching.pandoc OK. > >> --- >> xen/common/Kconfig | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/xen/common/Kconfig b/xen/common/Kconfig >> index dc8e876..876086c 100644 >> --- a/xen/common/Kconfig >> +++ b/xen/common/Kconfig >> @@ -226,7 +226,7 @@ config CRYPTO >> bool >> >> config LIVEPATCH >> - bool "Live patching support (TECH PREVIEW)" >> + bool "Live patching support" >> default n > > This default should flip as well. > OK.
George Dunlap writes ("Re: [PATCH for-4.9] livepatch: Declare live patching as a supported feature"): > I agree that as long as the patch can be applied after "xl pause", then > the domain cannot be said to be preventing the application of the > livepatch. But if either 'xl pause' doesn't work, or if livepatching > fails due to a malicious domain's actions after 'xl pause' (or 'xl > destroy'), then it should be treated as a security issue. +1 Ian.
On 26/06/17 16:36, Ross Lagerwall wrote: > Xen Live Patching has been available as tech preview feature since Xen > 4.7 and has now had a couple of releases to stabilize. Xen Live patching > has been used by multiple vendors to fix several real-world security > issues without any severe bugs encountered. Additionally, there are now > tests in OSSTest that test live patching to ensure that no regressions > are introduced. > > Based on the amount of testing and usage it has had, we are ready to > declare live patching as a 'Supported' feature. Great write-up, Ross, thanks. I more or less agree with everything except... > * Bugs in livepatch-build-tools creating incorrect live patch that > results in an insecure host: > If livepatch-build-tools creates an incorrect live patch that > results in an insecure host, this shall not be considered a security > issue. There are too many OSes and toolchains to consider supporting > this. A live patch should be checked to verify that it is valid > before loading. I'm not sure I follow the argument here. Suppose in one months' time it is discovered that livepatch-build-tools, under some circumstances, creates patches that open up a side vulnerability. Do you really think we should just post a fix to the mailing list, without alerting anybody who may be affected by it? Rememeber, "security support" doesn't mean, "We promise there are no bugs". It means, "If bugs are discovered, we will notify people according to the XenProject Security Response Process"; and this is not only for people on the pre-disclosure list, but for everyone *not* on the list as well, to have one place to find all security-related issues relevant to Xen. -George
On 26/06/17 17:50, Ross Lagerwall wrote: > On 06/26/2017 05:39 PM, Andrew Cooper wrote: >> On 26/06/17 16:36, Ross Lagerwall wrote: >> >>> >>> * Bugs which allow a guest to prevent the application of a livepatch: >>> A guest should not be able to prevent the application of a live >>> patch. If an unprivileged guest can prevent the application of a >>> live patch, it shall be treated as a security issue. >> >> This one is harder to say. We know that enough concurrent live >> migrations can, which extends to "lots of activity in the guest". Its >> perhaps worth noting the potential workaround of `xl pause $DOM; >> xen-livepatch ...; xl unpause`. >> >> I'd prefer that we excluded situations like this from being within >> security support. "guest having heavy workloads" is normal for end >> users, so shouldn't constitute a security vulnerability, as there is >> nothing we can do about it. > > But surely live migrations cannot be triggered by the guest, only the > host administrator? I don't know of any way of triggering the timeout > from within an unprivileged guest. Every VCPU issuing a loop of decrease/increase reservation on a single gfn will cause a similar quantity of p2m lock contention. On older AMD hardware, we have to hold the p2m read lock to service hypercalls, which is why XSA-114 was issued. ~Andrew
On 26/06/17 17:50, George Dunlap wrote: > On 26/06/17 17:39, Andrew Cooper wrote: >>> * Bugs which allow a guest to prevent the application of a livepatch: >>> A guest should not be able to prevent the application of a live >>> patch. If an unprivileged guest can prevent the application of a >>> live patch, it shall be treated as a security issue. >> This one is harder to say. We know that enough concurrent live >> migrations can, which extends to "lots of activity in the guest". Its >> perhaps worth noting the potential workaround of `xl pause $DOM; >> xen-livepatch ...; xl unpause`. > And what if the guest can prevent itself from being paused? In which case, that is an XSA in its own right. The underlying implementation uses XEN_DOMCTL_{,un}pausedomain which call straight into domain_{un,}pause(). We have very big problems if the guest has any influence in this... > > Or, what if the guest can trigger some other persistent state change > such that livepatching will fail even if the domain is paused (or > destroyed)? Such as? The guest being able to cause damaging mutative state change in Xen is clearly a security issue, irrespective of any livepatch involvement. However, livepatch content (hook function for example) which trips over state as found in the hypervisor at the point of application is a bad livepatch, not a vulnerability in livepatching. > I agree that as long as the patch can be applied after "xl pause", then > the domain cannot be said to be preventing the application of the > livepatch. But if either 'xl pause' doesn't work, or if livepatching > fails due to a malicious domain's actions after 'xl pause' (or 'xl > destroy'), then it should be treated as a security issue. I broadly agree, but these bugs feel like they would be self-standing, perhaps with an impact to applying a livepatch, rather than XSAs in livepatching itself. ~Andrew
On 26/06/17 18:00, George Dunlap wrote: > On 26/06/17 16:36, Ross Lagerwall wrote: >> Xen Live Patching has been available as tech preview feature since Xen >> 4.7 and has now had a couple of releases to stabilize. Xen Live patching >> has been used by multiple vendors to fix several real-world security >> issues without any severe bugs encountered. Additionally, there are now >> tests in OSSTest that test live patching to ensure that no regressions >> are introduced. >> >> Based on the amount of testing and usage it has had, we are ready to >> declare live patching as a 'Supported' feature. > Great write-up, Ross, thanks. I more or less agree with everything > except... > >> * Bugs in livepatch-build-tools creating incorrect live patch that >> results in an insecure host: >> If livepatch-build-tools creates an incorrect live patch that >> results in an insecure host, this shall not be considered a security >> issue. There are too many OSes and toolchains to consider supporting >> this. A live patch should be checked to verify that it is valid >> before loading. > I'm not sure I follow the argument here. Suppose in one months' time it > is discovered that livepatch-build-tools, under some circumstances, > creates patches that open up a side vulnerability. Do you really think > we should just post a fix to the mailing list, without alerting anybody > who may be affected by it? There are a million ways this could happen, starting from the simple cases of accidentally building the livepatch from a non-clean working tree, or accidentally using a compiler other than the one used to build the running hypervisor. We absolutely cannot be in the position of issuing XSAs for situations like this, because there are too many ways where it definitely will go wrong, and we'd end up issuing XSAs saying "remember to clean your working tree before building a livepatch". This is of course absurd. IMO, The only viable option is to exclude livepatch-build-tools entirely from security scope. It is already the case that people producing livepatches need to check the resulting livepatch binary for sanity, and test it suitably in a development environment before use in production. ~Andrew
Hi, On 06/26/2017 04:36 PM, Ross Lagerwall wrote: > Xen Live Patching has been available as tech preview feature since Xen > 4.7 and has now had a couple of releases to stabilize. Xen Live patching > has been used by multiple vendors to fix several real-world security > issues without any severe bugs encountered. Additionally, there are now > tests in OSSTest that test live patching to ensure that no regressions > are introduced. > > Based on the amount of testing and usage it has had, we are ready to > declare live patching as a 'Supported' feature. There are only test for x86 and amd64. We likely want to have those test enabled for all architectures by default. Also, I am not aware of anyone using in production livepatch on ARM64 and ARM32. So did anyone give a good kick at the ARM implementation? If not, then we should do it before even considering as a supported feature for ARM. Cheers,
On Mon, Jun 26, 2017 at 07:29:22PM +0100, Julien Grall wrote: > Hi, > > On 06/26/2017 04:36 PM, Ross Lagerwall wrote: > > Xen Live Patching has been available as tech preview feature since Xen > > 4.7 and has now had a couple of releases to stabilize. Xen Live patching > > has been used by multiple vendors to fix several real-world security > > issues without any severe bugs encountered. Additionally, there are now > > tests in OSSTest that test live patching to ensure that no regressions > > are introduced. > > > > Based on the amount of testing and usage it has had, we are ready to > > declare live patching as a 'Supported' feature. > > There are only test for x86 and amd64. We likely want to have those test The test-cases are also for ARM32. > enabled for all architectures by default. And the OSSTest can test all of those. > > Also, I am not aware of anyone using in production livepatch on ARM64 and > ARM32. So did anyone give a good kick at the ARM implementation? I am not aware of anybody using it on production on ARM32 or ARM64. The test-cases are there, the code is there, but yes nobody has kicked the tires on ARM32/ARM64 extensively with it. I would be excited to see vendors that use it and their reports but I am not aware of any. > > If not, then we should do it before even considering as a supported feature > for ARM. OK. Perhaps then only for x86 until ARM operational users pipe up? > > Cheers, > > -- > Julien Grall
>>> Andrew Cooper <andrew.cooper3@citrix.com> 06/26/17 6:40 PM >>> >> --- a/xen/common/Kconfig >> +++ b/xen/common/Kconfig >> @@ -226,7 +226,7 @@ config CRYPTO >> bool >> >> config LIVEPATCH >> - bool "Live patching support (TECH PREVIEW)" >> + bool "Live patching support" >> default n > >This default should flip as well. For unstable maybe. But please not for 4.9 - people shouldn't be caught by surprise after 9 RCs that a config option's default changes. Furthermore, if we default it to y, will there be much of a difference to simply removing the config option? Jan
Hi Jan, On 06/27/2017 07:04 AM, Jan Beulich wrote: >>>> Andrew Cooper <andrew.cooper3@citrix.com> 06/26/17 6:40 PM >>> >>> --- a/xen/common/Kconfig >>> +++ b/xen/common/Kconfig >>> @@ -226,7 +226,7 @@ config CRYPTO >>> bool >>> >>> config LIVEPATCH >>> - bool "Live patching support (TECH PREVIEW)" >>> + bool "Live patching support" >>> default n >> >> This default should flip as well. > > For unstable maybe. But please not for 4.9 - people shouldn't be caught by > surprise after 9 RCs that a config option's default changes. Furthermore, if > we default it to y, will there be much of a difference to simply removing the > config option? I think we should allow user to disable livepatch at build time. I have mind any people looking at using Xen in constraint environment and require some certifications. They will likely want to disable some part of the Xen. Cheers,
On 06/26/2017 10:07 PM, Konrad Rzeszutek Wilk wrote: > On Mon, Jun 26, 2017 at 07:29:22PM +0100, Julien Grall wrote: >> Hi, >> >> On 06/26/2017 04:36 PM, Ross Lagerwall wrote: >>> Xen Live Patching has been available as tech preview feature since Xen >>> 4.7 and has now had a couple of releases to stabilize. Xen Live patching >>> has been used by multiple vendors to fix several real-world security >>> issues without any severe bugs encountered. Additionally, there are now >>> tests in OSSTest that test live patching to ensure that no regressions >>> are introduced. >>> >>> Based on the amount of testing and usage it has had, we are ready to >>> declare live patching as a 'Supported' feature. >> >> There are only test for x86 and amd64. We likely want to have those test > > The test-cases are also for ARM32. > >> enabled for all architectures by default. > > And the OSSTest can test all of those. Can we enable them by default? I know that we limited the number of tests for ARM64 due to limited bandwidth. But I don't think we have anything preventing it on ARM32. >> >> Also, I am not aware of anyone using in production livepatch on ARM64 and >> ARM32. So did anyone give a good kick at the ARM implementaton? > > I am not aware of anybody using it on production on ARM32 or ARM64. > > The test-cases are there, the code is there, but yes nobody has kicked > the tires on ARM32/ARM64 extensively with it. I would be excited to > see vendors that use it and their reports but I am not aware of any. > >> >> If not, then we should do it before even considering as a supported feature >> for ARM. > > OK. Perhaps then only for x86 until ARM operational users pipe up? That would be my preference. My main concern is to handle security issue afterwards because we didn't give any kick at the code. Cheers,
Hi all, there was also a discussion on IRC, which Ian said we should formally summarise in e-mail, just so there is no doubt. So here is my go at it. As far as I can tell - besides the technical discussion in this thread, there are several issues which need to be clarified: * For Xen 4.9 we can declare live patching supported, without spinning another RC to update the in-tree documentation: in other words, we would apply the documentation/policy changes + to the 4.9 tree sometimes after this discussion has been concluded. In effect this means that docs/features/livepatching.pandoc (or similar) and associated changes to KCONFIG options would not show up until Xen 4.9.1 is spun, but the security team would treat live patching as supported for 4.9. In other words for now, we can update the table in the wiki (https://wiki.xenproject.org/wiki/Xen_Project_Release_Features) and live with in-tree artefacts being out-of-sync with the support status for a few months. We need to fix this anyway in-tree and there is a concrete proposal which should be discussed at the summit. * There was a proposal to declare live patching supported for older releases (aka "back port" docs/features/livepatching.pandoc), but Royger pointed out that the toolstack in question needs to support buildid. If so, we should include back-porting requests and d * Julien pointed out that maybe we shouldn't declare live patching as supported for ARM32/64. I don't see an issue to declare it supported for x86/amd64 only for now. But it is obviously up to committers to make that call. I think that covers the ghist of the IRC discussion Regards Lars On 27/06/2017, 08:24, "Julien Grall" <julien.grall@arm.com> wrote: > > >On 06/26/2017 10:07 PM, Konrad Rzeszutek Wilk wrote: >> On Mon, Jun 26, 2017 at 07:29:22PM +0100, Julien Grall wrote: >>> Hi, >>> >>> On 06/26/2017 04:36 PM, Ross Lagerwall wrote: >>>> Xen Live Patching has been available as tech preview feature since Xen >>>> 4.7 and has now had a couple of releases to stabilize. Xen Live >>>>patching >>>> has been used by multiple vendors to fix several real-world security >>>> issues without any severe bugs encountered. Additionally, there are >>>>now >>>> tests in OSSTest that test live patching to ensure that no regressions >>>> are introduced. >>>> >>>> Based on the amount of testing and usage it has had, we are ready to >>>> declare live patching as a 'Supported' feature. >>> >>> There are only test for x86 and amd64. We likely want to have those >>>test >> >> The test-cases are also for ARM32. >> >>> enabled for all architectures by default. >> >> And the OSSTest can test all of those. > >Can we enable them by default? I know that we limited the number of >tests for ARM64 due to limited bandwidth. But I don't think we have >anything preventing it on ARM32. > >>> >>> Also, I am not aware of anyone using in production livepatch on ARM64 >>>and >>> ARM32. So did anyone give a good kick at the ARM implementaton? >> >> I am not aware of anybody using it on production on ARM32 or ARM64. >> >> The test-cases are there, the code is there, but yes nobody has kicked >> the tires on ARM32/ARM64 extensively with it. I would be excited to >> see vendors that use it and their reports but I am not aware of any. >> >>> >>> If not, then we should do it before even considering as a supported >>>feature >>> for ARM. >> >> OK. Perhaps then only for x86 until ARM operational users pipe up? > >That would be my preference. My main concern is to handle security issue >afterwards because we didn't give any kick at the code. > >Cheers, > >-- >Julien Grall
On 26/06/17 18:18, Andrew Cooper wrote: > On 26/06/17 17:50, George Dunlap wrote: >> On 26/06/17 17:39, Andrew Cooper wrote: >>>> * Bugs which allow a guest to prevent the application of a livepatch: >>>> A guest should not be able to prevent the application of a live >>>> patch. If an unprivileged guest can prevent the application of a >>>> live patch, it shall be treated as a security issue. >>> This one is harder to say. We know that enough concurrent live >>> migrations can, which extends to "lots of activity in the guest". Its >>> perhaps worth noting the potential workaround of `xl pause $DOM; >>> xen-livepatch ...; xl unpause`. >> And what if the guest can prevent itself from being paused? > > In which case, that is an XSA in its own right. > > The underlying implementation uses XEN_DOMCTL_{,un}pausedomain which > call straight into domain_{un,}pause(). We have very big problems if > the guest has any influence in this... > >> >> Or, what if the guest can trigger some other persistent state change >> such that livepatching will fail even if the domain is paused (or >> destroyed)? > > Such as? > > The guest being able to cause damaging mutative state change in Xen is > clearly a security issue, irrespective of any livepatch involvement. > > However, livepatch content (hook function for example) which trips over > state as found in the hypervisor at the point of application is a bad > livepatch, not a vulnerability in livepatching. > >> I agree that as long as the patch can be applied after "xl pause", then >> the domain cannot be said to be preventing the application of the >> livepatch. But if either 'xl pause' doesn't work, or if livepatching >> fails due to a malicious domain's actions after 'xl pause' (or 'xl >> destroy'), then it should be treated as a security issue. > > I broadly agree, but these bugs feel like they would be self-standing, > perhaps with an impact to applying a livepatch, rather than XSAs in > livepatching itself. So let me get this right. You think that all possible cases in which a guest can persistently prevent a livepatch from being applied would already be a security issue for other reasons. Therefore, you think we should include a paragraph in our security support statement specifically stating that we do not provide security support if the guest can prevent a livepatch. Is that correct? -George
On 26/06/17 18:30, Andrew Cooper wrote: > On 26/06/17 18:00, George Dunlap wrote: >> On 26/06/17 16:36, Ross Lagerwall wrote: >>> Xen Live Patching has been available as tech preview feature since Xen >>> 4.7 and has now had a couple of releases to stabilize. Xen Live patching >>> has been used by multiple vendors to fix several real-world security >>> issues without any severe bugs encountered. Additionally, there are now >>> tests in OSSTest that test live patching to ensure that no regressions >>> are introduced. >>> >>> Based on the amount of testing and usage it has had, we are ready to >>> declare live patching as a 'Supported' feature. >> Great write-up, Ross, thanks. I more or less agree with everything >> except... >> >>> * Bugs in livepatch-build-tools creating incorrect live patch that >>> results in an insecure host: >>> If livepatch-build-tools creates an incorrect live patch that >>> results in an insecure host, this shall not be considered a security >>> issue. There are too many OSes and toolchains to consider supporting >>> this. A live patch should be checked to verify that it is valid >>> before loading. >> I'm not sure I follow the argument here. Suppose in one months' time it >> is discovered that livepatch-build-tools, under some circumstances, >> creates patches that open up a side vulnerability. Do you really think >> we should just post a fix to the mailing list, without alerting anybody >> who may be affected by it? > > There are a million ways this could happen, starting from the simple > cases of accidentally building the livepatch from a non-clean working > tree, or accidentally using a compiler other than the one used to build > the running hypervisor. > > We absolutely cannot be in the position of issuing XSAs for situations > like this, because there are too many ways where it definitely will go > wrong, and we'd end up issuing XSAs saying "remember to clean your > working tree before building a livepatch". This is of course absurd. Your argument is that because we do not issue XSAs for *user mistakes*, that therefore we should not issue XSAs for *bugs in the tool*. That is of course absurd. We do not issue XSAs for user mistakes in building the hypervisor either (for instance, switching gcc versions without cleaning the hypervisor tree), and yet we still issue XSAs for bugs in the hypervisor itself. > IMO, The only viable option is to exclude livepatch-build-tools entirely > from security scope. It is already the case that people producing > livepatches need to check the resulting livepatch binary for sanity, and > test it suitably in a development environment before use in production. Look, it sounds like right now you are going through all the livepatches with a fine-tooth comb *because* the tools are (or recently have been) unreliable. But at some point in the future, the patch generation mechanism will become more reliable. After 20 XSAs over six months in which the livepatch tool created the correct patch, you will become more complacent. You won't look as closely; it's human nature. You seem to be simply refusing to use your imagination. Step back. Imagine yourself in one year. You come to the office and find an e-mail on security@ which says, "Livepatch tools open a security hole when compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses gcc x.yy, so you take a closer look at that livepatch, only to discover that the livepatches generated actually do contain the bug, but you missed it because ${LATEST-[0,1]} were perfectly fine (since they used newer versions of gcc), the difference was subtle, and it passed all the functional tests. Now all of the customers that have applied those patches are vulnerable. Do you: 1. Tell the reporter to post it publicly to xen-devel immediately, since livepatch tools are not security supported -- thus "zero-day"-ing all your customers (as well as anyone else who happens to have used x.yy to build a hypervisor)? 2. Secretly take advantage of Citrix' privileged position on the security list, and try to get an update out to your customers before it gets announced (but allowing everyone *else* using gcc x.yy to experience a zero-day)? 3. Issue an XSA so that everyone has the opportunity to fix things up before making a public announcement, and so that anyone not on the embargo list gets an alert, so they know to either update their own livepatches, or look for updates from their software provider? I think #3 is the only possible choice. -George
Julien Grall writes ("Re: [PATCH for-4.9] livepatch: Declare live patching as a supported feature"): > On 06/26/2017 10:07 PM, Konrad Rzeszutek Wilk wrote: > > On Mon, Jun 26, 2017 at 07:29:22PM +0100, Julien Grall wrote: > >> enabled for all architectures by default. > > > > And the OSSTest can test all of those. > > Can we enable them by default? I know that we limited the number of > tests for ARM64 due to limited bandwidth. But I don't think we have > anything preventing it on ARM32. We're slightly short of armhf bandwidth too and are about to become more so, with the expansion of the facility (and our difficulty getting more armhf hardware). > > OK. Perhaps then only for x86 until ARM operational users pipe up? > > That would be my preference. My main concern is to handle security issue > afterwards because we didn't give any kick at the code. SGTM Ian.
On 27/06/17 09:37, George Dunlap wrote: > On 26/06/17 18:18, Andrew Cooper wrote: >> On 26/06/17 17:50, George Dunlap wrote: >>> On 26/06/17 17:39, Andrew Cooper wrote: >>>>> * Bugs which allow a guest to prevent the application of a livepatch: >>>>> A guest should not be able to prevent the application of a live >>>>> patch. If an unprivileged guest can prevent the application of a >>>>> live patch, it shall be treated as a security issue. >>>> This one is harder to say. We know that enough concurrent live >>>> migrations can, which extends to "lots of activity in the guest". Its >>>> perhaps worth noting the potential workaround of `xl pause $DOM; >>>> xen-livepatch ...; xl unpause`. >>> And what if the guest can prevent itself from being paused? >> In which case, that is an XSA in its own right. >> >> The underlying implementation uses XEN_DOMCTL_{,un}pausedomain which >> call straight into domain_{un,}pause(). We have very big problems if >> the guest has any influence in this... >> >>> Or, what if the guest can trigger some other persistent state change >>> such that livepatching will fail even if the domain is paused (or >>> destroyed)? >> Such as? >> >> The guest being able to cause damaging mutative state change in Xen is >> clearly a security issue, irrespective of any livepatch involvement. >> >> However, livepatch content (hook function for example) which trips over >> state as found in the hypervisor at the point of application is a bad >> livepatch, not a vulnerability in livepatching. >> >>> I agree that as long as the patch can be applied after "xl pause", then >>> the domain cannot be said to be preventing the application of the >>> livepatch. But if either 'xl pause' doesn't work, or if livepatching >>> fails due to a malicious domain's actions after 'xl pause' (or 'xl >>> destroy'), then it should be treated as a security issue. >> I broadly agree, but these bugs feel like they would be self-standing, >> perhaps with an impact to applying a livepatch, rather than XSAs in >> livepatching itself. > So let me get this right. > > You think that all possible cases in which a guest can persistently > prevent a livepatch from being applied would already be a security issue > for other reasons. Yes. The only possible ways a guest (potentially) has of preventing livepatching from functioning (in the case that it is paused) is by mechanisms such as causing memory corruption, mis-refcounting or by having already achieved code injection. > Therefore, you think we should include a paragraph in our security > support statement specifically stating that we do not provide security > support if the guest can prevent a livepatch. > > Is that correct? Incorrect. I don't think it is worth mentioning at all. By calling it out, you are adding confusion to the area, as it is redundant with the rest of our policy. ~Andrew
Lars Kurth writes ("Re: [PATCH for-4.9] livepatch: Declare live patching as a supported feature"): > * For Xen 4.9 we can declare live patching supported, without spinning > another RC to update the in-tree documentation: in other words, we would > apply the documentation/policy changes + to the 4.9 tree sometimes after > this discussion has been concluded. In effect this means that > docs/features/livepatching.pandoc (or similar) and associated changes to > KCONFIG options would not show up until Xen 4.9.1 is spun, I think that the effective version of these documents is not the one in the most recent release, but the one in (one of the) git branches. In this case, I think xenbits:xen.git#stable-4.9. (This is an essential way to look at it because we have releases which are still in security support, and might need updates to their support status, but which are out of functional support and do not get releases of updates.) > * There was a proposal to declare live patching supported for older > releases (aka "back port" docs/features/livepatching.pandoc), but Royger > pointed out that the toolstack in question needs to support buildid. If > so, we should include back-porting requests and d ("and do it then" or something?) Yes. Ian.
On 27/06/17 11:47, Andrew Cooper wrote: > On 27/06/17 09:37, George Dunlap wrote: >> On 26/06/17 18:18, Andrew Cooper wrote: >>> On 26/06/17 17:50, George Dunlap wrote: >>>> On 26/06/17 17:39, Andrew Cooper wrote: >>>>>> * Bugs which allow a guest to prevent the application of a livepatch: >>>>>> A guest should not be able to prevent the application of a live >>>>>> patch. If an unprivileged guest can prevent the application of a >>>>>> live patch, it shall be treated as a security issue. >>>>> This one is harder to say. We know that enough concurrent live >>>>> migrations can, which extends to "lots of activity in the guest". Its >>>>> perhaps worth noting the potential workaround of `xl pause $DOM; >>>>> xen-livepatch ...; xl unpause`. >>>> And what if the guest can prevent itself from being paused? >>> In which case, that is an XSA in its own right. >>> >>> The underlying implementation uses XEN_DOMCTL_{,un}pausedomain which >>> call straight into domain_{un,}pause(). We have very big problems if >>> the guest has any influence in this... >>> >>>> Or, what if the guest can trigger some other persistent state change >>>> such that livepatching will fail even if the domain is paused (or >>>> destroyed)? >>> Such as? >>> >>> The guest being able to cause damaging mutative state change in Xen is >>> clearly a security issue, irrespective of any livepatch involvement. >>> >>> However, livepatch content (hook function for example) which trips over >>> state as found in the hypervisor at the point of application is a bad >>> livepatch, not a vulnerability in livepatching. >>> >>>> I agree that as long as the patch can be applied after "xl pause", then >>>> the domain cannot be said to be preventing the application of the >>>> livepatch. But if either 'xl pause' doesn't work, or if livepatching >>>> fails due to a malicious domain's actions after 'xl pause' (or 'xl >>>> destroy'), then it should be treated as a security issue. >>> I broadly agree, but these bugs feel like they would be self-standing, >>> perhaps with an impact to applying a livepatch, rather than XSAs in >>> livepatching itself. >> So let me get this right. >> >> You think that all possible cases in which a guest can persistently >> prevent a livepatch from being applied would already be a security issue >> for other reasons. > > Yes. The only possible ways a guest (potentially) has of preventing > livepatching from functioning (in the case that it is paused) is by > mechanisms such as causing memory corruption, mis-refcounting or by > having already achieved code injection. > >> Therefore, you think we should include a paragraph in our security >> support statement specifically stating that we do not provide security >> support if the guest can prevent a livepatch. >> >> Is that correct? > > Incorrect. I don't think it is worth mentioning at all. > > By calling it out, you are adding confusion to the area, as it is > redundant with the rest of our policy. So we shouldn't tell people we'll issue an XSA, because they should already be able to infer from other things in our policy that an XSA will be issued? And if we clearly state that we'll issue an XSA in those circumstances, they'll be confused? -George
On 27/06/2017, 11:49, "Ian Jackson" <ian.jackson@eu.citrix.com> wrote: >> * There was a proposal to declare live patching supported for older >> releases (aka "back port" docs/features/livepatching.pandoc), but Royger >> pointed out that the toolstack in question needs to support buildid. If >> so, we should include back-porting requests and d > >("and do it then" or something?) Yes. Correct: "and do it then" ... Must have deleted the text when hovering over it. Lars
>>> Julien Grall <julien.grall@arm.com> 06/27/17 9:20 AM >>> >On 06/27/2017 07:04 AM, Jan Beulich wrote: >>>>> Andrew Cooper <andrew.cooper3@citrix.com> 06/26/17 6:40 PM >>> >>>> --- a/xen/common/Kconfig >>>> +++ b/xen/common/Kconfig >>>> @@ -226,7 +226,7 @@ config CRYPTO >>>> bool >>>> >>>> config LIVEPATCH >>>> - bool "Live patching support (TECH PREVIEW)" >>>> + bool "Live patching support" >>>> default n >>> >>> This default should flip as well. >> >> For unstable maybe. But please not for 4.9 - people shouldn't be caught by >> surprise after 9 RCs that a config option's default changes. Furthermore, if >> we default it to y, will there be much of a difference to simply removing the >> config option? > >I think we should allow user to disable livepatch at build time. I have >mind any people looking at using Xen in constraint environment and >require some certifications. They will likely want to disable some part >of the Xen. The at least I'd like to see it gain an EXPERT dependency. For now our overall goal still is to limit the number of possible different configurations. Jan
On Tue, Jun 27, 2017 at 7:04 AM, Jan Beulich <jbeulich@suse.com> wrote: >>>> Andrew Cooper <andrew.cooper3@citrix.com> 06/26/17 6:40 PM >>> >>> --- a/xen/common/Kconfig >>> +++ b/xen/common/Kconfig >>> @@ -226,7 +226,7 @@ config CRYPTO >>> bool >>> >>> config LIVEPATCH >>> - bool "Live patching support (TECH PREVIEW)" >>> + bool "Live patching support" >>> default n >> >>This default should flip as well. > > For unstable maybe. But please not for 4.9 - people shouldn't be caught by > surprise after 9 RCs that a config option's default changes. Leaving it the same as the RCs makes sense to me. -George
On 06/27/2017 10:17 AM, George Dunlap wrote: > On 26/06/17 18:30, Andrew Cooper wrote: >> On 26/06/17 18:00, George Dunlap wrote: >>> On 26/06/17 16:36, Ross Lagerwall wrote: ... >> >> We absolutely cannot be in the position of issuing XSAs for situations >> like this, because there are too many ways where it definitely will go >> wrong, and we'd end up issuing XSAs saying "remember to clean your >> working tree before building a livepatch". This is of course absurd. > > Your argument is that because we do not issue XSAs for *user mistakes*, > that therefore we should not issue XSAs for *bugs in the tool*. > > That is of course absurd. We do not issue XSAs for user mistakes in > building the hypervisor either (for instance, switching gcc versions > without cleaning the hypervisor tree), and yet we still issue XSAs for > bugs in the hypervisor itself. > >> IMO, The only viable option is to exclude livepatch-build-tools entirely >> from security scope. It is already the case that people producing >> livepatches need to check the resulting livepatch binary for sanity, and >> test it suitably in a development environment before use in production. > > Look, it sounds like right now you are going through all the livepatches > with a fine-tooth comb *because* the tools are (or recently have been) > unreliable. But at some point in the future, the patch generation > mechanism will become more reliable. After 20 XSAs over six months in > which the livepatch tool created the correct patch, you will become more > complacent. You won't look as closely; it's human nature. > > You seem to be simply refusing to use your imagination. Step back. > Imagine yourself in one year. You come to the office and find an e-mail > on security@ which says, "Livepatch tools open a security hole when > compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses > gcc x.yy, so you take a closer look at that livepatch, only to discover > that the livepatches generated actually do contain the bug, but you > missed it because ${LATEST-[0,1]} were perfectly fine (since they used > newer versions of gcc), the difference was subtle, and it passed all the > functional tests. > > Now all of the customers that have applied those patches are vulnerable. > > Do you: > > 1. Tell the reporter to post it publicly to xen-devel immediately, since > livepatch tools are not security supported -- thus "zero-day"-ing all > your customers (as well as anyone else who happens to have used x.yy to > build a hypervisor)? > > 2. Secretly take advantage of Citrix' privileged position on the > security list, and try to get an update out to your customers before it > gets announced (but allowing everyone *else* using gcc x.yy to > experience a zero-day)? > > 3. Issue an XSA so that everyone has the opportunity to fix things up > before making a public announcement, and so that anyone not on the > embargo list gets an alert, so they know to either update their own > livepatches, or look for updates from their software provider? > > I think #3 is the only possible choice. > > -George > The issue here is that any bug in livepatch-build-tools which still results in output being generated would be a security issue, because someone might have used it to patch a security issue. livepatch-build-tools is certainly not stable enough yet (ever?) to be treated in this fashion.
. snip.. > > Now all of the customers that have applied those patches are vulnerable. > > > > Do you: > > > > 1. Tell the reporter to post it publicly to xen-devel immediately, since > > livepatch tools are not security supported -- thus "zero-day"-ing all > > your customers (as well as anyone else who happens to have used x.yy to > > build a hypervisor)? > > > > 2. Secretly take advantage of Citrix' privileged position on the > > security list, and try to get an update out to your customers before it > > gets announced (but allowing everyone *else* using gcc x.yy to > > experience a zero-day)? > > > > 3. Issue an XSA so that everyone has the opportunity to fix things up > > before making a public announcement, and so that anyone not on the > > embargo list gets an alert, so they know to either update their own > > livepatches, or look for updates from their software provider? > > > > I think #3 is the only possible choice. > > > > -George > > > > The issue here is that any bug in livepatch-build-tools which still results > in output being generated would be a security issue, because someone might > have used it to patch a security issue. livepatch-build-tools is certainly > not stable enough yet (ever?) to be treated in this fashion. > To add a bit. The livepatch-build-tools does not have to be used to create the livepatches. One can use other tools to create the livepatches (like for example the test-cases). And there is a nice design http://xenbits.xen.org/docs/unstable/misc/livepatch.html (see "Design of payload format") which describes what the format of this livepatch MUST be. > -- > Ross Lagerwall
On 06/28/2017 05:18 PM, Ross Lagerwall wrote: > On 06/27/2017 10:17 AM, George Dunlap wrote: >> On 26/06/17 18:30, Andrew Cooper wrote: >>> On 26/06/17 18:00, George Dunlap wrote: >>>> On 26/06/17 16:36, Ross Lagerwall wrote: > ... >>> >>> We absolutely cannot be in the position of issuing XSAs for situations >>> like this, because there are too many ways where it definitely will go >>> wrong, and we'd end up issuing XSAs saying "remember to clean your >>> working tree before building a livepatch". This is of course absurd. >> >> Your argument is that because we do not issue XSAs for *user mistakes*, >> that therefore we should not issue XSAs for *bugs in the tool*. >> >> That is of course absurd. We do not issue XSAs for user mistakes in >> building the hypervisor either (for instance, switching gcc versions >> without cleaning the hypervisor tree), and yet we still issue XSAs for >> bugs in the hypervisor itself. >> >>> IMO, The only viable option is to exclude livepatch-build-tools entirely >>> from security scope. It is already the case that people producing >>> livepatches need to check the resulting livepatch binary for sanity, and >>> test it suitably in a development environment before use in production. >> >> Look, it sounds like right now you are going through all the livepatches >> with a fine-tooth comb *because* the tools are (or recently have been) >> unreliable. But at some point in the future, the patch generation >> mechanism will become more reliable. After 20 XSAs over six months in >> which the livepatch tool created the correct patch, you will become more >> complacent. You won't look as closely; it's human nature. >> >> You seem to be simply refusing to use your imagination. Step back. >> Imagine yourself in one year. You come to the office and find an e-mail >> on security@ which says, "Livepatch tools open a security hole when >> compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses >> gcc x.yy, so you take a closer look at that livepatch, only to discover >> that the livepatches generated actually do contain the bug, but you >> missed it because ${LATEST-[0,1]} were perfectly fine (since they used >> newer versions of gcc), the difference was subtle, and it passed all the >> functional tests. >> >> Now all of the customers that have applied those patches are vulnerable. >> >> Do you: >> >> 1. Tell the reporter to post it publicly to xen-devel immediately, since >> livepatch tools are not security supported -- thus "zero-day"-ing all >> your customers (as well as anyone else who happens to have used x.yy to >> build a hypervisor)? >> >> 2. Secretly take advantage of Citrix' privileged position on the >> security list, and try to get an update out to your customers before it >> gets announced (but allowing everyone *else* using gcc x.yy to >> experience a zero-day)? >> >> 3. Issue an XSA so that everyone has the opportunity to fix things up >> before making a public announcement, and so that anyone not on the >> embargo list gets an alert, so they know to either update their own >> livepatches, or look for updates from their software provider? >> >> I think #3 is the only possible choice. >> >> -George >> > > The issue here is that any bug in livepatch-build-tools which still > results in output being generated would be a security issue, because > someone might have used it to patch a security issue. > livepatch-build-tools is certainly not stable enough yet (ever?) to be > treated in this fashion. You didn't answer my question. If the situation described happens, what position do you want Andrew to be put in? (If I missed a potential action, let me know.) -George
On 06/30/2017 02:42 PM, George Dunlap wrote: > On 06/28/2017 05:18 PM, Ross Lagerwall wrote: >> On 06/27/2017 10:17 AM, George Dunlap wrote: >>> On 26/06/17 18:30, Andrew Cooper wrote: >>>> On 26/06/17 18:00, George Dunlap wrote: >>>>> On 26/06/17 16:36, Ross Lagerwall wrote: >> ... >>>> >>>> We absolutely cannot be in the position of issuing XSAs for situations >>>> like this, because there are too many ways where it definitely will go >>>> wrong, and we'd end up issuing XSAs saying "remember to clean your >>>> working tree before building a livepatch". This is of course absurd. >>> >>> Your argument is that because we do not issue XSAs for *user mistakes*, >>> that therefore we should not issue XSAs for *bugs in the tool*. >>> >>> That is of course absurd. We do not issue XSAs for user mistakes in >>> building the hypervisor either (for instance, switching gcc versions >>> without cleaning the hypervisor tree), and yet we still issue XSAs for >>> bugs in the hypervisor itself. >>> >>>> IMO, The only viable option is to exclude livepatch-build-tools entirely >>>> from security scope. It is already the case that people producing >>>> livepatches need to check the resulting livepatch binary for sanity, and >>>> test it suitably in a development environment before use in production. >>> >>> Look, it sounds like right now you are going through all the livepatches >>> with a fine-tooth comb *because* the tools are (or recently have been) >>> unreliable. But at some point in the future, the patch generation >>> mechanism will become more reliable. After 20 XSAs over six months in >>> which the livepatch tool created the correct patch, you will become more >>> complacent. You won't look as closely; it's human nature. >>> >>> You seem to be simply refusing to use your imagination. Step back. >>> Imagine yourself in one year. You come to the office and find an e-mail >>> on security@ which says, "Livepatch tools open a security hole when >>> compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses >>> gcc x.yy, so you take a closer look at that livepatch, only to discover >>> that the livepatches generated actually do contain the bug, but you >>> missed it because ${LATEST-[0,1]} were perfectly fine (since they used >>> newer versions of gcc), the difference was subtle, and it passed all the >>> functional tests. >>> >>> Now all of the customers that have applied those patches are vulnerable. >>> >>> Do you: >>> >>> 1. Tell the reporter to post it publicly to xen-devel immediately, since >>> livepatch tools are not security supported -- thus "zero-day"-ing all >>> your customers (as well as anyone else who happens to have used x.yy to >>> build a hypervisor)? >>> >>> 2. Secretly take advantage of Citrix' privileged position on the >>> security list, and try to get an update out to your customers before it >>> gets announced (but allowing everyone *else* using gcc x.yy to >>> experience a zero-day)? >>> >>> 3. Issue an XSA so that everyone has the opportunity to fix things up >>> before making a public announcement, and so that anyone not on the >>> embargo list gets an alert, so they know to either update their own >>> livepatches, or look for updates from their software provider? >>> >>> I think #3 is the only possible choice. >>> >>> -George >>> >> >> The issue here is that any bug in livepatch-build-tools which still >> results in output being generated would be a security issue, because >> someone might have used it to patch a security issue. >> livepatch-build-tools is certainly not stable enough yet (ever?) to be >> treated in this fashion. > > You didn't answer my question. If the situation described happens, what > position do you want Andrew to be put in? (If I missed a potential > action, let me know.) > I would choose #3 as it is the obvious choice. But I still don't think it is a sensible idea to have security support for the build tools, at least at this point. The same scenario could be posed for a nasty bug that affects Xen 4.4 only, but it is now just out of security support. IMO something being not supported doesn't preclude it from having an XSA released if there is a particularly nasty vulnerability found.
On Mon, Jul 03, 2017 at 03:53:42PM +0100, Ross Lagerwall wrote: > On 06/30/2017 02:42 PM, George Dunlap wrote: > > On 06/28/2017 05:18 PM, Ross Lagerwall wrote: > > > On 06/27/2017 10:17 AM, George Dunlap wrote: > > > > On 26/06/17 18:30, Andrew Cooper wrote: > > > > > On 26/06/17 18:00, George Dunlap wrote: > > > > > > On 26/06/17 16:36, Ross Lagerwall wrote: > > > ... > > > > > > > > > > We absolutely cannot be in the position of issuing XSAs for situations > > > > > like this, because there are too many ways where it definitely will go > > > > > wrong, and we'd end up issuing XSAs saying "remember to clean your > > > > > working tree before building a livepatch". This is of course absurd. > > > > > > > > Your argument is that because we do not issue XSAs for *user mistakes*, > > > > that therefore we should not issue XSAs for *bugs in the tool*. > > > > > > > > That is of course absurd. We do not issue XSAs for user mistakes in > > > > building the hypervisor either (for instance, switching gcc versions > > > > without cleaning the hypervisor tree), and yet we still issue XSAs for > > > > bugs in the hypervisor itself. > > > > > > > > > IMO, The only viable option is to exclude livepatch-build-tools entirely > > > > > from security scope. It is already the case that people producing > > > > > livepatches need to check the resulting livepatch binary for sanity, and > > > > > test it suitably in a development environment before use in production. > > > > > > > > Look, it sounds like right now you are going through all the livepatches > > > > with a fine-tooth comb *because* the tools are (or recently have been) > > > > unreliable. But at some point in the future, the patch generation > > > > mechanism will become more reliable. After 20 XSAs over six months in > > > > which the livepatch tool created the correct patch, you will become more > > > > complacent. You won't look as closely; it's human nature. > > > > > > > > You seem to be simply refusing to use your imagination. Step back. > > > > Imagine yourself in one year. You come to the office and find an e-mail > > > > on security@ which says, "Livepatch tools open a security hole when > > > > compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses > > > > gcc x.yy, so you take a closer look at that livepatch, only to discover > > > > that the livepatches generated actually do contain the bug, but you > > > > missed it because ${LATEST-[0,1]} were perfectly fine (since they used > > > > newer versions of gcc), the difference was subtle, and it passed all the > > > > functional tests. > > > > > > > > Now all of the customers that have applied those patches are vulnerable. > > > > > > > > Do you: > > > > > > > > 1. Tell the reporter to post it publicly to xen-devel immediately, since > > > > livepatch tools are not security supported -- thus "zero-day"-ing all > > > > your customers (as well as anyone else who happens to have used x.yy to > > > > build a hypervisor)? > > > > > > > > 2. Secretly take advantage of Citrix' privileged position on the > > > > security list, and try to get an update out to your customers before it > > > > gets announced (but allowing everyone *else* using gcc x.yy to > > > > experience a zero-day)? > > > > > > > > 3. Issue an XSA so that everyone has the opportunity to fix things up > > > > before making a public announcement, and so that anyone not on the > > > > embargo list gets an alert, so they know to either update their own > > > > livepatches, or look for updates from their software provider? > > > > > > > > I think #3 is the only possible choice. > > > > > > > > -George > > > > > > > > > > The issue here is that any bug in livepatch-build-tools which still > > > results in output being generated would be a security issue, because > > > someone might have used it to patch a security issue. > > > livepatch-build-tools is certainly not stable enough yet (ever?) to be > > > treated in this fashion. > > > > You didn't answer my question. If the situation described happens, what > > position do you want Andrew to be put in? (If I missed a potential > > action, let me know.) > > > > I would choose #3 as it is the obvious choice. But I still don't think it is > a sensible idea to have security support for the build tools, at least at > this point. The same scenario could be posed for a nasty bug that affects > Xen 4.4 only, but it is now just out of security support. IMO something > being not supported doesn't preclude it from having an XSA released if there > is a particularly nasty vulnerability found. I think this is a grey area that we should try to avoid as much as possible. What constitutes a XSA should be clearly defined, so that there's no room for speculation or subjective decisions. Following from the example above (and I'm really not doubting Andrew's objective criteria here) but what might not seem as a relevant issue to Andrew (or the security team) in general might be severe to other parties IMHO, and since not everyone has a stake into what constitutes a XSA the process would become unfair to them. Roger.
On 07/03/2017 03:53 PM, Ross Lagerwall wrote: > On 06/30/2017 02:42 PM, George Dunlap wrote: >> On 06/28/2017 05:18 PM, Ross Lagerwall wrote: >>> On 06/27/2017 10:17 AM, George Dunlap wrote: >>>> On 26/06/17 18:30, Andrew Cooper wrote: >>>>> On 26/06/17 18:00, George Dunlap wrote: >>>>>> On 26/06/17 16:36, Ross Lagerwall wrote: >>> ... >>>> You seem to be simply refusing to use your imagination. Step back. >>>> Imagine yourself in one year. You come to the office and find an >>>> e-mail >>>> on security@ which says, "Livepatch tools open a security hole when >>>> compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses >>>> gcc x.yy, so you take a closer look at that livepatch, only to discover >>>> that the livepatches generated actually do contain the bug, but you >>>> missed it because ${LATEST-[0,1]} were perfectly fine (since they used >>>> newer versions of gcc), the difference was subtle, and it passed all >>>> the >>>> functional tests. >>>> >>>> Now all of the customers that have applied those patches are >>>> vulnerable. >>>> >>>> Do you: >>>> >>>> 1. Tell the reporter to post it publicly to xen-devel immediately, >>>> since >>>> livepatch tools are not security supported -- thus "zero-day"-ing all >>>> your customers (as well as anyone else who happens to have used x.yy to >>>> build a hypervisor)? >>>> >>>> 2. Secretly take advantage of Citrix' privileged position on the >>>> security list, and try to get an update out to your customers before it >>>> gets announced (but allowing everyone *else* using gcc x.yy to >>>> experience a zero-day)? >>>> >>>> 3. Issue an XSA so that everyone has the opportunity to fix things up >>>> before making a public announcement, and so that anyone not on the >>>> embargo list gets an alert, so they know to either update their own >>>> livepatches, or look for updates from their software provider? >>>> >>>> I think #3 is the only possible choice. >>>> >>>> -George >>>> >>> >>> The issue here is that any bug in livepatch-build-tools which still >>> results in output being generated would be a security issue, because >>> someone might have used it to patch a security issue. >>> livepatch-build-tools is certainly not stable enough yet (ever?) to be >>> treated in this fashion. >> >> You didn't answer my question. If the situation described happens, what >> position do you want Andrew to be put in? (If I missed a potential >> action, let me know.) >> > > I would choose #3 as it is the obvious choice. But I still don't think > it is a sensible idea to have security support for the build tools, at > least at this point. The same scenario could be posed for a nasty bug > that affects Xen 4.4 only, but it is now just out of security support. > IMO something being not supported doesn't preclude it from having an XSA > released if there is a particularly nasty vulnerability found. Well basically I think we agree, but we're using different terms. You want to say, "This isn't security supported, but if important bug is actually found then we'll issue an XSA". I want to say, "This is security supported, because if an important bug is actually found we'll issue an XSA." So it seems to me there are likely two things that make you resistant to calling it "security supported": 1. The fear that we'll be issuing XSAs over trivial things that don't matter 2. The fear that people will not do due diligence when creating patches with the tools. I think #1 is just a misconception. *Every* bug reported to us about any part of the code we go through the process of trying to determine its impact and whether we need to issue an XSA or not. All of the examples put forward of things we don't want to issue an XSA for are things that I'm sure we would not issue an XSA for. For #2, that is a reasonable fear, but we can deal with that in a different way than calling the tools "unsupported". We can, for instance, mention that in the documents. We can add a warning message that the build tools output saying that the result should be manually inspected for correctness. -George
On 08/03/2017 06:20 PM, George Dunlap wrote: > On 07/03/2017 03:53 PM, Ross Lagerwall wrote: >> On 06/30/2017 02:42 PM, George Dunlap wrote: >>> On 06/28/2017 05:18 PM, Ross Lagerwall wrote: >>>> On 06/27/2017 10:17 AM, George Dunlap wrote: >>>>> On 26/06/17 18:30, Andrew Cooper wrote: >>>>>> On 26/06/17 18:00, George Dunlap wrote: >>>>>>> On 26/06/17 16:36, Ross Lagerwall wrote: >>>> ... >>>>> You seem to be simply refusing to use your imagination. Step back. >>>>> Imagine yourself in one year. You come to the office and find an >>>>> e-mail >>>>> on security@ which says, "Livepatch tools open a security hole when >>>>> compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses >>>>> gcc x.yy, so you take a closer look at that livepatch, only to discover >>>>> that the livepatches generated actually do contain the bug, but you >>>>> missed it because ${LATEST-[0,1]} were perfectly fine (since they used >>>>> newer versions of gcc), the difference was subtle, and it passed all >>>>> the >>>>> functional tests. >>>>> >>>>> Now all of the customers that have applied those patches are >>>>> vulnerable. >>>>> >>>>> Do you: >>>>> >>>>> 1. Tell the reporter to post it publicly to xen-devel immediately, >>>>> since >>>>> livepatch tools are not security supported -- thus "zero-day"-ing all >>>>> your customers (as well as anyone else who happens to have used x.yy to >>>>> build a hypervisor)? >>>>> >>>>> 2. Secretly take advantage of Citrix' privileged position on the >>>>> security list, and try to get an update out to your customers before it >>>>> gets announced (but allowing everyone *else* using gcc x.yy to >>>>> experience a zero-day)? >>>>> >>>>> 3. Issue an XSA so that everyone has the opportunity to fix things up >>>>> before making a public announcement, and so that anyone not on the >>>>> embargo list gets an alert, so they know to either update their own >>>>> livepatches, or look for updates from their software provider? >>>>> >>>>> I think #3 is the only possible choice. >>>>> >>>>> -George >>>>> >>>> >>>> The issue here is that any bug in livepatch-build-tools which still >>>> results in output being generated would be a security issue, because >>>> someone might have used it to patch a security issue. >>>> livepatch-build-tools is certainly not stable enough yet (ever?) to be >>>> treated in this fashion. >>> >>> You didn't answer my question. If the situation described happens, what >>> position do you want Andrew to be put in? (If I missed a potential >>> action, let me know.) >>> >> >> I would choose #3 as it is the obvious choice. But I still don't think >> it is a sensible idea to have security support for the build tools, at >> least at this point. The same scenario could be posed for a nasty bug >> that affects Xen 4.4 only, but it is now just out of security support. >> IMO something being not supported doesn't preclude it from having an XSA >> released if there is a particularly nasty vulnerability found. > > Well basically I think we agree, but we're using different terms. You > want to say, "This isn't security supported, but if important bug is > actually found then we'll issue an XSA". I want to say, "This is > security supported, because if an important bug is actually found we'll > issue an XSA." > > So it seems to me there are likely two things that make you resistant to > calling it "security supported": > > 1. The fear that we'll be issuing XSAs over trivial things that don't matter > > 2. The fear that people will not do due diligence when creating patches > with the tools. > > I think #1 is just a misconception. *Every* bug reported to us about > any part of the code we go through the process of trying to determine > its impact and whether we need to issue an XSA or not. All of the > examples put forward of things we don't want to issue an XSA for are > things that I'm sure we would not issue an XSA for. > > For #2, that is a reasonable fear, but we can deal with that in a > different way than calling the tools "unsupported". We can, for > instance, mention that in the documents. We can add a warning message > that the build tools output saying that the result should be manually > inspected for correctness. We need to get a resolution on this. Anyone else (particarly committers) want to give their opinion? -George
On Thu, Aug 03, 2017 at 06:21:30PM +0100, George Dunlap wrote: > On 08/03/2017 06:20 PM, George Dunlap wrote: > > On 07/03/2017 03:53 PM, Ross Lagerwall wrote: > >> On 06/30/2017 02:42 PM, George Dunlap wrote: > >>> On 06/28/2017 05:18 PM, Ross Lagerwall wrote: > >>>> On 06/27/2017 10:17 AM, George Dunlap wrote: > >>>>> On 26/06/17 18:30, Andrew Cooper wrote: > >>>>>> On 26/06/17 18:00, George Dunlap wrote: > >>>>>>> On 26/06/17 16:36, Ross Lagerwall wrote: > >>>> ... > >>>>> You seem to be simply refusing to use your imagination. Step back. > >>>>> Imagine yourself in one year. You come to the office and find an > >>>>> e-mail > >>>>> on security@ which says, "Livepatch tools open a security hole when > >>>>> compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses > >>>>> gcc x.yy, so you take a closer look at that livepatch, only to discover > >>>>> that the livepatches generated actually do contain the bug, but you > >>>>> missed it because ${LATEST-[0,1]} were perfectly fine (since they used > >>>>> newer versions of gcc), the difference was subtle, and it passed all > >>>>> the > >>>>> functional tests. > >>>>> > >>>>> Now all of the customers that have applied those patches are > >>>>> vulnerable. > >>>>> > >>>>> Do you: > >>>>> > >>>>> 1. Tell the reporter to post it publicly to xen-devel immediately, > >>>>> since > >>>>> livepatch tools are not security supported -- thus "zero-day"-ing all > >>>>> your customers (as well as anyone else who happens to have used x.yy to > >>>>> build a hypervisor)? > >>>>> > >>>>> 2. Secretly take advantage of Citrix' privileged position on the > >>>>> security list, and try to get an update out to your customers before it > >>>>> gets announced (but allowing everyone *else* using gcc x.yy to > >>>>> experience a zero-day)? > >>>>> > >>>>> 3. Issue an XSA so that everyone has the opportunity to fix things up > >>>>> before making a public announcement, and so that anyone not on the > >>>>> embargo list gets an alert, so they know to either update their own > >>>>> livepatches, or look for updates from their software provider? > >>>>> > >>>>> I think #3 is the only possible choice. > >>>>> > >>>>> -George > >>>>> > >>>> > >>>> The issue here is that any bug in livepatch-build-tools which still > >>>> results in output being generated would be a security issue, because > >>>> someone might have used it to patch a security issue. > >>>> livepatch-build-tools is certainly not stable enough yet (ever?) to be > >>>> treated in this fashion. > >>> > >>> You didn't answer my question. If the situation described happens, what > >>> position do you want Andrew to be put in? (If I missed a potential > >>> action, let me know.) > >>> > >> > >> I would choose #3 as it is the obvious choice. But I still don't think > >> it is a sensible idea to have security support for the build tools, at > >> least at this point. The same scenario could be posed for a nasty bug > >> that affects Xen 4.4 only, but it is now just out of security support. > >> IMO something being not supported doesn't preclude it from having an XSA > >> released if there is a particularly nasty vulnerability found. > > > > Well basically I think we agree, but we're using different terms. You > > want to say, "This isn't security supported, but if important bug is > > actually found then we'll issue an XSA". I want to say, "This is > > security supported, because if an important bug is actually found we'll > > issue an XSA." > > > > So it seems to me there are likely two things that make you resistant to > > calling it "security supported": > > > > 1. The fear that we'll be issuing XSAs over trivial things that don't matter > > > > 2. The fear that people will not do due diligence when creating patches > > with the tools. > > > > I think #1 is just a misconception. *Every* bug reported to us about > > any part of the code we go through the process of trying to determine > > its impact and whether we need to issue an XSA or not. All of the > > examples put forward of things we don't want to issue an XSA for are > > things that I'm sure we would not issue an XSA for. > > > > For #2, that is a reasonable fear, but we can deal with that in a > > different way than calling the tools "unsupported". We can, for > > instance, mention that in the documents. We can add a warning message > > that the build tools output saying that the result should be manually > > inspected for correctness. > > We need to get a resolution on this. Anyone else (particarly > committers) want to give their opinion? Changing title as this is all about now livepatch-build-tools. The livepatch-build-tools get a lot of usage around XSA times. And that is when the corner cases are being found. The three of them: 0c10457 Remove section alignment requirement b30d34c Ignore .discard sections 6327ab9 create-diff-object: Update fixup offsets in .rela.ex_table where thanks to generating XSAs. Now the folks who use these tools are also the ones that do pre-disclosures. And the folks who work on these tools also are the ones who have to get the livepatches out. It is a stressful time and in the past the issues were off: 'oh, livepatch-build-tools won't generate the livepatch' which I don't even know how to classify - is it an XSA that it could not create an livepatch? And if the livepatch-build-tools does generate something mighty wrong then the folks on the XSA pre-disclosure list should be let know (and that has been happening). But I am not really a fan of 'Oh, and one more XSA' The second argument is that livepatch-build-tools is like the GCC compiler. In fact it takes the binary blob of what the compiler has produced and checks it against the other one. If the compiler adds extra instructions or changes the instructions slightly we will classify that as code needing to be patched (and yes that has come up). This is very similar to what XSA-155 was - the GCC compiler optimizations added a nice jump table that was accessed twice. And the offset was retrieved from the shared ring. But we didn't do an XSA-155 for the GCC compiler. That is we didn't file a ticket with GCC saying 'Hey, your compiler can create an race on shared memory. Could you make your compiler be smarter in these cases' We instead wrote code with this optimization in mind with more barriers. I think livepatch-build-tools is in the same category as GCC or linkers. > > -George
On 08/06/2017 01:07 AM, Konrad Rzeszutek Wilk wrote: > On Thu, Aug 03, 2017 at 06:21:30PM +0100, George Dunlap wrote: >> On 08/03/2017 06:20 PM, George Dunlap wrote: >>> On 07/03/2017 03:53 PM, Ross Lagerwall wrote: >>>> On 06/30/2017 02:42 PM, George Dunlap wrote: >>>>> On 06/28/2017 05:18 PM, Ross Lagerwall wrote: >>>>>> On 06/27/2017 10:17 AM, George Dunlap wrote: >>>>>>> On 26/06/17 18:30, Andrew Cooper wrote: >>>>>>>> On 26/06/17 18:00, George Dunlap wrote: >>>>>>>>> On 26/06/17 16:36, Ross Lagerwall wrote: >>>>>> ... >>>>>>> You seem to be simply refusing to use your imagination. Step back. >>>>>>> Imagine yourself in one year. You come to the office and find an >>>>>>> e-mail >>>>>>> on security@ which says, "Livepatch tools open a security hole when >>>>>>> compiling with gcc x.yy". You realize that XenVerson ${LATEST-2} uses >>>>>>> gcc x.yy, so you take a closer look at that livepatch, only to discover >>>>>>> that the livepatches generated actually do contain the bug, but you >>>>>>> missed it because ${LATEST-[0,1]} were perfectly fine (since they used >>>>>>> newer versions of gcc), the difference was subtle, and it passed all >>>>>>> the >>>>>>> functional tests. >>>>>>> >>>>>>> Now all of the customers that have applied those patches are >>>>>>> vulnerable. >>>>>>> >>>>>>> Do you: >>>>>>> >>>>>>> 1. Tell the reporter to post it publicly to xen-devel immediately, >>>>>>> since >>>>>>> livepatch tools are not security supported -- thus "zero-day"-ing all >>>>>>> your customers (as well as anyone else who happens to have used x.yy to >>>>>>> build a hypervisor)? >>>>>>> >>>>>>> 2. Secretly take advantage of Citrix' privileged position on the >>>>>>> security list, and try to get an update out to your customers before it >>>>>>> gets announced (but allowing everyone *else* using gcc x.yy to >>>>>>> experience a zero-day)? >>>>>>> >>>>>>> 3. Issue an XSA so that everyone has the opportunity to fix things up >>>>>>> before making a public announcement, and so that anyone not on the >>>>>>> embargo list gets an alert, so they know to either update their own >>>>>>> livepatches, or look for updates from their software provider? >>>>>>> >>>>>>> I think #3 is the only possible choice. >>>>>>> >>>>>>> -George >>>>>>> >>>>>> >>>>>> The issue here is that any bug in livepatch-build-tools which still >>>>>> results in output being generated would be a security issue, because >>>>>> someone might have used it to patch a security issue. >>>>>> livepatch-build-tools is certainly not stable enough yet (ever?) to be >>>>>> treated in this fashion. >>>>> >>>>> You didn't answer my question. If the situation described happens, what >>>>> position do you want Andrew to be put in? (If I missed a potential >>>>> action, let me know.) >>>>> >>>> >>>> I would choose #3 as it is the obvious choice. But I still don't think >>>> it is a sensible idea to have security support for the build tools, at >>>> least at this point. The same scenario could be posed for a nasty bug >>>> that affects Xen 4.4 only, but it is now just out of security support. >>>> IMO something being not supported doesn't preclude it from having an XSA >>>> released if there is a particularly nasty vulnerability found. >>> >>> Well basically I think we agree, but we're using different terms. You >>> want to say, "This isn't security supported, but if important bug is >>> actually found then we'll issue an XSA". I want to say, "This is >>> security supported, because if an important bug is actually found we'll >>> issue an XSA." >>> >>> So it seems to me there are likely two things that make you resistant to >>> calling it "security supported": >>> >>> 1. The fear that we'll be issuing XSAs over trivial things that don't matter >>> >>> 2. The fear that people will not do due diligence when creating patches >>> with the tools. >>> >>> I think #1 is just a misconception. *Every* bug reported to us about >>> any part of the code we go through the process of trying to determine >>> its impact and whether we need to issue an XSA or not. All of the >>> examples put forward of things we don't want to issue an XSA for are >>> things that I'm sure we would not issue an XSA for. >>> >>> For #2, that is a reasonable fear, but we can deal with that in a >>> different way than calling the tools "unsupported". We can, for >>> instance, mention that in the documents. We can add a warning message >>> that the build tools output saying that the result should be manually >>> inspected for correctness. >> >> We need to get a resolution on this. Anyone else (particarly >> committers) want to give their opinion? > > Changing title as this is all about now livepatch-build-tools. > > The livepatch-build-tools get a lot of usage around XSA times. > And that is when the corner cases are being found. The three of them: > 0c10457 Remove section alignment requirement > b30d34c Ignore .discard sections > 6327ab9 create-diff-object: Update fixup offsets in .rela.ex_table > > where thanks to generating XSAs. Now the folks who use these tools > are also the ones that do pre-disclosures. And the folks who > work on these tools also are the ones who have to get the livepatches out. > > It is a stressful time and in the past the issues were off: > 'oh, livepatch-build-tools won't generate the livepatch' > > which I don't even know how to classify - is it an XSA that it could not > create an livepatch? > > And if the livepatch-build-tools does generate something mighty wrong > then the folks on the XSA pre-disclosure list should be let know > (and that has been happening). > > But I am not really a fan of 'Oh, and one more XSA' Thanks for weighing in, Konrad. So it seems that people are still not quite clear about what I'm proposing. In general, a bug is a security issue if a human told the computer to do safe thing X, and the computer appeared to do safe thing X, but in fact did unsafe thing Y. Consider the question: "Is it an XSA that domain creation will fail for a kernel with feature A enabled?" No, that's not a security issue: The human told the computer to do safe thing X ("boot with kernel A"), and it did safe thing Z ("not boot"). Consider this question: "Is it an XSA that if you detach a block device from one domain, and then attach it to another block device, that the second user can read potentially sensitive data from the first domain?" No, that's not a security issue: The human told the computer to do unsafe thing Y ("expose block device from domain B to domain C"), and the computer did unsafe thing Y. So. Suppose a developer / package maintainer / sysadmin tries to build a livepatch, and the livepatch throws an error. Is this a security issue? No -- the human told it to do safe thing X ("build me a livepatch"), and it did safe thing Z ("print an error"). Suppose someone builds a livepatch that, when loaded, immediately crashes the hypervisor in all cases. Is this a security issue? No -- the human told it to do safe thing X ("patch this vulnerability") and it did safe thing Z ("crash the hypervisor immediately"). (This is safe because we assume you do a reasonable amount of testing before deploying a livepatch.) According to my understanding, the above three fixes that happened as part of devloping XSAs were like one of the above two examples: The tool either just didn't make a livepatch at all, or it made one that clearly didn't work. As such, they would not be considered XSAs. Suppose someone builds the livepatch with two different versions of the compiler; or against stale versions of the binaries; and the livepatch tool doesn't notice, and generates a livepatch which opens up a security hole. Is this a security issue? No -- the human told it to do unsafe thing Y, and it did unsafe thing Y. Suppose someone builds a livepatch that with buggy fix-up code that inadvertently gives PV guests access to hypervisor memory. Is this a security issue? No -- the human told it to do unsafe thing Y ("give PV guest access to hypervisor memory") and it did unsafe thing Y. Suppose someone builds a livepatch with the correct compiler, with a correct patch (that would fix the bug if rebooted into a new hypervisor), with correct fix-up code. Suppose that the bug passes all reasonable testing; but that, *due to a bug in the tools*, the patch also gives PV guests access to hypervisor memory. Is this a security issue? Yes -- the human told it to do safe thing X ("build a livepatch based on correct inputs to fix this bug") and it did unsafe thing Y ("build a livepatch that opens up a new security hole"). We could even place more restrictions on the scope if we wanted to. We could say that we only support the livepatch tools generating patches for XSAs. > This is very similar to what XSA-155 was - the GCC compiler optimizations > added a nice jump table that was accessed twice. And the offset was > retrieved from the shared ring. > > But we didn't do an XSA-155 for the GCC compiler. That is we didn't > file a ticket with GCC saying 'Hey, your compiler can create an race > on shared memory. Could you make your compiler be smarter in these cases' > We instead wrote code with this optimization in mind with more > barriers. Right -- so the gcc compiler guys are using a specification that allows that behavior. So from their perspective, we told the compiler to do unsafe thing Y (or at least, said that we were OK with it doing unsafe thing Y), and it did unsafe thing Y -- a security issue for Xen, but not for gcc. If gcc had *violated* the spec when causing the security issue, then we certainly would have called that a security issue in gcc. > I think livepatch-build-tools is in the same category as GCC or linkers. Indeed; gcc is "security-supported" in the way I am proposing livepatching be security supported. I don't think this is a hill worth dying on; if everyone agrees we should call them "security un-supported", we can go with that. I'm only still talking because people seem to be confused about what I'm proposing. -George
>>> George Dunlap <george.dunlap@citrix.com> 08/07/17 12:27 PM >>> >So it seems that people are still not quite clear about what I'm proposing. And indeed your examples helped me understand better what you mean (or at least I hope they did). >Suppose someone builds a livepatch with the correct compiler, with a >correct patch (that would fix the bug if rebooted into a new >hypervisor), with correct fix-up code. Suppose that the bug passes all >reasonable testing; but that, *due to a bug in the tools*, the patch >also gives PV guests access to hypervisor memory. Is this a security >issue? Yes -- the human told it to do safe thing X ("build a livepatch >based on correct inputs to fix this bug") and it did unsafe thing Y >("build a livepatch that opens up a new security hole"). There's one more factor here: The livepatch tools may behave properly with one version of the compiler, and improperly with another. Or they may behave properly with the Oracle patched hypervisor sources, but improperly with the Citrix ones (assuming every distro carries a different set of patches on top of an upstream stable release or branch). >We could even place more restrictions on the scope if we wanted to. We >could say that we only support the livepatch tools generating patches >for XSAs. For me, much depends on how tight such restrictions would be. I.e. with the examples given above, how would we determine a canonical livepatch-tools / hypervisor pair (or set of pairs)? After all tools mis-behavior may be a result of some custom patch in someone's derived tree. >> This is very similar to what XSA-155 was - the GCC compiler optimizations >> added a nice jump table that was accessed twice. And the offset was >> retrieved from the shared ring. >> >> But we didn't do an XSA-155 for the GCC compiler. That is we didn't >> file a ticket with GCC saying 'Hey, your compiler can create an race >> on shared memory. Could you make your compiler be smarter in these cases' >> We instead wrote code with this optimization in mind with more >> barriers. > >Right -- so the gcc compiler guys are using a specification that allows >that behavior. So from their perspective, we told the compiler to do >unsafe thing Y (or at least, said that we were OK with it doing unsafe >thing Y), and it did unsafe thing Y -- a security issue for Xen, but not >for gcc. If gcc had *violated* the spec when causing the security >issue, then we certainly would have called that a security issue in gcc. But would we have issued an XSA? Wouldn't that rather be a CVE against gcc then? Jan
On 08/07/2017 04:59 PM, Jan Beulich wrote: >>>> George Dunlap <george.dunlap@citrix.com> 08/07/17 12:27 PM >>> >> So it seems that people are still not quite clear about what I'm proposing. > > And indeed your examples helped me understand better what you mean > (or at least I hope they did). > >> Suppose someone builds a livepatch with the correct compiler, with a >> correct patch (that would fix the bug if rebooted into a new >> hypervisor), with correct fix-up code. Suppose that the bug passes all >> reasonable testing; but that, *due to a bug in the tools*, the patch >> also gives PV guests access to hypervisor memory. Is this a security >> issue? Yes -- the human told it to do safe thing X ("build a livepatch >> based on correct inputs to fix this bug") and it did unsafe thing Y >> ("build a livepatch that opens up a new security hole"). > > There's one more factor here: The livepatch tools may behave properly > with one version of the compiler, and improperly with another. I don't really understand the reasoning here. Is this your argument: "One can imagine a security-critical livepatch bug that only affects say, gcc 6.x and not gcc 5.x or 7.x. Therefore, we should never issue XSAs for any security-critical livepatch bugs." If we found that livepatching tools make an incorrect patch only when using gcc 5.x, and we have reason to believe that some people may be using gcc 5.x, then I think we should issue an XSA and say that it only affects people compiling xen with gcc 5.x. It probably would make sense to specify some range of compiler versions for which we will issue XSAs for the livepatch tools. A good baseline would be what versions of gcc Xen uses, and then we can restrict it further if we need to (for instance, if some versions of gcc are missing requisite features, or if they are just known to be buggy). And remember, this is not "We have tested all compiler versions and promise you there are no bugs." It's, "If someone finds a bug for this set of compilers, we will tell you about it so you can do something about it." >> We could even place more restrictions on the scope if we wanted to. We >> could say that we only support the livepatch tools generating patches >> for XSAs. > > For me, much depends on how tight such restrictions would be. I.e. > with the examples given above, how would we determine a canonical > livepatch-tools / hypervisor pair (or set of pairs)? After all tools > mis-behavior may be a result of some custom patch in someone's > derived tree. Well, suppose that we issued an XSA with a patch, and suppose it was later discovered that the patch opened up a different security hole when applied on the upstream tree. Would we issue another XSA and/or an update to the existing XSA? I think obviously yes we would. Suppose instead we issued an XSA with a patch, and that it was later discovered that the patch opened up a different security hole when applied on top of XenServer's patchqueue, but not on the baseline XenProject. Would we issue another XSA and/or an update to an existing XSA? The obvious *default* answer to that is "No; it's not practical for us to deal with software that is not inside the XenProject's control." One could imagine circumstances in which we issue statements or an XSA anyway, but that would the exception and not the rule. I think the same kind of thing would apply to the livepatch tools: *by default*, we only issue XSAs for the livepatch tools if they create security issues when generating blobs based on security patches issued by the XenProject, and on top of XenProject-released software. As always, if there's some unforeseen circumstance then someone could argue for an exception. >>> This is very similar to what XSA-155 was - the GCC compiler optimizations >>> added a nice jump table that was accessed twice. And the offset was >>> retrieved from the shared ring. >>> >>> But we didn't do an XSA-155 for the GCC compiler. That is we didn't >>> file a ticket with GCC saying 'Hey, your compiler can create an race >>> on shared memory. Could you make your compiler be smarter in these cases' >>> We instead wrote code with this optimization in mind with more >>> barriers. >> >> Right -- so the gcc compiler guys are using a specification that allows >> that behavior. So from their perspective, we told the compiler to do >> unsafe thing Y (or at least, said that we were OK with it doing unsafe >> thing Y), and it did unsafe thing Y -- a security issue for Xen, but not >> for gcc. If gcc had *violated* the spec when causing the security >> issue, then we certainly would have called that a security issue in gcc. > > But would we have issued an XSA? Wouldn't that rather be a CVE > against gcc then? This is changing the question slightly, from "Should X have security support", to "If X is to be security supported, what organization and process should be used to support them?" Obviously in the case of gcc, we would primarily handle the security issue the way the gcc project handles security issues (which may be nothing at all for all I know). (Although depending on the bug and the circumstances, we might still issue an advisory to raise awareness for downstreams who might have compiled Xen with a particular version of gcc.) If livepatch-tools were an external project run by somebody else, with their own security process, then we would report the issue to them and let them handle it. Since livepatch-tools is developed by Xen developers and for Xen downstreams and users, if it is to be security supported, then it seems to me that the obvious thing to do is to support it within the XenProject security response process. I mean, if someone *wants* to set up an independent organization with an independent security team and security process to handle the livepatch project, then I guess that would be OK with me -- I don't care so much *who* does the security support, as long as it gets done. -George
>>> On 08.08.17 at 13:16, <george.dunlap@citrix.com> wrote: > On 08/07/2017 04:59 PM, Jan Beulich wrote: >>>>> George Dunlap <george.dunlap@citrix.com> 08/07/17 12:27 PM >>> >>> So it seems that people are still not quite clear about what I'm proposing. >> >> And indeed your examples helped me understand better what you mean >> (or at least I hope they did). >> >>> Suppose someone builds a livepatch with the correct compiler, with a >>> correct patch (that would fix the bug if rebooted into a new >>> hypervisor), with correct fix-up code. Suppose that the bug passes all >>> reasonable testing; but that, *due to a bug in the tools*, the patch >>> also gives PV guests access to hypervisor memory. Is this a security >>> issue? Yes -- the human told it to do safe thing X ("build a livepatch >>> based on correct inputs to fix this bug") and it did unsafe thing Y >>> ("build a livepatch that opens up a new security hole"). >> >> There's one more factor here: The livepatch tools may behave properly >> with one version of the compiler, and improperly with another. > > I don't really understand the reasoning here. Is this your argument: > "One can imagine a security-critical livepatch bug that only affects > say, gcc 6.x and not gcc 5.x or 7.x. Therefore, we should never issue > XSAs for any security-critical livepatch bugs." > > If we found that livepatching tools make an incorrect patch only when > using gcc 5.x, and we have reason to believe that some people may be > using gcc 5.x, then I think we should issue an XSA and say that it only > affects people compiling xen with gcc 5.x. > > It probably would make sense to specify some range of compiler versions > for which we will issue XSAs for the livepatch tools. A good baseline > would be what versions of gcc Xen uses, and then we can restrict it > further if we need to (for instance, if some versions of gcc are missing > requisite features, or if they are just known to be buggy). > > And remember, this is not "We have tested all compiler versions and > promise you there are no bugs." It's, "If someone finds a bug for this > set of compilers, we will tell you about it so you can do something > about it." I can see and understand all of what you say; my argument, however was more towards the matrix of what needs supporting possibly becoming unreasonably large (no matter whether we specify a range of compilers, as once again distros tend to not ship plain unpatched upstream compiler versions). >>> We could even place more restrictions on the scope if we wanted to. We >>> could say that we only support the livepatch tools generating patches >>> for XSAs. >> >> For me, much depends on how tight such restrictions would be. I.e. >> with the examples given above, how would we determine a canonical >> livepatch-tools / hypervisor pair (or set of pairs)? After all tools >> mis-behavior may be a result of some custom patch in someone's >> derived tree. > > Well, suppose that we issued an XSA with a patch, and suppose it was > later discovered that the patch opened up a different security hole when > applied on the upstream tree. Would we issue another XSA and/or an > update to the existing XSA? I think obviously yes we would. Yes (this has happened in the past already). > Suppose instead we issued an XSA with a patch, and that it was later > discovered that the patch opened up a different security hole when > applied on top of XenServer's patchqueue, but not on the baseline > XenProject. Would we issue another XSA and/or an update to an existing XSA? > > The obvious *default* answer to that is "No; it's not practical for us > to deal with software that is not inside the XenProject's control." One > could imagine circumstances in which we issue statements or an XSA > anyway, but that would the exception and not the rule. > > I think the same kind of thing would apply to the livepatch tools: *by > default*, we only issue XSAs for the livepatch tools if they create > security issues when generating blobs based on security patches issued > by the XenProject, and on top of XenProject-released software. As > always, if there's some unforeseen circumstance then someone could argue > for an exception. Not sure here - if analysis showed that the same issue could happen elsewhere, and others were just lucky so far, I think we'd have to alter the default (and I'm hesitant to call this an exception). Plus analysis may, the more different components are involved (specifically the compiler, which perhaps none of us has deep enough knowledge about), become more and more difficult. Bottom line - while technically I agree it would be good for the tools to be security supported, from a practical perspective I see too much complexity for this to be reasonably manageable. Jan
On Wed, Aug 9, 2017 at 8:36 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>> On 08.08.17 at 13:16, <george.dunlap@citrix.com> wrote: >> On 08/07/2017 04:59 PM, Jan Beulich wrote: >>>>>> George Dunlap <george.dunlap@citrix.com> 08/07/17 12:27 PM >>> >>>> So it seems that people are still not quite clear about what I'm proposing. >>> >>> And indeed your examples helped me understand better what you mean >>> (or at least I hope they did). >>> >>>> Suppose someone builds a livepatch with the correct compiler, with a >>>> correct patch (that would fix the bug if rebooted into a new >>>> hypervisor), with correct fix-up code. Suppose that the bug passes all >>>> reasonable testing; but that, *due to a bug in the tools*, the patch >>>> also gives PV guests access to hypervisor memory. Is this a security >>>> issue? Yes -- the human told it to do safe thing X ("build a livepatch >>>> based on correct inputs to fix this bug") and it did unsafe thing Y >>>> ("build a livepatch that opens up a new security hole"). >>> >>> There's one more factor here: The livepatch tools may behave properly >>> with one version of the compiler, and improperly with another. >> >> I don't really understand the reasoning here. Is this your argument: >> "One can imagine a security-critical livepatch bug that only affects >> say, gcc 6.x and not gcc 5.x or 7.x. Therefore, we should never issue >> XSAs for any security-critical livepatch bugs." >> >> If we found that livepatching tools make an incorrect patch only when >> using gcc 5.x, and we have reason to believe that some people may be >> using gcc 5.x, then I think we should issue an XSA and say that it only >> affects people compiling xen with gcc 5.x. >> >> It probably would make sense to specify some range of compiler versions >> for which we will issue XSAs for the livepatch tools. A good baseline >> would be what versions of gcc Xen uses, and then we can restrict it >> further if we need to (for instance, if some versions of gcc are missing >> requisite features, or if they are just known to be buggy). >> >> And remember, this is not "We have tested all compiler versions and >> promise you there are no bugs." It's, "If someone finds a bug for this >> set of compilers, we will tell you about it so you can do something >> about it." > > I can see and understand all of what you say; my argument, > however was more towards the matrix of what needs supporting > possibly becoming unreasonably large (no matter whether we > specify a range of compilers, as once again distros tend to not > ship plain unpatched upstream compiler versions). What do you mean, "The matrix of what needs supporting [may possibly become] increasingly large"? What is the problem with having a large (implicit) "supported" matrix? How is supporting a "large matrix" for livepatch tools different than the current "large matrix" we support for just building Xen at all? >> Suppose instead we issued an XSA with a patch, and that it was later >> discovered that the patch opened up a different security hole when >> applied on top of XenServer's patchqueue, but not on the baseline >> XenProject. Would we issue another XSA and/or an update to an existing XSA? >> >> The obvious *default* answer to that is "No; it's not practical for us >> to deal with software that is not inside the XenProject's control." One >> could imagine circumstances in which we issue statements or an XSA >> anyway, but that would the exception and not the rule. >> >> I think the same kind of thing would apply to the livepatch tools: *by >> default*, we only issue XSAs for the livepatch tools if they create >> security issues when generating blobs based on security patches issued >> by the XenProject, and on top of XenProject-released software. As >> always, if there's some unforeseen circumstance then someone could argue >> for an exception. > > Not sure here - if analysis showed that the same issue could happen > elsewhere, and others were just lucky so far, I think we'd have to > alter the default (and I'm hesitant to call this an exception). Plus > analysis may, the more different components are involved > (specifically the compiler, which perhaps none of us has deep enough > knowledge about), become more and more difficult. > > Bottom line - while technically I agree it would be good for the tools > to be security supported, from a practical perspective I see too > much complexity for this to be reasonably manageable. But I still don't understand why you think so. Every single objection or question about what would or would not be supported that has been raised so far has analogs in what we already support. It is no more complex to support livepatch-tools than to support HVM USB passthrough, or credit2. I have elsewhere described a hypothetical scenario where I think we should issue an XSA for livepatch-tools. Are you really seriously suggesting that in that scenario we should simply publish the vulnerability onto xen-devel with no predisclosure? -George
>>> On 21.08.17 at 12:59, <george.dunlap@citrix.com> wrote: > On Wed, Aug 9, 2017 at 8:36 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 08.08.17 at 13:16, <george.dunlap@citrix.com> wrote: >>> On 08/07/2017 04:59 PM, Jan Beulich wrote: >>>>>>> George Dunlap <george.dunlap@citrix.com> 08/07/17 12:27 PM >>> >>>>> So it seems that people are still not quite clear about what I'm proposing. >>>> >>>> And indeed your examples helped me understand better what you mean >>>> (or at least I hope they did). >>>> >>>>> Suppose someone builds a livepatch with the correct compiler, with a >>>>> correct patch (that would fix the bug if rebooted into a new >>>>> hypervisor), with correct fix-up code. Suppose that the bug passes all >>>>> reasonable testing; but that, *due to a bug in the tools*, the patch >>>>> also gives PV guests access to hypervisor memory. Is this a security >>>>> issue? Yes -- the human told it to do safe thing X ("build a livepatch >>>>> based on correct inputs to fix this bug") and it did unsafe thing Y >>>>> ("build a livepatch that opens up a new security hole"). >>>> >>>> There's one more factor here: The livepatch tools may behave properly >>>> with one version of the compiler, and improperly with another. >>> >>> I don't really understand the reasoning here. Is this your argument: >>> "One can imagine a security-critical livepatch bug that only affects >>> say, gcc 6.x and not gcc 5.x or 7.x. Therefore, we should never issue >>> XSAs for any security-critical livepatch bugs." >>> >>> If we found that livepatching tools make an incorrect patch only when >>> using gcc 5.x, and we have reason to believe that some people may be >>> using gcc 5.x, then I think we should issue an XSA and say that it only >>> affects people compiling xen with gcc 5.x. >>> >>> It probably would make sense to specify some range of compiler versions >>> for which we will issue XSAs for the livepatch tools. A good baseline >>> would be what versions of gcc Xen uses, and then we can restrict it >>> further if we need to (for instance, if some versions of gcc are missing >>> requisite features, or if they are just known to be buggy). >>> >>> And remember, this is not "We have tested all compiler versions and >>> promise you there are no bugs." It's, "If someone finds a bug for this >>> set of compilers, we will tell you about it so you can do something >>> about it." >> >> I can see and understand all of what you say; my argument, >> however was more towards the matrix of what needs supporting >> possibly becoming unreasonably large (no matter whether we >> specify a range of compilers, as once again distros tend to not >> ship plain unpatched upstream compiler versions). > > What do you mean, "The matrix of what needs supporting [may possibly > become] increasingly large"? What is the problem with having a large > (implicit) "supported" matrix? How is supporting a "large matrix" for > livepatch tools different than the current "large matrix" we support > for just building Xen at all? The matrix of Xen only has just a single dimension. Since livepatch tools and Xen are independent, any pair of them would need building/testing in order to be sure things work in all supported combinations. >>> Suppose instead we issued an XSA with a patch, and that it was later >>> discovered that the patch opened up a different security hole when >>> applied on top of XenServer's patchqueue, but not on the baseline >>> XenProject. Would we issue another XSA and/or an update to an existing XSA? >>> >>> The obvious *default* answer to that is "No; it's not practical for us >>> to deal with software that is not inside the XenProject's control." One >>> could imagine circumstances in which we issue statements or an XSA >>> anyway, but that would the exception and not the rule. >>> >>> I think the same kind of thing would apply to the livepatch tools: *by >>> default*, we only issue XSAs for the livepatch tools if they create >>> security issues when generating blobs based on security patches issued >>> by the XenProject, and on top of XenProject-released software. As >>> always, if there's some unforeseen circumstance then someone could argue >>> for an exception. >> >> Not sure here - if analysis showed that the same issue could happen >> elsewhere, and others were just lucky so far, I think we'd have to >> alter the default (and I'm hesitant to call this an exception). Plus >> analysis may, the more different components are involved >> (specifically the compiler, which perhaps none of us has deep enough >> knowledge about), become more and more difficult. >> >> Bottom line - while technically I agree it would be good for the tools >> to be security supported, from a practical perspective I see too >> much complexity for this to be reasonably manageable. > > But I still don't understand why you think so. Every single objection > or question about what would or would not be supported that has been > raised so far has analogs in what we already support. It is no more > complex to support livepatch-tools than to support HVM USB > passthrough, or credit2. I don't think we currently have any scenario where it might be required (rather than just being optional) to do in-depth analysis of compiler behavior. If we ran into a compiler induced vulnerability in Xen, I'm sure we'd forward this to the compiler folks. If, however, there's an issue with a livepatch, and it's unclear whether the root cause are the tool chain or the livepatch tools, both would need to be analyzed to find the culprit. I don't think forwarding this to the compiler folks would be appropriate until we're sure it's in their code rather than ours. > I have elsewhere described a hypothetical scenario where I think we > should issue an XSA for livepatch-tools. Are you really seriously > suggesting that in that scenario we should simply publish the > vulnerability onto xen-devel with no predisclosure? Well, at least I'm not 100% convinced issuing an XSA in this case would be appropriate. Anyway - since it feels like we're moving in circles (which in part may be because I can't express well enough the reasons for my hesitation to go to the full XSA extent with the livepatch tools) I'd like to just conclude my part here with saying that I'm not going to stand in the way whichever decision is taken. I've voiced my reservations, and that will have to do. I'd therefore prefer to leave the discussion to those more familiar with those tools (and their possible limitations and issues). Jan
On 08/21/2017 01:07 PM, Jan Beulich wrote: >>>> And remember, this is not "We have tested all compiler versions and >>>> promise you there are no bugs." It's, "If someone finds a bug for this >>>> set of compilers, we will tell you about it so you can do something >>>> about it." >>> >>> I can see and understand all of what you say; my argument, >>> however was more towards the matrix of what needs supporting >>> possibly becoming unreasonably large (no matter whether we >>> specify a range of compilers, as once again distros tend to not >>> ship plain unpatched upstream compiler versions). >> >> What do you mean, "The matrix of what needs supporting [may possibly >> become] increasingly large"? What is the problem with having a large >> (implicit) "supported" matrix? How is supporting a "large matrix" for >> livepatch tools different than the current "large matrix" we support >> for just building Xen at all? > > The matrix of Xen only has just a single dimension. Since livepatch > tools and Xen are independent, any pair of them would need > building/testing in order to be sure things work in all supported > combinations. So your argument seems to be: 1. We can only provide security support in situations where we can test all possible combinations in the support matrix. 2. We cannot test the entire matrix of combinations for Xen x livepatch tools x compilers 3. Therefore, we cannot provide security support for livepatching tools. Put this way, I hope you can see what the flaw in the argument is: #1 is false. Xen has {Xen version} x {Linux version} x {Compiler} x {Hardware}. Hardware of course includes not only the chip itself, but the BIOS / firmware, and the particular devices (and device firmware). If we wanted we could add in {Python version} for people using pygrub, and {Ocaml compiler version} for people running Ocaml, versions of systemd -- I'm sure with effort I could find more dimensions to add to the matrix. We do not, and never have, *tested* the entire matrix of possible combinations considered "security supported" to make sure they work. Such a matrix is completely impossible to even consider, and even if we did some sort of testing, that could not guarantee that they are bug free. What we do for security support is: 1. Test a *representative sample* of combinations (via osstest, product testing, user testing, &c) 2. Promise to issue XSAs if anyone *happens to discover* a combination in the rest of the support matrix that has a security issue That is the requirment for normal Xen, and it would be the same requirement for livepatch-tools: That between osstest, product, and the community, we get regular testing of *a representative sample* of {Xen, livepatch-tools, compiler}, and (what primarily concerns me) issue an XSA if anyone discovers a security issue somewhere in that matrix. I'm not frustrated, but I am baffled by the fact that this "support matrix" objection is so persistent. Nearly everyone has brought it up, as though "test every combination" was a necessary requirement, in spite of the fact that 1) there is *no* piece of software for which we test the entire matrix of possible combinations 2) I have said over and over again (in fact, I specifically said a few replies ago -- it's there at the top of this email) that we do not test all possible combinations. >> I have elsewhere described a hypothetical scenario where I think we >> should issue an XSA for livepatch-tools. Are you really seriously >> suggesting that in that scenario we should simply publish the >> vulnerability onto xen-devel with no predisclosure? > > Well, at least I'm not 100% convinced issuing an XSA in this case > would be appropriate. > > Anyway - since it feels like we're moving in circles (which in part > may be because I can't express well enough the reasons for my > hesitation to go to the full XSA extent with the livepatch tools) > I'd like to just conclude my part here with saying that I'm not > going to stand in the way whichever decision is taken. I've > voiced my reservations, and that will have to do. I'd therefore > prefer to leave the discussion to those more familiar with those > tools (and their possible limitations and issues). Indeed; and as I think I said before, I think we need to move forward with getting a statement on livepatching in, and since most of the voices involved in this conversation seem to be in favor of saying livepatch-tools are *not* supported, I won't object. I'm only still continuing this thread because people seem to be confused about what I am asking people to do. I think the likelihood of an XSA-worthy bug being found in the livepatch tools is very low. I'm happy to defer the argument about whether we should issue an XSA for such a bug until such time as one becomes known. -George
>>> On 21.08.17 at 17:28, <george.dunlap@citrix.com> wrote: > So your argument seems to be: > > 1. We can only provide security support in situations where we can test > all possible combinations in the support matrix. > > 2. We cannot test the entire matrix of combinations for Xen x livepatch > tools x compilers > > 3. Therefore, we cannot provide security support for livepatching tools. > > Put this way, I hope you can see what the flaw in the argument is: #1 is > false. Xen has {Xen version} x {Linux version} x {Compiler} x > {Hardware}. Hardware of course includes not only the chip itself, but > the BIOS / firmware, and the particular devices (and device firmware). > If we wanted we could add in {Python version} for people using pygrub, > and {Ocaml compiler version} for people running Ocaml, versions of > systemd -- I'm sure with effort I could find more dimensions to add to > the matrix. > > We do not, and never have, *tested* the entire matrix of possible > combinations considered "security supported" to make sure they work. > Such a matrix is completely impossible to even consider, and even if we > did some sort of testing, that could not guarantee that they are bug free. > > What we do for security support is: > > 1. Test a *representative sample* of combinations (via osstest, product > testing, user testing, &c) > > 2. Promise to issue XSAs if anyone *happens to discover* a combination > in the rest of the support matrix that has a security issue > > That is the requirment for normal Xen, and it would be the same > requirement for livepatch-tools: That between osstest, product, and the > community, we get regular testing of *a representative sample* of {Xen, > livepatch-tools, compiler}, and (what primarily concerns me) issue an > XSA if anyone discovers a security issue somewhere in that matrix. > > I'm not frustrated, but I am baffled by the fact that this "support > matrix" objection is so persistent. Nearly everyone has brought it up, > as though "test every combination" was a necessary requirement, in spite > of the fact that 1) there is *no* piece of software for which we test > the entire matrix of possible combinations 2) I have said over and over > again (in fact, I specifically said a few replies ago -- it's there at > the top of this email) that we do not test all possible combinations. Well, part of it may be that the other components involved in the test matrix you suggest are external, i.e. we're just their consumers. If we consider just our own portions, the matrix is - as said - one dimensional. With the livepatching tools, a dimension is being added. Even us issuing Linux XSAs is, with the current upstream status of the Xen pieces in there, questionable imo. This is also considering the fact that iirc we've never issued an XSA for another Xen guest OS, yet it is hard to believe that only Linux would ever had any vulnerability. Jan
On Tue, Aug 22, 2017 at 7:37 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>> On 21.08.17 at 17:28, <george.dunlap@citrix.com> wrote: >> So your argument seems to be: >> >> 1. We can only provide security support in situations where we can test >> all possible combinations in the support matrix. >> >> 2. We cannot test the entire matrix of combinations for Xen x livepatch >> tools x compilers >> >> 3. Therefore, we cannot provide security support for livepatching tools. >> >> Put this way, I hope you can see what the flaw in the argument is: #1 is >> false. Xen has {Xen version} x {Linux version} x {Compiler} x >> {Hardware}. Hardware of course includes not only the chip itself, but >> the BIOS / firmware, and the particular devices (and device firmware). >> If we wanted we could add in {Python version} for people using pygrub, >> and {Ocaml compiler version} for people running Ocaml, versions of >> systemd -- I'm sure with effort I could find more dimensions to add to >> the matrix. >> >> We do not, and never have, *tested* the entire matrix of possible >> combinations considered "security supported" to make sure they work. >> Such a matrix is completely impossible to even consider, and even if we >> did some sort of testing, that could not guarantee that they are bug free. >> >> What we do for security support is: >> >> 1. Test a *representative sample* of combinations (via osstest, product >> testing, user testing, &c) >> >> 2. Promise to issue XSAs if anyone *happens to discover* a combination >> in the rest of the support matrix that has a security issue >> >> That is the requirment for normal Xen, and it would be the same >> requirement for livepatch-tools: That between osstest, product, and the >> community, we get regular testing of *a representative sample* of {Xen, >> livepatch-tools, compiler}, and (what primarily concerns me) issue an >> XSA if anyone discovers a security issue somewhere in that matrix. >> >> I'm not frustrated, but I am baffled by the fact that this "support >> matrix" objection is so persistent. Nearly everyone has brought it up, >> as though "test every combination" was a necessary requirement, in spite >> of the fact that 1) there is *no* piece of software for which we test >> the entire matrix of possible combinations 2) I have said over and over >> again (in fact, I specifically said a few replies ago -- it's there at >> the top of this email) that we do not test all possible combinations. > > Well, part of it may be that the other components involved in the > test matrix you suggest are external, i.e. we're just their consumers. But our response to me, you mentioned distros having patched compilers as a reason that the matrix is untenably large. > If we consider just our own portions, the matrix is - as said - one > dimensional. With the livepatching tools, a dimension is being added. > Even us issuing Linux XSAs is, with the current upstream status of > the Xen pieces in there, questionable imo. This is also considering > the fact that iirc we've never issued an XSA for another Xen guest > OS, yet it is hard to believe that only Linux would ever had any > vulnerability. Well, no -- we have at very least {Linux} x {Xen}, for which we end up testing *many* different configurations, but certainly not all. I think guest OS support is actually a pretty good analog. I can't imagine not issuing XSAs for bugs in Linux, just as I can't imagine not issuing XSAs for actual security issues that get found in the livepatch tools. If you think we shouldn't give security support for Linux, it makes sense that you would feel the same way for livepatch-tools (although I don't really understand why you think that way about either). We issue more XSAs for Linux than for other guests, in part because of the complexity of the code inside Linux compared to other OSes; but also in part due to the fact that that is the most tested and looked-at. There probably *are* more bugs in Linux than in NetBSD or FreeBSD; but also more of them are found because more people are testing and looking. But in any case, in my mind, the promise never was "we test all versions of Linux with all versions of Xen", much less "We test all versions of all operating systems with all versions of Xen". The promise was, "We test a representative sample of Linux and Xen combinations, and we promise to report issues if they are found." That's what I thing is the right thing to do for livepatch-tools as well. -George
On Tue, Aug 22, 2017 at 11:58:57AM +0100, George Dunlap wrote: > I think guest OS support is actually a pretty good analog. I can't > imagine not issuing XSAs for bugs in Linux, just as I can't imagine > not issuing XSAs for actual security issues that get found in the > livepatch tools. If you think we shouldn't give security support for > Linux, it makes sense that you would feel the same way for > livepatch-tools (although I don't really understand why you think that > way about either). > > We issue more XSAs for Linux than for other guests, in part because of > the complexity of the code inside Linux compared to other OSes; but > also in part due to the fact that that is the most tested and > looked-at. There probably *are* more bugs in Linux than in NetBSD or > FreeBSD; but also more of them are found because more people are > testing and looking. IMHO, we issue XSA for Linux because Linux lacks a security process. If a bug was found in the BSDs, it should be handled using the normal security process that each BSD has, and a SA would be issued by the security officer: https://www.freebsd.org/security/advisories.html For example NetBSD has recently released a SA for a Xen-specific PV vulnerability in their implementation: ftp://ftp.nl.netbsd.org/pub/NetBSD/security/advisories/NetBSD-SA2017-003.txt.asc Roger.
diff --git a/xen/common/Kconfig b/xen/common/Kconfig index dc8e876..876086c 100644 --- a/xen/common/Kconfig +++ b/xen/common/Kconfig @@ -226,7 +226,7 @@ config CRYPTO bool config LIVEPATCH - bool "Live patching support (TECH PREVIEW)" + bool "Live patching support" default n depends on HAS_BUILD_ID = "y" ---help---
Xen Live Patching has been available as tech preview feature since Xen 4.7 and has now had a couple of releases to stabilize. Xen Live patching has been used by multiple vendors to fix several real-world security issues without any severe bugs encountered. Additionally, there are now tests in OSSTest that test live patching to ensure that no regressions are introduced. Based on the amount of testing and usage it has had, we are ready to declare live patching as a 'Supported' feature. Live patching is slightly peculiar when it comes to support because it allows the host administrator to break their system rather easily depending on the content of the live patch. Because of this, it is worth detailing out the scope of security support: * Unprivileged access to live patching operations: Live patching operations should only be accessible to privileged guests and it shall be treated as a security issue if this is not the case. * Bugs in the patch-application code such that vulnerabilities exist after application: If a correct live patch is loaded but it is not applied correctly such that it might result in an insecure system (e.g. not all functions are patched), it shall be treated as a security issue. * Bugs in livepatch-build-tools creating incorrect live patch that results in an insecure host: If livepatch-build-tools creates an incorrect live patch that results in an insecure host, this shall not be considered a security issue. There are too many OSes and toolchains to consider supporting this. A live patch should be checked to verify that it is valid before loading. * Loading an incorrect live patch that results in an insecure host or host crash: If a live patch (whether created using livepatch-build-tools or some alternative) is loaded and it results in an insecure host or host crash due to the content of the live patch being incorrect or the issue being inappropriate to live patch, this is not considered as a security issue. * Bugs in the live patch parsing code (the ELF loader): Bugs in the live patch parsing code such as out-of-bounds reads caused by invalid ELF files are not considered to be security issues because the it can only be triggered by a privileged domain. * Bugs which allow a guest to prevent the application of a livepatch: A guest should not be able to prevent the application of a live patch. If an unprivileged guest can prevent the application of a live patch, it shall be treated as a security issue. There are also some generic security questions which it is worth asking: 1) Is guest->host privilege escalation possible? The new live patching sysctl subops are only accessible to privileged domains and this is tested by OSSTest with an XTF test. There is a caveat -- an incorrect live patch can introduce a guest->host privilege escalation. 2) Is guest user->guest kernel escalation possible? No, although an incorrect live patch can introduce a guest user->guest kernel privilege escalation. 3) Is there any information leakage? The new live patching sysctl subops are only accessible to privileged domains so it is not possible for an unprivileged guest to access the list of loaded live patches. This is tested by OSSTest with an XTF test. There is a caveat -- an incorrect live patch can introduce an information leakage. 4) Can a Denial-of-Service be triggered? There are no known ways that an unprivileged guest can prevent a live patch from being loaded. Once again, there is a caveat that an incorrect live patch can introduce an arbitrary denial of service. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> --- xen/common/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)