Message ID | 20201015192956.1797021-1-dburgener@linux.microsoft.com (mailing list archive) |
---|---|
Headers | show |
Series | Update SELinuxfs out of tree and then swapover | expand |
On Thu, Oct 15, 2020 at 03:29:53PM -0400, Daniel Burgener wrote: > This is a backport for stable of my series to fix a race condition in > selinuxfs during policy load: > > selinux: Create function for selinuxfs directory cleanup > https://lore.kernel.org/selinux/20200819195935.1720168-2-dburgener@linux.microsoft.com/ > > selinux: Refactor selinuxfs directory populating functions > https://lore.kernel.org/selinux/20200819195935.1720168-3-dburgener@linux.microsoft.com/ > > selinux: Standardize string literal usage for selinuxfs directory names > https://lore.kernel.org/selinux/20200819195935.1720168-4-dburgener@linux.microsoft.com/ > > selinux: Create new booleans and class dirs out of tree > https://lore.kernel.org/selinux/20200819195935.1720168-5-dburgener@linux.microsoft.com/ > > Several changes were necessary to backport. They are detailed in the > commit message for the third commit in the series. I also dropped the > original third commit from this because it was only a style change. As Sasha said, we want to take the original commits as much as possible to reduce the delta. It is ok to take style changes if other patches depend on them, because every time we do something that is not upstream, we create bugs. So can you redo this series, and backport the original commits, and provide the ids for them as well? thanks, greg k-h
On 10/16/20 1:00 AM, Greg KH wrote: > On Thu, Oct 15, 2020 at 03:29:53PM -0400, Daniel Burgener wrote: >> This is a backport for stable of my series to fix a race condition in >> selinuxfs during policy load: >> >> selinux: Create function for selinuxfs directory cleanup >> https://lore.kernel.org/selinux/20200819195935.1720168-2-dburgener@linux.microsoft.com/ >> >> selinux: Refactor selinuxfs directory populating functions >> https://lore.kernel.org/selinux/20200819195935.1720168-3-dburgener@linux.microsoft.com/ >> >> selinux: Standardize string literal usage for selinuxfs directory names >> https://lore.kernel.org/selinux/20200819195935.1720168-4-dburgener@linux.microsoft.com/ >> >> selinux: Create new booleans and class dirs out of tree >> https://lore.kernel.org/selinux/20200819195935.1720168-5-dburgener@linux.microsoft.com/ >> >> Several changes were necessary to backport. They are detailed in the >> commit message for the third commit in the series. I also dropped the >> original third commit from this because it was only a style change. > As Sasha said, we want to take the original commits as much as possible > to reduce the delta. It is ok to take style changes if other patches > depend on them, because every time we do something that is not upstream, > we create bugs. > > So can you redo this series, and backport the original commits, and > provide the ids for them as well? > > thanks, > > greg k-h Yes, thank you. I will fix up the series with the third commit included, and add commit ids. Thanks. -Daniel
On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener <dburgener@linux.microsoft.com> wrote: > Yes, thank you. I will fix up the series with the third commit > included, and add commit ids. Thanks. Greg and I have different opinions on what is classified as a good candidate for the -stable trees, but in my opinion this patch series doesn't qualify. There are a lot of dependencies, it is intertwined with a lot of code, and the issue that this patchset fixes has been around for a *long* time. I personally feel the risk of backporting this to -stable does not outweigh the potential wins. My current opinion is that backporting this patchset is not a good idea; it gets a NACK from me. Daniel, in the future if this is something you want to see backported please bring this issue up on the SELinux mailing list when the patchset is originally posted so we can have a discussion about it and plan accordingly.
On 10/16/20 9:55 AM, Paul Moore wrote: > On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener > <dburgener@linux.microsoft.com> wrote: >> Yes, thank you. I will fix up the series with the third commit >> included, and add commit ids. Thanks. > Greg and I have different opinions on what is classified as a good > candidate for the -stable trees, but in my opinion this patch series > doesn't qualify. There are a lot of dependencies, it is intertwined > with a lot of code, and the issue that this patchset fixes has been > around for a *long* time. I personally feel the risk of backporting > this to -stable does not outweigh the potential wins. > > My current opinion is that backporting this patchset is not a good > idea; it gets a NACK from me. > > Daniel, in the future if this is something you want to see backported > please bring this issue up on the SELinux mailing list when the > patchset is originally posted so we can have a discussion about it and > plan accordingly. > Noted. Thanks for the feedback. I will make sure to bring such things up with the selinux list in the future. -Daniel
On Fri, Oct 16, 2020 at 09:55:25AM -0400, Paul Moore wrote: >On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener ><dburgener@linux.microsoft.com> wrote: >> Yes, thank you. I will fix up the series with the third commit >> included, and add commit ids. Thanks. > >Greg and I have different opinions on what is classified as a good >candidate for the -stable trees, but in my opinion this patch series >doesn't qualify. There are a lot of dependencies, it is intertwined >with a lot of code, and the issue that this patchset fixes has been >around for a *long* time. I personally feel the risk of backporting >this to -stable does not outweigh the potential wins. My understanding is that while the issue Daniel is fixing here has been around for a while, it's also very real - the reports suggest a failure rate of 1-2% on boot. I do understand your concerns around this series, but given it was just fixed upstream we don't have a better story than "sit tight for the next LTS" to tell to users affected by this issue. Is there a scenario where you'd feel safer with the series? I suspect that if it doesn't go into upstream stable Daniel will end up carrying it out of tree anyway, so maybe we can ask Daniel to do targetted testing for the next week or two and report back?
On 10/16/20 10:22 AM, Sasha Levin wrote: > On Fri, Oct 16, 2020 at 09:55:25AM -0400, Paul Moore wrote: >> On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener >> <dburgener@linux.microsoft.com> wrote: >>> Yes, thank you. I will fix up the series with the third commit >>> included, and add commit ids. Thanks. >> >> Greg and I have different opinions on what is classified as a good >> candidate for the -stable trees, but in my opinion this patch series >> doesn't qualify. There are a lot of dependencies, it is intertwined >> with a lot of code, and the issue that this patchset fixes has been >> around for a *long* time. I personally feel the risk of backporting >> this to -stable does not outweigh the potential wins. > > My understanding is that while the issue Daniel is fixing here has been > around for a while, it's also very real - the reports suggest a failure > rate of 1-2% on boot. As a point of clarity, I think that the issue occurs much less frequently on boot than it does with a policy load during ordinary operation, since there are a much higher volume of userspace policy manager lookups on a policy_load once the system is up. I think 1-2% is roughly accurate for what we're seeing in the environment I'm working on for a policy load during normal steady state operation. I don't have hard numbers on policy load during boot, but I would expect it to be quite a bit lower. We have seen it, but it's not the common case we're seeing. > > I do understand your concerns around this series, but given it was just > fixed upstream we don't have a better story than "sit tight for the > next LTS" to tell to users affected by this issue. > > Is there a scenario where you'd feel safer with the series? I suspect > that if it doesn't go into upstream stable Daniel will end up carrying > it out of tree anyway, so maybe we can ask Daniel to do targetted > testing for the next week or two and report back? > I believe my team will intend to carry this out of tree, yes. If additional data from that would be helpful, I'd be happy to provide it. -Daniel
This is a backport for stable of my series to fix a race condition in selinuxfs during policy load: selinux: Create function for selinuxfs directory cleanup https://lore.kernel.org/selinux/20200819195935.1720168-2-dburgener@linux.microsoft.com/ selinux: Refactor selinuxfs directory populating functions https://lore.kernel.org/selinux/20200819195935.1720168-3-dburgener@linux.microsoft.com/ selinux: Standardize string literal usage for selinuxfs directory names https://lore.kernel.org/selinux/20200819195935.1720168-4-dburgener@linux.microsoft.com/ selinux: Create new booleans and class dirs out of tree https://lore.kernel.org/selinux/20200819195935.1720168-5-dburgener@linux.microsoft.com/ Several changes were necessary to backport. They are detailed in the commit message for the third commit in the series. I also dropped the original third commit from this because it was only a style change. The bulk of the original cover letter is reproduced below. In the current implementation, on policy load /sys/fs/selinux is updated by deleting the previous contents of /sys/fs/selinux/{class,booleans} and then recreating them. This means that there is a period of time when the contents of these directories do not exist which can cause race conditions as userspace relies on them for information about the policy. In addition, it means that error recovery in the event of failure is challenging. This patch series follows the design outlined by Al Viro in a previous e-mail to the list[1]. This approach is to first create the new directory structures out of tree, then to perform the swapover, and finally to delete the old directories. Not handled in this series is error recovery in the event of failure. Error recovery in the selinuxfs recreation is unhandled in the current code, so this series will not cause any regression in this regard. Handling directory recreation in this manner is a prerequisite to make proper error handling possible. In order to demonstrate the race condition that this series fixes, you can use the following commands: while true; do cat /sys/fs/selinux/class/service/perms/status >/dev/null; done & while true; do load_policy; done; In the existing code, this will display errors fairly often as the class lookup fails. (In normal operation from systemd, this would result in a permission check which would be allowed or denied based on policy settings around unknown object classes.) After applying this patch series you should expect to no longer see such error messages. Daniel Burgener (3): selinux: Create function for selinuxfs directory cleanup selinux: Refactor selinuxfs directory populating functions selinux: Create new booleans and class dirs out of tree security/selinux/selinuxfs.c | 160 +++++++++++++++++++++++++++-------- 1 file changed, 123 insertions(+), 37 deletions(-)