Message ID | 20220727123048.46389-1-jlayton@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vfs: bypass may_create_in_sticky check if task has CAP_FOWNER | expand |
On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote: > NFS server is exporting a sticky directory (mode 01777) with root > squashing enabled. Client has protect_regular enabled and then tries to > open a file as root in that directory. File is created (with ownership > set to nobody:nobody) but the open syscall returns an error. > > The problem is may_create_in_sticky, which rejects the open even though > the file has already been created/opened. Bypass the checks in > may_create_in_sticky if the task has CAP_FOWNER in the given namespace. > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829 > Reported-by: Yongchen Yang <yoyang@redhat.com> > Suggested-by: Christian Brauner <brauner@kernel.org> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/namei.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/namei.c b/fs/namei.c > index 1f28d3f463c3..170c2396ba29 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns, > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) || > likely(!(dir_mode & S_ISVTX)) || > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) || > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) || > + ns_capable(mnt_userns, CAP_FOWNER)) > return 0; Hm, no. You really want inode_owner_or_capable() here.. You need to verify that you have a mapping for the inode->i_{g,u}id in question and that you're having CAP_FOWNER in the caller's userns. I'm pretty sure we should also restrict this to the case were the caller actually created the file otherwise we introduce a potential issue where the caller is susceptible to data spoofing. For example, the file was created by another user racing the caller's O_CREAT.
On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote: > On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote: > > NFS server is exporting a sticky directory (mode 01777) with root > > squashing enabled. Client has protect_regular enabled and then tries to > > open a file as root in that directory. File is created (with ownership > > set to nobody:nobody) but the open syscall returns an error. > > > > The problem is may_create_in_sticky, which rejects the open even though > > the file has already been created/opened. Bypass the checks in > > may_create_in_sticky if the task has CAP_FOWNER in the given namespace. > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829 > > Reported-by: Yongchen Yang <yoyang@redhat.com> > > Suggested-by: Christian Brauner <brauner@kernel.org> > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/namei.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/fs/namei.c b/fs/namei.c > > index 1f28d3f463c3..170c2396ba29 100644 > > --- a/fs/namei.c > > +++ b/fs/namei.c > > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns, > > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) || > > likely(!(dir_mode & S_ISVTX)) || > > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) || > > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) > > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) || > > + ns_capable(mnt_userns, CAP_FOWNER)) > > return 0; > > Hm, no. You really want inode_owner_or_capable() here.. > You need to verify that you have a mapping for the inode->i_{g,u}id in > question and that you're having CAP_FOWNER in the caller's userns. > Ok, I should be able to make that change and test it out. > I'm pretty sure we should also restrict this to the case were the caller > actually created the file otherwise we introduce a potential issue where > the caller is susceptible to data spoofing. For example, the file was > created by another user racing the caller's O_CREAT. That won't be sufficient to fix the testcase, I think. If a file already exists in the sticky dir and is owned by nobody:nobody, do we really want to prevent root from opening it? I wouldn't think so.
On Wed, Jul 27, 2022 at 08:55:35AM -0400, Jeff Layton wrote: > On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote: > > On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote: > > > NFS server is exporting a sticky directory (mode 01777) with root > > > squashing enabled. Client has protect_regular enabled and then tries to > > > open a file as root in that directory. File is created (with ownership > > > set to nobody:nobody) but the open syscall returns an error. > > > > > > The problem is may_create_in_sticky, which rejects the open even though > > > the file has already been created/opened. Bypass the checks in > > > may_create_in_sticky if the task has CAP_FOWNER in the given namespace. > > > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829 > > > Reported-by: Yongchen Yang <yoyang@redhat.com> > > > Suggested-by: Christian Brauner <brauner@kernel.org> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > --- > > > fs/namei.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/namei.c b/fs/namei.c > > > index 1f28d3f463c3..170c2396ba29 100644 > > > --- a/fs/namei.c > > > +++ b/fs/namei.c > > > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns, > > > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) || > > > likely(!(dir_mode & S_ISVTX)) || > > > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) || > > > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) > > > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) || > > > + ns_capable(mnt_userns, CAP_FOWNER)) > > > return 0; > > > > Hm, no. You really want inode_owner_or_capable() here.. > > You need to verify that you have a mapping for the inode->i_{g,u}id in > > question and that you're having CAP_FOWNER in the caller's userns. > > > > Ok, I should be able to make that change and test it out. > > > I'm pretty sure we should also restrict this to the case were the caller > > actually created the file otherwise we introduce a potential issue where > > the caller is susceptible to data spoofing. For example, the file was > > created by another user racing the caller's O_CREAT. > > That won't be sufficient to fix the testcase, I think. If a file already > exists in the sticky dir and is owned by nobody:nobody, do we really > want to prevent root from opening it? I wouldn't think so. Afaict, the whole stick behind the protected_regular thing in may_create_in_sticky() thing is that you prevent scenarios where you can be tricked into opening a file that you didn't intend to with O_CREAT. That's specifically also a protection for root. So say root specifies O_CREAT but someone beats root to it and creates the file dumping malicious data in there. The uid_eq() requirement is supposed to prevent such attacks and it's a sysctl that userspace opted into. We'd be relaxing that restriction quite a bit if we not just allow newly created but also pre-existing file to be opened even with the CAP_FOWNER requirement. So the dd call should really fail if O_CREAT is passed but the file is pre-existing, imho. It's a different story if dd created that file and has CAP_FOWNER imho.
On Wed, 2022-07-27 at 15:16 +0200, Christian Brauner wrote: > On Wed, Jul 27, 2022 at 08:55:35AM -0400, Jeff Layton wrote: > > On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote: > > > On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote: > > > > NFS server is exporting a sticky directory (mode 01777) with root > > > > squashing enabled. Client has protect_regular enabled and then tries to > > > > open a file as root in that directory. File is created (with ownership > > > > set to nobody:nobody) but the open syscall returns an error. > > > > > > > > The problem is may_create_in_sticky, which rejects the open even though > > > > the file has already been created/opened. Bypass the checks in > > > > may_create_in_sticky if the task has CAP_FOWNER in the given namespace. > > > > > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829 > > > > Reported-by: Yongchen Yang <yoyang@redhat.com> > > > > Suggested-by: Christian Brauner <brauner@kernel.org> > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > --- > > > > fs/namei.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/fs/namei.c b/fs/namei.c > > > > index 1f28d3f463c3..170c2396ba29 100644 > > > > --- a/fs/namei.c > > > > +++ b/fs/namei.c > > > > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns, > > > > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) || > > > > likely(!(dir_mode & S_ISVTX)) || > > > > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) || > > > > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) > > > > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) || > > > > + ns_capable(mnt_userns, CAP_FOWNER)) > > > > return 0; > > > > > > Hm, no. You really want inode_owner_or_capable() here.. > > > You need to verify that you have a mapping for the inode->i_{g,u}id in > > > question and that you're having CAP_FOWNER in the caller's userns. > > > > > > > Ok, I should be able to make that change and test it out. > > > > > I'm pretty sure we should also restrict this to the case were the caller > > > actually created the file otherwise we introduce a potential issue where > > > the caller is susceptible to data spoofing. For example, the file was > > > created by another user racing the caller's O_CREAT. > > > > That won't be sufficient to fix the testcase, I think. If a file already > > exists in the sticky dir and is owned by nobody:nobody, do we really > > want to prevent root from opening it? I wouldn't think so. > > Afaict, the whole stick behind the protected_regular thing in > may_create_in_sticky() thing is that you prevent scenarios where you can > be tricked into opening a file that you didn't intend to with O_CREAT. > Yuck. The proper way to get that protection is to use O_EXCL... > That's specifically also a protection for root. So say root specifies > O_CREAT but someone beats root to it and creates the file dumping > malicious data in there. The uid_eq() requirement is supposed to prevent > such attacks and it's a sysctl that userspace opted into. > > We'd be relaxing that restriction quite a bit if we not just allow newly > created but also pre-existing file to be opened even with the CAP_FOWNER > requirement. > > So the dd call should really fail if O_CREAT is passed but the file is > pre-existing, imho. It's a different story if dd created that file and > has CAP_FOWNER imho. That's pretty nasty. So if I create a file as root in a sticky dir that doesn't exist, and then close it and try to open it again it'll fail with -EACCES? That's terribly confusing.
On Wed, Jul 27, 2022 at 09:29:37AM -0400, Jeff Layton wrote: > On Wed, 2022-07-27 at 15:16 +0200, Christian Brauner wrote: > > On Wed, Jul 27, 2022 at 08:55:35AM -0400, Jeff Layton wrote: > > > On Wed, 2022-07-27 at 14:37 +0200, Christian Brauner wrote: > > > > On Wed, Jul 27, 2022 at 08:30:48AM -0400, Jeff Layton wrote: > > > > > NFS server is exporting a sticky directory (mode 01777) with root > > > > > squashing enabled. Client has protect_regular enabled and then tries to > > > > > open a file as root in that directory. File is created (with ownership > > > > > set to nobody:nobody) but the open syscall returns an error. > > > > > > > > > > The problem is may_create_in_sticky, which rejects the open even though > > > > > the file has already been created/opened. Bypass the checks in > > > > > may_create_in_sticky if the task has CAP_FOWNER in the given namespace. > > > > > > > > > > Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829 > > > > > Reported-by: Yongchen Yang <yoyang@redhat.com> > > > > > Suggested-by: Christian Brauner <brauner@kernel.org> > > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > > > --- > > > > > fs/namei.c | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/fs/namei.c b/fs/namei.c > > > > > index 1f28d3f463c3..170c2396ba29 100644 > > > > > --- a/fs/namei.c > > > > > +++ b/fs/namei.c > > > > > @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns, > > > > > (!sysctl_protected_regular && S_ISREG(inode->i_mode)) || > > > > > likely(!(dir_mode & S_ISVTX)) || > > > > > uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) || > > > > > - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) > > > > > + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) || > > > > > + ns_capable(mnt_userns, CAP_FOWNER)) > > > > > return 0; > > > > > > > > Hm, no. You really want inode_owner_or_capable() here.. > > > > You need to verify that you have a mapping for the inode->i_{g,u}id in > > > > question and that you're having CAP_FOWNER in the caller's userns. > > > > > > > > > > Ok, I should be able to make that change and test it out. > > > > > > > I'm pretty sure we should also restrict this to the case were the caller > > > > actually created the file otherwise we introduce a potential issue where > > > > the caller is susceptible to data spoofing. For example, the file was > > > > created by another user racing the caller's O_CREAT. > > > > > > That won't be sufficient to fix the testcase, I think. If a file already > > > exists in the sticky dir and is owned by nobody:nobody, do we really > > > want to prevent root from opening it? I wouldn't think so. > > > > Afaict, the whole stick behind the protected_regular thing in > > may_create_in_sticky() thing is that you prevent scenarios where you can > > be tricked into opening a file that you didn't intend to with O_CREAT. > > > > Yuck. The proper way to get that protection is to use O_EXCL... I'm not saying the interface was a particularly great idea. But it's at least a sysctl... > > > That's specifically also a protection for root. So say root specifies > > O_CREAT but someone beats root to it and creates the file dumping > > malicious data in there. The uid_eq() requirement is supposed to prevent > > such attacks and it's a sysctl that userspace opted into. > > > > We'd be relaxing that restriction quite a bit if we not just allow newly > > created but also pre-existing file to be opened even with the CAP_FOWNER > > requirement. > > > > So the dd call should really fail if O_CREAT is passed but the file is > > pre-existing, imho. It's a different story if dd created that file and > > has CAP_FOWNER imho. > > That's pretty nasty. So if I create a file as root in a sticky dir that > doesn't exist, and then close it and try to open it again it'll fail > with -EACCES? That's terribly confusing. At least only if you try to re-open with O_CREAT and have this protected_regular sysctl thingy turned on...
diff --git a/fs/namei.c b/fs/namei.c index 1f28d3f463c3..170c2396ba29 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1230,7 +1230,8 @@ static int may_create_in_sticky(struct user_namespace *mnt_userns, (!sysctl_protected_regular && S_ISREG(inode->i_mode)) || likely(!(dir_mode & S_ISVTX)) || uid_eq(i_uid_into_mnt(mnt_userns, inode), dir_uid) || - uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode))) + uid_eq(current_fsuid(), i_uid_into_mnt(mnt_userns, inode)) || + ns_capable(mnt_userns, CAP_FOWNER)) return 0; if (likely(dir_mode & 0002) ||
NFS server is exporting a sticky directory (mode 01777) with root squashing enabled. Client has protect_regular enabled and then tries to open a file as root in that directory. File is created (with ownership set to nobody:nobody) but the open syscall returns an error. The problem is may_create_in_sticky, which rejects the open even though the file has already been created/opened. Bypass the checks in may_create_in_sticky if the task has CAP_FOWNER in the given namespace. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976829 Reported-by: Yongchen Yang <yoyang@redhat.com> Suggested-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/namei.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)