Message ID | cover.1709675979.git.mattbobrowski@google.com (mailing list archive) |
---|---|
Headers | show |
Series | add new acquire/release BPF kfuncs | expand |
On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote: > G'day All, > > The original cover letter providing background context and motivating > factors around the needs for the BPF kfuncs introduced within this > patch series can be found here [0], so please do reference that if > need be. > > Notably, one of the main contention points within v1 of this patch > series was that we were effectively leaning on some preexisting > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file() > within some of the newly introduced BPF kfuncs. As noted in my > response here [1] though, I struggle to understand the technical > reasoning behind why exposing such in-kernel helpers, specifically > only to BPF LSM program types in the form of BPF kfuncs, is inherently > a terrible idea. So, until someone provides me with a sound technical > explanation as to why this cannot or should not be done, I'll continue > to lean on them. The alternative is to reimplement the necessary > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO. You may lean as much as you like. What I've reacted to is that you've (not you specifically, I'm sure) messed up. You've exposed d_path() to users without understanding that it wasn't safe apparently. And now we get patches that use the self-inflicted brokeness as an argument to expose a bunch of other low-level helpers to fix that. The fact that it's "just bpf LSM" programs doesn't alleviate any concerns whatsoever. Not just because that is just an entry vector but also because we have LSMs induced API abuse that we only ever get to see the fallout from when we refactor apis and then it causes pain for the vfs. I'll take another look at the proposed helpers you need as bpf kfuncs and I'll give my best not to be overly annoyed by all of this. I have no intention of not helping you quite the opposite but I'm annoyed that we're here in the first place. What I want is to stop this madness of exposing stuff to users without fully understanding it's semantics and required guarantees.
On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote: > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote: > > G'day All, > > > > The original cover letter providing background context and motivating > > factors around the needs for the BPF kfuncs introduced within this > > patch series can be found here [0], so please do reference that if > > need be. > > > > Notably, one of the main contention points within v1 of this patch > > series was that we were effectively leaning on some preexisting > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file() > > within some of the newly introduced BPF kfuncs. As noted in my > > response here [1] though, I struggle to understand the technical > > reasoning behind why exposing such in-kernel helpers, specifically > > only to BPF LSM program types in the form of BPF kfuncs, is inherently > > a terrible idea. So, until someone provides me with a sound technical > > explanation as to why this cannot or should not be done, I'll continue > > to lean on them. The alternative is to reimplement the necessary > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO. > > You may lean as much as you like. What I've reacted to is that you've > (not you specifically, I'm sure) messed up. You've exposed d_path() to > users without understanding that it wasn't safe apparently. > > And now we get patches that use the self-inflicted brokeness as an > argument to expose a bunch of other low-level helpers to fix that. > > The fact that it's "just bpf LSM" programs doesn't alleviate any > concerns whatsoever. Not just because that is just an entry vector but > also because we have LSMs induced API abuse that we only ever get to see > the fallout from when we refactor apis and then it causes pain for the vfs. > > I'll take another look at the proposed helpers you need as bpf kfuncs > and I'll give my best not to be overly annoyed by all of this. I have no > intention of not helping you quite the opposite but I'm annoyed that > we're here in the first place. > > What I want is to stop this madness of exposing stuff to users without > fully understanding it's semantics and required guarantees. So, looking at this series you're now asking us to expose: (1) mmgrab() (2) mmput() (3) fput() (5) get_mm_exe_file() (4) get_task_exe_file() (7) get_task_fs_pwd() (6) get_task_fs_root() (8) path_get() (9) path_put() in one go and the justification in all patches amounts to "This is common in some BPF LSM programs". So, broken stuff got exposed to users or at least a broken BPF LSM program was written somewhere out there that is susceptible to UAFs becauase you didn't restrict bpf_d_path() to trusted pointer arguments. So you're now scrambling to fix this by asking for a bunch of low-level exports. What is the guarantee that you don't end up writing another BPF LSM that abuses these exports in a way that causes even more issues and then someone else comes back asking for the next round of bpf funcs to be exposed to fix it. The difference between a regular LSM asking about this and a BPF LSM program is that we can see in the hook implementation what the LSM intends to do with this and we can judge whether that's safe or not. Here you're asking us to do this blindfolded. So I feel we can't really ACK a series such as this without actually seeing what is intended to be done with all these helpers that you want as kfuncs. Not after the previous brokenness. In any case, you need separate ACKs from mm for the mmgrab()/mmput() kfuncs as well. Because really, all I see immediately supportable is the addition of a safe variant of bpf making use of the trusted pointer argument constraint: [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based BPF kfunc bpf_path_d_path()
On Wed, Mar 6, 2024 at 7:14 AM Christian Brauner <brauner@kernel.org> wrote: > On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote: > > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote: > > > G'day All, > > > > > > The original cover letter providing background context and motivating > > > factors around the needs for the BPF kfuncs introduced within this > > > patch series can be found here [0], so please do reference that if > > > need be. > > > > > > Notably, one of the main contention points within v1 of this patch > > > series was that we were effectively leaning on some preexisting > > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file() > > > within some of the newly introduced BPF kfuncs. As noted in my > > > response here [1] though, I struggle to understand the technical > > > reasoning behind why exposing such in-kernel helpers, specifically > > > only to BPF LSM program types in the form of BPF kfuncs, is inherently > > > a terrible idea. So, until someone provides me with a sound technical > > > explanation as to why this cannot or should not be done, I'll continue > > > to lean on them. The alternative is to reimplement the necessary > > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO. > > > > You may lean as much as you like. What I've reacted to is that you've > > (not you specifically, I'm sure) messed up. You've exposed d_path() to > > users without understanding that it wasn't safe apparently. > > > > And now we get patches that use the self-inflicted brokeness as an > > argument to expose a bunch of other low-level helpers to fix that. > > > > The fact that it's "just bpf LSM" programs doesn't alleviate any > > concerns whatsoever. Not just because that is just an entry vector but > > also because we have LSMs induced API abuse that we only ever get to see > > the fallout from when we refactor apis and then it causes pain for the vfs. Let's not start throwing stones at the in-tree, native LSM folks; instead let's just all admit that it's often a challenge working across subsystem lines, but we usually find a way because we all want the kernel to move forward. We've all got our grudges and scars, let's not start poking each other with sticks. > > I'll take another look at the proposed helpers you need as bpf kfuncs > > and I'll give my best not to be overly annoyed by all of this. I have no > > intention of not helping you quite the opposite but I'm annoyed that > > we're here in the first place. > > > > What I want is to stop this madness of exposing stuff to users without > > fully understanding it's semantics and required guarantees. > > So, looking at this series you're now asking us to expose: > > (1) mmgrab() > (2) mmput() > (3) fput() > (5) get_mm_exe_file() > (4) get_task_exe_file() > (7) get_task_fs_pwd() > (6) get_task_fs_root() > (8) path_get() > (9) path_put() > > in one go and the justification in all patches amounts to "This is > common in some BPF LSM programs". > > So, broken stuff got exposed to users or at least a broken BPF LSM > program was written somewhere out there that is susceptible to UAFs > becauase you didn't restrict bpf_d_path() to trusted pointer arguments. > So you're now scrambling to fix this by asking for a bunch of low-level > exports. > > What is the guarantee that you don't end up writing another BPF LSM that > abuses these exports in a way that causes even more issues and then > someone else comes back asking for the next round of bpf funcs to be > exposed to fix it. > > The difference between a regular LSM asking about this and a BPF LSM > program is that we can see in the hook implementation what the LSM > intends to do with this and we can judge whether that's safe or not. I wish things were different, but the current reality is that from a practical perspective BPF LSMs are not much different than out-of-tree, native LSMs. There is no requirement, or even convention, for BPF LSMs to post their designs, docs, or implementations on the LSM list for review like we do with other LSMs; maybe it has happened once or twice, but I can't remember the last time I saw a serious BPF LSM posted to the LSM list. Even if someone did do The Right Thing with their BPF LSM and proposed it for review just like a native LSM, there is no guarantee someone else couldn't fork that LSM and do something wrong (intentional or not). Unless something changes, I'm left to treat BPF LSMs the same as any other out-of-tree kernel code: I'm not going to make life difficult for a /out-of-tree/BPF/ LSM simply because it is /out-of-tree/BPF/, but I'm not going to do anything to negatively impact the current, or future, state of the in-tree LSMs simply to appease the /out-of-tree/BPF/ LSMs.
On Wed, Mar 6, 2024 at 4:13 AM Christian Brauner <brauner@kernel.org> wrote: > > On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote: > > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote: > > > G'day All, > > > > > > The original cover letter providing background context and motivating > > > factors around the needs for the BPF kfuncs introduced within this > > > patch series can be found here [0], so please do reference that if > > > need be. > > > > > > Notably, one of the main contention points within v1 of this patch > > > series was that we were effectively leaning on some preexisting > > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file() > > > within some of the newly introduced BPF kfuncs. As noted in my > > > response here [1] though, I struggle to understand the technical > > > reasoning behind why exposing such in-kernel helpers, specifically > > > only to BPF LSM program types in the form of BPF kfuncs, is inherently > > > a terrible idea. So, until someone provides me with a sound technical > > > explanation as to why this cannot or should not be done, I'll continue > > > to lean on them. The alternative is to reimplement the necessary > > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO. > > > > You may lean as much as you like. What I've reacted to is that you've > > (not you specifically, I'm sure) messed up. You've exposed d_path() to > > users without understanding that it wasn't safe apparently. > > > > And now we get patches that use the self-inflicted brokeness as an > > argument to expose a bunch of other low-level helpers to fix that. > > > > The fact that it's "just bpf LSM" programs doesn't alleviate any > > concerns whatsoever. Not just because that is just an entry vector but > > also because we have LSMs induced API abuse that we only ever get to see > > the fallout from when we refactor apis and then it causes pain for the vfs. > > > > I'll take another look at the proposed helpers you need as bpf kfuncs > > and I'll give my best not to be overly annoyed by all of this. I have no > > intention of not helping you quite the opposite but I'm annoyed that > > we're here in the first place. > > > > What I want is to stop this madness of exposing stuff to users without > > fully understanding it's semantics and required guarantees. > > So, looking at this series you're now asking us to expose: > > (1) mmgrab() > (2) mmput() > (3) fput() > (5) get_mm_exe_file() > (4) get_task_exe_file() > (7) get_task_fs_pwd() > (6) get_task_fs_root() > (8) path_get() > (9) path_put() > > in one go and the justification in all patches amounts to "This is > common in some BPF LSM programs". > > So, broken stuff got exposed to users or at least a broken BPF LSM > program was written somewhere out there that is susceptible to UAFs > becauase you didn't restrict bpf_d_path() to trusted pointer arguments. > So you're now scrambling to fix this by asking for a bunch of low-level > exports. > > What is the guarantee that you don't end up writing another BPF LSM that > abuses these exports in a way that causes even more issues and then > someone else comes back asking for the next round of bpf funcs to be > exposed to fix it. There is no guarantee. We made a safety mistake with bpf_d_path() though we restricted it very tight. And that UAF is tricky. I'm still amazed how Jann managed to find it. We all make mistakes. It's not the first one and not going to be the last. What Matt is doing is an honest effort to fix it in the upstream kernel for all bpf users to benefit. He could have done it with a kernel module. The above "low level" helpers are all either static inline in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt. One can implement such kfuncs in an out of tree kernel module and be done with it, but in the bpf community we encourage everyone to upstream their work. So kudos to Matt for working on these patches. His bpf-lsm use case is not special. It just needs a safe way to call d_path. Let's look at one of the test in patch 9: +SEC("lsm.s/file_open") +__failure __msg("R1 must be referenced or trusted") +int BPF_PROG(path_d_path_kfunc_untrusted_from_current) +{ + struct path *pwd; + struct task_struct *current; + + current = bpf_get_current_task_btf(); + /* Walking a trusted pointer returned from bpf_get_current_task_btf() + * yields and untrusted pointer. */ + pwd = ¤t->fs->pwd; + bpf_path_d_path(pwd, buf, sizeof(buf)); + return 0; +} This test checks that such an access pattern is unsafe and the verifier will catch it. To make it safe one needs to do: current = bpf_get_current_task_btf(); pwd = bpf_get_task_fs_pwd(current); if (!pwd) // error path bpf_path_d_path(pwd, ...); bpf_put_path(pwd); these are the kfuncs from patch 6. And notice that they have KF_ACQUIRE and KF_RELEASE flags. They tell the verifier to recognize that bpf_get_task_fs_pwd() kfunc acquires 'struct path *'. Meaning that bpf prog cannot just return without releasing it. The bpf prog cannot use-after-free that 'pwd' either after it was released by bpf_put_path(pwd). The verifier static analysis catches such UAF-s. It didn't catch Jann's UAF earlier, because we didn't have these kfuncs! Hence the fix is to add such kfuncs with acquire/release semantics. > The difference between a regular LSM asking about this and a BPF LSM > program is that we can see in the hook implementation what the LSM > intends to do with this and we can judge whether that's safe or not. See above example. The verifier is doing a much better job than humans when it comes to safety. > Here you're asking us to do this blindfolded. If you don't trust the verifier to enforce safety, you shouldn't trust Rust compiler to generate safe code either. In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL. Such analogy is correct to some extent, but unlike exported symbols kfuncs are restricted to particular program types. They don't accept arbitrary pointers, and reference count is enforced as well. That's a pretty big difference vs EXPORT_SYMBOL. > Because really, all I see immediately supportable is the addition of a > safe variant of bpf making use of the trusted pointer argument > constraint: > > [PATCH v2 bpf-next 8/9] bpf: add trusted d_path() based BPF kfunc bpf_path_d_path() This kfunc alone is useful in limited cases. See above example. For simple LSM file_open hook the bpf prog needs bpf_get_task_fs_pwd() to use this d_path kfunc.
On Wed, Mar 06, 2024 at 08:05:05PM -0800, Alexei Starovoitov wrote: > On Wed, Mar 6, 2024 at 4:13 AM Christian Brauner <brauner@kernel.org> wrote: > > > > On Wed, Mar 06, 2024 at 12:21:28PM +0100, Christian Brauner wrote: > > > On Wed, Mar 06, 2024 at 07:39:14AM +0000, Matt Bobrowski wrote: > > > > G'day All, > > > > > > > > The original cover letter providing background context and motivating > > > > factors around the needs for the BPF kfuncs introduced within this > > > > patch series can be found here [0], so please do reference that if > > > > need be. > > > > > > > > Notably, one of the main contention points within v1 of this patch > > > > series was that we were effectively leaning on some preexisting > > > > in-kernel APIs such as get_task_exe_file() and get_mm_exe_file() > > > > within some of the newly introduced BPF kfuncs. As noted in my > > > > response here [1] though, I struggle to understand the technical > > > > reasoning behind why exposing such in-kernel helpers, specifically > > > > only to BPF LSM program types in the form of BPF kfuncs, is inherently > > > > a terrible idea. So, until someone provides me with a sound technical > > > > explanation as to why this cannot or should not be done, I'll continue > > > > to lean on them. The alternative is to reimplement the necessary > > > > in-kernel APIs within the BPF kfuncs, but that's just nonsensical IMO. > > > > > > You may lean as much as you like. What I've reacted to is that you've > > > (not you specifically, I'm sure) messed up. You've exposed d_path() to > > > users without understanding that it wasn't safe apparently. > > > > > > And now we get patches that use the self-inflicted brokeness as an > > > argument to expose a bunch of other low-level helpers to fix that. > > > > > > The fact that it's "just bpf LSM" programs doesn't alleviate any > > > concerns whatsoever. Not just because that is just an entry vector but > > > also because we have LSMs induced API abuse that we only ever get to see > > > the fallout from when we refactor apis and then it causes pain for the vfs. > > > > > > I'll take another look at the proposed helpers you need as bpf kfuncs > > > and I'll give my best not to be overly annoyed by all of this. I have no > > > intention of not helping you quite the opposite but I'm annoyed that > > > we're here in the first place. > > > > > > What I want is to stop this madness of exposing stuff to users without > > > fully understanding it's semantics and required guarantees. > > > > So, looking at this series you're now asking us to expose: > > > > (1) mmgrab() > > (2) mmput() > > (3) fput() > > (5) get_mm_exe_file() > > (4) get_task_exe_file() > > (7) get_task_fs_pwd() > > (6) get_task_fs_root() > > (8) path_get() > > (9) path_put() > > > > in one go and the justification in all patches amounts to "This is > > common in some BPF LSM programs". > > > > So, broken stuff got exposed to users or at least a broken BPF LSM > > program was written somewhere out there that is susceptible to UAFs > > becauase you didn't restrict bpf_d_path() to trusted pointer arguments. > > So you're now scrambling to fix this by asking for a bunch of low-level > > exports. > > > > What is the guarantee that you don't end up writing another BPF LSM that > > abuses these exports in a way that causes even more issues and then > > someone else comes back asking for the next round of bpf funcs to be > > exposed to fix it. > > There is no guarantee. > We made a safety mistake with bpf_d_path() though > we restricted it very tight. And that UAF is tricky. > I'm still amazed how Jann managed to find it. > We all make mistakes. > It's not the first one and not going to be the last. > > What Matt is doing is an honest effort to fix it > in the upstream kernel for all bpf users to benefit. > He could have done it with a kernel module. > The above "low level" helpers are all either static inline > in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt. > > One can implement such kfuncs in an out of tree kernel module > and be done with it, but in the bpf community we encourage > everyone to upstream their work. > > So kudos to Matt for working on these patches. > > His bpf-lsm use case is not special. > It just needs a safe way to call d_path. > > +SEC("lsm.s/file_open") > +__failure __msg("R1 must be referenced or trusted") > +int BPF_PROG(path_d_path_kfunc_untrusted_from_current) > +{ > + struct path *pwd; > + struct task_struct *current; > + > + current = bpf_get_current_task_btf(); > + /* Walking a trusted pointer returned from bpf_get_current_task_btf() > + * yields and untrusted pointer. */ > + pwd = ¤t->fs->pwd; > + bpf_path_d_path(pwd, buf, sizeof(buf)); > + return 0; > +} > > This test checks that such an access pattern is unsafe and > the verifier will catch it. > > To make it safe one needs to do: > > current = bpf_get_current_task_btf(); > pwd = bpf_get_task_fs_pwd(current); > if (!pwd) // error path > bpf_path_d_path(pwd, ...); > bpf_put_path(pwd); > > these are the kfuncs from patch 6. > > And notice that they have KF_ACQUIRE and KF_RELEASE flags. > > They tell the verifier to recognize that bpf_get_task_fs_pwd() > kfunc acquires 'struct path *'. > Meaning that bpf prog cannot just return without releasing it. > > The bpf prog cannot use-after-free that 'pwd' either > after it was released by bpf_put_path(pwd). > > The verifier static analysis catches such UAF-s. > It didn't catch Jann's UAF earlier, because we didn't have > these kfuncs! Hence the fix is to add such kfuncs with > acquire/release semantics. > > > The difference between a regular LSM asking about this and a BPF LSM > > program is that we can see in the hook implementation what the LSM > > intends to do with this and we can judge whether that's safe or not. > > See above example. > The verifier is doing a much better job than humans when it comes > to safety. > > > Here you're asking us to do this blindfolded. > > If you don't trust the verifier to enforce safety, > you shouldn't trust Rust compiler to generate safe code either. > > In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL. > Such analogy is correct to some extent, > but unlike exported symbols kfuncs are restricted to particular > program types. They don't accept arbitrary pointers, > and reference count is enforced as well. > That's a pretty big difference vs EXPORT_SYMBOL. There's one fundamental question here that we'll need an official answer to: Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen to request access to various helpers in the kernel? Because fundamentally this is what this patchset is asking to be done. If the ZFS out-of-tree kernel module were to send us a similar patch series asking us for a list of 9 functions that they'd like us to export what would the answer to that be? It would be "no" - on principle alone. So what is different between an out-of-tree BPF LSM program that no one even has ever seen and an out-of-tree kernel module that one can at least look at in Github? Why should we reject requests from the latter but are supposed to accept requests from the former? If we say yes to the BPF LSM program requests we would have to say yes to ZFS as well.
On Thu, Mar 7, 2024 at 4:55 AM Christian Brauner <brauner@kernel.org> wrote: > > There's one fundamental question here that we'll need an official answer to: > > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen > to request access to various helpers in the kernel? Phrased in a slightly different way, and a bit more generalized: do we treat out-of-tree BPF programs the same as we do with out-of-tree kernel modules? I believe that's the real question, and if we answer that, we should also have our answer for the internal helper function question. > Because fundamentally this is what this patchset is asking to be done. > > If the ZFS out-of-tree kernel module were to send us a similar patch > series asking us for a list of 9 functions that they'd like us to export > what would the answer to that be? It would be "no" - on principle alone. > > So what is different between an out-of-tree BPF LSM program that no one > even has ever seen and an out-of-tree kernel module that one can at > least look at in Github? Why should we reject requests from the latter > but are supposed to accept requests from the former? > > If we say yes to the BPF LSM program requests we would have to say yes > to ZFS as well.
On Thu, Mar 7, 2024 at 1:55 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > > So, looking at this series you're now asking us to expose: > > > > > > (1) mmgrab() > > > (2) mmput() > > > (3) fput() > > > (5) get_mm_exe_file() > > > (4) get_task_exe_file() > > > (7) get_task_fs_pwd() > > > (6) get_task_fs_root() > > > (8) path_get() > > > (9) path_put() > > > > > > in one go and the justification in all patches amounts to "This is > > > common in some BPF LSM programs". > > > > > > So, broken stuff got exposed to users or at least a broken BPF LSM > > > program was written somewhere out there that is susceptible to UAFs > > > becauase you didn't restrict bpf_d_path() to trusted pointer arguments. > > > So you're now scrambling to fix this by asking for a bunch of low-level > > > exports. > > > > > > What is the guarantee that you don't end up writing another BPF LSM that > > > abuses these exports in a way that causes even more issues and then > > > someone else comes back asking for the next round of bpf funcs to be > > > exposed to fix it. > > > > There is no guarantee. > > We made a safety mistake with bpf_d_path() though > > we restricted it very tight. And that UAF is tricky. > > I'm still amazed how Jann managed to find it. > > We all make mistakes. > > It's not the first one and not going to be the last. > > > > What Matt is doing is an honest effort to fix it > > in the upstream kernel for all bpf users to benefit. > > He could have done it with a kernel module. > > The above "low level" helpers are all either static inline > > in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt. > > > > One can implement such kfuncs in an out of tree kernel module > > and be done with it, but in the bpf community we encourage > > everyone to upstream their work. > > > > So kudos to Matt for working on these patches. > > > > His bpf-lsm use case is not special. > > It just needs a safe way to call d_path. > > > > +SEC("lsm.s/file_open") > > +__failure __msg("R1 must be referenced or trusted") > > +int BPF_PROG(path_d_path_kfunc_untrusted_from_current) > > +{ > > + struct path *pwd; > > + struct task_struct *current; > > + > > + current = bpf_get_current_task_btf(); > > + /* Walking a trusted pointer returned from bpf_get_current_task_btf() > > + * yields and untrusted pointer. */ > > + pwd = ¤t->fs->pwd; > > + bpf_path_d_path(pwd, buf, sizeof(buf)); > > + return 0; > > +} > > > > This test checks that such an access pattern is unsafe and > > the verifier will catch it. > > > > To make it safe one needs to do: > > > > current = bpf_get_current_task_btf(); > > pwd = bpf_get_task_fs_pwd(current); > > if (!pwd) // error path > > bpf_path_d_path(pwd, ...); > > bpf_put_path(pwd); > > > > these are the kfuncs from patch 6. > > > > And notice that they have KF_ACQUIRE and KF_RELEASE flags. > > > > They tell the verifier to recognize that bpf_get_task_fs_pwd() > > kfunc acquires 'struct path *'. > > Meaning that bpf prog cannot just return without releasing it. > > > > The bpf prog cannot use-after-free that 'pwd' either > > after it was released by bpf_put_path(pwd). > > > > The verifier static analysis catches such UAF-s. > > It didn't catch Jann's UAF earlier, because we didn't have > > these kfuncs! Hence the fix is to add such kfuncs with > > acquire/release semantics. > > > > > The difference between a regular LSM asking about this and a BPF LSM > > > program is that we can see in the hook implementation what the LSM > > > intends to do with this and we can judge whether that's safe or not. > > > > See above example. > > The verifier is doing a much better job than humans when it comes > > to safety. > > > > > Here you're asking us to do this blindfolded. > > > > If you don't trust the verifier to enforce safety, > > you shouldn't trust Rust compiler to generate safe code either. > > > > In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL. > > Such analogy is correct to some extent, > > but unlike exported symbols kfuncs are restricted to particular > > program types. They don't accept arbitrary pointers, > > and reference count is enforced as well. > > That's a pretty big difference vs EXPORT_SYMBOL. > > There's one fundamental question here that we'll need an official answer to: > > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen > to request access to various helpers in the kernel? obviously not. > Because fundamentally this is what this patchset is asking to be done. Pardon ? > If the ZFS out-of-tree kernel module were to send us a similar patch > series asking us for a list of 9 functions that they'd like us to export > what would the answer to that be? This patch set doesn't request any new EXPORT_SYMBOL. Quoting previous reply: " The above "low level" helpers are all either static inline in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt. " Let's look at your list: > > > (1) mmgrab() > > > (2) mmput() > > > (3) fput() > > > (5) get_mm_exe_file() > > > (4) get_task_exe_file() > > > (7) get_task_fs_pwd() > > > (6) get_task_fs_root() > > > (8) path_get() > > > (9) path_put() First of all, there is no such thing as get_task_fs_pwd/root in the kernel. The hypothetical out-of-tree ZFS can use the rest just fine. One can argue that get_mm_exe_file() is not exported, but it's nothing but rcu_lock-wrap plus get_file_rcu() which is EXPORT_SYMBOL. kfunc-s in patch 6 wrap get_fs_root/pwd from linux/fs_struct.h that calls EXPORT_SYMBOL(path_get). So upon close examination no new EXPORT_SYMBOL-s were requested. Christian, I understand your irritation with anything bpf related due to our mistake with fd=0, but as I said earlier it was an honest mistake. There was no malicious intent. Time to move on.
On Thu, Mar 7, 2024 at 12:51 PM Paul Moore <paul@paul-moore.com> wrote: > > On Thu, Mar 7, 2024 at 4:55 AM Christian Brauner <brauner@kernel.org> wrote: > > > > There's one fundamental question here that we'll need an official answer to: > > > > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen > > to request access to various helpers in the kernel? > > Phrased in a slightly different way, and a bit more generalized: do we > treat out-of-tree BPF programs the same as we do with out-of-tree > kernel modules? I believe that's the real question, and if we answer > that, we should also have our answer for the internal helper function > question. From 10k ft view bpf programs may look like kernel modules, but looking closely they are very different. Modules can read/write any data structure and can call any exported function. All modules fall into two categories GPL or not. While bpf progs are divided by program type. Tracing progs can read any kernel memory safely via probe_read_kernel. Networking prog can read/write packets, but cannot read kernel memory. bpf_lsm programs can be called from lsm hooks and call only kfuncs that were explicitly allowlisted to bpf_lsm prog type. Furthermore kfuncs have acquire/release semantics enforced by the verifier. For example, bpf progs can do bpf_rcu_read_lock() which is a wrapper around rcu_read_lock() and the verifier will make sure that bpf_rcu_read_unlock() is called. Under bpf_rcu_read_lock() bpf programs can dereference __rcu tagged fields and the verifier will track them as rcu protected objects until bpf_rcu_read_unlock(). In other words the verifier is doing sparse-on-steroids analysis and enforcing it. Kernel modules are not subject to such enforcement. One more distinction: 99.9% of bpf features require a GPL-ed bpf program. All kfuncs are GPL only.
On Thu, Mar 07, 2024 at 07:11:22PM -0800, Alexei Starovoitov wrote: > On Thu, Mar 7, 2024 at 1:55 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > > > > > So, looking at this series you're now asking us to expose: > > > > > > > > (1) mmgrab() > > > > (2) mmput() > > > > (3) fput() > > > > (5) get_mm_exe_file() > > > > (4) get_task_exe_file() > > > > (7) get_task_fs_pwd() > > > > (6) get_task_fs_root() > > > > (8) path_get() > > > > (9) path_put() > > > > > > > > in one go and the justification in all patches amounts to "This is > > > > common in some BPF LSM programs". > > > > > > > > So, broken stuff got exposed to users or at least a broken BPF LSM > > > > program was written somewhere out there that is susceptible to UAFs > > > > becauase you didn't restrict bpf_d_path() to trusted pointer arguments. > > > > So you're now scrambling to fix this by asking for a bunch of low-level > > > > exports. > > > > > > > > What is the guarantee that you don't end up writing another BPF LSM that > > > > abuses these exports in a way that causes even more issues and then > > > > someone else comes back asking for the next round of bpf funcs to be > > > > exposed to fix it. > > > > > > There is no guarantee. > > > We made a safety mistake with bpf_d_path() though > > > we restricted it very tight. And that UAF is tricky. > > > I'm still amazed how Jann managed to find it. > > > We all make mistakes. > > > It's not the first one and not going to be the last. > > > > > > What Matt is doing is an honest effort to fix it > > > in the upstream kernel for all bpf users to benefit. > > > He could have done it with a kernel module. > > > The above "low level" helpers are all either static inline > > > in .h or they call EXPORT_SYMBOL[_GPL] or simply inc/dec refcnt. > > > > > > One can implement such kfuncs in an out of tree kernel module > > > and be done with it, but in the bpf community we encourage > > > everyone to upstream their work. > > > > > > So kudos to Matt for working on these patches. > > > > > > His bpf-lsm use case is not special. > > > It just needs a safe way to call d_path. > > > > > > +SEC("lsm.s/file_open") > > > +__failure __msg("R1 must be referenced or trusted") > > > +int BPF_PROG(path_d_path_kfunc_untrusted_from_current) > > > +{ > > > + struct path *pwd; > > > + struct task_struct *current; > > > + > > > + current = bpf_get_current_task_btf(); > > > + /* Walking a trusted pointer returned from bpf_get_current_task_btf() > > > + * yields and untrusted pointer. */ > > > + pwd = ¤t->fs->pwd; > > > + bpf_path_d_path(pwd, buf, sizeof(buf)); > > > + return 0; > > > +} > > > > > > This test checks that such an access pattern is unsafe and > > > the verifier will catch it. > > > > > > To make it safe one needs to do: > > > > > > current = bpf_get_current_task_btf(); > > > pwd = bpf_get_task_fs_pwd(current); > > > if (!pwd) // error path > > > bpf_path_d_path(pwd, ...); > > > bpf_put_path(pwd); > > > > > > these are the kfuncs from patch 6. > > > > > > And notice that they have KF_ACQUIRE and KF_RELEASE flags. > > > > > > They tell the verifier to recognize that bpf_get_task_fs_pwd() > > > kfunc acquires 'struct path *'. > > > Meaning that bpf prog cannot just return without releasing it. > > > > > > The bpf prog cannot use-after-free that 'pwd' either > > > after it was released by bpf_put_path(pwd). > > > > > > The verifier static analysis catches such UAF-s. > > > It didn't catch Jann's UAF earlier, because we didn't have > > > these kfuncs! Hence the fix is to add such kfuncs with > > > acquire/release semantics. > > > > > > > The difference between a regular LSM asking about this and a BPF LSM > > > > program is that we can see in the hook implementation what the LSM > > > > intends to do with this and we can judge whether that's safe or not. > > > > > > See above example. > > > The verifier is doing a much better job than humans when it comes > > > to safety. > > > > > > > Here you're asking us to do this blindfolded. > > > > > > If you don't trust the verifier to enforce safety, > > > you shouldn't trust Rust compiler to generate safe code either. > > > > > > In another reply you've compared kfuncs to EXPORT_SYMBOL_GPL. > > > Such analogy is correct to some extent, > > > but unlike exported symbols kfuncs are restricted to particular > > > program types. They don't accept arbitrary pointers, > > > and reference count is enforced as well. > > > That's a pretty big difference vs EXPORT_SYMBOL. > > > > There's one fundamental question here that we'll need an official answer to: > > > > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen > > to request access to various helpers in the kernel? > > obviously not. > > > Because fundamentally this is what this patchset is asking to be done. > > Pardon ? > > > If the ZFS out-of-tree kernel module were to send us a similar patch > > series asking us for a list of 9 functions that they'd like us to export > > what would the answer to that be? > > This patch set doesn't request any new EXPORT_SYMBOL. In order for that out-of-tree BPF LSM program to get similar guarantees than an equivalent kernel module we need to export all nine functions for BPF. Included in that set are functions that aren't currently even exported to modules. These exports are specifically for an out-of-tree BPF LSM program that is not accessible to the public. The question in the other mail stands. > First of all, there is no such thing as get_task_fs_pwd/root > in the kernel. Yeah, we'd need specific helpers for a never seen before out-of-tree BPF LSM. I don't see how that's different from an out-of-tree kernel module. > One can argue that get_mm_exe_file() is not exported, > but it's nothing but rcu_lock-wrap plus get_file_rcu() > which is EXPORT_SYMBOL. Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't be exported. So that'll be removed asap. > Christian, > > I understand your irritation with anything bpf related > due to our mistake with fd=0, but as I said earlier it was > an honest mistake. There was no malicious intent. > Time to move on. I understand your concern but I'm neither irritated by bpf nor do I hold grudges. As I'm writing this bpf's bpffs and token changes are on their way into mainline for v6.9 which I reviewed and acked immediately after that discussion. Rest assures that you can move on from that concern. There's no need to keep bringing it up in unrelated discussions.
On Thu, Mar 07, 2024 at 07:25:05PM -0800, Alexei Starovoitov wrote: > On Thu, Mar 7, 2024 at 12:51 PM Paul Moore <paul@paul-moore.com> wrote: > > > > On Thu, Mar 7, 2024 at 4:55 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > There's one fundamental question here that we'll need an official answer to: > > > > > > Is it ok for an out-of-tree BPF LSM program, that nobody has ever seen > > > to request access to various helpers in the kernel? > > > > Phrased in a slightly different way, and a bit more generalized: do we > > treat out-of-tree BPF programs the same as we do with out-of-tree > > kernel modules? I believe that's the real question, and if we answer > > that, we should also have our answer for the internal helper function > > question. > > From 10k ft view bpf programs may look like kernel modules, > but looking closely they are very different. > > Modules can read/write any data structure and can call any exported function. > All modules fall into two categories GPL or not. > While bpf progs are divided by program type. > Tracing progs can read any kernel memory safely via probe_read_kernel. > Networking prog can read/write packets, but cannot read kernel memory. > bpf_lsm programs can be called from lsm hooks and > call only kfuncs that were explicitly allowlisted to bpf_lsm prog type. > Furthermore kfuncs have acquire/release semantics enforced by > the verifier. > For example, bpf progs can do bpf_rcu_read_lock() which is > a wrapper around rcu_read_lock() and the verifier will make sure > that bpf_rcu_read_unlock() is called. > Under bpf_rcu_read_lock() bpf programs can dereference __rcu tagged > fields and the verifier will track them as rcu protected objects > until bpf_rcu_read_unlock(). > In other words the verifier is doing sparse-on-steroids analysis > and enforcing it. > Kernel modules are not subject to such enforcement. > > One more distinction: 99.9% of bpf features require a GPL-ed bpf program. > All kfuncs are GPL only. While these are certainly all very nice properties it doesn't change the core question. An out of-tree BPF LSM program is requesting us to add nine vfs helpers to the BPF kfunc api. That out-of-tree BPF LSM program is not even accessible to the public. There is no meaningful difference to an out-of-tree kernel module asking us to add nine functions to the export api. The safety properties don't matter for this. Clearly, they didn't prevent the previous bpf_d_path() stuff from happening. We need rules for how this is supposed to be handled. Unless those rules are clearly specified and make sense we shouldn't be adding BPF kfuncs based on requests from out-of-tree BPF LSMs that amount to "we need this".
On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote: > > > These exports are specifically for an out-of-tree BPF LSM program that > is not accessible to the public. The question in the other mail stands. The question was already answered. You just don't like the answer. bpf progs are not equivalent to kernel modules. They have completely different safety and visibility properties. The safety part I already talked about. Sounds like the visibility has to be explained. Kernel modules are opaque binary blobs. bpf programs are fully transparent. The intent is known to the verifier and to anyone with understanding of bpf assembly. Those that cannot read bpf asm can read C source code that is embedded in the bpf program in kernel memory. It's not the same as "llvm-dwarfdump module.ko" on disk. The bpf prog source code is loaded into the kernel at program verification time for debugging and visibility reasons. If there is a verifier bug and bpf manages to crash the kernel vmcore will have relevant lines of program C source code right there. Hence out-of-tree or in-tree bpf makes no practical difference. The program cannot hide its meaning and doesn't hamper debugging. Hence adding EXPORT_SYMBOL == Brace for impact! Expect crashes, api misuse and what not. While adding bpf_kfunc is a nop for kernel development. If kfunc is in the way of code refactoring it can be removed (as we demonstrated several times). A kfunc won't cause headaches for the kernel code it is calling (assuming no verifier bugs). If there is a bug it's on us to fix it as we demonstrated in the past. For example: bpf_probe_read_kernel(). It's a wrapper of copy_from_kernel_nofault() and over the years bpf users hit various bugs in copy_from_kernel_nofault(), reported them, and _bpf developers_ fixed them. Though copy_from_kernel_nofault() is as generic as it can get and the same bugs could have been reproduced without bpf we took care of fixing these parts of the kernel. Look at path_put(). It's EXPORT_SYMBOL and any kernel module can easily screw up reference counting, so that sooner or later distro folks will experience debug pains due to out-of-tree drivers. kfunc that calls path_put() won't have such consequences. The verifier will prevent path_put() on a pointer that wasn't acquired by the same bpf program. No support pains. It's a nop for vfs folks. > > First of all, there is no such thing as get_task_fs_pwd/root > > in the kernel. > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF > LSM. I don't see how that's different from an out-of-tree kernel module. Sorry, but you don't seem to understand what bpf can and cannot do, hence they look similar. > > One can argue that get_mm_exe_file() is not exported, > > but it's nothing but rcu_lock-wrap plus get_file_rcu() > > which is EXPORT_SYMBOL. > > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't > be exported. So that'll be removed asap. So, just to make a point that "Included in that set are functions that aren't currently even exported to modules" you want to un-export get_file_rcu() ? Because as the patch stands today everything that these kfuncs are calling is EXPORT_SYMBOL.
On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote: > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > These exports are specifically for an out-of-tree BPF LSM program that > > is not accessible to the public. The question in the other mail stands. > > The question was already answered. You just don't like the answer. > bpf progs are not equivalent to kernel modules. > They have completely different safety and visibility properties. > The safety part I already talked about. > Sounds like the visibility has to be explained. > Kernel modules are opaque binary blobs. > bpf programs are fully transparent. The intent is known > to the verifier and to anyone with understanding > of bpf assembly. > Those that cannot read bpf asm can read C source code that is > embedded in the bpf program in kernel memory. > It's not the same as "llvm-dwarfdump module.ko" on disk. > The bpf prog source code is loaded into the kernel > at program verification time for debugging and visibility reasons. > If there is a verifier bug and bpf manages to crash the kernel > vmcore will have relevant lines of program C source code right there. > > Hence out-of-tree or in-tree bpf makes no practical difference. > The program cannot hide its meaning and doesn't hamper debugging. > > Hence adding EXPORT_SYMBOL == Brace for impact! > Expect crashes, api misuse and what not. > > While adding bpf_kfunc is a nop for kernel development. > If kfunc is in the way of code refactoring it can be removed > (as we demonstrated several times). > A kfunc won't cause headaches for the kernel code it is > calling (assuming no verifier bugs). > If there is a bug it's on us to fix it as we demonstrated in the past. > For example: bpf_probe_read_kernel(). > It's a wrapper of copy_from_kernel_nofault() and over the years > bpf users hit various bugs in copy_from_kernel_nofault(), > reported them, and _bpf developers_ fixed them. > Though copy_from_kernel_nofault() is as generic as it can get > and the same bugs could have been reproduced without bpf > we took care of fixing these parts of the kernel. > > Look at path_put(). > It's EXPORT_SYMBOL and any kernel module can easily screw up > reference counting, so that sooner or later distro folks > will experience debug pains due to out-of-tree drivers. > > kfunc that calls path_put() won't have such consequences. > The verifier will prevent path_put() on a pointer that wasn't > acquired by the same bpf program. No support pains. > It's a nop for vfs folks. > > > > First of all, there is no such thing as get_task_fs_pwd/root > > > in the kernel. > > > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF > > LSM. I don't see how that's different from an out-of-tree kernel module. > > Sorry, but you don't seem to understand what bpf can and cannot do, > hence they look similar. Maybe. On the other hand you seem to ignore what I'm saying. You currently don't have a clear set of rules for when it's ok for someone to send patches and request access to bpf kfuncs to implement a new BPF program. This patchset very much illustrates this point. The safety properties of bpf don't matter for this. And again, your safety properties very much didn't protect you from your bpf_d_path() mess. We're not even clearly told where and how these helper are supposed to be used. That's not ok and will never be ok. As long as there are no clear criteria to operate under this is highly problematic. This may be fine from a bpf perspective and one can even understand why because that's apparently your model or promise to your users. But there's no reason to expect the same level of laxness from any of the subsystems you're requesting kfuncs from. > > > One can argue that get_mm_exe_file() is not exported, > > > but it's nothing but rcu_lock-wrap plus get_file_rcu() > > > which is EXPORT_SYMBOL. > > > > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't > > be exported. So that'll be removed asap. > > So, just to make a point that > "Included in that set are functions that aren't currently even > exported to modules" > you want to un-export get_file_rcu() ? No. The reason it was exported was because of the drm subsystem and we already quite disliked that. But it turned out that's not needed so in commit 61d4fb0b349e ("file, i915: fix file reference for mmap_singleton()") they were moved away from this helper. And then we simply forgot to unexport it. A helper such as get_file_rcu() is called on a file object that is subject to SLAB_TYPESAFE_BY_RCU semantics where the caller doesn't hold a reference. The semantics of that are maybe understood by a couple of people in the kernel. There is absolutely no way that any userspace will get access to such low-level helpers. They have zero business to be involved in the lifetimes of objects on this level just as no module has. So really, this is an orthogonal cleanup.
Hey Christian, On Mon, Mar 11, 2024 at 01:00:56PM +0100, Christian Brauner wrote: > On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote: > > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > > > > These exports are specifically for an out-of-tree BPF LSM program that > > > is not accessible to the public. The question in the other mail stands. > > > > The question was already answered. You just don't like the answer. > > bpf progs are not equivalent to kernel modules. > > They have completely different safety and visibility properties. > > The safety part I already talked about. > > Sounds like the visibility has to be explained. > > Kernel modules are opaque binary blobs. > > bpf programs are fully transparent. The intent is known > > to the verifier and to anyone with understanding > > of bpf assembly. > > Those that cannot read bpf asm can read C source code that is > > embedded in the bpf program in kernel memory. > > It's not the same as "llvm-dwarfdump module.ko" on disk. > > The bpf prog source code is loaded into the kernel > > at program verification time for debugging and visibility reasons. > > If there is a verifier bug and bpf manages to crash the kernel > > vmcore will have relevant lines of program C source code right there. > > > > Hence out-of-tree or in-tree bpf makes no practical difference. > > The program cannot hide its meaning and doesn't hamper debugging. > > > > Hence adding EXPORT_SYMBOL == Brace for impact! > > Expect crashes, api misuse and what not. > > > > While adding bpf_kfunc is a nop for kernel development. > > If kfunc is in the way of code refactoring it can be removed > > (as we demonstrated several times). > > A kfunc won't cause headaches for the kernel code it is > > calling (assuming no verifier bugs). > > If there is a bug it's on us to fix it as we demonstrated in the past. > > For example: bpf_probe_read_kernel(). > > It's a wrapper of copy_from_kernel_nofault() and over the years > > bpf users hit various bugs in copy_from_kernel_nofault(), > > reported them, and _bpf developers_ fixed them. > > Though copy_from_kernel_nofault() is as generic as it can get > > and the same bugs could have been reproduced without bpf > > we took care of fixing these parts of the kernel. > > > > Look at path_put(). > > It's EXPORT_SYMBOL and any kernel module can easily screw up > > reference counting, so that sooner or later distro folks > > will experience debug pains due to out-of-tree drivers. > > > > kfunc that calls path_put() won't have such consequences. > > The verifier will prevent path_put() on a pointer that wasn't > > acquired by the same bpf program. No support pains. > > It's a nop for vfs folks. > > > > > > First of all, there is no such thing as get_task_fs_pwd/root > > > > in the kernel. > > > > > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF > > > LSM. I don't see how that's different from an out-of-tree kernel module. > > > > Sorry, but you don't seem to understand what bpf can and cannot do, > > hence they look similar. > > Maybe. On the other hand you seem to ignore what I'm saying. You > currently don't have a clear set of rules for when it's ok for someone > to send patches and request access to bpf kfuncs to implement a new BPF > program. This patchset very much illustrates this point. The safety > properties of bpf don't matter for this. And again, your safety > properties very much didn't protect you from your bpf_d_path() mess. > > We're not even clearly told where and how these helper are supposed to be > used. That's not ok and will never be ok. As long as there are no clear > criteria to operate under this is highly problematic. This may be fine > from a bpf perspective and one can even understand why because that's > apparently your model or promise to your users. But there's no reason to > expect the same level of laxness from any of the subsystems you're > requesting kfuncs from. You raise a completely fair point, and I truly do apologies for the lack of context and in depth explanations around the specific situations that the proposed BPF kfuncs are intended to be used from. Admittedly, that's a failure on my part, and I can completely understand why from a maintainers point of view there would be reservations around acknowledging requests for adding such invisible dependencies. Now, I'm in a little bit of a tough situation as I'm unable to point you to an open-source BPF LSM implementation that intends to make use of such newly proposed BPF kfuncs. That's just an unfortunate constraint and circumstance that I'm having to deal with, so I'm just going to have to provide heavily redacted and incomplete example to illustrate how these BPF kfuncs intend to be used from BPF LSM programs that I personally work on here at Google. Notably though, the contexts that I do share here may obviously be a nonholistic view on how these newly introduced BPF kfuncs end up getting used in practice by some other completely arbitrary open-source BPF LSM programs. Anyway, as Alexei had pointed out in one of the prior responses, the core motivating factor behind introducing these newly proposed BPF kfuncs purely stems from the requirement of needing to call bpf_d_path() safely on a struct path from the context of a BPF LSM program, specifically within the security_file_open() and security_mmap_file() LSM hooks. Now, as noted within the original bug report [0], it's currently not considered safe to pluck a struct path out from an arbitrary in-kernel data structure, which in our case was current->mm->exe_file->f_path, and have it passed to bpf_d_path() from the aforementioned LSM hook points, or any other LSM hook point for that matter. So, without using these newly introduced BPF kfuncs, our BPF LSM program hanging off security_file_open() looks as follows: ``` int BPF_PROG(file_open, struct file *file) { // Perform a whole bunch of operations on the supplied file argument. This // includes some form of policy evaluation, and if there's a violation against // policy and auditing is enabled, then we eventually call bpf_d_path() on // file->f_path. Calling bpf_d_path() on the file argument isn't problematic // as we have a stable path here as the file argument is reference counted. struct path *target = &file->f_path; // ... struct task_struct *current = bpf_get_current_task_btf(); // ... bpf_rcu_read_lock(); // Reserve a slot on the BPF ring buffer such that the actor's path can be // passed back to userspace. void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0); if (!buf) { goto unlock; } // For contextual purposes when performing an audit we also call bpf_d_path() // on the actor, being current->mm->exe_file->f_path. struct path *actor = ¤t->mm->exe_file->f_path; // Now perform the path resolution on the actor via bpf_d_path(). u64 ret = bpf_d_path(actor, buf, PATH_MAX); if (ret > 0) { bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP); } else { bpf_ringbuf_discard(buf, 0); } unlock: bpf_rcu_read_unlock(); return 0; } ``` Post landing these BPF kfuncs, the BPF LSM program hanging off security_file_open() would be updated to make use of the acquire/release BPF kfuncs as below. Here I'm only making use of bpf_get_task_exe_file(), but similar usage also extends to bpf_get_task_fs_root() and bpf_get_task_fs_pwd(). ``` int BPF_PROG(file_open, struct file *file) { // Perform a whole bunch of operations on the supplied file argument. This // includes some form of policy evaluation, and if there's a violation against // policy and auditing is enabled, then we eventually call bpf_path_d_path() // on file->f_path. Calling bpf_path_d_path() on the file argument isn't // problematic as we have a stable path here as the file argument is trusted // and reference counted. struct path *target = &file->f_path; // ... struct task_struct *current = bpf_get_current_task_btf(); // ... // Reserve a slot on the BPF ring buffer such that the actor's path can be // passed back to userspace. void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0); if (!buf) { return 0; } // For contextual purposes when performing an audit we also call // bpf_path_d_path() on the actor, being current->mm->exe_file->f_path. // Here we're operating on a stable trused and reference counted file, // thanks to bpf_get_task_exe_file(). struct file *exe_file = bpf_get_task_exe_file(current); if (!exe_file) { bpf_ringbuf_discard(buf, 0); return 0; } // Now perform the path resolution on the actor via bpf_path_d_path(), which // only accepts a trusted struct path. u64 ret = bpf_path_d_path(&exe_file->f_path, buf, PATH_MAX); if (ret > 0) { bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP); } else { bpf_ringbuf_discard(buf, 0); } // Drop the reference on exe_file. bpf_put_file(exe_file); return 0; } ``` This is rather incredibly straightforward, but the fundamental difference between the two implementations is that one allows us to work on stable file, whereas the other does not. That's really it. Similarly, we do more or less the same for our BPF LSM program that hangs off security_mmap_file(). Do you need anything else which illustrates the proposed BPF kfunc usage? [0] https://lore.kernel.org/bpf/CAG48ez0ppjcT=QxU-jtCUfb5xQb3mLr=5FcwddF_VKfEBPs_Dg@mail.gmail.com/ /M
On Tue, Mar 12, 2024 at 05:06:36PM +0000, Matt Bobrowski wrote: > Hey Christian, > > On Mon, Mar 11, 2024 at 01:00:56PM +0100, Christian Brauner wrote: > > On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote: > > > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > > > > > > > These exports are specifically for an out-of-tree BPF LSM program that > > > > is not accessible to the public. The question in the other mail stands. > > > > > > The question was already answered. You just don't like the answer. > > > bpf progs are not equivalent to kernel modules. > > > They have completely different safety and visibility properties. > > > The safety part I already talked about. > > > Sounds like the visibility has to be explained. > > > Kernel modules are opaque binary blobs. > > > bpf programs are fully transparent. The intent is known > > > to the verifier and to anyone with understanding > > > of bpf assembly. > > > Those that cannot read bpf asm can read C source code that is > > > embedded in the bpf program in kernel memory. > > > It's not the same as "llvm-dwarfdump module.ko" on disk. > > > The bpf prog source code is loaded into the kernel > > > at program verification time for debugging and visibility reasons. > > > If there is a verifier bug and bpf manages to crash the kernel > > > vmcore will have relevant lines of program C source code right there. > > > > > > Hence out-of-tree or in-tree bpf makes no practical difference. > > > The program cannot hide its meaning and doesn't hamper debugging. > > > > > > Hence adding EXPORT_SYMBOL == Brace for impact! > > > Expect crashes, api misuse and what not. > > > > > > While adding bpf_kfunc is a nop for kernel development. > > > If kfunc is in the way of code refactoring it can be removed > > > (as we demonstrated several times). > > > A kfunc won't cause headaches for the kernel code it is > > > calling (assuming no verifier bugs). > > > If there is a bug it's on us to fix it as we demonstrated in the past. > > > For example: bpf_probe_read_kernel(). > > > It's a wrapper of copy_from_kernel_nofault() and over the years > > > bpf users hit various bugs in copy_from_kernel_nofault(), > > > reported them, and _bpf developers_ fixed them. > > > Though copy_from_kernel_nofault() is as generic as it can get > > > and the same bugs could have been reproduced without bpf > > > we took care of fixing these parts of the kernel. > > > > > > Look at path_put(). > > > It's EXPORT_SYMBOL and any kernel module can easily screw up > > > reference counting, so that sooner or later distro folks > > > will experience debug pains due to out-of-tree drivers. > > > > > > kfunc that calls path_put() won't have such consequences. > > > The verifier will prevent path_put() on a pointer that wasn't > > > acquired by the same bpf program. No support pains. > > > It's a nop for vfs folks. > > > > > > > > First of all, there is no such thing as get_task_fs_pwd/root > > > > > in the kernel. > > > > > > > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF > > > > LSM. I don't see how that's different from an out-of-tree kernel module. > > > > > > Sorry, but you don't seem to understand what bpf can and cannot do, > > > hence they look similar. > > > > Maybe. On the other hand you seem to ignore what I'm saying. You > > currently don't have a clear set of rules for when it's ok for someone > > to send patches and request access to bpf kfuncs to implement a new BPF > > program. This patchset very much illustrates this point. The safety > > properties of bpf don't matter for this. And again, your safety > > properties very much didn't protect you from your bpf_d_path() mess. > > > > We're not even clearly told where and how these helper are supposed to be > > used. That's not ok and will never be ok. As long as there are no clear > > criteria to operate under this is highly problematic. This may be fine > > from a bpf perspective and one can even understand why because that's > > apparently your model or promise to your users. But there's no reason to > > expect the same level of laxness from any of the subsystems you're > > requesting kfuncs from. > > You raise a completely fair point, and I truly do apologies for the > lack of context and in depth explanations around the specific > situations that the proposed BPF kfuncs are intended to be used > from. Admittedly, that's a failure on my part, and I can completely > understand why from a maintainers point of view there would be > reservations around acknowledging requests for adding such invisible > dependencies. > > Now, I'm in a little bit of a tough situation as I'm unable to point > you to an open-source BPF LSM implementation that intends to make use > of such newly proposed BPF kfuncs. That's just an unfortunate > constraint and circumstance that I'm having to deal with, so I'm just > going to have to provide heavily redacted and incomplete example to > illustrate how these BPF kfuncs intend to be used from BPF LSM > programs that I personally work on here at Google. Notably though, the > contexts that I do share here may obviously be a nonholistic view on > how these newly introduced BPF kfuncs end up getting used in practice > by some other completely arbitrary open-source BPF LSM programs. > > Anyway, as Alexei had pointed out in one of the prior responses, the > core motivating factor behind introducing these newly proposed BPF > kfuncs purely stems from the requirement of needing to call > bpf_d_path() safely on a struct path from the context of a BPF LSM > program, specifically within the security_file_open() and > security_mmap_file() LSM hooks. Now, as noted within the original bug > report [0], it's currently not considered safe to pluck a struct path > out from an arbitrary in-kernel data structure, which in our case was > current->mm->exe_file->f_path, and have it passed to bpf_d_path() from > the aforementioned LSM hook points, or any other LSM hook point for > that matter. > > So, without using these newly introduced BPF kfuncs, our BPF LSM > program hanging off security_file_open() looks as follows: > > ``` > int BPF_PROG(file_open, struct file *file) > { > // Perform a whole bunch of operations on the supplied file argument. This > // includes some form of policy evaluation, and if there's a violation against > // policy and auditing is enabled, then we eventually call bpf_d_path() on > // file->f_path. Calling bpf_d_path() on the file argument isn't problematic > // as we have a stable path here as the file argument is reference counted. > struct path *target = &file->f_path; > > // ... > > struct task_struct *current = bpf_get_current_task_btf(); > > // ... > > bpf_rcu_read_lock(); > // Reserve a slot on the BPF ring buffer such that the actor's path can be > // passed back to userspace. > void *buf = bpf_ringbuf_reserve(&ringbuf, PATH_MAX, 0); > if (!buf) { > goto unlock; > } > > // For contextual purposes when performing an audit we also call bpf_d_path() > // on the actor, being current->mm->exe_file->f_path. > struct path *actor = ¤t->mm->exe_file->f_path; > > // Now perform the path resolution on the actor via bpf_d_path(). > u64 ret = bpf_d_path(actor, buf, PATH_MAX); > if (ret > 0) { > bpf_ringbuf_submit(buf, BPF_RB_NO_WAKEUP); > } else { > bpf_ringbuf_discard(buf, 0); > } > > unlock: > bpf_rcu_read_unlock(); > return 0; > } > ``` Note that we're also aware of the fact that calling bpf_d_path() within an RCU read-side critical shouldn't be permitted. I have a patch teed up which addresses this. bpf_path_d_path() OTOH isn't susceptible to this problem as the BPF verifier ensure that BPF kfuncs annotated KF_SLEEPABLE can't be called whilst in an RCU read-side critical section. /M
On Mon, Mar 11, 2024 at 5:01 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > One can argue that get_mm_exe_file() is not exported, > > > > but it's nothing but rcu_lock-wrap plus get_file_rcu() > > > > which is EXPORT_SYMBOL. > > > > > > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't > > > be exported. So that'll be removed asap. > > > > So, just to make a point that > > "Included in that set are functions that aren't currently even > > exported to modules" > > you want to un-export get_file_rcu() ? > > No. The reason it was exported was because of the drm subsystem and we > already quite disliked that. But it turned out that's not needed so in > commit 61d4fb0b349e ("file, i915: fix file reference for > mmap_singleton()") they were moved away from this helper. Arguably that commit 61d4fb0b349e should have had Fixes: 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU") i915 was buggy before you touched it and safe_by_rcu exposed the bug. I can see why you guys looked at it, saw issues, and decided to look away. Though your guess in commit 61d4fb0b349e " Now, there might be delays until file->f_op->release::singleton_release() is called and i915->gem.mmap_singleton is set to NULL. " feels unlikely. I suspect release() delay cannot be that long to cause rcu stall. In the log prior to the splat there are just two mmap related calls from selftests in i915_gem_mman_live_selftests(): i915: Running i915_gem_mman_live_selftests/igt_mmap_offset_exhaustion i915: Running i915_gem_mman_live_selftests/igt_mmap 1st mmap test passed, but 2nd failed. So it looks like it's not a race, but an issue with cleanup in that driver. And instead of getting to the bottom of the issue you've decided to paper over with get_file_active(). I agree with that trade-off. But the bug in i915 is still there and it's probably an UAF. get_file_active() is probably operating on a broken 'struct file' that got to zero, but somehow it still around or it's just a garbage memory and file->f_count just happened to be zero. My point is that it's not ok to have such double standards. On one side you're arguing that we shouldn't introduce kfunc: +__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task) +{ + return get_task_exe_file(task); +} that cleanly takes ref cnt on task->mm->exe_file and _not_ using lower level get_file/get_file_rcu/get_file_active api-s directly which are certainly problematic to expose anywhere, since safe_by_rcu protocol is delicate. But on the other side there is buggy i915 that does questionable dance with get_file_active(). It's EXPORT_SYMBOL_GPL as well and out of tree driver can ruin safe_by_rcu file properties with hard to debug consequences. > There is absolutely no way that any userspace will > get access to such low-level helpers. They have zero business to be > involved in the lifetimes of objects on this level just as no module has. correct, and kfuncs do not give bpf prog to do direct get_file*() access because we saw how tricky safe_by_rcu is. Hence kfuncs acquire file via get_task_exe_file or get_mm_exe_file and release via fput. That's the same pattern that security/tomoyo/util.c is doing: exe_file = get_mm_exe_file(mm); if (!exe_file) return NULL; cp = tomoyo_realpath_from_path(&exe_file->f_path); fput(exe_file); in bpf_lsm case it will be: exe_file = bpf_get_mm_exe_file(mm); if (!exe_file) // the verifier will enforce that bpf prog has this NULL check here // because we annotate kfunc as: BTF_ID_FLAGS(func, bpf_get_mm_exe_file, KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL) bpf_path_d_path(&exe_file->f_path, ...); bpf_put_file(exe_file); // and the verifier will enforce that bpf_put_file() is called too. // and there is no path out of this bpf program that can take file refcnt // without releasing. So really these kfuncs are a nop from vfs pov. If there is a bug in the verifier we will debug it and we will fix it. You keep saying that bpf_d_path() is a mess. Right. It is a mess now and we're fixing it. When it was introduced 4 years ago it was safe at that time. The unrelated verifier "smartness" made it possible to use it in UAF. We found the issue now and we're fixing it. Over these years we didn't ask vfs folks to help fix such bugs, and not asking for help now. You're being cc-ed on the patches to be aware on how we plan to fix this bpf_d_path() mess. If you have a viable alternative please suggest. As it stands the new kfuncs are clean and safe way to solve this mess.
On Wed, Mar 13, 2024 at 02:05:13PM -0700, Alexei Starovoitov wrote: > On Mon, Mar 11, 2024 at 5:01 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > > One can argue that get_mm_exe_file() is not exported, > > > > > but it's nothing but rcu_lock-wrap plus get_file_rcu() > > > > > which is EXPORT_SYMBOL. > > > > > > > > Oh, good spot. That's an accident. get_file_rcu() definitely shouldn't > > > > be exported. So that'll be removed asap. > > > > > > So, just to make a point that > > > "Included in that set are functions that aren't currently even > > > exported to modules" > > > you want to un-export get_file_rcu() ? > > > > No. The reason it was exported was because of the drm subsystem and we > > already quite disliked that. But it turned out that's not needed so in > > commit 61d4fb0b349e ("file, i915: fix file reference for > > mmap_singleton()") they were moved away from this helper. > > Arguably that commit 61d4fb0b349e should have had > Fixes: 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU") > i915 was buggy before you touched it > and safe_by_rcu exposed the bug. > I can see why you guys looked at it, saw issues, > and decided to look away. > Though your guess in commit 61d4fb0b349e > " > Now, there might be delays until > file->f_op->release::singleton_release() is called and > i915->gem.mmap_singleton is set to NULL. > " > feels unlikely. > I suspect release() delay cannot be that long to cause rcu stall. > In the log prior to the splat there are just two mmap related calls > from selftests in i915_gem_mman_live_selftests(): > i915: Running i915_gem_mman_live_selftests/igt_mmap_offset_exhaustion > i915: Running i915_gem_mman_live_selftests/igt_mmap > 1st mmap test passed, but 2nd failed. > So it looks like it's not a race, but an issue with cleanup in that driver. > And instead of getting to the bottom of the issue > you've decided to paper over with get_file_active(). > I agree with that trade-off. > But the bug in i915 is still there and it's probably an UAF. > get_file_active() is probably operating on a broken 'struct file' > that got to zero, but somehow it still around > or it's just a garbage memory and file->f_count > just happened to be zero. > > My point is that it's not ok to have such double standards. > On one side you're arguing that we shouldn't introduce kfunc: > +__bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task) > +{ > + return get_task_exe_file(task); > +} > that cleanly takes ref cnt on task->mm->exe_file and _not_ using lower > level get_file/get_file_rcu/get_file_active api-s directly which > are certainly problematic to expose anywhere, since safe_by_rcu > protocol is delicate. > > But on the other side there is buggy i915 that does > questionable dance with get_file_active(). > It's EXPORT_SYMBOL_GPL as well and out of tree driver can > ruin safe_by_rcu file properties with hard to debug consequences. You're lending strong support for my earlier point. Because this is a clear an example where a subsystem got access to a helper that it shouldn't have had access to. So we fixed the issue. But this whole polemic just illustrates that you simply didn't bother to understand how the code works. The way you talk about UAF together with SLAB_TYPESAFE_BY_RCU is telling. Please read the code instead of guessing. So the same way we don't have to take responsibility for bpf misunderstanding the expectations of d_path() we don't have to take responsibility for misusing an internal helper by another subsystem. So your argument here is moot at best and polemical and opportunistic at worst. It certainly doesn't illustrate what you think it does. And the above is fundamentally a suspiciously long sideshow. So let's get back to the core topic: Unless you document your rules when it's ok for a bpf program to come along and request access to internal apis patchsets such as this are not acceptable. > > > There is absolutely no way that any userspace will > > get access to such low-level helpers. They have zero business to be > > involved in the lifetimes of objects on this level just as no module has. > > correct, and kfuncs do not give bpf prog to do direct get_file*() access > because we saw how tricky safe_by_rcu is. > Hence kfuncs acquire file via get_task_exe_file or get_mm_exe_file > and release via fput. > That's the same pattern that security/tomoyo/util.c is doing: > exe_file = get_mm_exe_file(mm); > if (!exe_file) > return NULL; > > cp = tomoyo_realpath_from_path(&exe_file->f_path); > fput(exe_file); > > in bpf_lsm case it will be: > > exe_file = bpf_get_mm_exe_file(mm); > if (!exe_file) > // the verifier will enforce that bpf prog has this NULL check here > // because we annotate kfunc as: > BTF_ID_FLAGS(func, bpf_get_mm_exe_file, KF_ACQUIRE | KF_TRUSTED_ARGS | > KF_RET_NULL) > > bpf_path_d_path(&exe_file->f_path, ...); > bpf_put_file(exe_file); > // and the verifier will enforce that bpf_put_file() is called too. > // and there is no path out of this bpf program that can take file refcnt > // without releasing. > > So really these kfuncs are a nop from vfs pov. > If there is a bug in the verifier we will debug it and we will fix it. > > You keep saying that bpf_d_path() is a mess. > Right. It is a mess now and we're fixing it. > When it was introduced 4 years ago it was safe at that time. Uhm, no it was always sketchy. > The unrelated verifier "smartness" made it possible to use it in UAF. > We found the issue now and we're fixing it. > Over these years we didn't ask vfs folks to help fix such bugs, > and not asking for help now. > You're being cc-ed on the patches to be aware on how we plan to fix > this bpf_d_path() mess. If you have a viable alternative please suggest. The fix is to export a variant with trusted args. > As it stands the new kfuncs are clean and safe way to solve this mess. I will remind you of what you have been told in [1]: "No. It's not up to maintainers to suggest alternatives. Sometimes it's simply enough to explain *why* something isn't acceptable. A plain "no" without explanation isn't sufficient. NAKs need a good reason. But they don't need more than that. The onus of coming up with an acceptable solution is on the person who needs something new." You've been provided: a) good reasons why the patchset in it's current form isn't acceptable repeated multiple times b) support for exporting a variant of bpf_d_path() that is safe to use c) a request that all kfunc exports for the vfs will have to be located under fs/, not in kernel/bpf/ d) a path on how to move forward with additional kfunc requests: Clear and documented rules when it's ok for someone to come along and request access to bpf kfuncs when it's to be rejected and when it's ok to be supported. You repeatedly threatening to go over the heads of people will not make them more amenable to happily integrate with your subsystem. [1]: https://lore.kernel.org/all/CAHk-=whD2HMe4ja5nR6WWofUh3nLmhjoSPDvZm2-XMGjeie5Tg@mail.gmail.com
On Tue, Mar 12, 2024 at 05:06:36PM +0000, Matt Bobrowski wrote: > Hey Christian, > > On Mon, Mar 11, 2024 at 01:00:56PM +0100, Christian Brauner wrote: > > On Fri, Mar 08, 2024 at 05:23:30PM -0800, Alexei Starovoitov wrote: > > > On Fri, Mar 8, 2024 at 2:36 AM Christian Brauner <brauner@kernel.org> wrote: > > > > > > > > > > > > These exports are specifically for an out-of-tree BPF LSM program that > > > > is not accessible to the public. The question in the other mail stands. > > > > > > The question was already answered. You just don't like the answer. > > > bpf progs are not equivalent to kernel modules. > > > They have completely different safety and visibility properties. > > > The safety part I already talked about. > > > Sounds like the visibility has to be explained. > > > Kernel modules are opaque binary blobs. > > > bpf programs are fully transparent. The intent is known > > > to the verifier and to anyone with understanding > > > of bpf assembly. > > > Those that cannot read bpf asm can read C source code that is > > > embedded in the bpf program in kernel memory. > > > It's not the same as "llvm-dwarfdump module.ko" on disk. > > > The bpf prog source code is loaded into the kernel > > > at program verification time for debugging and visibility reasons. > > > If there is a verifier bug and bpf manages to crash the kernel > > > vmcore will have relevant lines of program C source code right there. > > > > > > Hence out-of-tree or in-tree bpf makes no practical difference. > > > The program cannot hide its meaning and doesn't hamper debugging. > > > > > > Hence adding EXPORT_SYMBOL == Brace for impact! > > > Expect crashes, api misuse and what not. > > > > > > While adding bpf_kfunc is a nop for kernel development. > > > If kfunc is in the way of code refactoring it can be removed > > > (as we demonstrated several times). > > > A kfunc won't cause headaches for the kernel code it is > > > calling (assuming no verifier bugs). > > > If there is a bug it's on us to fix it as we demonstrated in the past. > > > For example: bpf_probe_read_kernel(). > > > It's a wrapper of copy_from_kernel_nofault() and over the years > > > bpf users hit various bugs in copy_from_kernel_nofault(), > > > reported them, and _bpf developers_ fixed them. > > > Though copy_from_kernel_nofault() is as generic as it can get > > > and the same bugs could have been reproduced without bpf > > > we took care of fixing these parts of the kernel. > > > > > > Look at path_put(). > > > It's EXPORT_SYMBOL and any kernel module can easily screw up > > > reference counting, so that sooner or later distro folks > > > will experience debug pains due to out-of-tree drivers. > > > > > > kfunc that calls path_put() won't have such consequences. > > > The verifier will prevent path_put() on a pointer that wasn't > > > acquired by the same bpf program. No support pains. > > > It's a nop for vfs folks. > > > > > > > > First of all, there is no such thing as get_task_fs_pwd/root > > > > > in the kernel. > > > > > > > > Yeah, we'd need specific helpers for a never seen before out-of-tree BPF > > > > LSM. I don't see how that's different from an out-of-tree kernel module. > > > > > > Sorry, but you don't seem to understand what bpf can and cannot do, > > > hence they look similar. > > > > Maybe. On the other hand you seem to ignore what I'm saying. You > > currently don't have a clear set of rules for when it's ok for someone > > to send patches and request access to bpf kfuncs to implement a new BPF > > program. This patchset very much illustrates this point. The safety > > properties of bpf don't matter for this. And again, your safety > > properties very much didn't protect you from your bpf_d_path() mess. > > > > We're not even clearly told where and how these helper are supposed to be > > used. That's not ok and will never be ok. As long as there are no clear > > criteria to operate under this is highly problematic. This may be fine > > from a bpf perspective and one can even understand why because that's > > apparently your model or promise to your users. But there's no reason to > > expect the same level of laxness from any of the subsystems you're > > requesting kfuncs from. > > You raise a completely fair point, and I truly do apologies for the > lack of context and in depth explanations around the specific > situations that the proposed BPF kfuncs are intended to be used > from. Admittedly, that's a failure on my part, and I can completely > understand why from a maintainers point of view there would be > reservations around acknowledging requests for adding such invisible > dependencies. Thanks for providing more background. > Now, I'm in a little bit of a tough situation as I'm unable to point > you to an open-source BPF LSM implementation that intends to make use > of such newly proposed BPF kfuncs. That's just an unfortunate > constraint and circumstance that I'm having to deal with, so I'm just > going to have to provide heavily redacted and incomplete example to > illustrate how these BPF kfuncs intend to be used from BPF LSM > programs that I personally work on here at Google. Notably though, the > contexts that I do share here may obviously be a nonholistic view on > how these newly introduced BPF kfuncs end up getting used in practice > by some other completely arbitrary open-source BPF LSM programs. I have to say that this to me is wild. Essentially we're supposed to allow access to our internal APIs based on internal use-cases that aren't public and likely never will be. If that's acceptable then I want bpf to document this in their kernel Documentation and submit this for review to the wider community.
On Mon, Mar 18, 2024 at 6:14 AM Christian Brauner <brauner@kernel.org> wrote: > > On Wed, Mar 13, 2024 at 02:05:13PM -0700, Alexei Starovoitov wrote: > > But this whole polemic just illustrates that you simply didn't bother to > understand how the code works. The way you talk about UAF together with > SLAB_TYPESAFE_BY_RCU is telling. Please read the code instead of > guessing. Ok. Fair enough. I've read the code and old threads from Sep-Nov. I think the concerns about typesafe_by_rcu made folks believe in races that don't exist. if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) .. * (a) the file ref already went down to zero and the * file hasn't been reused yet or the file count * isn't zero but the file has already been reused. .. if (unlikely(file != rcu_dereference_raw(*fdentry)) || The first part of the comment is partially incorrect. (it's in the wrong place). The file ref could have been down to zero and not reused yet, but it's before atomic_long_inc_not_zero. Once the code reaches 2nd check the file guaranteed to be reused and it went through init_file(), because that's the only code that brings it back from zero. This race is ok: cpu0 cpu1 file = rcu_dereference_raw(*fdentry); // file->f_count == 1 rcu_assign_pointer(fdt->fd[fd], NULL); fput() // reaches zero atomic_long_inc_not_zero() // will not succeed. This race is ok too: cpu0 cpu1 file = rcu_dereference_raw(*fdentry); // file->f_count == 1 rcu_assign_pointer(fdt->fd[fd], NULL); atomic_long_inc_not_zero() // succeeds. f_count == 2 fput() // f_count == 1 file != rcu_dereference_raw(*fdentry) // will fail but it doesn't have to. This is a safe race. It's no different if cpu0 went through these steps including successful last check and then cpu1 did close(fd) The file held by cpu0 is not on the path to zero. Similarly, back then, there was a concern about two parallel __fget_files_rcu() where one cpu incremented refcnt, failed some check and didn't do fput yet. In this case the file is not on the path to zero either. Both cpu-s saw non-zero f_count when they went through atomic_long_inc_not_zero. The file is not on the path to be reused. Now the second part of the comment "the file count isn't zero but the file has already been reused" is misleading. The (file != rcu_dereference_raw(*fdentry)) check is racy. Another cpu could have replaced that slot right after that check. Example: cpu0 doing lockless __fget_files_rcu() while cpu1 doing sys_close. __fget_files_rcu() will be returning a file that doesn't exist in fdt. And it's safe. This race is possible: cpu0 cpu1 file = rcu_dereference_raw(*fdentry); fput() reaches zero file_free alloc_empty_file // got same file init_file // f_count = 1 atomic_long_inc_not_zero() // succeeds. f_count == 2 file != rcu_dereference_raw(*fdentry)) // preventing a reuse of a file that was never in this fdt. The only value of this check is to make sure that this file either _is_ or _was_ at some point in this fdt. It's not preventing reuse per-se. This race is possible: cpu0 cpu1 file = rcu_dereference_raw(*fdentry); fput() reaches zero file_free alloc_empty_file // got same file init_file // f_count = 1 fd_install atomic_long_inc_not_zero() // succeeds. f_count == 2 file == rcu_dereference_raw(*fdentry)) // success, though the file _was reused_. I suggest to revise the comment. > > You've been provided: > > a) good reasons why the patchset in it's current form isn't acceptable > repeated multiple times We will improve commit logs in the next revision. > b) support for exporting a variant of bpf_d_path() that is safe to use good, but bpf_d_path kfunc alone is not useful. As Jann noted back in September: https://lore.kernel.org/all/CAG48ez2d5CW=CDi+fBOU1YqtwHfubN3q6w=1LfD+ss+Q1PWHgQ@mail.gmail.com/ The conversion of files to typesafe_by_rcu broke that verifier assumption about mm->exe_file and we need kfuncs to safely acquire/release file reference to fix that breakage. > c) a request that all kfunc exports for the vfs will have to be located > under fs/, not in kernel/bpf/ we've added kfuncs to net/netfilter/, net/xfrm/, net/ipv4/, kernel/cgroup/, drivers/hid/ because maintainers of those subsystems demonstrated understanding of what bpf is doing and what these kfuncs are for. We can put them in fs/, but you need to demonstrate willingness to understand the problem we're solving instead of arguing about how hard file typesafe_by_rcu is to understand. > d) a path on how to move forward with additional kfunc requests: > Clear and documented rules when it's ok for someone to come along and > request access to bpf kfuncs when it's to be rejected and when it's > ok to be supported. There are ~36500 EXPORT_SYMBOL in the kernel. Are there "clear documented rules when it's ok for someone to" add or remove them? There is a gentleman's agreement that maintainers of subsystems need to be cc-ed when new EXPORT_SYMBOL-s are added. In this case no new EXPORT_SYMBOLs are requested. Compare that to 221 bpf helpers (which are uapi, and for the last 2 years we didn't add a single one) and 151 bpf kfuncs which are not uapi as clearly documented in Documentation/bpf/kfuncs.rst When developers want to add them they cc bpf@vger and relevant subsystems just like we did with netfilter, xfrm, cgroup, hid. kfunc deprecation rules are also documented in kfunc.rst > You repeatedly threatening to go over the heads of people will not make > them more amenable to happily integrate with your subsystem. This is not it. We made our own mistakes with bpf_d_path safety, and now file typesafe_by_rcu broke bpf safety assumptions. We have to fix it one way or the other. It's bpf that got affected by your changes. But we don't demand that you fix bpf bits. We're fixing them. But you have to provide technical reasons why file acquire/release kfuncs are not suitable. "Only 3 people understand typesafe_by_rcu" is not it.