Message ID | 20230117212731.442859-1-toke@redhat.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | [RFC,v2] Documentation/bpf: Add a description of "stable kfuncs" | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
On Tue, Jan 17, 2023 at 2:20 PM Stanislav Fomichev <sdf@google.com> wrote: > > On Tue, Jan 17, 2023 at 2:04 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > Stanislav Fomichev <sdf@google.com> writes: > > > > > On Tue, Jan 17, 2023 at 1:27 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > >> > > >> Following up on the discussion at the BPF office hours, this patch adds a > > >> description of the (new) concept of "stable kfuncs", which are kfuncs that > > >> offer a "more stable" interface than what we have now, but is still not > > >> part of UAPI. > > >> > > >> This is mostly meant as a straw man proposal to focus discussions around > > >> stability guarantees. From the discussion, it seemed clear that there were > > >> at least some people (myself included) who felt that there needs to be some > > >> way to export functionality that we consider "stable" (in the sense of > > >> "applications can rely on its continuing existence"). > > >> > > >> One option is to keep BPF helpers as the stable interface and implement > > >> some technical solution for moving functionality from kfuncs to helpers > > >> once it has stood the test of time and we're comfortable committing to it > > >> as a stable API. Another is to freeze the helper definitions, and instead > > >> use kfuncs for this purpose as well, by marking a subset of them as > > >> "stable" in some way. Or we can do both and have multiple levels of > > >> "stable", I suppose. > > >> > > >> This patch is an attempt to describe what the "stable kfuncs" idea might > > >> look like, as well as to formulate some criteria for what we mean by > > >> "stable", and describe an explicit deprecation procedure. Feel free to > > >> critique any part of this (including rejecting the notion entirely). > > >> > > >> Some people mentioned (in the office hours) that should we decide to go in > > >> this direction, there's some work that needs to be done in libbpf (and > > >> probably the kernel too?) to bring the kfunc developer experience up to par > > >> with helpers. Things like exporting kfunc definitions to vmlinux.h (to make > > >> them discoverable), and having CO-RE support for using them, etc. I kinda > > >> consider that orthogonal to what's described here, but I do think we should > > >> fix those issues before implementing the procedures described here. > > >> > > >> v2: > > >> - Incorporate Daniel's changes > > >> > > >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > > >> --- > > >> Documentation/bpf/kfuncs.rst | 87 +++++++++++++++++++++++++++++++++--- > > >> 1 file changed, 81 insertions(+), 6 deletions(-) > > >> > > >> diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst > > >> index 9fd7fb539f85..dd40a4ee35f2 100644 > > >> --- a/Documentation/bpf/kfuncs.rst > > >> +++ b/Documentation/bpf/kfuncs.rst > > >> @@ -7,9 +7,9 @@ BPF Kernel Functions (kfuncs) > > >> > > >> BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux > > >> kernel which are exposed for use by BPF programs. Unlike normal BPF helpers, > > >> -kfuncs do not have a stable interface and can change from one kernel release to > > >> -another. Hence, BPF programs need to be updated in response to changes in the > > >> -kernel. > > >> +kfuncs by default do not have a stable interface and can change from one kernel > > >> +release to another. Hence, BPF programs may need to be updated in response to > > >> +changes in the kernel. See :ref:`BPF_kfunc_stability`. > > >> > > >> 2. Defining a kfunc > > >> =================== > > >> @@ -223,14 +223,89 @@ type. An example is shown below:: > > >> } > > >> late_initcall(init_subsystem); > > >> > > >> -3. Core kfuncs > > >> + > > >> +.. _BPF_kfunc_stability: > > >> + > > >> +3. API (in)stability of kfuncs > > >> +============================== > > >> + > > >> +By default, kfuncs exported to BPF programs are considered a kernel-internal > > >> +interface that can change between kernel versions. This means that BPF programs > > >> +using kfuncs may need to adapt to changes between kernel versions. In the > > >> +extreme case that could also include removal of a kfunc. In other words, kfuncs > > >> +are _not_ part of the kernel UAPI! Rather, these kfuncs can be thought of as > > >> +being similar to internal kernel API functions exported using the > > > > > > [..] > > > > > >> +``EXPORT_SYMBOL_GPL`` macro. All new BPF kernel helper-like functionality must > > >> +initially start out as kfuncs. > > > > > > To clarify, as part of this proposal, are we making a decision here > > > that we ban new helpers going forward? > > > > Good question! That is one of the things I'm hoping we can clear up by > > this discussing. I don't have a strong opinion on the matter myself, as > > long as there is *some* way to mark a subset of helpers/kfuncs as > > "stable"... > > Might be worth it to capitalize in this case to indicate that it's a > MUST from the RFC world? (or go with SHOULD otherwise). > I'm fine either way. The only thing that stops me from fully embracing > MUST is the kfunc requirement on the explicit jit support; I'm not > sure why it exists and at this point I'm too afraid to ask. But having > MUST here might give us motivation to address the shortcomings... Did you do: git grep bpf_jit_supports_kfunc_call and didn't find your favorite architecture there and didn't find it in the upcoming patches for riscv and arm32? If you care about kfuncs on arm32 please help reviewing posted patches.
On Tue, Jan 17, 2023 at 3:19 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Tue, Jan 17, 2023 at 2:20 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > On Tue, Jan 17, 2023 at 2:04 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > > > Stanislav Fomichev <sdf@google.com> writes: > > > > > > > On Tue, Jan 17, 2023 at 1:27 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > >> > > > >> Following up on the discussion at the BPF office hours, this patch adds a > > > >> description of the (new) concept of "stable kfuncs", which are kfuncs that > > > >> offer a "more stable" interface than what we have now, but is still not > > > >> part of UAPI. > > > >> > > > >> This is mostly meant as a straw man proposal to focus discussions around > > > >> stability guarantees. From the discussion, it seemed clear that there were > > > >> at least some people (myself included) who felt that there needs to be some > > > >> way to export functionality that we consider "stable" (in the sense of > > > >> "applications can rely on its continuing existence"). > > > >> > > > >> One option is to keep BPF helpers as the stable interface and implement > > > >> some technical solution for moving functionality from kfuncs to helpers > > > >> once it has stood the test of time and we're comfortable committing to it > > > >> as a stable API. Another is to freeze the helper definitions, and instead > > > >> use kfuncs for this purpose as well, by marking a subset of them as > > > >> "stable" in some way. Or we can do both and have multiple levels of > > > >> "stable", I suppose. > > > >> > > > >> This patch is an attempt to describe what the "stable kfuncs" idea might > > > >> look like, as well as to formulate some criteria for what we mean by > > > >> "stable", and describe an explicit deprecation procedure. Feel free to > > > >> critique any part of this (including rejecting the notion entirely). > > > >> > > > >> Some people mentioned (in the office hours) that should we decide to go in > > > >> this direction, there's some work that needs to be done in libbpf (and > > > >> probably the kernel too?) to bring the kfunc developer experience up to par > > > >> with helpers. Things like exporting kfunc definitions to vmlinux.h (to make > > > >> them discoverable), and having CO-RE support for using them, etc. I kinda > > > >> consider that orthogonal to what's described here, but I do think we should > > > >> fix those issues before implementing the procedures described here. > > > >> > > > >> v2: > > > >> - Incorporate Daniel's changes > > > >> > > > >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > > > >> --- > > > >> Documentation/bpf/kfuncs.rst | 87 +++++++++++++++++++++++++++++++++--- > > > >> 1 file changed, 81 insertions(+), 6 deletions(-) > > > >> > > > >> diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst > > > >> index 9fd7fb539f85..dd40a4ee35f2 100644 > > > >> --- a/Documentation/bpf/kfuncs.rst > > > >> +++ b/Documentation/bpf/kfuncs.rst > > > >> @@ -7,9 +7,9 @@ BPF Kernel Functions (kfuncs) > > > >> > > > >> BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux > > > >> kernel which are exposed for use by BPF programs. Unlike normal BPF helpers, > > > >> -kfuncs do not have a stable interface and can change from one kernel release to > > > >> -another. Hence, BPF programs need to be updated in response to changes in the > > > >> -kernel. > > > >> +kfuncs by default do not have a stable interface and can change from one kernel > > > >> +release to another. Hence, BPF programs may need to be updated in response to > > > >> +changes in the kernel. See :ref:`BPF_kfunc_stability`. > > > >> > > > >> 2. Defining a kfunc > > > >> =================== > > > >> @@ -223,14 +223,89 @@ type. An example is shown below:: > > > >> } > > > >> late_initcall(init_subsystem); > > > >> > > > >> -3. Core kfuncs > > > >> + > > > >> +.. _BPF_kfunc_stability: > > > >> + > > > >> +3. API (in)stability of kfuncs > > > >> +============================== > > > >> + > > > >> +By default, kfuncs exported to BPF programs are considered a kernel-internal > > > >> +interface that can change between kernel versions. This means that BPF programs > > > >> +using kfuncs may need to adapt to changes between kernel versions. In the > > > >> +extreme case that could also include removal of a kfunc. In other words, kfuncs > > > >> +are _not_ part of the kernel UAPI! Rather, these kfuncs can be thought of as > > > >> +being similar to internal kernel API functions exported using the > > > > > > > > [..] > > > > > > > >> +``EXPORT_SYMBOL_GPL`` macro. All new BPF kernel helper-like functionality must > > > >> +initially start out as kfuncs. > > > > > > > > To clarify, as part of this proposal, are we making a decision here > > > > that we ban new helpers going forward? > > > > > > Good question! That is one of the things I'm hoping we can clear up by > > > this discussing. I don't have a strong opinion on the matter myself, as > > > long as there is *some* way to mark a subset of helpers/kfuncs as > > > "stable"... > > > > Might be worth it to capitalize in this case to indicate that it's a > > MUST from the RFC world? (or go with SHOULD otherwise). > > I'm fine either way. The only thing that stops me from fully embracing > > MUST is the kfunc requirement on the explicit jit support; I'm not > > sure why it exists and at this point I'm too afraid to ask. But having > > MUST here might give us motivation to address the shortcomings... > > Did you do: > git grep bpf_jit_supports_kfunc_call > and didn't find your favorite architecture there and > didn't find it in the upcoming patches for riscv and arm32? > If you care about kfuncs on arm32 please help reviewing posted patches. Exactly why I'm going to support whatever decision is being made here. Just trying to clarify what that decision is.
On 1/18/23 3:00 AM, Stanislav Fomichev wrote: > On Tue, Jan 17, 2023 at 3:19 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: >> On Tue, Jan 17, 2023 at 2:20 PM Stanislav Fomichev <sdf@google.com> wrote: >>> On Tue, Jan 17, 2023 at 2:04 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >>>> Stanislav Fomichev <sdf@google.com> writes: >>>>> On Tue, Jan 17, 2023 at 1:27 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >>>>>> >>>>>> Following up on the discussion at the BPF office hours, this patch adds a >>>>>> description of the (new) concept of "stable kfuncs", which are kfuncs that >>>>>> offer a "more stable" interface than what we have now, but is still not >>>>>> part of UAPI. >>>>>> >>>>>> This is mostly meant as a straw man proposal to focus discussions around >>>>>> stability guarantees. From the discussion, it seemed clear that there were >>>>>> at least some people (myself included) who felt that there needs to be some >>>>>> way to export functionality that we consider "stable" (in the sense of >>>>>> "applications can rely on its continuing existence"). >>>>>> >>>>>> One option is to keep BPF helpers as the stable interface and implement >>>>>> some technical solution for moving functionality from kfuncs to helpers >>>>>> once it has stood the test of time and we're comfortable committing to it >>>>>> as a stable API. Another is to freeze the helper definitions, and instead >>>>>> use kfuncs for this purpose as well, by marking a subset of them as >>>>>> "stable" in some way. Or we can do both and have multiple levels of >>>>>> "stable", I suppose. >>>>>> >>>>>> This patch is an attempt to describe what the "stable kfuncs" idea might >>>>>> look like, as well as to formulate some criteria for what we mean by >>>>>> "stable", and describe an explicit deprecation procedure. Feel free to >>>>>> critique any part of this (including rejecting the notion entirely). >>>>>> >>>>>> Some people mentioned (in the office hours) that should we decide to go in >>>>>> this direction, there's some work that needs to be done in libbpf (and >>>>>> probably the kernel too?) to bring the kfunc developer experience up to par >>>>>> with helpers. Things like exporting kfunc definitions to vmlinux.h (to make >>>>>> them discoverable), and having CO-RE support for using them, etc. I kinda >>>>>> consider that orthogonal to what's described here, but I do think we should >>>>>> fix those issues before implementing the procedures described here. >>>>>> >>>>>> v2: >>>>>> - Incorporate Daniel's changes >>>>>> >>>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> >>>>>> --- >>>>>> Documentation/bpf/kfuncs.rst | 87 +++++++++++++++++++++++++++++++++--- >>>>>> 1 file changed, 81 insertions(+), 6 deletions(-) >>>>>> >>>>>> diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst >>>>>> index 9fd7fb539f85..dd40a4ee35f2 100644 >>>>>> --- a/Documentation/bpf/kfuncs.rst >>>>>> +++ b/Documentation/bpf/kfuncs.rst >>>>>> @@ -7,9 +7,9 @@ BPF Kernel Functions (kfuncs) >>>>>> >>>>>> BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux >>>>>> kernel which are exposed for use by BPF programs. Unlike normal BPF helpers, >>>>>> -kfuncs do not have a stable interface and can change from one kernel release to >>>>>> -another. Hence, BPF programs need to be updated in response to changes in the >>>>>> -kernel. >>>>>> +kfuncs by default do not have a stable interface and can change from one kernel >>>>>> +release to another. Hence, BPF programs may need to be updated in response to >>>>>> +changes in the kernel. See :ref:`BPF_kfunc_stability`. >>>>>> >>>>>> 2. Defining a kfunc >>>>>> =================== >>>>>> @@ -223,14 +223,89 @@ type. An example is shown below:: >>>>>> } >>>>>> late_initcall(init_subsystem); >>>>>> >>>>>> -3. Core kfuncs >>>>>> + >>>>>> +.. _BPF_kfunc_stability: small nit: please also link from Documentation/bpf/bpf_design_QA.rst, so these sections here are easier to find. >>>>>> +3. API (in)stability of kfuncs >>>>>> +============================== >>>>>> + >>>>>> +By default, kfuncs exported to BPF programs are considered a kernel-internal >>>>>> +interface that can change between kernel versions. This means that BPF programs >>>>>> +using kfuncs may need to adapt to changes between kernel versions. In the >>>>>> +extreme case that could also include removal of a kfunc. In other words, kfuncs >>>>>> +are _not_ part of the kernel UAPI! Rather, these kfuncs can be thought of as >>>>>> +being similar to internal kernel API functions exported using the >>>>> >>>>> [..] >>>>> >>>>>> +``EXPORT_SYMBOL_GPL`` macro. All new BPF kernel helper-like functionality must >>>>>> +initially start out as kfuncs. >>>>> >>>>> To clarify, as part of this proposal, are we making a decision here >>>>> that we ban new helpers going forward? >>>> >>>> Good question! That is one of the things I'm hoping we can clear up by >>>> this discussing. I don't have a strong opinion on the matter myself, as >>>> long as there is *some* way to mark a subset of helpers/kfuncs as >>>> "stable"... >>> >>> Might be worth it to capitalize in this case to indicate that it's a >>> MUST from the RFC world? (or go with SHOULD otherwise). >>> I'm fine either way. The only thing that stops me from fully embracing >>> MUST is the kfunc requirement on the explicit jit support; I'm not >>> sure why it exists and at this point I'm too afraid to ask. But having >>> MUST here might give us motivation to address the shortcomings... >> >> Did you do: >> git grep bpf_jit_supports_kfunc_call >> and didn't find your favorite architecture there and >> didn't find it in the upcoming patches for riscv and arm32? >> If you care about kfuncs on arm32 please help reviewing posted patches. > > Exactly why I'm going to support whatever decision is being made here. > Just trying to clarify what that decision is. My $0.02 is that I don't think we need to make a hard-cut ban as part of this. The 'All new BPF kernel helper-like functionality must initially start out as kfuncs.' is pretty clear where things would need to start out with, and we could leave the option on the table if really needed to go BPF helper route when promoting kfunc to stable at the same time. I had that in the text suggestion earlier, it's more corner case and maybe we'll never need it but we also don't drive ourselves into a corner where we close the door on it. Lets let the infra around kfuncs evolve further first. Thanks, Daniel
On Tue, Jan 17, 2023 at 10:27:31PM +0100, Toke Høiland-Jørgensen wrote: > Following up on the discussion at the BPF office hours, this patch adds a > description of the (new) concept of "stable kfuncs", which are kfuncs that > offer a "more stable" interface than what we have now, but is still not > part of UAPI. > > This is mostly meant as a straw man proposal to focus discussions around > stability guarantees. From the discussion, it seemed clear that there were > at least some people (myself included) who felt that there needs to be some > way to export functionality that we consider "stable" (in the sense of > "applications can rely on its continuing existence"). > > One option is to keep BPF helpers as the stable interface and implement > some technical solution for moving functionality from kfuncs to helpers > once it has stood the test of time and we're comfortable committing to it > as a stable API. Another is to freeze the helper definitions, and instead > use kfuncs for this purpose as well, by marking a subset of them as > "stable" in some way. Or we can do both and have multiple levels of > "stable", I suppose. > > This patch is an attempt to describe what the "stable kfuncs" idea might > look like, as well as to formulate some criteria for what we mean by > "stable", and describe an explicit deprecation procedure. Feel free to > critique any part of this (including rejecting the notion entirely). > > Some people mentioned (in the office hours) that should we decide to go in > this direction, there's some work that needs to be done in libbpf (and > probably the kernel too?) to bring the kfunc developer experience up to par > with helpers. Things like exporting kfunc definitions to vmlinux.h (to make > them discoverable), and having CO-RE support for using them, etc. I kinda > consider that orthogonal to what's described here, but I do think we should > fix those issues before implementing the procedures described here. > > v2: > - Incorporate Daniel's changes > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Hi Toke, Thanks a lot for writing this up. Left some thoughts and comments below. > --- > Documentation/bpf/kfuncs.rst | 87 +++++++++++++++++++++++++++++++++--- > 1 file changed, 81 insertions(+), 6 deletions(-) > > diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst > index 9fd7fb539f85..dd40a4ee35f2 100644 > --- a/Documentation/bpf/kfuncs.rst > +++ b/Documentation/bpf/kfuncs.rst > @@ -7,9 +7,9 @@ BPF Kernel Functions (kfuncs) > > BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux > kernel which are exposed for use by BPF programs. Unlike normal BPF helpers, > -kfuncs do not have a stable interface and can change from one kernel release to > -another. Hence, BPF programs need to be updated in response to changes in the > -kernel. > +kfuncs by default do not have a stable interface and can change from one kernel > +release to another. Hence, BPF programs may need to be updated in response to > +changes in the kernel. See :ref:`BPF_kfunc_stability`. > > 2. Defining a kfunc > =================== > @@ -223,14 +223,89 @@ type. An example is shown below:: > } > late_initcall(init_subsystem); > > -3. Core kfuncs > + > +.. _BPF_kfunc_stability: > + > +3. API (in)stability of kfuncs > +============================== > + > +By default, kfuncs exported to BPF programs are considered a kernel-internal > +interface that can change between kernel versions. This means that BPF programs > +using kfuncs may need to adapt to changes between kernel versions. In the > +extreme case that could also include removal of a kfunc. In other words, kfuncs > +are _not_ part of the kernel UAPI! Rather, these kfuncs can be thought of as > +being similar to internal kernel API functions exported using the > +``EXPORT_SYMBOL_GPL`` macro. All new BPF kernel helper-like functionality must > +initially start out as kfuncs. > + > +3.1 Promotion to "stable" kfuncs > +-------------------------------- > + > +While kfuncs are by default considered unstable as described above, some kfuncs > +may warrant a stronger stability guarantee and can be marked as *stable*. The > +decision to move a kfunc to *stable* is taken on a case-by-case basis and must > +clear a high bar, taking into account the functions' usefulness under s/functions'/function's > +longer-term production deployment without any unforeseen API issues or > +limitations. In general, it is not expected that every kfunc will turn into a > +stable one - think of it as an exception rather than the norm. > + > +Those kfuncs which have been promoted to stable are then marked using the > +``KF_STABLE`` tag. The process for requesting a kfunc be marked as stable I'm not 100% convinced that KF_STABLE and KF_DEPRECATED flags are the correct API to go with here for signaling stability and deprecation, for a couple of reasons: 1. We're already requiring that we document in the kernel docs when a stable, deprecated kfunc is slated to be removed, so I think an argument could be made to make this a documentation convention rather than a programmatic + documentation convention. A user that sees that a kfunc is deprecated will end up having to go to the documentation anyways to see when it will be removed. 2. Long term, I don't think KF_STABLE or KF_DEPRECATED will be expressive enough for us. For example, we could do something similar to what Maxim suggested in [0] wherein we have API version numbers for kfuncs which are interpreted by libbpf, allowing us to warn for deprecated kfuncs, enble symbol versioning by linking against different versions of the kfunc in the kernel, etc. [0]: https://lore.kernel.org/all/Y78+iIqqpyufjWOv@mail.gmail.com/ So my take would be, let's just stick with a documentation convention for now. Maybe we could also add a "deprecated" BTF tag that would at least allow the verifier to warn the user on clang builds (until gcc supports it) when they link against a deprecated kfunc. What do you think? > +consists of submitting a patch to the bpf@vger.kernel.org mailing list adding > +the ``KF_STABLE`` tag to that kfunc's definition. The patch description must > +include the rationale for why the kfunc should be promoted to stable, including > +references to existing production uses, etc. The patch will be considered the > +same was as any other patch, and ultimately the decision on whether a kfunc > +should be promoted to stable is taken by the BPF maintainers. > + > +Stable kfuncs provide the following stability guarantees: > + > +1. Stable kfuncs will not change their function signature or functionality in a > + way that may cause incompatibilities for BPF programs calling the function. > + > +2. The BPF community will make every reasonable effort to keep stable kfuncs > + around as long as they continue to be useful to real-world BPF applications. I think we should try to avoid setting subjective expectations such as "every reasonable effort". I think it's useful to paint a picture of the typical workflow / culture for stability, but in terms of _guarantees_, I think we should only be speaking in terms of deprecation timelines. For example, points (2) and (3) could possibly be rephrased as: 2. The BPF community generally makes an effort to keep stable kfuncs around as long as they continue to be useful to real-world applications. Stable kfuncs will maintain their signature and behavior for as long as they're defined, and up through and possibly exceeding a deprecation window that is outlined in more detail below. > + > +3. Should a stable kfunc turn out to be no longer useful, the BPF community may > + decide to eventually remove it. In this case, before being removed that kfunc > + will go through a deprecation procedure as outlined below. Echoing the point above, I also think we should be a bit cautious in providing specific reasons as to why a stable kfunc would be removed. There may be cases where something is useful, but it turns out that the complexity is causing bugs, etc, and the value just isn't worth the cost. Hopefully the kfunc wouldn't get to the stable point by the time we realized there's a better alternative, but it's possible (perhaps bpf_loop() is a good example?) Also, "useful" is subjective, so it may surprise / confuse users if they we deprecate something that's still useful to them, such as bpf_get_current_task(). Note that we may choose to keep a kfunc around if a user tells us they need it, but the point is that I don't think we want to give readers the idea that a kfunc being useful automatically means it won't be removed. > + > +3.2 Deprecation of kfuncs > +------------------------- > + > +As described above, the community will make every reasonable effort to keep > +kfuncs available through future kernel versions once they are marked as stable. > +However, it may happen case that BPF development moves in an unforeseen 'it may happen in some cases'? > +direction so that even a stable kfunc ceases to be useful for program > +development. > + > +In this case, stable kfuncs can be marked as *deprecated* using the > +``KF_DEPRECATED`` tag. Such a deprecation request cannot be arbitrary and must > +explain why a given stable kfunc should be deprecated. Once a kfunc is marked as > +deprecated, the following procedure will be followed for removal: See above -- not sure I'm convinced yet that KF_STABLE | KF_DEPRECATED are the right APIs for the job. As another observation to that point, I don't think it ever make sense to specify a kfunc as KF_DEPRECATED without also KF_STABLE, which IMO is a sign that we need something different than KF flags to express what we're going for here. > + > +1. A kfunc marked as deprecated will be kept in the kernel for a conservatively > + chosen period of time after it was first marked as deprecated (usually > + corresponding to a span of multiple years). I have two suggestions for this paragraph: 1. I think we should consider specifying a concrete minimum time period here. Unfortunately it's difficult to choose one given that we haven't yet gone through a deprecation story and so we have little experience to draw from in choosing it. On the other hand, I think it would be good for us to have a more fully-defined policy to point to as that seems to have been a big point of confusion in prior discussions, and we can always update the time period later. 2. We should probably specify that the clock for a kfunc being considered deprecated starts at the release date of the first kernel version in which it was marked as deprecated. 'First marked as deprecated' is a bit too open for interpretation IMO, as it could correspond to when the commit was first merged to Linus' tree, etc. Assuming others agree that we're ready to specify a minimum deprecation time, I think 1 year makes sense as a starting point, though I expect it will increase as we stabilize and finish some remaining kfunc items like symbol versioning. 1 year isn't super aggressive in that it gives us the flexibility to change things if we decide that we want a different deprecation story (e.g. if and when we support symboled versions for kfuncs), but it's still a reasonably long period of time in that we're agreeing to a cost for the sake of stability. Wdyt? > + > +2. Deprecated functions will be documented in the kernel docs along with their > + remaining lifespan and including a recommendation for new functionality that > + can replace the usage of the deprecated function (or an explanation for why > + no such replacement exists). > + > +3. After the deprecation period, the kfunc will be removed and the function name > + will be marked as invalid inside the kernel (to ensure that no new kfunc is > + accidentally introduced with the same name in the future). After this > + happens, BPF programs calling the kfunc will be rejected by the verifier. Do we need to mark the function name as invalid? Wouldn't removing it be sufficient for the verifier to reject the program? If we wanted to have some better UX for this case, ideally we could rely on libbpf to help us. In general, I think what you have in this paragraph is good and makes sense given the current state of kfuncs not having symbol versioning. That being said, once we do have something like symbol versioning support, etc, I don't think we'd need to ban the name from ever being used again. My thinking here is that there may be times where we need to e.g. add a or remove a field to or from a kfunc, change some semantics, etc, and adding an entirely new name like kfunc_name_v2() would be kind of unfortunate. It's certainly warranted in the current state of things, but if and when we have symbol version support I think it would be nice to relax this restriction. > + > +4. Core kfuncs > ============== > > The BPF subsystem provides a number of "core" kfuncs that are potentially > applicable to a wide variety of different possible use cases and programs. > Those kfuncs are documented here. > > -3.1 struct task_struct * kfuncs > +4.1 struct task_struct * kfuncs > ------------------------------- > > There are a number of kfuncs that allow ``struct task_struct *`` objects to be > @@ -306,7 +381,7 @@ Here is an example of it being used: > return 0; > } > > -3.2 struct cgroup * kfuncs > +4.2 struct cgroup * kfuncs > -------------------------- > > ``struct cgroup *`` objects also have acquire and release functions: > -- > 2.39.0 >
On Wed, Jan 18, 2023 at 11:48:59AM +0100, Daniel Borkmann wrote: > On 1/18/23 3:00 AM, Stanislav Fomichev wrote: > > On Tue, Jan 17, 2023 at 3:19 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > On Tue, Jan 17, 2023 at 2:20 PM Stanislav Fomichev <sdf@google.com> wrote: > > > > On Tue, Jan 17, 2023 at 2:04 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > > Stanislav Fomichev <sdf@google.com> writes: > > > > > > On Tue, Jan 17, 2023 at 1:27 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > > > > > > > > > > > Following up on the discussion at the BPF office hours, this patch adds a > > > > > > > description of the (new) concept of "stable kfuncs", which are kfuncs that > > > > > > > offer a "more stable" interface than what we have now, but is still not > > > > > > > part of UAPI. > > > > > > > > > > > > > > This is mostly meant as a straw man proposal to focus discussions around > > > > > > > stability guarantees. From the discussion, it seemed clear that there were > > > > > > > at least some people (myself included) who felt that there needs to be some > > > > > > > way to export functionality that we consider "stable" (in the sense of > > > > > > > "applications can rely on its continuing existence"). > > > > > > > > > > > > > > One option is to keep BPF helpers as the stable interface and implement > > > > > > > some technical solution for moving functionality from kfuncs to helpers > > > > > > > once it has stood the test of time and we're comfortable committing to it > > > > > > > as a stable API. Another is to freeze the helper definitions, and instead > > > > > > > use kfuncs for this purpose as well, by marking a subset of them as > > > > > > > "stable" in some way. Or we can do both and have multiple levels of > > > > > > > "stable", I suppose. > > > > > > > > > > > > > > This patch is an attempt to describe what the "stable kfuncs" idea might > > > > > > > look like, as well as to formulate some criteria for what we mean by > > > > > > > "stable", and describe an explicit deprecation procedure. Feel free to > > > > > > > critique any part of this (including rejecting the notion entirely). > > > > > > > > > > > > > > Some people mentioned (in the office hours) that should we decide to go in > > > > > > > this direction, there's some work that needs to be done in libbpf (and > > > > > > > probably the kernel too?) to bring the kfunc developer experience up to par > > > > > > > with helpers. Things like exporting kfunc definitions to vmlinux.h (to make > > > > > > > them discoverable), and having CO-RE support for using them, etc. I kinda > > > > > > > consider that orthogonal to what's described here, but I do think we should > > > > > > > fix those issues before implementing the procedures described here. > > > > > > > > > > > > > > v2: > > > > > > > - Incorporate Daniel's changes > > > > > > > > > > > > > > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > > > > > > > --- > > > > > > > Documentation/bpf/kfuncs.rst | 87 +++++++++++++++++++++++++++++++++--- > > > > > > > 1 file changed, 81 insertions(+), 6 deletions(-) > > > > > > > > > > > > > > diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst > > > > > > > index 9fd7fb539f85..dd40a4ee35f2 100644 > > > > > > > --- a/Documentation/bpf/kfuncs.rst > > > > > > > +++ b/Documentation/bpf/kfuncs.rst > > > > > > > @@ -7,9 +7,9 @@ BPF Kernel Functions (kfuncs) > > > > > > > > > > > > > > BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux > > > > > > > kernel which are exposed for use by BPF programs. Unlike normal BPF helpers, > > > > > > > -kfuncs do not have a stable interface and can change from one kernel release to > > > > > > > -another. Hence, BPF programs need to be updated in response to changes in the > > > > > > > -kernel. > > > > > > > +kfuncs by default do not have a stable interface and can change from one kernel > > > > > > > +release to another. Hence, BPF programs may need to be updated in response to > > > > > > > +changes in the kernel. See :ref:`BPF_kfunc_stability`. > > > > > > > > > > > > > > 2. Defining a kfunc > > > > > > > =================== > > > > > > > @@ -223,14 +223,89 @@ type. An example is shown below:: > > > > > > > } > > > > > > > late_initcall(init_subsystem); > > > > > > > > > > > > > > -3. Core kfuncs > > > > > > > + > > > > > > > +.. _BPF_kfunc_stability: > > small nit: please also link from Documentation/bpf/bpf_design_QA.rst, so these sections > here are easier to find. > > > > > > > > +3. API (in)stability of kfuncs > > > > > > > +============================== > > > > > > > + > > > > > > > +By default, kfuncs exported to BPF programs are considered a kernel-internal > > > > > > > +interface that can change between kernel versions. This means that BPF programs > > > > > > > +using kfuncs may need to adapt to changes between kernel versions. In the > > > > > > > +extreme case that could also include removal of a kfunc. In other words, kfuncs > > > > > > > +are _not_ part of the kernel UAPI! Rather, these kfuncs can be thought of as > > > > > > > +being similar to internal kernel API functions exported using the > > > > > > > > > > > > [..] > > > > > > > > > > > > > +``EXPORT_SYMBOL_GPL`` macro. All new BPF kernel helper-like functionality must > > > > > > > +initially start out as kfuncs. > > > > > > > > > > > > To clarify, as part of this proposal, are we making a decision here > > > > > > that we ban new helpers going forward? > > > > > > > > > > Good question! That is one of the things I'm hoping we can clear up by > > > > > this discussing. I don't have a strong opinion on the matter myself, as > > > > > long as there is *some* way to mark a subset of helpers/kfuncs as > > > > > "stable"... > > > > > > > > Might be worth it to capitalize in this case to indicate that it's a > > > > MUST from the RFC world? (or go with SHOULD otherwise). > > > > I'm fine either way. The only thing that stops me from fully embracing > > > > MUST is the kfunc requirement on the explicit jit support; I'm not > > > > sure why it exists and at this point I'm too afraid to ask. But having > > > > MUST here might give us motivation to address the shortcomings... > > > > > > Did you do: > > > git grep bpf_jit_supports_kfunc_call > > > and didn't find your favorite architecture there and > > > didn't find it in the upcoming patches for riscv and arm32? > > > If you care about kfuncs on arm32 please help reviewing posted patches. > > > > Exactly why I'm going to support whatever decision is being made here. > > Just trying to clarify what that decision is. > > My $0.02 is that I don't think we need to make a hard-cut ban as part of this. > The 'All new BPF kernel helper-like functionality must initially start out as > kfuncs.' is pretty clear where things would need to start out with, and we could > leave the option on the table if really needed to go BPF helper route when > promoting kfunc to stable at the same time. I had that in the text suggestion > earlier, it's more corner case and maybe we'll never need it but we also don't > drive ourselves into a corner where we close the door on it. Lets let the infra > around kfuncs evolve further first. I think that's reasonable, though I also think it would be good for us to be concrete about what we mean by "if really needed to go BPF helper route". One of Andrii's main points (hopefully I'm not misrepresenting anything) was that having things as kfuncs requires JIT support, which means that architectures which don't yet have JIT support wouldn't be able to reap the benefits of whatever functionality is added with kfuncs. On the other hand, Alexei pointed out in [0] that riscv and arm32 support is coming for JIT. [0]: https://lore.kernel.org/all/CAADnVQKy1QzM+wg1BxfYA30QsTaM4M5RRCi+VHN6A7ah2BeZZw@mail.gmail.com/ I propose that we also specify that wanting the feature to be present on non-JIT / non-BTF kernels is not a sufficient reason for making them a helper. Not because there are no tradeoffs in doing so, but rather because: 1. I think we just need to make a decision and be consistent here to avoid more lengthy debates. 2. I think that if something is really useful on an architecture, people will add JIT support for it. An argument could always be made that we should be able to rely only on the interpreter for new architectures that are added, etc. As more time passes, BPF sans JIT (i.e. interpreter BPF) will be less and less useful, and will diverge more and more from JIT-BPF. It's really inevitable anyways given the direction that things are going, and IMO we should just embrace that and focus on enabling JIT / modern BPF on useful architectures rather than adding things to helpers for the sake of those platforms. Thanks, David
On Wed, Jan 18, 2023 at 11:48:59AM +0100, Daniel Borkmann wrote: > > My $0.02 is that I don't think we need to make a hard-cut ban as part of this. The hard-cut is easier to enforce otherwise every developer will be arguing that their new feature is special and it requires a new discussion. This thread has been going for too long. We need to finish it now and don't come back to it again every now and then. imo this is the summary of the thread: bpf folks fall into two categories: kernel maintainers and bpf developers/users. - developers add new bpf features. They obviously want to use them and want bpf users to know that the feature they added is not going to disappear in the next kernel. They want stability. - maintainers want to make sure that the kernel development doesn't suffer because developers keep adding new apis. They want freedom to innovate and change apis. Maintainers also know that developers make mistakes and might leave the community. The kernel is huge and core infra changes all the time. bpf apis must never be a reason not to change something in the kernel. Freedom to change and stability just don't overlap. These two camps can never agree on what is more important. But we can make them co-exist. The bpf developers adding new kfunc should assume that it's stable and proceed to use it in bpf progs and production applications. The bpf maintainers will keep this stability promise. They obviously will not reap it out of the kernel on the whim, but they will nuke it if this kfunc will be in the way of the kernel innovation. The longer the kfunc is present the harder it will be for maintainers to justify removing it. The developers have to stick around and demonstrate that their kfunc is actually being used. The better developers do it the bigger the effort maintainers will put into keeping the kfunc perfectly intact. Some kfunc might be perfect on the first try and it will be stable from the first kernel release it appeared in. Other kfuncs might be questionable. Like what happened with conntrack kfunc. It looked good first, but then the same developers who added it came back to change it. The approach of 'assume stable, but fix it if you like' worked in this case. Take bpf_obj_new kfunc. I think it's great and I hope the interface will stick. But we have an option to change it. Take Andrii's upcoming 'open coded iterators'. The concept and api look great. I hope we will do it right the first time and bpf progs will start to use it immediately. In such case why would anyone think of changing it? If api works well and progs are using we will keep it this way. But imagine we decide to replace the verifier with something better. It will give us much better flexibility, but sadly bounded loops and iterators will be in the way. What we most likely going to do in such case we'll keep two verifiers for several years and deprecate the old one along with kfuncs that we couldn't keep. That would be a scheme for deprecation of kfunc. Maybe we will use KF_DEPRECATE mechansim in such case or something else. I think we need to cross that bridge when we get there. Introducing KF_STABLE and KF_DEPRECATED right now looks premature. We can discuss it, but adding it to a doc and committing to it is too early. We don't have any kfuncs to mark as KF_STABLE or as KF_DEPRECATED. No one presented any data on usage of existing kfuncs. So we're not going to change or remove any one of them. bpf developers and users should assume that all kfuncs are stable and use them. When somebody comes to argue that a particular kfunc needs to change the developer who added that kfunc better to be around to argue that the kfunc is perfect the way it is. If developer is gone the maintainers will make a call. It's a self regulating system. kfuncs will be stable if developers/users are around. Yet the maintainers will have a freedom to change if absolutely necessary. Back to deprecation... I think KF_DEPRECATED is a good idea. When kfunc will be auto emitted into vmlinux.h (or whatever other file) or shipped in libbpf header we can emit __attribute__((deprecated("reason", "replacement"))); to that header file (so it's seen during bpf prog build) and start dmesg warn on them in the verifier. Kernel splats do get noticed. The users would have to act quickly. As far as KF_STABLE... I think it hurts the system in the long run. The developer can argue at one point in time that kfunc has to be KF_STABLE. The patch will be applied, but the developer is off the hook and can disappear. The maintainers would have to argue on behalf of the developer and keep maintaining it? The maintainers won't have a signal whether kfunc is still useful after initial KF_STABLE patch. I think it's more important to decide how we document kfuncs and how equivalent of bpf_helper_defs.h can be done. > The 'All new BPF kernel helper-like functionality must initially start out as > kfuncs.' is pretty clear where things would need to start out with, and we could > leave the option on the table if really needed to go BPF helper route when > promoting kfunc to stable at the same time. I had that in the text suggestion > earlier, it's more corner case and maybe we'll never need it but we also don't > drive ourselves into a corner where we close the door on it. Lets let the infra > around kfuncs evolve further first. Going kfunc->helper for stability was discussed already. It probably got lost in the noise. The summary was that it's not an option for the following reason: kfuncs and helpers are done through different mechanisms on prog and kernel side. The prog either sees = (void *)1 hack or normal call to extern func. The generated code is different. Say, we convert a kfunc to helper. Immediately the existing bpf prog that uses that kfunc will fail to load. That's the opposite of stability. We're going to require the developer to demonstrate the real world use of kfunc before promoting to stable, but with such 'promotion' we will break bpf progs. Say, we keep kfunc and introduce a new helper that does exactly the same. But it won't help bpf prog. The prog was using kfunc in production, why would it do some kind of CO-RE thing to compile 'call foo' differently depending whether 'foo' is kfunc or helper in the given kernel. And so on.
On Wed, Jan 18, 2023 at 8:32 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, Jan 18, 2023 at 11:48:59AM +0100, Daniel Borkmann wrote: > > > > My $0.02 is that I don't think we need to make a hard-cut ban as part of this. > > The hard-cut is easier to enforce otherwise every developer will be arguing that > their new feature is special and it requires a new discussion. > This thread has been going for too long. We need to finish it now and > don't come back to it again every now and then. I wish that we could grant exception at least to complete dynptr basics (bpf_dynptr_is_null, bpf_dynptr_get_size, bpf_dynptr_{clone,trim,advance}) so that it is consistently provided as a unified set of helpers. Similarly, for open coded loop iterator (3 helpers), I believe it would be better for BPF ecosystem overall to work on any BPF-enabled architecture and configuration (no matter JIT or not, BTF of not, etc), just due to generality and unassuming nature of this functionality. But it is what it is, let's move on. > > imo this is the summary of the thread: > > bpf folks fall into two categories: kernel maintainers and bpf developers/users. > - developers add new bpf features. They obviously want to use them and want bpf users > to know that the feature they added is not going to disappear in the next kernel. > They want stability. > - maintainers want to make sure that the kernel development doesn't suffer because > developers keep adding new apis. They want freedom to innovate and change apis. > Maintainers also know that developers make mistakes and might leave the community. > The kernel is huge and core infra changes all the time. > bpf apis must never be a reason not to change something in the kernel. > > Freedom to change and stability just don't overlap. > These two camps can never agree on what is more important. > But we can make them co-exist. > > The bpf developers adding new kfunc should assume that it's stable and proceed > to use it in bpf progs and production applications. It's unclear what this means for library developers (libbpf, libxdp, and others), but I guess we'll find out with time. > The bpf maintainers will keep this stability promise. They obviously will not > reap it out of the kernel on the whim, but they will nuke it if this kfunc > will be in the way of the kernel innovation. > The longer the kfunc is present the harder it will be for maintainers to justify > removing it. The developers have to stick around and demonstrate that their > kfunc is actually being used. The better developers do it the bigger the effort > maintainers will put into keeping the kfunc perfectly intact. > [...] > > > The 'All new BPF kernel helper-like functionality must initially start out as > > kfuncs.' is pretty clear where things would need to start out with, and we could > > leave the option on the table if really needed to go BPF helper route when > > promoting kfunc to stable at the same time. I had that in the text suggestion > > earlier, it's more corner case and maybe we'll never need it but we also don't > > drive ourselves into a corner where we close the door on it. Lets let the infra > > around kfuncs evolve further first. > > Going kfunc->helper for stability was discussed already. It probably got lost > in the noise. The summary was that it's not an option for the following reason: > kfuncs and helpers are done through different mechanisms on prog and kernel side. > The prog either sees = (void *)1 hack or normal call to extern func. > The generated code is different. > Say, we convert a kfunc to helper. Immediately the existing bpf prog that uses > that kfunc will fail to load. That's the opposite of stability. > We're going to require the developer to demonstrate the real world use of kfunc > before promoting to stable, but with such 'promotion' we will break bpf progs. > Say, we keep kfunc and introduce a new helper that does exactly the same. > But it won't help bpf prog. The prog was using kfunc in production, > why would it do some kind of CO-RE thing to compile 'call foo' differently > depending whether 'foo' is kfunc or helper in the given kernel. And so on. Correct. If we'd anticipated promotion of kfunc to helper (but it doesn't seem like we do), we'd need to have kfunc with a different name from its corresponding helper's name to avoid massive pain for users. So if we were to add bpf_dynptr_is_null() as kfunc with an eye for it to be stabilized as helper, I'd vote to add it as something like bpf_kf_dynptr_is_null() or something, and then eventually add bpf_dynptr_is_null(). That will at least less painful abstraction in user's BPF code (and libbpf's helpers, if any).
On Tue, Jan 24, 2023 at 9:17 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Jan 18, 2023 at 8:32 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Wed, Jan 18, 2023 at 11:48:59AM +0100, Daniel Borkmann wrote: > > > > > > My $0.02 is that I don't think we need to make a hard-cut ban as part of this. > > > > The hard-cut is easier to enforce otherwise every developer will be arguing that > > their new feature is special and it requires a new discussion. > > This thread has been going for too long. We need to finish it now and > > don't come back to it again every now and then. > > I wish that we could grant exception at least to complete dynptr > basics (bpf_dynptr_is_null, bpf_dynptr_get_size, > bpf_dynptr_{clone,trim,advance}) so that it is consistently provided > as a unified set of helpers. Similarly, for open coded loop iterator > (3 helpers), I believe it would be better for BPF ecosystem overall to > work on any BPF-enabled architecture and configuration (no matter JIT > or not, BTF of not, etc), just due to generality and unassuming nature > of this functionality. > > But it is what it is, let's move on. Just to expand a bit on the above and make it clearer. I don't like a hard-cut ban on helpers, but I'll disagree and commit and will move open-coded iterators to kfuncs. And whoever is waiting on the helpers vs kfuncs decision should stop waiting and use kfuncs. [...]
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > The bpf developers adding new kfunc should assume that it's stable and proceed > to use it in bpf progs and production applications. "Assume all kfuncs are stable" is fine by me, but that is emphatically not what we have been saying thus far, quite the opposite... > The bpf maintainers will keep this stability promise. They obviously will not > reap it out of the kernel on the whim, but they will nuke it if this kfunc > will be in the way of the kernel innovation. ...and it is contradicted by this last bit. I mean "it's stable, but we'll remove it if it's in the way" is not, well, stable. [...] > bpf developers and users should assume that all kfuncs are stable and use them. > When somebody comes to argue that a particular kfunc needs to change > the developer who added that kfunc better to be around to argue that the kfunc is > perfect the way it is. If developer is gone the maintainers will make a call. > It's a self regulating system. > kfuncs will be stable if developers/users are around. > Yet the maintainers will have a freedom to change if absolutely necessary. This assumes users (i.e., BPF program authors) are around during the development phase, which they are generally not. Except for the users who are also BPF devs, but that's a minority (if not now, hopefully in the future). So I really think we need to document some expectations here. For instance, what happens if we change a kfunc, and a user shows up during the -rc phase saying it broke their application? Are we going to revert that change? > Back to deprecation... > I think KF_DEPRECATED is a good idea. > When kfunc will be auto emitted into vmlinux.h (or whatever other file) > or shipped in libbpf header we can emit > __attribute__((deprecated("reason", "replacement"))); > to that header file (so it's seen during bpf prog build) and > start dmesg warn on them in the verifier. > Kernel splats do get noticed. The users would have to act quickly. So how about documenting that bit? Something like: "We promise that kfuncs will not be removed without going through a deprecation phase. The length of the deprecation will be proportional to how long that kfunc has existed in the kernel, but will be no shorter than XX kernel releases." ? > As far as KF_STABLE... I think it hurts the system in the long run. > The developer can argue at one point in time that kfunc has to be KF_STABLE. > The patch will be applied, but the developer is off the hook and can disappear. > The maintainers would have to argue on behalf of the developer > and keep maintaining it? The maintainers won't have a signal whether > kfunc is still useful after initial KF_STABLE patch. Doing the above wrt deprecation without having an explicit stable tag would be OK with me. > I think it's more important to decide how we document kfuncs and > how equivalent of bpf_helper_defs.h can be done. I agree we (also) need to do this. As well as have some support for querying for them from userspace on a running kernel (for CO-RE purposes). -Toke
On Tue, Jan 24, 2023 at 5:18 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > > > The bpf developers adding new kfunc should assume that it's stable and proceed > > to use it in bpf progs and production applications. > > "Assume all kfuncs are stable" is fine by me, but that is emphatically > not what we have been saying thus far, quite the opposite... > > > The bpf maintainers will keep this stability promise. They obviously will not > > reap it out of the kernel on the whim, but they will nuke it if this kfunc > > will be in the way of the kernel innovation. > > ...and it is contradicted by this last bit. I mean "it's stable, but > we'll remove it if it's in the way" is not, well, stable. Schrodinger's kfuncs :) > [...] > > > bpf developers and users should assume that all kfuncs are stable and use them. > > When somebody comes to argue that a particular kfunc needs to change > > the developer who added that kfunc better to be around to argue that the kfunc is > > perfect the way it is. If developer is gone the maintainers will make a call. > > It's a self regulating system. > > kfuncs will be stable if developers/users are around. > > Yet the maintainers will have a freedom to change if absolutely necessary. > > This assumes users (i.e., BPF program authors) are around during the > development phase, which they are generally not. Except for the users > who are also BPF devs, but that's a minority (if not now, hopefully in > the future). So I really think we need to document some expectations > here. > > For instance, what happens if we change a kfunc, and a user shows up > during the -rc phase saying it broke their application? Are we going to > revert that change? It depends. Obviously we're not going to change/remove/deprecate kfuncs unless there is a need to do so. And when we do we will consider all users. > > Back to deprecation... > > I think KF_DEPRECATED is a good idea. > > When kfunc will be auto emitted into vmlinux.h (or whatever other file) > > or shipped in libbpf header we can emit > > __attribute__((deprecated("reason", "replacement"))); > > to that header file (so it's seen during bpf prog build) and > > start dmesg warn on them in the verifier. > > Kernel splats do get noticed. The users would have to act quickly. > > So how about documenting that bit? Something like: > > "We promise that kfuncs will not be removed without going through a > deprecation phase. The length of the deprecation will be proportional to > how long that kfunc has existed in the kernel, but will be no shorter > than XX kernel releases." ? That's not something we can promise. Take conntrack kfuncs. If netfilter folks decide to sunset conntrack tomorrow we won't be standing in their way. On the other side the dynptr kfuncs are going to stay as-is for foreseeable future because they don't rely on other kernel subsystems to do the job. Both cases may still change if users themselves (after using it in prod) come back with reasons to change it. In the past the kernel devs were dictating the helper uapi to users and users had no option, but to shut up and use what's available. Now they will be able to use new apis and request changes. At that point it will be a set of users X vs a set of users Y. If ten users say that this kfunc sucks while one user wants to keep it as-is we will introduce another kfunc and will start deprecating the one that lost the vote. The deprecation time window will depend on case by case considering maintenance cost, etc. > > As far as KF_STABLE... I think it hurts the system in the long run. > > The developer can argue at one point in time that kfunc has to be KF_STABLE. > > The patch will be applied, but the developer is off the hook and can disappear. > > The maintainers would have to argue on behalf of the developer > > and keep maintaining it? The maintainers won't have a signal whether > > kfunc is still useful after initial KF_STABLE patch. > > Doing the above wrt deprecation without having an explicit stable tag > would be OK with me. > > > I think it's more important to decide how we document kfuncs and > > how equivalent of bpf_helper_defs.h can be done. > > I agree we (also) need to do this. As well as have some support for > querying for them from userspace on a running kernel (for CO-RE > purposes). Of course. That has been brought up many times. Just do it. Whoever has time to do it.
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: > On Tue, Jan 24, 2023 at 5:18 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes: >> >> > The bpf developers adding new kfunc should assume that it's stable and proceed >> > to use it in bpf progs and production applications. >> >> "Assume all kfuncs are stable" is fine by me, but that is emphatically >> not what we have been saying thus far, quite the opposite... >> >> > The bpf maintainers will keep this stability promise. They obviously will not >> > reap it out of the kernel on the whim, but they will nuke it if this kfunc >> > will be in the way of the kernel innovation. >> >> ...and it is contradicted by this last bit. I mean "it's stable, but >> we'll remove it if it's in the way" is not, well, stable. > > Schrodinger's kfuncs :) Heh, yeah. There's a reason it's called Schrodinger's *uncertainty* principle, though. Documenting it is about removing some of that uncertainty so users of can actually know what to expect. Otherwise, there is a subset of potential users who will shy away from using kfuncs because it's perceived as "totally unstable, may change at any time". Which is a shame, because there are many such users who could benefit from using BPF. So in other words, even if we don't commit to a stability promise I think it's worth documenting expectations as precisely as we can. [...] >> > Back to deprecation... >> > I think KF_DEPRECATED is a good idea. >> > When kfunc will be auto emitted into vmlinux.h (or whatever other file) >> > or shipped in libbpf header we can emit >> > __attribute__((deprecated("reason", "replacement"))); >> > to that header file (so it's seen during bpf prog build) and >> > start dmesg warn on them in the verifier. >> > Kernel splats do get noticed. The users would have to act quickly. >> >> So how about documenting that bit? Something like: >> >> "We promise that kfuncs will not be removed without going through a >> deprecation phase. The length of the deprecation will be proportional to >> how long that kfunc has existed in the kernel, but will be no shorter >> than XX kernel releases." ? > > That's not something we can promise. > Take conntrack kfuncs. If netfilter folks decide to sunset > conntrack tomorrow we won't be standing in their way. Well, we could do one of two things: - Make a promise to commit to the deprecation procedure and tell subsystems not to add kfuncs unless they are OK with that (getting suitable ACKs for the existing users first, of course) or - Document that core-BPF kfuncs won't go away without a deprecation procedure, and have each subsystem using them document their own policies. I believe the latter is more in line what you and others have set as an expectation when discussing this previously? > On the other side the dynptr kfuncs are going to stay as-is for > foreseeable future because they don't rely on other kernel > subsystems to do the job. > Both cases may still change if users themselves > (after using it in prod) come back with reasons to change it. > > In the past the kernel devs were dictating the helper uapi to > users and users had no option, but to shut up and use what's available. > Now they will be able to use new apis and request changes. > At that point it will be a set of users X vs a set of users Y. > If ten users say that this kfunc sucks while one user > wants to keep it as-is we will introduce another kfunc and > will start deprecating the one that lost the vote. "Lost the vote"? This seems like a can of worms (who can vote? how are we counting them? etc). I think what you really mean here is something like "maintainers will take into consideration the opinions of users of the API and make a call as to whether the benefits of changing a kfunc outweighs the costs"? > The deprecation time window will depend on case by case > considering maintenance cost, etc. Right, that's not too far from what I proposed above. I still think it would be useful to commit to a minimum number of releases, though. Again, to set expectations. -Toke
diff --git a/Documentation/bpf/kfuncs.rst b/Documentation/bpf/kfuncs.rst index 9fd7fb539f85..dd40a4ee35f2 100644 --- a/Documentation/bpf/kfuncs.rst +++ b/Documentation/bpf/kfuncs.rst @@ -7,9 +7,9 @@ BPF Kernel Functions (kfuncs) BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux kernel which are exposed for use by BPF programs. Unlike normal BPF helpers, -kfuncs do not have a stable interface and can change from one kernel release to -another. Hence, BPF programs need to be updated in response to changes in the -kernel. +kfuncs by default do not have a stable interface and can change from one kernel +release to another. Hence, BPF programs may need to be updated in response to +changes in the kernel. See :ref:`BPF_kfunc_stability`. 2. Defining a kfunc =================== @@ -223,14 +223,89 @@ type. An example is shown below:: } late_initcall(init_subsystem); -3. Core kfuncs + +.. _BPF_kfunc_stability: + +3. API (in)stability of kfuncs +============================== + +By default, kfuncs exported to BPF programs are considered a kernel-internal +interface that can change between kernel versions. This means that BPF programs +using kfuncs may need to adapt to changes between kernel versions. In the +extreme case that could also include removal of a kfunc. In other words, kfuncs +are _not_ part of the kernel UAPI! Rather, these kfuncs can be thought of as +being similar to internal kernel API functions exported using the +``EXPORT_SYMBOL_GPL`` macro. All new BPF kernel helper-like functionality must +initially start out as kfuncs. + +3.1 Promotion to "stable" kfuncs +-------------------------------- + +While kfuncs are by default considered unstable as described above, some kfuncs +may warrant a stronger stability guarantee and can be marked as *stable*. The +decision to move a kfunc to *stable* is taken on a case-by-case basis and must +clear a high bar, taking into account the functions' usefulness under +longer-term production deployment without any unforeseen API issues or +limitations. In general, it is not expected that every kfunc will turn into a +stable one - think of it as an exception rather than the norm. + +Those kfuncs which have been promoted to stable are then marked using the +``KF_STABLE`` tag. The process for requesting a kfunc be marked as stable +consists of submitting a patch to the bpf@vger.kernel.org mailing list adding +the ``KF_STABLE`` tag to that kfunc's definition. The patch description must +include the rationale for why the kfunc should be promoted to stable, including +references to existing production uses, etc. The patch will be considered the +same was as any other patch, and ultimately the decision on whether a kfunc +should be promoted to stable is taken by the BPF maintainers. + +Stable kfuncs provide the following stability guarantees: + +1. Stable kfuncs will not change their function signature or functionality in a + way that may cause incompatibilities for BPF programs calling the function. + +2. The BPF community will make every reasonable effort to keep stable kfuncs + around as long as they continue to be useful to real-world BPF applications. + +3. Should a stable kfunc turn out to be no longer useful, the BPF community may + decide to eventually remove it. In this case, before being removed that kfunc + will go through a deprecation procedure as outlined below. + +3.2 Deprecation of kfuncs +------------------------- + +As described above, the community will make every reasonable effort to keep +kfuncs available through future kernel versions once they are marked as stable. +However, it may happen case that BPF development moves in an unforeseen +direction so that even a stable kfunc ceases to be useful for program +development. + +In this case, stable kfuncs can be marked as *deprecated* using the +``KF_DEPRECATED`` tag. Such a deprecation request cannot be arbitrary and must +explain why a given stable kfunc should be deprecated. Once a kfunc is marked as +deprecated, the following procedure will be followed for removal: + +1. A kfunc marked as deprecated will be kept in the kernel for a conservatively + chosen period of time after it was first marked as deprecated (usually + corresponding to a span of multiple years). + +2. Deprecated functions will be documented in the kernel docs along with their + remaining lifespan and including a recommendation for new functionality that + can replace the usage of the deprecated function (or an explanation for why + no such replacement exists). + +3. After the deprecation period, the kfunc will be removed and the function name + will be marked as invalid inside the kernel (to ensure that no new kfunc is + accidentally introduced with the same name in the future). After this + happens, BPF programs calling the kfunc will be rejected by the verifier. + +4. Core kfuncs ============== The BPF subsystem provides a number of "core" kfuncs that are potentially applicable to a wide variety of different possible use cases and programs. Those kfuncs are documented here. -3.1 struct task_struct * kfuncs +4.1 struct task_struct * kfuncs ------------------------------- There are a number of kfuncs that allow ``struct task_struct *`` objects to be @@ -306,7 +381,7 @@ Here is an example of it being used: return 0; } -3.2 struct cgroup * kfuncs +4.2 struct cgroup * kfuncs -------------------------- ``struct cgroup *`` objects also have acquire and release functions:
Following up on the discussion at the BPF office hours, this patch adds a description of the (new) concept of "stable kfuncs", which are kfuncs that offer a "more stable" interface than what we have now, but is still not part of UAPI. This is mostly meant as a straw man proposal to focus discussions around stability guarantees. From the discussion, it seemed clear that there were at least some people (myself included) who felt that there needs to be some way to export functionality that we consider "stable" (in the sense of "applications can rely on its continuing existence"). One option is to keep BPF helpers as the stable interface and implement some technical solution for moving functionality from kfuncs to helpers once it has stood the test of time and we're comfortable committing to it as a stable API. Another is to freeze the helper definitions, and instead use kfuncs for this purpose as well, by marking a subset of them as "stable" in some way. Or we can do both and have multiple levels of "stable", I suppose. This patch is an attempt to describe what the "stable kfuncs" idea might look like, as well as to formulate some criteria for what we mean by "stable", and describe an explicit deprecation procedure. Feel free to critique any part of this (including rejecting the notion entirely). Some people mentioned (in the office hours) that should we decide to go in this direction, there's some work that needs to be done in libbpf (and probably the kernel too?) to bring the kfunc developer experience up to par with helpers. Things like exporting kfunc definitions to vmlinux.h (to make them discoverable), and having CO-RE support for using them, etc. I kinda consider that orthogonal to what's described here, but I do think we should fix those issues before implementing the procedures described here. v2: - Incorporate Daniel's changes Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- Documentation/bpf/kfuncs.rst | 87 +++++++++++++++++++++++++++++++++--- 1 file changed, 81 insertions(+), 6 deletions(-)