Message ID | 20220606103734.92423-1-kurt@linutronix.de (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next] bpf: Add BPF-helper for accessing CLOCK_TAI | expand |
On Mon, Jun 6, 2022 at 3:38 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: > > From: Jesper Dangaard Brouer <brouer@redhat.com> > > Commit 3dc6ffae2da2 ("timekeeping: Introduce fast accessor to clock tai") > introduced a fast and NMI-safe accessor for CLOCK_TAI. Especially in time > sensitive networks (TSN), where all nodes are synchronized by Precision Time > Protocol (PTP), it's helpful to have the possibility to generate timestamps > based on CLOCK_TAI instead of CLOCK_MONOTONIC. With a BPF helper for TAI in > place, it becomes very convenient to correlate activity across different > machines in the network. That's a fresh feature. It feels risky to bake it into uapi already. imo it would be better to annotate tk_core variable in vmlinux BTF. Then progs will be able to read all possible timekeeper offsets.
Alexei, On Mon, Jun 06 2022 at 08:57, Alexei Starovoitov wrote: > On Mon, Jun 6, 2022 at 3:38 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: >> >> From: Jesper Dangaard Brouer <brouer@redhat.com> >> >> Commit 3dc6ffae2da2 ("timekeeping: Introduce fast accessor to clock tai") >> introduced a fast and NMI-safe accessor for CLOCK_TAI. Especially in time >> sensitive networks (TSN), where all nodes are synchronized by Precision Time >> Protocol (PTP), it's helpful to have the possibility to generate timestamps >> based on CLOCK_TAI instead of CLOCK_MONOTONIC. With a BPF helper for TAI in >> place, it becomes very convenient to correlate activity across different >> machines in the network. > > That's a fresh feature. It feels risky to bake it into uapi already. What? That's just support for a different CLOCK. What's so risky about it? > imo it would be better to annotate tk_core variable in vmlinux BTF. > Then progs will be able to read all possible timekeeper offsets. We are exposing APIs. APIs can be supported, but exposing implementation details creates ABIs of the worst sort because that prevents the kernel from changing the implementation. We've seen the fallout with the recent tracepoint changes already. Thanks, tglx
On 07/06/2022 11.14, Thomas Gleixner wrote: > Alexei, > > On Mon, Jun 06 2022 at 08:57, Alexei Starovoitov wrote: >> On Mon, Jun 6, 2022 at 3:38 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: >>> >>> From: Jesper Dangaard Brouer <brouer@redhat.com> >>> >>> Commit 3dc6ffae2da2 ("timekeeping: Introduce fast accessor to clock tai") >>> introduced a fast and NMI-safe accessor for CLOCK_TAI. Especially in time >>> sensitive networks (TSN), where all nodes are synchronized by Precision Time >>> Protocol (PTP), it's helpful to have the possibility to generate timestamps >>> based on CLOCK_TAI instead of CLOCK_MONOTONIC. With a BPF helper for TAI in >>> place, it becomes very convenient to correlate activity across different >>> machines in the network. >> >> That's a fresh feature. It feels risky to bake it into uapi already. > > What? That's just support for a different CLOCK. What's so risky about > it? I didn't think it was "risky" as this is already exported as: EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns); Correct me if I'm wrong, but this simple gives BPF access to CLOCK_TAI (see man clock_gettime(2)), right? And CLOCK_TAI is not really a new/fresh type of CLOCK. Especially for networking we need this CLOCK_TAI time as HW LaunchTime need this (e.g. see qdisc's sch_etf.c and sch_taprio.c). > >> imo it would be better to annotate tk_core variable in vmlinux BTF. >> Then progs will be able to read all possible timekeeper offsets. > > We are exposing APIs. APIs can be supported, but exposing implementation > details creates ABIs of the worst sort because that prevents the kernel > from changing the implementation. We've seen the fallout with the recent > tracepoint changes already. Hmm... annotate tk_core variable in vmlinux BTF and letting BPF progs access this seems like an unsafe approach and we tempt BPF-developers to think other parts are okay to access. Accessing timekeeper->offs_tai might be okay as it is already "marked" with data_race(tk->offs_tai), but I'm not sure about other members, as I'm not expert in this area. I assume that the include filename <linux/timekeeper_internal.h> indicate that the maintainers don't want to open up access to struct timekeeper... --Jesper
On Tue, Jun 7, 2022 at 12:35 PM Jesper Dangaard Brouer <jbrouer@redhat.com> wrote: > > > On 07/06/2022 11.14, Thomas Gleixner wrote: > > Alexei, > > > > On Mon, Jun 06 2022 at 08:57, Alexei Starovoitov wrote: > >> On Mon, Jun 6, 2022 at 3:38 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: > >>> > >>> From: Jesper Dangaard Brouer <brouer@redhat.com> > >>> > >>> Commit 3dc6ffae2da2 ("timekeeping: Introduce fast accessor to clock tai") > >>> introduced a fast and NMI-safe accessor for CLOCK_TAI. Especially in time > >>> sensitive networks (TSN), where all nodes are synchronized by Precision Time > >>> Protocol (PTP), it's helpful to have the possibility to generate timestamps > >>> based on CLOCK_TAI instead of CLOCK_MONOTONIC. With a BPF helper for TAI in > >>> place, it becomes very convenient to correlate activity across different > >>> machines in the network. > >> > >> That's a fresh feature. It feels risky to bake it into uapi already. > > > > What? That's just support for a different CLOCK. What's so risky about > > it? > > I didn't think it was "risky" as this is already exported as: > EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns); > > Correct me if I'm wrong, but this simple gives BPF access to CLOCK_TAI > (see man clock_gettime(2)), right? > And CLOCK_TAI is not really a new/fresh type of CLOCK. > > Especially for networking we need this CLOCK_TAI time as HW LaunchTime > need this (e.g. see qdisc's sch_etf.c and sch_taprio.c). I see. I interpreted the commit log that commit 3dc6ffae2da2 introduced TAI into the kernel for the first time. But it introduced the NMI safe version of it, right? > > > >> imo it would be better to annotate tk_core variable in vmlinux BTF. > >> Then progs will be able to read all possible timekeeper offsets. > > > > We are exposing APIs. APIs can be supported, but exposing implementation > > details creates ABIs of the worst sort because that prevents the kernel > > from changing the implementation. We've seen the fallout with the recent > > tracepoint changes already. > > Hmm... annotate tk_core variable in vmlinux BTF and letting BPF progs > access this seems like an unsafe approach and we tempt BPF-developers to > think other parts are okay to access. It is safe to access. Whether garbage will be read it's a different story. The following works (with lose definition of 'works'): extern const void tk_core __ksym; struct timekeeper { long long offs_tai; } __attribute__((preserve_access_index)); struct seqcount_raw_spinlock { } __attribute__((preserve_access_index)); long get_clock_tai(void) { long long off = 0; void *addr = (void *)&tk_core + ((bpf_core_type_size(struct seqcount_raw_spinlock) + 7) / 8) * 8 + bpf_core_field_offset(struct timekeeper, offs_tai); bpf_probe_read_kernel(&off, sizeof(off), addr); return bpf_ktime_get_ns() + off; } It's ugly, but no kernel changes are necessary. If you need to access clock_tai you can do so on older kernels too. Even on android 4.19 kernels. It's not foolproof. Future kernel changes will break it, but CO-RE will detect the breakage. The prog author would have to adjust the prog every time. People do these kinds of tricks all the time. Note that above was possible even before CO-RE came to existence. The first day bpf_probe_read_kernel() became available 8 years ago the tracing progs could read any kernel memory. Even earlier kprobes could read it for 10+ years. And in all those years bpf progs accessing kernel internals didn't freeze kernel development. bpf progs don't extend uapi surface unless the changes are in uapi/bpf.h. Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since it's so trivial, but selftest is necessary.
On Tue Jun 07 2022, Alexei Starovoitov wrote: > On Tue, Jun 7, 2022 at 12:35 PM Jesper Dangaard Brouer > <jbrouer@redhat.com> wrote: >> >> >> On 07/06/2022 11.14, Thomas Gleixner wrote: >> > Alexei, >> > >> > On Mon, Jun 06 2022 at 08:57, Alexei Starovoitov wrote: >> >> On Mon, Jun 6, 2022 at 3:38 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: >> >>> >> >>> From: Jesper Dangaard Brouer <brouer@redhat.com> >> >>> >> >>> Commit 3dc6ffae2da2 ("timekeeping: Introduce fast accessor to clock tai") >> >>> introduced a fast and NMI-safe accessor for CLOCK_TAI. Especially in time >> >>> sensitive networks (TSN), where all nodes are synchronized by Precision Time >> >>> Protocol (PTP), it's helpful to have the possibility to generate timestamps >> >>> based on CLOCK_TAI instead of CLOCK_MONOTONIC. With a BPF helper for TAI in >> >>> place, it becomes very convenient to correlate activity across different >> >>> machines in the network. >> >> >> >> That's a fresh feature. It feels risky to bake it into uapi already. >> > >> > What? That's just support for a different CLOCK. What's so risky about >> > it? >> >> I didn't think it was "risky" as this is already exported as: >> EXPORT_SYMBOL_GPL(ktime_get_tai_fast_ns); >> >> Correct me if I'm wrong, but this simple gives BPF access to CLOCK_TAI >> (see man clock_gettime(2)), right? >> And CLOCK_TAI is not really a new/fresh type of CLOCK. >> >> Especially for networking we need this CLOCK_TAI time as HW LaunchTime >> need this (e.g. see qdisc's sch_etf.c and sch_taprio.c). In addition to Tx launchtime I have two other uses cases in mind: Timestamping and policing. > > I see. I interpreted the commit log that commit 3dc6ffae2da2 > introduced TAI into the kernel for the first time. > But it introduced the NMI safe version of it, right? Yes, exactly. The clock itself is nothing new. Only the NMI safe version of it is. It is designated to be used e.g., in tracing or bpf. I'll update the changelog. > >> > >> >> imo it would be better to annotate tk_core variable in vmlinux BTF. >> >> Then progs will be able to read all possible timekeeper offsets. >> > >> > We are exposing APIs. APIs can be supported, but exposing implementation >> > details creates ABIs of the worst sort because that prevents the kernel >> > from changing the implementation. We've seen the fallout with the recent >> > tracepoint changes already. >> >> Hmm... annotate tk_core variable in vmlinux BTF and letting BPF progs >> access this seems like an unsafe approach and we tempt BPF-developers to >> think other parts are okay to access. > > It is safe to access. > Whether garbage will be read it's a different story. > > The following works (with lose definition of 'works'): > > extern const void tk_core __ksym; > > struct timekeeper { > long long offs_tai; > } __attribute__((preserve_access_index)); > > struct seqcount_raw_spinlock { > } __attribute__((preserve_access_index)); > > long get_clock_tai(void) > { > long long off = 0; > void *addr = (void *)&tk_core + > ((bpf_core_type_size(struct seqcount_raw_spinlock) + 7) / 8) * 8 + > bpf_core_field_offset(struct timekeeper, offs_tai); > > bpf_probe_read_kernel(&off, sizeof(off), addr); > return bpf_ktime_get_ns() + off; > } Thanks for the example. > > It's ugly, but no kernel changes are necessary. > If you need to access clock_tai you can do so on older kernels too. > Even on android 4.19 kernels. > > It's not foolproof. Future kernel changes will break it, > but CO-RE will detect the breakage. > The prog author would have to adjust the prog every time. > > People do these kinds of tricks all the time. > > Note that above was possible even before CO-RE came to existence. > The first day bpf_probe_read_kernel() became available 8 years ago > the tracing progs could read any kernel memory. > Even earlier kprobes could read it for 10+ years. > > And in all those years bpf progs accessing kernel internals > didn't freeze kernel development. bpf progs don't extend > uapi surface unless the changes are in uapi/bpf.h. > > Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since > it's so trivial, but selftest is necessary. OK. I'll add a selftest and resolve the warnings detected by netdev ci. Thanks, Kurt
Hi Alexei, On Tue Jun 07 2022, Alexei Starovoitov wrote: > Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since > it's so trivial, but selftest is necessary. So, I did write a selftest [1] for testing bpf_ktime_get_tai_ns() and verifying that the access to the clock works. It uses AF_XDP sockets and timestamps the incoming packets. The timestamps are then validated in user space. Since AF_XDP related code is migrating from libbpf to libxdp, I'm wondering if that sample fits into the kernel's selftests or not. What kind of selftest are you looking for? Thanks, Kurt [1] - https://github.com/shifty91/xdp-timestamping/tree/master/src
On Tue, Aug 2, 2022 at 12:06 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: > > Hi Alexei, > > On Tue Jun 07 2022, Alexei Starovoitov wrote: > > Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since > > it's so trivial, but selftest is necessary. > > So, I did write a selftest [1] for testing bpf_ktime_get_tai_ns() and > verifying that the access to the clock works. It uses AF_XDP sockets and > timestamps the incoming packets. The timestamps are then validated in > user space. > > Since AF_XDP related code is migrating from libbpf to libxdp, I'm > wondering if that sample fits into the kernel's selftests or not. What > kind of selftest are you looking for? Please use selftests/bpf framework. There are plenty of networking tests in there. bpf_ktime_get_tai_ns() doesn't have to rely on af_xdp. It can be skb based.
On Tue Aug 02 2022, Alexei Starovoitov wrote: > On Tue, Aug 2, 2022 at 12:06 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: >> >> Hi Alexei, >> >> On Tue Jun 07 2022, Alexei Starovoitov wrote: >> > Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since >> > it's so trivial, but selftest is necessary. >> >> So, I did write a selftest [1] for testing bpf_ktime_get_tai_ns() and >> verifying that the access to the clock works. It uses AF_XDP sockets and >> timestamps the incoming packets. The timestamps are then validated in >> user space. >> >> Since AF_XDP related code is migrating from libbpf to libxdp, I'm >> wondering if that sample fits into the kernel's selftests or not. What >> kind of selftest are you looking for? > > Please use selftests/bpf framework. > There are plenty of networking tests in there. > bpf_ktime_get_tai_ns() doesn't have to rely on af_xdp. OK. > It can be skb based. Something like this? +++ b/tools/testing/selftests/bpf/prog_tests/check_tai.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2022 Linutronix GmbH */ + +#include <test_progs.h> +#include <network_helpers.h> + +#include <time.h> +#include <stdint.h> + +#define TAI_THRESHOLD 1000000000ULL /* 1s */ +#define NSEC_PER_SEC 1000000000ULL + +static __u64 ts_to_ns(const struct timespec *ts) +{ + return ts->tv_sec * NSEC_PER_SEC + ts->tv_nsec; +} + +void test_tai(void) +{ + struct __sk_buff skb = { + .tstamp = 0, + .hwtstamp = 0, + }; + LIBBPF_OPTS(bpf_test_run_opts, topts, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .ctx_in = &skb, + .ctx_size_in = sizeof(skb), + .ctx_out = &skb, + .ctx_size_out = sizeof(skb), + ); + struct timespec now_tai; + struct bpf_object *obj; + int ret, prog_fd; + + ret = bpf_prog_test_load("./test_tai.o", + BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd); + if (!ASSERT_OK(ret, "load")) + return; + ret = bpf_prog_test_run_opts(prog_fd, &topts); + ASSERT_OK(ret, "test_run"); + + /* TAI != 0 */ + ASSERT_NEQ(skb.tstamp, 0, "tai_ts0_0"); + ASSERT_NEQ(skb.hwtstamp, 0, "tai_ts0_1"); + + /* TAI is moving forward only */ + ASSERT_GT(skb.hwtstamp, skb.tstamp, "tai_forward"); + + /* Check for reasoneable range */ + ret = clock_gettime(CLOCK_TAI, &now_tai); + ASSERT_EQ(ret, 0, "tai_gettime"); + ASSERT_TRUE((ts_to_ns(&now_tai) - skb.hwtstamp) < TAI_THRESHOLD, + "tai_range"); + + bpf_object__close(obj); +} diff --git a/tools/testing/selftests/bpf/progs/test_tai.c b/tools/testing/selftests/bpf/progs/test_tai.c new file mode 100644 index 000000000000..34ac4175e29d --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_tai.c @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (C) 2022 Linutronix GmbH */ + +#include <linux/bpf.h> +#include <bpf/bpf_helpers.h> + +char _license[] SEC("license") = "GPL"; + +SEC("tc") +int save_tai(struct __sk_buff *skb) +{ + /* Save TAI timestamps */ + skb->tstamp = bpf_ktime_get_tai_ns(); + skb->hwtstamp = bpf_ktime_get_tai_ns(); + + return 0; +}
On Wed, Aug 03, 2022 at 08:29:29AM +0200, Kurt Kanzenbach wrote: > On Tue Aug 02 2022, Alexei Starovoitov wrote: > > On Tue, Aug 2, 2022 at 12:06 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: > >> > >> Hi Alexei, > >> > >> On Tue Jun 07 2022, Alexei Starovoitov wrote: > >> > Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since > >> > it's so trivial, but selftest is necessary. > >> > >> So, I did write a selftest [1] for testing bpf_ktime_get_tai_ns() and > >> verifying that the access to the clock works. It uses AF_XDP sockets and > >> timestamps the incoming packets. The timestamps are then validated in > >> user space. > >> > >> Since AF_XDP related code is migrating from libbpf to libxdp, I'm > >> wondering if that sample fits into the kernel's selftests or not. What > >> kind of selftest are you looking for? > > > > Please use selftests/bpf framework. > > There are plenty of networking tests in there. > > bpf_ktime_get_tai_ns() doesn't have to rely on af_xdp. > > OK. > > > It can be skb based. FWIW there is xskxceiver and libbpf's xsk part in selftests/bpf framework, so your initial work should be fine in there. Personally I found both (AF_XDP and SKB one, below) tests valuable. Later on, if we add a support to xskxceiver for loading external BPF progs then your sample would just become another test case in there. > > Something like this? > > +++ b/tools/testing/selftests/bpf/prog_tests/check_tai.c > @@ -0,0 +1,57 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (C) 2022 Linutronix GmbH */ > + > +#include <test_progs.h> > +#include <network_helpers.h> > + > +#include <time.h> > +#include <stdint.h> > + > +#define TAI_THRESHOLD 1000000000ULL /* 1s */ > +#define NSEC_PER_SEC 1000000000ULL > + > +static __u64 ts_to_ns(const struct timespec *ts) > +{ > + return ts->tv_sec * NSEC_PER_SEC + ts->tv_nsec; > +} > + > +void test_tai(void) > +{ > + struct __sk_buff skb = { > + .tstamp = 0, > + .hwtstamp = 0, > + }; > + LIBBPF_OPTS(bpf_test_run_opts, topts, > + .data_in = &pkt_v4, > + .data_size_in = sizeof(pkt_v4), > + .ctx_in = &skb, > + .ctx_size_in = sizeof(skb), > + .ctx_out = &skb, > + .ctx_size_out = sizeof(skb), > + ); > + struct timespec now_tai; > + struct bpf_object *obj; > + int ret, prog_fd; > + > + ret = bpf_prog_test_load("./test_tai.o", > + BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd); > + if (!ASSERT_OK(ret, "load")) > + return; > + ret = bpf_prog_test_run_opts(prog_fd, &topts); > + ASSERT_OK(ret, "test_run"); > + > + /* TAI != 0 */ > + ASSERT_NEQ(skb.tstamp, 0, "tai_ts0_0"); > + ASSERT_NEQ(skb.hwtstamp, 0, "tai_ts0_1"); > + > + /* TAI is moving forward only */ > + ASSERT_GT(skb.hwtstamp, skb.tstamp, "tai_forward"); > + > + /* Check for reasoneable range */ > + ret = clock_gettime(CLOCK_TAI, &now_tai); > + ASSERT_EQ(ret, 0, "tai_gettime"); > + ASSERT_TRUE((ts_to_ns(&now_tai) - skb.hwtstamp) < TAI_THRESHOLD, > + "tai_range"); > + > + bpf_object__close(obj); > +} > diff --git a/tools/testing/selftests/bpf/progs/test_tai.c b/tools/testing/selftests/bpf/progs/test_tai.c > new file mode 100644 > index 000000000000..34ac4175e29d > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/test_tai.c > @@ -0,0 +1,17 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (C) 2022 Linutronix GmbH */ > + > +#include <linux/bpf.h> > +#include <bpf/bpf_helpers.h> > + > +char _license[] SEC("license") = "GPL"; > + > +SEC("tc") > +int save_tai(struct __sk_buff *skb) > +{ > + /* Save TAI timestamps */ > + skb->tstamp = bpf_ktime_get_tai_ns(); > + skb->hwtstamp = bpf_ktime_get_tai_ns(); > + > + return 0; > +}
On Tue, Aug 2, 2022 at 11:29 PM Kurt Kanzenbach <kurt@linutronix.de> wrote: > > On Tue Aug 02 2022, Alexei Starovoitov wrote: > > On Tue, Aug 2, 2022 at 12:06 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: > >> > >> Hi Alexei, > >> > >> On Tue Jun 07 2022, Alexei Starovoitov wrote: > >> > Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since > >> > it's so trivial, but selftest is necessary. > >> > >> So, I did write a selftest [1] for testing bpf_ktime_get_tai_ns() and > >> verifying that the access to the clock works. It uses AF_XDP sockets and > >> timestamps the incoming packets. The timestamps are then validated in > >> user space. > >> > >> Since AF_XDP related code is migrating from libbpf to libxdp, I'm > >> wondering if that sample fits into the kernel's selftests or not. What > >> kind of selftest are you looking for? > > > > Please use selftests/bpf framework. > > There are plenty of networking tests in there. > > bpf_ktime_get_tai_ns() doesn't have to rely on af_xdp. > > OK. > > > It can be skb based. > > Something like this? > > +++ b/tools/testing/selftests/bpf/prog_tests/check_tai.c > @@ -0,0 +1,57 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (C) 2022 Linutronix GmbH */ > + > +#include <test_progs.h> > +#include <network_helpers.h> > + > +#include <time.h> > +#include <stdint.h> > + > +#define TAI_THRESHOLD 1000000000ULL /* 1s */ > +#define NSEC_PER_SEC 1000000000ULL > + > +static __u64 ts_to_ns(const struct timespec *ts) > +{ > + return ts->tv_sec * NSEC_PER_SEC + ts->tv_nsec; > +} > + > +void test_tai(void) > +{ > + struct __sk_buff skb = { > + .tstamp = 0, > + .hwtstamp = 0, > + }; > + LIBBPF_OPTS(bpf_test_run_opts, topts, > + .data_in = &pkt_v4, > + .data_size_in = sizeof(pkt_v4), > + .ctx_in = &skb, > + .ctx_size_in = sizeof(skb), > + .ctx_out = &skb, > + .ctx_size_out = sizeof(skb), > + ); > + struct timespec now_tai; > + struct bpf_object *obj; > + int ret, prog_fd; > + > + ret = bpf_prog_test_load("./test_tai.o", > + BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd); it would be best to rely on BPF skeleton, please see other tests including *.skel.h, thanks > + if (!ASSERT_OK(ret, "load")) > + return; > + ret = bpf_prog_test_run_opts(prog_fd, &topts); > + ASSERT_OK(ret, "test_run"); > + > + /* TAI != 0 */ > + ASSERT_NEQ(skb.tstamp, 0, "tai_ts0_0"); > + ASSERT_NEQ(skb.hwtstamp, 0, "tai_ts0_1"); > + > + /* TAI is moving forward only */ > + ASSERT_GT(skb.hwtstamp, skb.tstamp, "tai_forward"); > + > + /* Check for reasoneable range */ > + ret = clock_gettime(CLOCK_TAI, &now_tai); > + ASSERT_EQ(ret, 0, "tai_gettime"); > + ASSERT_TRUE((ts_to_ns(&now_tai) - skb.hwtstamp) < TAI_THRESHOLD, > + "tai_range"); > + > + bpf_object__close(obj); > +} > diff --git a/tools/testing/selftests/bpf/progs/test_tai.c b/tools/testing/selftests/bpf/progs/test_tai.c > new file mode 100644 > index 000000000000..34ac4175e29d > --- /dev/null > +++ b/tools/testing/selftests/bpf/progs/test_tai.c > @@ -0,0 +1,17 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* Copyright (C) 2022 Linutronix GmbH */ > + > +#include <linux/bpf.h> > +#include <bpf/bpf_helpers.h> > + > +char _license[] SEC("license") = "GPL"; > + > +SEC("tc") > +int save_tai(struct __sk_buff *skb) > +{ > + /* Save TAI timestamps */ > + skb->tstamp = bpf_ktime_get_tai_ns(); > + skb->hwtstamp = bpf_ktime_get_tai_ns(); > + > + return 0; > +}
On Wed, Aug 3, 2022 at 2:29 AM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > On Wed, Aug 03, 2022 at 08:29:29AM +0200, Kurt Kanzenbach wrote: > > On Tue Aug 02 2022, Alexei Starovoitov wrote: > > > On Tue, Aug 2, 2022 at 12:06 AM Kurt Kanzenbach <kurt@linutronix.de> wrote: > > >> > > >> Hi Alexei, > > >> > > >> On Tue Jun 07 2022, Alexei Starovoitov wrote: > > >> > Anyway I guess new helper bpf_ktime_get_tai_ns() is ok, since > > >> > it's so trivial, but selftest is necessary. > > >> > > >> So, I did write a selftest [1] for testing bpf_ktime_get_tai_ns() and > > >> verifying that the access to the clock works. It uses AF_XDP sockets and > > >> timestamps the incoming packets. The timestamps are then validated in > > >> user space. > > >> > > >> Since AF_XDP related code is migrating from libbpf to libxdp, I'm > > >> wondering if that sample fits into the kernel's selftests or not. What > > >> kind of selftest are you looking for? > > > > > > Please use selftests/bpf framework. > > > There are plenty of networking tests in there. > > > bpf_ktime_get_tai_ns() doesn't have to rely on af_xdp. > > > > OK. > > > > > It can be skb based. > > FWIW there is xskxceiver and libbpf's xsk part in selftests/bpf framework, > so your initial work should be fine in there. Personally I found both > (AF_XDP and SKB one, below) tests valuable. test_progs is always tested on each patch/patch set. xskxceiver can potentially break without anyone noticing for a while. So having something like this in test_progs is much better. > > Later on, if we add a support to xskxceiver for loading external BPF progs > then your sample would just become another test case in there. > > > > > Something like this? > > > > +++ b/tools/testing/selftests/bpf/prog_tests/check_tai.c > > @@ -0,0 +1,57 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* Copyright (C) 2022 Linutronix GmbH */ > > + > > +#include <test_progs.h> > > +#include <network_helpers.h> > > + > > +#include <time.h> > > +#include <stdint.h> > > + > > +#define TAI_THRESHOLD 1000000000ULL /* 1s */ > > +#define NSEC_PER_SEC 1000000000ULL > > + > > +static __u64 ts_to_ns(const struct timespec *ts) > > +{ > > + return ts->tv_sec * NSEC_PER_SEC + ts->tv_nsec; > > +} > > + > > +void test_tai(void) > > +{ > > + struct __sk_buff skb = { > > + .tstamp = 0, > > + .hwtstamp = 0, > > + }; > > + LIBBPF_OPTS(bpf_test_run_opts, topts, > > + .data_in = &pkt_v4, > > + .data_size_in = sizeof(pkt_v4), > > + .ctx_in = &skb, > > + .ctx_size_in = sizeof(skb), > > + .ctx_out = &skb, > > + .ctx_size_out = sizeof(skb), > > + ); > > + struct timespec now_tai; > > + struct bpf_object *obj; > > + int ret, prog_fd; > > + > > + ret = bpf_prog_test_load("./test_tai.o", > > + BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd); > > + if (!ASSERT_OK(ret, "load")) > > + return; > > + ret = bpf_prog_test_run_opts(prog_fd, &topts); > > + ASSERT_OK(ret, "test_run"); > > + > > + /* TAI != 0 */ > > + ASSERT_NEQ(skb.tstamp, 0, "tai_ts0_0"); > > + ASSERT_NEQ(skb.hwtstamp, 0, "tai_ts0_1"); > > + > > + /* TAI is moving forward only */ > > + ASSERT_GT(skb.hwtstamp, skb.tstamp, "tai_forward"); > > + > > + /* Check for reasoneable range */ > > + ret = clock_gettime(CLOCK_TAI, &now_tai); > > + ASSERT_EQ(ret, 0, "tai_gettime"); > > + ASSERT_TRUE((ts_to_ns(&now_tai) - skb.hwtstamp) < TAI_THRESHOLD, > > + "tai_range"); > > + > > + bpf_object__close(obj); > > +} > > diff --git a/tools/testing/selftests/bpf/progs/test_tai.c b/tools/testing/selftests/bpf/progs/test_tai.c > > new file mode 100644 > > index 000000000000..34ac4175e29d > > --- /dev/null > > +++ b/tools/testing/selftests/bpf/progs/test_tai.c > > @@ -0,0 +1,17 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* Copyright (C) 2022 Linutronix GmbH */ > > + > > +#include <linux/bpf.h> > > +#include <bpf/bpf_helpers.h> > > + > > +char _license[] SEC("license") = "GPL"; > > + > > +SEC("tc") > > +int save_tai(struct __sk_buff *skb) > > +{ > > + /* Save TAI timestamps */ > > + skb->tstamp = bpf_ktime_get_tai_ns(); > > + skb->hwtstamp = bpf_ktime_get_tai_ns(); > > + > > + return 0; > > +} > >
On Wed Aug 03 2022, Andrii Nakryiko wrote: >> +void test_tai(void) >> +{ >> + struct __sk_buff skb = { >> + .tstamp = 0, >> + .hwtstamp = 0, >> + }; >> + LIBBPF_OPTS(bpf_test_run_opts, topts, >> + .data_in = &pkt_v4, >> + .data_size_in = sizeof(pkt_v4), >> + .ctx_in = &skb, >> + .ctx_size_in = sizeof(skb), >> + .ctx_out = &skb, >> + .ctx_size_out = sizeof(skb), >> + ); >> + struct timespec now_tai; >> + struct bpf_object *obj; >> + int ret, prog_fd; >> + >> + ret = bpf_prog_test_load("./test_tai.o", >> + BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd); > > it would be best to rely on BPF skeleton, please see other tests > including *.skel.h, thanks > Ah, nice. Adjusted the test case accordingly. Will post v2 after the merge window. Thanks!
On Wed, Aug 3, 2022 at 11:40 PM Kurt Kanzenbach <kurt@linutronix.de> wrote: > > On Wed Aug 03 2022, Andrii Nakryiko wrote: > >> +void test_tai(void) > >> +{ > >> + struct __sk_buff skb = { > >> + .tstamp = 0, > >> + .hwtstamp = 0, > >> + }; > >> + LIBBPF_OPTS(bpf_test_run_opts, topts, > >> + .data_in = &pkt_v4, > >> + .data_size_in = sizeof(pkt_v4), > >> + .ctx_in = &skb, > >> + .ctx_size_in = sizeof(skb), > >> + .ctx_out = &skb, > >> + .ctx_size_out = sizeof(skb), > >> + ); > >> + struct timespec now_tai; > >> + struct bpf_object *obj; > >> + int ret, prog_fd; > >> + > >> + ret = bpf_prog_test_load("./test_tai.o", > >> + BPF_PROG_TYPE_SCHED_CLS, &obj, &prog_fd); > > > > it would be best to rely on BPF skeleton, please see other tests > > including *.skel.h, thanks > > > > Ah, nice. Adjusted the test case accordingly. Will post v2 after the > merge window. Thanks! We don't close bpf-next during the merge window, so feel free to send v2 early.
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index f4009dbdf62d..5f240d5d30f6 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5249,6 +5249,18 @@ union bpf_attr { * Pointer to the underlying dynptr data, NULL if the dynptr is * read-only, if the dynptr is invalid, or if the offset and length * is out of bounds. + * + * u64 bpf_ktime_get_tai_ns(void) + * Description + * A nonsettable system-wide clock derived from wall-clock time but + * ignoring leap seconds. This clock does not experience + * discontinuities and backwards jumps caused by NTP inserting leap + * seconds as CLOCK_REALTIME does. + * + * See: **clock_gettime**\ (**CLOCK_TAI**) + * Return + * Current *ktime*. + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5455,6 +5467,7 @@ union bpf_attr { FN(dynptr_read), \ FN(dynptr_write), \ FN(dynptr_data), \ + FN(ktime_get_tai_ns), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index e78cc5eea4a5..edfa716c3528 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2641,6 +2641,7 @@ const struct bpf_func_proto bpf_get_numa_node_id_proto __weak; const struct bpf_func_proto bpf_ktime_get_ns_proto __weak; const struct bpf_func_proto bpf_ktime_get_boot_ns_proto __weak; const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto __weak; +const struct bpf_func_proto bpf_ktime_get_tai_ns_proto __weak; const struct bpf_func_proto bpf_get_current_pid_tgid_proto __weak; const struct bpf_func_proto bpf_get_current_uid_gid_proto __weak; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 225806a02efb..981b34d1e551 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -198,6 +198,18 @@ const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto = { .ret_type = RET_INTEGER, }; +BPF_CALL_0(bpf_ktime_get_tai_ns) +{ + /* NMI safe access to clock tai */ + return ktime_get_tai_fast_ns(); +} + +const struct bpf_func_proto bpf_ktime_get_tai_ns_proto = { + .func = bpf_ktime_get_tai_ns, + .gpl_only = false, + .ret_type = RET_INTEGER, +}; + BPF_CALL_0(bpf_get_current_pid_tgid) { struct task_struct *task = current; @@ -1613,6 +1625,8 @@ bpf_base_func_proto(enum bpf_func_id func_id) return &bpf_ktime_get_ns_proto; case BPF_FUNC_ktime_get_boot_ns: return &bpf_ktime_get_boot_ns_proto; + case BPF_FUNC_ktime_get_tai_ns: + return &bpf_ktime_get_tai_ns_proto; case BPF_FUNC_ringbuf_output: return &bpf_ringbuf_output_proto; case BPF_FUNC_ringbuf_reserve: diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index f4009dbdf62d..5f240d5d30f6 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5249,6 +5249,18 @@ union bpf_attr { * Pointer to the underlying dynptr data, NULL if the dynptr is * read-only, if the dynptr is invalid, or if the offset and length * is out of bounds. + * + * u64 bpf_ktime_get_tai_ns(void) + * Description + * A nonsettable system-wide clock derived from wall-clock time but + * ignoring leap seconds. This clock does not experience + * discontinuities and backwards jumps caused by NTP inserting leap + * seconds as CLOCK_REALTIME does. + * + * See: **clock_gettime**\ (**CLOCK_TAI**) + * Return + * Current *ktime*. + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5455,6 +5467,7 @@ union bpf_attr { FN(dynptr_read), \ FN(dynptr_write), \ FN(dynptr_data), \ + FN(ktime_get_tai_ns), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper