Message ID | 1605151497-29986-4-git-send-email-wenxu@ucloud.cn (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/sched: fix over mtu packet of defrag in | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | fail | Errors and warnings before: 257 this patch: 258 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | warning | WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? |
netdev/build_allmodconfig_warn | fail | Errors and warnings before: 257 this patch: 258 |
netdev/header_inline | success | Link |
netdev/stable | success | Stable not CCed |
On Thu, 12 Nov 2020 11:24:57 +0800 wenxu@ucloud.cn wrote:
> v7-v10: fix __rcu warning
Are you reposting stuff just to get it build tested?
This is absolutely unacceptable.
On Thu, Nov 12, 2020 at 02:20:58PM -0800, Jakub Kicinski wrote: > On Thu, 12 Nov 2020 11:24:57 +0800 wenxu@ucloud.cn wrote: > > v7-v10: fix __rcu warning > > Are you reposting stuff just to get it build tested? > > This is absolutely unacceptable. I don't know if that's the case, but maybe we could have a shadow mailing list just for that? So that bots would monitor and would run (almost) the same tests are they do here. Then when patches are posted here, a list that people actually subscribe, they are already more ready. The bots would have to email an "ok" as well, but that's implementation detail already. Not that developers shouldn't test before posting, but the bots are already doing some tests that may be beyond of what one can think of testing before posting.
On Thu, 12 Nov 2020 23:25:22 -0300 Marcelo Ricardo Leitner wrote: > On Thu, Nov 12, 2020 at 02:20:58PM -0800, Jakub Kicinski wrote: > > On Thu, 12 Nov 2020 11:24:57 +0800 wenxu@ucloud.cn wrote: > > > v7-v10: fix __rcu warning > > > > Are you reposting stuff just to get it build tested? > > > > This is absolutely unacceptable. > > I don't know if that's the case, but maybe we could have a shadow > mailing list just for that? So that bots would monitor and would run > (almost) the same tests are they do here. Then when patches are posted > here, a list that people actually subscribe, they are already more > ready. The bots would have to email an "ok" as well, but that's > implementation detail already. Not that developers shouldn't test > before posting, but the bots are already doing some tests that may be > beyond of what one can think of testing before posting. The code for the entire system is right here: https://github.com/kuba-moo/nipa It depends on a patchwork instance to report results to. I have a script there to feed patches in locally from a maildir but haven't tested that in a while so it's probably broken. You can also just run the build bash script without running the whole bot: https://github.com/kuba-moo/nipa/blob/master/tests/patch/build_allmodconfig_warn/build_allmodconfig.sh Hardly rocket science. I have no preference on what people do to test their code, and I'm happy to take patches for the bot, too. But we can't have people posting 11 versions of patches to netdev which is already too high traffic for people to follow. Not to mention that someone needs to pay for the CPU cycles of the bot, and we don't want to block getting results for legitimate patches.
On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c > index 9c79fb9..dff3c40 100644 > --- a/net/sched/act_ct.c > +++ b/net/sched/act_ct.c > @@ -1541,8 +1541,14 @@ static int __init ct_init_module(void) > if (err) > goto err_register; > > + err = tcf_set_xmit_hook(tcf_frag_xmit_hook); Yeah, this approach is certainly much better than extending act_mirred. Just one comment below. > diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c > new file mode 100644 > index 0000000..3a7ab92 > --- /dev/null > +++ b/net/sched/act_frag.c It is kinda confusing to see this is a module. It provides some wrappers and hooks the dev_xmit_queue(), it belongs more to the core tc code than any modularized code. How about putting this into net/sched/sch_generic.c? Thanks.
On Sat, Nov 14, 2020 at 10:05:39AM -0800, Cong Wang wrote: > On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: > > diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c > > index 9c79fb9..dff3c40 100644 > > --- a/net/sched/act_ct.c > > +++ b/net/sched/act_ct.c > > @@ -1541,8 +1541,14 @@ static int __init ct_init_module(void) > > if (err) > > goto err_register; > > > > + err = tcf_set_xmit_hook(tcf_frag_xmit_hook); > > Yeah, this approach is certainly much better than extending act_mirred. > Just one comment below. Nice. :-) > > > > diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c > > new file mode 100644 > > index 0000000..3a7ab92 > > --- /dev/null > > +++ b/net/sched/act_frag.c > > It is kinda confusing to see this is a module. It provides some > wrappers and hooks the dev_xmit_queue(), it belongs more to > the core tc code than any modularized code. How about putting > this into net/sched/sch_generic.c? Davide had shared similar concerns with regards of the new module too. The main idea behind the new module was to keep it as isolated/contained as possible, and only so. So thumbs up from my side. To be clear, you're only talking about the module itself, right? It would still need to have the Kconfig to enable this feature, or not? Thanks, Marcelo
在 2020/11/15 2:05, Cong Wang 写道: > On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: >> diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c >> new file mode 100644 >> index 0000000..3a7ab92 >> --- /dev/null >> +++ b/net/sched/act_frag.c > It is kinda confusing to see this is a module. It provides some > wrappers and hooks the dev_xmit_queue(), it belongs more to > the core tc code than any modularized code. How about putting > this into net/sched/sch_generic.c? > > Thanks. All the operations in the act_frag are single L3 action. So we put in a single module. to keep it as isolated/contained as possible Maybe put this in a single file is better than a module? Buildin in the tc core code or not. Enable this feature in Kconifg with NET_ACT_FRAG? +config NET_ACT_FRAG + bool "Packet fragmentation" + depends on NET_CLS_ACT + help + Say Y here to allow fragmenting big packets when outputting + with the mirred action. + + If unsure, say N. >
This nagged me: What happens if all the frags dont make it out? Should you at least return an error code(from tcf_fragment?) and get the action err counters incremented? cheers, jamal On 2020-11-15 8:05 a.m., wenxu wrote: > > 在 2020/11/15 2:05, Cong Wang 写道: >> On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: >>> diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c >>> new file mode 100644 >>> index 0000000..3a7ab92 >>> --- /dev/null >>> +++ b/net/sched/act_frag.c >> It is kinda confusing to see this is a module. It provides some >> wrappers and hooks the dev_xmit_queue(), it belongs more to >> the core tc code than any modularized code. How about putting >> this into net/sched/sch_generic.c? >> >> Thanks. > > All the operations in the act_frag are single L3 action. > > So we put in a single module. to keep it as isolated/contained as possible > > Maybe put this in a single file is better than a module? Buildin in the tc core code or not. > > Enable this feature in Kconifg with NET_ACT_FRAG? > > +config NET_ACT_FRAG > + bool "Packet fragmentation" > + depends on NET_CLS_ACT > + help > + Say Y here to allow fragmenting big packets when outputting > + with the mirred action. > + > + If unsure, say N. > > >>
On 11/16/2020 12:26 AM, Jamal Hadi Salim wrote: > This nagged me: > What happens if all the frags dont make it out? > Should you at least return an error code(from tcf_fragment?) > and get the action err counters incremented? Thanks, Will do. > > cheers, > jamal > > On 2020-11-15 8:05 a.m., wenxu wrote: >> >> 在 2020/11/15 2:05, Cong Wang 写道: >>> On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: >>>> diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c >>>> new file mode 100644 >>>> index 0000000..3a7ab92 >>>> --- /dev/null >>>> +++ b/net/sched/act_frag.c >>> It is kinda confusing to see this is a module. It provides some >>> wrappers and hooks the dev_xmit_queue(), it belongs more to >>> the core tc code than any modularized code. How about putting >>> this into net/sched/sch_generic.c? >>> >>> Thanks. >> >> All the operations in the act_frag are single L3 action. >> >> So we put in a single module. to keep it as isolated/contained as possible >> >> Maybe put this in a single file is better than a module? Buildin in the tc core code or not. >> >> Enable this feature in Kconifg with NET_ACT_FRAG? >> >> +config NET_ACT_FRAG >> + bool "Packet fragmentation" >> + depends on NET_CLS_ACT >> + help >> + Say Y here to allow fragmenting big packets when outputting >> + with the mirred action. >> + >> + If unsure, say N. >> >> >>> > >
On Sat, Nov 14, 2020 at 2:46 PM Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > Davide had shared similar concerns with regards of the new module too. > The main idea behind the new module was to keep it as > isolated/contained as possible, and only so. So thumbs up from my > side. > > To be clear, you're only talking about the module itself, right? It > would still need to have the Kconfig to enable this feature, or not? Both. The code itself doesn't look like a module, and it doesn't look like an optional feature for act_ct either, does it? If not, there is no need to have a user visible Kconfig, we just select it, or no Kconfig at all. Thanks.
On Sun, Nov 15, 2020 at 5:06 AM wenxu <wenxu@ucloud.cn> wrote: > > > 在 2020/11/15 2:05, Cong Wang 写道: > > On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: > >> diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c > >> new file mode 100644 > >> index 0000000..3a7ab92 > >> --- /dev/null > >> +++ b/net/sched/act_frag.c > > It is kinda confusing to see this is a module. It provides some > > wrappers and hooks the dev_xmit_queue(), it belongs more to > > the core tc code than any modularized code. How about putting > > this into net/sched/sch_generic.c? > > > > Thanks. > > All the operations in the act_frag are single L3 action. > > So we put in a single module. to keep it as isolated/contained as possible Yeah, but you hook dev_queue_xmit() which is L2. > > Maybe put this in a single file is better than a module? Buildin in the tc core code or not. > > Enable this feature in Kconifg with NET_ACT_FRAG? Sort of... If this is not an optional feature, that is a must-have feature for act_ct, we should just get rid of this Kconfig. Also, you need to depend on CONFIG_INET somewhere to use the IP fragment, no? Thanks.
On 11/17/2020 3:01 AM, Cong Wang wrote: > On Sun, Nov 15, 2020 at 5:06 AM wenxu <wenxu@ucloud.cn> wrote: >> >> 在 2020/11/15 2:05, Cong Wang 写道: >>> On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: >>>> diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c >>>> new file mode 100644 >>>> index 0000000..3a7ab92 >>>> --- /dev/null >>>> +++ b/net/sched/act_frag.c >>> It is kinda confusing to see this is a module. It provides some >>> wrappers and hooks the dev_xmit_queue(), it belongs more to >>> the core tc code than any modularized code. How about putting >>> this into net/sched/sch_generic.c? >>> >>> Thanks. >> All the operations in the act_frag are single L3 action. >> >> So we put in a single module. to keep it as isolated/contained as possible > Yeah, but you hook dev_queue_xmit() which is L2. > >> Maybe put this in a single file is better than a module? Buildin in the tc core code or not. >> >> Enable this feature in Kconifg with NET_ACT_FRAG? > Sort of... If this is not an optional feature, that is a must-have > feature for act_ct, > we should just get rid of this Kconfig. > > Also, you need to depend on CONFIG_INET somewhere to use the IP > fragment, no? > > Thanks. Maybe the act_frag should rename to sch_frag and buildin kernel. This fcuntion can be used for all tc subsystem. There is no need for CONFIG_INET. The sched system depends on NET. >
On Mon, Nov 16, 2020 at 8:06 PM wenxu <wenxu@ucloud.cn> wrote: > > > On 11/17/2020 3:01 AM, Cong Wang wrote: > > On Sun, Nov 15, 2020 at 5:06 AM wenxu <wenxu@ucloud.cn> wrote: > >> > >> 在 2020/11/15 2:05, Cong Wang 写道: > >>> On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: > >>>> diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c > >>>> new file mode 100644 > >>>> index 0000000..3a7ab92 > >>>> --- /dev/null > >>>> +++ b/net/sched/act_frag.c > >>> It is kinda confusing to see this is a module. It provides some > >>> wrappers and hooks the dev_xmit_queue(), it belongs more to > >>> the core tc code than any modularized code. How about putting > >>> this into net/sched/sch_generic.c? > >>> > >>> Thanks. > >> All the operations in the act_frag are single L3 action. > >> > >> So we put in a single module. to keep it as isolated/contained as possible > > Yeah, but you hook dev_queue_xmit() which is L2. > > > >> Maybe put this in a single file is better than a module? Buildin in the tc core code or not. > >> > >> Enable this feature in Kconifg with NET_ACT_FRAG? > > Sort of... If this is not an optional feature, that is a must-have > > feature for act_ct, > > we should just get rid of this Kconfig. > > > > Also, you need to depend on CONFIG_INET somewhere to use the IP > > fragment, no? > > > > Thanks. > > Maybe the act_frag should rename to sch_frag and buildin kernel. sch_frag still sounds like a module. ;) This is why I proposed putting it into sch_generic.c. > > This fcuntion can be used for all tc subsystem. There is no need for > > CONFIG_INET. The sched system depends on NET. CONFIG_INET is different from CONFIG_NET, right? Thanks.
在 2020/11/18 6:43, Cong Wang 写道: > On Mon, Nov 16, 2020 at 8:06 PM wenxu <wenxu@ucloud.cn> wrote: >> >> On 11/17/2020 3:01 AM, Cong Wang wrote: >>> On Sun, Nov 15, 2020 at 5:06 AM wenxu <wenxu@ucloud.cn> wrote: >>>> 在 2020/11/15 2:05, Cong Wang 写道: >>>>> On Wed, Nov 11, 2020 at 9:44 PM <wenxu@ucloud.cn> wrote: >>>>>> diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c >>>>>> new file mode 100644 >>>>>> index 0000000..3a7ab92 >>>>>> --- /dev/null >>>>>> +++ b/net/sched/act_frag.c >>>>> It is kinda confusing to see this is a module. It provides some >>>>> wrappers and hooks the dev_xmit_queue(), it belongs more to >>>>> the core tc code than any modularized code. How about putting >>>>> this into net/sched/sch_generic.c? >>>>> >>>>> Thanks. >>>> All the operations in the act_frag are single L3 action. >>>> >>>> So we put in a single module. to keep it as isolated/contained as possible >>> Yeah, but you hook dev_queue_xmit() which is L2. >>> >>>> Maybe put this in a single file is better than a module? Buildin in the tc core code or not. >>>> >>>> Enable this feature in Kconifg with NET_ACT_FRAG? >>> Sort of... If this is not an optional feature, that is a must-have >>> feature for act_ct, >>> we should just get rid of this Kconfig. >>> >>> Also, you need to depend on CONFIG_INET somewhere to use the IP >>> fragment, no? >>> >>> Thanks. >> Maybe the act_frag should rename to sch_frag and buildin kernel. > sch_frag still sounds like a module. ;) This is why I proposed putting > it into sch_generic.c. > >> This fcuntion can be used for all tc subsystem. There is no need for >> >> CONFIG_INET. The sched system depends on NET. > CONFIG_INET is different from CONFIG_NET, right? you are right. ip_do_fragment depends on this! > > Thanks. >
diff --git a/include/net/act_api.h b/include/net/act_api.h index 8721492..87ea1df 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -239,6 +239,24 @@ int tcf_action_check_ctrlact(int action, struct tcf_proto *tp, struct netlink_ext_ack *newchain); struct tcf_chain *tcf_action_set_ctrlact(struct tc_action *a, int action, struct tcf_chain *newchain); + +typedef int xmit_hook_func(struct sk_buff *skb, + int (*xmit)(struct sk_buff *skb)); + +int tcf_dev_queue_xmit(struct sk_buff *skb, int (*xmit)(struct sk_buff *skb)); +int tcf_set_xmit_hook(xmit_hook_func *xmit_hook); +void tcf_clear_xmit_hook(void); + +#if IS_ENABLED(CONFIG_NET_ACT_FRAG) +int tcf_frag_xmit_hook(struct sk_buff *skb, int (*xmit)(struct sk_buff *skb)); +#else +static inline int tcf_frag_xmit_hook(struct sk_buff *skb, + int (*xmit)(struct sk_buff *skb)) +{ + return 0; +} +#endif + #endif /* CONFIG_NET_CLS_ACT */ static inline void tcf_action_stats_update(struct tc_action *a, u64 bytes, diff --git a/net/sched/Kconfig b/net/sched/Kconfig index a3b37d8..9a240c7 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -974,9 +974,22 @@ config NET_ACT_TUNNEL_KEY To compile this code as a module, choose M here: the module will be called act_tunnel_key. +config NET_ACT_FRAG + tristate "Packet fragmentation" + depends on NET_CLS_ACT + help + Say Y here to allow fragmenting big packets when outputting + with the mirred action. + + If unsure, say N. + + To compile this code as a module, choose M here: the + module will be called act_frag. + config NET_ACT_CT tristate "connection tracking tc action" depends on NET_CLS_ACT && NF_CONNTRACK && NF_NAT && NF_FLOW_TABLE + depends on NET_ACT_FRAG help Say Y here to allow sending the packets to conntrack module. diff --git a/net/sched/Makefile b/net/sched/Makefile index 66bbf9a..c146186 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -29,6 +29,7 @@ obj-$(CONFIG_NET_IFE_SKBMARK) += act_meta_mark.o obj-$(CONFIG_NET_IFE_SKBPRIO) += act_meta_skbprio.o obj-$(CONFIG_NET_IFE_SKBTCINDEX) += act_meta_skbtcindex.o obj-$(CONFIG_NET_ACT_TUNNEL_KEY)+= act_tunnel_key.o +obj-$(CONFIG_NET_ACT_FRAG) += act_frag.o obj-$(CONFIG_NET_ACT_CT) += act_ct.o obj-$(CONFIG_NET_ACT_GATE) += act_gate.o obj-$(CONFIG_NET_SCH_FIFO) += sch_fifo.o diff --git a/net/sched/act_api.c b/net/sched/act_api.c index f66417d..93868b7 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -22,6 +22,50 @@ #include <net/act_api.h> #include <net/netlink.h> +static xmit_hook_func __rcu *tcf_xmit_hook; +static DEFINE_SPINLOCK(tcf_xmit_hook_lock); +static u16 tcf_xmit_hook_count; + +int tcf_set_xmit_hook(xmit_hook_func *xmit_hook) +{ + spin_lock(&tcf_xmit_hook_lock); + if (!tcf_xmit_hook_count) { + rcu_assign_pointer(tcf_xmit_hook, xmit_hook); + } else if (xmit_hook != tcf_xmit_hook) { + spin_unlock(&tcf_xmit_hook_lock); + return -EBUSY; + } + + tcf_xmit_hook_count++; + spin_unlock(&tcf_xmit_hook_lock); + + return 0; +} +EXPORT_SYMBOL_GPL(tcf_set_xmit_hook); + +void tcf_clear_xmit_hook(void) +{ + spin_lock(&tcf_xmit_hook_lock); + if (--tcf_xmit_hook_count == 0) + rcu_assign_pointer(tcf_xmit_hook, NULL); + spin_unlock(&tcf_xmit_hook_lock); + + synchronize_rcu(); +} +EXPORT_SYMBOL_GPL(tcf_clear_xmit_hook); + +int tcf_dev_queue_xmit(struct sk_buff *skb, int (*xmit)(struct sk_buff *skb)) +{ + xmit_hook_func *xmit_hook; + + xmit_hook = rcu_dereference(tcf_xmit_hook); + if (xmit_hook) + return xmit_hook(skb, xmit); + else + return xmit(skb); +} +EXPORT_SYMBOL_GPL(tcf_dev_queue_xmit); + static void tcf_action_goto_chain_exec(const struct tc_action *a, struct tcf_result *res) { diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 9c79fb9..dff3c40 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -1541,8 +1541,14 @@ static int __init ct_init_module(void) if (err) goto err_register; + err = tcf_set_xmit_hook(tcf_frag_xmit_hook); + if (err) + goto err_action; + return 0; +err_action: + tcf_unregister_action(&act_ct_ops, &ct_net_ops); err_register: tcf_ct_flow_tables_uninit(); err_tbl_init: @@ -1552,6 +1558,7 @@ static int __init ct_init_module(void) static void __exit ct_cleanup_module(void) { + tcf_clear_xmit_hook(); tcf_unregister_action(&act_ct_ops, &ct_net_ops); tcf_ct_flow_tables_uninit(); destroy_workqueue(act_ct_wq); diff --git a/net/sched/act_frag.c b/net/sched/act_frag.c new file mode 100644 index 0000000..3a7ab92 --- /dev/null +++ b/net/sched/act_frag.c @@ -0,0 +1,164 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +#include <net/netlink.h> +#include <net/act_api.h> +#include <net/dst.h> +#include <net/ip.h> +#include <net/ip6_fib.h> + +struct tcf_frag_data { + unsigned long dst; + struct qdisc_skb_cb cb; + __be16 inner_protocol; + u16 vlan_tci; + __be16 vlan_proto; + unsigned int l2_len; + u8 l2_data[VLAN_ETH_HLEN]; + int (*xmit)(struct sk_buff *skb); +}; + +static DEFINE_PER_CPU(struct tcf_frag_data, tcf_frag_data_storage); + +static int tcf_frag_xmit(struct net *net, struct sock *sk, struct sk_buff *skb) +{ + struct tcf_frag_data *data = this_cpu_ptr(&tcf_frag_data_storage); + + if (skb_cow_head(skb, data->l2_len) < 0) { + kfree_skb(skb); + return -ENOMEM; + } + + __skb_dst_copy(skb, data->dst); + *qdisc_skb_cb(skb) = data->cb; + skb->inner_protocol = data->inner_protocol; + if (data->vlan_tci & VLAN_CFI_MASK) + __vlan_hwaccel_put_tag(skb, data->vlan_proto, + data->vlan_tci & ~VLAN_CFI_MASK); + else + __vlan_hwaccel_clear_tag(skb); + + /* Reconstruct the MAC header. */ + skb_push(skb, data->l2_len); + memcpy(skb->data, &data->l2_data, data->l2_len); + skb_postpush_rcsum(skb, skb->data, data->l2_len); + skb_reset_mac_header(skb); + + data->xmit(skb); + + return 0; +} + +static void tcf_frag_prepare_frag(struct sk_buff *skb, + int (*xmit)(struct sk_buff *skb)) +{ + unsigned int hlen = skb_network_offset(skb); + struct tcf_frag_data *data; + + data = this_cpu_ptr(&tcf_frag_data_storage); + data->dst = skb->_skb_refdst; + data->cb = *qdisc_skb_cb(skb); + data->xmit = xmit; + data->inner_protocol = skb->inner_protocol; + if (skb_vlan_tag_present(skb)) + data->vlan_tci = skb_vlan_tag_get(skb) | VLAN_CFI_MASK; + else + data->vlan_tci = 0; + data->vlan_proto = skb->vlan_proto; + data->l2_len = hlen; + memcpy(&data->l2_data, skb->data, hlen); + + memset(IPCB(skb), 0, sizeof(struct inet_skb_parm)); + skb_pull(skb, hlen); +} + +static unsigned int +tcf_frag_dst_get_mtu(const struct dst_entry *dst) +{ + return dst->dev->mtu; +} + +static struct dst_ops tcf_frag_dst_ops = { + .family = AF_UNSPEC, + .mtu = tcf_frag_dst_get_mtu, +}; + +static int tcf_fragment(struct net *net, struct sk_buff *skb, + u16 mru, int (*xmit)(struct sk_buff *skb)) +{ + if (skb_network_offset(skb) > VLAN_ETH_HLEN) { + net_warn_ratelimited("L2 header too long to fragment\n"); + goto err; + } + + if (skb_protocol(skb, true) == htons(ETH_P_IP)) { + struct dst_entry tcf_frag_dst; + unsigned long orig_dst; + + tcf_frag_prepare_frag(skb, xmit); + dst_init(&tcf_frag_dst, &tcf_frag_dst_ops, NULL, 1, + DST_OBSOLETE_NONE, DST_NOCOUNT); + tcf_frag_dst.dev = skb->dev; + + orig_dst = skb->_skb_refdst; + skb_dst_set_noref(skb, &tcf_frag_dst); + IPCB(skb)->frag_max_size = mru; + + ip_do_fragment(net, skb->sk, skb, tcf_frag_xmit); + refdst_drop(orig_dst); + } else if (skb_protocol(skb, true) == htons(ETH_P_IPV6)) { + unsigned long orig_dst; + struct rt6_info tcf_frag_rt; + + tcf_frag_prepare_frag(skb, xmit); + memset(&tcf_frag_rt, 0, sizeof(tcf_frag_rt)); + dst_init(&tcf_frag_rt.dst, &tcf_frag_dst_ops, NULL, 1, + DST_OBSOLETE_NONE, DST_NOCOUNT); + tcf_frag_rt.dst.dev = skb->dev; + + orig_dst = skb->_skb_refdst; + skb_dst_set_noref(skb, &tcf_frag_rt.dst); + IP6CB(skb)->frag_max_size = mru; + + ipv6_stub->ipv6_fragment(net, skb->sk, skb, tcf_frag_xmit); + refdst_drop(orig_dst); + } else { + net_warn_ratelimited("Fail frag %s: eth=%x, MRU=%d, MTU=%d\n", + netdev_name(skb->dev), + ntohs(skb_protocol(skb, true)), mru, + skb->dev->mtu); + goto err; + } + + qdisc_skb_cb(skb)->mru = 0; + return 0; +err: + kfree_skb(skb); + return -1; +} + +int tcf_frag_xmit_hook(struct sk_buff *skb, int (*xmit)(struct sk_buff *skb)) +{ + u16 mru = qdisc_skb_cb(skb)->mru; + int err; + + if (mru && skb->len > mru + skb->dev->hard_header_len) + err = tcf_fragment(dev_net(skb->dev), skb, mru, xmit); + else + err = xmit(skb); + + return err; +} +EXPORT_SYMBOL_GPL(tcf_frag_xmit_hook); + +static int __init frag_init_module(void) +{ + return 0; +} + +static void __exit frag_cleanup_module(void) +{ +} + +module_init(frag_init_module); +module_exit(frag_cleanup_module); +MODULE_AUTHOR("wenxu <wenxu@ucloud.cn>"); +MODULE_LICENSE("GPL v2"); diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 17d0095..7153c67 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -210,7 +210,7 @@ static int tcf_mirred_forward(bool want_ingress, struct sk_buff *skb) int err; if (!want_ingress) - err = dev_queue_xmit(skb); + err = tcf_dev_queue_xmit(skb, dev_queue_xmit); else err = netif_receive_skb(skb);