Message ID | 20250315-airoha-flowtable-null-ptr-fix-v2-1-94b923d30234@kernel.org (mailing list archive) |
---|---|
State | New |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v2] net: airoha: Validate egress gdm port in airoha_ppe_foe_entry_prepare() | expand |
> Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare > routine. A more interesting question is, why do you see an invalid port? Is the hardware broken? Something not correctly configured? Are you just papering over the crack? > -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe, > +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth, > + struct airoha_foe_entry *hwe, > struct net_device *dev, int type, > struct airoha_flow_data *data, > int l4proto) > @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe, > if (dev) { > struct airoha_gdm_port *port = netdev_priv(dev); If port is invalid, is dev also invalid? And if dev is invalid, could dereferencing it to get priv cause an opps? Andrew
> > Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare > > routine. > > A more interesting question is, why do you see an invalid port? Is the > hardware broken? Something not correctly configured? Are you just > papering over the crack? > > > -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe, > > +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth, > > + struct airoha_foe_entry *hwe, > > struct net_device *dev, int type, > > struct airoha_flow_data *data, > > int l4proto) > > @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe, > > if (dev) { > > struct airoha_gdm_port *port = netdev_priv(dev); > > If port is invalid, is dev also invalid? And if dev is invalid, could > dereferencing it to get priv cause an opps? I do not think this is a hw problem. Running bidirectional high load traffic, I got the sporadic crash reported above. In particular, netfilter runs airoha_ppe_flow_offload_replace() providing the egress net_device pointer used in airoha_ppe_foe_entry_prepare(). Debugging with gdb, I discovered the system crashes dereferencing port pointer in airoha_ppe_foe_entry_prepare() (even if dev pointer is not NULL). Adding this sanity check makes the system stable. Please note a similar check is available even in mtk driver [0]. Regards, Lorenzo [0] https://github.com/torvalds/linux/blob/master/drivers/net/ethernet/mediatek/mtk_ppe_offload.c#L220 > > Andrew
diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c index c0a642568ac115ea9df6fbaf7133627a4405a36c..bf9c882e9c8b087dbf5e907636547a0117d1b96a 100644 --- a/drivers/net/ethernet/airoha/airoha_eth.c +++ b/drivers/net/ethernet/airoha/airoha_eth.c @@ -2454,6 +2454,19 @@ static void airoha_metadata_dst_free(struct airoha_gdm_port *port) } } +int airoha_is_valid_gdm_port(struct airoha_eth *eth, + struct airoha_gdm_port *port) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(eth->ports); i++) { + if (eth->ports[i] == port) + return 0; + } + + return -EINVAL; +} + static int airoha_alloc_gdm_port(struct airoha_eth *eth, struct device_node *np, int index) { diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h index f66b9b736b9447b31afc036eb906d0a1c617e132..c7d4f124d11481cd31c1566936cd47e3446877c0 100644 --- a/drivers/net/ethernet/airoha/airoha_eth.h +++ b/drivers/net/ethernet/airoha/airoha_eth.h @@ -532,6 +532,9 @@ u32 airoha_rmw(void __iomem *base, u32 offset, u32 mask, u32 val); #define airoha_qdma_clear(qdma, offset, val) \ airoha_rmw((qdma)->regs, (offset), (val), 0) +int airoha_is_valid_gdm_port(struct airoha_eth *eth, + struct airoha_gdm_port *port); + void airoha_ppe_check_skb(struct airoha_ppe *ppe, u16 hash); int airoha_ppe_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv); diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c index 8b55e871352d359fa692c253d3f3315c619472b3..65833e2058194a64569eafec08b80df8190bba6c 100644 --- a/drivers/net/ethernet/airoha/airoha_ppe.c +++ b/drivers/net/ethernet/airoha/airoha_ppe.c @@ -197,7 +197,8 @@ static int airoha_get_dsa_port(struct net_device **dev) #endif } -static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe, +static int airoha_ppe_foe_entry_prepare(struct airoha_eth *eth, + struct airoha_foe_entry *hwe, struct net_device *dev, int type, struct airoha_flow_data *data, int l4proto) @@ -224,6 +225,11 @@ static int airoha_ppe_foe_entry_prepare(struct airoha_foe_entry *hwe, if (dev) { struct airoha_gdm_port *port = netdev_priv(dev); u8 pse_port; + int err; + + err = airoha_is_valid_gdm_port(eth, port); + if (err) + return err; if (dsa_port >= 0) pse_port = port->id == 4 ? FE_PSE_PORT_GDM4 : port->id; @@ -633,7 +639,7 @@ static int airoha_ppe_flow_offload_replace(struct airoha_gdm_port *port, !is_valid_ether_addr(data.eth.h_dest)) return -EINVAL; - err = airoha_ppe_foe_entry_prepare(&hwe, odev, offload_type, + err = airoha_ppe_foe_entry_prepare(eth, &hwe, odev, offload_type, &data, l4proto); if (err) return err;
The system occasionally crashes dereferencing a NULL pointer when it is forwarding constant, high load bidirectional traffic. [ 2149.913414] Unable to handle kernel read from unreadable memory at virtual address 0000000000000000 [ 2149.925812] Mem abort info: [ 2149.928713] ESR = 0x0000000096000005 [ 2149.932762] EC = 0x25: DABT (current EL), IL = 32 bits [ 2149.938429] SET = 0, FnV = 0 [ 2149.941814] EA = 0, S1PTW = 0 [ 2149.945187] FSC = 0x05: level 1 translation fault [ 2149.950348] Data abort info: [ 2149.953472] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 2149.959243] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 2149.964593] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 2149.970243] user pgtable: 4k pages, 39-bit VAs, pgdp=000000008b507000 [ 2149.977068] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 2149.986062] Internal error: Oops: 0000000096000005 [#1] SMP [ 2150.082282] arht_wrapper(O) i2c_core arht_hook(O) crc32_generic [ 2150.177623] CPU: 0 PID: 38 Comm: kworker/u9:1 Tainted: G O 6.6.73 #0 [ 2150.185362] Hardware name: Airoha AN7581 Evaluation Board (DT) [ 2150.191189] Workqueue: nf_ft_offload_add nf_flow_rule_route_ipv6 [nf_flow_table] [ 2150.198653] pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 2150.205615] pc : airoha_ppe_flow_offload_replace.isra.0+0x6dc/0xc54 [ 2150.211882] lr : airoha_ppe_flow_offload_replace.isra.0+0x6cc/0xc54 [ 2150.218149] sp : ffffffc080e8ba10 [ 2150.221456] x29: ffffffc080e8bae0 x28: ffffff80080b0000 x27: 0000000000000000 [ 2150.228591] x26: ffffff8001c70020 x25: 0000000000000002 x24: 0000000000000000 [ 2150.235727] x23: 0000000061000000 x22: 00000000ffffffed x21: ffffffc080e8bbb0 [ 2150.242862] x20: ffffff8001c70000 x19: 0000000000000008 x18: 0000000000000000 [ 2150.249998] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 2150.257133] x14: 0000000000000001 x13: 0000000000000008 x12: 0101010101010101 [ 2150.264268] x11: 7f7f7f7f7f7f7f7f x10: 0000000000000041 x9 : 0000000000000000 [ 2150.271404] x8 : ffffffc080e8bad8 x7 : 0000000000000000 x6 : 0000000000000015 [ 2150.278540] x5 : ffffffc080e8ba4e x4 : 0000000000000004 x3 : 0000000000000000 [ 2150.285675] x2 : 0000000000000008 x1 : 00000000080b0000 x0 : 0000000000000000 [ 2150.292811] Call trace: [ 2150.295250] airoha_ppe_flow_offload_replace.isra.0+0x6dc/0xc54 [ 2150.301171] airoha_ppe_setup_tc_block_cb+0x7c/0x8b4 [ 2150.306135] nf_flow_offload_ip_hook+0x710/0x874 [nf_flow_table] [ 2150.312168] nf_flow_rule_route_ipv6+0x53c/0x580 [nf_flow_table] [ 2150.318200] process_one_work+0x178/0x2f0 [ 2150.322211] worker_thread+0x2e4/0x4cc [ 2150.325953] kthread+0xd8/0xdc [ 2150.329008] ret_from_fork+0x10/0x20 [ 2150.332589] Code: b9007bf7 b4001e9c f9448380 b9491381 (f9400000) [ 2150.338681] ---[ end trace 0000000000000000 ]--- [ 2150.343298] Kernel panic - not syncing: Oops: Fatal exception [ 2150.349035] SMP: stopping secondary CPUs [ 2150.352954] Kernel Offset: disabled [ 2150.356438] CPU features: 0x0,00000000,00000000,1000400b [ 2150.361743] Memory Limit: none Fix the issue validating egress gdm port in airoha_ppe_foe_entry_prepare routine. Fixes: 00a7678310fe ("net: airoha: Introduce flowtable offload support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> --- Changes in v2: - Avoid checking netdev_priv pointer since it is always not NULL - Link to v1: https://lore.kernel.org/r/20250312-airoha-flowtable-null-ptr-fix-v1-1-6363fab884d0@kernel.org --- drivers/net/ethernet/airoha/airoha_eth.c | 13 +++++++++++++ drivers/net/ethernet/airoha/airoha_eth.h | 3 +++ drivers/net/ethernet/airoha/airoha_ppe.c | 10 ++++++++-- 3 files changed, 24 insertions(+), 2 deletions(-) --- base-commit: bfc6c67ec2d64d0ca4e5cc3e1ac84298a10b8d62 change-id: 20250312-airoha-flowtable-null-ptr-fix-a4656d12546a Best regards,