diff mbox series

[net,v2,1/2] ethtool: Add link_mode parameter capability bit to ethtool_ops

Message ID 20210404081433.1260889-2-danieller@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Fix link_mode derived params functionality | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net
netdev/subject_prefix success Link
netdev/cc_maintainers warning 2 maintainers not CCed: amitc@mellanox.com leon@kernel.org
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 3073 this patch: 3073
netdev/kdoc success Errors and warnings before: 14 this patch: 14
netdev/verify_fixes success Link
netdev/checkpatch warning WARNING: Possible repeated word: 'Google'
netdev/build_allmodconfig_warn success Errors and warnings before: 3191 this patch: 3191
netdev/header_inline success Link

Commit Message

Danielle Ratson April 4, 2021, 8:14 a.m. UTC
Some drivers clear the 'ethtool_link_ksettings' struct in their
get_link_ksettings() callback, before populating it with actual values.
Such drivers will set the new 'link_mode' field to zero, resulting in
user space receiving wrong link mode information given that zero is a
valid value for the field.

Fix this by introducing a new capability bit ('cap_link_mode_supported')
to ethtool_ops, which indicates whether the driver supports 'link_mode'
parameter or not. Set it to true in mlxsw which is currently the only
driver supporting 'link_mode'.

Another problem is that some drivers (notably tun) can report random
values in the 'link_mode' field. This can result in a general protection
fault when the field is used as an index to the 'link_mode_params' array
[1].

This happens because such drivers implement their set_link_ksettings()
callback by simply overwriting their private copy of
'ethtool_link_ksettings' struct with the one they get from the stack,
which is not always properly initialized.

Fix this by making sure that the implied parameters will be derived from
the 'link_mode' parameter only when the 'cap_link_mode_supported' is set.

v2:
	* Introduce 'cap_link_mode_supported' instead of adding a
	  validity field to 'ethtool_link_ksettings' struct.

[1]
general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967]
CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446
Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03
+38 d0 7c 08 84 d2 0f 85 b9
RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000
RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960
RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f
R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000
R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210
FS:  0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37
 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586
 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656
 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620
 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842
 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440
 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
 sock_ioctl+0x477/0x6a0 net/socket.c:1177
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl fs/ioctl.c:739 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c | 1 +
 include/linux/ethtool.h                                | 5 ++++-
 net/ethtool/ioctl.c                                    | 8 ++++++--
 3 files changed, 11 insertions(+), 3 deletions(-)

Comments

Andrew Lunn April 4, 2021, 2:18 p.m. UTC | #1
On Sun, Apr 04, 2021 at 11:14:32AM +0300, Danielle Ratson wrote:
> Some drivers clear the 'ethtool_link_ksettings' struct in their
> get_link_ksettings() callback, before populating it with actual values.
> Such drivers will set the new 'link_mode' field to zero, resulting in
> user space receiving wrong link mode information given that zero is a
> valid value for the field.

That sounds like the bug, that 0 means something, not that the
structure is zeroed. It is good practice to zero structures about to
be returned to user space otherwise you could leak stack information
etc.

Do we still have time to fix the KAPI, so zero means unset? Where in
the workflow is the patch adding this feature? Is it in a released
kernel? or -rc kernel?

> Fix this by introducing a new capability bit ('cap_link_mode_supported')
> to ethtool_ops, which indicates whether the driver supports 'link_mode'
> parameter or not. Set it to true in mlxsw which is currently the only
> driver supporting 'link_mode'.
> 
> Another problem is that some drivers (notably tun) can report random
> values in the 'link_mode' field. This can result in a general protection
> fault when the field is used as an index to the 'link_mode_params' array
> [1].
> 
> This happens because such drivers implement their set_link_ksettings()
> callback by simply overwriting their private copy of
> 'ethtool_link_ksettings' struct with the one they get from the stack,
> which is not always properly initialized.
> 
> Fix this by making sure that the implied parameters will be derived from
> the 'link_mode' parameter only when the 'cap_link_mode_supported' is set.

So you left in place the kernel memory leak to user space? Or is there
an additional patch to fix tun as well?

   Andrew
Danielle Ratson April 4, 2021, 3:04 p.m. UTC | #2
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Sunday, April 4, 2021 5:18 PM
> To: Danielle Ratson <danieller@nvidia.com>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; kuba@kernel.org; eric.dumazet@gmail.com; mkubecek@suse.cz;
> f.fainelli@gmail.com; acardace@redhat.com; irusskikh@marvell.com; gustavo@embeddedor.com; magnus.karlsson@intel.com;
> ecree@solarflare.com; Ido Schimmel <idosch@nvidia.com>; Jiri Pirko <jiri@nvidia.com>; mlxsw <mlxsw@nvidia.com>
> Subject: Re: [PATCH net v2 1/2] ethtool: Add link_mode parameter capability bit to ethtool_ops
> 
> On Sun, Apr 04, 2021 at 11:14:32AM +0300, Danielle Ratson wrote:
> > Some drivers clear the 'ethtool_link_ksettings' struct in their
> > get_link_ksettings() callback, before populating it with actual values.
> > Such drivers will set the new 'link_mode' field to zero, resulting in
> > user space receiving wrong link mode information given that zero is a
> > valid value for the field.
> 
> That sounds like the bug, that 0 means something, not that the structure is zeroed. It is good practice to zero structures about to be
> returned to user space otherwise you could leak stack information etc.
> 
> Do we still have time to fix the KAPI, so zero means unset? Where in the workflow is the patch adding this feature? Is it in a released
> kernel? or -rc kernel?

First, it is not the API structure that is passed to user space. Please pay attention that the link_mode field is added to "ethtool_link_ksettings" and not "ethtool_link_settings", which is the API.
So I am not sure what leak could happen in that situation, could you explain?

Second, the link_mode indices are uAPI, so 0 is not forbidden.

> 
> > Fix this by introducing a new capability bit
> > ('cap_link_mode_supported') to ethtool_ops, which indicates whether the driver supports 'link_mode'
> > parameter or not. Set it to true in mlxsw which is currently the only
> > driver supporting 'link_mode'.
> >
> > Another problem is that some drivers (notably tun) can report random
> > values in the 'link_mode' field. This can result in a general
> > protection fault when the field is used as an index to the
> > 'link_mode_params' array [1].
> >
> > This happens because such drivers implement their set_link_ksettings()
> > callback by simply overwriting their private copy of
> > 'ethtool_link_ksettings' struct with the one they get from the stack,
> > which is not always properly initialized.
> >
> > Fix this by making sure that the implied parameters will be derived
> > from the 'link_mode' parameter only when the 'cap_link_mode_supported' is set.
> 
> So you left in place the kernel memory leak to user space? Or is there an additional patch to fix tun as well?
> 
>    Andrew

Again, not sure what is the memory leak you are talking about. 
This patch solves the protection fault in tun driver, by the fact that now we are paying attention to the link_mode value only if the driver confirmed it supports the link_mode and set it properly.
Andrew Lunn April 4, 2021, 4:07 p.m. UTC | #3
> First, it is not the API structure that is passed to user space. Please pay attention

Yes, sorry. Jumped to assumptions without checking.

Let me try again.

    Andrew
Andrew Lunn April 4, 2021, 4:32 p.m. UTC | #4
> @@ -436,12 +436,16 @@ int __ethtool_get_link_ksettings(struct net_device *dev,
>  
>  	memset(link_ksettings, 0, sizeof(*link_ksettings));
>  
> -	link_ksettings->link_mode = -1;
>  	err = dev->ethtool_ops->get_link_ksettings(dev, link_ksettings);
>  	if (err)
>  		return err;
>  
> -	if (link_ksettings->link_mode != -1) {
> +	if (dev->ethtool_ops->cap_link_mode_supported &&
> +	    link_ksettings->link_mode != -1) {

Is this -1 behaviour documented anywhere? It seems like you just
changed its meaning. It used to mean, this field has not been set,
ignore it. Adding the cap_link_mode_supported it now means, we have
field has been set, but we have no idea what link mode is being used.
So you should probably add something to the documentation of struct
ethtool_link_ksettings.

I wonder if we should actually add ETHTOOL_LINK_MODE_UNKNOWN to enum
ethtool_link_mode_bit_indices?

> +		if (WARN_ON_ONCE(link_ksettings->link_mode >=
> +				 __ETHTOOL_LINK_MODE_MASK_NBITS))
> +			return -EINVAL;
> +
>  		link_info = &link_mode_params[link_ksettings->link_mode];
>  		link_ksettings->base.speed = link_info->speed;
>  		link_ksettings->lanes = link_info->lanes;

If dev->ethtool_ops->cap_link_mode_supported &&
link_ksettings->link_mode == -1 should you be setting speed to
SPEED_UNKNOWN, and lanes to LANE_UNKNOWN? Or is that already the
default?

But over all, this API between the core and the driver seems
messy. Why not just add a helper in common.c which translates link
mode to speed/duplex/lanes and call it in the driver. Then you don't
need this capability flags, which i doubt any other driver will ever
use. And you don't need to worry about drivers returning random
values. As far as i can see, the link_mode returned by the driver is
not used for anything other than for this translation. So i don't see
a need for it outside of the driver. Or maybe i'm missing something?

	Andrew
Danielle Ratson April 5, 2021, 5:01 p.m. UTC | #5
> -----Original Message-----
> From: Andrew Lunn <andrew@lunn.ch>
> Sent: Sunday, April 4, 2021 7:33 PM
> To: Danielle Ratson <danieller@nvidia.com>
> Cc: netdev@vger.kernel.org; davem@davemloft.net; kuba@kernel.org; eric.dumazet@gmail.com; mkubecek@suse.cz;
> f.fainelli@gmail.com; acardace@redhat.com; irusskikh@marvell.com; gustavo@embeddedor.com; magnus.karlsson@intel.com;
> ecree@solarflare.com; Ido Schimmel <idosch@nvidia.com>; Jiri Pirko <jiri@nvidia.com>; mlxsw <mlxsw@nvidia.com>
> Subject: Re: [PATCH net v2 1/2] ethtool: Add link_mode parameter capability bit to ethtool_ops
> 
> > @@ -436,12 +436,16 @@ int __ethtool_get_link_ksettings(struct
> > net_device *dev,
> >
> >  	memset(link_ksettings, 0, sizeof(*link_ksettings));
> >
> > -	link_ksettings->link_mode = -1;
> >  	err = dev->ethtool_ops->get_link_ksettings(dev, link_ksettings);
> >  	if (err)
> >  		return err;
> >
> > -	if (link_ksettings->link_mode != -1) {
> > +	if (dev->ethtool_ops->cap_link_mode_supported &&
> > +	    link_ksettings->link_mode != -1) {
> 
> Is this -1 behaviour documented anywhere? It seems like you just changed its meaning. It used to mean, this field has not been set,
> ignore it. Adding the cap_link_mode_supported it now means, we have field has been set, but we have no idea what link mode is
> being used.
> So you should probably add something to the documentation of struct ethtool_link_ksettings.
> 
> I wonder if we should actually add ETHTOOL_LINK_MODE_UNKNOWN to enum ethtool_link_mode_bit_indices?
> 
> > +		if (WARN_ON_ONCE(link_ksettings->link_mode >=
> > +				 __ETHTOOL_LINK_MODE_MASK_NBITS))
> > +			return -EINVAL;
> > +
> >  		link_info = &link_mode_params[link_ksettings->link_mode];
> >  		link_ksettings->base.speed = link_info->speed;
> >  		link_ksettings->lanes = link_info->lanes;
> 
> If dev->ethtool_ops->cap_link_mode_supported && link_ksettings->link_mode == -1 should you be setting speed to
> SPEED_UNKNOWN, and lanes to LANE_UNKNOWN? Or is that already the default?
> 
> But over all, this API between the core and the driver seems messy. Why not just add a helper in common.c which translates link
> mode to speed/duplex/lanes and call it in the driver. Then you don't need this capability flags, which i doubt any other driver will ever
> use. And you don't need to worry about drivers returning random values. As far as i can see, the link_mode returned by the driver is
> not used for anything other than for this translation. So i don't see a need for it outside of the driver. Or maybe i'm missing
> something?
> 
> 	Andrew

Hi Andrew,

Please see my patch here: https://github.com/daniellerts/linux_mlxsw/commit/72ca614951418843aa87323630c354691d9e50d4
Since a lack of time until the window closes, please see if that is what you meant and you are ok with it, before I am sending another version.

Thanks,
Danielle
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c
index 0bd64169bf81..54f04c94210c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c
@@ -1061,6 +1061,7 @@  mlxsw_sp_get_ts_info(struct net_device *netdev, struct ethtool_ts_info *info)
 
 const struct ethtool_ops mlxsw_sp_port_ethtool_ops = {
 	.cap_link_lanes_supported	= true,
+	.cap_link_mode_supported	= true,
 	.get_drvinfo			= mlxsw_sp_port_get_drvinfo,
 	.get_link			= ethtool_op_get_link,
 	.get_link_ext_state		= mlxsw_sp_port_get_link_ext_state,
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index ec4cd3921c67..9e6eb6df375d 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -269,6 +269,8 @@  struct ethtool_pause_stats {
  * struct ethtool_ops - optional netdev operations
  * @cap_link_lanes_supported: indicates if the driver supports lanes
  *	parameter.
+ * @cap_link_mode_supported: indicates if the driver supports link_mode
+ *	parameter.
  * @supported_coalesce_params: supported types of interrupt coalescing.
  * @get_drvinfo: Report driver/device information.  Should only set the
  *	@driver, @version, @fw_version and @bus_info fields.  If not
@@ -424,7 +426,8 @@  struct ethtool_pause_stats {
  * of the generic netdev features interface.
  */
 struct ethtool_ops {
-	u32     cap_link_lanes_supported:1;
+	u32     cap_link_lanes_supported:1,
+		cap_link_mode_supported:1;
 	u32	supported_coalesce_params;
 	void	(*get_drvinfo)(struct net_device *, struct ethtool_drvinfo *);
 	int	(*get_regs_len)(struct net_device *);
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 24783b71c584..cebbf93b27a7 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -436,12 +436,16 @@  int __ethtool_get_link_ksettings(struct net_device *dev,
 
 	memset(link_ksettings, 0, sizeof(*link_ksettings));
 
-	link_ksettings->link_mode = -1;
 	err = dev->ethtool_ops->get_link_ksettings(dev, link_ksettings);
 	if (err)
 		return err;
 
-	if (link_ksettings->link_mode != -1) {
+	if (dev->ethtool_ops->cap_link_mode_supported &&
+	    link_ksettings->link_mode != -1) {
+		if (WARN_ON_ONCE(link_ksettings->link_mode >=
+				 __ETHTOOL_LINK_MODE_MASK_NBITS))
+			return -EINVAL;
+
 		link_info = &link_mode_params[link_ksettings->link_mode];
 		link_ksettings->base.speed = link_info->speed;
 		link_ksettings->lanes = link_info->lanes;