mbox series

[net-next,00/15] eth: fbnic: Add network driver for Meta Platforms Host Network Interface

Message ID	171217454226.1598374.8971335637623132496.stgit@ahduyck-xeon-server.home.arpa (mailing list archive)
Headers	show Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC72D56B70; Wed, 3 Apr 2024 20:08:26 +0000 (UTC) Subject: [net-next PATCH 00/15] eth: fbnic: Add network driver for Meta Platforms Host Network Interface From: Alexander Duyck <alexander.duyck@gmail.com> To: netdev@vger.kernel.org Cc: bhelgaas@google.com, linux-pci@vger.kernel.org, Alexander Duyck <alexanderduyck@fb.com>, kuba@kernel.org, davem@davemloft.net, pabeni@redhat.com Date: Wed, 03 Apr 2024 13:08:24 -0700 Message-ID: <171217454226.1598374.8971335637623132496.stgit@ahduyck-xeon-server.home.arpa> User-Agent: StGit/1.5 Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit
Series	eth: fbnic: Add network driver for Meta Platforms Host Network Interface \| expand [net-next,00/15] eth: fbnic: Add network driver for Meta Platforms Host Network Interface [net-next,01/15] PCI: Add Meta Platforms vendor ID [net-next,02/15] eth: fbnic: add scaffolding for Meta's NIC driver [net-next,03/15] eth: fbnic: Allocate core device specific structures and devlink interface [net-next,04/15] eth: fbnic: Add register init to set PCIe/Ethernet device config [net-next,05/15] eth: fbnic: add message parsing for FW messages [net-next,06/15] eth: fbnic: add FW communication mechanism [net-next,07/15] eth: fbnic: allocate a netdevice and napi vectors with queues [net-next,08/15] eth: fbnic: implement Tx queue alloc/start/stop/free [net-next,09/15] eth: fbnic: implement Rx queue alloc/start/stop/free [net-next,10/15] eth: fbnic: Add initial messaging to notify FW of our presence [net-next,11/15] eth: fbnic: Enable Ethernet link setup [net-next,12/15] eth: fbnic: add basic Tx handling [net-next,13/15] eth: fbnic: add basic Rx handling [net-next,14/15] eth: fbnic: add L2 address programming [net-next,15/15] eth: fbnic: write the TCAM tables used for RSS control and Rx to host

Message ID

171217454226.1598374.8971335637623132496.stgit@ahduyck-xeon-server.home.arpa (mailing list archive)

Headers

Subject: [net-next PATCH 00/15] eth: fbnic: Add network driver for Meta
 Platforms Host Network Interface
From: Alexander Duyck <alexander.duyck@gmail.com>
To: netdev@vger.kernel.org
Cc: bhelgaas@google.com, linux-pci@vger.kernel.org,
 Alexander Duyck <alexanderduyck@fb.com>, kuba@kernel.org,
 davem@davemloft.net, pabeni@redhat.com
Date: Wed, 03 Apr 2024 13:08:24 -0700
Message-ID: 
 <171217454226.1598374.8971335637623132496.stgit@ahduyck-xeon-server.home.arpa>
User-Agent: StGit/1.5
Precedence: bulk
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit

Series

eth: fbnic: Add network driver for Meta Platforms Host Network Interface | expand

Message

Alexander Duyck April 3, 2024, 8:08 p.m. UTC

This patch set includes the necessary patches to enable basic Tx and Rx
over the Meta Platforms Host Network Interface. To do this we introduce a
new driver and driver and directories in the form of
"drivers/net/ethernet/meta/fbnic".

Due to submission limits the general plan to submit a minimal driver for
now almost equivalent to a UEFI driver in functionality, and then follow up
over the coming weeks enabling additional offloads and more features for
the device.

The general plan is to look at adding support for ethtool, statistics, and
start work on offloads in the next set of patches.

---

Alexander Duyck (15):
      PCI: Add Meta Platforms vendor ID
      eth: fbnic: add scaffolding for Meta's NIC driver
      eth: fbnic: Allocate core device specific structures and devlink interface
      eth: fbnic: Add register init to set PCIe/Ethernet device config
      eth: fbnic: add message parsing for FW messages
      eth: fbnic: add FW communication mechanism
      eth: fbnic: allocate a netdevice and napi vectors with queues
      eth: fbnic: implement Tx queue alloc/start/stop/free
      eth: fbnic: implement Rx queue alloc/start/stop/free
      eth: fbnic: Add initial messaging to notify FW of our presence
      eth: fbnic: Enable Ethernet link setup
      eth: fbnic: add basic Tx handling
      eth: fbnic: add basic Rx handling
      eth: fbnic: add L2 address programming
      eth: fbnic: write the TCAM tables used for RSS control and Rx to host


 MAINTAINERS                                   |    7 +
 drivers/net/ethernet/Kconfig                  |    1 +
 drivers/net/ethernet/Makefile                 |    1 +
 drivers/net/ethernet/meta/Kconfig             |   29 +
 drivers/net/ethernet/meta/Makefile            |    6 +
 drivers/net/ethernet/meta/fbnic/Makefile      |   18 +
 drivers/net/ethernet/meta/fbnic/fbnic.h       |  148 ++
 drivers/net/ethernet/meta/fbnic/fbnic_csr.h   |  912 ++++++++
 .../net/ethernet/meta/fbnic/fbnic_devlink.c   |   86 +
 .../net/ethernet/meta/fbnic/fbnic_drvinfo.h   |    5 +
 drivers/net/ethernet/meta/fbnic/fbnic_fw.c    |  823 ++++++++
 drivers/net/ethernet/meta/fbnic/fbnic_fw.h    |  133 ++
 drivers/net/ethernet/meta/fbnic/fbnic_irq.c   |  251 +++
 drivers/net/ethernet/meta/fbnic/fbnic_mac.c   | 1025 +++++++++
 drivers/net/ethernet/meta/fbnic/fbnic_mac.h   |   83 +
 .../net/ethernet/meta/fbnic/fbnic_netdev.c    |  470 +++++
 .../net/ethernet/meta/fbnic/fbnic_netdev.h    |   59 +
 drivers/net/ethernet/meta/fbnic/fbnic_pci.c   |  633 ++++++
 drivers/net/ethernet/meta/fbnic/fbnic_rpc.c   |  709 +++++++
 drivers/net/ethernet/meta/fbnic/fbnic_rpc.h   |  189 ++
 drivers/net/ethernet/meta/fbnic/fbnic_tlv.c   |  529 +++++
 drivers/net/ethernet/meta/fbnic/fbnic_tlv.h   |  175 ++
 drivers/net/ethernet/meta/fbnic/fbnic_txrx.c  | 1873 +++++++++++++++++
 drivers/net/ethernet/meta/fbnic/fbnic_txrx.h  |  125 ++
 include/linux/pci_ids.h                       |    2 +
 25 files changed, 8292 insertions(+)
 create mode 100644 drivers/net/ethernet/meta/Kconfig
 create mode 100644 drivers/net/ethernet/meta/Makefile
 create mode 100644 drivers/net/ethernet/meta/fbnic/Makefile
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_csr.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_devlink.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_fw.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_fw.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_irq.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_mac.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_mac.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_pci.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_rpc.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_rpc.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_tlv.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_tlv.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_txrx.h

--

Comments

Bjorn Helgaas April 3, 2024, 8:42 p.m. UTC | #1

On Wed, Apr 03, 2024 at 01:08:24PM -0700, Alexander Duyck wrote:
> This patch set includes the necessary patches to enable basic Tx and Rx
> over the Meta Platforms Host Network Interface. To do this we introduce a
> new driver and driver and directories in the form of
> "drivers/net/ethernet/meta/fbnic".

>       PCI: Add Meta Platforms vendor ID
>       eth: fbnic: add scaffolding for Meta's NIC driver
>       eth: fbnic: Allocate core device specific structures and devlink interface
>       eth: fbnic: Add register init to set PCIe/Ethernet device config
>       eth: fbnic: add message parsing for FW messages
>       eth: fbnic: add FW communication mechanism
>       eth: fbnic: allocate a netdevice and napi vectors with queues
>       eth: fbnic: implement Tx queue alloc/start/stop/free
>       eth: fbnic: implement Rx queue alloc/start/stop/free
>       eth: fbnic: Add initial messaging to notify FW of our presence
>       eth: fbnic: Enable Ethernet link setup
>       eth: fbnic: add basic Tx handling
>       eth: fbnic: add basic Rx handling
>       eth: fbnic: add L2 address programming
>       eth: fbnic: write the TCAM tables used for RSS control and Rx to host

Random mix of initial caps in subjects.  Also kind of a mix of initial
caps in comments, e.g.,

  $ grep -Er "^\s+/\*" drivers/net/ethernet/meta/fbnic/

I didn't bother to figure out which patch these typos were in:

  $ codespell drivers/net/ethernet/meta/fbnic/
  drivers/net/ethernet/meta/fbnic/fbnic_pci.c:452: ot ==> to, of, or
  drivers/net/ethernet/meta/fbnic/fbnic_pci.c:479: Reenable ==> Re-enable
  drivers/net/ethernet/meta/fbnic/fbnic_txrx.c:569: caclulation ==> calculation
  drivers/net/ethernet/meta/fbnic/fbnic_fw.c:740: conents ==> contents
  drivers/net/ethernet/meta/fbnic/fbnic_txrx.h:19: cachline ==> cacheline

Jiri Pirko April 4, 2024, 11:37 a.m. UTC | #2

Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote:
>This patch set includes the necessary patches to enable basic Tx and Rx
>over the Meta Platforms Host Network Interface. To do this we introduce a
>new driver and driver and directories in the form of
>"drivers/net/ethernet/meta/fbnic".
>
>Due to submission limits the general plan to submit a minimal driver for
>now almost equivalent to a UEFI driver in functionality, and then follow up
>over the coming weeks enabling additional offloads and more features for
>the device.
>
>The general plan is to look at adding support for ethtool, statistics, and
>start work on offloads in the next set of patches.

Could you please shed some light for the motivation to introduce this
driver in the community kernel? Is this device something people can
obtain in a shop, or is it rather something to be seen in Meta
datacenter only? If the second is the case, why exactly would we need
this driver?



>
>---
>
>Alexander Duyck (15):
>      PCI: Add Meta Platforms vendor ID
>      eth: fbnic: add scaffolding for Meta's NIC driver
>      eth: fbnic: Allocate core device specific structures and devlink interface
>      eth: fbnic: Add register init to set PCIe/Ethernet device config
>      eth: fbnic: add message parsing for FW messages
>      eth: fbnic: add FW communication mechanism
>      eth: fbnic: allocate a netdevice and napi vectors with queues
>      eth: fbnic: implement Tx queue alloc/start/stop/free
>      eth: fbnic: implement Rx queue alloc/start/stop/free
>      eth: fbnic: Add initial messaging to notify FW of our presence
>      eth: fbnic: Enable Ethernet link setup
>      eth: fbnic: add basic Tx handling
>      eth: fbnic: add basic Rx handling
>      eth: fbnic: add L2 address programming
>      eth: fbnic: write the TCAM tables used for RSS control and Rx to host
>
>
> MAINTAINERS                                   |    7 +
> drivers/net/ethernet/Kconfig                  |    1 +
> drivers/net/ethernet/Makefile                 |    1 +
> drivers/net/ethernet/meta/Kconfig             |   29 +
> drivers/net/ethernet/meta/Makefile            |    6 +
> drivers/net/ethernet/meta/fbnic/Makefile      |   18 +
> drivers/net/ethernet/meta/fbnic/fbnic.h       |  148 ++
> drivers/net/ethernet/meta/fbnic/fbnic_csr.h   |  912 ++++++++
> .../net/ethernet/meta/fbnic/fbnic_devlink.c   |   86 +
> .../net/ethernet/meta/fbnic/fbnic_drvinfo.h   |    5 +
> drivers/net/ethernet/meta/fbnic/fbnic_fw.c    |  823 ++++++++
> drivers/net/ethernet/meta/fbnic/fbnic_fw.h    |  133 ++
> drivers/net/ethernet/meta/fbnic/fbnic_irq.c   |  251 +++
> drivers/net/ethernet/meta/fbnic/fbnic_mac.c   | 1025 +++++++++
> drivers/net/ethernet/meta/fbnic/fbnic_mac.h   |   83 +
> .../net/ethernet/meta/fbnic/fbnic_netdev.c    |  470 +++++
> .../net/ethernet/meta/fbnic/fbnic_netdev.h    |   59 +
> drivers/net/ethernet/meta/fbnic/fbnic_pci.c   |  633 ++++++
> drivers/net/ethernet/meta/fbnic/fbnic_rpc.c   |  709 +++++++
> drivers/net/ethernet/meta/fbnic/fbnic_rpc.h   |  189 ++
> drivers/net/ethernet/meta/fbnic/fbnic_tlv.c   |  529 +++++
> drivers/net/ethernet/meta/fbnic/fbnic_tlv.h   |  175 ++
> drivers/net/ethernet/meta/fbnic/fbnic_txrx.c  | 1873 +++++++++++++++++
> drivers/net/ethernet/meta/fbnic/fbnic_txrx.h  |  125 ++
> include/linux/pci_ids.h                       |    2 +
> 25 files changed, 8292 insertions(+)
> create mode 100644 drivers/net/ethernet/meta/Kconfig
> create mode 100644 drivers/net/ethernet/meta/Makefile
> create mode 100644 drivers/net/ethernet/meta/fbnic/Makefile
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_csr.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_devlink.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_fw.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_fw.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_irq.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_mac.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_mac.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_netdev.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_pci.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_rpc.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_rpc.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_tlv.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_tlv.h
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_txrx.c
> create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_txrx.h
>
>--
>
>

Alexander Duyck April 4, 2024, 2:45 p.m. UTC | #3

On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote:
> >This patch set includes the necessary patches to enable basic Tx and Rx
> >over the Meta Platforms Host Network Interface. To do this we introduce a
> >new driver and driver and directories in the form of
> >"drivers/net/ethernet/meta/fbnic".
> >
> >Due to submission limits the general plan to submit a minimal driver for
> >now almost equivalent to a UEFI driver in functionality, and then follow up
> >over the coming weeks enabling additional offloads and more features for
> >the device.
> >
> >The general plan is to look at adding support for ethtool, statistics, and
> >start work on offloads in the next set of patches.
>
> Could you please shed some light for the motivation to introduce this
> driver in the community kernel? Is this device something people can
> obtain in a shop, or is it rather something to be seen in Meta
> datacenter only? If the second is the case, why exactly would we need
> this driver?

For now this is Meta only. However there are several reasons for
wanting to include this in the upstream kernel.

First is the fact that from a maintenance standpoint it is easier to
avoid drifting from the upstream APIs and such if we are in the kernel
it makes things much easier to maintain as we can just pull in patches
without having to add onto that work by having to craft backports
around code that isn't already in upstream.

Second is the fact that as we introduce new features with our driver
it is much easier to show a proof of concept with the driver being in
the kernel than not. It makes it much harder to work with the
community on offloads and such if we don't have a good vehicle to use
for that. What this driver will provide is an opportunity to push
changes that would be beneficial to us, and likely the rest of the
community without being constrained by what vendors decide they want
to enable or not. The general idea is that if we can show benefit with
our NIC then other vendors would be more likely to follow in our path.

Lastly, there is a good chance that we may end up opening up more than
just the driver code for this project assuming we can get past these
initial hurdles. I don't know if you have noticed but Meta is pretty
involved in the Open Compute Project. So if we want to work with third
parties on things like firmware and hardware it makes it much easier
to do so if the driver is already open and publicly available in the
Linux kernel.

Thanks,

- Alex

Andrew Lunn April 4, 2024, 3:24 p.m. UTC | #4

> For now this is Meta only. However there are several reasons for
> wanting to include this in the upstream kernel.
> 
> First is the fact that from a maintenance standpoint it is easier to
> avoid drifting from the upstream APIs and such if we are in the kernel
> it makes things much easier to maintain as we can just pull in patches
> without having to add onto that work by having to craft backports
> around code that isn't already in upstream.
> 
> Second is the fact that as we introduce new features with our driver
> it is much easier to show a proof of concept with the driver being in
> the kernel than not. It makes it much harder to work with the
> community on offloads and such if we don't have a good vehicle to use
> for that. What this driver will provide is an opportunity to push
> changes that would be beneficial to us, and likely the rest of the
> community without being constrained by what vendors decide they want
> to enable or not. The general idea is that if we can show benefit with
> our NIC then other vendors would be more likely to follow in our path.
 
Given the discussion going on in the thread "mlx5 ConnectX control
misc driver", you also plan to show you don't need such a misc driver?

	Andrew

Jiri Pirko April 4, 2024, 3:36 p.m. UTC | #5

Thu, Apr 04, 2024 at 04:45:14PM CEST, alexander.duyck@gmail.com wrote:
>On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote:
>> >This patch set includes the necessary patches to enable basic Tx and Rx
>> >over the Meta Platforms Host Network Interface. To do this we introduce a
>> >new driver and driver and directories in the form of
>> >"drivers/net/ethernet/meta/fbnic".
>> >
>> >Due to submission limits the general plan to submit a minimal driver for
>> >now almost equivalent to a UEFI driver in functionality, and then follow up
>> >over the coming weeks enabling additional offloads and more features for
>> >the device.
>> >
>> >The general plan is to look at adding support for ethtool, statistics, and
>> >start work on offloads in the next set of patches.
>>
>> Could you please shed some light for the motivation to introduce this
>> driver in the community kernel? Is this device something people can
>> obtain in a shop, or is it rather something to be seen in Meta
>> datacenter only? If the second is the case, why exactly would we need
>> this driver?
>
>For now this is Meta only. However there are several reasons for
>wanting to include this in the upstream kernel.
>
>First is the fact that from a maintenance standpoint it is easier to
>avoid drifting from the upstream APIs and such if we are in the kernel
>it makes things much easier to maintain as we can just pull in patches
>without having to add onto that work by having to craft backports
>around code that isn't already in upstream.

That is making life easier for you, making it harder for the community.
O relevance.


>
>Second is the fact that as we introduce new features with our driver
>it is much easier to show a proof of concept with the driver being in
>the kernel than not. It makes it much harder to work with the
>community on offloads and such if we don't have a good vehicle to use
>for that. What this driver will provide is an opportunity to push
>changes that would be beneficial to us, and likely the rest of the
>community without being constrained by what vendors decide they want
>to enable or not. The general idea is that if we can show benefit with
>our NIC then other vendors would be more likely to follow in our path.

Yeah, so not even we would have to maintain driver nobody (outside Meta)
uses or cares about, you say that we will likely maintain more of a dead
code related to that. I think that in Linux kernel, there any many
examples of similarly dead code that causes a lot of headaches to
maintain.

You just want to make your life easier here again. Don't drag community
into this please.


>
>Lastly, there is a good chance that we may end up opening up more than
>just the driver code for this project assuming we can get past these
>initial hurdles. I don't know if you have noticed but Meta is pretty
>involved in the Open Compute Project. So if we want to work with third
>parties on things like firmware and hardware it makes it much easier
>to do so if the driver is already open and publicly available in the
>Linux kernel.

OCP, ehm, lets say I have my reservations...
Okay, what motivation would anyne have to touch the fw of a hardware
running inside Meta datacenter only? Does not make any sense.

I'd say come again when your HW is not limited to Meta datacenter.
For the record and FWIW, I NACK this.


>
>Thanks,
>
>- Alex

Jakub Kicinski April 4, 2024, 3:37 p.m. UTC | #6

On Thu, 4 Apr 2024 17:24:03 +0200 Andrew Lunn wrote:
> Given the discussion going on in the thread "mlx5 ConnectX control
> misc driver", you also plan to show you don't need such a misc driver?

To the extent to which it is possible to prove an in-existence
of something :) Since my argument that Meta can use _vendor_ devices
without resorting to "misc drivers" doesn't convince them, I doubt
this data point would help.

Andrew Lunn April 4, 2024, 6:35 p.m. UTC | #7

> OCP, ehm, lets say I have my reservations...
> Okay, what motivation would anyne have to touch the fw of a hardware
> running inside Meta datacenter only? Does not make any sense.
> 
> I'd say come again when your HW is not limited to Meta datacenter.
> For the record and FWIW, I NACK this.

Is ENA used outside of Amazon data centres?

Is FUNGIBLE used outside of Microsoft data centres?

Is gVNIC used outside of Google data centres?

I don't actually know, i'm more of an Embedded developer.

  Andrew

Leon Romanovsky April 4, 2024, 7:05 p.m. UTC | #8

On Thu, Apr 04, 2024 at 08:35:18PM +0200, Andrew Lunn wrote:
> > OCP, ehm, lets say I have my reservations...
> > Okay, what motivation would anyne have to touch the fw of a hardware
> > running inside Meta datacenter only? Does not make any sense.
> > 
> > I'd say come again when your HW is not limited to Meta datacenter.
> > For the record and FWIW, I NACK this.
> 
> Is ENA used outside of Amazon data centres?

At least this driver is needed when you use cloud instance in Amazon
and want install your own kernel/OS.

> 
> Is FUNGIBLE used outside of Microsoft data centres?

Have no idea

> 
> Is gVNIC used outside of Google data centres?

You need this driver too.
https://cloud.google.com/compute/docs/networking/using-gvnic

> 
> I don't actually know, i'm more of an Embedded developer.
> 
>   Andrew
> 
>

Alexander Duyck April 4, 2024, 7:22 p.m. UTC | #9

On Thu, Apr 4, 2024 at 8:36 AM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Thu, Apr 04, 2024 at 04:45:14PM CEST, alexander.duyck@gmail.com wrote:
> >On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote:
> >>
> >> Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote:

<...>

> >> Could you please shed some light for the motivation to introduce this
> >> driver in the community kernel? Is this device something people can
> >> obtain in a shop, or is it rather something to be seen in Meta
> >> datacenter only? If the second is the case, why exactly would we need
> >> this driver?
> >
> >For now this is Meta only. However there are several reasons for
> >wanting to include this in the upstream kernel.
> >
> >First is the fact that from a maintenance standpoint it is easier to
> >avoid drifting from the upstream APIs and such if we are in the kernel
> >it makes things much easier to maintain as we can just pull in patches
> >without having to add onto that work by having to craft backports
> >around code that isn't already in upstream.
>
> That is making life easier for you, making it harder for the community.
> O relevance.
>
>
> >
> >Second is the fact that as we introduce new features with our driver
> >it is much easier to show a proof of concept with the driver being in
> >the kernel than not. It makes it much harder to work with the
> >community on offloads and such if we don't have a good vehicle to use
> >for that. What this driver will provide is an opportunity to push
> >changes that would be beneficial to us, and likely the rest of the
> >community without being constrained by what vendors decide they want
> >to enable or not. The general idea is that if we can show benefit with
> >our NIC then other vendors would be more likely to follow in our path.
>
> Yeah, so not even we would have to maintain driver nobody (outside Meta)
> uses or cares about, you say that we will likely maintain more of a dead
> code related to that. I think that in Linux kernel, there any many
> examples of similarly dead code that causes a lot of headaches to
> maintain.
>
> You just want to make your life easier here again. Don't drag community
> into this please.

The argument itself doesn't really hold water. The fact is the Meta
data centers are not an insignificant consumer of Linux, so it isn't
as if the driver isn't going to be used. This implies some lack of
good faith from Meta. I don't understand that as we are contributing
across multiple areas in the kernel including networking and ebpf. Is
Meta expected to start pulling time from our upstream maintainers to
have them update out-of-tree kernel modules since the community isn't
willing to let us maintain it in the kernel? Is the message that the
kernel is expected to get value from Meta, but that value is not meant
to be reciprocated? Would you really rather have us start maintaining
our own internal kernel with our own "proprietary goodness", and ask
other NIC vendors to have to maintain their drivers against yet
another kernel if they want to be used in our data centers?

As pointed out by Andew we aren't the first data center to push a
driver for our own proprietary device. The fact is there have been
drivers added for devices that were for purely emulated devices with
no actual customers such as rocker. Should the switch vendors at the
time have pushed back on it stating it wasn't a real "for sale"
device? The whole argument seems counter to what is expected. When a
vendor creates a new device and will likely be enabling new kernel
features my understanding is that it is better to be in the kernel
than not.

Putting a criteria on it that it must be "for sale" seems rather
arbitrary and capricious, especially given that most drivers have to
be pushed out long before they are available in the market in order to
meet deadlines to get the driver into OSV releases such as Redhat when
it hits the market. By that logic should we block all future drivers
until we can find them for sale somewhere? That way we don't run the
risk of adding a vendor driver for a product that might be scrapped
due to a last minute bug that will cause it to never be released.

Jakub Kicinski April 4, 2024, 8:25 p.m. UTC | #10

On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote:
> The argument itself doesn't really hold water. The fact is the Meta
> data centers are not an insignificant consumer of Linux, 

customer or beneficiary ?

> so it isn't as if the driver isn't going to be used. This implies
> some lack of good faith from Meta.

"Good faith" is not a sufficient foundation for a community consisting
of volunteers, and commercial entities (with the xz debacle maybe even
less today than it was a month ago). As a maintainer I really don't want
to be in position of judging the "good faith" of corporate actors.

> I don't understand that as we are
> contributing across multiple areas in the kernel including networking
> and ebpf. Is Meta expected to start pulling time from our upstream
> maintainers to have them update out-of-tree kernel modules since the
> community isn't willing to let us maintain it in the kernel? Is the
> message that the kernel is expected to get value from Meta, but that
> value is not meant to be reciprocated? Would you really rather have
> us start maintaining our own internal kernel with our own
> "proprietary goodness", and ask other NIC vendors to have to maintain
> their drivers against yet another kernel if they want to be used in
> our data centers?

Please allow the community to make rational choices in the interest of
the project and more importantly the interest of its broader user base.

Google would also claim "good faith" -- undoubtedly is supports 
the kernel, and lets some of its best engineers contribute.
Did that make them stop trying to build Fuchsia? The "good faith" of
companies operates with the limits of margin of error of they consider
rational and beneficial.

I don't want to put my thumb on the scale (yet?), but (with my
maintainer hat on) please don't use the "Meta is good" argument, because
someone will send a similar driver from a less involved company later on
and we'll be accused of playing favorites :( Plus companies can change
their approach to open source from "inclusive" to "extractive" 
(to borrow the economic terminology) rather quickly.

John Fastabend April 4, 2024, 9:59 p.m. UTC | #11

Jakub Kicinski wrote:
> On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote:
> > The argument itself doesn't really hold water. The fact is the Meta
> > data centers are not an insignificant consumer of Linux, 
> 
> customer or beneficiary ?
> 
> > so it isn't as if the driver isn't going to be used. This implies
> > some lack of good faith from Meta.
> 
> "Good faith" is not a sufficient foundation for a community consisting
> of volunteers, and commercial entities (with the xz debacle maybe even
> less today than it was a month ago). As a maintainer I really don't want
> to be in position of judging the "good faith" of corporate actors.
> 
> > I don't understand that as we are
> > contributing across multiple areas in the kernel including networking
> > and ebpf. Is Meta expected to start pulling time from our upstream
> > maintainers to have them update out-of-tree kernel modules since the
> > community isn't willing to let us maintain it in the kernel? Is the
> > message that the kernel is expected to get value from Meta, but that
> > value is not meant to be reciprocated? Would you really rather have
> > us start maintaining our own internal kernel with our own
> > "proprietary goodness", and ask other NIC vendors to have to maintain
> > their drivers against yet another kernel if they want to be used in
> > our data centers?
> 
> Please allow the community to make rational choices in the interest of
> the project and more importantly the interest of its broader user base.
> 
> Google would also claim "good faith" -- undoubtedly is supports 
> the kernel, and lets some of its best engineers contribute.
> Did that make them stop trying to build Fuchsia? The "good faith" of
> companies operates with the limits of margin of error of they consider
> rational and beneficial.
> 
> I don't want to put my thumb on the scale (yet?), but (with my
> maintainer hat on) please don't use the "Meta is good" argument, because
> someone will send a similar driver from a less involved company later on
> and we'll be accused of playing favorites :( Plus companies can change
> their approach to open source from "inclusive" to "extractive" 
> (to borrow the economic terminology) rather quickly.
> 

I'll throw my $.02 in. In this case you have a driver that I only scanned
so far, but looks well done. Alex has written lots of drivers I trust he
will not just abondon it. And if it does end up abondoned and no one
supports it at some future point we can deprecate it same as any other
driver in the networking tree. All the feedback is being answered and
debate is happening so I expect will get a v2, v3 or so. All good signs
in my point.

Back to your point about faith in a company. I don't think we even need
to care about whatever companies business plans. The author could have
submitted with their personal address for what its worth and called it
drivers/alexware/duyck.o Bit extreme and I would have called him on it,
but hopefully the point is clear.

We have lots of drivers in the tree that are hard to physically get ahold
of. Or otherwise gated by paying some vendor for compute time, etc. to
use. We even have some drivers where the hardware itself never made
it out into the wild or only a single customer used it before sellers 
burned it for commercial reasons or hw wasn't workable, team was cut, etc.

I can't see how if I have a physical NIC for it on my desk here makes
much difference one way or the other.

The alternative is much worse someone builds a team of engineers locks
them up they build some interesting pieces and we never get to see it
because we tried to block someone from opensourcing their driver?
Eventually they need some kernel changes and than we block those too
because we didn't allow the driver that was the use case? This seems
wrong to me.

Anyways we have zero ways to enforce such a policy. Have vendors
ship a NIC to somebody with the v0 of the patch set? Attach a picture? 
Even if vendor X claims they will have a product in N months and
than only sells it to qualified customers what to do we do then.
Driver author could even believe the hardware will be available
when they post the driver, but business may change out of hands
of the developer.

I'm 100% on letting this through assuming Alex is on top of feedback
and the code is good. I think any other policy would be very ugly
to enforce, prove, and even understand. Obviously code and architecture
debates I'm all for. Ensuring we have a trusted, experienced person
signed up to review code, address feedback, fix whatever syzbot finds
and so on is also a must I think. I'm sure Alex will take care of
it.

Thanks,
John

Jakub Kicinski April 4, 2024, 11:50 p.m. UTC | #12

On Thu, 04 Apr 2024 14:59:33 -0700 John Fastabend wrote:
> The alternative is much worse someone builds a team of engineers locks
> them up they build some interesting pieces and we never get to see it
> because we tried to block someone from opensourcing their driver?

Opensourcing is just one push to github.
There are guarantees we give to upstream drivers.

> Eventually they need some kernel changes and than we block those too
> because we didn't allow the driver that was the use case? This seems
> wrong to me.

The flip side of the argument is, what if we allow some device we don't
have access to to make changes to the core for its benefit. Owner
reports that some changes broke the kernel for them. Kernel rules,
regression, we have to revert. This is not a hypothetical, "less than
cooperative users" demanding reverts, and "reporting us to Linus"
is a reality :(

Technical solution? Maybe if it's not a public device regression rules
don't apply? Seems fairly reasonable.

> Anyways we have zero ways to enforce such a policy. Have vendors
> ship a NIC to somebody with the v0 of the patch set? Attach a picture? 

GenAI world, pictures mean nothing :) We do have a CI in netdev, which
is all ready to ingest external results, and a (currently tiny amount?)
of test for NICs. Prove that you care about the device by running the
upstream tests and reporting results? Seems fairly reasonable.

> Even if vendor X claims they will have a product in N months and
> than only sells it to qualified customers what to do we do then.
> Driver author could even believe the hardware will be available
> when they post the driver, but business may change out of hands
> of the developer.
> 
> I'm 100% on letting this through assuming Alex is on top of feedback
> and the code is good.

I'd strongly prefer if we detach our trust and respect for Alex
from whatever precedent we make here. I can't stress this enough.
IDK if I'm exaggerating or it's hard to appreciate the challenges 
of maintainership without living it, but I really don't like being
accused of playing favorites or big companies buying their way in :(

> I think any other policy would be very ugly to enforce, prove, and
> even understand. Obviously code and architecture debates I'm all for.
> Ensuring we have a trusted, experienced person signed up to review
> code, address feedback, fix whatever syzbot finds and so on is also a
> must I think. I'm sure Alex will take care of it.

"Whatever syzbot finds" may be slightly moot for a private device ;)
but otherwise 100%! These are exactly the kind of points I think we
should enumerate. I started writing a list of expectations a while back:

Documentation/maintainer/feature-and-driver-maintainers.rst

I think we just need something like this, maybe just a step up, for
non-public devices..

Alexander Duyck April 4, 2024, 11:50 p.m. UTC | #13

On Thu, Apr 4, 2024 at 2:59 PM John Fastabend <john.fastabend@gmail.com> wrote:
>
> Jakub Kicinski wrote:
> > On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote:
> > > I don't understand that as we are
> > > contributing across multiple areas in the kernel including networking
> > > and ebpf. Is Meta expected to start pulling time from our upstream
> > > maintainers to have them update out-of-tree kernel modules since the
> > > community isn't willing to let us maintain it in the kernel? Is the
> > > message that the kernel is expected to get value from Meta, but that
> > > value is not meant to be reciprocated? Would you really rather have
> > > us start maintaining our own internal kernel with our own
> > > "proprietary goodness", and ask other NIC vendors to have to maintain
> > > their drivers against yet another kernel if they want to be used in
> > > our data centers?
> >
> > Please allow the community to make rational choices in the interest of
> > the project and more importantly the interest of its broader user base.
> >
> > Google would also claim "good faith" -- undoubtedly is supports
> > the kernel, and lets some of its best engineers contribute.
> > Did that make them stop trying to build Fuchsia? The "good faith" of
> > companies operates with the limits of margin of error of they consider
> > rational and beneficial.
> >
> > I don't want to put my thumb on the scale (yet?), but (with my
> > maintainer hat on) please don't use the "Meta is good" argument, because
> > someone will send a similar driver from a less involved company later on
> > and we'll be accused of playing favorites :( Plus companies can change
> > their approach to open source from "inclusive" to "extractive"
> > (to borrow the economic terminology) rather quickly.
> >
>
> I'll throw my $.02 in. In this case you have a driver that I only scanned
> so far, but looks well done. Alex has written lots of drivers I trust he
> will not just abondon it. And if it does end up abondoned and no one
> supports it at some future point we can deprecate it same as any other
> driver in the networking tree. All the feedback is being answered and
> debate is happening so I expect will get a v2, v3 or so. All good signs
> in my point.
>
> Back to your point about faith in a company. I don't think we even need
> to care about whatever companies business plans. The author could have
> submitted with their personal address for what its worth and called it
> drivers/alexware/duyck.o Bit extreme and I would have called him on it,
> but hopefully the point is clear.
>
> We have lots of drivers in the tree that are hard to physically get ahold
> of. Or otherwise gated by paying some vendor for compute time, etc. to
> use. We even have some drivers where the hardware itself never made
> it out into the wild or only a single customer used it before sellers
> burned it for commercial reasons or hw wasn't workable, team was cut, etc.
>
> I can't see how if I have a physical NIC for it on my desk here makes
> much difference one way or the other.

The advantage of Meta not selling it publicly at this time is that
Meta is both the consumer and the maintainer so if a kernel API change
broke something Meta would be responsible for fixing it. It would be
Meta's own interest to maintain the part, and if it breaks Meta would
be the only one impacted assuming the breaking change at least builds.
So rather than "good faith", maybe I should have said "motivated self
interest".

It seems like the worst case scenario would actually be some
commercial product being sold. Worse yet, one with proprietary bits
such as firmware on the device. Should commercial vendors be required
to provide some sort of documentation proving they are dedicated to
their product and financially stable enough to maintain it for the
entire product life cycle? What if the vendor sells some significant
number of units to Linux users out there, and then either goes under,
gets acquired, and/or just decides to go in a new direction. In that
scenario we have the driver unmaintained and consumers impacted, but
the company responsible for it has no motivation to address it other
than maybe negative PR.

> The alternative is much worse someone builds a team of engineers locks
> them up they build some interesting pieces and we never get to see it
> because we tried to block someone from opensourcing their driver?
> Eventually they need some kernel changes and than we block those too
> because we didn't allow the driver that was the use case? This seems
> wrong to me.
>
> Anyways we have zero ways to enforce such a policy. Have vendors
> ship a NIC to somebody with the v0 of the patch set? Attach a picture?
> Even if vendor X claims they will have a product in N months and
> than only sells it to qualified customers what to do we do then.
> Driver author could even believe the hardware will be available
> when they post the driver, but business may change out of hands
> of the developer.

This is what I was referring to as being "arbitrary and capricious".
The issue is what would we define as a NIC being for sale
commercially. Do we have to support a certain form factor, sell it for
a certain price, or sell a certain quantity?

If anything maybe we should look more at something like a blast radius
in terms of inclusion. If this were to go wrong, how could it go wrong
and who would be impacted.

> I'm 100% on letting this through assuming Alex is on top of feedback
> and the code is good. I think any other policy would be very ugly
> to enforce, prove, and even understand. Obviously code and architecture
> debates I'm all for. Ensuring we have a trusted, experienced person
> signed up to review code, address feedback, fix whatever syzbot finds
> and so on is also a must I think. I'm sure Alex will take care of
> it.
>
> Thanks,
> John

Thanks for your reply. I had started a reply, but you probably worded
this better than I could have.

Thanks

- Alex

Alexander Duyck April 5, 2024, 12:11 a.m. UTC | #14

On Thu, Apr 4, 2024 at 4:50 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 04 Apr 2024 14:59:33 -0700 John Fastabend wrote:
> > The alternative is much worse someone builds a team of engineers locks
> > them up they build some interesting pieces and we never get to see it
> > because we tried to block someone from opensourcing their driver?
>
> Opensourcing is just one push to github.
> There are guarantees we give to upstream drivers.

Are there? Do we have them documented somewhere?

> > Eventually they need some kernel changes and than we block those too
> > because we didn't allow the driver that was the use case? This seems
> > wrong to me.
>
> The flip side of the argument is, what if we allow some device we don't
> have access to to make changes to the core for its benefit. Owner
> reports that some changes broke the kernel for them. Kernel rules,
> regression, we have to revert. This is not a hypothetical, "less than
> cooperative users" demanding reverts, and "reporting us to Linus"
> is a reality :(
>
> Technical solution? Maybe if it's not a public device regression rules
> don't apply? Seems fairly reasonable.

This is a hypothetical. This driver currently isn't changing anything
outside of itself. At this point the driver would only be build tested
by everyone else. They could just not include it in their Kconfig and
then out-of-sight, out-of-mind.

> > Anyways we have zero ways to enforce such a policy. Have vendors
> > ship a NIC to somebody with the v0 of the patch set? Attach a picture?
>
> GenAI world, pictures mean nothing :) We do have a CI in netdev, which
> is all ready to ingest external results, and a (currently tiny amount?)
> of test for NICs. Prove that you care about the device by running the
> upstream tests and reporting results? Seems fairly reasonable.

That seems like an opportunity to be exploited through. Are the
results going to be verified in any way? Maybe cryptographically
signed? Seems like it would be easy enough to fake the results.

> > Even if vendor X claims they will have a product in N months and
> > than only sells it to qualified customers what to do we do then.
> > Driver author could even believe the hardware will be available
> > when they post the driver, but business may change out of hands
> > of the developer.
> >
> > I'm 100% on letting this through assuming Alex is on top of feedback
> > and the code is good.
>
> I'd strongly prefer if we detach our trust and respect for Alex
> from whatever precedent we make here. I can't stress this enough.
> IDK if I'm exaggerating or it's hard to appreciate the challenges
> of maintainership without living it, but I really don't like being
> accused of playing favorites or big companies buying their way in :(

Again, I would say we look at the blast radius. That is how we should
be measuring any change. At this point the driver is self contained
into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
outside that directory, and it can be switched off via Kconfig.

When the time comes to start adding new features we can probably start
by looking at how to add either generic offloads like was done for
GSO, CSO, ect or how it can also be implemented on another vendor's
NIC.

At this point the only risk the driver presents is that it is yet
another driver, done in the same style I did the other Intel drivers,
and so any kernel API changes will end up needing to be applied to it
just like the other drivers.

> > I think any other policy would be very ugly to enforce, prove, and
> > even understand. Obviously code and architecture debates I'm all for.
> > Ensuring we have a trusted, experienced person signed up to review
> > code, address feedback, fix whatever syzbot finds and so on is also a
> > must I think. I'm sure Alex will take care of it.
>
> "Whatever syzbot finds" may be slightly moot for a private device ;)
> but otherwise 100%! These are exactly the kind of points I think we
> should enumerate. I started writing a list of expectations a while back:
>
> Documentation/maintainer/feature-and-driver-maintainers.rst
>
> I think we just need something like this, maybe just a step up, for
> non-public devices..

I honestly think we are getting the cart ahead of the horse. When we
start talking about kernel API changes then we can probably get into
the whole "private" versus "publicly available" argument. A good
example of the kind of thing I am thinking of is GSO partial where I
ended up with Mellanox and Intel sending me 40G and 100G NICs and
cables to implement it on their devices as all I had was essentially
igb and ixgbe based NICs.

Odds are when we start getting to those kind of things maybe we need
to look at having a few systems available for developer use, but until
then I am not sure it makes sense to focus on if the device is
publicly available or not.

Jakub Kicinski April 5, 2024, 2:38 a.m. UTC | #15

On Thu, 4 Apr 2024 17:11:47 -0700 Alexander Duyck wrote:
> > Opensourcing is just one push to github.
> > There are guarantees we give to upstream drivers.  
> 
> Are there? Do we have them documented somewhere?

I think they are somewhere in Documentation/
To some extent this question in itself supports my point that written
down rules, as out of date as they may be, seem to carry more respect
than what a maintainer says :S

> > > Eventually they need some kernel changes and than we block those too
> > > because we didn't allow the driver that was the use case? This seems
> > > wrong to me.  
> >
> > The flip side of the argument is, what if we allow some device we don't
> > have access to to make changes to the core for its benefit. Owner
> > reports that some changes broke the kernel for them. Kernel rules,
> > regression, we have to revert. This is not a hypothetical, "less than
> > cooperative users" demanding reverts, and "reporting us to Linus"
> > is a reality :(
> >
> > Technical solution? Maybe if it's not a public device regression rules
> > don't apply? Seems fairly reasonable.  
> 
> This is a hypothetical. This driver currently isn't changing anything
> outside of itself. At this point the driver would only be build tested
> by everyone else. They could just not include it in their Kconfig and
> then out-of-sight, out-of-mind.

Not changing does not mean not depending on existing behavior.
Investigating and fixing properly even the hardest regressions in 
the stack is a bar that Meta can so easily clear. I don't understand
why you are arguing.

> > > Anyways we have zero ways to enforce such a policy. Have vendors
> > > ship a NIC to somebody with the v0 of the patch set? Attach a picture?  
> >
> > GenAI world, pictures mean nothing :) We do have a CI in netdev, which
> > is all ready to ingest external results, and a (currently tiny amount?)
> > of test for NICs. Prove that you care about the device by running the
> > upstream tests and reporting results? Seems fairly reasonable.  
> 
> That seems like an opportunity to be exploited through. Are the
> results going to be verified in any way? Maybe cryptographically
> signed? Seems like it would be easy enough to fake the results.

I think it's much easier to just run the tests than write a system
which will competently lie. But even if we completely suspend trust,
someone lying is of no cost to the community in this case.

> > > Even if vendor X claims they will have a product in N months and
> > > than only sells it to qualified customers what to do we do then.
> > > Driver author could even believe the hardware will be available
> > > when they post the driver, but business may change out of hands
> > > of the developer.
> > >
> > > I'm 100% on letting this through assuming Alex is on top of feedback
> > > and the code is good.  
> >
> > I'd strongly prefer if we detach our trust and respect for Alex
> > from whatever precedent we make here. I can't stress this enough.
> > IDK if I'm exaggerating or it's hard to appreciate the challenges
> > of maintainership without living it, but I really don't like being
> > accused of playing favorites or big companies buying their way in :(  
> 
> Again, I would say we look at the blast radius. That is how we should
> be measuring any change. At this point the driver is self contained
> into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
> outside that directory, and it can be switched off via Kconfig.

It is not practical to ponder every change case by case. Maintainers
are overworked. How long until we send the uAPI patch for RSS on the
flow label? I'd rather not re-litigate this every time someone posts
a slightly different feature. Let's cover the obvious points from 
the beginning while everyone is paying attention. We can amend later
as need be.

> When the time comes to start adding new features we can probably start
> by looking at how to add either generic offloads like was done for
> GSO, CSO, ect or how it can also be implemented on another vendor's
> NIC.
> 
> At this point the only risk the driver presents is that it is yet
> another driver, done in the same style I did the other Intel drivers,
> and so any kernel API changes will end up needing to be applied to it
> just like the other drivers.

The risk is we'll have a fight every time there is a disagreement about
the expectations.

> > > I think any other policy would be very ugly to enforce, prove, and
> > > even understand. Obviously code and architecture debates I'm all for.
> > > Ensuring we have a trusted, experienced person signed up to review
> > > code, address feedback, fix whatever syzbot finds and so on is also a
> > > must I think. I'm sure Alex will take care of it.  
> >
> > "Whatever syzbot finds" may be slightly moot for a private device ;)
> > but otherwise 100%! These are exactly the kind of points I think we
> > should enumerate. I started writing a list of expectations a while back:
> >
> > Documentation/maintainer/feature-and-driver-maintainers.rst
> >
> > I think we just need something like this, maybe just a step up, for
> > non-public devices..  
> 
> I honestly think we are getting the cart ahead of the horse. When we
> start talking about kernel API changes then we can probably get into
> the whole "private" versus "publicly available" argument. A good
> example of the kind of thing I am thinking of is GSO partial where I
> ended up with Mellanox and Intel sending me 40G and 100G NICs and
> cables to implement it on their devices as all I had was essentially
> igb and ixgbe based NICs.

That'd be great. Maybe even more than I'd expect. So why not write
it down? In case the person doing the coding is not Alex Duyck, and
just wants to get it done for their narrow use case, get a promo,
go work on something else?

> Odds are when we start getting to those kind of things maybe we need
> to look at having a few systems available for developer use, but until
> then I am not sure it makes sense to focus on if the device is
> publicly available or not.

Developer access would be huge.
A mirage of developer access? immaterial :)

David Ahern April 5, 2024, 3:08 a.m. UTC | #16

On 4/4/24 9:37 AM, Jakub Kicinski wrote:
> On Thu, 4 Apr 2024 17:24:03 +0200 Andrew Lunn wrote:
>> Given the discussion going on in the thread "mlx5 ConnectX control
>> misc driver", you also plan to show you don't need such a misc driver?
> 
> To the extent to which it is possible to prove an in-existence
> of something :) Since my argument that Meta can use _vendor_ devices
> without resorting to "misc drivers" doesn't convince them, I doubt
> this data point would help.
> 

When this device gets widely deployed and you have the inevitable
production problems (inevitable in this sense this is designed,
implemented and deployed by humans who make mistakes and then S/W has to
compensate for the quirks), you can see whether it is easy to
completely, sanely and efficiently debug those problems solely with
extensions to open source tools.

But, I am guessing that is still 1+ years away before you will know.

Paolo Abeni April 5, 2024, 7:11 a.m. UTC | #17

On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote:
> Again, I would say we look at the blast radius. That is how we should
> be measuring any change. At this point the driver is self contained
> into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
> outside that directory, and it can be switched off via Kconfig.

I personally think this is the most relevant point. This is just a new
NIC driver, completely self-encapsulated. I quickly glanced over the
code and it looks like it's not doing anything obviously bad. It really
looks like an usual, legit, NIC driver.

I don't think the fact that the NIC itself is hard to grasp for anyone
outside <organization> makes a difference. Long time ago Greg noted
that drivers has been merged for H/W known to have a _single_ existing
instance (IIRC, I can't find the reference on top of my head, but back
then was quite popular, I hope some other old guy could remember).

To me, the maintainership burden is on Meta: Alex/Meta will have to
handle bug report, breakages, user-complains (I guess this last would
be the easier part ;). If he/they will not cope with the process we can
simply revert the driver. I would be quite surprised if such situation
should happen, but the impact from my PoV looks minimal.

TL;DR: I don't see any good reason to not accept this - unless my quick
glance was too quick and very wrong, but it looks like other has
similar view.

Cheers,

Paolo

Jason Gunthorpe April 5, 2024, 12:26 p.m. UTC | #18

On Fri, Apr 05, 2024 at 09:11:19AM +0200, Paolo Abeni wrote:
> On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote:
> > Again, I would say we look at the blast radius. That is how we should
> > be measuring any change. At this point the driver is self contained
> > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
> > outside that directory, and it can be switched off via Kconfig.
> 
> I personally think this is the most relevant point. This is just a new
> NIC driver, completely self-encapsulated. I quickly glanced over the
> code and it looks like it's not doing anything obviously bad. It really
> looks like an usual, legit, NIC driver.

This is completely true, and as I've said many times the kernel as a
project is substantially about supporting the HW that people actually
build. There is no reason not to merge yet another basic netdev
driver.

However, there is also a pretty strong red line in Linux where people
belive, with strong conviction, that kernel code should not be merged
only to support a propriety userspace. This submission is clearly
bluring that line. This driver will only run in Meta's proprietary
kernel fork on servers running Meta's propriety userspace.

At this point perhaps it is OK, a basic NIC driver is not really an
issue, but Jiri is also very correct to point out that this is heading
in a very concerning direction.

Alex already indicated new features are coming, changes to the core
code will be proposed. How should those be evaluated? Hypothetically
should fbnic be allowed to be the first implementation of something
invasive like Mina's DMABUF work? Google published an open userspace
for NCCL that people can (in theory at least) actually run. Meta would
not be able to do that. I would say that clearly crosses the line and
should not be accepted.

So I think there should be an expectation that technicaly sound things
Meta may propose must not be accepted because they cross the
ideological red line into enabling only proprietary software.

To me it sets up a fairly common anti-pattern where a vendor starts
out with good intentions, reaches community pushback and falls back to
their downstream fork. Once forking occurs it becomes self-reinforcing
as built up infrastructure like tests and CI will only run correctly
on the fork and the fork grows. Then eventually the upstream code is
abandoned. This has happened many times before in Linux..

IMHO from a community perspective I feel like we should expect Meta to
fail and end up with a fork. The community should warn them. However
if they really want to try anyhow then I'm not sure it would be
appropriate to stop them at this point. Meta will just end up being a
"bad vendor".

I think the best thing the netdev community could do is come up with
some more clear guidelines what Meta could use fbnic to justify and
what would be rejected (ideologically) and Meta can decide on their
own if they want to continue.

Jason

Daniel Borkmann April 5, 2024, 1:06 p.m. UTC | #19

On 4/5/24 2:26 PM, Jason Gunthorpe wrote:
> On Fri, Apr 05, 2024 at 09:11:19AM +0200, Paolo Abeni wrote:
>> On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote:
>>> Again, I would say we look at the blast radius. That is how we should
>>> be measuring any change. At this point the driver is self contained
>>> into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
>>> outside that directory, and it can be switched off via Kconfig.
>>
>> I personally think this is the most relevant point. This is just a new
>> NIC driver, completely self-encapsulated. I quickly glanced over the
>> code and it looks like it's not doing anything obviously bad. It really
>> looks like an usual, legit, NIC driver.
> 
> This is completely true, and as I've said many times the kernel as a
> project is substantially about supporting the HW that people actually
> build. There is no reason not to merge yet another basic netdev
> driver.
> 
> However, there is also a pretty strong red line in Linux where people
> belive, with strong conviction, that kernel code should not be merged
> only to support a propriety userspace. This submission is clearly
> bluring that line. This driver will only run in Meta's proprietary
> kernel fork on servers running Meta's propriety userspace.
> 
> At this point perhaps it is OK, a basic NIC driver is not really an
> issue, but Jiri is also very correct to point out that this is heading
> in a very concerning direction.
> 
> Alex already indicated new features are coming, changes to the core
> code will be proposed. How should those be evaluated? Hypothetically
> should fbnic be allowed to be the first implementation of something
> invasive like Mina's DMABUF work?
My $0.02 from only reading this thread on the side.. when it comes to
extending and integrating with core networking code (e.g. larger features
like offloads, xdp/af_xdp, etc) the networking community always requested
at least two driver implementations to show-case that the code extensions
touching core code are not unique to just a single driver/NIC/vendor. I'd
expect this holds true also here..

Przemek Kitszel April 5, 2024, 2:01 p.m. UTC | #20

On 4/3/24 22:08, Alexander Duyck wrote:
> This patch set includes the necessary patches to enable basic Tx and Rx
> over the Meta Platforms Host Network Interface. To do this we introduce a
> new driver and driver and directories in the form of
> "drivers/net/ethernet/meta/fbnic".
> 
> Due to submission limits the general plan to submit a minimal driver for
> now almost equivalent to a UEFI driver in functionality, and then follow up
> over the coming weeks enabling additional offloads and more features for
> the device.
> 
> The general plan is to look at adding support for ethtool, statistics, and
> start work on offloads in the next set of patches.
> 
> ---
> 
> Alexander Duyck (15):
>        PCI: Add Meta Platforms vendor ID
>        eth: fbnic: add scaffolding for Meta's NIC driver
>        eth: fbnic: Allocate core device specific structures and devlink interface
>        eth: fbnic: Add register init to set PCIe/Ethernet device config
>        eth: fbnic: add message parsing for FW messages
>        eth: fbnic: add FW communication mechanism
>        eth: fbnic: allocate a netdevice and napi vectors with queues
>        eth: fbnic: implement Tx queue alloc/start/stop/free
>        eth: fbnic: implement Rx queue alloc/start/stop/free
>        eth: fbnic: Add initial messaging to notify FW of our presence
>        eth: fbnic: Enable Ethernet link setup
>        eth: fbnic: add basic Tx handling
>        eth: fbnic: add basic Rx handling
>        eth: fbnic: add L2 address programming
>        eth: fbnic: write the TCAM tables used for RSS control and Rx to host
> 
> 
>   MAINTAINERS                                   |    7 +
>   drivers/net/ethernet/Kconfig                  |    1 +
>   drivers/net/ethernet/Makefile                 |    1 +
>   drivers/net/ethernet/meta/Kconfig             |   29 +
>   drivers/net/ethernet/meta/Makefile            |    6 +
>   drivers/net/ethernet/meta/fbnic/Makefile      |   18 +
>   drivers/net/ethernet/meta/fbnic/fbnic.h       |  148 ++
>   drivers/net/ethernet/meta/fbnic/fbnic_csr.h   |  912 ++++++++
>   .../net/ethernet/meta/fbnic/fbnic_devlink.c   |   86 +
>   .../net/ethernet/meta/fbnic/fbnic_drvinfo.h   |    5 +
>   drivers/net/ethernet/meta/fbnic/fbnic_fw.c    |  823 ++++++++
>   drivers/net/ethernet/meta/fbnic/fbnic_fw.h    |  133 ++
>   drivers/net/ethernet/meta/fbnic/fbnic_irq.c   |  251 +++
>   drivers/net/ethernet/meta/fbnic/fbnic_mac.c   | 1025 +++++++++
>   drivers/net/ethernet/meta/fbnic/fbnic_mac.h   |   83 +
>   .../net/ethernet/meta/fbnic/fbnic_netdev.c    |  470 +++++
>   .../net/ethernet/meta/fbnic/fbnic_netdev.h    |   59 +
>   drivers/net/ethernet/meta/fbnic/fbnic_pci.c   |  633 ++++++
>   drivers/net/ethernet/meta/fbnic/fbnic_rpc.c   |  709 +++++++
>   drivers/net/ethernet/meta/fbnic/fbnic_rpc.h   |  189 ++
>   drivers/net/ethernet/meta/fbnic/fbnic_tlv.c   |  529 +++++
>   drivers/net/ethernet/meta/fbnic/fbnic_tlv.h   |  175 ++
>   drivers/net/ethernet/meta/fbnic/fbnic_txrx.c  | 1873 +++++++++++++++++
>   drivers/net/ethernet/meta/fbnic/fbnic_txrx.h  |  125 ++
>   include/linux/pci_ids.h                       |    2 +
>   25 files changed, 8292 insertions(+)

Even if this is just a basic scaffolding for what will come, it's hard
to believe that no patch was co-developed, or should be marked as
authored-by some other developer.

[...]

Alexander Duyck April 5, 2024, 2:24 p.m. UTC | #21

On Fri, Apr 5, 2024 at 5:26 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Fri, Apr 05, 2024 at 09:11:19AM +0200, Paolo Abeni wrote:
> > On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote:
> > > Again, I would say we look at the blast radius. That is how we should
> > > be measuring any change. At this point the driver is self contained
> > > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
> > > outside that directory, and it can be switched off via Kconfig.
> >
> > I personally think this is the most relevant point. This is just a new
> > NIC driver, completely self-encapsulated. I quickly glanced over the
> > code and it looks like it's not doing anything obviously bad. It really
> > looks like an usual, legit, NIC driver.
>
> This is completely true, and as I've said many times the kernel as a
> project is substantially about supporting the HW that people actually
> build. There is no reason not to merge yet another basic netdev
> driver.
>
> However, there is also a pretty strong red line in Linux where people
> belive, with strong conviction, that kernel code should not be merged
> only to support a propriety userspace. This submission is clearly
> bluring that line. This driver will only run in Meta's proprietary
> kernel fork on servers running Meta's propriety userspace.
>
> At this point perhaps it is OK, a basic NIC driver is not really an
> issue, but Jiri is also very correct to point out that this is heading
> in a very concerning direction.
>
> Alex already indicated new features are coming, changes to the core
> code will be proposed. How should those be evaluated? Hypothetically
> should fbnic be allowed to be the first implementation of something
> invasive like Mina's DMABUF work? Google published an open userspace
> for NCCL that people can (in theory at least) actually run. Meta would
> not be able to do that. I would say that clearly crosses the line and
> should not be accepted.

Why not? Just because we are not commercially selling it doesn't mean
we couldn't look at other solutions such as QEMU. If we were to
provide a github repo with an emulation of the NIC would that be
enough to satisfy the "commercial" requirement?

The fact is I already have an implementation, but I would probably
need to clean up a few things as the current setup requires 3 QEMU
instances to emulate the full setup with host, firmware, and BMC. It
wouldn't be as performant as the actual hardware but it is more than
enough for us to test code with. If we need to look at publishing
something like that to github in order to address the lack of user
availability I could start looking at getting the approvals for that.

> So I think there should be an expectation that technically sound things
> Meta may propose must not be accepted because they cross the
> ideological red line into enabling only proprietary software.

That is a faulty argument. That is like saying we should kick out the
nouveu driver out of Linux just because it supports Nvidia graphics
cards that happen to also have a proprietary out-of-tree driver out
there, or maybe we need to kick all the Intel NIC drivers out for
DPDK? I can't think of many NIC vendors that don't have their own
out-of-tree drivers floating around with their own kernel bypass
solutions to support proprietary software.

> To me it sets up a fairly common anti-pattern where a vendor starts
> out with good intentions, reaches community pushback and falls back to
> their downstream fork. Once forking occurs it becomes self-reinforcing
> as built up infrastructure like tests and CI will only run correctly
> on the fork and the fork grows. Then eventually the upstream code is
> abandoned. This has happened many times before in Linux..
>
> IMHO from a community perspective I feel like we should expect Meta to
> fail and end up with a fork. The community should warn them. However
> if they really want to try anyhow then I'm not sure it would be
> appropriate to stop them at this point. Meta will just end up being a
> "bad vendor".
>
> I think the best thing the netdev community could do is come up with
> some more clear guidelines what Meta could use fbnic to justify and
> what would be rejected (ideologically) and Meta can decide on their
> own if they want to continue.

I agree. We need a consistent set of standards. I just strongly
believe commercial availability shouldn't be one of them.

Jason Gunthorpe April 5, 2024, 3:17 p.m. UTC | #22

On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
> > Alex already indicated new features are coming, changes to the core
> > code will be proposed. How should those be evaluated? Hypothetically
> > should fbnic be allowed to be the first implementation of something
> > invasive like Mina's DMABUF work? Google published an open userspace
> > for NCCL that people can (in theory at least) actually run. Meta would
> > not be able to do that. I would say that clearly crosses the line and
> > should not be accepted.
> 
> Why not? Just because we are not commercially selling it doesn't mean
> we couldn't look at other solutions such as QEMU. If we were to
> provide a github repo with an emulation of the NIC would that be
> enough to satisfy the "commercial" requirement?

My test is not "commercial", it is enabling open source ecosystem vs
benefiting only proprietary software.

In my hypothetical you'd need to do something like open source Meta's
implementation of the AI networking that the DMABUF patches enable,
and even then since nobody could run it at performance the thing is
pretty questionable.

IMHO publishing a qemu chip emulator would not advance the open source
ecosystem around building a DMABUF AI networking scheme.

> > So I think there should be an expectation that technically sound things
> > Meta may propose must not be accepted because they cross the
> > ideological red line into enabling only proprietary software.
> 
> That is a faulty argument. That is like saying we should kick out the
> nouveu driver out of Linux just because it supports Nvidia graphics
> cards that happen to also have a proprietary out-of-tree driver out
> there,

Huh? nouveau supports a fully open source mesa graphics stack in
Linux. How is that remotely similar to what I said? No issue.

> or maybe we need to kick all the Intel NIC drivers out for
> DPDK? 

DPDK is fully open source, again no issue.

You pointed at two things that I would consider to be exemplar open
source projects and said their existance somehow means we should be
purging drivers from the kernel???

I really don't understand what you are trying to say at all.

The kernel standard is that good quality open source *does* exist, we
tend to not care what proprietary things people create beyond that.

> I can't think of many NIC vendors that don't have their own
> out-of-tree drivers floating around with their own kernel bypass
> solutions to support proprietary software.

Most of those are also open source, and we can't say much about what
people do out of tree, obviously.

> I agree. We need a consistent set of standards. I just strongly
> believe commercial availability shouldn't be one of them.

I never said commercial availability. I talked about open source vs
proprietary userspace. This is very standard kernel stuff.

You have an unavailable NIC, so we know it is only ever operated with
Meta's proprietary kernel fork, supporting Meta's proprietary
userspace software. Where exactly is the open source?

Why should someone working to improve only their proprietary
environment be welcomed in the same way as someone working to improve
the open source ecosystem? That has never been the kernel communities
position.

If you want to propose things to the kernel that can only be
meaningfully used by your proprietary software then you should not
expect to succeed. No one should be surprised to hear this.

Jason

Alexander Duyck April 5, 2024, 3:41 p.m. UTC | #23

On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 4 Apr 2024 17:11:47 -0700 Alexander Duyck wrote:
> > > Opensourcing is just one push to github.
> > > There are guarantees we give to upstream drivers.
> >
> > Are there? Do we have them documented somewhere?
>
> I think they are somewhere in Documentation/
> To some extent this question in itself supports my point that written
> down rules, as out of date as they may be, seem to carry more respect
> than what a maintainer says :S

I think the problem is there are multiple maintainers and they all
have different ways of doing things. As a submitter over the years I
have had to deal with probably over a half dozen different maintainers
and each experience has been different. I have always argued that the
netdev tree is one of the better maintained setups.

> > > > Eventually they need some kernel changes and than we block those too
> > > > because we didn't allow the driver that was the use case? This seems
> > > > wrong to me.
> > >
> > > The flip side of the argument is, what if we allow some device we don't
> > > have access to to make changes to the core for its benefit. Owner
> > > reports that some changes broke the kernel for them. Kernel rules,
> > > regression, we have to revert. This is not a hypothetical, "less than
> > > cooperative users" demanding reverts, and "reporting us to Linus"
> > > is a reality :(
> > >
> > > Technical solution? Maybe if it's not a public device regression rules
> > > don't apply? Seems fairly reasonable.
> >
> > This is a hypothetical. This driver currently isn't changing anything
> > outside of itself. At this point the driver would only be build tested
> > by everyone else. They could just not include it in their Kconfig and
> > then out-of-sight, out-of-mind.
>
> Not changing does not mean not depending on existing behavior.
> Investigating and fixing properly even the hardest regressions in
> the stack is a bar that Meta can so easily clear. I don't understand
> why you are arguing.

I wasn't saying the driver wouldn't be dependent on existing behavior.
I was saying that it was a hypothetical that Meta would be a "less
than cooperative user" and demand a revert.  It is also a hypothetical
that Linus wouldn't just propose a revert of the fbnic driver instead
of the API for the crime of being a "less than cooperative maintainer"
and  then give Meta the Nvidia treatment.

> > > > Anyways we have zero ways to enforce such a policy. Have vendors
> > > > ship a NIC to somebody with the v0 of the patch set? Attach a picture?
> > >
> > > GenAI world, pictures mean nothing :) We do have a CI in netdev, which
> > > is all ready to ingest external results, and a (currently tiny amount?)
> > > of test for NICs. Prove that you care about the device by running the
> > > upstream tests and reporting results? Seems fairly reasonable.
> >
> > That seems like an opportunity to be exploited through. Are the
> > results going to be verified in any way? Maybe cryptographically
> > signed? Seems like it would be easy enough to fake the results.
>
> I think it's much easier to just run the tests than write a system
> which will competently lie. But even if we completely suspend trust,
> someone lying is of no cost to the community in this case.

I don't get this part. You are paranoid about bad actors until it
comes to accepting the test results? So write a broken API, "prove" it
works by running it on your broken test setup, and then get it
upstream after establishing a false base for trust. Seems like a
perfect setup for an exploit like what happened with the xz setup.

> > > > Even if vendor X claims they will have a product in N months and
> > > > than only sells it to qualified customers what to do we do then.
> > > > Driver author could even believe the hardware will be available
> > > > when they post the driver, but business may change out of hands
> > > > of the developer.
> > > >
> > > > I'm 100% on letting this through assuming Alex is on top of feedback
> > > > and the code is good.
> > >
> > > I'd strongly prefer if we detach our trust and respect for Alex
> > > from whatever precedent we make here. I can't stress this enough.
> > > IDK if I'm exaggerating or it's hard to appreciate the challenges
> > > of maintainership without living it, but I really don't like being
> > > accused of playing favorites or big companies buying their way in :(
> >
> > Again, I would say we look at the blast radius. That is how we should
> > be measuring any change. At this point the driver is self contained
> > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
> > outside that directory, and it can be switched off via Kconfig.
>
> It is not practical to ponder every change case by case. Maintainers
> are overworked. How long until we send the uAPI patch for RSS on the
> flow label? I'd rather not re-litigate this every time someone posts
> a slightly different feature. Let's cover the obvious points from
> the beginning while everyone is paying attention. We can amend later
> as need be.

Isn't that what we are doing right now? We are basically refusing this
patch set not based on its own merits but on a "what-if" scenario for
a patch set that might come at some point in the future and conjecture
that somehow it is going to be able to add features for just itself
when we haven't allowed that for in the past, for example with things
like GSO partial.

> > When the time comes to start adding new features we can probably start
> > by looking at how to add either generic offloads like was done for
> > GSO, CSO, ect or how it can also be implemented on another vendor's
> > NIC.
> >
> > At this point the only risk the driver presents is that it is yet
> > another driver, done in the same style I did the other Intel drivers,
> > and so any kernel API changes will end up needing to be applied to it
> > just like the other drivers.
>
> The risk is we'll have a fight every time there is a disagreement about
> the expectations.

We always do. I am not sure why you would expect that would change by
blocking this patch set. If anything it sounds like maybe we need to
document some requirements for availability for testing, and
expectations for what it would mean to be a one-off device where only
one entity has access to it. However that is a process problem, and
not so much a patch or driver issue.

> > > > I think any other policy would be very ugly to enforce, prove, and
> > > > even understand. Obviously code and architecture debates I'm all for.
> > > > Ensuring we have a trusted, experienced person signed up to review
> > > > code, address feedback, fix whatever syzbot finds and so on is also a
> > > > must I think. I'm sure Alex will take care of it.
> > >
> > > "Whatever syzbot finds" may be slightly moot for a private device ;)
> > > but otherwise 100%! These are exactly the kind of points I think we
> > > should enumerate. I started writing a list of expectations a while back:
> > >
> > > Documentation/maintainer/feature-and-driver-maintainers.rst
> > >
> > > I think we just need something like this, maybe just a step up, for
> > > non-public devices..
> >
> > I honestly think we are getting the cart ahead of the horse. When we
> > start talking about kernel API changes then we can probably get into
> > the whole "private" versus "publicly available" argument. A good
> > example of the kind of thing I am thinking of is GSO partial where I
> > ended up with Mellanox and Intel sending me 40G and 100G NICs and
> > cables to implement it on their devices as all I had was essentially
> > igb and ixgbe based NICs.
>
> That'd be great. Maybe even more than I'd expect. So why not write
> it down? In case the person doing the coding is not Alex Duyck, and
> just wants to get it done for their narrow use case, get a promo,
> go work on something else?

Write what down? That the vendors didn't like me harassing them to
test my code so they shipped me the NICs and asked me to just test it
myself?

That worked in my scenario as I had a server level system in my home
lab to make that work. We cannot expect everyone to have that kind of
setup for their own development. That is why I am considering the QEMU
approach as that might make this a bit more accessible. I could then
look at enabling the QEMU at the same time I enable the driver.

> > Odds are when we start getting to those kind of things maybe we need
> > to look at having a few systems available for developer use, but until
> > then I am not sure it makes sense to focus on if the device is
> > publicly available or not.
>
> Developer access would be huge.
> A mirage of developer access? immaterial :)

If nothing else, maybe we need a writeup somewhere of the level of
support a driver should expect from the Linux community if the device
is not "easily available". We could probably define that in terms of
what a reasonable expectation would be for a developer to have access
to it.

I would say that "commercial sales" is not a good metric and shouldn't
come into it. If anything it would be about device availability for
development and testing.

In addition I would be good with a definition of "support" for the
case of a device that isn't publicly available being quite limited, as
those with access would have to have active involvement in the
community to enable support. Without that it might as well be
considered orphaned and the driver dropped.

Alexander Duyck April 5, 2024, 6:38 p.m. UTC | #24

On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
> > > Alex already indicated new features are coming, changes to the core
> > > code will be proposed. How should those be evaluated? Hypothetically
> > > should fbnic be allowed to be the first implementation of something
> > > invasive like Mina's DMABUF work? Google published an open userspace
> > > for NCCL that people can (in theory at least) actually run. Meta would
> > > not be able to do that. I would say that clearly crosses the line and
> > > should not be accepted.
> >
> > Why not? Just because we are not commercially selling it doesn't mean
> > we couldn't look at other solutions such as QEMU. If we were to
> > provide a github repo with an emulation of the NIC would that be
> > enough to satisfy the "commercial" requirement?
>
> My test is not "commercial", it is enabling open source ecosystem vs
> benefiting only proprietary software.

Sorry, that was where this started where Jiri was stating that we had
to be selling this.

> In my hypothetical you'd need to do something like open source Meta's
> implementation of the AI networking that the DMABUF patches enable,
> and even then since nobody could run it at performance the thing is
> pretty questionable.
>
> IMHO publishing a qemu chip emulator would not advance the open source
> ecosystem around building a DMABUF AI networking scheme.

Well not too many will be able to afford getting the types of systems
and hardware needed for this in the first place. Primarily just your
large data center companies can afford it.

I never said this hardware is about enabling DMABUF. You implied that.
The fact is that this driver is meant to be a pretty basic speeds and
feeds device. We support header split and network flow classification
so I suppose it could be used for DMABUF but by that logic so could a
number of other drivers.

> > > So I think there should be an expectation that technically sound things
> > > Meta may propose must not be accepted because they cross the
> > > ideological red line into enabling only proprietary software.
> >
> > That is a faulty argument. That is like saying we should kick out the
> > nouveu driver out of Linux just because it supports Nvidia graphics
> > cards that happen to also have a proprietary out-of-tree driver out
> > there,
>
> Huh? nouveau supports a fully open source mesa graphics stack in
> Linux. How is that remotely similar to what I said? No issue.

Right, nouveau is fully open source. That is what I am trying to do
with fbnic. That is what I am getting at. This isn't connecting to
some proprietary stack or engaging in any sort of bypass. It is going
through the standard networking stack. If there were some other
out-of-tree driver for this to support some other use case how would
that impact the upstream patch submission?

This driver is being NAKed for enabling stuff that hasn't even been
presented. It is barely enough driver to handle PXE booting which is
needed to be able to even load an OS on the system. Yet somehow
because you are expecting a fork to come in at some point to support
DMABUF you are wanting to block it outright. How about rather than
doing that we wait until there is something there that is
objectionable before we start speculating on what may be coming.

> You pointed at two things that I would consider to be exemplar open
> source projects and said their existance somehow means we should be
> purging drivers from the kernel???
>
> I really don't understand what you are trying to say at all.

I'm trying to say that both those projects are essentially doing the
same thing you are accusing fbnic of doing, even though I am exposing
no non-standard API(s) and everything is open source. You are
projecting future changes onto this driver that don't currently and
may never exist.

> The kernel standard is that good quality open source *does* exist, we
> tend to not care what proprietary things people create beyond that.

Now I am confused. You say you don't care what happens later, but you
seem to be insisting you care about what proprietary things will be
done with it after it is upstreamed.

> > I can't think of many NIC vendors that don't have their own
> > out-of-tree drivers floating around with their own kernel bypass
> > solutions to support proprietary software.
>
> Most of those are also open source, and we can't say much about what
> people do out of tree, obviously.

Isn't that exactly what you are doing though with all your
"proprietary" comments?

> > I agree. We need a consistent set of standards. I just strongly
> > believe commercial availability shouldn't be one of them.
>
> I never said commercial availability. I talked about open source vs
> proprietary userspace. This is very standard kernel stuff.
>
> You have an unavailable NIC, so we know it is only ever operated with
> Meta's proprietary kernel fork, supporting Meta's proprietary
> userspace software. Where exactly is the open source?

It depends on your definition of "unavailable". I could argue that for
many most of the Mellanox NICs are also have limited availability as
they aren't exactly easy to get a hold of without paying a hefty
ransom.

The NIC is currently available to developers within Meta. As such I
know there are not a small number of kernel developers who could get
access to it if they asked for a login to one of our test and
development systems. Also I offered to provide the QEMU repo, but you
said you had no interest in that option.

> Why should someone working to improve only their proprietary
> environment be welcomed in the same way as someone working to improve
> the open source ecosystem? That has never been the kernel communities
> position.

To quote Linus `I do not see open source as some big goody-goody
"let's all sing kumbaya around the campfire and make the world a
better place". No, open source only really works if everybody is
contributing for their own selfish reasons.`[1]

How is us using our own NIC any different than if one of the vendors
were to make a NIC exclusively for us or any other large data center?
The only reason why this is coming up is because Meta is not a typical
NIC vendor but normally a consumer. The fact that we will be
dogfooding our own NIC seems to be at the heart of the issue here.

Haven't there been a number of maintainers who end up maintaining code
bases in the kernel for platforms and/or devices where they own one of
the few devices available in the world? How would this be any
different. Given enough time it is likely this will end up in the
hands of those outside Meta anyway, at that point the argument would
be moot.

> If you want to propose things to the kernel that can only be
> meaningfully used by your proprietary software then you should not
> expect to succeed. No one should be surprised to hear this.

If the whole idea is to get us to run a non-proprietary stack nothing
sends the exact opposite message like telling us we cannot upstream a
simple network driver because of a "what if" about some DMABUF patch
set from Google. All I am asking for is the ability to net install a
system with this device. That requires the driver to be available in
the provisioning kernel image, so thus why I am working to upstream it
as I would rather not have to maintain an out-of-tree kernel driver.

The argument here isn't about proprietary software. It is proprietary
hardware that seems to be the issue, or at least that is where it
started. The driver itself anyone could load, build, or even run on
QEMU as I mentioned. It is open source and not exposing any new APIs.
The issue seems to be with the fact that the NIC can't be bought from
a vendor and instead Meta is building the NIC for it's own
consumption.

As far as the software stack the concern about DMABUF seems like an
orthogonal argument that should be had at the userspace/API level and
doesn't directly relate to any specific driver. As has been pointed
out enabling anything like that wouldn't be a single NIC solution and
to be accepted upstream it should be implemented on at least 2
different vendor drivers.  Additionally, there isn't anything unique
about this hardware that would make it more capable of enabling that
than any other device.

Thanks,

- Alex

[1]: https://www.bbc.com/news/technology-18419231

Jason Gunthorpe April 5, 2024, 7:02 p.m. UTC | #25

On Fri, Apr 05, 2024 at 11:38:25AM -0700, Alexander Duyck wrote:

> > In my hypothetical you'd need to do something like open source Meta's
> > implementation of the AI networking that the DMABUF patches enable,
> > and even then since nobody could run it at performance the thing is
> > pretty questionable.
> >
> > IMHO publishing a qemu chip emulator would not advance the open source
> > ecosystem around building a DMABUF AI networking scheme.
> 
> Well not too many will be able to afford getting the types of systems
> and hardware needed for this in the first place. Primarily just your
> large data center companies can afford it.
> 
> I never said this hardware is about enabling DMABUF.

I presented a hypothetical to be able to illustrate a scenario where
this driver should not be used to justify invasive core kernel
changes.

I have no idea what future things you have in mind, or if any will
reach a threshold where I would expect they should not be
included. You where the one saying a key reason you wanted this driver
was to push core changes and you said you imagine changes that are
unique to fbnic that "others might like to follow".

I'm being very clear to say that there are some core changes should
not be accepted due to the kernel's open source ideology.

> Right, nouveau is fully open source. That is what I am trying to do
> with fbnic. That is what I am getting at. This isn't connecting to
> some proprietary stack or engaging in any sort of bypass.

The basic driver presented here is not, a future driver that justifies
unknown changes to the core may be.

This is why my message was pretty clear. IMHO there is nothing wrong
with this series, but I do not expect you will get everything you want
in future due to this issue.

I said decide if you want to continue. I'm not NAKing anything on this
series.

> I'm trying to say that both those projects are essentially doing the
> same thing you are accusing fbnic of doing, 

Not even close. Both those projects support open source ecosystems,
have wide cross vendor participating. fbnic isn't even going to be
enabled in any distribution.

> > You have an unavailable NIC, so we know it is only ever operated with
> > Meta's proprietary kernel fork, supporting Meta's proprietary
> > userspace software. Where exactly is the open source?
> 
> It depends on your definition of "unavailable". I could argue that for
> many most of the Mellanox NICs are also have limited availability as
> they aren't exactly easy to get a hold of without paying a hefty
> ransom.

And GNIC's that run Mina's series are completely unavailable right
now. That is still a big different from a temporary issue to a
permanent structural intention of the manufacturer.

> > Why should someone working to improve only their proprietary
> > environment be welcomed in the same way as someone working to improve
> > the open source ecosystem? That has never been the kernel communities
> > position.
> 
> To quote Linus `I do not see open source as some big goody-goody
> "let's all sing kumbaya around the campfire and make the world a
> better place". No, open source only really works if everybody is
> contributing for their own selfish reasons.`[1]

I think that stance has evolved and the consensus position toward uapi
is stronger.

> different. Given enough time it is likely this will end up in the
> hands of those outside Meta anyway, at that point the argument would
> be moot.

Oh, I'm skeptical about that.

You seem to have taken my original email in a strange direction. I
said this series was fine but cautioned that if you proceed you should
be expecting an eventual feature rejection for idological reasons, and
gave a hypothetical example what that would look like.

If you want to continue or not is up to Meta, in my view.

Jason

Alexander Duyck April 6, 2024, 4:05 p.m. UTC | #26

On Fri, Apr 5, 2024 at 12:02 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Fri, Apr 05, 2024 at 11:38:25AM -0700, Alexander Duyck wrote:
>
> > > In my hypothetical you'd need to do something like open source Meta's
> > > implementation of the AI networking that the DMABUF patches enable,
> > > and even then since nobody could run it at performance the thing is
> > > pretty questionable.
> > >
> > > IMHO publishing a qemu chip emulator would not advance the open source
> > > ecosystem around building a DMABUF AI networking scheme.
> >
> > Well not too many will be able to afford getting the types of systems
> > and hardware needed for this in the first place. Primarily just your
> > large data center companies can afford it.
> >
> > I never said this hardware is about enabling DMABUF.
>
> I presented a hypothetical to be able to illustrate a scenario where
> this driver should not be used to justify invasive core kernel
> changes.
>
> I have no idea what future things you have in mind, or if any will
> reach a threshold where I would expect they should not be
> included. You where the one saying a key reason you wanted this driver
> was to push core changes and you said you imagine changes that are
> unique to fbnic that "others might like to follow".
>
> I'm being very clear to say that there are some core changes should
> not be accepted due to the kernel's open source ideology.

Okay, on core changes I 100% agree. That is one of the reasons why we
have the whole thing about any feature really needing to be enabled on
at least 2 different vendor devices.

> > Right, nouveau is fully open source. That is what I am trying to do
> > with fbnic. That is what I am getting at. This isn't connecting to
> > some proprietary stack or engaging in any sort of bypass.
>
> The basic driver presented here is not, a future driver that justifies
> unknown changes to the core may be.
>
> This is why my message was pretty clear. IMHO there is nothing wrong
> with this series, but I do not expect you will get everything you want
> in future due to this issue.
>
> I said decide if you want to continue. I'm not NAKing anything on this
> series.

My apologies. I had interpreted your statement as essentially agreeing
with Jiri and NAKing the patches simply for not being commercially
available.

> > I'm trying to say that both those projects are essentially doing the
> > same thing you are accusing fbnic of doing,
>
> Not even close. Both those projects support open source ecosystems,
> have wide cross vendor participating. fbnic isn't even going to be
> enabled in any distribution.

I can't say for certain if that will be the case going forward or not.
I know we haven't reached out to any distros and currently don't need
to. With that said though, having the driver available as a module
shouldn't cause any harm either so I don't really have a strong
opinion about it either way.

Seeing as how we arne't in the NIC business I am not sure how
restrictive we would be about licensing out the IP. I could see Meta
potentially trying to open up the specs just to take the manufacturing
burden off of us and enable more competition, though I am on the
engineering side and not so much sourcing so I am just speculating.

> > > You have an unavailable NIC, so we know it is only ever operated with
> > > Meta's proprietary kernel fork, supporting Meta's proprietary
> > > userspace software. Where exactly is the open source?
> >
> > It depends on your definition of "unavailable". I could argue that for
> > many most of the Mellanox NICs are also have limited availability as
> > they aren't exactly easy to get a hold of without paying a hefty
> > ransom.
>
> And GNIC's that run Mina's series are completely unavailable right
> now. That is still a big different from a temporary issue to a
> permanent structural intention of the manufacturer.

I'm assuming it is some sort of firmware functionality that is needed
to enable it? One thing with our design is that the firmware actually
has minimal functionality. Basically it is the liaison between the
BMC, Host, and the MAC. Otherwise it has no role to play in the
control path so when the driver is loaded it is running the show.

> > > Why should someone working to improve only their proprietary
> > > environment be welcomed in the same way as someone working to improve
> > > the open source ecosystem? That has never been the kernel communities
> > > position.
> >
> > To quote Linus `I do not see open source as some big goody-goody
> > "let's all sing kumbaya around the campfire and make the world a
> > better place". No, open source only really works if everybody is
> > contributing for their own selfish reasons.`[1]
>
> I think that stance has evolved and the consensus position toward uapi
> is stronger.

I assume you are talking about the need to have an open source
consumer for any exposed uapi? That makes sense from the test and
development standpoint as you need to have some way to exercise and
test any interface that is available in the kernel.

> > different. Given enough time it is likely this will end up in the
> > hands of those outside Meta anyway, at that point the argument would
> > be moot.
>
> Oh, I'm skeptical about that.

I didn't say how widely or when. I got my introduction to Linux by
buying used server systems and trying to get something maintainable in
terms of OS on them. From my experience you end up with all sorts of
odd-ball proprietary parts that eventually end up leaking out to the
public. Though back then it was more likely to be a proprietary spin
of some known silicon with a few tweaks that had to be accounted for.

> You seem to have taken my original email in a strange direction. I
> said this series was fine but cautioned that if you proceed you should
> be expecting an eventual feature rejection for idological reasons, and
> gave a hypothetical example what that would look like.
>
> If you want to continue or not is up to Meta, in my view.
>
> Jason

Yeah, I think I had misunderstood your original comment as being in
support of Jiri's position. I hadn't fully grokked that you were doing
more of a "yes, but" versus an "yes, and".

For this driver I don't see there being too much in the way of
complications as there shouldn't be much in the way of any sort of
kernel-bypass that would be applicable to this driver uniquely.

Thanks,

- Alex

Andrew Lunn April 6, 2024, 4:49 p.m. UTC | #27

> I'm assuming it is some sort of firmware functionality that is needed
> to enable it? One thing with our design is that the firmware actually
> has minimal functionality. Basically it is the liaison between the
> BMC, Host, and the MAC. Otherwise it has no role to play in the
> control path so when the driver is loaded it is running the show.

Which i personally feel is great. In an odd way, this to me indicates
this is a commodity product, or at least, leading the way towards
commodity 100G products. Looking at the embedded SoC NIC market, which
pretty is much about commodity, few 1G Ethernet NICs have firmware.
Most 10G NICs also have no firmware. Linux is driving the hardware.

Much of the current Linux infrastructure is limited to 10G, because
currently everything faster than that hides away in firmware, Linux
does not get to driver it. This driver could help push Linux
controlling the hardware forward, to be benefit of us all. It would be
great if this driver used phylink to manage the PCS and the SFP cage,
that the PCS code is moved into drivers/net/pcs, etc. Assuming this
PCS follows the standards, it would be great to add helpers like we
have for clause 37, clause 73, to help support other future PCS
drivers which will appear. 100G in SoCs is probably not going to
appear too soon, but single channel 25G is probably the next step
after 10G. And what is added for this device will probably also work
for 25G. 40G via 4 channels is probably not too far away either.

Our Linux SFP driver is also currently limited to 10G. It would be
great if this driver could push that forwards to support faster SFP
cages and devices, support splitting and merging, etc.

None of this requires new kAPIs, they all already exist. There is
nothing controversial here. Everything follows standards. So if Meta
were to abandon the MAC driver, it would not matter, its not dead
infrastructure code, future drivers would make use of it, as this
technology becomes more and more commodity.

	Andrew

Alexander Duyck April 6, 2024, 4:53 p.m. UTC | #28

On Fri, Apr 5, 2024 at 7:01 AM Przemek Kitszel
<przemyslaw.kitszel@intel.com> wrote:
>
> On 4/3/24 22:08, Alexander Duyck wrote:
> > This patch set includes the necessary patches to enable basic Tx and Rx
> > over the Meta Platforms Host Network Interface. To do this we introduce a
> > new driver and driver and directories in the form of
> > "drivers/net/ethernet/meta/fbnic".
> >
> > Due to submission limits the general plan to submit a minimal driver for
> > now almost equivalent to a UEFI driver in functionality, and then follow up
> > over the coming weeks enabling additional offloads and more features for
> > the device.
> >
> > The general plan is to look at adding support for ethtool, statistics, and
> > start work on offloads in the next set of patches.
> >
> > ---
> >
> > Alexander Duyck (15):
> >        PCI: Add Meta Platforms vendor ID
> >        eth: fbnic: add scaffolding for Meta's NIC driver
> >        eth: fbnic: Allocate core device specific structures and devlink interface
> >        eth: fbnic: Add register init to set PCIe/Ethernet device config
> >        eth: fbnic: add message parsing for FW messages
> >        eth: fbnic: add FW communication mechanism
> >        eth: fbnic: allocate a netdevice and napi vectors with queues
> >        eth: fbnic: implement Tx queue alloc/start/stop/free
> >        eth: fbnic: implement Rx queue alloc/start/stop/free
> >        eth: fbnic: Add initial messaging to notify FW of our presence
> >        eth: fbnic: Enable Ethernet link setup
> >        eth: fbnic: add basic Tx handling
> >        eth: fbnic: add basic Rx handling
> >        eth: fbnic: add L2 address programming
> >        eth: fbnic: write the TCAM tables used for RSS control and Rx to host
> >
> >
> >   MAINTAINERS                                   |    7 +
> >   drivers/net/ethernet/Kconfig                  |    1 +
> >   drivers/net/ethernet/Makefile                 |    1 +
> >   drivers/net/ethernet/meta/Kconfig             |   29 +
> >   drivers/net/ethernet/meta/Makefile            |    6 +
> >   drivers/net/ethernet/meta/fbnic/Makefile      |   18 +
> >   drivers/net/ethernet/meta/fbnic/fbnic.h       |  148 ++
> >   drivers/net/ethernet/meta/fbnic/fbnic_csr.h   |  912 ++++++++
> >   .../net/ethernet/meta/fbnic/fbnic_devlink.c   |   86 +
> >   .../net/ethernet/meta/fbnic/fbnic_drvinfo.h   |    5 +
> >   drivers/net/ethernet/meta/fbnic/fbnic_fw.c    |  823 ++++++++
> >   drivers/net/ethernet/meta/fbnic/fbnic_fw.h    |  133 ++
> >   drivers/net/ethernet/meta/fbnic/fbnic_irq.c   |  251 +++
> >   drivers/net/ethernet/meta/fbnic/fbnic_mac.c   | 1025 +++++++++
> >   drivers/net/ethernet/meta/fbnic/fbnic_mac.h   |   83 +
> >   .../net/ethernet/meta/fbnic/fbnic_netdev.c    |  470 +++++
> >   .../net/ethernet/meta/fbnic/fbnic_netdev.h    |   59 +
> >   drivers/net/ethernet/meta/fbnic/fbnic_pci.c   |  633 ++++++
> >   drivers/net/ethernet/meta/fbnic/fbnic_rpc.c   |  709 +++++++
> >   drivers/net/ethernet/meta/fbnic/fbnic_rpc.h   |  189 ++
> >   drivers/net/ethernet/meta/fbnic/fbnic_tlv.c   |  529 +++++
> >   drivers/net/ethernet/meta/fbnic/fbnic_tlv.h   |  175 ++
> >   drivers/net/ethernet/meta/fbnic/fbnic_txrx.c  | 1873 +++++++++++++++++
> >   drivers/net/ethernet/meta/fbnic/fbnic_txrx.h  |  125 ++
> >   include/linux/pci_ids.h                       |    2 +
> >   25 files changed, 8292 insertions(+)
>
> Even if this is just a basic scaffolding for what will come, it's hard
> to believe that no patch was co-developed, or should be marked as
> authored-by some other developer.
>
> [...]

I don't want to come across as snarky, but you must be new to Intel?
If nothing else you might ask a few people there  about the history of
the fm10k drivers. I think I did most of the Linux and FreeBSD fm10k
drivers in about 2 to 3 years. Typically getting basic Tx and Rx up
and running on a driver only takes a few weeks, and it is pretty
straight forward when you are implementing the QEMU at the same time
to test it on. From my experience driver development really goes by
the pareto principle where getting basic Tx/Rx up and running is
usually a quick task. What takes forever is enabling all the other
offloads and functions.

As far as this driver goes I would say this is something similar, only
this time I have worked on a Linux and UEFI driver, both of which I am
hoping to get upstreamed. With that said I can go through the bits for
the yet to be upstreamed parts that weren't done by me.

We had tried to bring a few people onto the team early on, none of
which are with the team anymore but a couple are still with the
company so I can reach out to them and see if they are okay with the
Co-autthor attribution before I submit those patches. I have a few
people who worked on the ptp and debugfs, one that enabled Tx offloads
and the ethtool ring configuration, and another had just started work
on the UEFI driver before he left. In addition there was an Intern who
did most of the work on the ethtool loopback test.

When the layoffs hit in late 2022 the team was basically reduced to
just myself and a firmware developer. Neither of us really strayed too
much into the other's code. Basically I defined the mailbox interface
and messages and went on our separate ways from there. In the last 6
months our team started hiring again. The new hires are currently
working in areas that aren't in this set such as devlink firmware
update, ethtool register dump, and various other interactions with the
firmware.

Alexander Duyck April 6, 2024, 5:16 p.m. UTC | #29

On Sat, Apr 6, 2024 at 9:49 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > I'm assuming it is some sort of firmware functionality that is needed
> > to enable it? One thing with our design is that the firmware actually
> > has minimal functionality. Basically it is the liaison between the
> > BMC, Host, and the MAC. Otherwise it has no role to play in the
> > control path so when the driver is loaded it is running the show.
>
> Which i personally feel is great. In an odd way, this to me indicates
> this is a commodity product, or at least, leading the way towards
> commodity 100G products. Looking at the embedded SoC NIC market, which
> pretty is much about commodity, few 1G Ethernet NICs have firmware.
> Most 10G NICs also have no firmware. Linux is driving the hardware.
>
> Much of the current Linux infrastructure is limited to 10G, because
> currently everything faster than that hides away in firmware, Linux
> does not get to driver it. This driver could help push Linux
> controlling the hardware forward, to be benefit of us all. It would be
> great if this driver used phylink to manage the PCS and the SFP cage,
> that the PCS code is moved into drivers/net/pcs, etc. Assuming this
> PCS follows the standards, it would be great to add helpers like we
> have for clause 37, clause 73, to help support other future PCS
> drivers which will appear. 100G in SoCs is probably not going to
> appear too soon, but single channel 25G is probably the next step
> after 10G. And what is added for this device will probably also work
> for 25G. 40G via 4 channels is probably not too far away either.
>
> Our Linux SFP driver is also currently limited to 10G. It would be
> great if this driver could push that forwards to support faster SFP
> cages and devices, support splitting and merging, etc.
>
> None of this requires new kAPIs, they all already exist. There is
> nothing controversial here. Everything follows standards. So if Meta
> were to abandon the MAC driver, it would not matter, its not dead
> infrastructure code, future drivers would make use of it, as this
> technology becomes more and more commodity.
>
>         Andrew

As far as the MAC/PCS code goes I will have to see what I can do. I
think I have to check with our sourcing team to figure out what
contracts are in place for whatever IP we are currently using before I
can share any additional info beyond the code here.

One other complication I can think of in terms of switching things
over as you have requested is that we will probably need to look at
splitting up the fbnic_mac.c file as it is currently used for both the
UEFI driver and the Linux driver so I will need to have a solution for
the UEFI driver which wouldn't have the advantage of phylink.

Thanks,

- Alex

Leon Romanovsky April 8, 2024, 6:18 a.m. UTC | #30

On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote:
> On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote:

<...>

> > > > Technical solution? Maybe if it's not a public device regression rules
> > > > don't apply? Seems fairly reasonable.
> > >
> > > This is a hypothetical. This driver currently isn't changing anything
> > > outside of itself. At this point the driver would only be build tested
> > > by everyone else. They could just not include it in their Kconfig and
> > > then out-of-sight, out-of-mind.
> >
> > Not changing does not mean not depending on existing behavior.
> > Investigating and fixing properly even the hardest regressions in
> > the stack is a bar that Meta can so easily clear. I don't understand
> > why you are arguing.
> 
> I wasn't saying the driver wouldn't be dependent on existing behavior.
> I was saying that it was a hypothetical that Meta would be a "less
> than cooperative user" and demand a revert.  It is also a hypothetical
> that Linus wouldn't just propose a revert of the fbnic driver instead
> of the API for the crime of being a "less than cooperative maintainer"
> and  then give Meta the Nvidia treatment.

It is very easy to be "less than cooperative maintainer" in netdev world.
1. Be vendor.
2. Propose ideas which are different.
3. Report for user-visible regression.
4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice.

And voilà, you are "less than cooperative maintainer".

So in reality, the "hypothetical" is very close to the reality, unless
Meta contribution will be treated as a special case.

Thanks

Jiri Pirko April 8, 2024, 10:54 a.m. UTC | #31

Thu, Apr 04, 2024 at 09:22:02PM CEST, alexander.duyck@gmail.com wrote:
>On Thu, Apr 4, 2024 at 8:36 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Thu, Apr 04, 2024 at 04:45:14PM CEST, alexander.duyck@gmail.com wrote:
>> >On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote:
>> >>
>> >> Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote:
>
><...>
>
>> >> Could you please shed some light for the motivation to introduce this
>> >> driver in the community kernel? Is this device something people can
>> >> obtain in a shop, or is it rather something to be seen in Meta
>> >> datacenter only? If the second is the case, why exactly would we need
>> >> this driver?
>> >
>> >For now this is Meta only. However there are several reasons for
>> >wanting to include this in the upstream kernel.
>> >
>> >First is the fact that from a maintenance standpoint it is easier to
>> >avoid drifting from the upstream APIs and such if we are in the kernel
>> >it makes things much easier to maintain as we can just pull in patches
>> >without having to add onto that work by having to craft backports
>> >around code that isn't already in upstream.
>>
>> That is making life easier for you, making it harder for the community.
>> O relevance.
>>
>>
>> >
>> >Second is the fact that as we introduce new features with our driver
>> >it is much easier to show a proof of concept with the driver being in
>> >the kernel than not. It makes it much harder to work with the
>> >community on offloads and such if we don't have a good vehicle to use
>> >for that. What this driver will provide is an opportunity to push
>> >changes that would be beneficial to us, and likely the rest of the
>> >community without being constrained by what vendors decide they want
>> >to enable or not. The general idea is that if we can show benefit with
>> >our NIC then other vendors would be more likely to follow in our path.
>>
>> Yeah, so not even we would have to maintain driver nobody (outside Meta)
>> uses or cares about, you say that we will likely maintain more of a dead
>> code related to that. I think that in Linux kernel, there any many
>> examples of similarly dead code that causes a lot of headaches to
>> maintain.
>>
>> You just want to make your life easier here again. Don't drag community
>> into this please.
>
>The argument itself doesn't really hold water. The fact is the Meta
>data centers are not an insignificant consumer of Linux, so it isn't
>as if the driver isn't going to be used. This implies some lack of

Used by one user. Consider a person creating some custom proprietary
FPGA based pet project for himself, trying to add driver for it to the
mainline kernel. Why? Nobody else will ever see the device, why the
community should be involved at all? Does not make sense. Have the
driver for your internal cook-ups internal.


>good faith from Meta. I don't understand that as we are contributing
>across multiple areas in the kernel including networking and ebpf. Is
>Meta expected to start pulling time from our upstream maintainers to
>have them update out-of-tree kernel modules since the community isn't
>willing to let us maintain it in the kernel? Is the message that the

If Meta contributes whatever may be useful for somebody else, it is
completely fine. This driver is not useful for anyone, except Meta.


>kernel is expected to get value from Meta, but that value is not meant
>to be reciprocated? Would you really rather have us start maintaining
>our own internal kernel with our own "proprietary goodness", and ask

I don't care, maintain whatever you want internally. Totally up to you.
Just try to understand my POV. I may believe you have good faith and
everything. But still, I think that community has to be selfish.


>other NIC vendors to have to maintain their drivers against yet
>another kernel if they want to be used in our data centers?
>
>As pointed out by Andew we aren't the first data center to push a
>driver for our own proprietary device. The fact is there have been

If you proprietary device is used by other people running virtual
machines on your systems, that is completely fine. But that is incorrect
analogy to your nic, no outside-Meta person will ever see it!


>drivers added for devices that were for purely emulated devices with
>no actual customers such as rocker. Should the switch vendors at the

This is completely fault analogy. Rocker was introduced to solve
chicken-egg problem to ass switch device support into kernel. It served
the purpose quite well. Let it rot now.



>time have pushed back on it stating it wasn't a real "for sale"
>device? The whole argument seems counter to what is expected. When a
>vendor creates a new device and will likely be enabling new kernel
>features my understanding is that it is better to be in the kernel
>than not.
>
>Putting a criteria on it that it must be "for sale" seems rather

Not "for sale", but "available to the outside person".


>arbitrary and capricious, especially given that most drivers have to

Not capricious at all, I sorry you feel that way. You proceed your
company goals, my position here is to defend the community and
the unnecessary and pointless burden you are putting on it.


>be pushed out long before they are available in the market in order to
>meet deadlines to get the driver into OSV releases such as Redhat when
>it hits the market. By that logic should we block all future drivers
>until we can find them for sale somewhere? That way we don't run the

That is or course obviously complete fault analogy again. You never plan
to introduce your device to public. Big difference. Don't you see it?


>risk of adding a vendor driver for a product that might be scrapped
>due to a last minute bug that will cause it to never be released.

Jiri Pirko April 8, 2024, 11:05 a.m. UTC | #32

Thu, Apr 04, 2024 at 11:59:33PM CEST, john.fastabend@gmail.com wrote:
>Jakub Kicinski wrote:
>> On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote:
>> > The argument itself doesn't really hold water. The fact is the Meta
>> > data centers are not an insignificant consumer of Linux, 
>> 
>> customer or beneficiary ?
>> 
>> > so it isn't as if the driver isn't going to be used. This implies
>> > some lack of good faith from Meta.
>> 
>> "Good faith" is not a sufficient foundation for a community consisting
>> of volunteers, and commercial entities (with the xz debacle maybe even
>> less today than it was a month ago). As a maintainer I really don't want
>> to be in position of judging the "good faith" of corporate actors.
>> 
>> > I don't understand that as we are
>> > contributing across multiple areas in the kernel including networking
>> > and ebpf. Is Meta expected to start pulling time from our upstream
>> > maintainers to have them update out-of-tree kernel modules since the
>> > community isn't willing to let us maintain it in the kernel? Is the
>> > message that the kernel is expected to get value from Meta, but that
>> > value is not meant to be reciprocated? Would you really rather have
>> > us start maintaining our own internal kernel with our own
>> > "proprietary goodness", and ask other NIC vendors to have to maintain
>> > their drivers against yet another kernel if they want to be used in
>> > our data centers?
>> 
>> Please allow the community to make rational choices in the interest of
>> the project and more importantly the interest of its broader user base.
>> 
>> Google would also claim "good faith" -- undoubtedly is supports 
>> the kernel, and lets some of its best engineers contribute.
>> Did that make them stop trying to build Fuchsia? The "good faith" of
>> companies operates with the limits of margin of error of they consider
>> rational and beneficial.
>> 
>> I don't want to put my thumb on the scale (yet?), but (with my
>> maintainer hat on) please don't use the "Meta is good" argument, because
>> someone will send a similar driver from a less involved company later on
>> and we'll be accused of playing favorites :( Plus companies can change
>> their approach to open source from "inclusive" to "extractive" 
>> (to borrow the economic terminology) rather quickly.
>> 
>
>I'll throw my $.02 in. In this case you have a driver that I only scanned
>so far, but looks well done. Alex has written lots of drivers I trust he
>will not just abondon it. And if it does end up abondoned and no one
>supports it at some future point we can deprecate it same as any other
>driver in the networking tree. All the feedback is being answered and
>debate is happening so I expect will get a v2, v3 or so. All good signs
>in my point.
>
>Back to your point about faith in a company. I don't think we even need
>to care about whatever companies business plans. The author could have
>submitted with their personal address for what its worth and called it
>drivers/alexware/duyck.o Bit extreme and I would have called him on it,
>but hopefully the point is clear.
>
>We have lots of drivers in the tree that are hard to physically get ahold
>of. Or otherwise gated by paying some vendor for compute time, etc. to
>use. We even have some drivers where the hardware itself never made
>it out into the wild or only a single customer used it before sellers 
>burned it for commercial reasons or hw wasn't workable, team was cut, etc.
>
>I can't see how if I have a physical NIC for it on my desk here makes
>much difference one way or the other.
>
>The alternative is much worse someone builds a team of engineers locks
>them up they build some interesting pieces and we never get to see it
>because we tried to block someone from opensourcing their driver?
>Eventually they need some kernel changes and than we block those too
>because we didn't allow the driver that was the use case? This seems
>wrong to me.
>
>Anyways we have zero ways to enforce such a policy. Have vendors
>ship a NIC to somebody with the v0 of the patch set? Attach a picture? 

Come on. Are you kidding? Isn't this case crystal clear?


>Even if vendor X claims they will have a product in N months and
>than only sells it to qualified customers what to do we do then.
>Driver author could even believe the hardware will be available
>when they post the driver, but business may change out of hands
>of the developer.
>
>I'm 100% on letting this through assuming Alex is on top of feedback
>and the code is good. I think any other policy would be very ugly
>to enforce, prove, and even understand. Obviously code and architecture
>debates I'm all for. Ensuring we have a trusted, experienced person
>signed up to review code, address feedback, fix whatever syzbot finds
>and so on is also a must I think. I'm sure Alex will take care of
>it.

You are for some reason making this submission very personal on Alex.
Just to be clear, this has nothing to do with Alex.


>
>Thanks,
>John

Jiri Pirko April 8, 2024, 11:37 a.m. UTC | #33

Fri, Apr 05, 2024 at 09:11:19AM CEST, pabeni@redhat.com wrote:
>On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote:
>> Again, I would say we look at the blast radius. That is how we should
>> be measuring any change. At this point the driver is self contained
>> into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything
>> outside that directory, and it can be switched off via Kconfig.
>
>I personally think this is the most relevant point. This is just a new
>NIC driver, completely self-encapsulated. I quickly glanced over the

What do you mean by "self contained/encapsulated"? You are not using
any API outside the driver? Every driver API change that this NIC
is going to use is a burden. I did my share of changes like that in
the past so I have pretty good notion how painful it often is.


>code and it looks like it's not doing anything obviously bad. It really
>looks like an usual, legit, NIC driver.
>
>I don't think the fact that the NIC itself is hard to grasp for anyone

Distinguish "hard"/"impossible".


>outside <organization> makes a difference. Long time ago Greg noted
>that drivers has been merged for H/W known to have a _single_ existing
>instance (IIRC, I can't find the reference on top of my head, but back
>then was quite popular, I hope some other old guy could remember).
>
>To me, the maintainership burden is on Meta: Alex/Meta will have to
>handle bug report, breakages, user-complains (I guess this last would
>be the easier part ;). If he/they will not cope with the process we can
>simply revert the driver. I would be quite surprised if such situation
>should happen, but the impact from my PoV looks minimal.
>
>TL;DR: I don't see any good reason to not accept this - unless my quick
>glance was too quick and very wrong, but it looks like other has
>similar view.

Do you actually see any good reason to accept this? I mean, really,
could you spell out at least one benefit it brings for non-Meta user?
I see only gains for Meta and losses for the community.

Jiri Pirko April 8, 2024, 11:50 a.m. UTC | #34

Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>> > > Alex already indicated new features are coming, changes to the core
>> > > code will be proposed. How should those be evaluated? Hypothetically
>> > > should fbnic be allowed to be the first implementation of something
>> > > invasive like Mina's DMABUF work? Google published an open userspace
>> > > for NCCL that people can (in theory at least) actually run. Meta would
>> > > not be able to do that. I would say that clearly crosses the line and
>> > > should not be accepted.
>> >
>> > Why not? Just because we are not commercially selling it doesn't mean
>> > we couldn't look at other solutions such as QEMU. If we were to
>> > provide a github repo with an emulation of the NIC would that be
>> > enough to satisfy the "commercial" requirement?
>>
>> My test is not "commercial", it is enabling open source ecosystem vs
>> benefiting only proprietary software.
>
>Sorry, that was where this started where Jiri was stating that we had
>to be selling this.

For the record, I never wrote that. Not sure why you repeat this over
this thread.

And for the record, I don't share Jason's concern about proprietary
userspace. From what I see, whoever is consuming the KAPI is free to do
that however he pleases.

But, this is completely distant from my concerns about this driver.


[...]


>> > I agree. We need a consistent set of standards. I just strongly
>> > believe commercial availability shouldn't be one of them.
>>
>> I never said commercial availability. I talked about open source vs
>> proprietary userspace. This is very standard kernel stuff.
>>
>> You have an unavailable NIC, so we know it is only ever operated with
>> Meta's proprietary kernel fork, supporting Meta's proprietary
>> userspace software. Where exactly is the open source?
>
>It depends on your definition of "unavailable". I could argue that for
>many most of the Mellanox NICs are also have limited availability as
>they aren't exactly easy to get a hold of without paying a hefty
>ransom.

Sorry, but I have to say this is ridiculous argument, really Alex.
Apples and oranges.

[...]

Jakub Kicinski April 8, 2024, 3:04 p.m. UTC | #35

On Sat, 6 Apr 2024 09:05:01 -0700 Alexander Duyck wrote:
> > I'm being very clear to say that there are some core changes should
> > not be accepted due to the kernel's open source ideology.  
> 
> Okay, on core changes I 100% agree. That is one of the reasons why we
> have the whole thing about any feature really needing to be enabled on
> at least 2 different vendor devices.

The "2 different vendor"/implementation thing came up before so
I wanted to provide more context for the less initiated readers.
We try  to judge features in terms of how reasonable the APIs are,
overall system design and how easy they will be to modify later
(e.g. uAPI, depth of core changes).

Risks are usually more pronounced for stack features like GSO partial,
XDP or AF_XDP. Although my (faulty) memory is that we started with
just mlx4 for XDP and other drivers quickly followed. But we did not
wait for more than an ACK from other vendors.

We almost never have a second implementation for HW-heavy features.
TLS immediately comes to mind, and merging it was probably the right
call given how many implementations were added since. For "full" IPsec
offload we still only have one vendor. Existing TCP ZC Rx (page
flipping) was presented as possible with two NICs but mlx5 was hacked
up and still doesn't support real HDS.

Most (if not all) of recent uAPI we added in netdev netlink were
accepted with a single implementation (be it Intel's work + drivers
or my work, and I often provide just a bnxt implementation).

Summary -> the number of implementations we require is decided on case
by case basis, depending on our level of confidence..

I don't know if this "2 implementations" rule is just a "mental
ellipsis" everyone understands to be a more nuanced rule in practice. 
But to an outsider it would seem very selectively enforced. In worst
case a fake rule to give us an excuse to nack stuff.

Alexander Duyck April 8, 2024, 3:26 p.m. UTC | #36

On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote:
> > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> <...>
>
> > > > > Technical solution? Maybe if it's not a public device regression rules
> > > > > don't apply? Seems fairly reasonable.
> > > >
> > > > This is a hypothetical. This driver currently isn't changing anything
> > > > outside of itself. At this point the driver would only be build tested
> > > > by everyone else. They could just not include it in their Kconfig and
> > > > then out-of-sight, out-of-mind.
> > >
> > > Not changing does not mean not depending on existing behavior.
> > > Investigating and fixing properly even the hardest regressions in
> > > the stack is a bar that Meta can so easily clear. I don't understand
> > > why you are arguing.
> >
> > I wasn't saying the driver wouldn't be dependent on existing behavior.
> > I was saying that it was a hypothetical that Meta would be a "less
> > than cooperative user" and demand a revert.  It is also a hypothetical
> > that Linus wouldn't just propose a revert of the fbnic driver instead
> > of the API for the crime of being a "less than cooperative maintainer"
> > and  then give Meta the Nvidia treatment.
>
> It is very easy to be "less than cooperative maintainer" in netdev world.
> 1. Be vendor.
> 2. Propose ideas which are different.
> 3. Report for user-visible regression.
> 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice.
>
> And voilà, you are "less than cooperative maintainer".
>
> So in reality, the "hypothetical" is very close to the reality, unless
> Meta contribution will be treated as a special case.
>
> Thanks

How many cases of that have we had in the past? I'm honestly curious
as I don't actually have any reference.

Also as far as item 3 isn't hard for it to be a "user-visible"
regression if there are no users outside of the vendor that is
maintaining the driver to report it? Again I am assuming that the same
rules wouldn't necessarily apply in the vendor/consumer being one
entity case.

Also from my past experience the community doesn't give a damn about
1. It is only if 3 is being reported by actual users that somebody
would care. The fact is if vendors held that much power they would
have run roughshod over the community long ago as I know there are
vendors who love to provide one-off projects outside of the kernel and
usually have to work to get things into the upstream later and no
amount of complaining about "the users" will get their code accepted.
The users may complain but it is the vendors fault for that so the
community doesn't have to take action.

Alexander Duyck April 8, 2024, 3:46 p.m. UTC | #37

On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
> >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >>
> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
> >> > > Alex already indicated new features are coming, changes to the core
> >> > > code will be proposed. How should those be evaluated? Hypothetically
> >> > > should fbnic be allowed to be the first implementation of something
> >> > > invasive like Mina's DMABUF work? Google published an open userspace
> >> > > for NCCL that people can (in theory at least) actually run. Meta would
> >> > > not be able to do that. I would say that clearly crosses the line and
> >> > > should not be accepted.
> >> >
> >> > Why not? Just because we are not commercially selling it doesn't mean
> >> > we couldn't look at other solutions such as QEMU. If we were to
> >> > provide a github repo with an emulation of the NIC would that be
> >> > enough to satisfy the "commercial" requirement?
> >>
> >> My test is not "commercial", it is enabling open source ecosystem vs
> >> benefiting only proprietary software.
> >
> >Sorry, that was where this started where Jiri was stating that we had
> >to be selling this.
>
> For the record, I never wrote that. Not sure why you repeat this over
> this thread.

Because you seem to be implying that the Meta NIC driver shouldn't be
included simply since it isn't going to be available outside of Meta.
The fact is Meta employs a number of kernel developers and as a result
of that there will be a number of kernel developers that will have
access to this NIC and likely do development on systems containing it.
In addition simply due to the size of the datacenters that we will be
populating there is actually a strong likelihood that there will be
more instances of this NIC running on Linux than there are of some
other vendor devices that have been allowed to have drivers in the
kernel.

So from what I can tell the only difference is if we are manufacturing
this for sale, or for personal use. Thus why I mention "commercial"
since the only difference from my perspective is the fact that we are
making it for our own use instead of selling it.

[...]

> >> > I agree. We need a consistent set of standards. I just strongly
> >> > believe commercial availability shouldn't be one of them.
> >>
> >> I never said commercial availability. I talked about open source vs
> >> proprietary userspace. This is very standard kernel stuff.
> >>
> >> You have an unavailable NIC, so we know it is only ever operated with
> >> Meta's proprietary kernel fork, supporting Meta's proprietary
> >> userspace software. Where exactly is the open source?
> >
> >It depends on your definition of "unavailable". I could argue that for
> >many most of the Mellanox NICs are also have limited availability as
> >they aren't exactly easy to get a hold of without paying a hefty
> >ransom.
>
> Sorry, but I have to say this is ridiculous argument, really Alex.
> Apples and oranges.

Really? So would you be making the same argument if it was
Nvidia/Mellanox pushing the driver and they were exclusively making it
just for Meta, Google, or some other big cloud provider? I suspect
not. If nothing else they likely wouldn't disclose the plan for
exclusive sales to get around this sort of thing. The fact is I know
many of the vendors make proprietary spins of their firmware and
hardware for specific customers. The way I see it this patchset is
being rejected as I was too honest about the general plan and use case
for it.

This is what I am getting at. It just seems like we are playing games
with semantics where if it is a vendor making the arrangement then it
is okay for them to make hardware that is inaccessible to most, but if
it is Meta then somehow it isn't.

Jiri Pirko April 8, 2024, 4:51 p.m. UTC | #38

Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>
>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>> >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>> >>
>> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>> >> > > Alex already indicated new features are coming, changes to the core
>> >> > > code will be proposed. How should those be evaluated? Hypothetically
>> >> > > should fbnic be allowed to be the first implementation of something
>> >> > > invasive like Mina's DMABUF work? Google published an open userspace
>> >> > > for NCCL that people can (in theory at least) actually run. Meta would
>> >> > > not be able to do that. I would say that clearly crosses the line and
>> >> > > should not be accepted.
>> >> >
>> >> > Why not? Just because we are not commercially selling it doesn't mean
>> >> > we couldn't look at other solutions such as QEMU. If we were to
>> >> > provide a github repo with an emulation of the NIC would that be
>> >> > enough to satisfy the "commercial" requirement?
>> >>
>> >> My test is not "commercial", it is enabling open source ecosystem vs
>> >> benefiting only proprietary software.
>> >
>> >Sorry, that was where this started where Jiri was stating that we had
>> >to be selling this.
>>
>> For the record, I never wrote that. Not sure why you repeat this over
>> this thread.
>
>Because you seem to be implying that the Meta NIC driver shouldn't be
>included simply since it isn't going to be available outside of Meta.
>The fact is Meta employs a number of kernel developers and as a result
>of that there will be a number of kernel developers that will have
>access to this NIC and likely do development on systems containing it.
>In addition simply due to the size of the datacenters that we will be
>populating there is actually a strong likelihood that there will be
>more instances of this NIC running on Linux than there are of some
>other vendor devices that have been allowed to have drivers in the
>kernel.

So? The gain for community is still 0. No matter how many instances is
private hw you privately have. Just have a private driver.


>
>So from what I can tell the only difference is if we are manufacturing
>this for sale, or for personal use. Thus why I mention "commercial"
>since the only difference from my perspective is the fact that we are
>making it for our own use instead of selling it.

Give it for free.


>
>[...]
>
>> >> > I agree. We need a consistent set of standards. I just strongly
>> >> > believe commercial availability shouldn't be one of them.
>> >>
>> >> I never said commercial availability. I talked about open source vs
>> >> proprietary userspace. This is very standard kernel stuff.
>> >>
>> >> You have an unavailable NIC, so we know it is only ever operated with
>> >> Meta's proprietary kernel fork, supporting Meta's proprietary
>> >> userspace software. Where exactly is the open source?
>> >
>> >It depends on your definition of "unavailable". I could argue that for
>> >many most of the Mellanox NICs are also have limited availability as
>> >they aren't exactly easy to get a hold of without paying a hefty
>> >ransom.
>>
>> Sorry, but I have to say this is ridiculous argument, really Alex.
>> Apples and oranges.
>
>Really? So would you be making the same argument if it was
>Nvidia/Mellanox pushing the driver and they were exclusively making it
>just for Meta, Google, or some other big cloud provider? I suspect

Heh, what ifs :) Anyway, chance that happens is very close to 0.


>not. If nothing else they likely wouldn't disclose the plan for
>exclusive sales to get around this sort of thing. The fact is I know
>many of the vendors make proprietary spins of their firmware and
>hardware for specific customers. The way I see it this patchset is
>being rejected as I was too honest about the general plan and use case
>for it.
>
>This is what I am getting at. It just seems like we are playing games
>with semantics where if it is a vendor making the arrangement then it
>is okay for them to make hardware that is inaccessible to most, but if
>it is Meta then somehow it isn't.

John Fastabend April 8, 2024, 5:32 p.m. UTC | #39

Jiri Pirko wrote:
> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
> >On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
> >>
> >> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
> >> >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >> >>
> >> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
> >> >> > > Alex already indicated new features are coming, changes to the core
> >> >> > > code will be proposed. How should those be evaluated? Hypothetically
> >> >> > > should fbnic be allowed to be the first implementation of something
> >> >> > > invasive like Mina's DMABUF work? Google published an open userspace
> >> >> > > for NCCL that people can (in theory at least) actually run. Meta would
> >> >> > > not be able to do that. I would say that clearly crosses the line and
> >> >> > > should not be accepted.
> >> >> >
> >> >> > Why not? Just because we are not commercially selling it doesn't mean
> >> >> > we couldn't look at other solutions such as QEMU. If we were to
> >> >> > provide a github repo with an emulation of the NIC would that be
> >> >> > enough to satisfy the "commercial" requirement?
> >> >>
> >> >> My test is not "commercial", it is enabling open source ecosystem vs
> >> >> benefiting only proprietary software.
> >> >
> >> >Sorry, that was where this started where Jiri was stating that we had
> >> >to be selling this.
> >>
> >> For the record, I never wrote that. Not sure why you repeat this over
> >> this thread.
> >
> >Because you seem to be implying that the Meta NIC driver shouldn't be
> >included simply since it isn't going to be available outside of Meta.
> >The fact is Meta employs a number of kernel developers and as a result
> >of that there will be a number of kernel developers that will have
> >access to this NIC and likely do development on systems containing it.
> >In addition simply due to the size of the datacenters that we will be
> >populating there is actually a strong likelihood that there will be
> >more instances of this NIC running on Linux than there are of some
> >other vendor devices that have been allowed to have drivers in the
> >kernel.
> 
> So? The gain for community is still 0. No matter how many instances is
> private hw you privately have. Just have a private driver.

The gain is the same as if company X makes a card and sells it
exclusively to datacenter provider Y. We know this happens.
Vendors would happily spin up a NIC if a DC with scale like this
would pay for it. They just don't advertise it in patch 0/X,
"adding device for cloud provider foo".

There is no difference here. We gain developers, we gain insights,
learnings and Linux and OSS drivers are running on another big
DC. They improve things and find bugs they upstream them its a win.

The opposite is also true if we exclude a driver/NIC HW that is
running on major DCs we lose a lot of insight, experience, value.
DCs are all starting to build their own hardware if we lose this
section of HW we lose those developers too. We are less likely
to get any advances they come up with. I think you have it backwards.
Eventually Linux networking becomes either commodity and irrelevant
for DC deployments.

So I strongly disagree we lose by excluding drivers and win by
bringing it in.

> 
> 
> >
> >So from what I can tell the only difference is if we are manufacturing
> >this for sale, or for personal use. Thus why I mention "commercial"
> >since the only difference from my perspective is the fact that we are
> >making it for our own use instead of selling it.
> 
> Give it for free.

Huh?

> 
> 
> >
> >[...]
> >
> >> >> > I agree. We need a consistent set of standards. I just strongly
> >> >> > believe commercial availability shouldn't be one of them.
> >> >>
> >> >> I never said commercial availability. I talked about open source vs
> >> >> proprietary userspace. This is very standard kernel stuff.
> >> >>
> >> >> You have an unavailable NIC, so we know it is only ever operated with
> >> >> Meta's proprietary kernel fork, supporting Meta's proprietary
> >> >> userspace software. Where exactly is the open source?
> >> >
> >> >It depends on your definition of "unavailable". I could argue that for
> >> >many most of the Mellanox NICs are also have limited availability as
> >> >they aren't exactly easy to get a hold of without paying a hefty
> >> >ransom.
> >>
> >> Sorry, but I have to say this is ridiculous argument, really Alex.
> >> Apples and oranges.
> >
> >Really? So would you be making the same argument if it was
> >Nvidia/Mellanox pushing the driver and they were exclusively making it
> >just for Meta, Google, or some other big cloud provider? I suspect
> 
> Heh, what ifs :) Anyway, chance that happens is very close to 0.
> 
> 
> >not. If nothing else they likely wouldn't disclose the plan for
> >exclusive sales to get around this sort of thing. The fact is I know
> >many of the vendors make proprietary spins of their firmware and
> >hardware for specific customers. The way I see it this patchset is
> >being rejected as I was too honest about the general plan and use case
> >for it.
> >
> >This is what I am getting at. It just seems like we are playing games
> >with semantics where if it is a vendor making the arrangement then it
> >is okay for them to make hardware that is inaccessible to most, but if
> >it is Meta then somehow it isn't.

Jason Gunthorpe April 8, 2024, 6:16 p.m. UTC | #40

On Mon, Apr 08, 2024 at 08:46:35AM -0700, Alexander Duyck wrote:

> Really? So would you be making the same argument if it was
> Nvidia/Mellanox pushing the driver and they were exclusively making it
> just for Meta, Google, or some other big cloud provider? 

At least I would, yes.

> I suspect not. If nothing else they likely wouldn't disclose the
> plan for exclusive sales to get around this sort of thing. The fact
> is I know many of the vendors make proprietary spins of their
> firmware and hardware for specific customers. The way I see it this
> patchset is being rejected as I was too honest about the general
> plan and use case for it.

Regrettably this does happen quietly in the kernel. If you know the
right behind the scenes stuff you can start to be aware. That doesn't
mean it is aligned with community values or should be done/encouraged.

> This is what I am getting at. It just seems like we are playing games
> with semantics where if it is a vendor making the arrangement then it
> is okay for them to make hardware that is inaccessible to most, but if
> it is Meta then somehow it isn't.

With Meta it is obvious what is happening, and what is benefiting. If
a COTS vendor does it then we have to take a leap of faith a unique
feature will have wider applications - and many would require to see
an open source userspace to boot strap that. I don't think we always
get it right. Value judgements are often a bit murky like that.

Jason

Leon Romanovsky April 8, 2024, 6:41 p.m. UTC | #41

On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote:
> On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote:
> > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > <...>
> >
> > > > > > Technical solution? Maybe if it's not a public device regression rules
> > > > > > don't apply? Seems fairly reasonable.
> > > > >
> > > > > This is a hypothetical. This driver currently isn't changing anything
> > > > > outside of itself. At this point the driver would only be build tested
> > > > > by everyone else. They could just not include it in their Kconfig and
> > > > > then out-of-sight, out-of-mind.
> > > >
> > > > Not changing does not mean not depending on existing behavior.
> > > > Investigating and fixing properly even the hardest regressions in
> > > > the stack is a bar that Meta can so easily clear. I don't understand
> > > > why you are arguing.
> > >
> > > I wasn't saying the driver wouldn't be dependent on existing behavior.
> > > I was saying that it was a hypothetical that Meta would be a "less
> > > than cooperative user" and demand a revert.  It is also a hypothetical
> > > that Linus wouldn't just propose a revert of the fbnic driver instead
> > > of the API for the crime of being a "less than cooperative maintainer"
> > > and  then give Meta the Nvidia treatment.
> >
> > It is very easy to be "less than cooperative maintainer" in netdev world.
> > 1. Be vendor.
> > 2. Propose ideas which are different.
> > 3. Report for user-visible regression.
> > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice.
> >
> > And voilà, you are "less than cooperative maintainer".
> >
> > So in reality, the "hypothetical" is very close to the reality, unless
> > Meta contribution will be treated as a special case.
> >
> > Thanks
> 
> How many cases of that have we had in the past? I'm honestly curious
> as I don't actually have any reference.

And this is the problem, you don't have "any reference" and accurate
knowledge what happened, but you are saying "less than cooperative
maintainer".

> 
> Also as far as item 3 isn't hard for it to be a "user-visible"
> regression if there are no users outside of the vendor that is
> maintaining the driver to report it? 

This wasn't the case. It was change in core code, which broke specific
version of vagrant. Vendor caught it simply by luck.

> Again I am assuming that the same rules wouldn't necessarily apply
> in the vendor/consumer being one entity case.
> 
> Also from my past experience the community doesn't give a damn about
> 1. It is only if 3 is being reported by actual users that somebody
> would care. The fact is if vendors held that much power they would
> have run roughshod over the community long ago as I know there are
> vendors who love to provide one-off projects outside of the kernel and
> usually have to work to get things into the upstream later and no
> amount of complaining about "the users" will get their code accepted.
> The users may complain but it is the vendors fault for that so the
> community doesn't have to take action.

You are taking it to completely wrong direction with your assumptions.
The reality is that regression was reported by real user without any
vendor code involved. This is why the end result was so bad for all parties.

So no, you can get "less than cooperative maintainer" label really easy in
current environment.

Thanks

Mina Almasry April 8, 2024, 7:50 p.m. UTC | #42

On Sat, Apr 6, 2024 at 9:05 AM Alexander Duyck
<alexander.duyck@gmail.com> wrote:
> > > > You have an unavailable NIC, so we know it is only ever operated with
> > > > Meta's proprietary kernel fork, supporting Meta's proprietary
> > > > userspace software. Where exactly is the open source?
> > >
> > > It depends on your definition of "unavailable". I could argue that for
> > > many most of the Mellanox NICs are also have limited availability as
> > > they aren't exactly easy to get a hold of without paying a hefty
> > > ransom.
> >
> > And GNIC's that run Mina's series are completely unavailable right
> > now. That is still a big different from a temporary issue to a
> > permanent structural intention of the manufacturer.
>
> I'm assuming it is some sort of firmware functionality that is needed
> to enable it? One thing with our design is that the firmware actually
> has minimal functionality. Basically it is the liaison between the
> BMC, Host, and the MAC. Otherwise it has no role to play in the
> control path so when the driver is loaded it is running the show.
>

Sorry, I didn't realize our devmem TCP work was mentioned in this
context. Just jumping in to say, no, this is not the case, devmem TCP
does not require firmware functionality AFAICT. The selftest provided
with the devmem TCP series should work on any driver that:

1. supports header split/flow steering/rss/page pool (I guess this
support may need firmware changes...).

2. supports the new queue configuration ndos:
https://patchwork.kernel.org/project/netdevbpf/patch/20240403002053.2376017-2-almasrymina@google.com/

3. supports the new netmem page_pool APIs:
https://patchwork.kernel.org/project/netdevbpf/patch/20240403002053.2376017-8-almasrymina@google.com/

No firmware changes specific to devmem TCP are needed, AFAICT. All
these are driver changes. I also always publish a full branch with all
the GVE changes so reviewers can check if there is anything too
specific to GVE that we're doing, so far there are been no issues, and
to be honest I can't see anything specific that we do with GVE for
devmem TCP:

https://github.com/mina/linux/commits/tcpdevmem-v7/

In fact, GVE is IMO a relatively feature light driver, and the fact
that GVE can do devmem TCP IMO makes it easier for fancier NICs to
also do devmem TCP.

I'm working with folks interested in extending devmem TCP to their
drivers, and they may follow up with patches after the series is
merged (or before). The only reason I haven't implemented devmem TCP
for multiple different drivers is a logistical one. I don't have
access to hardware that supports all these prerequisite features other
than GVE.

Alexander Duyck April 8, 2024, 8:43 p.m. UTC | #43

On Mon, Apr 8, 2024 at 11:41 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote:
> > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote:
> > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > <...>
> > >
> > > > > > > Technical solution? Maybe if it's not a public device regression rules
> > > > > > > don't apply? Seems fairly reasonable.
> > > > > >
> > > > > > This is a hypothetical. This driver currently isn't changing anything
> > > > > > outside of itself. At this point the driver would only be build tested
> > > > > > by everyone else. They could just not include it in their Kconfig and
> > > > > > then out-of-sight, out-of-mind.
> > > > >
> > > > > Not changing does not mean not depending on existing behavior.
> > > > > Investigating and fixing properly even the hardest regressions in
> > > > > the stack is a bar that Meta can so easily clear. I don't understand
> > > > > why you are arguing.
> > > >
> > > > I wasn't saying the driver wouldn't be dependent on existing behavior.
> > > > I was saying that it was a hypothetical that Meta would be a "less
> > > > than cooperative user" and demand a revert.  It is also a hypothetical
> > > > that Linus wouldn't just propose a revert of the fbnic driver instead
> > > > of the API for the crime of being a "less than cooperative maintainer"
> > > > and  then give Meta the Nvidia treatment.
> > >
> > > It is very easy to be "less than cooperative maintainer" in netdev world.
> > > 1. Be vendor.
> > > 2. Propose ideas which are different.
> > > 3. Report for user-visible regression.
> > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice.
> > >
> > > And voilà, you are "less than cooperative maintainer".
> > >
> > > So in reality, the "hypothetical" is very close to the reality, unless
> > > Meta contribution will be treated as a special case.
> > >
> > > Thanks
> >
> > How many cases of that have we had in the past? I'm honestly curious
> > as I don't actually have any reference.
>
> And this is the problem, you don't have "any reference" and accurate
> knowledge what happened, but you are saying "less than cooperative
> maintainer".

By "less than cooperative maintainer" I was referring to the scenario
where somebody is maintaining something unique to them, such as the
Meta Host NIC, and not willing to work with the community to fix it
and instead just demanding a revert of a change. It doesn't seem like
it would be too much to ask to work with the author on a fix for the
problem as long as the maintainer is willing to work with the author
on putting together and testing the fix.

With that said if the upstream version of things aren't broken then it
doesn't matter. It shouldn't be expected of the community to maintain
any proprietary code that wasn't accepted upstream.

> >
> > Also as far as item 3 isn't hard for it to be a "user-visible"
> > regression if there are no users outside of the vendor that is
> > maintaining the driver to report it?
>
> This wasn't the case. It was change in core code, which broke specific
> version of vagrant. Vendor caught it simply by luck.

Any more info on this? Without context it is hard to say one way or the other.

I know I have seen my fair share of hot issues such as when the
introduction of the tracing framework was corrupting the NVRAM on
e1000e NICs.[1] It got everyone's attention when it essentially
bricked one of Linus's systems. I don't recall us doing a full revert
on function tracing as a result, but I believe it was flagged as
broken until it could be resolved. So depending on the situation there
are cases where asking for a fix or revert might be appropriate.

> > Again I am assuming that the same rules wouldn't necessarily apply
> > in the vendor/consumer being one entity case.
> >
> > Also from my past experience the community doesn't give a damn about
> > 1. It is only if 3 is being reported by actual users that somebody
> > would care. The fact is if vendors held that much power they would
> > have run roughshod over the community long ago as I know there are
> > vendors who love to provide one-off projects outside of the kernel and
> > usually have to work to get things into the upstream later and no
> > amount of complaining about "the users" will get their code accepted.
> > The users may complain but it is the vendors fault for that so the
> > community doesn't have to take action.
>
> You are taking it to completely wrong direction with your assumptions.
> The reality is that regression was reported by real user without any
> vendor code involved. This is why the end result was so bad for all parties.

Okay, but that doesn't tie into what is going on here. In this case
"vendor" == "user". Like I was saying the community generally cares
about the user so 3 would be the important case assuming they are
using a stock kernel and driver and not hiding behind the vendor
expecting some sort of proprietary fix. If they are using some
proprietary stuff behind the scenes, then tough luck.

> So no, you can get "less than cooperative maintainer" label really easy in
> current environment.

I didn't say you couldn't. Without context I cannot say if it was
deserved or not. I know in the example I cited above Intel had to add
changes to the e1000e driver to make the NVRAM non-writable until the
problem patch was found. So Intel was having to patch to fix an issue
it didn't introduce and deal with the negative press and blow-back
from a function tracing patch that was damaging NICs.

The point I was trying to make is that if you are the only owner of
something, and not willing to work with the community as a maintainer
it becomes much easier for the community to just revert the driver
rather than try to change the code if you aren't willing to work with
them. Thus the "less than cooperative" part. The argument being made
seems to be that once something is in the kernel it is forever and if
we get it in and then refuse to work with the community it couldn't be
reverted. I am arguing that isn't the case, especially if Meta were to
become a "less than cooperative maintainer" for a device that is
primarily only going to be available in Meta data centers.

Thanks,

- Alex

[1]: https://lwn.net/Articles/304105/

Florian Fainelli April 8, 2024, 9:36 p.m. UTC | #44

On 4/8/24 09:51, Jiri Pirko wrote:
> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>
>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>
>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>>>>>>> Alex already indicated new features are coming, changes to the core
>>>>>>> code will be proposed. How should those be evaluated? Hypothetically
>>>>>>> should fbnic be allowed to be the first implementation of something
>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace
>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would
>>>>>>> not be able to do that. I would say that clearly crosses the line and
>>>>>>> should not be accepted.
>>>>>>
>>>>>> Why not? Just because we are not commercially selling it doesn't mean
>>>>>> we couldn't look at other solutions such as QEMU. If we were to
>>>>>> provide a github repo with an emulation of the NIC would that be
>>>>>> enough to satisfy the "commercial" requirement?
>>>>>
>>>>> My test is not "commercial", it is enabling open source ecosystem vs
>>>>> benefiting only proprietary software.
>>>>
>>>> Sorry, that was where this started where Jiri was stating that we had
>>>> to be selling this.
>>>
>>> For the record, I never wrote that. Not sure why you repeat this over
>>> this thread.
>>
>> Because you seem to be implying that the Meta NIC driver shouldn't be
>> included simply since it isn't going to be available outside of Meta.
>> The fact is Meta employs a number of kernel developers and as a result
>> of that there will be a number of kernel developers that will have
>> access to this NIC and likely do development on systems containing it.
>> In addition simply due to the size of the datacenters that we will be
>> populating there is actually a strong likelihood that there will be
>> more instances of this NIC running on Linux than there are of some
>> other vendor devices that have been allowed to have drivers in the
>> kernel.
> 
> So? The gain for community is still 0. No matter how many instances is
> private hw you privately have. Just have a private driver.

I am amazed and not in a good way at how far this has gone, truly.

This really is akin to saying that any non-zero driver count to maintain 
is a burden on the community. Which is true, by definition, but if the 
goal was to build something for no users, then clearly this is the wrong 
place to be in, or too late. The systems with no users are the best to 
maintain, that is for sure.

If the practical concern is wen you make tree wide API change that fbnic 
happens to use, and you have yet another driver (fbnic) to convert, so 
what? Work with Alex ahead of time, get his driver to be modified, post 
the patch series. Even if Alex happens to move on and stop being 
responsible and there is no maintainer, so what? Give the driver a 
depreciation window for someone to step in, rip it, end of story. 
Nothing new, so what has specifically changed as of April 4th 2024 to 
oppose such strong rejection?

Like it was said, there are tons of drivers in the Linux kernel that 
have a single user, this one might have a few more than a single one, 
that should be good enough.

What the heck is going on?

Florian Fainelli April 8, 2024, 9:49 p.m. UTC | #45

On 4/8/24 13:43, Alexander Duyck wrote:
>>>
>>> Also as far as item 3 isn't hard for it to be a "user-visible"
>>> regression if there are no users outside of the vendor that is
>>> maintaining the driver to report it?
>>
>> This wasn't the case. It was change in core code, which broke specific
>> version of vagrant. Vendor caught it simply by luck.
> 
> Any more info on this? Without context it is hard to say one way or the other.

Believe this is the thread in question:

https://lore.kernel.org/netdev/MN2PR12MB44863139E562A59329E89DBEB982A@MN2PR12MB4486.namprd12.prod.outlook.com/

Florian Fainelli April 8, 2024, 9:52 p.m. UTC | #46

On 4/8/24 14:49, Florian Fainelli wrote:
> On 4/8/24 13:43, Alexander Duyck wrote:
>>>>
>>>> Also as far as item 3 isn't hard for it to be a "user-visible"
>>>> regression if there are no users outside of the vendor that is
>>>> maintaining the driver to report it?
>>>
>>> This wasn't the case. It was change in core code, which broke specific
>>> version of vagrant. Vendor caught it simply by luck.
>>
>> Any more info on this? Without context it is hard to say one way or 
>> the other.
> 
> Believe this is the thread in question:
> 
> https://lore.kernel.org/netdev/MN2PR12MB44863139E562A59329E89DBEB982A@MN2PR12MB4486.namprd12.prod.outlook.com/

And the follow up:

https://lore.kernel.org/netdev/14459261ea9f9c7d7dfb28eb004ce8734fa83ade.1704185904.git.leonro@nvidia.com/

Leon Romanovsky April 9, 2024, 8:18 a.m. UTC | #47

On Mon, Apr 08, 2024 at 01:43:28PM -0700, Alexander Duyck wrote:
> On Mon, Apr 8, 2024 at 11:41 AM Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote:
> > > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > >
> > > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote:
> > > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > >
> > > > <...>
> > > >
> > > > > > > > Technical solution? Maybe if it's not a public device regression rules
> > > > > > > > don't apply? Seems fairly reasonable.
> > > > > > >
> > > > > > > This is a hypothetical. This driver currently isn't changing anything
> > > > > > > outside of itself. At this point the driver would only be build tested
> > > > > > > by everyone else. They could just not include it in their Kconfig and
> > > > > > > then out-of-sight, out-of-mind.
> > > > > >
> > > > > > Not changing does not mean not depending on existing behavior.
> > > > > > Investigating and fixing properly even the hardest regressions in
> > > > > > the stack is a bar that Meta can so easily clear. I don't understand
> > > > > > why you are arguing.
> > > > >
> > > > > I wasn't saying the driver wouldn't be dependent on existing behavior.
> > > > > I was saying that it was a hypothetical that Meta would be a "less
> > > > > than cooperative user" and demand a revert.  It is also a hypothetical
> > > > > that Linus wouldn't just propose a revert of the fbnic driver instead
> > > > > of the API for the crime of being a "less than cooperative maintainer"
> > > > > and  then give Meta the Nvidia treatment.
> > > >
> > > > It is very easy to be "less than cooperative maintainer" in netdev world.
> > > > 1. Be vendor.
> > > > 2. Propose ideas which are different.
> > > > 3. Report for user-visible regression.
> > > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice.
> > > >
> > > > And voilà, you are "less than cooperative maintainer".
> > > >
> > > > So in reality, the "hypothetical" is very close to the reality, unless
> > > > Meta contribution will be treated as a special case.
> > > >
> > > > Thanks
> > >
> > > How many cases of that have we had in the past? I'm honestly curious
> > > as I don't actually have any reference.
> >
> > And this is the problem, you don't have "any reference" and accurate
> > knowledge what happened, but you are saying "less than cooperative
> > maintainer".

<...>

> Any more info on this? Without context it is hard to say one way or the other.

<...>

> I didn't say you couldn't. Without context I cannot say if it was
> deserved or not. 

Florian gave links to the context, so I'll skip this part.

In this thread, Jakub tried to revive the discussion about it.
https://lore.kernel.org/netdev/20240326133412.47cf6d99@kernel.org/

<...>

> The point I was trying to make is that if you are the only owner of
> something, and not willing to work with the community as a maintainer

Like Jakub, I don't understand why you are talking about regressions in
the driver, while you brought the label of "less than cooperative maintainer"
and asked for "then give Meta the Nvidia treatment".

I don't want to get into the discussion about if this driver should be
accepted or not.

I'm just asking to stop label people and companies based on descriptions
from other people, but rely on facts.

Thanks

Jiri Pirko April 9, 2024, 10:56 a.m. UTC | #48

Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote:
>On 4/8/24 09:51, Jiri Pirko wrote:
>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>> > On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>> > > 
>> > > Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>> > > > On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>> > > > > 
>> > > > > On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>> > > > > > > Alex already indicated new features are coming, changes to the core
>> > > > > > > code will be proposed. How should those be evaluated? Hypothetically
>> > > > > > > should fbnic be allowed to be the first implementation of something
>> > > > > > > invasive like Mina's DMABUF work? Google published an open userspace
>> > > > > > > for NCCL that people can (in theory at least) actually run. Meta would
>> > > > > > > not be able to do that. I would say that clearly crosses the line and
>> > > > > > > should not be accepted.
>> > > > > > 
>> > > > > > Why not? Just because we are not commercially selling it doesn't mean
>> > > > > > we couldn't look at other solutions such as QEMU. If we were to
>> > > > > > provide a github repo with an emulation of the NIC would that be
>> > > > > > enough to satisfy the "commercial" requirement?
>> > > > > 
>> > > > > My test is not "commercial", it is enabling open source ecosystem vs
>> > > > > benefiting only proprietary software.
>> > > > 
>> > > > Sorry, that was where this started where Jiri was stating that we had
>> > > > to be selling this.
>> > > 
>> > > For the record, I never wrote that. Not sure why you repeat this over
>> > > this thread.
>> > 
>> > Because you seem to be implying that the Meta NIC driver shouldn't be
>> > included simply since it isn't going to be available outside of Meta.
>> > The fact is Meta employs a number of kernel developers and as a result
>> > of that there will be a number of kernel developers that will have
>> > access to this NIC and likely do development on systems containing it.
>> > In addition simply due to the size of the datacenters that we will be
>> > populating there is actually a strong likelihood that there will be
>> > more instances of this NIC running on Linux than there are of some
>> > other vendor devices that have been allowed to have drivers in the
>> > kernel.
>> 
>> So? The gain for community is still 0. No matter how many instances is
>> private hw you privately have. Just have a private driver.
>
>I am amazed and not in a good way at how far this has gone, truly.
>
>This really is akin to saying that any non-zero driver count to maintain is a
>burden on the community. Which is true, by definition, but if the goal was to
>build something for no users, then clearly this is the wrong place to be in,
>or too late. The systems with no users are the best to maintain, that is for
>sure.
>
>If the practical concern is wen you make tree wide API change that fbnic
>happens to use, and you have yet another driver (fbnic) to convert, so what?
>Work with Alex ahead of time, get his driver to be modified, post the patch
>series. Even if Alex happens to move on and stop being responsible and there
>is no maintainer, so what? Give the driver a depreciation window for someone
>to step in, rip it, end of story. Nothing new, so what has specifically
>changed as of April 4th 2024 to oppose such strong rejection?

How you describe the flow of internal API change is totally distant from
reality. Really, like no part is correct:
1) API change is responsibility of the person doing it. Imagine working
   with 40 driver maintainers for every API change. I did my share of
   API changes in the past, maintainer were only involved to be cced.
2) To deprecate driver because the maintainer is not responsible. Can
   you please show me one example when that happened in the past?


>
>Like it was said, there are tons of drivers in the Linux kernel that have a
>single user, this one might have a few more than a single one, that should be
>good enough.

This will have exactly 0. That is my point. Why to merge something
nobody will ever use?


>
>What the heck is going on?
>-- 
>Florian
>

Jiri Pirko April 9, 2024, 11:01 a.m. UTC | #49

Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote:
>Jiri Pirko wrote:
>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>> >On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>> >>
>> >> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>> >> >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>> >> >>
>> >> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>> >> >> > > Alex already indicated new features are coming, changes to the core
>> >> >> > > code will be proposed. How should those be evaluated? Hypothetically
>> >> >> > > should fbnic be allowed to be the first implementation of something
>> >> >> > > invasive like Mina's DMABUF work? Google published an open userspace
>> >> >> > > for NCCL that people can (in theory at least) actually run. Meta would
>> >> >> > > not be able to do that. I would say that clearly crosses the line and
>> >> >> > > should not be accepted.
>> >> >> >
>> >> >> > Why not? Just because we are not commercially selling it doesn't mean
>> >> >> > we couldn't look at other solutions such as QEMU. If we were to
>> >> >> > provide a github repo with an emulation of the NIC would that be
>> >> >> > enough to satisfy the "commercial" requirement?
>> >> >>
>> >> >> My test is not "commercial", it is enabling open source ecosystem vs
>> >> >> benefiting only proprietary software.
>> >> >
>> >> >Sorry, that was where this started where Jiri was stating that we had
>> >> >to be selling this.
>> >>
>> >> For the record, I never wrote that. Not sure why you repeat this over
>> >> this thread.
>> >
>> >Because you seem to be implying that the Meta NIC driver shouldn't be
>> >included simply since it isn't going to be available outside of Meta.
>> >The fact is Meta employs a number of kernel developers and as a result
>> >of that there will be a number of kernel developers that will have
>> >access to this NIC and likely do development on systems containing it.
>> >In addition simply due to the size of the datacenters that we will be
>> >populating there is actually a strong likelihood that there will be
>> >more instances of this NIC running on Linux than there are of some
>> >other vendor devices that have been allowed to have drivers in the
>> >kernel.
>> 
>> So? The gain for community is still 0. No matter how many instances is
>> private hw you privately have. Just have a private driver.
>
>The gain is the same as if company X makes a card and sells it
>exclusively to datacenter provider Y. We know this happens.

Different story. The driver is still the same. Perhaps only some parts
of it are tailored to fit one person's need, maybe. But here, the whole
thing is obviously is targeted to one person. Can't you see the scale on
which these are different?


>Vendors would happily spin up a NIC if a DC with scale like this
>would pay for it. They just don't advertise it in patch 0/X,
>"adding device for cloud provider foo".
>
>There is no difference here. We gain developers, we gain insights,
>learnings and Linux and OSS drivers are running on another big
>DC. They improve things and find bugs they upstream them its a win.
>
>The opposite is also true if we exclude a driver/NIC HW that is
>running on major DCs we lose a lot of insight, experience, value.

Could you please describe in details and examples what exactly is we
are about to loose? I don't see it.


>DCs are all starting to build their own hardware if we lose this
>section of HW we lose those developers too. We are less likely
>to get any advances they come up with. I think you have it backwards.
>Eventually Linux networking becomes either commodity and irrelevant
>for DC deployments.
>
>So I strongly disagree we lose by excluding drivers and win by
>bringing it in.
>
>> 
>> 
>> >
>> >So from what I can tell the only difference is if we are manufacturing
>> >this for sale, or for personal use. Thus why I mention "commercial"
>> >since the only difference from my perspective is the fact that we are
>> >making it for our own use instead of selling it.
>> 
>> Give it for free.
>
>Huh?
>
>> 
>> 
>> >
>> >[...]
>> >
>> >> >> > I agree. We need a consistent set of standards. I just strongly
>> >> >> > believe commercial availability shouldn't be one of them.
>> >> >>
>> >> >> I never said commercial availability. I talked about open source vs
>> >> >> proprietary userspace. This is very standard kernel stuff.
>> >> >>
>> >> >> You have an unavailable NIC, so we know it is only ever operated with
>> >> >> Meta's proprietary kernel fork, supporting Meta's proprietary
>> >> >> userspace software. Where exactly is the open source?
>> >> >
>> >> >It depends on your definition of "unavailable". I could argue that for
>> >> >many most of the Mellanox NICs are also have limited availability as
>> >> >they aren't exactly easy to get a hold of without paying a hefty
>> >> >ransom.
>> >>
>> >> Sorry, but I have to say this is ridiculous argument, really Alex.
>> >> Apples and oranges.
>> >
>> >Really? So would you be making the same argument if it was
>> >Nvidia/Mellanox pushing the driver and they were exclusively making it
>> >just for Meta, Google, or some other big cloud provider? I suspect
>> 
>> Heh, what ifs :) Anyway, chance that happens is very close to 0.
>> 
>> 
>> >not. If nothing else they likely wouldn't disclose the plan for
>> >exclusive sales to get around this sort of thing. The fact is I know
>> >many of the vendors make proprietary spins of their firmware and
>> >hardware for specific customers. The way I see it this patchset is
>> >being rejected as I was too honest about the general plan and use case
>> >for it.
>> >
>> >This is what I am getting at. It just seems like we are playing games
>> >with semantics where if it is a vendor making the arrangement then it
>> >is okay for them to make hardware that is inaccessible to most, but if
>> >it is Meta then somehow it isn't.
>
>

Florian Fainelli April 9, 2024, 1:05 p.m. UTC | #50

On 4/9/2024 3:56 AM, Jiri Pirko wrote:
> Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote:
>> On 4/8/24 09:51, Jiri Pirko wrote:
>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>
>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>>>
>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>>>>>>>>> Alex already indicated new features are coming, changes to the core
>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically
>>>>>>>>> should fbnic be allowed to be the first implementation of something
>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace
>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would
>>>>>>>>> not be able to do that. I would say that clearly crosses the line and
>>>>>>>>> should not be accepted.
>>>>>>>>
>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean
>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to
>>>>>>>> provide a github repo with an emulation of the NIC would that be
>>>>>>>> enough to satisfy the "commercial" requirement?
>>>>>>>
>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs
>>>>>>> benefiting only proprietary software.
>>>>>>
>>>>>> Sorry, that was where this started where Jiri was stating that we had
>>>>>> to be selling this.
>>>>>
>>>>> For the record, I never wrote that. Not sure why you repeat this over
>>>>> this thread.
>>>>
>>>> Because you seem to be implying that the Meta NIC driver shouldn't be
>>>> included simply since it isn't going to be available outside of Meta.
>>>> The fact is Meta employs a number of kernel developers and as a result
>>>> of that there will be a number of kernel developers that will have
>>>> access to this NIC and likely do development on systems containing it.
>>>> In addition simply due to the size of the datacenters that we will be
>>>> populating there is actually a strong likelihood that there will be
>>>> more instances of this NIC running on Linux than there are of some
>>>> other vendor devices that have been allowed to have drivers in the
>>>> kernel.
>>>
>>> So? The gain for community is still 0. No matter how many instances is
>>> private hw you privately have. Just have a private driver.
>>
>> I am amazed and not in a good way at how far this has gone, truly.
>>
>> This really is akin to saying that any non-zero driver count to maintain is a
>> burden on the community. Which is true, by definition, but if the goal was to
>> build something for no users, then clearly this is the wrong place to be in,
>> or too late. The systems with no users are the best to maintain, that is for
>> sure.
>>
>> If the practical concern is wen you make tree wide API change that fbnic
>> happens to use, and you have yet another driver (fbnic) to convert, so what?
>> Work with Alex ahead of time, get his driver to be modified, post the patch
>> series. Even if Alex happens to move on and stop being responsible and there
>> is no maintainer, so what? Give the driver a depreciation window for someone
>> to step in, rip it, end of story. Nothing new, so what has specifically
>> changed as of April 4th 2024 to oppose such strong rejection?
> 
> How you describe the flow of internal API change is totally distant from
> reality. Really, like no part is correct:
> 1) API change is responsibility of the person doing it. Imagine working
>     with 40 driver maintainers for every API change. I did my share of
>     API changes in the past, maintainer were only involved to be cced.

As a submitter you propose changes and silence is acknowledgement. If 
one of your API changes broke someone's driver and they did not notify 
you of the breakage during the review cycle, it falls on their shoulder 
to fix it for themselves and they should not be holding back your work, 
that would not be fair. If you know about the breakage, and there is 
still no fix, that is an indication the driver is not actively used and 
maintained.

This also does not mean you have to do the entire API changes to a 
driver you do not know about on your own. Nothing ever prevents you from 
posting the patches as RFC and say: "here is how I would go about 
changing your driver, please review and help me make corrections". If 
the driver maintainers do not respond there is no reason their lack of 
involvement should refrain your work, and so your proposed changes will 
be merged eventually.

Is not this the whole point of being a community and be able to delegate 
and mitigate the risk of large scale changes?

> 2) To deprecate driver because the maintainer is not responsible. Can
>     you please show me one example when that happened in the past?

I cannot show you an example because we never had to go that far and I 
did not say that this is an established practice, but that we *could* do 
that if we ever reached that point.

> 
> 
>>
>> Like it was said, there are tons of drivers in the Linux kernel that have a
>> single user, this one might have a few more than a single one, that should be
>> good enough.
> 
> This will have exactly 0. That is my point. Why to merge something
> nobody will ever use?

Even if Alex and his firmware colleague end up being the only two people 
using this driver if the decision is to make it upstream because this is 
the desired distribution and development model of the driver we should 
respect that.

And just to be clear, we should not be respecting that because Meta, or 
Alex or anyone decided that they were doing the world a favor by working 
in the open rather than being closed door, but simply because we cannot 
*presume* about their intentions and the future.

For drivers specifically, yes, there is a question of to which degree 
can we scale horizontally, and I do not think there is ever going to be 
an answer to that, as we will continue to see new drivers emerge, 
possibly with few users, for some definition of few.

Alexander Lobakin April 9, 2024, 1:11 p.m. UTC | #51

From: Jiri Pirko <jiri@resnulli.us>
Date: Tue, 9 Apr 2024 13:01:51 +0200

> Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote:
>> Jiri Pirko wrote:
>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>
>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>>>
>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>>>>>>>>> Alex already indicated new features are coming, changes to the core
>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically
>>>>>>>>> should fbnic be allowed to be the first implementation of something
>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace
>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would
>>>>>>>>> not be able to do that. I would say that clearly crosses the line and
>>>>>>>>> should not be accepted.
>>>>>>>>
>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean
>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to
>>>>>>>> provide a github repo with an emulation of the NIC would that be
>>>>>>>> enough to satisfy the "commercial" requirement?
>>>>>>>
>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs
>>>>>>> benefiting only proprietary software.
>>>>>>
>>>>>> Sorry, that was where this started where Jiri was stating that we had
>>>>>> to be selling this.
>>>>>
>>>>> For the record, I never wrote that. Not sure why you repeat this over
>>>>> this thread.
>>>>
>>>> Because you seem to be implying that the Meta NIC driver shouldn't be
>>>> included simply since it isn't going to be available outside of Meta.

BTW idpf is also not something you can go and buy in a store, but it's
here in the kernel. Anyway, see below.

>>>> The fact is Meta employs a number of kernel developers and as a result
>>>> of that there will be a number of kernel developers that will have
>>>> access to this NIC and likely do development on systems containing it.

[...]

>> Vendors would happily spin up a NIC if a DC with scale like this
>> would pay for it. They just don't advertise it in patch 0/X,
>> "adding device for cloud provider foo".
>>
>> There is no difference here. We gain developers, we gain insights,
>> learnings and Linux and OSS drivers are running on another big
>> DC. They improve things and find bugs they upstream them its a win.
>>
>> The opposite is also true if we exclude a driver/NIC HW that is
>> running on major DCs we lose a lot of insight, experience, value.
> 
> Could you please describe in details and examples what exactly is we
> are about to loose? I don't see it.

As long as driver A introduces new features / improvements / API /
whatever to the core kernel, we benefit from this no matter whether I'm
actually able to run this driver on my system.

Some drivers even give us benefit by that they are of good quality (I
don't speak for this driver, just some hypothetical) and/or have
interesting design / code / API / etc. choices. The drivers I work on
did gain a lot just from that I was reading new commits / lore threads
and look at changes in other drivers.

I saw enough situations when driver A started using/doing something the
way it wasn't ever done anywhere before, and then more and more drivers
stated doing the same thing and at the end it became sorta standard.

I didn't read this patchset and thus can't say if it will bring us good
immediately or some time later, but I believe there's no reason to
reject the driver only because you can't buy a board for it in your
gadget store next door.

[...]

Thanks,
Olek

Jason Gunthorpe April 9, 2024, 1:18 p.m. UTC | #52

On Tue, Apr 09, 2024 at 03:11:21PM +0200, Alexander Lobakin wrote:

> BTW idpf is also not something you can go and buy in a store, but it's
> here in the kernel. Anyway, see below.

That is really disappointing to hear :(

Jason

Jakub Kicinski April 9, 2024, 2:08 p.m. UTC | #53

On Tue, 9 Apr 2024 15:11:21 +0200 Alexander Lobakin wrote:
> BTW idpf is also not something you can go and buy in a store, but it's
> here in the kernel. Anyway, see below.

For some definition of "a store" :)

> > Could you please describe in details and examples what exactly is we
> > are about to loose? I don't see it.  
> 
> As long as driver A introduces new features / improvements / API /
> whatever to the core kernel, we benefit from this no matter whether I'm
> actually able to run this driver on my system.
> 
> Some drivers even give us benefit by that they are of good quality (I
> don't speak for this driver, just some hypothetical) and/or have
> interesting design / code / API / etc. choices. The drivers I work on
> did gain a lot just from that I was reading new commits / lore threads
> and look at changes in other drivers.

Another point along these lines is worth bringing up. Companies which
build their own kernels probably have little reason to distribute
drivers out of tree. Vendors unfortunately are forced by some of their
customers and/or sales department to provide out of tree drivers. Which
in turn distinctiveness them from implementing shared core
infrastructure. The queue API is a good example of that. Number of
vendors implement pre-allocate and swap for reconfiguration but it's
not controlled by the core. So after 5+ years (look at netconf 2019
slides) of violently agreeing that we need queue alloc we made little
progress :( I don't think that it's a coincidence that it's Mina
(Google) and David (Meta) who picked up this work. And it's really hard
to implement that in an "off the shelf device", where queues are fully
controlled by FW (no documentation available), and without breaking
something (no access to vendor's CI/tests). IOW while modifying core
for a single private driver is a concern there's also a ton of work
we all agree needs to be done in the core, that we need help with.

Jakub Kicinski April 9, 2024, 2:27 p.m. UTC | #54

On Tue, 9 Apr 2024 07:08:58 -0700 Jakub Kicinski wrote:
> distinctiveness

Too trusting of the spellcheck, I meant disincentivizes

Jiri Pirko April 9, 2024, 2:28 p.m. UTC | #55

Tue, Apr 09, 2024 at 03:05:47PM CEST, f.fainelli@gmail.com wrote:
>
>
>On 4/9/2024 3:56 AM, Jiri Pirko wrote:
>> Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote:
>> > On 4/8/24 09:51, Jiri Pirko wrote:
>> > > Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>> > > > On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>> > > > > 
>> > > > > Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>> > > > > > On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>> > > > > > > 
>> > > > > > > On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>> > > > > > > > > Alex already indicated new features are coming, changes to the core
>> > > > > > > > > code will be proposed. How should those be evaluated? Hypothetically
>> > > > > > > > > should fbnic be allowed to be the first implementation of something
>> > > > > > > > > invasive like Mina's DMABUF work? Google published an open userspace
>> > > > > > > > > for NCCL that people can (in theory at least) actually run. Meta would
>> > > > > > > > > not be able to do that. I would say that clearly crosses the line and
>> > > > > > > > > should not be accepted.
>> > > > > > > > 
>> > > > > > > > Why not? Just because we are not commercially selling it doesn't mean
>> > > > > > > > we couldn't look at other solutions such as QEMU. If we were to
>> > > > > > > > provide a github repo with an emulation of the NIC would that be
>> > > > > > > > enough to satisfy the "commercial" requirement?
>> > > > > > > 
>> > > > > > > My test is not "commercial", it is enabling open source ecosystem vs
>> > > > > > > benefiting only proprietary software.
>> > > > > > 
>> > > > > > Sorry, that was where this started where Jiri was stating that we had
>> > > > > > to be selling this.
>> > > > > 
>> > > > > For the record, I never wrote that. Not sure why you repeat this over
>> > > > > this thread.
>> > > > 
>> > > > Because you seem to be implying that the Meta NIC driver shouldn't be
>> > > > included simply since it isn't going to be available outside of Meta.
>> > > > The fact is Meta employs a number of kernel developers and as a result
>> > > > of that there will be a number of kernel developers that will have
>> > > > access to this NIC and likely do development on systems containing it.
>> > > > In addition simply due to the size of the datacenters that we will be
>> > > > populating there is actually a strong likelihood that there will be
>> > > > more instances of this NIC running on Linux than there are of some
>> > > > other vendor devices that have been allowed to have drivers in the
>> > > > kernel.
>> > > 
>> > > So? The gain for community is still 0. No matter how many instances is
>> > > private hw you privately have. Just have a private driver.
>> > 
>> > I am amazed and not in a good way at how far this has gone, truly.
>> > 
>> > This really is akin to saying that any non-zero driver count to maintain is a
>> > burden on the community. Which is true, by definition, but if the goal was to
>> > build something for no users, then clearly this is the wrong place to be in,
>> > or too late. The systems with no users are the best to maintain, that is for
>> > sure.
>> > 
>> > If the practical concern is wen you make tree wide API change that fbnic
>> > happens to use, and you have yet another driver (fbnic) to convert, so what?
>> > Work with Alex ahead of time, get his driver to be modified, post the patch
>> > series. Even if Alex happens to move on and stop being responsible and there
>> > is no maintainer, so what? Give the driver a depreciation window for someone
>> > to step in, rip it, end of story. Nothing new, so what has specifically
>> > changed as of April 4th 2024 to oppose such strong rejection?
>> 
>> How you describe the flow of internal API change is totally distant from
>> reality. Really, like no part is correct:
>> 1) API change is responsibility of the person doing it. Imagine working
>>     with 40 driver maintainers for every API change. I did my share of
>>     API changes in the past, maintainer were only involved to be cced.
>
>As a submitter you propose changes and silence is acknowledgement. If one of
>your API changes broke someone's driver and they did not notify you of the
>breakage during the review cycle, it falls on their shoulder to fix it for
>themselves and they should not be holding back your work, that would not be

Does it? I don't think so. If you break something, better try to fix it
before somebody else has to.


>fair. If you know about the breakage, and there is still no fix, that is an
>indication the driver is not actively used and maintained.

So? That is not my point. If I break something in fbnic, why does anyone
care? Nobody is ever to hit that bug, only Meta DC.


>
>This also does not mean you have to do the entire API changes to a driver you
>do not know about on your own. Nothing ever prevents you from posting the
>patches as RFC and say: "here is how I would go about changing your driver,
>please review and help me make corrections". If the driver maintainers do not
>respond there is no reason their lack of involvement should refrain your
>work, and so your proposed changes will be merged eventually.

Realistically, did you see that ever happen. I can't recall.


>
>Is not this the whole point of being a community and be able to delegate and
>mitigate the risk of large scale changes?
>
>> 2) To deprecate driver because the maintainer is not responsible. Can
>>     you please show me one example when that happened in the past?
>
>I cannot show you an example because we never had to go that far and I did
>not say that this is an established practice, but that we *could* do that if
>we ever reached that point.

You are talking about a flow that does not exist. I don't understand how
is that related to this discussion then.


>
>> 
>> 
>> > 
>> > Like it was said, there are tons of drivers in the Linux kernel that have a
>> > single user, this one might have a few more than a single one, that should be
>> > good enough.
>> 
>> This will have exactly 0. That is my point. Why to merge something
>> nobody will ever use?
>
>Even if Alex and his firmware colleague end up being the only two people
>using this driver if the decision is to make it upstream because this is the
>desired distribution and development model of the driver we should respect
>that.
>
>And just to be clear, we should not be respecting that because Meta, or Alex
>or anyone decided that they were doing the world a favor by working in the
>open rather than being closed door, but simply because we cannot *presume*

I don't see any favor for the community. What's the favor exactly?
The only favor I see is the in the opposite direction, community giving
Meta free cycles saving their backporting costs. Why?


>about their intentions and the future.

Heh, the intention is pretty clear from this discussion, isn't it? If
they ever by any chance decide to go public with their device, driver
for that could be submitted at a time. But this is totally hypothetical.


>
>For drivers specifically, yes, there is a question of to which degree can we
>scale horizontally, and I do not think there is ever going to be an answer to
>that, as we will continue to see new drivers emerge, possibly with few users,
>for some definition of few.
>-- 
>Florian

Jiri Pirko April 9, 2024, 2:41 p.m. UTC | #56

Tue, Apr 09, 2024 at 03:11:21PM CEST, aleksander.lobakin@intel.com wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Tue, 9 Apr 2024 13:01:51 +0200
>
>> Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote:
>>> Jiri Pirko wrote:
>>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>>
>>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>>>>
>>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>>>>>>>>>> Alex already indicated new features are coming, changes to the core
>>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically
>>>>>>>>>> should fbnic be allowed to be the first implementation of something
>>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace
>>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would
>>>>>>>>>> not be able to do that. I would say that clearly crosses the line and
>>>>>>>>>> should not be accepted.
>>>>>>>>>
>>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean
>>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to
>>>>>>>>> provide a github repo with an emulation of the NIC would that be
>>>>>>>>> enough to satisfy the "commercial" requirement?
>>>>>>>>
>>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs
>>>>>>>> benefiting only proprietary software.
>>>>>>>
>>>>>>> Sorry, that was where this started where Jiri was stating that we had
>>>>>>> to be selling this.
>>>>>>
>>>>>> For the record, I never wrote that. Not sure why you repeat this over
>>>>>> this thread.
>>>>>
>>>>> Because you seem to be implying that the Meta NIC driver shouldn't be
>>>>> included simply since it isn't going to be available outside of Meta.
>
>BTW idpf is also not something you can go and buy in a store, but it's
>here in the kernel. Anyway, see below.

IDK, why so many people in this thread are so focused on "buying" nic.
IDPF device is something I assume one may see on a VM hosted in some
cloud, isn't it? If yes, it is completely legit to have it in. Do I miss
something?


>
>>>>> The fact is Meta employs a number of kernel developers and as a result
>>>>> of that there will be a number of kernel developers that will have
>>>>> access to this NIC and likely do development on systems containing it.
>
>[...]
>
>>> Vendors would happily spin up a NIC if a DC with scale like this
>>> would pay for it. They just don't advertise it in patch 0/X,
>>> "adding device for cloud provider foo".
>>>
>>> There is no difference here. We gain developers, we gain insights,
>>> learnings and Linux and OSS drivers are running on another big
>>> DC. They improve things and find bugs they upstream them its a win.
>>>
>>> The opposite is also true if we exclude a driver/NIC HW that is
>>> running on major DCs we lose a lot of insight, experience, value.
>> 
>> Could you please describe in details and examples what exactly is we
>> are about to loose? I don't see it.
>
>As long as driver A introduces new features / improvements / API /
>whatever to the core kernel, we benefit from this no matter whether I'm
>actually able to run this driver on my system.
>
>Some drivers even give us benefit by that they are of good quality (I
>don't speak for this driver, just some hypothetical) and/or have
>interesting design / code / API / etc. choices. The drivers I work on
>did gain a lot just from that I was reading new commits / lore threads
>and look at changes in other drivers.
>
>I saw enough situations when driver A started using/doing something the
>way it wasn't ever done anywhere before, and then more and more drivers
>stated doing the same thing and at the end it became sorta standard.

So bottom line is, the unused driver *may* introduce some features and
*may* provide as an example of how to do things for other people.
Is this really that beneficial for the community that it overweights
the obvious cons (not going to repeat them)?

Like with any other patch/set we merge in, we always look at the cons
and pros. I'm honestly surprised that so many people here
want to make exception for Meta's internal toy project.


>
>I didn't read this patchset and thus can't say if it will bring us good
>immediately or some time later, but I believe there's no reason to
>reject the driver only because you can't buy a board for it in your
>gadget store next door.

Again with "buying", uff.


>
>[...]
>
>Thanks,
>Olek

Alexander Duyck April 9, 2024, 2:43 p.m. UTC | #57

On Tue, Apr 9, 2024 at 1:19 AM Leon Romanovsky <leon@kernel.org> wrote:
>
> On Mon, Apr 08, 2024 at 01:43:28PM -0700, Alexander Duyck wrote:
> > On Mon, Apr 8, 2024 at 11:41 AM Leon Romanovsky <leon@kernel.org> wrote:
> > >
> > > On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote:
> > > > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote:
> > > > >
> > > > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote:
> > > > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote:
> > > > >
> > > > > <...>
> > > > >
> > > > > > > > > Technical solution? Maybe if it's not a public device regression rules
> > > > > > > > > don't apply? Seems fairly reasonable.
> > > > > > > >
> > > > > > > > This is a hypothetical. This driver currently isn't changing anything
> > > > > > > > outside of itself. At this point the driver would only be build tested
> > > > > > > > by everyone else. They could just not include it in their Kconfig and
> > > > > > > > then out-of-sight, out-of-mind.
> > > > > > >
> > > > > > > Not changing does not mean not depending on existing behavior.
> > > > > > > Investigating and fixing properly even the hardest regressions in
> > > > > > > the stack is a bar that Meta can so easily clear. I don't understand
> > > > > > > why you are arguing.
> > > > > >
> > > > > > I wasn't saying the driver wouldn't be dependent on existing behavior.
> > > > > > I was saying that it was a hypothetical that Meta would be a "less
> > > > > > than cooperative user" and demand a revert.  It is also a hypothetical
> > > > > > that Linus wouldn't just propose a revert of the fbnic driver instead
> > > > > > of the API for the crime of being a "less than cooperative maintainer"
> > > > > > and  then give Meta the Nvidia treatment.
> > > > >
> > > > > It is very easy to be "less than cooperative maintainer" in netdev world.
> > > > > 1. Be vendor.
> > > > > 2. Propose ideas which are different.
> > > > > 3. Report for user-visible regression.
> > > > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice.
> > > > >
> > > > > And voilà, you are "less than cooperative maintainer".
> > > > >
> > > > > So in reality, the "hypothetical" is very close to the reality, unless
> > > > > Meta contribution will be treated as a special case.
> > > > >
> > > > > Thanks
> > > >
> > > > How many cases of that have we had in the past? I'm honestly curious
> > > > as I don't actually have any reference.
> > >
> > > And this is the problem, you don't have "any reference" and accurate
> > > knowledge what happened, but you are saying "less than cooperative
> > > maintainer".
>
> <...>
>
> > Any more info on this? Without context it is hard to say one way or the other.
>
> <...>
>
> > I didn't say you couldn't. Without context I cannot say if it was
> > deserved or not.
>
> Florian gave links to the context, so I'll skip this part.
>
> In this thread, Jakub tried to revive the discussion about it.
> https://lore.kernel.org/netdev/20240326133412.47cf6d99@kernel.org/
>
> <...>

I see. So this is what you were referencing. Arguably I can see both
sides of the issue. Ideally what should have been presented would have
been the root cause of why the diff was breaking things and then it
could have been fixed. However instead what was presented was
essentially a bisect with a request to revert.

Ultimately Eric accepted the revert since there was an issue that
needed to be fixed. However I can't tell what went on in terms of
trying to get to the root cause as that was taken offline for
discussion so I can't say what role Mellanox played in either good or
bad other than at least performing the bisect.

Ultimately I think this kind of comes down to the hobbyist versus
commercial interests issue that I brought up earlier. The hobbyist
side should at least be curious about what about the Vagrant
implementation was not RFC compliant which the changes supposedly
were, thus the interest in getting a root cause. However that said it
is broken and needs to be fixed so curiosity be damned, we cannot
break userspace or not interop with other TCP implementations.

> > The point I was trying to make is that if you are the only owner of
> > something, and not willing to work with the community as a maintainer
>
> Like Jakub, I don't understand why you are talking about regressions in
> the driver, while you brought the label of "less than cooperative maintainer"
> and asked for "then give Meta the Nvidia treatment".

Because I have been trying to keep the  whole discussion about the
fbnic driver that is presented in this patch set. When I was referring
to a "less than cooperative maintainer" it was in response to the
hypothetical about what if Meta started refusing to work with the
community after this was accepted, and the "Nvidia treatment" I was
referring was the graphics side about 10 years ago[1] as the question
was about somebody running to Linus to complain that their proprietary
hardware got broken by a kernel change. The general idea being if we
are a proprietary NIC with ourselves as the only customer Linus would
be more likely to give Meta a similar message.

> I don't want to get into the discussion about if this driver should be
> accepted or not.
>
> I'm just asking to stop label people and companies based on descriptions
> from other people, but rely on facts.

Sorry, it wasn't meant to be any sort of attack on Nvidia/Mellanox.
The Nvidia I was referencing was the graphics side which had a bad
reputation with the community long before Mellanox got involved.

> Thanks

Thank you. I understand now that you and Jason were just trying to
warn me about what the community will and won't accept. Like I
mentioned before I had just misconstrued Jason's comments as backing
Jiri initially in this. In my mind I was prepared for the
Nvidia/Mellanox folks dog piling me so I was just prepared for
attacks.

Just for the record this will be the third NIC driver I have added to
the kernel following igbvf and fm10k, and years maintaining some of
the other Intel network drivers. So I am well aware of the
expectations of a maintainer. I might be a bit rusty due to a couple
years of being focused on this project and not being able to post as
much upstream, but as the expression goes "This isn't my first rodeo".

- Alex

[1]: https://arstechnica.com/information-technology/2012/06/linus-torvalds-says-f-k-you-to-nvidia/

Jason Gunthorpe April 9, 2024, 3:39 p.m. UTC | #58

On Tue, Apr 09, 2024 at 07:43:07AM -0700, Alexander Duyck wrote:

> I see. So this is what you were referencing. Arguably I can see both
> sides of the issue. Ideally what should have been presented would have
> been the root cause of why the diff

Uh, that almost never happens in the kernel world. Someone does a
great favour to us all to test rc kernels and finds bugs. The
expectation is generally things like:

 - The bug is fixed immediately because the issue is obvious to the
   author
 - Iteration and rapid progress is seen toward enlightening the author
 - The patch is reverted, often rapidly, try again later with a good
   patch

Unsophisticated reporters should not experience regressions,
period. Unsophisticated reporters shuld not be expected to debug
things on their own (though it sure is nice if they can!). We really
like it and appreciate it if reporters can run experiments!

In this particular instance there was some resistance getting to a fix
quickly. I think a revert for something like this that could not be
immediately fixed is the correct thing, especially when it effects
significant work within the community. It gives the submitter time to
find out how to solve the regression.

That there is now so much ongoing bad blood over such an ordinary
matter is what is really distressing here.

I think Leon's point is broadly that those on the "vendor" side seem
to often be accused of being a "bad vendor". I couldn't help but
notice the language from Meta on this thread seemed to place Meta
outside of being a vendor, despite having always very much been doing
typical vendor activities like downstream forks, proprietary userspace
and now drivers for their own devices.

In my view the vendor/!vendor distinction is really toxic and should
stop.

Jason

Alexander Duyck April 9, 2024, 4:31 p.m. UTC | #59

On Tue, Apr 9, 2024 at 8:39 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Apr 09, 2024 at 07:43:07AM -0700, Alexander Duyck wrote:
>
> > I see. So this is what you were referencing. Arguably I can see both
> > sides of the issue. Ideally what should have been presented would have
> > been the root cause of why the diff
>
> Uh, that almost never happens in the kernel world. Someone does a
> great favour to us all to test rc kernels and finds bugs. The

Thus why I mentioned "Ideally". Most often that cannot be the case due
to various reasons. However, that said that would have been the Ideal
solution, not the practical one.

> expectation is generally things like:
>
>  - The bug is fixed immediately because the issue is obvious to the
>    author
>  - Iteration and rapid progress is seen toward enlightening the author
>  - The patch is reverted, often rapidly, try again later with a good
>    patch

When working on a development branch that shouldn't be the
expectation. I suspect that is why the revert was pushed back on
initially. The developer wanted a chance to try to debug and resolve
the issue with root cause.

Honestly what I probably would have proposed was a build flag that
would have allowed the code to stay but be disabled with a "Broken"
label to allow both developers to work on their own thing. Then if
people complained about the RFC non-compliance issue, but didn't care
about the Vagrant setup they could have just turned it on to test and
verify it fixed their issue and get additional testing. However I
assume that would have introduced additional maintenance overhead.

> Unsophisticated reporters should not experience regressions,
> period. Unsophisticated reporters shuld not be expected to debug
> things on their own (though it sure is nice if they can!). We really
> like it and appreciate it if reporters can run experiments!

Unsophisticated reporters/users shouldn't be running net-next. If this
has made it to or is about to go into Linus's tree then I would agree
the regression needs to be resolved ASAP as that stuff shouldn't exist
past rc1 at the latest.

> In this particular instance there was some resistance getting to a fix
> quickly. I think a revert for something like this that could not be
> immediately fixed is the correct thing, especially when it effects
> significant work within the community. It gives the submitter time to
> find out how to solve the regression.
>
> That there is now so much ongoing bad blood over such an ordinary
> matter is what is really distressing here.

Well much of it has to do with the fact that this is supposed to be a
community. Generally I help you, you help me and together we both make
progress. So within the community people tend to build up what we
could call karma. Generally I think some of the messages sent seemed
to make it come across that the Mellanox/Nvidia folks felt it "wasn't
their problem" so they elicited a bit of frustration from the other
maintainers and built up some negative karma.

As I had mentioned in the case of the e1000e NVRAM corruption. It
wasn't an Intel issue that caused the problem but Intel had to jump in
to address it until they found the root cause that was function
tracing. Unfortunately one thing that tends to happen with upstream is
that we get asked to do things that aren't directly related to the
project we are working on. We saw that at Intel quite often. I
referred to it at one point as the "you stepped in it, you own it"
phenomenon where if we even brushed against block of upstream code
that wasn't being well maintained we would be asked to fix it up and
address existing issues before we could upstream any patches.

> I think Leon's point is broadly that those on the "vendor" side seem
> to often be accused of being a "bad vendor". I couldn't help but
> notice the language from Meta on this thread seemed to place Meta
> outside of being a vendor, despite having always very much been doing
> typical vendor activities like downstream forks, proprietary userspace
> and now drivers for their own devices.

I wouldn't disagree that we are doing "vendor" things. Up until about
4 years ago I was on the "vendor" side at Intel. One difference is
that Meta is also the "consumer". So if I report an issue it is me
complaining about something as a sophisticated user instead of a
unsophisticated one. So hopefully we have gone though and done some
triage to at least bisect it down to a patch and are willing to work
with the community as you guys did. If we can work with the other
maintainers to enable them to debug and root cause the issue then even
better. The revert is normally the weapon of last resort to be broken
out before the merge window opens, or if an issue is caught in Linus's
tree.

> In my view the vendor/!vendor distinction is really toxic and should
> stop.

I agree. However that was essentially what started all this when Jiri
pointed out that we weren't selling the NIC to anyone else. That made
this all about vendor vs !vendor, and his suggestion of just giving
the NICs away isn't exactly practical. At least not an any sort of
large scale. Maybe we should start coming up with a new term for the
!vendor case. How about "prosumer", as in "producer" and "consumer"?

Edward Cree April 9, 2024, 4:53 p.m. UTC | #60

On 05/04/2024 15:24, Alexander Duyck wrote:
> Why not? Just because we are not commercially selling it doesn't mean
> we couldn't look at other solutions such as QEMU. If we were to
> provide a github repo with an emulation of the NIC would that be
> enough to satisfy the "commercial" requirement?
> 
> The fact is I already have an implementation, but I would probably
> need to clean up a few things as the current setup requires 3 QEMU
> instances to emulate the full setup with host, firmware, and BMC. It
> wouldn't be as performant as the actual hardware but it is more than
> enough for us to test code with. If we need to look at publishing
> something like that to github in order to address the lack of user
> availability I could start looking at getting the approvals for that.
Personally I think that this would vitiate any legitimate objections
 anyone could have to this driver.  The emulation would be a functional
 spec for the device, and (assuming it's open source, including the
 firmware) would provide a basis for anyone attempting to build their
 own hardware to the same interface.  As long as clones aren't
 prevented by some kind of patent encumbrance or whatever, this would
 be more 'open' than many of the devices users _can_ get their hands on
 today.
The way this suggestion/offer/proposal got dismissed and ignored in
 favour of spurious arguments about DMABUF speaks volumes.

-e

Jason Gunthorpe April 9, 2024, 5:12 p.m. UTC | #61

On Tue, Apr 09, 2024 at 09:31:06AM -0700, Alexander Duyck wrote:

> > expectation is generally things like:
> >
> >  - The bug is fixed immediately because the issue is obvious to the
> >    author
> >  - Iteration and rapid progress is seen toward enlightening the author
> >  - The patch is reverted, often rapidly, try again later with a good
> >    patch
> 
> When working on a development branch that shouldn't be the
> expectation. I suspect that is why the revert was pushed back on
> initially. The developer wanted a chance to try to debug and resolve
> the issue with root cause.

Even mm-unstable drops patches on a hair trigger, as an example.

You can't have an orderly development process if your development tree
is broken in your CI.. Personally I'm grateful for the people who test
linux-next (or the various constituent sub trees), it really helps.

> Well much of it has to do with the fact that this is supposed to be a
> community. Generally I help you, you help me and together we both make
> progress. So within the community people tend to build up what we
> could call karma. Generally I think some of the messages sent seemed
> to make it come across that the Mellanox/Nvidia folks felt it "wasn't
> their problem" so they elicited a bit of frustration from the other
> maintainers and built up some negative karma.

How could it be NVIDIA folks problem? They are not experts in TCP and
can't debug it. The engineer running the CI systems did what he was
asked by Eric from what I can tell.

> phenomenon where if we even brushed against block of upstream code
> that wasn't being well maintained we would be asked to fix it up and
> address existing issues before we could upstream any patches.

Well, Intel has it's own karma problems in the kernel community. :(

> > In my view the vendor/!vendor distinction is really toxic and should
> > stop.
> 
> I agree. However that was essentially what started all this when Jiri
> pointed out that we weren't selling the NIC to anyone else. That made
> this all about vendor vs !vendor, 

That is not how I would sum up Jiri's position.

By my read he is saying that contributing code to the kernel that only
Meta can actually use is purely extractive. It is not about vendor or
!vendor, it is taking-free-forwardporting or not. You have argued,
and I would agree, that there is a grey scale between
extractive/collaborative - but I also agree with Jiri that fbnic is
undeniably far toward the extractive side.

If being extractive is a problem in this case or not is another
question, but I would say Jiri's objection is definitely not about
selling or vendor vs !vendor.

Jason

Florian Fainelli April 9, 2024, 5:42 p.m. UTC | #62

On 4/9/24 07:28, Jiri Pirko wrote:
> Tue, Apr 09, 2024 at 03:05:47PM CEST, f.fainelli@gmail.com wrote:
>>
>>
>> On 4/9/2024 3:56 AM, Jiri Pirko wrote:
>>> Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote:
>>>> On 4/8/24 09:51, Jiri Pirko wrote:
>>>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>>>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>>>
>>>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>>>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>>>>>
>>>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>>>>>>>>>>> Alex already indicated new features are coming, changes to the core
>>>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically
>>>>>>>>>>> should fbnic be allowed to be the first implementation of something
>>>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace
>>>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would
>>>>>>>>>>> not be able to do that. I would say that clearly crosses the line and
>>>>>>>>>>> should not be accepted.
>>>>>>>>>>
>>>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean
>>>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to
>>>>>>>>>> provide a github repo with an emulation of the NIC would that be
>>>>>>>>>> enough to satisfy the "commercial" requirement?
>>>>>>>>>
>>>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs
>>>>>>>>> benefiting only proprietary software.
>>>>>>>>
>>>>>>>> Sorry, that was where this started where Jiri was stating that we had
>>>>>>>> to be selling this.
>>>>>>>
>>>>>>> For the record, I never wrote that. Not sure why you repeat this over
>>>>>>> this thread.
>>>>>>
>>>>>> Because you seem to be implying that the Meta NIC driver shouldn't be
>>>>>> included simply since it isn't going to be available outside of Meta.
>>>>>> The fact is Meta employs a number of kernel developers and as a result
>>>>>> of that there will be a number of kernel developers that will have
>>>>>> access to this NIC and likely do development on systems containing it.
>>>>>> In addition simply due to the size of the datacenters that we will be
>>>>>> populating there is actually a strong likelihood that there will be
>>>>>> more instances of this NIC running on Linux than there are of some
>>>>>> other vendor devices that have been allowed to have drivers in the
>>>>>> kernel.
>>>>>
>>>>> So? The gain for community is still 0. No matter how many instances is
>>>>> private hw you privately have. Just have a private driver.
>>>>
>>>> I am amazed and not in a good way at how far this has gone, truly.
>>>>
>>>> This really is akin to saying that any non-zero driver count to maintain is a
>>>> burden on the community. Which is true, by definition, but if the goal was to
>>>> build something for no users, then clearly this is the wrong place to be in,
>>>> or too late. The systems with no users are the best to maintain, that is for
>>>> sure.
>>>>
>>>> If the practical concern is wen you make tree wide API change that fbnic
>>>> happens to use, and you have yet another driver (fbnic) to convert, so what?
>>>> Work with Alex ahead of time, get his driver to be modified, post the patch
>>>> series. Even if Alex happens to move on and stop being responsible and there
>>>> is no maintainer, so what? Give the driver a depreciation window for someone
>>>> to step in, rip it, end of story. Nothing new, so what has specifically
>>>> changed as of April 4th 2024 to oppose such strong rejection?
>>>
>>> How you describe the flow of internal API change is totally distant from
>>> reality. Really, like no part is correct:
>>> 1) API change is responsibility of the person doing it. Imagine working
>>>      with 40 driver maintainers for every API change. I did my share of
>>>      API changes in the past, maintainer were only involved to be cced.
>>
>> As a submitter you propose changes and silence is acknowledgement. If one of
>> your API changes broke someone's driver and they did not notify you of the
>> breakage during the review cycle, it falls on their shoulder to fix it for
>> themselves and they should not be holding back your work, that would not be
> 
> Does it? I don't think so. If you break something, better try to fix it
> before somebody else has to.



> 
> 
>> fair. If you know about the breakage, and there is still no fix, that is an
>> indication the driver is not actively used and maintained.
> 
> So? That is not my point. If I break something in fbnic, why does anyone
> care? Nobody is ever to hit that bug, only Meta DC.

They care, and they will jump in to fix it. There is no expectation that 
as a community member you should be able to make 100% correct patches, 
this is absolutely not humanly possible, even less so with scarce access 
to the hardware. All you can hope for is that your changes work, and 
that someone catches it, sooner rather than later.

> 
> 
>>
>> This also does not mean you have to do the entire API changes to a driver you
>> do not know about on your own. Nothing ever prevents you from posting the
>> patches as RFC and say: "here is how I would go about changing your driver,
>> please review and help me make corrections". If the driver maintainers do not
>> respond there is no reason their lack of involvement should refrain your
>> work, and so your proposed changes will be merged eventually.
> 
> Realistically, did you see that ever happen. I can't recall.

This happens all of the time, if you make a netdev tree wide change, how 
many maintainer's Acked-by do we collect before merging those changes: 
none typically because some netdev maintainers are just quicker than 
reviewers could be. In other subsystems we might actually wait for 
people to give a change to give their A-b or R-b tags, not always though.

> 
> 
>>
>> Is not this the whole point of being a community and be able to delegate and
>> mitigate the risk of large scale changes?
>>
>>> 2) To deprecate driver because the maintainer is not responsible. Can
>>>      you please show me one example when that happened in the past?
>>
>> I cannot show you an example because we never had to go that far and I did
>> not say that this is an established practice, but that we *could* do that if
>> we ever reached that point.
> 
> You are talking about a flow that does not exist. I don't understand how
> is that related to this discussion then.

I was trying to appease your concerns about additional maintenance 
burden. If the burden becomes real, we ditch it. We can dismiss this 
point as being not relevant if you want.

> 
> 
>>
>>>
>>>
>>>>
>>>> Like it was said, there are tons of drivers in the Linux kernel that have a
>>>> single user, this one might have a few more than a single one, that should be
>>>> good enough.
>>>
>>> This will have exactly 0. That is my point. Why to merge something
>>> nobody will ever use?
>>
>> Even if Alex and his firmware colleague end up being the only two people
>> using this driver if the decision is to make it upstream because this is the
>> desired distribution and development model of the driver we should respect
>> that.
>>
>> And just to be clear, we should not be respecting that because Meta, or Alex
>> or anyone decided that they were doing the world a favor by working in the
>> open rather than being closed door, but simply because we cannot *presume*
> 
> I don't see any favor for the community. What's the favor exactly?

There is no exchange of favors or "this" for "that", this is not how a 
community works. You bring your code to the table, solicit review 
feedback, then go on to maintain it within your bounds, and, time 
permitting, beyond your driver. What we gain as a community is 
additional visibility, more API users (eventually real world users, 
too), and therefore a somewhat more objective way of coming up with new 
APIs and features, and just a broader understanding of what is out 
there. This is better than speculation since that creates a less skewed 
mental model.

Let us say that someone at Meta wanted to get this core netdev feature 
that could be super cool for others included in the upstream kernel, we 
would shut it down on the basis that no user exists and we would be 
right about doing it that. Turns out there is a user, but the driver 
lives out of tree, but now we also reject that driver? Who benefits from 
doing that: nobody.

You need a membership card to join the club that you can only enter if 
you have a membership card already? No thank you.

> The only favor I see is the in the opposite direction, community giving
> Meta free cycles saving their backporting costs. Why?

Technically it would be both forward porting cycles, since they would no 
longer need to rebase the driver against their most recent kernel used, 
and backporting cycles for the first kernel including fbnic onwards.

That comes almost for free these days anyways thanks to static analysis 
tools. The overwhelming cost of the maintenance remains on Meta 
engineers, being the only ones with access to the hardware. If they end 
up with customers in the future, they can offload some of that to their 
customers, too.

Let's just look at a few high profile drivers by lines changed:

Summary for: drivers/net/ethernet/mellanox/mlxsw
Total: 133422 (+), 44952 (-)
Own: 131180 (+), 42725 (-)
Community: 2242 (+) (1.680 %), 2227 (-) (4.954 %)

Summary for: drivers/net/ethernet/mellanox/mlx5
Total: 265368 (+), 107690 (-)
Own: 259213 (+), 100328 (-)
Community: 6155 (+) (2.319 %), 7362 (-) (6.836 %)%

Summary for: drivers/net/ethernet/broadcom/bnxt
Total: 70355 (+), 25402 (-)
Own: 68458 (+), 23358 (-)
Community: 1897 (+) (2.696 %), 2044 (-) (8.047 %)

Summary for: drivers/net/ethernet/intel/e1000e/
Total: 39760 (+), 9924 (-)
Own: 38514 (+), 8905 (-)
Community: 1246 (+) (3.134 %), 1019 (-) (10.268 %)

I admit this is simplistic because both mlxsw and mlx5 drivers helped 
greatly improve the networking stack in parts that I benefited directly 
from within DSA for instance.

The point is, you paid the maintenance price though, the community did not.

> 
> 
>> about their intentions and the future.
> 
> Heh, the intention is pretty clear from this discussion, isn't it? If
> they ever by any chance decide to go public with their device, driver
> for that could be submitted at a time. But this is totally hypothetical.

I think your opposition is unreasonable and is unfair. Using your 
argument to the extreme, I may go as far as saying that it encourages 
working out of tree, rather than in tree. This is the exact opposite of 
what made Linux successful as an OS.

Can I buy a Spectrum switch off Amazon?

Leon Romanovsky April 9, 2024, 6:38 p.m. UTC | #63

On Tue, Apr 09, 2024 at 10:42:44AM -0700, Florian Fainelli wrote:
> On 4/9/24 07:28, Jiri Pirko wrote:
> > Tue, Apr 09, 2024 at 03:05:47PM CEST, f.fainelli@gmail.com wrote:
> > > 

<...>

> Can I buy a Spectrum switch off Amazon?

You can buy latest Spectrum generation from eBay.
https://www.ebay.com/itm/145138400557?itmmeta=01HV2247YN9ENJHP6T4YSN2HP7&hash=item21caec3d2d:g:qWoAAOSwzEZjiq2z&itmprp=enc%3AAQAJAAAA4CjGKSBxVTaO07qnpLBQGBwBJdGVCYhu730MrI5AC6E%2BERJLxS0EdlgE2gKtElk%2FZUj6A9DQR69IXIpTk%2FJbqEGKgCNR4d6TMz6i%2BogO02ZZBCkpLkPOYpOvDJV1jRv2KVOBt7i5k5pUMRJpwPKzAH%2Fwf6tglPOId2d9fSy%2BBM3MbDcbZfkv4V%2FNbItTgspvDnnMKAzUmR3Rs9%2FoHVDVbU4ZsnfRKFOMaKbGH5j%2Fani2jAqtbPEIA3H8nUcdpkxo4I61N5w9peLlN6Hkj8E8irdQY4TTzSTdYZ7EC9JG09cQ%7Ctkp%3ABk9SR8T_kMLYYw

Thanks

> -- 
> Florian
> 
>

Alexander Duyck April 9, 2024, 6:38 p.m. UTC | #64

On Tue, Apr 9, 2024 at 10:12 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Apr 09, 2024 at 09:31:06AM -0700, Alexander Duyck wrote:
>
> > > expectation is generally things like:
> > >
> > >  - The bug is fixed immediately because the issue is obvious to the
> > >    author
> > >  - Iteration and rapid progress is seen toward enlightening the author
> > >  - The patch is reverted, often rapidly, try again later with a good
> > >    patch
> >
> > When working on a development branch that shouldn't be the
> > expectation. I suspect that is why the revert was pushed back on
> > initially. The developer wanted a chance to try to debug and resolve
> > the issue with root cause.
>
> Even mm-unstable drops patches on a hair trigger, as an example.
>
> You can't have an orderly development process if your development tree
> is broken in your CI.. Personally I'm grateful for the people who test
> linux-next (or the various constituent sub trees), it really helps.
>
> > Well much of it has to do with the fact that this is supposed to be a
> > community. Generally I help you, you help me and together we both make
> > progress. So within the community people tend to build up what we
> > could call karma. Generally I think some of the messages sent seemed
> > to make it come across that the Mellanox/Nvidia folks felt it "wasn't
> > their problem" so they elicited a bit of frustration from the other
> > maintainers and built up some negative karma.
>
> How could it be NVIDIA folks problem? They are not experts in TCP and
> can't debug it. The engineer running the CI systems did what he was
> asked by Eric from what I can tell.

No, I get your message. I wasn't saying it was your problem. All that
can be asked for is such cooperation. Like I said I think some of the
problem was the messaging more than the process.

> > phenomenon where if we even brushed against block of upstream code
> > that wasn't being well maintained we would be asked to fix it up and
> > address existing issues before we could upstream any patches.
>
> Well, Intel has it's own karma problems in the kernel community. :(

Oh, I know. I resisted the urge to push out the driver as "idgaf:
Internal Device Generated at Facebook" on April 1st instead of "fbnic"
to poke fun at the presentation they did at Netdev 0x16 where they
were trying to say all the vendors should be implementing "idpf" since
they made it a standard.

> > > In my view the vendor/!vendor distinction is really toxic and should
> > > stop.
> >
> > I agree. However that was essentially what started all this when Jiri
> > pointed out that we weren't selling the NIC to anyone else. That made
> > this all about vendor vs !vendor,
>
> That is not how I would sum up Jiri's position.
>
> By my read he is saying that contributing code to the kernel that only
> Meta can actually use is purely extractive. It is not about vendor or
> !vendor, it is taking-free-forwardporting or not. You have argued,
> and I would agree, that there is a grey scale between
> extractive/collaborative - but I also agree with Jiri that fbnic is
> undeniably far toward the extractive side.
>
> If being extractive is a problem in this case or not is another
> question, but I would say Jiri's objection is definitely not about
> selling or vendor vs !vendor.
>
> Jason

It all depends on your definition of being extractive. I would assume
a "consumer" that is running a large number of systems and is capable
of providing sophisticated feedback on issues found within the kernel,
in many cases providing fixes for said issues, or working with
maintainers on resolution of said issues, is not extractive.

The fact that said "consumer" decides to then produce their own device
becoming a "prosumer" means they are now able to more accurately and
quickly diagnose issues when they see them. They can design things
such that there isn't some black box of firmware, or a third party
driver, in the datapath that prevents them from quickly diagnosing the
issue. So if anything I would think that is a net positive for the
community as it allows the "prosumer" to provide much more quick and
useful feedback on issues found in the kernel rather than having to
wait on a third party vendor to provide additional input.

Note I am not going after any particular vendor with my comments. This
applies to all vendors. The problem as a customer is that you are
limited on what you can do once you find an issue. Quite often you are
at the mercy of the vendor in such cases, especially when there seems
to be either firmware or "security" issues involved.

Thanks,

- Alex

Jason Gunthorpe April 9, 2024, 6:54 p.m. UTC | #65

On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote:
> > > phenomenon where if we even brushed against block of upstream code
> > > that wasn't being well maintained we would be asked to fix it up and
> > > address existing issues before we could upstream any patches.
> >
> > Well, Intel has it's own karma problems in the kernel community. :(
> 
> Oh, I know. I resisted the urge to push out the driver as "idgaf:
> Internal Device Generated at Facebook" on April 1st instead of
> "fbnic"

That would have been hilarious!

> to poke fun at the presentation they did at Netdev 0x16 where they
> were trying to say all the vendors should be implementing "idpf" since
> they made it a standard.

Yes, I noticed this also. For all the worries I've heard lately about
lack of commonality/etc it seems like a major missed ecosystem
opportunity to have not invested in an industry standard. From what I
can see fbnic has no hope of being anything more than a one-off
generation for Meta. Too many silicon design micro-details are exposed
to the OS.

> It all depends on your definition of being extractive. I would assume
> a "consumer" that is running a large number of systems and is capable
> of providing sophisticated feedback on issues found within the kernel,
> in many cases providing fixes for said issues, or working with
> maintainers on resolution of said issues, is not extractive.

I don't know, as I said there is some grey scale.

IMHO it is not appropriate to make such decisions based on some
company wide metric. fbnic team alone should be judged and shouldn't
get a free ride based on the other good work Meta is doing. Otherwise
it turns into a thing where bigger/richer companies just get to do
whatever they want because they do the most "good" in aggregate.

Jason

Leon Romanovsky April 9, 2024, 7:15 p.m. UTC | #66

On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote:
> On Tue, Apr 9, 2024 at 10:12 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Tue, Apr 09, 2024 at 09:31:06AM -0700, Alexander Duyck wrote:
> >
> > > > expectation is generally things like:
> > > >
> > > >  - The bug is fixed immediately because the issue is obvious to the
> > > >    author
> > > >  - Iteration and rapid progress is seen toward enlightening the author
> > > >  - The patch is reverted, often rapidly, try again later with a good
> > > >    patch
> > >
> > > When working on a development branch that shouldn't be the
> > > expectation. I suspect that is why the revert was pushed back on
> > > initially. The developer wanted a chance to try to debug and resolve
> > > the issue with root cause.
> >
> > Even mm-unstable drops patches on a hair trigger, as an example.
> >
> > You can't have an orderly development process if your development tree
> > is broken in your CI.. Personally I'm grateful for the people who test
> > linux-next (or the various constituent sub trees), it really helps.
> >
> > > Well much of it has to do with the fact that this is supposed to be a
> > > community. Generally I help you, you help me and together we both make
> > > progress. So within the community people tend to build up what we
> > > could call karma. Generally I think some of the messages sent seemed
> > > to make it come across that the Mellanox/Nvidia folks felt it "wasn't
> > > their problem" so they elicited a bit of frustration from the other
> > > maintainers and built up some negative karma.
> >
> > How could it be NVIDIA folks problem? They are not experts in TCP and
> > can't debug it. The engineer running the CI systems did what he was
> > asked by Eric from what I can tell.
> 
> No, I get your message. I wasn't saying it was your problem. All that
> can be asked for is such cooperation. Like I said I think some of the
> problem was the messaging more than the process.

Patch with revert came month+ after we reported the issue and were ready
to do anything to find the root cause, so it is not the messaging issue,
it was the exclusion from process issue.

I tried to avoid to write the below, but because Jason brought it
already, I'll write my feelings.

Current netdev has very toxic environment, with binary separation to
vendors and not-vendors.

Vendors are bad guys who day and night try to cheat and sneak their
dirty hacks into the kernel. Their contributions are negligible and
can't be trusted by definition.

Luckily enough, there are some "not-vendors" and they are the good
guys who know what is the best for the community and all other world.

Thanks

Alexander Duyck April 9, 2024, 8:03 p.m. UTC | #67

On Tue, Apr 9, 2024 at 11:55 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote:
> > > > phenomenon where if we even brushed against block of upstream code
> > > > that wasn't being well maintained we would be asked to fix it up and
> > > > address existing issues before we could upstream any patches.
> > >
> > > Well, Intel has it's own karma problems in the kernel community. :(
> >
> > Oh, I know. I resisted the urge to push out the driver as "idgaf:
> > Internal Device Generated at Facebook" on April 1st instead of
> > "fbnic"
>
> That would have been hilarious!
>
> > to poke fun at the presentation they did at Netdev 0x16 where they
> > were trying to say all the vendors should be implementing "idpf" since
> > they made it a standard.
>
> Yes, I noticed this also. For all the worries I've heard lately about
> lack of commonality/etc it seems like a major missed ecosystem
> opportunity to have not invested in an industry standard. From what I
> can see fbnic has no hope of being anything more than a one-off
> generation for Meta. Too many silicon design micro-details are exposed
> to the OS.

I know. The fact is we aren't trying to abstract away anything as that
would mean a larger firmware blob. That is the problem with an
abstraction like idpf is that it just adds more overhead as you have
to have the firmware manage more of the control plane.

> > It all depends on your definition of being extractive. I would assume
> > a "consumer" that is running a large number of systems and is capable
> > of providing sophisticated feedback on issues found within the kernel,
> > in many cases providing fixes for said issues, or working with
> > maintainers on resolution of said issues, is not extractive.
>
> I don't know, as I said there is some grey scale.
>
> IMHO it is not appropriate to make such decisions based on some
> company wide metric. fbnic team alone should be judged and shouldn't
> get a free ride based on the other good work Meta is doing. Otherwise
> it turns into a thing where bigger/richer companies just get to do
> whatever they want because they do the most "good" in aggregate.

The problem here in this case is that I am pretty much the heart of
the software driver team with a few new hires onboarding the next
couple months. People were asking why others were jumping to my
defense, well if we are going to judge the team they are mostly
judging me. I'm just hoping my reputation has spoken for itself
considering I was making significant contributions to the drivers even
after I have gone through several changes of employer.

Jakub Kicinski April 9, 2024, 8:51 p.m. UTC | #68

On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote:
> This patch set includes the necessary patches to enable basic Tx and Rx
> over the Meta Platforms Host Network Interface. To do this we introduce a
> new driver and driver and directories in the form of
> "drivers/net/ethernet/meta/fbnic".

Let me try to restate some takeaways and ask for further clarification
on the main question...

First, I think there's broad support for merging the driver itself.

IIUC there is also broad support to raise the expectations from
maintainers of drivers for private devices, specifically that they will:
 - receive weaker "no regression" guarantees
 - help with refactoring / adapting their drivers more actively
 - not get upset when we delete those drivers if they stop participating

If you think that the drivers should be merged *without* setting these
expectations, please speak up.

Nobody picked me up on the suggestion to use the CI as a proactive
check whether the maintainer / owner is still paying attention, 
but okay :(

What is less clear to me is what do we do about uAPI / core changes.
Of those who touched on the subject - few people seem to be curious /
welcoming to any reasonable features coming out for private devices
(John, Olek, Florian)? Others are more cautious focusing on blast
radius and referring to the "two driver rule" (Daniel, Paolo)?
Whether that means outright ban on touching common code or uAPI
in ways which aren't exercised by commercial NICs, is unclear. 
Andrew and Ed did not address the question directly AFAICT.

Is my reading correct? Does anyone have an opinion on whether we should
try to dig more into this question prior to merging the driver, and
set some ground rules? Or proceed and learn by doing?

Willem de Bruijn April 9, 2024, 9:06 p.m. UTC | #69

Jakub Kicinski wrote:
> On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote:
> > This patch set includes the necessary patches to enable basic Tx and Rx
> > over the Meta Platforms Host Network Interface. To do this we introduce a
> > new driver and driver and directories in the form of
> > "drivers/net/ethernet/meta/fbnic".
> 
> Let me try to restate some takeaways and ask for further clarification
> on the main question...
> 
> First, I think there's broad support for merging the driver itself.
> 
> IIUC there is also broad support to raise the expectations from
> maintainers of drivers for private devices, specifically that they will:
>  - receive weaker "no regression" guarantees
>  - help with refactoring / adapting their drivers more actively
>  - not get upset when we delete those drivers if they stop participating
> 
> If you think that the drivers should be merged *without* setting these
> expectations, please speak up.
> 
> Nobody picked me up on the suggestion to use the CI as a proactive
> check whether the maintainer / owner is still paying attention, 
> but okay :(
> 
> 
> What is less clear to me is what do we do about uAPI / core changes.
> Of those who touched on the subject - few people seem to be curious /
> welcoming to any reasonable features coming out for private devices
> (John, Olek, Florian)? Others are more cautious focusing on blast
> radius and referring to the "two driver rule" (Daniel, Paolo)?
> Whether that means outright ban on touching common code or uAPI
> in ways which aren't exercised by commercial NICs, is unclear. 
> Andrew and Ed did not address the question directly AFAICT.
> 
> Is my reading correct? Does anyone have an opinion on whether we should
> try to dig more into this question prior to merging the driver, and
> set some ground rules? Or proceed and learn by doing?

Thanks for summarizing. That was my reading too

Two distict questions

1. whether a standard driver is as admissible if the device is not
   available on the open market.

2. whether new device features can be supported without at least
   two available devices supporting it.

FWIW, +1 for 1 from me. Any serious device that exists in quantity
and is properly maintained should be in-tree.

In terms of trusting Meta, it is less about karma, but an indication
of these two requirements when the driver first appears. We would not
want to merge vaporware drivers from unknown sources or university
research projects.

2 is out of scope for this series. But I would always want to hear
about potential new features that an organization finds valuable
enough to implement. Rather than a blanket rule against them.

Jason Gunthorpe April 9, 2024, 11:11 p.m. UTC | #70

On Tue, Apr 09, 2024 at 01:03:06PM -0700, Alexander Duyck wrote:
> People were asking why others were jumping to my defense, well if we
> are going to judge the team they are mostly judging me. I'm just
> hoping my reputation has spoken for itself considering I was making
> significant contributions to the drivers even after I have gone
> through several changes of employer.

I don't think anything in this thread is about you personally. You
were just given a "bad vendor" task by your employer and you carried
it out. As is usual with these things there is legitimate disagreement
on what it means to be a "bad vendor".

Jason

Andrew Lunn April 9, 2024, 11:42 p.m. UTC | #71

> What is less clear to me is what do we do about uAPI / core changes.

I would differentiate between core change and core additions. If there
is very limited firmware on this device, i assume Linux is managing
the SFP cage, and to some extend the PCS. Extending the core to handle
these at higher speeds than currently supported would be one such core
addition. I've no problem with this. And i doubt it will be a single
NIC using such additions for too long. It looks like ClearFog CX LX2
could make use of such extensions as well, and there are probably
other boards and devices, maybe the Zynq 7000?

> Is my reading correct? Does anyone have an opinion on whether we should
> try to dig more into this question prior to merging the driver, and
> set some ground rules? Or proceed and learn by doing?

I'm not too keen on keeping potentially shareable code in the driver
just because of UEFI. It has long been the norm that you should not
have wrappers so you can reuse code in different OSes. And UEFI is
just another OS. So i really would like to see a Linux I2C bus master
driver, a linux GPIO driver if appropriate, and using phylink, just as
i've pushed wangxun to do that, and to some extend nvidia with their
GPIO controller embedded in their NIC. The nice thing is, the
developers for wangxun has mostly solved all this for a PCIe device,
so their code can be copied.

Do we need to set some ground rules? No. I can give similar feedback
as i gave the wangxun developers, if Linux subsystems are not used
appropriately.

       Andrew

Jiri Pirko April 10, 2024, 7:26 a.m. UTC | #72

Tue, Apr 09, 2024 at 11:06:05PM CEST, willemdebruijn.kernel@gmail.com wrote:
>Jakub Kicinski wrote:
>> On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote:

[...]

>
>2. whether new device features can be supported without at least
>   two available devices supporting it.
>

[...]

>
>2 is out of scope for this series. But I would always want to hear
>about potential new features that an organization finds valuable
>enough to implement. Rather than a blanket rule against them.

This appears out of the nowhere. In the past, I would say wast majority
of the features was merged with single device implementation. Often, it
is the only device out there at a time that supports the feature.
This limitation would put break for feature additions. I can put a long
list of features that would not be here ever (like 50% of mlxsw driver).

>
>

Jiri Pirko April 10, 2024, 7:42 a.m. UTC | #73

Tue, Apr 09, 2024 at 10:51:42PM CEST, kuba@kernel.org wrote:
>On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote:
>> This patch set includes the necessary patches to enable basic Tx and Rx
>> over the Meta Platforms Host Network Interface. To do this we introduce a
>> new driver and driver and directories in the form of
>> "drivers/net/ethernet/meta/fbnic".
>
>Let me try to restate some takeaways and ask for further clarification
>on the main question...
>
>First, I think there's broad support for merging the driver itself.
>
>IIUC there is also broad support to raise the expectations from
>maintainers of drivers for private devices, specifically that they will:
> - receive weaker "no regression" guarantees
> - help with refactoring / adapting their drivers more actively

:)


> - not get upset when we delete those drivers if they stop participating

Sorry for being pain, but I would still like to see some sumarization of
what is actually the gain for the community to merge this unused driver.
So far, I don't recall to read anything solid.

btw:
Kconfig description should contain:
 Say N here, you can't ever see this device in real world.


>
>If you think that the drivers should be merged *without* setting these
>expectations, please speak up.
>
>Nobody picked me up on the suggestion to use the CI as a proactive
>check whether the maintainer / owner is still paying attention, 
>but okay :(
>
>
>What is less clear to me is what do we do about uAPI / core changes.
>Of those who touched on the subject - few people seem to be curious /
>welcoming to any reasonable features coming out for private devices
>(John, Olek, Florian)? Others are more cautious focusing on blast
>radius and referring to the "two driver rule" (Daniel, Paolo)?
>Whether that means outright ban on touching common code or uAPI
>in ways which aren't exercised by commercial NICs, is unclear. 

For these kind of unused drivers, I think it would be legit to
disallow any internal/external api changes. Just do that for some
normal driver, then benefit from the changes in the unused driver.

Now the question is, how to distinguish these 2 driver kinds? Maybe to
put them under some directory so it is clear?
drivers/net/unused/ethernet/meta/fbnic/


>Andrew and Ed did not address the question directly AFAICT.
>
>Is my reading correct? Does anyone have an opinion on whether we should
>try to dig more into this question prior to merging the driver, and
>set some ground rules? Or proceed and learn by doing?
>

Jiri Pirko April 10, 2024, 9:37 a.m. UTC | #74

Tue, Apr 09, 2024 at 10:03:06PM CEST, alexander.duyck@gmail.com wrote:
>On Tue, Apr 9, 2024 at 11:55 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>> On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote:
>> > > > phenomenon where if we even brushed against block of upstream code
>> > > > that wasn't being well maintained we would be asked to fix it up and
>> > > > address existing issues before we could upstream any patches.
>> > >
>> > > Well, Intel has it's own karma problems in the kernel community. :(
>> >
>> > Oh, I know. I resisted the urge to push out the driver as "idgaf:
>> > Internal Device Generated at Facebook" on April 1st instead of
>> > "fbnic"
>>
>> That would have been hilarious!
>>
>> > to poke fun at the presentation they did at Netdev 0x16 where they
>> > were trying to say all the vendors should be implementing "idpf" since
>> > they made it a standard.
>>
>> Yes, I noticed this also. For all the worries I've heard lately about
>> lack of commonality/etc it seems like a major missed ecosystem
>> opportunity to have not invested in an industry standard. From what I
>> can see fbnic has no hope of being anything more than a one-off
>> generation for Meta. Too many silicon design micro-details are exposed
>> to the OS.
>
>I know. The fact is we aren't trying to abstract away anything as that
>would mean a larger firmware blob. That is the problem with an
>abstraction like idpf is that it just adds more overhead as you have
>to have the firmware manage more of the control plane.
>
>> > It all depends on your definition of being extractive. I would assume
>> > a "consumer" that is running a large number of systems and is capable
>> > of providing sophisticated feedback on issues found within the kernel,
>> > in many cases providing fixes for said issues, or working with
>> > maintainers on resolution of said issues, is not extractive.
>>
>> I don't know, as I said there is some grey scale.
>>
>> IMHO it is not appropriate to make such decisions based on some
>> company wide metric. fbnic team alone should be judged and shouldn't
>> get a free ride based on the other good work Meta is doing. Otherwise
>> it turns into a thing where bigger/richer companies just get to do
>> whatever they want because they do the most "good" in aggregate.
>
>The problem here in this case is that I am pretty much the heart of
>the software driver team with a few new hires onboarding the next
>couple months. People were asking why others were jumping to my
>defense, well if we are going to judge the team they are mostly
>judging me. I'm just hoping my reputation has spoken for itself
>considering I was making significant contributions to the drivers even
>after I have gone through several changes of employer.

Let me clearly state two things this thread it running around all the
time:
1) This has nothing to do with you Alex, at all. If John Doe was the one
   pushing this, from my perspective, everything would be exactly the same.
2) This is not about selling devices. I already stresses that out
   multiple times, yet people are still talking about selling. This is
   about possibility of outside-Meta person to meet the device in real
   world with real usecase (not emulated, that does not make any sense).

Alexander Lobakin April 10, 2024, 11:45 a.m. UTC | #75

From: Jiri Pirko <jiri@resnulli.us>
Date: Tue, 9 Apr 2024 16:41:10 +0200

> Tue, Apr 09, 2024 at 03:11:21PM CEST, aleksander.lobakin@intel.com wrote:
>> From: Jiri Pirko <jiri@resnulli.us>
>> Date: Tue, 9 Apr 2024 13:01:51 +0200
>>
>>> Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote:
>>>> Jiri Pirko wrote:
>>>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>>>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>>>
>>>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>>>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>>>>>
>>>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>>>>>>>>>>> Alex already indicated new features are coming, changes to the core
>>>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically
>>>>>>>>>>> should fbnic be allowed to be the first implementation of something
>>>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace
>>>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would
>>>>>>>>>>> not be able to do that. I would say that clearly crosses the line and
>>>>>>>>>>> should not be accepted.
>>>>>>>>>>
>>>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean
>>>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to
>>>>>>>>>> provide a github repo with an emulation of the NIC would that be
>>>>>>>>>> enough to satisfy the "commercial" requirement?
>>>>>>>>>
>>>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs
>>>>>>>>> benefiting only proprietary software.
>>>>>>>>
>>>>>>>> Sorry, that was where this started where Jiri was stating that we had
>>>>>>>> to be selling this.
>>>>>>>
>>>>>>> For the record, I never wrote that. Not sure why you repeat this over
>>>>>>> this thread.
>>>>>>
>>>>>> Because you seem to be implying that the Meta NIC driver shouldn't be
>>>>>> included simply since it isn't going to be available outside of Meta.
>>
>> BTW idpf is also not something you can go and buy in a store, but it's
>> here in the kernel. Anyway, see below.
> 
> IDK, why so many people in this thread are so focused on "buying" nic.
> IDPF device is something I assume one may see on a VM hosted in some
> cloud, isn't it? If yes, it is completely legit to have it in. Do I miss
> something?

Anyhow, we want the upstream Linux kernel to work out of box on most
systems. Rejecting this driver basically encourages to still prefer
OOT/proprietary crap.

> 
> 
>>
>>>>>> The fact is Meta employs a number of kernel developers and as a result
>>>>>> of that there will be a number of kernel developers that will have
>>>>>> access to this NIC and likely do development on systems containing it.
>>
>> [...]
>>
>>>> Vendors would happily spin up a NIC if a DC with scale like this
>>>> would pay for it. They just don't advertise it in patch 0/X,
>>>> "adding device for cloud provider foo".
>>>>
>>>> There is no difference here. We gain developers, we gain insights,
>>>> learnings and Linux and OSS drivers are running on another big
>>>> DC. They improve things and find bugs they upstream them its a win.
>>>>
>>>> The opposite is also true if we exclude a driver/NIC HW that is
>>>> running on major DCs we lose a lot of insight, experience, value.
>>>
>>> Could you please describe in details and examples what exactly is we
>>> are about to loose? I don't see it.
>>
>> As long as driver A introduces new features / improvements / API /
>> whatever to the core kernel, we benefit from this no matter whether I'm
>> actually able to run this driver on my system.
>>
>> Some drivers even give us benefit by that they are of good quality (I
>> don't speak for this driver, just some hypothetical) and/or have
>> interesting design / code / API / etc. choices. The drivers I work on
>> did gain a lot just from that I was reading new commits / lore threads
>> and look at changes in other drivers.
>>
>> I saw enough situations when driver A started using/doing something the
>> way it wasn't ever done anywhere before, and then more and more drivers
>> stated doing the same thing and at the end it became sorta standard.
> 
> So bottom line is, the unused driver *may* introduce some features and
> *may* provide as an example of how to do things for other people.
> Is this really that beneficial for the community that it overweights
> the obvious cons (not going to repeat them)?
> 
> Like with any other patch/set we merge in, we always look at the cons
> and pros. I'm honestly surprised that so many people here
> want to make exception for Meta's internal toy project.

It's not me who wants to make an exception. I haven't ever seen a driver
rejected due to "it can be used only somewhere where I can't go", so
looks like it's you who wants to make an exception :>

Thanks,
Olek

Jiri Pirko April 10, 2024, 12:12 p.m. UTC | #76

Wed, Apr 10, 2024 at 01:45:54PM CEST, aleksander.lobakin@intel.com wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Tue, 9 Apr 2024 16:41:10 +0200
>
>> Tue, Apr 09, 2024 at 03:11:21PM CEST, aleksander.lobakin@intel.com wrote:
>>> From: Jiri Pirko <jiri@resnulli.us>
>>> Date: Tue, 9 Apr 2024 13:01:51 +0200
>>>
>>>> Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote:
>>>>> Jiri Pirko wrote:
>>>>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote:
>>>>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote:
>>>>>>>>
>>>>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote:
>>>>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote:
>>>>>>>>>>>> Alex already indicated new features are coming, changes to the core
>>>>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically
>>>>>>>>>>>> should fbnic be allowed to be the first implementation of something
>>>>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace
>>>>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would
>>>>>>>>>>>> not be able to do that. I would say that clearly crosses the line and
>>>>>>>>>>>> should not be accepted.
>>>>>>>>>>>
>>>>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean
>>>>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to
>>>>>>>>>>> provide a github repo with an emulation of the NIC would that be
>>>>>>>>>>> enough to satisfy the "commercial" requirement?
>>>>>>>>>>
>>>>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs
>>>>>>>>>> benefiting only proprietary software.
>>>>>>>>>
>>>>>>>>> Sorry, that was where this started where Jiri was stating that we had
>>>>>>>>> to be selling this.
>>>>>>>>
>>>>>>>> For the record, I never wrote that. Not sure why you repeat this over
>>>>>>>> this thread.
>>>>>>>
>>>>>>> Because you seem to be implying that the Meta NIC driver shouldn't be
>>>>>>> included simply since it isn't going to be available outside of Meta.
>>>
>>> BTW idpf is also not something you can go and buy in a store, but it's
>>> here in the kernel. Anyway, see below.
>> 
>> IDK, why so many people in this thread are so focused on "buying" nic.
>> IDPF device is something I assume one may see on a VM hosted in some
>> cloud, isn't it? If yes, it is completely legit to have it in. Do I miss
>> something?
>
>Anyhow, we want the upstream Linux kernel to work out of box on most
>systems. Rejecting this driver basically encourages to still prefer
>OOT/proprietary crap.

Totally true. Out of the box on as many systems as possible. This is not
the case. Can you show me an example of a person outside-of-Meta can
benefit from using this driver out-of-box?


>
>> 
>> 
>>>
>>>>>>> The fact is Meta employs a number of kernel developers and as a result
>>>>>>> of that there will be a number of kernel developers that will have
>>>>>>> access to this NIC and likely do development on systems containing it.
>>>
>>> [...]
>>>
>>>>> Vendors would happily spin up a NIC if a DC with scale like this
>>>>> would pay for it. They just don't advertise it in patch 0/X,
>>>>> "adding device for cloud provider foo".
>>>>>
>>>>> There is no difference here. We gain developers, we gain insights,
>>>>> learnings and Linux and OSS drivers are running on another big
>>>>> DC. They improve things and find bugs they upstream them its a win.
>>>>>
>>>>> The opposite is also true if we exclude a driver/NIC HW that is
>>>>> running on major DCs we lose a lot of insight, experience, value.
>>>>
>>>> Could you please describe in details and examples what exactly is we
>>>> are about to loose? I don't see it.
>>>
>>> As long as driver A introduces new features / improvements / API /
>>> whatever to the core kernel, we benefit from this no matter whether I'm
>>> actually able to run this driver on my system.
>>>
>>> Some drivers even give us benefit by that they are of good quality (I
>>> don't speak for this driver, just some hypothetical) and/or have
>>> interesting design / code / API / etc. choices. The drivers I work on
>>> did gain a lot just from that I was reading new commits / lore threads
>>> and look at changes in other drivers.
>>>
>>> I saw enough situations when driver A started using/doing something the
>>> way it wasn't ever done anywhere before, and then more and more drivers
>>> stated doing the same thing and at the end it became sorta standard.
>> 
>> So bottom line is, the unused driver *may* introduce some features and
>> *may* provide as an example of how to do things for other people.
>> Is this really that beneficial for the community that it overweights
>> the obvious cons (not going to repeat them)?
>> 
>> Like with any other patch/set we merge in, we always look at the cons
>> and pros. I'm honestly surprised that so many people here
>> want to make exception for Meta's internal toy project.
>
>It's not me who wants to make an exception. I haven't ever seen a driver
>rejected due to "it can be used only somewhere where I can't go", so
>looks like it's you who wants to make an exception :>

Could you please point me to some other existing driver for device
that does not exist (/exist only at some person's backyard)?


>
>Thanks,
>Olek

Przemek Kitszel April 10, 2024, 12:50 p.m. UTC | #77

On 4/10/24 09:42, Jiri Pirko wrote:
> Tue, Apr 09, 2024 at 10:51:42PM CEST, kuba@kernel.org wrote:
>> On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote:
>>> This patch set includes the necessary patches to enable basic Tx and Rx
>>> over the Meta Platforms Host Network Interface. To do this we introduce a
>>> new driver and driver and directories in the form of
>>> "drivers/net/ethernet/meta/fbnic".
>>
>> Let me try to restate some takeaways and ask for further clarification
>> on the main question...
>>
>> First, I think there's broad support for merging the driver itself.
>>
>> IIUC there is also broad support to raise the expectations from
>> maintainers of drivers for private devices, specifically that they will:
>> - receive weaker "no regression" guarantees
>> - help with refactoring / adapting their drivers more actively
> 
> :)
> 
> 
>> - not get upset when we delete those drivers if they stop participating
> 
> Sorry for being pain, but I would still like to see some sumarization of
> what is actually the gain for the community to merge this unused driver.
> So far, I don't recall to read anything solid.

For me personally, both as a developer and as an user, any movement into
lean-FW direction is a breeze of fresh air.

And nobody is stopping Nvidia or other vendor from manufacturing
Advanced FBNIC Accelerator TM, that uses the driver as-is, but makes it
faster, better and cheaper that anything you could buy right now.

> 
> btw:
> Kconfig description should contain:
>   Say N here, you can't ever see this device in real world.
> 

Thank you for keeping this entertaining :)

> 
>>
>> If you think that the drivers should be merged *without* setting these
>> expectations, please speak up.
>>
>> Nobody picked me up on the suggestion to use the CI as a proactive
>> check whether the maintainer / owner is still paying attention,
>> but okay :(
>>
>>
>> What is less clear to me is what do we do about uAPI / core changes.
>> Of those who touched on the subject - few people seem to be curious /
>> welcoming to any reasonable features coming out for private devices
>> (John, Olek, Florian)? Others are more cautious focusing on blast
>> radius and referring to the "two driver rule" (Daniel, Paolo)?
>> Whether that means outright ban on touching common code or uAPI
>> in ways which aren't exercised by commercial NICs, is unclear.
> 
> For these kind of unused drivers, I think it would be legit to
> disallow any internal/external api changes. Just do that for some
> normal driver, then benefit from the changes in the unused driver.
> 
> Now the question is, how to distinguish these 2 driver kinds? Maybe to
> put them under some directory so it is clear?
> drivers/net/unused/ethernet/meta/fbnic/
> 
> 
>> Andrew and Ed did not address the question directly AFAICT.
>>
>> Is my reading correct? Does anyone have an opinion on whether we should
>> try to dig more into this question prior to merging the driver, and
>> set some ground rules? Or proceed and learn by doing?
>>
>

Jakub Kicinski April 10, 2024, 1:46 p.m. UTC | #78

On Wed, 10 Apr 2024 09:42:14 +0200 Jiri Pirko wrote:
> > - not get upset when we delete those drivers if they stop participating  
> 
> Sorry for being pain, but I would still like to see some sumarization of
> what is actually the gain for the community to merge this unused driver.
> So far, I don't recall to read anything solid.

From the discussion I think some folks made the point that it's
educational to see what big companies do, and seeing the work
may lead to reuse and other people adopting features / ideas.

> btw:
> Kconfig description should contain:
>  Say N here, you can't ever see this device in real world.

We do use standard distro kernels in some corners of the DC, AFAIU.

> >If you think that the drivers should be merged *without* setting these
> >expectations, please speak up.
> >
> >Nobody picked me up on the suggestion to use the CI as a proactive
> >check whether the maintainer / owner is still paying attention, 
> >but okay :(
> >
> >
> >What is less clear to me is what do we do about uAPI / core changes.
> >Of those who touched on the subject - few people seem to be curious /
> >welcoming to any reasonable features coming out for private devices
> >(John, Olek, Florian)? Others are more cautious focusing on blast
> >radius and referring to the "two driver rule" (Daniel, Paolo)?
> >Whether that means outright ban on touching common code or uAPI
> >in ways which aren't exercised by commercial NICs, is unclear.   
> 
> For these kind of unused drivers, I think it would be legit to
> disallow any internal/external api changes. Just do that for some
> normal driver, then benefit from the changes in the unused driver.

Unused is a bit strong, and we didn't put netdevsim in a special
directory. Let's see if more such drivers appear and if there
are practical uses for the separation for scripts etc?

Jiri Pirko April 10, 2024, 3:12 p.m. UTC | #79

Wed, Apr 10, 2024 at 03:46:11PM CEST, kuba@kernel.org wrote:
>On Wed, 10 Apr 2024 09:42:14 +0200 Jiri Pirko wrote:
>> > - not get upset when we delete those drivers if they stop participating  
>> 
>> Sorry for being pain, but I would still like to see some sumarization of
>> what is actually the gain for the community to merge this unused driver.
>> So far, I don't recall to read anything solid.
>
>From the discussion I think some folks made the point that it's
>educational to see what big companies do, and seeing the work
>may lead to reuse and other people adopting features / ideas.

Okay, if that's all, does it justify the cons? Will someone put this on
weights?


>
>> btw:
>> Kconfig description should contain:
>>  Say N here, you can't ever see this device in real world.
>
>We do use standard distro kernels in some corners of the DC, AFAIU.

I find it amusing to think about a distro vendor, for example RedHat,
to support driver for a proprietary private device.


>
>> >If you think that the drivers should be merged *without* setting these
>> >expectations, please speak up.
>> >
>> >Nobody picked me up on the suggestion to use the CI as a proactive
>> >check whether the maintainer / owner is still paying attention, 
>> >but okay :(
>> >
>> >
>> >What is less clear to me is what do we do about uAPI / core changes.
>> >Of those who touched on the subject - few people seem to be curious /
>> >welcoming to any reasonable features coming out for private devices
>> >(John, Olek, Florian)? Others are more cautious focusing on blast
>> >radius and referring to the "two driver rule" (Daniel, Paolo)?
>> >Whether that means outright ban on touching common code or uAPI
>> >in ways which aren't exercised by commercial NICs, is unclear.   
>> 
>> For these kind of unused drivers, I think it would be legit to
>> disallow any internal/external api changes. Just do that for some
>> normal driver, then benefit from the changes in the unused driver.
>
>Unused is a bit strong, and we didn't put netdevsim in a special
>directory. Let's see if more such drivers appear and if there
>are practical uses for the separation for scripts etc?

The practical use I see that the reviewer would spot right away is
someone pushes a feature implemented in this unused driver only.
Say it would be a clear mark for a driver of lower category.
For the person doing API change it would be an indication that he
does not have that cautious to not to break anything in this driver.
The driver maintainer should be the one to deal with potential issues.

With this clear marking and Documentation to describe it, I think I
would be ok to let this in, FWIW.

Alexander Duyck April 10, 2024, 3:56 p.m. UTC | #80

On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > What is less clear to me is what do we do about uAPI / core changes.
>
> I would differentiate between core change and core additions. If there
> is very limited firmware on this device, i assume Linux is managing
> the SFP cage, and to some extend the PCS. Extending the core to handle
> these at higher speeds than currently supported would be one such core
> addition. I've no problem with this. And i doubt it will be a single
> NIC using such additions for too long. It looks like ClearFog CX LX2
> could make use of such extensions as well, and there are probably
> other boards and devices, maybe the Zynq 7000?

The driver on this device doesn't have full access over the PHY.
Basically we control everything from the PCS north, and the firmware
controls everything from the PMA south as the physical connection is
MUXed between 4 slices. So this means the firmware also controls all
the I2C and the QSFP and EEPROM. The main reason for this is that
those blocks are shared resources between the slices, as such the
firmware acts as the arbitrator for 4 slices and the BMC.

> > Is my reading correct? Does anyone have an opinion on whether we should
> > try to dig more into this question prior to merging the driver, and
> > set some ground rules? Or proceed and learn by doing?
>
> I'm not too keen on keeping potentially shareable code in the driver
> just because of UEFI. It has long been the norm that you should not
> have wrappers so you can reuse code in different OSes. And UEFI is
> just another OS. So i really would like to see a Linux I2C bus master
> driver, a linux GPIO driver if appropriate, and using phylink, just as
> i've pushed wangxun to do that, and to some extend nvidia with their
> GPIO controller embedded in their NIC. The nice thing is, the
> developers for wangxun has mostly solved all this for a PCIe device,
> so their code can be copied.

Well when you are a one man driver development team sharing code
definitely makes one's life much easier when you have to maintain
multiple drivers. That said, I will see what I can do to comply with
your requests while hopefully not increasing my maintenance burden too
much. If nothing else I might just write myself a kernel compatibility
shim in UEFI so I can reuse the code that way.

The driver has no access to I2C or GPIO. The QSFP and EEPROM are
shared so I don't have access directly from the driver and I will have
to send any messages over the Host/FW mailbox to change any of that.

> Do we need to set some ground rules? No. I can give similar feedback
> as i gave the wangxun developers, if Linux subsystems are not used
> appropriately.
>
>        Andrew

Yeah, I kind of assume that will always be the case. Rules only mean
something if they are enforced anway. Although having the rules
documented does help in terms of making them known to all those
involved.

Anyway, I have already incorporated most of the feedback from the
other maintainers. The two requests you had were a bit more difficult
as they are in areas I am not entirely familiar with so I thought I
would check in with you before I dive into them.

One request I recall you having was to look at using the LED framework
for the LEDs. I think I have enough info to chase that down and get it
resolved for v2. However if you have some examples you would prefer I
follow I can look into that.

As far as the PCS block does it matter if I expose what the actual
underlying IP is or isn't? My request for info on what I can disclose
seems to be moving at the speed of bureaucracy so I don't know how
long it would take if I were to try to write something truly generic.
That said if I just put together something that referred to this as
pcs-fbnic for now, would that work for making this acceptable for
upstream?

Jakub Kicinski April 10, 2024, 5:35 p.m. UTC | #81

On Wed, 10 Apr 2024 17:12:18 +0200 Jiri Pirko wrote:
> >> For these kind of unused drivers, I think it would be legit to
> >> disallow any internal/external api changes. Just do that for some
> >> normal driver, then benefit from the changes in the unused driver.  
> >
> >Unused is a bit strong, and we didn't put netdevsim in a special
> >directory. Let's see if more such drivers appear and if there
> >are practical uses for the separation for scripts etc?  
> 
> The practical use I see that the reviewer would spot right away is
> someone pushes a feature implemented in this unused driver only.
> Say it would be a clear mark for a driver of lower category.
> For the person doing API change it would be an indication that he
> does not have that cautious to not to break anything in this driver.
> The driver maintainer should be the one to deal with potential issues.

Hm, we currently group by vendor but the fact it's a private device
is probably more important indeed. For example if Google submits
a driver for a private device it may be confusing what's public
cloud (which I think/hope GVE is) and what's fully private.

So we could categorize by the characteristic rather than vendor:

drivers/net/ethernet/${term}/fbnic/

I'm afraid it may be hard for us to agree on an accurate term, tho.
"Unused" sounds.. odd, we don't keep unused code, "private"
sounds like we granted someone special right not took some away,
maybe "exclusive"? Or "besteffort"? Or "staging" :D  IDK.

> With this clear marking and Documentation to describe it, I think I
> would be ok to let this in, FWIW.

Florian Fainelli April 10, 2024, 5:39 p.m. UTC | #82

On 4/10/2024 10:35 AM, Jakub Kicinski wrote:
> On Wed, 10 Apr 2024 17:12:18 +0200 Jiri Pirko wrote:
>>>> For these kind of unused drivers, I think it would be legit to
>>>> disallow any internal/external api changes. Just do that for some
>>>> normal driver, then benefit from the changes in the unused driver.
>>>
>>> Unused is a bit strong, and we didn't put netdevsim in a special
>>> directory. Let's see if more such drivers appear and if there
>>> are practical uses for the separation for scripts etc?
>>
>> The practical use I see that the reviewer would spot right away is
>> someone pushes a feature implemented in this unused driver only.
>> Say it would be a clear mark for a driver of lower category.
>> For the person doing API change it would be an indication that he
>> does not have that cautious to not to break anything in this driver.
>> The driver maintainer should be the one to deal with potential issues.
> 
> Hm, we currently group by vendor but the fact it's a private device
> is probably more important indeed. For example if Google submits
> a driver for a private device it may be confusing what's public
> cloud (which I think/hope GVE is) and what's fully private.
> 
> So we could categorize by the characteristic rather than vendor:
> 
> drivers/net/ethernet/${term}/fbnic/
> 
> I'm afraid it may be hard for us to agree on an accurate term, tho.
> "Unused" sounds.. odd, we don't keep unused code, "private"
> sounds like we granted someone special right not took some away,
> maybe "exclusive"? Or "besteffort"? Or "staging" :D  IDK.

Do we really need that categorization at the directory/filesystem level? 
cannot we just document it clearly in the Kconfig help text and under 
Documentation/networking/?

Jakub Kicinski April 10, 2024, 5:56 p.m. UTC | #83

On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote:
> > Hm, we currently group by vendor but the fact it's a private device
> > is probably more important indeed. For example if Google submits
> > a driver for a private device it may be confusing what's public
> > cloud (which I think/hope GVE is) and what's fully private.
> > 
> > So we could categorize by the characteristic rather than vendor:
> > 
> > drivers/net/ethernet/${term}/fbnic/
> > 
> > I'm afraid it may be hard for us to agree on an accurate term, tho.
> > "Unused" sounds.. odd, we don't keep unused code, "private"
> > sounds like we granted someone special right not took some away,
> > maybe "exclusive"? Or "besteffort"? Or "staging" :D  IDK.  
> 
> Do we really need that categorization at the directory/filesystem level? 
> cannot we just document it clearly in the Kconfig help text and under 
> Documentation/networking/?

From the reviewer perspective I think we will just remember.
If some newcomer tries to do refactoring they may benefit from seeing
this is a special device and more help is offered. Dunno if a newcomer
would look at the right docs.

Whether it's more "paperwork" than we'll actually gain, I have no idea.
I may not be the best person to comment.

Florian Fainelli April 10, 2024, 6 p.m. UTC | #84

On 4/10/2024 10:56 AM, Jakub Kicinski wrote:
> On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote:
>>> Hm, we currently group by vendor but the fact it's a private device
>>> is probably more important indeed. For example if Google submits
>>> a driver for a private device it may be confusing what's public
>>> cloud (which I think/hope GVE is) and what's fully private.
>>>
>>> So we could categorize by the characteristic rather than vendor:
>>>
>>> drivers/net/ethernet/${term}/fbnic/
>>>
>>> I'm afraid it may be hard for us to agree on an accurate term, tho.
>>> "Unused" sounds.. odd, we don't keep unused code, "private"
>>> sounds like we granted someone special right not took some away,
>>> maybe "exclusive"? Or "besteffort"? Or "staging" :D  IDK.
>>
>> Do we really need that categorization at the directory/filesystem level?
>> cannot we just document it clearly in the Kconfig help text and under
>> Documentation/networking/?
> 
>  From the reviewer perspective I think we will just remember.
> If some newcomer tries to do refactoring they may benefit from seeing
> this is a special device and more help is offered. Dunno if a newcomer
> would look at the right docs.
> 
> Whether it's more "paperwork" than we'll actually gain, I have no idea.
> I may not be the best person to comment.

To me it is starting to feel like more paperwork than warranted, 
although I cannot really think about an "implied" metric that we could 
track, short of monitoring patches/bug reports coming from outside of 
the original driver authors/owners as an indication of how widely 
utilized a given driver is.

The number of changes to the driver between release cycles is not a good 
indication between a driver with few users presents the most agile 
configuration, but similarly, a very actively used driver with real 
world users may see a large number of changes between releases based 
upon its use.

What we need is some sort of popularity contest tracking :)

Alexander Duyck April 10, 2024, 6:01 p.m. UTC | #85

On Wed, Apr 10, 2024 at 10:56 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote:
> > > Hm, we currently group by vendor but the fact it's a private device
> > > is probably more important indeed. For example if Google submits
> > > a driver for a private device it may be confusing what's public
> > > cloud (which I think/hope GVE is) and what's fully private.
> > >
> > > So we could categorize by the characteristic rather than vendor:
> > >
> > > drivers/net/ethernet/${term}/fbnic/
> > >
> > > I'm afraid it may be hard for us to agree on an accurate term, tho.
> > > "Unused" sounds.. odd, we don't keep unused code, "private"
> > > sounds like we granted someone special right not took some away,
> > > maybe "exclusive"? Or "besteffort"? Or "staging" :D  IDK.
> >
> > Do we really need that categorization at the directory/filesystem level?
> > cannot we just document it clearly in the Kconfig help text and under
> > Documentation/networking/?
>
> From the reviewer perspective I think we will just remember.
> If some newcomer tries to do refactoring they may benefit from seeing
> this is a special device and more help is offered. Dunno if a newcomer
> would look at the right docs.
>
> Whether it's more "paperwork" than we'll actually gain, I have no idea.
> I may not be the best person to comment.

Are we going to go through and retro-actively move some of the drivers
that are already there that are exclusive to specific companies? That
is the bigger issue as I see it. It has already been brought up that
idpf is exclusive. In addition several other people have reached out
to me about other devices that are exclusive to other organizations.

I don't see any value in it as it would just encourage people to lie
in order to avoid being put in what would essentially become a
blacklisted directory.

If we are going to be trying to come up with some special status maybe
it makes sense to have some status in the MAINTAINERS file that would
indicate that this driver is exclusive to some organization and not
publicly available so any maintenance would have to be proprietary.

Florian Fainelli April 10, 2024, 6:29 p.m. UTC | #86

On 4/10/2024 11:01 AM, Alexander Duyck wrote:
> On Wed, Apr 10, 2024 at 10:56 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote:
>>>> Hm, we currently group by vendor but the fact it's a private device
>>>> is probably more important indeed. For example if Google submits
>>>> a driver for a private device it may be confusing what's public
>>>> cloud (which I think/hope GVE is) and what's fully private.
>>>>
>>>> So we could categorize by the characteristic rather than vendor:
>>>>
>>>> drivers/net/ethernet/${term}/fbnic/
>>>>
>>>> I'm afraid it may be hard for us to agree on an accurate term, tho.
>>>> "Unused" sounds.. odd, we don't keep unused code, "private"
>>>> sounds like we granted someone special right not took some away,
>>>> maybe "exclusive"? Or "besteffort"? Or "staging" :D  IDK.
>>>
>>> Do we really need that categorization at the directory/filesystem level?
>>> cannot we just document it clearly in the Kconfig help text and under
>>> Documentation/networking/?
>>
>>  From the reviewer perspective I think we will just remember.
>> If some newcomer tries to do refactoring they may benefit from seeing
>> this is a special device and more help is offered. Dunno if a newcomer
>> would look at the right docs.
>>
>> Whether it's more "paperwork" than we'll actually gain, I have no idea.
>> I may not be the best person to comment.
> 
> Are we going to go through and retro-actively move some of the drivers
> that are already there that are exclusive to specific companies? That
> is the bigger issue as I see it. It has already been brought up that
> idpf is exclusive. In addition several other people have reached out
> to me about other devices that are exclusive to other organizations.
> 
> I don't see any value in it as it would just encourage people to lie
> in order to avoid being put in what would essentially become a
> blacklisted directory.

Agreed.

> 
> If we are going to be trying to come up with some special status maybe
> it makes sense to have some status in the MAINTAINERS file that would
> indicate that this driver is exclusive to some organization and not
> publicly available so any maintenance would have to be proprietary.

I like that idea.

Jakub Kicinski April 10, 2024, 7:58 p.m. UTC | #87

On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote:
> > If we are going to be trying to come up with some special status maybe
> > it makes sense to have some status in the MAINTAINERS file that would
> > indicate that this driver is exclusive to some organization and not
> > publicly available so any maintenance would have to be proprietary.  
> 
> I like that idea.

+1, also first idea that came to mind but I was too afraid 
of bike shedding to mention it :) Fingers crossed? :)

Andrew Lunn April 10, 2024, 8:01 p.m. UTC | #88

On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote:
> On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > What is less clear to me is what do we do about uAPI / core changes.
> >
> > I would differentiate between core change and core additions. If there
> > is very limited firmware on this device, i assume Linux is managing
> > the SFP cage, and to some extend the PCS. Extending the core to handle
> > these at higher speeds than currently supported would be one such core
> > addition. I've no problem with this. And i doubt it will be a single
> > NIC using such additions for too long. It looks like ClearFog CX LX2
> > could make use of such extensions as well, and there are probably
> > other boards and devices, maybe the Zynq 7000?
> 
> The driver on this device doesn't have full access over the PHY.
> Basically we control everything from the PCS north, and the firmware
> controls everything from the PMA south as the physical connection is
> MUXed between 4 slices. So this means the firmware also controls all
> the I2C and the QSFP and EEPROM. The main reason for this is that
> those blocks are shared resources between the slices, as such the
> firmware acts as the arbitrator for 4 slices and the BMC.

Ah, shame. You took what is probably the least valuable intellectual
property, and most shareable with the community and locked it up in
firmware where nobody can use it.

You should probably stop saying there is not much firmware with this
device, and that Linux controls it. It clearly does not...

	Andrew

Jakub Kicinski April 10, 2024, 8:03 p.m. UTC | #89

On Wed, 10 Apr 2024 11:00:35 -0700 Florian Fainelli wrote:
> although I cannot really think about an "implied" metric that we could 
> track, short of monitoring patches/bug reports coming from outside of 
> the original driver authors/owners as an indication of how widely 
> utilized a given driver is.

Not metric, just to clarify. I think the discussion started from 
my email saying:

 - help with refactoring / adapting their drivers more actively

and that may be an empty promise if the person doing the refactoring
does not know they could ask. It's not uncommon for a relative
newcomer to redo some internal API. Not that it's usually a hard
transformation.. Dunno, a bit hypothetical.

Alexander Duyck April 10, 2024, 9:07 p.m. UTC | #90

On Wed, Apr 10, 2024 at 1:01 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote:
> > On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > What is less clear to me is what do we do about uAPI / core changes.
> > >
> > > I would differentiate between core change and core additions. If there
> > > is very limited firmware on this device, i assume Linux is managing
> > > the SFP cage, and to some extend the PCS. Extending the core to handle
> > > these at higher speeds than currently supported would be one such core
> > > addition. I've no problem with this. And i doubt it will be a single
> > > NIC using such additions for too long. It looks like ClearFog CX LX2
> > > could make use of such extensions as well, and there are probably
> > > other boards and devices, maybe the Zynq 7000?
> >
> > The driver on this device doesn't have full access over the PHY.
> > Basically we control everything from the PCS north, and the firmware
> > controls everything from the PMA south as the physical connection is
> > MUXed between 4 slices. So this means the firmware also controls all
> > the I2C and the QSFP and EEPROM. The main reason for this is that
> > those blocks are shared resources between the slices, as such the
> > firmware acts as the arbitrator for 4 slices and the BMC.
>
> Ah, shame. You took what is probably the least valuable intellectual
> property, and most shareable with the community and locked it up in
> firmware where nobody can use it.
>
> You should probably stop saying there is not much firmware with this
> device, and that Linux controls it. It clearly does not...
>
>         Andrew

Well I was referring more to the data path level more than the phy
configuration. I suspect different people have different levels of
expectations on what minimal firmware is. With this hardware we at
least don't need to use firmware commands to enable or disable queues,
get the device stats, or update a MAC address.

When it comes to multi-host NICs I am not sure there are going to be
any solutions that don't have some level of firmware due to the fact
that the cable is physically shared with multiple slots.

I am assuming we still want to do the PCS driver. So I will still see
what I can do to get that setup.

Thanks,

- Alex

Jacob Keller April 10, 2024, 9:30 p.m. UTC | #91

On 4/10/2024 12:26 AM, Jiri Pirko wrote:
> Tue, Apr 09, 2024 at 11:06:05PM CEST, willemdebruijn.kernel@gmail.com wrote:
>> Jakub Kicinski wrote:
>>> On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote:
> 
> [...]
> 
>>
>> 2. whether new device features can be supported without at least
>>   two available devices supporting it.
>>
> 
> [...]
> 
>>
>> 2 is out of scope for this series. But I would always want to hear
>> about potential new features that an organization finds valuable
>> enough to implement. Rather than a blanket rule against them.
> 
> This appears out of the nowhere. In the past, I would say wast majority
> of the features was merged with single device implementation. Often, it
> is the only device out there at a time that supports the feature.
> This limitation would put break for feature additions. I can put a long
> list of features that would not be here ever (like 50% of mlxsw driver).
> 
>>
>>
> 

Jakub already mentioned this being nuanced in a previous part of the
thread. In reality, lots of features do get implemented by only one
driver first.

I think its good practice to ensure multiple vendors/drivers can use
whatever common uAPI or kernel API exists. It can be frustrating when
some new API gets introduced but then can't be used by another device..
In most cases thats on the vendors for being slow to respond or work
with each other when developing the new API.

Jacob Keller April 10, 2024, 10:03 p.m. UTC | #92

On 4/10/2024 12:58 PM, Jakub Kicinski wrote:
> On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote:
>>> If we are going to be trying to come up with some special status maybe
>>> it makes sense to have some status in the MAINTAINERS file that would
>>> indicate that this driver is exclusive to some organization and not
>>> publicly available so any maintenance would have to be proprietary.  
>>
>> I like that idea.
> 
> +1, also first idea that came to mind but I was too afraid 
> of bike shedding to mention it :) Fingers crossed? :)
> 

+1, I think putting it in MAINTAINERS makes a lot of sense.

Andrew Lunn April 10, 2024, 10:19 p.m. UTC | #93

> I think its good practice to ensure multiple vendors/drivers can use
> whatever common uAPI or kernel API exists. It can be frustrating when
> some new API gets introduced but then can't be used by another device..
> In most cases thats on the vendors for being slow to respond or work
> with each other when developing the new API.

I tend to agree with the last part. Vendors tend not to reviewer other
vendors patches, and so often don't notice a new API being added which
they could use, if it was a little bit more generic. Also vendors
often seem to focus on their devices/firmware requirements, not an
abstract device, and so end up with something not generic.

As a reviewer, i try to take more notice of new APIs than most other
things, and ideally it is something we should all do.

	 Andrew

Andrew Lunn April 10, 2024, 10:37 p.m. UTC | #94

> Well I was referring more to the data path level more than the phy
> configuration. I suspect different people have different levels of
> expectations on what minimal firmware is. With this hardware we at
> least don't need to use firmware commands to enable or disable queues,
> get the device stats, or update a MAC address.
> 
> When it comes to multi-host NICs I am not sure there are going to be
> any solutions that don't have some level of firmware due to the fact
> that the cable is physically shared with multiple slots.

This is something Russell King at least considered. I don't really
know enough to know why its impossible for Linux to deal with multiple
slots.

> I am assuming we still want to do the PCS driver. So I will still see
> what I can do to get that setup.

You should look at the API offered by drivers in drivers/net/pcs. It
is designed to be used with drivers which actually drive the hardware,
and use phylink. Who is responsible for configuring and looking at the
results of auto negotiation? Who is responsible for putting the PCS
into the correct mode depending on the SFP modules capabilities?
Because you seemed to of split the PCS into two, and hidden some of it
away, i don't know if it makes sense to try to shoehorn what is left
into a Linux driver.

     Andrew

Jacob Keller April 11, 2024, 12:31 a.m. UTC | #95

On 4/10/2024 3:19 PM, Andrew Lunn wrote:
>> I think its good practice to ensure multiple vendors/drivers can use
>> whatever common uAPI or kernel API exists. It can be frustrating when
>> some new API gets introduced but then can't be used by another device..
>> In most cases thats on the vendors for being slow to respond or work
>> with each other when developing the new API.
> 
> I tend to agree with the last part. Vendors tend not to reviewer other
> vendors patches, and so often don't notice a new API being added which
> they could use, if it was a little bit more generic. Also vendors
> often seem to focus on their devices/firmware requirements, not an
> abstract device, and so end up with something not generic.
> 
> As a reviewer, i try to take more notice of new APIs than most other
> things, and ideally it is something we should all do.
> 
> 	 Andrew
> 
> 
> 

Agreed. It can be challenging when you're in the vendor space though, as
you get handed priorities.

Jiri Pirko April 11, 2024, 6:31 a.m. UTC | #96

Thu, Apr 11, 2024 at 12:03:54AM CEST, jacob.e.keller@intel.com wrote:
>
>
>On 4/10/2024 12:58 PM, Jakub Kicinski wrote:
>> On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote:
>>>> If we are going to be trying to come up with some special status maybe
>>>> it makes sense to have some status in the MAINTAINERS file that would
>>>> indicate that this driver is exclusive to some organization and not
>>>> publicly available so any maintenance would have to be proprietary.  
>>>
>>> I like that idea.
>> 
>> +1, also first idea that came to mind but I was too afraid 
>> of bike shedding to mention it :) Fingers crossed? :)
>> 
>
>+1, I think putting it in MAINTAINERS makes a lot of sense.

Well, how exactly you imagine to do this? I have no problem using
MAINTAINERS for this, I was thinking about that too, but I could not
figure out the way it would work. Having driver directory is much more
obvious, person cooking up a patch sees that immediatelly. Do you look
at MAINTAINTERS file when you do some driver API changing patch/ any
patch? I certainly don't (not counting get_maintainers sctipt).

Jiri Pirko April 11, 2024, 6:34 a.m. UTC | #97

Wed, Apr 10, 2024 at 08:01:44PM CEST, alexander.duyck@gmail.com wrote:
>On Wed, Apr 10, 2024 at 10:56 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote:
>> > > Hm, we currently group by vendor but the fact it's a private device
>> > > is probably more important indeed. For example if Google submits
>> > > a driver for a private device it may be confusing what's public
>> > > cloud (which I think/hope GVE is) and what's fully private.
>> > >
>> > > So we could categorize by the characteristic rather than vendor:
>> > >
>> > > drivers/net/ethernet/${term}/fbnic/
>> > >
>> > > I'm afraid it may be hard for us to agree on an accurate term, tho.
>> > > "Unused" sounds.. odd, we don't keep unused code, "private"
>> > > sounds like we granted someone special right not took some away,
>> > > maybe "exclusive"? Or "besteffort"? Or "staging" :D  IDK.
>> >
>> > Do we really need that categorization at the directory/filesystem level?
>> > cannot we just document it clearly in the Kconfig help text and under
>> > Documentation/networking/?
>>
>> From the reviewer perspective I think we will just remember.
>> If some newcomer tries to do refactoring they may benefit from seeing
>> this is a special device and more help is offered. Dunno if a newcomer
>> would look at the right docs.
>>
>> Whether it's more "paperwork" than we'll actually gain, I have no idea.
>> I may not be the best person to comment.
>
>Are we going to go through and retro-actively move some of the drivers
>that are already there that are exclusive to specific companies? That
>is the bigger issue as I see it. It has already been brought up that

Why is it an issue? Very easy to move drivers to this new directory.


>idpf is exclusive. In addition several other people have reached out
>to me about other devices that are exclusive to other organizations.
>
>I don't see any value in it as it would just encourage people to lie
>in order to avoid being put in what would essentially become a
>blacklisted directory.

You are thinking all or nothing. I'd say that if we have 80% of such
drivers in the correct place/directory, it's a win. The rest will lie.
Shame for them when it is discovered.


>
>If we are going to be trying to come up with some special status maybe
>it makes sense to have some status in the MAINTAINERS file that would
>indicate that this driver is exclusive to some organization and not
>publicly available so any maintenance would have to be proprietary.

Jiri Pirko April 11, 2024, 6:39 a.m. UTC | #98

Wed, Apr 10, 2024 at 11:07:02PM CEST, alexander.duyck@gmail.com wrote:
>On Wed, Apr 10, 2024 at 1:01 PM Andrew Lunn <andrew@lunn.ch> wrote:
>>
>> On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote:
>> > On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote:
>> > >
>> > > > What is less clear to me is what do we do about uAPI / core changes.
>> > >
>> > > I would differentiate between core change and core additions. If there
>> > > is very limited firmware on this device, i assume Linux is managing
>> > > the SFP cage, and to some extend the PCS. Extending the core to handle
>> > > these at higher speeds than currently supported would be one such core
>> > > addition. I've no problem with this. And i doubt it will be a single
>> > > NIC using such additions for too long. It looks like ClearFog CX LX2
>> > > could make use of such extensions as well, and there are probably
>> > > other boards and devices, maybe the Zynq 7000?
>> >
>> > The driver on this device doesn't have full access over the PHY.
>> > Basically we control everything from the PCS north, and the firmware
>> > controls everything from the PMA south as the physical connection is
>> > MUXed between 4 slices. So this means the firmware also controls all
>> > the I2C and the QSFP and EEPROM. The main reason for this is that
>> > those blocks are shared resources between the slices, as such the
>> > firmware acts as the arbitrator for 4 slices and the BMC.
>>
>> Ah, shame. You took what is probably the least valuable intellectual
>> property, and most shareable with the community and locked it up in
>> firmware where nobody can use it.
>>
>> You should probably stop saying there is not much firmware with this
>> device, and that Linux controls it. It clearly does not...
>>
>>         Andrew
>
>Well I was referring more to the data path level more than the phy
>configuration. I suspect different people have different levels of
>expectations on what minimal firmware is. With this hardware we at
>least don't need to use firmware commands to enable or disable queues,
>get the device stats, or update a MAC address.
>
>When it comes to multi-host NICs I am not sure there are going to be
>any solutions that don't have some level of firmware due to the fact

A small linux host on the nic that controls the eswitch perhaps? I mean,
the multi pf nic without a host in charge of the physical port and
swithing between it and pf is simply broken design. And yeah you would
probably now want to argue others are doing it already in the same way :)
True that.


>that the cable is physically shared with multiple slots.
>
>I am assuming we still want to do the PCS driver. So I will still see
>what I can do to get that setup.
>
>Thanks,
>
>- Alex
>

Alexander Duyck April 11, 2024, 4 p.m. UTC | #99

On Wed, Apr 10, 2024 at 3:37 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > Well I was referring more to the data path level more than the phy
> > configuration. I suspect different people have different levels of
> > expectations on what minimal firmware is. With this hardware we at
> > least don't need to use firmware commands to enable or disable queues,
> > get the device stats, or update a MAC address.
> >
> > When it comes to multi-host NICs I am not sure there are going to be
> > any solutions that don't have some level of firmware due to the fact
> > that the cable is physically shared with multiple slots.
>
> This is something Russell King at least considered. I don't really
> know enough to know why its impossible for Linux to deal with multiple
> slots.

It mostly has to do with the arbitration between them. It is a matter
of having to pass a TON of info to the individual slice and then the
problem is it would have to do things correctly and not manage to take
out it's neighbor or the BMC.

> > I am assuming we still want to do the PCS driver. So I will still see
> > what I can do to get that setup.
>
> You should look at the API offered by drivers in drivers/net/pcs. It
> is designed to be used with drivers which actually drive the hardware,
> and use phylink. Who is responsible for configuring and looking at the
> results of auto negotiation? Who is responsible for putting the PCS
> into the correct mode depending on the SFP modules capabilities?
> Because you seemed to of split the PCS into two, and hidden some of it
> away, i don't know if it makes sense to try to shoehorn what is left
> into a Linux driver.

We have control of the auto negotiation as that is north of the PMA
and is configured per host. We should support clause 73 autoneg.
Although we haven't done much with it as most of our use cases are
just fixed speed setups to the switch over either 25G-CR1, 50G-CR2,
50G-CR1, or 100G-CR2. So odds are we aren't going to be doing anything
too terribly exciting.

As far as the QSFP setup the FW is responsible for any communication
with it. I suspect that the expectation is that we aren't going to
need much in the way of config since we are just using direct attach
cables. So we are the only one driving the PCS assuming we aren't
talking about power-on init where the FW is setting up for the BMC to
have access.

We will have to see. The PCS drivers in that directory mostly make
sense to me and don't look like too much of a departure from my
current code. It will just be a matter of splitting up the fbnic_mac.c
file and adding the PCS logic as a separate block, or at least I hope
that is all that is mostly involved. Probably take me a week or two to
get it coded up and push out the v2.

Jacob Keller April 11, 2024, 4:22 p.m. UTC | #100

On 4/10/2024 11:31 PM, Jiri Pirko wrote:
> Thu, Apr 11, 2024 at 12:03:54AM CEST, jacob.e.keller@intel.com wrote:
>>
>>
>> On 4/10/2024 12:58 PM, Jakub Kicinski wrote:
>>> On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote:
>>>>> If we are going to be trying to come up with some special status maybe
>>>>> it makes sense to have some status in the MAINTAINERS file that would
>>>>> indicate that this driver is exclusive to some organization and not
>>>>> publicly available so any maintenance would have to be proprietary.  
>>>>
>>>> I like that idea.
>>>
>>> +1, also first idea that came to mind but I was too afraid 
>>> of bike shedding to mention it :) Fingers crossed? :)
>>>
>>
>> +1, I think putting it in MAINTAINERS makes a lot of sense.
> 
> Well, how exactly you imagine to do this? I have no problem using
> MAINTAINERS for this, I was thinking about that too, but I could not
> figure out the way it would work. Having driver directory is much more
> obvious, person cooking up a patch sees that immediatelly. Do you look
> at MAINTAINTERS file when you do some driver API changing patch/ any
> patch? I certainly don't (not counting get_maintainers sctipt).

I use MAINTAINERS (along with get_maintainers) to figure out who to CC
when dealing with a driver. I guess I probably don't do so before making
a change.

I guess it depends on what the intent of documenting it is for.
Presumably it would be a hint and reminder to future maintainers that if
the folks listed as MAINTAINERS are no longer responsive then this
driver can be reverted.

Or if you push more strongly towards "its up to them to do all
maintenance" i.e. we don't make API changes for them, etc? I didn't get
the sense that was the consensus.

The consensus I got from reading this thread is that most people are ok
with merging the driver. Some reservations about the future and any API
changes specifically for the one driver... Some reservations about the
extra maintenance burden.

Several others pointed out example cases which are similar in
availability. Perhaps not quite as obvious as the case of this where the
device is produced and consumed by the same group only.

The practical reality is that many of the devices are practically
exclusive if not definitionally so.

Alexander Duyck April 11, 2024, 4:46 p.m. UTC | #101

On Wed, Apr 10, 2024 at 11:39 PM Jiri Pirko <jiri@resnulli.us> wrote:
>
> Wed, Apr 10, 2024 at 11:07:02PM CEST, alexander.duyck@gmail.com wrote:
> >On Wed, Apr 10, 2024 at 1:01 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >>
> >> On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote:
> >> > On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >> > >
> >> > > > What is less clear to me is what do we do about uAPI / core changes.
> >> > >
> >> > > I would differentiate between core change and core additions. If there
> >> > > is very limited firmware on this device, i assume Linux is managing
> >> > > the SFP cage, and to some extend the PCS. Extending the core to handle
> >> > > these at higher speeds than currently supported would be one such core
> >> > > addition. I've no problem with this. And i doubt it will be a single
> >> > > NIC using such additions for too long. It looks like ClearFog CX LX2
> >> > > could make use of such extensions as well, and there are probably
> >> > > other boards and devices, maybe the Zynq 7000?
> >> >
> >> > The driver on this device doesn't have full access over the PHY.
> >> > Basically we control everything from the PCS north, and the firmware
> >> > controls everything from the PMA south as the physical connection is
> >> > MUXed between 4 slices. So this means the firmware also controls all
> >> > the I2C and the QSFP and EEPROM. The main reason for this is that
> >> > those blocks are shared resources between the slices, as such the
> >> > firmware acts as the arbitrator for 4 slices and the BMC.
> >>
> >> Ah, shame. You took what is probably the least valuable intellectual
> >> property, and most shareable with the community and locked it up in
> >> firmware where nobody can use it.
> >>
> >> You should probably stop saying there is not much firmware with this
> >> device, and that Linux controls it. It clearly does not...
> >>
> >>         Andrew
> >
> >Well I was referring more to the data path level more than the phy
> >configuration. I suspect different people have different levels of
> >expectations on what minimal firmware is. With this hardware we at
> >least don't need to use firmware commands to enable or disable queues,
> >get the device stats, or update a MAC address.
> >
> >When it comes to multi-host NICs I am not sure there are going to be
> >any solutions that don't have some level of firmware due to the fact
>
> A small linux host on the nic that controls the eswitch perhaps? I mean,
> the multi pf nic without a host in charge of the physical port and
> swithing between it and pf is simply broken design. And yeah you would
> probably now want to argue others are doing it already in the same way :)
> True that.

Well in our case there isn't an eswitch. The issue is more the logic
for the Ethernet PHY isn't setup to only run one port. Instead the PHY
is MUXed over 2 ports per interface, and then the QSFP interface
itself is spread over 4 ports.

What you end up with is something like the second to last image in
this article[1] where you have a MAC/PCS pair per host sitting on top
of one PMA with some blocks that are shared between the hosts and some
that are not. The issue becomes management of access to the QSFP and
PHY and how to prevent one host from being able to monopolize the
PHY/QSFP or crash the others if something goes sideways. Then you have
to also add in the BMC management on top of that.

[1]: https://semiengineering.com/integrated-ethernet-pcs-and-phy-ip-for-400g-800g-hyperscale-data-centers/

Andrew Lunn April 11, 2024, 5:32 p.m. UTC | #102

On Thu, Apr 11, 2024 at 09:00:17AM -0700, Alexander Duyck wrote:
> On Wed, Apr 10, 2024 at 3:37 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > Well I was referring more to the data path level more than the phy
> > > configuration. I suspect different people have different levels of
> > > expectations on what minimal firmware is. With this hardware we at
> > > least don't need to use firmware commands to enable or disable queues,
> > > get the device stats, or update a MAC address.
> > >
> > > When it comes to multi-host NICs I am not sure there are going to be
> > > any solutions that don't have some level of firmware due to the fact
> > > that the cable is physically shared with multiple slots.
> >
> > This is something Russell King at least considered. I don't really
> > know enough to know why its impossible for Linux to deal with multiple
> > slots.
> 
> It mostly has to do with the arbitration between them. It is a matter
> of having to pass a TON of info to the individual slice and then the
> problem is it would have to do things correctly and not manage to take
> out it's neighbor or the BMC.

How much is specific to your device? How much is just following 802.3
and the CMIS standards? I assume anything which is just following
802.3 and CMIS could actually be re-used? And you have some glue to
combine them in a way that is specific to your device?

> > > I am assuming we still want to do the PCS driver. So I will still see
> > > what I can do to get that setup.
> >
> > You should look at the API offered by drivers in drivers/net/pcs. It
> > is designed to be used with drivers which actually drive the hardware,
> > and use phylink. Who is responsible for configuring and looking at the
> > results of auto negotiation? Who is responsible for putting the PCS
> > into the correct mode depending on the SFP modules capabilities?
> > Because you seemed to of split the PCS into two, and hidden some of it
> > away, i don't know if it makes sense to try to shoehorn what is left
> > into a Linux driver.
> 
> We have control of the auto negotiation as that is north of the PMA
> and is configured per host. We should support clause 73 autoneg.
> Although we haven't done much with it as most of our use cases are
> just fixed speed setups to the switch over either 25G-CR1, 50G-CR2,
> 50G-CR1, or 100G-CR2. So odds are we aren't going to be doing anything
> too terribly exciting.

Maybe not, but you might of gained from the community here, if others
could of adopted this code for their devices. You might not need
clause 73, but phylink provides helpers to implement it, so it is
pretty easy to add. Maybe your initial PCS driver does not support it,
but later adopters who also licence this PCS might add it, and you get
the feature for free. The corrected/uncorrected counters i asked
about, are something you might not export in your current code via
ethtool. But again, this is something which somebody else could add a
helper for, and you would get it nearly for free.

> As far as the QSFP setup the FW is responsible for any communication
> with it. I suspect that the expectation is that we aren't going to
> need much in the way of config since we are just using direct attach
> cables.

Another place you might of got features for free. The Linux SFP driver
exports HWMON values for temperature, power, received power, etc, but
for 1G. The QSFP+ standard Versatile Diagnostics Monitoring is
different, but i could see somebody adding a generic implementation in
the Linux SFP driver, so that the HWMON support is just free. Same
goes for the error performance statics. Parts of power management
could easily be generic. It might be possible to use Linux regulators
to describe what your board is capable if, and the SFP core could then
implement the ethtool ops, checking with the regulator to see if the
power is actually available, and then talking to the SFP to tell it to
change its power class?

Florian posted some interesting statistics, that vendors tend to
maintain their own drivers, and don't get much support from the
community. However i suspect it is a different story for shared
infrastructure like PCS drivers, PHY drivers, SFP drivers. That is
where you get the most community support and the most stuff for free.
But you actually have to use it to benefit from it.

	Andrew

Alexander Duyck April 11, 2024, 11:12 p.m. UTC | #103

On Thu, Apr 11, 2024 at 10:32 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Thu, Apr 11, 2024 at 09:00:17AM -0700, Alexander Duyck wrote:
> > On Wed, Apr 10, 2024 at 3:37 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > Well I was referring more to the data path level more than the phy
> > > > configuration. I suspect different people have different levels of
> > > > expectations on what minimal firmware is. With this hardware we at
> > > > least don't need to use firmware commands to enable or disable queues,
> > > > get the device stats, or update a MAC address.
> > > >
> > > > When it comes to multi-host NICs I am not sure there are going to be
> > > > any solutions that don't have some level of firmware due to the fact
> > > > that the cable is physically shared with multiple slots.
> > >
> > > This is something Russell King at least considered. I don't really
> > > know enough to know why its impossible for Linux to deal with multiple
> > > slots.
> >
> > It mostly has to do with the arbitration between them. It is a matter
> > of having to pass a TON of info to the individual slice and then the
> > problem is it would have to do things correctly and not manage to take
> > out it's neighbor or the BMC.
>
> How much is specific to your device? How much is just following 802.3
> and the CMIS standards? I assume anything which is just following
> 802.3 and CMIS could actually be re-used? And you have some glue to
> combine them in a way that is specific to your device?
>
> > > > I am assuming we still want to do the PCS driver. So I will still see
> > > > what I can do to get that setup.
> > >
> > > You should look at the API offered by drivers in drivers/net/pcs. It
> > > is designed to be used with drivers which actually drive the hardware,
> > > and use phylink. Who is responsible for configuring and looking at the
> > > results of auto negotiation? Who is responsible for putting the PCS
> > > into the correct mode depending on the SFP modules capabilities?
> > > Because you seemed to of split the PCS into two, and hidden some of it
> > > away, i don't know if it makes sense to try to shoehorn what is left
> > > into a Linux driver.
> >
> > We have control of the auto negotiation as that is north of the PMA
> > and is configured per host. We should support clause 73 autoneg.
> > Although we haven't done much with it as most of our use cases are
> > just fixed speed setups to the switch over either 25G-CR1, 50G-CR2,
> > 50G-CR1, or 100G-CR2. So odds are we aren't going to be doing anything
> > too terribly exciting.
>
> Maybe not, but you might of gained from the community here, if others
> could of adopted this code for their devices. You might not need
> clause 73, but phylink provides helpers to implement it, so it is
> pretty easy to add. Maybe your initial PCS driver does not support it,
> but later adopters who also licence this PCS might add it, and you get
> the feature for free. The corrected/uncorrected counters i asked
> about, are something you might not export in your current code via
> ethtool. But again, this is something which somebody else could add a
> helper for, and you would get it nearly for free.

You don't have to sell me on the reuse advantages of open source. I
will probably look at adding autoneg at some point in the future, but
for our main use case it wasn't needed. If nothing else I will
probably hand it off to one of the new hires on the team when I get
some time.

The counters are exported. Just haven't gotten far enough to show the
ethtool patches yet.. :-)

> > As far as the QSFP setup the FW is responsible for any communication
> > with it. I suspect that the expectation is that we aren't going to
> > need much in the way of config since we are just using direct attach
> > cables.
>
> Another place you might of got features for free. The Linux SFP driver
> exports HWMON values for temperature, power, received power, etc, but
> for 1G. The QSFP+ standard Versatile Diagnostics Monitoring is
> different, but i could see somebody adding a generic implementation in
> the Linux SFP driver, so that the HWMON support is just free. Same
> goes for the error performance statics. Parts of power management
> could easily be generic. It might be possible to use Linux regulators
> to describe what your board is capable if, and the SFP core could then
> implement the ethtool ops, checking with the regulator to see if the
> power is actually available, and then talking to the SFP to tell it to
> change its power class?

Again, for us it ends up not having much value adding additional QSFP
logic because we aren't using anything fancy. It is all just direct
attach cables.

> Florian posted some interesting statistics, that vendors tend to
> maintain their own drivers, and don't get much support from the
> community. However I suspect it is a different story for shared
> infrastructure like PCS drivers, PHY drivers, SFP drivers. That is
> where you get the most community support and the most stuff for free.
> But you actually have to use it to benefit from it.

I'll probably get started on the PCS drivers for this next week. I
will follow up with questions if I run into any issues.

Thanks,

- Alex