Message ID | 171217454226.1598374.8971335637623132496.stgit@ahduyck-xeon-server.home.arpa (mailing list archive) |
---|---|
Headers | show |
Series | eth: fbnic: Add network driver for Meta Platforms Host Network Interface | expand |
On Wed, Apr 03, 2024 at 01:08:24PM -0700, Alexander Duyck wrote: > This patch set includes the necessary patches to enable basic Tx and Rx > over the Meta Platforms Host Network Interface. To do this we introduce a > new driver and driver and directories in the form of > "drivers/net/ethernet/meta/fbnic". > PCI: Add Meta Platforms vendor ID > eth: fbnic: add scaffolding for Meta's NIC driver > eth: fbnic: Allocate core device specific structures and devlink interface > eth: fbnic: Add register init to set PCIe/Ethernet device config > eth: fbnic: add message parsing for FW messages > eth: fbnic: add FW communication mechanism > eth: fbnic: allocate a netdevice and napi vectors with queues > eth: fbnic: implement Tx queue alloc/start/stop/free > eth: fbnic: implement Rx queue alloc/start/stop/free > eth: fbnic: Add initial messaging to notify FW of our presence > eth: fbnic: Enable Ethernet link setup > eth: fbnic: add basic Tx handling > eth: fbnic: add basic Rx handling > eth: fbnic: add L2 address programming > eth: fbnic: write the TCAM tables used for RSS control and Rx to host Random mix of initial caps in subjects. Also kind of a mix of initial caps in comments, e.g., $ grep -Er "^\s+/\*" drivers/net/ethernet/meta/fbnic/ I didn't bother to figure out which patch these typos were in: $ codespell drivers/net/ethernet/meta/fbnic/ drivers/net/ethernet/meta/fbnic/fbnic_pci.c:452: ot ==> to, of, or drivers/net/ethernet/meta/fbnic/fbnic_pci.c:479: Reenable ==> Re-enable drivers/net/ethernet/meta/fbnic/fbnic_txrx.c:569: caclulation ==> calculation drivers/net/ethernet/meta/fbnic/fbnic_fw.c:740: conents ==> contents drivers/net/ethernet/meta/fbnic/fbnic_txrx.h:19: cachline ==> cacheline
Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote: >This patch set includes the necessary patches to enable basic Tx and Rx >over the Meta Platforms Host Network Interface. To do this we introduce a >new driver and driver and directories in the form of >"drivers/net/ethernet/meta/fbnic". > >Due to submission limits the general plan to submit a minimal driver for >now almost equivalent to a UEFI driver in functionality, and then follow up >over the coming weeks enabling additional offloads and more features for >the device. > >The general plan is to look at adding support for ethtool, statistics, and >start work on offloads in the next set of patches. Could you please shed some light for the motivation to introduce this driver in the community kernel? Is this device something people can obtain in a shop, or is it rather something to be seen in Meta datacenter only? If the second is the case, why exactly would we need this driver? > >--- > >Alexander Duyck (15): > PCI: Add Meta Platforms vendor ID > eth: fbnic: add scaffolding for Meta's NIC driver > eth: fbnic: Allocate core device specific structures and devlink interface > eth: fbnic: Add register init to set PCIe/Ethernet device config > eth: fbnic: add message parsing for FW messages > eth: fbnic: add FW communication mechanism > eth: fbnic: allocate a netdevice and napi vectors with queues > eth: fbnic: implement Tx queue alloc/start/stop/free > eth: fbnic: implement Rx queue alloc/start/stop/free > eth: fbnic: Add initial messaging to notify FW of our presence > eth: fbnic: Enable Ethernet link setup > eth: fbnic: add basic Tx handling > eth: fbnic: add basic Rx handling > eth: fbnic: add L2 address programming > eth: fbnic: write the TCAM tables used for RSS control and Rx to host > > > MAINTAINERS | 7 + > drivers/net/ethernet/Kconfig | 1 + > drivers/net/ethernet/Makefile | 1 + > drivers/net/ethernet/meta/Kconfig | 29 + > drivers/net/ethernet/meta/Makefile | 6 + > drivers/net/ethernet/meta/fbnic/Makefile | 18 + > drivers/net/ethernet/meta/fbnic/fbnic.h | 148 ++ > drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 912 ++++++++ > .../net/ethernet/meta/fbnic/fbnic_devlink.c | 86 + > .../net/ethernet/meta/fbnic/fbnic_drvinfo.h | 5 + > drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 823 ++++++++ > drivers/net/ethernet/meta/fbnic/fbnic_fw.h | 133 ++ > drivers/net/ethernet/meta/fbnic/fbnic_irq.c | 251 +++ > drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 1025 +++++++++ > drivers/net/ethernet/meta/fbnic/fbnic_mac.h | 83 + > .../net/ethernet/meta/fbnic/fbnic_netdev.c | 470 +++++ > .../net/ethernet/meta/fbnic/fbnic_netdev.h | 59 + > drivers/net/ethernet/meta/fbnic/fbnic_pci.c | 633 ++++++ > drivers/net/ethernet/meta/fbnic/fbnic_rpc.c | 709 +++++++ > drivers/net/ethernet/meta/fbnic/fbnic_rpc.h | 189 ++ > drivers/net/ethernet/meta/fbnic/fbnic_tlv.c | 529 +++++ > drivers/net/ethernet/meta/fbnic/fbnic_tlv.h | 175 ++ > drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 1873 +++++++++++++++++ > drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 125 ++ > include/linux/pci_ids.h | 2 + > 25 files changed, 8292 insertions(+) > create mode 100644 drivers/net/ethernet/meta/Kconfig > create mode 100644 drivers/net/ethernet/meta/Makefile > create mode 100644 drivers/net/ethernet/meta/fbnic/Makefile > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_csr.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_devlink.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_fw.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_fw.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_irq.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_mac.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_mac.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_netdev.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_netdev.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_pci.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_rpc.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_rpc.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_tlv.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_tlv.h > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_txrx.c > create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_txrx.h > >-- > >
On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote: > > Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote: > >This patch set includes the necessary patches to enable basic Tx and Rx > >over the Meta Platforms Host Network Interface. To do this we introduce a > >new driver and driver and directories in the form of > >"drivers/net/ethernet/meta/fbnic". > > > >Due to submission limits the general plan to submit a minimal driver for > >now almost equivalent to a UEFI driver in functionality, and then follow up > >over the coming weeks enabling additional offloads and more features for > >the device. > > > >The general plan is to look at adding support for ethtool, statistics, and > >start work on offloads in the next set of patches. > > Could you please shed some light for the motivation to introduce this > driver in the community kernel? Is this device something people can > obtain in a shop, or is it rather something to be seen in Meta > datacenter only? If the second is the case, why exactly would we need > this driver? For now this is Meta only. However there are several reasons for wanting to include this in the upstream kernel. First is the fact that from a maintenance standpoint it is easier to avoid drifting from the upstream APIs and such if we are in the kernel it makes things much easier to maintain as we can just pull in patches without having to add onto that work by having to craft backports around code that isn't already in upstream. Second is the fact that as we introduce new features with our driver it is much easier to show a proof of concept with the driver being in the kernel than not. It makes it much harder to work with the community on offloads and such if we don't have a good vehicle to use for that. What this driver will provide is an opportunity to push changes that would be beneficial to us, and likely the rest of the community without being constrained by what vendors decide they want to enable or not. The general idea is that if we can show benefit with our NIC then other vendors would be more likely to follow in our path. Lastly, there is a good chance that we may end up opening up more than just the driver code for this project assuming we can get past these initial hurdles. I don't know if you have noticed but Meta is pretty involved in the Open Compute Project. So if we want to work with third parties on things like firmware and hardware it makes it much easier to do so if the driver is already open and publicly available in the Linux kernel. Thanks, - Alex
> For now this is Meta only. However there are several reasons for > wanting to include this in the upstream kernel. > > First is the fact that from a maintenance standpoint it is easier to > avoid drifting from the upstream APIs and such if we are in the kernel > it makes things much easier to maintain as we can just pull in patches > without having to add onto that work by having to craft backports > around code that isn't already in upstream. > > Second is the fact that as we introduce new features with our driver > it is much easier to show a proof of concept with the driver being in > the kernel than not. It makes it much harder to work with the > community on offloads and such if we don't have a good vehicle to use > for that. What this driver will provide is an opportunity to push > changes that would be beneficial to us, and likely the rest of the > community without being constrained by what vendors decide they want > to enable or not. The general idea is that if we can show benefit with > our NIC then other vendors would be more likely to follow in our path. Given the discussion going on in the thread "mlx5 ConnectX control misc driver", you also plan to show you don't need such a misc driver? Andrew
Thu, Apr 04, 2024 at 04:45:14PM CEST, alexander.duyck@gmail.com wrote: >On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote: >> >> Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote: >> >This patch set includes the necessary patches to enable basic Tx and Rx >> >over the Meta Platforms Host Network Interface. To do this we introduce a >> >new driver and driver and directories in the form of >> >"drivers/net/ethernet/meta/fbnic". >> > >> >Due to submission limits the general plan to submit a minimal driver for >> >now almost equivalent to a UEFI driver in functionality, and then follow up >> >over the coming weeks enabling additional offloads and more features for >> >the device. >> > >> >The general plan is to look at adding support for ethtool, statistics, and >> >start work on offloads in the next set of patches. >> >> Could you please shed some light for the motivation to introduce this >> driver in the community kernel? Is this device something people can >> obtain in a shop, or is it rather something to be seen in Meta >> datacenter only? If the second is the case, why exactly would we need >> this driver? > >For now this is Meta only. However there are several reasons for >wanting to include this in the upstream kernel. > >First is the fact that from a maintenance standpoint it is easier to >avoid drifting from the upstream APIs and such if we are in the kernel >it makes things much easier to maintain as we can just pull in patches >without having to add onto that work by having to craft backports >around code that isn't already in upstream. That is making life easier for you, making it harder for the community. O relevance. > >Second is the fact that as we introduce new features with our driver >it is much easier to show a proof of concept with the driver being in >the kernel than not. It makes it much harder to work with the >community on offloads and such if we don't have a good vehicle to use >for that. What this driver will provide is an opportunity to push >changes that would be beneficial to us, and likely the rest of the >community without being constrained by what vendors decide they want >to enable or not. The general idea is that if we can show benefit with >our NIC then other vendors would be more likely to follow in our path. Yeah, so not even we would have to maintain driver nobody (outside Meta) uses or cares about, you say that we will likely maintain more of a dead code related to that. I think that in Linux kernel, there any many examples of similarly dead code that causes a lot of headaches to maintain. You just want to make your life easier here again. Don't drag community into this please. > >Lastly, there is a good chance that we may end up opening up more than >just the driver code for this project assuming we can get past these >initial hurdles. I don't know if you have noticed but Meta is pretty >involved in the Open Compute Project. So if we want to work with third >parties on things like firmware and hardware it makes it much easier >to do so if the driver is already open and publicly available in the >Linux kernel. OCP, ehm, lets say I have my reservations... Okay, what motivation would anyne have to touch the fw of a hardware running inside Meta datacenter only? Does not make any sense. I'd say come again when your HW is not limited to Meta datacenter. For the record and FWIW, I NACK this. > >Thanks, > >- Alex
On Thu, 4 Apr 2024 17:24:03 +0200 Andrew Lunn wrote: > Given the discussion going on in the thread "mlx5 ConnectX control > misc driver", you also plan to show you don't need such a misc driver? To the extent to which it is possible to prove an in-existence of something :) Since my argument that Meta can use _vendor_ devices without resorting to "misc drivers" doesn't convince them, I doubt this data point would help.
> OCP, ehm, lets say I have my reservations... > Okay, what motivation would anyne have to touch the fw of a hardware > running inside Meta datacenter only? Does not make any sense. > > I'd say come again when your HW is not limited to Meta datacenter. > For the record and FWIW, I NACK this. Is ENA used outside of Amazon data centres? Is FUNGIBLE used outside of Microsoft data centres? Is gVNIC used outside of Google data centres? I don't actually know, i'm more of an Embedded developer. Andrew
On Thu, Apr 04, 2024 at 08:35:18PM +0200, Andrew Lunn wrote: > > OCP, ehm, lets say I have my reservations... > > Okay, what motivation would anyne have to touch the fw of a hardware > > running inside Meta datacenter only? Does not make any sense. > > > > I'd say come again when your HW is not limited to Meta datacenter. > > For the record and FWIW, I NACK this. > > Is ENA used outside of Amazon data centres? At least this driver is needed when you use cloud instance in Amazon and want install your own kernel/OS. > > Is FUNGIBLE used outside of Microsoft data centres? Have no idea > > Is gVNIC used outside of Google data centres? You need this driver too. https://cloud.google.com/compute/docs/networking/using-gvnic > > I don't actually know, i'm more of an Embedded developer. > > Andrew > >
On Thu, Apr 4, 2024 at 8:36 AM Jiri Pirko <jiri@resnulli.us> wrote: > > Thu, Apr 04, 2024 at 04:45:14PM CEST, alexander.duyck@gmail.com wrote: > >On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote: > >> > >> Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote: <...> > >> Could you please shed some light for the motivation to introduce this > >> driver in the community kernel? Is this device something people can > >> obtain in a shop, or is it rather something to be seen in Meta > >> datacenter only? If the second is the case, why exactly would we need > >> this driver? > > > >For now this is Meta only. However there are several reasons for > >wanting to include this in the upstream kernel. > > > >First is the fact that from a maintenance standpoint it is easier to > >avoid drifting from the upstream APIs and such if we are in the kernel > >it makes things much easier to maintain as we can just pull in patches > >without having to add onto that work by having to craft backports > >around code that isn't already in upstream. > > That is making life easier for you, making it harder for the community. > O relevance. > > > > > >Second is the fact that as we introduce new features with our driver > >it is much easier to show a proof of concept with the driver being in > >the kernel than not. It makes it much harder to work with the > >community on offloads and such if we don't have a good vehicle to use > >for that. What this driver will provide is an opportunity to push > >changes that would be beneficial to us, and likely the rest of the > >community without being constrained by what vendors decide they want > >to enable or not. The general idea is that if we can show benefit with > >our NIC then other vendors would be more likely to follow in our path. > > Yeah, so not even we would have to maintain driver nobody (outside Meta) > uses or cares about, you say that we will likely maintain more of a dead > code related to that. I think that in Linux kernel, there any many > examples of similarly dead code that causes a lot of headaches to > maintain. > > You just want to make your life easier here again. Don't drag community > into this please. The argument itself doesn't really hold water. The fact is the Meta data centers are not an insignificant consumer of Linux, so it isn't as if the driver isn't going to be used. This implies some lack of good faith from Meta. I don't understand that as we are contributing across multiple areas in the kernel including networking and ebpf. Is Meta expected to start pulling time from our upstream maintainers to have them update out-of-tree kernel modules since the community isn't willing to let us maintain it in the kernel? Is the message that the kernel is expected to get value from Meta, but that value is not meant to be reciprocated? Would you really rather have us start maintaining our own internal kernel with our own "proprietary goodness", and ask other NIC vendors to have to maintain their drivers against yet another kernel if they want to be used in our data centers? As pointed out by Andew we aren't the first data center to push a driver for our own proprietary device. The fact is there have been drivers added for devices that were for purely emulated devices with no actual customers such as rocker. Should the switch vendors at the time have pushed back on it stating it wasn't a real "for sale" device? The whole argument seems counter to what is expected. When a vendor creates a new device and will likely be enabling new kernel features my understanding is that it is better to be in the kernel than not. Putting a criteria on it that it must be "for sale" seems rather arbitrary and capricious, especially given that most drivers have to be pushed out long before they are available in the market in order to meet deadlines to get the driver into OSV releases such as Redhat when it hits the market. By that logic should we block all future drivers until we can find them for sale somewhere? That way we don't run the risk of adding a vendor driver for a product that might be scrapped due to a last minute bug that will cause it to never be released.
On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote: > The argument itself doesn't really hold water. The fact is the Meta > data centers are not an insignificant consumer of Linux, customer or beneficiary ? > so it isn't as if the driver isn't going to be used. This implies > some lack of good faith from Meta. "Good faith" is not a sufficient foundation for a community consisting of volunteers, and commercial entities (with the xz debacle maybe even less today than it was a month ago). As a maintainer I really don't want to be in position of judging the "good faith" of corporate actors. > I don't understand that as we are > contributing across multiple areas in the kernel including networking > and ebpf. Is Meta expected to start pulling time from our upstream > maintainers to have them update out-of-tree kernel modules since the > community isn't willing to let us maintain it in the kernel? Is the > message that the kernel is expected to get value from Meta, but that > value is not meant to be reciprocated? Would you really rather have > us start maintaining our own internal kernel with our own > "proprietary goodness", and ask other NIC vendors to have to maintain > their drivers against yet another kernel if they want to be used in > our data centers? Please allow the community to make rational choices in the interest of the project and more importantly the interest of its broader user base. Google would also claim "good faith" -- undoubtedly is supports the kernel, and lets some of its best engineers contribute. Did that make them stop trying to build Fuchsia? The "good faith" of companies operates with the limits of margin of error of they consider rational and beneficial. I don't want to put my thumb on the scale (yet?), but (with my maintainer hat on) please don't use the "Meta is good" argument, because someone will send a similar driver from a less involved company later on and we'll be accused of playing favorites :( Plus companies can change their approach to open source from "inclusive" to "extractive" (to borrow the economic terminology) rather quickly.
Jakub Kicinski wrote: > On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote: > > The argument itself doesn't really hold water. The fact is the Meta > > data centers are not an insignificant consumer of Linux, > > customer or beneficiary ? > > > so it isn't as if the driver isn't going to be used. This implies > > some lack of good faith from Meta. > > "Good faith" is not a sufficient foundation for a community consisting > of volunteers, and commercial entities (with the xz debacle maybe even > less today than it was a month ago). As a maintainer I really don't want > to be in position of judging the "good faith" of corporate actors. > > > I don't understand that as we are > > contributing across multiple areas in the kernel including networking > > and ebpf. Is Meta expected to start pulling time from our upstream > > maintainers to have them update out-of-tree kernel modules since the > > community isn't willing to let us maintain it in the kernel? Is the > > message that the kernel is expected to get value from Meta, but that > > value is not meant to be reciprocated? Would you really rather have > > us start maintaining our own internal kernel with our own > > "proprietary goodness", and ask other NIC vendors to have to maintain > > their drivers against yet another kernel if they want to be used in > > our data centers? > > Please allow the community to make rational choices in the interest of > the project and more importantly the interest of its broader user base. > > Google would also claim "good faith" -- undoubtedly is supports > the kernel, and lets some of its best engineers contribute. > Did that make them stop trying to build Fuchsia? The "good faith" of > companies operates with the limits of margin of error of they consider > rational and beneficial. > > I don't want to put my thumb on the scale (yet?), but (with my > maintainer hat on) please don't use the "Meta is good" argument, because > someone will send a similar driver from a less involved company later on > and we'll be accused of playing favorites :( Plus companies can change > their approach to open source from "inclusive" to "extractive" > (to borrow the economic terminology) rather quickly. > I'll throw my $.02 in. In this case you have a driver that I only scanned so far, but looks well done. Alex has written lots of drivers I trust he will not just abondon it. And if it does end up abondoned and no one supports it at some future point we can deprecate it same as any other driver in the networking tree. All the feedback is being answered and debate is happening so I expect will get a v2, v3 or so. All good signs in my point. Back to your point about faith in a company. I don't think we even need to care about whatever companies business plans. The author could have submitted with their personal address for what its worth and called it drivers/alexware/duyck.o Bit extreme and I would have called him on it, but hopefully the point is clear. We have lots of drivers in the tree that are hard to physically get ahold of. Or otherwise gated by paying some vendor for compute time, etc. to use. We even have some drivers where the hardware itself never made it out into the wild or only a single customer used it before sellers burned it for commercial reasons or hw wasn't workable, team was cut, etc. I can't see how if I have a physical NIC for it on my desk here makes much difference one way or the other. The alternative is much worse someone builds a team of engineers locks them up they build some interesting pieces and we never get to see it because we tried to block someone from opensourcing their driver? Eventually they need some kernel changes and than we block those too because we didn't allow the driver that was the use case? This seems wrong to me. Anyways we have zero ways to enforce such a policy. Have vendors ship a NIC to somebody with the v0 of the patch set? Attach a picture? Even if vendor X claims they will have a product in N months and than only sells it to qualified customers what to do we do then. Driver author could even believe the hardware will be available when they post the driver, but business may change out of hands of the developer. I'm 100% on letting this through assuming Alex is on top of feedback and the code is good. I think any other policy would be very ugly to enforce, prove, and even understand. Obviously code and architecture debates I'm all for. Ensuring we have a trusted, experienced person signed up to review code, address feedback, fix whatever syzbot finds and so on is also a must I think. I'm sure Alex will take care of it. Thanks, John
On Thu, 04 Apr 2024 14:59:33 -0700 John Fastabend wrote: > The alternative is much worse someone builds a team of engineers locks > them up they build some interesting pieces and we never get to see it > because we tried to block someone from opensourcing their driver? Opensourcing is just one push to github. There are guarantees we give to upstream drivers. > Eventually they need some kernel changes and than we block those too > because we didn't allow the driver that was the use case? This seems > wrong to me. The flip side of the argument is, what if we allow some device we don't have access to to make changes to the core for its benefit. Owner reports that some changes broke the kernel for them. Kernel rules, regression, we have to revert. This is not a hypothetical, "less than cooperative users" demanding reverts, and "reporting us to Linus" is a reality :( Technical solution? Maybe if it's not a public device regression rules don't apply? Seems fairly reasonable. > Anyways we have zero ways to enforce such a policy. Have vendors > ship a NIC to somebody with the v0 of the patch set? Attach a picture? GenAI world, pictures mean nothing :) We do have a CI in netdev, which is all ready to ingest external results, and a (currently tiny amount?) of test for NICs. Prove that you care about the device by running the upstream tests and reporting results? Seems fairly reasonable. > Even if vendor X claims they will have a product in N months and > than only sells it to qualified customers what to do we do then. > Driver author could even believe the hardware will be available > when they post the driver, but business may change out of hands > of the developer. > > I'm 100% on letting this through assuming Alex is on top of feedback > and the code is good. I'd strongly prefer if we detach our trust and respect for Alex from whatever precedent we make here. I can't stress this enough. IDK if I'm exaggerating or it's hard to appreciate the challenges of maintainership without living it, but I really don't like being accused of playing favorites or big companies buying their way in :( > I think any other policy would be very ugly to enforce, prove, and > even understand. Obviously code and architecture debates I'm all for. > Ensuring we have a trusted, experienced person signed up to review > code, address feedback, fix whatever syzbot finds and so on is also a > must I think. I'm sure Alex will take care of it. "Whatever syzbot finds" may be slightly moot for a private device ;) but otherwise 100%! These are exactly the kind of points I think we should enumerate. I started writing a list of expectations a while back: Documentation/maintainer/feature-and-driver-maintainers.rst I think we just need something like this, maybe just a step up, for non-public devices..
On Thu, Apr 4, 2024 at 2:59 PM John Fastabend <john.fastabend@gmail.com> wrote: > > Jakub Kicinski wrote: > > On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote: > > > I don't understand that as we are > > > contributing across multiple areas in the kernel including networking > > > and ebpf. Is Meta expected to start pulling time from our upstream > > > maintainers to have them update out-of-tree kernel modules since the > > > community isn't willing to let us maintain it in the kernel? Is the > > > message that the kernel is expected to get value from Meta, but that > > > value is not meant to be reciprocated? Would you really rather have > > > us start maintaining our own internal kernel with our own > > > "proprietary goodness", and ask other NIC vendors to have to maintain > > > their drivers against yet another kernel if they want to be used in > > > our data centers? > > > > Please allow the community to make rational choices in the interest of > > the project and more importantly the interest of its broader user base. > > > > Google would also claim "good faith" -- undoubtedly is supports > > the kernel, and lets some of its best engineers contribute. > > Did that make them stop trying to build Fuchsia? The "good faith" of > > companies operates with the limits of margin of error of they consider > > rational and beneficial. > > > > I don't want to put my thumb on the scale (yet?), but (with my > > maintainer hat on) please don't use the "Meta is good" argument, because > > someone will send a similar driver from a less involved company later on > > and we'll be accused of playing favorites :( Plus companies can change > > their approach to open source from "inclusive" to "extractive" > > (to borrow the economic terminology) rather quickly. > > > > I'll throw my $.02 in. In this case you have a driver that I only scanned > so far, but looks well done. Alex has written lots of drivers I trust he > will not just abondon it. And if it does end up abondoned and no one > supports it at some future point we can deprecate it same as any other > driver in the networking tree. All the feedback is being answered and > debate is happening so I expect will get a v2, v3 or so. All good signs > in my point. > > Back to your point about faith in a company. I don't think we even need > to care about whatever companies business plans. The author could have > submitted with their personal address for what its worth and called it > drivers/alexware/duyck.o Bit extreme and I would have called him on it, > but hopefully the point is clear. > > We have lots of drivers in the tree that are hard to physically get ahold > of. Or otherwise gated by paying some vendor for compute time, etc. to > use. We even have some drivers where the hardware itself never made > it out into the wild or only a single customer used it before sellers > burned it for commercial reasons or hw wasn't workable, team was cut, etc. > > I can't see how if I have a physical NIC for it on my desk here makes > much difference one way or the other. The advantage of Meta not selling it publicly at this time is that Meta is both the consumer and the maintainer so if a kernel API change broke something Meta would be responsible for fixing it. It would be Meta's own interest to maintain the part, and if it breaks Meta would be the only one impacted assuming the breaking change at least builds. So rather than "good faith", maybe I should have said "motivated self interest". It seems like the worst case scenario would actually be some commercial product being sold. Worse yet, one with proprietary bits such as firmware on the device. Should commercial vendors be required to provide some sort of documentation proving they are dedicated to their product and financially stable enough to maintain it for the entire product life cycle? What if the vendor sells some significant number of units to Linux users out there, and then either goes under, gets acquired, and/or just decides to go in a new direction. In that scenario we have the driver unmaintained and consumers impacted, but the company responsible for it has no motivation to address it other than maybe negative PR. > The alternative is much worse someone builds a team of engineers locks > them up they build some interesting pieces and we never get to see it > because we tried to block someone from opensourcing their driver? > Eventually they need some kernel changes and than we block those too > because we didn't allow the driver that was the use case? This seems > wrong to me. > > Anyways we have zero ways to enforce such a policy. Have vendors > ship a NIC to somebody with the v0 of the patch set? Attach a picture? > Even if vendor X claims they will have a product in N months and > than only sells it to qualified customers what to do we do then. > Driver author could even believe the hardware will be available > when they post the driver, but business may change out of hands > of the developer. This is what I was referring to as being "arbitrary and capricious". The issue is what would we define as a NIC being for sale commercially. Do we have to support a certain form factor, sell it for a certain price, or sell a certain quantity? If anything maybe we should look more at something like a blast radius in terms of inclusion. If this were to go wrong, how could it go wrong and who would be impacted. > I'm 100% on letting this through assuming Alex is on top of feedback > and the code is good. I think any other policy would be very ugly > to enforce, prove, and even understand. Obviously code and architecture > debates I'm all for. Ensuring we have a trusted, experienced person > signed up to review code, address feedback, fix whatever syzbot finds > and so on is also a must I think. I'm sure Alex will take care of > it. > > Thanks, > John Thanks for your reply. I had started a reply, but you probably worded this better than I could have. Thanks - Alex
On Thu, Apr 4, 2024 at 4:50 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 04 Apr 2024 14:59:33 -0700 John Fastabend wrote: > > The alternative is much worse someone builds a team of engineers locks > > them up they build some interesting pieces and we never get to see it > > because we tried to block someone from opensourcing their driver? > > Opensourcing is just one push to github. > There are guarantees we give to upstream drivers. Are there? Do we have them documented somewhere? > > Eventually they need some kernel changes and than we block those too > > because we didn't allow the driver that was the use case? This seems > > wrong to me. > > The flip side of the argument is, what if we allow some device we don't > have access to to make changes to the core for its benefit. Owner > reports that some changes broke the kernel for them. Kernel rules, > regression, we have to revert. This is not a hypothetical, "less than > cooperative users" demanding reverts, and "reporting us to Linus" > is a reality :( > > Technical solution? Maybe if it's not a public device regression rules > don't apply? Seems fairly reasonable. This is a hypothetical. This driver currently isn't changing anything outside of itself. At this point the driver would only be build tested by everyone else. They could just not include it in their Kconfig and then out-of-sight, out-of-mind. > > Anyways we have zero ways to enforce such a policy. Have vendors > > ship a NIC to somebody with the v0 of the patch set? Attach a picture? > > GenAI world, pictures mean nothing :) We do have a CI in netdev, which > is all ready to ingest external results, and a (currently tiny amount?) > of test for NICs. Prove that you care about the device by running the > upstream tests and reporting results? Seems fairly reasonable. That seems like an opportunity to be exploited through. Are the results going to be verified in any way? Maybe cryptographically signed? Seems like it would be easy enough to fake the results. > > Even if vendor X claims they will have a product in N months and > > than only sells it to qualified customers what to do we do then. > > Driver author could even believe the hardware will be available > > when they post the driver, but business may change out of hands > > of the developer. > > > > I'm 100% on letting this through assuming Alex is on top of feedback > > and the code is good. > > I'd strongly prefer if we detach our trust and respect for Alex > from whatever precedent we make here. I can't stress this enough. > IDK if I'm exaggerating or it's hard to appreciate the challenges > of maintainership without living it, but I really don't like being > accused of playing favorites or big companies buying their way in :( Again, I would say we look at the blast radius. That is how we should be measuring any change. At this point the driver is self contained into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything outside that directory, and it can be switched off via Kconfig. When the time comes to start adding new features we can probably start by looking at how to add either generic offloads like was done for GSO, CSO, ect or how it can also be implemented on another vendor's NIC. At this point the only risk the driver presents is that it is yet another driver, done in the same style I did the other Intel drivers, and so any kernel API changes will end up needing to be applied to it just like the other drivers. > > I think any other policy would be very ugly to enforce, prove, and > > even understand. Obviously code and architecture debates I'm all for. > > Ensuring we have a trusted, experienced person signed up to review > > code, address feedback, fix whatever syzbot finds and so on is also a > > must I think. I'm sure Alex will take care of it. > > "Whatever syzbot finds" may be slightly moot for a private device ;) > but otherwise 100%! These are exactly the kind of points I think we > should enumerate. I started writing a list of expectations a while back: > > Documentation/maintainer/feature-and-driver-maintainers.rst > > I think we just need something like this, maybe just a step up, for > non-public devices.. I honestly think we are getting the cart ahead of the horse. When we start talking about kernel API changes then we can probably get into the whole "private" versus "publicly available" argument. A good example of the kind of thing I am thinking of is GSO partial where I ended up with Mellanox and Intel sending me 40G and 100G NICs and cables to implement it on their devices as all I had was essentially igb and ixgbe based NICs. Odds are when we start getting to those kind of things maybe we need to look at having a few systems available for developer use, but until then I am not sure it makes sense to focus on if the device is publicly available or not.
On Thu, 4 Apr 2024 17:11:47 -0700 Alexander Duyck wrote: > > Opensourcing is just one push to github. > > There are guarantees we give to upstream drivers. > > Are there? Do we have them documented somewhere? I think they are somewhere in Documentation/ To some extent this question in itself supports my point that written down rules, as out of date as they may be, seem to carry more respect than what a maintainer says :S > > > Eventually they need some kernel changes and than we block those too > > > because we didn't allow the driver that was the use case? This seems > > > wrong to me. > > > > The flip side of the argument is, what if we allow some device we don't > > have access to to make changes to the core for its benefit. Owner > > reports that some changes broke the kernel for them. Kernel rules, > > regression, we have to revert. This is not a hypothetical, "less than > > cooperative users" demanding reverts, and "reporting us to Linus" > > is a reality :( > > > > Technical solution? Maybe if it's not a public device regression rules > > don't apply? Seems fairly reasonable. > > This is a hypothetical. This driver currently isn't changing anything > outside of itself. At this point the driver would only be build tested > by everyone else. They could just not include it in their Kconfig and > then out-of-sight, out-of-mind. Not changing does not mean not depending on existing behavior. Investigating and fixing properly even the hardest regressions in the stack is a bar that Meta can so easily clear. I don't understand why you are arguing. > > > Anyways we have zero ways to enforce such a policy. Have vendors > > > ship a NIC to somebody with the v0 of the patch set? Attach a picture? > > > > GenAI world, pictures mean nothing :) We do have a CI in netdev, which > > is all ready to ingest external results, and a (currently tiny amount?) > > of test for NICs. Prove that you care about the device by running the > > upstream tests and reporting results? Seems fairly reasonable. > > That seems like an opportunity to be exploited through. Are the > results going to be verified in any way? Maybe cryptographically > signed? Seems like it would be easy enough to fake the results. I think it's much easier to just run the tests than write a system which will competently lie. But even if we completely suspend trust, someone lying is of no cost to the community in this case. > > > Even if vendor X claims they will have a product in N months and > > > than only sells it to qualified customers what to do we do then. > > > Driver author could even believe the hardware will be available > > > when they post the driver, but business may change out of hands > > > of the developer. > > > > > > I'm 100% on letting this through assuming Alex is on top of feedback > > > and the code is good. > > > > I'd strongly prefer if we detach our trust and respect for Alex > > from whatever precedent we make here. I can't stress this enough. > > IDK if I'm exaggerating or it's hard to appreciate the challenges > > of maintainership without living it, but I really don't like being > > accused of playing favorites or big companies buying their way in :( > > Again, I would say we look at the blast radius. That is how we should > be measuring any change. At this point the driver is self contained > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything > outside that directory, and it can be switched off via Kconfig. It is not practical to ponder every change case by case. Maintainers are overworked. How long until we send the uAPI patch for RSS on the flow label? I'd rather not re-litigate this every time someone posts a slightly different feature. Let's cover the obvious points from the beginning while everyone is paying attention. We can amend later as need be. > When the time comes to start adding new features we can probably start > by looking at how to add either generic offloads like was done for > GSO, CSO, ect or how it can also be implemented on another vendor's > NIC. > > At this point the only risk the driver presents is that it is yet > another driver, done in the same style I did the other Intel drivers, > and so any kernel API changes will end up needing to be applied to it > just like the other drivers. The risk is we'll have a fight every time there is a disagreement about the expectations. > > > I think any other policy would be very ugly to enforce, prove, and > > > even understand. Obviously code and architecture debates I'm all for. > > > Ensuring we have a trusted, experienced person signed up to review > > > code, address feedback, fix whatever syzbot finds and so on is also a > > > must I think. I'm sure Alex will take care of it. > > > > "Whatever syzbot finds" may be slightly moot for a private device ;) > > but otherwise 100%! These are exactly the kind of points I think we > > should enumerate. I started writing a list of expectations a while back: > > > > Documentation/maintainer/feature-and-driver-maintainers.rst > > > > I think we just need something like this, maybe just a step up, for > > non-public devices.. > > I honestly think we are getting the cart ahead of the horse. When we > start talking about kernel API changes then we can probably get into > the whole "private" versus "publicly available" argument. A good > example of the kind of thing I am thinking of is GSO partial where I > ended up with Mellanox and Intel sending me 40G and 100G NICs and > cables to implement it on their devices as all I had was essentially > igb and ixgbe based NICs. That'd be great. Maybe even more than I'd expect. So why not write it down? In case the person doing the coding is not Alex Duyck, and just wants to get it done for their narrow use case, get a promo, go work on something else? > Odds are when we start getting to those kind of things maybe we need > to look at having a few systems available for developer use, but until > then I am not sure it makes sense to focus on if the device is > publicly available or not. Developer access would be huge. A mirage of developer access? immaterial :)
On 4/4/24 9:37 AM, Jakub Kicinski wrote: > On Thu, 4 Apr 2024 17:24:03 +0200 Andrew Lunn wrote: >> Given the discussion going on in the thread "mlx5 ConnectX control >> misc driver", you also plan to show you don't need such a misc driver? > > To the extent to which it is possible to prove an in-existence > of something :) Since my argument that Meta can use _vendor_ devices > without resorting to "misc drivers" doesn't convince them, I doubt > this data point would help. > When this device gets widely deployed and you have the inevitable production problems (inevitable in this sense this is designed, implemented and deployed by humans who make mistakes and then S/W has to compensate for the quirks), you can see whether it is easy to completely, sanely and efficiently debug those problems solely with extensions to open source tools. But, I am guessing that is still 1+ years away before you will know.
On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote: > Again, I would say we look at the blast radius. That is how we should > be measuring any change. At this point the driver is self contained > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything > outside that directory, and it can be switched off via Kconfig. I personally think this is the most relevant point. This is just a new NIC driver, completely self-encapsulated. I quickly glanced over the code and it looks like it's not doing anything obviously bad. It really looks like an usual, legit, NIC driver. I don't think the fact that the NIC itself is hard to grasp for anyone outside <organization> makes a difference. Long time ago Greg noted that drivers has been merged for H/W known to have a _single_ existing instance (IIRC, I can't find the reference on top of my head, but back then was quite popular, I hope some other old guy could remember). To me, the maintainership burden is on Meta: Alex/Meta will have to handle bug report, breakages, user-complains (I guess this last would be the easier part ;). If he/they will not cope with the process we can simply revert the driver. I would be quite surprised if such situation should happen, but the impact from my PoV looks minimal. TL;DR: I don't see any good reason to not accept this - unless my quick glance was too quick and very wrong, but it looks like other has similar view. Cheers, Paolo
On Fri, Apr 05, 2024 at 09:11:19AM +0200, Paolo Abeni wrote: > On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote: > > Again, I would say we look at the blast radius. That is how we should > > be measuring any change. At this point the driver is self contained > > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything > > outside that directory, and it can be switched off via Kconfig. > > I personally think this is the most relevant point. This is just a new > NIC driver, completely self-encapsulated. I quickly glanced over the > code and it looks like it's not doing anything obviously bad. It really > looks like an usual, legit, NIC driver. This is completely true, and as I've said many times the kernel as a project is substantially about supporting the HW that people actually build. There is no reason not to merge yet another basic netdev driver. However, there is also a pretty strong red line in Linux where people belive, with strong conviction, that kernel code should not be merged only to support a propriety userspace. This submission is clearly bluring that line. This driver will only run in Meta's proprietary kernel fork on servers running Meta's propriety userspace. At this point perhaps it is OK, a basic NIC driver is not really an issue, but Jiri is also very correct to point out that this is heading in a very concerning direction. Alex already indicated new features are coming, changes to the core code will be proposed. How should those be evaluated? Hypothetically should fbnic be allowed to be the first implementation of something invasive like Mina's DMABUF work? Google published an open userspace for NCCL that people can (in theory at least) actually run. Meta would not be able to do that. I would say that clearly crosses the line and should not be accepted. So I think there should be an expectation that technicaly sound things Meta may propose must not be accepted because they cross the ideological red line into enabling only proprietary software. To me it sets up a fairly common anti-pattern where a vendor starts out with good intentions, reaches community pushback and falls back to their downstream fork. Once forking occurs it becomes self-reinforcing as built up infrastructure like tests and CI will only run correctly on the fork and the fork grows. Then eventually the upstream code is abandoned. This has happened many times before in Linux.. IMHO from a community perspective I feel like we should expect Meta to fail and end up with a fork. The community should warn them. However if they really want to try anyhow then I'm not sure it would be appropriate to stop them at this point. Meta will just end up being a "bad vendor". I think the best thing the netdev community could do is come up with some more clear guidelines what Meta could use fbnic to justify and what would be rejected (ideologically) and Meta can decide on their own if they want to continue. Jason
On 4/5/24 2:26 PM, Jason Gunthorpe wrote: > On Fri, Apr 05, 2024 at 09:11:19AM +0200, Paolo Abeni wrote: >> On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote: >>> Again, I would say we look at the blast radius. That is how we should >>> be measuring any change. At this point the driver is self contained >>> into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything >>> outside that directory, and it can be switched off via Kconfig. >> >> I personally think this is the most relevant point. This is just a new >> NIC driver, completely self-encapsulated. I quickly glanced over the >> code and it looks like it's not doing anything obviously bad. It really >> looks like an usual, legit, NIC driver. > > This is completely true, and as I've said many times the kernel as a > project is substantially about supporting the HW that people actually > build. There is no reason not to merge yet another basic netdev > driver. > > However, there is also a pretty strong red line in Linux where people > belive, with strong conviction, that kernel code should not be merged > only to support a propriety userspace. This submission is clearly > bluring that line. This driver will only run in Meta's proprietary > kernel fork on servers running Meta's propriety userspace. > > At this point perhaps it is OK, a basic NIC driver is not really an > issue, but Jiri is also very correct to point out that this is heading > in a very concerning direction. > > Alex already indicated new features are coming, changes to the core > code will be proposed. How should those be evaluated? Hypothetically > should fbnic be allowed to be the first implementation of something > invasive like Mina's DMABUF work? My $0.02 from only reading this thread on the side.. when it comes to extending and integrating with core networking code (e.g. larger features like offloads, xdp/af_xdp, etc) the networking community always requested at least two driver implementations to show-case that the code extensions touching core code are not unique to just a single driver/NIC/vendor. I'd expect this holds true also here..
On 4/3/24 22:08, Alexander Duyck wrote: > This patch set includes the necessary patches to enable basic Tx and Rx > over the Meta Platforms Host Network Interface. To do this we introduce a > new driver and driver and directories in the form of > "drivers/net/ethernet/meta/fbnic". > > Due to submission limits the general plan to submit a minimal driver for > now almost equivalent to a UEFI driver in functionality, and then follow up > over the coming weeks enabling additional offloads and more features for > the device. > > The general plan is to look at adding support for ethtool, statistics, and > start work on offloads in the next set of patches. > > --- > > Alexander Duyck (15): > PCI: Add Meta Platforms vendor ID > eth: fbnic: add scaffolding for Meta's NIC driver > eth: fbnic: Allocate core device specific structures and devlink interface > eth: fbnic: Add register init to set PCIe/Ethernet device config > eth: fbnic: add message parsing for FW messages > eth: fbnic: add FW communication mechanism > eth: fbnic: allocate a netdevice and napi vectors with queues > eth: fbnic: implement Tx queue alloc/start/stop/free > eth: fbnic: implement Rx queue alloc/start/stop/free > eth: fbnic: Add initial messaging to notify FW of our presence > eth: fbnic: Enable Ethernet link setup > eth: fbnic: add basic Tx handling > eth: fbnic: add basic Rx handling > eth: fbnic: add L2 address programming > eth: fbnic: write the TCAM tables used for RSS control and Rx to host > > > MAINTAINERS | 7 + > drivers/net/ethernet/Kconfig | 1 + > drivers/net/ethernet/Makefile | 1 + > drivers/net/ethernet/meta/Kconfig | 29 + > drivers/net/ethernet/meta/Makefile | 6 + > drivers/net/ethernet/meta/fbnic/Makefile | 18 + > drivers/net/ethernet/meta/fbnic/fbnic.h | 148 ++ > drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 912 ++++++++ > .../net/ethernet/meta/fbnic/fbnic_devlink.c | 86 + > .../net/ethernet/meta/fbnic/fbnic_drvinfo.h | 5 + > drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 823 ++++++++ > drivers/net/ethernet/meta/fbnic/fbnic_fw.h | 133 ++ > drivers/net/ethernet/meta/fbnic/fbnic_irq.c | 251 +++ > drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 1025 +++++++++ > drivers/net/ethernet/meta/fbnic/fbnic_mac.h | 83 + > .../net/ethernet/meta/fbnic/fbnic_netdev.c | 470 +++++ > .../net/ethernet/meta/fbnic/fbnic_netdev.h | 59 + > drivers/net/ethernet/meta/fbnic/fbnic_pci.c | 633 ++++++ > drivers/net/ethernet/meta/fbnic/fbnic_rpc.c | 709 +++++++ > drivers/net/ethernet/meta/fbnic/fbnic_rpc.h | 189 ++ > drivers/net/ethernet/meta/fbnic/fbnic_tlv.c | 529 +++++ > drivers/net/ethernet/meta/fbnic/fbnic_tlv.h | 175 ++ > drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 1873 +++++++++++++++++ > drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 125 ++ > include/linux/pci_ids.h | 2 + > 25 files changed, 8292 insertions(+) Even if this is just a basic scaffolding for what will come, it's hard to believe that no patch was co-developed, or should be marked as authored-by some other developer. [...]
On Fri, Apr 5, 2024 at 5:26 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Fri, Apr 05, 2024 at 09:11:19AM +0200, Paolo Abeni wrote: > > On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote: > > > Again, I would say we look at the blast radius. That is how we should > > > be measuring any change. At this point the driver is self contained > > > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything > > > outside that directory, and it can be switched off via Kconfig. > > > > I personally think this is the most relevant point. This is just a new > > NIC driver, completely self-encapsulated. I quickly glanced over the > > code and it looks like it's not doing anything obviously bad. It really > > looks like an usual, legit, NIC driver. > > This is completely true, and as I've said many times the kernel as a > project is substantially about supporting the HW that people actually > build. There is no reason not to merge yet another basic netdev > driver. > > However, there is also a pretty strong red line in Linux where people > belive, with strong conviction, that kernel code should not be merged > only to support a propriety userspace. This submission is clearly > bluring that line. This driver will only run in Meta's proprietary > kernel fork on servers running Meta's propriety userspace. > > At this point perhaps it is OK, a basic NIC driver is not really an > issue, but Jiri is also very correct to point out that this is heading > in a very concerning direction. > > Alex already indicated new features are coming, changes to the core > code will be proposed. How should those be evaluated? Hypothetically > should fbnic be allowed to be the first implementation of something > invasive like Mina's DMABUF work? Google published an open userspace > for NCCL that people can (in theory at least) actually run. Meta would > not be able to do that. I would say that clearly crosses the line and > should not be accepted. Why not? Just because we are not commercially selling it doesn't mean we couldn't look at other solutions such as QEMU. If we were to provide a github repo with an emulation of the NIC would that be enough to satisfy the "commercial" requirement? The fact is I already have an implementation, but I would probably need to clean up a few things as the current setup requires 3 QEMU instances to emulate the full setup with host, firmware, and BMC. It wouldn't be as performant as the actual hardware but it is more than enough for us to test code with. If we need to look at publishing something like that to github in order to address the lack of user availability I could start looking at getting the approvals for that. > So I think there should be an expectation that technically sound things > Meta may propose must not be accepted because they cross the > ideological red line into enabling only proprietary software. That is a faulty argument. That is like saying we should kick out the nouveu driver out of Linux just because it supports Nvidia graphics cards that happen to also have a proprietary out-of-tree driver out there, or maybe we need to kick all the Intel NIC drivers out for DPDK? I can't think of many NIC vendors that don't have their own out-of-tree drivers floating around with their own kernel bypass solutions to support proprietary software. > To me it sets up a fairly common anti-pattern where a vendor starts > out with good intentions, reaches community pushback and falls back to > their downstream fork. Once forking occurs it becomes self-reinforcing > as built up infrastructure like tests and CI will only run correctly > on the fork and the fork grows. Then eventually the upstream code is > abandoned. This has happened many times before in Linux.. > > IMHO from a community perspective I feel like we should expect Meta to > fail and end up with a fork. The community should warn them. However > if they really want to try anyhow then I'm not sure it would be > appropriate to stop them at this point. Meta will just end up being a > "bad vendor". > > I think the best thing the netdev community could do is come up with > some more clear guidelines what Meta could use fbnic to justify and > what would be rejected (ideologically) and Meta can decide on their > own if they want to continue. I agree. We need a consistent set of standards. I just strongly believe commercial availability shouldn't be one of them.
On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: > > Alex already indicated new features are coming, changes to the core > > code will be proposed. How should those be evaluated? Hypothetically > > should fbnic be allowed to be the first implementation of something > > invasive like Mina's DMABUF work? Google published an open userspace > > for NCCL that people can (in theory at least) actually run. Meta would > > not be able to do that. I would say that clearly crosses the line and > > should not be accepted. > > Why not? Just because we are not commercially selling it doesn't mean > we couldn't look at other solutions such as QEMU. If we were to > provide a github repo with an emulation of the NIC would that be > enough to satisfy the "commercial" requirement? My test is not "commercial", it is enabling open source ecosystem vs benefiting only proprietary software. In my hypothetical you'd need to do something like open source Meta's implementation of the AI networking that the DMABUF patches enable, and even then since nobody could run it at performance the thing is pretty questionable. IMHO publishing a qemu chip emulator would not advance the open source ecosystem around building a DMABUF AI networking scheme. > > So I think there should be an expectation that technically sound things > > Meta may propose must not be accepted because they cross the > > ideological red line into enabling only proprietary software. > > That is a faulty argument. That is like saying we should kick out the > nouveu driver out of Linux just because it supports Nvidia graphics > cards that happen to also have a proprietary out-of-tree driver out > there, Huh? nouveau supports a fully open source mesa graphics stack in Linux. How is that remotely similar to what I said? No issue. > or maybe we need to kick all the Intel NIC drivers out for > DPDK? DPDK is fully open source, again no issue. You pointed at two things that I would consider to be exemplar open source projects and said their existance somehow means we should be purging drivers from the kernel??? I really don't understand what you are trying to say at all. The kernel standard is that good quality open source *does* exist, we tend to not care what proprietary things people create beyond that. > I can't think of many NIC vendors that don't have their own > out-of-tree drivers floating around with their own kernel bypass > solutions to support proprietary software. Most of those are also open source, and we can't say much about what people do out of tree, obviously. > I agree. We need a consistent set of standards. I just strongly > believe commercial availability shouldn't be one of them. I never said commercial availability. I talked about open source vs proprietary userspace. This is very standard kernel stuff. You have an unavailable NIC, so we know it is only ever operated with Meta's proprietary kernel fork, supporting Meta's proprietary userspace software. Where exactly is the open source? Why should someone working to improve only their proprietary environment be welcomed in the same way as someone working to improve the open source ecosystem? That has never been the kernel communities position. If you want to propose things to the kernel that can only be meaningfully used by your proprietary software then you should not expect to succeed. No one should be surprised to hear this. Jason
On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 4 Apr 2024 17:11:47 -0700 Alexander Duyck wrote: > > > Opensourcing is just one push to github. > > > There are guarantees we give to upstream drivers. > > > > Are there? Do we have them documented somewhere? > > I think they are somewhere in Documentation/ > To some extent this question in itself supports my point that written > down rules, as out of date as they may be, seem to carry more respect > than what a maintainer says :S I think the problem is there are multiple maintainers and they all have different ways of doing things. As a submitter over the years I have had to deal with probably over a half dozen different maintainers and each experience has been different. I have always argued that the netdev tree is one of the better maintained setups. > > > > Eventually they need some kernel changes and than we block those too > > > > because we didn't allow the driver that was the use case? This seems > > > > wrong to me. > > > > > > The flip side of the argument is, what if we allow some device we don't > > > have access to to make changes to the core for its benefit. Owner > > > reports that some changes broke the kernel for them. Kernel rules, > > > regression, we have to revert. This is not a hypothetical, "less than > > > cooperative users" demanding reverts, and "reporting us to Linus" > > > is a reality :( > > > > > > Technical solution? Maybe if it's not a public device regression rules > > > don't apply? Seems fairly reasonable. > > > > This is a hypothetical. This driver currently isn't changing anything > > outside of itself. At this point the driver would only be build tested > > by everyone else. They could just not include it in their Kconfig and > > then out-of-sight, out-of-mind. > > Not changing does not mean not depending on existing behavior. > Investigating and fixing properly even the hardest regressions in > the stack is a bar that Meta can so easily clear. I don't understand > why you are arguing. I wasn't saying the driver wouldn't be dependent on existing behavior. I was saying that it was a hypothetical that Meta would be a "less than cooperative user" and demand a revert. It is also a hypothetical that Linus wouldn't just propose a revert of the fbnic driver instead of the API for the crime of being a "less than cooperative maintainer" and then give Meta the Nvidia treatment. > > > > Anyways we have zero ways to enforce such a policy. Have vendors > > > > ship a NIC to somebody with the v0 of the patch set? Attach a picture? > > > > > > GenAI world, pictures mean nothing :) We do have a CI in netdev, which > > > is all ready to ingest external results, and a (currently tiny amount?) > > > of test for NICs. Prove that you care about the device by running the > > > upstream tests and reporting results? Seems fairly reasonable. > > > > That seems like an opportunity to be exploited through. Are the > > results going to be verified in any way? Maybe cryptographically > > signed? Seems like it would be easy enough to fake the results. > > I think it's much easier to just run the tests than write a system > which will competently lie. But even if we completely suspend trust, > someone lying is of no cost to the community in this case. I don't get this part. You are paranoid about bad actors until it comes to accepting the test results? So write a broken API, "prove" it works by running it on your broken test setup, and then get it upstream after establishing a false base for trust. Seems like a perfect setup for an exploit like what happened with the xz setup. > > > > Even if vendor X claims they will have a product in N months and > > > > than only sells it to qualified customers what to do we do then. > > > > Driver author could even believe the hardware will be available > > > > when they post the driver, but business may change out of hands > > > > of the developer. > > > > > > > > I'm 100% on letting this through assuming Alex is on top of feedback > > > > and the code is good. > > > > > > I'd strongly prefer if we detach our trust and respect for Alex > > > from whatever precedent we make here. I can't stress this enough. > > > IDK if I'm exaggerating or it's hard to appreciate the challenges > > > of maintainership without living it, but I really don't like being > > > accused of playing favorites or big companies buying their way in :( > > > > Again, I would say we look at the blast radius. That is how we should > > be measuring any change. At this point the driver is self contained > > into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything > > outside that directory, and it can be switched off via Kconfig. > > It is not practical to ponder every change case by case. Maintainers > are overworked. How long until we send the uAPI patch for RSS on the > flow label? I'd rather not re-litigate this every time someone posts > a slightly different feature. Let's cover the obvious points from > the beginning while everyone is paying attention. We can amend later > as need be. Isn't that what we are doing right now? We are basically refusing this patch set not based on its own merits but on a "what-if" scenario for a patch set that might come at some point in the future and conjecture that somehow it is going to be able to add features for just itself when we haven't allowed that for in the past, for example with things like GSO partial. > > When the time comes to start adding new features we can probably start > > by looking at how to add either generic offloads like was done for > > GSO, CSO, ect or how it can also be implemented on another vendor's > > NIC. > > > > At this point the only risk the driver presents is that it is yet > > another driver, done in the same style I did the other Intel drivers, > > and so any kernel API changes will end up needing to be applied to it > > just like the other drivers. > > The risk is we'll have a fight every time there is a disagreement about > the expectations. We always do. I am not sure why you would expect that would change by blocking this patch set. If anything it sounds like maybe we need to document some requirements for availability for testing, and expectations for what it would mean to be a one-off device where only one entity has access to it. However that is a process problem, and not so much a patch or driver issue. > > > > I think any other policy would be very ugly to enforce, prove, and > > > > even understand. Obviously code and architecture debates I'm all for. > > > > Ensuring we have a trusted, experienced person signed up to review > > > > code, address feedback, fix whatever syzbot finds and so on is also a > > > > must I think. I'm sure Alex will take care of it. > > > > > > "Whatever syzbot finds" may be slightly moot for a private device ;) > > > but otherwise 100%! These are exactly the kind of points I think we > > > should enumerate. I started writing a list of expectations a while back: > > > > > > Documentation/maintainer/feature-and-driver-maintainers.rst > > > > > > I think we just need something like this, maybe just a step up, for > > > non-public devices.. > > > > I honestly think we are getting the cart ahead of the horse. When we > > start talking about kernel API changes then we can probably get into > > the whole "private" versus "publicly available" argument. A good > > example of the kind of thing I am thinking of is GSO partial where I > > ended up with Mellanox and Intel sending me 40G and 100G NICs and > > cables to implement it on their devices as all I had was essentially > > igb and ixgbe based NICs. > > That'd be great. Maybe even more than I'd expect. So why not write > it down? In case the person doing the coding is not Alex Duyck, and > just wants to get it done for their narrow use case, get a promo, > go work on something else? Write what down? That the vendors didn't like me harassing them to test my code so they shipped me the NICs and asked me to just test it myself? That worked in my scenario as I had a server level system in my home lab to make that work. We cannot expect everyone to have that kind of setup for their own development. That is why I am considering the QEMU approach as that might make this a bit more accessible. I could then look at enabling the QEMU at the same time I enable the driver. > > Odds are when we start getting to those kind of things maybe we need > > to look at having a few systems available for developer use, but until > > then I am not sure it makes sense to focus on if the device is > > publicly available or not. > > Developer access would be huge. > A mirage of developer access? immaterial :) If nothing else, maybe we need a writeup somewhere of the level of support a driver should expect from the Linux community if the device is not "easily available". We could probably define that in terms of what a reasonable expectation would be for a developer to have access to it. I would say that "commercial sales" is not a good metric and shouldn't come into it. If anything it would be about device availability for development and testing. In addition I would be good with a definition of "support" for the case of a device that isn't publicly available being quite limited, as those with access would have to have active involvement in the community to enable support. Without that it might as well be considered orphaned and the driver dropped.
On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: > > > Alex already indicated new features are coming, changes to the core > > > code will be proposed. How should those be evaluated? Hypothetically > > > should fbnic be allowed to be the first implementation of something > > > invasive like Mina's DMABUF work? Google published an open userspace > > > for NCCL that people can (in theory at least) actually run. Meta would > > > not be able to do that. I would say that clearly crosses the line and > > > should not be accepted. > > > > Why not? Just because we are not commercially selling it doesn't mean > > we couldn't look at other solutions such as QEMU. If we were to > > provide a github repo with an emulation of the NIC would that be > > enough to satisfy the "commercial" requirement? > > My test is not "commercial", it is enabling open source ecosystem vs > benefiting only proprietary software. Sorry, that was where this started where Jiri was stating that we had to be selling this. > In my hypothetical you'd need to do something like open source Meta's > implementation of the AI networking that the DMABUF patches enable, > and even then since nobody could run it at performance the thing is > pretty questionable. > > IMHO publishing a qemu chip emulator would not advance the open source > ecosystem around building a DMABUF AI networking scheme. Well not too many will be able to afford getting the types of systems and hardware needed for this in the first place. Primarily just your large data center companies can afford it. I never said this hardware is about enabling DMABUF. You implied that. The fact is that this driver is meant to be a pretty basic speeds and feeds device. We support header split and network flow classification so I suppose it could be used for DMABUF but by that logic so could a number of other drivers. > > > So I think there should be an expectation that technically sound things > > > Meta may propose must not be accepted because they cross the > > > ideological red line into enabling only proprietary software. > > > > That is a faulty argument. That is like saying we should kick out the > > nouveu driver out of Linux just because it supports Nvidia graphics > > cards that happen to also have a proprietary out-of-tree driver out > > there, > > Huh? nouveau supports a fully open source mesa graphics stack in > Linux. How is that remotely similar to what I said? No issue. Right, nouveau is fully open source. That is what I am trying to do with fbnic. That is what I am getting at. This isn't connecting to some proprietary stack or engaging in any sort of bypass. It is going through the standard networking stack. If there were some other out-of-tree driver for this to support some other use case how would that impact the upstream patch submission? This driver is being NAKed for enabling stuff that hasn't even been presented. It is barely enough driver to handle PXE booting which is needed to be able to even load an OS on the system. Yet somehow because you are expecting a fork to come in at some point to support DMABUF you are wanting to block it outright. How about rather than doing that we wait until there is something there that is objectionable before we start speculating on what may be coming. > You pointed at two things that I would consider to be exemplar open > source projects and said their existance somehow means we should be > purging drivers from the kernel??? > > I really don't understand what you are trying to say at all. I'm trying to say that both those projects are essentially doing the same thing you are accusing fbnic of doing, even though I am exposing no non-standard API(s) and everything is open source. You are projecting future changes onto this driver that don't currently and may never exist. > The kernel standard is that good quality open source *does* exist, we > tend to not care what proprietary things people create beyond that. Now I am confused. You say you don't care what happens later, but you seem to be insisting you care about what proprietary things will be done with it after it is upstreamed. > > I can't think of many NIC vendors that don't have their own > > out-of-tree drivers floating around with their own kernel bypass > > solutions to support proprietary software. > > Most of those are also open source, and we can't say much about what > people do out of tree, obviously. Isn't that exactly what you are doing though with all your "proprietary" comments? > > I agree. We need a consistent set of standards. I just strongly > > believe commercial availability shouldn't be one of them. > > I never said commercial availability. I talked about open source vs > proprietary userspace. This is very standard kernel stuff. > > You have an unavailable NIC, so we know it is only ever operated with > Meta's proprietary kernel fork, supporting Meta's proprietary > userspace software. Where exactly is the open source? It depends on your definition of "unavailable". I could argue that for many most of the Mellanox NICs are also have limited availability as they aren't exactly easy to get a hold of without paying a hefty ransom. The NIC is currently available to developers within Meta. As such I know there are not a small number of kernel developers who could get access to it if they asked for a login to one of our test and development systems. Also I offered to provide the QEMU repo, but you said you had no interest in that option. > Why should someone working to improve only their proprietary > environment be welcomed in the same way as someone working to improve > the open source ecosystem? That has never been the kernel communities > position. To quote Linus `I do not see open source as some big goody-goody "let's all sing kumbaya around the campfire and make the world a better place". No, open source only really works if everybody is contributing for their own selfish reasons.`[1] How is us using our own NIC any different than if one of the vendors were to make a NIC exclusively for us or any other large data center? The only reason why this is coming up is because Meta is not a typical NIC vendor but normally a consumer. The fact that we will be dogfooding our own NIC seems to be at the heart of the issue here. Haven't there been a number of maintainers who end up maintaining code bases in the kernel for platforms and/or devices where they own one of the few devices available in the world? How would this be any different. Given enough time it is likely this will end up in the hands of those outside Meta anyway, at that point the argument would be moot. > If you want to propose things to the kernel that can only be > meaningfully used by your proprietary software then you should not > expect to succeed. No one should be surprised to hear this. If the whole idea is to get us to run a non-proprietary stack nothing sends the exact opposite message like telling us we cannot upstream a simple network driver because of a "what if" about some DMABUF patch set from Google. All I am asking for is the ability to net install a system with this device. That requires the driver to be available in the provisioning kernel image, so thus why I am working to upstream it as I would rather not have to maintain an out-of-tree kernel driver. The argument here isn't about proprietary software. It is proprietary hardware that seems to be the issue, or at least that is where it started. The driver itself anyone could load, build, or even run on QEMU as I mentioned. It is open source and not exposing any new APIs. The issue seems to be with the fact that the NIC can't be bought from a vendor and instead Meta is building the NIC for it's own consumption. As far as the software stack the concern about DMABUF seems like an orthogonal argument that should be had at the userspace/API level and doesn't directly relate to any specific driver. As has been pointed out enabling anything like that wouldn't be a single NIC solution and to be accepted upstream it should be implemented on at least 2 different vendor drivers. Additionally, there isn't anything unique about this hardware that would make it more capable of enabling that than any other device. Thanks, - Alex [1]: https://www.bbc.com/news/technology-18419231
On Fri, Apr 05, 2024 at 11:38:25AM -0700, Alexander Duyck wrote: > > In my hypothetical you'd need to do something like open source Meta's > > implementation of the AI networking that the DMABUF patches enable, > > and even then since nobody could run it at performance the thing is > > pretty questionable. > > > > IMHO publishing a qemu chip emulator would not advance the open source > > ecosystem around building a DMABUF AI networking scheme. > > Well not too many will be able to afford getting the types of systems > and hardware needed for this in the first place. Primarily just your > large data center companies can afford it. > > I never said this hardware is about enabling DMABUF. I presented a hypothetical to be able to illustrate a scenario where this driver should not be used to justify invasive core kernel changes. I have no idea what future things you have in mind, or if any will reach a threshold where I would expect they should not be included. You where the one saying a key reason you wanted this driver was to push core changes and you said you imagine changes that are unique to fbnic that "others might like to follow". I'm being very clear to say that there are some core changes should not be accepted due to the kernel's open source ideology. > Right, nouveau is fully open source. That is what I am trying to do > with fbnic. That is what I am getting at. This isn't connecting to > some proprietary stack or engaging in any sort of bypass. The basic driver presented here is not, a future driver that justifies unknown changes to the core may be. This is why my message was pretty clear. IMHO there is nothing wrong with this series, but I do not expect you will get everything you want in future due to this issue. I said decide if you want to continue. I'm not NAKing anything on this series. > I'm trying to say that both those projects are essentially doing the > same thing you are accusing fbnic of doing, Not even close. Both those projects support open source ecosystems, have wide cross vendor participating. fbnic isn't even going to be enabled in any distribution. > > You have an unavailable NIC, so we know it is only ever operated with > > Meta's proprietary kernel fork, supporting Meta's proprietary > > userspace software. Where exactly is the open source? > > It depends on your definition of "unavailable". I could argue that for > many most of the Mellanox NICs are also have limited availability as > they aren't exactly easy to get a hold of without paying a hefty > ransom. And GNIC's that run Mina's series are completely unavailable right now. That is still a big different from a temporary issue to a permanent structural intention of the manufacturer. > > Why should someone working to improve only their proprietary > > environment be welcomed in the same way as someone working to improve > > the open source ecosystem? That has never been the kernel communities > > position. > > To quote Linus `I do not see open source as some big goody-goody > "let's all sing kumbaya around the campfire and make the world a > better place". No, open source only really works if everybody is > contributing for their own selfish reasons.`[1] I think that stance has evolved and the consensus position toward uapi is stronger. > different. Given enough time it is likely this will end up in the > hands of those outside Meta anyway, at that point the argument would > be moot. Oh, I'm skeptical about that. You seem to have taken my original email in a strange direction. I said this series was fine but cautioned that if you proceed you should be expecting an eventual feature rejection for idological reasons, and gave a hypothetical example what that would look like. If you want to continue or not is up to Meta, in my view. Jason
On Fri, Apr 5, 2024 at 12:02 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Fri, Apr 05, 2024 at 11:38:25AM -0700, Alexander Duyck wrote: > > > > In my hypothetical you'd need to do something like open source Meta's > > > implementation of the AI networking that the DMABUF patches enable, > > > and even then since nobody could run it at performance the thing is > > > pretty questionable. > > > > > > IMHO publishing a qemu chip emulator would not advance the open source > > > ecosystem around building a DMABUF AI networking scheme. > > > > Well not too many will be able to afford getting the types of systems > > and hardware needed for this in the first place. Primarily just your > > large data center companies can afford it. > > > > I never said this hardware is about enabling DMABUF. > > I presented a hypothetical to be able to illustrate a scenario where > this driver should not be used to justify invasive core kernel > changes. > > I have no idea what future things you have in mind, or if any will > reach a threshold where I would expect they should not be > included. You where the one saying a key reason you wanted this driver > was to push core changes and you said you imagine changes that are > unique to fbnic that "others might like to follow". > > I'm being very clear to say that there are some core changes should > not be accepted due to the kernel's open source ideology. Okay, on core changes I 100% agree. That is one of the reasons why we have the whole thing about any feature really needing to be enabled on at least 2 different vendor devices. > > Right, nouveau is fully open source. That is what I am trying to do > > with fbnic. That is what I am getting at. This isn't connecting to > > some proprietary stack or engaging in any sort of bypass. > > The basic driver presented here is not, a future driver that justifies > unknown changes to the core may be. > > This is why my message was pretty clear. IMHO there is nothing wrong > with this series, but I do not expect you will get everything you want > in future due to this issue. > > I said decide if you want to continue. I'm not NAKing anything on this > series. My apologies. I had interpreted your statement as essentially agreeing with Jiri and NAKing the patches simply for not being commercially available. > > I'm trying to say that both those projects are essentially doing the > > same thing you are accusing fbnic of doing, > > Not even close. Both those projects support open source ecosystems, > have wide cross vendor participating. fbnic isn't even going to be > enabled in any distribution. I can't say for certain if that will be the case going forward or not. I know we haven't reached out to any distros and currently don't need to. With that said though, having the driver available as a module shouldn't cause any harm either so I don't really have a strong opinion about it either way. Seeing as how we arne't in the NIC business I am not sure how restrictive we would be about licensing out the IP. I could see Meta potentially trying to open up the specs just to take the manufacturing burden off of us and enable more competition, though I am on the engineering side and not so much sourcing so I am just speculating. > > > You have an unavailable NIC, so we know it is only ever operated with > > > Meta's proprietary kernel fork, supporting Meta's proprietary > > > userspace software. Where exactly is the open source? > > > > It depends on your definition of "unavailable". I could argue that for > > many most of the Mellanox NICs are also have limited availability as > > they aren't exactly easy to get a hold of without paying a hefty > > ransom. > > And GNIC's that run Mina's series are completely unavailable right > now. That is still a big different from a temporary issue to a > permanent structural intention of the manufacturer. I'm assuming it is some sort of firmware functionality that is needed to enable it? One thing with our design is that the firmware actually has minimal functionality. Basically it is the liaison between the BMC, Host, and the MAC. Otherwise it has no role to play in the control path so when the driver is loaded it is running the show. > > > Why should someone working to improve only their proprietary > > > environment be welcomed in the same way as someone working to improve > > > the open source ecosystem? That has never been the kernel communities > > > position. > > > > To quote Linus `I do not see open source as some big goody-goody > > "let's all sing kumbaya around the campfire and make the world a > > better place". No, open source only really works if everybody is > > contributing for their own selfish reasons.`[1] > > I think that stance has evolved and the consensus position toward uapi > is stronger. I assume you are talking about the need to have an open source consumer for any exposed uapi? That makes sense from the test and development standpoint as you need to have some way to exercise and test any interface that is available in the kernel. > > different. Given enough time it is likely this will end up in the > > hands of those outside Meta anyway, at that point the argument would > > be moot. > > Oh, I'm skeptical about that. I didn't say how widely or when. I got my introduction to Linux by buying used server systems and trying to get something maintainable in terms of OS on them. From my experience you end up with all sorts of odd-ball proprietary parts that eventually end up leaking out to the public. Though back then it was more likely to be a proprietary spin of some known silicon with a few tweaks that had to be accounted for. > You seem to have taken my original email in a strange direction. I > said this series was fine but cautioned that if you proceed you should > be expecting an eventual feature rejection for idological reasons, and > gave a hypothetical example what that would look like. > > If you want to continue or not is up to Meta, in my view. > > Jason Yeah, I think I had misunderstood your original comment as being in support of Jiri's position. I hadn't fully grokked that you were doing more of a "yes, but" versus an "yes, and". For this driver I don't see there being too much in the way of complications as there shouldn't be much in the way of any sort of kernel-bypass that would be applicable to this driver uniquely. Thanks, - Alex
> I'm assuming it is some sort of firmware functionality that is needed > to enable it? One thing with our design is that the firmware actually > has minimal functionality. Basically it is the liaison between the > BMC, Host, and the MAC. Otherwise it has no role to play in the > control path so when the driver is loaded it is running the show. Which i personally feel is great. In an odd way, this to me indicates this is a commodity product, or at least, leading the way towards commodity 100G products. Looking at the embedded SoC NIC market, which pretty is much about commodity, few 1G Ethernet NICs have firmware. Most 10G NICs also have no firmware. Linux is driving the hardware. Much of the current Linux infrastructure is limited to 10G, because currently everything faster than that hides away in firmware, Linux does not get to driver it. This driver could help push Linux controlling the hardware forward, to be benefit of us all. It would be great if this driver used phylink to manage the PCS and the SFP cage, that the PCS code is moved into drivers/net/pcs, etc. Assuming this PCS follows the standards, it would be great to add helpers like we have for clause 37, clause 73, to help support other future PCS drivers which will appear. 100G in SoCs is probably not going to appear too soon, but single channel 25G is probably the next step after 10G. And what is added for this device will probably also work for 25G. 40G via 4 channels is probably not too far away either. Our Linux SFP driver is also currently limited to 10G. It would be great if this driver could push that forwards to support faster SFP cages and devices, support splitting and merging, etc. None of this requires new kAPIs, they all already exist. There is nothing controversial here. Everything follows standards. So if Meta were to abandon the MAC driver, it would not matter, its not dead infrastructure code, future drivers would make use of it, as this technology becomes more and more commodity. Andrew
On Fri, Apr 5, 2024 at 7:01 AM Przemek Kitszel <przemyslaw.kitszel@intel.com> wrote: > > On 4/3/24 22:08, Alexander Duyck wrote: > > This patch set includes the necessary patches to enable basic Tx and Rx > > over the Meta Platforms Host Network Interface. To do this we introduce a > > new driver and driver and directories in the form of > > "drivers/net/ethernet/meta/fbnic". > > > > Due to submission limits the general plan to submit a minimal driver for > > now almost equivalent to a UEFI driver in functionality, and then follow up > > over the coming weeks enabling additional offloads and more features for > > the device. > > > > The general plan is to look at adding support for ethtool, statistics, and > > start work on offloads in the next set of patches. > > > > --- > > > > Alexander Duyck (15): > > PCI: Add Meta Platforms vendor ID > > eth: fbnic: add scaffolding for Meta's NIC driver > > eth: fbnic: Allocate core device specific structures and devlink interface > > eth: fbnic: Add register init to set PCIe/Ethernet device config > > eth: fbnic: add message parsing for FW messages > > eth: fbnic: add FW communication mechanism > > eth: fbnic: allocate a netdevice and napi vectors with queues > > eth: fbnic: implement Tx queue alloc/start/stop/free > > eth: fbnic: implement Rx queue alloc/start/stop/free > > eth: fbnic: Add initial messaging to notify FW of our presence > > eth: fbnic: Enable Ethernet link setup > > eth: fbnic: add basic Tx handling > > eth: fbnic: add basic Rx handling > > eth: fbnic: add L2 address programming > > eth: fbnic: write the TCAM tables used for RSS control and Rx to host > > > > > > MAINTAINERS | 7 + > > drivers/net/ethernet/Kconfig | 1 + > > drivers/net/ethernet/Makefile | 1 + > > drivers/net/ethernet/meta/Kconfig | 29 + > > drivers/net/ethernet/meta/Makefile | 6 + > > drivers/net/ethernet/meta/fbnic/Makefile | 18 + > > drivers/net/ethernet/meta/fbnic/fbnic.h | 148 ++ > > drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 912 ++++++++ > > .../net/ethernet/meta/fbnic/fbnic_devlink.c | 86 + > > .../net/ethernet/meta/fbnic/fbnic_drvinfo.h | 5 + > > drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 823 ++++++++ > > drivers/net/ethernet/meta/fbnic/fbnic_fw.h | 133 ++ > > drivers/net/ethernet/meta/fbnic/fbnic_irq.c | 251 +++ > > drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 1025 +++++++++ > > drivers/net/ethernet/meta/fbnic/fbnic_mac.h | 83 + > > .../net/ethernet/meta/fbnic/fbnic_netdev.c | 470 +++++ > > .../net/ethernet/meta/fbnic/fbnic_netdev.h | 59 + > > drivers/net/ethernet/meta/fbnic/fbnic_pci.c | 633 ++++++ > > drivers/net/ethernet/meta/fbnic/fbnic_rpc.c | 709 +++++++ > > drivers/net/ethernet/meta/fbnic/fbnic_rpc.h | 189 ++ > > drivers/net/ethernet/meta/fbnic/fbnic_tlv.c | 529 +++++ > > drivers/net/ethernet/meta/fbnic/fbnic_tlv.h | 175 ++ > > drivers/net/ethernet/meta/fbnic/fbnic_txrx.c | 1873 +++++++++++++++++ > > drivers/net/ethernet/meta/fbnic/fbnic_txrx.h | 125 ++ > > include/linux/pci_ids.h | 2 + > > 25 files changed, 8292 insertions(+) > > Even if this is just a basic scaffolding for what will come, it's hard > to believe that no patch was co-developed, or should be marked as > authored-by some other developer. > > [...] I don't want to come across as snarky, but you must be new to Intel? If nothing else you might ask a few people there about the history of the fm10k drivers. I think I did most of the Linux and FreeBSD fm10k drivers in about 2 to 3 years. Typically getting basic Tx and Rx up and running on a driver only takes a few weeks, and it is pretty straight forward when you are implementing the QEMU at the same time to test it on. From my experience driver development really goes by the pareto principle where getting basic Tx/Rx up and running is usually a quick task. What takes forever is enabling all the other offloads and functions. As far as this driver goes I would say this is something similar, only this time I have worked on a Linux and UEFI driver, both of which I am hoping to get upstreamed. With that said I can go through the bits for the yet to be upstreamed parts that weren't done by me. We had tried to bring a few people onto the team early on, none of which are with the team anymore but a couple are still with the company so I can reach out to them and see if they are okay with the Co-autthor attribution before I submit those patches. I have a few people who worked on the ptp and debugfs, one that enabled Tx offloads and the ethtool ring configuration, and another had just started work on the UEFI driver before he left. In addition there was an Intern who did most of the work on the ethtool loopback test. When the layoffs hit in late 2022 the team was basically reduced to just myself and a firmware developer. Neither of us really strayed too much into the other's code. Basically I defined the mailbox interface and messages and went on our separate ways from there. In the last 6 months our team started hiring again. The new hires are currently working in areas that aren't in this set such as devlink firmware update, ethtool register dump, and various other interactions with the firmware.
On Sat, Apr 6, 2024 at 9:49 AM Andrew Lunn <andrew@lunn.ch> wrote: > > > I'm assuming it is some sort of firmware functionality that is needed > > to enable it? One thing with our design is that the firmware actually > > has minimal functionality. Basically it is the liaison between the > > BMC, Host, and the MAC. Otherwise it has no role to play in the > > control path so when the driver is loaded it is running the show. > > Which i personally feel is great. In an odd way, this to me indicates > this is a commodity product, or at least, leading the way towards > commodity 100G products. Looking at the embedded SoC NIC market, which > pretty is much about commodity, few 1G Ethernet NICs have firmware. > Most 10G NICs also have no firmware. Linux is driving the hardware. > > Much of the current Linux infrastructure is limited to 10G, because > currently everything faster than that hides away in firmware, Linux > does not get to driver it. This driver could help push Linux > controlling the hardware forward, to be benefit of us all. It would be > great if this driver used phylink to manage the PCS and the SFP cage, > that the PCS code is moved into drivers/net/pcs, etc. Assuming this > PCS follows the standards, it would be great to add helpers like we > have for clause 37, clause 73, to help support other future PCS > drivers which will appear. 100G in SoCs is probably not going to > appear too soon, but single channel 25G is probably the next step > after 10G. And what is added for this device will probably also work > for 25G. 40G via 4 channels is probably not too far away either. > > Our Linux SFP driver is also currently limited to 10G. It would be > great if this driver could push that forwards to support faster SFP > cages and devices, support splitting and merging, etc. > > None of this requires new kAPIs, they all already exist. There is > nothing controversial here. Everything follows standards. So if Meta > were to abandon the MAC driver, it would not matter, its not dead > infrastructure code, future drivers would make use of it, as this > technology becomes more and more commodity. > > Andrew As far as the MAC/PCS code goes I will have to see what I can do. I think I have to check with our sourcing team to figure out what contracts are in place for whatever IP we are currently using before I can share any additional info beyond the code here. One other complication I can think of in terms of switching things over as you have requested is that we will probably need to look at splitting up the fbnic_mac.c file as it is currently used for both the UEFI driver and the Linux driver so I will need to have a solution for the UEFI driver which wouldn't have the advantage of phylink. Thanks, - Alex
On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote: > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote: <...> > > > > Technical solution? Maybe if it's not a public device regression rules > > > > don't apply? Seems fairly reasonable. > > > > > > This is a hypothetical. This driver currently isn't changing anything > > > outside of itself. At this point the driver would only be build tested > > > by everyone else. They could just not include it in their Kconfig and > > > then out-of-sight, out-of-mind. > > > > Not changing does not mean not depending on existing behavior. > > Investigating and fixing properly even the hardest regressions in > > the stack is a bar that Meta can so easily clear. I don't understand > > why you are arguing. > > I wasn't saying the driver wouldn't be dependent on existing behavior. > I was saying that it was a hypothetical that Meta would be a "less > than cooperative user" and demand a revert. It is also a hypothetical > that Linus wouldn't just propose a revert of the fbnic driver instead > of the API for the crime of being a "less than cooperative maintainer" > and then give Meta the Nvidia treatment. It is very easy to be "less than cooperative maintainer" in netdev world. 1. Be vendor. 2. Propose ideas which are different. 3. Report for user-visible regression. 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice. And voilà, you are "less than cooperative maintainer". So in reality, the "hypothetical" is very close to the reality, unless Meta contribution will be treated as a special case. Thanks
Thu, Apr 04, 2024 at 09:22:02PM CEST, alexander.duyck@gmail.com wrote: >On Thu, Apr 4, 2024 at 8:36 AM Jiri Pirko <jiri@resnulli.us> wrote: >> >> Thu, Apr 04, 2024 at 04:45:14PM CEST, alexander.duyck@gmail.com wrote: >> >On Thu, Apr 4, 2024 at 4:37 AM Jiri Pirko <jiri@resnulli.us> wrote: >> >> >> >> Wed, Apr 03, 2024 at 10:08:24PM CEST, alexander.duyck@gmail.com wrote: > ><...> > >> >> Could you please shed some light for the motivation to introduce this >> >> driver in the community kernel? Is this device something people can >> >> obtain in a shop, or is it rather something to be seen in Meta >> >> datacenter only? If the second is the case, why exactly would we need >> >> this driver? >> > >> >For now this is Meta only. However there are several reasons for >> >wanting to include this in the upstream kernel. >> > >> >First is the fact that from a maintenance standpoint it is easier to >> >avoid drifting from the upstream APIs and such if we are in the kernel >> >it makes things much easier to maintain as we can just pull in patches >> >without having to add onto that work by having to craft backports >> >around code that isn't already in upstream. >> >> That is making life easier for you, making it harder for the community. >> O relevance. >> >> >> > >> >Second is the fact that as we introduce new features with our driver >> >it is much easier to show a proof of concept with the driver being in >> >the kernel than not. It makes it much harder to work with the >> >community on offloads and such if we don't have a good vehicle to use >> >for that. What this driver will provide is an opportunity to push >> >changes that would be beneficial to us, and likely the rest of the >> >community without being constrained by what vendors decide they want >> >to enable or not. The general idea is that if we can show benefit with >> >our NIC then other vendors would be more likely to follow in our path. >> >> Yeah, so not even we would have to maintain driver nobody (outside Meta) >> uses or cares about, you say that we will likely maintain more of a dead >> code related to that. I think that in Linux kernel, there any many >> examples of similarly dead code that causes a lot of headaches to >> maintain. >> >> You just want to make your life easier here again. Don't drag community >> into this please. > >The argument itself doesn't really hold water. The fact is the Meta >data centers are not an insignificant consumer of Linux, so it isn't >as if the driver isn't going to be used. This implies some lack of Used by one user. Consider a person creating some custom proprietary FPGA based pet project for himself, trying to add driver for it to the mainline kernel. Why? Nobody else will ever see the device, why the community should be involved at all? Does not make sense. Have the driver for your internal cook-ups internal. >good faith from Meta. I don't understand that as we are contributing >across multiple areas in the kernel including networking and ebpf. Is >Meta expected to start pulling time from our upstream maintainers to >have them update out-of-tree kernel modules since the community isn't >willing to let us maintain it in the kernel? Is the message that the If Meta contributes whatever may be useful for somebody else, it is completely fine. This driver is not useful for anyone, except Meta. >kernel is expected to get value from Meta, but that value is not meant >to be reciprocated? Would you really rather have us start maintaining >our own internal kernel with our own "proprietary goodness", and ask I don't care, maintain whatever you want internally. Totally up to you. Just try to understand my POV. I may believe you have good faith and everything. But still, I think that community has to be selfish. >other NIC vendors to have to maintain their drivers against yet >another kernel if they want to be used in our data centers? > >As pointed out by Andew we aren't the first data center to push a >driver for our own proprietary device. The fact is there have been If you proprietary device is used by other people running virtual machines on your systems, that is completely fine. But that is incorrect analogy to your nic, no outside-Meta person will ever see it! >drivers added for devices that were for purely emulated devices with >no actual customers such as rocker. Should the switch vendors at the This is completely fault analogy. Rocker was introduced to solve chicken-egg problem to ass switch device support into kernel. It served the purpose quite well. Let it rot now. >time have pushed back on it stating it wasn't a real "for sale" >device? The whole argument seems counter to what is expected. When a >vendor creates a new device and will likely be enabling new kernel >features my understanding is that it is better to be in the kernel >than not. > >Putting a criteria on it that it must be "for sale" seems rather Not "for sale", but "available to the outside person". >arbitrary and capricious, especially given that most drivers have to Not capricious at all, I sorry you feel that way. You proceed your company goals, my position here is to defend the community and the unnecessary and pointless burden you are putting on it. >be pushed out long before they are available in the market in order to >meet deadlines to get the driver into OSV releases such as Redhat when >it hits the market. By that logic should we block all future drivers >until we can find them for sale somewhere? That way we don't run the That is or course obviously complete fault analogy again. You never plan to introduce your device to public. Big difference. Don't you see it? >risk of adding a vendor driver for a product that might be scrapped >due to a last minute bug that will cause it to never be released.
Thu, Apr 04, 2024 at 11:59:33PM CEST, john.fastabend@gmail.com wrote: >Jakub Kicinski wrote: >> On Thu, 4 Apr 2024 12:22:02 -0700 Alexander Duyck wrote: >> > The argument itself doesn't really hold water. The fact is the Meta >> > data centers are not an insignificant consumer of Linux, >> >> customer or beneficiary ? >> >> > so it isn't as if the driver isn't going to be used. This implies >> > some lack of good faith from Meta. >> >> "Good faith" is not a sufficient foundation for a community consisting >> of volunteers, and commercial entities (with the xz debacle maybe even >> less today than it was a month ago). As a maintainer I really don't want >> to be in position of judging the "good faith" of corporate actors. >> >> > I don't understand that as we are >> > contributing across multiple areas in the kernel including networking >> > and ebpf. Is Meta expected to start pulling time from our upstream >> > maintainers to have them update out-of-tree kernel modules since the >> > community isn't willing to let us maintain it in the kernel? Is the >> > message that the kernel is expected to get value from Meta, but that >> > value is not meant to be reciprocated? Would you really rather have >> > us start maintaining our own internal kernel with our own >> > "proprietary goodness", and ask other NIC vendors to have to maintain >> > their drivers against yet another kernel if they want to be used in >> > our data centers? >> >> Please allow the community to make rational choices in the interest of >> the project and more importantly the interest of its broader user base. >> >> Google would also claim "good faith" -- undoubtedly is supports >> the kernel, and lets some of its best engineers contribute. >> Did that make them stop trying to build Fuchsia? The "good faith" of >> companies operates with the limits of margin of error of they consider >> rational and beneficial. >> >> I don't want to put my thumb on the scale (yet?), but (with my >> maintainer hat on) please don't use the "Meta is good" argument, because >> someone will send a similar driver from a less involved company later on >> and we'll be accused of playing favorites :( Plus companies can change >> their approach to open source from "inclusive" to "extractive" >> (to borrow the economic terminology) rather quickly. >> > >I'll throw my $.02 in. In this case you have a driver that I only scanned >so far, but looks well done. Alex has written lots of drivers I trust he >will not just abondon it. And if it does end up abondoned and no one >supports it at some future point we can deprecate it same as any other >driver in the networking tree. All the feedback is being answered and >debate is happening so I expect will get a v2, v3 or so. All good signs >in my point. > >Back to your point about faith in a company. I don't think we even need >to care about whatever companies business plans. The author could have >submitted with their personal address for what its worth and called it >drivers/alexware/duyck.o Bit extreme and I would have called him on it, >but hopefully the point is clear. > >We have lots of drivers in the tree that are hard to physically get ahold >of. Or otherwise gated by paying some vendor for compute time, etc. to >use. We even have some drivers where the hardware itself never made >it out into the wild or only a single customer used it before sellers >burned it for commercial reasons or hw wasn't workable, team was cut, etc. > >I can't see how if I have a physical NIC for it on my desk here makes >much difference one way or the other. > >The alternative is much worse someone builds a team of engineers locks >them up they build some interesting pieces and we never get to see it >because we tried to block someone from opensourcing their driver? >Eventually they need some kernel changes and than we block those too >because we didn't allow the driver that was the use case? This seems >wrong to me. > >Anyways we have zero ways to enforce such a policy. Have vendors >ship a NIC to somebody with the v0 of the patch set? Attach a picture? Come on. Are you kidding? Isn't this case crystal clear? >Even if vendor X claims they will have a product in N months and >than only sells it to qualified customers what to do we do then. >Driver author could even believe the hardware will be available >when they post the driver, but business may change out of hands >of the developer. > >I'm 100% on letting this through assuming Alex is on top of feedback >and the code is good. I think any other policy would be very ugly >to enforce, prove, and even understand. Obviously code and architecture >debates I'm all for. Ensuring we have a trusted, experienced person >signed up to review code, address feedback, fix whatever syzbot finds >and so on is also a must I think. I'm sure Alex will take care of >it. You are for some reason making this submission very personal on Alex. Just to be clear, this has nothing to do with Alex. > >Thanks, >John
Fri, Apr 05, 2024 at 09:11:19AM CEST, pabeni@redhat.com wrote: >On Thu, 2024-04-04 at 17:11 -0700, Alexander Duyck wrote: >> Again, I would say we look at the blast radius. That is how we should >> be measuring any change. At this point the driver is self contained >> into /drivers/net/ethernet/meta/fbnic/. It isn't exporting anything >> outside that directory, and it can be switched off via Kconfig. > >I personally think this is the most relevant point. This is just a new >NIC driver, completely self-encapsulated. I quickly glanced over the What do you mean by "self contained/encapsulated"? You are not using any API outside the driver? Every driver API change that this NIC is going to use is a burden. I did my share of changes like that in the past so I have pretty good notion how painful it often is. >code and it looks like it's not doing anything obviously bad. It really >looks like an usual, legit, NIC driver. > >I don't think the fact that the NIC itself is hard to grasp for anyone Distinguish "hard"/"impossible". >outside <organization> makes a difference. Long time ago Greg noted >that drivers has been merged for H/W known to have a _single_ existing >instance (IIRC, I can't find the reference on top of my head, but back >then was quite popular, I hope some other old guy could remember). > >To me, the maintainership burden is on Meta: Alex/Meta will have to >handle bug report, breakages, user-complains (I guess this last would >be the easier part ;). If he/they will not cope with the process we can >simply revert the driver. I would be quite surprised if such situation >should happen, but the impact from my PoV looks minimal. > >TL;DR: I don't see any good reason to not accept this - unless my quick >glance was too quick and very wrong, but it looks like other has >similar view. Do you actually see any good reason to accept this? I mean, really, could you spell out at least one benefit it brings for non-Meta user? I see only gains for Meta and losses for the community.
Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >> > > Alex already indicated new features are coming, changes to the core >> > > code will be proposed. How should those be evaluated? Hypothetically >> > > should fbnic be allowed to be the first implementation of something >> > > invasive like Mina's DMABUF work? Google published an open userspace >> > > for NCCL that people can (in theory at least) actually run. Meta would >> > > not be able to do that. I would say that clearly crosses the line and >> > > should not be accepted. >> > >> > Why not? Just because we are not commercially selling it doesn't mean >> > we couldn't look at other solutions such as QEMU. If we were to >> > provide a github repo with an emulation of the NIC would that be >> > enough to satisfy the "commercial" requirement? >> >> My test is not "commercial", it is enabling open source ecosystem vs >> benefiting only proprietary software. > >Sorry, that was where this started where Jiri was stating that we had >to be selling this. For the record, I never wrote that. Not sure why you repeat this over this thread. And for the record, I don't share Jason's concern about proprietary userspace. From what I see, whoever is consuming the KAPI is free to do that however he pleases. But, this is completely distant from my concerns about this driver. [...] >> > I agree. We need a consistent set of standards. I just strongly >> > believe commercial availability shouldn't be one of them. >> >> I never said commercial availability. I talked about open source vs >> proprietary userspace. This is very standard kernel stuff. >> >> You have an unavailable NIC, so we know it is only ever operated with >> Meta's proprietary kernel fork, supporting Meta's proprietary >> userspace software. Where exactly is the open source? > >It depends on your definition of "unavailable". I could argue that for >many most of the Mellanox NICs are also have limited availability as >they aren't exactly easy to get a hold of without paying a hefty >ransom. Sorry, but I have to say this is ridiculous argument, really Alex. Apples and oranges. [...]
On Sat, 6 Apr 2024 09:05:01 -0700 Alexander Duyck wrote: > > I'm being very clear to say that there are some core changes should > > not be accepted due to the kernel's open source ideology. > > Okay, on core changes I 100% agree. That is one of the reasons why we > have the whole thing about any feature really needing to be enabled on > at least 2 different vendor devices. The "2 different vendor"/implementation thing came up before so I wanted to provide more context for the less initiated readers. We try to judge features in terms of how reasonable the APIs are, overall system design and how easy they will be to modify later (e.g. uAPI, depth of core changes). Risks are usually more pronounced for stack features like GSO partial, XDP or AF_XDP. Although my (faulty) memory is that we started with just mlx4 for XDP and other drivers quickly followed. But we did not wait for more than an ACK from other vendors. We almost never have a second implementation for HW-heavy features. TLS immediately comes to mind, and merging it was probably the right call given how many implementations were added since. For "full" IPsec offload we still only have one vendor. Existing TCP ZC Rx (page flipping) was presented as possible with two NICs but mlx5 was hacked up and still doesn't support real HDS. Most (if not all) of recent uAPI we added in netdev netlink were accepted with a single implementation (be it Intel's work + drivers or my work, and I often provide just a bnxt implementation). Summary -> the number of implementations we require is decided on case by case basis, depending on our level of confidence.. I don't know if this "2 implementations" rule is just a "mental ellipsis" everyone understands to be a more nuanced rule in practice. But to an outsider it would seem very selectively enforced. In worst case a fake rule to give us an excuse to nack stuff.
On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote: > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote: > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote: > > <...> > > > > > > Technical solution? Maybe if it's not a public device regression rules > > > > > don't apply? Seems fairly reasonable. > > > > > > > > This is a hypothetical. This driver currently isn't changing anything > > > > outside of itself. At this point the driver would only be build tested > > > > by everyone else. They could just not include it in their Kconfig and > > > > then out-of-sight, out-of-mind. > > > > > > Not changing does not mean not depending on existing behavior. > > > Investigating and fixing properly even the hardest regressions in > > > the stack is a bar that Meta can so easily clear. I don't understand > > > why you are arguing. > > > > I wasn't saying the driver wouldn't be dependent on existing behavior. > > I was saying that it was a hypothetical that Meta would be a "less > > than cooperative user" and demand a revert. It is also a hypothetical > > that Linus wouldn't just propose a revert of the fbnic driver instead > > of the API for the crime of being a "less than cooperative maintainer" > > and then give Meta the Nvidia treatment. > > It is very easy to be "less than cooperative maintainer" in netdev world. > 1. Be vendor. > 2. Propose ideas which are different. > 3. Report for user-visible regression. > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice. > > And voilà, you are "less than cooperative maintainer". > > So in reality, the "hypothetical" is very close to the reality, unless > Meta contribution will be treated as a special case. > > Thanks How many cases of that have we had in the past? I'm honestly curious as I don't actually have any reference. Also as far as item 3 isn't hard for it to be a "user-visible" regression if there are no users outside of the vendor that is maintaining the driver to report it? Again I am assuming that the same rules wouldn't necessarily apply in the vendor/consumer being one entity case. Also from my past experience the community doesn't give a damn about 1. It is only if 3 is being reported by actual users that somebody would care. The fact is if vendors held that much power they would have run roughshod over the community long ago as I know there are vendors who love to provide one-off projects outside of the kernel and usually have to work to get things into the upstream later and no amount of complaining about "the users" will get their code accepted. The users may complain but it is the vendors fault for that so the community doesn't have to take action.
On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: > > Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: > >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > >> > >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: > >> > > Alex already indicated new features are coming, changes to the core > >> > > code will be proposed. How should those be evaluated? Hypothetically > >> > > should fbnic be allowed to be the first implementation of something > >> > > invasive like Mina's DMABUF work? Google published an open userspace > >> > > for NCCL that people can (in theory at least) actually run. Meta would > >> > > not be able to do that. I would say that clearly crosses the line and > >> > > should not be accepted. > >> > > >> > Why not? Just because we are not commercially selling it doesn't mean > >> > we couldn't look at other solutions such as QEMU. If we were to > >> > provide a github repo with an emulation of the NIC would that be > >> > enough to satisfy the "commercial" requirement? > >> > >> My test is not "commercial", it is enabling open source ecosystem vs > >> benefiting only proprietary software. > > > >Sorry, that was where this started where Jiri was stating that we had > >to be selling this. > > For the record, I never wrote that. Not sure why you repeat this over > this thread. Because you seem to be implying that the Meta NIC driver shouldn't be included simply since it isn't going to be available outside of Meta. The fact is Meta employs a number of kernel developers and as a result of that there will be a number of kernel developers that will have access to this NIC and likely do development on systems containing it. In addition simply due to the size of the datacenters that we will be populating there is actually a strong likelihood that there will be more instances of this NIC running on Linux than there are of some other vendor devices that have been allowed to have drivers in the kernel. So from what I can tell the only difference is if we are manufacturing this for sale, or for personal use. Thus why I mention "commercial" since the only difference from my perspective is the fact that we are making it for our own use instead of selling it. [...] > >> > I agree. We need a consistent set of standards. I just strongly > >> > believe commercial availability shouldn't be one of them. > >> > >> I never said commercial availability. I talked about open source vs > >> proprietary userspace. This is very standard kernel stuff. > >> > >> You have an unavailable NIC, so we know it is only ever operated with > >> Meta's proprietary kernel fork, supporting Meta's proprietary > >> userspace software. Where exactly is the open source? > > > >It depends on your definition of "unavailable". I could argue that for > >many most of the Mellanox NICs are also have limited availability as > >they aren't exactly easy to get a hold of without paying a hefty > >ransom. > > Sorry, but I have to say this is ridiculous argument, really Alex. > Apples and oranges. Really? So would you be making the same argument if it was Nvidia/Mellanox pushing the driver and they were exclusively making it just for Meta, Google, or some other big cloud provider? I suspect not. If nothing else they likely wouldn't disclose the plan for exclusive sales to get around this sort of thing. The fact is I know many of the vendors make proprietary spins of their firmware and hardware for specific customers. The way I see it this patchset is being rejected as I was too honest about the general plan and use case for it. This is what I am getting at. It just seems like we are playing games with semantics where if it is a vendor making the arrangement then it is okay for them to make hardware that is inaccessible to most, but if it is Meta then somehow it isn't.
Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >> >> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >> >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >> >> >> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >> >> > > Alex already indicated new features are coming, changes to the core >> >> > > code will be proposed. How should those be evaluated? Hypothetically >> >> > > should fbnic be allowed to be the first implementation of something >> >> > > invasive like Mina's DMABUF work? Google published an open userspace >> >> > > for NCCL that people can (in theory at least) actually run. Meta would >> >> > > not be able to do that. I would say that clearly crosses the line and >> >> > > should not be accepted. >> >> > >> >> > Why not? Just because we are not commercially selling it doesn't mean >> >> > we couldn't look at other solutions such as QEMU. If we were to >> >> > provide a github repo with an emulation of the NIC would that be >> >> > enough to satisfy the "commercial" requirement? >> >> >> >> My test is not "commercial", it is enabling open source ecosystem vs >> >> benefiting only proprietary software. >> > >> >Sorry, that was where this started where Jiri was stating that we had >> >to be selling this. >> >> For the record, I never wrote that. Not sure why you repeat this over >> this thread. > >Because you seem to be implying that the Meta NIC driver shouldn't be >included simply since it isn't going to be available outside of Meta. >The fact is Meta employs a number of kernel developers and as a result >of that there will be a number of kernel developers that will have >access to this NIC and likely do development on systems containing it. >In addition simply due to the size of the datacenters that we will be >populating there is actually a strong likelihood that there will be >more instances of this NIC running on Linux than there are of some >other vendor devices that have been allowed to have drivers in the >kernel. So? The gain for community is still 0. No matter how many instances is private hw you privately have. Just have a private driver. > >So from what I can tell the only difference is if we are manufacturing >this for sale, or for personal use. Thus why I mention "commercial" >since the only difference from my perspective is the fact that we are >making it for our own use instead of selling it. Give it for free. > >[...] > >> >> > I agree. We need a consistent set of standards. I just strongly >> >> > believe commercial availability shouldn't be one of them. >> >> >> >> I never said commercial availability. I talked about open source vs >> >> proprietary userspace. This is very standard kernel stuff. >> >> >> >> You have an unavailable NIC, so we know it is only ever operated with >> >> Meta's proprietary kernel fork, supporting Meta's proprietary >> >> userspace software. Where exactly is the open source? >> > >> >It depends on your definition of "unavailable". I could argue that for >> >many most of the Mellanox NICs are also have limited availability as >> >they aren't exactly easy to get a hold of without paying a hefty >> >ransom. >> >> Sorry, but I have to say this is ridiculous argument, really Alex. >> Apples and oranges. > >Really? So would you be making the same argument if it was >Nvidia/Mellanox pushing the driver and they were exclusively making it >just for Meta, Google, or some other big cloud provider? I suspect Heh, what ifs :) Anyway, chance that happens is very close to 0. >not. If nothing else they likely wouldn't disclose the plan for >exclusive sales to get around this sort of thing. The fact is I know >many of the vendors make proprietary spins of their firmware and >hardware for specific customers. The way I see it this patchset is >being rejected as I was too honest about the general plan and use case >for it. > >This is what I am getting at. It just seems like we are playing games >with semantics where if it is a vendor making the arrangement then it >is okay for them to make hardware that is inaccessible to most, but if >it is Meta then somehow it isn't.
Jiri Pirko wrote: > Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: > >On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: > >> > >> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: > >> >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > >> >> > >> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: > >> >> > > Alex already indicated new features are coming, changes to the core > >> >> > > code will be proposed. How should those be evaluated? Hypothetically > >> >> > > should fbnic be allowed to be the first implementation of something > >> >> > > invasive like Mina's DMABUF work? Google published an open userspace > >> >> > > for NCCL that people can (in theory at least) actually run. Meta would > >> >> > > not be able to do that. I would say that clearly crosses the line and > >> >> > > should not be accepted. > >> >> > > >> >> > Why not? Just because we are not commercially selling it doesn't mean > >> >> > we couldn't look at other solutions such as QEMU. If we were to > >> >> > provide a github repo with an emulation of the NIC would that be > >> >> > enough to satisfy the "commercial" requirement? > >> >> > >> >> My test is not "commercial", it is enabling open source ecosystem vs > >> >> benefiting only proprietary software. > >> > > >> >Sorry, that was where this started where Jiri was stating that we had > >> >to be selling this. > >> > >> For the record, I never wrote that. Not sure why you repeat this over > >> this thread. > > > >Because you seem to be implying that the Meta NIC driver shouldn't be > >included simply since it isn't going to be available outside of Meta. > >The fact is Meta employs a number of kernel developers and as a result > >of that there will be a number of kernel developers that will have > >access to this NIC and likely do development on systems containing it. > >In addition simply due to the size of the datacenters that we will be > >populating there is actually a strong likelihood that there will be > >more instances of this NIC running on Linux than there are of some > >other vendor devices that have been allowed to have drivers in the > >kernel. > > So? The gain for community is still 0. No matter how many instances is > private hw you privately have. Just have a private driver. The gain is the same as if company X makes a card and sells it exclusively to datacenter provider Y. We know this happens. Vendors would happily spin up a NIC if a DC with scale like this would pay for it. They just don't advertise it in patch 0/X, "adding device for cloud provider foo". There is no difference here. We gain developers, we gain insights, learnings and Linux and OSS drivers are running on another big DC. They improve things and find bugs they upstream them its a win. The opposite is also true if we exclude a driver/NIC HW that is running on major DCs we lose a lot of insight, experience, value. DCs are all starting to build their own hardware if we lose this section of HW we lose those developers too. We are less likely to get any advances they come up with. I think you have it backwards. Eventually Linux networking becomes either commodity and irrelevant for DC deployments. So I strongly disagree we lose by excluding drivers and win by bringing it in. > > > > > >So from what I can tell the only difference is if we are manufacturing > >this for sale, or for personal use. Thus why I mention "commercial" > >since the only difference from my perspective is the fact that we are > >making it for our own use instead of selling it. > > Give it for free. Huh? > > > > > >[...] > > > >> >> > I agree. We need a consistent set of standards. I just strongly > >> >> > believe commercial availability shouldn't be one of them. > >> >> > >> >> I never said commercial availability. I talked about open source vs > >> >> proprietary userspace. This is very standard kernel stuff. > >> >> > >> >> You have an unavailable NIC, so we know it is only ever operated with > >> >> Meta's proprietary kernel fork, supporting Meta's proprietary > >> >> userspace software. Where exactly is the open source? > >> > > >> >It depends on your definition of "unavailable". I could argue that for > >> >many most of the Mellanox NICs are also have limited availability as > >> >they aren't exactly easy to get a hold of without paying a hefty > >> >ransom. > >> > >> Sorry, but I have to say this is ridiculous argument, really Alex. > >> Apples and oranges. > > > >Really? So would you be making the same argument if it was > >Nvidia/Mellanox pushing the driver and they were exclusively making it > >just for Meta, Google, or some other big cloud provider? I suspect > > Heh, what ifs :) Anyway, chance that happens is very close to 0. > > > >not. If nothing else they likely wouldn't disclose the plan for > >exclusive sales to get around this sort of thing. The fact is I know > >many of the vendors make proprietary spins of their firmware and > >hardware for specific customers. The way I see it this patchset is > >being rejected as I was too honest about the general plan and use case > >for it. > > > >This is what I am getting at. It just seems like we are playing games > >with semantics where if it is a vendor making the arrangement then it > >is okay for them to make hardware that is inaccessible to most, but if > >it is Meta then somehow it isn't.
On Mon, Apr 08, 2024 at 08:46:35AM -0700, Alexander Duyck wrote: > Really? So would you be making the same argument if it was > Nvidia/Mellanox pushing the driver and they were exclusively making it > just for Meta, Google, or some other big cloud provider? At least I would, yes. > I suspect not. If nothing else they likely wouldn't disclose the > plan for exclusive sales to get around this sort of thing. The fact > is I know many of the vendors make proprietary spins of their > firmware and hardware for specific customers. The way I see it this > patchset is being rejected as I was too honest about the general > plan and use case for it. Regrettably this does happen quietly in the kernel. If you know the right behind the scenes stuff you can start to be aware. That doesn't mean it is aligned with community values or should be done/encouraged. > This is what I am getting at. It just seems like we are playing games > with semantics where if it is a vendor making the arrangement then it > is okay for them to make hardware that is inaccessible to most, but if > it is Meta then somehow it isn't. With Meta it is obvious what is happening, and what is benefiting. If a COTS vendor does it then we have to take a leap of faith a unique feature will have wider applications - and many would require to see an open source userspace to boot strap that. I don't think we always get it right. Value judgements are often a bit murky like that. Jason
On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote: > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote: > > > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote: > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > <...> > > > > > > > > Technical solution? Maybe if it's not a public device regression rules > > > > > > don't apply? Seems fairly reasonable. > > > > > > > > > > This is a hypothetical. This driver currently isn't changing anything > > > > > outside of itself. At this point the driver would only be build tested > > > > > by everyone else. They could just not include it in their Kconfig and > > > > > then out-of-sight, out-of-mind. > > > > > > > > Not changing does not mean not depending on existing behavior. > > > > Investigating and fixing properly even the hardest regressions in > > > > the stack is a bar that Meta can so easily clear. I don't understand > > > > why you are arguing. > > > > > > I wasn't saying the driver wouldn't be dependent on existing behavior. > > > I was saying that it was a hypothetical that Meta would be a "less > > > than cooperative user" and demand a revert. It is also a hypothetical > > > that Linus wouldn't just propose a revert of the fbnic driver instead > > > of the API for the crime of being a "less than cooperative maintainer" > > > and then give Meta the Nvidia treatment. > > > > It is very easy to be "less than cooperative maintainer" in netdev world. > > 1. Be vendor. > > 2. Propose ideas which are different. > > 3. Report for user-visible regression. > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice. > > > > And voilà, you are "less than cooperative maintainer". > > > > So in reality, the "hypothetical" is very close to the reality, unless > > Meta contribution will be treated as a special case. > > > > Thanks > > How many cases of that have we had in the past? I'm honestly curious > as I don't actually have any reference. And this is the problem, you don't have "any reference" and accurate knowledge what happened, but you are saying "less than cooperative maintainer". > > Also as far as item 3 isn't hard for it to be a "user-visible" > regression if there are no users outside of the vendor that is > maintaining the driver to report it? This wasn't the case. It was change in core code, which broke specific version of vagrant. Vendor caught it simply by luck. > Again I am assuming that the same rules wouldn't necessarily apply > in the vendor/consumer being one entity case. > > Also from my past experience the community doesn't give a damn about > 1. It is only if 3 is being reported by actual users that somebody > would care. The fact is if vendors held that much power they would > have run roughshod over the community long ago as I know there are > vendors who love to provide one-off projects outside of the kernel and > usually have to work to get things into the upstream later and no > amount of complaining about "the users" will get their code accepted. > The users may complain but it is the vendors fault for that so the > community doesn't have to take action. You are taking it to completely wrong direction with your assumptions. The reality is that regression was reported by real user without any vendor code involved. This is why the end result was so bad for all parties. So no, you can get "less than cooperative maintainer" label really easy in current environment. Thanks
On Sat, Apr 6, 2024 at 9:05 AM Alexander Duyck <alexander.duyck@gmail.com> wrote: > > > > You have an unavailable NIC, so we know it is only ever operated with > > > > Meta's proprietary kernel fork, supporting Meta's proprietary > > > > userspace software. Where exactly is the open source? > > > > > > It depends on your definition of "unavailable". I could argue that for > > > many most of the Mellanox NICs are also have limited availability as > > > they aren't exactly easy to get a hold of without paying a hefty > > > ransom. > > > > And GNIC's that run Mina's series are completely unavailable right > > now. That is still a big different from a temporary issue to a > > permanent structural intention of the manufacturer. > > I'm assuming it is some sort of firmware functionality that is needed > to enable it? One thing with our design is that the firmware actually > has minimal functionality. Basically it is the liaison between the > BMC, Host, and the MAC. Otherwise it has no role to play in the > control path so when the driver is loaded it is running the show. > Sorry, I didn't realize our devmem TCP work was mentioned in this context. Just jumping in to say, no, this is not the case, devmem TCP does not require firmware functionality AFAICT. The selftest provided with the devmem TCP series should work on any driver that: 1. supports header split/flow steering/rss/page pool (I guess this support may need firmware changes...). 2. supports the new queue configuration ndos: https://patchwork.kernel.org/project/netdevbpf/patch/20240403002053.2376017-2-almasrymina@google.com/ 3. supports the new netmem page_pool APIs: https://patchwork.kernel.org/project/netdevbpf/patch/20240403002053.2376017-8-almasrymina@google.com/ No firmware changes specific to devmem TCP are needed, AFAICT. All these are driver changes. I also always publish a full branch with all the GVE changes so reviewers can check if there is anything too specific to GVE that we're doing, so far there are been no issues, and to be honest I can't see anything specific that we do with GVE for devmem TCP: https://github.com/mina/linux/commits/tcpdevmem-v7/ In fact, GVE is IMO a relatively feature light driver, and the fact that GVE can do devmem TCP IMO makes it easier for fancier NICs to also do devmem TCP. I'm working with folks interested in extending devmem TCP to their drivers, and they may follow up with patches after the series is merged (or before). The only reason I haven't implemented devmem TCP for multiple different drivers is a logistical one. I don't have access to hardware that supports all these prerequisite features other than GVE.
On Mon, Apr 8, 2024 at 11:41 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote: > > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote: > > > > > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote: > > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > <...> > > > > > > > > > > Technical solution? Maybe if it's not a public device regression rules > > > > > > > don't apply? Seems fairly reasonable. > > > > > > > > > > > > This is a hypothetical. This driver currently isn't changing anything > > > > > > outside of itself. At this point the driver would only be build tested > > > > > > by everyone else. They could just not include it in their Kconfig and > > > > > > then out-of-sight, out-of-mind. > > > > > > > > > > Not changing does not mean not depending on existing behavior. > > > > > Investigating and fixing properly even the hardest regressions in > > > > > the stack is a bar that Meta can so easily clear. I don't understand > > > > > why you are arguing. > > > > > > > > I wasn't saying the driver wouldn't be dependent on existing behavior. > > > > I was saying that it was a hypothetical that Meta would be a "less > > > > than cooperative user" and demand a revert. It is also a hypothetical > > > > that Linus wouldn't just propose a revert of the fbnic driver instead > > > > of the API for the crime of being a "less than cooperative maintainer" > > > > and then give Meta the Nvidia treatment. > > > > > > It is very easy to be "less than cooperative maintainer" in netdev world. > > > 1. Be vendor. > > > 2. Propose ideas which are different. > > > 3. Report for user-visible regression. > > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice. > > > > > > And voilà, you are "less than cooperative maintainer". > > > > > > So in reality, the "hypothetical" is very close to the reality, unless > > > Meta contribution will be treated as a special case. > > > > > > Thanks > > > > How many cases of that have we had in the past? I'm honestly curious > > as I don't actually have any reference. > > And this is the problem, you don't have "any reference" and accurate > knowledge what happened, but you are saying "less than cooperative > maintainer". By "less than cooperative maintainer" I was referring to the scenario where somebody is maintaining something unique to them, such as the Meta Host NIC, and not willing to work with the community to fix it and instead just demanding a revert of a change. It doesn't seem like it would be too much to ask to work with the author on a fix for the problem as long as the maintainer is willing to work with the author on putting together and testing the fix. With that said if the upstream version of things aren't broken then it doesn't matter. It shouldn't be expected of the community to maintain any proprietary code that wasn't accepted upstream. > > > > Also as far as item 3 isn't hard for it to be a "user-visible" > > regression if there are no users outside of the vendor that is > > maintaining the driver to report it? > > This wasn't the case. It was change in core code, which broke specific > version of vagrant. Vendor caught it simply by luck. Any more info on this? Without context it is hard to say one way or the other. I know I have seen my fair share of hot issues such as when the introduction of the tracing framework was corrupting the NVRAM on e1000e NICs.[1] It got everyone's attention when it essentially bricked one of Linus's systems. I don't recall us doing a full revert on function tracing as a result, but I believe it was flagged as broken until it could be resolved. So depending on the situation there are cases where asking for a fix or revert might be appropriate. > > Again I am assuming that the same rules wouldn't necessarily apply > > in the vendor/consumer being one entity case. > > > > Also from my past experience the community doesn't give a damn about > > 1. It is only if 3 is being reported by actual users that somebody > > would care. The fact is if vendors held that much power they would > > have run roughshod over the community long ago as I know there are > > vendors who love to provide one-off projects outside of the kernel and > > usually have to work to get things into the upstream later and no > > amount of complaining about "the users" will get their code accepted. > > The users may complain but it is the vendors fault for that so the > > community doesn't have to take action. > > You are taking it to completely wrong direction with your assumptions. > The reality is that regression was reported by real user without any > vendor code involved. This is why the end result was so bad for all parties. Okay, but that doesn't tie into what is going on here. In this case "vendor" == "user". Like I was saying the community generally cares about the user so 3 would be the important case assuming they are using a stock kernel and driver and not hiding behind the vendor expecting some sort of proprietary fix. If they are using some proprietary stuff behind the scenes, then tough luck. > So no, you can get "less than cooperative maintainer" label really easy in > current environment. I didn't say you couldn't. Without context I cannot say if it was deserved or not. I know in the example I cited above Intel had to add changes to the e1000e driver to make the NVRAM non-writable until the problem patch was found. So Intel was having to patch to fix an issue it didn't introduce and deal with the negative press and blow-back from a function tracing patch that was damaging NICs. The point I was trying to make is that if you are the only owner of something, and not willing to work with the community as a maintainer it becomes much easier for the community to just revert the driver rather than try to change the code if you aren't willing to work with them. Thus the "less than cooperative" part. The argument being made seems to be that once something is in the kernel it is forever and if we get it in and then refuse to work with the community it couldn't be reverted. I am arguing that isn't the case, especially if Meta were to become a "less than cooperative maintainer" for a device that is primarily only going to be available in Meta data centers. Thanks, - Alex [1]: https://lwn.net/Articles/304105/
On 4/8/24 09:51, Jiri Pirko wrote: > Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >>> >>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >>>>> >>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >>>>>>> Alex already indicated new features are coming, changes to the core >>>>>>> code will be proposed. How should those be evaluated? Hypothetically >>>>>>> should fbnic be allowed to be the first implementation of something >>>>>>> invasive like Mina's DMABUF work? Google published an open userspace >>>>>>> for NCCL that people can (in theory at least) actually run. Meta would >>>>>>> not be able to do that. I would say that clearly crosses the line and >>>>>>> should not be accepted. >>>>>> >>>>>> Why not? Just because we are not commercially selling it doesn't mean >>>>>> we couldn't look at other solutions such as QEMU. If we were to >>>>>> provide a github repo with an emulation of the NIC would that be >>>>>> enough to satisfy the "commercial" requirement? >>>>> >>>>> My test is not "commercial", it is enabling open source ecosystem vs >>>>> benefiting only proprietary software. >>>> >>>> Sorry, that was where this started where Jiri was stating that we had >>>> to be selling this. >>> >>> For the record, I never wrote that. Not sure why you repeat this over >>> this thread. >> >> Because you seem to be implying that the Meta NIC driver shouldn't be >> included simply since it isn't going to be available outside of Meta. >> The fact is Meta employs a number of kernel developers and as a result >> of that there will be a number of kernel developers that will have >> access to this NIC and likely do development on systems containing it. >> In addition simply due to the size of the datacenters that we will be >> populating there is actually a strong likelihood that there will be >> more instances of this NIC running on Linux than there are of some >> other vendor devices that have been allowed to have drivers in the >> kernel. > > So? The gain for community is still 0. No matter how many instances is > private hw you privately have. Just have a private driver. I am amazed and not in a good way at how far this has gone, truly. This really is akin to saying that any non-zero driver count to maintain is a burden on the community. Which is true, by definition, but if the goal was to build something for no users, then clearly this is the wrong place to be in, or too late. The systems with no users are the best to maintain, that is for sure. If the practical concern is wen you make tree wide API change that fbnic happens to use, and you have yet another driver (fbnic) to convert, so what? Work with Alex ahead of time, get his driver to be modified, post the patch series. Even if Alex happens to move on and stop being responsible and there is no maintainer, so what? Give the driver a depreciation window for someone to step in, rip it, end of story. Nothing new, so what has specifically changed as of April 4th 2024 to oppose such strong rejection? Like it was said, there are tons of drivers in the Linux kernel that have a single user, this one might have a few more than a single one, that should be good enough. What the heck is going on?
On 4/8/24 13:43, Alexander Duyck wrote: >>> >>> Also as far as item 3 isn't hard for it to be a "user-visible" >>> regression if there are no users outside of the vendor that is >>> maintaining the driver to report it? >> >> This wasn't the case. It was change in core code, which broke specific >> version of vagrant. Vendor caught it simply by luck. > > Any more info on this? Without context it is hard to say one way or the other. Believe this is the thread in question: https://lore.kernel.org/netdev/MN2PR12MB44863139E562A59329E89DBEB982A@MN2PR12MB4486.namprd12.prod.outlook.com/
On 4/8/24 14:49, Florian Fainelli wrote: > On 4/8/24 13:43, Alexander Duyck wrote: >>>> >>>> Also as far as item 3 isn't hard for it to be a "user-visible" >>>> regression if there are no users outside of the vendor that is >>>> maintaining the driver to report it? >>> >>> This wasn't the case. It was change in core code, which broke specific >>> version of vagrant. Vendor caught it simply by luck. >> >> Any more info on this? Without context it is hard to say one way or >> the other. > > Believe this is the thread in question: > > https://lore.kernel.org/netdev/MN2PR12MB44863139E562A59329E89DBEB982A@MN2PR12MB4486.namprd12.prod.outlook.com/ And the follow up: https://lore.kernel.org/netdev/14459261ea9f9c7d7dfb28eb004ce8734fa83ade.1704185904.git.leonro@nvidia.com/
On Mon, Apr 08, 2024 at 01:43:28PM -0700, Alexander Duyck wrote: > On Mon, Apr 8, 2024 at 11:41 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote: > > > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote: > > > > > > > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote: > > > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > > > <...> > > > > > > > > > > > > Technical solution? Maybe if it's not a public device regression rules > > > > > > > > don't apply? Seems fairly reasonable. > > > > > > > > > > > > > > This is a hypothetical. This driver currently isn't changing anything > > > > > > > outside of itself. At this point the driver would only be build tested > > > > > > > by everyone else. They could just not include it in their Kconfig and > > > > > > > then out-of-sight, out-of-mind. > > > > > > > > > > > > Not changing does not mean not depending on existing behavior. > > > > > > Investigating and fixing properly even the hardest regressions in > > > > > > the stack is a bar that Meta can so easily clear. I don't understand > > > > > > why you are arguing. > > > > > > > > > > I wasn't saying the driver wouldn't be dependent on existing behavior. > > > > > I was saying that it was a hypothetical that Meta would be a "less > > > > > than cooperative user" and demand a revert. It is also a hypothetical > > > > > that Linus wouldn't just propose a revert of the fbnic driver instead > > > > > of the API for the crime of being a "less than cooperative maintainer" > > > > > and then give Meta the Nvidia treatment. > > > > > > > > It is very easy to be "less than cooperative maintainer" in netdev world. > > > > 1. Be vendor. > > > > 2. Propose ideas which are different. > > > > 3. Report for user-visible regression. > > > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice. > > > > > > > > And voilà, you are "less than cooperative maintainer". > > > > > > > > So in reality, the "hypothetical" is very close to the reality, unless > > > > Meta contribution will be treated as a special case. > > > > > > > > Thanks > > > > > > How many cases of that have we had in the past? I'm honestly curious > > > as I don't actually have any reference. > > > > And this is the problem, you don't have "any reference" and accurate > > knowledge what happened, but you are saying "less than cooperative > > maintainer". <...> > Any more info on this? Without context it is hard to say one way or the other. <...> > I didn't say you couldn't. Without context I cannot say if it was > deserved or not. Florian gave links to the context, so I'll skip this part. In this thread, Jakub tried to revive the discussion about it. https://lore.kernel.org/netdev/20240326133412.47cf6d99@kernel.org/ <...> > The point I was trying to make is that if you are the only owner of > something, and not willing to work with the community as a maintainer Like Jakub, I don't understand why you are talking about regressions in the driver, while you brought the label of "less than cooperative maintainer" and asked for "then give Meta the Nvidia treatment". I don't want to get into the discussion about if this driver should be accepted or not. I'm just asking to stop label people and companies based on descriptions from other people, but rely on facts. Thanks
Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote: >On 4/8/24 09:51, Jiri Pirko wrote: >> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >> > On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >> > > >> > > Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >> > > > On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >> > > > > >> > > > > On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >> > > > > > > Alex already indicated new features are coming, changes to the core >> > > > > > > code will be proposed. How should those be evaluated? Hypothetically >> > > > > > > should fbnic be allowed to be the first implementation of something >> > > > > > > invasive like Mina's DMABUF work? Google published an open userspace >> > > > > > > for NCCL that people can (in theory at least) actually run. Meta would >> > > > > > > not be able to do that. I would say that clearly crosses the line and >> > > > > > > should not be accepted. >> > > > > > >> > > > > > Why not? Just because we are not commercially selling it doesn't mean >> > > > > > we couldn't look at other solutions such as QEMU. If we were to >> > > > > > provide a github repo with an emulation of the NIC would that be >> > > > > > enough to satisfy the "commercial" requirement? >> > > > > >> > > > > My test is not "commercial", it is enabling open source ecosystem vs >> > > > > benefiting only proprietary software. >> > > > >> > > > Sorry, that was where this started where Jiri was stating that we had >> > > > to be selling this. >> > > >> > > For the record, I never wrote that. Not sure why you repeat this over >> > > this thread. >> > >> > Because you seem to be implying that the Meta NIC driver shouldn't be >> > included simply since it isn't going to be available outside of Meta. >> > The fact is Meta employs a number of kernel developers and as a result >> > of that there will be a number of kernel developers that will have >> > access to this NIC and likely do development on systems containing it. >> > In addition simply due to the size of the datacenters that we will be >> > populating there is actually a strong likelihood that there will be >> > more instances of this NIC running on Linux than there are of some >> > other vendor devices that have been allowed to have drivers in the >> > kernel. >> >> So? The gain for community is still 0. No matter how many instances is >> private hw you privately have. Just have a private driver. > >I am amazed and not in a good way at how far this has gone, truly. > >This really is akin to saying that any non-zero driver count to maintain is a >burden on the community. Which is true, by definition, but if the goal was to >build something for no users, then clearly this is the wrong place to be in, >or too late. The systems with no users are the best to maintain, that is for >sure. > >If the practical concern is wen you make tree wide API change that fbnic >happens to use, and you have yet another driver (fbnic) to convert, so what? >Work with Alex ahead of time, get his driver to be modified, post the patch >series. Even if Alex happens to move on and stop being responsible and there >is no maintainer, so what? Give the driver a depreciation window for someone >to step in, rip it, end of story. Nothing new, so what has specifically >changed as of April 4th 2024 to oppose such strong rejection? How you describe the flow of internal API change is totally distant from reality. Really, like no part is correct: 1) API change is responsibility of the person doing it. Imagine working with 40 driver maintainers for every API change. I did my share of API changes in the past, maintainer were only involved to be cced. 2) To deprecate driver because the maintainer is not responsible. Can you please show me one example when that happened in the past? > >Like it was said, there are tons of drivers in the Linux kernel that have a >single user, this one might have a few more than a single one, that should be >good enough. This will have exactly 0. That is my point. Why to merge something nobody will ever use? > >What the heck is going on? >-- >Florian >
Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote: >Jiri Pirko wrote: >> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >> >On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >> >> >> >> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >> >> >On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >> >> >> >> >> >> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >> >> >> > > Alex already indicated new features are coming, changes to the core >> >> >> > > code will be proposed. How should those be evaluated? Hypothetically >> >> >> > > should fbnic be allowed to be the first implementation of something >> >> >> > > invasive like Mina's DMABUF work? Google published an open userspace >> >> >> > > for NCCL that people can (in theory at least) actually run. Meta would >> >> >> > > not be able to do that. I would say that clearly crosses the line and >> >> >> > > should not be accepted. >> >> >> > >> >> >> > Why not? Just because we are not commercially selling it doesn't mean >> >> >> > we couldn't look at other solutions such as QEMU. If we were to >> >> >> > provide a github repo with an emulation of the NIC would that be >> >> >> > enough to satisfy the "commercial" requirement? >> >> >> >> >> >> My test is not "commercial", it is enabling open source ecosystem vs >> >> >> benefiting only proprietary software. >> >> > >> >> >Sorry, that was where this started where Jiri was stating that we had >> >> >to be selling this. >> >> >> >> For the record, I never wrote that. Not sure why you repeat this over >> >> this thread. >> > >> >Because you seem to be implying that the Meta NIC driver shouldn't be >> >included simply since it isn't going to be available outside of Meta. >> >The fact is Meta employs a number of kernel developers and as a result >> >of that there will be a number of kernel developers that will have >> >access to this NIC and likely do development on systems containing it. >> >In addition simply due to the size of the datacenters that we will be >> >populating there is actually a strong likelihood that there will be >> >more instances of this NIC running on Linux than there are of some >> >other vendor devices that have been allowed to have drivers in the >> >kernel. >> >> So? The gain for community is still 0. No matter how many instances is >> private hw you privately have. Just have a private driver. > >The gain is the same as if company X makes a card and sells it >exclusively to datacenter provider Y. We know this happens. Different story. The driver is still the same. Perhaps only some parts of it are tailored to fit one person's need, maybe. But here, the whole thing is obviously is targeted to one person. Can't you see the scale on which these are different? >Vendors would happily spin up a NIC if a DC with scale like this >would pay for it. They just don't advertise it in patch 0/X, >"adding device for cloud provider foo". > >There is no difference here. We gain developers, we gain insights, >learnings and Linux and OSS drivers are running on another big >DC. They improve things and find bugs they upstream them its a win. > >The opposite is also true if we exclude a driver/NIC HW that is >running on major DCs we lose a lot of insight, experience, value. Could you please describe in details and examples what exactly is we are about to loose? I don't see it. >DCs are all starting to build their own hardware if we lose this >section of HW we lose those developers too. We are less likely >to get any advances they come up with. I think you have it backwards. >Eventually Linux networking becomes either commodity and irrelevant >for DC deployments. > >So I strongly disagree we lose by excluding drivers and win by >bringing it in. > >> >> >> > >> >So from what I can tell the only difference is if we are manufacturing >> >this for sale, or for personal use. Thus why I mention "commercial" >> >since the only difference from my perspective is the fact that we are >> >making it for our own use instead of selling it. >> >> Give it for free. > >Huh? > >> >> >> > >> >[...] >> > >> >> >> > I agree. We need a consistent set of standards. I just strongly >> >> >> > believe commercial availability shouldn't be one of them. >> >> >> >> >> >> I never said commercial availability. I talked about open source vs >> >> >> proprietary userspace. This is very standard kernel stuff. >> >> >> >> >> >> You have an unavailable NIC, so we know it is only ever operated with >> >> >> Meta's proprietary kernel fork, supporting Meta's proprietary >> >> >> userspace software. Where exactly is the open source? >> >> > >> >> >It depends on your definition of "unavailable". I could argue that for >> >> >many most of the Mellanox NICs are also have limited availability as >> >> >they aren't exactly easy to get a hold of without paying a hefty >> >> >ransom. >> >> >> >> Sorry, but I have to say this is ridiculous argument, really Alex. >> >> Apples and oranges. >> > >> >Really? So would you be making the same argument if it was >> >Nvidia/Mellanox pushing the driver and they were exclusively making it >> >just for Meta, Google, or some other big cloud provider? I suspect >> >> Heh, what ifs :) Anyway, chance that happens is very close to 0. >> >> >> >not. If nothing else they likely wouldn't disclose the plan for >> >exclusive sales to get around this sort of thing. The fact is I know >> >many of the vendors make proprietary spins of their firmware and >> >hardware for specific customers. The way I see it this patchset is >> >being rejected as I was too honest about the general plan and use case >> >for it. >> > >> >This is what I am getting at. It just seems like we are playing games >> >with semantics where if it is a vendor making the arrangement then it >> >is okay for them to make hardware that is inaccessible to most, but if >> >it is Meta then somehow it isn't. > >
On 4/9/2024 3:56 AM, Jiri Pirko wrote: > Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote: >> On 4/8/24 09:51, Jiri Pirko wrote: >>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >>>>> >>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >>>>>>> >>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >>>>>>>>> Alex already indicated new features are coming, changes to the core >>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically >>>>>>>>> should fbnic be allowed to be the first implementation of something >>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace >>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would >>>>>>>>> not be able to do that. I would say that clearly crosses the line and >>>>>>>>> should not be accepted. >>>>>>>> >>>>>>>> Why not? Just because we are not commercially selling it doesn't mean >>>>>>>> we couldn't look at other solutions such as QEMU. If we were to >>>>>>>> provide a github repo with an emulation of the NIC would that be >>>>>>>> enough to satisfy the "commercial" requirement? >>>>>>> >>>>>>> My test is not "commercial", it is enabling open source ecosystem vs >>>>>>> benefiting only proprietary software. >>>>>> >>>>>> Sorry, that was where this started where Jiri was stating that we had >>>>>> to be selling this. >>>>> >>>>> For the record, I never wrote that. Not sure why you repeat this over >>>>> this thread. >>>> >>>> Because you seem to be implying that the Meta NIC driver shouldn't be >>>> included simply since it isn't going to be available outside of Meta. >>>> The fact is Meta employs a number of kernel developers and as a result >>>> of that there will be a number of kernel developers that will have >>>> access to this NIC and likely do development on systems containing it. >>>> In addition simply due to the size of the datacenters that we will be >>>> populating there is actually a strong likelihood that there will be >>>> more instances of this NIC running on Linux than there are of some >>>> other vendor devices that have been allowed to have drivers in the >>>> kernel. >>> >>> So? The gain for community is still 0. No matter how many instances is >>> private hw you privately have. Just have a private driver. >> >> I am amazed and not in a good way at how far this has gone, truly. >> >> This really is akin to saying that any non-zero driver count to maintain is a >> burden on the community. Which is true, by definition, but if the goal was to >> build something for no users, then clearly this is the wrong place to be in, >> or too late. The systems with no users are the best to maintain, that is for >> sure. >> >> If the practical concern is wen you make tree wide API change that fbnic >> happens to use, and you have yet another driver (fbnic) to convert, so what? >> Work with Alex ahead of time, get his driver to be modified, post the patch >> series. Even if Alex happens to move on and stop being responsible and there >> is no maintainer, so what? Give the driver a depreciation window for someone >> to step in, rip it, end of story. Nothing new, so what has specifically >> changed as of April 4th 2024 to oppose such strong rejection? > > How you describe the flow of internal API change is totally distant from > reality. Really, like no part is correct: > 1) API change is responsibility of the person doing it. Imagine working > with 40 driver maintainers for every API change. I did my share of > API changes in the past, maintainer were only involved to be cced. As a submitter you propose changes and silence is acknowledgement. If one of your API changes broke someone's driver and they did not notify you of the breakage during the review cycle, it falls on their shoulder to fix it for themselves and they should not be holding back your work, that would not be fair. If you know about the breakage, and there is still no fix, that is an indication the driver is not actively used and maintained. This also does not mean you have to do the entire API changes to a driver you do not know about on your own. Nothing ever prevents you from posting the patches as RFC and say: "here is how I would go about changing your driver, please review and help me make corrections". If the driver maintainers do not respond there is no reason their lack of involvement should refrain your work, and so your proposed changes will be merged eventually. Is not this the whole point of being a community and be able to delegate and mitigate the risk of large scale changes? > 2) To deprecate driver because the maintainer is not responsible. Can > you please show me one example when that happened in the past? I cannot show you an example because we never had to go that far and I did not say that this is an established practice, but that we *could* do that if we ever reached that point. > > >> >> Like it was said, there are tons of drivers in the Linux kernel that have a >> single user, this one might have a few more than a single one, that should be >> good enough. > > This will have exactly 0. That is my point. Why to merge something > nobody will ever use? Even if Alex and his firmware colleague end up being the only two people using this driver if the decision is to make it upstream because this is the desired distribution and development model of the driver we should respect that. And just to be clear, we should not be respecting that because Meta, or Alex or anyone decided that they were doing the world a favor by working in the open rather than being closed door, but simply because we cannot *presume* about their intentions and the future. For drivers specifically, yes, there is a question of to which degree can we scale horizontally, and I do not think there is ever going to be an answer to that, as we will continue to see new drivers emerge, possibly with few users, for some definition of few.
From: Jiri Pirko <jiri@resnulli.us> Date: Tue, 9 Apr 2024 13:01:51 +0200 > Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote: >> Jiri Pirko wrote: >>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >>>>> >>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >>>>>>> >>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >>>>>>>>> Alex already indicated new features are coming, changes to the core >>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically >>>>>>>>> should fbnic be allowed to be the first implementation of something >>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace >>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would >>>>>>>>> not be able to do that. I would say that clearly crosses the line and >>>>>>>>> should not be accepted. >>>>>>>> >>>>>>>> Why not? Just because we are not commercially selling it doesn't mean >>>>>>>> we couldn't look at other solutions such as QEMU. If we were to >>>>>>>> provide a github repo with an emulation of the NIC would that be >>>>>>>> enough to satisfy the "commercial" requirement? >>>>>>> >>>>>>> My test is not "commercial", it is enabling open source ecosystem vs >>>>>>> benefiting only proprietary software. >>>>>> >>>>>> Sorry, that was where this started where Jiri was stating that we had >>>>>> to be selling this. >>>>> >>>>> For the record, I never wrote that. Not sure why you repeat this over >>>>> this thread. >>>> >>>> Because you seem to be implying that the Meta NIC driver shouldn't be >>>> included simply since it isn't going to be available outside of Meta. BTW idpf is also not something you can go and buy in a store, but it's here in the kernel. Anyway, see below. >>>> The fact is Meta employs a number of kernel developers and as a result >>>> of that there will be a number of kernel developers that will have >>>> access to this NIC and likely do development on systems containing it. [...] >> Vendors would happily spin up a NIC if a DC with scale like this >> would pay for it. They just don't advertise it in patch 0/X, >> "adding device for cloud provider foo". >> >> There is no difference here. We gain developers, we gain insights, >> learnings and Linux and OSS drivers are running on another big >> DC. They improve things and find bugs they upstream them its a win. >> >> The opposite is also true if we exclude a driver/NIC HW that is >> running on major DCs we lose a lot of insight, experience, value. > > Could you please describe in details and examples what exactly is we > are about to loose? I don't see it. As long as driver A introduces new features / improvements / API / whatever to the core kernel, we benefit from this no matter whether I'm actually able to run this driver on my system. Some drivers even give us benefit by that they are of good quality (I don't speak for this driver, just some hypothetical) and/or have interesting design / code / API / etc. choices. The drivers I work on did gain a lot just from that I was reading new commits / lore threads and look at changes in other drivers. I saw enough situations when driver A started using/doing something the way it wasn't ever done anywhere before, and then more and more drivers stated doing the same thing and at the end it became sorta standard. I didn't read this patchset and thus can't say if it will bring us good immediately or some time later, but I believe there's no reason to reject the driver only because you can't buy a board for it in your gadget store next door. [...] Thanks, Olek
On Tue, Apr 09, 2024 at 03:11:21PM +0200, Alexander Lobakin wrote: > BTW idpf is also not something you can go and buy in a store, but it's > here in the kernel. Anyway, see below. That is really disappointing to hear :( Jason
On Tue, 9 Apr 2024 15:11:21 +0200 Alexander Lobakin wrote: > BTW idpf is also not something you can go and buy in a store, but it's > here in the kernel. Anyway, see below. For some definition of "a store" :) > > Could you please describe in details and examples what exactly is we > > are about to loose? I don't see it. > > As long as driver A introduces new features / improvements / API / > whatever to the core kernel, we benefit from this no matter whether I'm > actually able to run this driver on my system. > > Some drivers even give us benefit by that they are of good quality (I > don't speak for this driver, just some hypothetical) and/or have > interesting design / code / API / etc. choices. The drivers I work on > did gain a lot just from that I was reading new commits / lore threads > and look at changes in other drivers. Another point along these lines is worth bringing up. Companies which build their own kernels probably have little reason to distribute drivers out of tree. Vendors unfortunately are forced by some of their customers and/or sales department to provide out of tree drivers. Which in turn distinctiveness them from implementing shared core infrastructure. The queue API is a good example of that. Number of vendors implement pre-allocate and swap for reconfiguration but it's not controlled by the core. So after 5+ years (look at netconf 2019 slides) of violently agreeing that we need queue alloc we made little progress :( I don't think that it's a coincidence that it's Mina (Google) and David (Meta) who picked up this work. And it's really hard to implement that in an "off the shelf device", where queues are fully controlled by FW (no documentation available), and without breaking something (no access to vendor's CI/tests). IOW while modifying core for a single private driver is a concern there's also a ton of work we all agree needs to be done in the core, that we need help with.
On Tue, 9 Apr 2024 07:08:58 -0700 Jakub Kicinski wrote:
> distinctiveness
Too trusting of the spellcheck, I meant disincentivizes
Tue, Apr 09, 2024 at 03:05:47PM CEST, f.fainelli@gmail.com wrote: > > >On 4/9/2024 3:56 AM, Jiri Pirko wrote: >> Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote: >> > On 4/8/24 09:51, Jiri Pirko wrote: >> > > Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >> > > > On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >> > > > > >> > > > > Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >> > > > > > On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >> > > > > > > >> > > > > > > On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >> > > > > > > > > Alex already indicated new features are coming, changes to the core >> > > > > > > > > code will be proposed. How should those be evaluated? Hypothetically >> > > > > > > > > should fbnic be allowed to be the first implementation of something >> > > > > > > > > invasive like Mina's DMABUF work? Google published an open userspace >> > > > > > > > > for NCCL that people can (in theory at least) actually run. Meta would >> > > > > > > > > not be able to do that. I would say that clearly crosses the line and >> > > > > > > > > should not be accepted. >> > > > > > > > >> > > > > > > > Why not? Just because we are not commercially selling it doesn't mean >> > > > > > > > we couldn't look at other solutions such as QEMU. If we were to >> > > > > > > > provide a github repo with an emulation of the NIC would that be >> > > > > > > > enough to satisfy the "commercial" requirement? >> > > > > > > >> > > > > > > My test is not "commercial", it is enabling open source ecosystem vs >> > > > > > > benefiting only proprietary software. >> > > > > > >> > > > > > Sorry, that was where this started where Jiri was stating that we had >> > > > > > to be selling this. >> > > > > >> > > > > For the record, I never wrote that. Not sure why you repeat this over >> > > > > this thread. >> > > > >> > > > Because you seem to be implying that the Meta NIC driver shouldn't be >> > > > included simply since it isn't going to be available outside of Meta. >> > > > The fact is Meta employs a number of kernel developers and as a result >> > > > of that there will be a number of kernel developers that will have >> > > > access to this NIC and likely do development on systems containing it. >> > > > In addition simply due to the size of the datacenters that we will be >> > > > populating there is actually a strong likelihood that there will be >> > > > more instances of this NIC running on Linux than there are of some >> > > > other vendor devices that have been allowed to have drivers in the >> > > > kernel. >> > > >> > > So? The gain for community is still 0. No matter how many instances is >> > > private hw you privately have. Just have a private driver. >> > >> > I am amazed and not in a good way at how far this has gone, truly. >> > >> > This really is akin to saying that any non-zero driver count to maintain is a >> > burden on the community. Which is true, by definition, but if the goal was to >> > build something for no users, then clearly this is the wrong place to be in, >> > or too late. The systems with no users are the best to maintain, that is for >> > sure. >> > >> > If the practical concern is wen you make tree wide API change that fbnic >> > happens to use, and you have yet another driver (fbnic) to convert, so what? >> > Work with Alex ahead of time, get his driver to be modified, post the patch >> > series. Even if Alex happens to move on and stop being responsible and there >> > is no maintainer, so what? Give the driver a depreciation window for someone >> > to step in, rip it, end of story. Nothing new, so what has specifically >> > changed as of April 4th 2024 to oppose such strong rejection? >> >> How you describe the flow of internal API change is totally distant from >> reality. Really, like no part is correct: >> 1) API change is responsibility of the person doing it. Imagine working >> with 40 driver maintainers for every API change. I did my share of >> API changes in the past, maintainer were only involved to be cced. > >As a submitter you propose changes and silence is acknowledgement. If one of >your API changes broke someone's driver and they did not notify you of the >breakage during the review cycle, it falls on their shoulder to fix it for >themselves and they should not be holding back your work, that would not be Does it? I don't think so. If you break something, better try to fix it before somebody else has to. >fair. If you know about the breakage, and there is still no fix, that is an >indication the driver is not actively used and maintained. So? That is not my point. If I break something in fbnic, why does anyone care? Nobody is ever to hit that bug, only Meta DC. > >This also does not mean you have to do the entire API changes to a driver you >do not know about on your own. Nothing ever prevents you from posting the >patches as RFC and say: "here is how I would go about changing your driver, >please review and help me make corrections". If the driver maintainers do not >respond there is no reason their lack of involvement should refrain your >work, and so your proposed changes will be merged eventually. Realistically, did you see that ever happen. I can't recall. > >Is not this the whole point of being a community and be able to delegate and >mitigate the risk of large scale changes? > >> 2) To deprecate driver because the maintainer is not responsible. Can >> you please show me one example when that happened in the past? > >I cannot show you an example because we never had to go that far and I did >not say that this is an established practice, but that we *could* do that if >we ever reached that point. You are talking about a flow that does not exist. I don't understand how is that related to this discussion then. > >> >> >> > >> > Like it was said, there are tons of drivers in the Linux kernel that have a >> > single user, this one might have a few more than a single one, that should be >> > good enough. >> >> This will have exactly 0. That is my point. Why to merge something >> nobody will ever use? > >Even if Alex and his firmware colleague end up being the only two people >using this driver if the decision is to make it upstream because this is the >desired distribution and development model of the driver we should respect >that. > >And just to be clear, we should not be respecting that because Meta, or Alex >or anyone decided that they were doing the world a favor by working in the >open rather than being closed door, but simply because we cannot *presume* I don't see any favor for the community. What's the favor exactly? The only favor I see is the in the opposite direction, community giving Meta free cycles saving their backporting costs. Why? >about their intentions and the future. Heh, the intention is pretty clear from this discussion, isn't it? If they ever by any chance decide to go public with their device, driver for that could be submitted at a time. But this is totally hypothetical. > >For drivers specifically, yes, there is a question of to which degree can we >scale horizontally, and I do not think there is ever going to be an answer to >that, as we will continue to see new drivers emerge, possibly with few users, >for some definition of few. >-- >Florian
Tue, Apr 09, 2024 at 03:11:21PM CEST, aleksander.lobakin@intel.com wrote: >From: Jiri Pirko <jiri@resnulli.us> >Date: Tue, 9 Apr 2024 13:01:51 +0200 > >> Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote: >>> Jiri Pirko wrote: >>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >>>>>> >>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >>>>>>>> >>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >>>>>>>>>> Alex already indicated new features are coming, changes to the core >>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically >>>>>>>>>> should fbnic be allowed to be the first implementation of something >>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace >>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would >>>>>>>>>> not be able to do that. I would say that clearly crosses the line and >>>>>>>>>> should not be accepted. >>>>>>>>> >>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean >>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to >>>>>>>>> provide a github repo with an emulation of the NIC would that be >>>>>>>>> enough to satisfy the "commercial" requirement? >>>>>>>> >>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs >>>>>>>> benefiting only proprietary software. >>>>>>> >>>>>>> Sorry, that was where this started where Jiri was stating that we had >>>>>>> to be selling this. >>>>>> >>>>>> For the record, I never wrote that. Not sure why you repeat this over >>>>>> this thread. >>>>> >>>>> Because you seem to be implying that the Meta NIC driver shouldn't be >>>>> included simply since it isn't going to be available outside of Meta. > >BTW idpf is also not something you can go and buy in a store, but it's >here in the kernel. Anyway, see below. IDK, why so many people in this thread are so focused on "buying" nic. IDPF device is something I assume one may see on a VM hosted in some cloud, isn't it? If yes, it is completely legit to have it in. Do I miss something? > >>>>> The fact is Meta employs a number of kernel developers and as a result >>>>> of that there will be a number of kernel developers that will have >>>>> access to this NIC and likely do development on systems containing it. > >[...] > >>> Vendors would happily spin up a NIC if a DC with scale like this >>> would pay for it. They just don't advertise it in patch 0/X, >>> "adding device for cloud provider foo". >>> >>> There is no difference here. We gain developers, we gain insights, >>> learnings and Linux and OSS drivers are running on another big >>> DC. They improve things and find bugs they upstream them its a win. >>> >>> The opposite is also true if we exclude a driver/NIC HW that is >>> running on major DCs we lose a lot of insight, experience, value. >> >> Could you please describe in details and examples what exactly is we >> are about to loose? I don't see it. > >As long as driver A introduces new features / improvements / API / >whatever to the core kernel, we benefit from this no matter whether I'm >actually able to run this driver on my system. > >Some drivers even give us benefit by that they are of good quality (I >don't speak for this driver, just some hypothetical) and/or have >interesting design / code / API / etc. choices. The drivers I work on >did gain a lot just from that I was reading new commits / lore threads >and look at changes in other drivers. > >I saw enough situations when driver A started using/doing something the >way it wasn't ever done anywhere before, and then more and more drivers >stated doing the same thing and at the end it became sorta standard. So bottom line is, the unused driver *may* introduce some features and *may* provide as an example of how to do things for other people. Is this really that beneficial for the community that it overweights the obvious cons (not going to repeat them)? Like with any other patch/set we merge in, we always look at the cons and pros. I'm honestly surprised that so many people here want to make exception for Meta's internal toy project. > >I didn't read this patchset and thus can't say if it will bring us good >immediately or some time later, but I believe there's no reason to >reject the driver only because you can't buy a board for it in your >gadget store next door. Again with "buying", uff. > >[...] > >Thanks, >Olek
On Tue, Apr 9, 2024 at 1:19 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Mon, Apr 08, 2024 at 01:43:28PM -0700, Alexander Duyck wrote: > > On Mon, Apr 8, 2024 at 11:41 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > > > On Mon, Apr 08, 2024 at 08:26:33AM -0700, Alexander Duyck wrote: > > > > On Sun, Apr 7, 2024 at 11:18 PM Leon Romanovsky <leon@kernel.org> wrote: > > > > > > > > > > On Fri, Apr 05, 2024 at 08:41:11AM -0700, Alexander Duyck wrote: > > > > > > On Thu, Apr 4, 2024 at 7:38 PM Jakub Kicinski <kuba@kernel.org> wrote: > > > > > > > > > > <...> > > > > > > > > > > > > > > Technical solution? Maybe if it's not a public device regression rules > > > > > > > > > don't apply? Seems fairly reasonable. > > > > > > > > > > > > > > > > This is a hypothetical. This driver currently isn't changing anything > > > > > > > > outside of itself. At this point the driver would only be build tested > > > > > > > > by everyone else. They could just not include it in their Kconfig and > > > > > > > > then out-of-sight, out-of-mind. > > > > > > > > > > > > > > Not changing does not mean not depending on existing behavior. > > > > > > > Investigating and fixing properly even the hardest regressions in > > > > > > > the stack is a bar that Meta can so easily clear. I don't understand > > > > > > > why you are arguing. > > > > > > > > > > > > I wasn't saying the driver wouldn't be dependent on existing behavior. > > > > > > I was saying that it was a hypothetical that Meta would be a "less > > > > > > than cooperative user" and demand a revert. It is also a hypothetical > > > > > > that Linus wouldn't just propose a revert of the fbnic driver instead > > > > > > of the API for the crime of being a "less than cooperative maintainer" > > > > > > and then give Meta the Nvidia treatment. > > > > > > > > > > It is very easy to be "less than cooperative maintainer" in netdev world. > > > > > 1. Be vendor. > > > > > 2. Propose ideas which are different. > > > > > 3. Report for user-visible regression. > > > > > 4. Ask for a fix from the patch author or demand a revert according to netdev rules/practice. > > > > > > > > > > And voilà, you are "less than cooperative maintainer". > > > > > > > > > > So in reality, the "hypothetical" is very close to the reality, unless > > > > > Meta contribution will be treated as a special case. > > > > > > > > > > Thanks > > > > > > > > How many cases of that have we had in the past? I'm honestly curious > > > > as I don't actually have any reference. > > > > > > And this is the problem, you don't have "any reference" and accurate > > > knowledge what happened, but you are saying "less than cooperative > > > maintainer". > > <...> > > > Any more info on this? Without context it is hard to say one way or the other. > > <...> > > > I didn't say you couldn't. Without context I cannot say if it was > > deserved or not. > > Florian gave links to the context, so I'll skip this part. > > In this thread, Jakub tried to revive the discussion about it. > https://lore.kernel.org/netdev/20240326133412.47cf6d99@kernel.org/ > > <...> I see. So this is what you were referencing. Arguably I can see both sides of the issue. Ideally what should have been presented would have been the root cause of why the diff was breaking things and then it could have been fixed. However instead what was presented was essentially a bisect with a request to revert. Ultimately Eric accepted the revert since there was an issue that needed to be fixed. However I can't tell what went on in terms of trying to get to the root cause as that was taken offline for discussion so I can't say what role Mellanox played in either good or bad other than at least performing the bisect. Ultimately I think this kind of comes down to the hobbyist versus commercial interests issue that I brought up earlier. The hobbyist side should at least be curious about what about the Vagrant implementation was not RFC compliant which the changes supposedly were, thus the interest in getting a root cause. However that said it is broken and needs to be fixed so curiosity be damned, we cannot break userspace or not interop with other TCP implementations. > > The point I was trying to make is that if you are the only owner of > > something, and not willing to work with the community as a maintainer > > Like Jakub, I don't understand why you are talking about regressions in > the driver, while you brought the label of "less than cooperative maintainer" > and asked for "then give Meta the Nvidia treatment". Because I have been trying to keep the whole discussion about the fbnic driver that is presented in this patch set. When I was referring to a "less than cooperative maintainer" it was in response to the hypothetical about what if Meta started refusing to work with the community after this was accepted, and the "Nvidia treatment" I was referring was the graphics side about 10 years ago[1] as the question was about somebody running to Linus to complain that their proprietary hardware got broken by a kernel change. The general idea being if we are a proprietary NIC with ourselves as the only customer Linus would be more likely to give Meta a similar message. > I don't want to get into the discussion about if this driver should be > accepted or not. > > I'm just asking to stop label people and companies based on descriptions > from other people, but rely on facts. Sorry, it wasn't meant to be any sort of attack on Nvidia/Mellanox. The Nvidia I was referencing was the graphics side which had a bad reputation with the community long before Mellanox got involved. > Thanks Thank you. I understand now that you and Jason were just trying to warn me about what the community will and won't accept. Like I mentioned before I had just misconstrued Jason's comments as backing Jiri initially in this. In my mind I was prepared for the Nvidia/Mellanox folks dog piling me so I was just prepared for attacks. Just for the record this will be the third NIC driver I have added to the kernel following igbvf and fm10k, and years maintaining some of the other Intel network drivers. So I am well aware of the expectations of a maintainer. I might be a bit rusty due to a couple years of being focused on this project and not being able to post as much upstream, but as the expression goes "This isn't my first rodeo". - Alex [1]: https://arstechnica.com/information-technology/2012/06/linus-torvalds-says-f-k-you-to-nvidia/
On Tue, Apr 09, 2024 at 07:43:07AM -0700, Alexander Duyck wrote: > I see. So this is what you were referencing. Arguably I can see both > sides of the issue. Ideally what should have been presented would have > been the root cause of why the diff Uh, that almost never happens in the kernel world. Someone does a great favour to us all to test rc kernels and finds bugs. The expectation is generally things like: - The bug is fixed immediately because the issue is obvious to the author - Iteration and rapid progress is seen toward enlightening the author - The patch is reverted, often rapidly, try again later with a good patch Unsophisticated reporters should not experience regressions, period. Unsophisticated reporters shuld not be expected to debug things on their own (though it sure is nice if they can!). We really like it and appreciate it if reporters can run experiments! In this particular instance there was some resistance getting to a fix quickly. I think a revert for something like this that could not be immediately fixed is the correct thing, especially when it effects significant work within the community. It gives the submitter time to find out how to solve the regression. That there is now so much ongoing bad blood over such an ordinary matter is what is really distressing here. I think Leon's point is broadly that those on the "vendor" side seem to often be accused of being a "bad vendor". I couldn't help but notice the language from Meta on this thread seemed to place Meta outside of being a vendor, despite having always very much been doing typical vendor activities like downstream forks, proprietary userspace and now drivers for their own devices. In my view the vendor/!vendor distinction is really toxic and should stop. Jason
On Tue, Apr 9, 2024 at 8:39 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Apr 09, 2024 at 07:43:07AM -0700, Alexander Duyck wrote: > > > I see. So this is what you were referencing. Arguably I can see both > > sides of the issue. Ideally what should have been presented would have > > been the root cause of why the diff > > Uh, that almost never happens in the kernel world. Someone does a > great favour to us all to test rc kernels and finds bugs. The Thus why I mentioned "Ideally". Most often that cannot be the case due to various reasons. However, that said that would have been the Ideal solution, not the practical one. > expectation is generally things like: > > - The bug is fixed immediately because the issue is obvious to the > author > - Iteration and rapid progress is seen toward enlightening the author > - The patch is reverted, often rapidly, try again later with a good > patch When working on a development branch that shouldn't be the expectation. I suspect that is why the revert was pushed back on initially. The developer wanted a chance to try to debug and resolve the issue with root cause. Honestly what I probably would have proposed was a build flag that would have allowed the code to stay but be disabled with a "Broken" label to allow both developers to work on their own thing. Then if people complained about the RFC non-compliance issue, but didn't care about the Vagrant setup they could have just turned it on to test and verify it fixed their issue and get additional testing. However I assume that would have introduced additional maintenance overhead. > Unsophisticated reporters should not experience regressions, > period. Unsophisticated reporters shuld not be expected to debug > things on their own (though it sure is nice if they can!). We really > like it and appreciate it if reporters can run experiments! Unsophisticated reporters/users shouldn't be running net-next. If this has made it to or is about to go into Linus's tree then I would agree the regression needs to be resolved ASAP as that stuff shouldn't exist past rc1 at the latest. > In this particular instance there was some resistance getting to a fix > quickly. I think a revert for something like this that could not be > immediately fixed is the correct thing, especially when it effects > significant work within the community. It gives the submitter time to > find out how to solve the regression. > > That there is now so much ongoing bad blood over such an ordinary > matter is what is really distressing here. Well much of it has to do with the fact that this is supposed to be a community. Generally I help you, you help me and together we both make progress. So within the community people tend to build up what we could call karma. Generally I think some of the messages sent seemed to make it come across that the Mellanox/Nvidia folks felt it "wasn't their problem" so they elicited a bit of frustration from the other maintainers and built up some negative karma. As I had mentioned in the case of the e1000e NVRAM corruption. It wasn't an Intel issue that caused the problem but Intel had to jump in to address it until they found the root cause that was function tracing. Unfortunately one thing that tends to happen with upstream is that we get asked to do things that aren't directly related to the project we are working on. We saw that at Intel quite often. I referred to it at one point as the "you stepped in it, you own it" phenomenon where if we even brushed against block of upstream code that wasn't being well maintained we would be asked to fix it up and address existing issues before we could upstream any patches. > I think Leon's point is broadly that those on the "vendor" side seem > to often be accused of being a "bad vendor". I couldn't help but > notice the language from Meta on this thread seemed to place Meta > outside of being a vendor, despite having always very much been doing > typical vendor activities like downstream forks, proprietary userspace > and now drivers for their own devices. I wouldn't disagree that we are doing "vendor" things. Up until about 4 years ago I was on the "vendor" side at Intel. One difference is that Meta is also the "consumer". So if I report an issue it is me complaining about something as a sophisticated user instead of a unsophisticated one. So hopefully we have gone though and done some triage to at least bisect it down to a patch and are willing to work with the community as you guys did. If we can work with the other maintainers to enable them to debug and root cause the issue then even better. The revert is normally the weapon of last resort to be broken out before the merge window opens, or if an issue is caught in Linus's tree. > In my view the vendor/!vendor distinction is really toxic and should > stop. I agree. However that was essentially what started all this when Jiri pointed out that we weren't selling the NIC to anyone else. That made this all about vendor vs !vendor, and his suggestion of just giving the NICs away isn't exactly practical. At least not an any sort of large scale. Maybe we should start coming up with a new term for the !vendor case. How about "prosumer", as in "producer" and "consumer"?
On 05/04/2024 15:24, Alexander Duyck wrote: > Why not? Just because we are not commercially selling it doesn't mean > we couldn't look at other solutions such as QEMU. If we were to > provide a github repo with an emulation of the NIC would that be > enough to satisfy the "commercial" requirement? > > The fact is I already have an implementation, but I would probably > need to clean up a few things as the current setup requires 3 QEMU > instances to emulate the full setup with host, firmware, and BMC. It > wouldn't be as performant as the actual hardware but it is more than > enough for us to test code with. If we need to look at publishing > something like that to github in order to address the lack of user > availability I could start looking at getting the approvals for that. Personally I think that this would vitiate any legitimate objections anyone could have to this driver. The emulation would be a functional spec for the device, and (assuming it's open source, including the firmware) would provide a basis for anyone attempting to build their own hardware to the same interface. As long as clones aren't prevented by some kind of patent encumbrance or whatever, this would be more 'open' than many of the devices users _can_ get their hands on today. The way this suggestion/offer/proposal got dismissed and ignored in favour of spurious arguments about DMABUF speaks volumes. -e
On Tue, Apr 09, 2024 at 09:31:06AM -0700, Alexander Duyck wrote: > > expectation is generally things like: > > > > - The bug is fixed immediately because the issue is obvious to the > > author > > - Iteration and rapid progress is seen toward enlightening the author > > - The patch is reverted, often rapidly, try again later with a good > > patch > > When working on a development branch that shouldn't be the > expectation. I suspect that is why the revert was pushed back on > initially. The developer wanted a chance to try to debug and resolve > the issue with root cause. Even mm-unstable drops patches on a hair trigger, as an example. You can't have an orderly development process if your development tree is broken in your CI.. Personally I'm grateful for the people who test linux-next (or the various constituent sub trees), it really helps. > Well much of it has to do with the fact that this is supposed to be a > community. Generally I help you, you help me and together we both make > progress. So within the community people tend to build up what we > could call karma. Generally I think some of the messages sent seemed > to make it come across that the Mellanox/Nvidia folks felt it "wasn't > their problem" so they elicited a bit of frustration from the other > maintainers and built up some negative karma. How could it be NVIDIA folks problem? They are not experts in TCP and can't debug it. The engineer running the CI systems did what he was asked by Eric from what I can tell. > phenomenon where if we even brushed against block of upstream code > that wasn't being well maintained we would be asked to fix it up and > address existing issues before we could upstream any patches. Well, Intel has it's own karma problems in the kernel community. :( > > In my view the vendor/!vendor distinction is really toxic and should > > stop. > > I agree. However that was essentially what started all this when Jiri > pointed out that we weren't selling the NIC to anyone else. That made > this all about vendor vs !vendor, That is not how I would sum up Jiri's position. By my read he is saying that contributing code to the kernel that only Meta can actually use is purely extractive. It is not about vendor or !vendor, it is taking-free-forwardporting or not. You have argued, and I would agree, that there is a grey scale between extractive/collaborative - but I also agree with Jiri that fbnic is undeniably far toward the extractive side. If being extractive is a problem in this case or not is another question, but I would say Jiri's objection is definitely not about selling or vendor vs !vendor. Jason
On 4/9/24 07:28, Jiri Pirko wrote: > Tue, Apr 09, 2024 at 03:05:47PM CEST, f.fainelli@gmail.com wrote: >> >> >> On 4/9/2024 3:56 AM, Jiri Pirko wrote: >>> Mon, Apr 08, 2024 at 11:36:42PM CEST, f.fainelli@gmail.com wrote: >>>> On 4/8/24 09:51, Jiri Pirko wrote: >>>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >>>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >>>>>>> >>>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >>>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >>>>>>>>> >>>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >>>>>>>>>>> Alex already indicated new features are coming, changes to the core >>>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically >>>>>>>>>>> should fbnic be allowed to be the first implementation of something >>>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace >>>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would >>>>>>>>>>> not be able to do that. I would say that clearly crosses the line and >>>>>>>>>>> should not be accepted. >>>>>>>>>> >>>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean >>>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to >>>>>>>>>> provide a github repo with an emulation of the NIC would that be >>>>>>>>>> enough to satisfy the "commercial" requirement? >>>>>>>>> >>>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs >>>>>>>>> benefiting only proprietary software. >>>>>>>> >>>>>>>> Sorry, that was where this started where Jiri was stating that we had >>>>>>>> to be selling this. >>>>>>> >>>>>>> For the record, I never wrote that. Not sure why you repeat this over >>>>>>> this thread. >>>>>> >>>>>> Because you seem to be implying that the Meta NIC driver shouldn't be >>>>>> included simply since it isn't going to be available outside of Meta. >>>>>> The fact is Meta employs a number of kernel developers and as a result >>>>>> of that there will be a number of kernel developers that will have >>>>>> access to this NIC and likely do development on systems containing it. >>>>>> In addition simply due to the size of the datacenters that we will be >>>>>> populating there is actually a strong likelihood that there will be >>>>>> more instances of this NIC running on Linux than there are of some >>>>>> other vendor devices that have been allowed to have drivers in the >>>>>> kernel. >>>>> >>>>> So? The gain for community is still 0. No matter how many instances is >>>>> private hw you privately have. Just have a private driver. >>>> >>>> I am amazed and not in a good way at how far this has gone, truly. >>>> >>>> This really is akin to saying that any non-zero driver count to maintain is a >>>> burden on the community. Which is true, by definition, but if the goal was to >>>> build something for no users, then clearly this is the wrong place to be in, >>>> or too late. The systems with no users are the best to maintain, that is for >>>> sure. >>>> >>>> If the practical concern is wen you make tree wide API change that fbnic >>>> happens to use, and you have yet another driver (fbnic) to convert, so what? >>>> Work with Alex ahead of time, get his driver to be modified, post the patch >>>> series. Even if Alex happens to move on and stop being responsible and there >>>> is no maintainer, so what? Give the driver a depreciation window for someone >>>> to step in, rip it, end of story. Nothing new, so what has specifically >>>> changed as of April 4th 2024 to oppose such strong rejection? >>> >>> How you describe the flow of internal API change is totally distant from >>> reality. Really, like no part is correct: >>> 1) API change is responsibility of the person doing it. Imagine working >>> with 40 driver maintainers for every API change. I did my share of >>> API changes in the past, maintainer were only involved to be cced. >> >> As a submitter you propose changes and silence is acknowledgement. If one of >> your API changes broke someone's driver and they did not notify you of the >> breakage during the review cycle, it falls on their shoulder to fix it for >> themselves and they should not be holding back your work, that would not be > > Does it? I don't think so. If you break something, better try to fix it > before somebody else has to. > > >> fair. If you know about the breakage, and there is still no fix, that is an >> indication the driver is not actively used and maintained. > > So? That is not my point. If I break something in fbnic, why does anyone > care? Nobody is ever to hit that bug, only Meta DC. They care, and they will jump in to fix it. There is no expectation that as a community member you should be able to make 100% correct patches, this is absolutely not humanly possible, even less so with scarce access to the hardware. All you can hope for is that your changes work, and that someone catches it, sooner rather than later. > > >> >> This also does not mean you have to do the entire API changes to a driver you >> do not know about on your own. Nothing ever prevents you from posting the >> patches as RFC and say: "here is how I would go about changing your driver, >> please review and help me make corrections". If the driver maintainers do not >> respond there is no reason their lack of involvement should refrain your >> work, and so your proposed changes will be merged eventually. > > Realistically, did you see that ever happen. I can't recall. This happens all of the time, if you make a netdev tree wide change, how many maintainer's Acked-by do we collect before merging those changes: none typically because some netdev maintainers are just quicker than reviewers could be. In other subsystems we might actually wait for people to give a change to give their A-b or R-b tags, not always though. > > >> >> Is not this the whole point of being a community and be able to delegate and >> mitigate the risk of large scale changes? >> >>> 2) To deprecate driver because the maintainer is not responsible. Can >>> you please show me one example when that happened in the past? >> >> I cannot show you an example because we never had to go that far and I did >> not say that this is an established practice, but that we *could* do that if >> we ever reached that point. > > You are talking about a flow that does not exist. I don't understand how > is that related to this discussion then. I was trying to appease your concerns about additional maintenance burden. If the burden becomes real, we ditch it. We can dismiss this point as being not relevant if you want. > > >> >>> >>> >>>> >>>> Like it was said, there are tons of drivers in the Linux kernel that have a >>>> single user, this one might have a few more than a single one, that should be >>>> good enough. >>> >>> This will have exactly 0. That is my point. Why to merge something >>> nobody will ever use? >> >> Even if Alex and his firmware colleague end up being the only two people >> using this driver if the decision is to make it upstream because this is the >> desired distribution and development model of the driver we should respect >> that. >> >> And just to be clear, we should not be respecting that because Meta, or Alex >> or anyone decided that they were doing the world a favor by working in the >> open rather than being closed door, but simply because we cannot *presume* > > I don't see any favor for the community. What's the favor exactly? There is no exchange of favors or "this" for "that", this is not how a community works. You bring your code to the table, solicit review feedback, then go on to maintain it within your bounds, and, time permitting, beyond your driver. What we gain as a community is additional visibility, more API users (eventually real world users, too), and therefore a somewhat more objective way of coming up with new APIs and features, and just a broader understanding of what is out there. This is better than speculation since that creates a less skewed mental model. Let us say that someone at Meta wanted to get this core netdev feature that could be super cool for others included in the upstream kernel, we would shut it down on the basis that no user exists and we would be right about doing it that. Turns out there is a user, but the driver lives out of tree, but now we also reject that driver? Who benefits from doing that: nobody. You need a membership card to join the club that you can only enter if you have a membership card already? No thank you. > The only favor I see is the in the opposite direction, community giving > Meta free cycles saving their backporting costs. Why? Technically it would be both forward porting cycles, since they would no longer need to rebase the driver against their most recent kernel used, and backporting cycles for the first kernel including fbnic onwards. That comes almost for free these days anyways thanks to static analysis tools. The overwhelming cost of the maintenance remains on Meta engineers, being the only ones with access to the hardware. If they end up with customers in the future, they can offload some of that to their customers, too. Let's just look at a few high profile drivers by lines changed: Summary for: drivers/net/ethernet/mellanox/mlxsw Total: 133422 (+), 44952 (-) Own: 131180 (+), 42725 (-) Community: 2242 (+) (1.680 %), 2227 (-) (4.954 %) Summary for: drivers/net/ethernet/mellanox/mlx5 Total: 265368 (+), 107690 (-) Own: 259213 (+), 100328 (-) Community: 6155 (+) (2.319 %), 7362 (-) (6.836 %)% Summary for: drivers/net/ethernet/broadcom/bnxt Total: 70355 (+), 25402 (-) Own: 68458 (+), 23358 (-) Community: 1897 (+) (2.696 %), 2044 (-) (8.047 %) Summary for: drivers/net/ethernet/intel/e1000e/ Total: 39760 (+), 9924 (-) Own: 38514 (+), 8905 (-) Community: 1246 (+) (3.134 %), 1019 (-) (10.268 %) I admit this is simplistic because both mlxsw and mlx5 drivers helped greatly improve the networking stack in parts that I benefited directly from within DSA for instance. The point is, you paid the maintenance price though, the community did not. > > >> about their intentions and the future. > > Heh, the intention is pretty clear from this discussion, isn't it? If > they ever by any chance decide to go public with their device, driver > for that could be submitted at a time. But this is totally hypothetical. I think your opposition is unreasonable and is unfair. Using your argument to the extreme, I may go as far as saying that it encourages working out of tree, rather than in tree. This is the exact opposite of what made Linux successful as an OS. Can I buy a Spectrum switch off Amazon?
On Tue, Apr 09, 2024 at 10:42:44AM -0700, Florian Fainelli wrote: > On 4/9/24 07:28, Jiri Pirko wrote: > > Tue, Apr 09, 2024 at 03:05:47PM CEST, f.fainelli@gmail.com wrote: > > > <...> > Can I buy a Spectrum switch off Amazon? You can buy latest Spectrum generation from eBay. https://www.ebay.com/itm/145138400557?itmmeta=01HV2247YN9ENJHP6T4YSN2HP7&hash=item21caec3d2d:g:qWoAAOSwzEZjiq2z&itmprp=enc%3AAQAJAAAA4CjGKSBxVTaO07qnpLBQGBwBJdGVCYhu730MrI5AC6E%2BERJLxS0EdlgE2gKtElk%2FZUj6A9DQR69IXIpTk%2FJbqEGKgCNR4d6TMz6i%2BogO02ZZBCkpLkPOYpOvDJV1jRv2KVOBt7i5k5pUMRJpwPKzAH%2Fwf6tglPOId2d9fSy%2BBM3MbDcbZfkv4V%2FNbItTgspvDnnMKAzUmR3Rs9%2FoHVDVbU4ZsnfRKFOMaKbGH5j%2Fani2jAqtbPEIA3H8nUcdpkxo4I61N5w9peLlN6Hkj8E8irdQY4TTzSTdYZ7EC9JG09cQ%7Ctkp%3ABk9SR8T_kMLYYw Thanks > -- > Florian > >
On Tue, Apr 9, 2024 at 10:12 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Apr 09, 2024 at 09:31:06AM -0700, Alexander Duyck wrote: > > > > expectation is generally things like: > > > > > > - The bug is fixed immediately because the issue is obvious to the > > > author > > > - Iteration and rapid progress is seen toward enlightening the author > > > - The patch is reverted, often rapidly, try again later with a good > > > patch > > > > When working on a development branch that shouldn't be the > > expectation. I suspect that is why the revert was pushed back on > > initially. The developer wanted a chance to try to debug and resolve > > the issue with root cause. > > Even mm-unstable drops patches on a hair trigger, as an example. > > You can't have an orderly development process if your development tree > is broken in your CI.. Personally I'm grateful for the people who test > linux-next (or the various constituent sub trees), it really helps. > > > Well much of it has to do with the fact that this is supposed to be a > > community. Generally I help you, you help me and together we both make > > progress. So within the community people tend to build up what we > > could call karma. Generally I think some of the messages sent seemed > > to make it come across that the Mellanox/Nvidia folks felt it "wasn't > > their problem" so they elicited a bit of frustration from the other > > maintainers and built up some negative karma. > > How could it be NVIDIA folks problem? They are not experts in TCP and > can't debug it. The engineer running the CI systems did what he was > asked by Eric from what I can tell. No, I get your message. I wasn't saying it was your problem. All that can be asked for is such cooperation. Like I said I think some of the problem was the messaging more than the process. > > phenomenon where if we even brushed against block of upstream code > > that wasn't being well maintained we would be asked to fix it up and > > address existing issues before we could upstream any patches. > > Well, Intel has it's own karma problems in the kernel community. :( Oh, I know. I resisted the urge to push out the driver as "idgaf: Internal Device Generated at Facebook" on April 1st instead of "fbnic" to poke fun at the presentation they did at Netdev 0x16 where they were trying to say all the vendors should be implementing "idpf" since they made it a standard. > > > In my view the vendor/!vendor distinction is really toxic and should > > > stop. > > > > I agree. However that was essentially what started all this when Jiri > > pointed out that we weren't selling the NIC to anyone else. That made > > this all about vendor vs !vendor, > > That is not how I would sum up Jiri's position. > > By my read he is saying that contributing code to the kernel that only > Meta can actually use is purely extractive. It is not about vendor or > !vendor, it is taking-free-forwardporting or not. You have argued, > and I would agree, that there is a grey scale between > extractive/collaborative - but I also agree with Jiri that fbnic is > undeniably far toward the extractive side. > > If being extractive is a problem in this case or not is another > question, but I would say Jiri's objection is definitely not about > selling or vendor vs !vendor. > > Jason It all depends on your definition of being extractive. I would assume a "consumer" that is running a large number of systems and is capable of providing sophisticated feedback on issues found within the kernel, in many cases providing fixes for said issues, or working with maintainers on resolution of said issues, is not extractive. The fact that said "consumer" decides to then produce their own device becoming a "prosumer" means they are now able to more accurately and quickly diagnose issues when they see them. They can design things such that there isn't some black box of firmware, or a third party driver, in the datapath that prevents them from quickly diagnosing the issue. So if anything I would think that is a net positive for the community as it allows the "prosumer" to provide much more quick and useful feedback on issues found in the kernel rather than having to wait on a third party vendor to provide additional input. Note I am not going after any particular vendor with my comments. This applies to all vendors. The problem as a customer is that you are limited on what you can do once you find an issue. Quite often you are at the mercy of the vendor in such cases, especially when there seems to be either firmware or "security" issues involved. Thanks, - Alex
On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote: > > > phenomenon where if we even brushed against block of upstream code > > > that wasn't being well maintained we would be asked to fix it up and > > > address existing issues before we could upstream any patches. > > > > Well, Intel has it's own karma problems in the kernel community. :( > > Oh, I know. I resisted the urge to push out the driver as "idgaf: > Internal Device Generated at Facebook" on April 1st instead of > "fbnic" That would have been hilarious! > to poke fun at the presentation they did at Netdev 0x16 where they > were trying to say all the vendors should be implementing "idpf" since > they made it a standard. Yes, I noticed this also. For all the worries I've heard lately about lack of commonality/etc it seems like a major missed ecosystem opportunity to have not invested in an industry standard. From what I can see fbnic has no hope of being anything more than a one-off generation for Meta. Too many silicon design micro-details are exposed to the OS. > It all depends on your definition of being extractive. I would assume > a "consumer" that is running a large number of systems and is capable > of providing sophisticated feedback on issues found within the kernel, > in many cases providing fixes for said issues, or working with > maintainers on resolution of said issues, is not extractive. I don't know, as I said there is some grey scale. IMHO it is not appropriate to make such decisions based on some company wide metric. fbnic team alone should be judged and shouldn't get a free ride based on the other good work Meta is doing. Otherwise it turns into a thing where bigger/richer companies just get to do whatever they want because they do the most "good" in aggregate. Jason
On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote: > On Tue, Apr 9, 2024 at 10:12 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > On Tue, Apr 09, 2024 at 09:31:06AM -0700, Alexander Duyck wrote: > > > > > > expectation is generally things like: > > > > > > > > - The bug is fixed immediately because the issue is obvious to the > > > > author > > > > - Iteration and rapid progress is seen toward enlightening the author > > > > - The patch is reverted, often rapidly, try again later with a good > > > > patch > > > > > > When working on a development branch that shouldn't be the > > > expectation. I suspect that is why the revert was pushed back on > > > initially. The developer wanted a chance to try to debug and resolve > > > the issue with root cause. > > > > Even mm-unstable drops patches on a hair trigger, as an example. > > > > You can't have an orderly development process if your development tree > > is broken in your CI.. Personally I'm grateful for the people who test > > linux-next (or the various constituent sub trees), it really helps. > > > > > Well much of it has to do with the fact that this is supposed to be a > > > community. Generally I help you, you help me and together we both make > > > progress. So within the community people tend to build up what we > > > could call karma. Generally I think some of the messages sent seemed > > > to make it come across that the Mellanox/Nvidia folks felt it "wasn't > > > their problem" so they elicited a bit of frustration from the other > > > maintainers and built up some negative karma. > > > > How could it be NVIDIA folks problem? They are not experts in TCP and > > can't debug it. The engineer running the CI systems did what he was > > asked by Eric from what I can tell. > > No, I get your message. I wasn't saying it was your problem. All that > can be asked for is such cooperation. Like I said I think some of the > problem was the messaging more than the process. Patch with revert came month+ after we reported the issue and were ready to do anything to find the root cause, so it is not the messaging issue, it was the exclusion from process issue. I tried to avoid to write the below, but because Jason brought it already, I'll write my feelings. Current netdev has very toxic environment, with binary separation to vendors and not-vendors. Vendors are bad guys who day and night try to cheat and sneak their dirty hacks into the kernel. Their contributions are negligible and can't be trusted by definition. Luckily enough, there are some "not-vendors" and they are the good guys who know what is the best for the community and all other world. Thanks
On Tue, Apr 9, 2024 at 11:55 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote: > > > > phenomenon where if we even brushed against block of upstream code > > > > that wasn't being well maintained we would be asked to fix it up and > > > > address existing issues before we could upstream any patches. > > > > > > Well, Intel has it's own karma problems in the kernel community. :( > > > > Oh, I know. I resisted the urge to push out the driver as "idgaf: > > Internal Device Generated at Facebook" on April 1st instead of > > "fbnic" > > That would have been hilarious! > > > to poke fun at the presentation they did at Netdev 0x16 where they > > were trying to say all the vendors should be implementing "idpf" since > > they made it a standard. > > Yes, I noticed this also. For all the worries I've heard lately about > lack of commonality/etc it seems like a major missed ecosystem > opportunity to have not invested in an industry standard. From what I > can see fbnic has no hope of being anything more than a one-off > generation for Meta. Too many silicon design micro-details are exposed > to the OS. I know. The fact is we aren't trying to abstract away anything as that would mean a larger firmware blob. That is the problem with an abstraction like idpf is that it just adds more overhead as you have to have the firmware manage more of the control plane. > > It all depends on your definition of being extractive. I would assume > > a "consumer" that is running a large number of systems and is capable > > of providing sophisticated feedback on issues found within the kernel, > > in many cases providing fixes for said issues, or working with > > maintainers on resolution of said issues, is not extractive. > > I don't know, as I said there is some grey scale. > > IMHO it is not appropriate to make such decisions based on some > company wide metric. fbnic team alone should be judged and shouldn't > get a free ride based on the other good work Meta is doing. Otherwise > it turns into a thing where bigger/richer companies just get to do > whatever they want because they do the most "good" in aggregate. The problem here in this case is that I am pretty much the heart of the software driver team with a few new hires onboarding the next couple months. People were asking why others were jumping to my defense, well if we are going to judge the team they are mostly judging me. I'm just hoping my reputation has spoken for itself considering I was making significant contributions to the drivers even after I have gone through several changes of employer.
On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote: > This patch set includes the necessary patches to enable basic Tx and Rx > over the Meta Platforms Host Network Interface. To do this we introduce a > new driver and driver and directories in the form of > "drivers/net/ethernet/meta/fbnic". Let me try to restate some takeaways and ask for further clarification on the main question... First, I think there's broad support for merging the driver itself. IIUC there is also broad support to raise the expectations from maintainers of drivers for private devices, specifically that they will: - receive weaker "no regression" guarantees - help with refactoring / adapting their drivers more actively - not get upset when we delete those drivers if they stop participating If you think that the drivers should be merged *without* setting these expectations, please speak up. Nobody picked me up on the suggestion to use the CI as a proactive check whether the maintainer / owner is still paying attention, but okay :( What is less clear to me is what do we do about uAPI / core changes. Of those who touched on the subject - few people seem to be curious / welcoming to any reasonable features coming out for private devices (John, Olek, Florian)? Others are more cautious focusing on blast radius and referring to the "two driver rule" (Daniel, Paolo)? Whether that means outright ban on touching common code or uAPI in ways which aren't exercised by commercial NICs, is unclear. Andrew and Ed did not address the question directly AFAICT. Is my reading correct? Does anyone have an opinion on whether we should try to dig more into this question prior to merging the driver, and set some ground rules? Or proceed and learn by doing?
Jakub Kicinski wrote: > On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote: > > This patch set includes the necessary patches to enable basic Tx and Rx > > over the Meta Platforms Host Network Interface. To do this we introduce a > > new driver and driver and directories in the form of > > "drivers/net/ethernet/meta/fbnic". > > Let me try to restate some takeaways and ask for further clarification > on the main question... > > First, I think there's broad support for merging the driver itself. > > IIUC there is also broad support to raise the expectations from > maintainers of drivers for private devices, specifically that they will: > - receive weaker "no regression" guarantees > - help with refactoring / adapting their drivers more actively > - not get upset when we delete those drivers if they stop participating > > If you think that the drivers should be merged *without* setting these > expectations, please speak up. > > Nobody picked me up on the suggestion to use the CI as a proactive > check whether the maintainer / owner is still paying attention, > but okay :( > > > What is less clear to me is what do we do about uAPI / core changes. > Of those who touched on the subject - few people seem to be curious / > welcoming to any reasonable features coming out for private devices > (John, Olek, Florian)? Others are more cautious focusing on blast > radius and referring to the "two driver rule" (Daniel, Paolo)? > Whether that means outright ban on touching common code or uAPI > in ways which aren't exercised by commercial NICs, is unclear. > Andrew and Ed did not address the question directly AFAICT. > > Is my reading correct? Does anyone have an opinion on whether we should > try to dig more into this question prior to merging the driver, and > set some ground rules? Or proceed and learn by doing? Thanks for summarizing. That was my reading too Two distict questions 1. whether a standard driver is as admissible if the device is not available on the open market. 2. whether new device features can be supported without at least two available devices supporting it. FWIW, +1 for 1 from me. Any serious device that exists in quantity and is properly maintained should be in-tree. In terms of trusting Meta, it is less about karma, but an indication of these two requirements when the driver first appears. We would not want to merge vaporware drivers from unknown sources or university research projects. 2 is out of scope for this series. But I would always want to hear about potential new features that an organization finds valuable enough to implement. Rather than a blanket rule against them.
On Tue, Apr 09, 2024 at 01:03:06PM -0700, Alexander Duyck wrote: > People were asking why others were jumping to my defense, well if we > are going to judge the team they are mostly judging me. I'm just > hoping my reputation has spoken for itself considering I was making > significant contributions to the drivers even after I have gone > through several changes of employer. I don't think anything in this thread is about you personally. You were just given a "bad vendor" task by your employer and you carried it out. As is usual with these things there is legitimate disagreement on what it means to be a "bad vendor". Jason
> What is less clear to me is what do we do about uAPI / core changes. I would differentiate between core change and core additions. If there is very limited firmware on this device, i assume Linux is managing the SFP cage, and to some extend the PCS. Extending the core to handle these at higher speeds than currently supported would be one such core addition. I've no problem with this. And i doubt it will be a single NIC using such additions for too long. It looks like ClearFog CX LX2 could make use of such extensions as well, and there are probably other boards and devices, maybe the Zynq 7000? > Is my reading correct? Does anyone have an opinion on whether we should > try to dig more into this question prior to merging the driver, and > set some ground rules? Or proceed and learn by doing? I'm not too keen on keeping potentially shareable code in the driver just because of UEFI. It has long been the norm that you should not have wrappers so you can reuse code in different OSes. And UEFI is just another OS. So i really would like to see a Linux I2C bus master driver, a linux GPIO driver if appropriate, and using phylink, just as i've pushed wangxun to do that, and to some extend nvidia with their GPIO controller embedded in their NIC. The nice thing is, the developers for wangxun has mostly solved all this for a PCIe device, so their code can be copied. Do we need to set some ground rules? No. I can give similar feedback as i gave the wangxun developers, if Linux subsystems are not used appropriately. Andrew
Tue, Apr 09, 2024 at 11:06:05PM CEST, willemdebruijn.kernel@gmail.com wrote: >Jakub Kicinski wrote: >> On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote: [...] > >2. whether new device features can be supported without at least > two available devices supporting it. > [...] > >2 is out of scope for this series. But I would always want to hear >about potential new features that an organization finds valuable >enough to implement. Rather than a blanket rule against them. This appears out of the nowhere. In the past, I would say wast majority of the features was merged with single device implementation. Often, it is the only device out there at a time that supports the feature. This limitation would put break for feature additions. I can put a long list of features that would not be here ever (like 50% of mlxsw driver). > >
Tue, Apr 09, 2024 at 10:51:42PM CEST, kuba@kernel.org wrote: >On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote: >> This patch set includes the necessary patches to enable basic Tx and Rx >> over the Meta Platforms Host Network Interface. To do this we introduce a >> new driver and driver and directories in the form of >> "drivers/net/ethernet/meta/fbnic". > >Let me try to restate some takeaways and ask for further clarification >on the main question... > >First, I think there's broad support for merging the driver itself. > >IIUC there is also broad support to raise the expectations from >maintainers of drivers for private devices, specifically that they will: > - receive weaker "no regression" guarantees > - help with refactoring / adapting their drivers more actively :) > - not get upset when we delete those drivers if they stop participating Sorry for being pain, but I would still like to see some sumarization of what is actually the gain for the community to merge this unused driver. So far, I don't recall to read anything solid. btw: Kconfig description should contain: Say N here, you can't ever see this device in real world. > >If you think that the drivers should be merged *without* setting these >expectations, please speak up. > >Nobody picked me up on the suggestion to use the CI as a proactive >check whether the maintainer / owner is still paying attention, >but okay :( > > >What is less clear to me is what do we do about uAPI / core changes. >Of those who touched on the subject - few people seem to be curious / >welcoming to any reasonable features coming out for private devices >(John, Olek, Florian)? Others are more cautious focusing on blast >radius and referring to the "two driver rule" (Daniel, Paolo)? >Whether that means outright ban on touching common code or uAPI >in ways which aren't exercised by commercial NICs, is unclear. For these kind of unused drivers, I think it would be legit to disallow any internal/external api changes. Just do that for some normal driver, then benefit from the changes in the unused driver. Now the question is, how to distinguish these 2 driver kinds? Maybe to put them under some directory so it is clear? drivers/net/unused/ethernet/meta/fbnic/ >Andrew and Ed did not address the question directly AFAICT. > >Is my reading correct? Does anyone have an opinion on whether we should >try to dig more into this question prior to merging the driver, and >set some ground rules? Or proceed and learn by doing? >
Tue, Apr 09, 2024 at 10:03:06PM CEST, alexander.duyck@gmail.com wrote: >On Tue, Apr 9, 2024 at 11:55 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >> >> On Tue, Apr 09, 2024 at 11:38:59AM -0700, Alexander Duyck wrote: >> > > > phenomenon where if we even brushed against block of upstream code >> > > > that wasn't being well maintained we would be asked to fix it up and >> > > > address existing issues before we could upstream any patches. >> > > >> > > Well, Intel has it's own karma problems in the kernel community. :( >> > >> > Oh, I know. I resisted the urge to push out the driver as "idgaf: >> > Internal Device Generated at Facebook" on April 1st instead of >> > "fbnic" >> >> That would have been hilarious! >> >> > to poke fun at the presentation they did at Netdev 0x16 where they >> > were trying to say all the vendors should be implementing "idpf" since >> > they made it a standard. >> >> Yes, I noticed this also. For all the worries I've heard lately about >> lack of commonality/etc it seems like a major missed ecosystem >> opportunity to have not invested in an industry standard. From what I >> can see fbnic has no hope of being anything more than a one-off >> generation for Meta. Too many silicon design micro-details are exposed >> to the OS. > >I know. The fact is we aren't trying to abstract away anything as that >would mean a larger firmware blob. That is the problem with an >abstraction like idpf is that it just adds more overhead as you have >to have the firmware manage more of the control plane. > >> > It all depends on your definition of being extractive. I would assume >> > a "consumer" that is running a large number of systems and is capable >> > of providing sophisticated feedback on issues found within the kernel, >> > in many cases providing fixes for said issues, or working with >> > maintainers on resolution of said issues, is not extractive. >> >> I don't know, as I said there is some grey scale. >> >> IMHO it is not appropriate to make such decisions based on some >> company wide metric. fbnic team alone should be judged and shouldn't >> get a free ride based on the other good work Meta is doing. Otherwise >> it turns into a thing where bigger/richer companies just get to do >> whatever they want because they do the most "good" in aggregate. > >The problem here in this case is that I am pretty much the heart of >the software driver team with a few new hires onboarding the next >couple months. People were asking why others were jumping to my >defense, well if we are going to judge the team they are mostly >judging me. I'm just hoping my reputation has spoken for itself >considering I was making significant contributions to the drivers even >after I have gone through several changes of employer. Let me clearly state two things this thread it running around all the time: 1) This has nothing to do with you Alex, at all. If John Doe was the one pushing this, from my perspective, everything would be exactly the same. 2) This is not about selling devices. I already stresses that out multiple times, yet people are still talking about selling. This is about possibility of outside-Meta person to meet the device in real world with real usecase (not emulated, that does not make any sense).
From: Jiri Pirko <jiri@resnulli.us> Date: Tue, 9 Apr 2024 16:41:10 +0200 > Tue, Apr 09, 2024 at 03:11:21PM CEST, aleksander.lobakin@intel.com wrote: >> From: Jiri Pirko <jiri@resnulli.us> >> Date: Tue, 9 Apr 2024 13:01:51 +0200 >> >>> Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote: >>>> Jiri Pirko wrote: >>>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >>>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >>>>>>> >>>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >>>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >>>>>>>>> >>>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >>>>>>>>>>> Alex already indicated new features are coming, changes to the core >>>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically >>>>>>>>>>> should fbnic be allowed to be the first implementation of something >>>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace >>>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would >>>>>>>>>>> not be able to do that. I would say that clearly crosses the line and >>>>>>>>>>> should not be accepted. >>>>>>>>>> >>>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean >>>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to >>>>>>>>>> provide a github repo with an emulation of the NIC would that be >>>>>>>>>> enough to satisfy the "commercial" requirement? >>>>>>>>> >>>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs >>>>>>>>> benefiting only proprietary software. >>>>>>>> >>>>>>>> Sorry, that was where this started where Jiri was stating that we had >>>>>>>> to be selling this. >>>>>>> >>>>>>> For the record, I never wrote that. Not sure why you repeat this over >>>>>>> this thread. >>>>>> >>>>>> Because you seem to be implying that the Meta NIC driver shouldn't be >>>>>> included simply since it isn't going to be available outside of Meta. >> >> BTW idpf is also not something you can go and buy in a store, but it's >> here in the kernel. Anyway, see below. > > IDK, why so many people in this thread are so focused on "buying" nic. > IDPF device is something I assume one may see on a VM hosted in some > cloud, isn't it? If yes, it is completely legit to have it in. Do I miss > something? Anyhow, we want the upstream Linux kernel to work out of box on most systems. Rejecting this driver basically encourages to still prefer OOT/proprietary crap. > > >> >>>>>> The fact is Meta employs a number of kernel developers and as a result >>>>>> of that there will be a number of kernel developers that will have >>>>>> access to this NIC and likely do development on systems containing it. >> >> [...] >> >>>> Vendors would happily spin up a NIC if a DC with scale like this >>>> would pay for it. They just don't advertise it in patch 0/X, >>>> "adding device for cloud provider foo". >>>> >>>> There is no difference here. We gain developers, we gain insights, >>>> learnings and Linux and OSS drivers are running on another big >>>> DC. They improve things and find bugs they upstream them its a win. >>>> >>>> The opposite is also true if we exclude a driver/NIC HW that is >>>> running on major DCs we lose a lot of insight, experience, value. >>> >>> Could you please describe in details and examples what exactly is we >>> are about to loose? I don't see it. >> >> As long as driver A introduces new features / improvements / API / >> whatever to the core kernel, we benefit from this no matter whether I'm >> actually able to run this driver on my system. >> >> Some drivers even give us benefit by that they are of good quality (I >> don't speak for this driver, just some hypothetical) and/or have >> interesting design / code / API / etc. choices. The drivers I work on >> did gain a lot just from that I was reading new commits / lore threads >> and look at changes in other drivers. >> >> I saw enough situations when driver A started using/doing something the >> way it wasn't ever done anywhere before, and then more and more drivers >> stated doing the same thing and at the end it became sorta standard. > > So bottom line is, the unused driver *may* introduce some features and > *may* provide as an example of how to do things for other people. > Is this really that beneficial for the community that it overweights > the obvious cons (not going to repeat them)? > > Like with any other patch/set we merge in, we always look at the cons > and pros. I'm honestly surprised that so many people here > want to make exception for Meta's internal toy project. It's not me who wants to make an exception. I haven't ever seen a driver rejected due to "it can be used only somewhere where I can't go", so looks like it's you who wants to make an exception :> Thanks, Olek
Wed, Apr 10, 2024 at 01:45:54PM CEST, aleksander.lobakin@intel.com wrote: >From: Jiri Pirko <jiri@resnulli.us> >Date: Tue, 9 Apr 2024 16:41:10 +0200 > >> Tue, Apr 09, 2024 at 03:11:21PM CEST, aleksander.lobakin@intel.com wrote: >>> From: Jiri Pirko <jiri@resnulli.us> >>> Date: Tue, 9 Apr 2024 13:01:51 +0200 >>> >>>> Mon, Apr 08, 2024 at 07:32:59PM CEST, john.fastabend@gmail.com wrote: >>>>> Jiri Pirko wrote: >>>>>> Mon, Apr 08, 2024 at 05:46:35PM CEST, alexander.duyck@gmail.com wrote: >>>>>>> On Mon, Apr 8, 2024 at 4:51 AM Jiri Pirko <jiri@resnulli.us> wrote: >>>>>>>> >>>>>>>> Fri, Apr 05, 2024 at 08:38:25PM CEST, alexander.duyck@gmail.com wrote: >>>>>>>>> On Fri, Apr 5, 2024 at 8:17 AM Jason Gunthorpe <jgg@nvidia.com> wrote: >>>>>>>>>> >>>>>>>>>> On Fri, Apr 05, 2024 at 07:24:32AM -0700, Alexander Duyck wrote: >>>>>>>>>>>> Alex already indicated new features are coming, changes to the core >>>>>>>>>>>> code will be proposed. How should those be evaluated? Hypothetically >>>>>>>>>>>> should fbnic be allowed to be the first implementation of something >>>>>>>>>>>> invasive like Mina's DMABUF work? Google published an open userspace >>>>>>>>>>>> for NCCL that people can (in theory at least) actually run. Meta would >>>>>>>>>>>> not be able to do that. I would say that clearly crosses the line and >>>>>>>>>>>> should not be accepted. >>>>>>>>>>> >>>>>>>>>>> Why not? Just because we are not commercially selling it doesn't mean >>>>>>>>>>> we couldn't look at other solutions such as QEMU. If we were to >>>>>>>>>>> provide a github repo with an emulation of the NIC would that be >>>>>>>>>>> enough to satisfy the "commercial" requirement? >>>>>>>>>> >>>>>>>>>> My test is not "commercial", it is enabling open source ecosystem vs >>>>>>>>>> benefiting only proprietary software. >>>>>>>>> >>>>>>>>> Sorry, that was where this started where Jiri was stating that we had >>>>>>>>> to be selling this. >>>>>>>> >>>>>>>> For the record, I never wrote that. Not sure why you repeat this over >>>>>>>> this thread. >>>>>>> >>>>>>> Because you seem to be implying that the Meta NIC driver shouldn't be >>>>>>> included simply since it isn't going to be available outside of Meta. >>> >>> BTW idpf is also not something you can go and buy in a store, but it's >>> here in the kernel. Anyway, see below. >> >> IDK, why so many people in this thread are so focused on "buying" nic. >> IDPF device is something I assume one may see on a VM hosted in some >> cloud, isn't it? If yes, it is completely legit to have it in. Do I miss >> something? > >Anyhow, we want the upstream Linux kernel to work out of box on most >systems. Rejecting this driver basically encourages to still prefer >OOT/proprietary crap. Totally true. Out of the box on as many systems as possible. This is not the case. Can you show me an example of a person outside-of-Meta can benefit from using this driver out-of-box? > >> >> >>> >>>>>>> The fact is Meta employs a number of kernel developers and as a result >>>>>>> of that there will be a number of kernel developers that will have >>>>>>> access to this NIC and likely do development on systems containing it. >>> >>> [...] >>> >>>>> Vendors would happily spin up a NIC if a DC with scale like this >>>>> would pay for it. They just don't advertise it in patch 0/X, >>>>> "adding device for cloud provider foo". >>>>> >>>>> There is no difference here. We gain developers, we gain insights, >>>>> learnings and Linux and OSS drivers are running on another big >>>>> DC. They improve things and find bugs they upstream them its a win. >>>>> >>>>> The opposite is also true if we exclude a driver/NIC HW that is >>>>> running on major DCs we lose a lot of insight, experience, value. >>>> >>>> Could you please describe in details and examples what exactly is we >>>> are about to loose? I don't see it. >>> >>> As long as driver A introduces new features / improvements / API / >>> whatever to the core kernel, we benefit from this no matter whether I'm >>> actually able to run this driver on my system. >>> >>> Some drivers even give us benefit by that they are of good quality (I >>> don't speak for this driver, just some hypothetical) and/or have >>> interesting design / code / API / etc. choices. The drivers I work on >>> did gain a lot just from that I was reading new commits / lore threads >>> and look at changes in other drivers. >>> >>> I saw enough situations when driver A started using/doing something the >>> way it wasn't ever done anywhere before, and then more and more drivers >>> stated doing the same thing and at the end it became sorta standard. >> >> So bottom line is, the unused driver *may* introduce some features and >> *may* provide as an example of how to do things for other people. >> Is this really that beneficial for the community that it overweights >> the obvious cons (not going to repeat them)? >> >> Like with any other patch/set we merge in, we always look at the cons >> and pros. I'm honestly surprised that so many people here >> want to make exception for Meta's internal toy project. > >It's not me who wants to make an exception. I haven't ever seen a driver >rejected due to "it can be used only somewhere where I can't go", so >looks like it's you who wants to make an exception :> Could you please point me to some other existing driver for device that does not exist (/exist only at some person's backyard)? > >Thanks, >Olek
On 4/10/24 09:42, Jiri Pirko wrote: > Tue, Apr 09, 2024 at 10:51:42PM CEST, kuba@kernel.org wrote: >> On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote: >>> This patch set includes the necessary patches to enable basic Tx and Rx >>> over the Meta Platforms Host Network Interface. To do this we introduce a >>> new driver and driver and directories in the form of >>> "drivers/net/ethernet/meta/fbnic". >> >> Let me try to restate some takeaways and ask for further clarification >> on the main question... >> >> First, I think there's broad support for merging the driver itself. >> >> IIUC there is also broad support to raise the expectations from >> maintainers of drivers for private devices, specifically that they will: >> - receive weaker "no regression" guarantees >> - help with refactoring / adapting their drivers more actively > > :) > > >> - not get upset when we delete those drivers if they stop participating > > Sorry for being pain, but I would still like to see some sumarization of > what is actually the gain for the community to merge this unused driver. > So far, I don't recall to read anything solid. For me personally, both as a developer and as an user, any movement into lean-FW direction is a breeze of fresh air. And nobody is stopping Nvidia or other vendor from manufacturing Advanced FBNIC Accelerator TM, that uses the driver as-is, but makes it faster, better and cheaper that anything you could buy right now. > > btw: > Kconfig description should contain: > Say N here, you can't ever see this device in real world. > Thank you for keeping this entertaining :) > >> >> If you think that the drivers should be merged *without* setting these >> expectations, please speak up. >> >> Nobody picked me up on the suggestion to use the CI as a proactive >> check whether the maintainer / owner is still paying attention, >> but okay :( >> >> >> What is less clear to me is what do we do about uAPI / core changes. >> Of those who touched on the subject - few people seem to be curious / >> welcoming to any reasonable features coming out for private devices >> (John, Olek, Florian)? Others are more cautious focusing on blast >> radius and referring to the "two driver rule" (Daniel, Paolo)? >> Whether that means outright ban on touching common code or uAPI >> in ways which aren't exercised by commercial NICs, is unclear. > > For these kind of unused drivers, I think it would be legit to > disallow any internal/external api changes. Just do that for some > normal driver, then benefit from the changes in the unused driver. > > Now the question is, how to distinguish these 2 driver kinds? Maybe to > put them under some directory so it is clear? > drivers/net/unused/ethernet/meta/fbnic/ > > >> Andrew and Ed did not address the question directly AFAICT. >> >> Is my reading correct? Does anyone have an opinion on whether we should >> try to dig more into this question prior to merging the driver, and >> set some ground rules? Or proceed and learn by doing? >> >
On Wed, 10 Apr 2024 09:42:14 +0200 Jiri Pirko wrote: > > - not get upset when we delete those drivers if they stop participating > > Sorry for being pain, but I would still like to see some sumarization of > what is actually the gain for the community to merge this unused driver. > So far, I don't recall to read anything solid. From the discussion I think some folks made the point that it's educational to see what big companies do, and seeing the work may lead to reuse and other people adopting features / ideas. > btw: > Kconfig description should contain: > Say N here, you can't ever see this device in real world. We do use standard distro kernels in some corners of the DC, AFAIU. > >If you think that the drivers should be merged *without* setting these > >expectations, please speak up. > > > >Nobody picked me up on the suggestion to use the CI as a proactive > >check whether the maintainer / owner is still paying attention, > >but okay :( > > > > > >What is less clear to me is what do we do about uAPI / core changes. > >Of those who touched on the subject - few people seem to be curious / > >welcoming to any reasonable features coming out for private devices > >(John, Olek, Florian)? Others are more cautious focusing on blast > >radius and referring to the "two driver rule" (Daniel, Paolo)? > >Whether that means outright ban on touching common code or uAPI > >in ways which aren't exercised by commercial NICs, is unclear. > > For these kind of unused drivers, I think it would be legit to > disallow any internal/external api changes. Just do that for some > normal driver, then benefit from the changes in the unused driver. Unused is a bit strong, and we didn't put netdevsim in a special directory. Let's see if more such drivers appear and if there are practical uses for the separation for scripts etc?
Wed, Apr 10, 2024 at 03:46:11PM CEST, kuba@kernel.org wrote: >On Wed, 10 Apr 2024 09:42:14 +0200 Jiri Pirko wrote: >> > - not get upset when we delete those drivers if they stop participating >> >> Sorry for being pain, but I would still like to see some sumarization of >> what is actually the gain for the community to merge this unused driver. >> So far, I don't recall to read anything solid. > >From the discussion I think some folks made the point that it's >educational to see what big companies do, and seeing the work >may lead to reuse and other people adopting features / ideas. Okay, if that's all, does it justify the cons? Will someone put this on weights? > >> btw: >> Kconfig description should contain: >> Say N here, you can't ever see this device in real world. > >We do use standard distro kernels in some corners of the DC, AFAIU. I find it amusing to think about a distro vendor, for example RedHat, to support driver for a proprietary private device. > >> >If you think that the drivers should be merged *without* setting these >> >expectations, please speak up. >> > >> >Nobody picked me up on the suggestion to use the CI as a proactive >> >check whether the maintainer / owner is still paying attention, >> >but okay :( >> > >> > >> >What is less clear to me is what do we do about uAPI / core changes. >> >Of those who touched on the subject - few people seem to be curious / >> >welcoming to any reasonable features coming out for private devices >> >(John, Olek, Florian)? Others are more cautious focusing on blast >> >radius and referring to the "two driver rule" (Daniel, Paolo)? >> >Whether that means outright ban on touching common code or uAPI >> >in ways which aren't exercised by commercial NICs, is unclear. >> >> For these kind of unused drivers, I think it would be legit to >> disallow any internal/external api changes. Just do that for some >> normal driver, then benefit from the changes in the unused driver. > >Unused is a bit strong, and we didn't put netdevsim in a special >directory. Let's see if more such drivers appear and if there >are practical uses for the separation for scripts etc? The practical use I see that the reviewer would spot right away is someone pushes a feature implemented in this unused driver only. Say it would be a clear mark for a driver of lower category. For the person doing API change it would be an indication that he does not have that cautious to not to break anything in this driver. The driver maintainer should be the one to deal with potential issues. With this clear marking and Documentation to describe it, I think I would be ok to let this in, FWIW.
On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > What is less clear to me is what do we do about uAPI / core changes. > > I would differentiate between core change and core additions. If there > is very limited firmware on this device, i assume Linux is managing > the SFP cage, and to some extend the PCS. Extending the core to handle > these at higher speeds than currently supported would be one such core > addition. I've no problem with this. And i doubt it will be a single > NIC using such additions for too long. It looks like ClearFog CX LX2 > could make use of such extensions as well, and there are probably > other boards and devices, maybe the Zynq 7000? The driver on this device doesn't have full access over the PHY. Basically we control everything from the PCS north, and the firmware controls everything from the PMA south as the physical connection is MUXed between 4 slices. So this means the firmware also controls all the I2C and the QSFP and EEPROM. The main reason for this is that those blocks are shared resources between the slices, as such the firmware acts as the arbitrator for 4 slices and the BMC. > > Is my reading correct? Does anyone have an opinion on whether we should > > try to dig more into this question prior to merging the driver, and > > set some ground rules? Or proceed and learn by doing? > > I'm not too keen on keeping potentially shareable code in the driver > just because of UEFI. It has long been the norm that you should not > have wrappers so you can reuse code in different OSes. And UEFI is > just another OS. So i really would like to see a Linux I2C bus master > driver, a linux GPIO driver if appropriate, and using phylink, just as > i've pushed wangxun to do that, and to some extend nvidia with their > GPIO controller embedded in their NIC. The nice thing is, the > developers for wangxun has mostly solved all this for a PCIe device, > so their code can be copied. Well when you are a one man driver development team sharing code definitely makes one's life much easier when you have to maintain multiple drivers. That said, I will see what I can do to comply with your requests while hopefully not increasing my maintenance burden too much. If nothing else I might just write myself a kernel compatibility shim in UEFI so I can reuse the code that way. The driver has no access to I2C or GPIO. The QSFP and EEPROM are shared so I don't have access directly from the driver and I will have to send any messages over the Host/FW mailbox to change any of that. > Do we need to set some ground rules? No. I can give similar feedback > as i gave the wangxun developers, if Linux subsystems are not used > appropriately. > > Andrew Yeah, I kind of assume that will always be the case. Rules only mean something if they are enforced anway. Although having the rules documented does help in terms of making them known to all those involved. Anyway, I have already incorporated most of the feedback from the other maintainers. The two requests you had were a bit more difficult as they are in areas I am not entirely familiar with so I thought I would check in with you before I dive into them. One request I recall you having was to look at using the LED framework for the LEDs. I think I have enough info to chase that down and get it resolved for v2. However if you have some examples you would prefer I follow I can look into that. As far as the PCS block does it matter if I expose what the actual underlying IP is or isn't? My request for info on what I can disclose seems to be moving at the speed of bureaucracy so I don't know how long it would take if I were to try to write something truly generic. That said if I just put together something that referred to this as pcs-fbnic for now, would that work for making this acceptable for upstream?
On Wed, 10 Apr 2024 17:12:18 +0200 Jiri Pirko wrote: > >> For these kind of unused drivers, I think it would be legit to > >> disallow any internal/external api changes. Just do that for some > >> normal driver, then benefit from the changes in the unused driver. > > > >Unused is a bit strong, and we didn't put netdevsim in a special > >directory. Let's see if more such drivers appear and if there > >are practical uses for the separation for scripts etc? > > The practical use I see that the reviewer would spot right away is > someone pushes a feature implemented in this unused driver only. > Say it would be a clear mark for a driver of lower category. > For the person doing API change it would be an indication that he > does not have that cautious to not to break anything in this driver. > The driver maintainer should be the one to deal with potential issues. Hm, we currently group by vendor but the fact it's a private device is probably more important indeed. For example if Google submits a driver for a private device it may be confusing what's public cloud (which I think/hope GVE is) and what's fully private. So we could categorize by the characteristic rather than vendor: drivers/net/ethernet/${term}/fbnic/ I'm afraid it may be hard for us to agree on an accurate term, tho. "Unused" sounds.. odd, we don't keep unused code, "private" sounds like we granted someone special right not took some away, maybe "exclusive"? Or "besteffort"? Or "staging" :D IDK. > With this clear marking and Documentation to describe it, I think I > would be ok to let this in, FWIW.
On 4/10/2024 10:35 AM, Jakub Kicinski wrote: > On Wed, 10 Apr 2024 17:12:18 +0200 Jiri Pirko wrote: >>>> For these kind of unused drivers, I think it would be legit to >>>> disallow any internal/external api changes. Just do that for some >>>> normal driver, then benefit from the changes in the unused driver. >>> >>> Unused is a bit strong, and we didn't put netdevsim in a special >>> directory. Let's see if more such drivers appear and if there >>> are practical uses for the separation for scripts etc? >> >> The practical use I see that the reviewer would spot right away is >> someone pushes a feature implemented in this unused driver only. >> Say it would be a clear mark for a driver of lower category. >> For the person doing API change it would be an indication that he >> does not have that cautious to not to break anything in this driver. >> The driver maintainer should be the one to deal with potential issues. > > Hm, we currently group by vendor but the fact it's a private device > is probably more important indeed. For example if Google submits > a driver for a private device it may be confusing what's public > cloud (which I think/hope GVE is) and what's fully private. > > So we could categorize by the characteristic rather than vendor: > > drivers/net/ethernet/${term}/fbnic/ > > I'm afraid it may be hard for us to agree on an accurate term, tho. > "Unused" sounds.. odd, we don't keep unused code, "private" > sounds like we granted someone special right not took some away, > maybe "exclusive"? Or "besteffort"? Or "staging" :D IDK. Do we really need that categorization at the directory/filesystem level? cannot we just document it clearly in the Kconfig help text and under Documentation/networking/?
On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote: > > Hm, we currently group by vendor but the fact it's a private device > > is probably more important indeed. For example if Google submits > > a driver for a private device it may be confusing what's public > > cloud (which I think/hope GVE is) and what's fully private. > > > > So we could categorize by the characteristic rather than vendor: > > > > drivers/net/ethernet/${term}/fbnic/ > > > > I'm afraid it may be hard for us to agree on an accurate term, tho. > > "Unused" sounds.. odd, we don't keep unused code, "private" > > sounds like we granted someone special right not took some away, > > maybe "exclusive"? Or "besteffort"? Or "staging" :D IDK. > > Do we really need that categorization at the directory/filesystem level? > cannot we just document it clearly in the Kconfig help text and under > Documentation/networking/? From the reviewer perspective I think we will just remember. If some newcomer tries to do refactoring they may benefit from seeing this is a special device and more help is offered. Dunno if a newcomer would look at the right docs. Whether it's more "paperwork" than we'll actually gain, I have no idea. I may not be the best person to comment.
On 4/10/2024 10:56 AM, Jakub Kicinski wrote: > On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote: >>> Hm, we currently group by vendor but the fact it's a private device >>> is probably more important indeed. For example if Google submits >>> a driver for a private device it may be confusing what's public >>> cloud (which I think/hope GVE is) and what's fully private. >>> >>> So we could categorize by the characteristic rather than vendor: >>> >>> drivers/net/ethernet/${term}/fbnic/ >>> >>> I'm afraid it may be hard for us to agree on an accurate term, tho. >>> "Unused" sounds.. odd, we don't keep unused code, "private" >>> sounds like we granted someone special right not took some away, >>> maybe "exclusive"? Or "besteffort"? Or "staging" :D IDK. >> >> Do we really need that categorization at the directory/filesystem level? >> cannot we just document it clearly in the Kconfig help text and under >> Documentation/networking/? > > From the reviewer perspective I think we will just remember. > If some newcomer tries to do refactoring they may benefit from seeing > this is a special device and more help is offered. Dunno if a newcomer > would look at the right docs. > > Whether it's more "paperwork" than we'll actually gain, I have no idea. > I may not be the best person to comment. To me it is starting to feel like more paperwork than warranted, although I cannot really think about an "implied" metric that we could track, short of monitoring patches/bug reports coming from outside of the original driver authors/owners as an indication of how widely utilized a given driver is. The number of changes to the driver between release cycles is not a good indication between a driver with few users presents the most agile configuration, but similarly, a very actively used driver with real world users may see a large number of changes between releases based upon its use. What we need is some sort of popularity contest tracking :)
On Wed, Apr 10, 2024 at 10:56 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote: > > > Hm, we currently group by vendor but the fact it's a private device > > > is probably more important indeed. For example if Google submits > > > a driver for a private device it may be confusing what's public > > > cloud (which I think/hope GVE is) and what's fully private. > > > > > > So we could categorize by the characteristic rather than vendor: > > > > > > drivers/net/ethernet/${term}/fbnic/ > > > > > > I'm afraid it may be hard for us to agree on an accurate term, tho. > > > "Unused" sounds.. odd, we don't keep unused code, "private" > > > sounds like we granted someone special right not took some away, > > > maybe "exclusive"? Or "besteffort"? Or "staging" :D IDK. > > > > Do we really need that categorization at the directory/filesystem level? > > cannot we just document it clearly in the Kconfig help text and under > > Documentation/networking/? > > From the reviewer perspective I think we will just remember. > If some newcomer tries to do refactoring they may benefit from seeing > this is a special device and more help is offered. Dunno if a newcomer > would look at the right docs. > > Whether it's more "paperwork" than we'll actually gain, I have no idea. > I may not be the best person to comment. Are we going to go through and retro-actively move some of the drivers that are already there that are exclusive to specific companies? That is the bigger issue as I see it. It has already been brought up that idpf is exclusive. In addition several other people have reached out to me about other devices that are exclusive to other organizations. I don't see any value in it as it would just encourage people to lie in order to avoid being put in what would essentially become a blacklisted directory. If we are going to be trying to come up with some special status maybe it makes sense to have some status in the MAINTAINERS file that would indicate that this driver is exclusive to some organization and not publicly available so any maintenance would have to be proprietary.
On 4/10/2024 11:01 AM, Alexander Duyck wrote: > On Wed, Apr 10, 2024 at 10:56 AM Jakub Kicinski <kuba@kernel.org> wrote: >> >> On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote: >>>> Hm, we currently group by vendor but the fact it's a private device >>>> is probably more important indeed. For example if Google submits >>>> a driver for a private device it may be confusing what's public >>>> cloud (which I think/hope GVE is) and what's fully private. >>>> >>>> So we could categorize by the characteristic rather than vendor: >>>> >>>> drivers/net/ethernet/${term}/fbnic/ >>>> >>>> I'm afraid it may be hard for us to agree on an accurate term, tho. >>>> "Unused" sounds.. odd, we don't keep unused code, "private" >>>> sounds like we granted someone special right not took some away, >>>> maybe "exclusive"? Or "besteffort"? Or "staging" :D IDK. >>> >>> Do we really need that categorization at the directory/filesystem level? >>> cannot we just document it clearly in the Kconfig help text and under >>> Documentation/networking/? >> >> From the reviewer perspective I think we will just remember. >> If some newcomer tries to do refactoring they may benefit from seeing >> this is a special device and more help is offered. Dunno if a newcomer >> would look at the right docs. >> >> Whether it's more "paperwork" than we'll actually gain, I have no idea. >> I may not be the best person to comment. > > Are we going to go through and retro-actively move some of the drivers > that are already there that are exclusive to specific companies? That > is the bigger issue as I see it. It has already been brought up that > idpf is exclusive. In addition several other people have reached out > to me about other devices that are exclusive to other organizations. > > I don't see any value in it as it would just encourage people to lie > in order to avoid being put in what would essentially become a > blacklisted directory. Agreed. > > If we are going to be trying to come up with some special status maybe > it makes sense to have some status in the MAINTAINERS file that would > indicate that this driver is exclusive to some organization and not > publicly available so any maintenance would have to be proprietary. I like that idea.
On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote: > > If we are going to be trying to come up with some special status maybe > > it makes sense to have some status in the MAINTAINERS file that would > > indicate that this driver is exclusive to some organization and not > > publicly available so any maintenance would have to be proprietary. > > I like that idea. +1, also first idea that came to mind but I was too afraid of bike shedding to mention it :) Fingers crossed? :)
On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote: > On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > What is less clear to me is what do we do about uAPI / core changes. > > > > I would differentiate between core change and core additions. If there > > is very limited firmware on this device, i assume Linux is managing > > the SFP cage, and to some extend the PCS. Extending the core to handle > > these at higher speeds than currently supported would be one such core > > addition. I've no problem with this. And i doubt it will be a single > > NIC using such additions for too long. It looks like ClearFog CX LX2 > > could make use of such extensions as well, and there are probably > > other boards and devices, maybe the Zynq 7000? > > The driver on this device doesn't have full access over the PHY. > Basically we control everything from the PCS north, and the firmware > controls everything from the PMA south as the physical connection is > MUXed between 4 slices. So this means the firmware also controls all > the I2C and the QSFP and EEPROM. The main reason for this is that > those blocks are shared resources between the slices, as such the > firmware acts as the arbitrator for 4 slices and the BMC. Ah, shame. You took what is probably the least valuable intellectual property, and most shareable with the community and locked it up in firmware where nobody can use it. You should probably stop saying there is not much firmware with this device, and that Linux controls it. It clearly does not... Andrew
On Wed, 10 Apr 2024 11:00:35 -0700 Florian Fainelli wrote: > although I cannot really think about an "implied" metric that we could > track, short of monitoring patches/bug reports coming from outside of > the original driver authors/owners as an indication of how widely > utilized a given driver is. Not metric, just to clarify. I think the discussion started from my email saying: - help with refactoring / adapting their drivers more actively and that may be an empty promise if the person doing the refactoring does not know they could ask. It's not uncommon for a relative newcomer to redo some internal API. Not that it's usually a hard transformation.. Dunno, a bit hypothetical.
On Wed, Apr 10, 2024 at 1:01 PM Andrew Lunn <andrew@lunn.ch> wrote: > > On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote: > > On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > > > What is less clear to me is what do we do about uAPI / core changes. > > > > > > I would differentiate between core change and core additions. If there > > > is very limited firmware on this device, i assume Linux is managing > > > the SFP cage, and to some extend the PCS. Extending the core to handle > > > these at higher speeds than currently supported would be one such core > > > addition. I've no problem with this. And i doubt it will be a single > > > NIC using such additions for too long. It looks like ClearFog CX LX2 > > > could make use of such extensions as well, and there are probably > > > other boards and devices, maybe the Zynq 7000? > > > > The driver on this device doesn't have full access over the PHY. > > Basically we control everything from the PCS north, and the firmware > > controls everything from the PMA south as the physical connection is > > MUXed between 4 slices. So this means the firmware also controls all > > the I2C and the QSFP and EEPROM. The main reason for this is that > > those blocks are shared resources between the slices, as such the > > firmware acts as the arbitrator for 4 slices and the BMC. > > Ah, shame. You took what is probably the least valuable intellectual > property, and most shareable with the community and locked it up in > firmware where nobody can use it. > > You should probably stop saying there is not much firmware with this > device, and that Linux controls it. It clearly does not... > > Andrew Well I was referring more to the data path level more than the phy configuration. I suspect different people have different levels of expectations on what minimal firmware is. With this hardware we at least don't need to use firmware commands to enable or disable queues, get the device stats, or update a MAC address. When it comes to multi-host NICs I am not sure there are going to be any solutions that don't have some level of firmware due to the fact that the cable is physically shared with multiple slots. I am assuming we still want to do the PCS driver. So I will still see what I can do to get that setup. Thanks, - Alex
On 4/10/2024 12:26 AM, Jiri Pirko wrote: > Tue, Apr 09, 2024 at 11:06:05PM CEST, willemdebruijn.kernel@gmail.com wrote: >> Jakub Kicinski wrote: >>> On Wed, 03 Apr 2024 13:08:24 -0700 Alexander Duyck wrote: > > [...] > >> >> 2. whether new device features can be supported without at least >> two available devices supporting it. >> > > [...] > >> >> 2 is out of scope for this series. But I would always want to hear >> about potential new features that an organization finds valuable >> enough to implement. Rather than a blanket rule against them. > > This appears out of the nowhere. In the past, I would say wast majority > of the features was merged with single device implementation. Often, it > is the only device out there at a time that supports the feature. > This limitation would put break for feature additions. I can put a long > list of features that would not be here ever (like 50% of mlxsw driver). > >> >> > Jakub already mentioned this being nuanced in a previous part of the thread. In reality, lots of features do get implemented by only one driver first. I think its good practice to ensure multiple vendors/drivers can use whatever common uAPI or kernel API exists. It can be frustrating when some new API gets introduced but then can't be used by another device.. In most cases thats on the vendors for being slow to respond or work with each other when developing the new API.
On 4/10/2024 12:58 PM, Jakub Kicinski wrote: > On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote: >>> If we are going to be trying to come up with some special status maybe >>> it makes sense to have some status in the MAINTAINERS file that would >>> indicate that this driver is exclusive to some organization and not >>> publicly available so any maintenance would have to be proprietary. >> >> I like that idea. > > +1, also first idea that came to mind but I was too afraid > of bike shedding to mention it :) Fingers crossed? :) > +1, I think putting it in MAINTAINERS makes a lot of sense.
> I think its good practice to ensure multiple vendors/drivers can use > whatever common uAPI or kernel API exists. It can be frustrating when > some new API gets introduced but then can't be used by another device.. > In most cases thats on the vendors for being slow to respond or work > with each other when developing the new API. I tend to agree with the last part. Vendors tend not to reviewer other vendors patches, and so often don't notice a new API being added which they could use, if it was a little bit more generic. Also vendors often seem to focus on their devices/firmware requirements, not an abstract device, and so end up with something not generic. As a reviewer, i try to take more notice of new APIs than most other things, and ideally it is something we should all do. Andrew
> Well I was referring more to the data path level more than the phy > configuration. I suspect different people have different levels of > expectations on what minimal firmware is. With this hardware we at > least don't need to use firmware commands to enable or disable queues, > get the device stats, or update a MAC address. > > When it comes to multi-host NICs I am not sure there are going to be > any solutions that don't have some level of firmware due to the fact > that the cable is physically shared with multiple slots. This is something Russell King at least considered. I don't really know enough to know why its impossible for Linux to deal with multiple slots. > I am assuming we still want to do the PCS driver. So I will still see > what I can do to get that setup. You should look at the API offered by drivers in drivers/net/pcs. It is designed to be used with drivers which actually drive the hardware, and use phylink. Who is responsible for configuring and looking at the results of auto negotiation? Who is responsible for putting the PCS into the correct mode depending on the SFP modules capabilities? Because you seemed to of split the PCS into two, and hidden some of it away, i don't know if it makes sense to try to shoehorn what is left into a Linux driver. Andrew
On 4/10/2024 3:19 PM, Andrew Lunn wrote: >> I think its good practice to ensure multiple vendors/drivers can use >> whatever common uAPI or kernel API exists. It can be frustrating when >> some new API gets introduced but then can't be used by another device.. >> In most cases thats on the vendors for being slow to respond or work >> with each other when developing the new API. > > I tend to agree with the last part. Vendors tend not to reviewer other > vendors patches, and so often don't notice a new API being added which > they could use, if it was a little bit more generic. Also vendors > often seem to focus on their devices/firmware requirements, not an > abstract device, and so end up with something not generic. > > As a reviewer, i try to take more notice of new APIs than most other > things, and ideally it is something we should all do. > > Andrew > > > Agreed. It can be challenging when you're in the vendor space though, as you get handed priorities.
Thu, Apr 11, 2024 at 12:03:54AM CEST, jacob.e.keller@intel.com wrote: > > >On 4/10/2024 12:58 PM, Jakub Kicinski wrote: >> On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote: >>>> If we are going to be trying to come up with some special status maybe >>>> it makes sense to have some status in the MAINTAINERS file that would >>>> indicate that this driver is exclusive to some organization and not >>>> publicly available so any maintenance would have to be proprietary. >>> >>> I like that idea. >> >> +1, also first idea that came to mind but I was too afraid >> of bike shedding to mention it :) Fingers crossed? :) >> > >+1, I think putting it in MAINTAINERS makes a lot of sense. Well, how exactly you imagine to do this? I have no problem using MAINTAINERS for this, I was thinking about that too, but I could not figure out the way it would work. Having driver directory is much more obvious, person cooking up a patch sees that immediatelly. Do you look at MAINTAINTERS file when you do some driver API changing patch/ any patch? I certainly don't (not counting get_maintainers sctipt).
Wed, Apr 10, 2024 at 08:01:44PM CEST, alexander.duyck@gmail.com wrote: >On Wed, Apr 10, 2024 at 10:56 AM Jakub Kicinski <kuba@kernel.org> wrote: >> >> On Wed, 10 Apr 2024 10:39:11 -0700 Florian Fainelli wrote: >> > > Hm, we currently group by vendor but the fact it's a private device >> > > is probably more important indeed. For example if Google submits >> > > a driver for a private device it may be confusing what's public >> > > cloud (which I think/hope GVE is) and what's fully private. >> > > >> > > So we could categorize by the characteristic rather than vendor: >> > > >> > > drivers/net/ethernet/${term}/fbnic/ >> > > >> > > I'm afraid it may be hard for us to agree on an accurate term, tho. >> > > "Unused" sounds.. odd, we don't keep unused code, "private" >> > > sounds like we granted someone special right not took some away, >> > > maybe "exclusive"? Or "besteffort"? Or "staging" :D IDK. >> > >> > Do we really need that categorization at the directory/filesystem level? >> > cannot we just document it clearly in the Kconfig help text and under >> > Documentation/networking/? >> >> From the reviewer perspective I think we will just remember. >> If some newcomer tries to do refactoring they may benefit from seeing >> this is a special device and more help is offered. Dunno if a newcomer >> would look at the right docs. >> >> Whether it's more "paperwork" than we'll actually gain, I have no idea. >> I may not be the best person to comment. > >Are we going to go through and retro-actively move some of the drivers >that are already there that are exclusive to specific companies? That >is the bigger issue as I see it. It has already been brought up that Why is it an issue? Very easy to move drivers to this new directory. >idpf is exclusive. In addition several other people have reached out >to me about other devices that are exclusive to other organizations. > >I don't see any value in it as it would just encourage people to lie >in order to avoid being put in what would essentially become a >blacklisted directory. You are thinking all or nothing. I'd say that if we have 80% of such drivers in the correct place/directory, it's a win. The rest will lie. Shame for them when it is discovered. > >If we are going to be trying to come up with some special status maybe >it makes sense to have some status in the MAINTAINERS file that would >indicate that this driver is exclusive to some organization and not >publicly available so any maintenance would have to be proprietary.
Wed, Apr 10, 2024 at 11:07:02PM CEST, alexander.duyck@gmail.com wrote: >On Wed, Apr 10, 2024 at 1:01 PM Andrew Lunn <andrew@lunn.ch> wrote: >> >> On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote: >> > On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote: >> > > >> > > > What is less clear to me is what do we do about uAPI / core changes. >> > > >> > > I would differentiate between core change and core additions. If there >> > > is very limited firmware on this device, i assume Linux is managing >> > > the SFP cage, and to some extend the PCS. Extending the core to handle >> > > these at higher speeds than currently supported would be one such core >> > > addition. I've no problem with this. And i doubt it will be a single >> > > NIC using such additions for too long. It looks like ClearFog CX LX2 >> > > could make use of such extensions as well, and there are probably >> > > other boards and devices, maybe the Zynq 7000? >> > >> > The driver on this device doesn't have full access over the PHY. >> > Basically we control everything from the PCS north, and the firmware >> > controls everything from the PMA south as the physical connection is >> > MUXed between 4 slices. So this means the firmware also controls all >> > the I2C and the QSFP and EEPROM. The main reason for this is that >> > those blocks are shared resources between the slices, as such the >> > firmware acts as the arbitrator for 4 slices and the BMC. >> >> Ah, shame. You took what is probably the least valuable intellectual >> property, and most shareable with the community and locked it up in >> firmware where nobody can use it. >> >> You should probably stop saying there is not much firmware with this >> device, and that Linux controls it. It clearly does not... >> >> Andrew > >Well I was referring more to the data path level more than the phy >configuration. I suspect different people have different levels of >expectations on what minimal firmware is. With this hardware we at >least don't need to use firmware commands to enable or disable queues, >get the device stats, or update a MAC address. > >When it comes to multi-host NICs I am not sure there are going to be >any solutions that don't have some level of firmware due to the fact A small linux host on the nic that controls the eswitch perhaps? I mean, the multi pf nic without a host in charge of the physical port and swithing between it and pf is simply broken design. And yeah you would probably now want to argue others are doing it already in the same way :) True that. >that the cable is physically shared with multiple slots. > >I am assuming we still want to do the PCS driver. So I will still see >what I can do to get that setup. > >Thanks, > >- Alex >
On Wed, Apr 10, 2024 at 3:37 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > Well I was referring more to the data path level more than the phy > > configuration. I suspect different people have different levels of > > expectations on what minimal firmware is. With this hardware we at > > least don't need to use firmware commands to enable or disable queues, > > get the device stats, or update a MAC address. > > > > When it comes to multi-host NICs I am not sure there are going to be > > any solutions that don't have some level of firmware due to the fact > > that the cable is physically shared with multiple slots. > > This is something Russell King at least considered. I don't really > know enough to know why its impossible for Linux to deal with multiple > slots. It mostly has to do with the arbitration between them. It is a matter of having to pass a TON of info to the individual slice and then the problem is it would have to do things correctly and not manage to take out it's neighbor or the BMC. > > I am assuming we still want to do the PCS driver. So I will still see > > what I can do to get that setup. > > You should look at the API offered by drivers in drivers/net/pcs. It > is designed to be used with drivers which actually drive the hardware, > and use phylink. Who is responsible for configuring and looking at the > results of auto negotiation? Who is responsible for putting the PCS > into the correct mode depending on the SFP modules capabilities? > Because you seemed to of split the PCS into two, and hidden some of it > away, i don't know if it makes sense to try to shoehorn what is left > into a Linux driver. We have control of the auto negotiation as that is north of the PMA and is configured per host. We should support clause 73 autoneg. Although we haven't done much with it as most of our use cases are just fixed speed setups to the switch over either 25G-CR1, 50G-CR2, 50G-CR1, or 100G-CR2. So odds are we aren't going to be doing anything too terribly exciting. As far as the QSFP setup the FW is responsible for any communication with it. I suspect that the expectation is that we aren't going to need much in the way of config since we are just using direct attach cables. So we are the only one driving the PCS assuming we aren't talking about power-on init where the FW is setting up for the BMC to have access. We will have to see. The PCS drivers in that directory mostly make sense to me and don't look like too much of a departure from my current code. It will just be a matter of splitting up the fbnic_mac.c file and adding the PCS logic as a separate block, or at least I hope that is all that is mostly involved. Probably take me a week or two to get it coded up and push out the v2.
On 4/10/2024 11:31 PM, Jiri Pirko wrote: > Thu, Apr 11, 2024 at 12:03:54AM CEST, jacob.e.keller@intel.com wrote: >> >> >> On 4/10/2024 12:58 PM, Jakub Kicinski wrote: >>> On Wed, 10 Apr 2024 11:29:57 -0700 Florian Fainelli wrote: >>>>> If we are going to be trying to come up with some special status maybe >>>>> it makes sense to have some status in the MAINTAINERS file that would >>>>> indicate that this driver is exclusive to some organization and not >>>>> publicly available so any maintenance would have to be proprietary. >>>> >>>> I like that idea. >>> >>> +1, also first idea that came to mind but I was too afraid >>> of bike shedding to mention it :) Fingers crossed? :) >>> >> >> +1, I think putting it in MAINTAINERS makes a lot of sense. > > Well, how exactly you imagine to do this? I have no problem using > MAINTAINERS for this, I was thinking about that too, but I could not > figure out the way it would work. Having driver directory is much more > obvious, person cooking up a patch sees that immediatelly. Do you look > at MAINTAINTERS file when you do some driver API changing patch/ any > patch? I certainly don't (not counting get_maintainers sctipt). I use MAINTAINERS (along with get_maintainers) to figure out who to CC when dealing with a driver. I guess I probably don't do so before making a change. I guess it depends on what the intent of documenting it is for. Presumably it would be a hint and reminder to future maintainers that if the folks listed as MAINTAINERS are no longer responsive then this driver can be reverted. Or if you push more strongly towards "its up to them to do all maintenance" i.e. we don't make API changes for them, etc? I didn't get the sense that was the consensus. The consensus I got from reading this thread is that most people are ok with merging the driver. Some reservations about the future and any API changes specifically for the one driver... Some reservations about the extra maintenance burden. Several others pointed out example cases which are similar in availability. Perhaps not quite as obvious as the case of this where the device is produced and consumed by the same group only. The practical reality is that many of the devices are practically exclusive if not definitionally so.
On Wed, Apr 10, 2024 at 11:39 PM Jiri Pirko <jiri@resnulli.us> wrote: > > Wed, Apr 10, 2024 at 11:07:02PM CEST, alexander.duyck@gmail.com wrote: > >On Wed, Apr 10, 2024 at 1:01 PM Andrew Lunn <andrew@lunn.ch> wrote: > >> > >> On Wed, Apr 10, 2024 at 08:56:31AM -0700, Alexander Duyck wrote: > >> > On Tue, Apr 9, 2024 at 4:42 PM Andrew Lunn <andrew@lunn.ch> wrote: > >> > > > >> > > > What is less clear to me is what do we do about uAPI / core changes. > >> > > > >> > > I would differentiate between core change and core additions. If there > >> > > is very limited firmware on this device, i assume Linux is managing > >> > > the SFP cage, and to some extend the PCS. Extending the core to handle > >> > > these at higher speeds than currently supported would be one such core > >> > > addition. I've no problem with this. And i doubt it will be a single > >> > > NIC using such additions for too long. It looks like ClearFog CX LX2 > >> > > could make use of such extensions as well, and there are probably > >> > > other boards and devices, maybe the Zynq 7000? > >> > > >> > The driver on this device doesn't have full access over the PHY. > >> > Basically we control everything from the PCS north, and the firmware > >> > controls everything from the PMA south as the physical connection is > >> > MUXed between 4 slices. So this means the firmware also controls all > >> > the I2C and the QSFP and EEPROM. The main reason for this is that > >> > those blocks are shared resources between the slices, as such the > >> > firmware acts as the arbitrator for 4 slices and the BMC. > >> > >> Ah, shame. You took what is probably the least valuable intellectual > >> property, and most shareable with the community and locked it up in > >> firmware where nobody can use it. > >> > >> You should probably stop saying there is not much firmware with this > >> device, and that Linux controls it. It clearly does not... > >> > >> Andrew > > > >Well I was referring more to the data path level more than the phy > >configuration. I suspect different people have different levels of > >expectations on what minimal firmware is. With this hardware we at > >least don't need to use firmware commands to enable or disable queues, > >get the device stats, or update a MAC address. > > > >When it comes to multi-host NICs I am not sure there are going to be > >any solutions that don't have some level of firmware due to the fact > > A small linux host on the nic that controls the eswitch perhaps? I mean, > the multi pf nic without a host in charge of the physical port and > swithing between it and pf is simply broken design. And yeah you would > probably now want to argue others are doing it already in the same way :) > True that. Well in our case there isn't an eswitch. The issue is more the logic for the Ethernet PHY isn't setup to only run one port. Instead the PHY is MUXed over 2 ports per interface, and then the QSFP interface itself is spread over 4 ports. What you end up with is something like the second to last image in this article[1] where you have a MAC/PCS pair per host sitting on top of one PMA with some blocks that are shared between the hosts and some that are not. The issue becomes management of access to the QSFP and PHY and how to prevent one host from being able to monopolize the PHY/QSFP or crash the others if something goes sideways. Then you have to also add in the BMC management on top of that. [1]: https://semiengineering.com/integrated-ethernet-pcs-and-phy-ip-for-400g-800g-hyperscale-data-centers/
On Thu, Apr 11, 2024 at 09:00:17AM -0700, Alexander Duyck wrote: > On Wed, Apr 10, 2024 at 3:37 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > Well I was referring more to the data path level more than the phy > > > configuration. I suspect different people have different levels of > > > expectations on what minimal firmware is. With this hardware we at > > > least don't need to use firmware commands to enable or disable queues, > > > get the device stats, or update a MAC address. > > > > > > When it comes to multi-host NICs I am not sure there are going to be > > > any solutions that don't have some level of firmware due to the fact > > > that the cable is physically shared with multiple slots. > > > > This is something Russell King at least considered. I don't really > > know enough to know why its impossible for Linux to deal with multiple > > slots. > > It mostly has to do with the arbitration between them. It is a matter > of having to pass a TON of info to the individual slice and then the > problem is it would have to do things correctly and not manage to take > out it's neighbor or the BMC. How much is specific to your device? How much is just following 802.3 and the CMIS standards? I assume anything which is just following 802.3 and CMIS could actually be re-used? And you have some glue to combine them in a way that is specific to your device? > > > I am assuming we still want to do the PCS driver. So I will still see > > > what I can do to get that setup. > > > > You should look at the API offered by drivers in drivers/net/pcs. It > > is designed to be used with drivers which actually drive the hardware, > > and use phylink. Who is responsible for configuring and looking at the > > results of auto negotiation? Who is responsible for putting the PCS > > into the correct mode depending on the SFP modules capabilities? > > Because you seemed to of split the PCS into two, and hidden some of it > > away, i don't know if it makes sense to try to shoehorn what is left > > into a Linux driver. > > We have control of the auto negotiation as that is north of the PMA > and is configured per host. We should support clause 73 autoneg. > Although we haven't done much with it as most of our use cases are > just fixed speed setups to the switch over either 25G-CR1, 50G-CR2, > 50G-CR1, or 100G-CR2. So odds are we aren't going to be doing anything > too terribly exciting. Maybe not, but you might of gained from the community here, if others could of adopted this code for their devices. You might not need clause 73, but phylink provides helpers to implement it, so it is pretty easy to add. Maybe your initial PCS driver does not support it, but later adopters who also licence this PCS might add it, and you get the feature for free. The corrected/uncorrected counters i asked about, are something you might not export in your current code via ethtool. But again, this is something which somebody else could add a helper for, and you would get it nearly for free. > As far as the QSFP setup the FW is responsible for any communication > with it. I suspect that the expectation is that we aren't going to > need much in the way of config since we are just using direct attach > cables. Another place you might of got features for free. The Linux SFP driver exports HWMON values for temperature, power, received power, etc, but for 1G. The QSFP+ standard Versatile Diagnostics Monitoring is different, but i could see somebody adding a generic implementation in the Linux SFP driver, so that the HWMON support is just free. Same goes for the error performance statics. Parts of power management could easily be generic. It might be possible to use Linux regulators to describe what your board is capable if, and the SFP core could then implement the ethtool ops, checking with the regulator to see if the power is actually available, and then talking to the SFP to tell it to change its power class? Florian posted some interesting statistics, that vendors tend to maintain their own drivers, and don't get much support from the community. However i suspect it is a different story for shared infrastructure like PCS drivers, PHY drivers, SFP drivers. That is where you get the most community support and the most stuff for free. But you actually have to use it to benefit from it. Andrew
On Thu, Apr 11, 2024 at 10:32 AM Andrew Lunn <andrew@lunn.ch> wrote: > > On Thu, Apr 11, 2024 at 09:00:17AM -0700, Alexander Duyck wrote: > > On Wed, Apr 10, 2024 at 3:37 PM Andrew Lunn <andrew@lunn.ch> wrote: > > > > > > > Well I was referring more to the data path level more than the phy > > > > configuration. I suspect different people have different levels of > > > > expectations on what minimal firmware is. With this hardware we at > > > > least don't need to use firmware commands to enable or disable queues, > > > > get the device stats, or update a MAC address. > > > > > > > > When it comes to multi-host NICs I am not sure there are going to be > > > > any solutions that don't have some level of firmware due to the fact > > > > that the cable is physically shared with multiple slots. > > > > > > This is something Russell King at least considered. I don't really > > > know enough to know why its impossible for Linux to deal with multiple > > > slots. > > > > It mostly has to do with the arbitration between them. It is a matter > > of having to pass a TON of info to the individual slice and then the > > problem is it would have to do things correctly and not manage to take > > out it's neighbor or the BMC. > > How much is specific to your device? How much is just following 802.3 > and the CMIS standards? I assume anything which is just following > 802.3 and CMIS could actually be re-used? And you have some glue to > combine them in a way that is specific to your device? > > > > > I am assuming we still want to do the PCS driver. So I will still see > > > > what I can do to get that setup. > > > > > > You should look at the API offered by drivers in drivers/net/pcs. It > > > is designed to be used with drivers which actually drive the hardware, > > > and use phylink. Who is responsible for configuring and looking at the > > > results of auto negotiation? Who is responsible for putting the PCS > > > into the correct mode depending on the SFP modules capabilities? > > > Because you seemed to of split the PCS into two, and hidden some of it > > > away, i don't know if it makes sense to try to shoehorn what is left > > > into a Linux driver. > > > > We have control of the auto negotiation as that is north of the PMA > > and is configured per host. We should support clause 73 autoneg. > > Although we haven't done much with it as most of our use cases are > > just fixed speed setups to the switch over either 25G-CR1, 50G-CR2, > > 50G-CR1, or 100G-CR2. So odds are we aren't going to be doing anything > > too terribly exciting. > > Maybe not, but you might of gained from the community here, if others > could of adopted this code for their devices. You might not need > clause 73, but phylink provides helpers to implement it, so it is > pretty easy to add. Maybe your initial PCS driver does not support it, > but later adopters who also licence this PCS might add it, and you get > the feature for free. The corrected/uncorrected counters i asked > about, are something you might not export in your current code via > ethtool. But again, this is something which somebody else could add a > helper for, and you would get it nearly for free. You don't have to sell me on the reuse advantages of open source. I will probably look at adding autoneg at some point in the future, but for our main use case it wasn't needed. If nothing else I will probably hand it off to one of the new hires on the team when I get some time. The counters are exported. Just haven't gotten far enough to show the ethtool patches yet.. :-) > > As far as the QSFP setup the FW is responsible for any communication > > with it. I suspect that the expectation is that we aren't going to > > need much in the way of config since we are just using direct attach > > cables. > > Another place you might of got features for free. The Linux SFP driver > exports HWMON values for temperature, power, received power, etc, but > for 1G. The QSFP+ standard Versatile Diagnostics Monitoring is > different, but i could see somebody adding a generic implementation in > the Linux SFP driver, so that the HWMON support is just free. Same > goes for the error performance statics. Parts of power management > could easily be generic. It might be possible to use Linux regulators > to describe what your board is capable if, and the SFP core could then > implement the ethtool ops, checking with the regulator to see if the > power is actually available, and then talking to the SFP to tell it to > change its power class? Again, for us it ends up not having much value adding additional QSFP logic because we aren't using anything fancy. It is all just direct attach cables. > Florian posted some interesting statistics, that vendors tend to > maintain their own drivers, and don't get much support from the > community. However I suspect it is a different story for shared > infrastructure like PCS drivers, PHY drivers, SFP drivers. That is > where you get the most community support and the most stuff for free. > But you actually have to use it to benefit from it. I'll probably get started on the PCS drivers for this next week. I will follow up with questions if I run into any issues. Thanks, - Alex