diff mbox series

[net-next,02/15] eth: fbnic: add scaffolding for Meta's NIC driver

Message ID 171217491384.1598374.15535514527169847181.stgit@ahduyck-xeon-server.home.arpa (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series eth: fbnic: Add network driver for Meta Platforms Host Network Interface | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 949 this patch: 949
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers warning 4 maintainers not CCed: edumazet@google.com horms@kernel.org masahiroy@kernel.org kernel-team@meta.com
netdev/build_clang success Errors and warnings before: 955 this patch: 955
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 961 this patch: 961
netdev/checkpatch warning WARNING: please write a help paragraph that fully describes the config symbol
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc fail Errors and warnings before: 0 this patch: 2
netdev/source_inline success Was 0 now: 0

Commit Message

Alexander Duyck April 3, 2024, 8:08 p.m. UTC
From: Alexander Duyck <alexanderduyck@fb.com>

Create a bare-bones PCI driver for Meta's NIC.
Subsequent changes will flesh it out.

Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
---
 MAINTAINERS                                     |    7 +
 drivers/net/ethernet/Kconfig                    |    1 
 drivers/net/ethernet/Makefile                   |    1 
 drivers/net/ethernet/meta/Kconfig               |   29 +++
 drivers/net/ethernet/meta/Makefile              |    6 +
 drivers/net/ethernet/meta/fbnic/Makefile        |   10 +
 drivers/net/ethernet/meta/fbnic/fbnic.h         |   19 ++
 drivers/net/ethernet/meta/fbnic/fbnic_csr.h     |    9 +
 drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h |    5 +
 drivers/net/ethernet/meta/fbnic/fbnic_pci.c     |  196 +++++++++++++++++++++++
 10 files changed, 283 insertions(+)
 create mode 100644 drivers/net/ethernet/meta/Kconfig
 create mode 100644 drivers/net/ethernet/meta/Makefile
 create mode 100644 drivers/net/ethernet/meta/fbnic/Makefile
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_csr.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h
 create mode 100644 drivers/net/ethernet/meta/fbnic/fbnic_pci.c

Comments

Andrew Lunn April 3, 2024, 8:33 p.m. UTC | #1
> + * fbnic_init_module - Driver Registration Routine
> + *
> + * The first routine called when the driver is loaded.  All it does is
> + * register with the PCI subsystem.
> + **/
> +static int __init fbnic_init_module(void)
> +{
> +	int err;
> +
> +	pr_info(DRV_SUMMARY " (%s)", fbnic_driver.name);

Please don't spam the kernel log like this. Drivers should only report
when something goes wrong.

     Andrew
Alexander Duyck April 3, 2024, 8:47 p.m. UTC | #2
On Wed, Apr 3, 2024 at 1:33 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > + * fbnic_init_module - Driver Registration Routine
> > + *
> > + * The first routine called when the driver is loaded.  All it does is
> > + * register with the PCI subsystem.
> > + **/
> > +static int __init fbnic_init_module(void)
> > +{
> > +     int err;
> > +
> > +     pr_info(DRV_SUMMARY " (%s)", fbnic_driver.name);
>
> Please don't spam the kernel log like this. Drivers should only report
> when something goes wrong.
>
>      Andrew

Really? I have always used something like this to determine that the
driver isn't there when a user complains that the driver didn't load
on a given device. It isn't as though it would be super spammy as this
is something that is normally only run once when the module is loaded
during early boot, and there isn't a good way to say the module isn't
loaded if the driver itself isn't there.

For example if somebody adds the driver, but forgets to update the
initramfs I can easily call out that it isn't there when I ask for the
logs from the system.

Although I suppose I could meet you halfway. It seems like I am always
posting the message here. If you would prefer I can only display it if
the driver is successfully registered.
Andrew Lunn April 3, 2024, 9:17 p.m. UTC | #3
On Wed, Apr 03, 2024 at 01:47:18PM -0700, Alexander Duyck wrote:
> On Wed, Apr 3, 2024 at 1:33 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > + * fbnic_init_module - Driver Registration Routine
> > > + *
> > > + * The first routine called when the driver is loaded.  All it does is
> > > + * register with the PCI subsystem.
> > > + **/
> > > +static int __init fbnic_init_module(void)
> > > +{
> > > +     int err;
> > > +
> > > +     pr_info(DRV_SUMMARY " (%s)", fbnic_driver.name);
> >
> > Please don't spam the kernel log like this. Drivers should only report
> > when something goes wrong.
> >
> >      Andrew
> 
> Really?

I think if you look around, GregKH has said this.

lsmod | wc
    167     585    6814

Do i really want my kernel log spammed with 167 'Hello world'
messages?

> I have always used something like this to determine that the
> driver isn't there when a user complains that the driver didn't load
> on a given device. It isn't as though it would be super spammy as this
> is something that is normally only run once when the module is loaded
> during early boot, and there isn't a good way to say the module isn't
> loaded if the driver itself isn't there.

lsmod

	Andrew
Alexander Duyck April 3, 2024, 9:51 p.m. UTC | #4
On Wed, Apr 3, 2024 at 2:17 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Wed, Apr 03, 2024 at 01:47:18PM -0700, Alexander Duyck wrote:
> > On Wed, Apr 3, 2024 at 1:33 PM Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > + * fbnic_init_module - Driver Registration Routine
> > > > + *
> > > > + * The first routine called when the driver is loaded.  All it does is
> > > > + * register with the PCI subsystem.
> > > > + **/
> > > > +static int __init fbnic_init_module(void)
> > > > +{
> > > > +     int err;
> > > > +
> > > > +     pr_info(DRV_SUMMARY " (%s)", fbnic_driver.name);
> > >
> > > Please don't spam the kernel log like this. Drivers should only report
> > > when something goes wrong.
> > >
> > >      Andrew
> >
> > Really?
>
> I think if you look around, GregKH has said this.
>
> lsmod | wc
>     167     585    6814
>
> Do i really want my kernel log spammed with 167 'Hello world'
> messages?

I would say it depends. Are you trying to boot off of all 167 devices?
The issue I run into is that I have to support boot scenarios where
the driver has to load as early as possible in order to mount a boot
image copied over the network. In many cases if something fails we
won't have access to something like lsmod since this is being used in
fairly small monolithic kernel images used for provisioning systems.

> > I have always used something like this to determine that the
> > driver isn't there when a user complains that the driver didn't load
> > on a given device. It isn't as though it would be super spammy as this
> > is something that is normally only run once when the module is loaded
> > during early boot, and there isn't a good way to say the module isn't
> > loaded if the driver itself isn't there.
>
> lsmod
>
>         Andrew

That assumes you have access to the system and aren't looking at logs
after the fact. In addition that assumes the module isn't built into
the kernel as well. Having the one line in the log provides a single
point of truth that is easily searchable without having to resort to
one of several different ways of trying to figure out if it is there:
[root@localhost ~]# dmesg | grep "Meta(R) Host Network Interface Driver"
[   11.890979] Meta(R) Host Network Interface Driver (fbnic)

Otherwise we are having to go searching in sysfs if it is there, or
lsmod, or whatever is your preferred way and that only works if we
have login access to the system and it isn't just doing something like
writing the log to a file and rebooting.

Thanks,

- Alex
Andrew Lunn April 3, 2024, 10:20 p.m. UTC | #5
> I would say it depends. Are you trying to boot off of all 167 devices?

If i'm doing some sort of odd boot setup, i generally TFTP boot the
kernel, and then use NFS root. And i have everything built in. It not
finding the root fs because networking is FUBAR is pretty obvious. Bin
there, done that.

Please keep in mind the users here. This is a data centre NIC, not a
'grandma and grandpa' device which as the designated IT expert of the
family i need to help make work. Can the operators of Meta data
centres really not understand lsmod? Cannot look in /sys?

       Andrew
Alexander Duyck April 3, 2024, 11:27 p.m. UTC | #6
On Wed, Apr 3, 2024 at 3:20 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > I would say it depends. Are you trying to boot off of all 167 devices?
>
> If i'm doing some sort of odd boot setup, i generally TFTP boot the
> kernel, and then use NFS root. And i have everything built in. It not
> finding the root fs because networking is FUBAR is pretty obvious. Bin
> there, done that.
>
> Please keep in mind the users here. This is a data centre NIC, not a
> 'grandma and grandpa' device which as the designated IT expert of the
> family i need to help make work. Can the operators of Meta data
> centres really not understand lsmod? Cannot look in /sys?
>
>        Andrew

At datacenter scales stopping to triage individual issues becomes
quite expensive. It implies that you are leaving the device in the
failed state while you take the time to come back around and figure
out what is going on. That is why in many cases we are not able to
stop and run an lsmod. Usually the error is recorded, the system
reset, and nothing comes of it unless it is a repeated issue.

Also it seems like this messaging is still being added for new
drivers. A quick search through the code for an example comes up with
cb7dd712189f ("octeon_ep_vf: Add driver framework and device
initialization") which is doing the same exact thing and is even a bit
noisier.

Anyway I partially agree that we do need to reduce the noise scope of
it so I will update so we only print the message if we actually
register the driver. We can probably discuss this further for v2 when
I get it submitted.

Thanks,

- Alex
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 6a233e1a3cf2..77efffbd23f9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -14307,6 +14307,13 @@  T:	git git://linuxtv.org/media_tree.git
 F:	Documentation/devicetree/bindings/media/amlogic,gx-vdec.yaml
 F:	drivers/staging/media/meson/vdec/
 
+META ETHERNET DRIVERS
+M:	Alexander Duyck <alexanderduyck@fb.com>
+M:	Jakub Kicinski <kuba@kernel.org>
+R:	kernel-team@meta.com
+S:	Maintained
+F:	drivers/net/ethernet/meta/
+
 METHODE UDPU SUPPORT
 M:	Robert Marko <robert.marko@sartura.hr>
 S:	Maintained
diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index 6a19b5393ed1..0baac25db4f8 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -122,6 +122,7 @@  source "drivers/net/ethernet/litex/Kconfig"
 source "drivers/net/ethernet/marvell/Kconfig"
 source "drivers/net/ethernet/mediatek/Kconfig"
 source "drivers/net/ethernet/mellanox/Kconfig"
+source "drivers/net/ethernet/meta/Kconfig"
 source "drivers/net/ethernet/micrel/Kconfig"
 source "drivers/net/ethernet/microchip/Kconfig"
 source "drivers/net/ethernet/mscc/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index 0d872d4efcd1..c03203439c0e 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -59,6 +59,7 @@  obj-$(CONFIG_NET_VENDOR_LITEX) += litex/
 obj-$(CONFIG_NET_VENDOR_MARVELL) += marvell/
 obj-$(CONFIG_NET_VENDOR_MEDIATEK) += mediatek/
 obj-$(CONFIG_NET_VENDOR_MELLANOX) += mellanox/
+obj-$(CONFIG_NET_VENDOR_META) += meta/
 obj-$(CONFIG_NET_VENDOR_MICREL) += micrel/
 obj-$(CONFIG_NET_VENDOR_MICROCHIP) += microchip/
 obj-$(CONFIG_NET_VENDOR_MICROSEMI) += mscc/
diff --git a/drivers/net/ethernet/meta/Kconfig b/drivers/net/ethernet/meta/Kconfig
new file mode 100644
index 000000000000..8949ab15a02e
--- /dev/null
+++ b/drivers/net/ethernet/meta/Kconfig
@@ -0,0 +1,29 @@ 
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Meta Platforms network device configuration
+#
+
+config NET_VENDOR_META
+	bool "Meta Platforms devices"
+	default y
+	help
+	  If you have a network (Ethernet) card designed by Meta, say Y.
+	  That's Meta as in the parent company of Facebook.
+
+	  Note that the answer to this question doesn't directly affect the
+	  kernel: saying N will just cause the configurator to skip all
+	  the questions about Meta cards. If you say Y, you will be asked for
+	  your specific card in the following questions.
+
+if NET_VENDOR_META
+
+config FBNIC
+	tristate "Meta Platforms Host Network Interface"
+	depends on PCI_MSI
+	help
+	  This driver supports Meta Platforms Host Network Interface.
+
+	  To compile this driver as a module, choose M here. The module
+	  will be called fbnic.  MSI-X interrupt support is required.
+
+endif # NET_VENDOR_META
diff --git a/drivers/net/ethernet/meta/Makefile b/drivers/net/ethernet/meta/Makefile
new file mode 100644
index 000000000000..88804f3de963
--- /dev/null
+++ b/drivers/net/ethernet/meta/Makefile
@@ -0,0 +1,6 @@ 
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the Meta Platforms network device drivers.
+#
+
+obj-$(CONFIG_FBNIC) += fbnic/
diff --git a/drivers/net/ethernet/meta/fbnic/Makefile b/drivers/net/ethernet/meta/fbnic/Makefile
new file mode 100644
index 000000000000..ce277fec875f
--- /dev/null
+++ b/drivers/net/ethernet/meta/fbnic/Makefile
@@ -0,0 +1,10 @@ 
+# SPDX-License-Identifier: GPL-2.0
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+
+#
+# Makefile for the Meta(R) Host Network Interface
+#
+
+obj-$(CONFIG_FBNIC) += fbnic.o
+
+fbnic-y := fbnic_pci.o
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic.h b/drivers/net/ethernet/meta/fbnic/fbnic.h
new file mode 100644
index 000000000000..25702dab8d66
--- /dev/null
+++ b/drivers/net/ethernet/meta/fbnic/fbnic.h
@@ -0,0 +1,19 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#ifndef _FBNIC_H_
+#define _FBNIC_H_
+
+#include "fbnic_csr.h"
+
+extern char fbnic_driver_name[];
+
+enum fbnic_boards {
+	fbnic_board_asic
+};
+
+struct fbnic_info {
+	unsigned int bar_mask;
+};
+
+#endif /* _FBNIC_H_ */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
new file mode 100644
index 000000000000..72e89c07bf54
--- /dev/null
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
@@ -0,0 +1,9 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#ifndef _FBNIC_CSR_H_
+#define _FBNIC_CSR_H_
+
+#define PCI_DEVICE_ID_META_FBNIC_ASIC		0x0013
+
+#endif /* _FBNIC_CSR_H_ */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h b/drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h
new file mode 100644
index 000000000000..809ba6729442
--- /dev/null
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_drvinfo.h
@@ -0,0 +1,5 @@ 
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#define DRV_NAME "fbnic"
+#define DRV_SUMMARY "Meta(R) Host Network Interface Driver"
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_pci.c b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
new file mode 100644
index 000000000000..1cb71cb1de14
--- /dev/null
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
@@ -0,0 +1,196 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/types.h>
+
+#include "fbnic.h"
+#include "fbnic_drvinfo.h"
+
+char fbnic_driver_name[] = DRV_NAME;
+
+MODULE_DESCRIPTION(DRV_SUMMARY);
+MODULE_LICENSE("GPL");
+
+static const struct fbnic_info fbnic_asic_info = {
+	.bar_mask = BIT(0) | BIT(4)
+};
+
+static const struct fbnic_info *fbnic_info_tbl[] = {
+	[fbnic_board_asic] = &fbnic_asic_info,
+};
+
+static const struct pci_device_id fbnic_pci_tbl[] = {
+	{ PCI_DEVICE_DATA(META, FBNIC_ASIC, fbnic_board_asic) },
+	/* required last entry */
+	{0, }
+};
+MODULE_DEVICE_TABLE(pci, fbnic_pci_tbl);
+
+/**
+ *  fbnic_probe - Device Initialization Routine
+ *  @pdev: PCI device information struct
+ *  @ent: entry in fbnic_pci_tbl
+ *
+ *  Returns 0 on success, negative on failure
+ *
+ *  Initializes a PCI device identified by a pci_dev structure.
+ *  The OS initialization, configuring of the adapter private structure,
+ *  and a hardware reset occur.
+ **/
+static int fbnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
+{
+	const struct fbnic_info *info = fbnic_info_tbl[ent->driver_data];
+	int err;
+
+	if (pdev->error_state != pci_channel_io_normal) {
+		dev_err(&pdev->dev,
+			"PCI device still in an error state. Unable to load...\n");
+		return -EIO;
+	}
+
+	err = pcim_enable_device(pdev);
+	if (err) {
+		dev_err(&pdev->dev, "PCI enable device failed: %d\n", err);
+		return err;
+	}
+
+	err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(46));
+	if (err)
+		err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+	if (err) {
+		dev_err(&pdev->dev, "DMA configuration failed: %d\n", err);
+		return err;
+	}
+
+	err = pcim_iomap_regions(pdev, info->bar_mask, fbnic_driver_name);
+	if (err) {
+		dev_err(&pdev->dev,
+			"pci_request_selected_regions failed: %d\n", err);
+		return err;
+	}
+
+	pci_set_master(pdev);
+	pci_save_state(pdev);
+
+	return 0;
+}
+
+/**
+ * fbnic_remove - Device Removal Routine
+ * @pdev: PCI device information struct
+ *
+ * Called by the PCI subsystem to alert the driver that it should release
+ * a PCI device.  The could be caused by a Hot-Plug event, or because the
+ * driver is going to be removed from memory.
+ **/
+static void fbnic_remove(struct pci_dev *pdev)
+{
+}
+
+static int fbnic_pm_suspend(struct device *dev)
+{
+	return 0;
+}
+
+static int __fbnic_pm_resume(struct device *dev)
+{
+	return 0;
+}
+
+static int __maybe_unused fbnic_pm_resume(struct device *dev)
+{
+	int err;
+
+	err = __fbnic_pm_resume(dev);
+
+	return err;
+}
+
+static const struct dev_pm_ops fbnic_pm_ops = {
+	SET_SYSTEM_SLEEP_PM_OPS(fbnic_pm_suspend, fbnic_pm_resume)
+};
+
+static void fbnic_shutdown(struct pci_dev *pdev)
+{
+	fbnic_pm_suspend(&pdev->dev);
+}
+
+static pci_ers_result_t fbnic_err_error_detected(struct pci_dev *pdev,
+						 pci_channel_state_t state)
+{
+	/* disconnect device if failure is not recoverable via reset */
+	if (state == pci_channel_io_perm_failure)
+		return PCI_ERS_RESULT_DISCONNECT;
+
+	fbnic_pm_suspend(&pdev->dev);
+
+	/* Request a slot reset */
+	return PCI_ERS_RESULT_NEED_RESET;
+}
+
+static pci_ers_result_t fbnic_err_slot_reset(struct pci_dev *pdev)
+{
+	pci_set_power_state(pdev, PCI_D0);
+	pci_restore_state(pdev);
+	pci_save_state(pdev);
+
+	if (pci_enable_device_mem(pdev)) {
+		dev_err(&pdev->dev,
+			"Cannot re-enable PCI device after reset.\n");
+		return PCI_ERS_RESULT_DISCONNECT;
+	}
+
+	return PCI_ERS_RESULT_RECOVERED;
+}
+
+static void fbnic_err_resume(struct pci_dev *pdev)
+{
+}
+
+static const struct pci_error_handlers fbnic_err_handler = {
+	.error_detected	= fbnic_err_error_detected,
+	.slot_reset	= fbnic_err_slot_reset,
+	.resume		= fbnic_err_resume,
+};
+
+static struct pci_driver fbnic_driver = {
+	.name		= fbnic_driver_name,
+	.id_table	= fbnic_pci_tbl,
+	.probe		= fbnic_probe,
+	.remove		= fbnic_remove,
+	.driver.pm	= &fbnic_pm_ops,
+	.shutdown	= fbnic_shutdown,
+	.err_handler	= &fbnic_err_handler,
+};
+
+/**
+ * fbnic_init_module - Driver Registration Routine
+ *
+ * The first routine called when the driver is loaded.  All it does is
+ * register with the PCI subsystem.
+ **/
+static int __init fbnic_init_module(void)
+{
+	int err;
+
+	pr_info(DRV_SUMMARY " (%s)", fbnic_driver.name);
+
+	err = pci_register_driver(&fbnic_driver);
+
+	return err;
+}
+module_init(fbnic_init_module);
+
+/**
+ * fbnic_exit_module - Driver Exit Cleanup Routine
+ *
+ * Called just before the driver is removed from memory.
+ **/
+static void __exit fbnic_exit_module(void)
+{
+	pci_unregister_driver(&fbnic_driver);
+}
+module_exit(fbnic_exit_module);