diff mbox

[v3,3/3] staging/rdma/hfi1: Method to toggle "fast ECN" detection

Message ID 1447112084-12532-1-git-send-email-ira.weiny@intel.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Ira Weiny Nov. 9, 2015, 11:34 p.m. UTC
From: Vennila Megavannan <vennila.megavannan@intel.com>

Add a module paramter to toggle prescan/Fast ECN Detection and remove the
Kconfig option which used to control this.

Reviewed-by: Arthur Kepner<arthur.kepner@intel.com>
Reviewed-by: Mike Marciniszyn<mike.marciniszyn@intel.com>
Signed-off-by: Vennila Megavannan<vennila.megavannan@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>

---
Changes from V1:
	Redo commit message as well as Kconfig help to make it clear what the
	compile and module options do.

Changes from V2:
	Remove Kconfig option completely

 drivers/staging/rdma/hfi1/Kconfig  | 10 ----------
 drivers/staging/rdma/hfi1/driver.c | 24 ++++++++++++++----------
 2 files changed, 14 insertions(+), 20 deletions(-)

Comments

Greg KH Nov. 20, 2015, 12:54 a.m. UTC | #1
On Mon, Nov 09, 2015 at 06:34:44PM -0500, ira.weiny@intel.com wrote:
> From: Vennila Megavannan <vennila.megavannan@intel.com>
> 
> Add a module paramter to toggle prescan/Fast ECN Detection and remove the
> Kconfig option which used to control this.

Ick, no, not a module parameter, that's horrid (hint, it isn't a
per-device option...)

Why can't you do this dynamically?
Why would anyone ever want to make this "slow"?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ira Weiny Nov. 23, 2015, 2:15 a.m. UTC | #2
On Thu, Nov 19, 2015 at 04:54:44PM -0800, Greg KH wrote:
> On Mon, Nov 09, 2015 at 06:34:44PM -0500, ira.weiny@intel.com wrote:
> > From: Vennila Megavannan <vennila.megavannan@intel.com>
> > 
> > Add a module paramter to toggle prescan/Fast ECN Detection and remove the
> > Kconfig option which used to control this.
> 
> Ick, no, not a module parameter, that's horrid (hint, it isn't a
> per-device option...)

This is a good point.  Previous to this patch we had a compile time option
which would have affected all devices and I think we just continued that.  I do
like the idea of making this per port.  I will respin the patch.

However, I want to be clear on your hint.  Are you saying that sysfs would be a
better place to put such a flag?

> 
> Why can't you do this dynamically?

ECN is always on.  The key is the reaction time of the individual port.
Attempting to turn this on and off would affect both the reaction time and the
processing time in a negative way.

> Why would anyone ever want to make this "slow"?

This is a tuning nob for over all fabric performance not individual node
performance.  ECN controls congestion spreading through the network as is
explained in this paper.

http://infocom2003.ieee-infocom.org/papers/28_01.PDF

As is shown in figure 1 and 2 of that paper congestion at node Bc is affecting
traffic at node Bv.  The default reaction time of ECN is likely to be
sufficient for most users based on our experience so far.

However, should a particular network see system wide degradation, this option
can be turned on to increase the reaction time at node Bc.  However, node Bc is
already overloaded so the trade off is likely acceptable.

Unfortunately, it is hard to predict when a user may need this option as we
don't have the resources to build extreme scale fabrics for testing.  Nor do we
know all users workloads or the fabric topologies those workloads may be
running on.

We developed the code which this option enables based on lab experiments.  So
we need this to be an option available to users.

Ira

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH Nov. 23, 2015, 2:41 a.m. UTC | #3
On Sun, Nov 22, 2015 at 09:15:02PM -0500, ira.weiny wrote:
> On Thu, Nov 19, 2015 at 04:54:44PM -0800, Greg KH wrote:
> > On Mon, Nov 09, 2015 at 06:34:44PM -0500, ira.weiny@intel.com wrote:
> > > From: Vennila Megavannan <vennila.megavannan@intel.com>
> > > 
> > > Add a module paramter to toggle prescan/Fast ECN Detection and remove the
> > > Kconfig option which used to control this.
> > 
> > Ick, no, not a module parameter, that's horrid (hint, it isn't a
> > per-device option...)
> 
> This is a good point.  Previous to this patch we had a compile time option
> which would have affected all devices and I think we just continued that.  I do
> like the idea of making this per port.  I will respin the patch.
> 
> However, I want to be clear on your hint.  Are you saying that sysfs would be a
> better place to put such a flag?

Maybe, if you want it per-device, but really, you don't want to have any
"knobs" a user has to tune.

> > Why can't you do this dynamically?
> 
> ECN is always on.  The key is the reaction time of the individual port.
> Attempting to turn this on and off would affect both the reaction time and the
> processing time in a negative way.
> 
> > Why would anyone ever want to make this "slow"?
> 
> This is a tuning nob for over all fabric performance not individual node
> performance.  ECN controls congestion spreading through the network as is
> explained in this paper.
> 
> http://infocom2003.ieee-infocom.org/papers/28_01.PDF
> 
> As is shown in figure 1 and 2 of that paper congestion at node Bc is affecting
> traffic at node Bv.  The default reaction time of ECN is likely to be
> sufficient for most users based on our experience so far.
> 
> However, should a particular network see system wide degradation, this option
> can be turned on to increase the reaction time at node Bc.  However, node Bc is
> already overloaded so the trade off is likely acceptable.
> 
> Unfortunately, it is hard to predict when a user may need this option as we
> don't have the resources to build extreme scale fabrics for testing.  Nor do we
> know all users workloads or the fabric topologies those workloads may be
> running on.
> 
> We developed the code which this option enables based on lab experiments.  So
> we need this to be an option available to users.

Ok, then make it able to be a per-device option, through sysfs, and
choose the "best" option to set it to be as the default.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/staging/rdma/hfi1/Kconfig b/drivers/staging/rdma/hfi1/Kconfig
index fd25078ee923..0ad92fdb3fe6 100644
--- a/drivers/staging/rdma/hfi1/Kconfig
+++ b/drivers/staging/rdma/hfi1/Kconfig
@@ -25,13 +25,3 @@  config SDMA_VERBOSITY
 	---help---
 	This is a configuration flag to enable verbose
 	SDMA debug
-config PRESCAN_RXQ
-	bool "Enable prescanning of the RX queue for ECNs"
-	depends on INFINIBAND_HFI1
-	default n
-	---help---
-	This option toggles the prescanning of the receive queue for
-	Explicit Congestion Notifications. If an ECN is detected, it
-	is processed as quickly as possible, the ECN is toggled off.
-	After the prescanning step, the receive queue is processed as
-	usual.
diff --git a/drivers/staging/rdma/hfi1/driver.c b/drivers/staging/rdma/hfi1/driver.c
index 9a4ec09af020..145ac3061f5d 100644
--- a/drivers/staging/rdma/hfi1/driver.c
+++ b/drivers/staging/rdma/hfi1/driver.c
@@ -83,6 +83,12 @@  unsigned int hfi1_cu = 1;
 module_param_named(cu, hfi1_cu, uint, S_IRUGO);
 MODULE_PARM_DESC(cu, "Credit return units");
 
+static unsigned int prescan_rx_queue;
+module_param_named(prescan_rxq, prescan_rx_queue, uint,
+		   S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(prescan_rxq,
+		 "Used to toggle rx prescan. Set to 1 to enable prescan");
+
 unsigned long hfi1_cap_mask = HFI1_CAP_MASK_DEFAULT;
 static int hfi1_caps_set(const char *, const struct kernel_param *);
 static int hfi1_caps_get(char *, const struct kernel_param *);
@@ -434,11 +440,6 @@  static inline void init_packet(struct hfi1_ctxtdata *rcd,
 	packet->rcv_flags = 0;
 }
 
-#ifndef CONFIG_PRESCAN_RXQ
-static void prescan_rxq(struct hfi1_packet *packet) {}
-#else /* !CONFIG_PRESCAN_RXQ */
-static int prescan_receive_queue;
-
 static void process_ecn(struct hfi1_qp *qp, struct hfi1_ib_header *hdr,
 			struct hfi1_other_headers *ohdr,
 			u64 rhf, u32 bth1, struct ib_grh *grh)
@@ -541,15 +542,19 @@  static inline void update_ps_mdata(struct ps_mdata *mdata)
  * containing Excplicit Congestion Notifications (FECNs, or BECNs).
  * When an ECN is found, process the Congestion Notification, and toggle
  * it off.
+ * This is declared as a macro to allow quick checking of the module param and
+ * avoid the overhead of a function call if not enabled.
  */
-static void prescan_rxq(struct hfi1_packet *packet)
+#define prescan_rxq(packet) \
+	do { \
+		if (prescan_rx_queue) \
+			__prescan_rxq(packet); \
+	} while (0)
+static void __prescan_rxq(struct hfi1_packet *packet)
 {
 	struct hfi1_ctxtdata *rcd = packet->rcd;
 	struct ps_mdata mdata;
 
-	if (!prescan_receive_queue)
-		return;
-
 	init_ps_mdata(&mdata, packet);
 
 	while (1) {
@@ -609,7 +614,6 @@  next:
 		update_ps_mdata(&mdata);
 	}
 }
-#endif /* CONFIG_PRESCAN_RXQ */
 
 static inline int process_rcv_packet(struct hfi1_packet *packet, int thread)
 {