Message ID | 1516831913-26013-1-git-send-email-mikelley@ntdev.microsoft.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Updated/corrected two email addresses ... > -----Original Message----- > From: Michael Kelley (EOSG) > Sent: Wednesday, January 24, 2018 2:14 PM > To: KY Srinivasan <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; > martin.petersen@oracle.com; longi@microsoft.com; JBottomley@odin.com; > devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-scsi@vger.kernel.org > Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com> > Subject: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a channel for I/O requests > > Update the algorithm in storvsc_do_io to look for a channel > starting with the current CPU + 1 and wrap around (within the > current NUMA node). This spreads VMbus interrupts more evenly > across CPUs. Previous code always started with first CPU in > the current NUMA node, skewing the interrupt load to that CPU. > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > --- > drivers/scsi/storvsc_drv.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > index e07907d..f3264c4 100644 > --- a/drivers/scsi/storvsc_drv.c > +++ b/drivers/scsi/storvsc_drv.c > @@ -1310,7 +1310,8 @@ static int storvsc_do_io(struct hv_device *device, > */ > cpumask_and(&alloced_mask, &stor_device->alloced_cpus, > cpumask_of_node(cpu_to_node(q_num))); > - for_each_cpu(tgt_cpu, &alloced_mask) { > + for_each_cpu_wrap(tgt_cpu, &alloced_mask, > + outgoing_channel->target_cpu + 1) { > if (tgt_cpu != outgoing_channel->target_cpu) { > outgoing_channel = > stor_device->stor_chns[tgt_cpu]; > -- > 1.8.3.1
> Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a > channel for I/O requests > > Updated/corrected two email addresses ... > > > -----Original Message----- > > From: Michael Kelley (EOSG) > > Sent: Wednesday, January 24, 2018 2:14 PM > > To: KY Srinivasan <kys@microsoft.com>; Stephen Hemminger > > <sthemmin@microsoft.com>; martin.petersen@oracle.com; > > longi@microsoft.com; JBottomley@odin.com; > > devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; > > linux-scsi@vger.kernel.org > > Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com> > > Subject: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a > > channel for I/O requests > > > > Update the algorithm in storvsc_do_io to look for a channel starting > > with the current CPU + 1 and wrap around (within the current NUMA > > node). This spreads VMbus interrupts more evenly across CPUs. Previous > > code always started with first CPU in the current NUMA node, skewing > > the interrupt load to that CPU. > > > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > > --- > > drivers/scsi/storvsc_drv.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > index e07907d..f3264c4 100644 > > --- a/drivers/scsi/storvsc_drv.c > > +++ b/drivers/scsi/storvsc_drv.c > > @@ -1310,7 +1310,8 @@ static int storvsc_do_io(struct hv_device *device, > > */ > > cpumask_and(&alloced_mask, &stor_device- > >alloced_cpus, > > > cpumask_of_node(cpu_to_node(q_num))); > > - for_each_cpu(tgt_cpu, &alloced_mask) { > > + for_each_cpu_wrap(tgt_cpu, &alloced_mask, > > + outgoing_channel->target_cpu + 1) { Does it work when target_cpu is the last CPU on the system? Otherwise, looking good. > > if (tgt_cpu != outgoing_channel->target_cpu) > { > > outgoing_channel = > > stor_device->stor_chns[tgt_cpu]; > > -- > > 1.8.3.1
> From: Long Li > Sent: Wednesday, January 31, 2018 12:23 PM > To: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>; KY Srinivasan > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; > martin.petersen@oracle.com; devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; > linux-scsi@vger.kernel.org; James E . J . Bottomley <jejb@linux.vnet.ibm.com> > Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a channel for I/O > requests > > > Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a > > channel for I/O requests > > > > Updated/corrected two email addresses ... > > > > > -----Original Message----- > > > From: Michael Kelley (EOSG) > > > Sent: Wednesday, January 24, 2018 2:14 PM > > > To: KY Srinivasan <kys@microsoft.com>; Stephen Hemminger > > > <sthemmin@microsoft.com>; martin.petersen@oracle.com; > > > longi@microsoft.com; JBottomley@odin.com; > > > devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; > > > linux-scsi@vger.kernel.org > > > Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com> > > > Subject: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a > > > channel for I/O requests > > > > > > Update the algorithm in storvsc_do_io to look for a channel starting > > > with the current CPU + 1 and wrap around (within the current NUMA > > > node). This spreads VMbus interrupts more evenly across CPUs. Previous > > > code always started with first CPU in the current NUMA node, skewing > > > the interrupt load to that CPU. > > > > > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > > > --- > > > drivers/scsi/storvsc_drv.c | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c > > > index e07907d..f3264c4 100644 > > > --- a/drivers/scsi/storvsc_drv.c > > > +++ b/drivers/scsi/storvsc_drv.c > > > @@ -1310,7 +1310,8 @@ static int storvsc_do_io(struct hv_device *device, > > > */ > > > cpumask_and(&alloced_mask, &stor_device- > > >alloced_cpus, > > > > > cpumask_of_node(cpu_to_node(q_num))); > > > - for_each_cpu(tgt_cpu, &alloced_mask) { > > > + for_each_cpu_wrap(tgt_cpu, &alloced_mask, > > > + outgoing_channel->target_cpu + 1) { > > Does it work when target_cpu is the last CPU on the system? > > Otherwise, looking good. Yes, it works. for_each_cpu_wrap() correctly wraps in the case where the 3rd parameter ('start') is one past the end of the mask. Arguably, we shouldn't rely on that, and should do the wrap to 0 before calling for_each_cpu_wrap(). > > > > if (tgt_cpu != outgoing_channel->target_cpu) > > { > > > outgoing_channel = > > > stor_device->stor_chns[tgt_cpu]; > > > -- > > > 1.8.3.1
> Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking a > channel for I/O requests > > > From: Long Li > > Sent: Wednesday, January 31, 2018 12:23 PM > > To: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com>; KY > > Srinivasan <kys@microsoft.com>; Stephen Hemminger > > <sthemmin@microsoft.com>; martin.petersen@oracle.com; > > devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; > > linux-scsi@vger.kernel.org; James E . J . Bottomley > > <jejb@linux.vnet.ibm.com> > > Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking > > a channel for I/O requests > > > > > Subject: RE: [PATCH 1/1] scsi: storvsc: Spread interrupts when > > > picking a channel for I/O requests > > > > > > Updated/corrected two email addresses ... > > > > > > > -----Original Message----- > > > > From: Michael Kelley (EOSG) > > > > Sent: Wednesday, January 24, 2018 2:14 PM > > > > To: KY Srinivasan <kys@microsoft.com>; Stephen Hemminger > > > > <sthemmin@microsoft.com>; martin.petersen@oracle.com; > > > > longi@microsoft.com; JBottomley@odin.com; > > > > devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; > > > > linux-scsi@vger.kernel.org > > > > Cc: Michael Kelley (EOSG) <Michael.H.Kelley@microsoft.com> > > > > Subject: [PATCH 1/1] scsi: storvsc: Spread interrupts when picking > > > > a channel for I/O requests > > > > > > > > Update the algorithm in storvsc_do_io to look for a channel > > > > starting with the current CPU + 1 and wrap around (within the > > > > current NUMA node). This spreads VMbus interrupts more evenly > > > > across CPUs. Previous code always started with first CPU in the > > > > current NUMA node, skewing the interrupt load to that CPU. > > > > > > > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> > > > > --- > > > > drivers/scsi/storvsc_drv.c | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/scsi/storvsc_drv.c > > > > b/drivers/scsi/storvsc_drv.c index e07907d..f3264c4 100644 > > > > --- a/drivers/scsi/storvsc_drv.c > > > > +++ b/drivers/scsi/storvsc_drv.c > > > > @@ -1310,7 +1310,8 @@ static int storvsc_do_io(struct hv_device > *device, > > > > */ > > > > cpumask_and(&alloced_mask, &stor_device- > alloced_cpus, > > > > > > > cpumask_of_node(cpu_to_node(q_num))); > > > > - for_each_cpu(tgt_cpu, &alloced_mask) { > > > > + for_each_cpu_wrap(tgt_cpu, &alloced_mask, > > > > + outgoing_channel->target_cpu + 1) { > > > > Does it work when target_cpu is the last CPU on the system? > > > > Otherwise, looking good. > > Yes, it works. for_each_cpu_wrap() correctly wraps in the case where the > 3rd parameter ('start') is one past the end of the mask. Arguably, we > shouldn't rely on that, and should do the wrap to 0 before calling > for_each_cpu_wrap(). > > > > > > > if (tgt_cpu != outgoing_channel->target_cpu) > > > { > > > > outgoing_channel = > > > > stor_device->stor_chns[tgt_cpu]; > > > > -- > > > > 1.8.3.1
Michael, > Update the algorithm in storvsc_do_io to look for a channel starting > with the current CPU + 1 and wrap around (within the current NUMA > node). This spreads VMbus interrupts more evenly across CPUs. Previous > code always started with first CPU in the current NUMA node, skewing > the interrupt load to that CPU. Applied to 4.16/scsi-fixes. Thanks!
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c index e07907d..f3264c4 100644 --- a/drivers/scsi/storvsc_drv.c +++ b/drivers/scsi/storvsc_drv.c @@ -1310,7 +1310,8 @@ static int storvsc_do_io(struct hv_device *device, */ cpumask_and(&alloced_mask, &stor_device->alloced_cpus, cpumask_of_node(cpu_to_node(q_num))); - for_each_cpu(tgt_cpu, &alloced_mask) { + for_each_cpu_wrap(tgt_cpu, &alloced_mask, + outgoing_channel->target_cpu + 1) { if (tgt_cpu != outgoing_channel->target_cpu) { outgoing_channel = stor_device->stor_chns[tgt_cpu];
Update the algorithm in storvsc_do_io to look for a channel starting with the current CPU + 1 and wrap around (within the current NUMA node). This spreads VMbus interrupts more evenly across CPUs. Previous code always started with first CPU in the current NUMA node, skewing the interrupt load to that CPU. Signed-off-by: Michael Kelley <mikelley@microsoft.com> --- drivers/scsi/storvsc_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)