Message ID | 20210111231225.105347-2-tyreld@linux.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | ibmvfc: initial MQ development | expand |
On 11/01/2021 23:12, Tyrel Datwyler wrote: > Introduce several new vhost fields for managing MQ state of the adapter > as well as initial defaults for MQ enablement. > > Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> > --- > drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ > drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c > index ba95438a8912..9200fe49c57e 100644 > --- a/drivers/scsi/ibmvscsi/ibmvfc.c > +++ b/drivers/scsi/ibmvscsi/ibmvfc.c > @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { > .max_sectors = IBMVFC_MAX_SECTORS, > .shost_attrs = ibmvfc_attrs, > .track_queue_depth = 1, > + .host_tagset = 1, Good to see another user :) I didn't check the whole series very thoroughly, but I guess that you only need to set this when shost->nr_hw_queues > 1. Having said that, it should be fine when shost->nr_hw_queues = 0 or 1. Thanks, John > }; > > /** > @@ -5290,6 +5291,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) > shost->max_sectors = IBMVFC_MAX_SECTORS; > shost->max_cmd_len = IBMVFC_MAX_CDB_LEN; > shost->unique_id = shost->host_no; > + shost->nr_hw_queues = IBMVFC_MQ ? IBMVFC_SCSI_HW_QUEUES : 1; > > vhost = shost_priv(shost); > INIT_LIST_HEAD(&vhost->targets); > @@ -5300,6 +5302,12 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) > vhost->partition_number = -1; > vhost->log_level = log_level; > vhost->task_set = 1; > + > + vhost->mq_enabled = IBMVFC_MQ; > + vhost->client_scsi_channels = IBMVFC_SCSI_CHANNELS; > + vhost->using_channels = 0; > + vhost->do_enquiry = 1; > + > strcpy(vhost->partition_name, "UNKNOWN"); > init_waitqueue_head(&vhost->work_wait_q); > init_waitqueue_head(&vhost->init_wait_q); > diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h > index 632e977449c5..dd6d89292867 100644 > --- a/drivers/scsi/ibmvscsi/ibmvfc.h > +++ b/drivers/scsi/ibmvscsi/ibmvfc.h > @@ -41,6 +41,11 @@ > #define IBMVFC_DEFAULT_LOG_LEVEL 2 > #define IBMVFC_MAX_CDB_LEN 16 > #define IBMVFC_CLS3_ERROR 0 > +#define IBMVFC_MQ 0 > +#define IBMVFC_SCSI_CHANNELS 0 > +#define IBMVFC_SCSI_HW_QUEUES 1 > +#define IBMVFC_MIG_NO_SUB_TO_CRQ 0 > +#define IBMVFC_MIG_NO_N_TO_M 0 > > /* > * Ensure we have resources for ERP and initialization: > @@ -840,6 +845,10 @@ struct ibmvfc_host { > int delay_init; > int scan_complete; > int logged_in; > + int mq_enabled; > + int using_channels; > + int do_enquiry; > + int client_scsi_channels; > int aborting_passthru; > int events_to_log; > #define IBMVFC_AE_LINKUP 0x0001 >
On 1/11/21 5:12 PM, Tyrel Datwyler wrote: > Introduce several new vhost fields for managing MQ state of the adapter > as well as initial defaults for MQ enablement. > > Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> > --- > drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ > drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c > index ba95438a8912..9200fe49c57e 100644 > --- a/drivers/scsi/ibmvscsi/ibmvfc.c > +++ b/drivers/scsi/ibmvscsi/ibmvfc.c > @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { > .max_sectors = IBMVFC_MAX_SECTORS, > .shost_attrs = ibmvfc_attrs, > .track_queue_depth = 1, > + .host_tagset = 1, This doesn't seem right. You are setting host_tagset, which means you want a shared, host wide, tag set for commands. It also means that the total queue depth for the host is can_queue. However, it looks like you are allocating max_requests events for each sub crq, which means you are over allocating memory. Looking at this closer, we might have bigger problems. There is a host wide max number of commands that the VFC host supports, which gets returned on NPIV Login. This value can change across a live migration event. The ibmvfc driver, which does the same thing the lpfc driver does, modifies can_queue on the scsi_host *after* the tag set has been allocated. This looks to be a concern with ibmvfc, not sure about lpfc, as it doesn't look like we look at can_queue once the tag set is setup, and I'm not seeing a good way to dynamically change the host queue depth once the tag set is setup. Unless I'm missing something, our best options appear to either be to implement our own host wide busy reference counting, which doesn't sound very good, or we need to add some API to block / scsi that allows us to dynamically change can_queue. Thanks, Brian
On 1/12/21 2:54 PM, Brian King wrote: > On 1/11/21 5:12 PM, Tyrel Datwyler wrote: >> Introduce several new vhost fields for managing MQ state of the adapter >> as well as initial defaults for MQ enablement. >> >> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> >> --- >> drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ >> drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ >> 2 files changed, 17 insertions(+) >> >> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c >> index ba95438a8912..9200fe49c57e 100644 >> --- a/drivers/scsi/ibmvscsi/ibmvfc.c >> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c >> @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { >> .max_sectors = IBMVFC_MAX_SECTORS, >> .shost_attrs = ibmvfc_attrs, >> .track_queue_depth = 1, >> + .host_tagset = 1, > > This doesn't seem right. You are setting host_tagset, which means you want a > shared, host wide, tag set for commands. It also means that the total > queue depth for the host is can_queue. However, it looks like you are allocating > max_requests events for each sub crq, which means you are over allocating memory. With the shared tagset yes the queue depth for the host is can_queue, but this also implies that the max queue depth for each hw queue is also can_queue. So, in the worst case that all commands are queued down the same hw queue we need an event pool with can_queue commands. > > Looking at this closer, we might have bigger problems. There is a host wide > max number of commands that the VFC host supports, which gets returned on > NPIV Login. This value can change across a live migration event. From what I understand the max commands can only become less. > > The ibmvfc driver, which does the same thing the lpfc driver does, modifies > can_queue on the scsi_host *after* the tag set has been allocated. This looks > to be a concern with ibmvfc, not sure about lpfc, as it doesn't look like > we look at can_queue once the tag set is setup, and I'm not seeing a good way > to dynamically change the host queue depth once the tag set is setup. > > Unless I'm missing something, our best options appear to either be to implement > our own host wide busy reference counting, which doesn't sound very good, or > we need to add some API to block / scsi that allows us to dynamically change > can_queue. Changing can_queue won't do use any good with the shared tagset becasue each queue still needs to be able to queue can_queue number of commands in the worst case. The complaint before was not using shared tags increases the tag memory allocation because they are statically allocated for each queue. The question is what claims more memory our event pool allocation or the tagset per queue allocation. If we chose to not use the shared tagset then the queue depth for each hw queue becomes (can_queue / nr_hw_queues). -Tyrel > > Thanks, > > Brian > >
On 1/12/21 6:33 PM, Tyrel Datwyler wrote: > On 1/12/21 2:54 PM, Brian King wrote: >> On 1/11/21 5:12 PM, Tyrel Datwyler wrote: >>> Introduce several new vhost fields for managing MQ state of the adapter >>> as well as initial defaults for MQ enablement. >>> >>> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> >>> --- >>> drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ >>> drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ >>> 2 files changed, 17 insertions(+) >>> >>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c >>> index ba95438a8912..9200fe49c57e 100644 >>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c >>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c >>> @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { >>> .max_sectors = IBMVFC_MAX_SECTORS, >>> .shost_attrs = ibmvfc_attrs, >>> .track_queue_depth = 1, >>> + .host_tagset = 1, >> >> This doesn't seem right. You are setting host_tagset, which means you want a >> shared, host wide, tag set for commands. It also means that the total >> queue depth for the host is can_queue. However, it looks like you are allocating >> max_requests events for each sub crq, which means you are over allocating memory. > > With the shared tagset yes the queue depth for the host is can_queue, but this > also implies that the max queue depth for each hw queue is also can_queue. So, > in the worst case that all commands are queued down the same hw queue we need an > event pool with can_queue commands. > >> >> Looking at this closer, we might have bigger problems. There is a host wide >> max number of commands that the VFC host supports, which gets returned on >> NPIV Login. This value can change across a live migration event. > > From what I understand the max commands can only become less. > >> >> The ibmvfc driver, which does the same thing the lpfc driver does, modifies >> can_queue on the scsi_host *after* the tag set has been allocated. This looks >> to be a concern with ibmvfc, not sure about lpfc, as it doesn't look like >> we look at can_queue once the tag set is setup, and I'm not seeing a good way >> to dynamically change the host queue depth once the tag set is setup. >> >> Unless I'm missing something, our best options appear to either be to implement >> our own host wide busy reference counting, which doesn't sound very good, or >> we need to add some API to block / scsi that allows us to dynamically change >> can_queue. > > Changing can_queue won't do use any good with the shared tagset becasue each > queue still needs to be able to queue can_queue number of commands in the worst > case. The issue I'm trying to highlight here is the following scenario: 1. We set shost->can_queue, then call scsi_add_host, which allocates the tag set. 2. On our NPIV login response from the VIOS, we might get a lower value than we initially set in shost->can_queue, so we update it, but nobody ever looks at it again, and we don't have any protection against sending too many commands to the host. Basically, we no longer have any code that ensures we don't send more commands to the VIOS than we are told it supports. According to the architecture, if we actually do this, the VIOS will do an h_free_crq, which would be a bit of a bug on our part. I don't think it was ever clearly defined in the API that a driver can change shost->can_queue after calling scsi_add_host, but up until commit 6eb045e092efefafc6687409a6fa6d1dabf0fb69, this worked and now it doesn't. I started looking through drivers that do this, and so far, it looks like the following drivers do: ibmvfc, lpfc, aix94xx, libfc, BusLogic, and likely others... We probably need an API that lets us change shost->can_queue dynamically. Thanks, Brian
On Wed, Jan 13, 2021 at 11:13:07AM -0600, Brian King wrote: > On 1/12/21 6:33 PM, Tyrel Datwyler wrote: > > On 1/12/21 2:54 PM, Brian King wrote: > >> On 1/11/21 5:12 PM, Tyrel Datwyler wrote: > >>> Introduce several new vhost fields for managing MQ state of the adapter > >>> as well as initial defaults for MQ enablement. > >>> > >>> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> > >>> --- > >>> drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ > >>> drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ > >>> 2 files changed, 17 insertions(+) > >>> > >>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c > >>> index ba95438a8912..9200fe49c57e 100644 > >>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c > >>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c > >>> @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { > >>> .max_sectors = IBMVFC_MAX_SECTORS, > >>> .shost_attrs = ibmvfc_attrs, > >>> .track_queue_depth = 1, > >>> + .host_tagset = 1, > >> > >> This doesn't seem right. You are setting host_tagset, which means you want a > >> shared, host wide, tag set for commands. It also means that the total > >> queue depth for the host is can_queue. However, it looks like you are allocating > >> max_requests events for each sub crq, which means you are over allocating memory. > > > > With the shared tagset yes the queue depth for the host is can_queue, but this > > also implies that the max queue depth for each hw queue is also can_queue. So, > > in the worst case that all commands are queued down the same hw queue we need an > > event pool with can_queue commands. > > > >> > >> Looking at this closer, we might have bigger problems. There is a host wide > >> max number of commands that the VFC host supports, which gets returned on > >> NPIV Login. This value can change across a live migration event. > > > > From what I understand the max commands can only become less. > > > >> > >> The ibmvfc driver, which does the same thing the lpfc driver does, modifies > >> can_queue on the scsi_host *after* the tag set has been allocated. This looks > >> to be a concern with ibmvfc, not sure about lpfc, as it doesn't look like > >> we look at can_queue once the tag set is setup, and I'm not seeing a good way > >> to dynamically change the host queue depth once the tag set is setup. > >> > >> Unless I'm missing something, our best options appear to either be to implement > >> our own host wide busy reference counting, which doesn't sound very good, or > >> we need to add some API to block / scsi that allows us to dynamically change > >> can_queue. > > > > Changing can_queue won't do use any good with the shared tagset becasue each > > queue still needs to be able to queue can_queue number of commands in the worst > > case. > > The issue I'm trying to highlight here is the following scenario: > > 1. We set shost->can_queue, then call scsi_add_host, which allocates the tag set. > > 2. On our NPIV login response from the VIOS, we might get a lower value than we > initially set in shost->can_queue, so we update it, but nobody ever looks at it > again, and we don't have any protection against sending too many commands to the host. > > > Basically, we no longer have any code that ensures we don't send more > commands to the VIOS than we are told it supports. According to the architecture, > if we actually do this, the VIOS will do an h_free_crq, which would be a bit > of a bug on our part. > > I don't think it was ever clearly defined in the API that a driver can > change shost->can_queue after calling scsi_add_host, but up until > commit 6eb045e092efefafc6687409a6fa6d1dabf0fb69, this worked and now > it doesn't. Actually it isn't related with commit 6eb045e092ef, because blk_mq_alloc_tag_set() uses .can_queue to create driver tag sbitmap and request pool. So even thought without 6eb045e092ef, the updated .can_queue can't work as expected because the max driver tag depth has been fixed by blk-mq already. What 6eb045e092ef does is just to remove the double check on max host-wide allowed commands because that has been respected by blk-mq driver tag allocation already. > > I started looking through drivers that do this, and so far, it looks like the > following drivers do: ibmvfc, lpfc, aix94xx, libfc, BusLogic, and likely others... > > We probably need an API that lets us change shost->can_queue dynamically. I'd suggest to confirm changing .can_queue is one real usecase. Thanks, Ming
On 1/13/21 7:27 PM, Ming Lei wrote: > On Wed, Jan 13, 2021 at 11:13:07AM -0600, Brian King wrote: >> On 1/12/21 6:33 PM, Tyrel Datwyler wrote: >>> On 1/12/21 2:54 PM, Brian King wrote: >>>> On 1/11/21 5:12 PM, Tyrel Datwyler wrote: >>>>> Introduce several new vhost fields for managing MQ state of the adapter >>>>> as well as initial defaults for MQ enablement. >>>>> >>>>> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> >>>>> --- >>>>> drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ >>>>> drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ >>>>> 2 files changed, 17 insertions(+) >>>>> >>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c >>>>> index ba95438a8912..9200fe49c57e 100644 >>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c >>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c >>>>> @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { >>>>> .max_sectors = IBMVFC_MAX_SECTORS, >>>>> .shost_attrs = ibmvfc_attrs, >>>>> .track_queue_depth = 1, >>>>> + .host_tagset = 1, >>>> >>>> This doesn't seem right. You are setting host_tagset, which means you want a >>>> shared, host wide, tag set for commands. It also means that the total >>>> queue depth for the host is can_queue. However, it looks like you are allocating >>>> max_requests events for each sub crq, which means you are over allocating memory. >>> >>> With the shared tagset yes the queue depth for the host is can_queue, but this >>> also implies that the max queue depth for each hw queue is also can_queue. So, >>> in the worst case that all commands are queued down the same hw queue we need an >>> event pool with can_queue commands. >>> >>>> >>>> Looking at this closer, we might have bigger problems. There is a host wide >>>> max number of commands that the VFC host supports, which gets returned on >>>> NPIV Login. This value can change across a live migration event. >>> >>> From what I understand the max commands can only become less. >>> >>>> >>>> The ibmvfc driver, which does the same thing the lpfc driver does, modifies >>>> can_queue on the scsi_host *after* the tag set has been allocated. This looks >>>> to be a concern with ibmvfc, not sure about lpfc, as it doesn't look like >>>> we look at can_queue once the tag set is setup, and I'm not seeing a good way >>>> to dynamically change the host queue depth once the tag set is setup. >>>> >>>> Unless I'm missing something, our best options appear to either be to implement >>>> our own host wide busy reference counting, which doesn't sound very good, or >>>> we need to add some API to block / scsi that allows us to dynamically change >>>> can_queue. >>> >>> Changing can_queue won't do use any good with the shared tagset becasue each >>> queue still needs to be able to queue can_queue number of commands in the worst >>> case. >> >> The issue I'm trying to highlight here is the following scenario: >> >> 1. We set shost->can_queue, then call scsi_add_host, which allocates the tag set. >> >> 2. On our NPIV login response from the VIOS, we might get a lower value than we >> initially set in shost->can_queue, so we update it, but nobody ever looks at it >> again, and we don't have any protection against sending too many commands to the host. >> >> >> Basically, we no longer have any code that ensures we don't send more >> commands to the VIOS than we are told it supports. According to the architecture, >> if we actually do this, the VIOS will do an h_free_crq, which would be a bit >> of a bug on our part. >> >> I don't think it was ever clearly defined in the API that a driver can >> change shost->can_queue after calling scsi_add_host, but up until >> commit 6eb045e092efefafc6687409a6fa6d1dabf0fb69, this worked and now >> it doesn't. > > Actually it isn't related with commit 6eb045e092ef, because blk_mq_alloc_tag_set() > uses .can_queue to create driver tag sbitmap and request pool. > > So even thought without 6eb045e092ef, the updated .can_queue can't work > as expected because the max driver tag depth has been fixed by blk-mq already. There are two scenarios here. In the scenario of someone increasing can_queue after the tag set is allocated, I agree, blk-mq will never take advantage of this. However, in the scenario of someone *decreasing* can_queue after the tag set is allocated, prior to 6eb045e092ef, the shost->host_busy code provided this protection. > > What 6eb045e092ef does is just to remove the double check on max > host-wide allowed commands because that has been respected by blk-mq > driver tag allocation already. > >> >> I started looking through drivers that do this, and so far, it looks like the >> following drivers do: ibmvfc, lpfc, aix94xx, libfc, BusLogic, and likely others... >> >> We probably need an API that lets us change shost->can_queue dynamically. > > I'd suggest to confirm changing .can_queue is one real usecase. For ibmvfc, the total number of commands that the scsi host supports is very much a dynamic value. It can increase and it can decrease. Live migrating a logical partition from one system to another is the usual cause of such a capability change. For ibmvfc, at least, this only ever happens when we've self blocked the host and have sent back all outstanding I/O. However, looking at other drivers that modify can_queue dynamically, this doesn't always hold true. Looking at libfc, it looks to dynamically ramp up and ramp down can_queue based on its ability to handle requests. There are certainly a number of other drivers that change can_queue after the tag set has been allocated. Some of these drivers could likely be changed to avoid doing this, but changing them all will likely be difficult. Thanks, Brian
On Thu, Jan 14, 2021 at 11:24:35AM -0600, Brian King wrote: > On 1/13/21 7:27 PM, Ming Lei wrote: > > On Wed, Jan 13, 2021 at 11:13:07AM -0600, Brian King wrote: > >> On 1/12/21 6:33 PM, Tyrel Datwyler wrote: > >>> On 1/12/21 2:54 PM, Brian King wrote: > >>>> On 1/11/21 5:12 PM, Tyrel Datwyler wrote: > >>>>> Introduce several new vhost fields for managing MQ state of the adapter > >>>>> as well as initial defaults for MQ enablement. > >>>>> > >>>>> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> > >>>>> --- > >>>>> drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ > >>>>> drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ > >>>>> 2 files changed, 17 insertions(+) > >>>>> > >>>>> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c > >>>>> index ba95438a8912..9200fe49c57e 100644 > >>>>> --- a/drivers/scsi/ibmvscsi/ibmvfc.c > >>>>> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c > >>>>> @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { > >>>>> .max_sectors = IBMVFC_MAX_SECTORS, > >>>>> .shost_attrs = ibmvfc_attrs, > >>>>> .track_queue_depth = 1, > >>>>> + .host_tagset = 1, > >>>> > >>>> This doesn't seem right. You are setting host_tagset, which means you want a > >>>> shared, host wide, tag set for commands. It also means that the total > >>>> queue depth for the host is can_queue. However, it looks like you are allocating > >>>> max_requests events for each sub crq, which means you are over allocating memory. > >>> > >>> With the shared tagset yes the queue depth for the host is can_queue, but this > >>> also implies that the max queue depth for each hw queue is also can_queue. So, > >>> in the worst case that all commands are queued down the same hw queue we need an > >>> event pool with can_queue commands. > >>> > >>>> > >>>> Looking at this closer, we might have bigger problems. There is a host wide > >>>> max number of commands that the VFC host supports, which gets returned on > >>>> NPIV Login. This value can change across a live migration event. > >>> > >>> From what I understand the max commands can only become less. > >>> > >>>> > >>>> The ibmvfc driver, which does the same thing the lpfc driver does, modifies > >>>> can_queue on the scsi_host *after* the tag set has been allocated. This looks > >>>> to be a concern with ibmvfc, not sure about lpfc, as it doesn't look like > >>>> we look at can_queue once the tag set is setup, and I'm not seeing a good way > >>>> to dynamically change the host queue depth once the tag set is setup. > >>>> > >>>> Unless I'm missing something, our best options appear to either be to implement > >>>> our own host wide busy reference counting, which doesn't sound very good, or > >>>> we need to add some API to block / scsi that allows us to dynamically change > >>>> can_queue. > >>> > >>> Changing can_queue won't do use any good with the shared tagset becasue each > >>> queue still needs to be able to queue can_queue number of commands in the worst > >>> case. > >> > >> The issue I'm trying to highlight here is the following scenario: > >> > >> 1. We set shost->can_queue, then call scsi_add_host, which allocates the tag set. > >> > >> 2. On our NPIV login response from the VIOS, we might get a lower value than we > >> initially set in shost->can_queue, so we update it, but nobody ever looks at it > >> again, and we don't have any protection against sending too many commands to the host. > >> > >> > >> Basically, we no longer have any code that ensures we don't send more > >> commands to the VIOS than we are told it supports. According to the architecture, > >> if we actually do this, the VIOS will do an h_free_crq, which would be a bit > >> of a bug on our part. > >> > >> I don't think it was ever clearly defined in the API that a driver can > >> change shost->can_queue after calling scsi_add_host, but up until > >> commit 6eb045e092efefafc6687409a6fa6d1dabf0fb69, this worked and now > >> it doesn't. > > > > Actually it isn't related with commit 6eb045e092ef, because blk_mq_alloc_tag_set() > > uses .can_queue to create driver tag sbitmap and request pool. > > > > So even thought without 6eb045e092ef, the updated .can_queue can't work > > as expected because the max driver tag depth has been fixed by blk-mq already. > > There are two scenarios here. In the scenario of someone increasing can_queue > after the tag set is allocated, I agree, blk-mq will never take advantage > of this. However, in the scenario of someone *decreasing* can_queue after the > tag set is allocated, prior to 6eb045e092ef, the shost->host_busy code provided > this protection. When .can_queue is decreased, blk-mq still may allocate driver tag which is > .can_queue, this way might break driver/device too, but it depends on how driver uses req->tag. > > > > > What 6eb045e092ef does is just to remove the double check on max > > host-wide allowed commands because that has been respected by blk-mq > > driver tag allocation already. > > > >> > >> I started looking through drivers that do this, and so far, it looks like the > >> following drivers do: ibmvfc, lpfc, aix94xx, libfc, BusLogic, and likely others... > >> > >> We probably need an API that lets us change shost->can_queue dynamically. > > > > I'd suggest to confirm changing .can_queue is one real usecase. > > For ibmvfc, the total number of commands that the scsi host supports is very > much a dynamic value. It can increase and it can decrease. Live migrating > a logical partition from one system to another is the usual cause of > such a capability change. For ibmvfc, at least, this only ever happens > when we've self blocked the host and have sent back all outstanding I/O. This one looks a good use case, and the new API may have to freeze request queues of all LUNs, and the operation is very expensive and slow. > > However, looking at other drivers that modify can_queue dynamically, this > doesn't always hold true. Looking at libfc, it looks to dynamically ramp > up and ramp down can_queue based on its ability to handle requests. This one looks hard to use the new API which isn't supposed to be called in fast path. And changing host wide resource is really not good in fast path, IMO. > > There are certainly a number of other drivers that change can_queue > after the tag set has been allocated. Some of these drivers could > likely be changed to avoid doing this, but changing them all will likely > be difficult. It is still better to understand why these drivers have to update .can_queue dynamically. Thanks, Ming
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index ba95438a8912..9200fe49c57e 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -3302,6 +3302,7 @@ static struct scsi_host_template driver_template = { .max_sectors = IBMVFC_MAX_SECTORS, .shost_attrs = ibmvfc_attrs, .track_queue_depth = 1, + .host_tagset = 1, }; /** @@ -5290,6 +5291,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) shost->max_sectors = IBMVFC_MAX_SECTORS; shost->max_cmd_len = IBMVFC_MAX_CDB_LEN; shost->unique_id = shost->host_no; + shost->nr_hw_queues = IBMVFC_MQ ? IBMVFC_SCSI_HW_QUEUES : 1; vhost = shost_priv(shost); INIT_LIST_HEAD(&vhost->targets); @@ -5300,6 +5302,12 @@ static int ibmvfc_probe(struct vio_dev *vdev, const struct vio_device_id *id) vhost->partition_number = -1; vhost->log_level = log_level; vhost->task_set = 1; + + vhost->mq_enabled = IBMVFC_MQ; + vhost->client_scsi_channels = IBMVFC_SCSI_CHANNELS; + vhost->using_channels = 0; + vhost->do_enquiry = 1; + strcpy(vhost->partition_name, "UNKNOWN"); init_waitqueue_head(&vhost->work_wait_q); init_waitqueue_head(&vhost->init_wait_q); diff --git a/drivers/scsi/ibmvscsi/ibmvfc.h b/drivers/scsi/ibmvscsi/ibmvfc.h index 632e977449c5..dd6d89292867 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.h +++ b/drivers/scsi/ibmvscsi/ibmvfc.h @@ -41,6 +41,11 @@ #define IBMVFC_DEFAULT_LOG_LEVEL 2 #define IBMVFC_MAX_CDB_LEN 16 #define IBMVFC_CLS3_ERROR 0 +#define IBMVFC_MQ 0 +#define IBMVFC_SCSI_CHANNELS 0 +#define IBMVFC_SCSI_HW_QUEUES 1 +#define IBMVFC_MIG_NO_SUB_TO_CRQ 0 +#define IBMVFC_MIG_NO_N_TO_M 0 /* * Ensure we have resources for ERP and initialization: @@ -840,6 +845,10 @@ struct ibmvfc_host { int delay_init; int scan_complete; int logged_in; + int mq_enabled; + int using_channels; + int do_enquiry; + int client_scsi_channels; int aborting_passthru; int events_to_log; #define IBMVFC_AE_LINKUP 0x0001
Introduce several new vhost fields for managing MQ state of the adapter as well as initial defaults for MQ enablement. Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> --- drivers/scsi/ibmvscsi/ibmvfc.c | 8 ++++++++ drivers/scsi/ibmvscsi/ibmvfc.h | 9 +++++++++ 2 files changed, 17 insertions(+)