diff mbox series

block/nvme: call blk_drain in NVMe reset code to avoid lockups

Message ID 1541506615-30202-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive)
State New, archived
Headers show
Series block/nvme: call blk_drain in NVMe reset code to avoid lockups | expand

Commit Message

Igor Druzhinin Nov. 6, 2018, 12:16 p.m. UTC
When blk_flush called in NVMe reset path S/C queues are already freed
which means that re-entering AIO handling loop having some IO requests
unfinished will lockup or crash as their SG structures being potentially
reused. Call blk_drain before freeing the queues to avoid this nasty
scenario.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---
 hw/block/nvme.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Igor Druzhinin Nov. 14, 2018, 5:42 p.m. UTC | #1
On 06/11/2018 12:16, Igor Druzhinin wrote:
> When blk_flush called in NVMe reset path S/C queues are already freed
> which means that re-entering AIO handling loop having some IO requests
> unfinished will lockup or crash as their SG structures being potentially
> reused. Call blk_drain before freeing the queues to avoid this nasty
> scenario.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> ---
>  hw/block/nvme.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index fc7dacb..cdf836e 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -797,6 +797,8 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
>  {
>      int i;
>  
> +    blk_drain(n->conf.blk);
> +
>      for (i = 0; i < n->num_queues; i++) {
>          if (n->sq[i] != NULL) {
>              nvme_free_sq(n->sq[i], n);
> 

ping?
Igor Druzhinin Nov. 20, 2018, 5:31 p.m. UTC | #2
On 14/11/2018 17:42, Igor Druzhinin wrote:
> On 06/11/2018 12:16, Igor Druzhinin wrote:
>> When blk_flush called in NVMe reset path S/C queues are already freed
>> which means that re-entering AIO handling loop having some IO requests
>> unfinished will lockup or crash as their SG structures being potentially
>> reused. Call blk_drain before freeing the queues to avoid this nasty
>> scenario.
>>
>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>> ---
>>  hw/block/nvme.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
>> index fc7dacb..cdf836e 100644
>> --- a/hw/block/nvme.c
>> +++ b/hw/block/nvme.c
>> @@ -797,6 +797,8 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
>>  {
>>      int i;
>>  
>> +    blk_drain(n->conf.blk);
>> +
>>      for (i = 0; i < n->num_queues; i++) {
>>          if (n->sq[i] != NULL) {
>>              nvme_free_sq(n->sq[i], n);
>>
> 
> ping?
> 

CC: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini Nov. 20, 2018, 5:52 p.m. UTC | #3
On 20/11/18 18:31, Igor Druzhinin wrote:
> On 14/11/2018 17:42, Igor Druzhinin wrote:
>> On 06/11/2018 12:16, Igor Druzhinin wrote:
>>> When blk_flush called in NVMe reset path S/C queues are already freed
>>> which means that re-entering AIO handling loop having some IO requests
>>> unfinished will lockup or crash as their SG structures being potentially
>>> reused. Call blk_drain before freeing the queues to avoid this nasty
>>> scenario.
>>>
>>> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
>>> ---
>>>  hw/block/nvme.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
>>> index fc7dacb..cdf836e 100644
>>> --- a/hw/block/nvme.c
>>> +++ b/hw/block/nvme.c
>>> @@ -797,6 +797,8 @@ static void nvme_clear_ctrl(NvmeCtrl *n)
>>>  {
>>>      int i;
>>>  
>>> +    blk_drain(n->conf.blk);
>>> +
>>>      for (i = 0; i < n->num_queues; i++) {
>>>          if (n->sq[i] != NULL) {
>>>              nvme_free_sq(n->sq[i], n);
>>>
>>
>> ping?
>>
> 
> CC: Paolo Bonzini <pbonzini@redhat.com>
> 

Looks good to me.  Kevin, Max?

Paolo
Kevin Wolf Nov. 22, 2018, 2:32 p.m. UTC | #4
Am 06.11.2018 um 13:16 hat Igor Druzhinin geschrieben:
> When blk_flush called in NVMe reset path S/C queues are already freed
> which means that re-entering AIO handling loop having some IO requests
> unfinished will lockup or crash as their SG structures being potentially
> reused. Call blk_drain before freeing the queues to avoid this nasty
> scenario.
> 
> Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>

Thanks, applied to the block branch.

Kevin
diff mbox series

Patch

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index fc7dacb..cdf836e 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -797,6 +797,8 @@  static void nvme_clear_ctrl(NvmeCtrl *n)
 {
     int i;
 
+    blk_drain(n->conf.blk);
+
     for (i = 0; i < n->num_queues; i++) {
         if (n->sq[i] != NULL) {
             nvme_free_sq(n->sq[i], n);