diff mbox

[1/2] rbd: don't queue watch delayed work when we are removing device

Message ID CAOi1vP8-6KUh5drZ5Sc5g08VdTtE6kovDeUUD0f8d2L1FX+mLg@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ilya Dryomov May 31, 2018, 8:13 a.m. UTC
On Thu, May 31, 2018 at 9:18 AM, Dongsheng Yang
<dongsheng.yang@easystack.cn> wrote:
>
> On 05/31/2018 03:11 AM, Ilya Dryomov wrote:
>>
>> On Tue, May 29, 2018 at 5:22 AM, Dongsheng Yang
>> <dongsheng.yang@easystack.cn> wrote:
>>>
>>> We will cancel all watch delayed work in
>>> cancel_delayed_work_sync(&rbd_dev->watch_dwork);
>>> If we queue delayed work after this, there will be a use-after-free
>>> problem:
>>>
>>> [  549.932085] BUG: unable to handle kernel NULL pointer dereference at
>>> 0000000000000000
>>> [  549.934134] PGD 0 P4D 0
>>> [  549.935145] Oops: 0000 [#1] SMP PTI
>>> [  549.936283] Modules linked in: rbd(OE) libceph(OE) tcp_diag udp_diag
>>> inet_diag unix_diag af_packet_diag netlink_diag dns_resolver ebtable_filter
>>> ebtables ip6table_filter ip6_tables iptable_filter sg cfg80211 rfkill
>>> snd_hda_codec_generic ext4 snd_hda_intel snd_hda_codec crct10dif_pclmul
>>> crc32_pclmul ghash_clmulni_intel snd_hda_core pcbc snd_hwdep snd_seq mbcache
>>> aesni_intel snd_seq_device jbd2 crypto_simd nfsd cryptd glue_helper snd_pcm
>>> snd_timer auth_rpcgss pcspkr snd virtio_balloon nfs_acl soundcore i2c_piix4
>>> lockd grace sunrpc ip_tables xfs libcrc32c virtio_console virtio_blk
>>> ata_generic pata_acpi 8139too qxl drm_kms_helper syscopyarea sysfillrect
>>> sysimgblt fb_sys_fops ttm drm ata_piix libata crc32c_intel virtio_pci 8139cp
>>> virtio_ring i2c_core mii virtio floppy serio_raw dm_mirror dm_region_hash
>>> [  549.951835]  dm_log dm_mod dax [last unloaded: libceph]
>>> [  549.953490] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE
>>> 4.17.0-rc6+ #13
>>> [  549.955502] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> [  549.957246] RIP: 0010:__queue_work+0x6a/0x3b0
>>> [  549.958744] RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086
>>> [  549.960374] RAX: ffff9427deca8400 RBX: 0000000000000000 RCX:
>>> 0000000000000000
>>> [  549.962297] RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI:
>>> 0000000000000000
>>> [  549.964216] RBP: ffff942783e39e00 R08: ffff9427deca8400 R09:
>>> ffff9427df1c3f00
>>> [  549.966136] R10: 0000000000000004 R11: 0000000000000005 R12:
>>> ffff9427cfb85970
>>> [  549.968070] R13: 0000000000002000 R14: 000000000001eca0 R15:
>>> 0000000000000007
>>> [  549.969999] FS:  0000000000000000(0000) GS:ffff9427df1c0000(0000)
>>> knlGS:0000000000000000
>>> [  549.972069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  549.973775] CR2: 0000000000000000 CR3: 00000004c900a005 CR4:
>>> 00000000000206e0
>>> [  549.975695] Call Trace:
>>> [  549.976900]  <IRQ>
>>> [  549.978033]  ? __queue_work+0x3b0/0x3b0
>>> [  549.979442]  call_timer_fn+0x2d/0x130
>>> [  549.980824]  run_timer_softirq+0x16e/0x430
>>> [  549.982263]  ? tick_sched_timer+0x37/0x70
>>> [  549.983691]  __do_softirq+0xd2/0x280
>>> [  549.985035]  irq_exit+0xd5/0xe0
>>> [  549.986316]  smp_apic_timer_interrupt+0x6c/0x130
>>> [  549.987835]  apic_timer_interrupt+0xf/0x20
>>>
>>> This patch forbid to queue watch_dwork when we are removing device.
>>>
>>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>> ---
>>>   drivers/block/rbd.c | 10 +++++++---
>>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>> index 2b4e90d..d1d8f46 100644
>>> --- a/drivers/block/rbd.c
>>> +++ b/drivers/block/rbd.c
>>> @@ -3475,9 +3475,13 @@ static void rbd_reregister_watch(struct
>>> work_struct *work)
>>>                          set_bit(RBD_DEV_FLAG_BLACKLISTED,
>>> &rbd_dev->flags);
>>>                          wake_requests(rbd_dev, true);
>>>                  } else {
>>> -                       queue_delayed_work(rbd_dev->task_wq,
>>> -                                          &rbd_dev->watch_dwork,
>>> -                                          RBD_RETRY_DELAY);
>>> +                       spin_lock_irq(&rbd_dev->lock);
>>> +                       if (!test_bit(RBD_DEV_FLAG_REMOVING,
>>> &rbd_dev->flags)) {
>>> +                               queue_delayed_work(rbd_dev->task_wq,
>>> +                                                  &rbd_dev->watch_dwork,
>>> +                                                  RBD_RETRY_DELAY);
>>> +                       }
>>> +                       spin_unlock_irq(&rbd_dev->lock);
>>>                  }
>>>                  mutex_unlock(&rbd_dev->watch_mutex);
>>>                  return;
>>
>> Hi Dongsheng,
>>
>> What made you think it is rbd (or ceph) related?  Do you know what gets
>> dereferenced?  Is it reproducible?
>
>
> Hi Ilya,
>     There is a simple reproduce script:
>
> ./vstart.sh  -k -l --bluestore
> rbd map -o osd_request_timeout=10 test1
> time dd if=/dev/zero of=/dev/rbd0 bs=64K count=1000 oflag=direct &
> sleep 1
> ps -ef|grep -E "ceph-mon|ceph-osd"|gawk '{print "kill -9 "$2}'|bash
> rbd unmap -o force /dev/rbd0
>
> But maybe you need to run this script in a loop, because that's not 100%
> happen.

Ah, I see.  I think a more correct fix would be to move ->watch_dwork
cancellation:


This way rbd_reregister_watch() should either bail out early because
the watch is UNREGISTERED at that point or just get cancelled.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Dongsheng Yang May 31, 2018, 8:34 a.m. UTC | #1
On 05/31/2018 04:13 PM, Ilya Dryomov wrote:
> On Thu, May 31, 2018 at 9:18 AM, Dongsheng Yang
> <dongsheng.yang@easystack.cn> wrote:
>> On 05/31/2018 03:11 AM, Ilya Dryomov wrote:
>>> On Tue, May 29, 2018 at 5:22 AM, Dongsheng Yang
>>> <dongsheng.yang@easystack.cn> wrote:
>>>> We will cancel all watch delayed work in
>>>> cancel_delayed_work_sync(&rbd_dev->watch_dwork);
>>>> If we queue delayed work after this, there will be a use-after-free
>>>> problem:
>>>>
>>>> [  549.932085] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000000
>>>> [  549.934134] PGD 0 P4D 0
>>>> [  549.935145] Oops: 0000 [#1] SMP PTI
>>>> [  549.936283] Modules linked in: rbd(OE) libceph(OE) tcp_diag udp_diag
>>>> inet_diag unix_diag af_packet_diag netlink_diag dns_resolver ebtable_filter
>>>> ebtables ip6table_filter ip6_tables iptable_filter sg cfg80211 rfkill
>>>> snd_hda_codec_generic ext4 snd_hda_intel snd_hda_codec crct10dif_pclmul
>>>> crc32_pclmul ghash_clmulni_intel snd_hda_core pcbc snd_hwdep snd_seq mbcache
>>>> aesni_intel snd_seq_device jbd2 crypto_simd nfsd cryptd glue_helper snd_pcm
>>>> snd_timer auth_rpcgss pcspkr snd virtio_balloon nfs_acl soundcore i2c_piix4
>>>> lockd grace sunrpc ip_tables xfs libcrc32c virtio_console virtio_blk
>>>> ata_generic pata_acpi 8139too qxl drm_kms_helper syscopyarea sysfillrect
>>>> sysimgblt fb_sys_fops ttm drm ata_piix libata crc32c_intel virtio_pci 8139cp
>>>> virtio_ring i2c_core mii virtio floppy serio_raw dm_mirror dm_region_hash
>>>> [  549.951835]  dm_log dm_mod dax [last unloaded: libceph]
>>>> [  549.953490] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE
>>>> 4.17.0-rc6+ #13
>>>> [  549.955502] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>>> [  549.957246] RIP: 0010:__queue_work+0x6a/0x3b0
>>>> [  549.958744] RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086
>>>> [  549.960374] RAX: ffff9427deca8400 RBX: 0000000000000000 RCX:
>>>> 0000000000000000
>>>> [  549.962297] RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI:
>>>> 0000000000000000
>>>> [  549.964216] RBP: ffff942783e39e00 R08: ffff9427deca8400 R09:
>>>> ffff9427df1c3f00
>>>> [  549.966136] R10: 0000000000000004 R11: 0000000000000005 R12:
>>>> ffff9427cfb85970
>>>> [  549.968070] R13: 0000000000002000 R14: 000000000001eca0 R15:
>>>> 0000000000000007
>>>> [  549.969999] FS:  0000000000000000(0000) GS:ffff9427df1c0000(0000)
>>>> knlGS:0000000000000000
>>>> [  549.972069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [  549.973775] CR2: 0000000000000000 CR3: 00000004c900a005 CR4:
>>>> 00000000000206e0
>>>> [  549.975695] Call Trace:
>>>> [  549.976900]  <IRQ>
>>>> [  549.978033]  ? __queue_work+0x3b0/0x3b0
>>>> [  549.979442]  call_timer_fn+0x2d/0x130
>>>> [  549.980824]  run_timer_softirq+0x16e/0x430
>>>> [  549.982263]  ? tick_sched_timer+0x37/0x70
>>>> [  549.983691]  __do_softirq+0xd2/0x280
>>>> [  549.985035]  irq_exit+0xd5/0xe0
>>>> [  549.986316]  smp_apic_timer_interrupt+0x6c/0x130
>>>> [  549.987835]  apic_timer_interrupt+0xf/0x20
>>>>
>>>> This patch forbid to queue watch_dwork when we are removing device.
>>>>
>>>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>>> ---
>>>>    drivers/block/rbd.c | 10 +++++++---
>>>>    1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>>> index 2b4e90d..d1d8f46 100644
>>>> --- a/drivers/block/rbd.c
>>>> +++ b/drivers/block/rbd.c
>>>> @@ -3475,9 +3475,13 @@ static void rbd_reregister_watch(struct
>>>> work_struct *work)
>>>>                           set_bit(RBD_DEV_FLAG_BLACKLISTED,
>>>> &rbd_dev->flags);
>>>>                           wake_requests(rbd_dev, true);
>>>>                   } else {
>>>> -                       queue_delayed_work(rbd_dev->task_wq,
>>>> -                                          &rbd_dev->watch_dwork,
>>>> -                                          RBD_RETRY_DELAY);
>>>> +                       spin_lock_irq(&rbd_dev->lock);
>>>> +                       if (!test_bit(RBD_DEV_FLAG_REMOVING,
>>>> &rbd_dev->flags)) {
>>>> +                               queue_delayed_work(rbd_dev->task_wq,
>>>> +                                                  &rbd_dev->watch_dwork,
>>>> +                                                  RBD_RETRY_DELAY);
>>>> +                       }
>>>> +                       spin_unlock_irq(&rbd_dev->lock);
>>>>                   }
>>>>                   mutex_unlock(&rbd_dev->watch_mutex);
>>>>                   return;
>>> Hi Dongsheng,
>>>
>>> What made you think it is rbd (or ceph) related?  Do you know what gets
>>> dereferenced?  Is it reproducible?
>>
>> Hi Ilya,
>>      There is a simple reproduce script:
>>
>> ./vstart.sh  -k -l --bluestore
>> rbd map -o osd_request_timeout=10 test1
>> time dd if=/dev/zero of=/dev/rbd0 bs=64K count=1000 oflag=direct &
>> sleep 1
>> ps -ef|grep -E "ceph-mon|ceph-osd"|gawk '{print "kill -9 "$2}'|bash
>> rbd unmap -o force /dev/rbd0
>>
>> But maybe you need to run this script in a loop, because that's not 100%
>> happen.
> Ah, I see.  I think a more correct fix would be to move ->watch_dwork
> cancellation:
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 2b4e90d06822..23ae0df7a978 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -3400,7 +3400,6 @@ static void cancel_tasks_sync(struct rbd_device *rbd_dev)
>   {
>          dout("%s rbd_dev %p\n", __func__, rbd_dev);
>
> -       cancel_delayed_work_sync(&rbd_dev->watch_dwork);
>          cancel_work_sync(&rbd_dev->acquired_lock_work);
>          cancel_work_sync(&rbd_dev->released_lock_work);
>          cancel_delayed_work_sync(&rbd_dev->lock_dwork);
> @@ -3418,6 +3417,7 @@ static void rbd_unregister_watch(struct
> rbd_device *rbd_dev)
>          rbd_dev->watch_state = RBD_WATCH_STATE_UNREGISTERED;
>          mutex_unlock(&rbd_dev->watch_mutex);
>
> +       cancel_delayed_work_sync(&rbd_dev->watch_dwork);
>          ceph_osdc_flush_notifies(&rbd_dev->rbd_client->client->osdc);
>   }
>
> This way rbd_reregister_watch() should either bail out early because
> the watch is UNREGISTERED at that point or just get cancelled.

Sounds good. what's your suggestion about the [2/2] about lock_dwork?

thanx
Dongsheng
>
> Thanks,
>
>                  Ilya
>


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ilya Dryomov May 31, 2018, 2:42 p.m. UTC | #2
On Thu, May 31, 2018 at 10:34 AM, Dongsheng Yang
<dongsheng.yang@easystack.cn> wrote:
>
>
> On 05/31/2018 04:13 PM, Ilya Dryomov wrote:
>>
>> On Thu, May 31, 2018 at 9:18 AM, Dongsheng Yang
>> <dongsheng.yang@easystack.cn> wrote:
>>>
>>> On 05/31/2018 03:11 AM, Ilya Dryomov wrote:
>>>>
>>>> On Tue, May 29, 2018 at 5:22 AM, Dongsheng Yang
>>>> <dongsheng.yang@easystack.cn> wrote:
>>>>>
>>>>> We will cancel all watch delayed work in
>>>>> cancel_delayed_work_sync(&rbd_dev->watch_dwork);
>>>>> If we queue delayed work after this, there will be a use-after-free
>>>>> problem:
>>>>>
>>>>> [  549.932085] BUG: unable to handle kernel NULL pointer dereference at
>>>>> 0000000000000000
>>>>> [  549.934134] PGD 0 P4D 0
>>>>> [  549.935145] Oops: 0000 [#1] SMP PTI
>>>>> [  549.936283] Modules linked in: rbd(OE) libceph(OE) tcp_diag udp_diag
>>>>> inet_diag unix_diag af_packet_diag netlink_diag dns_resolver
>>>>> ebtable_filter
>>>>> ebtables ip6table_filter ip6_tables iptable_filter sg cfg80211 rfkill
>>>>> snd_hda_codec_generic ext4 snd_hda_intel snd_hda_codec crct10dif_pclmul
>>>>> crc32_pclmul ghash_clmulni_intel snd_hda_core pcbc snd_hwdep snd_seq
>>>>> mbcache
>>>>> aesni_intel snd_seq_device jbd2 crypto_simd nfsd cryptd glue_helper
>>>>> snd_pcm
>>>>> snd_timer auth_rpcgss pcspkr snd virtio_balloon nfs_acl soundcore
>>>>> i2c_piix4
>>>>> lockd grace sunrpc ip_tables xfs libcrc32c virtio_console virtio_blk
>>>>> ata_generic pata_acpi 8139too qxl drm_kms_helper syscopyarea
>>>>> sysfillrect
>>>>> sysimgblt fb_sys_fops ttm drm ata_piix libata crc32c_intel virtio_pci
>>>>> 8139cp
>>>>> virtio_ring i2c_core mii virtio floppy serio_raw dm_mirror
>>>>> dm_region_hash
>>>>> [  549.951835]  dm_log dm_mod dax [last unloaded: libceph]
>>>>> [  549.953490] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           OE
>>>>> 4.17.0-rc6+ #13
>>>>> [  549.955502] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>>>> [  549.957246] RIP: 0010:__queue_work+0x6a/0x3b0
>>>>> [  549.958744] RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086
>>>>> [  549.960374] RAX: ffff9427deca8400 RBX: 0000000000000000 RCX:
>>>>> 0000000000000000
>>>>> [  549.962297] RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI:
>>>>> 0000000000000000
>>>>> [  549.964216] RBP: ffff942783e39e00 R08: ffff9427deca8400 R09:
>>>>> ffff9427df1c3f00
>>>>> [  549.966136] R10: 0000000000000004 R11: 0000000000000005 R12:
>>>>> ffff9427cfb85970
>>>>> [  549.968070] R13: 0000000000002000 R14: 000000000001eca0 R15:
>>>>> 0000000000000007
>>>>> [  549.969999] FS:  0000000000000000(0000) GS:ffff9427df1c0000(0000)
>>>>> knlGS:0000000000000000
>>>>> [  549.972069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [  549.973775] CR2: 0000000000000000 CR3: 00000004c900a005 CR4:
>>>>> 00000000000206e0
>>>>> [  549.975695] Call Trace:
>>>>> [  549.976900]  <IRQ>
>>>>> [  549.978033]  ? __queue_work+0x3b0/0x3b0
>>>>> [  549.979442]  call_timer_fn+0x2d/0x130
>>>>> [  549.980824]  run_timer_softirq+0x16e/0x430
>>>>> [  549.982263]  ? tick_sched_timer+0x37/0x70
>>>>> [  549.983691]  __do_softirq+0xd2/0x280
>>>>> [  549.985035]  irq_exit+0xd5/0xe0
>>>>> [  549.986316]  smp_apic_timer_interrupt+0x6c/0x130
>>>>> [  549.987835]  apic_timer_interrupt+0xf/0x20
>>>>>
>>>>> This patch forbid to queue watch_dwork when we are removing device.
>>>>>
>>>>> Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>>>>> ---
>>>>>    drivers/block/rbd.c | 10 +++++++---
>>>>>    1 file changed, 7 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>>>>> index 2b4e90d..d1d8f46 100644
>>>>> --- a/drivers/block/rbd.c
>>>>> +++ b/drivers/block/rbd.c
>>>>> @@ -3475,9 +3475,13 @@ static void rbd_reregister_watch(struct
>>>>> work_struct *work)
>>>>>                           set_bit(RBD_DEV_FLAG_BLACKLISTED,
>>>>> &rbd_dev->flags);
>>>>>                           wake_requests(rbd_dev, true);
>>>>>                   } else {
>>>>> -                       queue_delayed_work(rbd_dev->task_wq,
>>>>> -                                          &rbd_dev->watch_dwork,
>>>>> -                                          RBD_RETRY_DELAY);
>>>>> +                       spin_lock_irq(&rbd_dev->lock);
>>>>> +                       if (!test_bit(RBD_DEV_FLAG_REMOVING,
>>>>> &rbd_dev->flags)) {
>>>>> +                               queue_delayed_work(rbd_dev->task_wq,
>>>>> +
>>>>> &rbd_dev->watch_dwork,
>>>>> +                                                  RBD_RETRY_DELAY);
>>>>> +                       }
>>>>> +                       spin_unlock_irq(&rbd_dev->lock);
>>>>>                   }
>>>>>                   mutex_unlock(&rbd_dev->watch_mutex);
>>>>>                   return;
>>>>
>>>> Hi Dongsheng,
>>>>
>>>> What made you think it is rbd (or ceph) related?  Do you know what gets
>>>> dereferenced?  Is it reproducible?
>>>
>>>
>>> Hi Ilya,
>>>      There is a simple reproduce script:
>>>
>>> ./vstart.sh  -k -l --bluestore
>>> rbd map -o osd_request_timeout=10 test1
>>> time dd if=/dev/zero of=/dev/rbd0 bs=64K count=1000 oflag=direct &
>>> sleep 1
>>> ps -ef|grep -E "ceph-mon|ceph-osd"|gawk '{print "kill -9 "$2}'|bash
>>> rbd unmap -o force /dev/rbd0
>>>
>>> But maybe you need to run this script in a loop, because that's not 100%
>>> happen.
>>
>> Ah, I see.  I think a more correct fix would be to move ->watch_dwork
>> cancellation:
>>
>> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
>> index 2b4e90d06822..23ae0df7a978 100644
>> --- a/drivers/block/rbd.c
>> +++ b/drivers/block/rbd.c
>> @@ -3400,7 +3400,6 @@ static void cancel_tasks_sync(struct rbd_device
>> *rbd_dev)
>>   {
>>          dout("%s rbd_dev %p\n", __func__, rbd_dev);
>>
>> -       cancel_delayed_work_sync(&rbd_dev->watch_dwork);
>>          cancel_work_sync(&rbd_dev->acquired_lock_work);
>>          cancel_work_sync(&rbd_dev->released_lock_work);
>>          cancel_delayed_work_sync(&rbd_dev->lock_dwork);
>> @@ -3418,6 +3417,7 @@ static void rbd_unregister_watch(struct
>> rbd_device *rbd_dev)
>>          rbd_dev->watch_state = RBD_WATCH_STATE_UNREGISTERED;
>>          mutex_unlock(&rbd_dev->watch_mutex);
>>
>> +       cancel_delayed_work_sync(&rbd_dev->watch_dwork);
>>          ceph_osdc_flush_notifies(&rbd_dev->rbd_client->client->osdc);
>>   }
>>
>> This way rbd_reregister_watch() should either bail out early because
>> the watch is UNREGISTERED at that point or just get cancelled.
>
>
> Sounds good. what's your suggestion about the [2/2] about lock_dwork?

I don't think it's needed.  In order for rbd_acquire_lock() to possibly
requeue itself, it must be queued after it is cancelled.

Looking at the places where ->lock_dwork is queued:

- rbd_wait_state_locked(), i.e. new I/O request -- shoudn't happen at
  that point

- rbd_reacquire_lock(), from rbd_reregister_watch() if LOCKED -- again
  shouldn't happen at that point because it is preceded by rbd_unlock()
  in rbd_dev_image_unlock()

If you are seeing a similar crash on ->lock_dwork, the bug is somewhere
else.

Thanks,

                Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 2b4e90d06822..23ae0df7a978 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3400,7 +3400,6 @@  static void cancel_tasks_sync(struct rbd_device *rbd_dev)
 {
        dout("%s rbd_dev %p\n", __func__, rbd_dev);

-       cancel_delayed_work_sync(&rbd_dev->watch_dwork);
        cancel_work_sync(&rbd_dev->acquired_lock_work);
        cancel_work_sync(&rbd_dev->released_lock_work);
        cancel_delayed_work_sync(&rbd_dev->lock_dwork);
@@ -3418,6 +3417,7 @@  static void rbd_unregister_watch(struct
rbd_device *rbd_dev)
        rbd_dev->watch_state = RBD_WATCH_STATE_UNREGISTERED;
        mutex_unlock(&rbd_dev->watch_mutex);

+       cancel_delayed_work_sync(&rbd_dev->watch_dwork);
        ceph_osdc_flush_notifies(&rbd_dev->rbd_client->client->osdc);
 }