diff mbox series

[1/2] virtio_pmem: initialize provider_data through nd_region_desc

Message ID 20220620081519.1494-1-jasowang@redhat.com (mailing list archive)
State Superseded
Headers show
Series [1/2] virtio_pmem: initialize provider_data through nd_region_desc | expand

Commit Message

Jason Wang June 20, 2022, 8:15 a.m. UTC
We used to initialize the provider_data manually after
nvdimm_pemm_region_create(). This seems to be racy if the flush is
issued before the initialization of provider_data. Fixing this by
initialize the provider_data through nd_region_desc to make sure the
provider_data is ready after the pmem is created.

Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/nvdimm/virtio_pmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Michael S. Tsirkin June 20, 2022, 8:36 a.m. UTC | #1
On Mon, Jun 20, 2022 at 04:15:18PM +0800, Jason Wang wrote:
> We used to initialize the provider_data manually after

we used to -> we currently

> nvdimm_pemm_region_create(). This seems to be racy if the flush is

the flush -> flush

> issued before the initialization of provider_data. Fixing this by

Fixing -> Fix

> initialize

initialize -> initializing

> the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.res = &res;
>  	ndr_desc.numa_node = nid;
>  	ndr_desc.flush = async_pmem_flush;
> +	ndr_desc.provider_data = vdev;
>  	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>  	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
>  	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  		err = -ENXIO;
>  		goto out_nd;
>  	}
> -	nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>  	return 0;
>  out_nd:
>  	nvdimm_bus_unregister(vpmem->nvdimm_bus);
> -- 
> 2.25.1
Jason Wang June 20, 2022, 8:36 a.m. UTC | #2
Adding Pankaj.

On Mon, Jun 20, 2022 at 4:15 PM Jason Wang <jasowang@redhat.com> wrote:
>
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         ndr_desc.res = &res;
>         ndr_desc.numa_node = nid;
>         ndr_desc.flush = async_pmem_flush;
> +       ndr_desc.provider_data = vdev;
>         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
>         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>                 err = -ENXIO;
>                 goto out_nd;
>         }
> -       nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>         return 0;
>  out_nd:
>         nvdimm_bus_unregister(vpmem->nvdimm_bus);
> --
> 2.25.1
>
Pankaj Gupta June 21, 2022, 12:44 p.m. UTC | #3
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is
> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
>
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>         ndr_desc.res = &res;
>         ndr_desc.numa_node = nid;
>         ndr_desc.flush = async_pmem_flush;
> +       ndr_desc.provider_data = vdev;
>         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
>         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
>         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>                 err = -ENXIO;
>                 goto out_nd;
>         }
> -       nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>         return 0;
>  out_nd:
>         nvdimm_bus_unregister(vpmem->nvdimm_bus);

Thank you for adding me.

The patch seems correct to me. Will test this as well.

Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>



Thanks,
Pankaj
Dan Williams June 21, 2022, 10:34 p.m. UTC | #4
Jason Wang wrote:
> We used to initialize the provider_data manually after
> nvdimm_pemm_region_create(). This seems to be racy if the flush is

It would be nice to include the actual backtrace / bug signature that
this fixes if it is available.

> issued before the initialization of provider_data. Fixing this by
> initialize the provider_data through nd_region_desc to make sure the
> provider_data is ready after the pmem is created.
> 
> Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/nvdimm/virtio_pmem.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..48f8327d0431 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>  	ndr_desc.res = &res;
>  	ndr_desc.numa_node = nid;
>  	ndr_desc.flush = async_pmem_flush;
> +	ndr_desc.provider_data = vdev;

For my untrained eye, why not
"dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
equivalent "vdev" then you can do a follow-on cleanup patch to reduce
that syntax. Otherwise, if by chance they are not equivalent, then this
conversion is introducing a new problem.

Outside of that you can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Jason Wang June 22, 2022, 3:22 a.m. UTC | #5
On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
>
> Jason Wang wrote:
> > We used to initialize the provider_data manually after
> > nvdimm_pemm_region_create(). This seems to be racy if the flush is
>
> It would be nice to include the actual backtrace / bug signature that
> this fixes if it is available.

The bug was spotted during code review. But it can be reproduced by
adding a msleep() between nvdimm_pmem_region_create() and
nd_region->provider_data =
dev_to_virtio(nd_region->dev.parent->parent);

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 995b6cdc67ed..153d9dbfbe70 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -8,6 +8,7 @@
  */
 #include "virtio_pmem.h"
 #include "nd.h"
+#include <linux/delay.h>

 static struct virtio_device_id id_table[] = {
        { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
@@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
                err = -ENXIO;
                goto out_nd;
        }
+       msleep(100 * 1000);
        nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
        return 0;
 out_nd:

Then if we hotplug and try to do mkfs we get:

[   80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
[   92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
[   92.394551] #PF: supervisor read access in kernel mode
[   92.394955] #PF: error_code(0x0000) - not-present page
[   92.395365] PGD 0 P4D 0
[   92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
[   92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
[   92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
[   92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
00 00
[   92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
[   92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
[   92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
[   92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
[   92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
[   92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
[   92.401849] FS:  00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
knlGS:0000000000000000
[   92.402423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
[   92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   92.404278] Call Trace:
[   92.404481]  <TASK>
[   92.404654]  ? mempool_alloc+0x5d/0x160
[   92.404939]  ? terminate_walk+0x5f/0xf0
[   92.405226]  ? bio_alloc_bioset+0xbb/0x3f0
[   92.405525]  async_pmem_flush+0x17/0x80
[   92.405806]  nvdimm_flush+0x11/0x30
[   92.406067]  pmem_submit_bio+0x1e9/0x200
[   92.406354]  __submit_bio+0x80/0x120
[   92.406621]  submit_bio_noacct_nocheck+0xdc/0x2a0
[   92.406958]  submit_bio_wait+0x4e/0x80
[   92.407234]  blkdev_issue_flush+0x31/0x50
[   92.407526]  ? punt_bios_to_rescuer+0x230/0x230
[   92.407852]  blkdev_fsync+0x1e/0x30
[   92.408112]  do_fsync+0x33/0x70
[   92.408354]  __x64_sys_fsync+0xb/0x10
[   92.408625]  do_syscall_64+0x43/0x90
[   92.408895]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[   92.409257] RIP: 0033:0x7f3cd76c6c44

>
> > issued before the initialization of provider_data. Fixing this by
> > initialize the provider_data through nd_region_desc to make sure the
> > provider_data is ready after the pmem is created.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..48f8327d0431 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >       ndr_desc.res = &res;
> >       ndr_desc.numa_node = nid;
> >       ndr_desc.flush = async_pmem_flush;
> > +     ndr_desc.provider_data = vdev;
>
> For my untrained eye, why not
> "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> that syntax. Otherwise, if by chance they are not equivalent, then this
> conversion is introducing a new problem.

It is because nd_region hasn't been allocated at this time (which is
allocated by nd_region_create() afterwards).

Thanks

>
> Outside of that you can add:
>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
>
Jason Wang June 22, 2022, 3:35 a.m. UTC | #6
On Tue, Jun 21, 2022 at 8:44 PM Pankaj Gupta
<pankaj.gupta.linux@gmail.com> wrote:
>
> > We used to initialize the provider_data manually after
> > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> > issued before the initialization of provider_data. Fixing this by
> > initialize the provider_data through nd_region_desc to make sure the
> > provider_data is ready after the pmem is created.
> >
> > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/nvdimm/virtio_pmem.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..48f8327d0431 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >         ndr_desc.res = &res;
> >         ndr_desc.numa_node = nid;
> >         ndr_desc.flush = async_pmem_flush;
> > +       ndr_desc.provider_data = vdev;
> >         set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
> >         set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> >         nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
> > @@ -89,7 +90,6 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >                 err = -ENXIO;
> >                 goto out_nd;
> >         }
> > -       nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> >         return 0;
> >  out_nd:
> >         nvdimm_bus_unregister(vpmem->nvdimm_bus);
>
> Thank you for adding me.
>
> The patch seems correct to me. Will test this as well.
>
> Acked-by: Pankaj Gupta <pankaj.gupta@amd.com>

Thanks a lot.

I've done a round of tests and everything works well.

>
>
>
> Thanks,
> Pankaj
>
Michael S. Tsirkin June 24, 2022, 6:46 a.m. UTC | #7
On Wed, Jun 22, 2022 at 11:22:00AM +0800, Jason Wang wrote:
> On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > Jason Wang wrote:
> > > We used to initialize the provider_data manually after
> > > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> >
> > It would be nice to include the actual backtrace / bug signature that
> > this fixes if it is available.
> 
> The bug was spotted during code review. But it can be reproduced by
> adding a msleep() between nvdimm_pmem_region_create() and
> nd_region->provider_data =
> dev_to_virtio(nd_region->dev.parent->parent);
> 
> diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> index 995b6cdc67ed..153d9dbfbe70 100644
> --- a/drivers/nvdimm/virtio_pmem.c
> +++ b/drivers/nvdimm/virtio_pmem.c
> @@ -8,6 +8,7 @@
>   */
>  #include "virtio_pmem.h"
>  #include "nd.h"
> +#include <linux/delay.h>
> 
>  static struct virtio_device_id id_table[] = {
>         { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> @@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
>                 err = -ENXIO;
>                 goto out_nd;
>         }
> +       msleep(100 * 1000);
>         nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
>         return 0;
>  out_nd:
> 
> Then if we hotplug and try to do mkfs we get:
> 
> [   80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
> [   92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
> [   92.394551] #PF: supervisor read access in kernel mode
> [   92.394955] #PF: error_code(0x0000) - not-present page
> [   92.395365] PGD 0 P4D 0
> [   92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
> [   92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
> [   92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [   92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
> [   92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
> 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
> 89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
> 00 00
> [   92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
> [   92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
> [   92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
> [   92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
> [   92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
> [   92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
> [   92.401849] FS:  00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
> knlGS:0000000000000000
> [   92.402423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
> [   92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   92.404278] Call Trace:
> [   92.404481]  <TASK>
> [   92.404654]  ? mempool_alloc+0x5d/0x160
> [   92.404939]  ? terminate_walk+0x5f/0xf0
> [   92.405226]  ? bio_alloc_bioset+0xbb/0x3f0
> [   92.405525]  async_pmem_flush+0x17/0x80
> [   92.405806]  nvdimm_flush+0x11/0x30
> [   92.406067]  pmem_submit_bio+0x1e9/0x200
> [   92.406354]  __submit_bio+0x80/0x120
> [   92.406621]  submit_bio_noacct_nocheck+0xdc/0x2a0
> [   92.406958]  submit_bio_wait+0x4e/0x80
> [   92.407234]  blkdev_issue_flush+0x31/0x50
> [   92.407526]  ? punt_bios_to_rescuer+0x230/0x230
> [   92.407852]  blkdev_fsync+0x1e/0x30
> [   92.408112]  do_fsync+0x33/0x70
> [   92.408354]  __x64_sys_fsync+0xb/0x10
> [   92.408625]  do_syscall_64+0x43/0x90
> [   92.408895]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [   92.409257] RIP: 0033:0x7f3cd76c6c44



Jason pls repost everything with this info included, and maybe really
do make the patch minimal as Dan suggested.

> >
> > > issued before the initialization of provider_data. Fixing this by
> > > initialize the provider_data through nd_region_desc to make sure the
> > > provider_data is ready after the pmem is created.
> > >
> > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/nvdimm/virtio_pmem.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > index 995b6cdc67ed..48f8327d0431 100644
> > > --- a/drivers/nvdimm/virtio_pmem.c
> > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > >       ndr_desc.res = &res;
> > >       ndr_desc.numa_node = nid;
> > >       ndr_desc.flush = async_pmem_flush;
> > > +     ndr_desc.provider_data = vdev;
> >
> > For my untrained eye, why not
> > "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> > equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> > that syntax. Otherwise, if by chance they are not equivalent, then this
> > conversion is introducing a new problem.
> 
> It is because nd_region hasn't been allocated at this time (which is
> allocated by nd_region_create() afterwards).
> 
> Thanks
> 
> >
> > Outside of that you can add:
> >
> > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> >
Jason Wang June 27, 2022, 2:31 a.m. UTC | #8
On Fri, Jun 24, 2022 at 2:46 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jun 22, 2022 at 11:22:00AM +0800, Jason Wang wrote:
> > On Wed, Jun 22, 2022 at 6:34 AM Dan Williams <dan.j.williams@intel.com> wrote:
> > >
> > > Jason Wang wrote:
> > > > We used to initialize the provider_data manually after
> > > > nvdimm_pemm_region_create(). This seems to be racy if the flush is
> > >
> > > It would be nice to include the actual backtrace / bug signature that
> > > this fixes if it is available.
> >
> > The bug was spotted during code review. But it can be reproduced by
> > adding a msleep() between nvdimm_pmem_region_create() and
> > nd_region->provider_data =
> > dev_to_virtio(nd_region->dev.parent->parent);
> >
> > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > index 995b6cdc67ed..153d9dbfbe70 100644
> > --- a/drivers/nvdimm/virtio_pmem.c
> > +++ b/drivers/nvdimm/virtio_pmem.c
> > @@ -8,6 +8,7 @@
> >   */
> >  #include "virtio_pmem.h"
> >  #include "nd.h"
> > +#include <linux/delay.h>
> >
> >  static struct virtio_device_id id_table[] = {
> >         { VIRTIO_ID_PMEM, VIRTIO_DEV_ANY_ID },
> > @@ -89,6 +90,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> >                 err = -ENXIO;
> >                 goto out_nd;
> >         }
> > +       msleep(100 * 1000);
> >         nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
> >         return 0;
> >  out_nd:
> >
> > Then if we hotplug and try to do mkfs we get:
> >
> > [   80.152281] nd_pmem namespace0.0: unable to guarantee persistence of writes
> > [   92.393956] BUG: kernel NULL pointer dereference, address: 0000000000000318
> > [   92.394551] #PF: supervisor read access in kernel mode
> > [   92.394955] #PF: error_code(0x0000) - not-present page
> > [   92.395365] PGD 0 P4D 0
> > [   92.395566] Oops: 0000 [#1] PREEMPT SMP PTI
> > [   92.395867] CPU: 2 PID: 506 Comm: mkfs.ext4 Not tainted 5.19.0-rc1+ #453
> > [   92.396365] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> > [   92.397178] RIP: 0010:virtio_pmem_flush+0x2f/0x1f0
> > [   92.397521] Code: 55 41 54 55 53 48 81 ec a0 00 00 00 65 48 8b 04
> > 25 28 00 00 00 48 89 84 24 98 00 00 00 31 c0 48 8b 87 78 03 00 00 48
> > 89 04 24 <48> 8b 98 18 03 00 00 e8 85 bf 6b 00 ba 58 00 00 00 be c0 0c
> > 00 00
> > [   92.398982] RSP: 0018:ffff9a7380aefc88 EFLAGS: 00010246
> > [   92.399349] RAX: 0000000000000000 RBX: ffff8e77c3f86f00 RCX: 0000000000000000
> > [   92.399833] RDX: ffffffffad4ea720 RSI: ffff8e77c41e39c0 RDI: ffff8e77c41c5c00
> > [   92.400388] RBP: ffff8e77c41e39c0 R08: ffff8e77c19f0600 R09: 0000000000000000
> > [   92.400874] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8e77c0814e28
> > [   92.401364] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8e77c41e39c0
> > [   92.401849] FS:  00007f3cd75b2780(0000) GS:ffff8e7937d00000(0000)
> > knlGS:0000000000000000
> > [   92.402423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   92.402821] CR2: 0000000000000318 CR3: 0000000103c80002 CR4: 0000000000370ee0
> > [   92.403307] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   92.403793] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   92.404278] Call Trace:
> > [   92.404481]  <TASK>
> > [   92.404654]  ? mempool_alloc+0x5d/0x160
> > [   92.404939]  ? terminate_walk+0x5f/0xf0
> > [   92.405226]  ? bio_alloc_bioset+0xbb/0x3f0
> > [   92.405525]  async_pmem_flush+0x17/0x80
> > [   92.405806]  nvdimm_flush+0x11/0x30
> > [   92.406067]  pmem_submit_bio+0x1e9/0x200
> > [   92.406354]  __submit_bio+0x80/0x120
> > [   92.406621]  submit_bio_noacct_nocheck+0xdc/0x2a0
> > [   92.406958]  submit_bio_wait+0x4e/0x80
> > [   92.407234]  blkdev_issue_flush+0x31/0x50
> > [   92.407526]  ? punt_bios_to_rescuer+0x230/0x230
> > [   92.407852]  blkdev_fsync+0x1e/0x30
> > [   92.408112]  do_fsync+0x33/0x70
> > [   92.408354]  __x64_sys_fsync+0xb/0x10
> > [   92.408625]  do_syscall_64+0x43/0x90
> > [   92.408895]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> > [   92.409257] RIP: 0033:0x7f3cd76c6c44
>
>
>
> Jason pls repost everything with this info included, and maybe really
> do make the patch minimal as Dan suggested.

Working on this, will post soon.

Thanks

>
> > >
> > > > issued before the initialization of provider_data. Fixing this by
> > > > initialize the provider_data through nd_region_desc to make sure the
> > > > provider_data is ready after the pmem is created.
> > > >
> > > > Fixes 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > > ---
> > > >  drivers/nvdimm/virtio_pmem.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
> > > > index 995b6cdc67ed..48f8327d0431 100644
> > > > --- a/drivers/nvdimm/virtio_pmem.c
> > > > +++ b/drivers/nvdimm/virtio_pmem.c
> > > > @@ -81,6 +81,7 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
> > > >       ndr_desc.res = &res;
> > > >       ndr_desc.numa_node = nid;
> > > >       ndr_desc.flush = async_pmem_flush;
> > > > +     ndr_desc.provider_data = vdev;
> > >
> > > For my untrained eye, why not
> > > "dev_to_virtio(nd_region->dev.parent->parent)"? If that is indeed
> > > equivalent "vdev" then you can do a follow-on cleanup patch to reduce
> > > that syntax. Otherwise, if by chance they are not equivalent, then this
> > > conversion is introducing a new problem.
> >
> > It is because nd_region hasn't been allocated at this time (which is
> > allocated by nd_region_create() afterwards).
> >
> > Thanks
> >
> > >
> > > Outside of that you can add:
> > >
> > > Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> > >
>
diff mbox series

Patch

diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 995b6cdc67ed..48f8327d0431 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -81,6 +81,7 @@  static int virtio_pmem_probe(struct virtio_device *vdev)
 	ndr_desc.res = &res;
 	ndr_desc.numa_node = nid;
 	ndr_desc.flush = async_pmem_flush;
+	ndr_desc.provider_data = vdev;
 	set_bit(ND_REGION_PAGEMAP, &ndr_desc.flags);
 	set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
 	nd_region = nvdimm_pmem_region_create(vpmem->nvdimm_bus, &ndr_desc);
@@ -89,7 +90,6 @@  static int virtio_pmem_probe(struct virtio_device *vdev)
 		err = -ENXIO;
 		goto out_nd;
 	}
-	nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
 	return 0;
 out_nd:
 	nvdimm_bus_unregister(vpmem->nvdimm_bus);