diff mbox series

[RFC,v1,07/12] staging: kpc2000: Prepare transfer_complete_cb() for PG_reserved changes

Message ID 20191022171239.21487-8-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE) | expand

Commit Message

David Hildenbrand Oct. 22, 2019, 5:12 p.m. UTC
Right now, ZONE_DEVICE memory is always set PG_reserved. We want to
change that.

The pages are obtained via get_user_pages_fast(). I assume, these
could be ZONE_DEVICE pages. Let's just exclude them as well explicitly.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vandana BN <bnvandana@gmail.com>
Cc: "Simon Sandström" <simon@nikanor.nu>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Nishka Dasgupta <nishkadg.linux@gmail.com>
Cc: Madhumitha Prabakaran <madhumithabiw@gmail.com>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: Matt Sickler <Matt.Sickler@daktronics.com>
Cc: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/staging/kpc2000/kpc_dma/fileops.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Matt Sickler Oct. 22, 2019, 5:55 p.m. UTC | #1
>Right now, ZONE_DEVICE memory is always set PG_reserved. We want to change that.
>
>The pages are obtained via get_user_pages_fast(). I assume, these could be ZONE_DEVICE pages. Let's just exclude them as well explicitly.

I'm not sure what ZONE_DEVICE pages are, but these pages are normal system RAM, typically HugePages (but not always).

>
>Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>Cc: Vandana BN <bnvandana@gmail.com>
>Cc: "Simon Sandström" <simon@nikanor.nu>
>Cc: Dan Carpenter <dan.carpenter@oracle.com>
>Cc: Nishka Dasgupta <nishkadg.linux@gmail.com>
>Cc: Madhumitha Prabakaran <madhumithabiw@gmail.com>
>Cc: Fabio Estevam <festevam@gmail.com>
>Cc: Matt Sickler <Matt.Sickler@daktronics.com>
>Cc: Jeremy Sowden <jeremy@azazel.net>
>Signed-off-by: David Hildenbrand <david@redhat.com>
>---
> drivers/staging/kpc2000/kpc_dma/fileops.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/staging/kpc2000/kpc_dma/fileops.c b/drivers/staging/kpc2000/kpc_dma/fileops.c
>index cb52bd9a6d2f..457adcc81fe6 100644
>--- a/drivers/staging/kpc2000/kpc_dma/fileops.c
>+++ b/drivers/staging/kpc2000/kpc_dma/fileops.c
>@@ -212,7 +212,8 @@ void  transfer_complete_cb(struct aio_cb_data *acd, size_t xfr_count, u32 flags)
>        BUG_ON(acd->ldev->pldev == NULL);
>
>        for (i = 0 ; i < acd->page_count ; i++) {
>-               if (!PageReserved(acd->user_pages[i])) {
>+               if (!PageReserved(acd->user_pages[i]) &&
>+                   !is_zone_device_page(acd->user_pages[i])) {
>                        set_page_dirty(acd->user_pages[i]);
>                }
>        }
>--
>2.21.0
David Hildenbrand Oct. 22, 2019, 9:01 p.m. UTC | #2
On 22.10.19 19:55, Matt Sickler wrote:
>> Right now, ZONE_DEVICE memory is always set PG_reserved. We want to change that.
>>
>> The pages are obtained via get_user_pages_fast(). I assume, these could be ZONE_DEVICE pages. Let's just exclude them as well explicitly.
> 
> I'm not sure what ZONE_DEVICE pages are, but these pages are normal system RAM, typically HugePages (but not always).

ZONE_DEVICE, a.k.a. devmem, are pages that bypass the pagecache (e.g., 
DAX) completely and will therefore never get swapped. These pages are 
not managed by any page allocator (especially not the buddy), they are 
rather "directly mapped device memory".

E.g., a NVDIMM. It is mapped into the physical address space similar to 
ordinary RAM (a DIMM). Any write to such a PFN will directly end up on 
the target device. In contrast to a DIMM, the memory is persistent 
accross reboots.

Now, if you mmap such an NVDIMM into a user space process, you will end 
up with ZONE_DEVICE pages as part of the user space mapping (VMA). 
get_user_pages_fast() on this memory will result in "struct pages" that 
belong to ZONE_DEVICE. This is where this patch comes into play.

This patch makes sure that there is absolutely no change once we stop 
setting these ZONE_DEVICE pages PG_reserved. E.g., AFAIK, setting a 
ZONE_DEVICE page dirty does not make too much sense (never swapped).

Yes, it might not be a likely setup, however, it is possible. In this 
series I collect all places that *could* be affected. If that change is 
really needed has to be decided. I can see that the two staging drivers 
I have patches for might be able to just live with the change - but then 
we talked about it and are aware of the change.

Thanks!
diff mbox series

Patch

diff --git a/drivers/staging/kpc2000/kpc_dma/fileops.c b/drivers/staging/kpc2000/kpc_dma/fileops.c
index cb52bd9a6d2f..457adcc81fe6 100644
--- a/drivers/staging/kpc2000/kpc_dma/fileops.c
+++ b/drivers/staging/kpc2000/kpc_dma/fileops.c
@@ -212,7 +212,8 @@  void  transfer_complete_cb(struct aio_cb_data *acd, size_t xfr_count, u32 flags)
 	BUG_ON(acd->ldev->pldev == NULL);
 
 	for (i = 0 ; i < acd->page_count ; i++) {
-		if (!PageReserved(acd->user_pages[i])) {
+		if (!PageReserved(acd->user_pages[i]) &&
+		    !is_zone_device_page(acd->user_pages[i])) {
 			set_page_dirty(acd->user_pages[i]);
 		}
 	}