diff mbox series

[v2] x86/hibernate: Use fixmap for saving unmapped pages

Message ID 20230110211839.19572-1-rick.p.edgecombe@intel.com (mailing list archive)
State Changes Requested, archived
Headers show
Series [v2] x86/hibernate: Use fixmap for saving unmapped pages | expand

Commit Message

Edgecombe, Rick P Jan. 10, 2023, 9:18 p.m. UTC
Hibernate uses the direct map to read memory it saves to disk. Since
sometimes pages are not accessible on the direct map ("not present" on
x86), it has special case logic to temporarily make a page present. On x86
these direct map addresses can be mapped at various page sizes, but the
logic works ok as long as the not present pages are always mapped as
PAGE_SIZE such that they don't require a split to map the region as
present. If the address was mapped not present by a larger page size, the
split may fail and hibernate would then try to read an address mapped not
present.

Today on x86 there are no known cases of this (huge not present pages on
the direct map), but it has come up from time to time when developing
things that operate on the direct map. It blocked making
VM_FLUSH_RESET_PERMS support huge vmalloc when that came up, and also
has been a complication for various direct map protection efforts.

This dependency is also pretty hidden and easily missed by people poking at
the direct map. For this reason, there are warnings in place to complain
but not handle this scenario.

One way to make this more robust would be to create some new CPA
functionality that can know to map and reset the whole huge page in the
case of trying to map a subpage. But for simplicity and smaller code, just
make x86 hibernate have its own fixmap PTE that it can use to point
to 4k pages when it encounters an unmapped direct map page.

Move do_copy_page() to a header such that it can be used in an arch
breakout. Rename it hib_copy_page() to be more hibernate specific since
it could appear in other files.

Use __weak for the arch breakout because there is not a suitable arch
specific header to use the #define method.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---

Rebase to v6.2-rc3 (original still applied) and resending per:
https://lore.kernel.org/lkml/CAJZ5v0i6cxGD+V6G+q-Y_Lp-ov51_zmkZr8ZGpCtqWV-e=BsLg@mail.gmail.com/

 arch/x86/include/asm/fixmap.h |  3 +++
 arch/x86/power/hibernate.c    | 10 ++++++++++
 include/linux/suspend.h       | 13 +++++++++++++
 kernel/power/snapshot.c       | 21 +++++++--------------
 4 files changed, 33 insertions(+), 14 deletions(-)

Comments

Rafael J. Wysocki Jan. 11, 2023, 8:35 p.m. UTC | #1
On Tue, Jan 10, 2023 at 10:19 PM Rick Edgecombe
<rick.p.edgecombe@intel.com> wrote:
>
> Hibernate uses the direct map to read memory it saves to disk. Since
> sometimes pages are not accessible on the direct map ("not present" on
> x86), it has special case logic to temporarily make a page present. On x86
> these direct map addresses can be mapped at various page sizes, but the
> logic works ok as long as the not present pages are always mapped as
> PAGE_SIZE such that they don't require a split to map the region as
> present. If the address was mapped not present by a larger page size, the
> split may fail and hibernate would then try to read an address mapped not
> present.
>
> Today on x86 there are no known cases of this (huge not present pages on
> the direct map), but it has come up from time to time when developing
> things that operate on the direct map. It blocked making
> VM_FLUSH_RESET_PERMS support huge vmalloc when that came up, and also
> has been a complication for various direct map protection efforts.
>
> This dependency is also pretty hidden and easily missed by people poking at
> the direct map. For this reason, there are warnings in place to complain
> but not handle this scenario.
>
> One way to make this more robust would be to create some new CPA
> functionality that can know to map and reset the whole huge page in the
> case of trying to map a subpage. But for simplicity and smaller code, just
> make x86 hibernate have its own fixmap PTE that it can use to point
> to 4k pages when it encounters an unmapped direct map page.
>
> Move do_copy_page() to a header such that it can be used in an arch
> breakout. Rename it hib_copy_page() to be more hibernate specific since
> it could appear in other files.
>
> Use __weak for the arch breakout because there is not a suitable arch
> specific header to use the #define method.
>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>
> Rebase to v6.2-rc3 (original still applied) and resending per:
> https://lore.kernel.org/lkml/CAJZ5v0i6cxGD+V6G+q-Y_Lp-ov51_zmkZr8ZGpCtqWV-e=BsLg@mail.gmail.com/
>
>  arch/x86/include/asm/fixmap.h |  3 +++
>  arch/x86/power/hibernate.c    | 10 ++++++++++
>  include/linux/suspend.h       | 13 +++++++++++++
>  kernel/power/snapshot.c       | 21 +++++++--------------
>  4 files changed, 33 insertions(+), 14 deletions(-)
>
> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
> index d0dcefb5cc59..0fceed9a4152 100644
> --- a/arch/x86/include/asm/fixmap.h
> +++ b/arch/x86/include/asm/fixmap.h
> @@ -108,6 +108,9 @@ enum fixed_addresses {
>  #ifdef CONFIG_PARAVIRT_XXL
>         FIX_PARAVIRT_BOOTMAP,
>  #endif
> +#ifdef CONFIG_HIBERNATION
> +       FIX_HIBERNATE,
> +#endif
>
>  #ifdef CONFIG_ACPI_APEI_GHES
>         /* Used for GHES mapping from assorted contexts */
> diff --git a/arch/x86/power/hibernate.c b/arch/x86/power/hibernate.c
> index 6f955eb1e163..473b6b5f6b7e 100644
> --- a/arch/x86/power/hibernate.c
> +++ b/arch/x86/power/hibernate.c
> @@ -147,6 +147,16 @@ int arch_hibernation_header_restore(void *addr)
>         return 0;
>  }
>
> +void copy_unmapped_page(void *dst, struct page *page)
> +{
> +       WARN_ON(!preempt_count());

I don't think the above is needed.  The code using this function
cannot be preempted anyway AFAICS.

> +
> +       set_fixmap(FIX_HIBERNATE, page_to_phys(page));
> +       __flush_tlb_all();

So do TLBs need to be flushed before copying every single page?
Basically, they are all copied in one loop.

> +       hib_copy_page(dst, (void *)fix_to_virt(FIX_HIBERNATE));
> +       clear_fixmap(FIX_HIBERNATE);
> +}
> +
>  int relocate_restore_code(void)
>  {
>         pgd_t *pgd;
> diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> index cfe19a028918..0b19b910526e 100644
> --- a/include/linux/suspend.h
> +++ b/include/linux/suspend.h
> @@ -447,6 +447,19 @@ extern bool hibernation_available(void);
>  asmlinkage int swsusp_save(void);
>  extern struct pbe *restore_pblist;
>  int pfn_is_nosave(unsigned long pfn);
> +void copy_unmapped_page(void *dst, struct page *page);
> +
> +/*
> + * This is needed, because copy_page and memcpy are not usable for copying
> + * task structs.
> + */
> +static inline void hib_copy_page(long *dst, long *src)
> +{
> +       int n;
> +
> +       for (n = PAGE_SIZE / sizeof(long); n; n--)
> +               *dst++ = *src++;
> +}
>
>  int hibernate_quiet_exec(int (*func)(void *data), void *data);
>  #else /* CONFIG_HIBERNATION */
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index cd8b7b35f1e8..344c071f29d3 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -1369,16 +1369,11 @@ static unsigned int count_data_pages(void)
>         return n;
>  }
>
> -/*
> - * This is needed, because copy_page and memcpy are not usable for copying
> - * task structs.
> - */
> -static inline void do_copy_page(long *dst, long *src)
> +void __weak copy_unmapped_page(void *dst, struct page *page)
>  {
> -       int n;
> -
> -       for (n = PAGE_SIZE / sizeof(long); n; n--)
> -               *dst++ = *src++;
> +       hibernate_map_page(page);
> +       hib_copy_page(dst, page_address(page));
> +       hibernate_unmap_page(page);
>  }
>
>  /**
> @@ -1392,11 +1387,9 @@ static inline void do_copy_page(long *dst, long *src)
>  static void safe_copy_page(void *dst, struct page *s_page)
>  {
>         if (kernel_page_present(s_page)) {
> -               do_copy_page(dst, page_address(s_page));
> +               hib_copy_page(dst, page_address(s_page));
>         } else {
> -               hibernate_map_page(s_page);
> -               do_copy_page(dst, page_address(s_page));
> -               hibernate_unmap_page(s_page);
> +               copy_unmapped_page(dst, s_page);
>         }
>  }
>
> @@ -1417,7 +1410,7 @@ static void copy_data_page(unsigned long dst_pfn, unsigned long src_pfn)
>         if (PageHighMem(s_page)) {
>                 src = kmap_atomic(s_page);
>                 dst = kmap_atomic(d_page);
> -               do_copy_page(dst, src);
> +               hib_copy_page(dst, src);
>                 kunmap_atomic(dst);
>                 kunmap_atomic(src);
>         } else {
> --
> 2.17.1
>
Edgecombe, Rick P Jan. 11, 2023, 8:46 p.m. UTC | #2
On Wed, 2023-01-11 at 21:35 +0100, Rafael J. Wysocki wrote:
> On Tue, Jan 10, 2023 at 10:19 PM Rick Edgecombe
> <rick.p.edgecombe@intel.com> wrote:
> > 
> > Hibernate uses the direct map to read memory it saves to disk.
> > Since
> > sometimes pages are not accessible on the direct map ("not present"
> > on
> > x86), it has special case logic to temporarily make a page present.
> > On x86
> > these direct map addresses can be mapped at various page sizes, but
> > the
> > logic works ok as long as the not present pages are always mapped
> > as
> > PAGE_SIZE such that they don't require a split to map the region as
> > present. If the address was mapped not present by a larger page
> > size, the
> > split may fail and hibernate would then try to read an address
> > mapped not
> > present.
> > 
> > Today on x86 there are no known cases of this (huge not present
> > pages on
> > the direct map), but it has come up from time to time when
> > developing
> > things that operate on the direct map. It blocked making
> > VM_FLUSH_RESET_PERMS support huge vmalloc when that came up, and
> > also
> > has been a complication for various direct map protection efforts.
> > 
> > This dependency is also pretty hidden and easily missed by people
> > poking at
> > the direct map. For this reason, there are warnings in place to
> > complain
> > but not handle this scenario.
> > 
> > One way to make this more robust would be to create some new CPA
> > functionality that can know to map and reset the whole huge page in
> > the
> > case of trying to map a subpage. But for simplicity and smaller
> > code, just
> > make x86 hibernate have its own fixmap PTE that it can use to point
> > to 4k pages when it encounters an unmapped direct map page.
> > 
> > Move do_copy_page() to a header such that it can be used in an arch
> > breakout. Rename it hib_copy_page() to be more hibernate specific
> > since
> > it could appear in other files.
> > 
> > Use __weak for the arch breakout because there is not a suitable
> > arch
> > specific header to use the #define method.
> > 
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > ---
> > 
> > Rebase to v6.2-rc3 (original still applied) and resending per:
> > 
https://lore.kernel.org/lkml/CAJZ5v0i6cxGD+V6G+q-Y_Lp-ov51_zmkZr8ZGpCtqWV-e=BsLg@mail.gmail.com/
> > 
> >  arch/x86/include/asm/fixmap.h |  3 +++
> >  arch/x86/power/hibernate.c    | 10 ++++++++++
> >  include/linux/suspend.h       | 13 +++++++++++++
> >  kernel/power/snapshot.c       | 21 +++++++--------------
> >  4 files changed, 33 insertions(+), 14 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/fixmap.h
> > b/arch/x86/include/asm/fixmap.h
> > index d0dcefb5cc59..0fceed9a4152 100644
> > --- a/arch/x86/include/asm/fixmap.h
> > +++ b/arch/x86/include/asm/fixmap.h
> > @@ -108,6 +108,9 @@ enum fixed_addresses {
> >  #ifdef CONFIG_PARAVIRT_XXL
> >         FIX_PARAVIRT_BOOTMAP,
> >  #endif
> > +#ifdef CONFIG_HIBERNATION
> > +       FIX_HIBERNATE,
> > +#endif
> > 
> >  #ifdef CONFIG_ACPI_APEI_GHES
> >         /* Used for GHES mapping from assorted contexts */
> > diff --git a/arch/x86/power/hibernate.c
> > b/arch/x86/power/hibernate.c
> > index 6f955eb1e163..473b6b5f6b7e 100644
> > --- a/arch/x86/power/hibernate.c
> > +++ b/arch/x86/power/hibernate.c
> > @@ -147,6 +147,16 @@ int arch_hibernation_header_restore(void
> > *addr)
> >         return 0;
> >  }
> > 
> > +void copy_unmapped_page(void *dst, struct page *page)
> > +{
> > +       WARN_ON(!preempt_count());
> 
> I don't think the above is needed.  The code using this function
> cannot be preempted anyway AFAICS.

The reason I thought it was useful was because this function is now
defined in a header. Someone else might decide to use it. Does it seem
more useful?

> 
> > +
> > +       set_fixmap(FIX_HIBERNATE, page_to_phys(page));
> > +       __flush_tlb_all();
> 
> So do TLBs need to be flushed before copying every single page?
> Basically, they are all copied in one loop.

It is only one fixmap entry so it needs to be flushed after changing
the PTE to point to a different page. But this is only for the case of
unmapped pages, the more common mapped pages are copied from the direct
map like usual.

> 
> > +       hib_copy_page(dst, (void *)fix_to_virt(FIX_HIBERNATE));
> > +       clear_fixmap(FIX_HIBERNATE);
> > +}
> > +
> >  int relocate_restore_code(void)
> >  {
> >         pgd_t *pgd;
> > diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> > index cfe19a028918..0b19b910526e 100644
> > --- a/include/linux/suspend.h
> > +++ b/include/linux/suspend.h
> > @@ -447,6 +447,19 @@ extern bool hibernation_available(void);
> >  asmlinkage int swsusp_save(void);
> >  extern struct pbe *restore_pblist;
> >  int pfn_is_nosave(unsigned long pfn);
> > +void copy_unmapped_page(void *dst, struct page *page);
> > +
> > +/*
> > + * This is needed, because copy_page and memcpy are not usable for
> > copying
> > + * task structs.
> > + */
> > +static inline void hib_copy_page(long *dst, long *src)
> > +{
> > +       int n;
> > +
> > +       for (n = PAGE_SIZE / sizeof(long); n; n--)
> > +               *dst++ = *src++;
> > +}
> > 
> >  int hibernate_quiet_exec(int (*func)(void *data), void *data);
> >  #else /* CONFIG_HIBERNATION */
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index cd8b7b35f1e8..344c071f29d3 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -1369,16 +1369,11 @@ static unsigned int count_data_pages(void)
> >         return n;
> >  }
> > 
> > -/*
> > - * This is needed, because copy_page and memcpy are not usable for
> > copying
> > - * task structs.
> > - */
> > -static inline void do_copy_page(long *dst, long *src)
> > +void __weak copy_unmapped_page(void *dst, struct page *page)
> >  {
> > -       int n;
> > -
> > -       for (n = PAGE_SIZE / sizeof(long); n; n--)
> > -               *dst++ = *src++;
> > +       hibernate_map_page(page);
> > +       hib_copy_page(dst, page_address(page));
> > +       hibernate_unmap_page(page);
> >  }
> > 
> >  /**
> > @@ -1392,11 +1387,9 @@ static inline void do_copy_page(long *dst,
> > long *src)
> >  static void safe_copy_page(void *dst, struct page *s_page)
> >  {
> >         if (kernel_page_present(s_page)) {
> > -               do_copy_page(dst, page_address(s_page));
> > +               hib_copy_page(dst, page_address(s_page));
> >         } else {
> > -               hibernate_map_page(s_page);
> > -               do_copy_page(dst, page_address(s_page));
> > -               hibernate_unmap_page(s_page);
> > +               copy_unmapped_page(dst, s_page);
> >         }
> >  }
> > 
> > @@ -1417,7 +1410,7 @@ static void copy_data_page(unsigned long
> > dst_pfn, unsigned long src_pfn)
> >         if (PageHighMem(s_page)) {
> >                 src = kmap_atomic(s_page);
> >                 dst = kmap_atomic(d_page);
> > -               do_copy_page(dst, src);
> > +               hib_copy_page(dst, src);
> >                 kunmap_atomic(dst);
> >                 kunmap_atomic(src);
> >         } else {
> > --
> > 2.17.1
> >
Rafael J. Wysocki Jan. 12, 2023, 2:15 p.m. UTC | #3
On Wed, Jan 11, 2023 at 9:46 PM Edgecombe, Rick P
<rick.p.edgecombe@intel.com> wrote:
>
> On Wed, 2023-01-11 at 21:35 +0100, Rafael J. Wysocki wrote:
> > On Tue, Jan 10, 2023 at 10:19 PM Rick Edgecombe
> > <rick.p.edgecombe@intel.com> wrote:
> > >
> > > Hibernate uses the direct map to read memory it saves to disk.
> > > Since
> > > sometimes pages are not accessible on the direct map ("not present"
> > > on
> > > x86), it has special case logic to temporarily make a page present.
> > > On x86
> > > these direct map addresses can be mapped at various page sizes, but
> > > the
> > > logic works ok as long as the not present pages are always mapped
> > > as
> > > PAGE_SIZE such that they don't require a split to map the region as
> > > present. If the address was mapped not present by a larger page
> > > size, the
> > > split may fail and hibernate would then try to read an address
> > > mapped not
> > > present.
> > >
> > > Today on x86 there are no known cases of this (huge not present
> > > pages on
> > > the direct map), but it has come up from time to time when
> > > developing
> > > things that operate on the direct map. It blocked making
> > > VM_FLUSH_RESET_PERMS support huge vmalloc when that came up, and
> > > also
> > > has been a complication for various direct map protection efforts.
> > >
> > > This dependency is also pretty hidden and easily missed by people
> > > poking at
> > > the direct map. For this reason, there are warnings in place to
> > > complain
> > > but not handle this scenario.
> > >
> > > One way to make this more robust would be to create some new CPA
> > > functionality that can know to map and reset the whole huge page in
> > > the
> > > case of trying to map a subpage. But for simplicity and smaller
> > > code, just
> > > make x86 hibernate have its own fixmap PTE that it can use to point
> > > to 4k pages when it encounters an unmapped direct map page.
> > >
> > > Move do_copy_page() to a header such that it can be used in an arch
> > > breakout. Rename it hib_copy_page() to be more hibernate specific
> > > since
> > > it could appear in other files.
> > >
> > > Use __weak for the arch breakout because there is not a suitable
> > > arch
> > > specific header to use the #define method.
> > >
> > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > > ---
> > >
> > > Rebase to v6.2-rc3 (original still applied) and resending per:
> > >
> https://lore.kernel.org/lkml/CAJZ5v0i6cxGD+V6G+q-Y_Lp-ov51_zmkZr8ZGpCtqWV-e=BsLg@mail.gmail.com/
> > >
> > >  arch/x86/include/asm/fixmap.h |  3 +++
> > >  arch/x86/power/hibernate.c    | 10 ++++++++++
> > >  include/linux/suspend.h       | 13 +++++++++++++
> > >  kernel/power/snapshot.c       | 21 +++++++--------------
> > >  4 files changed, 33 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/fixmap.h
> > > b/arch/x86/include/asm/fixmap.h
> > > index d0dcefb5cc59..0fceed9a4152 100644
> > > --- a/arch/x86/include/asm/fixmap.h
> > > +++ b/arch/x86/include/asm/fixmap.h
> > > @@ -108,6 +108,9 @@ enum fixed_addresses {
> > >  #ifdef CONFIG_PARAVIRT_XXL
> > >         FIX_PARAVIRT_BOOTMAP,
> > >  #endif
> > > +#ifdef CONFIG_HIBERNATION
> > > +       FIX_HIBERNATE,
> > > +#endif
> > >
> > >  #ifdef CONFIG_ACPI_APEI_GHES
> > >         /* Used for GHES mapping from assorted contexts */
> > > diff --git a/arch/x86/power/hibernate.c
> > > b/arch/x86/power/hibernate.c
> > > index 6f955eb1e163..473b6b5f6b7e 100644
> > > --- a/arch/x86/power/hibernate.c
> > > +++ b/arch/x86/power/hibernate.c
> > > @@ -147,6 +147,16 @@ int arch_hibernation_header_restore(void
> > > *addr)
> > >         return 0;
> > >  }
> > >
> > > +void copy_unmapped_page(void *dst, struct page *page)
> > > +{
> > > +       WARN_ON(!preempt_count());
> >
> > I don't think the above is needed.  The code using this function
> > cannot be preempted anyway AFAICS.
>
> The reason I thought it was useful was because this function is now
> defined in a header. Someone else might decide to use it. Does it seem
> more useful?

Well, it is exposed now, but only in order to allow the __weak
function to be overridden.  I don't think it is logically valid to use
it anywhere beyond its original call site.

To make that clear, I would call it something hibernation-specific,
like hibernate_copy_unmapped_page() and I would add a kerneldoc
comment to it to describe its intended use.

Furthermore, I'm not sure about the new code layout.

Personally, I would prefer hibernate_map_page() and
hibernate_unmap_page() to be turned into __weak functions and possibly
overridden by the arch code, which would allow the amount of changes
to be reduced and do_copy_page() wouldn't need to be moved into the
header any more.

> >
> > > +
> > > +       set_fixmap(FIX_HIBERNATE, page_to_phys(page));
> > > +       __flush_tlb_all();
> >
> > So do TLBs need to be flushed before copying every single page?
> > Basically, they are all copied in one loop.
>
> It is only one fixmap entry so it needs to be flushed after changing
> the PTE to point to a different page. But this is only for the case of
> unmapped pages, the more common mapped pages are copied from the direct
> map like usual.

OK

> >
> > > +       hib_copy_page(dst, (void *)fix_to_virt(FIX_HIBERNATE));
> > > +       clear_fixmap(FIX_HIBERNATE);
> > > +}
> > > +
> > >  int relocate_restore_code(void)
> > >  {
> > >         pgd_t *pgd;
> > > diff --git a/include/linux/suspend.h b/include/linux/suspend.h
> > > index cfe19a028918..0b19b910526e 100644
> > > --- a/include/linux/suspend.h
> > > +++ b/include/linux/suspend.h
> > > @@ -447,6 +447,19 @@ extern bool hibernation_available(void);
> > >  asmlinkage int swsusp_save(void);
> > >  extern struct pbe *restore_pblist;
> > >  int pfn_is_nosave(unsigned long pfn);
> > > +void copy_unmapped_page(void *dst, struct page *page);
> > > +
> > > +/*
> > > + * This is needed, because copy_page and memcpy are not usable for
> > > copying
> > > + * task structs.
> > > + */
> > > +static inline void hib_copy_page(long *dst, long *src)
> > > +{
> > > +       int n;
> > > +
> > > +       for (n = PAGE_SIZE / sizeof(long); n; n--)
> > > +               *dst++ = *src++;
> > > +}
> > >
> > >  int hibernate_quiet_exec(int (*func)(void *data), void *data);
> > >  #else /* CONFIG_HIBERNATION */
> > > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > > index cd8b7b35f1e8..344c071f29d3 100644
> > > --- a/kernel/power/snapshot.c
> > > +++ b/kernel/power/snapshot.c
> > > @@ -1369,16 +1369,11 @@ static unsigned int count_data_pages(void)
> > >         return n;
> > >  }
> > >
> > > -/*
> > > - * This is needed, because copy_page and memcpy are not usable for
> > > copying
> > > - * task structs.
> > > - */
> > > -static inline void do_copy_page(long *dst, long *src)
> > > +void __weak copy_unmapped_page(void *dst, struct page *page)
> > >  {
> > > -       int n;
> > > -
> > > -       for (n = PAGE_SIZE / sizeof(long); n; n--)
> > > -               *dst++ = *src++;
> > > +       hibernate_map_page(page);
> > > +       hib_copy_page(dst, page_address(page));
> > > +       hibernate_unmap_page(page);
> > >  }
> > >
> > >  /**
> > > @@ -1392,11 +1387,9 @@ static inline void do_copy_page(long *dst,
> > > long *src)
> > >  static void safe_copy_page(void *dst, struct page *s_page)
> > >  {
> > >         if (kernel_page_present(s_page)) {
> > > -               do_copy_page(dst, page_address(s_page));
> > > +               hib_copy_page(dst, page_address(s_page));
> > >         } else {
> > > -               hibernate_map_page(s_page);
> > > -               do_copy_page(dst, page_address(s_page));
> > > -               hibernate_unmap_page(s_page);
> > > +               copy_unmapped_page(dst, s_page);
> > >         }
> > >  }
> > >
> > > @@ -1417,7 +1410,7 @@ static void copy_data_page(unsigned long
> > > dst_pfn, unsigned long src_pfn)
> > >         if (PageHighMem(s_page)) {
> > >                 src = kmap_atomic(s_page);
> > >                 dst = kmap_atomic(d_page);
> > > -               do_copy_page(dst, src);
> > > +               hib_copy_page(dst, src);
> > >                 kunmap_atomic(dst);
> > >                 kunmap_atomic(src);
> > >         } else {
> > > --
> > > 2.17.1
> > >
Edgecombe, Rick P Jan. 12, 2023, 7:15 p.m. UTC | #4
On Thu, 2023-01-12 at 15:15 +0100, Rafael J. Wysocki wrote:
> > > 
> > > I don't think the above is needed.  The code using this function
> > > cannot be preempted anyway AFAICS.
> > 
> > The reason I thought it was useful was because this function is now
> > defined in a header. Someone else might decide to use it. Does it
> > seem
> > more useful?
> 
> Well, it is exposed now, but only in order to allow the __weak
> function to be overridden.  I don't think it is logically valid to
> use
> it anywhere beyond its original call site.
> 
> To make that clear, I would call it something hibernation-specific,
> like hibernate_copy_unmapped_page() and I would add a kerneldoc
> comment to it to describe its intended use.

Ok, I'll change the name, that makes sense.

On the warning, ok, I'll drop it. But to me the code stand out as
questionable with the PTE change and only the local TLB flush. It's a
bit of a comment as code on a rare path.

> 
> Furthermore, I'm not sure about the new code layout.
> 
> Personally, I would prefer hibernate_map_page() and
> hibernate_unmap_page() to be turned into __weak functions and
> possibly
> overridden by the arch code, which would allow the amount of changes
> to be reduced and do_copy_page() wouldn't need to be moved into the
> header any more.

Currently hibernate_map_page() maps the page on the direct map and
doesn't return anything. This new code effectively creates a readable
alias in the fixmap. So it would have to return an address to use so
the core hibernate code would know where to copy from. Then it would
have to pass it back into hibernate_unmap_page() for the arch to decide
what to do to clean it up. I think it would be more complicated.

There is also already multiple paths in hibernate_map_page() that would
have to be duplicated in the arch versions.

So I see the idea, but I'm not sure it ends up better. Can we leave
this one?

Thanks,
Rick
Rafael J. Wysocki Jan. 12, 2023, 7:37 p.m. UTC | #5
On Thu, Jan 12, 2023 at 8:15 PM Edgecombe, Rick P
<rick.p.edgecombe@intel.com> wrote:
>
> On Thu, 2023-01-12 at 15:15 +0100, Rafael J. Wysocki wrote:
> > > >
> > > > I don't think the above is needed.  The code using this function
> > > > cannot be preempted anyway AFAICS.
> > >
> > > The reason I thought it was useful was because this function is now
> > > defined in a header. Someone else might decide to use it. Does it
> > > seem
> > > more useful?
> >
> > Well, it is exposed now, but only in order to allow the __weak
> > function to be overridden.  I don't think it is logically valid to
> > use
> > it anywhere beyond its original call site.
> >
> > To make that clear, I would call it something hibernation-specific,
> > like hibernate_copy_unmapped_page() and I would add a kerneldoc
> > comment to it to describe its intended use.
>
> Ok, I'll change the name, that makes sense.
>
> On the warning, ok, I'll drop it. But to me the code stand out as
> questionable with the PTE change and only the local TLB flush. It's a
> bit of a comment as code on a rare path.
>
> >
> > Furthermore, I'm not sure about the new code layout.
> >
> > Personally, I would prefer hibernate_map_page() and
> > hibernate_unmap_page() to be turned into __weak functions and
> > possibly
> > overridden by the arch code, which would allow the amount of changes
> > to be reduced and do_copy_page() wouldn't need to be moved into the
> > header any more.
>
> Currently hibernate_map_page() maps the page on the direct map and
> doesn't return anything. This new code effectively creates a readable
> alias in the fixmap. So it would have to return an address to use so
> the core hibernate code would know where to copy from. Then it would
> have to pass it back into hibernate_unmap_page() for the arch to decide
> what to do to clean it up. I think it would be more complicated.

AFAICS, you only need hibernate_map_page() to return an address that
will be passed to do_copy_page() (that doesn't need to be touched
otherwise) and hibernate_unmap_page() would just do

clear_fixmap(FIX_HIBERNATE);

so it may as well take the page as the argument.

> There is also already multiple paths in hibernate_map_page() that would
> have to be duplicated in the arch versions.

But only if the arch decides to override the originals.

Now, the only caller of both hibernate_map_page() and
hibernate_unmap_page() is safe_copy_page() in the "else" branch.  Your
current patch replaces that branch completely with an arch version, so
they are not going to be invoked on x86 anyway.

> So I see the idea, but I'm not sure it ends up better. Can we leave this one?

I first need to be convinced that it is indeed better.
Edgecombe, Rick P Jan. 12, 2023, 9:25 p.m. UTC | #6
On Thu, 2023-01-12 at 20:37 +0100, Rafael J. Wysocki wrote:
> > So I see the idea, but I'm not sure it ends up better. Can we leave
> > this one?
> 
> I first need to be convinced that it is indeed better.

I'll give it a try. And good point about not needing to reproduce the
functionality in the default hibernate_map_page(). I'm see that x86
doesn't need the other options.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index d0dcefb5cc59..0fceed9a4152 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -108,6 +108,9 @@  enum fixed_addresses {
 #ifdef CONFIG_PARAVIRT_XXL
 	FIX_PARAVIRT_BOOTMAP,
 #endif
+#ifdef CONFIG_HIBERNATION
+	FIX_HIBERNATE,
+#endif
 
 #ifdef CONFIG_ACPI_APEI_GHES
 	/* Used for GHES mapping from assorted contexts */
diff --git a/arch/x86/power/hibernate.c b/arch/x86/power/hibernate.c
index 6f955eb1e163..473b6b5f6b7e 100644
--- a/arch/x86/power/hibernate.c
+++ b/arch/x86/power/hibernate.c
@@ -147,6 +147,16 @@  int arch_hibernation_header_restore(void *addr)
 	return 0;
 }
 
+void copy_unmapped_page(void *dst, struct page *page)
+{
+	WARN_ON(!preempt_count());
+
+	set_fixmap(FIX_HIBERNATE, page_to_phys(page));
+	__flush_tlb_all();
+	hib_copy_page(dst, (void *)fix_to_virt(FIX_HIBERNATE));
+	clear_fixmap(FIX_HIBERNATE);
+}
+
 int relocate_restore_code(void)
 {
 	pgd_t *pgd;
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index cfe19a028918..0b19b910526e 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -447,6 +447,19 @@  extern bool hibernation_available(void);
 asmlinkage int swsusp_save(void);
 extern struct pbe *restore_pblist;
 int pfn_is_nosave(unsigned long pfn);
+void copy_unmapped_page(void *dst, struct page *page);
+
+/*
+ * This is needed, because copy_page and memcpy are not usable for copying
+ * task structs.
+ */
+static inline void hib_copy_page(long *dst, long *src)
+{
+	int n;
+
+	for (n = PAGE_SIZE / sizeof(long); n; n--)
+		*dst++ = *src++;
+}
 
 int hibernate_quiet_exec(int (*func)(void *data), void *data);
 #else /* CONFIG_HIBERNATION */
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index cd8b7b35f1e8..344c071f29d3 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1369,16 +1369,11 @@  static unsigned int count_data_pages(void)
 	return n;
 }
 
-/*
- * This is needed, because copy_page and memcpy are not usable for copying
- * task structs.
- */
-static inline void do_copy_page(long *dst, long *src)
+void __weak copy_unmapped_page(void *dst, struct page *page)
 {
-	int n;
-
-	for (n = PAGE_SIZE / sizeof(long); n; n--)
-		*dst++ = *src++;
+	hibernate_map_page(page);
+	hib_copy_page(dst, page_address(page));
+	hibernate_unmap_page(page);
 }
 
 /**
@@ -1392,11 +1387,9 @@  static inline void do_copy_page(long *dst, long *src)
 static void safe_copy_page(void *dst, struct page *s_page)
 {
 	if (kernel_page_present(s_page)) {
-		do_copy_page(dst, page_address(s_page));
+		hib_copy_page(dst, page_address(s_page));
 	} else {
-		hibernate_map_page(s_page);
-		do_copy_page(dst, page_address(s_page));
-		hibernate_unmap_page(s_page);
+		copy_unmapped_page(dst, s_page);
 	}
 }
 
@@ -1417,7 +1410,7 @@  static void copy_data_page(unsigned long dst_pfn, unsigned long src_pfn)
 	if (PageHighMem(s_page)) {
 		src = kmap_atomic(s_page);
 		dst = kmap_atomic(d_page);
-		do_copy_page(dst, src);
+		hib_copy_page(dst, src);
 		kunmap_atomic(dst);
 		kunmap_atomic(src);
 	} else {