mbox series

[0/3] eldie generated code for folded p4d/pud

Message ID 20191009222658.961-1-vgupta@synopsys.com (mailing list archive)
Headers show
Series eldie generated code for folded p4d/pud | expand

Message

Vineet Gupta Oct. 9, 2019, 10:26 p.m. UTC
Hi,

This series elides extraneous generate code for folded p4d/pud.
This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
The code saving are not a while lot, but still worthwhile IMHO.

bloat-o-meter2 vmlinux-A-baseline vmlinux-E-elide-p?d_clear_bad
add/remove: 0/2 grow/shrink: 0/1 up/down: 0/-146 (-146)
function                                     old     new   delta
p4d_clear_bad                                  2       -      -2
pud_clear_bad                                 20       -     -20
free_pgd_range                               546     422    -124
Total: Before=4137148, After=4137002, chg -1.000000%

Thx,
-Vineet

Vineet Gupta (3):
  asm-generic/tlb: stub out pud_free_tlb() if __PAGETABLE_PUD_FOLDED ...
  asm-generic/tlb: stub out p4d_free_tlb() if __PAGETABLE_P4D_FOLDED ...
  asm-generic/mm: stub out p{4,d}d_clear_bad() if
    __PAGETABLE_P{4,u}D_FOLDED

 include/asm-generic/4level-fixup.h |  2 --
 include/asm-generic/5level-fixup.h |  2 --
 include/asm-generic/pgtable.h      | 11 +++++++++++
 include/asm-generic/tlb.h          |  8 ++++++--
 mm/pgtable-generic.c               |  4 ++++
 5 files changed, 21 insertions(+), 6 deletions(-)

Comments

Peter Zijlstra Oct. 10, 2019, 7:29 a.m. UTC | #1
On Wed, Oct 09, 2019 at 03:26:55PM -0700, Vineet Gupta wrote:
> Hi,
> 
> This series elides extraneous generate code for folded p4d/pud.
> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
> The code saving are not a while lot, but still worthwhile IMHO.
> 
> bloat-o-meter2 vmlinux-A-baseline vmlinux-E-elide-p?d_clear_bad
> add/remove: 0/2 grow/shrink: 0/1 up/down: 0/-146 (-146)
> function                                     old     new   delta
> p4d_clear_bad                                  2       -      -2
> pud_clear_bad                                 20       -     -20
> free_pgd_range                               546     422    -124
> Total: Before=4137148, After=4137002, chg -1.000000%
> 

Works for me, thanks!
Kirill A . Shutemov Oct. 10, 2019, 8:56 a.m. UTC | #2
On Wed, Oct 09, 2019 at 10:26:55PM +0000, Vineet Gupta wrote:
> Hi,
> 
> This series elides extraneous generate code for folded p4d/pud.
> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
> The code saving are not a while lot, but still worthwhile IMHO.

Agreed.

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Vineet Gupta Oct. 10, 2019, 8:05 p.m. UTC | #3
Hi Kirill,

On 10/10/19 1:56 AM, Kirill A. Shutemov wrote:
> On Wed, Oct 09, 2019 at 10:26:55PM +0000, Vineet Gupta wrote:
>>
>> This series elides extraneous generate code for folded p4d/pud.
>> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
>> The code saving are not a while lot, but still worthwhile IMHO.
> 
> Agreed.

Thx.

So given we are folding pmd too, it seemed we could do the following as well.

+#ifndef __PAGETABLE_PMD_FOLDED
 void pmd_clear_bad(pmd_t *);
+#else
+#define pmd_clear_bad(pmd)        do { } while (0)
+#endif

+#ifndef __PAGETABLE_PMD_FOLDED
 void pmd_clear_bad(pmd_t *pmd)
 {
        pmd_ERROR(*pmd);
        pmd_clear(pmd);
 }
+#endif

I stared at generated code and it seems a bit wrong.
free_pgd_range() -> pgd_none_or_clear_bad() is no longer checking for unmapped pgd
entries as pgd_none/pgd_bad are all stubs returning 0.

This whole pmd folding is a bit confusing considering I only revisit it every few
years :-) Abstraction wise, __PAGETABLE_PMD_FOLDED only has pgd, pte but even in
this regime bunch of pmd macros are still valid

    pmd_set(pmdp, ptep) {
        *pmdp.pud.p4d.pgd = (unsigned long)ptep
    }

Is there a better way to make a mental model of this code folding.

In an ideal world pmd folded would have meant pmd_* routines just vanish - poof.
So in that sense I like your implementation under #[45]LEVEL_HACK where the level
simply vanishes by code like #define p4d_t pgd_t. Perhaps there is lot of historic
baggage, proliferated into arch code so hard to untangle.

Thx,
-Vineet
Kirill A. Shutemov Oct. 11, 2019, 12:19 p.m. UTC | #4
On Thu, Oct 10, 2019 at 01:05:56PM -0700, Vineet Gupta wrote:
> 
> Hi Kirill,
> 
> On 10/10/19 1:56 AM, Kirill A. Shutemov wrote:
> > On Wed, Oct 09, 2019 at 10:26:55PM +0000, Vineet Gupta wrote:
> >>
> >> This series elides extraneous generate code for folded p4d/pud.
> >> This came up when trying to remove __ARCH_USE_5LEVEL_HACK from ARC port.
> >> The code saving are not a while lot, but still worthwhile IMHO.
> > 
> > Agreed.
> 
> Thx.
> 
> So given we are folding pmd too, it seemed we could do the following as well.
> 
> +#ifndef __PAGETABLE_PMD_FOLDED
>  void pmd_clear_bad(pmd_t *);
> +#else
> +#define pmd_clear_bad(pmd)        do { } while (0)
> +#endif
> 
> +#ifndef __PAGETABLE_PMD_FOLDED
>  void pmd_clear_bad(pmd_t *pmd)
>  {
>         pmd_ERROR(*pmd);
>         pmd_clear(pmd);
>  }
> +#endif
> 
> I stared at generated code and it seems a bit wrong.
> free_pgd_range() -> pgd_none_or_clear_bad() is no longer checking for unmapped pgd
> entries as pgd_none/pgd_bad are all stubs returning 0.
> 
> This whole pmd folding is a bit confusing considering I only revisit it every few
> years :-) Abstraction wise, __PAGETABLE_PMD_FOLDED only has pgd, pte but even in
> this regime bunch of pmd macros are still valid
> 
>     pmd_set(pmdp, ptep) {
>         *pmdp.pud.p4d.pgd = (unsigned long)ptep
>     }
> 
> Is there a better way to make a mental model of this code folding.

I don't have any. PMD folding predates me and have never looked at it
closely. Quick look brings more confusion than clarity. :P

> In an ideal world pmd folded would have meant pmd_* routines just vanish - poof.
> So in that sense I like your implementation under #[45]LEVEL_HACK where the level
> simply vanishes by code like #define p4d_t pgd_t. Perhaps there is lot of historic
> baggage, proliferated into arch code so hard to untangle.

In ideal world all these pgd/p4d/pud/pmd/pte should die and we have
something more flexible to begin with.

I played with this before:

https://lore.kernel.org/lkml/20180424154355.mfjgkf47kdp2by4e@black.fi.intel.com/