diff mbox series

riscv: enable THP_SWAP for RV64

Message ID 20220821170559.840-1-jszhang@kernel.org (mailing list archive)
State Superseded
Headers show
Series riscv: enable THP_SWAP for RV64 | expand

Commit Message

Jisheng Zhang Aug. 21, 2022, 5:05 p.m. UTC
I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
memory optimizations such as swap on zram are helpful. As is seen
in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
swapped out"), THP_SWAP can improve the swap throughput significantly.

Enable THP_SWAP for RV64, testing the micro-benchmark which is
introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
shows below numbers on the Lichee RV dock board:

thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)

Improved by 382%!

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
 arch/riscv/Kconfig | 1 +
 1 file changed, 1 insertion(+)

Comments

Andrew Jones Aug. 22, 2022, 8:02 a.m. UTC | #1
On Mon, Aug 22, 2022 at 01:05:59AM +0800, Jisheng Zhang wrote:
> I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
> memory optimizations such as swap on zram are helpful. As is seen
> in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
> commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
> swapped out"), THP_SWAP can improve the swap throughput significantly.
> 
> Enable THP_SWAP for RV64, testing the micro-benchmark which is
> introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
> shows below numbers on the Lichee RV dock board:
> 
> thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
> thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
> 
> Improved by 382%!
> 
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
>  arch/riscv/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index ed66c31e4655..19088c750c7f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -45,6 +45,7 @@ config RISCV
>  	select ARCH_WANT_FRAME_POINTERS
>  	select ARCH_WANT_GENERAL_HUGETLB
>  	select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
> +	select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
>  	select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
>  	select BUILDTIME_TABLE_SORT if MMU
>  	select CLONE_BACKWARDS
> -- 
> 2.34.1
>

That looks like a good idea to me.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Palmer Dabbelt Oct. 6, 2022, 2:35 a.m. UTC | #2
On Sun, 21 Aug 2022 10:05:59 PDT (-0700), jszhang@kernel.org wrote:
> I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
> memory optimizations such as swap on zram are helpful. As is seen
> in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
> commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
> swapped out"), THP_SWAP can improve the swap throughput significantly.
>
> Enable THP_SWAP for RV64, testing the micro-benchmark which is
> introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
> shows below numbers on the Lichee RV dock board:
>
> thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
> thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
>
> Improved by 382%!
>
> Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> ---
>  arch/riscv/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index ed66c31e4655..19088c750c7f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -45,6 +45,7 @@ config RISCV
>  	select ARCH_WANT_FRAME_POINTERS
>  	select ARCH_WANT_GENERAL_HUGETLB
>  	select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
> +	select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
>  	select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
>  	select BUILDTIME_TABLE_SORT if MMU
>  	select CLONE_BACKWARDS

Thanks, this is on for-next.
Conor Dooley Oct. 6, 2022, 6:53 a.m. UTC | #3
On Wed, Oct 05, 2022 at 07:35:53PM -0700, Palmer Dabbelt wrote:
> On Sun, 21 Aug 2022 10:05:59 PDT (-0700), jszhang@kernel.org wrote:
> > I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
> > memory optimizations such as swap on zram are helpful. As is seen
> > in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
> > commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
> > swapped out"), THP_SWAP can improve the swap throughput significantly.
> > 
> > Enable THP_SWAP for RV64, testing the micro-benchmark which is
> > introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
> > shows below numbers on the Lichee RV dock board:
> > 
> > thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
> > thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
> > 
> > Improved by 382%!
> > 
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > ---
> >  arch/riscv/Kconfig | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index ed66c31e4655..19088c750c7f 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -45,6 +45,7 @@ config RISCV
> >  	select ARCH_WANT_FRAME_POINTERS
> >  	select ARCH_WANT_GENERAL_HUGETLB
> >  	select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
> > +	select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
> >  	select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
> >  	select BUILDTIME_TABLE_SORT if MMU
> >  	select CLONE_BACKWARDS
> 
> Thanks, this is on for-next.

FYI, this is v1 of a patchset that went to v3.
v3 only changed the commit message, but v2 had a functional change.

v3 is here:
https://lore.kernel.org/all/20220829145742.3139-1-jszhang@kernel.org/

Thanks,
Conor.
Palmer Dabbelt Oct. 7, 2022, 3:05 a.m. UTC | #4
On Wed, 05 Oct 2022 23:53:03 PDT (-0700), conor.dooley@microchip.com wrote:
> On Wed, Oct 05, 2022 at 07:35:53PM -0700, Palmer Dabbelt wrote:
>> On Sun, 21 Aug 2022 10:05:59 PDT (-0700), jszhang@kernel.org wrote:
>> > I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
>> > memory optimizations such as swap on zram are helpful. As is seen
>> > in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
>> > commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
>> > swapped out"), THP_SWAP can improve the swap throughput significantly.
>> >
>> > Enable THP_SWAP for RV64, testing the micro-benchmark which is
>> > introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
>> > shows below numbers on the Lichee RV dock board:
>> >
>> > thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
>> > thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
>> >
>> > Improved by 382%!
>> >
>> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
>> > ---
>> >  arch/riscv/Kconfig | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> > index ed66c31e4655..19088c750c7f 100644
>> > --- a/arch/riscv/Kconfig
>> > +++ b/arch/riscv/Kconfig
>> > @@ -45,6 +45,7 @@ config RISCV
>> >  	select ARCH_WANT_FRAME_POINTERS
>> >  	select ARCH_WANT_GENERAL_HUGETLB
>> >  	select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
>> > +	select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
>> >  	select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
>> >  	select BUILDTIME_TABLE_SORT if MMU
>> >  	select CLONE_BACKWARDS
>>
>> Thanks, this is on for-next.
>
> FYI, this is v1 of a patchset that went to v3.
> v3 only changed the commit message, but v2 had a functional change.
>
> v3 is here:
> https://lore.kernel.org/all/20220829145742.3139-1-jszhang@kernel.org/

Thanks, not sure why I missed those.  I've put the v3 on for-next.
diff mbox series

Patch

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ed66c31e4655..19088c750c7f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -45,6 +45,7 @@  config RISCV
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_WANT_GENERAL_HUGETLB
 	select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
+	select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
 	select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
 	select BUILDTIME_TABLE_SORT if MMU
 	select CLONE_BACKWARDS