diff mbox series

[v16,12/16] x86/sgx: Revise global reclamation for EPC cgroups

Message ID 20240821015404.6038-13-haitao.huang@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series Add Cgroup support for SGX EPC memory | expand

Commit Message

Haitao Huang Aug. 21, 2024, 1:54 a.m. UTC
With EPC cgroups, the global reclamation function,
sgx_reclaim_pages_global(), can no longer apply to the global LRU as
pages are now in per-cgroup LRUs.

Create a wrapper, sgx_cgroup_reclaim_global() to invoke
sgx_cgroup_reclaim_pages() passing in the root cgroup. Call this wrapper
from sgx_reclaim_pages_global() when cgroup is enabled. The wrapper will
scan and attempt to reclaim SGX_NR_TO_SCAN pages just like the current
global reclaim.

Note this simple implementation doesn't _exactly_ mimic the current
global EPC reclaim (which always tries to do the actual reclaim in batch
of SGX_NR_TO_SCAN pages): in rare cases when LRUs have less than
SGX_NR_TO_SCAN reclaimable pages, the actual reclaim of EPC pages will
be split into smaller batches _across_ multiple LRUs with each being
smaller than SGX_NR_TO_SCAN pages.

A more precise way to mimic the current global EPC reclaim would be to
have a new function to only "scan" (or "isolate") SGX_NR_TO_SCAN pages
_across_ the given EPC cgroup _AND_ its descendants, and then do the
actual reclaim in one batch.  But this is unnecessarily complicated at
this stage to address such rare cases.

Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
---
 arch/x86/kernel/cpu/sgx/epc_cgroup.c | 12 ++++++++++++
 arch/x86/kernel/cpu/sgx/epc_cgroup.h |  3 +++
 arch/x86/kernel/cpu/sgx/main.c       |  7 +++++++
 3 files changed, 22 insertions(+)

Comments

Huang, Kai Aug. 27, 2024, 11:32 a.m. UTC | #1
On Tue, 2024-08-20 at 18:54 -0700, Haitao Huang wrote:
> With EPC cgroups, the global reclamation function,
> sgx_reclaim_pages_global(), can no longer apply to the global LRU as
> pages are now in per-cgroup LRUs.
> 
> Create a wrapper, sgx_cgroup_reclaim_global() to invoke
> sgx_cgroup_reclaim_pages() passing in the root cgroup. 
> 

[...]

> Call this wrapper
> from sgx_reclaim_pages_global() when cgroup is enabled. 
> 

This is not done in this patch.

> The wrapper will
> scan and attempt to reclaim SGX_NR_TO_SCAN pages just like the current
> global reclaim.
> 
> Note this simple implementation doesn't _exactly_ mimic the current
> global EPC reclaim (which always tries to do the actual reclaim in batch
> of SGX_NR_TO_SCAN pages): in rare cases when LRUs have less than
> SGX_NR_TO_SCAN reclaimable pages, the actual reclaim of EPC pages will
> be split into smaller batches _across_ multiple LRUs with each being
> smaller than SGX_NR_TO_SCAN pages.
> 
> A more precise way to mimic the current global EPC reclaim would be to
> have a new function to only "scan" (or "isolate") SGX_NR_TO_SCAN pages
> _across_ the given EPC cgroup _AND_ its descendants, and then do the
> actual reclaim in one batch.  But this is unnecessarily complicated at
> this stage to address such rare cases.
> 
> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/sgx/epc_cgroup.c | 12 ++++++++++++
>  arch/x86/kernel/cpu/sgx/epc_cgroup.h |  3 +++
>  arch/x86/kernel/cpu/sgx/main.c       |  7 +++++++
>  3 files changed, 22 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
> index ae65cac858f8..23a61689e0d9 100644
> --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
> +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
> @@ -240,6 +240,18 @@ static bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg)
>  	return (cur >= max);
>  }
>  
> +/**
> + * sgx_cgroup_reclaim_pages_global() - Perform one round of global reclamation.
> + *
> + * @charge_mm:	The mm to be charged for the backing store of reclaimed pages.
> + *
> + * Try to scan and attempt reclamation from root cgroup for %SGX_NR_TO_SCAN pages.
> + */
> +void sgx_cgroup_reclaim_pages_global(struct mm_struct *charge_mm)
> +{
> +	sgx_cgroup_reclaim_pages(&sgx_cg_root, charge_mm, SGX_NR_TO_SCAN);
> +}
> +
> 

[...]

>  
>  static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
>  {
> +	/*
> +	 * Now all EPC pages are still tracked in the @sgx_global_lru.
> +	 * Still reclaim from it.
> +	 *
> +	 * When EPC pages are tracked in the actual per-cgroup LRUs,
> +	 * sgx_cgroup_reclaim_pages_global() will be called.
> +	 */
>  	sgx_reclaim_pages(&sgx_global_lru, charge_mm);
>  }
>  

I didn't realize the only functional change of this patch is to add a new
helper sgx_cgroup_reclaim_pages_global() (hmm... it's not even a functional
change because the helper isn't called).

It might make sense to have this as a separate patch with the comment of
sgx_can_reclaim_global() being moved here, from the perspective that this
patch only adds the building block to do global reclaim from per-cgroup LRUs
but doesn't actually turn that on.

But given this patch only adds a helper it might also make sense to just merge
this with the patch "x86/sgx: Turn on per-cgroup EPC reclamation".

I have no hard opinion on this.  Maybe we can keep the current way unless
other people comment in.

Btw, I don't quite follow why you placed this patch here.  It should be placed
right before the patch "x86/sgx: Turn on per-cgroup EPC reclamation" (if we
want to make this as a separate patch).

Or, the patch "x86/sgx: implement direct reclamation for cgroups" should be
moved to an earlier place after patch "x86/sgx: Implement async reclamation
for cgroup".
Jarkko Sakkinen Aug. 27, 2024, 6:16 p.m. UTC | #2
On Wed Aug 21, 2024 at 4:54 AM EEST, Haitao Huang wrote:
> With EPC cgroups, the global reclamation function,
> sgx_reclaim_pages_global(), can no longer apply to the global LRU as
> pages are now in per-cgroup LRUs.
>
> Create a wrapper, sgx_cgroup_reclaim_global() to invoke
> sgx_cgroup_reclaim_pages() passing in the root cgroup. Call this wrapper
> from sgx_reclaim_pages_global() when cgroup is enabled. The wrapper will
> scan and attempt to reclaim SGX_NR_TO_SCAN pages just like the current
> global reclaim.
>
> Note this simple implementation doesn't _exactly_ mimic the current
> global EPC reclaim (which always tries to do the actual reclaim in batch
> of SGX_NR_TO_SCAN pages): in rare cases when LRUs have less than
> SGX_NR_TO_SCAN reclaimable pages, the actual reclaim of EPC pages will
> be split into smaller batches _across_ multiple LRUs with each being
> smaller than SGX_NR_TO_SCAN pages.
>
> A more precise way to mimic the current global EPC reclaim would be to
> have a new function to only "scan" (or "isolate") SGX_NR_TO_SCAN pages
> _across_ the given EPC cgroup _AND_ its descendants, and then do the
> actual reclaim in one batch.  But this is unnecessarily complicated at
> this stage to address such rare cases.
>
> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko
Huang, Kai Aug. 27, 2024, 11:29 p.m. UTC | #3
On 27/08/2024 11:32 pm, Huang, Kai wrote:
> On Tue, 2024-08-20 at 18:54 -0700, Haitao Huang wrote:
>> With EPC cgroups, the global reclamation function,
>> sgx_reclaim_pages_global(), can no longer apply to the global LRU as
>> pages are now in per-cgroup LRUs.
>>
>> Create a wrapper, sgx_cgroup_reclaim_global() to invoke
>> sgx_cgroup_reclaim_pages() passing in the root cgroup.
>>
> 
> [...]
> 
>> Call this wrapper
>> from sgx_reclaim_pages_global() when cgroup is enabled.
>>
> 
> This is not done in this patch.
> 

With this removed, and ...

> 
> It might make sense to have this as a separate patch with the comment of
> sgx_can_reclaim_global() being moved here, from the perspective that this
> patch only adds the building block to do global reclaim from per-cgroup LRUs
> but doesn't actually turn that on.
> 

... with the comment of sgx_can_reclaim_global() moved to this patch:

Reviewed-by: Kai Huang <kai.huang@intel.com>
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
index ae65cac858f8..23a61689e0d9 100644
--- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c
+++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c
@@ -240,6 +240,18 @@  static bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg)
 	return (cur >= max);
 }
 
+/**
+ * sgx_cgroup_reclaim_pages_global() - Perform one round of global reclamation.
+ *
+ * @charge_mm:	The mm to be charged for the backing store of reclaimed pages.
+ *
+ * Try to scan and attempt reclamation from root cgroup for %SGX_NR_TO_SCAN pages.
+ */
+void sgx_cgroup_reclaim_pages_global(struct mm_struct *charge_mm)
+{
+	sgx_cgroup_reclaim_pages(&sgx_cg_root, charge_mm, SGX_NR_TO_SCAN);
+}
+
 /*
  * Asynchronous work flow to reclaim pages from the cgroup when the cgroup is
  * at/near its maximum capacity.
diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.h b/arch/x86/kernel/cpu/sgx/epc_cgroup.h
index 37364bdb797f..c0390111e28c 100644
--- a/arch/x86/kernel/cpu/sgx/epc_cgroup.h
+++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h
@@ -36,6 +36,8 @@  static inline void __init sgx_cgroup_deinit(void) { }
 
 static inline void __init sgx_cgroup_register(void) { }
 
+static inline void sgx_cgroup_reclaim_pages_global(struct mm_struct *charge_mm) { }
+
 #else /* CONFIG_CGROUP_MISC */
 
 struct sgx_cgroup {
@@ -87,6 +89,7 @@  static inline void sgx_put_cg(struct sgx_cgroup *sgx_cg)
 
 int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_reclaim reclaim);
 void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg);
+void sgx_cgroup_reclaim_pages_global(struct mm_struct *charge_mm);
 int __init sgx_cgroup_init(void);
 void __init sgx_cgroup_register(void);
 void __init sgx_cgroup_deinit(void);
diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c
index fa00ed5e884d..d00cb012838b 100644
--- a/arch/x86/kernel/cpu/sgx/main.c
+++ b/arch/x86/kernel/cpu/sgx/main.c
@@ -411,6 +411,13 @@  static bool sgx_should_reclaim_global(unsigned long watermark)
 
 static void sgx_reclaim_pages_global(struct mm_struct *charge_mm)
 {
+	/*
+	 * Now all EPC pages are still tracked in the @sgx_global_lru.
+	 * Still reclaim from it.
+	 *
+	 * When EPC pages are tracked in the actual per-cgroup LRUs,
+	 * sgx_cgroup_reclaim_pages_global() will be called.
+	 */
 	sgx_reclaim_pages(&sgx_global_lru, charge_mm);
 }