Message ID | 20240122172048.11953-10-haitao.huang@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add Cgroup support for SGX EPC memory | expand |
On Mon Jan 22, 2024 at 7:20 PM EET, Haitao Huang wrote: > Enclave Page Cache(EPC) memory can be swapped out to regular system > memory, and the consumed memory should be charged to a proper > mem_cgroup. Currently the selection of mem_cgroup to charge is done in > sgx_encl_get_mem_cgroup(). But it only considers two contexts in which > the swapping can be done: normal tasks and the ksgxd kthread. > With the new EPC cgroup implementation, the swapping can also happen in > EPC cgroup work-queue threads. In those cases, it improperly selects the > root mem_cgroup to charge for the RAM usage. > > Change sgx_encl_get_mem_cgroup() to handle non-task contexts only and > return the mem_cgroup of an mm_struct associated with the enclave. The > return is used to charge for EPC backing pages in all kthread cases. > > Pass a flag into the top level reclamation function, > sgx_reclaim_pages(), to explicitly indicate whether it is called from a > background kthread. Internally, if the flag is true, switch the active > mem_cgroup to the one returned from sgx_encl_get_mem_cgroup(), prior to > any backing page allocation, in order to ensure that shmem page > allocations are charged to the enclave's cgroup. > > Removed current_is_ksgxd() as it is no longer needed. > > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > Reported-by: Mikko Ylinen <mikko.ylinen@linux.intel.com> > --- > arch/x86/kernel/cpu/sgx/encl.c | 43 ++++++++++++++-------------- > arch/x86/kernel/cpu/sgx/encl.h | 3 +- > arch/x86/kernel/cpu/sgx/epc_cgroup.c | 7 +++-- > arch/x86/kernel/cpu/sgx/main.c | 27 ++++++++--------- > arch/x86/kernel/cpu/sgx/sgx.h | 3 +- > 5 files changed, 40 insertions(+), 43 deletions(-) > > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c > index 279148e72459..75178cc7a6d2 100644 > --- a/arch/x86/kernel/cpu/sgx/encl.c > +++ b/arch/x86/kernel/cpu/sgx/encl.c > @@ -993,9 +993,7 @@ static int __sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_inde > } > > /* > - * When called from ksgxd, returns the mem_cgroup of a struct mm stored > - * in the enclave's mm_list. When not called from ksgxd, just returns > - * the mem_cgroup of the current task. > + * Returns the mem_cgroup of a struct mm stored in the enclave's mm_list. > */ > static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) > { > @@ -1003,14 +1001,6 @@ static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) > struct sgx_encl_mm *encl_mm; > int idx; > > - /* > - * If called from normal task context, return the mem_cgroup > - * of the current task's mm. The remainder of the handling is for > - * ksgxd. > - */ > - if (!current_is_ksgxd()) > - return get_mem_cgroup_from_mm(current->mm); > - > /* > * Search the enclave's mm_list to find an mm associated with > * this enclave to charge the allocation to. > @@ -1047,29 +1037,38 @@ static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) > * @encl: an enclave pointer > * @page_index: enclave page index > * @backing: data for accessing backing storage for the page > + * @indirect: in ksgxd or EPC cgroup work queue context > + * > + * Create a backing page for loading data back into an EPC page with ELDU. This function takes > + * a reference on a new backing page which must be dropped with a corresponding call to > + * sgx_encl_put_backing(). > * > - * When called from ksgxd, sets the active memcg from one of the > - * mms in the enclave's mm_list prior to any backing page allocation, > - * in order to ensure that shmem page allocations are charged to the > - * enclave. Create a backing page for loading data back into an EPC page with > - * ELDU. This function takes a reference on a new backing page which > - * must be dropped with a corresponding call to sgx_encl_put_backing(). > + * When @indirect is true, sets the active memcg from one of the mms in the enclave's mm_list > + * prior to any backing page allocation, in order to ensure that shmem page allocations are > + * charged to the enclave. Same complaint abouve unnecessarily long text paragraph. These are just impractical. > * > * Return: > * 0 on success, > * -errno otherwise. > */ > int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index, > - struct sgx_backing *backing) > + struct sgx_backing *backing, bool indirect) > { > - struct mem_cgroup *encl_memcg = sgx_encl_get_mem_cgroup(encl); > - struct mem_cgroup *memcg = set_active_memcg(encl_memcg); > + struct mem_cgroup *encl_memcg; > + struct mem_cgroup *memcg; > int ret; > > + if (indirect) { > + encl_memcg = sgx_encl_get_mem_cgroup(encl); > + memcg = set_active_memcg(encl_memcg); > + } > + > ret = __sgx_encl_get_backing(encl, page_index, backing); > > - set_active_memcg(memcg); > - mem_cgroup_put(encl_memcg); > + if (indirect) { > + set_active_memcg(memcg); > + mem_cgroup_put(encl_memcg); > + } > > return ret; > } > diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h > index f94ff14c9486..549cd2e8d98b 100644 > --- a/arch/x86/kernel/cpu/sgx/encl.h > +++ b/arch/x86/kernel/cpu/sgx/encl.h > @@ -103,12 +103,11 @@ static inline int sgx_encl_find(struct mm_struct *mm, unsigned long addr, > int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start, > unsigned long end, unsigned long vm_flags); > > -bool current_is_ksgxd(void); > void sgx_encl_release(struct kref *ref); > int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm); > const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl); > int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index, > - struct sgx_backing *backing); > + struct sgx_backing *backing, bool indirect); > void sgx_encl_put_backing(struct sgx_backing *backing); > int sgx_encl_test_and_clear_young(struct mm_struct *mm, > struct sgx_encl_page *page); > diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c b/arch/x86/kernel/cpu/sgx/epc_cgroup.c > index 71570c346d95..44265f62b2a4 100644 > --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c > +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c > @@ -85,9 +85,10 @@ bool sgx_epc_cgroup_lru_empty(struct misc_cg *root) > /** > * sgx_epc_cgroup_reclaim_pages() - walk a cgroup tree and scan LRUs to reclaim pages > * @root: Root of the tree to start walking > + * @indirect: In ksgxd or EPC cgroup work queue context. > * Return: Number of pages reclaimed. > */ > -unsigned int sgx_epc_cgroup_reclaim_pages(struct misc_cg *root) > +static unsigned int sgx_epc_cgroup_reclaim_pages(struct misc_cg *root, bool indirect) > { > /* > * Attempting to reclaim only a few pages will often fail and is inefficient, while > @@ -111,7 +112,7 @@ unsigned int sgx_epc_cgroup_reclaim_pages(struct misc_cg *root) > rcu_read_unlock(); > > epc_cg = sgx_epc_cgroup_from_misc_cg(css_misc(pos)); > - cnt += sgx_reclaim_pages(&epc_cg->lru, &nr_to_scan); > + cnt += sgx_reclaim_pages(&epc_cg->lru, &nr_to_scan, indirect); > > rcu_read_lock(); > css_put(pos); > @@ -168,7 +169,7 @@ static void sgx_epc_cgroup_reclaim_work_func(struct work_struct *work) > break; > > /* Keep reclaiming until above condition is met. */ > - sgx_epc_cgroup_reclaim_pages(epc_cg->cg); > + sgx_epc_cgroup_reclaim_pages(epc_cg->cg, true); > } > } > > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 60cb3a7b3001..14314f25880d 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -254,7 +254,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page, > } > > static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, > - struct sgx_backing *backing) > + struct sgx_backing *backing, bool indirect) > { > struct sgx_encl_page *encl_page = epc_page->owner; > struct sgx_encl *encl = encl_page->encl; > @@ -270,7 +270,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, > > if (!encl->secs_child_cnt && test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) { > ret = sgx_encl_alloc_backing(encl, PFN_DOWN(encl->size), > - &secs_backing); > + &secs_backing, indirect); > if (ret) > goto out; > > @@ -301,9 +301,11 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, > * > * @lru: The LRU from which pages are reclaimed. > * @nr_to_scan: Pointer to the target number of pages to scan, must be less than SGX_NR_TO_SCAN. > + * @indirect: In ksgxd or EPC cgroup work queue contexts. > * Return: Number of pages reclaimed. > */ > -unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan) > +unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan, > + bool indirect) > { > struct sgx_epc_page *chunk[SGX_NR_TO_SCAN]; > struct sgx_backing backing[SGX_NR_TO_SCAN]; > @@ -345,7 +347,7 @@ unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to > page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); > > mutex_lock(&encl_page->encl->lock); > - ret = sgx_encl_alloc_backing(encl_page->encl, page_index, &backing[i]); > + ret = sgx_encl_alloc_backing(encl_page->encl, page_index, &backing[i], indirect); > if (ret) { > mutex_unlock(&encl_page->encl->lock); > goto skip; > @@ -378,7 +380,7 @@ unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to > continue; > > encl_page = epc_page->owner; > - sgx_reclaimer_write(epc_page, &backing[i]); > + sgx_reclaimer_write(epc_page, &backing[i], indirect); > > kref_put(&encl_page->encl->refcount, sgx_encl_release); > epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; > @@ -396,11 +398,11 @@ static bool sgx_should_reclaim(unsigned long watermark) > !list_empty(&sgx_global_lru.reclaimable); > } > > -static void sgx_reclaim_pages_global(void) > +static void sgx_reclaim_pages_global(bool indirect) > { > unsigned int nr_to_scan = SGX_NR_TO_SCAN; > > - sgx_reclaim_pages(&sgx_global_lru, &nr_to_scan); > + sgx_reclaim_pages(&sgx_global_lru, &nr_to_scan, indirect); > } > > /* > @@ -411,7 +413,7 @@ static void sgx_reclaim_pages_global(void) > void sgx_reclaim_direct(void) > { > if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) > - sgx_reclaim_pages_global(); > + sgx_reclaim_pages_global(false); > } > > static int ksgxd(void *p) > @@ -434,7 +436,7 @@ static int ksgxd(void *p) > sgx_should_reclaim(SGX_NR_HIGH_PAGES)); > > if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) > - sgx_reclaim_pages_global(); > + sgx_reclaim_pages_global(true); > > cond_resched(); > } > @@ -457,11 +459,6 @@ static bool __init sgx_page_reclaimer_init(void) > return true; > } > > -bool current_is_ksgxd(void) > -{ > - return current == ksgxd_tsk; > -} > - > static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) > { > struct sgx_numa_node *node = &sgx_numa_nodes[nid]; > @@ -621,7 +618,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) > * Need to do a global reclamation if cgroup was not full but free > * physical pages run out, causing __sgx_alloc_epc_page() to fail. > */ > - sgx_reclaim_pages_global(); > + sgx_reclaim_pages_global(false); > cond_resched(); > } > > diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h > index 2593c013d091..cfe906054d85 100644 > --- a/arch/x86/kernel/cpu/sgx/sgx.h > +++ b/arch/x86/kernel/cpu/sgx/sgx.h > @@ -110,7 +110,8 @@ void sgx_reclaim_direct(void); > void sgx_mark_page_reclaimable(struct sgx_epc_page *page); > int sgx_unmark_page_reclaimable(struct sgx_epc_page *page); > struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); > -unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan); > +unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan, > + bool indirect); > > void sgx_ipi_cb(void *info); > BR, Jarkko
> > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > Reported-by: Mikko Ylinen <mikko.ylinen@linux.intel.com> > --- Non-technical staff: I believe checkpatch requires you to have a "Closes" tag after "Reported-by" otherwise it complains something like this: WARNING: Reported-by: should be immediately followed by Closes: with a URL to the report Not sure how strict this rule is, but seems you forgot to run checkpatch so just a reminder.
On Fri, 26 Jan 2024 08:37:25 -0600, Huang, Kai <kai.huang@intel.com> wrote: > >> >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >> Reported-by: Mikko Ylinen <mikko.ylinen@linux.intel.com> >> --- > > Non-technical staff: > > I believe checkpatch requires you to have a "Closes" tag after > "Reported-by" > otherwise it complains something like this: > > WARNING: Reported-by: should be immediately followed by Closes: with a > URL > to the report > > Not sure how strict this rule is, but seems you forgot to run checkpatch > so just > a reminder. Yeah I did remember run checkpatch this time and last time :-) Here is what I see from documentation[1]:"The tag should be followed by a Closes: tag pointing to the report, unless the report is not available on the web". So it seems to allow exceptions. Mikko mentioned this issue to me in private IM so this is not available on web. Thanks Haitao https://docs.kernel.org/process/submitting-patches.html#using-reported-by-tested-by-reviewed-by-suggested-by-and-fixes
On Mon, 2024-01-22 at 09:20 -0800, Haitao Huang wrote: > > @@ -1047,29 +1037,38 @@ static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) > * @encl: an enclave pointer > * @page_index: enclave page index > * @backing: data for accessing backing storage for the page > + * @indirect: in ksgxd or EPC cgroup work queue context > + * > + * Create a backing page for loading data back into an EPC page with ELDU. This function takes > + * a reference on a new backing page which must be dropped with a corresponding call to > + * sgx_encl_put_backing(). > * > - * When called from ksgxd, sets the active memcg from one of the > - * mms in the enclave's mm_list prior to any backing page allocation, > - * in order to ensure that shmem page allocations are charged to the > - * enclave. Create a backing page for loading data back into an EPC page with > - * ELDU. This function takes a reference on a new backing page which > - * must be dropped with a corresponding call to sgx_encl_put_backing(). > + * When @indirect is true, sets the active memcg from one of the mms in the enclave's mm_list > + * prior to any backing page allocation, in order to ensure that shmem page allocations are > + * charged to the enclave. > * > * Return: > * 0 on success, > * -errno otherwise. > */ > int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index, > - struct sgx_backing *backing) > + struct sgx_backing *backing, bool indirect) > { > - struct mem_cgroup *encl_memcg = sgx_encl_get_mem_cgroup(encl); > - struct mem_cgroup *memcg = set_active_memcg(encl_memcg); > + struct mem_cgroup *encl_memcg; > + struct mem_cgroup *memcg; > int ret; > > + if (indirect) { > + encl_memcg = sgx_encl_get_mem_cgroup(encl); > + memcg = set_active_memcg(encl_memcg); > + } > + > ret = __sgx_encl_get_backing(encl, page_index, backing); > > - set_active_memcg(memcg); > - mem_cgroup_put(encl_memcg); > + if (indirect) { > + set_active_memcg(memcg); > + mem_cgroup_put(encl_memcg); > + } > You can reduce the number of if statements to make the logic simpler. Something like if (!indirect) return __sgx_encl_get_backing(encl, page_index, backing); encl_memcg = sgx_encl_get_mem_cgroup(encl); memcg = set_active_memcg(encl_memcg); ret = __sgx_encl_get_backing(encl, page_index, backing); set_active_memcg(memcg); mem_cgroup_put(encl_memcg); > return ret; Tim
Hi Tim, On Fri, 02 Feb 2024 17:45:05 -0600, Tim Chen <tim.c.chen@linux.intel.com> wrote: > On Mon, 2024-01-22 at 09:20 -0800, Haitao Huang wrote: >> >> @@ -1047,29 +1037,38 @@ static struct mem_cgroup >> *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) >> * @encl: an enclave pointer >> * @page_index: enclave page index >> * @backing: data for accessing backing storage for the page >> + * @indirect: in ksgxd or EPC cgroup work queue context >> + * >> + * Create a backing page for loading data back into an EPC page with >> ELDU. This function takes >> + * a reference on a new backing page which must be dropped with a >> corresponding call to >> + * sgx_encl_put_backing(). >> * >> - * When called from ksgxd, sets the active memcg from one of the >> - * mms in the enclave's mm_list prior to any backing page allocation, >> - * in order to ensure that shmem page allocations are charged to the >> - * enclave. Create a backing page for loading data back into an EPC >> page with >> - * ELDU. This function takes a reference on a new backing page which >> - * must be dropped with a corresponding call to sgx_encl_put_backing(). >> + * When @indirect is true, sets the active memcg from one of the mms >> in the enclave's mm_list >> + * prior to any backing page allocation, in order to ensure that shmem >> page allocations are >> + * charged to the enclave. >> * >> * Return: >> * 0 on success, >> * -errno otherwise. >> */ >> int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long >> page_index, >> - struct sgx_backing *backing) >> + struct sgx_backing *backing, bool indirect) >> { >> - struct mem_cgroup *encl_memcg = sgx_encl_get_mem_cgroup(encl); >> - struct mem_cgroup *memcg = set_active_memcg(encl_memcg); >> + struct mem_cgroup *encl_memcg; >> + struct mem_cgroup *memcg; >> int ret; >> >> + if (indirect) { >> + encl_memcg = sgx_encl_get_mem_cgroup(encl); >> + memcg = set_active_memcg(encl_memcg); >> + } >> + >> ret = __sgx_encl_get_backing(encl, page_index, backing); >> >> - set_active_memcg(memcg); >> - mem_cgroup_put(encl_memcg); >> + if (indirect) { >> + set_active_memcg(memcg); >> + mem_cgroup_put(encl_memcg); >> + } >> > > > You can reduce the number of if statements to make the logic > simpler. Something like > > if (!indirect) > return __sgx_encl_get_backing(encl, page_index, backing); > > encl_memcg = sgx_encl_get_mem_cgroup(encl); > memcg = set_active_memcg(encl_memcg); > ret = __sgx_encl_get_backing(encl, page_index, backing); > set_active_memcg(memcg); > mem_cgroup_put(encl_memcg); > >> return ret; > > Tim > Yes, will do. Thanks Haitao
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 279148e72459..75178cc7a6d2 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -993,9 +993,7 @@ static int __sgx_encl_get_backing(struct sgx_encl *encl, unsigned long page_inde } /* - * When called from ksgxd, returns the mem_cgroup of a struct mm stored - * in the enclave's mm_list. When not called from ksgxd, just returns - * the mem_cgroup of the current task. + * Returns the mem_cgroup of a struct mm stored in the enclave's mm_list. */ static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) { @@ -1003,14 +1001,6 @@ static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) struct sgx_encl_mm *encl_mm; int idx; - /* - * If called from normal task context, return the mem_cgroup - * of the current task's mm. The remainder of the handling is for - * ksgxd. - */ - if (!current_is_ksgxd()) - return get_mem_cgroup_from_mm(current->mm); - /* * Search the enclave's mm_list to find an mm associated with * this enclave to charge the allocation to. @@ -1047,29 +1037,38 @@ static struct mem_cgroup *sgx_encl_get_mem_cgroup(struct sgx_encl *encl) * @encl: an enclave pointer * @page_index: enclave page index * @backing: data for accessing backing storage for the page + * @indirect: in ksgxd or EPC cgroup work queue context + * + * Create a backing page for loading data back into an EPC page with ELDU. This function takes + * a reference on a new backing page which must be dropped with a corresponding call to + * sgx_encl_put_backing(). * - * When called from ksgxd, sets the active memcg from one of the - * mms in the enclave's mm_list prior to any backing page allocation, - * in order to ensure that shmem page allocations are charged to the - * enclave. Create a backing page for loading data back into an EPC page with - * ELDU. This function takes a reference on a new backing page which - * must be dropped with a corresponding call to sgx_encl_put_backing(). + * When @indirect is true, sets the active memcg from one of the mms in the enclave's mm_list + * prior to any backing page allocation, in order to ensure that shmem page allocations are + * charged to the enclave. * * Return: * 0 on success, * -errno otherwise. */ int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index, - struct sgx_backing *backing) + struct sgx_backing *backing, bool indirect) { - struct mem_cgroup *encl_memcg = sgx_encl_get_mem_cgroup(encl); - struct mem_cgroup *memcg = set_active_memcg(encl_memcg); + struct mem_cgroup *encl_memcg; + struct mem_cgroup *memcg; int ret; + if (indirect) { + encl_memcg = sgx_encl_get_mem_cgroup(encl); + memcg = set_active_memcg(encl_memcg); + } + ret = __sgx_encl_get_backing(encl, page_index, backing); - set_active_memcg(memcg); - mem_cgroup_put(encl_memcg); + if (indirect) { + set_active_memcg(memcg); + mem_cgroup_put(encl_memcg); + } return ret; } diff --git a/arch/x86/kernel/cpu/sgx/encl.h b/arch/x86/kernel/cpu/sgx/encl.h index f94ff14c9486..549cd2e8d98b 100644 --- a/arch/x86/kernel/cpu/sgx/encl.h +++ b/arch/x86/kernel/cpu/sgx/encl.h @@ -103,12 +103,11 @@ static inline int sgx_encl_find(struct mm_struct *mm, unsigned long addr, int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start, unsigned long end, unsigned long vm_flags); -bool current_is_ksgxd(void); void sgx_encl_release(struct kref *ref); int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm); const cpumask_t *sgx_encl_cpumask(struct sgx_encl *encl); int sgx_encl_alloc_backing(struct sgx_encl *encl, unsigned long page_index, - struct sgx_backing *backing); + struct sgx_backing *backing, bool indirect); void sgx_encl_put_backing(struct sgx_backing *backing); int sgx_encl_test_and_clear_young(struct mm_struct *mm, struct sgx_encl_page *page); diff --git a/arch/x86/kernel/cpu/sgx/epc_cgroup.c b/arch/x86/kernel/cpu/sgx/epc_cgroup.c index 71570c346d95..44265f62b2a4 100644 --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.c +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.c @@ -85,9 +85,10 @@ bool sgx_epc_cgroup_lru_empty(struct misc_cg *root) /** * sgx_epc_cgroup_reclaim_pages() - walk a cgroup tree and scan LRUs to reclaim pages * @root: Root of the tree to start walking + * @indirect: In ksgxd or EPC cgroup work queue context. * Return: Number of pages reclaimed. */ -unsigned int sgx_epc_cgroup_reclaim_pages(struct misc_cg *root) +static unsigned int sgx_epc_cgroup_reclaim_pages(struct misc_cg *root, bool indirect) { /* * Attempting to reclaim only a few pages will often fail and is inefficient, while @@ -111,7 +112,7 @@ unsigned int sgx_epc_cgroup_reclaim_pages(struct misc_cg *root) rcu_read_unlock(); epc_cg = sgx_epc_cgroup_from_misc_cg(css_misc(pos)); - cnt += sgx_reclaim_pages(&epc_cg->lru, &nr_to_scan); + cnt += sgx_reclaim_pages(&epc_cg->lru, &nr_to_scan, indirect); rcu_read_lock(); css_put(pos); @@ -168,7 +169,7 @@ static void sgx_epc_cgroup_reclaim_work_func(struct work_struct *work) break; /* Keep reclaiming until above condition is met. */ - sgx_epc_cgroup_reclaim_pages(epc_cg->cg); + sgx_epc_cgroup_reclaim_pages(epc_cg->cg, true); } } diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 60cb3a7b3001..14314f25880d 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -254,7 +254,7 @@ static void sgx_encl_ewb(struct sgx_epc_page *epc_page, } static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, - struct sgx_backing *backing) + struct sgx_backing *backing, bool indirect) { struct sgx_encl_page *encl_page = epc_page->owner; struct sgx_encl *encl = encl_page->encl; @@ -270,7 +270,7 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, if (!encl->secs_child_cnt && test_bit(SGX_ENCL_INITIALIZED, &encl->flags)) { ret = sgx_encl_alloc_backing(encl, PFN_DOWN(encl->size), - &secs_backing); + &secs_backing, indirect); if (ret) goto out; @@ -301,9 +301,11 @@ static void sgx_reclaimer_write(struct sgx_epc_page *epc_page, * * @lru: The LRU from which pages are reclaimed. * @nr_to_scan: Pointer to the target number of pages to scan, must be less than SGX_NR_TO_SCAN. + * @indirect: In ksgxd or EPC cgroup work queue contexts. * Return: Number of pages reclaimed. */ -unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan) +unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan, + bool indirect) { struct sgx_epc_page *chunk[SGX_NR_TO_SCAN]; struct sgx_backing backing[SGX_NR_TO_SCAN]; @@ -345,7 +347,7 @@ unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to page_index = PFN_DOWN(encl_page->desc - encl_page->encl->base); mutex_lock(&encl_page->encl->lock); - ret = sgx_encl_alloc_backing(encl_page->encl, page_index, &backing[i]); + ret = sgx_encl_alloc_backing(encl_page->encl, page_index, &backing[i], indirect); if (ret) { mutex_unlock(&encl_page->encl->lock); goto skip; @@ -378,7 +380,7 @@ unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to continue; encl_page = epc_page->owner; - sgx_reclaimer_write(epc_page, &backing[i]); + sgx_reclaimer_write(epc_page, &backing[i], indirect); kref_put(&encl_page->encl->refcount, sgx_encl_release); epc_page->flags &= ~SGX_EPC_PAGE_RECLAIMER_TRACKED; @@ -396,11 +398,11 @@ static bool sgx_should_reclaim(unsigned long watermark) !list_empty(&sgx_global_lru.reclaimable); } -static void sgx_reclaim_pages_global(void) +static void sgx_reclaim_pages_global(bool indirect) { unsigned int nr_to_scan = SGX_NR_TO_SCAN; - sgx_reclaim_pages(&sgx_global_lru, &nr_to_scan); + sgx_reclaim_pages(&sgx_global_lru, &nr_to_scan, indirect); } /* @@ -411,7 +413,7 @@ static void sgx_reclaim_pages_global(void) void sgx_reclaim_direct(void) { if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) - sgx_reclaim_pages_global(); + sgx_reclaim_pages_global(false); } static int ksgxd(void *p) @@ -434,7 +436,7 @@ static int ksgxd(void *p) sgx_should_reclaim(SGX_NR_HIGH_PAGES)); if (sgx_should_reclaim(SGX_NR_HIGH_PAGES)) - sgx_reclaim_pages_global(); + sgx_reclaim_pages_global(true); cond_resched(); } @@ -457,11 +459,6 @@ static bool __init sgx_page_reclaimer_init(void) return true; } -bool current_is_ksgxd(void) -{ - return current == ksgxd_tsk; -} - static struct sgx_epc_page *__sgx_alloc_epc_page_from_node(int nid) { struct sgx_numa_node *node = &sgx_numa_nodes[nid]; @@ -621,7 +618,7 @@ struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim) * Need to do a global reclamation if cgroup was not full but free * physical pages run out, causing __sgx_alloc_epc_page() to fail. */ - sgx_reclaim_pages_global(); + sgx_reclaim_pages_global(false); cond_resched(); } diff --git a/arch/x86/kernel/cpu/sgx/sgx.h b/arch/x86/kernel/cpu/sgx/sgx.h index 2593c013d091..cfe906054d85 100644 --- a/arch/x86/kernel/cpu/sgx/sgx.h +++ b/arch/x86/kernel/cpu/sgx/sgx.h @@ -110,7 +110,8 @@ void sgx_reclaim_direct(void); void sgx_mark_page_reclaimable(struct sgx_epc_page *page); int sgx_unmark_page_reclaimable(struct sgx_epc_page *page); struct sgx_epc_page *sgx_alloc_epc_page(void *owner, bool reclaim); -unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan); +unsigned int sgx_reclaim_pages(struct sgx_epc_lru_list *lru, unsigned int *nr_to_scan, + bool indirect); void sgx_ipi_cb(void *info);
Enclave Page Cache(EPC) memory can be swapped out to regular system memory, and the consumed memory should be charged to a proper mem_cgroup. Currently the selection of mem_cgroup to charge is done in sgx_encl_get_mem_cgroup(). But it only considers two contexts in which the swapping can be done: normal tasks and the ksgxd kthread. With the new EPC cgroup implementation, the swapping can also happen in EPC cgroup work-queue threads. In those cases, it improperly selects the root mem_cgroup to charge for the RAM usage. Change sgx_encl_get_mem_cgroup() to handle non-task contexts only and return the mem_cgroup of an mm_struct associated with the enclave. The return is used to charge for EPC backing pages in all kthread cases. Pass a flag into the top level reclamation function, sgx_reclaim_pages(), to explicitly indicate whether it is called from a background kthread. Internally, if the flag is true, switch the active mem_cgroup to the one returned from sgx_encl_get_mem_cgroup(), prior to any backing page allocation, in order to ensure that shmem page allocations are charged to the enclave's cgroup. Removed current_is_ksgxd() as it is no longer needed. Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> Reported-by: Mikko Ylinen <mikko.ylinen@linux.intel.com> --- arch/x86/kernel/cpu/sgx/encl.c | 43 ++++++++++++++-------------- arch/x86/kernel/cpu/sgx/encl.h | 3 +- arch/x86/kernel/cpu/sgx/epc_cgroup.c | 7 +++-- arch/x86/kernel/cpu/sgx/main.c | 27 ++++++++--------- arch/x86/kernel/cpu/sgx/sgx.h | 3 +- 5 files changed, 40 insertions(+), 43 deletions(-)