Message ID | 20241121122409.277927-1-zhen.ni@easystack.cn (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/mempolicy: Fix redundant check and refine lock protection scope in init_nodemask_of_mempolicy | expand |
On 21.11.24 13:24, Zhen Ni wrote: > 1.Removing redundant checks for current->mempolicy, with a more concise > check order. > > 2.Using READ_ONCE(current->mempolicy) for safe, single access to > current->mempolicy to prevent potential race conditions. > > 3.Optimizing the scope of task_lock(current). The lock now only protects > the critical section where mempolicy is accessed, reducing the duration > the lock is held. This enhances performance by limiting the scope of the > lock to only what is necessary. > > Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> > --- > mm/mempolicy.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index b646fab3e45e..8bff8830b7e6 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -2132,11 +2132,14 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) > { > struct mempolicy *mempolicy; > > - if (!(mask && current->mempolicy)) > + if (!mask)> + return false; > + > + mempolicy = READ_ONCE(current->mempolicy); > + if (!mempolicy) > return false; > > task_lock(current); > - mempolicy = current->mempolicy; > switch (mempolicy->mode) { > case MPOL_PREFERRED: > case MPOL_PREFERRED_MANY: (a) current->mempolicy can only change under the task_lock(), see do_set_mempolicy(). (b) current->mempolicy cannot usually NULL once it was none-NULL. There are two exceptions: copy_process() might set it to NULL when creating the process, before the task gets scheduled. do_exit() might set it to NULL via mpol_put_task_policy(). I don't think init_nodemask_of_mempolicy() can be called while our task is exiting ... Now, if you read current->mempolicy outside of task_lock() to the *dereference* it when it might already have changed+was freed, that's a BUG you're introducing. So, regarding your (1): which redundant "check" ? There is a single "not NULL" chech. Reagrding your (2), I think you are introducing a BUG and regrding your (3) please share performance numbers, or realize that youa re making something up ;) The only improvement we could make here is converting: if (!(mask && current->mempolicy)) into a more readable if (!mask || !current->mapping) And maybe adding a comment that while we can race with someone changing current->mapping, we cannot race with someone changing it to NULL.
On Thu, Nov 21, 2024 at 02:40:44PM +0100, David Hildenbrand wrote: > On 21.11.24 13:24, Zhen Ni wrote: > > 1.Removing redundant checks for current->mempolicy, with a more concise > > check order. > > > > 2.Using READ_ONCE(current->mempolicy) for safe, single access to > > current->mempolicy to prevent potential race conditions. > > > > 3.Optimizing the scope of task_lock(current). The lock now only protects > > the critical section where mempolicy is accessed, reducing the duration > > the lock is held. This enhances performance by limiting the scope of the > > lock to only what is necessary. > > While optimizing task lock scopes is admirable, we're talking about a scale of maybe a microsecond length access on a lock that's typically held quite a bit longer - if it's ever really held given the context of this function. What you'd actually want to do is the opposite: task_lock(current); mempolicy = get_mempolicy(); task_unlock(current); /* do work */ put_mempolicy(); but even this is only optimizing a small amount of work. And if you look at what init_nodemask_of_mempolicy is actually used for, there is no need for this optimization - the task is almost certainly short lived and intended to just adjust a system control value. > > Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> > > --- > > mm/mempolicy.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index b646fab3e45e..8bff8830b7e6 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -2132,11 +2132,14 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) > > { > > struct mempolicy *mempolicy; > > - if (!(mask && current->mempolicy)) > > + if (!mask)> + return false; > > + > > + mempolicy = READ_ONCE(current->mempolicy); > > + if (!mempolicy) > > return false; > > task_lock(current); > > - mempolicy = current->mempolicy; > > switch (mempolicy->mode) { > > case MPOL_PREFERRED: > > case MPOL_PREFERRED_MANY: > > (a) current->mempolicy can only change under the task_lock(), see > do_set_mempolicy(). > > (b) current->mempolicy cannot usually NULL once it was none-NULL. There are > two exceptions: > > copy_process() might set it to NULL when creating the process, before the > task gets scheduled. > > do_exit() might set it to NULL via mpol_put_task_policy(). > > I don't think init_nodemask_of_mempolicy() can be called while our task is > exiting ... > Unless someone put an access to nr_hugepages_mempolicy in the exit path, I doubt it (also that would probably deadlock :]) > > Now, if you read current->mempolicy outside of task_lock() to the > *dereference* it when it might already have changed+was freed, that's a BUG > you're introducing. > 100% a bug, although since the task mempolicy can't currently change outside the context of the task - I can't think of a a way poke this particular bug. ~Gregory
Hi David and Gregory, Thank you for your detailed feedback and insights. I deeply appreciate the time you’ve taken to point out potential issues and to clarify the behavior of current->mempolicy under task_lock(). Regarding the "redundant check" in my patch, my intention was to simplify the readability of the check for mask and current->mempolicy. Your suggested alternative: if (!mask || !current->mapping) is indeed more concise and clear. I fully agree with this change and will incorporate it. On the issue of reading current->mempolicy outside the task_lock() context, I acknowledge the potential for introducing a race condition when current->mempolicy could be dereferenced after being freed. This was an oversight on my part, as I was primarily focused on reducing the lock scope. Regarding the potential performance improvement, I recognize that the optimization is minor, and in this specific context, the lock is held for such a short duration that it is unlikely to provide any meaningful benefit. I’ve revised the patch to simplify the conditional check and added a comment to clarify the behavior of current->mempolicy based on your input. Additionally, I’ve removed the lock scope optimization attempt to avoid introducing any potential race conditions. From 073e4ac5ee6a3f2b45804492f3865cf9157155e2 Mon Sep 17 00:00:00 2001 From: Zhen Ni <zhen.ni@easystack.cn> Date: Fri, 22 Nov 2024 11:48:05 +0800 Subject: [PATCH] mm/mempolicy: Improve readability of NULL check in init_nodemask_of_mempolicy Refines the readability of the NULL check in init_nodemask_of_mempolicy. Additionally, a comment is added to clarify current->mempolicy. Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> --- mm/mempolicy.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index b646fab3e45e..0f0dd33d20d4 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2132,7 +2132,13 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) { struct mempolicy *mempolicy; - if (!(mask && current->mempolicy)) + /* + * While current->mempolicy can race with someone changing + * current->mapping, it cannot race with changes that set it + * to NULL. Such changes are restricted to specific contexts + * (e.g., process initialization or exit). + */ + if (!mask || !current->mempolicy) return false; task_lock(current);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index b646fab3e45e..8bff8830b7e6 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2132,11 +2132,14 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) { struct mempolicy *mempolicy; - if (!(mask && current->mempolicy)) + if (!mask) + return false; + + mempolicy = READ_ONCE(current->mempolicy); + if (!mempolicy) return false; task_lock(current); - mempolicy = current->mempolicy; switch (mempolicy->mode) { case MPOL_PREFERRED: case MPOL_PREFERRED_MANY:
1.Removing redundant checks for current->mempolicy, with a more concise check order. 2.Using READ_ONCE(current->mempolicy) for safe, single access to current->mempolicy to prevent potential race conditions. 3.Optimizing the scope of task_lock(current). The lock now only protects the critical section where mempolicy is accessed, reducing the duration the lock is held. This enhances performance by limiting the scope of the lock to only what is necessary. Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> --- mm/mempolicy.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)