diff mbox series

[v21,07/19] mm: page_idle_get_page() does not need lru_lock

Message ID 1604566549-62481-8-git-send-email-alex.shi@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series per memcg lru lock | expand

Commit Message

Alex Shi Nov. 5, 2020, 8:55 a.m. UTC
From: Hugh Dickins <hughd@google.com>

It is necessary for page_idle_get_page() to recheck PageLRU() after
get_page_unless_zero(), but holding lru_lock around that serves no
useful purpose, and adds to lru_lock contention: delete it.

See https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop for the
discussion that led to lru_lock there; but __page_set_anon_rmap() now
uses WRITE_ONCE(), and I see no other risk in page_idle_clear_pte_refs()
using rmap_walk() (beyond the risk of racing PageAnon->PageKsm, mostly
but not entirely prevented by page_count() check in ksm.c's
write_protect_page(): that risk being shared with page_referenced() and
not helped by lru_lock).

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Alex Shi <alex.shi@linux.alibaba.com>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 mm/page_idle.c | 4 ----
 1 file changed, 4 deletions(-)

Comments

Johannes Weiner Nov. 10, 2020, 7:01 p.m. UTC | #1
On Thu, Nov 05, 2020 at 04:55:37PM +0800, Alex Shi wrote:
> From: Hugh Dickins <hughd@google.com>
> 
> It is necessary for page_idle_get_page() to recheck PageLRU() after
> get_page_unless_zero(), but holding lru_lock around that serves no
> useful purpose, and adds to lru_lock contention: delete it.
> 
> See https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop for the
> discussion that led to lru_lock there; but __page_set_anon_rmap() now
> uses WRITE_ONCE(), and I see no other risk in page_idle_clear_pte_refs()
> using rmap_walk() (beyond the risk of racing PageAnon->PageKsm, mostly
> but not entirely prevented by page_count() check in ksm.c's
> write_protect_page(): that risk being shared with page_referenced() and
> not helped by lru_lock).
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
huang ying Nov. 11, 2020, 8:17 a.m. UTC | #2
On Thu, Nov 5, 2020 at 4:56 PM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>
> From: Hugh Dickins <hughd@google.com>
>
> It is necessary for page_idle_get_page() to recheck PageLRU() after
> get_page_unless_zero(), but holding lru_lock around that serves no
> useful purpose, and adds to lru_lock contention: delete it.
>
> See https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop for the
> discussion that led to lru_lock there; but __page_set_anon_rmap() now
> uses WRITE_ONCE(), and I see no other risk in page_idle_clear_pte_refs()
> using rmap_walk() (beyond the risk of racing PageAnon->PageKsm, mostly
> but not entirely prevented by page_count() check in ksm.c's
> write_protect_page(): that risk being shared with page_referenced() and
> not helped by lru_lock).
>
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  mm/page_idle.c | 4 ----
>  1 file changed, 4 deletions(-)
>
> diff --git a/mm/page_idle.c b/mm/page_idle.c
> index 057c61df12db..64e5344a992c 100644
> --- a/mm/page_idle.c
> +++ b/mm/page_idle.c
> @@ -32,19 +32,15 @@
>  static struct page *page_idle_get_page(unsigned long pfn)
>  {
>         struct page *page = pfn_to_online_page(pfn);
> -       pg_data_t *pgdat;
>
>         if (!page || !PageLRU(page) ||
>             !get_page_unless_zero(page))
>                 return NULL;
>
> -       pgdat = page_pgdat(page);
> -       spin_lock_irq(&pgdat->lru_lock);

get_page_unless_zero() is a full memory barrier.  But do we need a
compiler barrier here to prevent the compiler to cache PageLRU()
results here?  Otherwise looks OK to me,

Acked-by: "Huang, Ying" <ying.huang@intel.com>

Best Regards,
Huang, Ying

>         if (unlikely(!PageLRU(page))) {
>                 put_page(page);
>                 page = NULL;
>         }
> -       spin_unlock_irq(&pgdat->lru_lock);
>         return page;
>  }
>
> --
> 1.8.3.1
>
>
Vlastimil Babka Nov. 11, 2020, 12:52 p.m. UTC | #3
On 11/11/20 9:17 AM, huang ying wrote:
> On Thu, Nov 5, 2020 at 4:56 PM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>>
>> From: Hugh Dickins <hughd@google.com>
>>
>> It is necessary for page_idle_get_page() to recheck PageLRU() after
>> get_page_unless_zero(), but holding lru_lock around that serves no
>> useful purpose, and adds to lru_lock contention: delete it.
>>
>> See https://lore.kernel.org/lkml/20150504031722.GA2768@blaptop for the
>> discussion that led to lru_lock there; but __page_set_anon_rmap() now
>> uses WRITE_ONCE(), and I see no other risk in page_idle_clear_pte_refs()
>> using rmap_walk() (beyond the risk of racing PageAnon->PageKsm, mostly
>> but not entirely prevented by page_count() check in ksm.c's
>> write_protect_page(): that risk being shared with page_referenced() and
>> not helped by lru_lock).
>>
>> Signed-off-by: Hugh Dickins <hughd@google.com>
>> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Alex Shi <alex.shi@linux.alibaba.com>
>> Cc: linux-mm@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>  mm/page_idle.c | 4 ----
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/mm/page_idle.c b/mm/page_idle.c
>> index 057c61df12db..64e5344a992c 100644
>> --- a/mm/page_idle.c
>> +++ b/mm/page_idle.c
>> @@ -32,19 +32,15 @@
>>  static struct page *page_idle_get_page(unsigned long pfn)
>>  {
>>         struct page *page = pfn_to_online_page(pfn);
>> -       pg_data_t *pgdat;
>>
>>         if (!page || !PageLRU(page) ||
>>             !get_page_unless_zero(page))
>>                 return NULL;
>>
>> -       pgdat = page_pgdat(page);
>> -       spin_lock_irq(&pgdat->lru_lock);
> 
> get_page_unless_zero() is a full memory barrier.  But do we need a
> compiler barrier here to prevent the compiler to cache PageLRU()
> results here?  Otherwise looks OK to me,

I think the compiler barrier is also implied by the full memory barrier and 
prevents the caching.

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> Acked-by: "Huang, Ying" <ying.huang@intel.com>
> 
> Best Regards,
> Huang, Ying
> 
>>         if (unlikely(!PageLRU(page))) {
>>                 put_page(page);
>>                 page = NULL;
>>         }
>> -       spin_unlock_irq(&pgdat->lru_lock);
>>         return page;
>>  }
>>
>> --
>> 1.8.3.1
>>
>>
>
diff mbox series

Patch

diff --git a/mm/page_idle.c b/mm/page_idle.c
index 057c61df12db..64e5344a992c 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -32,19 +32,15 @@ 
 static struct page *page_idle_get_page(unsigned long pfn)
 {
 	struct page *page = pfn_to_online_page(pfn);
-	pg_data_t *pgdat;
 
 	if (!page || !PageLRU(page) ||
 	    !get_page_unless_zero(page))
 		return NULL;
 
-	pgdat = page_pgdat(page);
-	spin_lock_irq(&pgdat->lru_lock);
 	if (unlikely(!PageLRU(page))) {
 		put_page(page);
 		page = NULL;
 	}
-	spin_unlock_irq(&pgdat->lru_lock);
 	return page;
 }