[v2,0/4] migrate: Avoid unbounded blocks in MIGRATE_SYNC_LIGHT

Message ID	20230421221249.1616168-1-dianders@chromium.org (mailing list archive)
Headers	show Return-Path: <linux-fsdevel-owner@vger.kernel.org> From: Douglas Anderson <dianders@chromium.org> To: Andrew Morton <akpm@linux-foundation.org>, Mel Gorman <mgorman@techsingularity.net>, Vlastimil Babka <vbabka@suse.cz>, Ying <ying.huang@intel.com>, Alexander Viro <viro@zeniv.linux.org.uk>, Christian Brauner <brauner@kernel.org> Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao <yuzhao@google.com>, linux-fsdevel@vger.kernel.org, Matthew Wilcox <willy@infradead.org>, Douglas Anderson <dianders@chromium.org>, Bart Van Assche <bvanassche@acm.org>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Ingo Molnar <mingo@redhat.com>, Jan Kara <jack@suse.cz>, Juri Lelli <juri.lelli@redhat.com>, Mel Gorman <mgorman@suse.de>, Mikulas Patocka <mpatocka@redhat.com>, Peter Zijlstra <peterz@infradead.org>, "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>, Steven Rostedt <rostedt@goodmis.org>, Valentin Schneider <vschneid@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Will Deacon <will@kernel.org>, Zhang Yi <yi.zhang@huawei.com> Subject: [PATCH v2 0/4] migrate: Avoid unbounded blocks in MIGRATE_SYNC_LIGHT Date: Fri, 21 Apr 2023 15:12:44 -0700 Message-ID: <20230421221249.1616168-1-dianders@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	migrate: Avoid unbounded blocks in MIGRATE_SYNC_LIGHT \| expand [v2,0/4] migrate: Avoid unbounded blocks in MIGRATE_SYNC_LIGHT [v2,1/4] mm/filemap: Add folio_lock_timeout() [v2,2/4] buffer: Add lock_buffer_timeout() [v2,3/4] migrate_pages: Don't wait forever locking pages in MIGRATE_SYNC_LIGHT [v2,4/4] migrate_pages: Don't wait forever locking buffers in MIGRATE_SYNC_LIGHT

Message ID

20230421221249.1616168-1-dianders@chromium.org (mailing list archive)

Headers

From: Douglas Anderson <dianders@chromium.org>
To: Andrew Morton <akpm@linux-foundation.org>,
        Mel Gorman <mgorman@techsingularity.net>,
        Vlastimil Babka <vbabka@suse.cz>, Ying <ying.huang@intel.com>,
        Alexander Viro <viro@zeniv.linux.org.uk>,
        Christian Brauner <brauner@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
        Yu Zhao <yuzhao@google.com>, linux-fsdevel@vger.kernel.org,
        Matthew Wilcox <willy@infradead.org>,
        Douglas Anderson <dianders@chromium.org>,
        Bart Van Assche <bvanassche@acm.org>,
        Ben Segall <bsegall@google.com>,
        Daniel Bristot de Oliveira <bristot@redhat.com>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Ingo Molnar <mingo@redhat.com>, Jan Kara <jack@suse.cz>,
        Juri Lelli <juri.lelli@redhat.com>,
        Mel Gorman <mgorman@suse.de>,
        Mikulas Patocka <mpatocka@redhat.com>,
        Peter Zijlstra <peterz@infradead.org>,
        "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Valentin Schneider <vschneid@redhat.com>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Will Deacon <will@kernel.org>, Zhang Yi <yi.zhang@huawei.com>
Subject: [PATCH v2 0/4] migrate: Avoid unbounded blocks in MIGRATE_SYNC_LIGHT
Date: Fri, 21 Apr 2023 15:12:44 -0700
Message-ID: <20230421221249.1616168-1-dianders@chromium.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

migrate: Avoid unbounded blocks in MIGRATE_SYNC_LIGHT | expand

Message

Doug Anderson April 21, 2023, 10:12 p.m. UTC

This series is the result of discussion around my RFC patch [1] where
I talked about completely removing the waits for the folio_lock in
migrate_folio_unmap().

This new series should, I think, be more palatable to folks. Please
let me know what you think!

Most of the description of why I think we want this patch series can
be seen in patch #3 ("migrate_pages: Don't wait forever locking pages
in MIGRATE_SYNC_LIGHT"). I won't repeat all that information here and
would humbly request that anyone wishing to comment on the overall
patch series respond there. ;-)

[1] https://lore.kernel.org/r/20230413182313.RFC.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid

Changes in v2:
- "Add folio_lock_timeout()" new for v2.
- "Add lock_buffer_timeout()" new for v2.
- Keep unbounded delay in "SYNC", delay with a timeout in "SYNC_LIGHT"
- "Don't wait forever locking buffers in MIGRATE_SYNC_LIGHT" new for v2.

Douglas Anderson (4):
  mm/filemap: Add folio_lock_timeout()
  buffer: Add lock_buffer_timeout()
  migrate_pages: Don't wait forever locking pages in MIGRATE_SYNC_LIGHT
  migrate_pages: Don't wait forever locking buffers in
    MIGRATE_SYNC_LIGHT

 fs/buffer.c                 |  7 ++++++
 include/linux/buffer_head.h | 10 ++++++++
 include/linux/pagemap.h     | 16 +++++++++++++
 include/linux/wait_bit.h    | 24 +++++++++++++++++++
 kernel/sched/wait_bit.c     | 14 +++++++++++
 mm/filemap.c                | 47 +++++++++++++++++++++++++++----------
 mm/migrate.c                | 45 +++++++++++++++++++++--------------
 7 files changed, 132 insertions(+), 31 deletions(-)

Comments

Gao Xiang April 23, 2023, 8:35 a.m. UTC | #1

Hi Hillf,

On 2023/4/23 16:12, Hillf Danton wrote:
> On 23 Apr 2023 14:08:49 +0800 Gao Xiang <hsiangkao@linux.alibaba.com>
>> On 2023/4/22 13:18, Hillf Danton wrote:
>>> On 21 Apr 2023 15:12:45 -0700 Douglas Anderson <dianders@chromium.org>
>>>> Add a variant of folio_lock() that can timeout. This is useful to
>>>> avoid unbounded waits for the page lock in kcompactd.
>>>
>>> Given no mutex_lock_timeout() (perhaps because timeout makes no sense for
>>> spinlock), I suspect your fix lies in the right layer. If waiting for
>>> page under IO causes trouble for you, another simpler option is make
>>> IO faster (perhaps all you can do) for instance. If kcompactd is waken
>>> up by kswapd, waiting for slow IO is the right thing to do.
>>
>> A bit out of topic.  That is almost our original inital use scenarios for
> 
> Thanks for taking a look.
> 
>> EROFS [1] although we didn't actually test Chrome OS, there lies four
>> points:
>>
>>    1) 128kb compressed size unit is not suitable for memory constraint
>>       workload, especially memory pressure scenarios, that amplify both I/Os
>>       and memory footprints (EROFS was initially optimized with 4KiB
>>       pclusters);
> 
> Feel free to take another one at 2M THP [1].
> 
> [1] https://lore.kernel.org/lkml/20230418191313.268131-1-hannes@cmpxchg.org/

Honestly I don't catch your point here, does THP has some relationship with
this?  Almost all smartphones (but I don't know Chromebook honestly) didn't
use THP at that time.

>>
>>    2) If you turn into a small compressed size (e.g. 4 KiB), some fs behaves
>>       ineffective since its on-disk compressed index isn't designed to be
>>       random accessed (another in-memory cache for random access) so you have
>>       to count one by one to calculate physical data offset if cache miss;
>>
>>    3) compressed data needs to take extra memory during I/O (especially
>>       low-ended devices) that makes the cases worse and our camera app
>>       workloads once cannot be properly launched under heavy memory pressure,
>>       but in order to keep user best experience we have to keep as many as
>>       apps active so that it's hard to kill apps directly.  So inplace I/O +
>>       decompression is needed in addition to small compressed sizes for
>>       overall performance.
> 
> Frankly nowadays I have no interest in running linux with <16M RAM for example.

Our cases are tested on 2016-2018 devices under 3 to 6 GB memory if you
take a glance at the original ATC paper, the page 9 (section 5.1) wrote:
"However, it costed too much CPU and memory resources, and when trying to
  run a camera application, the phone froze for tens of seconds before it
  finally failed."

I have no idea how 16M RAM here comes from but smartphones doesn't have
such limited memory.  In brief, if you runs few app, you have few problem.
but as long as you keeps more apps in background (and running), then the
memory will eventually suffer pressure.

>>
>>    4) If considering real-time performance, some algorithms are not quite
>>       suitable for extreme pressure cases;
> 
> Neither in considering any perf under extreme memory pressure (16M or 64G RAM)
> because of crystally pure waste of time.

Personally I don't think so, if you'd like to land an effective compression
approach for end users and avoid user complaints (app lagging, app frozen,
etc).  I think these all need to be considered in practice.

Thanks,
Gao Xiang

>>
>>    etc.
>>
>> I could give more details on this year LSF/MM about this, although it's not
>> a new topic and I'm not a Android guy now.
> 
> Did you book the air ticket? How many bucks?
>>
>> [1] https://www.usenix.org/conference/atc19/presentation/gao
>>
>> Thanks,
>> Gao Xiang

Gao Xiang April 23, 2023, 10:45 a.m. UTC | #2

On 2023/4/23 17:49, Hillf Danton wrote:
> On 23 Apr 2023 16:35:26 +0800 Gao Xiang <hsiangkao@linux.alibaba.com>
>> On 2023/4/23 16:12, Hillf Danton wrote:
>>> On 23 Apr 2023 14:08:49 +0800 Gao Xiang <hsiangkao@linux.alibaba.com>
>>>> On 2023/4/22 13:18, Hillf Danton wrote:
>>>>> On 21 Apr 2023 15:12:45 -0700 Douglas Anderson <dianders@chromium.org>
>>>>>> Add a variant of folio_lock() that can timeout. This is useful to
>>>>>> avoid unbounded waits for the page lock in kcompactd.
>>>>>
>>>>> Given no mutex_lock_timeout() (perhaps because timeout makes no sense for
>>>>> spinlock), I suspect your fix lies in the right layer. If waiting for
>>>>> page under IO causes trouble for you, another simpler option is make
>>>>> IO faster (perhaps all you can do) for instance. If kcompactd is waken
>>>>> up by kswapd, waiting for slow IO is the right thing to do.
>>>>
>>>> A bit out of topic.  That is almost our original inital use scenarios for
>>>
>>> Thanks for taking a look.
>>>
>>>> EROFS [1] although we didn't actually test Chrome OS, there lies four
>>>> points:
>>>>
>>>>     1) 128kb compressed size unit is not suitable for memory constraint
>>>>        workload, especially memory pressure scenarios, that amplify both I/Os
>>>>        and memory footprints (EROFS was initially optimized with 4KiB
>>>>        pclusters);
>>>
>>> Feel free to take another one at 2M THP [1].
>>>
>>> [1] https://lore.kernel.org/lkml/20230418191313.268131-1-hannes@cmpxchg.org/
>>
>> Honestly I don't catch your point here, does THP has some relationship with
> 
> THP tests ended without the help of timeout helpers.
> 
>> this?  Almost all smartphones (but I don't know Chromebook honestly) didn't
>> use THP at that time.
>>>>
>>>>     2) If you turn into a small compressed size (e.g. 4 KiB), some fs behaves
>>>>        ineffective since its on-disk compressed index isn't designed to be
>>>>        random accessed (another in-memory cache for random access) so you have
>>>>        to count one by one to calculate physical data offset if cache miss;
>>>>
>>>>     3) compressed data needs to take extra memory during I/O (especially
>>>>        low-ended devices) that makes the cases worse and our camera app
>>>>        workloads once cannot be properly launched under heavy memory pressure,
>>>>        but in order to keep user best experience we have to keep as many as
>>>>        apps active so that it's hard to kill apps directly.  So inplace I/O +
>>>>        decompression is needed in addition to small compressed sizes for
>>>>        overall performance.
>>>
>>> Frankly nowadays I have no interest in running linux with <16M RAM for example.
>>
>> Our cases are tested on 2016-2018 devices under 3 to 6 GB memory if you
>> take a glance at the original ATC paper, the page 9 (section 5.1) wrote:
>> "However, it costed too much CPU and memory resources, and when trying to
>>   run a camera application, the phone froze for tens of seconds before it
>>   finally failed."
>>
>> I have no idea how 16M RAM here comes from but smartphones doesn't have
>> such limited memory.  In brief, if you runs few app, you have few problem.
>> but as long as you keeps more apps in background (and running), then the
>> memory will eventually suffer pressure.
> 
> Given no complaints in case of running 16 apps with 1G RAM for instance,
> what is the point of running 256 apps with the same RAM? And adding changes

I don't think the `ill-designed` word is helpful to the overall topic.

I'm not sure if my description is confusing:

  1) First, I never said running 256 apps with the same RAM.  In fact, in 2018
     there are indeed some phones still with 1G RAM, if my memory is correct,
     such 1G phones couldn't run 16 latest mainstream super apps at the same time
     smoothly, and, previously compression will lead this worse.  Even such
     phones cannot use a full Android but a minimized Android Go [1] instead.
     The worst case I've heard on phones with 1G RAM would be "after you checked
     a new message from friends on a superapp by switch out, another previous
     one with some incomplete registeration form could be killed and you have
     to restart and refill the form."

  2) apps and baseos can be upgraded over time, especially apps, since Android
     ecosystem is open.  It's hard to get over it.

Thanks,
Gao Xiang

> because of ill designed phone products?

[1] https://developer.android.com/guide/topics/androidgo