Message ID | 20200302173516.iysuejilava37psk@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] mm/compaction: Disable compact_unevictable_allowed on RT | expand |
On Mon, 2 Mar 2020 18:35:16 +0100 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > Since commit > 5bbe3547aa3ba ("mm: allow compaction of unevictable pages") > > it is allowed to examine mlocked pages and compact them by default. > On -RT even minor pagefaults are problematic because it may take a few > 100us to resolve them and until then the task is blocked. > > Make compact_unevictable_allowed = 0 default and RO on RT. hm, that's a bit sad but I guess it's tolerable. > ... > > index 64aeee1009cab..bbfa59d25eec3 100644 > --- a/Documentation/admin-guide/sysctl/vm.rst > +++ b/Documentation/admin-guide/sysctl/vm.rst > @@ -128,6 +128,7 @@ allowed to examine the unevictable lru (mlocked pages) for pages to compact. > This should be used on systems where stalls for minor page faults are an > acceptable trade for large contiguous free memory. Set to 0 to prevent > compaction from moving pages that are unevictable. Default value is 1. > +On CONFIG_PREEMPT_RT the default value is 0. This doesn't mention that the file is unwritable on -rt, and it doesn't explain *why* -rt has different behaviour. > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -1483,7 +1483,11 @@ static struct ctl_table vm_table[] = { > .procname = "compact_unevictable_allowed", > .data = &sysctl_compact_unevictable_allowed, > .maxlen = sizeof(int), > +#ifdef CONFIG_PREEMPT_RT > + .mode = 0444, > +#else > .mode = 0644, > +#endif This is non-backward-compatible and introduces a possibility that tested-on-non-rt userspace will fail on -rt kernels. It might be better to accept the writes, but to ignore them. Probably with a pr_warn_once() to let people know what we did. But do we really need to take the option away from -rt users? Perhaps someone wants this feature and can accept the latency hit. How about switching the default and otherwise leaving the kernel behaviour as-is and simply emitting a warning letting -rt users know that they might not want to enable this?
On 2020-03-02 13:25:31 [-0800], Andrew Morton wrote: > > index 64aeee1009cab..bbfa59d25eec3 100644 > > --- a/Documentation/admin-guide/sysctl/vm.rst > > +++ b/Documentation/admin-guide/sysctl/vm.rst > > @@ -128,6 +128,7 @@ allowed to examine the unevictable lru (mlocked pages) for pages to compact. > > This should be used on systems where stalls for minor page faults are an > > acceptable trade for large contiguous free memory. Set to 0 to prevent > > compaction from moving pages that are unevictable. Default value is 1. > > +On CONFIG_PREEMPT_RT the default value is 0. > > This doesn't mention that the file is unwritable on -rt, and it doesn't > explain *why* -rt has different behaviour. I updated this bit. > > --- a/kernel/sysctl.c > > +++ b/kernel/sysctl.c > > @@ -1483,7 +1483,11 @@ static struct ctl_table vm_table[] = { > > .procname = "compact_unevictable_allowed", > > .data = &sysctl_compact_unevictable_allowed, > > .maxlen = sizeof(int), > > +#ifdef CONFIG_PREEMPT_RT > > + .mode = 0444, > > +#else > > .mode = 0644, > > +#endif > > This is non-backward-compatible and introduces a possibility that > tested-on-non-rt userspace will fail on -rt kernels. It might be > better to accept the writes, but to ignore them. Probably with a > pr_warn_once() to let people know what we did. Hmm. > But do we really need to take the option away from -rt users? Perhaps > someone wants this feature and can accept the latency hit. How about > switching the default and otherwise leaving the kernel behaviour as-is > and simply emitting a warning letting -rt users know that they might > not want to enable this? I don't think that RT people can live with the latency spike. The problem is that it is not deterministic in terms *when* it happens and *how*long* does it need to complete. Also it is not visible so you end up with additional 100us and you have no idea why. compaction is "okay" in the setup / configuration phase when the mlock() pages aren't around / the RT task is disabled. So it does not disturb the RT load. Allowing the user to change the knob and spitting a warning is probably good. So we have a preferred default and the user is aware if it is changed with or without his knowledge. Let me send a patch in a bit… Sebastian
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 64aeee1009cab..bbfa59d25eec3 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -128,6 +128,7 @@ allowed to examine the unevictable lru (mlocked pages) for pages to compact. This should be used on systems where stalls for minor page faults are an acceptable trade for large contiguous free memory. Set to 0 to prevent compaction from moving pages that are unevictable. Default value is 1. +On CONFIG_PREEMPT_RT the default value is 0. dirty_background_bytes diff --git a/kernel/sysctl.c b/kernel/sysctl.c index ad5b88a53c5a8..f113e31d0b0b6 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1483,7 +1483,11 @@ static struct ctl_table vm_table[] = { .procname = "compact_unevictable_allowed", .data = &sysctl_compact_unevictable_allowed, .maxlen = sizeof(int), +#ifdef CONFIG_PREEMPT_RT + .mode = 0444, +#else .mode = 0644, +#endif .proc_handler = proc_dointvec, .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_ONE, diff --git a/mm/compaction.c b/mm/compaction.c index 672d3c78c6abf..ba77809a1666e 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1590,7 +1590,11 @@ typedef enum { * Allow userspace to control policy on scanning the unevictable LRU for * compactable pages. */ +#ifdef CONFIG_PREEMPT_RT +int sysctl_compact_unevictable_allowed __read_mostly = 0; +#else int sysctl_compact_unevictable_allowed __read_mostly = 1; +#endif static inline void update_fast_start_pfn(struct compact_control *cc, unsigned long pfn)
Since commit 5bbe3547aa3ba ("mm: allow compaction of unevictable pages") it is allowed to examine mlocked pages and compact them by default. On -RT even minor pagefaults are problematic because it may take a few 100us to resolve them and until then the task is blocked. Make compact_unevictable_allowed = 0 default and RO on RT. Link: https://lore.kernel.org/linux-mm/20190710144138.qyn4tuttdq6h7kqx@linutronix.de/ Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- v1…v2: - Make the proc file RO instead removing it. - Mention this change in Documentation/…/vm.rst. Documentation/admin-guide/sysctl/vm.rst | 1 + kernel/sysctl.c | 4 ++++ mm/compaction.c | 4 ++++ 3 files changed, 9 insertions(+)