Message ID | 20230506085903.96133-1-wuyun.abel@bytedance.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | sock: Fix misuse of sk_under_memory_pressure() | expand |
On Sat, 2023-05-06 at 16:59 +0800, Abel Wu wrote: > The commit 180d8cd942ce ("foundations of per-cgroup memory pressure > controlling") wrapped proto::memory_pressure status into an accessor > named sk_under_memory_pressure(), and in the next commit e1aab161e013 > ("socket: initial cgroup code") added the consideration of net-memcg > pressure into this accessor. > > But with the former patch applied, not all of the call sites of > sk_under_memory_pressure() are interested in net-memcg's pressure. > The __sk_mem_{raise,reduce}_allocated() only focus on proto/netns > pressure rather than net-memcg's. Why do you state the above? The current behavior is established since ~12y, arguably we can state quite the opposite. I think this patch should at least target net-next, and I think we need a more detailed reasoning to introduce such behavior change. > IOW this accessor are generally > used for deciding whether should reclaim or not. > > Fixes: e1aab161e013 ("socket: initial cgroup code") > Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> > --- > include/net/sock.h | 5 ----- > net/core/sock.c | 17 +++++++++-------- > 2 files changed, 9 insertions(+), 13 deletions(-) > > diff --git a/include/net/sock.h b/include/net/sock.h > index 8b7ed7167243..752d51030c5a 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -1404,11 +1404,6 @@ static inline int sk_under_cgroup_hierarchy(struct sock *sk, > #endif > } > > -static inline bool sk_has_memory_pressure(const struct sock *sk) > -{ > - return sk->sk_prot->memory_pressure != NULL; > -} > - > static inline bool sk_under_memory_pressure(const struct sock *sk) > { > if (!sk->sk_prot->memory_pressure) > diff --git a/net/core/sock.c b/net/core/sock.c > index 5440e67bcfe3..8d215f821ea6 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -3017,13 +3017,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) > } > } > > - if (sk_has_memory_pressure(sk)) { > - u64 alloc; > - > - if (!sk_under_memory_pressure(sk)) > - return 1; > - alloc = sk_sockets_allocated_read_positive(sk); > - if (sk_prot_mem_limits(sk, 2) > alloc * > + if (prot->memory_pressure) { > + /* > + * If under global pressure, allow the sockets that are below > + * average memory usage to raise, trying to be fair between all > + * the sockets under global constrains. > + */ > + if (!*prot->memory_pressure || > + sk_prot_mem_limits(sk, 2) > sk_sockets_allocated_read_positive(sk) * The above introduces unrelated changes that makes the code IMHO less readable - I don't see a good reason to drop the 'alloc' variable. Cheers, Paolo
On Sat, May 6, 2023 at 10:59 AM Abel Wu <wuyun.abel@bytedance.com> wrote: > > The commit 180d8cd942ce ("foundations of per-cgroup memory pressure > controlling") wrapped proto::memory_pressure status into an accessor > named sk_under_memory_pressure(), and in the next commit e1aab161e013 > ("socket: initial cgroup code") added the consideration of net-memcg > pressure into this accessor. > > But with the former patch applied, not all of the call sites of > sk_under_memory_pressure() are interested in net-memcg's pressure. > The __sk_mem_{raise,reduce}_allocated() only focus on proto/netns > pressure rather than net-memcg's. IOW this accessor are generally > used for deciding whether should reclaim or not. > > Fixes: e1aab161e013 ("socket: initial cgroup code") > Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> > --- > include/net/sock.h | 5 ----- > net/core/sock.c | 17 +++++++++-------- > 2 files changed, 9 insertions(+), 13 deletions(-) > > diff --git a/include/net/sock.h b/include/net/sock.h > index 8b7ed7167243..752d51030c5a 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -1404,11 +1404,6 @@ static inline int sk_under_cgroup_hierarchy(struct sock *sk, > #endif > } > > -static inline bool sk_has_memory_pressure(const struct sock *sk) > -{ > - return sk->sk_prot->memory_pressure != NULL; > -} > - > static inline bool sk_under_memory_pressure(const struct sock *sk) > { > if (!sk->sk_prot->memory_pressure) > diff --git a/net/core/sock.c b/net/core/sock.c > index 5440e67bcfe3..8d215f821ea6 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -3017,13 +3017,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) > } > } > > - if (sk_has_memory_pressure(sk)) { > - u64 alloc; > - > - if (!sk_under_memory_pressure(sk)) > - return 1; > - alloc = sk_sockets_allocated_read_positive(sk); > - if (sk_prot_mem_limits(sk, 2) > alloc * > + if (prot->memory_pressure) { I do not understand this patch. Changelog is evasive, I do not see what practical problem you want to solve. sk_has_memory_pressure() is not about memcg, simply the fact that a proto has a non NULL memory_pressure pointer.
Hi Paolo, thanks very much for comment! On 5/9/23 3:52 PM, Paolo Abeni wrote: > On Sat, 2023-05-06 at 16:59 +0800, Abel Wu wrote: >> The commit 180d8cd942ce ("foundations of per-cgroup memory pressure >> controlling") wrapped proto::memory_pressure status into an accessor >> named sk_under_memory_pressure(), and in the next commit e1aab161e013 >> ("socket: initial cgroup code") added the consideration of net-memcg >> pressure into this accessor. >> >> But with the former patch applied, not all of the call sites of >> sk_under_memory_pressure() are interested in net-memcg's pressure. >> The __sk_mem_{raise,reduce}_allocated() only focus on proto/netns >> pressure rather than net-memcg's. > > Why do you state the above? The current behavior is established since > ~12y, arguably we can state quite the opposite. > > I think this patch should at least target net-next, and I think we need > a more detailed reasoning to introduce such behavior change. Sorry for failed to provide a reasonable explanation... When @allocated is no more than tcp_mem[0], the global tcp_mem pressure is gone even if the socket's memcg is under pressure. This reveals that prot::memory_pressure only considers the global tcp memory pressure, and is irrelevant to the memcg's. IOW if we're updating prot::memory_pressure or making desicions upon prot::memory_pressure, the memcg stat should not be considered and sk_under_memory_pressure() should not be called since it considers both. > >> IOW this accessor are generally >> used for deciding whether should reclaim or not. >> >> Fixes: e1aab161e013 ("socket: initial cgroup code") >> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> >> --- >> include/net/sock.h | 5 ----- >> net/core/sock.c | 17 +++++++++-------- >> 2 files changed, 9 insertions(+), 13 deletions(-) >> >> diff --git a/include/net/sock.h b/include/net/sock.h >> index 8b7ed7167243..752d51030c5a 100644 >> --- a/include/net/sock.h >> +++ b/include/net/sock.h >> @@ -1404,11 +1404,6 @@ static inline int sk_under_cgroup_hierarchy(struct sock *sk, >> #endif >> } >> >> -static inline bool sk_has_memory_pressure(const struct sock *sk) >> -{ >> - return sk->sk_prot->memory_pressure != NULL; >> -} >> - >> static inline bool sk_under_memory_pressure(const struct sock *sk) >> { >> if (!sk->sk_prot->memory_pressure) >> diff --git a/net/core/sock.c b/net/core/sock.c >> index 5440e67bcfe3..8d215f821ea6 100644 >> --- a/net/core/sock.c >> +++ b/net/core/sock.c >> @@ -3017,13 +3017,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) >> } >> } >> >> - if (sk_has_memory_pressure(sk)) { >> - u64 alloc; >> - >> - if (!sk_under_memory_pressure(sk)) >> - return 1; >> - alloc = sk_sockets_allocated_read_positive(sk); >> - if (sk_prot_mem_limits(sk, 2) > alloc * >> + if (prot->memory_pressure) { >> + /* >> + * If under global pressure, allow the sockets that are below >> + * average memory usage to raise, trying to be fair between all >> + * the sockets under global constrains. >> + */ >> + if (!*prot->memory_pressure || >> + sk_prot_mem_limits(sk, 2) > sk_sockets_allocated_read_positive(sk) * > > The above introduces unrelated changes that makes the code IMHO less > readable - I don't see a good reason to drop the 'alloc' variable. Besides drop the @alloc variable, this change also removes the condition of memcg's pressure from sk_under_memory_pressure() due to the reason aforementioned. I can re-introduce @alloc in the next version if you think it makes code more readable. Thanks & Best, Abel
Hi Eric, thanks very much for comments! On 5/9/23 4:55 PM, Eric Dumazet wrote: > On Sat, May 6, 2023 at 10:59 AM Abel Wu <wuyun.abel@bytedance.com> wrote: >> >> The commit 180d8cd942ce ("foundations of per-cgroup memory pressure >> controlling") wrapped proto::memory_pressure status into an accessor >> named sk_under_memory_pressure(), and in the next commit e1aab161e013 >> ("socket: initial cgroup code") added the consideration of net-memcg >> pressure into this accessor. >> >> But with the former patch applied, not all of the call sites of >> sk_under_memory_pressure() are interested in net-memcg's pressure. >> The __sk_mem_{raise,reduce}_allocated() only focus on proto/netns >> pressure rather than net-memcg's. IOW this accessor are generally >> used for deciding whether should reclaim or not. >> >> Fixes: e1aab161e013 ("socket: initial cgroup code") >> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> >> --- >> include/net/sock.h | 5 ----- >> net/core/sock.c | 17 +++++++++-------- >> 2 files changed, 9 insertions(+), 13 deletions(-) >> >> diff --git a/include/net/sock.h b/include/net/sock.h >> index 8b7ed7167243..752d51030c5a 100644 >> --- a/include/net/sock.h >> +++ b/include/net/sock.h >> @@ -1404,11 +1404,6 @@ static inline int sk_under_cgroup_hierarchy(struct sock *sk, >> #endif >> } >> >> -static inline bool sk_has_memory_pressure(const struct sock *sk) >> -{ >> - return sk->sk_prot->memory_pressure != NULL; >> -} >> - >> static inline bool sk_under_memory_pressure(const struct sock *sk) >> { >> if (!sk->sk_prot->memory_pressure) >> diff --git a/net/core/sock.c b/net/core/sock.c >> index 5440e67bcfe3..8d215f821ea6 100644 >> --- a/net/core/sock.c >> +++ b/net/core/sock.c >> @@ -3017,13 +3017,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) >> } >> } >> >> - if (sk_has_memory_pressure(sk)) { >> - u64 alloc; >> - >> - if (!sk_under_memory_pressure(sk)) >> - return 1; >> - alloc = sk_sockets_allocated_read_positive(sk); >> - if (sk_prot_mem_limits(sk, 2) > alloc * >> + if (prot->memory_pressure) { > > I do not understand this patch. > > Changelog is evasive, I do not see what practical problem you want to solve. > > sk_has_memory_pressure() is not about memcg, simply the fact that a > proto has a non NULL memory_pressure pointer. Sorry for failed to provide a reasonable explanation... Would you please check my reply to Paolo? Thanks, Abel
Gentle ping :) On 5/10/23 10:35 PM, Abel Wu wrote: > Hi Paolo, thanks very much for comment! > > On 5/9/23 3:52 PM, Paolo Abeni wrote: >> On Sat, 2023-05-06 at 16:59 +0800, Abel Wu wrote: >>> The commit 180d8cd942ce ("foundations of per-cgroup memory pressure >>> controlling") wrapped proto::memory_pressure status into an accessor >>> named sk_under_memory_pressure(), and in the next commit e1aab161e013 >>> ("socket: initial cgroup code") added the consideration of net-memcg >>> pressure into this accessor. >>> >>> But with the former patch applied, not all of the call sites of >>> sk_under_memory_pressure() are interested in net-memcg's pressure. >>> The __sk_mem_{raise,reduce}_allocated() only focus on proto/netns >>> pressure rather than net-memcg's. >> >> Why do you state the above? The current behavior is established since >> ~12y, arguably we can state quite the opposite. >> >> I think this patch should at least target net-next, and I think we need >> a more detailed reasoning to introduce such behavior change. > > Sorry for failed to provide a reasonable explanation... When @allocated > is no more than tcp_mem[0], the global tcp_mem pressure is gone even if > the socket's memcg is under pressure. > > This reveals that prot::memory_pressure only considers the global tcp > memory pressure, and is irrelevant to the memcg's. IOW if we're updating > prot::memory_pressure or making desicions upon prot::memory_pressure, > the memcg stat should not be considered and sk_under_memory_pressure() > should not be called since it considers both. > >> >>> IOW this accessor are generally >>> used for deciding whether should reclaim or not. >>> >>> Fixes: e1aab161e013 ("socket: initial cgroup code") >>> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> >>> --- >>> include/net/sock.h | 5 ----- >>> net/core/sock.c | 17 +++++++++-------- >>> 2 files changed, 9 insertions(+), 13 deletions(-) >>> >>> diff --git a/include/net/sock.h b/include/net/sock.h >>> index 8b7ed7167243..752d51030c5a 100644 >>> --- a/include/net/sock.h >>> +++ b/include/net/sock.h >>> @@ -1404,11 +1404,6 @@ static inline int >>> sk_under_cgroup_hierarchy(struct sock *sk, >>> #endif >>> } >>> -static inline bool sk_has_memory_pressure(const struct sock *sk) >>> -{ >>> - return sk->sk_prot->memory_pressure != NULL; >>> -} >>> - >>> static inline bool sk_under_memory_pressure(const struct sock *sk) >>> { >>> if (!sk->sk_prot->memory_pressure) >>> diff --git a/net/core/sock.c b/net/core/sock.c >>> index 5440e67bcfe3..8d215f821ea6 100644 >>> --- a/net/core/sock.c >>> +++ b/net/core/sock.c >>> @@ -3017,13 +3017,14 @@ int __sk_mem_raise_allocated(struct sock *sk, >>> int size, int amt, int kind) >>> } >>> } >>> - if (sk_has_memory_pressure(sk)) { >>> - u64 alloc; >>> - >>> - if (!sk_under_memory_pressure(sk)) >>> - return 1; >>> - alloc = sk_sockets_allocated_read_positive(sk); >>> - if (sk_prot_mem_limits(sk, 2) > alloc * >>> + if (prot->memory_pressure) { >>> + /* >>> + * If under global pressure, allow the sockets that are below >>> + * average memory usage to raise, trying to be fair between all >>> + * the sockets under global constrains. >>> + */ >>> + if (!*prot->memory_pressure || >>> + sk_prot_mem_limits(sk, 2) > >>> sk_sockets_allocated_read_positive(sk) * >> >> The above introduces unrelated changes that makes the code IMHO less >> readable - I don't see a good reason to drop the 'alloc' variable. > Besides drop the @alloc variable, this change also removes the condition > of memcg's pressure from sk_under_memory_pressure() due to the reason > aforementioned. I can re-introduce @alloc in the next version if you > think it makes code more readable. > > Thanks & Best, > Abel >
On Mon, May 15, 2023 at 9:04 AM Abel Wu <wuyun.abel@bytedance.com> wrote: > > Gentle ping :) > I still do not understand the patch. If I do not understand the patch and its changelog now in May 2023, how will anyone understand it later when/if a regression is investigated ? I repeat : Changelog is evasive, I do not see what practical problem you want to solve. sk_has_memory_pressure() is not about memcg, simply the fact that a proto has a non NULL memory_pressure pointer. I suggest that you answer these questions, and send a V2 with an updated changelog. Again, what is the practical problem you want to solve ? What is the behavior of the current stack that you think is a problem ? Thanks.
On 5/15/23 3:21 PM, Eric Dumazet wrote:> > I still do not understand the patch. > > If I do not understand the patch and its changelog now in May 2023, > how will anyone understand it later > when/if a regression is investigated ? > > I repeat : > > Changelog is evasive, I do not see what practical problem you want to solve. > > > sk_has_memory_pressure() is not about memcg, simply the fact that a > proto has a non NULL memory_pressure pointer. Yes, it has nothing to do with sk_has_memory_pressure(), this accessor is removed only due to it is not used anymore after this fix. I really should have put this into a separate patch. > > I suggest that you answer these questions, and send a V2 with an > updated changelog. OK, I will. > > Again, what is the practical problem you want to solve ? > What is the behavior of the current stack that you think is a problem ? The status of global tcp_mem pressure is updated when: a) __sk_mem_raise_allocated(): enter: sk_memory_allocated(sk) > tcp_mem[1] leave: sk_memory_allocated(sk) <= tcp_mem[0] b) __sk_mem_reduce_allocated(): leave: sk_under_memory_pressure(sk) && sk_memory_allocated(sk) < tcp_mem[0] So the conditions of leaving global pressure are inconstant, which may lead to the situation that one pressured memcg prevents the global pressure from being cleared when there is indeed no global pressure, thus the global constrains are still in effect unexpectedly on the other sockets. The patch fixes this by removing the condition of net-memcg's pressure in __sk_mem_reduce_allocated(). As for the changes in __sk_mem_raise_allocated(), I don't think it is the right place to check the pressure of the @sk's memcg. That piece of code was originally only trying to be fair between all the sockets if there is global pressure. And if we really want to forbid the socket memory from being raised when the socket's memcg is in pressure, the condition should be in the first place inside this function. So I plan to split this patch into three in v2: [1/3] fix inconstant condition in __sk_mem_reduce_allocated() [2/3] remove unrelated check in __sk_mem_raise_allocated() [3/3] remove sk_has_memory_pressure() since it is no longer used Does this make sense to you? Thanks & Best, Abel
diff --git a/include/net/sock.h b/include/net/sock.h index 8b7ed7167243..752d51030c5a 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1404,11 +1404,6 @@ static inline int sk_under_cgroup_hierarchy(struct sock *sk, #endif } -static inline bool sk_has_memory_pressure(const struct sock *sk) -{ - return sk->sk_prot->memory_pressure != NULL; -} - static inline bool sk_under_memory_pressure(const struct sock *sk) { if (!sk->sk_prot->memory_pressure) diff --git a/net/core/sock.c b/net/core/sock.c index 5440e67bcfe3..8d215f821ea6 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3017,13 +3017,14 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) } } - if (sk_has_memory_pressure(sk)) { - u64 alloc; - - if (!sk_under_memory_pressure(sk)) - return 1; - alloc = sk_sockets_allocated_read_positive(sk); - if (sk_prot_mem_limits(sk, 2) > alloc * + if (prot->memory_pressure) { + /* + * If under global pressure, allow the sockets that are below + * average memory usage to raise, trying to be fair between all + * the sockets under global constrains. + */ + if (!*prot->memory_pressure || + sk_prot_mem_limits(sk, 2) > sk_sockets_allocated_read_positive(sk) * sk_mem_pages(sk->sk_wmem_queued + atomic_read(&sk->sk_rmem_alloc) + sk->sk_forward_alloc)) @@ -3095,7 +3096,7 @@ void __sk_mem_reduce_allocated(struct sock *sk, int amount) if (mem_cgroup_sockets_enabled && sk->sk_memcg) mem_cgroup_uncharge_skmem(sk->sk_memcg, amount); - if (sk_under_memory_pressure(sk) && + if (sk->sk_prot->memory_pressure && *sk->sk_prot->memory_pressure && (sk_memory_allocated(sk) < sk_prot_mem_limits(sk, 0))) sk_leave_memory_pressure(sk); }
The commit 180d8cd942ce ("foundations of per-cgroup memory pressure controlling") wrapped proto::memory_pressure status into an accessor named sk_under_memory_pressure(), and in the next commit e1aab161e013 ("socket: initial cgroup code") added the consideration of net-memcg pressure into this accessor. But with the former patch applied, not all of the call sites of sk_under_memory_pressure() are interested in net-memcg's pressure. The __sk_mem_{raise,reduce}_allocated() only focus on proto/netns pressure rather than net-memcg's. IOW this accessor are generally used for deciding whether should reclaim or not. Fixes: e1aab161e013 ("socket: initial cgroup code") Signed-off-by: Abel Wu <wuyun.abel@bytedance.com> --- include/net/sock.h | 5 ----- net/core/sock.c | 17 +++++++++-------- 2 files changed, 9 insertions(+), 13 deletions(-)