diff mbox series

[net-next,v2,3/3] sock: Fix improper heuristic on raising memory

Message ID 20231016132812.63703-3-wuyun.abel@bytedance.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [net-next,v2,1/3] sock: Code cleanup on __sk_mem_raise_allocated() | expand

Checks

Context Check Description
netdev/series_format warning Series does not have a cover letter
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1366 this patch: 1366
netdev/cc_maintainers fail 2 blamed authors not CCed: kamezawa.hiroyu@jp.fujtsu.com glommer@parallels.com; 2 maintainers not CCed: kamezawa.hiroyu@jp.fujtsu.com glommer@parallels.com
netdev/build_clang success Errors and warnings before: 1386 this patch: 1386
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1391 this patch: 1391
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Abel Wu Oct. 16, 2023, 1:28 p.m. UTC
Before sockets became aware of net-memcg's memory pressure since
commit e1aab161e013 ("socket: initial cgroup code."), the memory
usage would be granted to raise if below average even when under
protocol's pressure. This provides fairness among the sockets of
same protocol.

That commit changes this because the heuristic will also be
effective when only memcg is under pressure which makes no sense.
Fix this by reverting to the behavior before that commit.

After this fix, __sk_mem_raise_allocated() no longer considers
memcg's pressure. As memcgs are isolated from each other w.r.t.
memory accounting, consuming one's budget won't affect others.
So except the places where buffer sizes are needed to be tuned,
allow workloads to use the memory they are provisioned.

Fixes: e1aab161e013 ("socket: initial cgroup code.")
Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
---
v2:
  - Ignore memcg pressure when raising memory allocated.
---
 net/core/sock.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

Comments

Shakeel Butt Oct. 16, 2023, 3:52 p.m. UTC | #1
On Mon, Oct 16, 2023 at 6:28 AM Abel Wu <wuyun.abel@bytedance.com> wrote:
>
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
>
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> Fix this by reverting to the behavior before that commit.
>
> After this fix, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
>
> Fixes: e1aab161e013 ("socket: initial cgroup code.")
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>

Acked-by: Shakeel Butt <shakeelb@google.com>
Paolo Abeni Oct. 19, 2023, 8:02 a.m. UTC | #2
On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
> 
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> Fix this by reverting to the behavior before that commit.
> 
> After this fix, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
> 
> Fixes: e1aab161e013 ("socket: initial cgroup code.")
> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> ---
> v2:
>   - Ignore memcg pressure when raising memory allocated.
> ---
>  net/core/sock.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 9f969e3c2ddf..1d28e3e87970 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -3035,7 +3035,13 @@ EXPORT_SYMBOL(sk_wait_data);
>   *	@amt: pages to allocate
>   *	@kind: allocation type
>   *
> - *	Similar to __sk_mem_schedule(), but does not update sk_forward_alloc
> + *	Similar to __sk_mem_schedule(), but does not update sk_forward_alloc.
> + *
> + *	Unlike the globally shared limits among the sockets under same protocol,
> + *	consuming the budget of a memcg won't have direct effect on other ones.
> + *	So be optimistic about memcg's tolerance, and leave the callers to decide
> + *	whether or not to raise allocated through sk_under_memory_pressure() or
> + *	its variants.
>   */
>  int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
>  {
> @@ -3093,7 +3099,11 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
>  	if (sk_has_memory_pressure(sk)) {
>  		u64 alloc;
>  
> -		if (!sk_under_memory_pressure(sk))
> +		/* The following 'average' heuristic is within the
> +		 * scope of global accounting, so it only makes
> +		 * sense for global memory pressure.
> +		 */
> +		if (!sk_under_global_memory_pressure(sk))
>  			return 1;

Since the whole logic is fairly non trivial I'd like to explicitly note
(for my own future memory) that I think this is the correct approach. 

The memcg granted the current allocation via the
mem_cgroup_charge_skmem() call above, the heuristic to eventually
suppress the allocation should be outside the memcg scope.

LGTM, thanks!

Paolo
Paolo Abeni Oct. 19, 2023, 8:53 a.m. UTC | #3
On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
> Before sockets became aware of net-memcg's memory pressure since
> commit e1aab161e013 ("socket: initial cgroup code."), the memory
> usage would be granted to raise if below average even when under
> protocol's pressure. This provides fairness among the sockets of
> same protocol.
> 
> That commit changes this because the heuristic will also be
> effective when only memcg is under pressure which makes no sense.
> Fix this by reverting to the behavior before that commit.
> 
> After this fix, __sk_mem_raise_allocated() no longer considers
> memcg's pressure. As memcgs are isolated from each other w.r.t.
> memory accounting, consuming one's budget won't affect others.
> So except the places where buffer sizes are needed to be tuned,
> allow workloads to use the memory they are provisioned.
> 
> Fixes: e1aab161e013 ("socket: initial cgroup code.")

I think it's better to drop this fixes tag. This is a functional change
and with such tag on at this point of the cycle, will land soon into
every stable tree. That feels not appropriate.

Please repost without such tag, thanks!

You can send the change to stables trees later, if needed.

Paolo
Abel Wu Oct. 19, 2023, 11:21 a.m. UTC | #4
On 10/16/23 11:52 PM, Shakeel Butt Wrote:
> On Mon, Oct 16, 2023 at 6:28 AM Abel Wu <wuyun.abel@bytedance.com> wrote:
>>
>> Before sockets became aware of net-memcg's memory pressure since
>> commit e1aab161e013 ("socket: initial cgroup code."), the memory
>> usage would be granted to raise if below average even when under
>> protocol's pressure. This provides fairness among the sockets of
>> same protocol.
>>
>> That commit changes this because the heuristic will also be
>> effective when only memcg is under pressure which makes no sense.
>> Fix this by reverting to the behavior before that commit.
>>
>> After this fix, __sk_mem_raise_allocated() no longer considers
>> memcg's pressure. As memcgs are isolated from each other w.r.t.
>> memory accounting, consuming one's budget won't affect others.
>> So except the places where buffer sizes are needed to be tuned,
>> allow workloads to use the memory they are provisioned.
>>
>> Fixes: e1aab161e013 ("socket: initial cgroup code.")
>> Signed-off-by: Abel Wu <wuyun.abel@bytedance.com>
> 
> Acked-by: Shakeel Butt <shakeelb@google.com>

Thanks!
Abel Wu Oct. 19, 2023, 11:23 a.m. UTC | #5
On 10/19/23 4:53 PM, Paolo Abeni Wrote:
> On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
>> Before sockets became aware of net-memcg's memory pressure since
>> commit e1aab161e013 ("socket: initial cgroup code."), the memory
>> usage would be granted to raise if below average even when under
>> protocol's pressure. This provides fairness among the sockets of
>> same protocol.
>>
>> That commit changes this because the heuristic will also be
>> effective when only memcg is under pressure which makes no sense.
>> Fix this by reverting to the behavior before that commit.
>>
>> After this fix, __sk_mem_raise_allocated() no longer considers
>> memcg's pressure. As memcgs are isolated from each other w.r.t.
>> memory accounting, consuming one's budget won't affect others.
>> So except the places where buffer sizes are needed to be tuned,
>> allow workloads to use the memory they are provisioned.
>>
>> Fixes: e1aab161e013 ("socket: initial cgroup code.")
> 
> I think it's better to drop this fixes tag. This is a functional change
> and with such tag on at this point of the cycle, will land soon into
> every stable tree. That feels not appropriate.
> 
> Please repost without such tag, thanks!
> 
> You can send the change to stables trees later, if needed.

OK. Shall I add a Acked-by tag for you?

Thanks!
	Abel
Paolo Abeni Oct. 19, 2023, 11:41 a.m. UTC | #6
On Thu, 2023-10-19 at 19:23 +0800, Abel Wu wrote:
> On 10/19/23 4:53 PM, Paolo Abeni Wrote:
> > On Mon, 2023-10-16 at 21:28 +0800, Abel Wu wrote:
> > > Before sockets became aware of net-memcg's memory pressure since
> > > commit e1aab161e013 ("socket: initial cgroup code."), the memory
> > > usage would be granted to raise if below average even when under
> > > protocol's pressure. This provides fairness among the sockets of
> > > same protocol.
> > > 
> > > That commit changes this because the heuristic will also be
> > > effective when only memcg is under pressure which makes no sense.
> > > Fix this by reverting to the behavior before that commit.
> > > 
> > > After this fix, __sk_mem_raise_allocated() no longer considers
> > > memcg's pressure. As memcgs are isolated from each other w.r.t.
> > > memory accounting, consuming one's budget won't affect others.
> > > So except the places where buffer sizes are needed to be tuned,
> > > allow workloads to use the memory they are provisioned.
> > > 
> > > Fixes: e1aab161e013 ("socket: initial cgroup code.")
> > 
> > I think it's better to drop this fixes tag. This is a functional change
> > and with such tag on at this point of the cycle, will land soon into
> > every stable tree. That feels not appropriate.
> > 
> > Please repost without such tag, thanks!
> > 
> > You can send the change to stables trees later, if needed.
> 
> OK. Shall I add a Acked-by tag for you?

Let's be formal:

Acked-by: Paolo Abeni <pabeni@redhat.com>

/P
diff mbox series

Patch

diff --git a/net/core/sock.c b/net/core/sock.c
index 9f969e3c2ddf..1d28e3e87970 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3035,7 +3035,13 @@  EXPORT_SYMBOL(sk_wait_data);
  *	@amt: pages to allocate
  *	@kind: allocation type
  *
- *	Similar to __sk_mem_schedule(), but does not update sk_forward_alloc
+ *	Similar to __sk_mem_schedule(), but does not update sk_forward_alloc.
+ *
+ *	Unlike the globally shared limits among the sockets under same protocol,
+ *	consuming the budget of a memcg won't have direct effect on other ones.
+ *	So be optimistic about memcg's tolerance, and leave the callers to decide
+ *	whether or not to raise allocated through sk_under_memory_pressure() or
+ *	its variants.
  */
 int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 {
@@ -3093,7 +3099,11 @@  int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
 	if (sk_has_memory_pressure(sk)) {
 		u64 alloc;
 
-		if (!sk_under_memory_pressure(sk))
+		/* The following 'average' heuristic is within the
+		 * scope of global accounting, so it only makes
+		 * sense for global memory pressure.
+		 */
+		if (!sk_under_global_memory_pressure(sk))
 			return 1;
 
 		/* Try to be fair among all the sockets under global