diff mbox series

mm/vma: check retry_merge only for new vma case

Message ID 20241118021823.17386-1-richard.weiyang@gmail.com (mailing list archive)
State New
Headers show
Series mm/vma: check retry_merge only for new vma case | expand

Commit Message

Wei Yang Nov. 18, 2024, 2:18 a.m. UTC
Current code logic looks like this:

__mmap_region()
  vma = vma_merge_new_range(&vmg)
  if (!vma)
    __mmap_new_vma(&map, &vma)
      __mmap_new_file_vma(map, vma)
        map->retry_merge = xxx               --- (1)
  if (map.retry_merge)
    vma_merge_existing_range(vmg, &map, vma)

Location (1) is the only place where map.retry_merge is set, this means
it is not necessary to check it if already merged with adjacent vma.

Let's move the check and following operation into new vma case.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
CC: Liam R. Howlett <Liam.Howlett@Oracle.com>
CC: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
CC: Vlastimil Babka <vbabka@suse.cz>
CC: Jann Horn <jannh@google.com>
---
 mm/vma.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Liam R. Howlett Nov. 18, 2024, 3:56 a.m. UTC | #1
* Wei Yang <richard.weiyang@gmail.com> [241117 21:21]:
> Current code logic looks like this:
> 
> __mmap_region()
>   vma = vma_merge_new_range(&vmg)
>   if (!vma)
>     __mmap_new_vma(&map, &vma)
>       __mmap_new_file_vma(map, vma)
>         map->retry_merge = xxx               --- (1)
>   if (map.retry_merge)
>     vma_merge_existing_range(vmg, &map, vma)

Please don't quote code in your commit log.  We can see the code in the
diff section.

> 
> Location (1) is the only place where map.retry_merge is set, this means
> it is not necessary to check it if already merged with adjacent vma.
> 
> Let's move the check and following operation into new vma case.

This makes sense, but this is a complex block of code.

I'm all for optimisations, but there is already a bug in the code that
you relocated in your patch, and the backport of these changes isn't
even complete.

Maybe we can give the existing code some time to soak before optimising?

> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> CC: Liam R. Howlett <Liam.Howlett@Oracle.com>
> CC: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> CC: Vlastimil Babka <vbabka@suse.cz>
> CC: Jann Horn <jannh@google.com>
> ---
>  mm/vma.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/vma.c b/mm/vma.c
> index 8a454a7bbc80..80b1bd404f23 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -2456,14 +2456,14 @@ unsigned long __mmap_region(struct file *file, unsigned long addr,
>  		error = __mmap_new_vma(&map, &vma);
>  		if (error)
>  			goto unacct_error;
> -	}
>  
> -	/* If flags changed, we might be able to merge, so try again. */
> -	if (map.retry_merge) {
> -		VMG_MMAP_STATE(vmg, &map, vma);
> +		/* If flags changed, we might be able to merge, so try again. */
> +		if (map.retry_merge) {
> +			VMG_MMAP_STATE(vmg, &map, vma);
>  
> -		vma_iter_config(map.vmi, map.addr, map.end);
> -		vma_merge_existing_range(&vmg);
> +			vma_iter_config(map.vmi, map.addr, map.end);
> +			vma_merge_existing_range(&vmg);
> +		}
>  	}
>  
>  	__mmap_complete(&map, vma);
> -- 
> 2.34.1
>
Wei Yang Nov. 18, 2024, 9:31 a.m. UTC | #2
On Sun, Nov 17, 2024 at 10:56:55PM -0500, Liam R. Howlett wrote:
>* Wei Yang <richard.weiyang@gmail.com> [241117 21:21]:
>> Current code logic looks like this:
>> 
>> __mmap_region()
>>   vma = vma_merge_new_range(&vmg)
>>   if (!vma)
>>     __mmap_new_vma(&map, &vma)
>>       __mmap_new_file_vma(map, vma)
>>         map->retry_merge = xxx               --- (1)
>>   if (map.retry_merge)
>>     vma_merge_existing_range(vmg, &map, vma)
>
>Please don't quote code in your commit log.  We can see the code in the
>diff section.
>

Sure, maybe I misunderstand Lorenzo's suggestion in pre-previous review.

Will not add these in change log in the future.

>> 
>> Location (1) is the only place where map.retry_merge is set, this means
>> it is not necessary to check it if already merged with adjacent vma.
>> 
>> Let's move the check and following operation into new vma case.
>
>This makes sense, but this is a complex block of code.
>
>I'm all for optimisations, but there is already a bug in the code that
>you relocated in your patch, and the backport of these changes isn't
>even complete.
>
>Maybe we can give the existing code some time to soak before optimising?
>

Sure, when is a proper time to re-send it if everything is fine?
Lorenzo Stoakes Nov. 18, 2024, 10:04 a.m. UTC | #3
On Mon, Nov 18, 2024 at 02:18:23AM +0000, Wei Yang wrote:
> Current code logic looks like this:
>
> __mmap_region()
>   vma = vma_merge_new_range(&vmg)
>   if (!vma)
>     __mmap_new_vma(&map, &vma)
>       __mmap_new_file_vma(map, vma)
>         map->retry_merge = xxx               --- (1)
>   if (map.retry_merge)
>     vma_merge_existing_range(vmg, &map, vma)
>
> Location (1) is the only place where map.retry_merge is set, this means
> it is not necessary to check it if already merged with adjacent vma.
>
> Let's move the check and following operation into new vma case.
>
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

Nack.

As I said to you in another patch, the aim isn't to find the mathematically
smallest size of patch. The aim is to keep things readable and nesting
things like this works against that aim.

Unless you can demonstrate a _significant_ perf improvement for not
checking a boolean here, then you're making the code more complicated than
it needs to be for no gain.

Right now it's (intentionally!) simple - try to merge, ok we can't, try to
map a new VMA. Finally, if the state says retry a merge, do it.

It's easy to read and easy to understand.

Nesting as you say takes away from that.

I also could have gone through this mmap() code and tried to reach some
mathematically perfect level of avoiding unnecessary things, but as this
wasn't the aim, I didn't.

Please try to focus on finding bugs, or _demonstrable_ perf issues rather
than stuff like this.

Thanks!

> CC: Liam R. Howlett <Liam.Howlett@Oracle.com>
> CC: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> CC: Vlastimil Babka <vbabka@suse.cz>
> CC: Jann Horn <jannh@google.com>
> ---
>  mm/vma.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/mm/vma.c b/mm/vma.c
> index 8a454a7bbc80..80b1bd404f23 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -2456,14 +2456,14 @@ unsigned long __mmap_region(struct file *file, unsigned long addr,
>  		error = __mmap_new_vma(&map, &vma);
>  		if (error)
>  			goto unacct_error;
> -	}
>
> -	/* If flags changed, we might be able to merge, so try again. */
> -	if (map.retry_merge) {
> -		VMG_MMAP_STATE(vmg, &map, vma);
> +		/* If flags changed, we might be able to merge, so try again. */
> +		if (map.retry_merge) {
> +			VMG_MMAP_STATE(vmg, &map, vma);
>
> -		vma_iter_config(map.vmi, map.addr, map.end);
> -		vma_merge_existing_range(&vmg);
> +			vma_iter_config(map.vmi, map.addr, map.end);
> +			vma_merge_existing_range(&vmg);
> +		}
>  	}
>
>  	__mmap_complete(&map, vma);
> --
> 2.34.1
>
>
Lorenzo Stoakes Nov. 18, 2024, 10:06 a.m. UTC | #4
On Mon, Nov 18, 2024 at 09:31:09AM +0000, Wei Yang wrote:
> On Sun, Nov 17, 2024 at 10:56:55PM -0500, Liam R. Howlett wrote:
> >* Wei Yang <richard.weiyang@gmail.com> [241117 21:21]:
> >> Current code logic looks like this:
> >>
> >> __mmap_region()
> >>   vma = vma_merge_new_range(&vmg)
> >>   if (!vma)
> >>     __mmap_new_vma(&map, &vma)
> >>       __mmap_new_file_vma(map, vma)
> >>         map->retry_merge = xxx               --- (1)
> >>   if (map.retry_merge)
> >>     vma_merge_existing_range(vmg, &map, vma)
> >
> >Please don't quote code in your commit log.  We can see the code in the
> >diff section.
> >
>
> Sure, maybe I misunderstand Lorenzo's suggestion in pre-previous review.
>
> Will not add these in change log in the future.
>
> >>
> >> Location (1) is the only place where map.retry_merge is set, this means
> >> it is not necessary to check it if already merged with adjacent vma.
> >>
> >> Let's move the check and following operation into new vma case.
> >
> >This makes sense, but this is a complex block of code.
> >
> >I'm all for optimisations, but there is already a bug in the code that
> >you relocated in your patch, and the backport of these changes isn't
> >even complete.
> >
> >Maybe we can give the existing code some time to soak before optimising?
> >
>
> Sure, when is a proper time to re-send it if everything is fine?

I've NACKed, so this patch isn't something we want, but to highlight -
ideally when new things like this have been introduced that fundamentally
change core stuff, keep in mind we may be very busy ensuring it settles
down correctly, and really do not want to be constantly reorganising the
code.

So for things that are rejigs or small improvements, it's better to wait
until the thing has landed in mainline and fully settled down.

Bugs of course we want to know about asap, and your bug reports are much
appreciated!

Thanks.

>
>
> --
> Wei Yang
> Help you, Help me
>
Wei Yang Nov. 18, 2024, 12:34 p.m. UTC | #5
On Mon, Nov 18, 2024 at 10:04:44AM +0000, Lorenzo Stoakes wrote:
>On Mon, Nov 18, 2024 at 02:18:23AM +0000, Wei Yang wrote:
>> Current code logic looks like this:
>>
>> __mmap_region()
>>   vma = vma_merge_new_range(&vmg)
>>   if (!vma)
>>     __mmap_new_vma(&map, &vma)
>>       __mmap_new_file_vma(map, vma)
>>         map->retry_merge = xxx               --- (1)
>>   if (map.retry_merge)
>>     vma_merge_existing_range(vmg, &map, vma)
>>
>> Location (1) is the only place where map.retry_merge is set, this means
>> it is not necessary to check it if already merged with adjacent vma.
>>
>> Let's move the check and following operation into new vma case.
>>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>
>Nack.
>
>As I said to you in another patch, the aim isn't to find the mathematically
>smallest size of patch. The aim is to keep things readable and nesting
>things like this works against that aim.
>

IMHO, by nesting it, readers will notice this case only happens for new vma
case. So we don't need to retry if vma_merge_new_range() succeed.

It seems to be more readable, but maybe you have other thoughts I missed. If
you would like to share some thoughts, I'd appreciate.

>Unless you can demonstrate a _significant_ perf improvement for not
>checking a boolean here, then you're making the code more complicated than
>it needs to be for no gain.
>
>Right now it's (intentionally!) simple - try to merge, ok we can't, try to
>map a new VMA. Finally, if the state says retry a merge, do it.
>

So you want to make retry_merge a universal check state?

>It's easy to read and easy to understand.
>
>Nesting as you say takes away from that.
>
>I also could have gone through this mmap() code and tried to reach some
>mathematically perfect level of avoiding unnecessary things, but as this
>wasn't the aim, I didn't.
>
>Please try to focus on finding bugs, or _demonstrable_ perf issues rather
>than stuff like this.
>
>Thanks!
>
Lorenzo Stoakes Nov. 18, 2024, 12:52 p.m. UTC | #6
On Mon, Nov 18, 2024 at 12:34:39PM +0000, Wei Yang wrote:
> On Mon, Nov 18, 2024 at 10:04:44AM +0000, Lorenzo Stoakes wrote:
> >On Mon, Nov 18, 2024 at 02:18:23AM +0000, Wei Yang wrote:
> >> Current code logic looks like this:
> >>
> >> __mmap_region()
> >>   vma = vma_merge_new_range(&vmg)
> >>   if (!vma)
> >>     __mmap_new_vma(&map, &vma)
> >>       __mmap_new_file_vma(map, vma)
> >>         map->retry_merge = xxx               --- (1)
> >>   if (map.retry_merge)
> >>     vma_merge_existing_range(vmg, &map, vma)
> >>
> >> Location (1) is the only place where map.retry_merge is set, this means
> >> it is not necessary to check it if already merged with adjacent vma.
> >>
> >> Let's move the check and following operation into new vma case.
> >>
> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >
> >Nack.
> >
> >As I said to you in another patch, the aim isn't to find the mathematically
> >smallest size of patch. The aim is to keep things readable and nesting
> >things like this works against that aim.
> >
>
> IMHO, by nesting it, readers will notice this case only happens for new vma
> case. So we don't need to retry if vma_merge_new_range() succeed.

By this logic we should not split up the functions any more, because the
reader won't know that it's only if you create a new VMA AND it's
file-backed AND that succeeded AND etc. etc. etc.

So by your logic, we should undo every refactoring here since we need the
reader to understand precisely the conditions under which this can occur
right?

Or where do we stop? Because now it's 'misleading' to imply it might always
happen in this situation without stipulating that there are other
conditions?

Do you see how this breaks down?

We encapsulate the circumstances under which a retry_merge occurs, using
the map.retry_merge variable. That's it.

It's simple, it documents itself, and a reader wishing to know when this
happens can very easily find out.

The purpose of this code is - again - not to be as small as possible or to
avoid all possible branches (unless perf numbers guide us to do so) - it's
to be readable.

In the original refactoring I can see:

- prepare for mmap
- try to merge
- if merge failed, create new
- maybe do a deferred merge
- mmap complete, do 'after mmap' tasks.

In your proposed code I see:

- prepare for mmap
- try to merge
- if merge failed, create new
  - if merge failed, created new, and we required a deferred merge, do a deferred merge
- mmap complete, do 'after mmap' tasks.

It's about humans reading this and being able to understand what is going
on.

>
> It seems to be more readable, but maybe you have other thoughts I missed. If
> you would like to share some thoughts, I'd appreciate.
>
> >Unless you can demonstrate a _significant_ perf improvement for not
> >checking a boolean here, then you're making the code more complicated than
> >it needs to be for no gain.
> >
> >Right now it's (intentionally!) simple - try to merge, ok we can't, try to
> >map a new VMA. Finally, if the state says retry a merge, do it.
> >
>
> So you want to make retry_merge a universal check state?

I don't know what a 'universal check state' is? You mean we check it
regardless of whether we previously merged or not, well obviously I do want
to do that, but for readability reasons as outlined above, and previously.

>
> >It's easy to read and easy to understand.
> >
> >Nesting as you say takes away from that.
> >
> >I also could have gone through this mmap() code and tried to reach some
> >mathematically perfect level of avoiding unnecessary things, but as this
> >wasn't the aim, I didn't.
> >
> >Please try to focus on finding bugs, or _demonstrable_ perf issues rather
> >than stuff like this.
> >
> >Thanks!
> >
>
> --
> Wei Yang
> Help you, Help me
>

I would strongly suggest you look at mmap_region() prior to my refactorings
(say in v6.11) and after, and honestly assess whether you think the new
version is more readable than the previous one.

And consider that we have run into _countless_ bugs because of this
complexity, much of which very much tried to do exactly what you are trying
to do here.

Again, we appreciate your help, but I strongly suggest you focus on bugs
and performance optimisations (backed by numbers), rather than nitpicking
things like this when the code is incredibly new.
Wei Yang Nov. 19, 2024, 12:59 a.m. UTC | #7
On Mon, Nov 18, 2024 at 12:52:48PM +0000, Lorenzo Stoakes wrote:
>On Mon, Nov 18, 2024 at 12:34:39PM +0000, Wei Yang wrote:
>> On Mon, Nov 18, 2024 at 10:04:44AM +0000, Lorenzo Stoakes wrote:
>> >On Mon, Nov 18, 2024 at 02:18:23AM +0000, Wei Yang wrote:
>> >> Current code logic looks like this:
>> >>
>> >> __mmap_region()
>> >>   vma = vma_merge_new_range(&vmg)
>> >>   if (!vma)
>> >>     __mmap_new_vma(&map, &vma)
>> >>       __mmap_new_file_vma(map, vma)
>> >>         map->retry_merge = xxx               --- (1)
>> >>   if (map.retry_merge)
>> >>     vma_merge_existing_range(vmg, &map, vma)
>> >>
>> >> Location (1) is the only place where map.retry_merge is set, this means
>> >> it is not necessary to check it if already merged with adjacent vma.
>> >>
>> >> Let's move the check and following operation into new vma case.
>> >>
>> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> >
>> >Nack.
>> >
>> >As I said to you in another patch, the aim isn't to find the mathematically
>> >smallest size of patch. The aim is to keep things readable and nesting
>> >things like this works against that aim.
>> >
>>
>> IMHO, by nesting it, readers will notice this case only happens for new vma
>> case. So we don't need to retry if vma_merge_new_range() succeed.
>
>By this logic we should not split up the functions any more, because the
>reader won't know that it's only if you create a new VMA AND it's
>file-backed AND that succeeded AND etc. etc. etc.
>
>So by your logic, we should undo every refactoring here since we need the
>reader to understand precisely the conditions under which this can occur
>right?
>
>Or where do we stop? Because now it's 'misleading' to imply it might always
>happen in this situation without stipulating that there are other
>conditions?
>
>Do you see how this breaks down?
>
>We encapsulate the circumstances under which a retry_merge occurs, using
>the map.retry_merge variable. That's it.
>
>It's simple, it documents itself, and a reader wishing to know when this
>happens can very easily find out.
>
>The purpose of this code is - again - not to be as small as possible or to
>avoid all possible branches (unless perf numbers guide us to do so) - it's
>to be readable.
>
>In the original refactoring I can see:
>
>- prepare for mmap
>- try to merge
>- if merge failed, create new
>- maybe do a deferred merge
>- mmap complete, do 'after mmap' tasks.

Thanks for your explanation. This looks neat.

So it hide the reason of deferred merge from user, we would think both two
cases would lead to a deferred merge. This is what we want to have, right?

>
>In your proposed code I see:
>
>- prepare for mmap
>- try to merge
>- if merge failed, create new
>  - if merge failed, created new, and we required a deferred merge, do a deferred merge

Maybe here is

   - maybe do a deferred merge

>- mmap complete, do 'after mmap' tasks.
>
>It's about humans reading this and being able to understand what is going
>on.
>
Lorenzo Stoakes Nov. 19, 2024, 10:14 a.m. UTC | #8
On Tue, Nov 19, 2024 at 12:59:23AM +0000, Wei Yang wrote:
>
> Thanks for your explanation. This looks neat.
>
> So it hide the reason of deferred merge from user, we would think both two
> cases would lead to a deferred merge. This is what we want to have, right?
>
> >
> >In your proposed code I see:
> >
> >- prepare for mmap
> >- try to merge
> >- if merge failed, create new
> >  - if merge failed, created new, and we required a deferred merge, do a deferred merge
>
> Maybe here is
>
>    - maybe do a deferred merge
>
> >- mmap complete, do 'after mmap' tasks.
> >
> >It's about humans reading this and being able to understand what is going
> >on.
> >
> --
> Wei Yang
> Help you, Help me
>

With respect Wei, I feel like I've already addressed all this, and I don't
want to go around in circles! :)
diff mbox series

Patch

diff --git a/mm/vma.c b/mm/vma.c
index 8a454a7bbc80..80b1bd404f23 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2456,14 +2456,14 @@  unsigned long __mmap_region(struct file *file, unsigned long addr,
 		error = __mmap_new_vma(&map, &vma);
 		if (error)
 			goto unacct_error;
-	}
 
-	/* If flags changed, we might be able to merge, so try again. */
-	if (map.retry_merge) {
-		VMG_MMAP_STATE(vmg, &map, vma);
+		/* If flags changed, we might be able to merge, so try again. */
+		if (map.retry_merge) {
+			VMG_MMAP_STATE(vmg, &map, vma);
 
-		vma_iter_config(map.vmi, map.addr, map.end);
-		vma_merge_existing_range(&vmg);
+			vma_iter_config(map.vmi, map.addr, map.end);
+			vma_merge_existing_range(&vmg);
+		}
 	}
 
 	__mmap_complete(&map, vma);