diff mbox series

[1/1] mm/mmap: allow MAP_DROPPABLE | MAP_PRIVATE in mmap()

Message ID 20250120012607.4808-1-ioworker0@gmail.com (mailing list archive)
State New
Headers show
Series [1/1] mm/mmap: allow MAP_DROPPABLE | MAP_PRIVATE in mmap() | expand

Commit Message

Lance Yang Jan. 20, 2025, 1:26 a.m. UTC
Currently, mmap() fails with `-EINVAL` when both MAP_DROPPABLE and
MAP_PRIVATE are specified. This behavior might be inconsistent, as the
implementation of MAP_DROPPABLE under the hood already includes the
semantics of MAP_PRIVATE. So, IMO, whether MAP_PRIVATE is explicitly
specified or not, it should work as expected.

For example, when mmap() is called with `MAP_DROPPABLE | MAP_ANONYMOUS`,
it creates a private anonymous mapping. Users can verify this behavior
via `/proc/self/smaps`, where the resulting VMA is marked with the `dp`
(MAP_DROPPABLE) flag, and the `Private_*` fields confirm private memory
semantics. The output for a 2MiB mapping with these flags might look like:

```
f433ace00000-f433ad000000 rw-p 00000000 00:00 0
Size:               2048 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                2048 kB
Pss:                2048 kB
Pss_Dirty:          2048 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:      2048 kB
Referenced:         2048 kB
Anonymous:          2048 kB
...
VmFlags: rd wr mr mw me nr wf dd dp
```

This patch changes mmap() to allow the combination of `MAP_DROPPABLE |
MAP_PRIVATE`. For mmap(), at least one of MAP_PRIVATE or MAP_SHARED could
be explicitly specified, regardless of the combination with other `MAP_*`
flags.

Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings")
Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com>
Signed-off-by: Lance Yang <ioworker0@gmail.com>
---
 mm/mmap.c | 1 +
 1 file changed, 1 insertion(+)

Comments

David Hildenbrand Jan. 20, 2025, 7:45 a.m. UTC | #1
On 20.01.25 02:26, Lance Yang wrote:
> Currently, mmap() fails with `-EINVAL` when both MAP_DROPPABLE and
> MAP_PRIVATE are specified. This behavior might be inconsistent, as the
> implementation of MAP_DROPPABLE under the hood already includes the
> semantics of MAP_PRIVATE. So, IMO, whether MAP_PRIVATE is explicitly
> specified or not, it should work as expected.
> 
> For example, when mmap() is called with `MAP_DROPPABLE | MAP_ANONYMOUS`,
> it creates a private anonymous mapping. Users can verify this behavior
> via `/proc/self/smaps`, where the resulting VMA is marked with the `dp`
> (MAP_DROPPABLE) flag, and the `Private_*` fields confirm private memory
> semantics. The output for a 2MiB mapping with these flags might look like:

Note that "Private_" in the stats has *nothing* to do with MAP_PRIVATE.

> 
> ```
> f433ace00000-f433ad000000 rw-p 00000000 00:00 0
> Size:               2048 kB
> KernelPageSize:        4 kB
> MMUPageSize:           4 kB
> Rss:                2048 kB
> Pss:                2048 kB
> Pss_Dirty:          2048 kB
> Shared_Clean:          0 kB
> Shared_Dirty:          0 kB
> Private_Clean:         0 kB
> Private_Dirty:      2048 kB
> Referenced:         2048 kB
> Anonymous:          2048 kB
> ...
> VmFlags: rd wr mr mw me nr wf dd dp
> ```
> 
> This patch changes mmap() to allow the combination of `MAP_DROPPABLE |
> MAP_PRIVATE`. For mmap(), at least one of MAP_PRIVATE or MAP_SHARED could
> be explicitly specified, regardless of the combination with other `MAP_*`
> flags.
> 
> Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings")

"How about we just say that VM_DROPPABLE really is something separate
from MAP_PRIVATE or MAP_SHARED..

And then we make the rule be that VM_DROPPABLE is never dumped and
always dropped on fork, just to make things simpler." [1]

[1] 
https://lore.kernel.org/linux-mm/CAHk-=wi=XvCZ9r897LjEb4ZarLzLtKN1p+Fyig+F2fmQDF8GSA@mail.gmail.com/

So, nack from my side.
Lorenzo Stoakes Jan. 20, 2025, 10:38 a.m. UTC | #2
Agree with David, NACK.

On Mon, Jan 20, 2025 at 08:45:07AM +0100, David Hildenbrand wrote:
> On 20.01.25 02:26, Lance Yang wrote:
> > Currently, mmap() fails with `-EINVAL` when both MAP_DROPPABLE and
> > MAP_PRIVATE are specified. This behavior might be inconsistent, as the
> > implementation of MAP_DROPPABLE under the hood already includes the
> > semantics of MAP_PRIVATE. So, IMO, whether MAP_PRIVATE is explicitly
> > specified or not, it should work as expected.
> >
> > For example, when mmap() is called with `MAP_DROPPABLE | MAP_ANONYMOUS`,
> > it creates a private anonymous mapping. Users can verify this behavior
> > via `/proc/self/smaps`, where the resulting VMA is marked with the `dp`
> > (MAP_DROPPABLE) flag, and the `Private_*` fields confirm private memory
> > semantics. The output for a 2MiB mapping with these flags might look like:
>
> Note that "Private_" in the stats has *nothing* to do with MAP_PRIVATE.
>
> >
> > ```
> > f433ace00000-f433ad000000 rw-p 00000000 00:00 0
> > Size:               2048 kB
> > KernelPageSize:        4 kB
> > MMUPageSize:           4 kB
> > Rss:                2048 kB
> > Pss:                2048 kB
> > Pss_Dirty:          2048 kB
> > Shared_Clean:          0 kB
> > Shared_Dirty:          0 kB
> > Private_Clean:         0 kB
> > Private_Dirty:      2048 kB
> > Referenced:         2048 kB
> > Anonymous:          2048 kB
> > ...
> > VmFlags: rd wr mr mw me nr wf dd dp
> > ```
> >
> > This patch changes mmap() to allow the combination of `MAP_DROPPABLE |
> > MAP_PRIVATE`. For mmap(), at least one of MAP_PRIVATE or MAP_SHARED could
> > be explicitly specified, regardless of the combination with other `MAP_*`
> > flags.
> >
> > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings")
>
> "How about we just say that VM_DROPPABLE really is something separate
> from MAP_PRIVATE or MAP_SHARED..

Which is also how I view it. I -really- do not want to add a weird situation too
where people wonder whether _not_ setting MAP_PRIVATE infers some different
semantics.

This mode is aggregate in behaviour by design and intended to be _specifically_
asked for, not in conjection with other map flags.

>
> And then we make the rule be that VM_DROPPABLE is never dumped and
> always dropped on fork, just to make things simpler." [1]

Yup.

>
> [1] https://lore.kernel.org/linux-mm/CAHk-=wi=XvCZ9r897LjEb4ZarLzLtKN1p+Fyig+F2fmQDF8GSA@mail.gmail.com/
>
> So, nack from my side.

Also, mine.

>
> --
> Cheers,
>
> David / dhildenb
>
Lance Yang Jan. 20, 2025, 1:32 p.m. UTC | #3
Hi David and Lorenzo,

On Mon, Jan 20, 2025 at 6:38 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
>
> Agree with David, NACK.
>
> On Mon, Jan 20, 2025 at 08:45:07AM +0100, David Hildenbrand wrote:
> > On 20.01.25 02:26, Lance Yang wrote:
> > > Currently, mmap() fails with `-EINVAL` when both MAP_DROPPABLE and
> > > MAP_PRIVATE are specified. This behavior might be inconsistent, as the
> > > implementation of MAP_DROPPABLE under the hood already includes the
> > > semantics of MAP_PRIVATE. So, IMO, whether MAP_PRIVATE is explicitly
> > > specified or not, it should work as expected.
> > >
> > > For example, when mmap() is called with `MAP_DROPPABLE | MAP_ANONYMOUS`,
> > > it creates a private anonymous mapping. Users can verify this behavior
> > > via `/proc/self/smaps`, where the resulting VMA is marked with the `dp`
> > > (MAP_DROPPABLE) flag, and the `Private_*` fields confirm private memory
> > > semantics. The output for a 2MiB mapping with these flags might look like:
> >
> > Note that "Private_" in the stats has *nothing* to do with MAP_PRIVATE.

Oh, I see. Thanks for pointing this out!

> >
> > >
> > > ```
> > > f433ace00000-f433ad000000 rw-p 00000000 00:00 0
> > > Size:               2048 kB
> > > KernelPageSize:        4 kB
> > > MMUPageSize:           4 kB
> > > Rss:                2048 kB
> > > Pss:                2048 kB
> > > Pss_Dirty:          2048 kB
> > > Shared_Clean:          0 kB
> > > Shared_Dirty:          0 kB
> > > Private_Clean:         0 kB
> > > Private_Dirty:      2048 kB
> > > Referenced:         2048 kB
> > > Anonymous:          2048 kB
> > > ...
> > > VmFlags: rd wr mr mw me nr wf dd dp
> > > ```
> > >
> > > This patch changes mmap() to allow the combination of `MAP_DROPPABLE |
> > > MAP_PRIVATE`. For mmap(), at least one of MAP_PRIVATE or MAP_SHARED could
> > > be explicitly specified, regardless of the combination with other `MAP_*`
> > > flags.
> > >
> > > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings")
> >
> > "How about we just say that VM_DROPPABLE really is something separate
> > from MAP_PRIVATE or MAP_SHARED..
>
> Which is also how I view it. I -really- do not want to add a weird situation too
> where people wonder whether _not_ setting MAP_PRIVATE infers some different
> semantics.
>
> This mode is aggregate in behaviour by design and intended to be _specifically_
> asked for, not in conjection with other map flags.

Thanks for the lesson! I missed this important info before :(

>
> >
> > And then we make the rule be that VM_DROPPABLE is never dumped and
> > always dropped on fork, just to make things simpler." [1]
>
> Yup.
>
> >
> > [1] https://lore.kernel.org/linux-mm/CAHk-=wi=XvCZ9r897LjEb4ZarLzLtKN1p+Fyig+F2fmQDF8GSA@mail.gmail.com/
> >
> > So, nack from my side.
>
> Also, mine.

Thanks again for your time!
Lance

>
> >
> > --
> > Cheers,
> >
> > David / dhildenb
> >
diff mbox series

Patch

diff --git a/mm/mmap.c b/mm/mmap.c
index cda01071c7b1..840889b5bfb2 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -504,6 +504,7 @@  unsigned long do_mmap(struct file *file, unsigned long addr,
 			vm_flags |= VM_SHARED | VM_MAYSHARE;
 			break;
 		case MAP_DROPPABLE:
+		case MAP_DROPPABLE | MAP_PRIVATE:
 			if (VM_DROPPABLE == VM_NONE)
 				return -ENOTSUPP;
 			/*