mbox series

[0/4] fs: allow listmount() with reversed ordering

Message ID 20240607-vfs-listmount-reverse-v1-0-7877a2bfa5e5@kernel.org (mailing list archive)
Headers show
Series fs: allow listmount() with reversed ordering | expand

Message

Christian Brauner June 7, 2024, 2:55 p.m. UTC
util-linux is about to implement listmount() and statmount() support.
Karel requested the ability to scan the mount table in backwards order
because that's what libmount currently does in order to get the latest
mount first. We currently don't support this in listmount(). Add a new
LISTMOUNT_RESERVE flag to allow listing mounts in reverse order. For
example, listing all child mounts of /sys without LISTMOUNT_REVERSE
gives:

    /sys/kernel/security @ mnt_id: 4294968369
    /sys/fs/cgroup @ mnt_id: 4294968370
    /sys/firmware/efi/efivars @ mnt_id: 4294968371
    /sys/fs/bpf @ mnt_id: 4294968372
    /sys/kernel/tracing @ mnt_id: 4294968373
    /sys/kernel/debug @ mnt_id: 4294968374
    /sys/fs/fuse/connections @ mnt_id: 4294968375
    /sys/kernel/config @ mnt_id: 4294968376

whereas with LISTMOUNT_RESERVE it gives:

    /sys/kernel/config @ mnt_id: 4294968376
    /sys/fs/fuse/connections @ mnt_id: 4294968375
    /sys/kernel/debug @ mnt_id: 4294968374
    /sys/kernel/tracing @ mnt_id: 4294968373
    /sys/fs/bpf @ mnt_id: 4294968372
    /sys/firmware/efi/efivars @ mnt_id: 4294968371
    /sys/fs/cgroup @ mnt_id: 4294968370
    /sys/kernel/security @ mnt_id: 4294968369

A few smaller cleanups included in this series.

---
---
base-commit: 1613e604df0cd359cf2a7fbd9be7a0bcfacfabd0
change-id: 20240607-vfs-listmount-reverse-0f5d4d8248ee

Comments

Mateusz Guzik June 7, 2024, 4:08 p.m. UTC | #1
On Fri, Jun 07, 2024 at 04:55:33PM +0200, Christian Brauner wrote:
> util-linux is about to implement listmount() and statmount() support.
> Karel requested the ability to scan the mount table in backwards order
> because that's what libmount currently does in order to get the latest
> mount first. We currently don't support this in listmount(). Add a new
> LISTMOUNT_RESERVE flag to allow listing mounts in reverse order. For
> example, listing all child mounts of /sys without LISTMOUNT_REVERSE
> gives:
> 
>     /sys/kernel/security @ mnt_id: 4294968369
>     /sys/fs/cgroup @ mnt_id: 4294968370
>     /sys/firmware/efi/efivars @ mnt_id: 4294968371
>     /sys/fs/bpf @ mnt_id: 4294968372
>     /sys/kernel/tracing @ mnt_id: 4294968373
>     /sys/kernel/debug @ mnt_id: 4294968374
>     /sys/fs/fuse/connections @ mnt_id: 4294968375
>     /sys/kernel/config @ mnt_id: 4294968376
> 
> whereas with LISTMOUNT_RESERVE it gives:
> 
>     /sys/kernel/config @ mnt_id: 4294968376
>     /sys/fs/fuse/connections @ mnt_id: 4294968375
>     /sys/kernel/debug @ mnt_id: 4294968374
>     /sys/kernel/tracing @ mnt_id: 4294968373
>     /sys/fs/bpf @ mnt_id: 4294968372
>     /sys/firmware/efi/efivars @ mnt_id: 4294968371
>     /sys/fs/cgroup @ mnt_id: 4294968370
>     /sys/kernel/security @ mnt_id: 4294968369
> 
> A few smaller cleanups included in this series.
> 

I have some remarks about listmount, they are not patchset-specific
though.

If I'm reading things correctly put_user is performed with namespace_sem
held? While it probably works, it is fundamentally unsound behavior --
said put_user introduces funky lock orderings which don't need to be
there and can go off cpu indefinitely waiting for the fault to be
serviced.

Either the code needs to pre-wire user memory or it needs to export to
userspace in chunks while dropping the lock in-between (say 16 or
whatever IDs per batch, something stack-tolerable).

From cursory reading dropping the lock looks fine.

This has no bearing on the feature, but since you are cleaning up the
syscall apart from it perhaps you will be willing to sort this out.

Bonus question is why lockdep is not yelling about it. At best only
select locks should be allowed to take a page fault with.
Josef Bacik June 7, 2024, 7:03 p.m. UTC | #2
On Fri, Jun 07, 2024 at 04:55:33PM +0200, Christian Brauner wrote:
> util-linux is about to implement listmount() and statmount() support.
> Karel requested the ability to scan the mount table in backwards order
> because that's what libmount currently does in order to get the latest
> mount first. We currently don't support this in listmount(). Add a new
> LISTMOUNT_RESERVE flag to allow listing mounts in reverse order. For
> example, listing all child mounts of /sys without LISTMOUNT_REVERSE
> gives:
> 

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

for the entire series, thanks,

Josef
Christian Brauner June 10, 2024, 1:16 p.m. UTC | #3
> This has no bearing on the feature, but since you are cleaning up the
> syscall apart from it perhaps you will be willing to sort this out.

I'm aware of this and I did rewrite statmount() so it doesn't do that
before we merged it. listmount() can do the same. Please see #vfs.misc.