diff mbox series

[v4] memfd: `MFD_NOEXEC_SEAL` should not imply `MFD_ALLOW_SEALING`

Message ID 20240630184912.37335-1-pobrn@protonmail.com (mailing list archive)
State New
Headers show
Series [v4] memfd: `MFD_NOEXEC_SEAL` should not imply `MFD_ALLOW_SEALING` | expand

Commit Message

Barnabás Pőcze June 30, 2024, 6:49 p.m. UTC
`MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
to prevent further modifications to the executable bits as per the comment
in the uapi header file:

  not executable and sealed to prevent changing to executable

However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
`F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.

Nothing implies that it should be so, and indeed up until the second version
of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
`F_SEAL_SEAL` was not removed, however, it was changed in the third revision
of the patchset[1] without a clear explanation.

This behaviour is surprising for application developers, there is no
documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
it has the effect of making all memfds initially sealable.

So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
thereby returning to the pre-Linux 6.3 behaviour of only allowing
sealing when `MFD_ALLOW_SEALING` is specified.

Now, this is technically a uapi break. However, the damage is expected
to be minimal. To trigger user visible change, a program has to do the
following steps:

 - create memfd:
   - with `MFD_NOEXEC_SEAL`,
   - without `MFD_ALLOW_SEALING`;
 - try to add seals / check the seals.

But that seems unlikely to happen intentionally since this change
essentially reverts the kernel's behaviour to that of Linux <6.3,
so if a program worked correctly on those older kernels, it will
likely work correctly after this change.

I have used Debian Code Search and GitHub to try to find potential
breakages, and I could only find a single one. dbus-broker's
memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
behaviour, and tries to work around it[2]. This workaround will
break. Luckily, this only affects the test suite, it does not affect
the normal operations of dbus-broker. There is a PR with a fix[3].

I also carried out a smoke test by building a kernel with this change
and booting an Arch Linux system into GNOME and Plasma sessions.

There was also a previous attempt to address this peculiarity by
introducing a new flag[4].

[0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
[1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
[2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
[3]: https://github.com/bus1/dbus-broker/pull/366
[4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/

Cc: stable@vger.kernel.org
Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
---

* v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
* v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
* v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/

This fourth version returns to removing the inconsistency as opposed to documenting
its existence, with the same code change as v1 but with a somewhat extended commit
message. This is sent because I believe it is worth at least a try; it can be easily
reverted if bigger application breakages are discovered than initially imagined.

---
 mm/memfd.c                                 | 9 ++++-----
 tools/testing/selftests/memfd/memfd_test.c | 2 +-
 2 files changed, 5 insertions(+), 6 deletions(-)

Comments

Jeff Xu July 1, 2024, 9:31 p.m. UTC | #1
Hi

On Sun, Jun 30, 2024 at 11:49 AM Barnabás Pőcze <pobrn@protonmail.com> wrote:
>
> `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
> to prevent further modifications to the executable bits as per the comment
> in the uapi header file:
>
>   not executable and sealed to prevent changing to executable
>
> However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
> `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
>
> Nothing implies that it should be so, and indeed up until the second version
> of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
> `F_SEAL_SEAL` was not removed, however, it was changed in the third revision
> of the patchset[1] without a clear explanation.
>
> This behaviour is surprising for application developers, there is no
> documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
> effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
> it has the effect of making all memfds initially sealable.
>
The documentation is in linux main (653c5c75115c), I hope this gives
clarity to the usage of MFD_NOEXEC_SEAL flag to application
developers, furthermore I'm working on man page for memfd_create.

> So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
> thereby returning to the pre-Linux 6.3 behaviour of only allowing
> sealing when `MFD_ALLOW_SEALING` is specified.
>
> Now, this is technically a uapi break. However, the damage is expected
> to be minimal. To trigger user visible change, a program has to do the
> following steps:
>
>  - create memfd:
>    - with `MFD_NOEXEC_SEAL`,
>    - without `MFD_ALLOW_SEALING`;
>  - try to add seals / check the seals.
>
> But that seems unlikely to happen intentionally since this change
> essentially reverts the kernel's behaviour to that of Linux <6.3,
> so if a program worked correctly on those older kernels, it will
> likely work correctly after this change.
>
During V3 patch discussion,  I sent my reasoning, but here are summaries:

- As one might have noticed, unlike other flags in memfd_create,
MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea
is to make it easier to use memfd in the most common way, which is
NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING.

- The new sysctl vm.noexec = 1 helps existing applications move to a
more secure way of using memfd. IMO, MFD_ALLOW_SEALING is included by
default because most applications would rather have it than not. In
any case, an app can set F_SEAL_SEAL to disable the sealing.

- MFD_NOEXEC_SEAL has been added for more than one year, multiple
applications and distributions have backported and utilized it.
Altering ABI now presents a degree of risk and may lead to
disruptions.

Best regards,
-Jeff

> I have used Debian Code Search and GitHub to try to find potential
> breakages, and I could only find a single one. dbus-broker's
> memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
> behaviour, and tries to work around it[2]. This workaround will
> break. Luckily, this only affects the test suite, it does not affect
> the normal operations of dbus-broker. There is a PR with a fix[3].
>
> I also carried out a smoke test by building a kernel with this change
> and booting an Arch Linux system into GNOME and Plasma sessions.
>
> There was also a previous attempt to address this peculiarity by
> introducing a new flag[4].
>
> [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
> [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
> [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> [3]: https://github.com/bus1/dbus-broker/pull/366
> [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> ---
>
> * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
> * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
> * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/
>
> This fourth version returns to removing the inconsistency as opposed to documenting
> its existence, with the same code change as v1 but with a somewhat extended commit
> message. This is sent because I believe it is worth at least a try; it can be easily
> reverted if bigger application breakages are discovered than initially imagined.
>
> ---
>  mm/memfd.c                                 | 9 ++++-----
>  tools/testing/selftests/memfd/memfd_test.c | 2 +-
>  2 files changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 7d8d3ab3fa37..8b7f6afee21d 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
>
>                 inode->i_mode &= ~0111;
>                 file_seals = memfd_file_seals_ptr(file);
> -               if (file_seals) {
> -                       *file_seals &= ~F_SEAL_SEAL;
> +               if (file_seals)
>                         *file_seals |= F_SEAL_EXEC;
> -               }
> -       } else if (flags & MFD_ALLOW_SEALING) {
> -               /* MFD_EXEC and MFD_ALLOW_SEALING are set */
> +       }
> +
> +       if (flags & MFD_ALLOW_SEALING) {
>                 file_seals = memfd_file_seals_ptr(file);
>                 if (file_seals)
>                         *file_seals &= ~F_SEAL_SEAL;
> diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
> index 95af2d78fd31..7b78329f65b6 100644
> --- a/tools/testing/selftests/memfd/memfd_test.c
> +++ b/tools/testing/selftests/memfd/memfd_test.c
> @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
>                             mfd_def_size,
>                             MFD_CLOEXEC | MFD_NOEXEC_SEAL);
>         mfd_assert_mode(fd, 0666);
> -       mfd_assert_has_seals(fd, F_SEAL_EXEC);
> +       mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
>         mfd_fail_chmod(fd, 0777);
>         close(fd);
>  }
> --
> 2.45.2
>
>
Aleksa Sarai July 2, 2024, 6:24 a.m. UTC | #2
On 2024-06-30, Barnabás Pőcze <pobrn@protonmail.com> wrote:
> `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
> to prevent further modifications to the executable bits as per the comment
> in the uapi header file:
> 
>   not executable and sealed to prevent changing to executable
> 
> However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
> `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
> 
> Nothing implies that it should be so, and indeed up until the second version
> of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
> `F_SEAL_SEAL` was not removed, however, it was changed in the third revision
> of the patchset[1] without a clear explanation.
> 
> This behaviour is surprising for application developers, there is no
> documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
> effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
> it has the effect of making all memfds initially sealable.
> 
> So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
> thereby returning to the pre-Linux 6.3 behaviour of only allowing
> sealing when `MFD_ALLOW_SEALING` is specified.

This behaviour makes sense, I'm a little sad I didn't catch it when I
was fixing vm.memfd_noexec. There is a possibility for breakage, but we
should give it a shot, given how new the API is (and the API itself was
also broken until Linux 6.6 anyway)...

Feel free to take my

Reviewed-by: Aleksa Sarai <cyphar@cyphar.com>

Thanks.

> Now, this is technically a uapi break. However, the damage is expected
> to be minimal. To trigger user visible change, a program has to do the
> following steps:
> 
>  - create memfd:
>    - with `MFD_NOEXEC_SEAL`,
>    - without `MFD_ALLOW_SEALING`;
>  - try to add seals / check the seals.
> 
> But that seems unlikely to happen intentionally since this change
> essentially reverts the kernel's behaviour to that of Linux <6.3,
> so if a program worked correctly on those older kernels, it will
> likely work correctly after this change.
> 
> I have used Debian Code Search and GitHub to try to find potential
> breakages, and I could only find a single one. dbus-broker's
> memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
> behaviour, and tries to work around it[2]. This workaround will
> break. Luckily, this only affects the test suite, it does not affect
> the normal operations of dbus-broker. There is a PR with a fix[3].
> 
> I also carried out a smoke test by building a kernel with this change
> and booting an Arch Linux system into GNOME and Plasma sessions.
> 
> There was also a previous attempt to address this peculiarity by
> introducing a new flag[4].
> 
> [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
> [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
> [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> [3]: https://github.com/bus1/dbus-broker/pull/366
> [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> ---
> 
> * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
> * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
> * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/
> 
> This fourth version returns to removing the inconsistency as opposed to documenting
> its existence, with the same code change as v1 but with a somewhat extended commit
> message. This is sent because I believe it is worth at least a try; it can be easily
> reverted if bigger application breakages are discovered than initially imagined.
> 
> ---
>  mm/memfd.c                                 | 9 ++++-----
>  tools/testing/selftests/memfd/memfd_test.c | 2 +-
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 7d8d3ab3fa37..8b7f6afee21d 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
>  
>  		inode->i_mode &= ~0111;
>  		file_seals = memfd_file_seals_ptr(file);
> -		if (file_seals) {
> -			*file_seals &= ~F_SEAL_SEAL;
> +		if (file_seals)
>  			*file_seals |= F_SEAL_EXEC;
> -		}
> -	} else if (flags & MFD_ALLOW_SEALING) {
> -		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
> +	}
> +
> +	if (flags & MFD_ALLOW_SEALING) {
>  		file_seals = memfd_file_seals_ptr(file);
>  		if (file_seals)
>  			*file_seals &= ~F_SEAL_SEAL;
> diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
> index 95af2d78fd31..7b78329f65b6 100644
> --- a/tools/testing/selftests/memfd/memfd_test.c
> +++ b/tools/testing/selftests/memfd/memfd_test.c
> @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
>  			    mfd_def_size,
>  			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
>  	mfd_assert_mode(fd, 0666);
> -	mfd_assert_has_seals(fd, F_SEAL_EXEC);
> +	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
>  	mfd_fail_chmod(fd, 0777);
>  	close(fd);
>  }
> -- 
> 2.45.2
> 
>
Barnabás Pőcze Sept. 27, 2024, 10:09 p.m. UTC | #3
Hi


Gentle ping. Is there any chance we could move forward with this? I am not aware
of any breakage it would cause; but longer the wait, the higher the likelihood.


Regards,
Barnabás Pőcze

2024. június 30., vasárnap 20:49 keltezéssel, Barnabás Pőcze <pobrn@protonmail.com> írta:

> `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
> to prevent further modifications to the executable bits as per the comment
> in the uapi header file:
> 
>   not executable and sealed to prevent changing to executable
> 
> However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
> `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
> 
> Nothing implies that it should be so, and indeed up until the second version
> of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
> `F_SEAL_SEAL` was not removed, however, it was changed in the third revision
> of the patchset[1] without a clear explanation.
> 
> This behaviour is surprising for application developers, there is no
> documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
> effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
> it has the effect of making all memfds initially sealable.
> 
> So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
> thereby returning to the pre-Linux 6.3 behaviour of only allowing
> sealing when `MFD_ALLOW_SEALING` is specified.
> 
> Now, this is technically a uapi break. However, the damage is expected
> to be minimal. To trigger user visible change, a program has to do the
> following steps:
> 
>  - create memfd:
>    - with `MFD_NOEXEC_SEAL`,
>    - without `MFD_ALLOW_SEALING`;
>  - try to add seals / check the seals.
> 
> But that seems unlikely to happen intentionally since this change
> essentially reverts the kernel's behaviour to that of Linux <6.3,
> so if a program worked correctly on those older kernels, it will
> likely work correctly after this change.
> 
> I have used Debian Code Search and GitHub to try to find potential
> breakages, and I could only find a single one. dbus-broker's
> memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
> behaviour, and tries to work around it[2]. This workaround will
> break. Luckily, this only affects the test suite, it does not affect
> the normal operations of dbus-broker. There is a PR with a fix[3].
> 
> I also carried out a smoke test by building a kernel with this change
> and booting an Arch Linux system into GNOME and Plasma sessions.
> 
> There was also a previous attempt to address this peculiarity by
> introducing a new flag[4].
> 
> [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
> [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
> [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> [3]: https://github.com/bus1/dbus-broker/pull/366
> [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> ---
> 
> * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
> * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
> * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/
> 
> This fourth version returns to removing the inconsistency as opposed to documenting
> its existence, with the same code change as v1 but with a somewhat extended commit
> message. This is sent because I believe it is worth at least a try; it can be easily
> reverted if bigger application breakages are discovered than initially imagined.
> 
> ---
>  mm/memfd.c                                 | 9 ++++-----
>  tools/testing/selftests/memfd/memfd_test.c | 2 +-
>  2 files changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memfd.c b/mm/memfd.c
> index 7d8d3ab3fa37..8b7f6afee21d 100644
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
>  
>  		inode->i_mode &= ~0111;
>  		file_seals = memfd_file_seals_ptr(file);
> -		if (file_seals) {
> -			*file_seals &= ~F_SEAL_SEAL;
> +		if (file_seals)
>  			*file_seals |= F_SEAL_EXEC;
> -		}
> -	} else if (flags & MFD_ALLOW_SEALING) {
> -		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
> +	}
> +
> +	if (flags & MFD_ALLOW_SEALING) {
>  		file_seals = memfd_file_seals_ptr(file);
>  		if (file_seals)
>  			*file_seals &= ~F_SEAL_SEAL;
> diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
> index 95af2d78fd31..7b78329f65b6 100644
> --- a/tools/testing/selftests/memfd/memfd_test.c
> +++ b/tools/testing/selftests/memfd/memfd_test.c
> @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
>  			    mfd_def_size,
>  			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
>  	mfd_assert_mode(fd, 0666);
> -	mfd_assert_has_seals(fd, F_SEAL_EXEC);
> +	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
>  	mfd_fail_chmod(fd, 0777);
>  	close(fd);
>  }
> -- 
> 2.45.2
>
Barnabás Pőcze Nov. 20, 2024, 5:20 p.m. UTC | #4
Hi


Gentle ping again. I am still hoping we can move forward with this.


Regards,
Barnabás Pőcze


2024. szeptember 28., szombat 0:09 keltezéssel, Barnabás Pőcze <pobrn@protonmail.com> írta:

> Hi
> 
> 
> Gentle ping. Is there any chance we could move forward with this? I am not aware
> of any breakage it would cause; but longer the wait, the higher the likelihood.
> 
> 
> Regards,
> Barnabás Pőcze
> 
> 2024. június 30., vasárnap 20:49 keltezéssel, Barnabás Pőcze <pobrn@protonmail.com> írta:
> 
> > `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
> > to prevent further modifications to the executable bits as per the comment
> > in the uapi header file:
> >
> >   not executable and sealed to prevent changing to executable
> >
> > However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> > that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
> > `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
> >
> > Nothing implies that it should be so, and indeed up until the second version
> > of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
> > `F_SEAL_SEAL` was not removed, however, it was changed in the third revision
> > of the patchset[1] without a clear explanation.
> >
> > This behaviour is surprising for application developers, there is no
> > documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
> > effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
> > it has the effect of making all memfds initially sealable.
> >
> > So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
> > thereby returning to the pre-Linux 6.3 behaviour of only allowing
> > sealing when `MFD_ALLOW_SEALING` is specified.
> >
> > Now, this is technically a uapi break. However, the damage is expected
> > to be minimal. To trigger user visible change, a program has to do the
> > following steps:
> >
> >  - create memfd:
> >    - with `MFD_NOEXEC_SEAL`,
> >    - without `MFD_ALLOW_SEALING`;
> >  - try to add seals / check the seals.
> >
> > But that seems unlikely to happen intentionally since this change
> > essentially reverts the kernel's behaviour to that of Linux <6.3,
> > so if a program worked correctly on those older kernels, it will
> > likely work correctly after this change.
> >
> > I have used Debian Code Search and GitHub to try to find potential
> > breakages, and I could only find a single one. dbus-broker's
> > memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
> > behaviour, and tries to work around it[2]. This workaround will
> > break. Luckily, this only affects the test suite, it does not affect
> > the normal operations of dbus-broker. There is a PR with a fix[3].
> >
> > I also carried out a smoke test by building a kernel with this change
> > and booting an Arch Linux system into GNOME and Plasma sessions.
> >
> > There was also a previous attempt to address this peculiarity by
> > introducing a new flag[4].
> >
> > [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
> > [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
> > [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> > [3]: https://github.com/bus1/dbus-broker/pull/366
> > [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> >
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> > ---
> >
> > * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
> > * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
> > * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/
> >
> > This fourth version returns to removing the inconsistency as opposed to documenting
> > its existence, with the same code change as v1 but with a somewhat extended commit
> > message. This is sent because I believe it is worth at least a try; it can be easily
> > reverted if bigger application breakages are discovered than initially imagined.
> >
> > ---
> >  mm/memfd.c                                 | 9 ++++-----
> >  tools/testing/selftests/memfd/memfd_test.c | 2 +-
> >  2 files changed, 5 insertions(+), 6 deletions(-)
> >
> > diff --git a/mm/memfd.c b/mm/memfd.c
> > index 7d8d3ab3fa37..8b7f6afee21d 100644
> > --- a/mm/memfd.c
> > +++ b/mm/memfd.c
> > @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
> >
> >  		inode->i_mode &= ~0111;
> >  		file_seals = memfd_file_seals_ptr(file);
> > -		if (file_seals) {
> > -			*file_seals &= ~F_SEAL_SEAL;
> > +		if (file_seals)
> >  			*file_seals |= F_SEAL_EXEC;
> > -		}
> > -	} else if (flags & MFD_ALLOW_SEALING) {
> > -		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
> > +	}
> > +
> > +	if (flags & MFD_ALLOW_SEALING) {
> >  		file_seals = memfd_file_seals_ptr(file);
> >  		if (file_seals)
> >  			*file_seals &= ~F_SEAL_SEAL;
> > diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
> > index 95af2d78fd31..7b78329f65b6 100644
> > --- a/tools/testing/selftests/memfd/memfd_test.c
> > +++ b/tools/testing/selftests/memfd/memfd_test.c
> > @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
> >  			    mfd_def_size,
> >  			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
> >  	mfd_assert_mode(fd, 0666);
> > -	mfd_assert_has_seals(fd, F_SEAL_EXEC);
> > +	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
> >  	mfd_fail_chmod(fd, 0777);
> >  	close(fd);
> >  }
> > --
> > 2.45.2
> >
Hugh Dickins Dec. 5, 2024, 4:29 a.m. UTC | #5
Jann, I notice you active in the SEALing arena recently, and wonder
whether you would would have time to consider Barnabás's patiently
pinged and repinged points here.  Instinct tells me that he knows
what he's talking about: but the SEALing protocol was a little too
subtle for me, even at the start, and would take me a bit too long
to remaster and comment now (which may well be true of others too).

Thanks!
Hugh

On Wed, 20 Nov 2024, Barnabás Pőcze wrote:

> Hi
> 
> 
> Gentle ping again. I am still hoping we can move forward with this.
> 
> 
> Regards,
> Barnabás Pőcze
> 
> 
> 2024. szeptember 28., szombat 0:09 keltezéssel, Barnabás Pőcze <pobrn@protonmail.com> írta:
> 
> > Hi
> > 
> > 
> > Gentle ping. Is there any chance we could move forward with this? I am not aware
> > of any breakage it would cause; but longer the wait, the higher the likelihood.
> > 
> > 
> > Regards,
> > Barnabás Pőcze
> > 
> > 2024. június 30., vasárnap 20:49 keltezéssel, Barnabás Pőcze <pobrn@protonmail.com> írta:
> > 
> > > `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC`
> > > to prevent further modifications to the executable bits as per the comment
> > > in the uapi header file:
> > >
> > >   not executable and sealed to prevent changing to executable
> > >
> > > However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC")
> > > that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets
> > > `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`.
> > >
> > > Nothing implies that it should be so, and indeed up until the second version
> > > of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`,
> > > `F_SEAL_SEAL` was not removed, however, it was changed in the third revision
> > > of the patchset[1] without a clear explanation.
> > >
> > > This behaviour is surprising for application developers, there is no
> > > documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional
> > > effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2`
> > > it has the effect of making all memfds initially sealable.
> > >
> > > So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested,
> > > thereby returning to the pre-Linux 6.3 behaviour of only allowing
> > > sealing when `MFD_ALLOW_SEALING` is specified.
> > >
> > > Now, this is technically a uapi break. However, the damage is expected
> > > to be minimal. To trigger user visible change, a program has to do the
> > > following steps:
> > >
> > >  - create memfd:
> > >    - with `MFD_NOEXEC_SEAL`,
> > >    - without `MFD_ALLOW_SEALING`;
> > >  - try to add seals / check the seals.
> > >
> > > But that seems unlikely to happen intentionally since this change
> > > essentially reverts the kernel's behaviour to that of Linux <6.3,
> > > so if a program worked correctly on those older kernels, it will
> > > likely work correctly after this change.
> > >
> > > I have used Debian Code Search and GitHub to try to find potential
> > > breakages, and I could only find a single one. dbus-broker's
> > > memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING`
> > > behaviour, and tries to work around it[2]. This workaround will
> > > break. Luckily, this only affects the test suite, it does not affect
> > > the normal operations of dbus-broker. There is a PR with a fix[3].
> > >
> > > I also carried out a smoke test by building a kernel with this change
> > > and booting an Arch Linux system into GNOME and Plasma sessions.
> > >
> > > There was also a previous attempt to address this peculiarity by
> > > introducing a new flag[4].
> > >
> > > [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/
> > > [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/
> > > [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114
> > > [3]: https://github.com/bus1/dbus-broker/pull/366
> > > [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> > >
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>
> > > ---
> > >
> > > * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/
> > > * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/
> > > * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/
> > >
> > > This fourth version returns to removing the inconsistency as opposed to documenting
> > > its existence, with the same code change as v1 but with a somewhat extended commit
> > > message. This is sent because I believe it is worth at least a try; it can be easily
> > > reverted if bigger application breakages are discovered than initially imagined.
> > >
> > > ---
> > >  mm/memfd.c                                 | 9 ++++-----
> > >  tools/testing/selftests/memfd/memfd_test.c | 2 +-
> > >  2 files changed, 5 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/mm/memfd.c b/mm/memfd.c
> > > index 7d8d3ab3fa37..8b7f6afee21d 100644
> > > --- a/mm/memfd.c
> > > +++ b/mm/memfd.c
> > > @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create,
> > >
> > >  		inode->i_mode &= ~0111;
> > >  		file_seals = memfd_file_seals_ptr(file);
> > > -		if (file_seals) {
> > > -			*file_seals &= ~F_SEAL_SEAL;
> > > +		if (file_seals)
> > >  			*file_seals |= F_SEAL_EXEC;
> > > -		}
> > > -	} else if (flags & MFD_ALLOW_SEALING) {
> > > -		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
> > > +	}
> > > +
> > > +	if (flags & MFD_ALLOW_SEALING) {
> > >  		file_seals = memfd_file_seals_ptr(file);
> > >  		if (file_seals)
> > >  			*file_seals &= ~F_SEAL_SEAL;
> > > diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
> > > index 95af2d78fd31..7b78329f65b6 100644
> > > --- a/tools/testing/selftests/memfd/memfd_test.c
> > > +++ b/tools/testing/selftests/memfd/memfd_test.c
> > > @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void)
> > >  			    mfd_def_size,
> > >  			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
> > >  	mfd_assert_mode(fd, 0666);
> > > -	mfd_assert_has_seals(fd, F_SEAL_EXEC);
> > > +	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
> > >  	mfd_fail_chmod(fd, 0777);
> > >  	close(fd);
> > >  }
> > > --
> > > 2.45.2
> > >
>
diff mbox series

Patch

diff --git a/mm/memfd.c b/mm/memfd.c
index 7d8d3ab3fa37..8b7f6afee21d 100644
--- a/mm/memfd.c
+++ b/mm/memfd.c
@@ -356,12 +356,11 @@  SYSCALL_DEFINE2(memfd_create,
 
 		inode->i_mode &= ~0111;
 		file_seals = memfd_file_seals_ptr(file);
-		if (file_seals) {
-			*file_seals &= ~F_SEAL_SEAL;
+		if (file_seals)
 			*file_seals |= F_SEAL_EXEC;
-		}
-	} else if (flags & MFD_ALLOW_SEALING) {
-		/* MFD_EXEC and MFD_ALLOW_SEALING are set */
+	}
+
+	if (flags & MFD_ALLOW_SEALING) {
 		file_seals = memfd_file_seals_ptr(file);
 		if (file_seals)
 			*file_seals &= ~F_SEAL_SEAL;
diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c
index 95af2d78fd31..7b78329f65b6 100644
--- a/tools/testing/selftests/memfd/memfd_test.c
+++ b/tools/testing/selftests/memfd/memfd_test.c
@@ -1151,7 +1151,7 @@  static void test_noexec_seal(void)
 			    mfd_def_size,
 			    MFD_CLOEXEC | MFD_NOEXEC_SEAL);
 	mfd_assert_mode(fd, 0666);
-	mfd_assert_has_seals(fd, F_SEAL_EXEC);
+	mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC);
 	mfd_fail_chmod(fd, 0777);
 	close(fd);
 }