mbox series

[v1,0/1] mm/memfd: add documentation for MFD_NOEXEC_SEAL

Message ID 20240607203543.2151433-1-jeffxu@google.com (mailing list archive)
Headers show
Series mm/memfd: add documentation for MFD_NOEXEC_SEAL | expand

Message

Jeff Xu June 7, 2024, 8:35 p.m. UTC
From: Jeff Xu <jeffxu@chromium.org>

When MFD_NOEXEC_SEAL was introduced, there was one big mistake: it
didn't have proper documentation. This led to a lot of confusion,
especially about whether or not memfd created with the MFD_NOEXEC_SEAL
flag is sealable. Before MFD_NOEXEC_SEAL, memfd had to explicitly set
MFD_ALLOW_SEALING to be sealable, so it's a fair question.

As one might have noticed, unlike other flags in memfd_create,
MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea
is to make it easier to use memfd in the most common way, which is
NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING. This works with sysctl
vm.noexec to help existing applications move to a more secure way of
using memfd.

Proposals have been made to put MFD_NOEXEC_SEAL non-sealable, unless
MFD_ALLOW_SEALING is set, to be consistent with other flags [1] [2],
Those are based on the viewpoint that each flag is an atomic unit,
which is a reasonable assumption. However, MFD_NOEXEC_SEAL was
designed with the intent of promoting the most secure method of using
memfd, therefore a combination of multiple functionalities into one
bit.

Furthermore, the MFD_NOEXEC_SEAL has been added for more than one
year, and multiple applications and distributions have backported and
utilized it. Altering ABI now presents a degree of risk and may lead
to disruption.

MFD_NOEXEC_SEAL is a new flag, and applications must change their code
to use it. There is no backward compatibility problem.

When sysctl vm.noexec == 1 or 2, applications that don't set
MFD_NOEXEC_SEAL or MFD_EXEC will get MFD_NOEXEC_SEAL memfd. And
old-application might break, that is by-design, in such a system
vm.noexec = 0 shall be used. Also no backward compatibility problem.

I propose to include this documentation patch to assist in clarifying
the semantics of MFD_NOEXEC_SEAL, thereby preventing any potential
future confusion.

This patch supersede previous patch which is trying different
direction [3], and please remove [2] from mm-unstable branch when
applying this patch.

Finally, I would like to express my gratitude to David Rheinsberg and
Barnabás Pőcze for initiating the discussion on the topic of sealability.

[1]
https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/

[2] 
https://lore.kernel.org/lkml/20240513191544.94754-1-pobrn@protonmail.com/

[3]
https://lore.kernel.org/lkml/20240524033933.135049-1-jeffxu@google.com/

Jeff Xu (1):
  mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC

 Documentation/userspace-api/index.rst      |  1 +
 Documentation/userspace-api/mfd_noexec.rst | 86 ++++++++++++++++++++++
 2 files changed, 87 insertions(+)
 create mode 100644 Documentation/userspace-api/mfd_noexec.rst

Comments

Barnabás Pőcze June 7, 2024, 9:41 p.m. UTC | #1
Hi


2024. június 7., péntek 22:35 keltezéssel, jeffxu@chromium.org <jeffxu@chromium.org> írta:

> From: Jeff Xu <jeffxu@chromium.org>
> 
> When MFD_NOEXEC_SEAL was introduced, there was one big mistake: it
> didn't have proper documentation. This led to a lot of confusion,
> especially about whether or not memfd created with the MFD_NOEXEC_SEAL
> flag is sealable. Before MFD_NOEXEC_SEAL, memfd had to explicitly set
> MFD_ALLOW_SEALING to be sealable, so it's a fair question.
> 
> As one might have noticed, unlike other flags in memfd_create,
> MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea
> is to make it easier to use memfd in the most common way, which is
> NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING. This works with sysctl
> vm.noexec to help existing applications move to a more secure way of
> using memfd.
> 
> Proposals have been made to put MFD_NOEXEC_SEAL non-sealable, unless
> MFD_ALLOW_SEALING is set, to be consistent with other flags [1] [2],
> Those are based on the viewpoint that each flag is an atomic unit,
> which is a reasonable assumption. However, MFD_NOEXEC_SEAL was
> designed with the intent of promoting the most secure method of using
> memfd, therefore a combination of multiple functionalities into one
> bit.
> 
> Furthermore, the MFD_NOEXEC_SEAL has been added for more than one
> year, and multiple applications and distributions have backported and
> utilized it. Altering ABI now presents a degree of risk and may lead
> to disruption.

I feel compelled to mention again that based on my investigation the risk is
minimal. Not to mention that it can easily be reverted if need be.

In my view, it is better to fix the inconsistency than to document it. I would
argue that "`MFD_ALLOW_SEALING` is needed to enable sealing except that XYZ"
is unintuitive and confusing for a non-significant amount of people.

In conclusion, I think it would be unfortunate if the inconsistency was not fixed and
the problem was considered "solved" by a passing mention in the documentation.


Regards,
Barnabás Pőcze

> 
> MFD_NOEXEC_SEAL is a new flag, and applications must change their code
> to use it. There is no backward compatibility problem.
> 
> When sysctl vm.noexec == 1 or 2, applications that don't set
> MFD_NOEXEC_SEAL or MFD_EXEC will get MFD_NOEXEC_SEAL memfd. And
> old-application might break, that is by-design, in such a system
> vm.noexec = 0 shall be used. Also no backward compatibility problem.
> 
> I propose to include this documentation patch to assist in clarifying
> the semantics of MFD_NOEXEC_SEAL, thereby preventing any potential
> future confusion.
> 
> This patch supersede previous patch which is trying different
> direction [3], and please remove [2] from mm-unstable branch when
> applying this patch.
> 
> Finally, I would like to express my gratitude to David Rheinsberg and
> Barnabás Pőcze for initiating the discussion on the topic of sealability.
> 
> [1]
> https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/
> 
> [2]
> https://lore.kernel.org/lkml/20240513191544.94754-1-pobrn@protonmail.com/
> 
> [3]
> https://lore.kernel.org/lkml/20240524033933.135049-1-jeffxu@google.com/
> 
> Jeff Xu (1):
>   mm/memfd: add documentation for MFD_NOEXEC_SEAL MFD_EXEC
> 
>  Documentation/userspace-api/index.rst      |  1 +
>  Documentation/userspace-api/mfd_noexec.rst | 86 ++++++++++++++++++++++
>  2 files changed, 87 insertions(+)
>  create mode 100644 Documentation/userspace-api/mfd_noexec.rst
> 
> --
> 2.45.2.505.gda0bf45e8d-goog
> 
>
Jeff Xu June 7, 2024, 10:31 p.m. UTC | #2
Hi

On Fri, Jun 7, 2024, 2:41 PM Barnabás Pőcze <pobrn@protonmail.com> wrote:

> Hi
>
>
> 2024. június 7., péntek 22:35 keltezéssel, jeffxu@chromium.org <
> jeffxu@chromium.org> írta:
>
> > From: Jeff Xu <jeffxu@chromium.org>
> >
> > When MFD_NOEXEC_SEAL was introduced, there was one big mistake: it
> > didn't have proper documentation. This led to a lot of confusion,
> > especially about whether or not memfd created with the MFD_NOEXEC_SEAL
> > flag is sealable. Before MFD_NOEXEC_SEAL, memfd had to explicitly set
> > MFD_ALLOW_SEALING to be sealable, so it's a fair question.
> >
> > As one might have noticed, unlike other flags in memfd_create,
> > MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea
> > is to make it easier to use memfd in the most common way, which is
> > NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING. This works with sysctl
> > vm.noexec to help existing applications move to a more secure way of
> > using memfd.
> >
> > Proposals have been made to put MFD_NOEXEC_SEAL non-sealable, unless
> > MFD_ALLOW_SEALING is set, to be consistent with other flags [1] [2],
> > Those are based on the viewpoint that each flag is an atomic unit,
> > which is a reasonable assumption. However, MFD_NOEXEC_SEAL was
> > designed with the intent of promoting the most secure method of using
> > memfd, therefore a combination of multiple functionalities into one
> > bit.
> >
> > Furthermore, the MFD_NOEXEC_SEAL has been added for more than one
> > year, and multiple applications and distributions have backported and
> > utilized it. Altering ABI now presents a degree of risk and may lead
> > to disruption.
>
> I feel compelled to mention again that based on my investigation the risk
> is
> minimal. Not to mention that it can easily be reverted if need be.
>


The risk is not zero. If we  changed the ABI it would  be propagated to
early kernel stable versions. Various Linux distributions also backported
the patch to earlier  kernels such as 5.4. If it needs a revert, then
everyone  has to do it again.


> In my view, it is better to fix the inconsistency than to document it. I
> would
> argue that "`MFD_ALLOW_SEALING` is needed to enable sealing except that
> XYZ"
> is unintuitive and confusing for a non-significant amount of people.
>

I understand,  documentation help resolve the confusion, the next step  i
will  updated man page for memfd.

Thanks
-Jeff
Jeff Xu June 8, 2024, 3:35 a.m. UTC | #3
Resent, (previous email is not plain text)

Hi

On Fri, Jun 7, 2024 at 2:41 PM Barnabás Pőcze <pobrn@protonmail.com> wrote:
>
> Hi
>
>
> 2024. június 7., péntek 22:35 keltezéssel, jeffxu@chromium.org <jeffxu@chromium.org> írta:
>
> > From: Jeff Xu <jeffxu@chromium.org>
> >
> > When MFD_NOEXEC_SEAL was introduced, there was one big mistake: it
> > didn't have proper documentation. This led to a lot of confusion,
> > especially about whether or not memfd created with the MFD_NOEXEC_SEAL
> > flag is sealable. Before MFD_NOEXEC_SEAL, memfd had to explicitly set
> > MFD_ALLOW_SEALING to be sealable, so it's a fair question.
> >
> > As one might have noticed, unlike other flags in memfd_create,
> > MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea
> > is to make it easier to use memfd in the most common way, which is
> > NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING. This works with sysctl
> > vm.noexec to help existing applications move to a more secure way of
> > using memfd.
> >
> > Proposals have been made to put MFD_NOEXEC_SEAL non-sealable, unless
> > MFD_ALLOW_SEALING is set, to be consistent with other flags [1] [2],
> > Those are based on the viewpoint that each flag is an atomic unit,
> > which is a reasonable assumption. However, MFD_NOEXEC_SEAL was
> > designed with the intent of promoting the most secure method of using
> > memfd, therefore a combination of multiple functionalities into one
> > bit.
> >
> > Furthermore, the MFD_NOEXEC_SEAL has been added for more than one
> > year, and multiple applications and distributions have backported and
> > utilized it. Altering ABI now presents a degree of risk and may lead
> > to disruption.
>
> I feel compelled to mention again that based on my investigation the risk is
> minimal. Not to mention that it can easily be reverted if need be.
>
The risk is not zero. If we changed the ABI it would be propagated to
early kernel stable versions. Various Linux distributions also
backported the patch to earlier kernels such as 5.4. If it needs a
revert, then everyone has to do it again.

> In my view, it is better to fix the inconsistency than to document it. I would
> argue that "`MFD_ALLOW_SEALING` is needed to enable sealing except that XYZ"
> is unintuitive and confusing for a non-significant amount of people.
>
I understand,  documentation helps resolve the confusion, the next
step  is to  update the man page for memfd.

Thanks
-Jeff