@@ -37,6 +37,7 @@ Currently, these files are in /proc/sys/fs:
- inode-nr
- inode-state
- nr_open
+- open_mayexec_enforce
- overflowuid
- overflowgid
- pipe-user-pages-hard
@@ -165,6 +166,48 @@ system needs to prune the inode list instead of allocating
more.
+open_mayexec_enforce
+--------------------
+
+The ``RESOLVE_MAYEXEC`` flag can be passed to :manpage:`openat2(2)` to only
+open regular files that are expected to be executable. If the file is not
+identified as executable, then the syscall returns -EACCES. This may allow a
+script interpreter to check executable permission before reading commands from
+a file. One interesting use case is to enforce a "write xor execute" policy
+through interpreters.
+
+The ability to restrict code execution must be thought as a system-wide policy,
+which first starts by restricting mount points with the ``noexec`` option.
+This option is also automatically applied to special filesystems such as /proc
+. This prevents files on such mount points to be directly executed by the
+kernel or mapped as executable memory (e.g. libraries). With script
+interpreters using the ``RESOLVE_MAYEXEC`` flag, the executable permission can
+then be checked before reading commands from files. This makes it possible to
+enforce the ``noexec`` at the interpreter level, and thus propagates this
+security policy to scripts. To be fully effective, these interpreters also
+need to handle the other ways to execute code: command line parameters (e.g.,
+option ``-e`` for Perl), module loading (e.g., option ``-m`` for Python),
+stdin, file sourcing, environment variables, configuration files, etc.
+According to the threat model, it may be acceptable to allow some script
+interpreters (e.g. Bash) to interpret commands from stdin, may it be a TTY or a
+pipe, because it may not be enough to (directly) perform syscalls.
+
+There are two complementary security policies: enforce the ``noexec`` mount
+option, and enforce executable file permission. These policies are handled by
+the ``fs.open_mayexec_enforce`` sysctl (writable only with ``CAP_MAC_ADMIN``)
+as a bitmask:
+
+1 - Mount restriction: checks that the mount options for the underlying VFS
+ mount do not prevent execution.
+
+2 - File permission restriction: checks that the to-be-opened file is marked as
+ executable for the current process (e.g., POSIX permissions).
+
+Code samples can be found in tools/testing/selftests/openat2/omayexec_test.c
+and (with the original ``O_MAYEXEC``) at
+https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC .
+
+
overflowgid & overflowuid
-------------------------