mbox series

[RFC,0/4] Landlock: ioctl support

Message ID 20230502171755.9788-1-gnoack3000@gmail.com (mailing list archive)
Headers show
Series Landlock: ioctl support | expand

Message

Günther Noack May 2, 2023, 5:17 p.m. UTC
Hello!

These patches add ioctl support to Landlock.

It's an early version - it potentially needs more tests and
documentation.  I'd like to circulate the patches early to discuss
whether this approach is feasible.

Objective
~~~~~~~~~

Make ioctl(2) requests restrictable with Landlock,
in a way that is useful for real-world applications.

Proposed approach
~~~~~~~~~~~~~~~~~

Introduce the LANDLOCK_ACCESS_FS_IOCTL right, which restricts the use
of ioctl(2) on file descriptors.

We attach the LANDLOCK_ACCESS_FS_IOCTL right to opened file
descriptors, as we already do for LANDLOCK_ACCESS_FS_TRUNCATE.

I believe that this approach works for the majority of use cases, and
offers a good trade-off between Landlock API and implementation
complexity and flexibility when the feature is used.

Notable implications of this approach
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Existing inherited file descriptors stay unaffected
  when a program enables Landlock.

  This means in particular that in common scenarios,
  the terminal's ioctls (ioctl_tty(2)) continue to work.

* ioctl(2) continues to be available for file descriptors acquired
  through means other than open(2).  Example: Network sockets.

Examples
~~~~~~~~

Starting a sandboxed shell from $HOME with samples/landlock/sandboxer:

  LL_FS_RO=/ LL_FS_RW=. ./sandboxer /bin/bash

The LANDLOCK_ACCESS_FS_IOCTL right is part of the "read-write" rights
here, so we expect that newly opened files outside of $HOME don't work
with ioctl(2).

  * "stty" works: It probes terminal properties

  * "stty </dev/tty" fails: /dev/tty can be reopened, but the ioctl is
    denied.

  * "eject" fails: ioctls to use CD-ROM drive are denied.

  * "ls /dev" works: It uses ioctl to get the terminal size for
    columnar layout

  * The text editors "vim" and "mg" work.  (GNU Emacs fails because it
    attempts to reopen /dev/tty.)

Alternatives considered: Allow-listing specific ioctl requests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It would be technically possible to keep an allow-list of ioctl
requests and thereby control in more detail which ioctls should work.

I believe that this is not needed for the majority of use cases and
that it is a reasonable trade-off to make here (but I'm happy to hear
about counterexamples).  The reasoning is:

* Many programs do not need ioctl at all,
  and denying all ioctl(2) requests OK for these.

* Other programs need ioctl, but only for the terminal FDs.
  This is supported because these file descriptors are usually
  inherited from the parent process - so the parent process gets to
  control the ioctl(2) policy for them.

* Some programs need ioctl on specific files that they are opening
  themselves.  They can allow-list these file paths for ioctl(2).
  This makes the programs work, but it restricts a variety of other
  ioctl requests which are otherwise possible through opening other
  files.

Because the LANDLOCK_ACCESS_FS_IOCTL right is attached to the file
descriptor, programs have flexible options to control which ioctl
operations should work, without the implementation complexity of
additional ioctl allow-lists in the kernel.

Finally, the proposed approach is simpler in implementation and has
lower API complexity, but it does *not* preclude us from implementing
per-ioctl-request allow lists later, if that turns out to be necessary
at a later point.

Related Work
~~~~~~~~~~~~

OpenBSD's pledge(2) [1] restricts ioctl(2) independent of the file
descriptor which is used.  The implementers maintain multiple
allow-lists of predefined ioctl(2) operations required for different
application domains such as "audio", "bpf", "tty" and "inet".

OpenBSD does not guarantee ABI backwards compatibility to the same
extent as Linux does, so it's easier for them to update these lists in
later versions.  It might not be a feasible approach for Linux though.

[1] https://man.openbsd.org/OpenBSD-7.3/pledge.2

Feedback I'm looking for
~~~~~~~~~~~~~~~~~~~~~~~~

Some specific points I would like to get your opinion on:

* Is this the right general approach to restricting ioctl(2)?

  It will probably be possible to find counter-examples where the
  alternative (see below) is better.  I'd be interested in these, and
  in how common they are, to understand whether we have picked the
  right trade-off here.

* Should we introduce a "landlock_fd_rights_limit()" syscall?

  We could potentially implement a system call for dropping the
  LANDLOCK_ACCESS_FS_IOCTL and LANDLOCK_ACCESS_FS_TRUNCATE rights from
  existing file descriptors (independent of the type of file
  descriptor, even).

  Possible use cases would be to (a) restrict the rights on inherited
  file descriptors like std{in,out,err} and to (b) restrict ioctl and
  truncate operations on file descriptors that are not acquired
  through open(2), such as network sockets.

  This would be similar to the cap_rights_limit(2) system call in
  FreeBSD's Capsicum.

  This idea feels somewhat orthogonal to the ioctl patch, but it would
  start to be more useful if the ioctl right exists.


Günther Noack (4):
  landlock: Increment Landlock ABI version to 4
  landlock: Add LANDLOCK_ACCESS_FS_IOCTL access right
  selftests/landlock: Test ioctl support
  samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL

 include/uapi/linux/landlock.h                | 19 ++++--
 samples/landlock/sandboxer.c                 | 12 +++-
 security/landlock/fs.c                       | 20 +++++-
 security/landlock/limits.h                   |  2 +-
 security/landlock/syscalls.c                 |  2 +-
 tools/testing/selftests/landlock/base_test.c |  2 +-
 tools/testing/selftests/landlock/fs_test.c   | 67 +++++++++++++++++++-
 7 files changed, 107 insertions(+), 17 deletions(-)


base-commit: 457391b0380335d5e9a5babdec90ac53928b23b4

Comments

Mickaël Salaün May 4, 2023, 9:12 p.m. UTC | #1
Thanks for this RFC, this is interesting!

I previously thought a lot about IOCTLs restrictions and here are some 
notes:

IOCTLs are everywhere, from devices to filesystems (see fscrypt). Each 
different file type may behave differently for the same IOCTL 
command/ID. It is also worth noting that there are a lot of different 
IOCTLs, they are growing over time, some might be dedicated to get some 
data (i.e. read) and others to change the state of a device (i.e. 
write), some might be innocuous (e.g. FIOCLEX, FIONCLEX) and others 
potentially harmful. _IOC_READ and _IOC_WRITE can be useful but they are 
not mandatory, there are exceptions, and it may be difficult to identify 
if a command pertains to one, the other, or both kind of actions.

I then think it would be very useful to be able to tie file/device types 
to a set of IOCTLs, letting user space libraries define and classify the 
IOCTL semantic.

Instead of relying on a LANDLOCK_ACCESS_FS_IOCTL which would allow or 
deny all IOCTLs, we can extend the path_beneath struct to manage IOCTLs 
in addition to regular file accesses. Because dealing with a set of 
IOCTLs would imply to deal with a lot of data and combinations, I 
thought about creating groups of IOCTLs (defining access semantic) that 
could be matched against file hierarchies. The composability nature of 
Landlock domains is also another constraint to keep in mind.

// New rule type dedicated to define groups of IOCTLs.
struct landlock_ioctl_attr {
   __u32 command; // IOCTL number/ID
   dev_t device; // must be 0 for regular file and directory
   __u8 file_type;
   __u8 id_mask; // if 0, then applied globally
};

We could use landlock_add_rule(2) to fill a set of landlock_ioctl_attr 
into a ruleset and use them with landlock_path_beneath_attr entries:

// LANDLOCK_RULE_PATH_BENEATH, leveraging the extensible design of
// landlock_path_beneath_attr, hence the same first fields.
struct landlock_path_beneath_attr {
   __u64 allowed_access;
   __s32 parent_fd;
   __u16 allowed_ioctl_id_mask;
};

landlock_ioctl_attr includes a 8-bit mask for which each bit identifies 
a set of allowed IOCTLs per device/file type. This mask is then tied to 
a path_beneath_attr. We cannot use number IDs because of dev_t+IOCTL->ID 
intersection conflicts. Using an id_mask enables to group (specific) 
IOCTLs together, then creating synthetic access rights.

When merging a ruleset with a domain, each IOCTL ID mask is shifted and 
ORed with the other layer ones to get a (8*16) 128-bit mask, stored in 
an IOCTL/dev_t table and in the related landlock_object. When looking 
for an IOCTL request, Landlock first looks into the IOCTL set ID table 
and get the global set ID mask, which kind of translates to a 
composition of synthetic access rights (stored with the 
landlock_layer.ioctl_access bitmask). We then walk through all the 
inodes to match the whole mask.

I realize that this is complex and this explanation might be confusing 
though. What do you think?



On 02/05/2023 19:17, Günther Noack wrote:
> Hello!
> 
> These patches add ioctl support to Landlock.
> 
> It's an early version - it potentially needs more tests and
> documentation.  I'd like to circulate the patches early to discuss
> whether this approach is feasible.
> 
> Objective
> ~~~~~~~~~
> 
> Make ioctl(2) requests restrictable with Landlock,
> in a way that is useful for real-world applications.
> 
> Proposed approach
> ~~~~~~~~~~~~~~~~~
> 
> Introduce the LANDLOCK_ACCESS_FS_IOCTL right, which restricts the use
> of ioctl(2) on file descriptors.
> 
> We attach the LANDLOCK_ACCESS_FS_IOCTL right to opened file
> descriptors, as we already do for LANDLOCK_ACCESS_FS_TRUNCATE.
> 
> I believe that this approach works for the majority of use cases, and
> offers a good trade-off between Landlock API and implementation
> complexity and flexibility when the feature is used.
> 
> Notable implications of this approach
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> * Existing inherited file descriptors stay unaffected
>    when a program enables Landlock.
> 
>    This means in particular that in common scenarios,
>    the terminal's ioctls (ioctl_tty(2)) continue to work.
> 
> * ioctl(2) continues to be available for file descriptors acquired
>    through means other than open(2).  Example: Network sockets.
> 
> Examples
> ~~~~~~~~
> 
> Starting a sandboxed shell from $HOME with samples/landlock/sandboxer:
> 
>    LL_FS_RO=/ LL_FS_RW=. ./sandboxer /bin/bash
> 
> The LANDLOCK_ACCESS_FS_IOCTL right is part of the "read-write" rights
> here, so we expect that newly opened files outside of $HOME don't work
> with ioctl(2).
> 
>    * "stty" works: It probes terminal properties
> 
>    * "stty </dev/tty" fails: /dev/tty can be reopened, but the ioctl is
>      denied.
> 
>    * "eject" fails: ioctls to use CD-ROM drive are denied.
> 
>    * "ls /dev" works: It uses ioctl to get the terminal size for
>      columnar layout
> 
>    * The text editors "vim" and "mg" work.  (GNU Emacs fails because it
>      attempts to reopen /dev/tty.)
> 
> Alternatives considered: Allow-listing specific ioctl requests
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> It would be technically possible to keep an allow-list of ioctl
> requests and thereby control in more detail which ioctls should work.
> 
> I believe that this is not needed for the majority of use cases and
> that it is a reasonable trade-off to make here (but I'm happy to hear
> about counterexamples).  The reasoning is:
> 
> * Many programs do not need ioctl at all,
>    and denying all ioctl(2) requests OK for these.
> 
> * Other programs need ioctl, but only for the terminal FDs.
>    This is supported because these file descriptors are usually
>    inherited from the parent process - so the parent process gets to
>    control the ioctl(2) policy for them.
> 
> * Some programs need ioctl on specific files that they are opening
>    themselves.  They can allow-list these file paths for ioctl(2).
>    This makes the programs work, but it restricts a variety of other
>    ioctl requests which are otherwise possible through opening other
>    files.
> 
> Because the LANDLOCK_ACCESS_FS_IOCTL right is attached to the file
> descriptor, programs have flexible options to control which ioctl
> operations should work, without the implementation complexity of
> additional ioctl allow-lists in the kernel.
> 
> Finally, the proposed approach is simpler in implementation and has
> lower API complexity, but it does *not* preclude us from implementing
> per-ioctl-request allow lists later, if that turns out to be necessary
> at a later point.

I value this simplicity, but I'm also wondering about how much this 
allow/deny all IOCTLs approach would be useful in real case scenarios. ;)


> 
> Related Work
> ~~~~~~~~~~~~
> 
> OpenBSD's pledge(2) [1] restricts ioctl(2) independent of the file
> descriptor which is used.  The implementers maintain multiple
> allow-lists of predefined ioctl(2) operations required for different
> application domains such as "audio", "bpf", "tty" and "inet".
> 
> OpenBSD does not guarantee ABI backwards compatibility to the same
> extent as Linux does, so it's easier for them to update these lists in
> later versions.  It might not be a feasible approach for Linux though.
> 
> [1] https://man.openbsd.org/OpenBSD-7.3/pledge.2
> 
> Feedback I'm looking for
> ~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Some specific points I would like to get your opinion on:
> 
> * Is this the right general approach to restricting ioctl(2)?
> 
>    It will probably be possible to find counter-examples where the
>    alternative (see below) is better.  I'd be interested in these, and
>    in how common they are, to understand whether we have picked the
>    right trade-off here.

In the long term, I'd like Landlock to be able to restrict a set of 
IOCTLs, taking into account the type of file/device. Being able to deny 
all IOCTLs might be useful and is much easier to implement though.


> 
> * Should we introduce a "landlock_fd_rights_limit()" syscall?
> 
>    We could potentially implement a system call for dropping the
>    LANDLOCK_ACCESS_FS_IOCTL and LANDLOCK_ACCESS_FS_TRUNCATE rights from
>    existing file descriptors (independent of the type of file
>    descriptor, even).
> 
>    Possible use cases would be to (a) restrict the rights on inherited
>    file descriptors like std{in,out,err} and to (b) restrict ioctl and
>    truncate operations on file descriptors that are not acquired
>    through open(2), such as network sockets.
> 
>    This would be similar to the cap_rights_limit(2) system call in
>    FreeBSD's Capsicum.
> 
>    This idea feels somewhat orthogonal to the ioctl patch, but it would
>    start to be more useful if the ioctl right exists.

This is indeed interesting, and that should be useful for the cases you 
explained. I think supporting IOCTLs is more important for now, but a 
new syscall to restrict FDs could be useful in the future. We should 
also think about batch operations on FD (see the close_range syscall), 
for instance to deny all IOCTLs on inherited or received FDs.


> 
> 
> Günther Noack (4):
>    landlock: Increment Landlock ABI version to 4
>    landlock: Add LANDLOCK_ACCESS_FS_IOCTL access right
>    selftests/landlock: Test ioctl support
>    samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL
> 
>   include/uapi/linux/landlock.h                | 19 ++++--
>   samples/landlock/sandboxer.c                 | 12 +++-
>   security/landlock/fs.c                       | 20 +++++-
>   security/landlock/limits.h                   |  2 +-
>   security/landlock/syscalls.c                 |  2 +-
>   tools/testing/selftests/landlock/base_test.c |  2 +-
>   tools/testing/selftests/landlock/fs_test.c   | 67 +++++++++++++++++++-
>   7 files changed, 107 insertions(+), 17 deletions(-)
> 
> 
> base-commit: 457391b0380335d5e9a5babdec90ac53928b23b4
Günther Noack May 10, 2023, 7:21 p.m. UTC | #2
Hello Mickaël!

Sorry for the late reply.  This was indeed a bit difficult for me to
understand; maybe it just needs more clarification.

Let me try to paraphrase your proposal (inline below) so we can
resolve the misunderstandings.


On Thu, May 04, 2023 at 11:12:00PM +0200, Mickaël Salaün wrote:
> Thanks for this RFC, this is interesting!
> 
> I previously thought a lot about IOCTLs restrictions and here are some
> notes:
> 
> IOCTLs are everywhere, from devices to filesystems (see fscrypt). Each
> different file type may behave differently for the same IOCTL command/ID. It
> is also worth noting that there are a lot of different IOCTLs, they are
> growing over time, some might be dedicated to get some data (i.e. read) and
> others to change the state of a device (i.e. write), some might be innocuous
> (e.g. FIOCLEX, FIONCLEX) and others potentially harmful. _IOC_READ and
> _IOC_WRITE can be useful but they are not mandatory, there are exceptions,
> and it may be difficult to identify if a command pertains to one, the other,
> or both kind of actions.
> 
> I then think it would be very useful to be able to tie file/device types to
> a set of IOCTLs, letting user space libraries define and classify the IOCTL
> semantic.
> 
> Instead of relying on a LANDLOCK_ACCESS_FS_IOCTL which would allow or deny
> all IOCTLs, we can extend the path_beneath struct to manage IOCTLs in
> addition to regular file accesses. Because dealing with a set of IOCTLs
> would imply to deal with a lot of data and combinations, I thought about
> creating groups of IOCTLs (defining access semantic) that could be matched
> against file hierarchies. The composability nature of Landlock domains is
> also another constraint to keep in mind.
> 
> // New rule type dedicated to define groups of IOCTLs.
> struct landlock_ioctl_attr {
>   __u32 command; // IOCTL number/ID
>   dev_t device; // must be 0 for regular file and directory
>   __u8 file_type;
>   __u8 id_mask; // if 0, then applied globally
> };
>
> We could use landlock_add_rule(2) to fill a set of landlock_ioctl_attr into
> a ruleset and use them with landlock_path_beneath_attr entries:
> 
> // LANDLOCK_RULE_PATH_BENEATH, leveraging the extensible design of
> // landlock_path_beneath_attr, hence the same first fields.
> struct landlock_path_beneath_attr {
>   __u64 allowed_access;
>   __s32 parent_fd;
>   __u16 allowed_ioctl_id_mask;
> };
> 
> landlock_ioctl_attr includes a 8-bit mask for which each bit identifies a
> set of allowed IOCTLs per device/file type. This mask is then tied to a
> path_beneath_attr. We cannot use number IDs because of dev_t+IOCTL->ID
> intersection conflicts. Using an id_mask enables to group (specific) IOCTLs
> together, then creating synthetic access rights.
> 
> When merging a ruleset with a domain, each IOCTL ID mask is shifted and ORed
> with the other layer ones to get a (8*16) 128-bit mask, stored in an
> IOCTL/dev_t table and in the related landlock_object. When looking for an
> IOCTL request, Landlock first looks into the IOCTL set ID table and get the
> global set ID mask, which kind of translates to a composition of synthetic
> access rights (stored with the landlock_layer.ioctl_access bitmask). We then
> walk through all the inodes to match the whole mask.
> 
> I realize that this is complex and this explanation might be confusing
> though. What do you think?

To be honest, I am not fully sure I understand the landlock_ioctl_attr
struct correctly.

My current guess is:

  command:   a single ioctl request number that should be permitted

  device:    if device != 0; require the dev_t (major+minor number)
             of the file to match, before permitting the ioctl

  file_type: if set (!= 0?), require the file type to match,
             before permitting the ioctl

  id_mask:   Indicates the IDs of the groups where this ioctl
             should be permitted(?)

So -- if we were to implement this without any optimizations -- the
logic of this is presumably something like this?:

  If Landlock checks an ioctl with a request_id on a file f:

  We look up the allowed_ioctl_id_mask and loop over the bits set in
  that mask.

  For every bit set in that mask, we look up the matching ioctl group.
  Within the group, we look whether we can find a landlock_ioctl_attr
  rule that belongs to the group which permits that ioctl request.

  The request is granted by a landlock_ioctl_attr if:

    attr.command == request_id
    && (attr.device == 0 || file_inode(file).i_rdev == attr.device)
    && (attr.file_type == 0 || landlock_get_file_type?(file))

Is this roughly what you imagined?


Some specific things I don't understand well are:

* How does id_mask identify the ioctl group?  Do you envision an
  interface where you can add a landlock_ioctl_attr to multiple groups
  at once?

  When the added landlock_ioctl_attr has id_mask=3, does it add that
  attr to group 2 and 1?

* How does allowed_ioctl_id_mask match against the ioctl group IDs
  (id_mask)?  I'm particularly confused because the
  allowed_ioctl_id_mask is __u16, whereas id_mask is __u8?  Is this
  intentional?

* What is file_type?  Are those the file types as used in mknod(2),
  so that you can distinguish between regular files, directories,
  named pipes and the like?

* If I understand the proposal right, the .device and .file_type in
  landlock_ioctl_attr are narrowing down the set of files that the
  ioctl policy applies to.  In all rules up until Landlock v3, which
  we already have, the selection of the files which the rule applies
  to is purely done based on file hierarchy.

  Would it not be a more orthogonal API if the "file selection" part
  of the Landlock API and the "policy adding" part for these selected
  files were independent of each other?  Then the .device and
  .file_type selection scheme could be used for the existing policies
  as well?

* When restricting by dev_t (major and minor number), aren't there use
  cases where a system might have 256 CDROM drives, and you'd need to
  allow-list all of these minor number combinations?

* Aren't many ioctl use cases already usable with just the proposal I
  made?  If you add a rule to permit IOCTL for /dev/cdrom0, that
  opened file will anyway only expose a small subset of the ioctls
  that the kernel as a whole offers, no?  Are there ioctls which are
  offered independent of the file type which I'm overlooking? (Sorry,
  this might be a naive question. :))


> On 02/05/2023 19:17, Günther Noack wrote:
> > Alternatives considered: Allow-listing specific ioctl requests
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > It would be technically possible to keep an allow-list of ioctl
> > requests and thereby control in more detail which ioctls should work.
> > 
> > I believe that this is not needed for the majority of use cases and
> > that it is a reasonable trade-off to make here (but I'm happy to hear
> > about counterexamples).  The reasoning is:
> > 
> > * Many programs do not need ioctl at all,
> >    and denying all ioctl(2) requests OK for these.
> > 
> > * Other programs need ioctl, but only for the terminal FDs.
> >    This is supported because these file descriptors are usually
> >    inherited from the parent process - so the parent process gets to
> >    control the ioctl(2) policy for them.
> > 
> > * Some programs need ioctl on specific files that they are opening
> >    themselves.  They can allow-list these file paths for ioctl(2).
> >    This makes the programs work, but it restricts a variety of other
> >    ioctl requests which are otherwise possible through opening other
> >    files.
> > 
> > Because the LANDLOCK_ACCESS_FS_IOCTL right is attached to the file
> > descriptor, programs have flexible options to control which ioctl
> > operations should work, without the implementation complexity of
> > additional ioctl allow-lists in the kernel.
> > 
> > Finally, the proposed approach is simpler in implementation and has
> > lower API complexity, but it does *not* preclude us from implementing
> > per-ioctl-request allow lists later, if that turns out to be necessary
> > at a later point.
> 
> I value this simplicity, but I'm also wondering about how much this
> allow/deny all IOCTLs approach would be useful in real case scenarios. ;)

Mickaël, would you be open to gather some more data for this from
existing users, to understand better which use cases they have?

(Looking in the direction of Jeff Xu, who has inquired about Landlock
for Chromium in the past -- do you happen to know in which ways you'd
want to restrict ioctls, if you have that need? :))


> > Related Work
> > ~~~~~~~~~~~~
> > 
> > OpenBSD's pledge(2) [1] restricts ioctl(2) independent of the file
> > descriptor which is used.  The implementers maintain multiple
> > allow-lists of predefined ioctl(2) operations required for different
> > application domains such as "audio", "bpf", "tty" and "inet".
> > 
> > OpenBSD does not guarantee ABI backwards compatibility to the same
> > extent as Linux does, so it's easier for them to update these lists in
> > later versions.  It might not be a feasible approach for Linux though.
> > 
> > [1] https://man.openbsd.org/OpenBSD-7.3/pledge.2
> > 
> > Feedback I'm looking for
> > ~~~~~~~~~~~~~~~~~~~~~~~~
> > 
> > Some specific points I would like to get your opinion on:
> > 
> > * Is this the right general approach to restricting ioctl(2)?
> > 
> >    It will probably be possible to find counter-examples where the
> >    alternative (see below) is better.  I'd be interested in these, and
> >    in how common they are, to understand whether we have picked the
> >    right trade-off here.
> 
> In the long term, I'd like Landlock to be able to restrict a set of IOCTLs,
> taking into account the type of file/device. Being able to deny all IOCTLs
> might be useful and is much easier to implement though.

In my mind, it's a trade-off between implementation complexity and
flexibility.

My proposal is more simple-minded than what you explained, but might
solve the bulk of real-life use cases for a lower implementation
complexity. (With the caveat that I don't understand the real life use
cases well enough to know how far it really gets us, that's why this
is just an RFC.)

I'm not *just* saying this because I'm lazy ;), but I also feel that
we should be careful with the amount of complexity that we take on,
especially if there is a chance that it won't be needed in practice.

I think that it might be a feasible approach to start with the
LANDLOCK_ACCESS_FS_IOCTL approach and then look at its usage to
understand whether we see a significant number of programs whose
sandboxes are too coarse because of this.

If more fine-granular control is needed, we can still put the other
approach on top, and the additional complexity from
LANDLOCK_ACCESS_FS_IOCTL that we have to support is not that
dramatically high.

If more fine-granular control is not needed, we can skip the
implementation of the other approach and Landlock is simpler.

Then again, I'm somewhat new to kernel development still, I'm not sure
whether this is an approach that is deemed acceptable in this setting?


> > * Should we introduce a "landlock_fd_rights_limit()" syscall?
> > 
> >    We could potentially implement a system call for dropping the
> >    LANDLOCK_ACCESS_FS_IOCTL and LANDLOCK_ACCESS_FS_TRUNCATE rights from
> >    existing file descriptors (independent of the type of file
> >    descriptor, even).
> > 
> >    Possible use cases would be to (a) restrict the rights on inherited
> >    file descriptors like std{in,out,err} and to (b) restrict ioctl and
> >    truncate operations on file descriptors that are not acquired
> >    through open(2), such as network sockets.
> > 
> >    This would be similar to the cap_rights_limit(2) system call in
> >    FreeBSD's Capsicum.
> > 
> >    This idea feels somewhat orthogonal to the ioctl patch, but it would
> >    start to be more useful if the ioctl right exists.
> 
> This is indeed interesting, and that should be useful for the cases you
> explained. I think supporting IOCTLs is more important for now, but a new
> syscall to restrict FDs could be useful in the future.

Ack, OK.  I agree, it's not that urgent yet.


> We should also think about batch operations on FD (see the
> close_range syscall), for instance to deny all IOCTLs on inherited
> or received FDs.

Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the
rights for an entire range of FDs?

I have to admit, I'm not familiar with the real-life use cases of
close_range().  In most programs I work with, it's difficult to reason
about their ordering once the program has really started to run. So I
imagine that close_range() is mostly used to "sanitize" the open file
descriptors at the start of main(), and you have a similar use case in
mind for this one as well?

If it's just about closing the range from 0 to 2, I'm not sure it's
worth adding a custom syscall. :)

Thanks for the review!
–Günther
Jeff Xu May 24, 2023, 9:43 p.m. UTC | #3
Sorry for the late reply.
>
> (Looking in the direction of Jeff Xu, who has inquired about Landlock
> for Chromium in the past -- do you happen to know in which ways you'd
> want to restrict ioctls, if you have that need? :))
>

Regarding this patch, here is some feedback from ChromeOS:
 - In the short term: we are looking to integrate Landlock into our
sandboxer, so the ability to restrict to a specific device is huge.
- Fundamentally though, in the effort of bringing process expected
behaviour closest to allowed behaviour, the ability to speak of
ioctl() path access in Landlock would be huge -- at least we can
continue to enumerate in policy what processes are allowed to do, even
if we still lack the ability to restrict individual ioctl commands for
a specific device node.

Regarding medium term:
My thoughts are, from software architecture point of view, it would be
nice to think in planes: i.e. Data plane / Control plane/ Signaling
Plane/Normal User Plane/Privileged User Plane. Let the application
define its planes, and assign operations to them. Landlock provides
data structure and syscall to construct the planes.

However, one thing I'm not sure is the third arg from ioctl:
int ioctl(int fd, unsigned long request, ...);
Is it possible for the driver to use the same request id, then put
whatever into the third arg ? how to deal with that effectively ?

For real world user cases, Dmitry Torokhov (added to list) can help.

PS: There is also lwn article about SELinux implementation of ioctl: [1]
[1] https://lwn.net/Articles/428140/

Thanks!
-Jeff Xu
Mickaël Salaün June 17, 2023, 9:47 a.m. UTC | #4
This subject is not easy, but I think we're reaching a consensus (see my 
3-steps proposal plan below). I answered your questions about the 
(complex) interface I proposed, but we should focus on the first step 
now (your initial proposal) and get back to the other steps later in 
another email thread.


On 10/05/2023 21:21, Günther Noack wrote:
> Hello Mickaël!
> 
> Sorry for the late reply.  This was indeed a bit difficult for me to
> understand; maybe it just needs more clarification.
> 
> Let me try to paraphrase your proposal (inline below) so we can
> resolve the misunderstandings.
> 
> 
> On Thu, May 04, 2023 at 11:12:00PM +0200, Mickaël Salaün wrote:
>> Thanks for this RFC, this is interesting!
>>
>> I previously thought a lot about IOCTLs restrictions and here are some
>> notes:
>>
>> IOCTLs are everywhere, from devices to filesystems (see fscrypt). Each
>> different file type may behave differently for the same IOCTL command/ID. It
>> is also worth noting that there are a lot of different IOCTLs, they are
>> growing over time, some might be dedicated to get some data (i.e. read) and
>> others to change the state of a device (i.e. write), some might be innocuous
>> (e.g. FIOCLEX, FIONCLEX) and others potentially harmful. _IOC_READ and
>> _IOC_WRITE can be useful but they are not mandatory, there are exceptions,
>> and it may be difficult to identify if a command pertains to one, the other,
>> or both kind of actions.
>>
>> I then think it would be very useful to be able to tie file/device types to
>> a set of IOCTLs, letting user space libraries define and classify the IOCTL
>> semantic.
>>
>> Instead of relying on a LANDLOCK_ACCESS_FS_IOCTL which would allow or deny
>> all IOCTLs, we can extend the path_beneath struct to manage IOCTLs in
>> addition to regular file accesses. Because dealing with a set of IOCTLs
>> would imply to deal with a lot of data and combinations, I thought about
>> creating groups of IOCTLs (defining access semantic) that could be matched
>> against file hierarchies. The composability nature of Landlock domains is
>> also another constraint to keep in mind.
>>
>> // New rule type dedicated to define groups of IOCTLs.
>> struct landlock_ioctl_attr {
>>    __u32 command; // IOCTL number/ID
>>    dev_t device; // must be 0 for regular file and directory
>>    __u8 file_type;
>>    __u8 id_mask; // if 0, then applied globally
>> };
>>
>> We could use landlock_add_rule(2) to fill a set of landlock_ioctl_attr into
>> a ruleset and use them with landlock_path_beneath_attr entries:
>>
>> // LANDLOCK_RULE_PATH_BENEATH, leveraging the extensible design of
>> // landlock_path_beneath_attr, hence the same first fields.
>> struct landlock_path_beneath_attr {
>>    __u64 allowed_access;
>>    __s32 parent_fd;
>>    __u16 allowed_ioctl_id_mask;
>> };
>>
>> landlock_ioctl_attr includes a 8-bit mask for which each bit identifies a
>> set of allowed IOCTLs per device/file type. This mask is then tied to a
>> path_beneath_attr. We cannot use number IDs because of dev_t+IOCTL->ID
>> intersection conflicts. Using an id_mask enables to group (specific) IOCTLs
>> together, then creating synthetic access rights.
>>
>> When merging a ruleset with a domain, each IOCTL ID mask is shifted and ORed
>> with the other layer ones to get a (8*16) 128-bit mask, stored in an
>> IOCTL/dev_t table and in the related landlock_object. When looking for an
>> IOCTL request, Landlock first looks into the IOCTL set ID table and get the
>> global set ID mask, which kind of translates to a composition of synthetic
>> access rights (stored with the landlock_layer.ioctl_access bitmask). We then
>> walk through all the inodes to match the whole mask.
>>
>> I realize that this is complex and this explanation might be confusing
>> though. What do you think?
> 
> To be honest, I am not fully sure I understand the landlock_ioctl_attr
> struct correctly.
> 
> My current guess is:
> 
>    command:   a single ioctl request number that should be permitted
> 
>    device:    if device != 0; require the dev_t (major+minor number)
>               of the file to match, before permitting the ioctl
> 
>    file_type: if set (!= 0?), require the file type to match,
>               before permitting the ioctl
> 
>    id_mask:   Indicates the IDs of the groups where this ioctl
>               should be permitted(?)
> 
> So -- if we were to implement this without any optimizations -- the
> logic of this is presumably something like this?:
> 
>    If Landlock checks an ioctl with a request_id on a file f:
> 
>    We look up the allowed_ioctl_id_mask and loop over the bits set in
>    that mask.
> 
>    For every bit set in that mask, we look up the matching ioctl group.
>    Within the group, we look whether we can find a landlock_ioctl_attr
>    rule that belongs to the group which permits that ioctl request.
> 
>    The request is granted by a landlock_ioctl_attr if:
> 
>      attr.command == request_id
>      && (attr.device == 0 || file_inode(file).i_rdev == attr.device)
>      && (attr.file_type == 0 || landlock_get_file_type?(file))
> 
> Is this roughly what you imagined?

Yes :)

> 
> 
> Some specific things I don't understand well are:
> 
> * How does id_mask identify the ioctl group?  Do you envision an
>    interface where you can add a landlock_ioctl_attr to multiple groups
>    at once?

id_mask would be an arbitrary value picked by the user (library). I was 
thinking about a bitmask (for allowed_ioctl_id_mask) because different 
ioctl_attr could overlap (e.g., for different file hierarchy).

> 
>    When the added landlock_ioctl_attr has id_mask=3, does it add that
>    attr to group 2 and 1?

No, it assigns this IOCTL attribute to the group 3, which can then be 
matched against with allowed_ioctl_id_mask & (1 << 2).

> 
> * How does allowed_ioctl_id_mask match against the ioctl group IDs
>    (id_mask)?  I'm particularly confused because the
>    allowed_ioctl_id_mask is __u16, whereas id_mask is __u8?  Is this
>    intentional?

The idea was to assign each ioctl_attr to one group (ID) but enable to 
match several groups with path_beneath_attr (allowed_ioctl_id_mask being 
the only bitmask, whereas id_mask is a number). Yes, the naming is 
confusing…

> 
> * What is file_type?  Are those the file types as used in mknod(2),
>    so that you can distinguish between regular files, directories,
>    named pipes and the like?

Yes

> 
> * If I understand the proposal right, the .device and .file_type in
>    landlock_ioctl_attr are narrowing down the set of files that the
>    ioctl policy applies to.  In all rules up until Landlock v3, which
>    we already have, the selection of the files which the rule applies
>    to is purely done based on file hierarchy.
> 
>    Would it not be a more orthogonal API if the "file selection" part
>    of the Landlock API and the "policy adding" part for these selected
>    files were independent of each other?  Then the .device and
>    .file_type selection scheme could be used for the existing policies
>    as well?

Both approaches have pros and cons. I propose a new incremental approach 
below that starts with the simple case where there is no direct links 
between different rule types (only the third step add that).

> 
> * When restricting by dev_t (major and minor number), aren't there use
>    cases where a system might have 256 CDROM drives, and you'd need to
>    allow-list all of these minor number combinations?

Indeed, we should be able to just ignore device minors.

> 
> * Aren't many ioctl use cases already usable with just the proposal I
>    made?  If you add a rule to permit IOCTL for /dev/cdrom0, that
>    opened file will anyway only expose a small subset of the ioctls
>    that the kernel as a whole offers, no?  Are there ioctls which are
>    offered independent of the file type which I'm overlooking? (Sorry,
>    this might be a naive question. :))

This is correct until users allow IOCTL for a directory (e.g. /etc). It 
depends on use cases though.

> 
> 
>> On 02/05/2023 19:17, Günther Noack wrote:
>>> Alternatives considered: Allow-listing specific ioctl requests
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> It would be technically possible to keep an allow-list of ioctl
>>> requests and thereby control in more detail which ioctls should work.
>>>
>>> I believe that this is not needed for the majority of use cases and
>>> that it is a reasonable trade-off to make here (but I'm happy to hear
>>> about counterexamples).  The reasoning is:
>>>
>>> * Many programs do not need ioctl at all,
>>>     and denying all ioctl(2) requests OK for these.
>>>
>>> * Other programs need ioctl, but only for the terminal FDs.
>>>     This is supported because these file descriptors are usually
>>>     inherited from the parent process - so the parent process gets to
>>>     control the ioctl(2) policy for them.
>>>
>>> * Some programs need ioctl on specific files that they are opening
>>>     themselves.  They can allow-list these file paths for ioctl(2).
>>>     This makes the programs work, but it restricts a variety of other
>>>     ioctl requests which are otherwise possible through opening other
>>>     files.
>>>
>>> Because the LANDLOCK_ACCESS_FS_IOCTL right is attached to the file
>>> descriptor, programs have flexible options to control which ioctl
>>> operations should work, without the implementation complexity of
>>> additional ioctl allow-lists in the kernel.
>>>
>>> Finally, the proposed approach is simpler in implementation and has
>>> lower API complexity, but it does *not* preclude us from implementing
>>> per-ioctl-request allow lists later, if that turns out to be necessary
>>> at a later point.
>>
>> I value this simplicity, but I'm also wondering about how much this
>> allow/deny all IOCTLs approach would be useful in real case scenarios. ;)
> 
> Mickaël, would you be open to gather some more data for this from
> existing users, to understand better which use cases they have?

Of course, thanks!

I'm trying to project myself as an app developer, a sysadmin and a 
desktop user. I know some sandboxed softwares, I sandboxed some, and I'm 
looking into new ones. The main issue I see is (from the sysadmin and 
user POV) when we want to manage a whole file hierarchy, which may 
contain different file/device types.


> 
> (Looking in the direction of Jeff Xu, who has inquired about Landlock
> for Chromium in the past -- do you happen to know in which ways you'd
> want to restrict ioctls, if you have that need? :))
> 
> 
>>> Related Work
>>> ~~~~~~~~~~~~
>>>
>>> OpenBSD's pledge(2) [1] restricts ioctl(2) independent of the file
>>> descriptor which is used.  The implementers maintain multiple
>>> allow-lists of predefined ioctl(2) operations required for different
>>> application domains such as "audio", "bpf", "tty" and "inet".
>>>
>>> OpenBSD does not guarantee ABI backwards compatibility to the same
>>> extent as Linux does, so it's easier for them to update these lists in
>>> later versions.  It might not be a feasible approach for Linux though.
>>>
>>> [1] https://man.openbsd.org/OpenBSD-7.3/pledge.2
>>>
>>> Feedback I'm looking for
>>> ~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> Some specific points I would like to get your opinion on:
>>>
>>> * Is this the right general approach to restricting ioctl(2)?
>>>
>>>     It will probably be possible to find counter-examples where the
>>>     alternative (see below) is better.  I'd be interested in these, and
>>>     in how common they are, to understand whether we have picked the
>>>     right trade-off here.
>>
>> In the long term, I'd like Landlock to be able to restrict a set of IOCTLs,
>> taking into account the type of file/device. Being able to deny all IOCTLs
>> might be useful and is much easier to implement though.
> 
> In my mind, it's a trade-off between implementation complexity and
> flexibility.
> 
> My proposal is more simple-minded than what you explained, but might
> solve the bulk of real-life use cases for a lower implementation
> complexity. (With the caveat that I don't understand the real life use
> cases well enough to know how far it really gets us, that's why this
> is just an RFC.)
> 
> I'm not *just* saying this because I'm lazy ;), but I also feel that
> we should be careful with the amount of complexity that we take on,
> especially if there is a chance that it won't be needed in practice.

I agree :)

> 
> I think that it might be a feasible approach to start with the
> LANDLOCK_ACCESS_FS_IOCTL approach and then look at its usage to
> understand whether we see a significant number of programs whose
> sandboxes are too coarse because of this.
> 
> If more fine-granular control is needed, we can still put the other
> approach on top, and the additional complexity from
> LANDLOCK_ACCESS_FS_IOCTL that we have to support is not that
> dramatically high.
> 
> If more fine-granular control is not needed, we can skip the
> implementation of the other approach and Landlock is simpler.
> 
> Then again, I'm somewhat new to kernel development still, I'm not sure
> whether this is an approach that is deemed acceptable in this setting?

I agree that IOCTLs are a security risk and that we should propose a 
simple solution short-term, and maybe a more complete one long-term.

The main issue with a unique IOCTL access right for a file hierarchy is 
that we may not know which device/driver will be the target/server, and 
hence if we need to allow some IOCTL for regular files (e.g., fscrypt), 
we might end up allowing all IOCTLs.

Here is a plan to incrementally develop a fine-grained IOCTL access 
control in 3 steps:

1/ Add a simple IOCTL access right for path_beneath: what you initially 
proposed. For systems that already configure nodev mount points, it 
could be even more useful (e.g., safely allow IOCTL on /home for 
fscrypt, and specific /dev files otherwise).


2/ Create a new type of rule to identify file/device type:
struct landlock_inode_type_attr {
     __u64 allowed_access; /* same as for path_beneath */
     __u64 flags; /* bit 0: ignores device minor */
     dev_t device; /* same as stat's st_rdev */
     __u16 file_type; /* same as stat's st_mode & S_IFMT */
};

We'll probably need to differentiate the handled accesses for 
path_beneath rules and those for inode_type rules to be more useful.

One issue with this type of rule is that it could be used as an oracle 
to bypass stat restrictions. We could check if such (virtual) action is 
allowed without the current domain though.


3/ Add a new type of rule to match IOCTL commands, with a mechanism to 
tie this to inode_type rules (because a command ID is relative to a file 
type/device), and potentially the same mechanism to tie inode_type rules 
to path_beneath rules.


Each of this step can be implemented one after the other, and each one 
is valuable. What do you think?


> 
> 
>>> * Should we introduce a "landlock_fd_rights_limit()" syscall?
>>>
>>>     We could potentially implement a system call for dropping the
>>>     LANDLOCK_ACCESS_FS_IOCTL and LANDLOCK_ACCESS_FS_TRUNCATE rights from
>>>     existing file descriptors (independent of the type of file
>>>     descriptor, even).
>>>
>>>     Possible use cases would be to (a) restrict the rights on inherited
>>>     file descriptors like std{in,out,err} and to (b) restrict ioctl and
>>>     truncate operations on file descriptors that are not acquired
>>>     through open(2), such as network sockets.
>>>
>>>     This would be similar to the cap_rights_limit(2) system call in
>>>     FreeBSD's Capsicum.
>>>
>>>     This idea feels somewhat orthogonal to the ioctl patch, but it would
>>>     start to be more useful if the ioctl right exists.
>>
>> This is indeed interesting, and that should be useful for the cases you
>> explained. I think supporting IOCTLs is more important for now, but a new
>> syscall to restrict FDs could be useful in the future.
> 
> Ack, OK.  I agree, it's not that urgent yet.
> 
> 
>> We should also think about batch operations on FD (see the
>> close_range syscall), for instance to deny all IOCTLs on inherited
>> or received FDs.
> 
> Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the
> rights for an entire range of FDs?
> 
> I have to admit, I'm not familiar with the real-life use cases of
> close_range().  In most programs I work with, it's difficult to reason
> about their ordering once the program has really started to run. So I
> imagine that close_range() is mostly used to "sanitize" the open file
> descriptors at the start of main(), and you have a similar use case in
> mind for this one as well?
> 
> If it's just about closing the range from 0 to 2, I'm not sure it's
> worth adding a custom syscall. :)

The advantage of this kind of range is to efficiently manage all 
potential FDs, and the main use case is to close (or change, see the 
flags) everything *except" 0-2 (i.e. 3-~0), and then avoid a lot of 
(potentially useless) syscalls.

The Landlock interface doesn't need to be a syscall. We could just add a 
new rule type which could take a FD range and restrict them when calling 
landlock_restrict_self(). Something like this:
struct landlock_fd_attr {
     __u64 allowed_access;
     __u32 first;
     __u32 last;
}

> 
> Thanks for the review!
> –Günther
Mickaël Salaün June 17, 2023, 9:48 a.m. UTC | #5
On 24/05/2023 23:43, Jeff Xu wrote:
> Sorry for the late reply.
>>
>> (Looking in the direction of Jeff Xu, who has inquired about Landlock
>> for Chromium in the past -- do you happen to know in which ways you'd
>> want to restrict ioctls, if you have that need? :))
>>
> 
> Regarding this patch, here is some feedback from ChromeOS:
>   - In the short term: we are looking to integrate Landlock into our
> sandboxer, so the ability to restrict to a specific device is huge.
> - Fundamentally though, in the effort of bringing process expected
> behaviour closest to allowed behaviour, the ability to speak of
> ioctl() path access in Landlock would be huge -- at least we can
> continue to enumerate in policy what processes are allowed to do, even
> if we still lack the ability to restrict individual ioctl commands for
> a specific device node.

Thanks for the feedback!

> 
> Regarding medium term:
> My thoughts are, from software architecture point of view, it would be
> nice to think in planes: i.e. Data plane / Control plane/ Signaling
> Plane/Normal User Plane/Privileged User Plane. Let the application
> define its planes, and assign operations to them. Landlock provides
> data structure and syscall to construct the planes.

I'm not sure to follow this plane thing. Could you give examples for 
these planes applied to Landlock?


> 
> However, one thing I'm not sure is the third arg from ioctl:
> int ioctl(int fd, unsigned long request, ...);
> Is it possible for the driver to use the same request id, then put
> whatever into the third arg ? how to deal with that effectively ?

I'm not sure about the value of all the arguments (except the command 
one) vs. the complexity to filter them, but we could discuss that when 
we'll reach this step.

> 
> For real world user cases, Dmitry Torokhov (added to list) can help.

Yes please!

> 
> PS: There is also lwn article about SELinux implementation of ioctl: [1]
> [1] https://lwn.net/Articles/428140/

Thanks for the pointer, this shows how complex this IOCTL access control 
is. For Landlock, I'd like to provide the minimal required features to 
enable user space to define their own rules, which means to let user 
space (and especially libraries) to identify useful or potentially 
harmful IOCTLs.

> 
> Thanks!
> -Jeff Xu
Günther Noack June 19, 2023, 4:21 p.m. UTC | #6
Hello Mickaël!

On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote:
> This subject is not easy, but I think we're reaching a consensus (see my
> 3-steps proposal plan below). I answered your questions about the (complex)
> interface I proposed, but we should focus on the first step now (your
> initial proposal) and get back to the other steps later in another email
> thread.

Thanks for the review!


> On 10/05/2023 21:21, Günther Noack wrote:
> > [...]
> > Some specific things I don't understand well are:
> > [...]

Thanks, this all make sense now. 
Mickaël Salaün June 19, 2023, 6:57 p.m. UTC | #7
On 19/06/2023 18:21, Günther Noack wrote:
> Hello Mickaël!
> 
> On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote:
>> This subject is not easy, but I think we're reaching a consensus (see my
>> 3-steps proposal plan below). I answered your questions about the (complex)
>> interface I proposed, but we should focus on the first step now (your
>> initial proposal) and get back to the other steps later in another email
>> thread.
> 
> Thanks for the review!
> 
> 
>> On 10/05/2023 21:21, Günther Noack wrote:
>>> [...]
>>> Some specific things I don't understand well are:
>>> [...]
> 
> Thanks, this all make sense now. 
Jeff Xu June 20, 2023, 11:44 p.m. UTC | #8
On Sat, Jun 17, 2023 at 2:49 AM Mickaël Salaün <mic@digikod.net> wrote:
>
>
> On 24/05/2023 23:43, Jeff Xu wrote:
> > Sorry for the late reply.
> >>
> >> (Looking in the direction of Jeff Xu, who has inquired about Landlock
> >> for Chromium in the past -- do you happen to know in which ways you'd
> >> want to restrict ioctls, if you have that need? :))
> >>
> >
> > Regarding this patch, here is some feedback from ChromeOS:
> >   - In the short term: we are looking to integrate Landlock into our
> > sandboxer, so the ability to restrict to a specific device is huge.
> > - Fundamentally though, in the effort of bringing process expected
> > behaviour closest to allowed behaviour, the ability to speak of
> > ioctl() path access in Landlock would be huge -- at least we can
> > continue to enumerate in policy what processes are allowed to do, even
> > if we still lack the ability to restrict individual ioctl commands for
> > a specific device node.
>
> Thanks for the feedback!
>
> >
> > Regarding medium term:
> > My thoughts are, from software architecture point of view, it would be
> > nice to think in planes: i.e. Data plane / Control plane/ Signaling
> > Plane/Normal User Plane/Privileged User Plane. Let the application
> > define its planes, and assign operations to them. Landlock provides
> > data structure and syscall to construct the planes.
>
> I'm not sure to follow this plane thing. Could you give examples for
> these planes applied to Landlock?
>
The idea is probably along the same lines as yours: let user space
define/categorize ioctls.  For example, for a camera driver, users can
define two planes - control plane: setup parameters of lens, data
plane: setup data buffers for data transfer and do start/stop (I'm
just making up the example since I don't really know the camera
driver).

The idea is for Landlock to provide a mechanism to let user space to
divide/assign ioctls to different planes, such that the user space
processes can set/define security boundaries according to the plane it
is on.

>
> >
> > However, one thing I'm not sure is the third arg from ioctl:
> > int ioctl(int fd, unsigned long request, ...);
> > Is it possible for the driver to use the same request id, then put
> > whatever into the third arg ? how to deal with that effectively ?
>
> I'm not sure about the value of all the arguments (except the command
> one) vs. the complexity to filter them, but we could discuss that when
> we'll reach this step.
>
> >
> > For real world user cases, Dmitry Torokhov (added to list) can help.
>
> Yes please!
>
ya,  it will help with the design if there is a real world scenario to study.

> >
> > PS: There is also lwn article about SELinux implementation of ioctl: [1]
> > [1] https://lwn.net/Articles/428140/
>
> Thanks for the pointer, this shows how complex this IOCTL access control
> is. For Landlock, I'd like to provide the minimal required features to
> enable user space to define their own rules, which means to let user
> space (and especially libraries) to identify useful or potentially
> harmful IOCTLs.
>
Yes. That makes sense.

> >
> > Thanks!
> > -Jeff Xu
Mickaël Salaün June 21, 2023, 9:17 a.m. UTC | #9
On 21/06/2023 01:44, Jeff Xu wrote:
> On Sat, Jun 17, 2023 at 2:49 AM Mickaël Salaün <mic@digikod.net> wrote:
>>
>>
>> On 24/05/2023 23:43, Jeff Xu wrote:
>>> Sorry for the late reply.
>>>>
>>>> (Looking in the direction of Jeff Xu, who has inquired about Landlock
>>>> for Chromium in the past -- do you happen to know in which ways you'd
>>>> want to restrict ioctls, if you have that need? :))
>>>>
>>>
>>> Regarding this patch, here is some feedback from ChromeOS:
>>>    - In the short term: we are looking to integrate Landlock into our
>>> sandboxer, so the ability to restrict to a specific device is huge.
>>> - Fundamentally though, in the effort of bringing process expected
>>> behaviour closest to allowed behaviour, the ability to speak of
>>> ioctl() path access in Landlock would be huge -- at least we can
>>> continue to enumerate in policy what processes are allowed to do, even
>>> if we still lack the ability to restrict individual ioctl commands for
>>> a specific device node.
>>
>> Thanks for the feedback!
>>
>>>
>>> Regarding medium term:
>>> My thoughts are, from software architecture point of view, it would be
>>> nice to think in planes: i.e. Data plane / Control plane/ Signaling
>>> Plane/Normal User Plane/Privileged User Plane. Let the application
>>> define its planes, and assign operations to them. Landlock provides
>>> data structure and syscall to construct the planes.
>>
>> I'm not sure to follow this plane thing. Could you give examples for
>> these planes applied to Landlock?
>>
> The idea is probably along the same lines as yours: let user space
> define/categorize ioctls.  For example, for a camera driver, users can
> define two planes - control plane: setup parameters of lens, data
> plane: setup data buffers for data transfer and do start/stop (I'm
> just making up the example since I don't really know the camera
> driver).
> 
> The idea is for Landlock to provide a mechanism to let user space to
> divide/assign ioctls to different planes, such that the user space
> processes can set/define security boundaries according to the plane it
> is on.

Right, we're on the same track. :)


> 
>>
>>>
>>> However, one thing I'm not sure is the third arg from ioctl:
>>> int ioctl(int fd, unsigned long request, ...);
>>> Is it possible for the driver to use the same request id, then put
>>> whatever into the third arg ? how to deal with that effectively ?
>>
>> I'm not sure about the value of all the arguments (except the command
>> one) vs. the complexity to filter them, but we could discuss that when
>> we'll reach this step.
>>
>>>
>>> For real world user cases, Dmitry Torokhov (added to list) can help.
>>
>> Yes please!
>>
> ya,  it will help with the design if there is a real world scenario to study.

I'll get internal requirements too.


> 
>>>
>>> PS: There is also lwn article about SELinux implementation of ioctl: [1]
>>> [1] https://lwn.net/Articles/428140/
>>
>> Thanks for the pointer, this shows how complex this IOCTL access control
>> is. For Landlock, I'd like to provide the minimal required features to
>> enable user space to define their own rules, which means to let user
>> space (and especially libraries) to identify useful or potentially
>> harmful IOCTLs.
>>
> Yes. That makes sense.
> 
>>>
>>> Thanks!
>>> -Jeff Xu
Günther Noack July 12, 2023, 11:08 a.m. UTC | #10
Hello!

On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote:
> > > We should also think about batch operations on FD (see the
> > > close_range syscall), for instance to deny all IOCTLs on inherited
> > > or received FDs.
> > 
> > Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the
> > rights for an entire range of FDs?
> > 
> > I have to admit, I'm not familiar with the real-life use cases of
> > close_range().  In most programs I work with, it's difficult to reason
> > about their ordering once the program has really started to run. So I
> > imagine that close_range() is mostly used to "sanitize" the open file
> > descriptors at the start of main(), and you have a similar use case in
> > mind for this one as well?
> > 
> > If it's just about closing the range from 0 to 2, I'm not sure it's
> > worth adding a custom syscall. :)
> 
> The advantage of this kind of range is to efficiently manage all potential
> FDs, and the main use case is to close (or change, see the flags) everything
> *except" 0-2 (i.e. 3-~0), and then avoid a lot of (potentially useless)
> syscalls.
> 
> The Landlock interface doesn't need to be a syscall. We could just add a new
> rule type which could take a FD range and restrict them when calling
> landlock_restrict_self(). Something like this:
> struct landlock_fd_attr {
>     __u64 allowed_access;
>     __u32 first;
>     __u32 last;
> }

FYI, regarding the idea of dropping rights on already-opened files:
I'm starting to have doubts about how feasible this is in practice.

The "obvious" approach is to just remove the access rights from the security
blob flags on the struct file.

But these opened "struct file"s might be shared with other processes already,
and mutating them in place would have undesired side effects on other processes.

For example, if brltty uses ioctls on the terminal and then one of the programs
running in that terminal drops ioctl rights on that open file, it would affect
brltty as well, because both the Landlocked program and brltty use the same
struct file.

It could be technically stored next to the file descriptor list, where the
close-on-exec flag is also stored, but that seems more cumbersome than I had
hoped.  I don't have a good approach for that idea yet, so I'll drop it for now.

Ideas are welcome. :)

—Günther
Mickaël Salaün July 12, 2023, 11:38 a.m. UTC | #11
On 12/07/2023 13:08, Günther Noack wrote:
> Hello!
> 
> On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote:
>>>> We should also think about batch operations on FD (see the
>>>> close_range syscall), for instance to deny all IOCTLs on inherited
>>>> or received FDs.
>>>
>>> Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the
>>> rights for an entire range of FDs?
>>>
>>> I have to admit, I'm not familiar with the real-life use cases of
>>> close_range().  In most programs I work with, it's difficult to reason
>>> about their ordering once the program has really started to run. So I
>>> imagine that close_range() is mostly used to "sanitize" the open file
>>> descriptors at the start of main(), and you have a similar use case in
>>> mind for this one as well?
>>>
>>> If it's just about closing the range from 0 to 2, I'm not sure it's
>>> worth adding a custom syscall. :)
>>
>> The advantage of this kind of range is to efficiently manage all potential
>> FDs, and the main use case is to close (or change, see the flags) everything
>> *except" 0-2 (i.e. 3-~0), and then avoid a lot of (potentially useless)
>> syscalls.
>>
>> The Landlock interface doesn't need to be a syscall. We could just add a new
>> rule type which could take a FD range and restrict them when calling
>> landlock_restrict_self(). Something like this:
>> struct landlock_fd_attr {
>>      __u64 allowed_access;
>>      __u32 first;
>>      __u32 last;
>> }
> 
> FYI, regarding the idea of dropping rights on already-opened files:
> I'm starting to have doubts about how feasible this is in practice.
> 
> The "obvious" approach is to just remove the access rights from the security
> blob flags on the struct file.
> 
> But these opened "struct file"s might be shared with other processes already,
> and mutating them in place would have undesired side effects on other processes.
> 
> For example, if brltty uses ioctls on the terminal and then one of the programs
> running in that terminal drops ioctl rights on that open file, it would affect
> brltty as well, because both the Landlocked program and brltty use the same
> struct file.
> 
> It could be technically stored next to the file descriptor list, where the
> close-on-exec flag is also stored, but that seems more cumbersome than I had
> hoped.  I don't have a good approach for that idea yet, so I'll drop it for now.

Indeed, as discussed in another thread (patch v9 network support), I now 
think that file descriptors should not be touched nor restricted by 
Landlock. Even if there are file *descriptions* and file descriptors, 
Landlock should focus on what user space cannot already do (i.e. close 
file descriptors). Already acquired file descriptors should be a concern 
for user space sandboxers and the whole system/services.

> 
> Ideas are welcome. :)
> 
> —Günther
>