Message ID | 20230502171755.9788-1-gnoack3000@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Landlock: ioctl support | expand |
Thanks for this RFC, this is interesting! I previously thought a lot about IOCTLs restrictions and here are some notes: IOCTLs are everywhere, from devices to filesystems (see fscrypt). Each different file type may behave differently for the same IOCTL command/ID. It is also worth noting that there are a lot of different IOCTLs, they are growing over time, some might be dedicated to get some data (i.e. read) and others to change the state of a device (i.e. write), some might be innocuous (e.g. FIOCLEX, FIONCLEX) and others potentially harmful. _IOC_READ and _IOC_WRITE can be useful but they are not mandatory, there are exceptions, and it may be difficult to identify if a command pertains to one, the other, or both kind of actions. I then think it would be very useful to be able to tie file/device types to a set of IOCTLs, letting user space libraries define and classify the IOCTL semantic. Instead of relying on a LANDLOCK_ACCESS_FS_IOCTL which would allow or deny all IOCTLs, we can extend the path_beneath struct to manage IOCTLs in addition to regular file accesses. Because dealing with a set of IOCTLs would imply to deal with a lot of data and combinations, I thought about creating groups of IOCTLs (defining access semantic) that could be matched against file hierarchies. The composability nature of Landlock domains is also another constraint to keep in mind. // New rule type dedicated to define groups of IOCTLs. struct landlock_ioctl_attr { __u32 command; // IOCTL number/ID dev_t device; // must be 0 for regular file and directory __u8 file_type; __u8 id_mask; // if 0, then applied globally }; We could use landlock_add_rule(2) to fill a set of landlock_ioctl_attr into a ruleset and use them with landlock_path_beneath_attr entries: // LANDLOCK_RULE_PATH_BENEATH, leveraging the extensible design of // landlock_path_beneath_attr, hence the same first fields. struct landlock_path_beneath_attr { __u64 allowed_access; __s32 parent_fd; __u16 allowed_ioctl_id_mask; }; landlock_ioctl_attr includes a 8-bit mask for which each bit identifies a set of allowed IOCTLs per device/file type. This mask is then tied to a path_beneath_attr. We cannot use number IDs because of dev_t+IOCTL->ID intersection conflicts. Using an id_mask enables to group (specific) IOCTLs together, then creating synthetic access rights. When merging a ruleset with a domain, each IOCTL ID mask is shifted and ORed with the other layer ones to get a (8*16) 128-bit mask, stored in an IOCTL/dev_t table and in the related landlock_object. When looking for an IOCTL request, Landlock first looks into the IOCTL set ID table and get the global set ID mask, which kind of translates to a composition of synthetic access rights (stored with the landlock_layer.ioctl_access bitmask). We then walk through all the inodes to match the whole mask. I realize that this is complex and this explanation might be confusing though. What do you think? On 02/05/2023 19:17, Günther Noack wrote: > Hello! > > These patches add ioctl support to Landlock. > > It's an early version - it potentially needs more tests and > documentation. I'd like to circulate the patches early to discuss > whether this approach is feasible. > > Objective > ~~~~~~~~~ > > Make ioctl(2) requests restrictable with Landlock, > in a way that is useful for real-world applications. > > Proposed approach > ~~~~~~~~~~~~~~~~~ > > Introduce the LANDLOCK_ACCESS_FS_IOCTL right, which restricts the use > of ioctl(2) on file descriptors. > > We attach the LANDLOCK_ACCESS_FS_IOCTL right to opened file > descriptors, as we already do for LANDLOCK_ACCESS_FS_TRUNCATE. > > I believe that this approach works for the majority of use cases, and > offers a good trade-off between Landlock API and implementation > complexity and flexibility when the feature is used. > > Notable implications of this approach > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > * Existing inherited file descriptors stay unaffected > when a program enables Landlock. > > This means in particular that in common scenarios, > the terminal's ioctls (ioctl_tty(2)) continue to work. > > * ioctl(2) continues to be available for file descriptors acquired > through means other than open(2). Example: Network sockets. > > Examples > ~~~~~~~~ > > Starting a sandboxed shell from $HOME with samples/landlock/sandboxer: > > LL_FS_RO=/ LL_FS_RW=. ./sandboxer /bin/bash > > The LANDLOCK_ACCESS_FS_IOCTL right is part of the "read-write" rights > here, so we expect that newly opened files outside of $HOME don't work > with ioctl(2). > > * "stty" works: It probes terminal properties > > * "stty </dev/tty" fails: /dev/tty can be reopened, but the ioctl is > denied. > > * "eject" fails: ioctls to use CD-ROM drive are denied. > > * "ls /dev" works: It uses ioctl to get the terminal size for > columnar layout > > * The text editors "vim" and "mg" work. (GNU Emacs fails because it > attempts to reopen /dev/tty.) > > Alternatives considered: Allow-listing specific ioctl requests > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > It would be technically possible to keep an allow-list of ioctl > requests and thereby control in more detail which ioctls should work. > > I believe that this is not needed for the majority of use cases and > that it is a reasonable trade-off to make here (but I'm happy to hear > about counterexamples). The reasoning is: > > * Many programs do not need ioctl at all, > and denying all ioctl(2) requests OK for these. > > * Other programs need ioctl, but only for the terminal FDs. > This is supported because these file descriptors are usually > inherited from the parent process - so the parent process gets to > control the ioctl(2) policy for them. > > * Some programs need ioctl on specific files that they are opening > themselves. They can allow-list these file paths for ioctl(2). > This makes the programs work, but it restricts a variety of other > ioctl requests which are otherwise possible through opening other > files. > > Because the LANDLOCK_ACCESS_FS_IOCTL right is attached to the file > descriptor, programs have flexible options to control which ioctl > operations should work, without the implementation complexity of > additional ioctl allow-lists in the kernel. > > Finally, the proposed approach is simpler in implementation and has > lower API complexity, but it does *not* preclude us from implementing > per-ioctl-request allow lists later, if that turns out to be necessary > at a later point. I value this simplicity, but I'm also wondering about how much this allow/deny all IOCTLs approach would be useful in real case scenarios. ;) > > Related Work > ~~~~~~~~~~~~ > > OpenBSD's pledge(2) [1] restricts ioctl(2) independent of the file > descriptor which is used. The implementers maintain multiple > allow-lists of predefined ioctl(2) operations required for different > application domains such as "audio", "bpf", "tty" and "inet". > > OpenBSD does not guarantee ABI backwards compatibility to the same > extent as Linux does, so it's easier for them to update these lists in > later versions. It might not be a feasible approach for Linux though. > > [1] https://man.openbsd.org/OpenBSD-7.3/pledge.2 > > Feedback I'm looking for > ~~~~~~~~~~~~~~~~~~~~~~~~ > > Some specific points I would like to get your opinion on: > > * Is this the right general approach to restricting ioctl(2)? > > It will probably be possible to find counter-examples where the > alternative (see below) is better. I'd be interested in these, and > in how common they are, to understand whether we have picked the > right trade-off here. In the long term, I'd like Landlock to be able to restrict a set of IOCTLs, taking into account the type of file/device. Being able to deny all IOCTLs might be useful and is much easier to implement though. > > * Should we introduce a "landlock_fd_rights_limit()" syscall? > > We could potentially implement a system call for dropping the > LANDLOCK_ACCESS_FS_IOCTL and LANDLOCK_ACCESS_FS_TRUNCATE rights from > existing file descriptors (independent of the type of file > descriptor, even). > > Possible use cases would be to (a) restrict the rights on inherited > file descriptors like std{in,out,err} and to (b) restrict ioctl and > truncate operations on file descriptors that are not acquired > through open(2), such as network sockets. > > This would be similar to the cap_rights_limit(2) system call in > FreeBSD's Capsicum. > > This idea feels somewhat orthogonal to the ioctl patch, but it would > start to be more useful if the ioctl right exists. This is indeed interesting, and that should be useful for the cases you explained. I think supporting IOCTLs is more important for now, but a new syscall to restrict FDs could be useful in the future. We should also think about batch operations on FD (see the close_range syscall), for instance to deny all IOCTLs on inherited or received FDs. > > > Günther Noack (4): > landlock: Increment Landlock ABI version to 4 > landlock: Add LANDLOCK_ACCESS_FS_IOCTL access right > selftests/landlock: Test ioctl support > samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL > > include/uapi/linux/landlock.h | 19 ++++-- > samples/landlock/sandboxer.c | 12 +++- > security/landlock/fs.c | 20 +++++- > security/landlock/limits.h | 2 +- > security/landlock/syscalls.c | 2 +- > tools/testing/selftests/landlock/base_test.c | 2 +- > tools/testing/selftests/landlock/fs_test.c | 67 +++++++++++++++++++- > 7 files changed, 107 insertions(+), 17 deletions(-) > > > base-commit: 457391b0380335d5e9a5babdec90ac53928b23b4
Hello Mickaël! Sorry for the late reply. This was indeed a bit difficult for me to understand; maybe it just needs more clarification. Let me try to paraphrase your proposal (inline below) so we can resolve the misunderstandings. On Thu, May 04, 2023 at 11:12:00PM +0200, Mickaël Salaün wrote: > Thanks for this RFC, this is interesting! > > I previously thought a lot about IOCTLs restrictions and here are some > notes: > > IOCTLs are everywhere, from devices to filesystems (see fscrypt). Each > different file type may behave differently for the same IOCTL command/ID. It > is also worth noting that there are a lot of different IOCTLs, they are > growing over time, some might be dedicated to get some data (i.e. read) and > others to change the state of a device (i.e. write), some might be innocuous > (e.g. FIOCLEX, FIONCLEX) and others potentially harmful. _IOC_READ and > _IOC_WRITE can be useful but they are not mandatory, there are exceptions, > and it may be difficult to identify if a command pertains to one, the other, > or both kind of actions. > > I then think it would be very useful to be able to tie file/device types to > a set of IOCTLs, letting user space libraries define and classify the IOCTL > semantic. > > Instead of relying on a LANDLOCK_ACCESS_FS_IOCTL which would allow or deny > all IOCTLs, we can extend the path_beneath struct to manage IOCTLs in > addition to regular file accesses. Because dealing with a set of IOCTLs > would imply to deal with a lot of data and combinations, I thought about > creating groups of IOCTLs (defining access semantic) that could be matched > against file hierarchies. The composability nature of Landlock domains is > also another constraint to keep in mind. > > // New rule type dedicated to define groups of IOCTLs. > struct landlock_ioctl_attr { > __u32 command; // IOCTL number/ID > dev_t device; // must be 0 for regular file and directory > __u8 file_type; > __u8 id_mask; // if 0, then applied globally > }; > > We could use landlock_add_rule(2) to fill a set of landlock_ioctl_attr into > a ruleset and use them with landlock_path_beneath_attr entries: > > // LANDLOCK_RULE_PATH_BENEATH, leveraging the extensible design of > // landlock_path_beneath_attr, hence the same first fields. > struct landlock_path_beneath_attr { > __u64 allowed_access; > __s32 parent_fd; > __u16 allowed_ioctl_id_mask; > }; > > landlock_ioctl_attr includes a 8-bit mask for which each bit identifies a > set of allowed IOCTLs per device/file type. This mask is then tied to a > path_beneath_attr. We cannot use number IDs because of dev_t+IOCTL->ID > intersection conflicts. Using an id_mask enables to group (specific) IOCTLs > together, then creating synthetic access rights. > > When merging a ruleset with a domain, each IOCTL ID mask is shifted and ORed > with the other layer ones to get a (8*16) 128-bit mask, stored in an > IOCTL/dev_t table and in the related landlock_object. When looking for an > IOCTL request, Landlock first looks into the IOCTL set ID table and get the > global set ID mask, which kind of translates to a composition of synthetic > access rights (stored with the landlock_layer.ioctl_access bitmask). We then > walk through all the inodes to match the whole mask. > > I realize that this is complex and this explanation might be confusing > though. What do you think? To be honest, I am not fully sure I understand the landlock_ioctl_attr struct correctly. My current guess is: command: a single ioctl request number that should be permitted device: if device != 0; require the dev_t (major+minor number) of the file to match, before permitting the ioctl file_type: if set (!= 0?), require the file type to match, before permitting the ioctl id_mask: Indicates the IDs of the groups where this ioctl should be permitted(?) So -- if we were to implement this without any optimizations -- the logic of this is presumably something like this?: If Landlock checks an ioctl with a request_id on a file f: We look up the allowed_ioctl_id_mask and loop over the bits set in that mask. For every bit set in that mask, we look up the matching ioctl group. Within the group, we look whether we can find a landlock_ioctl_attr rule that belongs to the group which permits that ioctl request. The request is granted by a landlock_ioctl_attr if: attr.command == request_id && (attr.device == 0 || file_inode(file).i_rdev == attr.device) && (attr.file_type == 0 || landlock_get_file_type?(file)) Is this roughly what you imagined? Some specific things I don't understand well are: * How does id_mask identify the ioctl group? Do you envision an interface where you can add a landlock_ioctl_attr to multiple groups at once? When the added landlock_ioctl_attr has id_mask=3, does it add that attr to group 2 and 1? * How does allowed_ioctl_id_mask match against the ioctl group IDs (id_mask)? I'm particularly confused because the allowed_ioctl_id_mask is __u16, whereas id_mask is __u8? Is this intentional? * What is file_type? Are those the file types as used in mknod(2), so that you can distinguish between regular files, directories, named pipes and the like? * If I understand the proposal right, the .device and .file_type in landlock_ioctl_attr are narrowing down the set of files that the ioctl policy applies to. In all rules up until Landlock v3, which we already have, the selection of the files which the rule applies to is purely done based on file hierarchy. Would it not be a more orthogonal API if the "file selection" part of the Landlock API and the "policy adding" part for these selected files were independent of each other? Then the .device and .file_type selection scheme could be used for the existing policies as well? * When restricting by dev_t (major and minor number), aren't there use cases where a system might have 256 CDROM drives, and you'd need to allow-list all of these minor number combinations? * Aren't many ioctl use cases already usable with just the proposal I made? If you add a rule to permit IOCTL for /dev/cdrom0, that opened file will anyway only expose a small subset of the ioctls that the kernel as a whole offers, no? Are there ioctls which are offered independent of the file type which I'm overlooking? (Sorry, this might be a naive question. :)) > On 02/05/2023 19:17, Günther Noack wrote: > > Alternatives considered: Allow-listing specific ioctl requests > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > It would be technically possible to keep an allow-list of ioctl > > requests and thereby control in more detail which ioctls should work. > > > > I believe that this is not needed for the majority of use cases and > > that it is a reasonable trade-off to make here (but I'm happy to hear > > about counterexamples). The reasoning is: > > > > * Many programs do not need ioctl at all, > > and denying all ioctl(2) requests OK for these. > > > > * Other programs need ioctl, but only for the terminal FDs. > > This is supported because these file descriptors are usually > > inherited from the parent process - so the parent process gets to > > control the ioctl(2) policy for them. > > > > * Some programs need ioctl on specific files that they are opening > > themselves. They can allow-list these file paths for ioctl(2). > > This makes the programs work, but it restricts a variety of other > > ioctl requests which are otherwise possible through opening other > > files. > > > > Because the LANDLOCK_ACCESS_FS_IOCTL right is attached to the file > > descriptor, programs have flexible options to control which ioctl > > operations should work, without the implementation complexity of > > additional ioctl allow-lists in the kernel. > > > > Finally, the proposed approach is simpler in implementation and has > > lower API complexity, but it does *not* preclude us from implementing > > per-ioctl-request allow lists later, if that turns out to be necessary > > at a later point. > > I value this simplicity, but I'm also wondering about how much this > allow/deny all IOCTLs approach would be useful in real case scenarios. ;) Mickaël, would you be open to gather some more data for this from existing users, to understand better which use cases they have? (Looking in the direction of Jeff Xu, who has inquired about Landlock for Chromium in the past -- do you happen to know in which ways you'd want to restrict ioctls, if you have that need? :)) > > Related Work > > ~~~~~~~~~~~~ > > > > OpenBSD's pledge(2) [1] restricts ioctl(2) independent of the file > > descriptor which is used. The implementers maintain multiple > > allow-lists of predefined ioctl(2) operations required for different > > application domains such as "audio", "bpf", "tty" and "inet". > > > > OpenBSD does not guarantee ABI backwards compatibility to the same > > extent as Linux does, so it's easier for them to update these lists in > > later versions. It might not be a feasible approach for Linux though. > > > > [1] https://man.openbsd.org/OpenBSD-7.3/pledge.2 > > > > Feedback I'm looking for > > ~~~~~~~~~~~~~~~~~~~~~~~~ > > > > Some specific points I would like to get your opinion on: > > > > * Is this the right general approach to restricting ioctl(2)? > > > > It will probably be possible to find counter-examples where the > > alternative (see below) is better. I'd be interested in these, and > > in how common they are, to understand whether we have picked the > > right trade-off here. > > In the long term, I'd like Landlock to be able to restrict a set of IOCTLs, > taking into account the type of file/device. Being able to deny all IOCTLs > might be useful and is much easier to implement though. In my mind, it's a trade-off between implementation complexity and flexibility. My proposal is more simple-minded than what you explained, but might solve the bulk of real-life use cases for a lower implementation complexity. (With the caveat that I don't understand the real life use cases well enough to know how far it really gets us, that's why this is just an RFC.) I'm not *just* saying this because I'm lazy ;), but I also feel that we should be careful with the amount of complexity that we take on, especially if there is a chance that it won't be needed in practice. I think that it might be a feasible approach to start with the LANDLOCK_ACCESS_FS_IOCTL approach and then look at its usage to understand whether we see a significant number of programs whose sandboxes are too coarse because of this. If more fine-granular control is needed, we can still put the other approach on top, and the additional complexity from LANDLOCK_ACCESS_FS_IOCTL that we have to support is not that dramatically high. If more fine-granular control is not needed, we can skip the implementation of the other approach and Landlock is simpler. Then again, I'm somewhat new to kernel development still, I'm not sure whether this is an approach that is deemed acceptable in this setting? > > * Should we introduce a "landlock_fd_rights_limit()" syscall? > > > > We could potentially implement a system call for dropping the > > LANDLOCK_ACCESS_FS_IOCTL and LANDLOCK_ACCESS_FS_TRUNCATE rights from > > existing file descriptors (independent of the type of file > > descriptor, even). > > > > Possible use cases would be to (a) restrict the rights on inherited > > file descriptors like std{in,out,err} and to (b) restrict ioctl and > > truncate operations on file descriptors that are not acquired > > through open(2), such as network sockets. > > > > This would be similar to the cap_rights_limit(2) system call in > > FreeBSD's Capsicum. > > > > This idea feels somewhat orthogonal to the ioctl patch, but it would > > start to be more useful if the ioctl right exists. > > This is indeed interesting, and that should be useful for the cases you > explained. I think supporting IOCTLs is more important for now, but a new > syscall to restrict FDs could be useful in the future. Ack, OK. I agree, it's not that urgent yet. > We should also think about batch operations on FD (see the > close_range syscall), for instance to deny all IOCTLs on inherited > or received FDs. Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the rights for an entire range of FDs? I have to admit, I'm not familiar with the real-life use cases of close_range(). In most programs I work with, it's difficult to reason about their ordering once the program has really started to run. So I imagine that close_range() is mostly used to "sanitize" the open file descriptors at the start of main(), and you have a similar use case in mind for this one as well? If it's just about closing the range from 0 to 2, I'm not sure it's worth adding a custom syscall. :) Thanks for the review! –Günther
Sorry for the late reply. > > (Looking in the direction of Jeff Xu, who has inquired about Landlock > for Chromium in the past -- do you happen to know in which ways you'd > want to restrict ioctls, if you have that need? :)) > Regarding this patch, here is some feedback from ChromeOS: - In the short term: we are looking to integrate Landlock into our sandboxer, so the ability to restrict to a specific device is huge. - Fundamentally though, in the effort of bringing process expected behaviour closest to allowed behaviour, the ability to speak of ioctl() path access in Landlock would be huge -- at least we can continue to enumerate in policy what processes are allowed to do, even if we still lack the ability to restrict individual ioctl commands for a specific device node. Regarding medium term: My thoughts are, from software architecture point of view, it would be nice to think in planes: i.e. Data plane / Control plane/ Signaling Plane/Normal User Plane/Privileged User Plane. Let the application define its planes, and assign operations to them. Landlock provides data structure and syscall to construct the planes. However, one thing I'm not sure is the third arg from ioctl: int ioctl(int fd, unsigned long request, ...); Is it possible for the driver to use the same request id, then put whatever into the third arg ? how to deal with that effectively ? For real world user cases, Dmitry Torokhov (added to list) can help. PS: There is also lwn article about SELinux implementation of ioctl: [1] [1] https://lwn.net/Articles/428140/ Thanks! -Jeff Xu
This subject is not easy, but I think we're reaching a consensus (see my 3-steps proposal plan below). I answered your questions about the (complex) interface I proposed, but we should focus on the first step now (your initial proposal) and get back to the other steps later in another email thread. On 10/05/2023 21:21, Günther Noack wrote: > Hello Mickaël! > > Sorry for the late reply. This was indeed a bit difficult for me to > understand; maybe it just needs more clarification. > > Let me try to paraphrase your proposal (inline below) so we can > resolve the misunderstandings. > > > On Thu, May 04, 2023 at 11:12:00PM +0200, Mickaël Salaün wrote: >> Thanks for this RFC, this is interesting! >> >> I previously thought a lot about IOCTLs restrictions and here are some >> notes: >> >> IOCTLs are everywhere, from devices to filesystems (see fscrypt). Each >> different file type may behave differently for the same IOCTL command/ID. It >> is also worth noting that there are a lot of different IOCTLs, they are >> growing over time, some might be dedicated to get some data (i.e. read) and >> others to change the state of a device (i.e. write), some might be innocuous >> (e.g. FIOCLEX, FIONCLEX) and others potentially harmful. _IOC_READ and >> _IOC_WRITE can be useful but they are not mandatory, there are exceptions, >> and it may be difficult to identify if a command pertains to one, the other, >> or both kind of actions. >> >> I then think it would be very useful to be able to tie file/device types to >> a set of IOCTLs, letting user space libraries define and classify the IOCTL >> semantic. >> >> Instead of relying on a LANDLOCK_ACCESS_FS_IOCTL which would allow or deny >> all IOCTLs, we can extend the path_beneath struct to manage IOCTLs in >> addition to regular file accesses. Because dealing with a set of IOCTLs >> would imply to deal with a lot of data and combinations, I thought about >> creating groups of IOCTLs (defining access semantic) that could be matched >> against file hierarchies. The composability nature of Landlock domains is >> also another constraint to keep in mind. >> >> // New rule type dedicated to define groups of IOCTLs. >> struct landlock_ioctl_attr { >> __u32 command; // IOCTL number/ID >> dev_t device; // must be 0 for regular file and directory >> __u8 file_type; >> __u8 id_mask; // if 0, then applied globally >> }; >> >> We could use landlock_add_rule(2) to fill a set of landlock_ioctl_attr into >> a ruleset and use them with landlock_path_beneath_attr entries: >> >> // LANDLOCK_RULE_PATH_BENEATH, leveraging the extensible design of >> // landlock_path_beneath_attr, hence the same first fields. >> struct landlock_path_beneath_attr { >> __u64 allowed_access; >> __s32 parent_fd; >> __u16 allowed_ioctl_id_mask; >> }; >> >> landlock_ioctl_attr includes a 8-bit mask for which each bit identifies a >> set of allowed IOCTLs per device/file type. This mask is then tied to a >> path_beneath_attr. We cannot use number IDs because of dev_t+IOCTL->ID >> intersection conflicts. Using an id_mask enables to group (specific) IOCTLs >> together, then creating synthetic access rights. >> >> When merging a ruleset with a domain, each IOCTL ID mask is shifted and ORed >> with the other layer ones to get a (8*16) 128-bit mask, stored in an >> IOCTL/dev_t table and in the related landlock_object. When looking for an >> IOCTL request, Landlock first looks into the IOCTL set ID table and get the >> global set ID mask, which kind of translates to a composition of synthetic >> access rights (stored with the landlock_layer.ioctl_access bitmask). We then >> walk through all the inodes to match the whole mask. >> >> I realize that this is complex and this explanation might be confusing >> though. What do you think? > > To be honest, I am not fully sure I understand the landlock_ioctl_attr > struct correctly. > > My current guess is: > > command: a single ioctl request number that should be permitted > > device: if device != 0; require the dev_t (major+minor number) > of the file to match, before permitting the ioctl > > file_type: if set (!= 0?), require the file type to match, > before permitting the ioctl > > id_mask: Indicates the IDs of the groups where this ioctl > should be permitted(?) > > So -- if we were to implement this without any optimizations -- the > logic of this is presumably something like this?: > > If Landlock checks an ioctl with a request_id on a file f: > > We look up the allowed_ioctl_id_mask and loop over the bits set in > that mask. > > For every bit set in that mask, we look up the matching ioctl group. > Within the group, we look whether we can find a landlock_ioctl_attr > rule that belongs to the group which permits that ioctl request. > > The request is granted by a landlock_ioctl_attr if: > > attr.command == request_id > && (attr.device == 0 || file_inode(file).i_rdev == attr.device) > && (attr.file_type == 0 || landlock_get_file_type?(file)) > > Is this roughly what you imagined? Yes :) > > > Some specific things I don't understand well are: > > * How does id_mask identify the ioctl group? Do you envision an > interface where you can add a landlock_ioctl_attr to multiple groups > at once? id_mask would be an arbitrary value picked by the user (library). I was thinking about a bitmask (for allowed_ioctl_id_mask) because different ioctl_attr could overlap (e.g., for different file hierarchy). > > When the added landlock_ioctl_attr has id_mask=3, does it add that > attr to group 2 and 1? No, it assigns this IOCTL attribute to the group 3, which can then be matched against with allowed_ioctl_id_mask & (1 << 2). > > * How does allowed_ioctl_id_mask match against the ioctl group IDs > (id_mask)? I'm particularly confused because the > allowed_ioctl_id_mask is __u16, whereas id_mask is __u8? Is this > intentional? The idea was to assign each ioctl_attr to one group (ID) but enable to match several groups with path_beneath_attr (allowed_ioctl_id_mask being the only bitmask, whereas id_mask is a number). Yes, the naming is confusing… > > * What is file_type? Are those the file types as used in mknod(2), > so that you can distinguish between regular files, directories, > named pipes and the like? Yes > > * If I understand the proposal right, the .device and .file_type in > landlock_ioctl_attr are narrowing down the set of files that the > ioctl policy applies to. In all rules up until Landlock v3, which > we already have, the selection of the files which the rule applies > to is purely done based on file hierarchy. > > Would it not be a more orthogonal API if the "file selection" part > of the Landlock API and the "policy adding" part for these selected > files were independent of each other? Then the .device and > .file_type selection scheme could be used for the existing policies > as well? Both approaches have pros and cons. I propose a new incremental approach below that starts with the simple case where there is no direct links between different rule types (only the third step add that). > > * When restricting by dev_t (major and minor number), aren't there use > cases where a system might have 256 CDROM drives, and you'd need to > allow-list all of these minor number combinations? Indeed, we should be able to just ignore device minors. > > * Aren't many ioctl use cases already usable with just the proposal I > made? If you add a rule to permit IOCTL for /dev/cdrom0, that > opened file will anyway only expose a small subset of the ioctls > that the kernel as a whole offers, no? Are there ioctls which are > offered independent of the file type which I'm overlooking? (Sorry, > this might be a naive question. :)) This is correct until users allow IOCTL for a directory (e.g. /etc). It depends on use cases though. > > >> On 02/05/2023 19:17, Günther Noack wrote: >>> Alternatives considered: Allow-listing specific ioctl requests >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> It would be technically possible to keep an allow-list of ioctl >>> requests and thereby control in more detail which ioctls should work. >>> >>> I believe that this is not needed for the majority of use cases and >>> that it is a reasonable trade-off to make here (but I'm happy to hear >>> about counterexamples). The reasoning is: >>> >>> * Many programs do not need ioctl at all, >>> and denying all ioctl(2) requests OK for these. >>> >>> * Other programs need ioctl, but only for the terminal FDs. >>> This is supported because these file descriptors are usually >>> inherited from the parent process - so the parent process gets to >>> control the ioctl(2) policy for them. >>> >>> * Some programs need ioctl on specific files that they are opening >>> themselves. They can allow-list these file paths for ioctl(2). >>> This makes the programs work, but it restricts a variety of other >>> ioctl requests which are otherwise possible through opening other >>> files. >>> >>> Because the LANDLOCK_ACCESS_FS_IOCTL right is attached to the file >>> descriptor, programs have flexible options to control which ioctl >>> operations should work, without the implementation complexity of >>> additional ioctl allow-lists in the kernel. >>> >>> Finally, the proposed approach is simpler in implementation and has >>> lower API complexity, but it does *not* preclude us from implementing >>> per-ioctl-request allow lists later, if that turns out to be necessary >>> at a later point. >> >> I value this simplicity, but I'm also wondering about how much this >> allow/deny all IOCTLs approach would be useful in real case scenarios. ;) > > Mickaël, would you be open to gather some more data for this from > existing users, to understand better which use cases they have? Of course, thanks! I'm trying to project myself as an app developer, a sysadmin and a desktop user. I know some sandboxed softwares, I sandboxed some, and I'm looking into new ones. The main issue I see is (from the sysadmin and user POV) when we want to manage a whole file hierarchy, which may contain different file/device types. > > (Looking in the direction of Jeff Xu, who has inquired about Landlock > for Chromium in the past -- do you happen to know in which ways you'd > want to restrict ioctls, if you have that need? :)) > > >>> Related Work >>> ~~~~~~~~~~~~ >>> >>> OpenBSD's pledge(2) [1] restricts ioctl(2) independent of the file >>> descriptor which is used. The implementers maintain multiple >>> allow-lists of predefined ioctl(2) operations required for different >>> application domains such as "audio", "bpf", "tty" and "inet". >>> >>> OpenBSD does not guarantee ABI backwards compatibility to the same >>> extent as Linux does, so it's easier for them to update these lists in >>> later versions. It might not be a feasible approach for Linux though. >>> >>> [1] https://man.openbsd.org/OpenBSD-7.3/pledge.2 >>> >>> Feedback I'm looking for >>> ~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> Some specific points I would like to get your opinion on: >>> >>> * Is this the right general approach to restricting ioctl(2)? >>> >>> It will probably be possible to find counter-examples where the >>> alternative (see below) is better. I'd be interested in these, and >>> in how common they are, to understand whether we have picked the >>> right trade-off here. >> >> In the long term, I'd like Landlock to be able to restrict a set of IOCTLs, >> taking into account the type of file/device. Being able to deny all IOCTLs >> might be useful and is much easier to implement though. > > In my mind, it's a trade-off between implementation complexity and > flexibility. > > My proposal is more simple-minded than what you explained, but might > solve the bulk of real-life use cases for a lower implementation > complexity. (With the caveat that I don't understand the real life use > cases well enough to know how far it really gets us, that's why this > is just an RFC.) > > I'm not *just* saying this because I'm lazy ;), but I also feel that > we should be careful with the amount of complexity that we take on, > especially if there is a chance that it won't be needed in practice. I agree :) > > I think that it might be a feasible approach to start with the > LANDLOCK_ACCESS_FS_IOCTL approach and then look at its usage to > understand whether we see a significant number of programs whose > sandboxes are too coarse because of this. > > If more fine-granular control is needed, we can still put the other > approach on top, and the additional complexity from > LANDLOCK_ACCESS_FS_IOCTL that we have to support is not that > dramatically high. > > If more fine-granular control is not needed, we can skip the > implementation of the other approach and Landlock is simpler. > > Then again, I'm somewhat new to kernel development still, I'm not sure > whether this is an approach that is deemed acceptable in this setting? I agree that IOCTLs are a security risk and that we should propose a simple solution short-term, and maybe a more complete one long-term. The main issue with a unique IOCTL access right for a file hierarchy is that we may not know which device/driver will be the target/server, and hence if we need to allow some IOCTL for regular files (e.g., fscrypt), we might end up allowing all IOCTLs. Here is a plan to incrementally develop a fine-grained IOCTL access control in 3 steps: 1/ Add a simple IOCTL access right for path_beneath: what you initially proposed. For systems that already configure nodev mount points, it could be even more useful (e.g., safely allow IOCTL on /home for fscrypt, and specific /dev files otherwise). 2/ Create a new type of rule to identify file/device type: struct landlock_inode_type_attr { __u64 allowed_access; /* same as for path_beneath */ __u64 flags; /* bit 0: ignores device minor */ dev_t device; /* same as stat's st_rdev */ __u16 file_type; /* same as stat's st_mode & S_IFMT */ }; We'll probably need to differentiate the handled accesses for path_beneath rules and those for inode_type rules to be more useful. One issue with this type of rule is that it could be used as an oracle to bypass stat restrictions. We could check if such (virtual) action is allowed without the current domain though. 3/ Add a new type of rule to match IOCTL commands, with a mechanism to tie this to inode_type rules (because a command ID is relative to a file type/device), and potentially the same mechanism to tie inode_type rules to path_beneath rules. Each of this step can be implemented one after the other, and each one is valuable. What do you think? > > >>> * Should we introduce a "landlock_fd_rights_limit()" syscall? >>> >>> We could potentially implement a system call for dropping the >>> LANDLOCK_ACCESS_FS_IOCTL and LANDLOCK_ACCESS_FS_TRUNCATE rights from >>> existing file descriptors (independent of the type of file >>> descriptor, even). >>> >>> Possible use cases would be to (a) restrict the rights on inherited >>> file descriptors like std{in,out,err} and to (b) restrict ioctl and >>> truncate operations on file descriptors that are not acquired >>> through open(2), such as network sockets. >>> >>> This would be similar to the cap_rights_limit(2) system call in >>> FreeBSD's Capsicum. >>> >>> This idea feels somewhat orthogonal to the ioctl patch, but it would >>> start to be more useful if the ioctl right exists. >> >> This is indeed interesting, and that should be useful for the cases you >> explained. I think supporting IOCTLs is more important for now, but a new >> syscall to restrict FDs could be useful in the future. > > Ack, OK. I agree, it's not that urgent yet. > > >> We should also think about batch operations on FD (see the >> close_range syscall), for instance to deny all IOCTLs on inherited >> or received FDs. > > Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the > rights for an entire range of FDs? > > I have to admit, I'm not familiar with the real-life use cases of > close_range(). In most programs I work with, it's difficult to reason > about their ordering once the program has really started to run. So I > imagine that close_range() is mostly used to "sanitize" the open file > descriptors at the start of main(), and you have a similar use case in > mind for this one as well? > > If it's just about closing the range from 0 to 2, I'm not sure it's > worth adding a custom syscall. :) The advantage of this kind of range is to efficiently manage all potential FDs, and the main use case is to close (or change, see the flags) everything *except" 0-2 (i.e. 3-~0), and then avoid a lot of (potentially useless) syscalls. The Landlock interface doesn't need to be a syscall. We could just add a new rule type which could take a FD range and restrict them when calling landlock_restrict_self(). Something like this: struct landlock_fd_attr { __u64 allowed_access; __u32 first; __u32 last; } > > Thanks for the review! > –Günther
On 24/05/2023 23:43, Jeff Xu wrote: > Sorry for the late reply. >> >> (Looking in the direction of Jeff Xu, who has inquired about Landlock >> for Chromium in the past -- do you happen to know in which ways you'd >> want to restrict ioctls, if you have that need? :)) >> > > Regarding this patch, here is some feedback from ChromeOS: > - In the short term: we are looking to integrate Landlock into our > sandboxer, so the ability to restrict to a specific device is huge. > - Fundamentally though, in the effort of bringing process expected > behaviour closest to allowed behaviour, the ability to speak of > ioctl() path access in Landlock would be huge -- at least we can > continue to enumerate in policy what processes are allowed to do, even > if we still lack the ability to restrict individual ioctl commands for > a specific device node. Thanks for the feedback! > > Regarding medium term: > My thoughts are, from software architecture point of view, it would be > nice to think in planes: i.e. Data plane / Control plane/ Signaling > Plane/Normal User Plane/Privileged User Plane. Let the application > define its planes, and assign operations to them. Landlock provides > data structure and syscall to construct the planes. I'm not sure to follow this plane thing. Could you give examples for these planes applied to Landlock? > > However, one thing I'm not sure is the third arg from ioctl: > int ioctl(int fd, unsigned long request, ...); > Is it possible for the driver to use the same request id, then put > whatever into the third arg ? how to deal with that effectively ? I'm not sure about the value of all the arguments (except the command one) vs. the complexity to filter them, but we could discuss that when we'll reach this step. > > For real world user cases, Dmitry Torokhov (added to list) can help. Yes please! > > PS: There is also lwn article about SELinux implementation of ioctl: [1] > [1] https://lwn.net/Articles/428140/ Thanks for the pointer, this shows how complex this IOCTL access control is. For Landlock, I'd like to provide the minimal required features to enable user space to define their own rules, which means to let user space (and especially libraries) to identify useful or potentially harmful IOCTLs. > > Thanks! > -Jeff Xu
Hello Mickaël! On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote: > This subject is not easy, but I think we're reaching a consensus (see my > 3-steps proposal plan below). I answered your questions about the (complex) > interface I proposed, but we should focus on the first step now (your > initial proposal) and get back to the other steps later in another email > thread. Thanks for the review! > On 10/05/2023 21:21, Günther Noack wrote: > > [...] > > Some specific things I don't understand well are: > > [...] Thanks, this all make sense now.
On 19/06/2023 18:21, Günther Noack wrote: > Hello Mickaël! > > On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote: >> This subject is not easy, but I think we're reaching a consensus (see my >> 3-steps proposal plan below). I answered your questions about the (complex) >> interface I proposed, but we should focus on the first step now (your >> initial proposal) and get back to the other steps later in another email >> thread. > > Thanks for the review! > > >> On 10/05/2023 21:21, Günther Noack wrote: >>> [...] >>> Some specific things I don't understand well are: >>> [...] > > Thanks, this all make sense now.
On Sat, Jun 17, 2023 at 2:49 AM Mickaël Salaün <mic@digikod.net> wrote: > > > On 24/05/2023 23:43, Jeff Xu wrote: > > Sorry for the late reply. > >> > >> (Looking in the direction of Jeff Xu, who has inquired about Landlock > >> for Chromium in the past -- do you happen to know in which ways you'd > >> want to restrict ioctls, if you have that need? :)) > >> > > > > Regarding this patch, here is some feedback from ChromeOS: > > - In the short term: we are looking to integrate Landlock into our > > sandboxer, so the ability to restrict to a specific device is huge. > > - Fundamentally though, in the effort of bringing process expected > > behaviour closest to allowed behaviour, the ability to speak of > > ioctl() path access in Landlock would be huge -- at least we can > > continue to enumerate in policy what processes are allowed to do, even > > if we still lack the ability to restrict individual ioctl commands for > > a specific device node. > > Thanks for the feedback! > > > > > Regarding medium term: > > My thoughts are, from software architecture point of view, it would be > > nice to think in planes: i.e. Data plane / Control plane/ Signaling > > Plane/Normal User Plane/Privileged User Plane. Let the application > > define its planes, and assign operations to them. Landlock provides > > data structure and syscall to construct the planes. > > I'm not sure to follow this plane thing. Could you give examples for > these planes applied to Landlock? > The idea is probably along the same lines as yours: let user space define/categorize ioctls. For example, for a camera driver, users can define two planes - control plane: setup parameters of lens, data plane: setup data buffers for data transfer and do start/stop (I'm just making up the example since I don't really know the camera driver). The idea is for Landlock to provide a mechanism to let user space to divide/assign ioctls to different planes, such that the user space processes can set/define security boundaries according to the plane it is on. > > > > > However, one thing I'm not sure is the third arg from ioctl: > > int ioctl(int fd, unsigned long request, ...); > > Is it possible for the driver to use the same request id, then put > > whatever into the third arg ? how to deal with that effectively ? > > I'm not sure about the value of all the arguments (except the command > one) vs. the complexity to filter them, but we could discuss that when > we'll reach this step. > > > > > For real world user cases, Dmitry Torokhov (added to list) can help. > > Yes please! > ya, it will help with the design if there is a real world scenario to study. > > > > PS: There is also lwn article about SELinux implementation of ioctl: [1] > > [1] https://lwn.net/Articles/428140/ > > Thanks for the pointer, this shows how complex this IOCTL access control > is. For Landlock, I'd like to provide the minimal required features to > enable user space to define their own rules, which means to let user > space (and especially libraries) to identify useful or potentially > harmful IOCTLs. > Yes. That makes sense. > > > > Thanks! > > -Jeff Xu
On 21/06/2023 01:44, Jeff Xu wrote: > On Sat, Jun 17, 2023 at 2:49 AM Mickaël Salaün <mic@digikod.net> wrote: >> >> >> On 24/05/2023 23:43, Jeff Xu wrote: >>> Sorry for the late reply. >>>> >>>> (Looking in the direction of Jeff Xu, who has inquired about Landlock >>>> for Chromium in the past -- do you happen to know in which ways you'd >>>> want to restrict ioctls, if you have that need? :)) >>>> >>> >>> Regarding this patch, here is some feedback from ChromeOS: >>> - In the short term: we are looking to integrate Landlock into our >>> sandboxer, so the ability to restrict to a specific device is huge. >>> - Fundamentally though, in the effort of bringing process expected >>> behaviour closest to allowed behaviour, the ability to speak of >>> ioctl() path access in Landlock would be huge -- at least we can >>> continue to enumerate in policy what processes are allowed to do, even >>> if we still lack the ability to restrict individual ioctl commands for >>> a specific device node. >> >> Thanks for the feedback! >> >>> >>> Regarding medium term: >>> My thoughts are, from software architecture point of view, it would be >>> nice to think in planes: i.e. Data plane / Control plane/ Signaling >>> Plane/Normal User Plane/Privileged User Plane. Let the application >>> define its planes, and assign operations to them. Landlock provides >>> data structure and syscall to construct the planes. >> >> I'm not sure to follow this plane thing. Could you give examples for >> these planes applied to Landlock? >> > The idea is probably along the same lines as yours: let user space > define/categorize ioctls. For example, for a camera driver, users can > define two planes - control plane: setup parameters of lens, data > plane: setup data buffers for data transfer and do start/stop (I'm > just making up the example since I don't really know the camera > driver). > > The idea is for Landlock to provide a mechanism to let user space to > divide/assign ioctls to different planes, such that the user space > processes can set/define security boundaries according to the plane it > is on. Right, we're on the same track. :) > >> >>> >>> However, one thing I'm not sure is the third arg from ioctl: >>> int ioctl(int fd, unsigned long request, ...); >>> Is it possible for the driver to use the same request id, then put >>> whatever into the third arg ? how to deal with that effectively ? >> >> I'm not sure about the value of all the arguments (except the command >> one) vs. the complexity to filter them, but we could discuss that when >> we'll reach this step. >> >>> >>> For real world user cases, Dmitry Torokhov (added to list) can help. >> >> Yes please! >> > ya, it will help with the design if there is a real world scenario to study. I'll get internal requirements too. > >>> >>> PS: There is also lwn article about SELinux implementation of ioctl: [1] >>> [1] https://lwn.net/Articles/428140/ >> >> Thanks for the pointer, this shows how complex this IOCTL access control >> is. For Landlock, I'd like to provide the minimal required features to >> enable user space to define their own rules, which means to let user >> space (and especially libraries) to identify useful or potentially >> harmful IOCTLs. >> > Yes. That makes sense. > >>> >>> Thanks! >>> -Jeff Xu
Hello! On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote: > > > We should also think about batch operations on FD (see the > > > close_range syscall), for instance to deny all IOCTLs on inherited > > > or received FDs. > > > > Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the > > rights for an entire range of FDs? > > > > I have to admit, I'm not familiar with the real-life use cases of > > close_range(). In most programs I work with, it's difficult to reason > > about their ordering once the program has really started to run. So I > > imagine that close_range() is mostly used to "sanitize" the open file > > descriptors at the start of main(), and you have a similar use case in > > mind for this one as well? > > > > If it's just about closing the range from 0 to 2, I'm not sure it's > > worth adding a custom syscall. :) > > The advantage of this kind of range is to efficiently manage all potential > FDs, and the main use case is to close (or change, see the flags) everything > *except" 0-2 (i.e. 3-~0), and then avoid a lot of (potentially useless) > syscalls. > > The Landlock interface doesn't need to be a syscall. We could just add a new > rule type which could take a FD range and restrict them when calling > landlock_restrict_self(). Something like this: > struct landlock_fd_attr { > __u64 allowed_access; > __u32 first; > __u32 last; > } FYI, regarding the idea of dropping rights on already-opened files: I'm starting to have doubts about how feasible this is in practice. The "obvious" approach is to just remove the access rights from the security blob flags on the struct file. But these opened "struct file"s might be shared with other processes already, and mutating them in place would have undesired side effects on other processes. For example, if brltty uses ioctls on the terminal and then one of the programs running in that terminal drops ioctl rights on that open file, it would affect brltty as well, because both the Landlocked program and brltty use the same struct file. It could be technically stored next to the file descriptor list, where the close-on-exec flag is also stored, but that seems more cumbersome than I had hoped. I don't have a good approach for that idea yet, so I'll drop it for now. Ideas are welcome. :) —Günther
On 12/07/2023 13:08, Günther Noack wrote: > Hello! > > On Sat, Jun 17, 2023 at 11:47:55AM +0200, Mickaël Salaün wrote: >>>> We should also think about batch operations on FD (see the >>>> close_range syscall), for instance to deny all IOCTLs on inherited >>>> or received FDs. >>> >>> Hm, you mean a landlock_fd_rights_limit_range() syscall to limit the >>> rights for an entire range of FDs? >>> >>> I have to admit, I'm not familiar with the real-life use cases of >>> close_range(). In most programs I work with, it's difficult to reason >>> about their ordering once the program has really started to run. So I >>> imagine that close_range() is mostly used to "sanitize" the open file >>> descriptors at the start of main(), and you have a similar use case in >>> mind for this one as well? >>> >>> If it's just about closing the range from 0 to 2, I'm not sure it's >>> worth adding a custom syscall. :) >> >> The advantage of this kind of range is to efficiently manage all potential >> FDs, and the main use case is to close (or change, see the flags) everything >> *except" 0-2 (i.e. 3-~0), and then avoid a lot of (potentially useless) >> syscalls. >> >> The Landlock interface doesn't need to be a syscall. We could just add a new >> rule type which could take a FD range and restrict them when calling >> landlock_restrict_self(). Something like this: >> struct landlock_fd_attr { >> __u64 allowed_access; >> __u32 first; >> __u32 last; >> } > > FYI, regarding the idea of dropping rights on already-opened files: > I'm starting to have doubts about how feasible this is in practice. > > The "obvious" approach is to just remove the access rights from the security > blob flags on the struct file. > > But these opened "struct file"s might be shared with other processes already, > and mutating them in place would have undesired side effects on other processes. > > For example, if brltty uses ioctls on the terminal and then one of the programs > running in that terminal drops ioctl rights on that open file, it would affect > brltty as well, because both the Landlocked program and brltty use the same > struct file. > > It could be technically stored next to the file descriptor list, where the > close-on-exec flag is also stored, but that seems more cumbersome than I had > hoped. I don't have a good approach for that idea yet, so I'll drop it for now. Indeed, as discussed in another thread (patch v9 network support), I now think that file descriptors should not be touched nor restricted by Landlock. Even if there are file *descriptions* and file descriptors, Landlock should focus on what user space cannot already do (i.e. close file descriptors). Already acquired file descriptors should be a concern for user space sandboxers and the whole system/services. > > Ideas are welcome. :) > > —Günther >