Message ID | 20240524093015.2402952-1-ivanov.mikhail1@huawei-partners.com (mailing list archive) |
---|---|
Headers | show |
Series | Socket type control for Landlock | expand |
On Fri, May 24, 2024 at 05:30:03PM +0800, Mikhail Ivanov wrote: > Hello! This is v2 RFC patch dedicated to socket protocols restriction. > > It is based on the landlock's mic-next branch on top of v6.9 kernel > version. Hello Mikhail! I patched in your patchset and tried to use the feature with a small demo tool, but I ran into what I think is a bug -- do you happen to know what this might be? I used 6.10-rc1 as a base and patched your patches on top. The code is a small tool called "nonet", which does the following: - Disable socket creation with a Landlock ruleset with the following attributes: struct landlock_ruleset_attr attr = { .handled_access_socket = LANDLOCK_ACCESS_SOCKET_CREATE, }; - open("/dev/null", O_WRONLY) Expected result: - open() should work Observed result: - open() fails with EACCES. I traced this with perf, and found that the open() gets rejected from Landlock's hook_file_open, whereas hook_socket_create does not get invoked. This is surprising to me -- Enabling a policy for socket creation should not influence the outcome of opening files! Tracing commands: sudo perf probe hook_socket_create '$params' sudo perf probe 'hook_file_open%return $retval' sudo perf record -e 'probe:*' -g -- ./nonet sudo perf report You can find the tool in my landlock-examples repo in the nonet_bug branch: https://github.com/gnoack/landlock-examples/blob/nonet_bug/nonet.c Landlock is enabled like this: https://github.com/gnoack/landlock-examples/blob/nonet_bug/sandbox_socket.c Do you have a hunch what might be going on? Thanks, –Günther
6/4/2024 11:22 PM, Günther Noack wrote: > On Fri, May 24, 2024 at 05:30:03PM +0800, Mikhail Ivanov wrote: >> Hello! This is v2 RFC patch dedicated to socket protocols restriction. >> >> It is based on the landlock's mic-next branch on top of v6.9 kernel >> version. > > Hello Mikhail! > > I patched in your patchset and tried to use the feature with a small > demo tool, but I ran into what I think is a bug -- do you happen to > know what this might be? > > I used 6.10-rc1 as a base and patched your patches on top. > > The code is a small tool called "nonet", which does the following: > > - Disable socket creation with a Landlock ruleset with the following > attributes: > > struct landlock_ruleset_attr attr = { > .handled_access_socket = LANDLOCK_ACCESS_SOCKET_CREATE, > }; > > - open("/dev/null", O_WRONLY) > > Expected result: > > - open() should work > > Observed result: > > - open() fails with EACCES. > > I traced this with perf, and found that the open() gets rejected from > Landlock's hook_file_open, whereas hook_socket_create does not get > invoked. This is surprising to me -- Enabling a policy for socket > creation should not influence the outcome of opening files! > > Tracing commands: > > sudo perf probe hook_socket_create '$params' > sudo perf probe 'hook_file_open%return $retval' > sudo perf record -e 'probe:*' -g -- ./nonet > sudo perf report > > You can find the tool in my landlock-examples repo in the nonet_bug branch: > https://github.com/gnoack/landlock-examples/blob/nonet_bug/nonet.c > > Landlock is enabled like this: > https://github.com/gnoack/landlock-examples/blob/nonet_bug/sandbox_socket.c > > Do you have a hunch what might be going on? Hello Günther! Big thanks for this research! I figured out that I define LANDLOCK_SHIFT_ACCESS_SOCKET macro in really strange way (see landlock/limits.h): #define LANDLOCK_SHIFT_ACCESS_SOCKET LANDLOCK_NUM_ACCESS_SOCKET With this definition, socket access mask overlaps the fs access mask in ruleset->access_masks[layer_level]. That's why landlock_get_fs_access_mask() returns non-zero mask in hook_file_open(). So, the macro must be defined in this way: #define LANDLOCK_SHIFT_ACCESS_SOCKET (LANDLOCK_NUM_ACCESS_NET + LANDLOCK_NUM_ACCESS_FS) With this fix, open() doesn't fail in your example. I'm really sorry that I somehow made such a stupid typo. I will try my best to make sure this doesn't happen again. > > Thanks, > –Günther >
Hello Mikhail! On Thu, Jun 06, 2024 at 02:44:23PM +0300, Mikhail Ivanov wrote: > 6/4/2024 11:22 PM, Günther Noack wrote: > > On Fri, May 24, 2024 at 05:30:03PM +0800, Mikhail Ivanov wrote: > > > Hello! This is v2 RFC patch dedicated to socket protocols restriction. > > > > > > It is based on the landlock's mic-next branch on top of v6.9 kernel > > > version. > > > > Hello Mikhail! > > > > I patched in your patchset and tried to use the feature with a small > > demo tool, but I ran into what I think is a bug -- do you happen to > > know what this might be? > > > > I used 6.10-rc1 as a base and patched your patches on top. > > > > The code is a small tool called "nonet", which does the following: > > > > - Disable socket creation with a Landlock ruleset with the following > > attributes: > > struct landlock_ruleset_attr attr = { > > .handled_access_socket = LANDLOCK_ACCESS_SOCKET_CREATE, > > }; > > > > - open("/dev/null", O_WRONLY) > > > > Expected result: > > > > - open() should work > > > > Observed result: > > > > - open() fails with EACCES. > > > > I traced this with perf, and found that the open() gets rejected from > > Landlock's hook_file_open, whereas hook_socket_create does not get > > invoked. This is surprising to me -- Enabling a policy for socket > > creation should not influence the outcome of opening files! > > > > Tracing commands: > > > > sudo perf probe hook_socket_create '$params' > > sudo perf probe 'hook_file_open%return $retval' > > sudo perf record -e 'probe:*' -g -- ./nonet > > sudo perf report > > You can find the tool in my landlock-examples repo in the nonet_bug branch: > > https://github.com/gnoack/landlock-examples/blob/nonet_bug/nonet.c > > > > Landlock is enabled like this: > > https://github.com/gnoack/landlock-examples/blob/nonet_bug/sandbox_socket.c > > > > Do you have a hunch what might be going on? > > Hello Günther! > Big thanks for this research! > > I figured out that I define LANDLOCK_SHIFT_ACCESS_SOCKET macro in > really strange way (see landlock/limits.h): > > #define LANDLOCK_SHIFT_ACCESS_SOCKET LANDLOCK_NUM_ACCESS_SOCKET > > With this definition, socket access mask overlaps the fs access > mask in ruleset->access_masks[layer_level]. That's why > landlock_get_fs_access_mask() returns non-zero mask in hook_file_open(). > > So, the macro must be defined in this way: > > #define LANDLOCK_SHIFT_ACCESS_SOCKET (LANDLOCK_NUM_ACCESS_NET + > LANDLOCK_NUM_ACCESS_FS) > > With this fix, open() doesn't fail in your example. > > I'm really sorry that I somehow made such a stupid typo. I will try my > best to make sure this doesn't happen again. Thanks for figuring it out so quickly. With that change, I'm getting some compilation errors (some bit shifts are becoming too wide for the underlying types), but I'm sure you can address that easily for the next version of the patch set. IMHO this shows that our reliance on bit manipulation is probably getting in the way of code clarity. :-/ I hope we can simplify these internal structures at some point. Once we have a better way to check for performance changes [1], we can try to change this and measure whether these comprehensibility/performance tradeoff is really worth it. [1] https://github.com/landlock-lsm/linux/issues/24 The other takeaway in my mind is, we should probably have some tests for that, to check that the enablement of one kind of policy does not affect the operations that belong to other kinds of policies. Like this, for instance (I was about to send this test to help debugging): TEST_F(mini, restricting_socket_does_not_affect_fs_actions) { const struct landlock_ruleset_attr ruleset_attr = { .handled_access_socket = LANDLOCK_ACCESS_SOCKET_CREATE, }; int ruleset_fd, fd; ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); ASSERT_LE(0, ruleset_fd); enforce_ruleset(_metadata, ruleset_fd); ASSERT_EQ(0, close(ruleset_fd)); /* * Accessing /dev/null for writing should be permitted, * because we did not add any file system restrictions. */ fd = open("/dev/null", O_WRONLY); EXPECT_LE(0, fd); ASSERT_EQ(0, close(fd)); } Since these kinds of tests are a bit at the intersection between the fs/net/socket tests, maybe they could go into a separate test file? The next time we add a new kind of Landlock restriction, it would come more naturally to add the matching test there and spot such issues earlier. Would you volunteer to add such a test as part of your patch set? :) Thanks, Günther
On Thu, Jun 06, 2024 at 03:32:47PM +0200, Günther Noack wrote: > Thanks for figuring it out so quickly. With that change, I'm getting some > compilation errors (some bit shifts are becoming too wide for the underlying > types), but I'm sure you can address that easily for the next version of the > patch set. Addendum, please ignore the remark about me getting compilation errors - I made a typo myself, and it worked in the way you suggested without warnings or errors. —Günther
6/6/2024 4:32 PM, Günther Noack wrote: > Hello Mikhail! > > On Thu, Jun 06, 2024 at 02:44:23PM +0300, Mikhail Ivanov wrote: >> 6/4/2024 11:22 PM, Günther Noack wrote: >>> On Fri, May 24, 2024 at 05:30:03PM +0800, Mikhail Ivanov wrote: >>>> Hello! This is v2 RFC patch dedicated to socket protocols restriction. >>>> >>>> It is based on the landlock's mic-next branch on top of v6.9 kernel >>>> version. >>> >>> Hello Mikhail! >>> >>> I patched in your patchset and tried to use the feature with a small >>> demo tool, but I ran into what I think is a bug -- do you happen to >>> know what this might be? >>> >>> I used 6.10-rc1 as a base and patched your patches on top. >>> >>> The code is a small tool called "nonet", which does the following: >>> >>> - Disable socket creation with a Landlock ruleset with the following >>> attributes: >>> struct landlock_ruleset_attr attr = { >>> .handled_access_socket = LANDLOCK_ACCESS_SOCKET_CREATE, >>> }; >>> >>> - open("/dev/null", O_WRONLY) >>> >>> Expected result: >>> >>> - open() should work >>> >>> Observed result: >>> >>> - open() fails with EACCES. >>> >>> I traced this with perf, and found that the open() gets rejected from >>> Landlock's hook_file_open, whereas hook_socket_create does not get >>> invoked. This is surprising to me -- Enabling a policy for socket >>> creation should not influence the outcome of opening files! >>> >>> Tracing commands: >>> >>> sudo perf probe hook_socket_create '$params' >>> sudo perf probe 'hook_file_open%return $retval' >>> sudo perf record -e 'probe:*' -g -- ./nonet >>> sudo perf report >>> You can find the tool in my landlock-examples repo in the nonet_bug branch: >>> https://github.com/gnoack/landlock-examples/blob/nonet_bug/nonet.c >>> >>> Landlock is enabled like this: >>> https://github.com/gnoack/landlock-examples/blob/nonet_bug/sandbox_socket.c >>> >>> Do you have a hunch what might be going on? >> >> Hello Günther! >> Big thanks for this research! >> >> I figured out that I define LANDLOCK_SHIFT_ACCESS_SOCKET macro in >> really strange way (see landlock/limits.h): >> >> #define LANDLOCK_SHIFT_ACCESS_SOCKET LANDLOCK_NUM_ACCESS_SOCKET >> >> With this definition, socket access mask overlaps the fs access >> mask in ruleset->access_masks[layer_level]. That's why >> landlock_get_fs_access_mask() returns non-zero mask in hook_file_open(). >> >> So, the macro must be defined in this way: >> >> #define LANDLOCK_SHIFT_ACCESS_SOCKET (LANDLOCK_NUM_ACCESS_NET + >> LANDLOCK_NUM_ACCESS_FS) >> >> With this fix, open() doesn't fail in your example. >> >> I'm really sorry that I somehow made such a stupid typo. I will try my >> best to make sure this doesn't happen again. > > Thanks for figuring it out so quickly. With that change, I'm getting some > compilation errors (some bit shifts are becoming too wide for the underlying > types), but I'm sure you can address that easily for the next version of the > patch set. > > IMHO this shows that our reliance on bit manipulation is probably getting in the > way of code clarity. :-/ I hope we can simplify these internal structures at > some point. Once we have a better way to check for performance changes [1], we > can try to change this and measure whether these comprehensibility/performance > tradeoff is really worth it. > > [1] https://github.com/landlock-lsm/linux/issues/24 Sounds great, probably this idea should be added to this issue [1]. [1] https://github.com/landlock-lsm/linux/issues/34 > > The other takeaway in my mind is, we should probably have some tests for that, > to check that the enablement of one kind of policy does not affect the > operations that belong to other kinds of policies. Like this, for instance (I > was about to send this test to help debugging): > > TEST_F(mini, restricting_socket_does_not_affect_fs_actions) > { > const struct landlock_ruleset_attr ruleset_attr = { > .handled_access_socket = LANDLOCK_ACCESS_SOCKET_CREATE, > }; > int ruleset_fd, fd; > > ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); > ASSERT_LE(0, ruleset_fd); > > enforce_ruleset(_metadata, ruleset_fd); > ASSERT_EQ(0, close(ruleset_fd)); > > /* > * Accessing /dev/null for writing should be permitted, > * because we did not add any file system restrictions. > */ > fd = open("/dev/null", O_WRONLY); > EXPECT_LE(0, fd); > > ASSERT_EQ(0, close(fd)); > } > > Since these kinds of tests are a bit at the intersection between the > fs/net/socket tests, maybe they could go into a separate test file? The next > time we add a new kind of Landlock restriction, it would come more naturally to > add the matching test there and spot such issues earlier. Would you volunteer > to add such a test as part of your patch set? :) Good idea! This test should probably be a part of the patch I mentioned here [1]. WDYT? (Btw, [1] should also be a part of the issue mentioned above). [1] https://lore.kernel.org/all/f4b5e2b9-e960-fd08-fdf4-328bb475e2ef@huawei-partners.com/ > > Thanks, > Günther
On Thu, Jun 06, 2024 at 02:44:23PM +0300, Mikhail Ivanov wrote: > 6/4/2024 11:22 PM, Günther Noack wrote: > I figured out that I define LANDLOCK_SHIFT_ACCESS_SOCKET macro in > really strange way (see landlock/limits.h): > > #define LANDLOCK_SHIFT_ACCESS_SOCKET LANDLOCK_NUM_ACCESS_SOCKET > > With this definition, socket access mask overlaps the fs access > mask in ruleset->access_masks[layer_level]. That's why > landlock_get_fs_access_mask() returns non-zero mask in hook_file_open(). > > So, the macro must be defined in this way: > > #define LANDLOCK_SHIFT_ACCESS_SOCKET (LANDLOCK_NUM_ACCESS_NET + > LANDLOCK_NUM_ACCESS_FS) > > With this fix, open() doesn't fail in your example. > > I'm really sorry that I somehow made such a stupid typo. I will try my > best to make sure this doesn't happen again. I found that we had the exact same bug with a wrongly defined "SHIFT" value in [1]. Maybe we should define access_masks_t as a bit-field rather than doing the bit-shifts by hand. Then the compiler would keep track of the bit-offsets automatically. Bit-fields have a bad reputation, but in my understanding, this is largely because they make it hard to control the exact bit-by-bit layout. In our case, we do not need such an exact control though, and it would be fine. To quote Linus Torvalds on [2], Bitfields are fine if you don't actually care about the underlying format, and want gcc to just randomly assign bits, and want things to be convenient in that situation. Let me send you a proposal patch which replaces access_masks_t with a bit-field and removes the need for the "SHIFT" definition, which we already got wrong in two patch sets now. It has the additional benefit of making the code a bit shorter and also removing a few static_assert()s which are now guaranteed by the compiler. —Günther [1] https://lore.kernel.org/all/ZmLEoBfHyUR3nKAV@google.com/ [2] https://yarchive.net/comp/linux/bitfields.html
6/10/2024 11:03 AM, Günther Noack wrote: > On Thu, Jun 06, 2024 at 02:44:23PM +0300, Mikhail Ivanov wrote: >> 6/4/2024 11:22 PM, Günther Noack wrote: >> I figured out that I define LANDLOCK_SHIFT_ACCESS_SOCKET macro in >> really strange way (see landlock/limits.h): >> >> #define LANDLOCK_SHIFT_ACCESS_SOCKET LANDLOCK_NUM_ACCESS_SOCKET >> >> With this definition, socket access mask overlaps the fs access >> mask in ruleset->access_masks[layer_level]. That's why >> landlock_get_fs_access_mask() returns non-zero mask in hook_file_open(). >> >> So, the macro must be defined in this way: >> >> #define LANDLOCK_SHIFT_ACCESS_SOCKET (LANDLOCK_NUM_ACCESS_NET + >> LANDLOCK_NUM_ACCESS_FS) >> >> With this fix, open() doesn't fail in your example. >> >> I'm really sorry that I somehow made such a stupid typo. I will try my >> best to make sure this doesn't happen again. > > I found that we had the exact same bug with a wrongly defined "SHIFT" value in > [1]. > > Maybe we should define access_masks_t as a bit-field rather than doing the > bit-shifts by hand. Then the compiler would keep track of the bit-offsets > automatically. > > Bit-fields have a bad reputation, but in my understanding, this is largely > because they make it hard to control the exact bit-by-bit layout. In our case, > we do not need such an exact control though, and it would be fine. > > To quote Linus Torvalds on [2], > > Bitfields are fine if you don't actually care about the underlying format, > and want gcc to just randomly assign bits, and want things to be > convenient in that situation. > > Let me send you a proposal patch which replaces access_masks_t with a bit-field > and removes the need for the "SHIFT" definition, which we already got wrong in > two patch sets now. It has the additional benefit of making the code a bit > shorter and also removing a few static_assert()s which are now guaranteed by the > compiler. > > —Günther > > [1] https://lore.kernel.org/all/ZmLEoBfHyUR3nKAV@google.com/ > [2] https://yarchive.net/comp/linux/bitfields.html Thank you, Günther! It really looks more clear. This patch should be applied to Landlock separately, right?