Message ID | 20220221212522.320243-7-mic@digikod.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Landlock: file linking and renaming support | expand |
Hi "Mickaël, I love your patch! Yet something to improve: [auto build test ERROR on cfb92440ee71adcc2105b0890bb01ac3cddb8507] url: https://github.com/0day-ci/linux/commits/Micka-l-Sala-n/Landlock-file-linking-and-renaming-support/20220222-051842 base: cfb92440ee71adcc2105b0890bb01ac3cddb8507 config: hexagon-randconfig-r002-20220221 (https://download.01.org/0day-ci/archive/20220222/202202221149.qLO9DEqo-lkp@intel.com/config) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project d271fc04d5b97b12e6b797c6067d3c96a8d7470e) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/c68b879f54d6262963d435a18cedbc238b7faeaf git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Micka-l-Sala-n/Landlock-file-linking-and-renaming-support/20220222-051842 git checkout c68b879f54d6262963d435a18cedbc238b7faeaf # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> security/landlock/fs.c:463:2: error: call to __compiletime_assert_228 declared with 'error' attribute: BUILD_BUG_ON failed: !layer_masks_dst_parent BUILD_BUG_ON(!layer_masks_dst_parent); ^ include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON' BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^ include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/compiler_types.h:346:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ include/linux/compiler_types.h:334:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ include/linux/compiler_types.h:327:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:170:1: note: expanded from here __compiletime_assert_228 ^ >> security/landlock/fs.c:670:2: error: call to __compiletime_assert_229 declared with 'error' attribute: BUILD_BUG_ON failed: !layer_masks_dom BUILD_BUG_ON(!layer_masks_dom); ^ include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON' BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^ include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG' #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^ include/linux/compiler_types.h:346:2: note: expanded from macro 'compiletime_assert' _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ include/linux/compiler_types.h:334:2: note: expanded from macro '_compiletime_assert' __compiletime_assert(condition, msg, prefix, suffix) ^ include/linux/compiler_types.h:327:4: note: expanded from macro '__compiletime_assert' prefix ## suffix(); \ ^ <scratch space>:174:1: note: expanded from here __compiletime_assert_229 ^ 2 errors generated. vim +/error +463 security/landlock/fs.c 401 402 /** 403 * check_access_path_dual - Check a source and a destination accesses 404 * 405 * @domain: Domain to check against. 406 * @path: File hierarchy to walk through. 407 * @child_is_directory: Must be set to true if the (original) leaf is a 408 * directory, false otherwise. 409 * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent 410 * is equal to @layer_masks_src_parent (if any). 411 * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access 412 * masks, identifying the layers that forbid a specific access. Bits from 413 * this matrix can be unset according to the @path walk. An empty matrix 414 * means that @domain allows all possible Landlock accesses (i.e. not only 415 * those identified by @access_request_dst_parent). This matrix can 416 * initially refer to domain layer masks and, when the accesses for the 417 * destination and source are the same, to request layer masks. 418 * @access_request_src_parent: Similar to @access_request_dst_parent but for an 419 * initial source path request. Only taken into account if 420 * @layer_masks_src_parent is not NULL. 421 * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an 422 * initial source path walk. This can be NULL if only dealing with a 423 * destination access request (i.e. not a rename nor a link action). 424 * @layer_masks_child: Similar to @layer_masks_src_parent but only for the 425 * linked or renamed inode (without hierarchy). This is only used if 426 * @layer_masks_src_parent is not NULL. 427 * 428 * This helper first checks that the destination has a superset of restrictions 429 * compared to the source (if any) for a common path. It then checks that the 430 * collected accesses and the remaining ones are enough to allow the request. 431 * 432 * Returns: 433 * - 0 if the access request is granted; 434 * - -EACCES if it is denied because of access right other than 435 * LANDLOCK_ACCESS_FS_REFER; 436 * - -EXDEV if the renaming or linking would be a privileged escalation 437 * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is 438 * not allowed by the source or the destination. 439 */ 440 static int check_access_path_dual(const struct landlock_ruleset *const domain, 441 const struct path *const path, 442 bool child_is_directory, 443 const access_mask_t access_request_dst_parent, 444 layer_mask_t (*const 445 layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], 446 const access_mask_t access_request_src_parent, 447 layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], 448 layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) 449 { 450 bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check; 451 struct path walker_path; 452 access_mask_t access_masked_dst_parent, access_masked_src_parent; 453 454 if (!access_request_dst_parent && !access_request_src_parent) 455 return 0; 456 if (WARN_ON_ONCE(!domain || !path)) 457 return 0; 458 if (is_nouser_or_private(path->dentry)) 459 return 0; 460 if (WARN_ON_ONCE(domain->num_layers < 1)) 461 return -EACCES; 462 > 463 BUILD_BUG_ON(!layer_masks_dst_parent); 464 if (layer_masks_src_parent) { 465 if (WARN_ON_ONCE(!layer_masks_child)) 466 return -EACCES; 467 access_masked_dst_parent = access_masked_src_parent = 468 get_handled_accesses(domain); 469 is_dom_check = true; 470 } else { 471 if (WARN_ON_ONCE(layer_masks_child)) 472 return -EACCES; 473 access_masked_dst_parent = access_request_dst_parent; 474 access_masked_src_parent = access_request_src_parent; 475 is_dom_check = false; 476 } 477 478 walker_path = *path; 479 path_get(&walker_path); 480 /* 481 * We need to walk through all the hierarchy to not miss any relevant 482 * restriction. 483 */ 484 while (true) { 485 struct dentry *parent_dentry; 486 const struct landlock_rule *rule; 487 488 /* 489 * If at least all accesses allowed on the destination are 490 * already allowed on the source, respectively if there is at 491 * least as much as restrictions on the destination than on the 492 * source, then we can safely refer files from the source to 493 * the destination without risking a privilege escalation. 494 * This is crucial for standalone multilayered security 495 * policies. Furthermore, this helps avoid policy writers to 496 * shoot themselves in the foot. 497 */ 498 if (is_dom_check && is_superset(child_is_directory, 499 layer_masks_dst_parent, 500 layer_masks_src_parent, 501 layer_masks_child)) { 502 allowed_dst_parent = 503 scope_to_request(access_request_dst_parent, 504 layer_masks_dst_parent); 505 allowed_src_parent = 506 scope_to_request(access_request_src_parent, 507 layer_masks_src_parent); 508 509 /* Stops when all accesses are granted. */ 510 if (allowed_dst_parent && allowed_src_parent) 511 break; 512 513 /* 514 * Downgrades checks from domain handled accesses to 515 * requested accesses. 516 */ 517 is_dom_check = false; 518 access_masked_dst_parent = access_request_dst_parent; 519 access_masked_src_parent = access_request_src_parent; 520 } 521 522 rule = find_rule(domain, walker_path.dentry); 523 allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent, 524 layer_masks_dst_parent); 525 allowed_src_parent = unmask_layers(rule, access_masked_src_parent, 526 layer_masks_src_parent); 527 528 /* Stops when a rule from each layer grants access. */ 529 if (allowed_dst_parent && allowed_src_parent) 530 break; 531 532 jump_up: 533 if (walker_path.dentry == walker_path.mnt->mnt_root) { 534 if (follow_up(&walker_path)) { 535 /* Ignores hidden mount points. */ 536 goto jump_up; 537 } else { 538 /* 539 * Stops at the real root. Denies access 540 * because not all layers have granted access. 541 */ 542 allowed_dst_parent = false; 543 break; 544 } 545 } 546 if (unlikely(IS_ROOT(walker_path.dentry))) { 547 /* 548 * Stops at disconnected root directories. Only allows 549 * access to internal filesystems (e.g. nsfs, which is 550 * reachable through /proc/<pid>/ns/<namespace>). 551 */ 552 allowed_dst_parent = !!(walker_path.mnt->mnt_flags & 553 MNT_INTERNAL); 554 break; 555 } 556 parent_dentry = dget_parent(walker_path.dentry); 557 dput(walker_path.dentry); 558 walker_path.dentry = parent_dentry; 559 } 560 path_put(&walker_path); 561 562 if (allowed_dst_parent && allowed_src_parent) 563 return 0; 564 565 /* 566 * Unfortunately, we cannot prioritize EACCES over EXDEV for all 567 * RENAME_EXCHANGE cases because it depends on the source and 568 * destination order. This could be changed with a new 569 * security_path_rename hook implementation. 570 */ 571 if (likely(is_eacces(layer_masks_dst_parent, access_request_dst_parent) 572 || is_eacces(layer_masks_src_parent, 573 access_request_src_parent))) 574 return -EACCES; 575 576 /* 577 * Gracefully forbids reparenting if the destination directory 578 * hierarchy is not a superset of restrictions of the source directory 579 * hierarchy, or if LANDLOCK_ACCESS_FS_REFER is not allowed by the 580 * source or the destination. 581 */ 582 return -EXDEV; 583 } 584 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
This error is because clang does not behave like GCC: check_access_path_dual() should be marked as __always_inline, or I should change from BUILD_BUG_ON() to WARN_ON_ONCE() if needed. I'll fix that in the next series. On 22/02/2022 04:16, kernel test robot wrote: > Hi "Mickaël, > > I love your patch! Yet something to improve: > > [auto build test ERROR on cfb92440ee71adcc2105b0890bb01ac3cddb8507] > > url: https://github.com/0day-ci/linux/commits/Micka-l-Sala-n/Landlock-file-linking-and-renaming-support/20220222-051842 > base: cfb92440ee71adcc2105b0890bb01ac3cddb8507 > config: hexagon-randconfig-r002-20220221 (https://download.01.org/0day-ci/archive/20220222/202202221149.qLO9DEqo-lkp@intel.com/config) > compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project d271fc04d5b97b12e6b797c6067d3c96a8d7470e) > reproduce (this is a W=1 build): > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross > chmod +x ~/bin/make.cross > # https://github.com/0day-ci/linux/commit/c68b879f54d6262963d435a18cedbc238b7faeaf > git remote add linux-review https://github.com/0day-ci/linux > git fetch --no-tags linux-review Micka-l-Sala-n/Landlock-file-linking-and-renaming-support/20220222-051842 > git checkout c68b879f54d6262963d435a18cedbc238b7faeaf > # save the config file to linux build tree > mkdir build_dir > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot <lkp@intel.com> > > All errors (new ones prefixed by >>): > >>> security/landlock/fs.c:463:2: error: call to __compiletime_assert_228 declared with 'error' attribute: BUILD_BUG_ON failed: !layer_masks_dst_parent > BUILD_BUG_ON(!layer_masks_dst_parent); > ^ > include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON' > BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) > ^ > include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG' > #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) > ^ > include/linux/compiler_types.h:346:2: note: expanded from macro 'compiletime_assert' > _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) > ^ > include/linux/compiler_types.h:334:2: note: expanded from macro '_compiletime_assert' > __compiletime_assert(condition, msg, prefix, suffix) > ^ > include/linux/compiler_types.h:327:4: note: expanded from macro '__compiletime_assert' > prefix ## suffix(); \ > ^ > <scratch space>:170:1: note: expanded from here > __compiletime_assert_228 > ^ >>> security/landlock/fs.c:670:2: error: call to __compiletime_assert_229 declared with 'error' attribute: BUILD_BUG_ON failed: !layer_masks_dom > BUILD_BUG_ON(!layer_masks_dom); > ^ > include/linux/build_bug.h:50:2: note: expanded from macro 'BUILD_BUG_ON' > BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) > ^ > include/linux/build_bug.h:39:37: note: expanded from macro 'BUILD_BUG_ON_MSG' > #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) > ^ > include/linux/compiler_types.h:346:2: note: expanded from macro 'compiletime_assert' > _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) > ^ > include/linux/compiler_types.h:334:2: note: expanded from macro '_compiletime_assert' > __compiletime_assert(condition, msg, prefix, suffix) > ^ > include/linux/compiler_types.h:327:4: note: expanded from macro '__compiletime_assert' > prefix ## suffix(); \ > ^ > <scratch space>:174:1: note: expanded from here > __compiletime_assert_229 > ^ > 2 errors generated. > > > vim +/error +463 security/landlock/fs.c > > 401 > 402 /** > 403 * check_access_path_dual - Check a source and a destination accesses > 404 * > 405 * @domain: Domain to check against. > 406 * @path: File hierarchy to walk through. > 407 * @child_is_directory: Must be set to true if the (original) leaf is a > 408 * directory, false otherwise. > 409 * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent > 410 * is equal to @layer_masks_src_parent (if any). > 411 * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access > 412 * masks, identifying the layers that forbid a specific access. Bits from > 413 * this matrix can be unset according to the @path walk. An empty matrix > 414 * means that @domain allows all possible Landlock accesses (i.e. not only > 415 * those identified by @access_request_dst_parent). This matrix can > 416 * initially refer to domain layer masks and, when the accesses for the > 417 * destination and source are the same, to request layer masks. > 418 * @access_request_src_parent: Similar to @access_request_dst_parent but for an > 419 * initial source path request. Only taken into account if > 420 * @layer_masks_src_parent is not NULL. > 421 * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an > 422 * initial source path walk. This can be NULL if only dealing with a > 423 * destination access request (i.e. not a rename nor a link action). > 424 * @layer_masks_child: Similar to @layer_masks_src_parent but only for the > 425 * linked or renamed inode (without hierarchy). This is only used if > 426 * @layer_masks_src_parent is not NULL. > 427 * > 428 * This helper first checks that the destination has a superset of restrictions > 429 * compared to the source (if any) for a common path. It then checks that the > 430 * collected accesses and the remaining ones are enough to allow the request. > 431 * > 432 * Returns: > 433 * - 0 if the access request is granted; > 434 * - -EACCES if it is denied because of access right other than > 435 * LANDLOCK_ACCESS_FS_REFER; > 436 * - -EXDEV if the renaming or linking would be a privileged escalation > 437 * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is > 438 * not allowed by the source or the destination. > 439 */ > 440 static int check_access_path_dual(const struct landlock_ruleset *const domain, > 441 const struct path *const path, > 442 bool child_is_directory, > 443 const access_mask_t access_request_dst_parent, > 444 layer_mask_t (*const > 445 layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], > 446 const access_mask_t access_request_src_parent, > 447 layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], > 448 layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) > 449 { > 450 bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check; > 451 struct path walker_path; > 452 access_mask_t access_masked_dst_parent, access_masked_src_parent; > 453 > 454 if (!access_request_dst_parent && !access_request_src_parent) > 455 return 0; > 456 if (WARN_ON_ONCE(!domain || !path)) > 457 return 0; > 458 if (is_nouser_or_private(path->dentry)) > 459 return 0; > 460 if (WARN_ON_ONCE(domain->num_layers < 1)) > 461 return -EACCES; > 462 > > 463 BUILD_BUG_ON(!layer_masks_dst_parent); > 464 if (layer_masks_src_parent) { > 465 if (WARN_ON_ONCE(!layer_masks_child)) > 466 return -EACCES; > 467 access_masked_dst_parent = access_masked_src_parent = > 468 get_handled_accesses(domain); > 469 is_dom_check = true; > 470 } else { > 471 if (WARN_ON_ONCE(layer_masks_child)) > 472 return -EACCES; > 473 access_masked_dst_parent = access_request_dst_parent; > 474 access_masked_src_parent = access_request_src_parent; > 475 is_dom_check = false; > 476 } > 477 > 478 walker_path = *path; > 479 path_get(&walker_path); > 480 /* > 481 * We need to walk through all the hierarchy to not miss any relevant > 482 * restriction. > 483 */ > 484 while (true) { > 485 struct dentry *parent_dentry; > 486 const struct landlock_rule *rule; > 487 > 488 /* > 489 * If at least all accesses allowed on the destination are > 490 * already allowed on the source, respectively if there is at > 491 * least as much as restrictions on the destination than on the > 492 * source, then we can safely refer files from the source to > 493 * the destination without risking a privilege escalation. > 494 * This is crucial for standalone multilayered security > 495 * policies. Furthermore, this helps avoid policy writers to > 496 * shoot themselves in the foot. > 497 */ > 498 if (is_dom_check && is_superset(child_is_directory, > 499 layer_masks_dst_parent, > 500 layer_masks_src_parent, > 501 layer_masks_child)) { > 502 allowed_dst_parent = > 503 scope_to_request(access_request_dst_parent, > 504 layer_masks_dst_parent); > 505 allowed_src_parent = > 506 scope_to_request(access_request_src_parent, > 507 layer_masks_src_parent); > 508 > 509 /* Stops when all accesses are granted. */ > 510 if (allowed_dst_parent && allowed_src_parent) > 511 break; > 512 > 513 /* > 514 * Downgrades checks from domain handled accesses to > 515 * requested accesses. > 516 */ > 517 is_dom_check = false; > 518 access_masked_dst_parent = access_request_dst_parent; > 519 access_masked_src_parent = access_request_src_parent; > 520 } > 521 > 522 rule = find_rule(domain, walker_path.dentry); > 523 allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent, > 524 layer_masks_dst_parent); > 525 allowed_src_parent = unmask_layers(rule, access_masked_src_parent, > 526 layer_masks_src_parent); > 527 > 528 /* Stops when a rule from each layer grants access. */ > 529 if (allowed_dst_parent && allowed_src_parent) > 530 break; > 531 > 532 jump_up: > 533 if (walker_path.dentry == walker_path.mnt->mnt_root) { > 534 if (follow_up(&walker_path)) { > 535 /* Ignores hidden mount points. */ > 536 goto jump_up; > 537 } else { > 538 /* > 539 * Stops at the real root. Denies access > 540 * because not all layers have granted access. > 541 */ > 542 allowed_dst_parent = false; > 543 break; > 544 } > 545 } > 546 if (unlikely(IS_ROOT(walker_path.dentry))) { > 547 /* > 548 * Stops at disconnected root directories. Only allows > 549 * access to internal filesystems (e.g. nsfs, which is > 550 * reachable through /proc/<pid>/ns/<namespace>). > 551 */ > 552 allowed_dst_parent = !!(walker_path.mnt->mnt_flags & > 553 MNT_INTERNAL); > 554 break; > 555 } > 556 parent_dentry = dget_parent(walker_path.dentry); > 557 dput(walker_path.dentry); > 558 walker_path.dentry = parent_dentry; > 559 } > 560 path_put(&walker_path); > 561 > 562 if (allowed_dst_parent && allowed_src_parent) > 563 return 0; > 564 > 565 /* > 566 * Unfortunately, we cannot prioritize EACCES over EXDEV for all > 567 * RENAME_EXCHANGE cases because it depends on the source and > 568 * destination order. This could be changed with a new > 569 * security_path_rename hook implementation. > 570 */ > 571 if (likely(is_eacces(layer_masks_dst_parent, access_request_dst_parent) > 572 || is_eacces(layer_masks_src_parent, > 573 access_request_src_parent))) > 574 return -EACCES; > 575 > 576 /* > 577 * Gracefully forbids reparenting if the destination directory > 578 * hierarchy is not a superset of restrictions of the source directory > 579 * hierarchy, or if LANDLOCK_ACCESS_FS_REFER is not allowed by the > 580 * source or the destination. > 581 */ > 582 return -EXDEV; > 583 } > 584 > > --- > 0-DAY CI Kernel Test Service, Intel Corporation > https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <mic@digikod.net> wrote: > > From: Mickaël Salaün <mic@linux.microsoft.com> > > Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers > to allow sandboxed processes to link and rename files from and to a > specific set of file hierarchies. This access right should be composed > with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename, > and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This > lift a Landlock limitation that always denied changing the parent of an > inode. > > Renaming or linking to the same directory is still always allowed, > whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not > considered a threat to user data. > > However, creating multiple links or renaming to a different parent > directory may lead to privilege escalations if not handled properly. > Indeed, we must be sure that the source doesn't gain more privileges by > being accessible from the destination. This is handled by making sure > that the source hierarchy (including the referenced file or directory > itself) restricts at least as much the destination hierarchy. If it is > not the case, an EXDEV error is returned, making it potentially possible > for user space to copy the file hierarchy instead of moving or linking > it. > > Instead of creating different access rights for the source and the > destination, we choose to make it simple and consistent for users. > Indeed, considering the previous constraint, it would be weird to > require such destination access right to be also granted to the source > (to make it a superset). > > See the provided documentation for additional details. > > New tests are provided with a following commit. > > Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> > Link: https://lore.kernel.org/r/20220221212522.320243-7-mic@digikod.net > --- > include/uapi/linux/landlock.h | 27 +- > security/landlock/fs.c | 550 ++++++++++++++++--- > security/landlock/limits.h | 2 +- > security/landlock/syscalls.c | 2 +- > tools/testing/selftests/landlock/base_test.c | 2 +- > tools/testing/selftests/landlock/fs_test.c | 3 +- > 6 files changed, 516 insertions(+), 70 deletions(-) ... > diff --git a/security/landlock/fs.c b/security/landlock/fs.c > index 3886f9ad1a60..c7c7ce4e7cd5 100644 > --- a/security/landlock/fs.c > +++ b/security/landlock/fs.c > @@ -4,6 +4,7 @@ > * > * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net> > * Copyright © 2018-2020 ANSSI > + * Copyright © 2021-2022 Microsoft Corporation > */ > > #include <linux/atomic.h> > @@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const struct dentry *dentry) > unlikely(IS_PRIVATE(d_backing_inode(dentry)))); > } > > -static int check_access_path(const struct landlock_ruleset *const domain, > - const struct path *const path, > +static inline access_mask_t get_handled_accesses( > + const struct landlock_ruleset *const domain) > +{ > + access_mask_t access_dom = 0; > + unsigned long access_bit; Would it be better to declare @access_bit as an access_mask_t type? You're not using any macros like for_each_set_bit() in this function so I believe it should be safe. > + for (access_bit = 0; access_bit < LANDLOCK_NUM_ACCESS_FS; > + access_bit++) { > + size_t layer_level; Considering the number of layers has dropped down to 16, it seems like a normal unsigned int might be big enough for @layer_level :) > + for (layer_level = 0; layer_level < domain->num_layers; > + layer_level++) { > + if (domain->fs_access_masks[layer_level] & > + BIT_ULL(access_bit)) { > + access_dom |= BIT_ULL(access_bit); > + break; > + } > + } > + } > + return access_dom; > +} > + > +static inline access_mask_t init_layer_masks( > + const struct landlock_ruleset *const domain, > + const access_mask_t access_request, > + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS]) > +{ > + access_mask_t handled_accesses = 0; > + size_t layer_level; > + > + memset(layer_masks, 0, sizeof(*layer_masks)); > + if (WARN_ON_ONCE(!access_request)) > + return 0; > + > + /* Saves all handled accesses per layer. */ > + for (layer_level = 0; layer_level < domain->num_layers; > + layer_level++) { > + const unsigned long access_req = access_request; > + unsigned long access_bit; > + > + for_each_set_bit(access_bit, &access_req, > + ARRAY_SIZE(*layer_masks)) { > + if (domain->fs_access_masks[layer_level] & > + BIT_ULL(access_bit)) { > + (*layer_masks)[access_bit] |= > + BIT_ULL(layer_level); > + handled_accesses |= BIT_ULL(access_bit); > + } > + } > + } > + return handled_accesses; > +} > + > +/* > + * Check that a destination file hierarchy has more restrictions than a source > + * file hierarchy. This is only used for link and rename actions. > + */ > +static inline bool is_superset(bool child_is_directory, > + const layer_mask_t (*const > + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], > + const layer_mask_t (*const > + layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], > + const layer_mask_t (*const > + layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) > +{ > + unsigned long access_bit; > + > + for (access_bit = 0; access_bit < ARRAY_SIZE(*layer_masks_dst_parent); > + access_bit++) { > + /* Ignores accesses that only make sense for directories. */ > + if (!child_is_directory && !(BIT_ULL(access_bit) & ACCESS_FILE)) > + continue; > + > + /* > + * Checks if the destination restrictions are a superset of the > + * source ones (i.e. inherited access rights without child > + * exceptions). > + */ > + if ((((*layer_masks_src_parent)[access_bit] & (*layer_masks_child)[access_bit]) | > + (*layer_masks_dst_parent)[access_bit]) != > + (*layer_masks_dst_parent)[access_bit]) > + return false; > + } > + return true; > +} > + > +/* > + * Removes @layer_masks accesses that are not requested. > + * > + * Returns true if the request is allowed, false otherwise. > + */ > +static inline bool scope_to_request(const access_mask_t access_request, > + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS]) > +{ > + const unsigned long access_req = access_request; > + unsigned long access_bit; > + > + if (WARN_ON_ONCE(!layer_masks)) > + return true; > + > + for_each_clear_bit(access_bit, &access_req, ARRAY_SIZE(*layer_masks)) > + (*layer_masks)[access_bit] = 0; > + return !memchr_inv(layer_masks, 0, sizeof(*layer_masks)); > +} > + > +/* > + * Returns true if there is at least one access right different than > + * LANDLOCK_ACCESS_FS_REFER. > + */ > +static inline bool is_eacces( > + const layer_mask_t (*const > + layer_masks)[LANDLOCK_NUM_ACCESS_FS], > const access_mask_t access_request) > { Granted, I don't have as deep of an understanding of Landlock as you do, but the function name "is_eacces" seems a little odd given the nature of the function. Perhaps "is_fsrefer"? > - layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {}; > - bool allowed = false, has_access = false; > + unsigned long access_bit; > + /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */ > + const unsigned long access_check = access_request & > + ~LANDLOCK_ACCESS_FS_REFER; > + > + if (!layer_masks) > + return false; > + > + for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) { > + if ((*layer_masks)[access_bit]) > + return true; > + } Is calling for_each_set_bit() overkill here? @access_check should only ever have at most one bit set (LANDLOCK_ACCESS_FS_REFER), yes? > + return false; > +} > + > +/** > + * check_access_path_dual - Check a source and a destination accesses > + * > + * @domain: Domain to check against. > + * @path: File hierarchy to walk through. > + * @child_is_directory: Must be set to true if the (original) leaf is a > + * directory, false otherwise. > + * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent > + * is equal to @layer_masks_src_parent (if any). > + * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access > + * masks, identifying the layers that forbid a specific access. Bits from > + * this matrix can be unset according to the @path walk. An empty matrix > + * means that @domain allows all possible Landlock accesses (i.e. not only > + * those identified by @access_request_dst_parent). This matrix can > + * initially refer to domain layer masks and, when the accesses for the > + * destination and source are the same, to request layer masks. > + * @access_request_src_parent: Similar to @access_request_dst_parent but for an > + * initial source path request. Only taken into account if > + * @layer_masks_src_parent is not NULL. > + * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an > + * initial source path walk. This can be NULL if only dealing with a > + * destination access request (i.e. not a rename nor a link action). > + * @layer_masks_child: Similar to @layer_masks_src_parent but only for the > + * linked or renamed inode (without hierarchy). This is only used if > + * @layer_masks_src_parent is not NULL. > + * > + * This helper first checks that the destination has a superset of restrictions > + * compared to the source (if any) for a common path. It then checks that the > + * collected accesses and the remaining ones are enough to allow the request. > + * > + * Returns: > + * - 0 if the access request is granted; > + * - -EACCES if it is denied because of access right other than > + * LANDLOCK_ACCESS_FS_REFER; > + * - -EXDEV if the renaming or linking would be a privileged escalation > + * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is > + * not allowed by the source or the destination. > + */ > +static int check_access_path_dual(const struct landlock_ruleset *const domain, > + const struct path *const path, > + bool child_is_directory, > + const access_mask_t access_request_dst_parent, > + layer_mask_t (*const > + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], > + const access_mask_t access_request_src_parent, > + layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], > + layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) > +{ > + bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check; > struct path walker_path; > - size_t i; > + access_mask_t access_masked_dst_parent, access_masked_src_parent; > > - if (!access_request) > + if (!access_request_dst_parent && !access_request_src_parent) > return 0; > if (WARN_ON_ONCE(!domain || !path)) > return 0; > @@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain, > if (WARN_ON_ONCE(domain->num_layers < 1)) > return -EACCES; > > - /* Saves all layers handling a subset of requested accesses. */ > - for (i = 0; i < domain->num_layers; i++) { > - const unsigned long access_req = access_request; > - unsigned long access_bit; > - > - for_each_set_bit(access_bit, &access_req, > - ARRAY_SIZE(layer_masks)) { > - if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) { > - layer_masks[access_bit] |= BIT_ULL(i); > - has_access = true; > - } > - } > + BUILD_BUG_ON(!layer_masks_dst_parent); I know the kbuild robot already flagged this, but checking function parameters with BUILD_BUG_ON() does seem a bit ... unusual :) > + if (layer_masks_src_parent) { > + if (WARN_ON_ONCE(!layer_masks_child)) > + return -EACCES; > + access_masked_dst_parent = access_masked_src_parent = > + get_handled_accesses(domain); > + is_dom_check = true; > + } else { > + if (WARN_ON_ONCE(layer_masks_child)) > + return -EACCES; > + access_masked_dst_parent = access_request_dst_parent; > + access_masked_src_parent = access_request_src_parent; > + is_dom_check = false; > } > - /* An access request not handled by the domain is allowed. */ > - if (!has_access) > - return 0; > > walker_path = *path; > path_get(&walker_path); > @@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain, > */ > while (true) { > struct dentry *parent_dentry; > + const struct landlock_rule *rule; > + > + /* > + * If at least all accesses allowed on the destination are > + * already allowed on the source, respectively if there is at > + * least as much as restrictions on the destination than on the > + * source, then we can safely refer files from the source to > + * the destination without risking a privilege escalation. > + * This is crucial for standalone multilayered security > + * policies. Furthermore, this helps avoid policy writers to > + * shoot themselves in the foot. > + */ > + if (is_dom_check && is_superset(child_is_directory, > + layer_masks_dst_parent, > + layer_masks_src_parent, > + layer_masks_child)) { > + allowed_dst_parent = > + scope_to_request(access_request_dst_parent, > + layer_masks_dst_parent); > + allowed_src_parent = > + scope_to_request(access_request_src_parent, > + layer_masks_src_parent); > + > + /* Stops when all accesses are granted. */ > + if (allowed_dst_parent && allowed_src_parent) > + break; > + > + /* > + * Downgrades checks from domain handled accesses to > + * requested accesses. > + */ > + is_dom_check = false; > + access_masked_dst_parent = access_request_dst_parent; > + access_masked_src_parent = access_request_src_parent; > + } > + > + rule = find_rule(domain, walker_path.dentry); > + allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent, > + layer_masks_dst_parent); > + allowed_src_parent = unmask_layers(rule, access_masked_src_parent, > + layer_masks_src_parent); > > - allowed = unmask_layers(find_rule(domain, walker_path.dentry), > - access_request, &layer_masks); > - if (allowed) > - /* Stops when a rule from each layer grants access. */ > + /* Stops when a rule from each layer grants access. */ > + if (allowed_dst_parent && allowed_src_parent) > break; If "(allowed_dst_parent && allowed_src_parent)" is true, you break out of the while loop only to do a path_put(), check the two booleans once more, and then return zero, yes? Why not just do the path_put() and return zero here? -- paul-moore.com
On 17/03/2022 02:26, Paul Moore wrote: > On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <mic@digikod.net> wrote: >> >> From: Mickaël Salaün <mic@linux.microsoft.com> >> >> Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers >> to allow sandboxed processes to link and rename files from and to a >> specific set of file hierarchies. This access right should be composed >> with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename, >> and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This >> lift a Landlock limitation that always denied changing the parent of an >> inode. >> >> Renaming or linking to the same directory is still always allowed, >> whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not >> considered a threat to user data. >> >> However, creating multiple links or renaming to a different parent >> directory may lead to privilege escalations if not handled properly. >> Indeed, we must be sure that the source doesn't gain more privileges by >> being accessible from the destination. This is handled by making sure >> that the source hierarchy (including the referenced file or directory >> itself) restricts at least as much the destination hierarchy. If it is >> not the case, an EXDEV error is returned, making it potentially possible >> for user space to copy the file hierarchy instead of moving or linking >> it. >> >> Instead of creating different access rights for the source and the >> destination, we choose to make it simple and consistent for users. >> Indeed, considering the previous constraint, it would be weird to >> require such destination access right to be also granted to the source >> (to make it a superset). >> >> See the provided documentation for additional details. >> >> New tests are provided with a following commit. >> >> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> >> Link: https://lore.kernel.org/r/20220221212522.320243-7-mic@digikod.net >> --- >> include/uapi/linux/landlock.h | 27 +- >> security/landlock/fs.c | 550 ++++++++++++++++--- >> security/landlock/limits.h | 2 +- >> security/landlock/syscalls.c | 2 +- >> tools/testing/selftests/landlock/base_test.c | 2 +- >> tools/testing/selftests/landlock/fs_test.c | 3 +- >> 6 files changed, 516 insertions(+), 70 deletions(-) > > ... > >> diff --git a/security/landlock/fs.c b/security/landlock/fs.c >> index 3886f9ad1a60..c7c7ce4e7cd5 100644 >> --- a/security/landlock/fs.c >> +++ b/security/landlock/fs.c >> @@ -4,6 +4,7 @@ >> * >> * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net> >> * Copyright © 2018-2020 ANSSI >> + * Copyright © 2021-2022 Microsoft Corporation >> */ >> >> #include <linux/atomic.h> >> @@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const struct dentry *dentry) >> unlikely(IS_PRIVATE(d_backing_inode(dentry)))); >> } >> >> -static int check_access_path(const struct landlock_ruleset *const domain, >> - const struct path *const path, >> +static inline access_mask_t get_handled_accesses( >> + const struct landlock_ruleset *const domain) >> +{ >> + access_mask_t access_dom = 0; >> + unsigned long access_bit; > > Would it be better to declare @access_bit as an access_mask_t type? > You're not using any macros like for_each_set_bit() in this function > so I believe it should be safe. Right, I'll change that. > >> + for (access_bit = 0; access_bit < LANDLOCK_NUM_ACCESS_FS; >> + access_bit++) { >> + size_t layer_level; > > Considering the number of layers has dropped down to 16, it seems like > a normal unsigned int might be big enough for @layer_level :) We could switch to u8, but I prefer to stick to size_t for array indexes which enable to reduce the cognitive workload related to the size of such array. ;) I guess there is enough info for compilers to optimize such code anyway. > >> + for (layer_level = 0; layer_level < domain->num_layers; >> + layer_level++) { >> + if (domain->fs_access_masks[layer_level] & >> + BIT_ULL(access_bit)) { >> + access_dom |= BIT_ULL(access_bit); >> + break; >> + } >> + } >> + } >> + return access_dom; >> +} >> + >> +static inline access_mask_t init_layer_masks( >> + const struct landlock_ruleset *const domain, >> + const access_mask_t access_request, >> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS]) >> +{ >> + access_mask_t handled_accesses = 0; >> + size_t layer_level; >> + >> + memset(layer_masks, 0, sizeof(*layer_masks)); >> + if (WARN_ON_ONCE(!access_request)) >> + return 0; >> + >> + /* Saves all handled accesses per layer. */ >> + for (layer_level = 0; layer_level < domain->num_layers; >> + layer_level++) { >> + const unsigned long access_req = access_request; >> + unsigned long access_bit; >> + >> + for_each_set_bit(access_bit, &access_req, >> + ARRAY_SIZE(*layer_masks)) { >> + if (domain->fs_access_masks[layer_level] & >> + BIT_ULL(access_bit)) { >> + (*layer_masks)[access_bit] |= >> + BIT_ULL(layer_level); >> + handled_accesses |= BIT_ULL(access_bit); >> + } >> + } >> + } >> + return handled_accesses; >> +} >> + >> +/* >> + * Check that a destination file hierarchy has more restrictions than a source >> + * file hierarchy. This is only used for link and rename actions. >> + */ >> +static inline bool is_superset(bool child_is_directory, >> + const layer_mask_t (*const >> + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], >> + const layer_mask_t (*const >> + layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], >> + const layer_mask_t (*const >> + layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) >> +{ >> + unsigned long access_bit; >> + >> + for (access_bit = 0; access_bit < ARRAY_SIZE(*layer_masks_dst_parent); >> + access_bit++) { >> + /* Ignores accesses that only make sense for directories. */ >> + if (!child_is_directory && !(BIT_ULL(access_bit) & ACCESS_FILE)) >> + continue; >> + >> + /* >> + * Checks if the destination restrictions are a superset of the >> + * source ones (i.e. inherited access rights without child >> + * exceptions). >> + */ >> + if ((((*layer_masks_src_parent)[access_bit] & (*layer_masks_child)[access_bit]) | >> + (*layer_masks_dst_parent)[access_bit]) != >> + (*layer_masks_dst_parent)[access_bit]) >> + return false; >> + } >> + return true; >> +} >> + >> +/* >> + * Removes @layer_masks accesses that are not requested. >> + * >> + * Returns true if the request is allowed, false otherwise. >> + */ >> +static inline bool scope_to_request(const access_mask_t access_request, >> + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS]) >> +{ >> + const unsigned long access_req = access_request; >> + unsigned long access_bit; >> + >> + if (WARN_ON_ONCE(!layer_masks)) >> + return true; >> + >> + for_each_clear_bit(access_bit, &access_req, ARRAY_SIZE(*layer_masks)) >> + (*layer_masks)[access_bit] = 0; >> + return !memchr_inv(layer_masks, 0, sizeof(*layer_masks)); >> +} >> + >> +/* >> + * Returns true if there is at least one access right different than >> + * LANDLOCK_ACCESS_FS_REFER. >> + */ >> +static inline bool is_eacces( >> + const layer_mask_t (*const >> + layer_masks)[LANDLOCK_NUM_ACCESS_FS], >> const access_mask_t access_request) >> { > > Granted, I don't have as deep of an understanding of Landlock as you > do, but the function name "is_eacces" seems a little odd given the > nature of the function. Perhaps "is_fsrefer"? Hmm, this helper does multiple things which are necessary to know if we need to return -EACCES or -EXDEV. Renaming it to is_fsrefer() would require to inverse the logic and use boolean negations in the callers (because of ordering). Renaming to something like without_fs_refer() would not be completely correct because we also check if there is no layer_masks, which indicated that it doesn't contain an access right that should return -EACCES. This helper is named as such because the underlying semantic is to check for such error code, which is a tricky. I can rename it co contains_eacces() or something, but a longer name would require to cut the caller lines to fit 80 columns. :| > >> - layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {}; >> - bool allowed = false, has_access = false; >> + unsigned long access_bit; >> + /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */ >> + const unsigned long access_check = access_request & >> + ~LANDLOCK_ACCESS_FS_REFER; >> + >> + if (!layer_masks) >> + return false; >> + >> + for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) { >> + if ((*layer_masks)[access_bit]) >> + return true; >> + } > > Is calling for_each_set_bit() overkill here? @access_check should > only ever have at most one bit set (LANDLOCK_ACCESS_FS_REFER), yes? No, it is the contrary, the bitmask is inverted and this loop check for non-FS_REFER access rights that should then return -EACCES. For instance, if a sandbox handles (and then restricts) MAKE_REG and REFER, a request to link a regular file would contains both of these bits, and the kernel should return -EACCES if MAKE_REG is not granted or -EXDEV if the request is only denied because of REFER. The reparent_* tests check the consistency of this behavior (with the exception of a RENAME_EXCHANGE case, see [1]). [1] https://lore.kernel.org/r/20220222175332.384545-1-mic@digikod.net > >> + return false; >> +} >> + >> +/** >> + * check_access_path_dual - Check a source and a destination accesses >> + * >> + * @domain: Domain to check against. >> + * @path: File hierarchy to walk through. >> + * @child_is_directory: Must be set to true if the (original) leaf is a >> + * directory, false otherwise. >> + * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent >> + * is equal to @layer_masks_src_parent (if any). >> + * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access >> + * masks, identifying the layers that forbid a specific access. Bits from >> + * this matrix can be unset according to the @path walk. An empty matrix >> + * means that @domain allows all possible Landlock accesses (i.e. not only >> + * those identified by @access_request_dst_parent). This matrix can >> + * initially refer to domain layer masks and, when the accesses for the >> + * destination and source are the same, to request layer masks. >> + * @access_request_src_parent: Similar to @access_request_dst_parent but for an >> + * initial source path request. Only taken into account if >> + * @layer_masks_src_parent is not NULL. >> + * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an >> + * initial source path walk. This can be NULL if only dealing with a >> + * destination access request (i.e. not a rename nor a link action). >> + * @layer_masks_child: Similar to @layer_masks_src_parent but only for the >> + * linked or renamed inode (without hierarchy). This is only used if >> + * @layer_masks_src_parent is not NULL. >> + * >> + * This helper first checks that the destination has a superset of restrictions >> + * compared to the source (if any) for a common path. It then checks that the >> + * collected accesses and the remaining ones are enough to allow the request. >> + * >> + * Returns: >> + * - 0 if the access request is granted; >> + * - -EACCES if it is denied because of access right other than >> + * LANDLOCK_ACCESS_FS_REFER; >> + * - -EXDEV if the renaming or linking would be a privileged escalation >> + * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is >> + * not allowed by the source or the destination. >> + */ >> +static int check_access_path_dual(const struct landlock_ruleset *const domain, >> + const struct path *const path, >> + bool child_is_directory, >> + const access_mask_t access_request_dst_parent, >> + layer_mask_t (*const >> + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], >> + const access_mask_t access_request_src_parent, >> + layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], >> + layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) >> +{ >> + bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check; >> struct path walker_path; >> - size_t i; >> + access_mask_t access_masked_dst_parent, access_masked_src_parent; >> >> - if (!access_request) >> + if (!access_request_dst_parent && !access_request_src_parent) >> return 0; >> if (WARN_ON_ONCE(!domain || !path)) >> return 0; >> @@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain, >> if (WARN_ON_ONCE(domain->num_layers < 1)) >> return -EACCES; >> >> - /* Saves all layers handling a subset of requested accesses. */ >> - for (i = 0; i < domain->num_layers; i++) { >> - const unsigned long access_req = access_request; >> - unsigned long access_bit; >> - >> - for_each_set_bit(access_bit, &access_req, >> - ARRAY_SIZE(layer_masks)) { >> - if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) { >> - layer_masks[access_bit] |= BIT_ULL(i); >> - has_access = true; >> - } >> - } >> + BUILD_BUG_ON(!layer_masks_dst_parent); > > I know the kbuild robot already flagged this, but checking function > parameters with BUILD_BUG_ON() does seem a bit ... unusual :) Yeah, I like such guarantee but it may not work without __always_inline. I moved this check in the previous WARN_ON_ONCE(). > >> + if (layer_masks_src_parent) { >> + if (WARN_ON_ONCE(!layer_masks_child)) >> + return -EACCES; >> + access_masked_dst_parent = access_masked_src_parent = >> + get_handled_accesses(domain); >> + is_dom_check = true; >> + } else { >> + if (WARN_ON_ONCE(layer_masks_child)) >> + return -EACCES; >> + access_masked_dst_parent = access_request_dst_parent; >> + access_masked_src_parent = access_request_src_parent; >> + is_dom_check = false; >> } >> - /* An access request not handled by the domain is allowed. */ >> - if (!has_access) >> - return 0; >> >> walker_path = *path; >> path_get(&walker_path); >> @@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain, >> */ >> while (true) { >> struct dentry *parent_dentry; >> + const struct landlock_rule *rule; >> + >> + /* >> + * If at least all accesses allowed on the destination are >> + * already allowed on the source, respectively if there is at >> + * least as much as restrictions on the destination than on the >> + * source, then we can safely refer files from the source to >> + * the destination without risking a privilege escalation. >> + * This is crucial for standalone multilayered security >> + * policies. Furthermore, this helps avoid policy writers to >> + * shoot themselves in the foot. >> + */ >> + if (is_dom_check && is_superset(child_is_directory, >> + layer_masks_dst_parent, >> + layer_masks_src_parent, >> + layer_masks_child)) { >> + allowed_dst_parent = >> + scope_to_request(access_request_dst_parent, >> + layer_masks_dst_parent); >> + allowed_src_parent = >> + scope_to_request(access_request_src_parent, >> + layer_masks_src_parent); >> + >> + /* Stops when all accesses are granted. */ >> + if (allowed_dst_parent && allowed_src_parent) >> + break; >> + >> + /* >> + * Downgrades checks from domain handled accesses to >> + * requested accesses. >> + */ >> + is_dom_check = false; >> + access_masked_dst_parent = access_request_dst_parent; >> + access_masked_src_parent = access_request_src_parent; >> + } >> + >> + rule = find_rule(domain, walker_path.dentry); >> + allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent, >> + layer_masks_dst_parent); >> + allowed_src_parent = unmask_layers(rule, access_masked_src_parent, >> + layer_masks_src_parent); >> >> - allowed = unmask_layers(find_rule(domain, walker_path.dentry), >> - access_request, &layer_masks); >> - if (allowed) >> - /* Stops when a rule from each layer grants access. */ >> + /* Stops when a rule from each layer grants access. */ >> + if (allowed_dst_parent && allowed_src_parent) >> break; > > If "(allowed_dst_parent && allowed_src_parent)" is true, you break out > of the while loop only to do a path_put(), check the two booleans once > more, and then return zero, yes? Why not just do the path_put() and > return zero here? Correct, that would work, but I prefer not to duplicate the logic of granting access if it doesn't make the code more complex, which I think is not the case here, and I'm reluctant to duplicate path_get/put() calls. This loop break is a small optimization to avoid walking the path one more step, and writing it this way looks cleaner and less error-prone from my point of view.
On Thu, Mar 17, 2022 at 8:03 AM Mickaël Salaün <mic@digikod.net> wrote: > On 17/03/2022 02:26, Paul Moore wrote: > > On Mon, Feb 21, 2022 at 4:15 PM Mickaël Salaün <mic@digikod.net> wrote: > >> > >> From: Mickaël Salaün <mic@linux.microsoft.com> > >> > >> Add a new LANDLOCK_ACCESS_FS_REFER access right to enable policy writers > >> to allow sandboxed processes to link and rename files from and to a > >> specific set of file hierarchies. This access right should be composed > >> with LANDLOCK_ACCESS_FS_MAKE_* for the destination of a link or rename, > >> and with LANDLOCK_ACCESS_FS_REMOVE_* for a source of a rename. This > >> lift a Landlock limitation that always denied changing the parent of an > >> inode. > >> > >> Renaming or linking to the same directory is still always allowed, > >> whatever LANDLOCK_ACCESS_FS_REFER is used or not, because it is not > >> considered a threat to user data. > >> > >> However, creating multiple links or renaming to a different parent > >> directory may lead to privilege escalations if not handled properly. > >> Indeed, we must be sure that the source doesn't gain more privileges by > >> being accessible from the destination. This is handled by making sure > >> that the source hierarchy (including the referenced file or directory > >> itself) restricts at least as much the destination hierarchy. If it is > >> not the case, an EXDEV error is returned, making it potentially possible > >> for user space to copy the file hierarchy instead of moving or linking > >> it. > >> > >> Instead of creating different access rights for the source and the > >> destination, we choose to make it simple and consistent for users. > >> Indeed, considering the previous constraint, it would be weird to > >> require such destination access right to be also granted to the source > >> (to make it a superset). > >> > >> See the provided documentation for additional details. > >> > >> New tests are provided with a following commit. > >> > >> Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com> > >> Link: https://lore.kernel.org/r/20220221212522.320243-7-mic@digikod.net > >> --- > >> include/uapi/linux/landlock.h | 27 +- > >> security/landlock/fs.c | 550 ++++++++++++++++--- > >> security/landlock/limits.h | 2 +- > >> security/landlock/syscalls.c | 2 +- > >> tools/testing/selftests/landlock/base_test.c | 2 +- > >> tools/testing/selftests/landlock/fs_test.c | 3 +- > >> 6 files changed, 516 insertions(+), 70 deletions(-) ... > >> +/* > >> + * Returns true if there is at least one access right different than > >> + * LANDLOCK_ACCESS_FS_REFER. > >> + */ > >> +static inline bool is_eacces( > >> + const layer_mask_t (*const > >> + layer_masks)[LANDLOCK_NUM_ACCESS_FS], > >> const access_mask_t access_request) > >> { > > > > Granted, I don't have as deep of an understanding of Landlock as you > > do, but the function name "is_eacces" seems a little odd given the > > nature of the function. Perhaps "is_fsrefer"? > > Hmm, this helper does multiple things which are necessary to know if we > need to return -EACCES or -EXDEV. Renaming it to is_fsrefer() would > require to inverse the logic and use boolean negations in the callers > (because of ordering). Renaming to something like without_fs_refer() > would not be completely correct because we also check if there is no > layer_masks, which indicated that it doesn't contain an access right > that should return -EACCES. This helper is named as such because the > underlying semantic is to check for such error code, which is a tricky. > I can rename it co contains_eacces() or something, but a longer name > would require to cut the caller lines to fit 80 columns. :| You know the Landlock code better than I do, if you like "is_eacces()", then leave it as it is. > >> - layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {}; > >> - bool allowed = false, has_access = false; > >> + unsigned long access_bit; > >> + /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */ > >> + const unsigned long access_check = access_request & > >> + ~LANDLOCK_ACCESS_FS_REFER; > >> + > >> + if (!layer_masks) > >> + return false; > >> + > >> + for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) { > >> + if ((*layer_masks)[access_bit]) > >> + return true; > >> + } > > > > Is calling for_each_set_bit() overkill here? @access_check should > > only ever have at most one bit set (LANDLOCK_ACCESS_FS_REFER), yes? > > No, it is the contrary ... Gotcha. Thanks for the clarification, I must have missed that when I was looking at it last night. > >> @@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain, > >> if (WARN_ON_ONCE(domain->num_layers < 1)) > >> return -EACCES; > >> > >> - /* Saves all layers handling a subset of requested accesses. */ > >> - for (i = 0; i < domain->num_layers; i++) { > >> - const unsigned long access_req = access_request; > >> - unsigned long access_bit; > >> - > >> - for_each_set_bit(access_bit, &access_req, > >> - ARRAY_SIZE(layer_masks)) { > >> - if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) { > >> - layer_masks[access_bit] |= BIT_ULL(i); > >> - has_access = true; > >> - } > >> - } > >> + BUILD_BUG_ON(!layer_masks_dst_parent); > > > > I know the kbuild robot already flagged this, but checking function > > parameters with BUILD_BUG_ON() does seem a bit ... unusual :) > > Yeah, I like such guarantee but it may not work without __always_inline. > I moved this check in the previous WARN_ON_ONCE(). That sounds good to me. > >> @@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain, > >> */ > >> while (true) { > >> struct dentry *parent_dentry; > >> + const struct landlock_rule *rule; > >> + > >> + /* > >> + * If at least all accesses allowed on the destination are > >> + * already allowed on the source, respectively if there is at > >> + * least as much as restrictions on the destination than on the > >> + * source, then we can safely refer files from the source to > >> + * the destination without risking a privilege escalation. > >> + * This is crucial for standalone multilayered security > >> + * policies. Furthermore, this helps avoid policy writers to > >> + * shoot themselves in the foot. > >> + */ > >> + if (is_dom_check && is_superset(child_is_directory, > >> + layer_masks_dst_parent, > >> + layer_masks_src_parent, > >> + layer_masks_child)) { > >> + allowed_dst_parent = > >> + scope_to_request(access_request_dst_parent, > >> + layer_masks_dst_parent); > >> + allowed_src_parent = > >> + scope_to_request(access_request_src_parent, > >> + layer_masks_src_parent); > >> + > >> + /* Stops when all accesses are granted. */ > >> + if (allowed_dst_parent && allowed_src_parent) > >> + break; > >> + > >> + /* > >> + * Downgrades checks from domain handled accesses to > >> + * requested accesses. > >> + */ > >> + is_dom_check = false; > >> + access_masked_dst_parent = access_request_dst_parent; > >> + access_masked_src_parent = access_request_src_parent; > >> + } > >> + > >> + rule = find_rule(domain, walker_path.dentry); > >> + allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent, > >> + layer_masks_dst_parent); > >> + allowed_src_parent = unmask_layers(rule, access_masked_src_parent, > >> + layer_masks_src_parent); > >> > >> - allowed = unmask_layers(find_rule(domain, walker_path.dentry), > >> - access_request, &layer_masks); > >> - if (allowed) > >> - /* Stops when a rule from each layer grants access. */ > >> + /* Stops when a rule from each layer grants access. */ > >> + if (allowed_dst_parent && allowed_src_parent) > >> break; > > > > If "(allowed_dst_parent && allowed_src_parent)" is true, you break out > > of the while loop only to do a path_put(), check the two booleans once > > more, and then return zero, yes? Why not just do the path_put() and > > return zero here? > > Correct, that would work, but I prefer not to duplicate the logic of > granting access if it doesn't make the code more complex, which I think > is not the case here, and I'm reluctant to duplicate path_get/put() > calls. This loop break is a small optimization to avoid walking the path > one more step, and writing it this way looks cleaner and less > error-prone from my point of view. I'm a big fan of maintainable code, and since you are the maintainer, if you prefer this approach I say stick with what you have :)
On 17/03/2022 13:04, Mickaël Salaün wrote: > > On 17/03/2022 02:26, Paul Moore wrote: [...] >>> @@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const >>> struct dentry *dentry) >>> >>> unlikely(IS_PRIVATE(d_backing_inode(dentry)))); >>> } >>> >>> -static int check_access_path(const struct landlock_ruleset *const >>> domain, >>> - const struct path *const path, >>> +static inline access_mask_t get_handled_accesses( >>> + const struct landlock_ruleset *const domain) >>> +{ >>> + access_mask_t access_dom = 0; >>> + unsigned long access_bit; >> >> Would it be better to declare @access_bit as an access_mask_t type? >> You're not using any macros like for_each_set_bit() in this function >> so I believe it should be safe. > > Right, I'll change that. Well, thinking about it again, access_bit is not an access mask but an index in such mask. access_mask_t gives enough space for such index but it is definitely not the right semantic. The best type should be size_t, but I prefer to stick to unsigned long (used for size_t anyway) for consistency with the other access_bit variable types. There is no need to use for_each_set_bit() here now but that could change, and I prefer to do my best to prevent future issues. ;) Anyway, I guess the compiler can optimize such code.
diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h index b3d952067f59..f433d58a58f2 100644 --- a/include/uapi/linux/landlock.h +++ b/include/uapi/linux/landlock.h @@ -21,8 +21,14 @@ struct landlock_ruleset_attr { /** * @handled_access_fs: Bitmask of actions (cf. `Filesystem flags`_) * that is handled by this ruleset and should then be forbidden if no - * rule explicitly allow them. This is needed for backward - * compatibility reasons. + * rule explicitly allow them: it is a deny-by-default list that should + * contain as much Landlock access rights as possible. Indeed, all + * Landlock filesystem access rights that are not part of + * handled_access_fs are allowed. This is needed for backward + * compatibility reasons. One exception is the + * LANDLOCK_ACCESS_FS_REFER access right, which is always implicitly + * handled, but must still be explicitly handled to add new rules with + * this access right. */ __u64 handled_access_fs; }; @@ -109,6 +115,22 @@ struct landlock_path_beneath_attr { * - %LANDLOCK_ACCESS_FS_MAKE_FIFO: Create (or rename or link) a named pipe. * - %LANDLOCK_ACCESS_FS_MAKE_BLOCK: Create (or rename or link) a block device. * - %LANDLOCK_ACCESS_FS_MAKE_SYM: Create (or rename or link) a symbolic link. + * - %LANDLOCK_ACCESS_FS_REFER: Link or rename a file from or to a different + * directory (i.e. reparent a file hierarchy). This access right is + * available since the second version of the Landlock ABI. This is also the + * only access right which is always considered handled by any ruleset in + * such a way that reparenting a file hierarchy is always denied by default. + * To avoid privilege escalation, it is not enough to add a rule with this + * access right. When linking or renaming a file, the destination directory + * hierarchy must also always have the same or a superset of restrictions of + * the source hierarchy. If it is not the case, or if the domain doesn't + * handle this access right, such actions are denied by default with errno + * set to EXDEV. Linking also requires a LANDLOCK_ACCESS_FS_MAKE_* access + * right on the destination directory, and renaming also requires a + * LANDLOCK_ACCESS_FS_REMOVE_* access right on the source's (file or + * directory) parent. Otherwise, such actions are denied with errno set to + * EACCES. The EACCES errno prevails over EXDEV to let user space + * efficiently deal with an unrecoverable error. * * .. warning:: * @@ -133,5 +155,6 @@ struct landlock_path_beneath_attr { #define LANDLOCK_ACCESS_FS_MAKE_FIFO (1ULL << 10) #define LANDLOCK_ACCESS_FS_MAKE_BLOCK (1ULL << 11) #define LANDLOCK_ACCESS_FS_MAKE_SYM (1ULL << 12) +#define LANDLOCK_ACCESS_FS_REFER (1ULL << 13) #endif /* _UAPI_LINUX_LANDLOCK_H */ diff --git a/security/landlock/fs.c b/security/landlock/fs.c index 3886f9ad1a60..c7c7ce4e7cd5 100644 --- a/security/landlock/fs.c +++ b/security/landlock/fs.c @@ -4,6 +4,7 @@ * * Copyright © 2016-2020 Mickaël Salaün <mic@digikod.net> * Copyright © 2018-2020 ANSSI + * Copyright © 2021-2022 Microsoft Corporation */ #include <linux/atomic.h> @@ -269,16 +270,188 @@ static inline bool is_nouser_or_private(const struct dentry *dentry) unlikely(IS_PRIVATE(d_backing_inode(dentry)))); } -static int check_access_path(const struct landlock_ruleset *const domain, - const struct path *const path, +static inline access_mask_t get_handled_accesses( + const struct landlock_ruleset *const domain) +{ + access_mask_t access_dom = 0; + unsigned long access_bit; + + for (access_bit = 0; access_bit < LANDLOCK_NUM_ACCESS_FS; + access_bit++) { + size_t layer_level; + + for (layer_level = 0; layer_level < domain->num_layers; + layer_level++) { + if (domain->fs_access_masks[layer_level] & + BIT_ULL(access_bit)) { + access_dom |= BIT_ULL(access_bit); + break; + } + } + } + return access_dom; +} + +static inline access_mask_t init_layer_masks( + const struct landlock_ruleset *const domain, + const access_mask_t access_request, + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS]) +{ + access_mask_t handled_accesses = 0; + size_t layer_level; + + memset(layer_masks, 0, sizeof(*layer_masks)); + if (WARN_ON_ONCE(!access_request)) + return 0; + + /* Saves all handled accesses per layer. */ + for (layer_level = 0; layer_level < domain->num_layers; + layer_level++) { + const unsigned long access_req = access_request; + unsigned long access_bit; + + for_each_set_bit(access_bit, &access_req, + ARRAY_SIZE(*layer_masks)) { + if (domain->fs_access_masks[layer_level] & + BIT_ULL(access_bit)) { + (*layer_masks)[access_bit] |= + BIT_ULL(layer_level); + handled_accesses |= BIT_ULL(access_bit); + } + } + } + return handled_accesses; +} + +/* + * Check that a destination file hierarchy has more restrictions than a source + * file hierarchy. This is only used for link and rename actions. + */ +static inline bool is_superset(bool child_is_directory, + const layer_mask_t (*const + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], + const layer_mask_t (*const + layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], + const layer_mask_t (*const + layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) +{ + unsigned long access_bit; + + for (access_bit = 0; access_bit < ARRAY_SIZE(*layer_masks_dst_parent); + access_bit++) { + /* Ignores accesses that only make sense for directories. */ + if (!child_is_directory && !(BIT_ULL(access_bit) & ACCESS_FILE)) + continue; + + /* + * Checks if the destination restrictions are a superset of the + * source ones (i.e. inherited access rights without child + * exceptions). + */ + if ((((*layer_masks_src_parent)[access_bit] & (*layer_masks_child)[access_bit]) | + (*layer_masks_dst_parent)[access_bit]) != + (*layer_masks_dst_parent)[access_bit]) + return false; + } + return true; +} + +/* + * Removes @layer_masks accesses that are not requested. + * + * Returns true if the request is allowed, false otherwise. + */ +static inline bool scope_to_request(const access_mask_t access_request, + layer_mask_t (*const layer_masks)[LANDLOCK_NUM_ACCESS_FS]) +{ + const unsigned long access_req = access_request; + unsigned long access_bit; + + if (WARN_ON_ONCE(!layer_masks)) + return true; + + for_each_clear_bit(access_bit, &access_req, ARRAY_SIZE(*layer_masks)) + (*layer_masks)[access_bit] = 0; + return !memchr_inv(layer_masks, 0, sizeof(*layer_masks)); +} + +/* + * Returns true if there is at least one access right different than + * LANDLOCK_ACCESS_FS_REFER. + */ +static inline bool is_eacces( + const layer_mask_t (*const + layer_masks)[LANDLOCK_NUM_ACCESS_FS], const access_mask_t access_request) { - layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {}; - bool allowed = false, has_access = false; + unsigned long access_bit; + /* LANDLOCK_ACCESS_FS_REFER alone must return -EXDEV. */ + const unsigned long access_check = access_request & + ~LANDLOCK_ACCESS_FS_REFER; + + if (!layer_masks) + return false; + + for_each_set_bit(access_bit, &access_check, ARRAY_SIZE(*layer_masks)) { + if ((*layer_masks)[access_bit]) + return true; + } + return false; +} + +/** + * check_access_path_dual - Check a source and a destination accesses + * + * @domain: Domain to check against. + * @path: File hierarchy to walk through. + * @child_is_directory: Must be set to true if the (original) leaf is a + * directory, false otherwise. + * @access_request_dst_parent: Accesses to check, once @layer_masks_dst_parent + * is equal to @layer_masks_src_parent (if any). + * @layer_masks_dst_parent: Pointer to a matrix of layer masks per access + * masks, identifying the layers that forbid a specific access. Bits from + * this matrix can be unset according to the @path walk. An empty matrix + * means that @domain allows all possible Landlock accesses (i.e. not only + * those identified by @access_request_dst_parent). This matrix can + * initially refer to domain layer masks and, when the accesses for the + * destination and source are the same, to request layer masks. + * @access_request_src_parent: Similar to @access_request_dst_parent but for an + * initial source path request. Only taken into account if + * @layer_masks_src_parent is not NULL. + * @layer_masks_src_parent: Similar to @layer_masks_dst_parent but for an + * initial source path walk. This can be NULL if only dealing with a + * destination access request (i.e. not a rename nor a link action). + * @layer_masks_child: Similar to @layer_masks_src_parent but only for the + * linked or renamed inode (without hierarchy). This is only used if + * @layer_masks_src_parent is not NULL. + * + * This helper first checks that the destination has a superset of restrictions + * compared to the source (if any) for a common path. It then checks that the + * collected accesses and the remaining ones are enough to allow the request. + * + * Returns: + * - 0 if the access request is granted; + * - -EACCES if it is denied because of access right other than + * LANDLOCK_ACCESS_FS_REFER; + * - -EXDEV if the renaming or linking would be a privileged escalation + * (according to each layered policies), or if LANDLOCK_ACCESS_FS_REFER is + * not allowed by the source or the destination. + */ +static int check_access_path_dual(const struct landlock_ruleset *const domain, + const struct path *const path, + bool child_is_directory, + const access_mask_t access_request_dst_parent, + layer_mask_t (*const + layer_masks_dst_parent)[LANDLOCK_NUM_ACCESS_FS], + const access_mask_t access_request_src_parent, + layer_mask_t (*layer_masks_src_parent)[LANDLOCK_NUM_ACCESS_FS], + layer_mask_t (*layer_masks_child)[LANDLOCK_NUM_ACCESS_FS]) +{ + bool allowed_dst_parent = false, allowed_src_parent = false, is_dom_check; struct path walker_path; - size_t i; + access_mask_t access_masked_dst_parent, access_masked_src_parent; - if (!access_request) + if (!access_request_dst_parent && !access_request_src_parent) return 0; if (WARN_ON_ONCE(!domain || !path)) return 0; @@ -287,22 +460,20 @@ static int check_access_path(const struct landlock_ruleset *const domain, if (WARN_ON_ONCE(domain->num_layers < 1)) return -EACCES; - /* Saves all layers handling a subset of requested accesses. */ - for (i = 0; i < domain->num_layers; i++) { - const unsigned long access_req = access_request; - unsigned long access_bit; - - for_each_set_bit(access_bit, &access_req, - ARRAY_SIZE(layer_masks)) { - if (domain->fs_access_masks[i] & BIT_ULL(access_bit)) { - layer_masks[access_bit] |= BIT_ULL(i); - has_access = true; - } - } + BUILD_BUG_ON(!layer_masks_dst_parent); + if (layer_masks_src_parent) { + if (WARN_ON_ONCE(!layer_masks_child)) + return -EACCES; + access_masked_dst_parent = access_masked_src_parent = + get_handled_accesses(domain); + is_dom_check = true; + } else { + if (WARN_ON_ONCE(layer_masks_child)) + return -EACCES; + access_masked_dst_parent = access_request_dst_parent; + access_masked_src_parent = access_request_src_parent; + is_dom_check = false; } - /* An access request not handled by the domain is allowed. */ - if (!has_access) - return 0; walker_path = *path; path_get(&walker_path); @@ -312,11 +483,50 @@ static int check_access_path(const struct landlock_ruleset *const domain, */ while (true) { struct dentry *parent_dentry; + const struct landlock_rule *rule; + + /* + * If at least all accesses allowed on the destination are + * already allowed on the source, respectively if there is at + * least as much as restrictions on the destination than on the + * source, then we can safely refer files from the source to + * the destination without risking a privilege escalation. + * This is crucial for standalone multilayered security + * policies. Furthermore, this helps avoid policy writers to + * shoot themselves in the foot. + */ + if (is_dom_check && is_superset(child_is_directory, + layer_masks_dst_parent, + layer_masks_src_parent, + layer_masks_child)) { + allowed_dst_parent = + scope_to_request(access_request_dst_parent, + layer_masks_dst_parent); + allowed_src_parent = + scope_to_request(access_request_src_parent, + layer_masks_src_parent); + + /* Stops when all accesses are granted. */ + if (allowed_dst_parent && allowed_src_parent) + break; + + /* + * Downgrades checks from domain handled accesses to + * requested accesses. + */ + is_dom_check = false; + access_masked_dst_parent = access_request_dst_parent; + access_masked_src_parent = access_request_src_parent; + } + + rule = find_rule(domain, walker_path.dentry); + allowed_dst_parent = unmask_layers(rule, access_masked_dst_parent, + layer_masks_dst_parent); + allowed_src_parent = unmask_layers(rule, access_masked_src_parent, + layer_masks_src_parent); - allowed = unmask_layers(find_rule(domain, walker_path.dentry), - access_request, &layer_masks); - if (allowed) - /* Stops when a rule from each layer grants access. */ + /* Stops when a rule from each layer grants access. */ + if (allowed_dst_parent && allowed_src_parent) break; jump_up: @@ -329,7 +539,7 @@ static int check_access_path(const struct landlock_ruleset *const domain, * Stops at the real root. Denies access * because not all layers have granted access. */ - allowed = false; + allowed_dst_parent = false; break; } } @@ -339,7 +549,8 @@ static int check_access_path(const struct landlock_ruleset *const domain, * access to internal filesystems (e.g. nsfs, which is * reachable through /proc/<pid>/ns/<namespace>). */ - allowed = !!(walker_path.mnt->mnt_flags & MNT_INTERNAL); + allowed_dst_parent = !!(walker_path.mnt->mnt_flags & + MNT_INTERNAL); break; } parent_dentry = dget_parent(walker_path.dentry); @@ -347,7 +558,40 @@ static int check_access_path(const struct landlock_ruleset *const domain, walker_path.dentry = parent_dentry; } path_put(&walker_path); - return allowed ? 0 : -EACCES; + + if (allowed_dst_parent && allowed_src_parent) + return 0; + + /* + * Unfortunately, we cannot prioritize EACCES over EXDEV for all + * RENAME_EXCHANGE cases because it depends on the source and + * destination order. This could be changed with a new + * security_path_rename hook implementation. + */ + if (likely(is_eacces(layer_masks_dst_parent, access_request_dst_parent) + || is_eacces(layer_masks_src_parent, + access_request_src_parent))) + return -EACCES; + + /* + * Gracefully forbids reparenting if the destination directory + * hierarchy is not a superset of restrictions of the source directory + * hierarchy, or if LANDLOCK_ACCESS_FS_REFER is not allowed by the + * source or the destination. + */ + return -EXDEV; +} + +static inline int check_access_path(const struct landlock_ruleset *const domain, + const struct path *const path, + access_mask_t access_request) +{ + layer_mask_t layer_masks[LANDLOCK_NUM_ACCESS_FS] = {}; + + access_request = init_layer_masks(domain, access_request, + &layer_masks); + return check_access_path_dual(domain, path, d_is_dir(path->dentry), + access_request, &layer_masks, 0, NULL, NULL); } static inline int current_check_access_path(const struct path *const path, @@ -394,6 +638,217 @@ static inline access_mask_t maybe_remove(const struct dentry *const dentry) LANDLOCK_ACCESS_FS_REMOVE_FILE; } +/** + * collect_domain_accesses - Walk through a file path and collect accesses + * + * @domain: Domain to check against. + * @mnt_root: Last directory to check. + * @dir: Directory to start the walk from. + * @layer_masks_dom: Where to store the collected accesses. + * + * This helper is useful to begin a path walk from the @dir directory to a + * @mnt_root directory used as a mount point. This mount point is the common + * ancestor between the source and the destination of a renamed and linked + * file. While walking from @dir to @mnt_root, we record all the domain's + * allowed accesses in @layer_masks_dom. + * + * This is similar to check_access_path_dual() but much simpler because it only + * handles walking on the same mount point and only check one set of accesses. + * + * Returns: + * - true if all the domain access rights are allowed for @dir; + * - false if the walk reached @mnt_root. + */ +static bool collect_domain_accesses( + const struct landlock_ruleset *const domain, + const struct dentry *const mnt_root, struct dentry *dir, + layer_mask_t (*const layer_masks_dom)[LANDLOCK_NUM_ACCESS_FS]) +{ + unsigned long access_dom; + bool ret = false; + + BUILD_BUG_ON(!layer_masks_dom); + if (WARN_ON_ONCE(!domain || !mnt_root || !dir)) + return true; + if (is_nouser_or_private(dir)) + return true; + + access_dom = init_layer_masks(domain, LANDLOCK_MASK_ACCESS_FS, + layer_masks_dom); + + dget(dir); + while (true) { + struct dentry *parent_dentry; + + /* Gets all layers allowing all domain accesses. */ + if (unmask_layers(find_rule(domain, dir), access_dom, + layer_masks_dom)) { + /* + * Stops when all handled accesses are allowed by at + * least one rule in each layer. + */ + ret = true; + break; + } + + /* We should not reach a root other than @mnt_root. */ + if (dir == mnt_root || WARN_ON_ONCE(IS_ROOT(dir))) { + ret = false; + break; + } + + parent_dentry = dget_parent(dir); + dput(dir); + dir = parent_dentry; + } + dput(dir); + return ret; +} + +/** + * current_check_refer_path - Check if a rename or link action is allowed + * + * @old_dentry: File or directory requested to be moved or linked. + * @new_dir: Destination parent directory. + * @new_dentry: Destination file or directory. + * @removable: Sets to true if it is a rename operation. + * + * Because of its unprivileged constraints, Landlock relies on file hierarchies + * (and not only inodes) to tie access rights to files. Being able to link or + * rename a file hierarchy brings some challenges. Indeed, moving or linking a + * file (i.e. creating a new reference to an inode) can have an impact on the + * actions allowed for a set of files if it would change its parent directory + * (i.e. reparenting). + * + * To avoid trivial access right bypasses, Landlock first checks if the file or + * directory requested to be moved would gain new access rights inherited from + * its new hierarchy. Before returning any error, Landlock then checks that + * the parent source hierarchy and the destination hierarchy would allow the + * link or rename action. If it is not the case, an error with EACCES is + * returned to inform user space that there is no way to remove or create the + * requested source file type. If it should be allowed but the new inherited + * access rights would be greater than the source access rights, then the + * kernel returns an error with EXDEV. Prioritizing EACCES over EXDEV enables + * user space to abort the whole operation if there is no way to do it, or to + * manually copy the source to the destination if this remains allowed, e.g. + * because file creation is allowed on the destination directory but not direct + * linking. + * + * To achieve this goal, the kernel needs to compare two file hierarchies: the + * one identifying the source file or directory (including itself), and the + * destination one. This can be seen as a multilayer partial ordering problem. + * The kernel walks through these paths and collect in a matrix the access + * rights that are denied per layer. These matrices are then compared to see + * if the destination one has more (or the same) restrictions as the source + * one. If this is the case, the requested action will not return EXDEV, which + * doesn't mean the action is allowed. The parent hierarchy of the source + * (i.e. parent directory), and the destination hierarchy must also be checked + * to verify that they explicitly allow such action (i.e. referencing, + * creation and potentially removal rights). The kernel implementation is then + * required to rely on three matrices of access rights: one for the source file + * or directory (i.e. the child), one for the source parent hierarchy and one + * for the destination hierarchy. These ephemeral matrices take some space on + * the stack, which limits the number of layers to a deemed reasonable number: + * 16. + * + * Returns: + * - 0 if access is allowed; + * - -EXDEV if @old_dentry would inherit new access rights from @new_dir; + * - -EACCES if file removal or creation is denied. + */ +static int current_check_refer_path(struct dentry *const old_dentry, + const struct path *const new_dir, + struct dentry *const new_dentry, + bool removable) +{ + const struct landlock_ruleset *const dom = + landlock_get_current_domain(); + bool allow_dst_parent, allow_src_parent; + access_mask_t access_request_dst_parent, access_request_src_parent, + access_child; + struct path mnt_dir; + layer_mask_t layer_masks_dst_parent[LANDLOCK_NUM_ACCESS_FS], + layer_masks_src_parent[LANDLOCK_NUM_ACCESS_FS], + layer_masks_child[LANDLOCK_NUM_ACCESS_FS]; + + if (!dom) + return 0; + if (WARN_ON_ONCE(dom->num_layers < 1)) + return -EACCES; + if (unlikely(d_is_negative(old_dentry))) + return -ENOENT; + + access_request_dst_parent = + get_mode_access(d_backing_inode(old_dentry)->i_mode); + access_request_src_parent = 0; + if (removable) { + access_request_dst_parent |= maybe_remove(new_dentry); + access_request_src_parent |= maybe_remove(old_dentry); + } + + /* The mount points are the same for old and new paths, cf. EXDEV. */ + if (old_dentry->d_parent == new_dir->dentry) { + /* + * The LANDLOCK_ACCESS_FS_REFER access right is not required + * for same-directory referer (i.e. no reparenting). + */ + access_request_dst_parent = init_layer_masks(dom, + access_request_dst_parent | access_request_src_parent, + &layer_masks_dst_parent); + return check_access_path_dual(dom, new_dir, d_is_dir(old_dentry), + access_request_dst_parent, &layer_masks_dst_parent, + 0, NULL, NULL); + } + + /* Backward compatibility: no reparenting support. */ + if (!(get_handled_accesses(dom) & LANDLOCK_ACCESS_FS_REFER)) + return -EXDEV; + + access_request_src_parent |= LANDLOCK_ACCESS_FS_REFER; + access_request_dst_parent |= LANDLOCK_ACCESS_FS_REFER; + + /* Saves the common mount point. */ + mnt_dir.mnt = new_dir->mnt; + mnt_dir.dentry = new_dir->mnt->mnt_root; + + /* new_dir->dentry is equal to new_dentry->d_parent */ + allow_dst_parent = collect_domain_accesses(dom, mnt_dir.dentry, + new_dir->dentry, &layer_masks_dst_parent); + allow_src_parent = collect_domain_accesses(dom, mnt_dir.dentry, + old_dentry->d_parent, &layer_masks_src_parent); + + if (allow_src_parent) { + /* No need to go further if everything is allowed. */ + if (allow_dst_parent) + return 0; + + /* @new_dentry can only gain more restrictions. */ + if (scope_to_request(access_request_dst_parent, + &layer_masks_dst_parent)) + return 0; + + return check_access_path_dual(dom, &mnt_dir, d_is_dir(old_dentry), + access_request_dst_parent, &layer_masks_dst_parent, + 0, NULL, NULL); + } + + /* + * To be able to compare source and destination domain access rights, + * take into account the @old_dentry access rights aggregated with its + * parent access rights. This will be useful to compare with the + * destination parent access rights. + */ + access_child = init_layer_masks(dom, LANDLOCK_MASK_ACCESS_FS, + &layer_masks_child); + unmask_layers(find_rule(dom, old_dentry), access_child, + &layer_masks_child); + + return check_access_path_dual(dom, &mnt_dir, d_is_dir(old_dentry), + access_request_dst_parent, &layer_masks_dst_parent, + access_request_src_parent, &layer_masks_src_parent, + &layer_masks_child); +} + /* Inode hooks */ static void hook_inode_free_security(struct inode *const inode) @@ -587,31 +1042,11 @@ static int hook_sb_pivotroot(const struct path *const old_path, /* Path hooks */ -/* - * Creating multiple links or renaming may lead to privilege escalations if not - * handled properly. Indeed, we must be sure that the source doesn't gain more - * privileges by being accessible from the destination. This is getting more - * complex when dealing with multiple layers. The whole picture can be seen as - * a multilayer partial ordering problem. A future version of Landlock will - * deal with that. - */ static int hook_path_link(struct dentry *const old_dentry, const struct path *const new_dir, struct dentry *const new_dentry) { - const struct landlock_ruleset *const dom = - landlock_get_current_domain(); - - if (!dom) - return 0; - /* The mount points are the same for old and new paths, cf. EXDEV. */ - if (old_dentry->d_parent != new_dir->dentry) - /* Gracefully forbids reparenting. */ - return -EXDEV; - if (unlikely(d_is_negative(old_dentry))) - return -ENOENT; - return check_access_path(dom, new_dir, - get_mode_access(d_backing_inode(old_dentry)->i_mode)); + return current_check_refer_path(old_dentry, new_dir, new_dentry, false); } static int hook_path_rename(const struct path *const old_dir, @@ -619,21 +1054,8 @@ static int hook_path_rename(const struct path *const old_dir, const struct path *const new_dir, struct dentry *const new_dentry) { - const struct landlock_ruleset *const dom = - landlock_get_current_domain(); - - if (!dom) - return 0; - /* The mount points are the same for old and new paths, cf. EXDEV. */ - if (old_dir->dentry != new_dir->dentry) - /* Gracefully forbids reparenting. */ - return -EXDEV; - if (unlikely(d_is_negative(old_dentry))) - return -ENOENT; - /* RENAME_EXCHANGE is handled because directories are the same. */ - return check_access_path(dom, old_dir, maybe_remove(old_dentry) | - maybe_remove(new_dentry) | - get_mode_access(d_backing_inode(old_dentry)->i_mode)); + /* old_dir refers to old_dentry->d_parent and new_dir->mnt */ + return current_check_refer_path(old_dentry, new_dir, new_dentry, true); } static int hook_path_mkdir(const struct path *const dir, diff --git a/security/landlock/limits.h b/security/landlock/limits.h index 126d1ec04d34..26c8166d0265 100644 --- a/security/landlock/limits.h +++ b/security/landlock/limits.h @@ -16,7 +16,7 @@ #define LANDLOCK_MAX_NUM_LAYERS 16 #define LANDLOCK_MAX_NUM_RULES U32_MAX -#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_MAKE_SYM +#define LANDLOCK_LAST_ACCESS_FS LANDLOCK_ACCESS_FS_REFER #define LANDLOCK_MASK_ACCESS_FS ((LANDLOCK_LAST_ACCESS_FS << 1) - 1) #define LANDLOCK_NUM_ACCESS_FS __const_hweight64(LANDLOCK_MASK_ACCESS_FS) diff --git a/security/landlock/syscalls.c b/security/landlock/syscalls.c index 32396962f04d..fa14f09b6bf4 100644 --- a/security/landlock/syscalls.c +++ b/security/landlock/syscalls.c @@ -128,7 +128,7 @@ static const struct file_operations ruleset_fops = { .write = fop_dummy_write, }; -#define LANDLOCK_ABI_VERSION 1 +#define LANDLOCK_ABI_VERSION 2 /** * sys_landlock_create_ruleset - Create a new ruleset diff --git a/tools/testing/selftests/landlock/base_test.c b/tools/testing/selftests/landlock/base_test.c index ca40abe9daa8..99aab93d50e1 100644 --- a/tools/testing/selftests/landlock/base_test.c +++ b/tools/testing/selftests/landlock/base_test.c @@ -67,7 +67,7 @@ TEST(abi_version) { const struct landlock_ruleset_attr ruleset_attr = { .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE, }; - ASSERT_EQ(1, landlock_create_ruleset(NULL, 0, + ASSERT_EQ(2, landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION)); ASSERT_EQ(-1, landlock_create_ruleset(&ruleset_attr, 0, diff --git a/tools/testing/selftests/landlock/fs_test.c b/tools/testing/selftests/landlock/fs_test.c index 1ac41bfa7382..0568d1193492 100644 --- a/tools/testing/selftests/landlock/fs_test.c +++ b/tools/testing/selftests/landlock/fs_test.c @@ -381,7 +381,7 @@ TEST_F_FORK(layout1, inval) LANDLOCK_ACCESS_FS_WRITE_FILE | \ LANDLOCK_ACCESS_FS_READ_FILE) -#define ACCESS_LAST LANDLOCK_ACCESS_FS_MAKE_SYM +#define ACCESS_LAST LANDLOCK_ACCESS_FS_REFER #define ACCESS_ALL ( \ ACCESS_FILE | \ @@ -394,6 +394,7 @@ TEST_F_FORK(layout1, inval) LANDLOCK_ACCESS_FS_MAKE_SOCK | \ LANDLOCK_ACCESS_FS_MAKE_FIFO | \ LANDLOCK_ACCESS_FS_MAKE_BLOCK | \ + LANDLOCK_ACCESS_FS_MAKE_SYM | \ ACCESS_LAST) TEST_F_FORK(layout1, file_access_rights)