mbox series

[v6,00/15] integrity: Introduce the Integrity Digest Cache

Message ID 20241119104922.2772571-1-roberto.sassu@huaweicloud.com (mailing list archive)
Headers show
Series integrity: Introduce the Integrity Digest Cache | expand

Message

Roberto Sassu Nov. 19, 2024, 10:49 a.m. UTC
From: Roberto Sassu <roberto.sassu@huawei.com>

Integrity detection and protection has long been a desirable feature, to
reach a large user base and mitigate the risk of flaws in the software
and attacks.

However, while solutions exist, they struggle to reach a large user base,
due to requiring higher than desired constraints on performance,
flexibility and configurability, that only security conscious people are
willing to accept.

For example, IMA measurement requires the target platform to collect
integrity measurements, and to protect them with the TPM, which introduces
a noticeable overhead (up to 10x slower in a microbenchmark) on frequently
used system calls, like the open().

IMA Appraisal currently requires individual files to be signed and
verified, and Linux distributions to rebuild all packages to include file
signatures (this approach has been adopted from Fedora 39+). Like a TPM,
also signature verification introduces a significant overhead, especially
if it is used to check the integrity of many files.

This is where the new Integrity Digest Cache comes into play, it offers
additional support for new and existing integrity solutions, to make
them faster and easier to deploy.

The Integrity Digest Cache can help IMA to reduce the number of TPM
operations and to make them happen in a deterministic way. If IMA knows
that a file comes from a Linux distribution, it can measure files in a
different way: measure the list of digests coming from the distribution
(e.g. RPM package headers), and subsequently measure a file if it is not
found in that list.

The performance improvement comes at the cost of IMA not reporting which
files from installed packages were accessed, and in which temporal
sequence. This approach might not be suitable for all use cases.

The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
lookup the calculated digest of an accessed file in the list of digests
extracted from package headers, after verifying the header signature. It is
sufficient to verify only one signature for all files in the package, as
opposed to verifying a signature for each file.

The same approach can be followed by other LSMs, such as Integrity Policy
Enforcement (IPE), and BPF LSM.

The Integrity Digest Cache is not tied to a specific package format. The
kernel supports a TLV-based digest list format. More can be added through
third-party kernel modules. The TLV parser has been verified for memory
safety with the Frama-C static analyzer. The version with the Frama-C
assertions is available here:

https://github.com/robertosassu/rpm-formal/blob/main/validate_tlv.c

Integrating the Integrity Digest Cache in IMA brings significant
performance improvements: up to 67% and 79% for measurement respectively in
sequential and parallel file reads; up to 65% and 43% for appraisal
respectively in sequential and parallel file reads.

The performance can be further enhanced by using fsverity digests instead
of conventional file digests, which would make IMA verify only the portion
of the file to be read. However, at the moment, fsverity digests are not
included in RPM packages. In this case, once rpm is extended to include
them, Linux distributions still have to rebuild their packages.

The Integrity Digest Cache can support both digest types, so that the
functionality is immediately available without waiting for Linux
distributions to do the transition.

This patch set only includes the patches necessary to extract digests from
a TLV-based data format, and exposes an API for LSMs to query them. A
separate patch set will be provided to integrate it in IMA.

This patch set and the follow-up IMA integration can be tested by following
the instructions at:

https://github.com/linux-integrity/digest-cache-tools

This patch set applies on top of:

https://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git/log/?h=next-integrity

with commit 08ae3e5f5fc8 ("integrity: Use static_assert() to check struct
sizes").

Changelog

v5:
- Remove the RPM parser and selftests (suggested by Linus)
- Return digest cache pointer from digest_cache_lookup()
- Export new Parser API, and allow registration of third-party digest list
  parsers (suggested by Mimi)
- Reduce sizes in TLV format and remove TLV header (suggested by Jani
  Nikula)
- Introduce new DIGEST_LIST_NUM_ENTRIES TLV field
- Pass file descriptor instead of dentry in digest_cache_get() to properly
  detect potential deadlocks
- Introduce digest_cache_opened_fd() to tell lockdep when it is safe to
  nest a mutex if digest_cache_get() is called with that mutex held
- Add new patch to introduce ksys_finit_module()
- Make the TLV parser as configurable (Y/N/m) with Kconfig (suggested by
  Mimi)
- Don't store the path structure in the digest cache and pass it between
  creation and initialization of the digest cache
- Remove digest_cache_dir_update_dig_user() and keep the digest cache
  retrieved during digest_cache_get()
- Fail with an error pointer in digest_cache_dir_lookup_digest() if the
  current and passed directory digest cache don't match, or the digest
  cache was reset
- Handle num_digest = 0 in digest_cache_htable_init()
- Accept -EOPNOTSUPP error in digest_cache_new()
- Implement inode_free_security_rcu LSM hook instead of inode_free_security
- Move reservation of file descriptor security blob inside the #ifdef in
  init_ima_lsm()
- Add test file_reset_again to check the error pointer returned by
  digest_cache_lookup()
- Remove TLV_FAILURE_HDR_LEN TLV error test
- Add missing MODULE_DESCRIPTION in kselftest kernel module (suggested by
  Jeff Johnson)
- Replace dentry_open() with kernel_file_open() in populate.c and dir.c
- Skip affected tests when CONFIG_DYNAMIC_FTRACE_WITH_ARGS=n

v4:
- Rename digest_cache LSM to Integrity Digest Cache (suggested by Paul
  Moore)
- Update documentation
- Remove forward declaration of struct digest_cache in
  include/linux/digest_cache.h (suggested by Jarkko)
- Add DIGEST_CACHE_FREE digest cache event for notification
- Remove digest_cache_found_t typedef and use uintptr_t instead
- Add header callback in TLV parser and unexport tlv_parse_hdr() and
  tlv_parse_data()
- Plug the Integrity Digest Cache into the 'ima' LSM
- Switch from constructor to zeroing the object cache
- Remove notifier and detect digest cache changes by comparing pointers
- Rename digest_cache_dir_create() to digest_cache_dir_add_entries()
- Introduce digest_cache_dir_create() to create and initialize a directory
  digest cache
- Introduce digest_cache_dir_update_dig_user() to update dig_user with a
  file digest cache on positive digest lookup
- Use up to date directory digest cache, to take into account possible
  inode eviction for the old ones
- Introduce digest_cache_dir_prefetch() to prefetch digest lists
- Adjust component name in debug messages (suggested by Jarkko)
- Add FILE_PREFETCH and FILE_READ digest cache flags, remove RESET_USER
- Reintroduce spin lock for digest cache verification data (needed for the
  selftests)
- Get inode and file descriptor security blob offsets from outside (IMA)
- Avoid user-after-free in digest_cache_unref() by decrementing the ref.
  count after printing the debug message
- Check for digest list lookup loops also for the parent directory
- Put and clear dig_owner directly in digest_cache_reset_clear_owner()
- Move digest cache initialization code from digest_cache_create() to
  digest_cache_init()
- Hold the digest list path until the digest cache is initialized (to avoid
  premature inode eviction)
- Avoid race condition on setting DIR_PREFETCH in the directory digest
  cache
- Introduce digest_cache_dir_prefetch() and do it between digest cache
  creation and initialization (to avoid lock inversion)
- Avoid unnecessary length check in digest_list_parse_rpm()
- Declare arrays of strings in tlv parser as static
- Emit reset for parent directory on directory entry modification
- Rename digest_cache_reset_owner() to digest_cache_reset_clear_owner()
  and digest_cache_reset_user() to digest_cache_clear_user()
- Execute digest_cache_file_release() either if FMODE_WRITE or
  FMODE_CREATED are set in the file descriptor f_mode
- Determine in digest_cache_verif_set() which gfp flag to use depending on
  verifier ID
- Update selftests

v3:
- Rewrite documentation, and remove the installation instructions since
  they are now included in the README of digest-cache-tools
- Add digest cache event notifier
- Drop digest_cache_was_reset(), and send instead to asynchronous
  notifications
- Fix digest_cache LSM Kconfig style issues (suggested by Randy Dunlap)
- Propagate digest cache reset to directory entries
- Destroy per directory entry mutex
- Introduce RESET_USER bit, to clear the dig_user pointer on
  set/removexattr
- Replace 'file content' with 'file data' (suggested by Mimi)
- Introduce per digest cache mutex and replace verif_data_lock spinlock
- Track changes of security.digest_list xattr
- Stop tracking file_open and use file_release instead also for file writes
- Add error messages in digest_cache_create()
- Load/unload testing kernel module automatically during execution of test
- Add tests for digest cache event notifier
- Add test for ftruncate()
- Remove DIGEST_CACHE_RESET_PREFETCH_BUF command in test and clear the
  buffer on read instead

v2:
- Include the TLV parser in this patch set (from user asymmetric keys and
  signatures)
- Move from IMA and make an independent LSM
- Remove IMA-specific stuff from this patch set
- Add per algorithm hash table
- Expect all digest lists to be in the same directory and allow changing
  the default directory
- Support digest lookup on directories, when there is no
  security.digest_list xattr
- Add seq num to digest list file name, to impose ordering on directory
  iteration
- Add a new data type DIGEST_LIST_ENTRY_DATA for the nested data in the
  tlv digest list format
- Add the concept of verification data attached to digest caches
- Add the reset mechanism to track changes on digest lists and directory
  containing the digest lists
- Add kernel selftests

v1:
- Add documentation in Documentation/security/integrity-digest-cache.rst
- Pass the mask of IMA actions to digest_cache_alloc()
- Add a reference count to the digest cache
- Remove the path parameter from digest_cache_get(), and rely on the
  reference count to avoid the digest cache disappearing while being used
- Rename the dentry_to_check parameter of digest_cache_get() to dentry
- Rename digest_cache_get() to digest_cache_new() and add
  digest_cache_get() to set the digest cache in the iint of the inode for
  which the digest cache was requested
- Add dig_owner and dig_user to the iint, to distinguish from which inode
  the digest cache was created from, and which is using it; consequently it
  makes the digest cache usable to measure/appraise other digest caches
  (support not yet enabled)
- Add dig_owner_mutex and dig_user_mutex to serialize accesses to dig_owner
  and dig_user until they are initialized
- Enforce strong synchronization and make the contenders wait until
  dig_owner and dig_user are assigned to the iint the first time
- Move checking IMA actions on the digest list earlier, and fail if no
  action were performed (digest cache not usable)
- Remove digest_cache_put(), not needed anymore with the introduction of
  the reference count
- Fail immediately in digest_cache_lookup() if the digest algorithm is
  not set in the digest cache
- Use 64 bit mask for IMA actions on the digest list instead of 8 bit
- Return NULL in the inline version of digest_cache_get()
- Use list_add_tail() instead of list_add() in the iterator
- Copy the digest list path to a separate buffer in digest_cache_iter_dir()
- Use digest list parsers verified with Frama-C
- Explicitly disable (for now) the possibility in the IMA policy to use the
  digest cache to measure/appraise other digest lists
- Replace exit(<value>) with return <value> in manage_digest_lists.c

Roberto Sassu (15):
  lib: Add TLV parser
  module: Introduce ksys_finit_module()
  integrity: Introduce the Integrity Digest Cache
  digest_cache: Initialize digest caches
  digest_cache: Add securityfs interface
  digest_cache: Add hash tables and operations
  digest_cache: Allow registration of digest list parsers
  digest_cache: Parse tlv digest lists
  digest_cache: Populate the digest cache from a digest list
  digest_cache: Add management of verification data
  digest_cache: Add support for directories
  digest cache: Prefetch digest lists if requested
  digest_cache: Reset digest cache on file/directory change
  selftests/digest_cache: Add selftests for the Integrity Digest Cache
  docs: Add documentation of the Integrity Digest Cache

 Documentation/security/digest_cache.rst       | 850 ++++++++++++++++++
 Documentation/security/index.rst              |   1 +
 MAINTAINERS                                   |  10 +
 include/linux/digest_cache.h                  | 131 +++
 include/linux/kernel_read_file.h              |   1 +
 include/linux/syscalls.h                      |  10 +
 include/linux/tlv_parser.h                    |  32 +
 include/uapi/linux/tlv_digest_list.h          |  47 +
 include/uapi/linux/tlv_parser.h               |  41 +
 include/uapi/linux/xattr.h                    |   6 +
 kernel/module/main.c                          |  43 +-
 lib/Kconfig                                   |   3 +
 lib/Makefile                                  |   2 +
 lib/tlv_parser.c                              |  87 ++
 lib/tlv_parser.h                              |  18 +
 security/integrity/Kconfig                    |   1 +
 security/integrity/Makefile                   |   1 +
 security/integrity/digest_cache/Kconfig       |  43 +
 security/integrity/digest_cache/Makefile      |  11 +
 security/integrity/digest_cache/dir.c         | 400 +++++++++
 security/integrity/digest_cache/htable.c      | 260 ++++++
 security/integrity/digest_cache/internal.h    | 283 ++++++
 security/integrity/digest_cache/main.c        | 597 ++++++++++++
 security/integrity/digest_cache/modsig.c      |  66 ++
 security/integrity/digest_cache/parsers.c     | 257 ++++++
 security/integrity/digest_cache/parsers/tlv.c | 341 +++++++
 security/integrity/digest_cache/populate.c    | 104 +++
 security/integrity/digest_cache/reset.c       | 227 +++++
 security/integrity/digest_cache/secfs.c       | 104 +++
 security/integrity/digest_cache/verif.c       | 135 +++
 security/integrity/ima/ima.h                  |   1 +
 security/integrity/ima/ima_fs.c               |   6 +
 security/integrity/ima/ima_main.c             |  10 +-
 tools/testing/selftests/Makefile              |   1 +
 .../testing/selftests/digest_cache/.gitignore |   3 +
 tools/testing/selftests/digest_cache/Makefile |  24 +
 .../testing/selftests/digest_cache/all_test.c | 769 ++++++++++++++++
 tools/testing/selftests/digest_cache/common.c |  78 ++
 tools/testing/selftests/digest_cache/common.h |  93 ++
 .../selftests/digest_cache/common_user.c      |  33 +
 .../selftests/digest_cache/common_user.h      |  15 +
 tools/testing/selftests/digest_cache/config   |   2 +
 .../selftests/digest_cache/generators.c       | 130 +++
 .../selftests/digest_cache/generators.h       |  16 +
 .../selftests/digest_cache/testmod/Makefile   |  16 +
 .../selftests/digest_cache/testmod/kern.c     | 551 ++++++++++++
 46 files changed, 5849 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/security/digest_cache.rst
 create mode 100644 include/linux/digest_cache.h
 create mode 100644 include/linux/tlv_parser.h
 create mode 100644 include/uapi/linux/tlv_digest_list.h
 create mode 100644 include/uapi/linux/tlv_parser.h
 create mode 100644 lib/tlv_parser.c
 create mode 100644 lib/tlv_parser.h
 create mode 100644 security/integrity/digest_cache/Kconfig
 create mode 100644 security/integrity/digest_cache/Makefile
 create mode 100644 security/integrity/digest_cache/dir.c
 create mode 100644 security/integrity/digest_cache/htable.c
 create mode 100644 security/integrity/digest_cache/internal.h
 create mode 100644 security/integrity/digest_cache/main.c
 create mode 100644 security/integrity/digest_cache/modsig.c
 create mode 100644 security/integrity/digest_cache/parsers.c
 create mode 100644 security/integrity/digest_cache/parsers/tlv.c
 create mode 100644 security/integrity/digest_cache/populate.c
 create mode 100644 security/integrity/digest_cache/reset.c
 create mode 100644 security/integrity/digest_cache/secfs.c
 create mode 100644 security/integrity/digest_cache/verif.c
 create mode 100644 tools/testing/selftests/digest_cache/.gitignore
 create mode 100644 tools/testing/selftests/digest_cache/Makefile
 create mode 100644 tools/testing/selftests/digest_cache/all_test.c
 create mode 100644 tools/testing/selftests/digest_cache/common.c
 create mode 100644 tools/testing/selftests/digest_cache/common.h
 create mode 100644 tools/testing/selftests/digest_cache/common_user.c
 create mode 100644 tools/testing/selftests/digest_cache/common_user.h
 create mode 100644 tools/testing/selftests/digest_cache/config
 create mode 100644 tools/testing/selftests/digest_cache/generators.c
 create mode 100644 tools/testing/selftests/digest_cache/generators.h
 create mode 100644 tools/testing/selftests/digest_cache/testmod/Makefile
 create mode 100644 tools/testing/selftests/digest_cache/testmod/kern.c

Comments

Luis Chamberlain Nov. 19, 2024, 8:03 p.m. UTC | #1
On Tue, Nov 19, 2024 at 11:49:07AM +0100, Roberto Sassu wrote:
> From: Roberto Sassu <roberto.sassu@huawei.com>
> v5:
> - Add new patch to introduce ksys_finit_module()

Why?

  Luis
Eric Snowberg Nov. 26, 2024, 12:13 a.m. UTC | #2
> On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> 
> From: Roberto Sassu <roberto.sassu@huawei.com>
> 
> The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> lookup the calculated digest of an accessed file in the list of digests
> extracted from package headers, after verifying the header signature. It is
> sufficient to verify only one signature for all files in the package, as
> opposed to verifying a signature for each file.

Is there a way to maintain integrity over time?  Today if a CVE is discovered 
in a signed program, the program hash can be added to the blacklist keyring. 
Later if IMA appraisal is used, the signature validation will fail just for that 
program.  With the Integrity Digest Cache, is there a way to do this?
Roberto Sassu Nov. 26, 2024, 10:41 a.m. UTC | #3
On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
> 
> > On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > 
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> > 
> > The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> > lookup the calculated digest of an accessed file in the list of digests
> > extracted from package headers, after verifying the header signature. It is
> > sufficient to verify only one signature for all files in the package, as
> > opposed to verifying a signature for each file.
> 
> Is there a way to maintain integrity over time?  Today if a CVE is discovered 
> in a signed program, the program hash can be added to the blacklist keyring. 
> Later if IMA appraisal is used, the signature validation will fail just for that 
> program.  With the Integrity Digest Cache, is there a way to do this?  

As far as I can see, the ima_check_blacklist() call is before
ima_appraise_measurement(). If it fails, appraisal with the Integrity
Digest Cache will not be done.

In the future, we might use the Integrity Digest Cache for blacklists
too. Since a digest cache is reset on a file/directory change, IMA
would have to revalidate the program digest against a new digest cache.

Thanks

Roberto
Dr. Greg Nov. 27, 2024, 5:30 p.m. UTC | #4
On Tue, Nov 19, 2024 at 11:49:07AM +0100, Roberto Sassu wrote:

Hi Roberto, I hope the week is going well for you.

> From: Roberto Sassu <roberto.sassu@huawei.com>
> 
> Integrity detection and protection has long been a desirable feature, to
> reach a large user base and mitigate the risk of flaws in the software
> and attacks.
> 
> However, while solutions exist, they struggle to reach a large user base,
> due to requiring higher than desired constraints on performance,
> flexibility and configurability, that only security conscious people are
> willing to accept.
> 
> For example, IMA measurement requires the target platform to collect
> integrity measurements, and to protect them with the TPM, which introduces
> a noticeable overhead (up to 10x slower in a microbenchmark) on frequently
> used system calls, like the open().
> 
> IMA Appraisal currently requires individual files to be signed and
> verified, and Linux distributions to rebuild all packages to include file
> signatures (this approach has been adopted from Fedora 39+). Like a TPM,
> also signature verification introduces a significant overhead, especially
> if it is used to check the integrity of many files.
> 
> This is where the new Integrity Digest Cache comes into play, it offers
> additional support for new and existing integrity solutions, to make
> them faster and easier to deploy.
> 
> The Integrity Digest Cache can help IMA to reduce the number of TPM
> operations and to make them happen in a deterministic way. If IMA knows
> that a file comes from a Linux distribution, it can measure files in a
> different way: measure the list of digests coming from the distribution
> (e.g. RPM package headers), and subsequently measure a file if it is not
> found in that list.
> 
> The performance improvement comes at the cost of IMA not reporting which
> files from installed packages were accessed, and in which temporal
> sequence. This approach might not be suitable for all use cases.
> 
> The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> lookup the calculated digest of an accessed file in the list of digests
> extracted from package headers, after verifying the header signature. It is
> sufficient to verify only one signature for all files in the package, as
> opposed to verifying a signature for each file.

Roberto, a big picture question for you, our apologies if we
completely misunderstand your patch series.

The performance benefit comes from the fact that the kernel doesn't
have to read a file and calculate the cryptographic digest when the
file is accessed.  The 'trusted' digest value comes from a signed list
of digests that a packaging entity provides and the kernel validates.
So there is an integrity guarantee that the supplied digests were the
same as when the package was built.

Is there a guarantee implemented, that we missed, that the on-disk
file actually has the digest value that was initially generated by the
packaging entity when the file is accessed operationally?

Secondly, and in a related issue, what happens in a container
environment when a pathname is accessed that is actually a different
file but with the same effective pathname as a file that is in the
vendor validated digest list?

Once again, apologies, if we completely misinterpret the issues
involved.

Have a good remainder of the week.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project
Roberto Sassu Nov. 27, 2024, 5:56 p.m. UTC | #5
On Wed, 2024-11-27 at 11:30 -0600, Dr. Greg wrote:
> On Tue, Nov 19, 2024 at 11:49:07AM +0100, Roberto Sassu wrote:
> 
> Hi Roberto, I hope the week is going well for you.
> 
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> > 
> > Integrity detection and protection has long been a desirable feature, to
> > reach a large user base and mitigate the risk of flaws in the software
> > and attacks.
> > 
> > However, while solutions exist, they struggle to reach a large user base,
> > due to requiring higher than desired constraints on performance,
> > flexibility and configurability, that only security conscious people are
> > willing to accept.
> > 
> > For example, IMA measurement requires the target platform to collect
> > integrity measurements, and to protect them with the TPM, which introduces
> > a noticeable overhead (up to 10x slower in a microbenchmark) on frequently
> > used system calls, like the open().
> > 
> > IMA Appraisal currently requires individual files to be signed and
> > verified, and Linux distributions to rebuild all packages to include file
> > signatures (this approach has been adopted from Fedora 39+). Like a TPM,
> > also signature verification introduces a significant overhead, especially
> > if it is used to check the integrity of many files.
> > 
> > This is where the new Integrity Digest Cache comes into play, it offers
> > additional support for new and existing integrity solutions, to make
> > them faster and easier to deploy.
> > 
> > The Integrity Digest Cache can help IMA to reduce the number of TPM
> > operations and to make them happen in a deterministic way. If IMA knows
> > that a file comes from a Linux distribution, it can measure files in a
> > different way: measure the list of digests coming from the distribution
> > (e.g. RPM package headers), and subsequently measure a file if it is not
> > found in that list.
> > 
> > The performance improvement comes at the cost of IMA not reporting which
> > files from installed packages were accessed, and in which temporal
> > sequence. This approach might not be suitable for all use cases.
> > 
> > The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> > lookup the calculated digest of an accessed file in the list of digests
> > extracted from package headers, after verifying the header signature. It is
> > sufficient to verify only one signature for all files in the package, as
> > opposed to verifying a signature for each file.
> 
> Roberto, a big picture question for you, our apologies if we
> completely misunderstand your patch series.

Hi Greg

no need to apologise, happy to answer your questions.

> The performance benefit comes from the fact that the kernel doesn't
> have to read a file and calculate the cryptographic digest when the
> file is accessed.  The 'trusted' digest value comes from a signed list
> of digests that a packaging entity provides and the kernel validates.
> So there is an integrity guarantee that the supplied digests were the
> same as when the package was built.

The performance benefit (for appraisal with my benchmark: 65% with
sequential file access and 43% with parallel file access) comes from
verifying the ECDSA signature of 303 digest lists, as opposed to the
ECDSA signature of 12312 files.

The additional performance boost due to switching from file data digest
to fsverity digests is on top of that.

> Is there a guarantee implemented, that we missed, that the on-disk
> file actually has the digest value that was initially generated by the
> packaging entity when the file is accessed operationally?

Yes, the guarantee is provided by IMA by measuring the actual file
digest and searching it in a digest cache. The integration in IMA of
the Integrity Digest Cache is done in a separate patch set:

https://lore.kernel.org/linux-security-module/20241119110103.2780453-1-roberto.sassu@huaweicloud.com/

The integrity evaluation result is invalidated when the file is
modified, or when the digest list used to verify the file is modified
too.

For fsverity, the guarantee similarly comes from searching the fsverity
digest in a digest cache, but as opposed of IMA the integrity
evaluation result does not need to be invalidated for a file write,
since fsverity-protected files are accessible only in read-only mode.
However, the result still needs to be invalidated if the digest list
changes.

> Secondly, and in a related issue, what happens in a container
> environment when a pathname is accessed that is actually a different
> file but with the same effective pathname as a file that is in the
> vendor validated digest list?

At the moment nothing, only the file data are evaluated. Currently, the
Integrity Digest Cache does not store the pathnames associated to a
digest. It can be done as an extension, if desired, and the pathnames
can be compared.

Roberto

> Once again, apologies, if we completely misinterpret the issues
> involved.
> 
> Have a good remainder of the week.
> 
> As always,
> Dr. Greg
> 
> The Quixote Project - Flailing at the Travails of Cybersecurity
>               https://github.com/Quixote-Project
Eric Snowberg Dec. 3, 2024, 8:06 p.m. UTC | #6
> On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> 
> On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
>> 
>>> On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>> 
>>> From: Roberto Sassu <roberto.sassu@huawei.com>
>>> 
>>> The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
>>> lookup the calculated digest of an accessed file in the list of digests
>>> extracted from package headers, after verifying the header signature. It is
>>> sufficient to verify only one signature for all files in the package, as
>>> opposed to verifying a signature for each file.
>> 
>> Is there a way to maintain integrity over time?  Today if a CVE is discovered 
>> in a signed program, the program hash can be added to the blacklist keyring. 
>> Later if IMA appraisal is used, the signature validation will fail just for that 
>> program.  With the Integrity Digest Cache, is there a way to do this?  
> 
> As far as I can see, the ima_check_blacklist() call is before
> ima_appraise_measurement(). If it fails, appraisal with the Integrity
> Digest Cache will not be done.


It is good the program hash would be checked beforehand and fail if it is 
contained on the list. 

The .ima keyring may contain many keys.  If one of the keys was later 
revoked and added to the .blacklist, wouldn't this be missed?  It would 
be caught during signature validation when the file is later appraised, but 
now this step isn't taking place.  Correct?

With IMA appraisal, it is easy to maintain authenticity but challenging to 
maintain integrity over time. In user-space there are constantly new CVEs.  
To maintain integrity over time, either keys need to be rotated in the .ima 
keyring or program hashes need to be frequently added to the .blacklist.   
If neither is done, for an end-user on a distro, IMA-appraisal basically 
guarantees authenticity.

While I understand the intent of the series is to increase performance, 
have you considered using this to give the end-user the ability to maintain 
integrity of their system?  What I mean is, instead of trying to import anything 
from an RPM, just have the end-user provide this information in some format 
to the Digest Cache.  User-space tools could be built to collect and format 
the data needed by the Digest Cache.  This data  may allow multiple versions 
of the same program.  The data would then be signed by one of the system 
kernel keys (either something in the secondary or machine keyring), to maintain 
a root of trust.  This would give the end-user the ability to have integrity however 
they see fit.  This leaves the distro to provide signed programs and the end-user 
the ability to decide what level of software they want to run on their system.  If 
something isn't in the Digest Cache, it gets bumped down to the traditional 
IMA-appraisal.  I think it would simplify the problem you are trying to solve, 
especially around the missing kernel PGP code required for all this to work, 
since it wouldn't be necessary.   With this approach, besides the performance 
gain, the end-user would gain the ability to maintain integrity that is enforced by
the kernel.
Roberto Sassu Dec. 4, 2024, 10:44 a.m. UTC | #7
On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
> 
> > On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > 
> > On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
> > > 
> > > > On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > 
> > > > From: Roberto Sassu <roberto.sassu@huawei.com>
> > > > 
> > > > The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> > > > lookup the calculated digest of an accessed file in the list of digests
> > > > extracted from package headers, after verifying the header signature. It is
> > > > sufficient to verify only one signature for all files in the package, as
> > > > opposed to verifying a signature for each file.
> > > 
> > > Is there a way to maintain integrity over time?  Today if a CVE is discovered 
> > > in a signed program, the program hash can be added to the blacklist keyring. 
> > > Later if IMA appraisal is used, the signature validation will fail just for that 
> > > program.  With the Integrity Digest Cache, is there a way to do this?  
> > 
> > As far as I can see, the ima_check_blacklist() call is before
> > ima_appraise_measurement(). If it fails, appraisal with the Integrity
> > Digest Cache will not be done.
> 
> 
> It is good the program hash would be checked beforehand and fail if it is 
> contained on the list. 
> 
> The .ima keyring may contain many keys.  If one of the keys was later 
> revoked and added to the .blacklist, wouldn't this be missed?  It would 
> be caught during signature validation when the file is later appraised, but 
> now this step isn't taking place.  Correct?

For files included in the digest lists, yes, there won't be detection
of later revocation of a key. However, it will still work at package
level/digest list level, since they are still appraised with a
signature.

We can add a mechanism (if it does not already exist) to invalidate the
integrity status based on key revocation, which can be propagated to
files verified with the affected digest lists.

> With IMA appraisal, it is easy to maintain authenticity but challenging to 
> maintain integrity over time. In user-space there are constantly new CVEs.  
> To maintain integrity over time, either keys need to be rotated in the .ima 
> keyring or program hashes need to be frequently added to the .blacklist.   
> If neither is done, for an end-user on a distro, IMA-appraisal basically 
> guarantees authenticity.
> 
> While I understand the intent of the series is to increase performance, 
> have you considered using this to give the end-user the ability to maintain 
> integrity of their system?  What I mean is, instead of trying to import anything 
> from an RPM, just have the end-user provide this information in some format 
> to the Digest Cache.  User-space tools could be built to collect and format 

This is already possible, digest-cache-tools
(https://github.com/linux-integrity/digest-cache-tools) already allow
to create a digest list with the file a user wants.

But in this case, the user is vouching for having taken the correct
measure of the file at the time it was added to the digest list. This
would be instead automatically guaranteed by RPMs or other packages
shipped with Linux distributions.

To mitigate the concerns of CVEs, we can probably implement a rollback
prevention mechanism, which would not allow to load a previous version
of a digest list.

> the data needed by the Digest Cache.  This data  may allow multiple versions 
> of the same program.  The data would then be signed by one of the system 
> kernel keys (either something in the secondary or machine keyring), to maintain 
> a root of trust.  This would give the end-user the ability to have integrity however 
> they see fit.  This leaves the distro to provide signed programs and the end-user 
> the ability to decide what level of software they want to run on their system.  If 
> something isn't in the Digest Cache, it gets bumped down to the traditional 
> IMA-appraisal.  I think it would simplify the problem you are trying to solve, 

All you say it is already possible. Users can generate and sign their
digest lists, and add enroll their key to the kernel keyring.

> especially around the missing kernel PGP code required for all this to work, 
> since it wouldn't be necessary.   With this approach, besides the performance 
> gain, the end-user would gain the ability to maintain integrity that is enforced by
> the kernel.

For what I understood, Linus would not be against the
Eric Snowberg Dec. 5, 2024, 12:57 a.m. UTC | #8
> On Dec 4, 2024, at 3:44 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> 
> On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
>> 
>>> On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>> 
>>> On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
>>>> 
>>>>> On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>>>> 
>>>>> From: Roberto Sassu <roberto.sassu@huawei.com>
>>>>> 
>>>>> The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
>>>>> lookup the calculated digest of an accessed file in the list of digests
>>>>> extracted from package headers, after verifying the header signature. It is
>>>>> sufficient to verify only one signature for all files in the package, as
>>>>> opposed to verifying a signature for each file.
>>>> 
>>>> Is there a way to maintain integrity over time?  Today if a CVE is discovered 
>>>> in a signed program, the program hash can be added to the blacklist keyring. 
>>>> Later if IMA appraisal is used, the signature validation will fail just for that 
>>>> program.  With the Integrity Digest Cache, is there a way to do this?  
>>> 
>>> As far as I can see, the ima_check_blacklist() call is before
>>> ima_appraise_measurement(). If it fails, appraisal with the Integrity
>>> Digest Cache will not be done.
>> 
>> 
>> It is good the program hash would be checked beforehand and fail if it is 
>> contained on the list. 
>> 
>> The .ima keyring may contain many keys.  If one of the keys was later 
>> revoked and added to the .blacklist, wouldn't this be missed?  It would 
>> be caught during signature validation when the file is later appraised, but 
>> now this step isn't taking place.  Correct?
> 
> For files included in the digest lists, yes, there won't be detection
> of later revocation of a key. However, it will still work at package
> level/digest list level, since they are still appraised with a
> signature.
> 
> We can add a mechanism (if it does not already exist) to invalidate the
> integrity status based on key revocation, which can be propagated to
> files verified with the affected digest lists.
> 
>> With IMA appraisal, it is easy to maintain authenticity but challenging to 
>> maintain integrity over time. In user-space there are constantly new CVEs.  
>> To maintain integrity over time, either keys need to be rotated in the .ima 
>> keyring or program hashes need to be frequently added to the .blacklist.   
>> If neither is done, for an end-user on a distro, IMA-appraisal basically 
>> guarantees authenticity.
>> 
>> While I understand the intent of the series is to increase performance, 
>> have you considered using this to give the end-user the ability to maintain 
>> integrity of their system?  What I mean is, instead of trying to import anything 
>> from an RPM, just have the end-user provide this information in some format 
>> to the Digest Cache.  User-space tools could be built to collect and format 
> 
> This is already possible, digest-cache-tools
> (https://github.com/linux-integrity/digest-cache-tools) already allow
> to create a digest list with the file a user wants.
> 
> But in this case, the user is vouching for having taken the correct
> measure of the file at the time it was added to the digest list. This
> would be instead automatically guaranteed by RPMs or other packages
> shipped with Linux distributions.
> 
> To mitigate the concerns of CVEs, we can probably implement a rollback
> prevention mechanism, which would not allow to load a previous version
> of a digest list.

IMHO, pursuing this with the end-user being in control of what is contained 
within the Digest Cache vs what is contained in a distro would provide more
value. Allowing the end-user to easily update their Digest Cache in some way 
without having to do any type of revocation for both old and vulnerable 
applications with CVEs would be very beneficial.

Is there a belief the Digest Cache would be used without signed kernel 
modules?  Is the performance gain worth changing how kernel modules 
get loaded at boot?  Couldn't this part just be dropped for easier acceptance?  
Integrity is already maintained with the current model of appended signatures.
Roberto Sassu Dec. 5, 2024, 8:53 a.m. UTC | #9
On Thu, 2024-12-05 at 00:57 +0000, Eric Snowberg wrote:
> 
> > On Dec 4, 2024, at 3:44 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > 
> > On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
> > > 
> > > > On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > 
> > > > On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
> > > > > 
> > > > > > On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > > 
> > > > > > From: Roberto Sassu <roberto.sassu@huawei.com>
> > > > > > 
> > > > > > The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> > > > > > lookup the calculated digest of an accessed file in the list of digests
> > > > > > extracted from package headers, after verifying the header signature. It is
> > > > > > sufficient to verify only one signature for all files in the package, as
> > > > > > opposed to verifying a signature for each file.
> > > > > 
> > > > > Is there a way to maintain integrity over time?  Today if a CVE is discovered 
> > > > > in a signed program, the program hash can be added to the blacklist keyring. 
> > > > > Later if IMA appraisal is used, the signature validation will fail just for that 
> > > > > program.  With the Integrity Digest Cache, is there a way to do this?  
> > > > 
> > > > As far as I can see, the ima_check_blacklist() call is before
> > > > ima_appraise_measurement(). If it fails, appraisal with the Integrity
> > > > Digest Cache will not be done.
> > > 
> > > 
> > > It is good the program hash would be checked beforehand and fail if it is 
> > > contained on the list. 
> > > 
> > > The .ima keyring may contain many keys.  If one of the keys was later 
> > > revoked and added to the .blacklist, wouldn't this be missed?  It would 
> > > be caught during signature validation when the file is later appraised, but 
> > > now this step isn't taking place.  Correct?
> > 
> > For files included in the digest lists, yes, there won't be detection
> > of later revocation of a key. However, it will still work at package
> > level/digest list level, since they are still appraised with a
> > signature.
> > 
> > We can add a mechanism (if it does not already exist) to invalidate the
> > integrity status based on key revocation, which can be propagated to
> > files verified with the affected digest lists.
> > 
> > > With IMA appraisal, it is easy to maintain authenticity but challenging to 
> > > maintain integrity over time. In user-space there are constantly new CVEs.  
> > > To maintain integrity over time, either keys need to be rotated in the .ima 
> > > keyring or program hashes need to be frequently added to the .blacklist.   
> > > If neither is done, for an end-user on a distro, IMA-appraisal basically 
> > > guarantees authenticity.
> > > 
> > > While I understand the intent of the series is to increase performance, 
> > > have you considered using this to give the end-user the ability to maintain 
> > > integrity of their system?  What I mean is, instead of trying to import anything 
> > > from an RPM, just have the end-user provide this information in some format 
> > > to the Digest Cache.  User-space tools could be built to collect and format 
> > 
> > This is already possible, digest-cache-tools
> > (https://github.com/linux-integrity/digest-cache-tools) already allow
> > to create a digest list with the file a user wants.
> > 
> > But in this case, the user is vouching for having taken the correct
> > measure of the file at the time it was added to the digest list. This
> > would be instead automatically guaranteed by RPMs or other packages
> > shipped with Linux distributions.
> > 
> > To mitigate the concerns of CVEs, we can probably implement a rollback
> > prevention mechanism, which would not allow to load a previous version
> > of a digest list.
> 
> IMHO, pursuing this with the end-user being in control of what is contained 
> within the Digest Cache vs what is contained in a distro would provide more
> value. Allowing the end-user to easily update their Digest Cache in some way 
> without having to do any type of revocation for both old and vulnerable 
> applications with CVEs would be very beneficial.

Yes, deleting the digest list would invalidate any integrity result
done with that digest list.

I developed also an rpm plugin that synchronizes the digest lists with
installed software. Old vulnerable software cannot be verified anymore
with the Integrity Digest Cache, since the rpm plugin deletes the old
software digest lists.

https://github.com/linux-integrity/digest-cache-tools/blob/main/rpm-plugin/digest_cache.c

The good thing is that the Integrity Digest Cache can be easily
controlled with filesystem operations (it works similarly to security
blobs attached to kernel objects, like inodes and file descriptors).

As soon as something changes (e.g. digest list written, link to the
digest lists), this triggers a reset in the Integrity Digest Cache, so
digest lists and files need to be verified again. Deleting the digest
list causes the in-kernel digest cache to be wiped away too (when the
reference count reaches zero).

> Is there a belief the Digest Cache would be used without signed kernel 
> modules?  Is the performance gain worth changing how kernel modules 
> get loaded at boot?  Couldn't this part just be dropped for easier acceptance?  
> Integrity is already maintained with the current model of appended signatures. 

I don't like making exceptions in the design, and I recently realized
that it should not be task of the users of the Integrity Digest Cache
to limit themselves.

But the main problem was not the kernel modules themselves, but the
infrastructure needed in user space to load them, which might not be
available at the time a digest list parser needs to be loaded.

I hope ksys_finit_module() does not cause too much resistance (however
I need to document it better, as others noted). It is just a different
way to pass the same parameters of the finit_module() system call.

Thanks

Roberto
Roberto Sassu Dec. 5, 2024, 4:16 p.m. UTC | #10
On Thu, 2024-12-05 at 09:53 +0100, Roberto Sassu wrote:
> On Thu, 2024-12-05 at 00:57 +0000, Eric Snowberg wrote:
> > 
> > > On Dec 4, 2024, at 3:44 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > 
> > > On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
> > > > 
> > > > > On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > 
> > > > > On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
> > > > > > 
> > > > > > > On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > > > 
> > > > > > > From: Roberto Sassu <roberto.sassu@huawei.com>
> > > > > > > 
> > > > > > > The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> > > > > > > lookup the calculated digest of an accessed file in the list of digests
> > > > > > > extracted from package headers, after verifying the header signature. It is
> > > > > > > sufficient to verify only one signature for all files in the package, as
> > > > > > > opposed to verifying a signature for each file.
> > > > > > 
> > > > > > Is there a way to maintain integrity over time?  Today if a CVE is discovered 
> > > > > > in a signed program, the program hash can be added to the blacklist keyring. 
> > > > > > Later if IMA appraisal is used, the signature validation will fail just for that 
> > > > > > program.  With the Integrity Digest Cache, is there a way to do this?  
> > > > > 
> > > > > As far as I can see, the ima_check_blacklist() call is before
> > > > > ima_appraise_measurement(). If it fails, appraisal with the Integrity
> > > > > Digest Cache will not be done.
> > > > 
> > > > 
> > > > It is good the program hash would be checked beforehand and fail if it is 
> > > > contained on the list. 
> > > > 
> > > > The .ima keyring may contain many keys.  If one of the keys was later 
> > > > revoked and added to the .blacklist, wouldn't this be missed?  It would 
> > > > be caught during signature validation when the file is later appraised, but 
> > > > now this step isn't taking place.  Correct?
> > > 
> > > For files included in the digest lists, yes, there won't be detection
> > > of later revocation of a key. However, it will still work at package
> > > level/digest list level, since they are still appraised with a
> > > signature.
> > > 
> > > We can add a mechanism (if it does not already exist) to invalidate the
> > > integrity status based on key revocation, which can be propagated to
> > > files verified with the affected digest lists.
> > > 
> > > > With IMA appraisal, it is easy to maintain authenticity but challenging to 
> > > > maintain integrity over time. In user-space there are constantly new CVEs.  
> > > > To maintain integrity over time, either keys need to be rotated in the .ima 
> > > > keyring or program hashes need to be frequently added to the .blacklist.   
> > > > If neither is done, for an end-user on a distro, IMA-appraisal basically 
> > > > guarantees authenticity.
> > > > 
> > > > While I understand the intent of the series is to increase performance, 
> > > > have you considered using this to give the end-user the ability to maintain 
> > > > integrity of their system?  What I mean is, instead of trying to import anything 
> > > > from an RPM, just have the end-user provide this information in some format 
> > > > to the Digest Cache.  User-space tools could be built to collect and format 
> > > 
> > > This is already possible, digest-cache-tools
> > > (https://github.com/linux-integrity/digest-cache-tools) already allow
> > > to create a digest list with the file a user wants.
> > > 
> > > But in this case, the user is vouching for having taken the correct
> > > measure of the file at the time it was added to the digest list. This
> > > would be instead automatically guaranteed by RPMs or other packages
> > > shipped with Linux distributions.
> > > 
> > > To mitigate the concerns of CVEs, we can probably implement a rollback
> > > prevention mechanism, which would not allow to load a previous version
> > > of a digest list.
> > 
> > IMHO, pursuing this with the end-user being in control of what is contained 
> > within the Digest Cache vs what is contained in a distro would provide more
> > value. Allowing the end-user to easily update their Digest Cache in some way 
> > without having to do any type of revocation for both old and vulnerable 
> > applications with CVEs would be very beneficial.
> 
> Yes, deleting the digest list would invalidate any integrity result
> done with that digest list.
> 
> I developed also an rpm plugin that synchronizes the digest lists with
> installed software. Old vulnerable software cannot be verified anymore
> with the Integrity Digest Cache, since the rpm plugin deletes the old
> software digest lists.
> 
> https://github.com/linux-integrity/digest-cache-tools/blob/main/rpm-plugin/digest_cache.c
> 
> The good thing is that the Integrity Digest Cache can be easily
> controlled with filesystem operations (it works similarly to security
> blobs attached to kernel objects, like inodes and file descriptors).
> 
> As soon as something changes (e.g. digest list written, link to the
> digest lists), this triggers a reset in the Integrity Digest Cache, so
> digest lists and files need to be verified again. Deleting the digest
> list causes the in-kernel digest cache to be wiped away too (when the
> reference count reaches zero).
> 
> > Is there a belief the Digest Cache would be used without signed kernel 
> > modules?  Is the performance gain worth changing how kernel modules 
> > get loaded at boot?  Couldn't this part just be dropped for easier acceptance?  
> > Integrity is already maintained with the current model of appended signatures. 
> 
> I don't like making exceptions in the design, and I recently realized
> that it should not be task of the users of the Integrity Digest Cache
> to limit themselves.

Forgot to mention that your use case is possible. The usage of the
Integrity Digest Cache must be explicitly enabled in the IMA policy. It
will be used if the matching rule has 'digest_cache=data' (its foreseen
to be used also for metadata).

For kernel modules, it is sufficient to not provide that keyword for
the MODULE_CHECK hook.

However, there is the possibility that you lose another advantage of
the Integrity Digest Cache, the predictability of the IMA PCR. By not
using digest caches, there is the risk that the IMA PCR will be
unstable, due to loading kernel modules in a different order at each
boot.

Roberto


> But the main problem was not the kernel modules themselves, but the
> infrastructure needed in user space to load them, which might not be
> available at the time a digest list parser needs to be loaded.
> 
> I hope ksys_finit_module() does not cause too much resistance (however
> I need to document it better, as others noted). It is just a different
> way to pass the same parameters of the finit_module() system call.
> 
> Thanks
> 
> Roberto
Eric Snowberg Dec. 5, 2024, 7:41 p.m. UTC | #11
> On Dec 5, 2024, at 9:16 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> 
> On Thu, 2024-12-05 at 09:53 +0100, Roberto Sassu wrote:
>> On Thu, 2024-12-05 at 00:57 +0000, Eric Snowberg wrote:
>>> 
>>>> On Dec 4, 2024, at 3:44 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>>> 
>>>> On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
>>>>> 
>>>>>> On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>>>>> 
>>>>>> On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
>>>>>>> 
>>>>>>>> On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>>>>>>> 
>>>>>>>> From: Roberto Sassu <roberto.sassu@huawei.com>
>>>>>>>> 
>>>>>>>> The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
>>>>>>>> lookup the calculated digest of an accessed file in the list of digests
>>>>>>>> extracted from package headers, after verifying the header signature. It is
>>>>>>>> sufficient to verify only one signature for all files in the package, as
>>>>>>>> opposed to verifying a signature for each file.
>>>>>>> 
>>>>>>> Is there a way to maintain integrity over time?  Today if a CVE is discovered 
>>>>>>> in a signed program, the program hash can be added to the blacklist keyring. 
>>>>>>> Later if IMA appraisal is used, the signature validation will fail just for that 
>>>>>>> program.  With the Integrity Digest Cache, is there a way to do this?  
>>>>>> 
>>>>>> As far as I can see, the ima_check_blacklist() call is before
>>>>>> ima_appraise_measurement(). If it fails, appraisal with the Integrity
>>>>>> Digest Cache will not be done.
>>>>> 
>>>>> 
>>>>> It is good the program hash would be checked beforehand and fail if it is 
>>>>> contained on the list. 
>>>>> 
>>>>> The .ima keyring may contain many keys.  If one of the keys was later 
>>>>> revoked and added to the .blacklist, wouldn't this be missed?  It would 
>>>>> be caught during signature validation when the file is later appraised, but 
>>>>> now this step isn't taking place.  Correct?
>>>> 
>>>> For files included in the digest lists, yes, there won't be detection
>>>> of later revocation of a key. However, it will still work at package
>>>> level/digest list level, since they are still appraised with a
>>>> signature.
>>>> 
>>>> We can add a mechanism (if it does not already exist) to invalidate the
>>>> integrity status based on key revocation, which can be propagated to
>>>> files verified with the affected digest lists.
>>>> 
>>>>> With IMA appraisal, it is easy to maintain authenticity but challenging to 
>>>>> maintain integrity over time. In user-space there are constantly new CVEs.  
>>>>> To maintain integrity over time, either keys need to be rotated in the .ima 
>>>>> keyring or program hashes need to be frequently added to the .blacklist.   
>>>>> If neither is done, for an end-user on a distro, IMA-appraisal basically 
>>>>> guarantees authenticity.
>>>>> 
>>>>> While I understand the intent of the series is to increase performance, 
>>>>> have you considered using this to give the end-user the ability to maintain 
>>>>> integrity of their system?  What I mean is, instead of trying to import anything 
>>>>> from an RPM, just have the end-user provide this information in some format 
>>>>> to the Digest Cache.  User-space tools could be built to collect and format 
>>>> 
>>>> This is already possible, digest-cache-tools
>>>> (https://github.com/linux-integrity/digest-cache-tools) already allow
>>>> to create a digest list with the file a user wants.
>>>> 
>>>> But in this case, the user is vouching for having taken the correct
>>>> measure of the file at the time it was added to the digest list. This
>>>> would be instead automatically guaranteed by RPMs or other packages
>>>> shipped with Linux distributions.
>>>> 
>>>> To mitigate the concerns of CVEs, we can probably implement a rollback
>>>> prevention mechanism, which would not allow to load a previous version
>>>> of a digest list.
>>> 
>>> IMHO, pursuing this with the end-user being in control of what is contained 
>>> within the Digest Cache vs what is contained in a distro would provide more
>>> value. Allowing the end-user to easily update their Digest Cache in some way 
>>> without having to do any type of revocation for both old and vulnerable 
>>> applications with CVEs would be very beneficial.
>> 
>> Yes, deleting the digest list would invalidate any integrity result
>> done with that digest list.
>> 
>> I developed also an rpm plugin that synchronizes the digest lists with
>> installed software. Old vulnerable software cannot be verified anymore
>> with the Integrity Digest Cache, since the rpm plugin deletes the old
>> software digest lists.
>> 
>> https://github.com/linux-integrity/digest-cache-tools/blob/main/rpm-plugin/digest_cache.c
>> 
>> The good thing is that the Integrity Digest Cache can be easily
>> controlled with filesystem operations (it works similarly to security
>> blobs attached to kernel objects, like inodes and file descriptors).
>> 
>> As soon as something changes (e.g. digest list written, link to the
>> digest lists), this triggers a reset in the Integrity Digest Cache, so
>> digest lists and files need to be verified again. Deleting the digest
>> list causes the in-kernel digest cache to be wiped away too (when the
>> reference count reaches zero).
>> 
>>> Is there a belief the Digest Cache would be used without signed kernel 
>>> modules?  Is the performance gain worth changing how kernel modules 
>>> get loaded at boot?  Couldn't this part just be dropped for easier acceptance?  
>>> Integrity is already maintained with the current model of appended signatures. 
>> 
>> I don't like making exceptions in the design, and I recently realized
>> that it should not be task of the users of the Integrity Digest Cache
>> to limit themselves.
> 
> Forgot to mention that your use case is possible. The usage of the
> Integrity Digest Cache must be explicitly enabled in the IMA policy. It
> will be used if the matching rule has 'digest_cache=data' (its foreseen
> to be used also for metadata).

I see a lot of benefit if metadata integrity could be maintained, but in the 
current form of this series, I don't think that is possible.  The Digest Cache 
doesn't contain or enforce the file path, which would be necessary to 
maintain integrity.  Here is an example of why it would be needed, say 
you have two applications that need a configuration file to start.  The first 
application has an empty file where no configuration options are currently 
defined. Now there is a hash for an empty file in the Digest Cache.  The 
second application can be started with an empty configuration file, however 
the end-user has added some options to it.  If the configuration file for the 
second application is replaced with an empty file, it will not be detected, 
since the Digest Cache would see the empty file hash in its cache.

> For kernel modules, it is sufficient to not provide that keyword for
> the MODULE_CHECK hook.
> 
> However, there is the possibility that you lose another advantage of
> the Integrity Digest Cache, the predictability of the IMA PCR. By not
> using digest caches, there is the risk that the IMA PCR will be
> unstable, due to loading kernel modules in a different order at each
> boot.

Understood, my recommendation was based on trying to narrow the series 
to help try to get something like this adopted quicker.
Roberto Sassu Dec. 6, 2024, 10:06 a.m. UTC | #12
On Thu, 2024-12-05 at 19:41 +0000, Eric Snowberg wrote:
> 
> > On Dec 5, 2024, at 9:16 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > 
> > On Thu, 2024-12-05 at 09:53 +0100, Roberto Sassu wrote:
> > > On Thu, 2024-12-05 at 00:57 +0000, Eric Snowberg wrote:
> > > > 
> > > > > On Dec 4, 2024, at 3:44 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > 
> > > > > On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
> > > > > > 
> > > > > > > On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > > > 
> > > > > > > On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
> > > > > > > > 
> > > > > > > > > On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > > > > > 
> > > > > > > > > From: Roberto Sassu <roberto.sassu@huawei.com>
> > > > > > > > > 
> > > > > > > > > The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> > > > > > > > > lookup the calculated digest of an accessed file in the list of digests
> > > > > > > > > extracted from package headers, after verifying the header signature. It is
> > > > > > > > > sufficient to verify only one signature for all files in the package, as
> > > > > > > > > opposed to verifying a signature for each file.
> > > > > > > > 
> > > > > > > > Is there a way to maintain integrity over time?  Today if a CVE is discovered 
> > > > > > > > in a signed program, the program hash can be added to the blacklist keyring. 
> > > > > > > > Later if IMA appraisal is used, the signature validation will fail just for that 
> > > > > > > > program.  With the Integrity Digest Cache, is there a way to do this?  
> > > > > > > 
> > > > > > > As far as I can see, the ima_check_blacklist() call is before
> > > > > > > ima_appraise_measurement(). If it fails, appraisal with the Integrity
> > > > > > > Digest Cache will not be done.
> > > > > > 
> > > > > > 
> > > > > > It is good the program hash would be checked beforehand and fail if it is 
> > > > > > contained on the list. 
> > > > > > 
> > > > > > The .ima keyring may contain many keys.  If one of the keys was later 
> > > > > > revoked and added to the .blacklist, wouldn't this be missed?  It would 
> > > > > > be caught during signature validation when the file is later appraised, but 
> > > > > > now this step isn't taking place.  Correct?
> > > > > 
> > > > > For files included in the digest lists, yes, there won't be detection
> > > > > of later revocation of a key. However, it will still work at package
> > > > > level/digest list level, since they are still appraised with a
> > > > > signature.
> > > > > 
> > > > > We can add a mechanism (if it does not already exist) to invalidate the
> > > > > integrity status based on key revocation, which can be propagated to
> > > > > files verified with the affected digest lists.
> > > > > 
> > > > > > With IMA appraisal, it is easy to maintain authenticity but challenging to 
> > > > > > maintain integrity over time. In user-space there are constantly new CVEs.  
> > > > > > To maintain integrity over time, either keys need to be rotated in the .ima 
> > > > > > keyring or program hashes need to be frequently added to the .blacklist.   
> > > > > > If neither is done, for an end-user on a distro, IMA-appraisal basically 
> > > > > > guarantees authenticity.
> > > > > > 
> > > > > > While I understand the intent of the series is to increase performance, 
> > > > > > have you considered using this to give the end-user the ability to maintain 
> > > > > > integrity of their system?  What I mean is, instead of trying to import anything 
> > > > > > from an RPM, just have the end-user provide this information in some format 
> > > > > > to the Digest Cache.  User-space tools could be built to collect and format 
> > > > > 
> > > > > This is already possible, digest-cache-tools
> > > > > (https://github.com/linux-integrity/digest-cache-tools) already allow
> > > > > to create a digest list with the file a user wants.
> > > > > 
> > > > > But in this case, the user is vouching for having taken the correct
> > > > > measure of the file at the time it was added to the digest list. This
> > > > > would be instead automatically guaranteed by RPMs or other packages
> > > > > shipped with Linux distributions.
> > > > > 
> > > > > To mitigate the concerns of CVEs, we can probably implement a rollback
> > > > > prevention mechanism, which would not allow to load a previous version
> > > > > of a digest list.
> > > > 
> > > > IMHO, pursuing this with the end-user being in control of what is contained 
> > > > within the Digest Cache vs what is contained in a distro would provide more
> > > > value. Allowing the end-user to easily update their Digest Cache in some way 
> > > > without having to do any type of revocation for both old and vulnerable 
> > > > applications with CVEs would be very beneficial.
> > > 
> > > Yes, deleting the digest list would invalidate any integrity result
> > > done with that digest list.
> > > 
> > > I developed also an rpm plugin that synchronizes the digest lists with
> > > installed software. Old vulnerable software cannot be verified anymore
> > > with the Integrity Digest Cache, since the rpm plugin deletes the old
> > > software digest lists.
> > > 
> > > https://github.com/linux-integrity/digest-cache-tools/blob/main/rpm-plugin/digest_cache.c
> > > 
> > > The good thing is that the Integrity Digest Cache can be easily
> > > controlled with filesystem operations (it works similarly to security
> > > blobs attached to kernel objects, like inodes and file descriptors).
> > > 
> > > As soon as something changes (e.g. digest list written, link to the
> > > digest lists), this triggers a reset in the Integrity Digest Cache, so
> > > digest lists and files need to be verified again. Deleting the digest
> > > list causes the in-kernel digest cache to be wiped away too (when the
> > > reference count reaches zero).
> > > 
> > > > Is there a belief the Digest Cache would be used without signed kernel 
> > > > modules?  Is the performance gain worth changing how kernel modules 
> > > > get loaded at boot?  Couldn't this part just be dropped for easier acceptance?  
> > > > Integrity is already maintained with the current model of appended signatures. 
> > > 
> > > I don't like making exceptions in the design, and I recently realized
> > > that it should not be task of the users of the Integrity Digest Cache
> > > to limit themselves.
> > 
> > Forgot to mention that your use case is possible. The usage of the
> > Integrity Digest Cache must be explicitly enabled in the IMA policy. It
> > will be used if the matching rule has 'digest_cache=data' (its foreseen
> > to be used also for metadata).
> 
> I see a lot of benefit if metadata integrity could be maintained, but in the 
> current form of this series, I don't think that is possible.  The Digest Cache 
> doesn't contain or enforce the file path, which would be necessary to 
> maintain integrity.  Here is an example of why it would be needed, say 
> you have two applications that need a configuration file to start.  The first 
> application has an empty file where no configuration options are currently 
> defined. Now there is a hash for an empty file in the Digest Cache.  The 
> second application can be started with an empty configuration file, however 
> the end-user has added some options to it.  If the configuration file for the 
> second application is replaced with an empty file, it will not be detected, 
> since the Digest Cache would see the empty file hash in its cache.

I was thinking more to store in the digest cache digests of metadata
(including for example the expected SELinux label), that EVM can
lookup.

In that way, the problem you foresee cannot happen: if you replace the
file belonging to app2_t with the one belonging to app1_t, SELinux
would deny the permission to access; if you change the SELinux label of
the file, EVM will deny the access.

You can still go back to the initial state, for that a rollback
prevention mechanism is needed (maybe EVM can remove the digest of the
initial state from the digest cache when it sees an update?).

In general, the Integrity Digest Cache should be considered as an
alternative mechanism to validate immutable files, or the initial state
of mutable files. For mutable files, EVM HMAC will protect further
updates.

Roberto

> > For kernel modules, it is sufficient to not provide that keyword for
> > the MODULE_CHECK hook.
> > 
> > However, there is the possibility that you lose another advantage of
> > the Integrity Digest Cache, the predictability of the IMA PCR. By not
> > using digest caches, there is the risk that the IMA PCR will be
> > unstable, due to loading kernel modules in a different order at each
> > boot.
> 
> Understood, my recommendation was based on trying to narrow the series 
> to help try to get something like this adopted quicker.
>
Eric Snowberg Dec. 6, 2024, 3:15 p.m. UTC | #13
> On Dec 6, 2024, at 3:06 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> 
> On Thu, 2024-12-05 at 19:41 +0000, Eric Snowberg wrote:
>> 
>>> On Dec 5, 2024, at 9:16 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>> 
>>> On Thu, 2024-12-05 at 09:53 +0100, Roberto Sassu wrote:
>>>> On Thu, 2024-12-05 at 00:57 +0000, Eric Snowberg wrote:
>>>>> 
>>>>>> On Dec 4, 2024, at 3:44 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>>>>> 
>>>>>> On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
>>>>>>> 
>>>>>>>> On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>>>>>>> 
>>>>>>>> On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
>>>>>>>>> 
>>>>>>>>>> On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> From: Roberto Sassu <roberto.sassu@huawei.com>
>>>>>>>>>> 
>>>>>>>>>> The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
>>>>>>>>>> lookup the calculated digest of an accessed file in the list of digests
>>>>>>>>>> extracted from package headers, after verifying the header signature. It is
>>>>>>>>>> sufficient to verify only one signature for all files in the package, as
>>>>>>>>>> opposed to verifying a signature for each file.
>>>>>>>>> 
>>>>>>>>> Is there a way to maintain integrity over time?  Today if a CVE is discovered 
>>>>>>>>> in a signed program, the program hash can be added to the blacklist keyring. 
>>>>>>>>> Later if IMA appraisal is used, the signature validation will fail just for that 
>>>>>>>>> program.  With the Integrity Digest Cache, is there a way to do this?  
>>>>>>>> 
>>>>>>>> As far as I can see, the ima_check_blacklist() call is before
>>>>>>>> ima_appraise_measurement(). If it fails, appraisal with the Integrity
>>>>>>>> Digest Cache will not be done.
>>>>>>> 
>>>>>>> 
>>>>>>> It is good the program hash would be checked beforehand and fail if it is 
>>>>>>> contained on the list. 
>>>>>>> 
>>>>>>> The .ima keyring may contain many keys.  If one of the keys was later 
>>>>>>> revoked and added to the .blacklist, wouldn't this be missed?  It would 
>>>>>>> be caught during signature validation when the file is later appraised, but 
>>>>>>> now this step isn't taking place.  Correct?
>>>>>> 
>>>>>> For files included in the digest lists, yes, there won't be detection
>>>>>> of later revocation of a key. However, it will still work at package
>>>>>> level/digest list level, since they are still appraised with a
>>>>>> signature.
>>>>>> 
>>>>>> We can add a mechanism (if it does not already exist) to invalidate the
>>>>>> integrity status based on key revocation, which can be propagated to
>>>>>> files verified with the affected digest lists.
>>>>>> 
>>>>>>> With IMA appraisal, it is easy to maintain authenticity but challenging to 
>>>>>>> maintain integrity over time. In user-space there are constantly new CVEs.  
>>>>>>> To maintain integrity over time, either keys need to be rotated in the .ima 
>>>>>>> keyring or program hashes need to be frequently added to the .blacklist.   
>>>>>>> If neither is done, for an end-user on a distro, IMA-appraisal basically 
>>>>>>> guarantees authenticity.
>>>>>>> 
>>>>>>> While I understand the intent of the series is to increase performance, 
>>>>>>> have you considered using this to give the end-user the ability to maintain 
>>>>>>> integrity of their system?  What I mean is, instead of trying to import anything 
>>>>>>> from an RPM, just have the end-user provide this information in some format 
>>>>>>> to the Digest Cache.  User-space tools could be built to collect and format 
>>>>>> 
>>>>>> This is already possible, digest-cache-tools
>>>>>> (https://github.com/linux-integrity/digest-cache-tools) already allow
>>>>>> to create a digest list with the file a user wants.
>>>>>> 
>>>>>> But in this case, the user is vouching for having taken the correct
>>>>>> measure of the file at the time it was added to the digest list. This
>>>>>> would be instead automatically guaranteed by RPMs or other packages
>>>>>> shipped with Linux distributions.
>>>>>> 
>>>>>> To mitigate the concerns of CVEs, we can probably implement a rollback
>>>>>> prevention mechanism, which would not allow to load a previous version
>>>>>> of a digest list.
>>>>> 
>>>>> IMHO, pursuing this with the end-user being in control of what is contained 
>>>>> within the Digest Cache vs what is contained in a distro would provide more
>>>>> value. Allowing the end-user to easily update their Digest Cache in some way 
>>>>> without having to do any type of revocation for both old and vulnerable 
>>>>> applications with CVEs would be very beneficial.
>>>> 
>>>> Yes, deleting the digest list would invalidate any integrity result
>>>> done with that digest list.
>>>> 
>>>> I developed also an rpm plugin that synchronizes the digest lists with
>>>> installed software. Old vulnerable software cannot be verified anymore
>>>> with the Integrity Digest Cache, since the rpm plugin deletes the old
>>>> software digest lists.
>>>> 
>>>> https://github.com/linux-integrity/digest-cache-tools/blob/main/rpm-plugin/digest_cache.c
>>>> 
>>>> The good thing is that the Integrity Digest Cache can be easily
>>>> controlled with filesystem operations (it works similarly to security
>>>> blobs attached to kernel objects, like inodes and file descriptors).
>>>> 
>>>> As soon as something changes (e.g. digest list written, link to the
>>>> digest lists), this triggers a reset in the Integrity Digest Cache, so
>>>> digest lists and files need to be verified again. Deleting the digest
>>>> list causes the in-kernel digest cache to be wiped away too (when the
>>>> reference count reaches zero).
>>>> 
>>>>> Is there a belief the Digest Cache would be used without signed kernel 
>>>>> modules?  Is the performance gain worth changing how kernel modules 
>>>>> get loaded at boot?  Couldn't this part just be dropped for easier acceptance?  
>>>>> Integrity is already maintained with the current model of appended signatures. 
>>>> 
>>>> I don't like making exceptions in the design, and I recently realized
>>>> that it should not be task of the users of the Integrity Digest Cache
>>>> to limit themselves.
>>> 
>>> Forgot to mention that your use case is possible. The usage of the
>>> Integrity Digest Cache must be explicitly enabled in the IMA policy. It
>>> will be used if the matching rule has 'digest_cache=data' (its foreseen
>>> to be used also for metadata).
>> 
>> I see a lot of benefit if metadata integrity could be maintained, but in the 
>> current form of this series, I don't think that is possible.  The Digest Cache 
>> doesn't contain or enforce the file path, which would be necessary to 
>> maintain integrity.  Here is an example of why it would be needed, say 
>> you have two applications that need a configuration file to start.  The first 
>> application has an empty file where no configuration options are currently 
>> defined. Now there is a hash for an empty file in the Digest Cache.  The 
>> second application can be started with an empty configuration file, however 
>> the end-user has added some options to it.  If the configuration file for the 
>> second application is replaced with an empty file, it will not be detected, 
>> since the Digest Cache would see the empty file hash in its cache.
> 
> I was thinking more to store in the digest cache digests of metadata
> (including for example the expected SELinux label), that EVM can
> lookup.
> 
> In that way, the problem you foresee cannot happen: if you replace the
> file belonging to app2_t with the one belonging to app1_t, SELinux
> would deny the permission to access; if you change the SELinux label of
> the file, EVM will deny the access.

If two different applications have config files in /etc, wouldn't both files 
have the same SELinux label?

> You can still go back to the initial state, for that a rollback
> prevention mechanism is needed (maybe EVM can remove the digest of the
> initial state from the digest cache when it sees an update?).
> 
> In general, the Integrity Digest Cache should be considered as an
> alternative mechanism to validate immutable files, or the initial state
> of mutable files. For mutable files, EVM HMAC will protect further
> updates.

In the example above, from a distro standpoint, most files contained in /etc 
are viewed as being mutable.  However an end-user that wants to maintain 
integrity on their system wouldn't view it that way.  They don't want config 
changes they have made to be backed out. In the current form they would 
view this series as an Authenticity Digest Cache. I'm just trying to show that 
this could be a lot more valuable to the end-user if some things were changed.
Roberto Sassu Dec. 6, 2024, 3:26 p.m. UTC | #14
On Fri, 2024-12-06 at 15:15 +0000, Eric Snowberg wrote:
> 
> > On Dec 6, 2024, at 3:06 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > 
> > On Thu, 2024-12-05 at 19:41 +0000, Eric Snowberg wrote:
> > > 
> > > > On Dec 5, 2024, at 9:16 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > 
> > > > On Thu, 2024-12-05 at 09:53 +0100, Roberto Sassu wrote:
> > > > > On Thu, 2024-12-05 at 00:57 +0000, Eric Snowberg wrote:
> > > > > > 
> > > > > > > On Dec 4, 2024, at 3:44 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > > > 
> > > > > > > On Tue, 2024-12-03 at 20:06 +0000, Eric Snowberg wrote:
> > > > > > > > 
> > > > > > > > > On Nov 26, 2024, at 3:41 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > > > > > 
> > > > > > > > > On Tue, 2024-11-26 at 00:13 +0000, Eric Snowberg wrote:
> > > > > > > > > > 
> > > > > > > > > > > On Nov 19, 2024, at 3:49 AM, Roberto Sassu <roberto.sassu@huaweicloud.com> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > From: Roberto Sassu <roberto.sassu@huawei.com>
> > > > > > > > > > > 
> > > > > > > > > > > The Integrity Digest Cache can also help IMA for appraisal. IMA can simply
> > > > > > > > > > > lookup the calculated digest of an accessed file in the list of digests
> > > > > > > > > > > extracted from package headers, after verifying the header signature. It is
> > > > > > > > > > > sufficient to verify only one signature for all files in the package, as
> > > > > > > > > > > opposed to verifying a signature for each file.
> > > > > > > > > > 
> > > > > > > > > > Is there a way to maintain integrity over time?  Today if a CVE is discovered 
> > > > > > > > > > in a signed program, the program hash can be added to the blacklist keyring. 
> > > > > > > > > > Later if IMA appraisal is used, the signature validation will fail just for that 
> > > > > > > > > > program.  With the Integrity Digest Cache, is there a way to do this?  
> > > > > > > > > 
> > > > > > > > > As far as I can see, the ima_check_blacklist() call is before
> > > > > > > > > ima_appraise_measurement(). If it fails, appraisal with the Integrity
> > > > > > > > > Digest Cache will not be done.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > It is good the program hash would be checked beforehand and fail if it is 
> > > > > > > > contained on the list. 
> > > > > > > > 
> > > > > > > > The .ima keyring may contain many keys.  If one of the keys was later 
> > > > > > > > revoked and added to the .blacklist, wouldn't this be missed?  It would 
> > > > > > > > be caught during signature validation when the file is later appraised, but 
> > > > > > > > now this step isn't taking place.  Correct?
> > > > > > > 
> > > > > > > For files included in the digest lists, yes, there won't be detection
> > > > > > > of later revocation of a key. However, it will still work at package
> > > > > > > level/digest list level, since they are still appraised with a
> > > > > > > signature.
> > > > > > > 
> > > > > > > We can add a mechanism (if it does not already exist) to invalidate the
> > > > > > > integrity status based on key revocation, which can be propagated to
> > > > > > > files verified with the affected digest lists.
> > > > > > > 
> > > > > > > > With IMA appraisal, it is easy to maintain authenticity but challenging to 
> > > > > > > > maintain integrity over time. In user-space there are constantly new CVEs.  
> > > > > > > > To maintain integrity over time, either keys need to be rotated in the .ima 
> > > > > > > > keyring or program hashes need to be frequently added to the .blacklist.   
> > > > > > > > If neither is done, for an end-user on a distro, IMA-appraisal basically 
> > > > > > > > guarantees authenticity.
> > > > > > > > 
> > > > > > > > While I understand the intent of the series is to increase performance, 
> > > > > > > > have you considered using this to give the end-user the ability to maintain 
> > > > > > > > integrity of their system?  What I mean is, instead of trying to import anything 
> > > > > > > > from an RPM, just have the end-user provide this information in some format 
> > > > > > > > to the Digest Cache.  User-space tools could be built to collect and format 
> > > > > > > 
> > > > > > > This is already possible, digest-cache-tools
> > > > > > > (https://github.com/linux-integrity/digest-cache-tools) already allow
> > > > > > > to create a digest list with the file a user wants.
> > > > > > > 
> > > > > > > But in this case, the user is vouching for having taken the correct
> > > > > > > measure of the file at the time it was added to the digest list. This
> > > > > > > would be instead automatically guaranteed by RPMs or other packages
> > > > > > > shipped with Linux distributions.
> > > > > > > 
> > > > > > > To mitigate the concerns of CVEs, we can probably implement a rollback
> > > > > > > prevention mechanism, which would not allow to load a previous version
> > > > > > > of a digest list.
> > > > > > 
> > > > > > IMHO, pursuing this with the end-user being in control of what is contained 
> > > > > > within the Digest Cache vs what is contained in a distro would provide more
> > > > > > value. Allowing the end-user to easily update their Digest Cache in some way 
> > > > > > without having to do any type of revocation for both old and vulnerable 
> > > > > > applications with CVEs would be very beneficial.
> > > > > 
> > > > > Yes, deleting the digest list would invalidate any integrity result
> > > > > done with that digest list.
> > > > > 
> > > > > I developed also an rpm plugin that synchronizes the digest lists with
> > > > > installed software. Old vulnerable software cannot be verified anymore
> > > > > with the Integrity Digest Cache, since the rpm plugin deletes the old
> > > > > software digest lists.
> > > > > 
> > > > > https://github.com/linux-integrity/digest-cache-tools/blob/main/rpm-plugin/digest_cache.c
> > > > > 
> > > > > The good thing is that the Integrity Digest Cache can be easily
> > > > > controlled with filesystem operations (it works similarly to security
> > > > > blobs attached to kernel objects, like inodes and file descriptors).
> > > > > 
> > > > > As soon as something changes (e.g. digest list written, link to the
> > > > > digest lists), this triggers a reset in the Integrity Digest Cache, so
> > > > > digest lists and files need to be verified again. Deleting the digest
> > > > > list causes the in-kernel digest cache to be wiped away too (when the
> > > > > reference count reaches zero).
> > > > > 
> > > > > > Is there a belief the Digest Cache would be used without signed kernel 
> > > > > > modules?  Is the performance gain worth changing how kernel modules 
> > > > > > get loaded at boot?  Couldn't this part just be dropped for easier acceptance?  
> > > > > > Integrity is already maintained with the current model of appended signatures. 
> > > > > 
> > > > > I don't like making exceptions in the design, and I recently realized
> > > > > that it should not be task of the users of the Integrity Digest Cache
> > > > > to limit themselves.
> > > > 
> > > > Forgot to mention that your use case is possible. The usage of the
> > > > Integrity Digest Cache must be explicitly enabled in the IMA policy. It
> > > > will be used if the matching rule has 'digest_cache=data' (its foreseen
> > > > to be used also for metadata).
> > > 
> > > I see a lot of benefit if metadata integrity could be maintained, but in the 
> > > current form of this series, I don't think that is possible.  The Digest Cache 
> > > doesn't contain or enforce the file path, which would be necessary to 
> > > maintain integrity.  Here is an example of why it would be needed, say 
> > > you have two applications that need a configuration file to start.  The first 
> > > application has an empty file where no configuration options are currently 
> > > defined. Now there is a hash for an empty file in the Digest Cache.  The 
> > > second application can be started with an empty configuration file, however 
> > > the end-user has added some options to it.  If the configuration file for the 
> > > second application is replaced with an empty file, it will not be detected, 
> > > since the Digest Cache would see the empty file hash in its cache.
> > 
> > I was thinking more to store in the digest cache digests of metadata
> > (including for example the expected SELinux label), that EVM can
> > lookup.
> > 
> > In that way, the problem you foresee cannot happen: if you replace the
> > file belonging to app2_t with the one belonging to app1_t, SELinux
> > would deny the permission to access; if you change the SELinux label of
> > the file, EVM will deny the access.
> 
> If two different applications have config files in /etc, wouldn't both files 
> have the same SELinux label?

Likely, unless there is an application-specific policy.

> > You can still go back to the initial state, for that a rollback
> > prevention mechanism is needed (maybe EVM can remove the digest of the
> > initial state from the digest cache when it sees an update?).
> > 
> > In general, the Integrity Digest Cache should be considered as an
> > alternative mechanism to validate immutable files, or the initial state
> > of mutable files. For mutable files, EVM HMAC will protect further
> > updates.
> 
> In the example above, from a distro standpoint, most files contained in /etc 
> are viewed as being mutable.  However an end-user that wants to maintain 
> integrity on their system wouldn't view it that way.  They don't want config 
> changes they have made to be backed out. In the current form they would 
> view this series as an Authenticity Digest Cache. I'm just trying to show that 
> this could be a lot more valuable to the end-user if some things were changed.

I agree, I think the current patch set contains the minimum necessary,
and it can grow depending on use cases/requirements from the community.

Thanks

Roberto