diff mbox series

[v6,07/15] digest_cache: Allow registration of digest list parsers

Message ID 20241119104922.2772571-8-roberto.sassu@huaweicloud.com (mailing list archive)
State Handled Elsewhere
Headers show
Series integrity: Introduce the Integrity Digest Cache | expand

Checks

Context Check Description
mcgrof/vmtest-modules-next-VM_Test-1 success Logs for Run CI tests
mcgrof/vmtest-modules-next-VM_Test-0 success Logs for Run CI tests
mcgrof/vmtest-modules-next-VM_Test-4 success Logs for setup / Setup kdevops environment
mcgrof/vmtest-modules-next-VM_Test-5 success Logs for setup / Setup kdevops environment
mcgrof/vmtest-modules-next-VM_Test-3 success Logs for cleanup / Archive results and cleanup
mcgrof/vmtest-modules-next-VM_Test-2 success Logs for cleanup / Archive results and cleanup
mcgrof/vmtest-modules-next-PR fail merge-conflict

Commit Message

Roberto Sassu Nov. 19, 2024, 10:49 a.m. UTC
From: Roberto Sassu <roberto.sassu@huawei.com>

Allow kernel modules to register/deregister new digest list parsers,
respectively through digest_cache_register_parser() and
digest_cache_unregister_parser().

Those functions pass the new parser structure holding the linked list
pointers and a parsing function with the new type parser_func.

Introduce digest_cache_parse_digest_list(), which determines the desired
parser from the file name, looks up the parser among the registered ones
with lookup_get_parser(), calls the parser-specific function along with the
digest list data read by the kernel, and finally releases the kernel module
reference with put_parser().

The expected digest list file name format is:

[<seq num>-]<digest list format>-<digest list name>

<seq-num>- is an optional prefix to impose in which order digest lists in
a directory should be parsed.

Introduce load_parser() to load a kernel module containing a
parser for the requested digest list format (compressed kernel modules are
supported). Kernel modules are searched in the
/lib/modules/<kernel ver>/security/integrity/digest_cache directory.

load_parser() calls ksys_finit_module() to load a kernel module directly
from the kernel. request_module() cannot be used at this point, since the
reference digests of modprobe and the linked libraries (required for IMA
appraisal) might not be yet available, resulting in modprobe execution
being denied.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
 include/linux/digest_cache.h               |  38 +++
 security/integrity/digest_cache/Kconfig    |   1 +
 security/integrity/digest_cache/Makefile   |   4 +-
 security/integrity/digest_cache/internal.h |   5 +
 security/integrity/digest_cache/parsers.c  | 257 +++++++++++++++++++++
 5 files changed, 304 insertions(+), 1 deletion(-)
 create mode 100644 security/integrity/digest_cache/parsers.c

Comments

Randy Dunlap Nov. 19, 2024, 4:46 p.m. UTC | #1
Hi--

On 11/19/24 2:49 AM, Roberto Sassu wrote:
> +/**
> + * struct parser - Structure to store a function pointer to parse digest list
> + * @list: Linked list
> + * @owner: Kernel module owning the parser
> + * @name: Parser name (must match the format in the digest list file name)
> + * @func: Function pointer for parsing
> + *
> + * This structure stores a function pointer to parse a digest list.
> + */
> +struct parser {
> +	struct list_head list;
> +	struct module *owner;
> +	const char name[NAME_MAX + 1];
> +	parser_func func;
> +};

I would make the struct name not so generic -- maybe digest_parser ...

thanks.
Roberto Sassu Nov. 19, 2024, 4:48 p.m. UTC | #2
On Tue, 2024-11-19 at 08:46 -0800, Randy Dunlap wrote:
> Hi--
> 
> On 11/19/24 2:49 AM, Roberto Sassu wrote:
> > +/**
> > + * struct parser - Structure to store a function pointer to parse digest list
> > + * @list: Linked list
> > + * @owner: Kernel module owning the parser
> > + * @name: Parser name (must match the format in the digest list file name)
> > + * @func: Function pointer for parsing
> > + *
> > + * This structure stores a function pointer to parse a digest list.
> > + */
> > +struct parser {
> > +	struct list_head list;
> > +	struct module *owner;
> > +	const char name[NAME_MAX + 1];
> > +	parser_func func;
> > +};
> 
> I would make the struct name not so generic -- maybe digest_parser ...

Hi

sure, thanks for the suggestion!

Roberto
Luis Chamberlain Nov. 25, 2024, 11:53 p.m. UTC | #3
On Tue, Nov 19, 2024 at 11:49:14AM +0100, Roberto Sassu wrote:
> From: Roberto Sassu <roberto.sassu@huawei.com>
> Introduce load_parser() to load a kernel module containing a
> parser for the requested digest list format (compressed kernel modules are
> supported). Kernel modules are searched in the
> /lib/modules/<kernel ver>/security/integrity/digest_cache directory.
> 
> load_parser() calls ksys_finit_module() to load a kernel module directly
> from the kernel. request_module() cannot be used at this point, since the
> reference digests of modprobe and the linked libraries (required for IMA
> appraisal) might not be yet available, resulting in modprobe execution
> being denied.

You are doing a full solution implementation of loading modules in-kernel.
Appraisals of modules is just part of the boot process, some module
loading may need firmware to loading to get some functinality to work
for example some firmware to get a network device up or a GPU driver.
So module loading alone is not the only thing which may require
IMA appraisal, and this solution only addresses modules. There are other
things which may be needed other than firmware, eBPF programs are
another example.

It sounds more like you want to provide or extend LSM hooks fit your
architecture and make kernel_read_file() LSM hooks optionally use it to
fit this model.

Because this is just for a *phase* in boot, which you've caught because
a catch-22 situaton, where you didn't have your parsers loaded. Which is
just a reflection that you hit that snag. It doesn't prove all snags
will be caught yet.

And you only want to rely on this .. in-kernel loading solution only
early on boot, is there a way to change this over to enable regular
operation later?

 Luis
Roberto Sassu Nov. 26, 2024, 10:25 a.m. UTC | #4
On Mon, 2024-11-25 at 15:53 -0800, Luis Chamberlain wrote:
> On Tue, Nov 19, 2024 at 11:49:14AM +0100, Roberto Sassu wrote:
> > From: Roberto Sassu <roberto.sassu@huawei.com>
> > Introduce load_parser() to load a kernel module containing a
> > parser for the requested digest list format (compressed kernel modules are
> > supported). Kernel modules are searched in the
> > /lib/modules/<kernel ver>/security/integrity/digest_cache directory.
> > 
> > load_parser() calls ksys_finit_module() to load a kernel module directly
> > from the kernel. request_module() cannot be used at this point, since the
> > reference digests of modprobe and the linked libraries (required for IMA
> > appraisal) might not be yet available, resulting in modprobe execution
> > being denied.
> 
> You are doing a full solution implementation of loading modules in-kernel.
> Appraisals of modules is just part of the boot process, some module
> loading may need firmware to loading to get some functinality to work
> for example some firmware to get a network device up or a GPU driver.
> So module loading alone is not the only thing which may require
> IMA appraisal, and this solution only addresses modules. There are other
> things which may be needed other than firmware, eBPF programs are
> another example.

Firmware, eBPF programs and so on are supposed to be verified with
digest lists (or alternative methods, such as file signatures), once
the required digest list parsers are loaded.

The parser is an exceptional case, because user space cannot be
executed at this point. Once the parsers are loaded, verification of
everything else proceeds as normal. Fortunately, in most cases kernel
modules are signed, so digest lists are not required to verify them.

> It sounds more like you want to provide or extend LSM hooks fit your
> architecture and make kernel_read_file() LSM hooks optionally use it to
> fit this model.

As far as the LSM infrastructure is concerned, I'm not adding new LSM
hooks, nor extending/modifying the existing ones. The operations the
Integrity Digest Cache is doing match the usage expectation by LSM (net
denying access, as discussed with Paul Moore).

The Integrity Digest Cache is supposed to be used as a supporting tool
for other LSMs to do regular access control based on file data and
metadata integrity. In doing that, it still needs the LSM
infrastructure to notify about filesystem changes, and to store
additional information in the inode and file descriptor security blobs.

The kernel_post_read_file LSM hook should be implemented by another LSM
to verify the integrity of a digest list, when the Integrity Digest
Cache calls kernel_read_file() to read that digest list. That LSM is
also responsible to provide the result of the integrity verification to
the Integrity Digest Cache, so that the latter can give this
information back to whoever wants to do a digest lookup from that
digest list and also wants to know whether or not the digest list was
authentic.

> Because this is just for a *phase* in boot, which you've caught because
> a catch-22 situaton, where you didn't have your parsers loaded. Which is
> just a reflection that you hit that snag. It doesn't prove all snags
> will be caught yet.

Yes, that didn't happen earlier, since all the parsers were compiled
built-in in the kernel. The Integrity Digest Cache already has a
deadlock avoidance mechanism for digest lists.

Supporting kernel modules opened the road for new deadlocks, since one
can ask a digest list to verify a kernel module, but that digest list
requires the same kernel module. That is why the in-kernel mechanism is
100% reliable, because the Integrity Digest Cache marks the file
descriptors it opens, and can recognize them, when those file
descriptors are passed back to it by other LSMs (e.g. through the
kernel_post_read_file LSM hook).

> And you only want to rely on this .. in-kernel loading solution only
> early on boot, is there a way to change this over to enable regular
> operation later?

User space can voluntarily load new digest list parsers, but the
Integrity Digest Cache cannot rely on it to be done. Also, using
request_module() does not seem a good idea, since it wouldn't allow the
Integrity Digest Cache to mark the file descriptor of kernel modules,
and thus the Integrity Digest Cache could not determine whether or not
a deadlock is happening.

Thanks

Roberto
Luis Chamberlain Nov. 26, 2024, 7:04 p.m. UTC | #5
On Tue, Nov 26, 2024 at 11:25:07AM +0100, Roberto Sassu wrote:
> On Mon, 2024-11-25 at 15:53 -0800, Luis Chamberlain wrote:
> 
> Firmware, eBPF programs and so on are supposed

Keyword: "supposed". 

> As far as the LSM infrastructure is concerned, I'm not adding new LSM
> hooks, nor extending/modifying the existing ones. The operations the
> Integrity Digest Cache is doing match the usage expectation by LSM (net
> denying access, as discussed with Paul Moore).

If modules are the only proven exception to your security model you are
not making the case for it clearly.

> The Integrity Digest Cache is supposed to be used as a supporting tool
> for other LSMs to do regular access control based on file data and
> metadata integrity. In doing that, it still needs the LSM
> infrastructure to notify about filesystem changes, and to store
> additional information in the inode and file descriptor security blobs.
> 
> The kernel_post_read_file LSM hook should be implemented by another LSM
> to verify the integrity of a digest list, when the Integrity Digest
> Cache calls kernel_read_file() to read that digest list.

If LSM folks *do* agree that this work is *suplementing* LSMS then sure,
it was not clear from the commit logs. But then you need to ensure the
parsers are special snowflakes which won't ever incur other additional
kernel_read_file() calls.

> Supporting kernel modules opened the road for new deadlocks, since one
> can ask a digest list to verify a kernel module, but that digest list
> requires the same kernel module. That is why the in-kernel mechanism is
> 100% reliable,

Are users of this infrastructure really in need of modules for these
parsers?

  Luis
Roberto Sassu Nov. 27, 2024, 9:51 a.m. UTC | #6
On Tue, 2024-11-26 at 11:04 -0800, Luis Chamberlain wrote:
> On Tue, Nov 26, 2024 at 11:25:07AM +0100, Roberto Sassu wrote:
> > On Mon, 2024-11-25 at 15:53 -0800, Luis Chamberlain wrote:
> > 
> > Firmware, eBPF programs and so on are supposed
> 
> Keyword: "supposed". 

It depends if they are in a policy. They can also be verified with
other methods, such as file signatures.

For eBPF programs we are also in a need for a better way to
measure/appraise them.

> > As far as the LSM infrastructure is concerned, I'm not adding new LSM
> > hooks, nor extending/modifying the existing ones. The operations the
> > Integrity Digest Cache is doing match the usage expectation by LSM (net
> > denying access, as discussed with Paul Moore).
> 
> If modules are the only proven exception to your security model you are
> not making the case for it clearly.

The Integrity Digest Cache is not implementing any security model, this
is demanded to other LSMs which might decide to use the Integrity
Digest Cache based on a policy.

If we want to be super picky, the ksys_finit_module() helper is not
calling security_kernel_module_request(), which is done when using
request_module(). On the other hand, ksys_finit_module() is not
triggering user space, as the description of the function states
(anyway, apologies for not bringing up this earlier).

Net this, and we can discuss if it is more appropriate to call the LSM
hook, the helper does not introduce any exception since
security_file_open() is called when the kernel opens the file
descriptor, and security_kernel_read_file() and
security_kernel_post_read_file() are called in the same way regardless
if it is user space doing insmod or the kernel calling
ksys_finit_module().

The only exception is that the Integrity Digest Cache is unable to
verify the kernel modules containing the parsers, but I believe this is
fine because they are verified with their appended signature.

If there are any other concerns I'm missing, please let me know.

> > The Integrity Digest Cache is supposed to be used as a supporting tool
> > for other LSMs to do regular access control based on file data and
> > metadata integrity. In doing that, it still needs the LSM
> > infrastructure to notify about filesystem changes, and to store
> > additional information in the inode and file descriptor security blobs.
> > 
> > The kernel_post_read_file LSM hook should be implemented by another LSM
> > to verify the integrity of a digest list, when the Integrity Digest
> > Cache calls kernel_read_file() to read that digest list.
> 
> If LSM folks *do* agree that this work is *suplementing* LSMS then sure,
> it was not clear from the commit logs. But then you need to ensure the
> parsers are special snowflakes which won't ever incur other additional
> kernel_read_file() calls.

The Integrity Digest Cache was originally called digest_cache LSM, but
was renamed due to Paul's concern that it is not a proper LSM enforcing
a security model. If you are interested, I gave a talk at LSS NA 2024:

https://www.youtube.com/watch?v=aNwlKYSksg8

Given that the Integrity Digest Cache could not be standalone and use
the LSM infrastructure facilities, it is going to be directly
integrated in IMA, although it is not strictly necessary.

I planned to support IPE and BPF LSM as other users.

Uhm, let me clarify your last sentence a bit.

Let's assume that IMA is asked to verify a parser, when invoked through
the kernel_post_read_file hook. IMA is not handling the exception, and
is calling digest_cache_get() as usual. Normally, this would succeed,
but because digest_cache_get() realizes that the file descriptor passed
as argument is marked (i.e. it was opened by the Integrity Digest Cache
itself), it returns NULL.

That means that IMA falls back on another verification method, which is
verifying the appended signature.

The most important design principle that I introduced is that users of
the Integrity Digest Cache don't need to be aware of any exception,
everything is handled by the Integrity Digest Cache itself.

The same occurs when a kernel read occurs with file ID
READING_DIGEST_LIST (introduced in this patch set). Yes, I forbid
specifying an IMA policy which requires the Integrity Digest Cache to
verify digest lists, but due to the need of handling kernel modules
I decided to handle the exceptions in the Integrity Digest Cache itself
(this is why now I'm passing a file descriptor to digest_cache_get()
instead of a dentry).

Now, I'm trying to follow you on the additional kernel_read_file()
calls. I agree with you, if a parser tries to open again the file that
is being verified it would cause a deadlock in IMA (since the inode
mutex is already locked for verifying the original file).

In the Integrity Digest Cache itself, this is not going to happen,
since the file being verified with a digest cache is known and an
internal open of the same file fails. If it is really necessary, we can
pass the information to the parsers so that they are aware, it is just
an additional parameter.

However, I was assuming that a parser just receives the data read by
the Integrity Digest Cache, and just calls the Parser API to add the
extracted digests to the new digest cache. Also this can be discussed,
but I guess there is no immediate need.

> > Supporting kernel modules opened the road for new deadlocks, since one
> > can ask a digest list to verify a kernel module, but that digest list
> > requires the same kernel module. That is why the in-kernel mechanism is
> > 100% reliable,
> 
> Are users of this infrastructure really in need of modules for these
> parsers?

I planned to postpone this to later, and introduced two parsers built-
in (TLV and RPM). However, due to Linus's concern regarding the RPM
parser, I moved it out in a kernel module.

Also, a parser cannot be in user space, since the trust anchor is in
the kernel (the public keys and the signature verification mechanism),
it is not something that can be established in the initial ram disk
since the Integrity Digest Cache will be continously used in the
running system (maybe more parsers will be loaded on demand depending
on requests from user space).

And finally, the parser cannot run in user space, since it would be at
the same level of what the kernel is verifying.

Thanks

Roberto
Luis Chamberlain Nov. 27, 2024, 7:53 p.m. UTC | #7
On Wed, Nov 27, 2024 at 10:51:11AM +0100, Roberto Sassu wrote:
> For eBPF programs we are also in a need for a better way to
> measure/appraise them.

I am confused now, I was under the impression this "Integrity Digest
Cache" is just a special thing for LSMs, and so I was under the
impression that kernel_read_file() lsm hook already would take care
of eBPF programs.

> Now, I'm trying to follow you on the additional kernel_read_file()
> calls. I agree with you, if a parser tries to open again the file that
> is being verified it would cause a deadlock in IMA (since the inode
> mutex is already locked for verifying the original file).

Just document this on the parser as a requirement.

> > > Supporting kernel modules opened the road for new deadlocks, since one
> > > can ask a digest list to verify a kernel module, but that digest list
> > > requires the same kernel module. That is why the in-kernel mechanism is
> > > 100% reliable,
> > 
> > Are users of this infrastructure really in need of modules for these
> > parsers?
> 
> I planned to postpone this to later, and introduced two parsers built-
> in (TLV and RPM). However, due to Linus's concern regarding the RPM
> parser, I moved it out in a kernel module.

OK this should be part of the commit log, ie that it is not desirable to
have an rpm parser in-kernel for some users.

  Luis
Roberto Sassu Nov. 28, 2024, 8:23 a.m. UTC | #8
On Wed, 2024-11-27 at 11:53 -0800, Luis Chamberlain wrote:
> On Wed, Nov 27, 2024 at 10:51:11AM +0100, Roberto Sassu wrote:
> > For eBPF programs we are also in a need for a better way to
> > measure/appraise them.
> 
> I am confused now, I was under the impression this "Integrity Digest
> Cache" is just a special thing for LSMs, and so I was under the
> impression that kernel_read_file() lsm hook already would take care
> of eBPF programs.

Yes, the problem is that eBPF programs are transformed in user space
before they are sent to the kernel:

https://lwn.net/Articles/977394/

The Integrity Digest Cache can be used for the measurement/appraisal of
the initial eBPF ELF file, when they are accessed from the filesystem,
but the resulting blob sent to the kernel will be different.

> > Now, I'm trying to follow you on the additional kernel_read_file()
> > calls. I agree with you, if a parser tries to open again the file that
> > is being verified it would cause a deadlock in IMA (since the inode
> > mutex is already locked for verifying the original file).
> 
> Just document this on the parser as a requirement.

Ok, will do.

> > > > Supporting kernel modules opened the road for new deadlocks, since one
> > > > can ask a digest list to verify a kernel module, but that digest list
> > > > requires the same kernel module. That is why the in-kernel mechanism is
> > > > 100% reliable,
> > > 
> > > Are users of this infrastructure really in need of modules for these
> > > parsers?
> > 
> > I planned to postpone this to later, and introduced two parsers built-
> > in (TLV and RPM). However, due to Linus's concern regarding the RPM
> > parser, I moved it out in a kernel module.
> 
> OK this should be part of the commit log, ie that it is not desirable to
> have an rpm parser in-kernel for some users.

I understand. Will add in the commit log.

Just to clarify, we are not talking about the full blown librpm in the
kernel, but a 243 LOC that I rewrote to obtain only the information I
need. I also formally verified it with pseudo/totally random data with
Frama-C:

https://github.com/robertosassu/rpm-formal/blob/main/validate_rpm.c

Thanks

Roberto
Luis Chamberlain Nov. 28, 2024, 8:40 p.m. UTC | #9
On Thu, Nov 28, 2024 at 09:23:57AM +0100, Roberto Sassu wrote:
> On Wed, 2024-11-27 at 11:53 -0800, Luis Chamberlain wrote:
> > On Wed, Nov 27, 2024 at 10:51:11AM +0100, Roberto Sassu wrote:
> > > For eBPF programs we are also in a need for a better way to
> > > measure/appraise them.
> > 
> > I am confused now, I was under the impression this "Integrity Digest
> > Cache" is just a special thing for LSMs, and so I was under the
> > impression that kernel_read_file() lsm hook already would take care
> > of eBPF programs.
> 
> Yes, the problem is that eBPF programs are transformed in user space
> before they are sent to the kernel:
> 
> https://lwn.net/Articles/977394/

That issue seems to be orthogonal to your eandeavor though, which just
supplements LSMS, right?

Anyway, in case this helps:

The Rust folks faced some slighty related challenges with our CRC
validations for symbols, our CRC are slapped on with genksyms but this
relies on the source code and with Rust the compiler may do final
touches to data. And so DWARF is being used [1].

Although I am not sure of the state of eBPF DWARF support, there is also
BTF support [0] and most distros are relying on it to make live introspection 
easier, and the output is much smaller. So could DWARF or BTF information
from eBPF programs be used by the verifier in similar way to verify eBPF
programs?

Note that to support BTF implicates DWARF and the leap of faith for Rust
modversions support is that most distros will support DWARF, and so BTF
can become the norm [2].

[0] https://www.kernel.org/doc/html/latest/bpf/btf.html
[1] https://lwn.net/Articles/986892/
[2] https://lwn.net/Articles/991719/

  Luis
Roberto Sassu Nov. 29, 2024, 8:30 a.m. UTC | #10
On Thu, 2024-11-28 at 12:40 -0800, Luis Chamberlain wrote:
> On Thu, Nov 28, 2024 at 09:23:57AM +0100, Roberto Sassu wrote:
> > On Wed, 2024-11-27 at 11:53 -0800, Luis Chamberlain wrote:
> > > On Wed, Nov 27, 2024 at 10:51:11AM +0100, Roberto Sassu wrote:
> > > > For eBPF programs we are also in a need for a better way to
> > > > measure/appraise them.
> > > 
> > > I am confused now, I was under the impression this "Integrity Digest
> > > Cache" is just a special thing for LSMs, and so I was under the
> > > impression that kernel_read_file() lsm hook already would take care
> > > of eBPF programs.
> > 
> > Yes, the problem is that eBPF programs are transformed in user space
> > before they are sent to the kernel:
> > 
> > https://lwn.net/Articles/977394/
> 
> That issue seems to be orthogonal to your eandeavor though, which just
> supplements LSMS, right?

Yes, correct, the Integrity Digest Cache would be used to search
whatever digest was calculated by LSMs.

Thanks

Roberto

> Anyway, in case this helps:
> 
> The Rust folks faced some slighty related challenges with our CRC
> validations for symbols, our CRC are slapped on with genksyms but this
> relies on the source code and with Rust the compiler may do final
> touches to data. And so DWARF is being used [1].
> 
> Although I am not sure of the state of eBPF DWARF support, there is also
> BTF support [0] and most distros are relying on it to make live introspection 
> easier, and the output is much smaller. So could DWARF or BTF information
> from eBPF programs be used by the verifier in similar way to verify eBPF
> programs?
> 
> Note that to support BTF implicates DWARF and the leap of faith for Rust
> modversions support is that most distros will support DWARF, and so BTF
> can become the norm [2].
> 
> [0] https://www.kernel.org/doc/html/latest/bpf/btf.html
> [1] https://lwn.net/Articles/986892/
> [2] https://lwn.net/Articles/991719/
> 
>   Luis
diff mbox series

Patch

diff --git a/include/linux/digest_cache.h b/include/linux/digest_cache.h
index 59a42c04cbb8..a9d731990b7c 100644
--- a/include/linux/digest_cache.h
+++ b/include/linux/digest_cache.h
@@ -13,6 +13,32 @@ 
 #include <linux/fs.h>
 #include <crypto/hash_info.h>
 
+struct digest_cache;
+
+/**
+ * typedef parser_func - Function to parse digest lists
+ *
+ * Define a function type to parse digest lists.
+ */
+typedef int (*parser_func)(struct digest_cache *digest_cache, const u8 *data,
+			   size_t data_len);
+
+/**
+ * struct parser - Structure to store a function pointer to parse digest list
+ * @list: Linked list
+ * @owner: Kernel module owning the parser
+ * @name: Parser name (must match the format in the digest list file name)
+ * @func: Function pointer for parsing
+ *
+ * This structure stores a function pointer to parse a digest list.
+ */
+struct parser {
+	struct list_head list;
+	struct module *owner;
+	const char name[NAME_MAX + 1];
+	parser_func func;
+};
+
 #ifdef CONFIG_INTEGRITY_DIGEST_CACHE
 /* Client API */
 struct digest_cache *digest_cache_get(struct file *file);
@@ -30,6 +56,8 @@  int digest_cache_htable_add(struct digest_cache *digest_cache, u8 *digest,
 int digest_cache_htable_lookup(struct dentry *dentry,
 			       struct digest_cache *digest_cache, u8 *digest,
 			       enum hash_algo algo);
+int digest_cache_register_parser(struct parser *parser);
+void digest_cache_unregister_parser(struct parser *parser);
 
 #else
 static inline struct digest_cache *digest_cache_get(struct file *file)
@@ -72,5 +100,15 @@  static inline int digest_cache_htable_lookup(struct dentry *dentry,
 	return -EOPNOTSUPP;
 }
 
+static inline int digest_cache_register_parser(const char *name,
+					       parser_func func)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void digest_cache_unregister_parser(const char *name)
+{
+}
+
 #endif /* CONFIG_INTEGRITY_DIGEST_CACHE */
 #endif /* _LINUX_DIGEST_CACHE_H */
diff --git a/security/integrity/digest_cache/Kconfig b/security/integrity/digest_cache/Kconfig
index 419011fb52c9..65c07110911b 100644
--- a/security/integrity/digest_cache/Kconfig
+++ b/security/integrity/digest_cache/Kconfig
@@ -1,6 +1,7 @@ 
 # SPDX-License-Identifier: GPL-2.0
 config INTEGRITY_DIGEST_CACHE
 	bool "Integrity Digest Cache"
+	select MODULE_DECOMPRESS if MODULE_COMPRESS
 	default n
 	help
 	  This option enables a cache of reference digests (e.g. of file
diff --git a/security/integrity/digest_cache/Makefile b/security/integrity/digest_cache/Makefile
index 0092c913979d..d68cae690241 100644
--- a/security/integrity/digest_cache/Makefile
+++ b/security/integrity/digest_cache/Makefile
@@ -4,4 +4,6 @@ 
 
 obj-$(CONFIG_INTEGRITY_DIGEST_CACHE) += digest_cache.o
 
-digest_cache-y := main.o secfs.o htable.o
+digest_cache-y := main.o secfs.o htable.o parsers.o
+
+CFLAGS_parsers.o += -DPARSERS_DIR=\"$(MODLIB)/kernel/security/integrity/digest_cache/parsers\"
diff --git a/security/integrity/digest_cache/internal.h b/security/integrity/digest_cache/internal.h
index e14343e96caa..e178549f9ff9 100644
--- a/security/integrity/digest_cache/internal.h
+++ b/security/integrity/digest_cache/internal.h
@@ -161,4 +161,9 @@  int __init digest_cache_secfs_init(struct dentry *dir);
 /* htable.c */
 void digest_cache_htable_free(struct digest_cache *digest_cache);
 
+/* parsers.c */
+int digest_cache_parse_digest_list(struct dentry *dentry,
+				   struct digest_cache *digest_cache,
+				   char *path_str, void *data, size_t data_len);
+
 #endif /* _DIGEST_CACHE_INTERNAL_H */
diff --git a/security/integrity/digest_cache/parsers.c b/security/integrity/digest_cache/parsers.c
new file mode 100644
index 000000000000..744c9742a44e
--- /dev/null
+++ b/security/integrity/digest_cache/parsers.c
@@ -0,0 +1,257 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2024 Huawei Technologies Duesseldorf GmbH
+ *
+ * Author: Roberto Sassu <roberto.sassu@huawei.com>
+ *
+ * Implement the code to register digest list parsers.
+ */
+
+#define pr_fmt(fmt) "digest_cache: "fmt
+#include <linux/init_task.h>
+#include <linux/namei.h>
+#include <uapi/linux/module.h>
+#include <linux/syscalls.h>
+#include <linux/vmalloc.h>
+
+#include "internal.h"
+
+static DEFINE_MUTEX(parsers_mutex);
+static LIST_HEAD(parsers);
+
+/**
+ * load_parser - Load kernel module containing a parser
+ * @dentry: Dentry of the inode for which the digest cache will be used
+ * @digest_cache: Digest cache
+ * @name: Name of the parser to load
+ *
+ * This function opens a kernel module file in
+ * /lib/modules/<kernel ver>/security/integrity/digest_cache, and executes
+ * ksys_finit_module() to load the kernel module. After kernel module
+ * initialization, the parser should be found in the linked list of parsers.
+ *
+ * Return: Zero if kernel module is loaded, a POSIX error code otherwise.
+ */
+static int load_parser(struct dentry *dentry, struct digest_cache *digest_cache,
+		       const char *name)
+{
+	char *compress_suffix = "";
+	char *parser_path;
+	struct file *file;
+	struct path path;
+	int ret = 0, flags = 0;
+
+	/* Must be kept in sync with kernel/module/Kconfig. */
+	if (IS_ENABLED(CONFIG_MODULE_COMPRESS_GZIP))
+		compress_suffix = ".gz";
+	else if (IS_ENABLED(CONFIG_MODULE_COMPRESS_XZ))
+		compress_suffix = ".xz";
+	else if (IS_ENABLED(CONFIG_MODULE_COMPRESS_ZSTD))
+		compress_suffix = ".zst";
+
+	if (strlen(compress_suffix))
+		flags |= MODULE_INIT_COMPRESSED_FILE;
+
+	parser_path = kasprintf(GFP_KERNEL, "%s/%s.ko%s", PARSERS_DIR, name,
+				compress_suffix);
+	if (!parser_path)
+		return -ENOMEM;
+
+	ret = kern_path(parser_path, 0, &path);
+	if (ret < 0) {
+		pr_debug("Cannot find path %s\n", parser_path);
+		goto out;
+	}
+
+	/* Cannot request a digest cache for the kernel module inode. */
+	if (d_backing_inode(dentry) == d_backing_inode(path.dentry)) {
+		pr_debug("Cannot request a digest cache for kernel module %s\n",
+			 dentry->d_name.name);
+		ret = -EBUSY;
+		goto out;
+	}
+
+	file = kernel_file_open(&path, O_RDONLY, &init_cred);
+	if (IS_ERR(file)) {
+		pr_debug("Cannot open %s\n", parser_path);
+		ret = PTR_ERR(file);
+		goto out_path;
+	}
+
+	/* Mark the file descriptor as ours. */
+	digest_cache_to_file_sec(file, digest_cache);
+
+	ret = ksys_finit_module(file, "", flags);
+	if (ret < 0)
+		pr_debug("Cannot load module %s\n", parser_path);
+
+	fput(file);
+out_path:
+	path_put(&path);
+out:
+	kfree(parser_path);
+	return ret;
+}
+
+/**
+ * lookup_get_parser - Lookup and get parser among registered ones
+ * @name: Name of the parser to search
+ *
+ * This function searches a parser among the registered ones, and returns it
+ * to the caller, after incrementing the kernel module reference count.
+ *
+ * Must be called with parser_mutex held.
+ *
+ * Return: A parser structure if parser is found and available, NULL otherwise.
+ */
+static struct parser *lookup_get_parser(const char *name)
+{
+	struct parser *entry, *found = NULL;
+
+	list_for_each_entry(entry, &parsers, list) {
+		if (!strcmp(entry->name, name) &&
+		    try_module_get(entry->owner)) {
+			found = entry;
+			break;
+		}
+	}
+
+	return found;
+}
+
+/**
+ * put_parser - Put parser
+ * @parser: Parser to put
+ *
+ * This function decreases the kernel module reference count.
+ */
+static void put_parser(struct parser *parser)
+{
+	module_put(parser->owner);
+}
+
+/**
+ * digest_cache_parse_digest_list - Parse a digest list
+ * @dentry: Dentry of the inode for which the digest cache will be used
+ * @digest_cache: Digest cache
+ * @path_str: Path string of the digest list
+ * @data: Data to parse
+ * @data_len: Length of @data
+ *
+ * This function selects a parser for a digest list depending on its file name,
+ * and calls the appropriate parsing function. It expects the file name to be
+ * in the format: [<seq num>-]<digest list format>-<digest list name>.
+ * <seq num>- is optional.
+ *
+ * Return: Zero on success, a POSIX error code otherwise.
+ */
+int digest_cache_parse_digest_list(struct dentry *dentry,
+				   struct digest_cache *digest_cache,
+				   char *path_str, void *data, size_t data_len)
+{
+	char *filename, *format, *next_sep;
+	struct parser *parser;
+	char format_buf[sizeof(parser->name)];
+	int ret = -EINVAL;
+
+	filename = strrchr(path_str, '/');
+	if (!filename)
+		return ret;
+
+	filename++;
+	format = filename;
+
+	/*
+	 * Since we expect that all files start with a digest list format, this
+	 * check is reliable to detect <seq num>.
+	 */
+	if (filename[0] >= '0' && filename[0] <= '9') {
+		format = strchr(filename, '-');
+		if (!format)
+			return ret;
+
+		format++;
+	}
+
+	next_sep = strchr(format, '-');
+	if (!next_sep || next_sep - format >= sizeof(format_buf))
+		return ret;
+
+	snprintf(format_buf, sizeof(format_buf), "%.*s",
+		 (int)(next_sep - format), format);
+
+	pr_debug("Parsing %s, format: %s, size: %ld\n", path_str, format_buf,
+		 data_len);
+
+	mutex_lock(&parsers_mutex);
+	parser = lookup_get_parser(format_buf);
+	mutex_unlock(&parsers_mutex);
+
+	if (!parser) {
+		load_parser(dentry, digest_cache, format_buf);
+
+		mutex_lock(&parsers_mutex);
+		parser = lookup_get_parser(format_buf);
+		mutex_unlock(&parsers_mutex);
+
+		if (!parser) {
+			pr_debug("Digest list parser %s not found\n",
+				 format_buf);
+			return -ENOENT;
+		}
+	}
+
+	ret = parser->func(digest_cache, data, data_len);
+	put_parser(parser);
+
+	return ret;
+}
+
+/**
+ * digest_cache_register_parser - Register new parser
+ * @parser: Parser structure to register
+ *
+ * This function searches the parser name among the registered ones and, if not
+ * found, appends the parser to the linked list of parsers.
+ *
+ * Return: Zero on success, -EEXIST if a parser with the same name exists.
+ */
+int digest_cache_register_parser(struct parser *parser)
+{
+	struct parser *p;
+	int ret = 0;
+
+	mutex_lock(&parsers_mutex);
+	p = lookup_get_parser(parser->name);
+	if (p) {
+		put_parser(p);
+		ret = -EEXIST;
+		goto out;
+	}
+
+	list_add_tail(&parser->list, &parsers);
+out:
+	pr_debug("Digest list parser \'%s\' %s registered\n", parser->name,
+		 (ret < 0) ? "cannot be" : "successfully");
+
+	mutex_unlock(&parsers_mutex);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(digest_cache_register_parser);
+
+/**
+ * digest_cache_unregister_parser - Unregister parser
+ * @parser: Parser structure to unregister
+ *
+ * This function removes the passed parser from the linked list of parsers.
+ */
+void digest_cache_unregister_parser(struct parser *parser)
+{
+	mutex_lock(&parsers_mutex);
+	list_del(&parser->list);
+	mutex_unlock(&parsers_mutex);
+
+	pr_debug("Digest list parser \'%s\' successfully unregistered\n",
+		 parser->name);
+}
+EXPORT_SYMBOL_GPL(digest_cache_unregister_parser);