Message ID | 20210726163700.2092768-2-roberto.sassu@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | integrity: Introduce DIGLIM | expand |
Em Mon, 26 Jul 2021 18:36:49 +0200 Roberto Sassu <roberto.sassu@huawei.com> escreveu: > Add an overview of DIGLIM to Documentation/security/diglim/introduction.rst > and the architecture to Documentation/security/diglim/architecture.rst > > Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> > --- > .../security/diglim/architecture.rst | 45 ++ > Documentation/security/diglim/index.rst | 11 + > .../security/diglim/introduction.rst | 631 ++++++++++++++++++ > Documentation/security/index.rst | 1 + > MAINTAINERS | 9 + > 5 files changed, 697 insertions(+) > create mode 100644 Documentation/security/diglim/architecture.rst > create mode 100644 Documentation/security/diglim/index.rst > create mode 100644 Documentation/security/diglim/introduction.rst > > diff --git a/Documentation/security/diglim/architecture.rst b/Documentation/security/diglim/architecture.rst > new file mode 100644 > index 000000000000..a54fe2453715 > --- /dev/null > +++ b/Documentation/security/diglim/architecture.rst > @@ -0,0 +1,45 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +Architecture > +============ > + > +This section introduces the high level architecture of DIGLIM. > + > +:: > + > + 5. add/delete from hash table and add refs to digest list > + +---------------------------------------------+ > + | +-----+ +-------------+ +--+ > + | | key |-->| digest refs |-->...-->| | > + V +-----+ +-------------+ +--+ > + +-------------+ +-----+ +-------------+ > + | digest list | | key |-->| digest refs | > + | (compact) | +-----+ +-------------+ > + +-------------+ +-----+ +-------------+ > + ^ 4. copy to | key |-->| digest refs | > + | kernel memory +-----+ +-------------+ kernel space > + -------------------------------------------------------------------------- > + ^ ^ user space > + |<----------------+ 3b. upload | > + +-------------+ +------------+ | 6. query digest > + | digest list | | user space | 2b. convert > + | (compact) | | parser | > + +-------------+ +------------+ > + 1a. upload ^ 1b. read > + | > + +------------+ > + | RPM header | > + +------------+ > + > + > +As mentioned before, digest lists can be uploaded directly if they are in "before"? This is at the beginning of this document ;-) You should probably add a reference to introduction.rst here, like: As mentioned at Documentation/security/diglim/introduction.rst, ... > +the compact format (step 1a) or can be uploaded indirectly by the user > +space parser if they are in an alternative format (steps 1b-3b). > + > +During upload, the kernel makes a copy of the digest list to the kernel > +memory (step 4), and creates the necessary structures to index the digests > +(hash table and a linked list of digest list references to locate the > +digests in the digest list) (step 5). > + > +Finally, digests can be searched from user space through a securityfs file > +(step 6) or by the kernel itself. This probably applies to Documentation/security as a hole, but the best is to split the documents on two separate parts: - the kAPI and internals; - the admin-guide part. The audience for the admin-guide is distribution pagagers and syssadmins. > diff --git a/Documentation/security/diglim/index.rst b/Documentation/security/diglim/index.rst > new file mode 100644 > index 000000000000..0fc5ab019bc0 > --- /dev/null > +++ b/Documentation/security/diglim/index.rst > @@ -0,0 +1,11 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +====================================== > +Digest Lists Integrity Module (DIGLIM) > +====================================== > + > +.. toctree:: > + :maxdepth: 1 > + > + introduction > + architecture > diff --git a/Documentation/security/diglim/introduction.rst b/Documentation/security/diglim/introduction.rst > new file mode 100644 > index 000000000000..d8d8b2a17222 > --- /dev/null > +++ b/Documentation/security/diglim/introduction.rst > @@ -0,0 +1,631 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +Introduction > +============ > + > +Digest Lists Integrity Module (DIGLIM) is a new component added to the > +integrity subsystem in the kernel, primarily aiming to aid Integrity I would replace: "is a new component added to" -> "is a component of" As this is the kind of text that tends to be outdated with time... Imagine someone reading this paragraph maybe 10 years in the future ;-) > +Measurement Architecture (IMA) in the process of checking the integrity of > +file content and metadata. It accomplishes this task by storing reference > +values coming from software vendors and by reporting whether or not the > +digest of file content or metadata calculated by IMA (or EVM) is found > +among those values. In this way, IMA can decide, depending on the result of > +a query, if a measurement should be taken or access to the file should be > +granted. The `Security Assumptions`_ section explains more in detail why > +this component has been placed in the kernel. > + > +The main benefits of using IMA in conjunction with DIGLIM are the ability > +to implement advanced remote attestation schemes based on the usage of a > +TPM key for establishing a TLS secure channel [1][2], and to reduce the > +burden on Linux distribution vendors to extend secure boot at OS level to > +applications. > + > +DIGLIM does not have the complexity of feature-rich databases. In fact, its > +main functionality comes from the hash table primitives already in the > +kernel. It does not have an ad-hoc storage module, it just indexes data in > +a fixed format (digest lists, a set of concatenated digests preceded by a > +header), copied to kernel memory as they are. Lastly, it does not support > +database-oriented languages such as SQL, but only accepts a digest and its > +algorithm as a query. > + > +The only digest list format supported by DIGLIM is called ``compact``. > +However, Linux distribution vendors don't have to generate new digest lists > +in this format for the packages they release, as already available > +information, such as RPM headers and DEB package metadata, can be already > +used as a source for reference values (they already include file digests), -ETOMANY_already as "already" available... can be "already" ... "already" include... I would simplify the above text removing such redundancy. > +with a user space parser taking care of the conversion to the compact > +format. > + > +Although one might perceive that storing file or metadata digests for a > +Linux distribution would significantly increase the memory usage, this does > +not seem to be the case. As an anticipation of the evaluation done in the > +`Preliminary Performance Evaluation`_ section, protecting binaries and > +shared libraries of a minimal Fedora 33 installation requires 208K of > +memory for the digest lists plus 556K for indexing. > + > +In exchange for a slightly increased memory usage, DIGLIM improves the > +performance of the integrity subsystem. In the considered scenario, IMA > +measurement and appraisal with digest lists requires respectively less than > +one quarter and less than half the time, compared to the current solution. I found this paragraph a little bit confusing to understand. Could you please improve the description? I mean: what improved by one quarter? what improved by "less than half of the time"? > + > +DIGLIM also keeps track of whether digest lists have been processed in some > +way (e.g. measured or appraised by IMA). This is important for example for > +remote attestation, so that remote verifiers understand what has been > +uploaded to the kernel. > + > +DIGLIM behaves like a transactional database, i.e. it has the ability to > +roll back to the beginning of the transaction if an error occurred during > +the addition of a digest list (the deletion operation always succeeds). I don't think it makes sense to compare it with a transactional database. I would say, instead, something like: The inserts on DIGLIM are atomic: if an error occurs during the addition of a digest list, it rolls back the entire insert operation. > +This capability has been tested with an ad-hoc fault injection mechanism > +capable of simulating failures during the operations. > + > +Finally, DIGLIM exposes to user space, through securityfs, the digest lists > +currently loaded, the number of digests added, a query interface and an > +interface to set digest list labels. > + > +[1] LSS EU 2019 > + > +- slides: > + https://static.sched.com/hosted_files/lsseu2019/bd/secure_attested_communication_channels_lss_eu_2019.pdf > +- video: https://youtu.be/mffdQgkvDNY > + > +[2] FutureTPM EU project, final review meeting demo > + > +- slides: > + https://futuretpm.eu/images/07-3-FutureTPM-Final-Review-Slides-WP6-Device-Management-Use-Case-HWDU.pdf > +- video: https://vimeo.com/528251864/4c1d55abcd The above won't generate any cross-references with Sphinx. For it correct syntax, see: https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#citations > + > + > +Binary Integrity > +---------------- > + > +Integrity is a fundamental security property in information systems. > +Integrity could be described as the condition in which a generic > +component is just after it has been released by the entity that created it. Sounds a weird description for me. (ISC)2 defines integrity on its glossary[1] as: "Guarding against improper information modification or destruction and includes ensuring information non-repudiation and authenticity." [1] https://www.isc2.org/Certifications/CISSP/CISSP-Student-Glossary > +One way to check whether a component is in this condition (called binary > +integrity) is to calculate its digest and to compare it with a reference > +value (i.e. the digest calculated in controlled conditions, when the > +component is released). > + > +IMA, a software part of the integrity subsystem, can perform such > +evaluation and execute different actions: > + > +- store the digest in an integrity-protected measurement list, so that it > + can be sent to a remote verifier for analysis; > +- compare the calculated digest with a reference value (usually protected > + with a signature) and deny operations if the file is found corrupted; > +- store the digest in the system log. > + > + > +Contribution > +------------ I would rename this chapter to "Benefits". > + > +DIGLIM further enhances the capabilities offered by IMA-based solutions > +and, at the same time, makes them more practical to adopt by reusing > +existing sources as reference values for integrity decisions. > + > +Possible sources for digest lists are: > + > +- RPM headers; > +- Debian repository metadata. > + > + > +Benefits for IMA Measurement > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +One of the issues that arises when files are measured by the OS is that, > +due to parallel execution, the order in which file accesses happen cannot > +be predicted. Since the TPM Platform Configuration Register (PCR) extend > +operation, executed after each file measurement, cryptographically binds > +the current measurement to the previous ones, the PCR value at the end of a > +workload cannot be predicted too. > + > +Thus, even if the usage of a TPM key, bound to a PCR value, should be > +allowed when only good files were accessed, the TPM could unexpectedly deny > +an operation on that key if files accesses did not happen as stated by the > +key policy (which allows only one of the possible sequences). > + > +DIGLIM solves this issue by making the PCR value stable over the time and > +not dependent on file accesses. The following figure depicts the current > +and the new approaches: > + > +:: > + > + IMA measurement list (current) > + > + entry# 1st boot 2nd boot 3rd boot > + +----+---------------+ +----+---------------+ +----+---------------+ > + 1: | 10 | file1 measur. | | 10 | file3 measur. | | 10 | file2 measur. | > + +----+---------------+ +----+---------------+ +----+---------------+ > + 2: | 10 | file2 measur. | | 10 | file2 measur. | | 10 | file3 measur. | > + +----+---------------+ +----+---------------+ +----+---------------+ > + 3: | 10 | file3 measur. | | 10 | file1 measur. | | 10 | file4 measur. | > + +----+---------------+ +----+---------------+ +----+---------------+ > + > + PCR: Extend != Extend != Extend > + file1, file2, file3 file3, file2, file1 file2, file3, file4 > + > + > + PCR Extend definition: > + > + PCR(new value) = Hash(Hash(meas. entry), PCR(previous value)) > + > +A new entry in the measurement list is created by IMA for each file access. > +Assuming that ``file1``, ``file2`` and ``file3`` are files provided by the > +software vendor, ``file4`` is an unknown file, the first two PCR values > +above represent a good system state, the third a bad system state. The PCR > +values are the result of the PCR extend operation performed for each > +measurement entry with the digest of the measurement entry as an input. > + > +:: > + > + IMA measurement list (with DIGLIM) > + > + dlist > + +--------------+ > + | header | > + +--------------+ > + | file1 digest | > + | file2 digest | > + | file3 digest | > + +--------------+ > + > +``dlist`` is a digest list containing the digest of ``file1``, ``file2`` > +and ``file3``. In the intended scenario, it is generated by a software > +vendor at the end of the building process, and retrieved by the > +administrator of the system where the digest list is loaded. > + > +:: > + > + entry# 1st boot 2nd boot 3rd boot > + +----+---------------+ +----+---------------+ +----+---------------+ > + 0: | 11 | dlist measur. | | 11 | dlist measur. | | 11 | dlist measur. | > + +----+---------------+ +----+---------------+ +----+---------------+ > + 1: < file1 measur. skip > < file3 measur. skip > < file2 measur. skip > > + > + 2: < file2 measur. skip > < file2 measur. skip > < file3 measur. skip > > + +----+---------------+ > + 3: < file3 measur. skip > < file1 measur. skip > | 11 | file4 measur. | > + +----+---------------+ > + > + PCR: Extend = Extend != Extend > + dlist dlist dlist, file4 > + > + > +The first entry in the measurement list contains the digest of the digest > +list uploaded to the kernel at kernel initialization time. > + > +When a file is accessed, IMA queries DIGLIM with the calculated file digest > +and, if it is found, IMA skips the measurement. > + > +Thus, the only information sent to remote verifiers are: the list of > +files that could possibly be accessed (from the digest list), but not if > +they were accessed and when; the measurement of unknown files. > + > +Despite providing less information, this solution has the advantage that > +the good system state (i.e. when only ``file1``, ``file2`` and ``file3`` > +are accessed) now can be represented with a deterministic PCR value (the > +PCR is extended only with the measurement of the digest list). Also, the > +bad system state can still be distinguished from the good state (the PCR is > +extended also with the measurement of ``file4``). > + > +If a TPM key is bound to the good PCR value, the TPM would allow the key to > +be used if ``file1``, ``file2`` or ``file3`` are accessed, regardless of > +the sequence in which they are accessed (the PCR value does not change), > +and would revoke the permission when the unknown ``file4`` is accessed (the > +PCR value changes). If a system is able to establish a TLS connection with > +a peer, this implicitly means that the system was in a good state (i.e. > +``file4`` was not accessed, otherwise the TPM would have denied the usage > +of the TPM key due to the key policy). > + > + > +Benefits for IMA Appraisal > +~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +Extending secure boot to applications means being able to verify the > +provenance of files accessed. IMA does it by verifying file signatures with > +a key that it trusts, which requires Linux distribution vendors to > +additionally include in the package header a signature for each file that > +must be verified (there is the dedicated ``RPMTAG_FILESIGNATURES`` section > +in the RPM header). > + > +The proposed approach would be instead to verify data provenance from > +already available metadata (file digests) in existing packages. IMA would > +verify the signature of package metadata and search file digests extracted > +from package metadata and added to the hash table in the kernel. > + > +For RPMs, file digests can be found in the ``RPMTAG_FILEDIGESTS`` section > +of ``RPMTAG_IMMUTABLE``, whose signature is in ``RPMTAG_RSAHEADER``. For > +DEBs, file digests (unsafe to use due to a weak digest algorithm) can be > +found in the ``md5sum`` file, which can be indirectly verified from > +``Release.gpg``. > + > +The following figure highlights the differences between the current and the > +proposed approach. > + > +:: > + > + IMA appraisal (current solution, with file signatures): > + > + appraise > + +-----------+ > + V | > + +-------------------------+-----+ +-------+-----+ | > + | RPM header | | ima rpm | file1 | sig | | > + | ... | | plugin +-------+-----+ +-----+ > + | file1 sig [to be added] | sig |--------> ... | IMA | > + | ... | | +-------+-----+ +-----+ > + | fileN sig [to be added] | | | fileN | sig | > + +-------------------------+-----+ +-------+-----+ > + > +In this case, file signatures must be added to the RPM header, so that the > +``ima`` rpm plugin can extract them together with the file content. The RPM > +header signature is not used. > + > +:: > + > + IMA appraisal (with DIGLIM): > + > + kernel hash table > + with RPM header content > + +---+ +--------------+ > + | |--->| file1 digest | > + +---+ +--------------+ > + ... > + +---+ appraise (file1) > + | | <--------------+ > + +----------------+-----+ +---+ | > + | RPM header | | ^ | > + | ... | | digest_list | | > + | file1 digest | sig | rpm plugin | +-------+ +-----+ > + | ... | |-------------+--->| file1 | | IMA | > + | fileN digest | | +-------+ +-----+ > + +----------------+-----+ | > + ^ | > + +------------------------------------+ > + appraise (RPM header) > + > +In this case, the RPM header is used as it is, and its signature is used > +for IMA appraisal. Then, the ``digest_list`` rpm plugin executes the user > +space parser to parse the RPM header and add the extracted digests to an > +hash table in the kernel. IMA appraisal of the files in the RPM package > +consists in searching their digest in the hash table. > + > +Other than reusing available information as digest list, another advantage > +is the lower computational overhead compared to the solution with file > +signatures (only one signature verification for many files and digest > +lookup, instead of per file signature verification, see `Preliminary > +Performance Evaluation`_ for more details). > + > + > +Lifecycle > +--------- > + > +The lifecycle of DIGLIM is represented in the following figure: > + > +:: You could just use: The lifecycle of DIGLIM is represented in the following figure:: > + > + Vendor premises (release process with modifications): > + > + +------------+ +-----------------------+ +------------------------+ > + | 1. build a | | 2. generate and sign | | 3. publish the package | > + | package |-->| a digest list from |-->| and digest list in | > + | | | packaged files | | a repository | > + +------------+ +-----------------------+ +------------------------+ > + | > + | > + User premises: | > + V > + +---------------------+ +------------------------+ +-----------------+ > + | 6. use digest lists | | 5. download the digest | | 4. download and | > + | for measurement |<--| list and upload to |<--| install the | > + | and/or appraisal | | the kernel | | package | > + +---------------------+ +------------------------+ +-----------------+ > + > +The figure above represents all the steps when a digest list is > +generated separately. However, as mentioned in `Contribution`_, in most > +cases existing packages can be already used as a source for digest lists, > +limiting the effort for software vendors. > + > +If, for example, RPMs are used as a source for digest lists, the figure > +above becomes: > + > +:: Same here. > + > + Vendor premises (release process without modifications): > + > + +------------+ +------------------------+ > + | 1. build a | | 2. publish the package | > + | package |-->| in a repository |---------------------+ > + | | | | | > + +------------+ +------------------------+ | > + | > + | > + User premises: | > + V > + +---------------------+ +------------------------+ +-----------------+ > + | 5. use digest lists | | 4. extract digest list | | 3. download and | > + | for measurement |<--| from the package |<--| install the | > + | and/or appraisal | | and upload to the | | package | > + | | | kernel | | | > + +---------------------+ +------------------------+ +-----------------+ > + > +Step 4 can be performed with the ``digest_list`` rpm plugin and the user > +space parser, without changes to rpm itself. > + > + > +Security Assumptions > +-------------------- > + > +As mentioned in the `Introduction`_, DIGLIM will be primarily used in > +conjunction with IMA to enforce a mandatory policy on all user space > +processes, including those owned by root. Even root, in a system with a > +locked-down kernel, cannot affect the enforcement of the mandatory policy > +or, if changes are permitted, it cannot do so without being detected. > + > +Given that the target of the enforcement are user space processes, DIGLIM > +cannot be placed in the target, as a Mandatory Access Control (MAC) design > +is required to have the components responsible to enforce the mandatory > +policy separated from the target. > + > +While locking-down a system and limiting actions with a mandatory policy is > +generally perceived by users as an obstacle, it has noteworthy benefits for > +the users themselves. > + > +First, it would timely block attempts by malicious software to steal or > +misuse user assets. Although users could query the package managers to > +detect them, detection would happen after the fact, or it wouldn't happen > +at all if the malicious software tampered with package managers. With a > +mandatory policy enforced by the kernel, users would still be able to > +decide which software they want to be executed except that, unlike package > +managers, the kernel is not affected by user space processes or root. > + > +Second, it might make systems more easily verifiable from outside, due to > +the limited actions the system allows. When users connect to a server, not > +only they would be able to verify the server identity, which is already > +possible with communication protocols like TLS, but also if the software > +running on that server can be trusted to handle their sensitive data. > + > + > +Adoption > +-------- > + > +A former version of DIGLIM is used in the following OSes: > + > +- openEuler 20.09 > + https://github.com/openeuler-mirror/kernel/tree/openEuler-20.09 > + > +- openEuler 21.03 > + https://github.com/openeuler-mirror/kernel/tree/openEuler-21.03 > + > +Originally, DIGLIM was part of IMA (known as IMA Digest Lists). In this > +version, it has been redesigned as a standalone module with an API that > +makes its functionality accessible by IMA and, eventually, other > +subsystems. > + > +User Space Support > +------------------ > + > +Digest lists can be generated and managed with ``digest-list-tools``: > + > +https://github.com/openeuler-mirror/digest-list-tools > + > +It includes two main applications: > + > +- ``gen_digest_lists``: generates digest lists from files in the > + filesystem or from the RPM database (more digest list sources can be > + supported); > +- ``manage_digest_lists``: converts and uploads digest lists to the > + kernel. > + > +Integration with rpm is done with the ``digest_list`` plugin: > + > +https://gitee.com/src-openeuler/rpm/blob/master/Add-digest-list-plugin.patch > + > +This plugin writes the RPM header and its signature to a file, so that the > +file is ready to be appraised by IMA, and calls the user space parser to > +convert and upload the digest list to the kernel. > + > + > +Simple Usage Example (Tested with Fedora 33) > +-------------------------------------------- > + > +1. Digest list generation (RPM headers and their signature are copied to > + the specified directory): > + > +.. code-block:: bash > + > + # mkdir /etc/digest_lists > + # gen_digest_lists -t file -f rpm+db -d /etc/digest_lists -o add > + > +2. Digest list upload with the user space parser: > + > +.. code-block:: bash > + > + # manage_digest_lists -p add-digest -d /etc/digest_lists > + > +3. First digest list query: > + > +.. code-block:: bash > + > + # echo sha256-$(sha256sum /bin/cat) > /sys/kernel/security/integrity/diglim/digest_query > + # cat /sys/kernel/security/integrity/diglim/digest_query > + sha256-[...]-0-file_list-rpm-coreutils-8.32-18.fc33.x86_64 (actions: 0): version: 1, algo: sha256, type: 2, modifiers: 1, count: 106, datalen: 3392 > + > +4. Second digest list query: > + > +.. code-block:: bash > + > + # echo sha256-$(sha256sum /bin/zip) > /sys/kernel/security/integrity/diglim/digest_query > + # cat /sys/kernel/security/integrity/diglim/digest_query > + sha256-[...]-0-file_list-rpm-zip-3.0-27.fc33.x86_64 (actions: 0): version: 1, algo: sha256, type: 2, modifiers: 1, count: 4, datalen: 128 > + > + > +Preliminary Performance Evaluation > +---------------------------------- > + > +This section provides an initial estimation of the overhead introduced by > +DIGLIM. The estimation has been performed on a Fedora 33 virtual machine > +with 1447 packages installed. The virtual machine has 16 vCPU (host CPU: > +AMD Ryzen Threadripper PRO 3955WX 16-Cores) and 2G of RAM (host memory: > +64G). The virtual machine also has a vTPM with libtpms and swtpm as > +backend. > + > +After writing the RPM headers to files, the size of the directory > +containing them is 36M. > + > +After converting the RPM headers to the compact digest list, the size of > +the data being uploaded to the kernel is 3.6M. > + > +The time to load the entire RPM database is 0.628s. > + > +After loading the digest lists to the kernel, the slab usage due to > +indexing is (obtained with slab_nomerge in the kernel command line): > + > +:: > + > + OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > + 118144 118144 100% 0,03K 923 128 3692K digest_list_item_ref_cache > + 102400 102400 100% 0,03K 800 128 3200K digest_item_cache > + 2646 2646 100% 0,09K 63 42 252K digest_list_item_cache > + > +The stats, obtained from the ``digests_count`` interface, introduced later, > +are: > + > +:: > + > + Parser digests: 0 > + File digests: 99100 > + Metadata digests: 0 > + Digest list digests: 1423 > + > +On this installation, this would be the worst case in which all files are > +measured and/or appraised, which is currently not recommended without > +enforcing an integrity policy protecting mutable files. Infoflow LSM is a > +component to accomplish this task: > + > +https://patchwork.kernel.org/project/linux-integrity/cover/20190818235745.1417-1-roberto.sassu@huawei.com/ > + > +The first manageable goal of IMA with DIGLIM is to use an execution policy, > +with measurement and/or appraisal of files executed or mapped in memory as > +executable (in addition to kernel modules and firmware). In this > +case, the digest list contains the digest only for those files. The numbers > +above change as follows. > + > +After converting the RPM headers to the compact digest list, the size of > +the data being uploaded to the kernel is 208K. > + > +The time to load the digest of binaries and shared libraries is 0.062s. > + > +After loading the digest lists to the kernel, the slab usage due to > +indexing is: > + > +:: > + > + OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > + 7168 7168 100% 0,03K 56 128 224K digest_list_item_ref_cache > + 7168 7168 100% 0,03K 56 128 224K digest_item_cache > + 1134 1134 100% 0,09K 27 42 108K digest_list_item_cache > + > + > +The stats, obtained from the ``digests_count`` interface, are: > + > +:: > + > + Parser digests: 0 > + File digests: 5986 > + Metadata digests: 0 > + Digest list digests: 1104 > + > + > +Comparison with IMA > +~~~~~~~~~~~~~~~~~~~ > + > +This section compares the performance between the current solution for IMA > +measurement and appraisal, and IMA with DIGLIM. > + > + > +Workload A (without DIGLIM): > + > +#. cat file[0-5985] > /dev/null > + > + > +Workload B (with DIGLIM): > + > +#. echo $PWD/0-file_list-compact-file[0-1103] > <securityfs>/integrity/diglim/digest_list_add > +#. cat file[0-5985] > /dev/null > + > + > +Workload A execution time without IMA policy: > + > +:: > + > + real 0m0,155s > + user 0m0,008s > + sys 0m0,066s > + > + > +Measurement > +........... > + > +IMA policy: > + > +:: > + > + measure fowner=2000 func=FILE_CHECK mask=MAY_READ use_diglim=allow pcr=11 ima_template=ima-sig > + > +``use_diglim`` is a policy keyword not yet supported by IMA. > + > + > +Workload A execution time with IMA and 5986 files with signature measured: > + > +:: > + > + real 0m8,273s > + user 0m0,008s > + sys 0m2,537s > + > + > +Workload B execution time with IMA, 1104 digest lists with signature > +measured and uploaded to the kernel, and 5986 files with signature accessed > +but not measured (due to the file digest being found in the hash table): > + > +:: > + > + real 0m1,837s > + user 0m0,036s > + sys 0m0,583s > + > + > +Appraisal > +......... > + > +IMA policy: > + > +:: > + > + appraise fowner=2000 func=FILE_CHECK mask=MAY_READ use_diglim=allow > + > +``use_diglim`` is a policy keyword not yet supported by IMA. > + > + > +Workload A execution time with IMA and 5986 files with file signature > +appraised: > + > +:: > + > + real 0m2,197s > + user 0m0,011s > + sys 0m2,022s > + > + > +Workload B execution time with IMA, 1104 digest lists with signature > +appraised and uploaded to the kernel, and with 5986 files with signature > +not verified (due to the file digest being found in the hash table): > + > +:: > + > + real 0m0,982s > + user 0m0,020s > + sys 0m0,865s > diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst > index 16335de04e8c..6c3aea41c55b 100644 > --- a/Documentation/security/index.rst > +++ b/Documentation/security/index.rst > @@ -17,3 +17,4 @@ Security Documentation > tpm/index > digsig > landlock > + diglim/index > diff --git a/MAINTAINERS b/MAINTAINERS > index 6c8be735cc91..c914dadd7e65 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -5452,6 +5452,15 @@ L: linux-gpio@vger.kernel.org > S: Maintained > F: drivers/gpio/gpio-gpio-mm.c > > +DIGLIM > +M: Roberto Sassu <roberto.sassu@huawei.com> > +L: linux-integrity@vger.kernel.org > +S: Supported > +T: git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git > +F: Documentation/security/diglim/architecture.rst > +F: Documentation/security/diglim/index.rst > +F: Documentation/security/diglim/introduction.rst > + > DIOLAN U2C-12 I2C DRIVER > M: Guenter Roeck <linux@roeck-us.net> > L: linux-i2c@vger.kernel.org Thanks, Mauro
> From: Mauro Carvalho Chehab [mailto:mchehab+huawei@kernel.org] > Sent: Wednesday, July 28, 2021 1:10 PM > Em Mon, 26 Jul 2021 18:36:49 +0200 > Roberto Sassu <roberto.sassu@huawei.com> escreveu: > > > Add an overview of DIGLIM to > Documentation/security/diglim/introduction.rst > > and the architecture to Documentation/security/diglim/architecture.rst > > > > Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> > > --- > > .../security/diglim/architecture.rst | 45 ++ > > Documentation/security/diglim/index.rst | 11 + > > .../security/diglim/introduction.rst | 631 ++++++++++++++++++ > > Documentation/security/index.rst | 1 + > > MAINTAINERS | 9 + > > 5 files changed, 697 insertions(+) > > create mode 100644 Documentation/security/diglim/architecture.rst > > create mode 100644 Documentation/security/diglim/index.rst > > create mode 100644 Documentation/security/diglim/introduction.rst > > > > diff --git a/Documentation/security/diglim/architecture.rst > b/Documentation/security/diglim/architecture.rst > > new file mode 100644 > > index 000000000000..a54fe2453715 > > --- /dev/null > > +++ b/Documentation/security/diglim/architecture.rst > > @@ -0,0 +1,45 @@ > > +.. SPDX-License-Identifier: GPL-2.0 > > + > > +Architecture > > +============ > > + > > +This section introduces the high level architecture of DIGLIM. > > + > > +:: > > + > > + 5. add/delete from hash table and add refs to digest list > > + +---------------------------------------------+ > > + | +-----+ +-------------+ +--+ > > + | | key |-->| digest refs |-->...-->| | > > + V +-----+ +-------------+ +--+ > > + +-------------+ +-----+ +-------------+ > > + | digest list | | key |-->| digest refs | > > + | (compact) | +-----+ +-------------+ > > + +-------------+ +-----+ +-------------+ > > + ^ 4. copy to | key |-->| digest refs | > > + | kernel memory +-----+ +-------------+ kernel space > > + -------------------------------------------------------------------------- > > + ^ ^ user space > > + |<----------------+ 3b. upload | > > + +-------------+ +------------+ | 6. query digest > > + | digest list | | user space | 2b. convert > > + | (compact) | | parser | > > + +-------------+ +------------+ > > + 1a. upload ^ 1b. read > > + | > > + +------------+ > > + | RPM header | > > + +------------+ > > + > > + > > +As mentioned before, digest lists can be uploaded directly if they are in > > "before"? This is at the beginning of this document ;-) > > You should probably add a reference to introduction.rst here, like: > > As mentioned at Documentation/security/diglim/introduction.rst, ... Hi Mauro ok. > > +the compact format (step 1a) or can be uploaded indirectly by the user > > +space parser if they are in an alternative format (steps 1b-3b). > > + > > +During upload, the kernel makes a copy of the digest list to the kernel > > +memory (step 4), and creates the necessary structures to index the digests > > +(hash table and a linked list of digest list references to locate the > > +digests in the digest list) (step 5). > > + > > +Finally, digests can be searched from user space through a securityfs file > > +(step 6) or by the kernel itself. > > This probably applies to Documentation/security as a hole, but the > best is to split the documents on two separate parts: > - the kAPI and internals; > - the admin-guide part. > > The audience for the admin-guide is distribution pagagers and > syssadmins. Ok. I will create an admin-guide. > > diff --git a/Documentation/security/diglim/index.rst > b/Documentation/security/diglim/index.rst > > new file mode 100644 > > index 000000000000..0fc5ab019bc0 > > --- /dev/null > > +++ b/Documentation/security/diglim/index.rst > > @@ -0,0 +1,11 @@ > > +.. SPDX-License-Identifier: GPL-2.0 > > + > > +====================================== > > +Digest Lists Integrity Module (DIGLIM) > > +====================================== > > + > > +.. toctree:: > > + :maxdepth: 1 > > + > > + introduction > > + architecture > > diff --git a/Documentation/security/diglim/introduction.rst > b/Documentation/security/diglim/introduction.rst > > new file mode 100644 > > index 000000000000..d8d8b2a17222 > > --- /dev/null > > +++ b/Documentation/security/diglim/introduction.rst > > @@ -0,0 +1,631 @@ > > +.. SPDX-License-Identifier: GPL-2.0 > > + > > +Introduction > > +============ > > + > > +Digest Lists Integrity Module (DIGLIM) is a new component added to the > > +integrity subsystem in the kernel, primarily aiming to aid Integrity > > I would replace: > > "is a new component added to" -> "is a component of" > > As this is the kind of text that tends to be outdated with time... > Imagine someone reading this paragraph maybe 10 years in the future ;-) Ok. > > +Measurement Architecture (IMA) in the process of checking the integrity of > > +file content and metadata. It accomplishes this task by storing reference > > +values coming from software vendors and by reporting whether or not the > > +digest of file content or metadata calculated by IMA (or EVM) is found > > +among those values. In this way, IMA can decide, depending on the result > of > > +a query, if a measurement should be taken or access to the file should be > > +granted. The `Security Assumptions`_ section explains more in detail why > > +this component has been placed in the kernel. > > + > > +The main benefits of using IMA in conjunction with DIGLIM are the ability > > +to implement advanced remote attestation schemes based on the usage of > a > > +TPM key for establishing a TLS secure channel [1][2], and to reduce the > > +burden on Linux distribution vendors to extend secure boot at OS level to > > +applications. > > + > > +DIGLIM does not have the complexity of feature-rich databases. In fact, its > > +main functionality comes from the hash table primitives already in the > > +kernel. It does not have an ad-hoc storage module, it just indexes data in > > +a fixed format (digest lists, a set of concatenated digests preceded by a > > +header), copied to kernel memory as they are. Lastly, it does not support > > +database-oriented languages such as SQL, but only accepts a digest and its > > +algorithm as a query. > > + > > +The only digest list format supported by DIGLIM is called ``compact``. > > +However, Linux distribution vendors don't have to generate new digest lists > > +in this format for the packages they release, as already available > > +information, such as RPM headers and DEB package metadata, can be > already > > +used as a source for reference values (they already include file digests), > > -ETOMANY_already > > as "already" available... can be "already" ... "already" include... > > I would simplify the above text removing such redundancy. Ok. > > +with a user space parser taking care of the conversion to the compact > > +format. > > + > > +Although one might perceive that storing file or metadata digests for a > > +Linux distribution would significantly increase the memory usage, this does > > +not seem to be the case. As an anticipation of the evaluation done in the > > +`Preliminary Performance Evaluation`_ section, protecting binaries and > > +shared libraries of a minimal Fedora 33 installation requires 208K of > > +memory for the digest lists plus 556K for indexing. > > + > > > > +In exchange for a slightly increased memory usage, DIGLIM improves the > > +performance of the integrity subsystem. In the considered scenario, IMA > > +measurement and appraisal with digest lists requires respectively less than > > +one quarter and less than half the time, compared to the current solution. > > I found this paragraph a little bit confusing to understand. Could you > please improve the description? > > I mean: > > what improved by one quarter? > what improved by "less than half of the time"? Ok. I didn't want to make the text too heavy. The tests are described in the Preliminary Performance Evaluation section. > > + > > +DIGLIM also keeps track of whether digest lists have been processed in > some > > +way (e.g. measured or appraised by IMA). This is important for example for > > +remote attestation, so that remote verifiers understand what has been > > +uploaded to the kernel. > > + > > > +DIGLIM behaves like a transactional database, i.e. it has the ability to > > +roll back to the beginning of the transaction if an error occurred during > > +the addition of a digest list (the deletion operation always succeeds). > > I don't think it makes sense to compare it with a transactional database. > > I would say, instead, something like: > > The inserts on DIGLIM are atomic: if an error occurs during the > addition > of a digest list, it rolls back the entire insert operation. Ok, better. > > +This capability has been tested with an ad-hoc fault injection mechanism > > +capable of simulating failures during the operations. > > + > > +Finally, DIGLIM exposes to user space, through securityfs, the digest lists > > +currently loaded, the number of digests added, a query interface and an > > +interface to set digest list labels. > > + > > +[1] LSS EU 2019 > > + > > +- slides: > > + > https://static.sched.com/hosted_files/lsseu2019/bd/secure_attested_commu > nication_channels_lss_eu_2019.pdf > > +- video: https://youtu.be/mffdQgkvDNY > > + > > +[2] FutureTPM EU project, final review meeting demo > > + > > +- slides: > > + https://futuretpm.eu/images/07-3-FutureTPM-Final-Review-Slides-WP6- > Device-Management-Use-Case-HWDU.pdf > > +- video: https://vimeo.com/528251864/4c1d55abcd > > The above won't generate any cross-references with Sphinx. > > For it correct syntax, see: > https://www.sphinx- > doc.org/en/master/usage/restructuredtext/basics.html#citations Ok, will fix it. > > + > > + > > +Binary Integrity > > +---------------- > > + > > +Integrity is a fundamental security property in information systems. > > > +Integrity could be described as the condition in which a generic > > +component is just after it has been released by the entity that created it. > > Sounds a weird description for me. (ISC)2 defines integrity on its > glossary[1] as: > > "Guarding against improper information modification or destruction > and > includes ensuring information non-repudiation and authenticity." > > [1] https://www.isc2.org/Certifications/CISSP/CISSP-Student-Glossary Ok, I meant integrity in the context of trusted computing. https://trustedcomputinggroup.org/wp-content/uploads/IWG_ArchitecturePartII_v1.0.pdf In general the term "integrity" is used to denote the pristine state of a component (page 13). > > +One way to check whether a component is in this condition (called binary > > +integrity) is to calculate its digest and to compare it with a reference > > +value (i.e. the digest calculated in controlled conditions, when the > > +component is released). > > + > > +IMA, a software part of the integrity subsystem, can perform such > > +evaluation and execute different actions: > > + > > +- store the digest in an integrity-protected measurement list, so that it > > + can be sent to a remote verifier for analysis; > > +- compare the calculated digest with a reference value (usually protected > > + with a signature) and deny operations if the file is found corrupted; > > +- store the digest in the system log. > > + > > + > > > > +Contribution > > +------------ > > I would rename this chapter to "Benefits". Ok. > > + > > +DIGLIM further enhances the capabilities offered by IMA-based solutions > > +and, at the same time, makes them more practical to adopt by reusing > > +existing sources as reference values for integrity decisions. > > + > > +Possible sources for digest lists are: > > + > > +- RPM headers; > > +- Debian repository metadata. > > + > > + > > +Benefits for IMA Measurement > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +One of the issues that arises when files are measured by the OS is that, > > +due to parallel execution, the order in which file accesses happen cannot > > +be predicted. Since the TPM Platform Configuration Register (PCR) extend > > +operation, executed after each file measurement, cryptographically binds > > +the current measurement to the previous ones, the PCR value at the end of > a > > +workload cannot be predicted too. > > + > > +Thus, even if the usage of a TPM key, bound to a PCR value, should be > > +allowed when only good files were accessed, the TPM could unexpectedly > deny > > +an operation on that key if files accesses did not happen as stated by the > > +key policy (which allows only one of the possible sequences). > > + > > +DIGLIM solves this issue by making the PCR value stable over the time and > > +not dependent on file accesses. The following figure depicts the current > > +and the new approaches: > > + > > +:: > > + > > + IMA measurement list (current) > > + > > + entry# 1st boot 2nd boot 3rd boot > > + +----+---------------+ +----+---------------+ +----+---------------+ > > + 1: | 10 | file1 measur. | | 10 | file3 measur. | | 10 | file2 measur. | > > + +----+---------------+ +----+---------------+ +----+---------------+ > > + 2: | 10 | file2 measur. | | 10 | file2 measur. | | 10 | file3 measur. | > > + +----+---------------+ +----+---------------+ +----+---------------+ > > + 3: | 10 | file3 measur. | | 10 | file1 measur. | | 10 | file4 measur. | > > + +----+---------------+ +----+---------------+ +----+---------------+ > > + > > + PCR: Extend != Extend != Extend > > + file1, file2, file3 file3, file2, file1 file2, file3, file4 > > + > > + > > + PCR Extend definition: > > + > > + PCR(new value) = Hash(Hash(meas. entry), PCR(previous value)) > > + > > +A new entry in the measurement list is created by IMA for each file access. > > +Assuming that ``file1``, ``file2`` and ``file3`` are files provided by the > > +software vendor, ``file4`` is an unknown file, the first two PCR values > > +above represent a good system state, the third a bad system state. The PCR > > +values are the result of the PCR extend operation performed for each > > +measurement entry with the digest of the measurement entry as an input. > > + > > +:: > > + > > + IMA measurement list (with DIGLIM) > > + > > + dlist > > + +--------------+ > > + | header | > > + +--------------+ > > + | file1 digest | > > + | file2 digest | > > + | file3 digest | > > + +--------------+ > > + > > +``dlist`` is a digest list containing the digest of ``file1``, ``file2`` > > +and ``file3``. In the intended scenario, it is generated by a software > > +vendor at the end of the building process, and retrieved by the > > +administrator of the system where the digest list is loaded. > > + > > +:: > > + > > + entry# 1st boot 2nd boot 3rd boot > > + +----+---------------+ +----+---------------+ +----+---------------+ > > + 0: | 11 | dlist measur. | | 11 | dlist measur. | | 11 | dlist measur. | > > + +----+---------------+ +----+---------------+ +----+---------------+ > > + 1: < file1 measur. skip > < file3 measur. skip > < file2 measur. skip > > > + > > + 2: < file2 measur. skip > < file2 measur. skip > < file3 measur. skip > > > + +----+---------------+ > > + 3: < file3 measur. skip > < file1 measur. skip > | 11 | file4 measur. | > > + +----+---------------+ > > + > > + PCR: Extend = Extend != Extend > > + dlist dlist dlist, file4 > > + > > + > > +The first entry in the measurement list contains the digest of the digest > > +list uploaded to the kernel at kernel initialization time. > > + > > +When a file is accessed, IMA queries DIGLIM with the calculated file digest > > +and, if it is found, IMA skips the measurement. > > + > > +Thus, the only information sent to remote verifiers are: the list of > > +files that could possibly be accessed (from the digest list), but not if > > +they were accessed and when; the measurement of unknown files. > > + > > +Despite providing less information, this solution has the advantage that > > +the good system state (i.e. when only ``file1``, ``file2`` and ``file3`` > > +are accessed) now can be represented with a deterministic PCR value (the > > +PCR is extended only with the measurement of the digest list). Also, the > > +bad system state can still be distinguished from the good state (the PCR is > > +extended also with the measurement of ``file4``). > > + > > +If a TPM key is bound to the good PCR value, the TPM would allow the key > to > > +be used if ``file1``, ``file2`` or ``file3`` are accessed, regardless of > > +the sequence in which they are accessed (the PCR value does not change), > > +and would revoke the permission when the unknown ``file4`` is accessed > (the > > +PCR value changes). If a system is able to establish a TLS connection with > > +a peer, this implicitly means that the system was in a good state (i.e. > > +``file4`` was not accessed, otherwise the TPM would have denied the usage > > +of the TPM key due to the key policy). > > + > > + > > +Benefits for IMA Appraisal > > +~~~~~~~~~~~~~~~~~~~~~~~~~~ > > + > > +Extending secure boot to applications means being able to verify the > > +provenance of files accessed. IMA does it by verifying file signatures with > > +a key that it trusts, which requires Linux distribution vendors to > > +additionally include in the package header a signature for each file that > > +must be verified (there is the dedicated ``RPMTAG_FILESIGNATURES`` > section > > +in the RPM header). > > + > > +The proposed approach would be instead to verify data provenance from > > +already available metadata (file digests) in existing packages. IMA would > > +verify the signature of package metadata and search file digests extracted > > +from package metadata and added to the hash table in the kernel. > > + > > +For RPMs, file digests can be found in the ``RPMTAG_FILEDIGESTS`` section > > +of ``RPMTAG_IMMUTABLE``, whose signature is in > ``RPMTAG_RSAHEADER``. For > > +DEBs, file digests (unsafe to use due to a weak digest algorithm) can be > > +found in the ``md5sum`` file, which can be indirectly verified from > > +``Release.gpg``. > > + > > +The following figure highlights the differences between the current and the > > +proposed approach. > > + > > +:: > > + > > + IMA appraisal (current solution, with file signatures): > > + > > + appraise > > + +-----------+ > > + V | > > + +-------------------------+-----+ +-------+-----+ | > > + | RPM header | | ima rpm | file1 | sig | | > > + | ... | | plugin +-------+-----+ +-----+ > > + | file1 sig [to be added] | sig |--------> ... | IMA | > > + | ... | | +-------+-----+ +-----+ > > + | fileN sig [to be added] | | | fileN | sig | > > + +-------------------------+-----+ +-------+-----+ > > + > > +In this case, file signatures must be added to the RPM header, so that the > > +``ima`` rpm plugin can extract them together with the file content. The > RPM > > +header signature is not used. > > + > > +:: > > + > > + IMA appraisal (with DIGLIM): > > + > > + kernel hash table > > + with RPM header content > > + +---+ +--------------+ > > + | |--->| file1 digest | > > + +---+ +--------------+ > > + ... > > + +---+ appraise (file1) > > + | | <--------------+ > > + +----------------+-----+ +---+ | > > + | RPM header | | ^ | > > + | ... | | digest_list | | > > + | file1 digest | sig | rpm plugin | +-------+ +-----+ > > + | ... | |-------------+--->| file1 | | IMA | > > + | fileN digest | | +-------+ +-----+ > > + +----------------+-----+ | > > + ^ | > > + +------------------------------------+ > > + appraise (RPM header) > > + > > +In this case, the RPM header is used as it is, and its signature is used > > +for IMA appraisal. Then, the ``digest_list`` rpm plugin executes the user > > +space parser to parse the RPM header and add the extracted digests to an > > +hash table in the kernel. IMA appraisal of the files in the RPM package > > +consists in searching their digest in the hash table. > > + > > +Other than reusing available information as digest list, another advantage > > +is the lower computational overhead compared to the solution with file > > +signatures (only one signature verification for many files and digest > > +lookup, instead of per file signature verification, see `Preliminary > > +Performance Evaluation`_ for more details). > > + > > + > > +Lifecycle > > +--------- > > + > > +The lifecycle of DIGLIM is represented in the following figure: > > + > > +:: > > You could just use: > > The lifecycle of DIGLIM is represented in the following figure:: > > > + > > + Vendor premises (release process with modifications): > > + > > + +------------+ +-----------------------+ +------------------------+ > > + | 1. build a | | 2. generate and sign | | 3. publish the package | > > + | package |-->| a digest list from |-->| and digest list in | > > + | | | packaged files | | a repository | > > + +------------+ +-----------------------+ +------------------------+ > > + | > > + | > > + User premises: | > > + V > > + +---------------------+ +------------------------+ +-----------------+ > > + | 6. use digest lists | | 5. download the digest | | 4. download and | > > + | for measurement |<--| list and upload to |<--| install the | > > + | and/or appraisal | | the kernel | | package | > > + +---------------------+ +------------------------+ +-----------------+ > > + > > +The figure above represents all the steps when a digest list is > > +generated separately. However, as mentioned in `Contribution`_, in most > > +cases existing packages can be already used as a source for digest lists, > > +limiting the effort for software vendors. > > + > > +If, for example, RPMs are used as a source for digest lists, the figure > > +above becomes: > > + > > +:: > > Same here. Ok. Thanks Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Li Jian, Shi Yanli > > + > > + Vendor premises (release process without modifications): > > + > > + +------------+ +------------------------+ > > + | 1. build a | | 2. publish the package | > > + | package |-->| in a repository |---------------------+ > > + | | | | | > > + +------------+ +------------------------+ | > > + | > > + | > > + User premises: | > > + V > > + +---------------------+ +------------------------+ +-----------------+ > > + | 5. use digest lists | | 4. extract digest list | | 3. download and | > > + | for measurement |<--| from the package |<--| install the | > > + | and/or appraisal | | and upload to the | | package | > > + | | | kernel | | | > > + +---------------------+ +------------------------+ +-----------------+ > > + > > +Step 4 can be performed with the ``digest_list`` rpm plugin and the user > > +space parser, without changes to rpm itself. > > + > > + > > +Security Assumptions > > +-------------------- > > + > > +As mentioned in the `Introduction`_, DIGLIM will be primarily used in > > +conjunction with IMA to enforce a mandatory policy on all user space > > +processes, including those owned by root. Even root, in a system with a > > +locked-down kernel, cannot affect the enforcement of the mandatory > policy > > +or, if changes are permitted, it cannot do so without being detected. > > + > > +Given that the target of the enforcement are user space processes, DIGLIM > > +cannot be placed in the target, as a Mandatory Access Control (MAC) > design > > +is required to have the components responsible to enforce the mandatory > > +policy separated from the target. > > + > > +While locking-down a system and limiting actions with a mandatory policy > is > > +generally perceived by users as an obstacle, it has noteworthy benefits for > > +the users themselves. > > + > > +First, it would timely block attempts by malicious software to steal or > > +misuse user assets. Although users could query the package managers to > > +detect them, detection would happen after the fact, or it wouldn't happen > > +at all if the malicious software tampered with package managers. With a > > +mandatory policy enforced by the kernel, users would still be able to > > +decide which software they want to be executed except that, unlike > package > > +managers, the kernel is not affected by user space processes or root. > > + > > +Second, it might make systems more easily verifiable from outside, due to > > +the limited actions the system allows. When users connect to a server, not > > +only they would be able to verify the server identity, which is already > > +possible with communication protocols like TLS, but also if the software > > +running on that server can be trusted to handle their sensitive data. > > + > > + > > +Adoption > > +-------- > > + > > +A former version of DIGLIM is used in the following OSes: > > + > > +- openEuler 20.09 > > + https://github.com/openeuler-mirror/kernel/tree/openEuler-20.09 > > + > > +- openEuler 21.03 > > + https://github.com/openeuler-mirror/kernel/tree/openEuler-21.03 > > + > > +Originally, DIGLIM was part of IMA (known as IMA Digest Lists). In this > > +version, it has been redesigned as a standalone module with an API that > > +makes its functionality accessible by IMA and, eventually, other > > +subsystems. > > + > > +User Space Support > > +------------------ > > + > > +Digest lists can be generated and managed with ``digest-list-tools``: > > + > > +https://github.com/openeuler-mirror/digest-list-tools > > + > > +It includes two main applications: > > + > > +- ``gen_digest_lists``: generates digest lists from files in the > > + filesystem or from the RPM database (more digest list sources can be > > + supported); > > +- ``manage_digest_lists``: converts and uploads digest lists to the > > + kernel. > > + > > +Integration with rpm is done with the ``digest_list`` plugin: > > + > > +https://gitee.com/src-openeuler/rpm/blob/master/Add-digest-list- > plugin.patch > > + > > +This plugin writes the RPM header and its signature to a file, so that the > > +file is ready to be appraised by IMA, and calls the user space parser to > > +convert and upload the digest list to the kernel. > > + > > + > > +Simple Usage Example (Tested with Fedora 33) > > +-------------------------------------------- > > + > > +1. Digest list generation (RPM headers and their signature are copied to > > + the specified directory): > > + > > +.. code-block:: bash > > + > > + # mkdir /etc/digest_lists > > + # gen_digest_lists -t file -f rpm+db -d /etc/digest_lists -o add > > + > > +2. Digest list upload with the user space parser: > > + > > +.. code-block:: bash > > + > > + # manage_digest_lists -p add-digest -d /etc/digest_lists > > + > > +3. First digest list query: > > + > > +.. code-block:: bash > > + > > + # echo sha256-$(sha256sum /bin/cat) > > /sys/kernel/security/integrity/diglim/digest_query > > + # cat /sys/kernel/security/integrity/diglim/digest_query > > + sha256-[...]-0-file_list-rpm-coreutils-8.32-18.fc33.x86_64 (actions: 0): > version: 1, algo: sha256, type: 2, modifiers: 1, count: 106, datalen: 3392 > > + > > +4. Second digest list query: > > + > > +.. code-block:: bash > > + > > + # echo sha256-$(sha256sum /bin/zip) > > /sys/kernel/security/integrity/diglim/digest_query > > + # cat /sys/kernel/security/integrity/diglim/digest_query > > + sha256-[...]-0-file_list-rpm-zip-3.0-27.fc33.x86_64 (actions: 0): version: 1, > algo: sha256, type: 2, modifiers: 1, count: 4, datalen: 128 > > + > > + > > +Preliminary Performance Evaluation > > +---------------------------------- > > + > > +This section provides an initial estimation of the overhead introduced by > > +DIGLIM. The estimation has been performed on a Fedora 33 virtual machine > > +with 1447 packages installed. The virtual machine has 16 vCPU (host CPU: > > +AMD Ryzen Threadripper PRO 3955WX 16-Cores) and 2G of RAM (host > memory: > > +64G). The virtual machine also has a vTPM with libtpms and swtpm as > > +backend. > > + > > +After writing the RPM headers to files, the size of the directory > > +containing them is 36M. > > + > > +After converting the RPM headers to the compact digest list, the size of > > +the data being uploaded to the kernel is 3.6M. > > + > > +The time to load the entire RPM database is 0.628s. > > + > > +After loading the digest lists to the kernel, the slab usage due to > > +indexing is (obtained with slab_nomerge in the kernel command line): > > + > > +:: > > + > > + OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > > + 118144 118144 100% 0,03K 923 128 3692K > digest_list_item_ref_cache > > + 102400 102400 100% 0,03K 800 128 3200K digest_item_cache > > + 2646 2646 100% 0,09K 63 42 252K digest_list_item_cache > > + > > +The stats, obtained from the ``digests_count`` interface, introduced later, > > +are: > > + > > +:: > > + > > + Parser digests: 0 > > + File digests: 99100 > > + Metadata digests: 0 > > + Digest list digests: 1423 > > + > > +On this installation, this would be the worst case in which all files are > > +measured and/or appraised, which is currently not recommended without > > +enforcing an integrity policy protecting mutable files. Infoflow LSM is a > > +component to accomplish this task: > > + > > +https://patchwork.kernel.org/project/linux- > integrity/cover/20190818235745.1417-1-roberto.sassu@huawei.com/ > > + > > +The first manageable goal of IMA with DIGLIM is to use an execution policy, > > +with measurement and/or appraisal of files executed or mapped in memory > as > > +executable (in addition to kernel modules and firmware). In this > > +case, the digest list contains the digest only for those files. The numbers > > +above change as follows. > > + > > +After converting the RPM headers to the compact digest list, the size of > > +the data being uploaded to the kernel is 208K. > > + > > +The time to load the digest of binaries and shared libraries is 0.062s. > > + > > +After loading the digest lists to the kernel, the slab usage due to > > +indexing is: > > + > > +:: > > + > > + OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME > > + 7168 7168 100% 0,03K 56 128 224K digest_list_item_ref_cache > > + 7168 7168 100% 0,03K 56 128 224K digest_item_cache > > + 1134 1134 100% 0,09K 27 42 108K digest_list_item_cache > > + > > + > > +The stats, obtained from the ``digests_count`` interface, are: > > + > > +:: > > + > > + Parser digests: 0 > > + File digests: 5986 > > + Metadata digests: 0 > > + Digest list digests: 1104 > > + > > + > > +Comparison with IMA > > +~~~~~~~~~~~~~~~~~~~ > > + > > +This section compares the performance between the current solution for > IMA > > +measurement and appraisal, and IMA with DIGLIM. > > + > > + > > +Workload A (without DIGLIM): > > + > > +#. cat file[0-5985] > /dev/null > > + > > + > > +Workload B (with DIGLIM): > > + > > +#. echo $PWD/0-file_list-compact-file[0-1103] > > <securityfs>/integrity/diglim/digest_list_add > > +#. cat file[0-5985] > /dev/null > > + > > + > > +Workload A execution time without IMA policy: > > + > > +:: > > + > > + real 0m0,155s > > + user 0m0,008s > > + sys 0m0,066s > > + > > + > > +Measurement > > +........... > > + > > +IMA policy: > > + > > +:: > > + > > + measure fowner=2000 func=FILE_CHECK mask=MAY_READ > use_diglim=allow pcr=11 ima_template=ima-sig > > + > > +``use_diglim`` is a policy keyword not yet supported by IMA. > > + > > + > > +Workload A execution time with IMA and 5986 files with signature > measured: > > + > > +:: > > + > > + real 0m8,273s > > + user 0m0,008s > > + sys 0m2,537s > > + > > + > > +Workload B execution time with IMA, 1104 digest lists with signature > > +measured and uploaded to the kernel, and 5986 files with signature > accessed > > +but not measured (due to the file digest being found in the hash table): > > + > > +:: > > + > > + real 0m1,837s > > + user 0m0,036s > > + sys 0m0,583s > > + > > + > > +Appraisal > > +......... > > + > > +IMA policy: > > + > > +:: > > + > > + appraise fowner=2000 func=FILE_CHECK mask=MAY_READ > use_diglim=allow > > + > > +``use_diglim`` is a policy keyword not yet supported by IMA. > > + > > + > > +Workload A execution time with IMA and 5986 files with file signature > > +appraised: > > + > > +:: > > + > > + real 0m2,197s > > + user 0m0,011s > > + sys 0m2,022s > > + > > + > > +Workload B execution time with IMA, 1104 digest lists with signature > > +appraised and uploaded to the kernel, and with 5986 files with signature > > +not verified (due to the file digest being found in the hash table): > > + > > +:: > > + > > + real 0m0,982s > > + user 0m0,020s > > + sys 0m0,865s > > diff --git a/Documentation/security/index.rst > b/Documentation/security/index.rst > > index 16335de04e8c..6c3aea41c55b 100644 > > --- a/Documentation/security/index.rst > > +++ b/Documentation/security/index.rst > > @@ -17,3 +17,4 @@ Security Documentation > > tpm/index > > digsig > > landlock > > + diglim/index > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 6c8be735cc91..c914dadd7e65 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -5452,6 +5452,15 @@ L: linux-gpio@vger.kernel.org > > S: Maintained > > F: drivers/gpio/gpio-gpio-mm.c > > > > +DIGLIM > > +M: Roberto Sassu <roberto.sassu@huawei.com> > > +L: linux-integrity@vger.kernel.org > > +S: Supported > > +T: git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git > > +F: Documentation/security/diglim/architecture.rst > > +F: Documentation/security/diglim/index.rst > > +F: Documentation/security/diglim/introduction.rst > > + > > DIOLAN U2C-12 I2C DRIVER > > M: Guenter Roeck <linux@roeck-us.net> > > L: linux-i2c@vger.kernel.org > > > > Thanks, > Mauro
diff --git a/Documentation/security/diglim/architecture.rst b/Documentation/security/diglim/architecture.rst new file mode 100644 index 000000000000..a54fe2453715 --- /dev/null +++ b/Documentation/security/diglim/architecture.rst @@ -0,0 +1,45 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Architecture +============ + +This section introduces the high level architecture of DIGLIM. + +:: + + 5. add/delete from hash table and add refs to digest list + +---------------------------------------------+ + | +-----+ +-------------+ +--+ + | | key |-->| digest refs |-->...-->| | + V +-----+ +-------------+ +--+ + +-------------+ +-----+ +-------------+ + | digest list | | key |-->| digest refs | + | (compact) | +-----+ +-------------+ + +-------------+ +-----+ +-------------+ + ^ 4. copy to | key |-->| digest refs | + | kernel memory +-----+ +-------------+ kernel space + -------------------------------------------------------------------------- + ^ ^ user space + |<----------------+ 3b. upload | + +-------------+ +------------+ | 6. query digest + | digest list | | user space | 2b. convert + | (compact) | | parser | + +-------------+ +------------+ + 1a. upload ^ 1b. read + | + +------------+ + | RPM header | + +------------+ + + +As mentioned before, digest lists can be uploaded directly if they are in +the compact format (step 1a) or can be uploaded indirectly by the user +space parser if they are in an alternative format (steps 1b-3b). + +During upload, the kernel makes a copy of the digest list to the kernel +memory (step 4), and creates the necessary structures to index the digests +(hash table and a linked list of digest list references to locate the +digests in the digest list) (step 5). + +Finally, digests can be searched from user space through a securityfs file +(step 6) or by the kernel itself. diff --git a/Documentation/security/diglim/index.rst b/Documentation/security/diglim/index.rst new file mode 100644 index 000000000000..0fc5ab019bc0 --- /dev/null +++ b/Documentation/security/diglim/index.rst @@ -0,0 +1,11 @@ +.. SPDX-License-Identifier: GPL-2.0 + +====================================== +Digest Lists Integrity Module (DIGLIM) +====================================== + +.. toctree:: + :maxdepth: 1 + + introduction + architecture diff --git a/Documentation/security/diglim/introduction.rst b/Documentation/security/diglim/introduction.rst new file mode 100644 index 000000000000..d8d8b2a17222 --- /dev/null +++ b/Documentation/security/diglim/introduction.rst @@ -0,0 +1,631 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction +============ + +Digest Lists Integrity Module (DIGLIM) is a new component added to the +integrity subsystem in the kernel, primarily aiming to aid Integrity +Measurement Architecture (IMA) in the process of checking the integrity of +file content and metadata. It accomplishes this task by storing reference +values coming from software vendors and by reporting whether or not the +digest of file content or metadata calculated by IMA (or EVM) is found +among those values. In this way, IMA can decide, depending on the result of +a query, if a measurement should be taken or access to the file should be +granted. The `Security Assumptions`_ section explains more in detail why +this component has been placed in the kernel. + +The main benefits of using IMA in conjunction with DIGLIM are the ability +to implement advanced remote attestation schemes based on the usage of a +TPM key for establishing a TLS secure channel [1][2], and to reduce the +burden on Linux distribution vendors to extend secure boot at OS level to +applications. + +DIGLIM does not have the complexity of feature-rich databases. In fact, its +main functionality comes from the hash table primitives already in the +kernel. It does not have an ad-hoc storage module, it just indexes data in +a fixed format (digest lists, a set of concatenated digests preceded by a +header), copied to kernel memory as they are. Lastly, it does not support +database-oriented languages such as SQL, but only accepts a digest and its +algorithm as a query. + +The only digest list format supported by DIGLIM is called ``compact``. +However, Linux distribution vendors don't have to generate new digest lists +in this format for the packages they release, as already available +information, such as RPM headers and DEB package metadata, can be already +used as a source for reference values (they already include file digests), +with a user space parser taking care of the conversion to the compact +format. + +Although one might perceive that storing file or metadata digests for a +Linux distribution would significantly increase the memory usage, this does +not seem to be the case. As an anticipation of the evaluation done in the +`Preliminary Performance Evaluation`_ section, protecting binaries and +shared libraries of a minimal Fedora 33 installation requires 208K of +memory for the digest lists plus 556K for indexing. + +In exchange for a slightly increased memory usage, DIGLIM improves the +performance of the integrity subsystem. In the considered scenario, IMA +measurement and appraisal with digest lists requires respectively less than +one quarter and less than half the time, compared to the current solution. + +DIGLIM also keeps track of whether digest lists have been processed in some +way (e.g. measured or appraised by IMA). This is important for example for +remote attestation, so that remote verifiers understand what has been +uploaded to the kernel. + +DIGLIM behaves like a transactional database, i.e. it has the ability to +roll back to the beginning of the transaction if an error occurred during +the addition of a digest list (the deletion operation always succeeds). +This capability has been tested with an ad-hoc fault injection mechanism +capable of simulating failures during the operations. + +Finally, DIGLIM exposes to user space, through securityfs, the digest lists +currently loaded, the number of digests added, a query interface and an +interface to set digest list labels. + +[1] LSS EU 2019 + +- slides: + https://static.sched.com/hosted_files/lsseu2019/bd/secure_attested_communication_channels_lss_eu_2019.pdf +- video: https://youtu.be/mffdQgkvDNY + +[2] FutureTPM EU project, final review meeting demo + +- slides: + https://futuretpm.eu/images/07-3-FutureTPM-Final-Review-Slides-WP6-Device-Management-Use-Case-HWDU.pdf +- video: https://vimeo.com/528251864/4c1d55abcd + + +Binary Integrity +---------------- + +Integrity is a fundamental security property in information systems. +Integrity could be described as the condition in which a generic +component is just after it has been released by the entity that created it. + +One way to check whether a component is in this condition (called binary +integrity) is to calculate its digest and to compare it with a reference +value (i.e. the digest calculated in controlled conditions, when the +component is released). + +IMA, a software part of the integrity subsystem, can perform such +evaluation and execute different actions: + +- store the digest in an integrity-protected measurement list, so that it + can be sent to a remote verifier for analysis; +- compare the calculated digest with a reference value (usually protected + with a signature) and deny operations if the file is found corrupted; +- store the digest in the system log. + + +Contribution +------------ + +DIGLIM further enhances the capabilities offered by IMA-based solutions +and, at the same time, makes them more practical to adopt by reusing +existing sources as reference values for integrity decisions. + +Possible sources for digest lists are: + +- RPM headers; +- Debian repository metadata. + + +Benefits for IMA Measurement +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +One of the issues that arises when files are measured by the OS is that, +due to parallel execution, the order in which file accesses happen cannot +be predicted. Since the TPM Platform Configuration Register (PCR) extend +operation, executed after each file measurement, cryptographically binds +the current measurement to the previous ones, the PCR value at the end of a +workload cannot be predicted too. + +Thus, even if the usage of a TPM key, bound to a PCR value, should be +allowed when only good files were accessed, the TPM could unexpectedly deny +an operation on that key if files accesses did not happen as stated by the +key policy (which allows only one of the possible sequences). + +DIGLIM solves this issue by making the PCR value stable over the time and +not dependent on file accesses. The following figure depicts the current +and the new approaches: + +:: + + IMA measurement list (current) + + entry# 1st boot 2nd boot 3rd boot + +----+---------------+ +----+---------------+ +----+---------------+ + 1: | 10 | file1 measur. | | 10 | file3 measur. | | 10 | file2 measur. | + +----+---------------+ +----+---------------+ +----+---------------+ + 2: | 10 | file2 measur. | | 10 | file2 measur. | | 10 | file3 measur. | + +----+---------------+ +----+---------------+ +----+---------------+ + 3: | 10 | file3 measur. | | 10 | file1 measur. | | 10 | file4 measur. | + +----+---------------+ +----+---------------+ +----+---------------+ + + PCR: Extend != Extend != Extend + file1, file2, file3 file3, file2, file1 file2, file3, file4 + + + PCR Extend definition: + + PCR(new value) = Hash(Hash(meas. entry), PCR(previous value)) + +A new entry in the measurement list is created by IMA for each file access. +Assuming that ``file1``, ``file2`` and ``file3`` are files provided by the +software vendor, ``file4`` is an unknown file, the first two PCR values +above represent a good system state, the third a bad system state. The PCR +values are the result of the PCR extend operation performed for each +measurement entry with the digest of the measurement entry as an input. + +:: + + IMA measurement list (with DIGLIM) + + dlist + +--------------+ + | header | + +--------------+ + | file1 digest | + | file2 digest | + | file3 digest | + +--------------+ + +``dlist`` is a digest list containing the digest of ``file1``, ``file2`` +and ``file3``. In the intended scenario, it is generated by a software +vendor at the end of the building process, and retrieved by the +administrator of the system where the digest list is loaded. + +:: + + entry# 1st boot 2nd boot 3rd boot + +----+---------------+ +----+---------------+ +----+---------------+ + 0: | 11 | dlist measur. | | 11 | dlist measur. | | 11 | dlist measur. | + +----+---------------+ +----+---------------+ +----+---------------+ + 1: < file1 measur. skip > < file3 measur. skip > < file2 measur. skip > + + 2: < file2 measur. skip > < file2 measur. skip > < file3 measur. skip > + +----+---------------+ + 3: < file3 measur. skip > < file1 measur. skip > | 11 | file4 measur. | + +----+---------------+ + + PCR: Extend = Extend != Extend + dlist dlist dlist, file4 + + +The first entry in the measurement list contains the digest of the digest +list uploaded to the kernel at kernel initialization time. + +When a file is accessed, IMA queries DIGLIM with the calculated file digest +and, if it is found, IMA skips the measurement. + +Thus, the only information sent to remote verifiers are: the list of +files that could possibly be accessed (from the digest list), but not if +they were accessed and when; the measurement of unknown files. + +Despite providing less information, this solution has the advantage that +the good system state (i.e. when only ``file1``, ``file2`` and ``file3`` +are accessed) now can be represented with a deterministic PCR value (the +PCR is extended only with the measurement of the digest list). Also, the +bad system state can still be distinguished from the good state (the PCR is +extended also with the measurement of ``file4``). + +If a TPM key is bound to the good PCR value, the TPM would allow the key to +be used if ``file1``, ``file2`` or ``file3`` are accessed, regardless of +the sequence in which they are accessed (the PCR value does not change), +and would revoke the permission when the unknown ``file4`` is accessed (the +PCR value changes). If a system is able to establish a TLS connection with +a peer, this implicitly means that the system was in a good state (i.e. +``file4`` was not accessed, otherwise the TPM would have denied the usage +of the TPM key due to the key policy). + + +Benefits for IMA Appraisal +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Extending secure boot to applications means being able to verify the +provenance of files accessed. IMA does it by verifying file signatures with +a key that it trusts, which requires Linux distribution vendors to +additionally include in the package header a signature for each file that +must be verified (there is the dedicated ``RPMTAG_FILESIGNATURES`` section +in the RPM header). + +The proposed approach would be instead to verify data provenance from +already available metadata (file digests) in existing packages. IMA would +verify the signature of package metadata and search file digests extracted +from package metadata and added to the hash table in the kernel. + +For RPMs, file digests can be found in the ``RPMTAG_FILEDIGESTS`` section +of ``RPMTAG_IMMUTABLE``, whose signature is in ``RPMTAG_RSAHEADER``. For +DEBs, file digests (unsafe to use due to a weak digest algorithm) can be +found in the ``md5sum`` file, which can be indirectly verified from +``Release.gpg``. + +The following figure highlights the differences between the current and the +proposed approach. + +:: + + IMA appraisal (current solution, with file signatures): + + appraise + +-----------+ + V | + +-------------------------+-----+ +-------+-----+ | + | RPM header | | ima rpm | file1 | sig | | + | ... | | plugin +-------+-----+ +-----+ + | file1 sig [to be added] | sig |--------> ... | IMA | + | ... | | +-------+-----+ +-----+ + | fileN sig [to be added] | | | fileN | sig | + +-------------------------+-----+ +-------+-----+ + +In this case, file signatures must be added to the RPM header, so that the +``ima`` rpm plugin can extract them together with the file content. The RPM +header signature is not used. + +:: + + IMA appraisal (with DIGLIM): + + kernel hash table + with RPM header content + +---+ +--------------+ + | |--->| file1 digest | + +---+ +--------------+ + ... + +---+ appraise (file1) + | | <--------------+ + +----------------+-----+ +---+ | + | RPM header | | ^ | + | ... | | digest_list | | + | file1 digest | sig | rpm plugin | +-------+ +-----+ + | ... | |-------------+--->| file1 | | IMA | + | fileN digest | | +-------+ +-----+ + +----------------+-----+ | + ^ | + +------------------------------------+ + appraise (RPM header) + +In this case, the RPM header is used as it is, and its signature is used +for IMA appraisal. Then, the ``digest_list`` rpm plugin executes the user +space parser to parse the RPM header and add the extracted digests to an +hash table in the kernel. IMA appraisal of the files in the RPM package +consists in searching their digest in the hash table. + +Other than reusing available information as digest list, another advantage +is the lower computational overhead compared to the solution with file +signatures (only one signature verification for many files and digest +lookup, instead of per file signature verification, see `Preliminary +Performance Evaluation`_ for more details). + + +Lifecycle +--------- + +The lifecycle of DIGLIM is represented in the following figure: + +:: + + Vendor premises (release process with modifications): + + +------------+ +-----------------------+ +------------------------+ + | 1. build a | | 2. generate and sign | | 3. publish the package | + | package |-->| a digest list from |-->| and digest list in | + | | | packaged files | | a repository | + +------------+ +-----------------------+ +------------------------+ + | + | + User premises: | + V + +---------------------+ +------------------------+ +-----------------+ + | 6. use digest lists | | 5. download the digest | | 4. download and | + | for measurement |<--| list and upload to |<--| install the | + | and/or appraisal | | the kernel | | package | + +---------------------+ +------------------------+ +-----------------+ + +The figure above represents all the steps when a digest list is +generated separately. However, as mentioned in `Contribution`_, in most +cases existing packages can be already used as a source for digest lists, +limiting the effort for software vendors. + +If, for example, RPMs are used as a source for digest lists, the figure +above becomes: + +:: + + Vendor premises (release process without modifications): + + +------------+ +------------------------+ + | 1. build a | | 2. publish the package | + | package |-->| in a repository |---------------------+ + | | | | | + +------------+ +------------------------+ | + | + | + User premises: | + V + +---------------------+ +------------------------+ +-----------------+ + | 5. use digest lists | | 4. extract digest list | | 3. download and | + | for measurement |<--| from the package |<--| install the | + | and/or appraisal | | and upload to the | | package | + | | | kernel | | | + +---------------------+ +------------------------+ +-----------------+ + +Step 4 can be performed with the ``digest_list`` rpm plugin and the user +space parser, without changes to rpm itself. + + +Security Assumptions +-------------------- + +As mentioned in the `Introduction`_, DIGLIM will be primarily used in +conjunction with IMA to enforce a mandatory policy on all user space +processes, including those owned by root. Even root, in a system with a +locked-down kernel, cannot affect the enforcement of the mandatory policy +or, if changes are permitted, it cannot do so without being detected. + +Given that the target of the enforcement are user space processes, DIGLIM +cannot be placed in the target, as a Mandatory Access Control (MAC) design +is required to have the components responsible to enforce the mandatory +policy separated from the target. + +While locking-down a system and limiting actions with a mandatory policy is +generally perceived by users as an obstacle, it has noteworthy benefits for +the users themselves. + +First, it would timely block attempts by malicious software to steal or +misuse user assets. Although users could query the package managers to +detect them, detection would happen after the fact, or it wouldn't happen +at all if the malicious software tampered with package managers. With a +mandatory policy enforced by the kernel, users would still be able to +decide which software they want to be executed except that, unlike package +managers, the kernel is not affected by user space processes or root. + +Second, it might make systems more easily verifiable from outside, due to +the limited actions the system allows. When users connect to a server, not +only they would be able to verify the server identity, which is already +possible with communication protocols like TLS, but also if the software +running on that server can be trusted to handle their sensitive data. + + +Adoption +-------- + +A former version of DIGLIM is used in the following OSes: + +- openEuler 20.09 + https://github.com/openeuler-mirror/kernel/tree/openEuler-20.09 + +- openEuler 21.03 + https://github.com/openeuler-mirror/kernel/tree/openEuler-21.03 + +Originally, DIGLIM was part of IMA (known as IMA Digest Lists). In this +version, it has been redesigned as a standalone module with an API that +makes its functionality accessible by IMA and, eventually, other +subsystems. + +User Space Support +------------------ + +Digest lists can be generated and managed with ``digest-list-tools``: + +https://github.com/openeuler-mirror/digest-list-tools + +It includes two main applications: + +- ``gen_digest_lists``: generates digest lists from files in the + filesystem or from the RPM database (more digest list sources can be + supported); +- ``manage_digest_lists``: converts and uploads digest lists to the + kernel. + +Integration with rpm is done with the ``digest_list`` plugin: + +https://gitee.com/src-openeuler/rpm/blob/master/Add-digest-list-plugin.patch + +This plugin writes the RPM header and its signature to a file, so that the +file is ready to be appraised by IMA, and calls the user space parser to +convert and upload the digest list to the kernel. + + +Simple Usage Example (Tested with Fedora 33) +-------------------------------------------- + +1. Digest list generation (RPM headers and their signature are copied to + the specified directory): + +.. code-block:: bash + + # mkdir /etc/digest_lists + # gen_digest_lists -t file -f rpm+db -d /etc/digest_lists -o add + +2. Digest list upload with the user space parser: + +.. code-block:: bash + + # manage_digest_lists -p add-digest -d /etc/digest_lists + +3. First digest list query: + +.. code-block:: bash + + # echo sha256-$(sha256sum /bin/cat) > /sys/kernel/security/integrity/diglim/digest_query + # cat /sys/kernel/security/integrity/diglim/digest_query + sha256-[...]-0-file_list-rpm-coreutils-8.32-18.fc33.x86_64 (actions: 0): version: 1, algo: sha256, type: 2, modifiers: 1, count: 106, datalen: 3392 + +4. Second digest list query: + +.. code-block:: bash + + # echo sha256-$(sha256sum /bin/zip) > /sys/kernel/security/integrity/diglim/digest_query + # cat /sys/kernel/security/integrity/diglim/digest_query + sha256-[...]-0-file_list-rpm-zip-3.0-27.fc33.x86_64 (actions: 0): version: 1, algo: sha256, type: 2, modifiers: 1, count: 4, datalen: 128 + + +Preliminary Performance Evaluation +---------------------------------- + +This section provides an initial estimation of the overhead introduced by +DIGLIM. The estimation has been performed on a Fedora 33 virtual machine +with 1447 packages installed. The virtual machine has 16 vCPU (host CPU: +AMD Ryzen Threadripper PRO 3955WX 16-Cores) and 2G of RAM (host memory: +64G). The virtual machine also has a vTPM with libtpms and swtpm as +backend. + +After writing the RPM headers to files, the size of the directory +containing them is 36M. + +After converting the RPM headers to the compact digest list, the size of +the data being uploaded to the kernel is 3.6M. + +The time to load the entire RPM database is 0.628s. + +After loading the digest lists to the kernel, the slab usage due to +indexing is (obtained with slab_nomerge in the kernel command line): + +:: + + OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME + 118144 118144 100% 0,03K 923 128 3692K digest_list_item_ref_cache + 102400 102400 100% 0,03K 800 128 3200K digest_item_cache + 2646 2646 100% 0,09K 63 42 252K digest_list_item_cache + +The stats, obtained from the ``digests_count`` interface, introduced later, +are: + +:: + + Parser digests: 0 + File digests: 99100 + Metadata digests: 0 + Digest list digests: 1423 + +On this installation, this would be the worst case in which all files are +measured and/or appraised, which is currently not recommended without +enforcing an integrity policy protecting mutable files. Infoflow LSM is a +component to accomplish this task: + +https://patchwork.kernel.org/project/linux-integrity/cover/20190818235745.1417-1-roberto.sassu@huawei.com/ + +The first manageable goal of IMA with DIGLIM is to use an execution policy, +with measurement and/or appraisal of files executed or mapped in memory as +executable (in addition to kernel modules and firmware). In this +case, the digest list contains the digest only for those files. The numbers +above change as follows. + +After converting the RPM headers to the compact digest list, the size of +the data being uploaded to the kernel is 208K. + +The time to load the digest of binaries and shared libraries is 0.062s. + +After loading the digest lists to the kernel, the slab usage due to +indexing is: + +:: + + OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME + 7168 7168 100% 0,03K 56 128 224K digest_list_item_ref_cache + 7168 7168 100% 0,03K 56 128 224K digest_item_cache + 1134 1134 100% 0,09K 27 42 108K digest_list_item_cache + + +The stats, obtained from the ``digests_count`` interface, are: + +:: + + Parser digests: 0 + File digests: 5986 + Metadata digests: 0 + Digest list digests: 1104 + + +Comparison with IMA +~~~~~~~~~~~~~~~~~~~ + +This section compares the performance between the current solution for IMA +measurement and appraisal, and IMA with DIGLIM. + + +Workload A (without DIGLIM): + +#. cat file[0-5985] > /dev/null + + +Workload B (with DIGLIM): + +#. echo $PWD/0-file_list-compact-file[0-1103] > <securityfs>/integrity/diglim/digest_list_add +#. cat file[0-5985] > /dev/null + + +Workload A execution time without IMA policy: + +:: + + real 0m0,155s + user 0m0,008s + sys 0m0,066s + + +Measurement +........... + +IMA policy: + +:: + + measure fowner=2000 func=FILE_CHECK mask=MAY_READ use_diglim=allow pcr=11 ima_template=ima-sig + +``use_diglim`` is a policy keyword not yet supported by IMA. + + +Workload A execution time with IMA and 5986 files with signature measured: + +:: + + real 0m8,273s + user 0m0,008s + sys 0m2,537s + + +Workload B execution time with IMA, 1104 digest lists with signature +measured and uploaded to the kernel, and 5986 files with signature accessed +but not measured (due to the file digest being found in the hash table): + +:: + + real 0m1,837s + user 0m0,036s + sys 0m0,583s + + +Appraisal +......... + +IMA policy: + +:: + + appraise fowner=2000 func=FILE_CHECK mask=MAY_READ use_diglim=allow + +``use_diglim`` is a policy keyword not yet supported by IMA. + + +Workload A execution time with IMA and 5986 files with file signature +appraised: + +:: + + real 0m2,197s + user 0m0,011s + sys 0m2,022s + + +Workload B execution time with IMA, 1104 digest lists with signature +appraised and uploaded to the kernel, and with 5986 files with signature +not verified (due to the file digest being found in the hash table): + +:: + + real 0m0,982s + user 0m0,020s + sys 0m0,865s diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst index 16335de04e8c..6c3aea41c55b 100644 --- a/Documentation/security/index.rst +++ b/Documentation/security/index.rst @@ -17,3 +17,4 @@ Security Documentation tpm/index digsig landlock + diglim/index diff --git a/MAINTAINERS b/MAINTAINERS index 6c8be735cc91..c914dadd7e65 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5452,6 +5452,15 @@ L: linux-gpio@vger.kernel.org S: Maintained F: drivers/gpio/gpio-gpio-mm.c +DIGLIM +M: Roberto Sassu <roberto.sassu@huawei.com> +L: linux-integrity@vger.kernel.org +S: Supported +T: git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git +F: Documentation/security/diglim/architecture.rst +F: Documentation/security/diglim/index.rst +F: Documentation/security/diglim/introduction.rst + DIOLAN U2C-12 I2C DRIVER M: Guenter Roeck <linux@roeck-us.net> L: linux-i2c@vger.kernel.org
Add an overview of DIGLIM to Documentation/security/diglim/introduction.rst and the architecture to Documentation/security/diglim/architecture.rst Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> --- .../security/diglim/architecture.rst | 45 ++ Documentation/security/diglim/index.rst | 11 + .../security/diglim/introduction.rst | 631 ++++++++++++++++++ Documentation/security/index.rst | 1 + MAINTAINERS | 9 + 5 files changed, 697 insertions(+) create mode 100644 Documentation/security/diglim/architecture.rst create mode 100644 Documentation/security/diglim/index.rst create mode 100644 Documentation/security/diglim/introduction.rst