mbox series

[v2,00/10] Encrypted Hibernation

Message ID 20220823222526.1524851-1-evgreen@chromium.org (mailing list archive)
Headers show
Series Encrypted Hibernation | expand

Message

Evan Green Aug. 23, 2022, 10:25 p.m. UTC
We are exploring enabling hibernation in some new scenarios. However,
our security team has a few requirements, listed below:
1. The hibernate image must be encrypted with protection derived from
   both the platform (eg TPM) and user authentication data (eg
   password).
2. Hibernation must not be a vector by which a malicious userspace can
   escalate to the kernel.

Requirement #1 can be achieved solely with uswsusp, however requirement
2 necessitates mechanisms in the kernel to guarantee integrity of the
hibernate image. The kernel needs a way to authenticate that it generated
the hibernate image being loaded, and that the image has not been tampered
with. Adding support for in-kernel AEAD encryption with a TPM-sealed key
allows us to achieve both requirements with a single computation pass.

Matthew Garrett published a series [1] that aligns closely with this
goal. His series utilized the fact that PCR23 is a resettable PCR that
can be blocked from access by usermode. The TPM can create a sealed key
tied to PCR23 in two ways. First, the TPM can attest to the value of
PCR23 when the key was created, which the kernel can use on resume to
verify that the kernel must have created the key (since it is the only
one capable of modifying PCR23). It can also create a policy that enforces
PCR23 be set to a specific value as a condition of unsealing the key,
preventing usermode from unsealing the key by talking directly to the
TPM.

This series adopts that primitive as a foundation, tweaking and building
on it a bit. Where Matthew's series used the TPM-backed key to encrypt a
hash of the image, this series uses the key directly as a gcm(aes)
encryption key, which the kernel uses to encrypt and decrypt the
hibernate image in chunks of 16 pages. This provides both encryption and
integrity, which turns out to be a noticeable performance improvement over
separate passes for encryption and hashing.

The series also introduces the concept of mixing user key material into
the encryption key. This allows usermode to introduce key material
based on unspecified external authentication data (in our case derived
from something like the user password or PIN), without requiring
usermode to do a separate encryption pass.

Matthew also documented issues his series had [2] related to generating
fake images by booting alternate kernels without the PCR23 limiting.
With access to PCR23 on the same machine, usermode can create fake
hibernate images that are indistinguishable to the new kernel from
genuine ones. His post outlines a solution that involves adding more
PCRs into the creation data and policy, with some gyrations to make this
work well on a standard PC.

Our approach would be similar: on our machines PCR 0 indicates whether
the system is booted in secure/verified mode or developer mode. By
adding PCR0 to the policy, we can reject hibernate images made in
developer mode while in verified mode (or vice versa).

Additionally, mixing in the user authentication data limits both
data exfiltration attacks (eg a stolen laptop) and forged hibernation
image attacks to attackers that already know the authentication data (eg
user's password). This, combined with our relatively sealed userspace
(dm-verity on the rootfs), and some judicious clearing of the hibernate
image (such as across an OS update) further reduce the risk of an online
attack. The remaining attack space of a forgery from someone with
physical access to the device and knowledge of the authentication data
is out of scope for us, given that flipping to developer mode or
reflashing RO firmware trivially achieves the same thing.

A couple of patches still need to be written on top of this series. The
generalized functionality to OR in additional PCRs via Kconfig (like PCR
0 or 5) still needs to be added. We'll also need a patch that disallows
unencrypted forms of resume from hibernation, to fully close the door
to malicious userspace. However, I wanted to get this series out first
and get reactions from upstream before continuing to add to it.

[1] https://patchwork.kernel.org/project/linux-pm/cover/20210220013255.1083202-1-matthewgarrett@google.com/
[2] https://mjg59.dreamwidth.org/58077.html

Changes in v2:
 - Fixed sparse warnings
 - Adjust hash len by 2 due to new ASN.1 storage, and add underflow
   check.
 - Rework load/create_kernel_key() to eliminate a label (Andrey)
 - Call put_device() needed from calling tpm_default_chip().
 - Add missing static on snapshot_encrypted_byte_count()
 - Fold in only the used kernel key bytes to the user key.
 - Make the user key length 32 (Eric)
 - Use CRYPTO_LIB_SHA256 for less boilerplate (Eric)
 - Fixed some sparse warnings
 - Use CRYPTO_LIB_SHA256 to get rid of sha256_data() (Eric)
 - Adjusted offsets due to new ASN.1 format, and added a creation data
   length check.
 - Fix sparse warnings
 - Fix session type comment (Andrey)
 - Eliminate extra label in get/create_kernel_key() (Andrey)
 - Call tpm_try_get_ops() before calling tpm2_flush_context().

Evan Green (7):
  security: keys: trusted: Include TPM2 creation data
  security: keys: trusted: Verify creation data
  PM: hibernate: Add kernel-based encryption
  PM: hibernate: Use TPM-backed keys to encrypt image
  PM: hibernate: Mix user key in encrypted hibernate
  PM: hibernate: Verify the digest encryption key
  PM: hibernate: seal the encryption key with a PCR policy

Matthew Garrett (3):
  tpm: Add support for in-kernel resetting of PCRs
  tpm: Allow PCR 23 to be restricted to kernel-only use
  security: keys: trusted: Allow storage of PCR values in creation data

 Documentation/power/userland-swsusp.rst       |    8 +
 .../security/keys/trusted-encrypted.rst       |    4 +
 drivers/char/tpm/Kconfig                      |   10 +
 drivers/char/tpm/tpm-dev-common.c             |    8 +
 drivers/char/tpm/tpm-interface.c              |   28 +
 drivers/char/tpm/tpm.h                        |   23 +
 drivers/char/tpm/tpm1-cmd.c                   |   69 ++
 drivers/char/tpm/tpm2-cmd.c                   |   58 +
 drivers/char/tpm/tpm2-space.c                 |    2 +-
 include/keys/trusted-type.h                   |    9 +
 include/linux/tpm.h                           |   12 +
 include/uapi/linux/suspend_ioctls.h           |   28 +-
 kernel/power/Kconfig                          |   16 +
 kernel/power/Makefile                         |    1 +
 kernel/power/power.h                          |    1 +
 kernel/power/snapenc.c                        | 1037 +++++++++++++++++
 kernel/power/snapshot.c                       |    5 +
 kernel/power/user.c                           |   44 +-
 kernel/power/user.h                           |  114 ++
 security/keys/trusted-keys/tpm2key.asn1       |    5 +-
 security/keys/trusted-keys/trusted_tpm1.c     |    9 +
 security/keys/trusted-keys/trusted_tpm2.c     |  304 ++++-
 22 files changed, 1754 insertions(+), 41 deletions(-)
 create mode 100644 kernel/power/snapenc.c
 create mode 100644 kernel/power/user.h

Comments

Mario Limonciello Aug. 31, 2022, 6:34 p.m. UTC | #1
On 8/23/2022 17:25, Evan Green wrote:
> We are exploring enabling hibernation in some new scenarios. However,
> our security team has a few requirements, listed below:
> 1. The hibernate image must be encrypted with protection derived from
>     both the platform (eg TPM) and user authentication data (eg
>     password).
> 2. Hibernation must not be a vector by which a malicious userspace can
>     escalate to the kernel.
> 
> Requirement #1 can be achieved solely with uswsusp, however requirement
> 2 necessitates mechanisms in the kernel to guarantee integrity of the
> hibernate image. The kernel needs a way to authenticate that it generated
> the hibernate image being loaded, and that the image has not been tampered
> with. Adding support for in-kernel AEAD encryption with a TPM-sealed key
> allows us to achieve both requirements with a single computation pass.
> 
> Matthew Garrett published a series [1] that aligns closely with this
> goal. His series utilized the fact that PCR23 is a resettable PCR that
> can be blocked from access by usermode. The TPM can create a sealed key
> tied to PCR23 in two ways. First, the TPM can attest to the value of
> PCR23 when the key was created, which the kernel can use on resume to
> verify that the kernel must have created the key (since it is the only
> one capable of modifying PCR23). It can also create a policy that enforces
> PCR23 be set to a specific value as a condition of unsealing the key,
> preventing usermode from unsealing the key by talking directly to the
> TPM.
> 
> This series adopts that primitive as a foundation, tweaking and building
> on it a bit. Where Matthew's series used the TPM-backed key to encrypt a
> hash of the image, this series uses the key directly as a gcm(aes)
> encryption key, which the kernel uses to encrypt and decrypt the
> hibernate image in chunks of 16 pages. This provides both encryption and
> integrity, which turns out to be a noticeable performance improvement over
> separate passes for encryption and hashing.
> 
> The series also introduces the concept of mixing user key material into
> the encryption key. This allows usermode to introduce key material
> based on unspecified external authentication data (in our case derived
> from something like the user password or PIN), without requiring
> usermode to do a separate encryption pass.
> 
> Matthew also documented issues his series had [2] related to generating
> fake images by booting alternate kernels without the PCR23 limiting.
> With access to PCR23 on the same machine, usermode can create fake
> hibernate images that are indistinguishable to the new kernel from
> genuine ones. His post outlines a solution that involves adding more
> PCRs into the creation data and policy, with some gyrations to make this
> work well on a standard PC.
> 
> Our approach would be similar: on our machines PCR 0 indicates whether
> the system is booted in secure/verified mode or developer mode. By
> adding PCR0 to the policy, we can reject hibernate images made in
> developer mode while in verified mode (or vice versa).
> 
> Additionally, mixing in the user authentication data limits both
> data exfiltration attacks (eg a stolen laptop) and forged hibernation
> image attacks to attackers that already know the authentication data (eg
> user's password). This, combined with our relatively sealed userspace
> (dm-verity on the rootfs), and some judicious clearing of the hibernate
> image (such as across an OS update) further reduce the risk of an online
> attack. The remaining attack space of a forgery from someone with
> physical access to the device and knowledge of the authentication data
> is out of scope for us, given that flipping to developer mode or
> reflashing RO firmware trivially achieves the same thing.
> 
> A couple of patches still need to be written on top of this series. The
> generalized functionality to OR in additional PCRs via Kconfig (like PCR
> 0 or 5) still needs to be added. We'll also need a patch that disallows
> unencrypted forms of resume from hibernation, to fully close the door
> to malicious userspace. However, I wanted to get this series out first
> and get reactions from upstream before continuing to add to it.

Something else to think about in this series is what happens with 
`hibernation_available` in kernel/power/hibernate.c.  Currently if the 
system is locked down hibernate is disabled, but I would think that
with a setup like that described here that should no longer be necessary.

> 
> [1] https://patchwork.kernel.org/project/linux-pm/cover/20210220013255.1083202-1-matthewgarrett@google.com/
> [2] https://mjg59.dreamwidth.org/58077.html
> 
> Changes in v2:
>   - Fixed sparse warnings
>   - Adjust hash len by 2 due to new ASN.1 storage, and add underflow
>     check.
>   - Rework load/create_kernel_key() to eliminate a label (Andrey)
>   - Call put_device() needed from calling tpm_default_chip().
>   - Add missing static on snapshot_encrypted_byte_count()
>   - Fold in only the used kernel key bytes to the user key.
>   - Make the user key length 32 (Eric)
>   - Use CRYPTO_LIB_SHA256 for less boilerplate (Eric)
>   - Fixed some sparse warnings
>   - Use CRYPTO_LIB_SHA256 to get rid of sha256_data() (Eric)
>   - Adjusted offsets due to new ASN.1 format, and added a creation data
>     length check.
>   - Fix sparse warnings
>   - Fix session type comment (Andrey)
>   - Eliminate extra label in get/create_kernel_key() (Andrey)
>   - Call tpm_try_get_ops() before calling tpm2_flush_context().
> 
> Evan Green (7):
>    security: keys: trusted: Include TPM2 creation data
>    security: keys: trusted: Verify creation data
>    PM: hibernate: Add kernel-based encryption
>    PM: hibernate: Use TPM-backed keys to encrypt image
>    PM: hibernate: Mix user key in encrypted hibernate
>    PM: hibernate: Verify the digest encryption key
>    PM: hibernate: seal the encryption key with a PCR policy
> 
> Matthew Garrett (3):
>    tpm: Add support for in-kernel resetting of PCRs
>    tpm: Allow PCR 23 to be restricted to kernel-only use
>    security: keys: trusted: Allow storage of PCR values in creation data
> 
>   Documentation/power/userland-swsusp.rst       |    8 +
>   .../security/keys/trusted-encrypted.rst       |    4 +
>   drivers/char/tpm/Kconfig                      |   10 +
>   drivers/char/tpm/tpm-dev-common.c             |    8 +
>   drivers/char/tpm/tpm-interface.c              |   28 +
>   drivers/char/tpm/tpm.h                        |   23 +
>   drivers/char/tpm/tpm1-cmd.c                   |   69 ++
>   drivers/char/tpm/tpm2-cmd.c                   |   58 +
>   drivers/char/tpm/tpm2-space.c                 |    2 +-
>   include/keys/trusted-type.h                   |    9 +
>   include/linux/tpm.h                           |   12 +
>   include/uapi/linux/suspend_ioctls.h           |   28 +-
>   kernel/power/Kconfig                          |   16 +
>   kernel/power/Makefile                         |    1 +
>   kernel/power/power.h                          |    1 +
>   kernel/power/snapenc.c                        | 1037 +++++++++++++++++
>   kernel/power/snapshot.c                       |    5 +
>   kernel/power/user.c                           |   44 +-
>   kernel/power/user.h                           |  114 ++
>   security/keys/trusted-keys/tpm2key.asn1       |    5 +-
>   security/keys/trusted-keys/trusted_tpm1.c     |    9 +
>   security/keys/trusted-keys/trusted_tpm2.c     |  304 ++++-
>   22 files changed, 1754 insertions(+), 41 deletions(-)
>   create mode 100644 kernel/power/snapenc.c
>   create mode 100644 kernel/power/user.h
>
Evan Green Sept. 7, 2022, 5:03 p.m. UTC | #2
On Wed, Aug 31, 2022 at 11:35 AM Limonciello, Mario
<mario.limonciello@amd.com> wrote:
>
> On 8/23/2022 17:25, Evan Green wrote:
> > We are exploring enabling hibernation in some new scenarios. However,
> > our security team has a few requirements, listed below:
> > 1. The hibernate image must be encrypted with protection derived from
> >     both the platform (eg TPM) and user authentication data (eg
> >     password).
> > 2. Hibernation must not be a vector by which a malicious userspace can
> >     escalate to the kernel.
> >
> > Requirement #1 can be achieved solely with uswsusp, however requirement
> > 2 necessitates mechanisms in the kernel to guarantee integrity of the
> > hibernate image. The kernel needs a way to authenticate that it generated
> > the hibernate image being loaded, and that the image has not been tampered
> > with. Adding support for in-kernel AEAD encryption with a TPM-sealed key
> > allows us to achieve both requirements with a single computation pass.
> >
> > Matthew Garrett published a series [1] that aligns closely with this
> > goal. His series utilized the fact that PCR23 is a resettable PCR that
> > can be blocked from access by usermode. The TPM can create a sealed key
> > tied to PCR23 in two ways. First, the TPM can attest to the value of
> > PCR23 when the key was created, which the kernel can use on resume to
> > verify that the kernel must have created the key (since it is the only
> > one capable of modifying PCR23). It can also create a policy that enforces
> > PCR23 be set to a specific value as a condition of unsealing the key,
> > preventing usermode from unsealing the key by talking directly to the
> > TPM.
> >
> > This series adopts that primitive as a foundation, tweaking and building
> > on it a bit. Where Matthew's series used the TPM-backed key to encrypt a
> > hash of the image, this series uses the key directly as a gcm(aes)
> > encryption key, which the kernel uses to encrypt and decrypt the
> > hibernate image in chunks of 16 pages. This provides both encryption and
> > integrity, which turns out to be a noticeable performance improvement over
> > separate passes for encryption and hashing.
> >
> > The series also introduces the concept of mixing user key material into
> > the encryption key. This allows usermode to introduce key material
> > based on unspecified external authentication data (in our case derived
> > from something like the user password or PIN), without requiring
> > usermode to do a separate encryption pass.
> >
> > Matthew also documented issues his series had [2] related to generating
> > fake images by booting alternate kernels without the PCR23 limiting.
> > With access to PCR23 on the same machine, usermode can create fake
> > hibernate images that are indistinguishable to the new kernel from
> > genuine ones. His post outlines a solution that involves adding more
> > PCRs into the creation data and policy, with some gyrations to make this
> > work well on a standard PC.
> >
> > Our approach would be similar: on our machines PCR 0 indicates whether
> > the system is booted in secure/verified mode or developer mode. By
> > adding PCR0 to the policy, we can reject hibernate images made in
> > developer mode while in verified mode (or vice versa).
> >
> > Additionally, mixing in the user authentication data limits both
> > data exfiltration attacks (eg a stolen laptop) and forged hibernation
> > image attacks to attackers that already know the authentication data (eg
> > user's password). This, combined with our relatively sealed userspace
> > (dm-verity on the rootfs), and some judicious clearing of the hibernate
> > image (such as across an OS update) further reduce the risk of an online
> > attack. The remaining attack space of a forgery from someone with
> > physical access to the device and knowledge of the authentication data
> > is out of scope for us, given that flipping to developer mode or
> > reflashing RO firmware trivially achieves the same thing.
> >
> > A couple of patches still need to be written on top of this series. The
> > generalized functionality to OR in additional PCRs via Kconfig (like PCR
> > 0 or 5) still needs to be added. We'll also need a patch that disallows
> > unencrypted forms of resume from hibernation, to fully close the door
> > to malicious userspace. However, I wanted to get this series out first
> > and get reactions from upstream before continuing to add to it.
>
> Something else to think about in this series is what happens with
> `hibernation_available` in kernel/power/hibernate.c.  Currently if the
> system is locked down hibernate is disabled, but I would think that
> with a setup like that described here that should no longer be necessary.
>

Correct, I think that would be a reasonable followup to this series.

-Evan
Pavel Machek Sept. 20, 2022, 8:46 a.m. UTC | #3
Hi!

> We are exploring enabling hibernation in some new scenarios. However,
> our security team has a few requirements, listed below:
> 1. The hibernate image must be encrypted with protection derived from
>    both the platform (eg TPM) and user authentication data (eg
>    password).
> 2. Hibernation must not be a vector by which a malicious userspace can
>    escalate to the kernel.

Why is #2 reasonable requirement?

We normally allow userspace with appropriate permissions to update the
kernel, for example.

Best regards,
								Pavel
Evan Green Sept. 20, 2022, 4:39 p.m. UTC | #4
On Tue, Sep 20, 2022 at 1:46 AM Pavel Machek <pavel@ucw.cz> wrote:
>
> Hi!
>
> > We are exploring enabling hibernation in some new scenarios. However,
> > our security team has a few requirements, listed below:
> > 1. The hibernate image must be encrypted with protection derived from
> >    both the platform (eg TPM) and user authentication data (eg
> >    password).
> > 2. Hibernation must not be a vector by which a malicious userspace can
> >    escalate to the kernel.
>
> Why is #2 reasonable requirement?
>
> We normally allow userspace with appropriate permissions to update the
> kernel, for example.

I'll take a stab at answering this. I've also CCed one of our security
folks to keep me honest and add any needed additional context.

ChromeOS takes an approach of attempting to limit the blast radius of
any given vulnerability as much as possible. A vulnerable system
service may be running as root, but may also still be fairly
constrained by sandboxing: it may not have access to all processes,
the entire file system, or all capability bits. With Verified Boot
[1], our kernel and rootfs are statically signed by Google (or
yourself if firmware has been reflashed). Even if a full root
compromise occurs, it's difficult for the attacker to persist across a
reboot, since they cannot update the kernel or init flow on disk
without the signing key.

We do our best to lock down other escalation vectors from root to
kernel as well. For instance, features like LoadPin help prevent a
malicious root from simply loading up a payload via insmod.

So in cases like ours, jumping from root execution to kernel execution
represents a real escalation in privilege. Hibernate as it exists
today represents a wide open door for root to become kernel, so we're
forced to disable the Kconfigs for it. This series, along with another
patch to restrict unencrypted resume, would add the guardrails we need
to prevent arbitrary code from moving into the kernel via resume.

-Evan

[1] https://www.chromium.org/chromium-os/chromiumos-design-docs/verified-boot/
Kees Cook Sept. 20, 2022, 10:52 p.m. UTC | #5
On Tue, Aug 23, 2022 at 03:25:16PM -0700, Evan Green wrote:
> This series adopts that primitive as a foundation, tweaking and building
> on it a bit. Where Matthew's series used the TPM-backed key to encrypt a
> hash of the image, this series uses the key directly as a gcm(aes)
> encryption key, which the kernel uses to encrypt and decrypt the
> hibernate image in chunks of 16 pages. This provides both encryption and
> integrity, which turns out to be a noticeable performance improvement over
> separate passes for encryption and hashing.

I like this series! I would ask that someone more familiar with the
cryptographic constraints here confirm that the primitives you're using
are going to actually provide you the constraints you want (i.e.
encryption, integrity, etc). My understanding is that gcm(aes) is
exactly right, but I Am Not A Cryptographer. ;)

I'll reply more to individual patches ...
Jason Gunthorpe Sept. 21, 2022, 6:09 p.m. UTC | #6
On Tue, Sep 20, 2022 at 10:46:48AM +0200, Pavel Machek wrote:
> Hi!
> 
> > We are exploring enabling hibernation in some new scenarios. However,
> > our security team has a few requirements, listed below:
> > 1. The hibernate image must be encrypted with protection derived from
> >    both the platform (eg TPM) and user authentication data (eg
> >    password).
> > 2. Hibernation must not be a vector by which a malicious userspace can
> >    escalate to the kernel.
> 
> Why is #2 reasonable requirement?

These days with kernel lockdown we don't allow userspace to enter the
kernel

> We normally allow userspace with appropriate permissions to update the
> kernel, for example.

And in a lockdown secure boot environment only a signed kernel can be
booted in the first place.

A series like this is effectively carrying the secure boot trust
across the hibernation

Jason