diff mbox series

[v3] tpm: Disable RNG for all AMD fTPMs

Message ID 20230803182428.25753-1-mario.limonciello@amd.com (mailing list archive)
State New, archived
Headers show
Series [v3] tpm: Disable RNG for all AMD fTPMs | expand

Commit Message

Mario Limonciello Aug. 3, 2023, 6:24 p.m. UTC
The TPM RNG functionality is not necessary for entropy when the CPU
already supports the RDRAND instruction. The TPM RNG functionality
was previously disabled on a subset of AMD fTPM series, but reports
continue to show problems on some systems causing stutter root caused
to TPM RNG functionality.

Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
that claim to have fixed or not. To accomplish this, move the detection
into part of the TPM CRB registration and add a flag indicating that
the TPM should opt-out of registration to hwrng.

Cc: stable@vger.kernel.org # 5.5+
Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
Reported-by: daniil.stas@posteo.net
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
Reported-by: bitlord0xff@gmail.com
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
v2->v3:
 * Pick up a R-b tag
 * Add another Fixes for when fTPM support enabled on AMD
 * Adjust stable tag further back to 5.5 (when fTPM enabled)
 * Fix an error handling case that would have introduced a NULL pointer dereference
v1->v2:
 * switch from callback to everything in tpm_crb
 * switch to open coded flags check instead of new inline
---
 drivers/char/tpm/tpm-chip.c | 71 +++----------------------------------
 drivers/char/tpm/tpm_crb.c  | 30 ++++++++++++++++
 include/linux/tpm.h         |  1 +
 3 files changed, 35 insertions(+), 67 deletions(-)

Comments

Jason A. Donenfeld Aug. 4, 2023, 1:28 p.m. UTC | #1
On Thu, Aug 03, 2023 at 01:24:28PM -0500, Mario Limonciello wrote:
> The TPM RNG functionality is not necessary for entropy when the CPU
> already supports the RDRAND instruction. The TPM RNG functionality
> was previously disabled on a subset of AMD fTPM series, but reports
> continue to show problems on some systems causing stutter root caused
> to TPM RNG functionality.
> 
> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> that claim to have fixed or not. To accomplish this, move the detection
> into part of the TPM CRB registration and add a flag indicating that
> the TPM should opt-out of registration to hwrng.
> 
> Cc: stable@vger.kernel.org # 5.5+
> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> Reported-by: daniil.stas@posteo.net
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> Reported-by: bitlord0xff@gmail.com
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

LGTM. Jarkko - can you replace the commit in your staging tree with this
one?
Jarkko Sakkinen Aug. 4, 2023, 10:54 p.m. UTC | #2
On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
> The TPM RNG functionality is not necessary for entropy when the CPU
> already supports the RDRAND instruction. The TPM RNG functionality
> was previously disabled on a subset of AMD fTPM series, but reports
> continue to show problems on some systems causing stutter root caused
> to TPM RNG functionality.
>
> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> that claim to have fixed or not. To accomplish this, move the detection
> into part of the TPM CRB registration and add a flag indicating that
> the TPM should opt-out of registration to hwrng.
>
> Cc: stable@vger.kernel.org # 5.5+
> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> Reported-by: daniil.stas@posteo.net
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> Reported-by: bitlord0xff@gmail.com
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

I will skip rc5 and send this for rc6 on Monday.

Has anyone with suitable AMD system tested this?

BR, Jarkko
Jarkko Sakkinen Aug. 4, 2023, 11:06 p.m. UTC | #3
On Fri Aug 4, 2023 at 4:28 PM EEST, Jason A. Donenfeld wrote:
> On Thu, Aug 03, 2023 at 01:24:28PM -0500, Mario Limonciello wrote:
> > The TPM RNG functionality is not necessary for entropy when the CPU
> > already supports the RDRAND instruction. The TPM RNG functionality
> > was previously disabled on a subset of AMD fTPM series, but reports
> > continue to show problems on some systems causing stutter root caused
> > to TPM RNG functionality.
> > 
> > Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> > that claim to have fixed or not. To accomplish this, move the detection
> > into part of the TPM CRB registration and add a flag indicating that
> > the TPM should opt-out of registration to hwrng.
> > 
> > Cc: stable@vger.kernel.org # 5.5+
> > Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> > Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> > Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> > Reported-by: daniil.stas@posteo.net
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> > Reported-by: bitlord0xff@gmail.com
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> > Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>
> LGTM. Jarkko - can you replace the commit in your staging tree with this
> one?

I'll replace either during weekend or Monday and send the PR for rc6.

BR, Jarkko
Mario Limonciello Aug. 4, 2023, 11:21 p.m. UTC | #4
On 8/4/23 17:54, Jarkko Sakkinen wrote:
> On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
>> The TPM RNG functionality is not necessary for entropy when the CPU
>> already supports the RDRAND instruction. The TPM RNG functionality
>> was previously disabled on a subset of AMD fTPM series, but reports
>> continue to show problems on some systems causing stutter root caused
>> to TPM RNG functionality.
>>
>> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
>> that claim to have fixed or not. To accomplish this, move the detection
>> into part of the TPM CRB registration and add a flag indicating that
>> the TPM should opt-out of registration to hwrng.
>>
>> Cc: stable@vger.kernel.org # 5.5+
>> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
>> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
>> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
>> Reported-by: daniil.stas@posteo.net
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
>> Reported-by: bitlord0xff@gmail.com
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
>> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> 
> I will skip rc5 and send this for rc6 on Monday.
> 
> Has anyone with suitable AMD system tested this?

Probably obvious; but I tested with a system that can support both dTPM 
and fTPM and swapped between the two before I sent it.
Jarkko Sakkinen Aug. 4, 2023, 11:39 p.m. UTC | #5
On Sat Aug 5, 2023 at 2:21 AM EEST, Mario Limonciello wrote:
> On 8/4/23 17:54, Jarkko Sakkinen wrote:
> > On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
> >> The TPM RNG functionality is not necessary for entropy when the CPU
> >> already supports the RDRAND instruction. The TPM RNG functionality
> >> was previously disabled on a subset of AMD fTPM series, but reports
> >> continue to show problems on some systems causing stutter root caused
> >> to TPM RNG functionality.
> >>
> >> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> >> that claim to have fixed or not. To accomplish this, move the detection
> >> into part of the TPM CRB registration and add a flag indicating that
> >> the TPM should opt-out of registration to hwrng.
> >>
> >> Cc: stable@vger.kernel.org # 5.5+
> >> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> >> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> >> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> >> Reported-by: daniil.stas@posteo.net
> >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> >> Reported-by: bitlord0xff@gmail.com
> >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> >> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > 
> > I will skip rc5 and send this for rc6 on Monday.
> > 
> > Has anyone with suitable AMD system tested this?
>
> Probably obvious; but I tested with a system that can support both dTPM 
> and fTPM and swapped between the two before I sent it.

Ok, great. I've tested that with non-AMD system things continue to
work so I guess that is sufficient enough for:

Tested-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko
Jason A. Donenfeld Aug. 7, 2023, 10:28 p.m. UTC | #6
On Sat, Aug 05, 2023 at 02:39:11AM +0300, Jarkko Sakkinen wrote:
> On Sat Aug 5, 2023 at 2:21 AM EEST, Mario Limonciello wrote:
> > On 8/4/23 17:54, Jarkko Sakkinen wrote:
> > > On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
> > >> The TPM RNG functionality is not necessary for entropy when the CPU
> > >> already supports the RDRAND instruction. The TPM RNG functionality
> > >> was previously disabled on a subset of AMD fTPM series, but reports
> > >> continue to show problems on some systems causing stutter root caused
> > >> to TPM RNG functionality.
> > >>
> > >> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> > >> that claim to have fixed or not. To accomplish this, move the detection
> > >> into part of the TPM CRB registration and add a flag indicating that
> > >> the TPM should opt-out of registration to hwrng.
> > >>
> > >> Cc: stable@vger.kernel.org # 5.5+
> > >> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> > >> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> > >> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> > >> Reported-by: daniil.stas@posteo.net
> > >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> > >> Reported-by: bitlord0xff@gmail.com
> > >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> > >> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > 
> > > I will skip rc5 and send this for rc6 on Monday.
> > > 
> > > Has anyone with suitable AMD system tested this?
> >
> > Probably obvious; but I tested with a system that can support both dTPM 
> > and fTPM and swapped between the two before I sent it.
> 
> Ok, great. I've tested that with non-AMD system things continue to
> work so I guess that is sufficient enough for:
> 
> Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
> 
> BR, Jarkko

Why is
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=554b841d470338a3b1d6335b14ee1cd0c8f5d754
in Linus' tree? After we told you on several email threads to take the
v3, and you said you would, you still took the v2? What are you doing?
I'm frustrated because this is not the first time you've been out
to lunch about this stuff. Now there's the wrong stable metadata and the
fix is incomplete. Shame.
Mario Limonciello Aug. 8, 2023, 12:15 a.m. UTC | #7
On 8/7/23 17:28, Jason A. Donenfeld wrote:
> On Sat, Aug 05, 2023 at 02:39:11AM +0300, Jarkko Sakkinen wrote:
>> On Sat Aug 5, 2023 at 2:21 AM EEST, Mario Limonciello wrote:
>>> On 8/4/23 17:54, Jarkko Sakkinen wrote:
>>>> On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
>>>>> The TPM RNG functionality is not necessary for entropy when the CPU
>>>>> already supports the RDRAND instruction. The TPM RNG functionality
>>>>> was previously disabled on a subset of AMD fTPM series, but reports
>>>>> continue to show problems on some systems causing stutter root caused
>>>>> to TPM RNG functionality.
>>>>>
>>>>> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
>>>>> that claim to have fixed or not. To accomplish this, move the detection
>>>>> into part of the TPM CRB registration and add a flag indicating that
>>>>> the TPM should opt-out of registration to hwrng.
>>>>>
>>>>> Cc: stable@vger.kernel.org # 5.5+
>>>>> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
>>>>> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
>>>>> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
>>>>> Reported-by: daniil.stas@posteo.net
>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
>>>>> Reported-by: bitlord0xff@gmail.com
>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
>>>>> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
>>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>>
>>>> I will skip rc5 and send this for rc6 on Monday.
>>>>
>>>> Has anyone with suitable AMD system tested this?
>>>
>>> Probably obvious; but I tested with a system that can support both dTPM
>>> and fTPM and swapped between the two before I sent it.
>>
>> Ok, great. I've tested that with non-AMD system things continue to
>> work so I guess that is sufficient enough for:
>>
>> Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
>>
>> BR, Jarkko
> 
> Why is
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=554b841d470338a3b1d6335b14ee1cd0c8f5d754
> in Linus' tree? After we told you on several email threads to take the
> v3, and you said you would, you still took the v2? What are you doing?
> I'm frustrated because this is not the first time you've been out
> to lunch about this stuff. Now there's the wrong stable metadata and the
> fix is incomplete. Shame.

I guess that means I need to re-send out the other fixup that was missed 
in v3 separately and with a new fixes tag against that hash that landed 
in Linus' tree?  Or Jarkko are you going to resolve the differences?
Jason A. Donenfeld Aug. 8, 2023, 12:39 a.m. UTC | #8
On Mon, Aug 07, 2023 at 07:15:44PM -0500, Mario Limonciello wrote:
> 
> 
> On 8/7/23 17:28, Jason A. Donenfeld wrote:
> > On Sat, Aug 05, 2023 at 02:39:11AM +0300, Jarkko Sakkinen wrote:
> >> On Sat Aug 5, 2023 at 2:21 AM EEST, Mario Limonciello wrote:
> >>> On 8/4/23 17:54, Jarkko Sakkinen wrote:
> >>>> On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
> >>>>> The TPM RNG functionality is not necessary for entropy when the CPU
> >>>>> already supports the RDRAND instruction. The TPM RNG functionality
> >>>>> was previously disabled on a subset of AMD fTPM series, but reports
> >>>>> continue to show problems on some systems causing stutter root caused
> >>>>> to TPM RNG functionality.
> >>>>>
> >>>>> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> >>>>> that claim to have fixed or not. To accomplish this, move the detection
> >>>>> into part of the TPM CRB registration and add a flag indicating that
> >>>>> the TPM should opt-out of registration to hwrng.
> >>>>>
> >>>>> Cc: stable@vger.kernel.org # 5.5+
> >>>>> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> >>>>> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> >>>>> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> >>>>> Reported-by: daniil.stas@posteo.net
> >>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> >>>>> Reported-by: bitlord0xff@gmail.com
> >>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> >>>>> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> >>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> >>>>
> >>>> I will skip rc5 and send this for rc6 on Monday.
> >>>>
> >>>> Has anyone with suitable AMD system tested this?
> >>>
> >>> Probably obvious; but I tested with a system that can support both dTPM
> >>> and fTPM and swapped between the two before I sent it.
> >>
> >> Ok, great. I've tested that with non-AMD system things continue to
> >> work so I guess that is sufficient enough for:
> >>
> >> Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
> >>
> >> BR, Jarkko
> > 
> > Why is
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=554b841d470338a3b1d6335b14ee1cd0c8f5d754
> > in Linus' tree? After we told you on several email threads to take the
> > v3, and you said you would, you still took the v2? What are you doing?
> > I'm frustrated because this is not the first time you've been out
> > to lunch about this stuff. Now there's the wrong stable metadata and the
> > fix is incomplete. Shame.
> 
> I guess that means I need to re-send out the other fixup that was missed 
> in v3 separately and with a new fixes tag against that hash that landed 
> in Linus' tree?  Or Jarkko are you going to resolve the differences?

Unless it can be fixed by sending the right patch to Linus and having
the merge commit resolve the difference, and then at least we'll have
the right patch with the right metadata for stable. Barring that, I'll
just be sure to communicate with Greg so things get picked properly.

I'm not sure what's best or what Linus prefers. Linus - Jarkko sent you
the wrong version patch. Do you want a fixup patch that accounts for the
difference, and then I'll address the stable@ metadata deficiency
manually by talking to Greg, or would you rather some merge commit
magic, or something else?

Jason
Linus Torvalds Aug. 8, 2023, 3:26 a.m. UTC | #9
On Mon, 7 Aug 2023 at 17:39, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> I'm not sure what's best or what Linus prefers. Linus - Jarkko sent you
> the wrong version patch. Do you want a fixup patch that accounts for the
> difference, and then I'll address the stable@ metadata deficiency
> manually by talking to Greg, or would you rather some merge commit
> magic, or something else?

Either works for me, whatever ends up being easiest.

However, looking at that v3 patch, that "should we enable/disable the
hwrng" is now repeated *three* times, and that first one is

  if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM) || tpm_is_firmware_upgrade(chip) ||
-     tpm_amd_is_rng_defective(chip))
+     chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED)

and wants fixing anyway: you want parenthesis around the '&'.

Yes, yes, it works (because bitwise ops have higher precedence than
logical ones), but let's not do that.

But more importantly, can we just have a single helper inline function
for this and *not* repeat the same multi-line expression three times
(just in negated and then 2x non-negated format)?

That test is ugly anyway. Why is "tpm_is_firmware_upgrade()" a wrapper
function around testing "chip->flags", but then right next to it it
tests them explicitly.

So if we have to re-do this all, let's re-do it properly. Ok?

Thinking about it, I do guess that makes it easier to just send an
incremental patch on top.

              Linus
Jason A. Donenfeld Aug. 8, 2023, 5:19 p.m. UTC | #10
On Mon, Aug 07, 2023 at 08:26:03PM -0700, Linus Torvalds wrote:
> On Mon, 7 Aug 2023 at 17:39, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > I'm not sure what's best or what Linus prefers. Linus - Jarkko sent you
> > the wrong version patch. Do you want a fixup patch that accounts for the
> > difference, and then I'll address the stable@ metadata deficiency
> > manually by talking to Greg, or would you rather some merge commit
> > magic, or something else?
> 
> Either works for me, whatever ends up being easiest.
> 
> However, looking at that v3 patch, that "should we enable/disable the
> hwrng" is now repeated *three* times, and that first one is
> 
>   if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM) || tpm_is_firmware_upgrade(chip) ||
> -     tpm_amd_is_rng_defective(chip))
> +     chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED)
> 
> and wants fixing anyway: you want parenthesis around the '&'.
> 
> Yes, yes, it works (because bitwise ops have higher precedence than
> logical ones), but let's not do that.
> 
> But more importantly, can we just have a single helper inline function
> for this and *not* repeat the same multi-line expression three times
> (just in negated and then 2x non-negated format)?
> 
> That test is ugly anyway. Why is "tpm_is_firmware_upgrade()" a wrapper
> function around testing "chip->flags", but then right next to it it
> tests them explicitly.
> 
> So if we have to re-do this all, let's re-do it properly. Ok?
> 
> Thinking about it, I do guess that makes it easier to just send an
> incremental patch on top.

Alright, looks like Mario took care of that:
https://lore.kernel.org/all/20230808041229.22514-1-mario.limonciello@amd.com/
Once this is in your tree I'll ping Greg about the right stable versions
to make up for the wrong tag.

Jason
Linus Torvalds Aug. 9, 2023, 5:06 p.m. UTC | #11
On Tue, 8 Aug 2023 at 10:19, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Alright, looks like Mario took care of that:

Ok, I just took that patch directly, so that we can close this issue.

It's

    cacc6e22932f ("tpm: Add a helper for checking hwrng enabled")

in my tree, and I'll push out shortly after it's gone through my trivial tests.

                  Linus
Jason A. Donenfeld Aug. 9, 2023, 9:35 p.m. UTC | #12
On Wed, Aug 09, 2023 at 10:06:31AM -0700, Linus Torvalds wrote:
> On Tue, 8 Aug 2023 at 10:19, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > Alright, looks like Mario took care of that:
> 
> Ok, I just took that patch directly, so that we can close this issue.
> 
> It's
> 
>     cacc6e22932f ("tpm: Add a helper for checking hwrng enabled")
> 
> in my tree, and I'll push out shortly after it's gone through my trivial tests.

Thanks. Sent onward to stable:
https://lore.kernel.org/stable/ZNQGL6XUtc8WFk1e@zx2c4.com/

Jason
Jarkko Sakkinen Aug. 10, 2023, 2:42 p.m. UTC | #13
On Tue Aug 8, 2023 at 1:28 AM EEST, Jason A. Donenfeld wrote:
> On Sat, Aug 05, 2023 at 02:39:11AM +0300, Jarkko Sakkinen wrote:
> > On Sat Aug 5, 2023 at 2:21 AM EEST, Mario Limonciello wrote:
> > > On 8/4/23 17:54, Jarkko Sakkinen wrote:
> > > > On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
> > > >> The TPM RNG functionality is not necessary for entropy when the CPU
> > > >> already supports the RDRAND instruction. The TPM RNG functionality
> > > >> was previously disabled on a subset of AMD fTPM series, but reports
> > > >> continue to show problems on some systems causing stutter root caused
> > > >> to TPM RNG functionality.
> > > >>
> > > >> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> > > >> that claim to have fixed or not. To accomplish this, move the detection
> > > >> into part of the TPM CRB registration and add a flag indicating that
> > > >> the TPM should opt-out of registration to hwrng.
> > > >>
> > > >> Cc: stable@vger.kernel.org # 5.5+
> > > >> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> > > >> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> > > >> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> > > >> Reported-by: daniil.stas@posteo.net
> > > >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> > > >> Reported-by: bitlord0xff@gmail.com
> > > >> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> > > >> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > > 
> > > > I will skip rc5 and send this for rc6 on Monday.
> > > > 
> > > > Has anyone with suitable AMD system tested this?
> > >
> > > Probably obvious; but I tested with a system that can support both dTPM 
> > > and fTPM and swapped between the two before I sent it.
> > 
> > Ok, great. I've tested that with non-AMD system things continue to
> > work so I guess that is sufficient enough for:
> > 
> > Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
> > 
> > BR, Jarkko
>
> Why is
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=554b841d470338a3b1d6335b14ee1cd0c8f5d754
> in Linus' tree? After we told you on several email threads to take the
> v3, and you said you would, you still took the v2? What are you doing?
> I'm frustrated because this is not the first time you've been out
> to lunch about this stuff. Now there's the wrong stable metadata and the
> fix is incomplete. Shame.

At least part of it must be transitioning from mutt to aerc but point
taken [1].

Should I revert the commit and send a PR with revert and the correct
patch?

[1] https://social.kernel.org/notice/AYOm9K4QULTHJMCN5E 

BR, Jarkko
Mario Limonciello Aug. 10, 2023, 2:45 p.m. UTC | #14
On 8/10/2023 9:42 AM, Jarkko Sakkinen wrote:
> On Tue Aug 8, 2023 at 1:28 AM EEST, Jason A. Donenfeld wrote:
>> On Sat, Aug 05, 2023 at 02:39:11AM +0300, Jarkko Sakkinen wrote:
>>> On Sat Aug 5, 2023 at 2:21 AM EEST, Mario Limonciello wrote:
>>>> On 8/4/23 17:54, Jarkko Sakkinen wrote:
>>>>> On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
>>>>>> The TPM RNG functionality is not necessary for entropy when the CPU
>>>>>> already supports the RDRAND instruction. The TPM RNG functionality
>>>>>> was previously disabled on a subset of AMD fTPM series, but reports
>>>>>> continue to show problems on some systems causing stutter root caused
>>>>>> to TPM RNG functionality.
>>>>>>
>>>>>> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
>>>>>> that claim to have fixed or not. To accomplish this, move the detection
>>>>>> into part of the TPM CRB registration and add a flag indicating that
>>>>>> the TPM should opt-out of registration to hwrng.
>>>>>>
>>>>>> Cc: stable@vger.kernel.org # 5.5+
>>>>>> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
>>>>>> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
>>>>>> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
>>>>>> Reported-by: daniil.stas@posteo.net
>>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
>>>>>> Reported-by: bitlord0xff@gmail.com
>>>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
>>>>>> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
>>>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>>>
>>>>> I will skip rc5 and send this for rc6 on Monday.
>>>>>
>>>>> Has anyone with suitable AMD system tested this?
>>>>
>>>> Probably obvious; but I tested with a system that can support both dTPM
>>>> and fTPM and swapped between the two before I sent it.
>>>
>>> Ok, great. I've tested that with non-AMD system things continue to
>>> work so I guess that is sufficient enough for:
>>>
>>> Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
>>>
>>> BR, Jarkko
>>
>> Why is
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=554b841d470338a3b1d6335b14ee1cd0c8f5d754
>> in Linus' tree? After we told you on several email threads to take the
>> v3, and you said you would, you still took the v2? What are you doing?
>> I'm frustrated because this is not the first time you've been out
>> to lunch about this stuff. Now there's the wrong stable metadata and the
>> fix is incomplete. Shame.
> 
> At least part of it must be transitioning from mutt to aerc but point
> taken [1].
> 
> Should I revert the commit and send a PR with revert and the correct
> patch?
> 

It's all sorted in Linus' tree now, no need to do anything at this point.

> [1] https://social.kernel.org/notice/AYOm9K4QULTHJMCN5E
> 
> BR, Jarkko
Jarkko Sakkinen Aug. 10, 2023, 3:04 p.m. UTC | #15
On Tue Aug 8, 2023 at 3:15 AM EEST, Mario Limonciello wrote:
>
>
> On 8/7/23 17:28, Jason A. Donenfeld wrote:
> > On Sat, Aug 05, 2023 at 02:39:11AM +0300, Jarkko Sakkinen wrote:
> >> On Sat Aug 5, 2023 at 2:21 AM EEST, Mario Limonciello wrote:
> >>> On 8/4/23 17:54, Jarkko Sakkinen wrote:
> >>>> On Thu Aug 3, 2023 at 9:24 PM EEST, Mario Limonciello wrote:
> >>>>> The TPM RNG functionality is not necessary for entropy when the CPU
> >>>>> already supports the RDRAND instruction. The TPM RNG functionality
> >>>>> was previously disabled on a subset of AMD fTPM series, but reports
> >>>>> continue to show problems on some systems causing stutter root caused
> >>>>> to TPM RNG functionality.
> >>>>>
> >>>>> Expand disabling TPM RNG use for all AMD fTPMs whether they have versions
> >>>>> that claim to have fixed or not. To accomplish this, move the detection
> >>>>> into part of the TPM CRB registration and add a flag indicating that
> >>>>> the TPM should opt-out of registration to hwrng.
> >>>>>
> >>>>> Cc: stable@vger.kernel.org # 5.5+
> >>>>> Fixes: b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted sources")
> >>>>> Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> >>>>> Fixes: 3ef193822b25 ("tpm_crb: fix fTPM on AMD Zen+ CPUs")
> >>>>> Reported-by: daniil.stas@posteo.net
> >>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217719
> >>>>> Reported-by: bitlord0xff@gmail.com
> >>>>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217212
> >>>>> Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
> >>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> >>>>
> >>>> I will skip rc5 and send this for rc6 on Monday.
> >>>>
> >>>> Has anyone with suitable AMD system tested this?
> >>>
> >>> Probably obvious; but I tested with a system that can support both dTPM
> >>> and fTPM and swapped between the two before I sent it.
> >>
> >> Ok, great. I've tested that with non-AMD system things continue to
> >> work so I guess that is sufficient enough for:
> >>
> >> Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
> >>
> >> BR, Jarkko
> > 
> > Why is
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=554b841d470338a3b1d6335b14ee1cd0c8f5d754
> > in Linus' tree? After we told you on several email threads to take the
> > v3, and you said you would, you still took the v2? What are you doing?
> > I'm frustrated because this is not the first time you've been out
> > to lunch about this stuff. Now there's the wrong stable metadata and the
> > fix is incomplete. Shame.
>
> I guess that means I need to re-send out the other fixup that was missed 
> in v3 separately and with a new fixes tag against that hash that landed 
> in Linus' tree?  Or Jarkko are you going to resolve the differences?

I can also revert the commit and apply the correct patch. Either
option works for me.

BR, Jarkko
Jarkko Sakkinen Aug. 10, 2023, 3:06 p.m. UTC | #16
On Tue Aug 8, 2023 at 6:26 AM EEST, Linus Torvalds wrote:
> On Mon, 7 Aug 2023 at 17:39, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > I'm not sure what's best or what Linus prefers. Linus - Jarkko sent you
> > the wrong version patch. Do you want a fixup patch that accounts for the
> > difference, and then I'll address the stable@ metadata deficiency
> > manually by talking to Greg, or would you rather some merge commit
> > magic, or something else?
>
> Either works for me, whatever ends up being easiest.
>
> However, looking at that v3 patch, that "should we enable/disable the
> hwrng" is now repeated *three* times, and that first one is
>
>   if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM) || tpm_is_firmware_upgrade(chip) ||
> -     tpm_amd_is_rng_defective(chip))
> +     chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED)
>
> and wants fixing anyway: you want parenthesis around the '&'.
>
> Yes, yes, it works (because bitwise ops have higher precedence than
> logical ones), but let's not do that.
>
> But more importantly, can we just have a single helper inline function
> for this and *not* repeat the same multi-line expression three times
> (just in negated and then 2x non-negated format)?
>
> That test is ugly anyway. Why is "tpm_is_firmware_upgrade()" a wrapper
> function around testing "chip->flags", but then right next to it it
> tests them explicitly.
>
> So if we have to re-do this all, let's re-do it properly. Ok?
>
> Thinking about it, I do guess that makes it easier to just send an
> incremental patch on top.
>
>               Linus

What if I just revert the commit, apply the correct one, and send a PR?

BR, Jarkko
Jason A. Donenfeld Aug. 10, 2023, 3:14 p.m. UTC | #17
On Thu, Aug 10, 2023 at 06:06:48PM +0300, Jarkko Sakkinen wrote:
> On Tue Aug 8, 2023 at 6:26 AM EEST, Linus Torvalds wrote:
> > On Mon, 7 Aug 2023 at 17:39, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > >
> > > I'm not sure what's best or what Linus prefers. Linus - Jarkko sent you
> > > the wrong version patch. Do you want a fixup patch that accounts for the
> > > difference, and then I'll address the stable@ metadata deficiency
> > > manually by talking to Greg, or would you rather some merge commit
> > > magic, or something else?
> >
> > Either works for me, whatever ends up being easiest.
> >
> > However, looking at that v3 patch, that "should we enable/disable the
> > hwrng" is now repeated *three* times, and that first one is
> >
> >   if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM) || tpm_is_firmware_upgrade(chip) ||
> > -     tpm_amd_is_rng_defective(chip))
> > +     chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED)
> >
> > and wants fixing anyway: you want parenthesis around the '&'.
> >
> > Yes, yes, it works (because bitwise ops have higher precedence than
> > logical ones), but let's not do that.
> >
> > But more importantly, can we just have a single helper inline function
> > for this and *not* repeat the same multi-line expression three times
> > (just in negated and then 2x non-negated format)?
> >
> > That test is ugly anyway. Why is "tpm_is_firmware_upgrade()" a wrapper
> > function around testing "chip->flags", but then right next to it it
> > tests them explicitly.
> >
> > So if we have to re-do this all, let's re-do it properly. Ok?
> >
> > Thinking about it, I do guess that makes it easier to just send an
> > incremental patch on top.
> >
> >               Linus
> 
> What if I just revert the commit, apply the correct one, and send a PR?

No, stop, please. We have it sorted already. There's a thread with Greg
now about the stable backport you can weigh in on, though. Please just
read all your emails from the last week, and then it'll be clear what's
up. Do that catchup before taking further actions, please.

Jason
Jarkko Sakkinen Aug. 10, 2023, 3:37 p.m. UTC | #18
On Wed Aug 9, 2023 at 8:06 PM EEST, Linus Torvalds wrote:
> On Tue, 8 Aug 2023 at 10:19, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > Alright, looks like Mario took care of that:
>
> Ok, I just took that patch directly, so that we can close this issue.
>
> It's
>
>     cacc6e22932f ("tpm: Add a helper for checking hwrng enabled")
>
> in my tree, and I'll push out shortly after it's gone through my trivial tests.
>
>                   Linus

Thank you.

I'm still sending a small PR with fixup to disable tpm_tis irq usage
across the board tonight or tomorrow morning (GMT+3).

BR, Jarkko
diff mbox series

Patch

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index cf5499e51999b..8f61b784810d6 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -510,70 +510,6 @@  static int tpm_add_legacy_sysfs(struct tpm_chip *chip)
 	return 0;
 }
 
-/*
- * Some AMD fTPM versions may cause stutter
- * https://www.amd.com/en/support/kb/faq/pa-410
- *
- * Fixes are available in two series of fTPM firmware:
- * 6.x.y.z series: 6.0.18.6 +
- * 3.x.y.z series: 3.57.y.5 +
- */
-#ifdef CONFIG_X86
-static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
-{
-	u32 val1, val2;
-	u64 version;
-	int ret;
-
-	if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
-		return false;
-
-	ret = tpm_request_locality(chip);
-	if (ret)
-		return false;
-
-	ret = tpm2_get_tpm_pt(chip, TPM2_PT_MANUFACTURER, &val1, NULL);
-	if (ret)
-		goto release;
-	if (val1 != 0x414D4400U /* AMD */) {
-		ret = -ENODEV;
-		goto release;
-	}
-	ret = tpm2_get_tpm_pt(chip, TPM2_PT_FIRMWARE_VERSION_1, &val1, NULL);
-	if (ret)
-		goto release;
-	ret = tpm2_get_tpm_pt(chip, TPM2_PT_FIRMWARE_VERSION_2, &val2, NULL);
-
-release:
-	tpm_relinquish_locality(chip);
-
-	if (ret)
-		return false;
-
-	version = ((u64)val1 << 32) | val2;
-	if ((version >> 48) == 6) {
-		if (version >= 0x0006000000180006ULL)
-			return false;
-	} else if ((version >> 48) == 3) {
-		if (version >= 0x0003005700000005ULL)
-			return false;
-	} else {
-		return false;
-	}
-
-	dev_warn(&chip->dev,
-		 "AMD fTPM version 0x%llx causes system stutter; hwrng disabled\n",
-		 version);
-
-	return true;
-}
-#else
-static inline bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
-{
-	return false;
-}
-#endif /* CONFIG_X86 */
-
 static int tpm_hwrng_read(struct hwrng *rng, void *data, size_t max, bool wait)
 {
 	struct tpm_chip *chip = container_of(rng, struct tpm_chip, hwrng);
@@ -588,7 +524,7 @@  static int tpm_hwrng_read(struct hwrng *rng, void *data, size_t max, bool wait)
 static int tpm_add_hwrng(struct tpm_chip *chip)
 {
 	if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM) || tpm_is_firmware_upgrade(chip) ||
-	    tpm_amd_is_rng_defective(chip))
+	    chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED)
 		return 0;
 
 	snprintf(chip->hwrng_name, sizeof(chip->hwrng_name),
@@ -693,7 +629,8 @@  int tpm_chip_register(struct tpm_chip *chip)
 	return 0;
 
 out_hwrng:
-	if (IS_ENABLED(CONFIG_HW_RANDOM_TPM) && !tpm_is_firmware_upgrade(chip))
+	if (IS_ENABLED(CONFIG_HW_RANDOM_TPM) && !tpm_is_firmware_upgrade(chip) &&
+	    !(chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED))
 		hwrng_unregister(&chip->hwrng);
 out_ppi:
 	tpm_bios_log_teardown(chip);
@@ -719,7 +656,7 @@  void tpm_chip_unregister(struct tpm_chip *chip)
 {
 	tpm_del_legacy_sysfs(chip);
 	if (IS_ENABLED(CONFIG_HW_RANDOM_TPM) && !tpm_is_firmware_upgrade(chip) &&
-	    !tpm_amd_is_rng_defective(chip))
+	    !(chip->flags & TPM_CHIP_FLAG_HWRNG_DISABLED))
 		hwrng_unregister(&chip->hwrng);
 	tpm_bios_log_teardown(chip);
 	if (chip->flags & TPM_CHIP_FLAG_TPM2 && !tpm_is_firmware_upgrade(chip))
diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index 1a5d09b185134..9eb1a18590123 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -463,6 +463,28 @@  static bool crb_req_canceled(struct tpm_chip *chip, u8 status)
 	return (cancel & CRB_CANCEL_INVOKE) == CRB_CANCEL_INVOKE;
 }
 
+static int crb_check_flags(struct tpm_chip *chip)
+{
+	u32 val;
+	int ret;
+
+	ret = crb_request_locality(chip, 0);
+	if (ret)
+		return ret;
+
+	ret = tpm2_get_tpm_pt(chip, TPM2_PT_MANUFACTURER, &val, NULL);
+	if (ret)
+		goto release;
+
+	if (val == 0x414D4400U /* AMD */)
+		chip->flags |= TPM_CHIP_FLAG_HWRNG_DISABLED;
+
+release:
+	crb_relinquish_locality(chip, 0);
+
+	return ret;
+}
+
 static const struct tpm_class_ops tpm_crb = {
 	.flags = TPM_OPS_AUTO_STARTUP,
 	.status = crb_status,
@@ -800,6 +822,14 @@  static int crb_acpi_add(struct acpi_device *device)
 	chip->acpi_dev_handle = device->handle;
 	chip->flags = TPM_CHIP_FLAG_TPM2;
 
+	rc = tpm_chip_bootstrap(chip);
+	if (rc)
+		goto out;
+
+	rc = crb_check_flags(chip);
+	if (rc)
+		goto out;
+
 	rc = tpm_chip_register(chip);
 
 out:
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index 6a1e8f1572551..4ee9d13749adc 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -283,6 +283,7 @@  enum tpm_chip_flags {
 	TPM_CHIP_FLAG_FIRMWARE_POWER_MANAGED	= BIT(6),
 	TPM_CHIP_FLAG_FIRMWARE_UPGRADE		= BIT(7),
 	TPM_CHIP_FLAG_SUSPENDED			= BIT(8),
+	TPM_CHIP_FLAG_HWRNG_DISABLED		= BIT(9),
 };
 
 #define to_tpm_chip(d) container_of(d, struct tpm_chip, dev)