diff mbox series

[v4] crypto: jitter - permanent and intermittent health errors

Message ID 4478169.LvFx2qVVIh@positron.chronox.de (mailing list archive)
State Accepted
Delegated to: Herbert Xu
Headers show
Series [v4] crypto: jitter - permanent and intermittent health errors | expand

Commit Message

Stephan Mueller March 27, 2023, 7:03 a.m. UTC
According to SP800-90B, two health failures are allowed: the intermittend
and the permanent failure. So far, only the intermittent failure was
implemented. The permanent failure was achieved by resetting the entire
entropy source including its health test state and waiting for two or
more back-to-back health errors.

This approach is appropriate for RCT, but not for APT as APT has a
non-linear cutoff value. Thus, this patch implements 2 cutoff values
for both RCT/APT. This implies that the health state is left untouched
when an intermittent failure occurs. The noise source is reset
and a new APT powerup-self test is performed. Yet, whith the unchanged
health test state, the counting of failures continues until a permanent
failure is reached.

Any non-failing raw entropy value causes the health tests to reset.

The intermittent error has an unchanged significance level of 2^-30.
The permanent error has a significance level of 2^-60. Considering that
this level also indicates a false-positive rate (see SP800-90B section 4.2)
a false-positive must only be incurred with a low probability when
considering a fleet of Linux kernels as a whole. Hitting the permanent
error may cause a panic(), the following calculation applies: Assuming
that a fleet of 10^9 Linux kernels run concurrently with this patch in
FIPS mode and on each kernel 2 health tests are performed every minute
for one year, the chances of a false positive is about 1:1000
based on the binomial distribution.

In addition, any power-up health test errors triggered with
jent_entropy_init are treated as permanent errors.

A permanent failure causes the entire entropy source to permanently
return an error. This implies that a caller can only remedy the situation
by re-allocating a new instance of the Jitter RNG. In a subsequent
patch, a transparent re-allocation will be provided which also changes
the implied heuristic entropy assessment.

In addition, when the kernel is booted with fips=1, the Jitter RNG
is defined to be part of a FIPS module. The permanent error of the
Jitter RNG is translated as a FIPS module error. In this case, the entire
FIPS module must cease operation. This is implemented in the kernel by
invoking panic().

The patch also fixes an off-by-one in the RCT cutoff value which is now
set to 30 instead of 31. This is because the counting of the values
starts with 0.

Reviewed-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Stephan Mueller <smueller@chronox.de>
---

v4:
 - fix comment regarding fips=1
 - update patch subject to match common naming schema
 - remove now unused jent_panic function
 - added Reviewed-by line

v3:
 - remove an unused goto target

v2:
 - Drop the enforcement of permanent disabling the entropy source

 crypto/jitterentropy-kcapi.c |  51 ++++++-------
 crypto/jitterentropy.c       | 144 +++++++++++++----------------------
 crypto/jitterentropy.h       |   1 -
 3 files changed, 76 insertions(+), 120 deletions(-)

Comments

Marcelo Henrique Cerri March 27, 2023, 4:40 p.m. UTC | #1
It looks good to me too.

I also did a quick smoke test using AF_ALG and I didn't find any issues.

Average of 10 runs doing a million reads of 8, 32, 64 and 128 bytes each
time with drbg_nopr_sha256.

Without the patch:

bsize  count    total (secs)  user (secs)  system (secs)
8      1000000  3,739         0,483        3,247
32     1000000  3,835         0,49         3,337
64     1000000  4,652         0,502        4,14
128    1000000  6,3           0,562        5,73

With the patch:

bsize  count    total (secs)  user (secs)  system (secs)
8      1000000  3,376         0,429        2,936
32     1000000  3,361         0,422        2,927
64     1000000  4,072         0,446        3,614
128    1000000  5,439         0,424        4,981

Reviewed-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>

--
Regards,
Marcelo
Herbert Xu April 6, 2023, 8:50 a.m. UTC | #2
On Mon, Mar 27, 2023 at 09:03:52AM +0200, Stephan Müller wrote:
> According to SP800-90B, two health failures are allowed: the intermittend
> and the permanent failure. So far, only the intermittent failure was
> implemented. The permanent failure was achieved by resetting the entire
> entropy source including its health test state and waiting for two or
> more back-to-back health errors.
> 
> This approach is appropriate for RCT, but not for APT as APT has a
> non-linear cutoff value. Thus, this patch implements 2 cutoff values
> for both RCT/APT. This implies that the health state is left untouched
> when an intermittent failure occurs. The noise source is reset
> and a new APT powerup-self test is performed. Yet, whith the unchanged
> health test state, the counting of failures continues until a permanent
> failure is reached.
> 
> Any non-failing raw entropy value causes the health tests to reset.
> 
> The intermittent error has an unchanged significance level of 2^-30.
> The permanent error has a significance level of 2^-60. Considering that
> this level also indicates a false-positive rate (see SP800-90B section 4.2)
> a false-positive must only be incurred with a low probability when
> considering a fleet of Linux kernels as a whole. Hitting the permanent
> error may cause a panic(), the following calculation applies: Assuming
> that a fleet of 10^9 Linux kernels run concurrently with this patch in
> FIPS mode and on each kernel 2 health tests are performed every minute
> for one year, the chances of a false positive is about 1:1000
> based on the binomial distribution.
> 
> In addition, any power-up health test errors triggered with
> jent_entropy_init are treated as permanent errors.
> 
> A permanent failure causes the entire entropy source to permanently
> return an error. This implies that a caller can only remedy the situation
> by re-allocating a new instance of the Jitter RNG. In a subsequent
> patch, a transparent re-allocation will be provided which also changes
> the implied heuristic entropy assessment.
> 
> In addition, when the kernel is booted with fips=1, the Jitter RNG
> is defined to be part of a FIPS module. The permanent error of the
> Jitter RNG is translated as a FIPS module error. In this case, the entire
> FIPS module must cease operation. This is implemented in the kernel by
> invoking panic().
> 
> The patch also fixes an off-by-one in the RCT cutoff value which is now
> set to 30 instead of 31. This is because the counting of the values
> starts with 0.
> 
> Reviewed-by: Vladis Dronov <vdronov@redhat.com>
> Signed-off-by: Stephan Mueller <smueller@chronox.de>
> ---
> 
> v4:
>  - fix comment regarding fips=1
>  - update patch subject to match common naming schema
>  - remove now unused jent_panic function
>  - added Reviewed-by line
> 
> v3:
>  - remove an unused goto target
> 
> v2:
>  - Drop the enforcement of permanent disabling the entropy source
> 
>  crypto/jitterentropy-kcapi.c |  51 ++++++-------
>  crypto/jitterentropy.c       | 144 +++++++++++++----------------------
>  crypto/jitterentropy.h       |   1 -
>  3 files changed, 76 insertions(+), 120 deletions(-)

Patch applied.  Thanks.
diff mbox series

Patch

diff --git a/crypto/jitterentropy-kcapi.c b/crypto/jitterentropy-kcapi.c
index 2d115bec15ae..b9edfaa51b27 100644
--- a/crypto/jitterentropy-kcapi.c
+++ b/crypto/jitterentropy-kcapi.c
@@ -37,6 +37,7 @@ 
  * DAMAGE.
  */
 
+#include <linux/fips.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/slab.h>
@@ -59,11 +60,6 @@  void jent_zfree(void *ptr)
 	kfree_sensitive(ptr);
 }
 
-void jent_panic(char *s)
-{
-	panic("%s", s);
-}
-
 void jent_memcpy(void *dest, const void *src, unsigned int n)
 {
 	memcpy(dest, src, n);
@@ -102,7 +98,6 @@  void jent_get_nstime(__u64 *out)
 struct jitterentropy {
 	spinlock_t jent_lock;
 	struct rand_data *entropy_collector;
-	unsigned int reset_cnt;
 };
 
 static int jent_kcapi_init(struct crypto_tfm *tfm)
@@ -138,32 +133,30 @@  static int jent_kcapi_random(struct crypto_rng *tfm,
 
 	spin_lock(&rng->jent_lock);
 
-	/* Return a permanent error in case we had too many resets in a row. */
-	if (rng->reset_cnt > (1<<10)) {
-		ret = -EFAULT;
-		goto out;
-	}
-
 	ret = jent_read_entropy(rng->entropy_collector, rdata, dlen);
 
-	/* Reset RNG in case of health failures */
-	if (ret < -1) {
-		pr_warn_ratelimited("Reset Jitter RNG due to health test failure: %s failure\n",
-				    (ret == -2) ? "Repetition Count Test" :
-						  "Adaptive Proportion Test");
-
-		rng->reset_cnt++;
-
+	if (ret == -3) {
+		/* Handle permanent health test error */
+		/*
+		 * If the kernel was booted with fips=1, it implies that
+		 * the entire kernel acts as a FIPS 140 module. In this case
+		 * an SP800-90B permanent health test error is treated as
+		 * a FIPS module error.
+		 */
+		if (fips_enabled)
+			panic("Jitter RNG permanent health test failure\n");
+
+		pr_err("Jitter RNG permanent health test failure\n");
+		ret = -EFAULT;
+	} else if (ret == -2) {
+		/* Handle intermittent health test error */
+		pr_warn_ratelimited("Reset Jitter RNG due to intermittent health test failure\n");
 		ret = -EAGAIN;
-	} else {
-		rng->reset_cnt = 0;
-
-		/* Convert the Jitter RNG error into a usable error code */
-		if (ret == -1)
-			ret = -EINVAL;
+	} else if (ret == -1) {
+		/* Handle other errors */
+		ret = -EINVAL;
 	}
 
-out:
 	spin_unlock(&rng->jent_lock);
 
 	return ret;
@@ -197,6 +190,10 @@  static int __init jent_mod_init(void)
 
 	ret = jent_entropy_init();
 	if (ret) {
+		/* Handle permanent health test error */
+		if (fips_enabled)
+			panic("jitterentropy: Initialization failed with host not compliant with requirements: %d\n", ret);
+
 		pr_info("jitterentropy: Initialization failed with host not compliant with requirements: %d\n", ret);
 		return -EFAULT;
 	}
diff --git a/crypto/jitterentropy.c b/crypto/jitterentropy.c
index 93bff3213823..22f48bf4c6f5 100644
--- a/crypto/jitterentropy.c
+++ b/crypto/jitterentropy.c
@@ -85,10 +85,14 @@  struct rand_data {
 				      * bit generation */
 
 	/* Repetition Count Test */
-	int rct_count;			/* Number of stuck values */
+	unsigned int rct_count;			/* Number of stuck values */
 
-	/* Adaptive Proportion Test for a significance level of 2^-30 */
+	/* Intermittent health test failure threshold of 2^-30 */
+#define JENT_RCT_CUTOFF		30	/* Taken from SP800-90B sec 4.4.1 */
 #define JENT_APT_CUTOFF		325	/* Taken from SP800-90B sec 4.4.2 */
+	/* Permanent health test failure threshold of 2^-60 */
+#define JENT_RCT_CUTOFF_PERMANENT	60
+#define JENT_APT_CUTOFF_PERMANENT	355
 #define JENT_APT_WINDOW_SIZE	512	/* Data window size */
 	/* LSB of time stamp to process */
 #define JENT_APT_LSB		16
@@ -97,8 +101,6 @@  struct rand_data {
 	unsigned int apt_count;		/* APT counter */
 	unsigned int apt_base;		/* APT base reference */
 	unsigned int apt_base_set:1;	/* APT base reference set? */
-
-	unsigned int health_failure:1;	/* Permanent health failure */
 };
 
 /* Flags that can be used to initialize the RNG */
@@ -169,19 +171,26 @@  static void jent_apt_insert(struct rand_data *ec, unsigned int delta_masked)
 		return;
 	}
 
-	if (delta_masked == ec->apt_base) {
+	if (delta_masked == ec->apt_base)
 		ec->apt_count++;
 
-		if (ec->apt_count >= JENT_APT_CUTOFF)
-			ec->health_failure = 1;
-	}
-
 	ec->apt_observations++;
 
 	if (ec->apt_observations >= JENT_APT_WINDOW_SIZE)
 		jent_apt_reset(ec, delta_masked);
 }
 
+/* APT health test failure detection */
+static int jent_apt_permanent_failure(struct rand_data *ec)
+{
+	return (ec->apt_count >= JENT_APT_CUTOFF_PERMANENT) ? 1 : 0;
+}
+
+static int jent_apt_failure(struct rand_data *ec)
+{
+	return (ec->apt_count >= JENT_APT_CUTOFF) ? 1 : 0;
+}
+
 /***************************************************************************
  * Stuck Test and its use as Repetition Count Test
  *
@@ -206,55 +215,14 @@  static void jent_apt_insert(struct rand_data *ec, unsigned int delta_masked)
  */
 static void jent_rct_insert(struct rand_data *ec, int stuck)
 {
-	/*
-	 * If we have a count less than zero, a previous RCT round identified
-	 * a failure. We will not overwrite it.
-	 */
-	if (ec->rct_count < 0)
-		return;
-
 	if (stuck) {
 		ec->rct_count++;
-
-		/*
-		 * The cutoff value is based on the following consideration:
-		 * alpha = 2^-30 as recommended in FIPS 140-2 IG 9.8.
-		 * In addition, we require an entropy value H of 1/OSR as this
-		 * is the minimum entropy required to provide full entropy.
-		 * Note, we collect 64 * OSR deltas for inserting them into
-		 * the entropy pool which should then have (close to) 64 bits
-		 * of entropy.
-		 *
-		 * Note, ec->rct_count (which equals to value B in the pseudo
-		 * code of SP800-90B section 4.4.1) starts with zero. Hence
-		 * we need to subtract one from the cutoff value as calculated
-		 * following SP800-90B.
-		 */
-		if ((unsigned int)ec->rct_count >= (31 * ec->osr)) {
-			ec->rct_count = -1;
-			ec->health_failure = 1;
-		}
 	} else {
+		/* Reset RCT */
 		ec->rct_count = 0;
 	}
 }
 
-/*
- * Is there an RCT health test failure?
- *
- * @ec [in] Reference to entropy collector
- *
- * @return
- * 	0 No health test failure
- * 	1 Permanent health test failure
- */
-static int jent_rct_failure(struct rand_data *ec)
-{
-	if (ec->rct_count < 0)
-		return 1;
-	return 0;
-}
-
 static inline __u64 jent_delta(__u64 prev, __u64 next)
 {
 #define JENT_UINT64_MAX		(__u64)(~((__u64) 0))
@@ -303,18 +271,26 @@  static int jent_stuck(struct rand_data *ec, __u64 current_delta)
 	return 0;
 }
 
-/*
- * Report any health test failures
- *
- * @ec [in] Reference to entropy collector
- *
- * @return
- * 	0 No health test failure
- * 	1 Permanent health test failure
- */
+/* RCT health test failure detection */
+static int jent_rct_permanent_failure(struct rand_data *ec)
+{
+	return (ec->rct_count >= JENT_RCT_CUTOFF_PERMANENT) ? 1 : 0;
+}
+
+static int jent_rct_failure(struct rand_data *ec)
+{
+	return (ec->rct_count >= JENT_RCT_CUTOFF) ? 1 : 0;
+}
+
+/* Report of health test failures */
 static int jent_health_failure(struct rand_data *ec)
 {
-	return ec->health_failure;
+	return jent_rct_failure(ec) | jent_apt_failure(ec);
+}
+
+static int jent_permanent_health_failure(struct rand_data *ec)
+{
+	return jent_rct_permanent_failure(ec) | jent_apt_permanent_failure(ec);
 }
 
 /***************************************************************************
@@ -600,8 +576,8 @@  static void jent_gen_entropy(struct rand_data *ec)
  *
  * The following error codes can occur:
  *	-1	entropy_collector is NULL
- *	-2	RCT failed
- *	-3	APT test failed
+ *	-2	Intermittent health failure
+ *	-3	Permanent health failure
  */
 int jent_read_entropy(struct rand_data *ec, unsigned char *data,
 		      unsigned int len)
@@ -616,39 +592,23 @@  int jent_read_entropy(struct rand_data *ec, unsigned char *data,
 
 		jent_gen_entropy(ec);
 
-		if (jent_health_failure(ec)) {
-			int ret;
-
-			if (jent_rct_failure(ec))
-				ret = -2;
-			else
-				ret = -3;
-
+		if (jent_permanent_health_failure(ec)) {
 			/*
-			 * Re-initialize the noise source
-			 *
-			 * If the health test fails, the Jitter RNG remains
-			 * in failure state and will return a health failure
-			 * during next invocation.
+			 * At this point, the Jitter RNG instance is considered
+			 * as a failed instance. There is no rerun of the
+			 * startup test any more, because the caller
+			 * is assumed to not further use this instance.
 			 */
-			if (jent_entropy_init())
-				return ret;
-
-			/* Set APT to initial state */
-			jent_apt_reset(ec, 0);
-			ec->apt_base_set = 0;
-
-			/* Set RCT to initial state */
-			ec->rct_count = 0;
-
-			/* Re-enable Jitter RNG */
-			ec->health_failure = 0;
-
+			return -3;
+		} else if (jent_health_failure(ec)) {
 			/*
-			 * Return the health test failure status to the
-			 * caller as the generated value is not appropriate.
+			 * Perform startup health tests and return permanent
+			 * error if it fails.
 			 */
-			return ret;
+			if (jent_entropy_init())
+				return -3;
+
+			return -2;
 		}
 
 		if ((DATA_SIZE_BITS / 8) < len)
diff --git a/crypto/jitterentropy.h b/crypto/jitterentropy.h
index b7397b617ef0..5cc583f6bc6b 100644
--- a/crypto/jitterentropy.h
+++ b/crypto/jitterentropy.h
@@ -2,7 +2,6 @@ 
 
 extern void *jent_zalloc(unsigned int len);
 extern void jent_zfree(void *ptr);
-extern void jent_panic(char *s);
 extern void jent_memcpy(void *dest, const void *src, unsigned int n);
 extern void jent_get_nstime(__u64 *out);