[v2,05/63] stddef: Introduce struct_group() helper macro

Message ID	20210818060533.3569517-6-keescook@chromium.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-kbuild-owner@kernel.org> From: Kees Cook <keescook@chromium.org> To: linux-kernel@vger.kernel.org Cc: Kees Cook <keescook@chromium.org>, Keith Packard <keithp@keithp.com>, "Gustavo A . R . Silva" <gustavoars@kernel.org>, Rasmus Villemoes <linux@rasmusvillemoes.dk>, Dan Williams <dan.j.williams@intel.com>, Daniel Vetter <daniel.vetter@ffwll.ch>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrew Morton <akpm@linux-foundation.org>, linux-wireless@vger.kernel.org, netdev@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-staging@lists.linux.dev, linux-block@vger.kernel.org, linux-kbuild@vger.kernel.org, clang-built-linux@googlegroups.com, linux-hardening@vger.kernel.org Subject: [PATCH v2 05/63] stddef: Introduce struct_group() helper macro Date: Tue, 17 Aug 2021 23:04:35 -0700 Message-Id: <20210818060533.3569517-6-keescook@chromium.org> In-Reply-To: <20210818060533.3569517-1-keescook@chromium.org> References: <20210818060533.3569517-1-keescook@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Introduce strict memcpy() bounds checking \| expand [v2,00/63] Introduce strict memcpy() bounds checking [v2,01/63] ipw2x00: Avoid field-overflowing memcpy() [v2,02/63] net/mlx5e: Avoid field-overflowing memcpy() [v2,03/63] rpmsg: glink: Replace strncpy() with strscpy_pad() [v2,04/63] pcmcia: ray_cs: Split memcpy() to avoid bounds check warning [v2,05/63] stddef: Introduce struct_group() helper macro [v2,06/63] cxl/core: Replace unions with struct_group() [v2,07/63] skbuff: Switch structure bounds to struct_group() [v2,08/63] bnxt_en: Use struct_group_attr() for memcpy() region [v2,09/63] mwl8k: Use struct_group() for memcpy() region [v2,10/63] libertas: Use struct_group() for memcpy() region [v2,11/63] libertas_tf: Use struct_group() for memcpy() region [v2,12/63] thermal: intel: int340x_thermal: Use struct_group() for memcpy() region [v2,13/63] iommu/amd: Use struct_group() for memcpy() region [v2,14/63] cxgb3: Use struct_group() for memcpy() region [v2,15/63] intersil: Use struct_group() for memcpy() region [v2,16/63] cxgb4: Use struct_group() for memcpy() region [v2,17/63] bnx2x: Use struct_group() for memcpy() region [v2,18/63] drm/amd/pm: Use struct_group() for memcpy() region [v2,19/63] staging: wlan-ng: Use struct_group() for memcpy() region [v2,20/63] drm/mga/mga_ioc32: Use struct_group() for memcpy() region [v2,21/63] net/mlx5e: Use struct_group() for memcpy() region [v2,22/63] HID: cp2112: Use struct_group() for memcpy() region [v2,23/63] media: omap3isp: Use struct_group() for memcpy() region [v2,24/63] sata_fsl: Use struct_group() for memcpy() region [v2,25/63] compiler_types.h: Remove __compiletime_object_size() [v2,26/63] lib/string: Move helper functions out of string.c [v2,27/63] fortify: Move remaining fortify helpers into fortify-string.h [v2,28/63] fortify: Explicitly disable Clang support [v2,29/63] fortify: Fix dropped strcpy() compile-time write overflow check [v2,30/63] fortify: Prepare to improve strnlen() and strlen() warnings [v2,31/63] fortify: Allow strlen() and strnlen() to pass compile-time known lengths [v2,32/63] fortify: Add compile-time FORTIFY_SOURCE tests [v2,33/63] lib: Introduce CONFIG_TEST_MEMCPY [v2,34/63] fortify: Detect struct member overflows in memcpy() at compile-time [v2,35/63] fortify: Detect struct member overflows in memmove() at compile-time [v2,36/63] scsi: ibmvscsi: Avoid multi-field memset() overflow by aiming at srp [v2,37/63] string.h: Introduce memset_after() for wiping trailing members/padding [v2,38/63] xfrm: Use memset_after() to clear padding [v2,39/63] ipv6: Use memset_after() to zero rt6_info [v2,40/63] netfilter: conntrack: Use memset_startat() to zero struct nf_conn [v2,41/63] net: 802: Use memset_startat() to clear struct fields [v2,42/63] net: dccp: Use memset_startat() for TP zeroing [v2,43/63] net: qede: Use memset_startat() for counters [v2,44/63] mac80211: Use memset_after() to clear tx status [v2,45/63] ath11k: Use memset_startat() for clearing queue descriptors [v2,46/63] iw_cxgb4: Use memset_startat() for cpl_t5_pass_accept_rpl [v2,47/63] intel_th: msu: Use memset_startat() for clearing hw header [v2,48/63] IB/mthca: Use memset_startat() for clearing mpt_entry [v2,49/63] btrfs: Use memset_startat() to clear end of struct [v2,50/63] tracing: Use memset_startat() to zero struct trace_iterator [v2,51/63] drbd: Use struct_group() to zero algs [v2,52/63] cm4000_cs: Use struct_group() to zero struct cm4000_dev region [v2,53/63] KVM: x86: Use struct_group() to zero decode cache [v2,54/63] dm integrity: Use struct_group() to zero struct journal_sector [v2,55/63] HID: roccat: Use struct_group() to zero kone_mouse_event [v2,56/63] RDMA/mlx5: Use struct_group() to zero struct mlx5_ib_mr [v2,57/63] powerpc/signal32: Use struct_group() to zero spe regs [v2,58/63] ethtool: stats: Use struct_group() to clear all stats at once [v2,59/63] can: flexcan: Use struct_group() to zero struct flexcan_regs regions [v2,60/63] net/af_iucv: Use struct_group() to zero struct iucv_sock region [v2,61/63] powerpc: Split memset() to avoid multi-field overflow [v2,62/63] fortify: Detect struct member overflows in memset() at compile-time [v2,63/63] fortify: Work around Clang inlining bugs

Message ID

20210818060533.3569517-6-keescook@chromium.org (mailing list archive)

State

New, archived

Headers

From: Kees Cook <keescook@chromium.org>
To: linux-kernel@vger.kernel.org
Cc: Kees Cook <keescook@chromium.org>,
        Keith Packard <keithp@keithp.com>,
        "Gustavo A . R . Silva" <gustavoars@kernel.org>,
        Rasmus Villemoes <linux@rasmusvillemoes.dk>,
        Dan Williams <dan.j.williams@intel.com>,
        Daniel Vetter <daniel.vetter@ffwll.ch>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        linux-wireless@vger.kernel.org, netdev@vger.kernel.org,
        dri-devel@lists.freedesktop.org, linux-staging@lists.linux.dev,
        linux-block@vger.kernel.org, linux-kbuild@vger.kernel.org,
        clang-built-linux@googlegroups.com, linux-hardening@vger.kernel.org
Subject: [PATCH v2 05/63] stddef: Introduce struct_group() helper macro
Date: Tue, 17 Aug 2021 23:04:35 -0700
Message-Id: <20210818060533.3569517-6-keescook@chromium.org>
In-Reply-To: <20210818060533.3569517-1-keescook@chromium.org>
References: <20210818060533.3569517-1-keescook@chromium.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

Introduce strict memcpy() bounds checking | expand

Commit Message

Kees Cook Aug. 18, 2021, 6:04 a.m. UTC

Kernel code has a regular need to describe groups of members within a
structure usually when they need to be copied or initialized separately
from the rest of the surrounding structure. The generally accepted design
pattern in C is to use a named sub-struct:

	struct foo {
		int one;
		struct {
			int two;
			int three, four;
		} thing;
		int five;
	};

This would allow for traditional references and sizing:

	memcpy(&dst.thing, &src.thing, sizeof(dst.thing));

However, doing this would mean that referencing struct members enclosed
by such named structs would always require including the sub-struct name
in identifiers:

	do_something(dst.thing.three);

This has tended to be quite inflexible, especially when such groupings
need to be added to established code which causes huge naming churn.
Three workarounds exist in the kernel for this problem, and each have
other negative properties.

To avoid the naming churn, there is a design pattern of adding macro
aliases for the named struct:

	#define f_three thing.three

This ends up polluting the global namespace, and makes it difficult to
search for identifiers.

Another common work-around in kernel code avoids the pollution by avoiding
the named struct entirely, instead identifying the group's boundaries using
either a pair of empty anonymous structs of a pair of zero-element arrays:

	struct foo {
		int one;
		struct { } start;
		int two;
		int three, four;
		struct { } finish;
		int five;
	};

	struct foo {
		int one;
		int start[0];
		int two;
		int three, four;
		int finish[0];
		int five;
	};

This allows code to avoid needing to use a sub-struct named for member
references within the surrounding structure, but loses the benefits of
being able to actually use such a struct, making it rather fragile. Using
these requires open-coded calculation of sizes and offsets. The efforts
made to avoid common mistakes include lots of comments, or adding various
BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
about the boundaries (e.g. the "start" object looks like it's 0 bytes
in length), making bounds checking depend on open-coded calculations:

	if (length > offsetof(struct foo, finish) -
		     offsetof(struct foo, start))
		return -EINVAL;
	memcpy(&dst.start, &src.start, offsetof(struct foo, finish) -
				       offsetof(struct foo, start));

However, the vast majority of places in the kernel that operate on
groups of members do so without any identification of the grouping,
relying either on comments or implicit knowledge of the struct contents,
which is even harder for the compiler to reason about, and results in
even more fragile manual sizing, usually depending on member locations
outside of the region (e.g. to copy "two" and "three", use the start of
"four" to find the size):

	BUILD_BUG_ON((offsetof(struct foo, four) <
		      offsetof(struct foo, two)) ||
		     (offsetof(struct foo, four) <
		      offsetof(struct foo, three));
	if (length > offsetof(struct foo, four) -
		     offsetof(struct foo, two))
		return -EINVAL;
	memcpy(&dst.two, &src.two, length);

In order to have a regular programmatic way to describe a struct
region that can be used for references and sizing, can be examined for
bounds checking, avoids forcing the use of intermediate identifiers,
and avoids polluting the global namespace, introduce the struct_group()
macro. This macro wraps the member declarations to create an anonymous
union of an anonymous struct (no intermediate name) and a named struct
(for references and sizing):

	struct foo {
		int one;
		struct_group(thing,
			int two;
			int three, four;
		);
		int five;
	};

	if (length > sizeof(src.thing))
		return -EINVAL;
	memcpy(&dst.thing, &src.thing, length);
	do_something(dst.three);

There are some rare cases where the resulting struct_group() needs
attributes added, so struct_group_attr() is also introduced to allow
for specifying struct attributes (e.g. __align(x) or __packed).
Additionally, there are places where such declarations would like to
have the struct be typed, so struct_group_typed() is added.

Given there is a need for a handful of UAPI uses too, the underlying
__struct_group() macro has been defined in UAPI so it can be used there
too.

Co-developed-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor
Enhanced-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dk
Enhanced-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@intel.com
Enhanced-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local
---
 include/linux/stddef.h      | 47 +++++++++++++++++++++++++++++++++++++
 include/uapi/linux/stddef.h | 21 +++++++++++++++++
 2 files changed, 68 insertions(+)

Comments

Dan Williams Aug. 18, 2021, 10:35 p.m. UTC | #1

On Tue, Aug 17, 2021 at 11:06 PM Kees Cook <keescook@chromium.org> wrote:
>
> Kernel code has a regular need to describe groups of members within a
> structure usually when they need to be copied or initialized separately
> from the rest of the surrounding structure. The generally accepted design
> pattern in C is to use a named sub-struct:
>
>         struct foo {
>                 int one;
>                 struct {
>                         int two;
>                         int three, four;
>                 } thing;
>                 int five;
>         };
>
> This would allow for traditional references and sizing:
>
>         memcpy(&dst.thing, &src.thing, sizeof(dst.thing));
>
> However, doing this would mean that referencing struct members enclosed
> by such named structs would always require including the sub-struct name
> in identifiers:
>
>         do_something(dst.thing.three);
>
> This has tended to be quite inflexible, especially when such groupings
> need to be added to established code which causes huge naming churn.
> Three workarounds exist in the kernel for this problem, and each have
> other negative properties.
>
> To avoid the naming churn, there is a design pattern of adding macro
> aliases for the named struct:
>
>         #define f_three thing.three
>
> This ends up polluting the global namespace, and makes it difficult to
> search for identifiers.
>
> Another common work-around in kernel code avoids the pollution by avoiding
> the named struct entirely, instead identifying the group's boundaries using
> either a pair of empty anonymous structs of a pair of zero-element arrays:
>
>         struct foo {
>                 int one;
>                 struct { } start;
>                 int two;
>                 int three, four;
>                 struct { } finish;
>                 int five;
>         };
>
>         struct foo {
>                 int one;
>                 int start[0];
>                 int two;
>                 int three, four;
>                 int finish[0];
>                 int five;
>         };
>
> This allows code to avoid needing to use a sub-struct named for member
> references within the surrounding structure, but loses the benefits of
> being able to actually use such a struct, making it rather fragile. Using
> these requires open-coded calculation of sizes and offsets. The efforts
> made to avoid common mistakes include lots of comments, or adding various
> BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason
> about the boundaries (e.g. the "start" object looks like it's 0 bytes
> in length), making bounds checking depend on open-coded calculations:
>
>         if (length > offsetof(struct foo, finish) -
>                      offsetof(struct foo, start))
>                 return -EINVAL;
>         memcpy(&dst.start, &src.start, offsetof(struct foo, finish) -
>                                        offsetof(struct foo, start));
>
> However, the vast majority of places in the kernel that operate on
> groups of members do so without any identification of the grouping,
> relying either on comments or implicit knowledge of the struct contents,
> which is even harder for the compiler to reason about, and results in
> even more fragile manual sizing, usually depending on member locations
> outside of the region (e.g. to copy "two" and "three", use the start of
> "four" to find the size):
>
>         BUILD_BUG_ON((offsetof(struct foo, four) <
>                       offsetof(struct foo, two)) ||
>                      (offsetof(struct foo, four) <
>                       offsetof(struct foo, three));
>         if (length > offsetof(struct foo, four) -
>                      offsetof(struct foo, two))
>                 return -EINVAL;
>         memcpy(&dst.two, &src.two, length);
>
> In order to have a regular programmatic way to describe a struct
> region that can be used for references and sizing, can be examined for
> bounds checking, avoids forcing the use of intermediate identifiers,
> and avoids polluting the global namespace, introduce the struct_group()
> macro. This macro wraps the member declarations to create an anonymous
> union of an anonymous struct (no intermediate name) and a named struct
> (for references and sizing):
>
>         struct foo {
>                 int one;
>                 struct_group(thing,
>                         int two;
>                         int three, four;
>                 );
>                 int five;
>         };
>
>         if (length > sizeof(src.thing))
>                 return -EINVAL;
>         memcpy(&dst.thing, &src.thing, length);
>         do_something(dst.three);
>
> There are some rare cases where the resulting struct_group() needs
> attributes added, so struct_group_attr() is also introduced to allow
> for specifying struct attributes (e.g. __align(x) or __packed).
> Additionally, there are places where such declarations would like to
> have the struct be typed, so struct_group_typed() is added.
>
> Given there is a need for a handful of UAPI uses too, the underlying
> __struct_group() macro has been defined in UAPI so it can be used there
> too.
>
> Co-developed-by: Keith Packard <keithp@keithp.com>
> Signed-off-by: Keith Packard <keithp@keithp.com>
> Signed-off-by: Kees Cook <keescook@chromium.org>
> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
> Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor
> Enhanced-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvillemoes.dk
> Enhanced-by: Dan Williams <dan.j.williams@intel.com>

Acked-by: Dan Williams <dan.j.williams@intel.com>

diff --git a/include/linux/stddef.h b/include/linux/stddef.h
index 998a4ba28eba..f2aefdb22d1d 100644
--- a/include/linux/stddef.h
+++ b/include/linux/stddef.h
@@ -36,4 +36,51 @@  enum {
 #define offsetofend(TYPE, MEMBER) \
 	(offsetof(TYPE, MEMBER)	+ sizeof_field(TYPE, MEMBER))
 
+/**
+ * struct_group(NAME, MEMBERS)
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members.
+ *
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @MEMBERS: The member declarations for the mirrored structs
+ */
+#define struct_group(NAME, MEMBERS...)	\
+	__struct_group(/* no tag */, NAME, /* no attrs */, MEMBERS)
+
+/**
+ * struct_group_attr(NAME, ATTRS, MEMBERS)
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members. Includes structure attributes argument.
+ *
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @ATTRS: Any struct attributes
+ * @MEMBERS: The member declarations for the mirrored structs
+ */
+#define struct_group_attr(NAME, ATTRS, MEMBERS...) \
+	__struct_group(/* no tag */, NAME, ATTRS, MEMBERS)
+
+/**
+ * struct_group_tagged(TAG, NAME, MEMBERS)
+ *
+ * Used to create an anonymous union of two structs with identical
+ * layout and size: one anonymous and one named. The former can be
+ * used normally without sub-struct naming, and the latter can be
+ * used to reason about the start, end, and size of the group of
+ * struct members. Includes struct tag argument for the named copy.
+ *
+ * @TAG: The tag name for the named sub-struct
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @MEMBERS: The member declarations for the mirrored structs
+ */
+#define struct_group_tagged(TAG, NAME, MEMBERS...) \
+	__struct_group(TAG, NAME, /* no attrs */, MEMBERS)
+
 #endif
diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h
index ee8220f8dcf5..0fbdf2f711aa 100644
--- a/include/uapi/linux/stddef.h
+++ b/include/uapi/linux/stddef.h
@@ -4,3 +4,24 @@ 
 #ifndef __always_inline
 #define __always_inline inline
 #endif
+
+/**
+ * __struct_group(TAG, NAME, ATTRS, MEMBERS)
+ *
+ * Used to create an anonymous union of two structs with identical layout
+ * and size: one anonymous and one named. The former's members can be used
+ * normally without sub-struct naming, and the latter can be used to
+ * reason about the start, end, and size of the group of struct members.
+ * The named struct can also be explicitly tagged, as well as both having
+ * struct attributes.
+ *
+ * @TAG: The tag name for the named sub-struct (usually empty)
+ * @NAME: The identifier name of the mirrored sub-struct
+ * @ATTRS: Any struct attributes (usually empty)
+ * @MEMBERS: The member declarations for the mirrored structs
+ */
+#define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \
+	union { \
+		struct { MEMBERS } ATTRS; \
+		struct TAG { MEMBERS } ATTRS NAME; \
+	}

[v2,05/63] stddef: Introduce struct_group() helper macro

Commit Message

Comments

Patch