Message ID | 20230831132546.3525721-8-armbru@redhat.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [1/7] migration/rdma: Fix save_page method to fail on polling error | expand |
On Thu, Aug 31, 2023 at 03:25:46PM +0200, Markus Armbruster wrote: [This paragraph written last: Bear with my stream of consciousness review below, where I end up duplicating some of the conslusions you reached before the point where I saw where the patch was headed] > Variables declared in macros can shadow other variables. Much of the > time, this is harmless, e.g.: > > #define _FDT(exp) \ > do { \ > int ret = (exp); \ > if (ret < 0) { \ > error_report("error creating device tree: %s: %s", \ > #exp, fdt_strerror(ret)); \ > exit(1); \ > } \ > } while (0) Which is why I've seen some projects require a strict namespace separation: if all macro parameters and any identifiers declared in macros use either a leading or a trailing _ (I prefer a trailing one, to avoid risking conflicts with libc reserved namespace; but leading is usually okay), and all other identifiers avoid that namespace, then you will never have shadowing by calling a macro from normal code. > > Harmless shadowing in h_client_architecture_support(): > > target_ulong ret; > > [...] > > ret = do_client_architecture_support(cpu, spapr, vec, fdt_bufsize); > if (ret == H_SUCCESS) { > _FDT((fdt_pack(spapr->fdt_blob))); > [...] > } > > return ret; > > However, we can get in trouble when the shadowed variable is used in a > macro argument: > > #define QOBJECT(obj) ({ \ > typeof(obj) o = (obj); \ > o ? container_of(&(o)->base, QObject, base) : NULL; \ > }) > > QOBJECT(o) expands into > > ({ > ---> typeof(o) o = (o); > o ? container_of(&(o)->base, QObject, base) : NULL; > }) > > Unintended variable name capture at --->. We'd be saved by > -Winit-self. But I could certainly construct more elaborate death > traps that don't trigger it. Indeed, not fully understanding the preprocessor makes for some interesting death traps. > > To reduce the risk of trapping ourselves, we use variable names in > macros that no sane person would use elsewhere. Here's our actual > definition of QOBJECT(): > > #define QOBJECT(obj) ({ \ > typeof(obj) _obj = (obj); \ > _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ > }) Yep, goes back to the policy I've seen of enforcing naming conventions that ensure macros can't shadow normal code. > > Works well enough until we nest macro calls. For instance, with > > #define qobject_ref(obj) ({ \ > typeof(obj) _obj = (obj); \ > qobject_ref_impl(QOBJECT(_obj)); \ > _obj; \ > }) > > the expression qobject_ref(obj) expands into > > ({ > typeof(obj) _obj = (obj); > qobject_ref_impl( > ({ > ---> typeof(_obj) _obj = (_obj); > _obj ? container_of(&(_obj)->base, QObject, base) : NULL; > })); > _obj; > }) > > Unintended variable name capture at --->. Yep, you've just proven how macros calling macros requires care, as we no longer have the namespace protections. If macro calls can only form a DAG, it is possible to choose unique intermediate declarations for every macro (although remembering to do that, and still keeping the macro definition legible, may not be easy). But if you can have macros that form any sort of nesting loop (and we WANT to allow things like MIN(MIN(a, b), c) - which is such a loop), dynamic naming becomes the only solution. > > The only reliable way to prevent unintended variable name capture is > -Wshadow. Yes, I would love to have that enabled eventually. > > One blocker for enabling it is shadowing hiding in function-like > macros like > > qdict_put(dict, "name", qobject_ref(...)) > > qdict_put() wraps its last argument in QOBJECT(), and the last > argument here contains another QOBJECT(). > > Use dark preprocessor sorcery to make the macros that give us this > problem use different variable names on every call. Sounds foreboding; hopefully not many macros are affected... > > Signed-off-by: Markus Armbruster <armbru@redhat.com> > --- > include/qapi/qmp/qobject.h | 8 +++++--- > include/qemu/atomic.h | 11 ++++++----- > include/qemu/osdep.h | 34 +++++++++++++++++++--------------- > 3 files changed, 30 insertions(+), 23 deletions(-) ...okay, the size of the diffstat seems tolerable (good that we don't have many macros that expand to a temporary variable declaration and which are likely to be heavily reused from nested contexts). > > diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h > index 9003b71fd3..7b50fc905d 100644 > --- a/include/qapi/qmp/qobject.h > +++ b/include/qapi/qmp/qobject.h > @@ -45,10 +45,12 @@ struct QObject { > struct QObjectBase_ base; > }; > > -#define QOBJECT(obj) ({ \ > - typeof(obj) _obj = (obj); \ > - _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ > +#define QOBJECT_INTERNAL(obj, l) ({ \ > + typeof(obj) PASTE(_obj, l) = (obj); \ > + PASTE(_obj, l) \ > + ? container_of(&(PASTE(_obj, l))->base, QObject, base) : NULL; \ > }) > +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) Slick! Every call to QOBJECT() defines a unique temporary variable name. But as written, QOBJECT(o) expands to this (when __COUNTER__ is 1): ({ typeof((o)) _obj1 = ((o)); _obj1 ? container_of(&(_obj1)->base, QObject, base) : NULL; }) which has three sets of redundant parens that could be elided. Why do you need to add () around obj when forwarding to QOBJECT_INTERNAL()? The only way the expansion of the text passed through the 'obj' parameter can contain a comma is if the user has already parenthesized it on their end (not that I expect a comma expression to be a frequent argument to QOBJECT(), but who knows). Arguing that it is to silence a syntax checker is weak; since we must NOT add parens around the parameters to QOBJECT_INTERNAL (calling PASTE((_obj), (l)) is obviously wrong). Meanwhile, the repetition of three calls to PASTE() is annoying. How about: #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ typeof _obj _tmp = _obj; \ _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ )} #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) #define QOBJECT_INTERNAL(_arg, _ctr) \ QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) #define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) or: #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ typeof(_obj) _tmp = (_obj); \ _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ )} #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) #define QOBJECT_INTERNAL(_arg, _ctr) \ QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) #define QOBJECT(obj) QOBJECT_INTERNAL(obj, __COUNTER__) where, in either case, QOBJECT(o) should then expand to ({ typeof (o) _obj1 = (o); _obj1 ? container_of(&_obj1->base, QObject, base) : NULL; }) > > /* Required for qobject_to() */ > #define QTYPE_CAST_TO_QNull QTYPE_QNULL > diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h > index d95612f7a0..3f80ffac69 100644 > --- a/include/qemu/atomic.h > +++ b/include/qemu/atomic.h > @@ -157,13 +157,14 @@ > smp_read_barrier_depends(); > #endif > > -#define qatomic_rcu_read(ptr) \ > - ({ \ > +#define qatomic_rcu_read_internal(ptr, l) \ > + ({ \ > qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ > - typeof_strip_qual(*ptr) _val; \ > - qatomic_rcu_read__nocheck(ptr, &_val); \ > - _val; \ > + typeof_strip_qual(*ptr) PASTE(_val, l); \ > + qatomic_rcu_read__nocheck(ptr, &PASTE(_val, l)); \ > + PASTE(_val, l); \ > }) > +#define qatomic_rcu_read(ptr) qatomic_rcu_read_internal((ptr), __COUNTER__) Same observation about being able to reduce the number of PASTE() calls by adding yet another layer of macro invocations. > > #define qatomic_rcu_set(ptr, i) do { \ > qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ > diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h > index 21ef8f1699..9c191ebe99 100644 > --- a/include/qemu/osdep.h > +++ b/include/qemu/osdep.h > @@ -371,18 +371,21 @@ void QEMU_ERROR("code path is reachable") > * have to open-code it. Sadly, Coverity is severely confused by the > * constant variants, so we have to dumb things down there. > */ > +#define PASTE(a, b) a##b > +#define MIN_INTERNAL(a, b, l) \ > + ({ \ > + typeof(1 ? (a) : (b)) PASTE(_a, l) = (a), PASTE(_b, l) = (b); \ > + PASTE(_a, l) < PASTE(_b, l) ? PASTE(_a, l) : PASTE(_b, l); \ > + }) And again. I think you are definitely on the right track to have all internal variable declarations within a macro definition use multi-layered expansion with the help of __COUNTER__ to ensure that the macro's temporary variable is globally unique; so if you leave it as written, I could live with: Reviewed-by: Eric Blake <eblake@redhat.com> But if you respin it to pick up any of my suggestions, I'll definitely spend another review cycle on the result. If it helps, here's what I ended up using in nbdkit: https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/unique-name.h#L36 /* https://stackoverflow.com/a/1597129 * https://stackoverflow.com/a/12711226 */ #define XXUNIQUE_NAME(name, counter) name ## counter #define XUNIQUE_NAME(name, counter) XXUNIQUE_NAME (name, counter) #define NBDKIT_UNIQUE_NAME(name) XUNIQUE_NAME (name, __COUNTER__) https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/minmax.h#L47 #include "unique-name.h" #undef MIN #define MIN(x, y) \ MIN_1 ((x), (y), NBDKIT_UNIQUE_NAME (_x), NBDKIT_UNIQUE_NAME (_y)) ... #define MIN_1(x, y, _x, _y) ({ \ __auto_type _x = (x); \ __auto_type _y = (y); \ _x < _y ? _x : _y; \ })
On 8/31/23 06:25, Markus Armbruster wrote:
> +#define PASTE(a, b) a##b
We already have glue() in qemu/compiler.h.
The rest of it looks quite sensible.
r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 8/31/23 06:25, Markus Armbruster wrote: >> +#define PASTE(a, b) a##b > > We already have glue() in qemu/compiler.h. Missed it, will fix. > The rest of it looks quite sensible. Thanks!
Eric Blake <eblake@redhat.com> writes: > On Thu, Aug 31, 2023 at 03:25:46PM +0200, Markus Armbruster wrote: > > [This paragraph written last: Bear with my stream of consciousness > review below, where I end up duplicating some of the conslusions you > reached before the point where I saw where the patch was headed] > >> Variables declared in macros can shadow other variables. Much of the >> time, this is harmless, e.g.: >> >> #define _FDT(exp) \ >> do { \ >> int ret = (exp); \ >> if (ret < 0) { \ >> error_report("error creating device tree: %s: %s", \ >> #exp, fdt_strerror(ret)); \ >> exit(1); \ >> } \ >> } while (0) > > Which is why I've seen some projects require a strict namespace > separation: if all macro parameters and any identifiers declared in > macros use either a leading or a trailing _ (I prefer a trailing one, > to avoid risking conflicts with libc reserved namespace; but leading > is usually okay), and all other identifiers avoid that namespace, then > you will never have shadowing by calling a macro from normal code. > >> >> Harmless shadowing in h_client_architecture_support(): >> >> target_ulong ret; >> >> [...] >> >> ret = do_client_architecture_support(cpu, spapr, vec, fdt_bufsize); >> if (ret == H_SUCCESS) { >> _FDT((fdt_pack(spapr->fdt_blob))); >> [...] >> } >> >> return ret; >> >> However, we can get in trouble when the shadowed variable is used in a >> macro argument: >> >> #define QOBJECT(obj) ({ \ >> typeof(obj) o = (obj); \ >> o ? container_of(&(o)->base, QObject, base) : NULL; \ >> }) >> >> QOBJECT(o) expands into >> >> ({ >> ---> typeof(o) o = (o); >> o ? container_of(&(o)->base, QObject, base) : NULL; >> }) >> >> Unintended variable name capture at --->. We'd be saved by >> -Winit-self. But I could certainly construct more elaborate death >> traps that don't trigger it. > > Indeed, not fully understanding the preprocessor makes for some > interesting death traps. We use ALL_CAPS for macros to signal "watch out for traps". >> To reduce the risk of trapping ourselves, we use variable names in >> macros that no sane person would use elsewhere. Here's our actual >> definition of QOBJECT(): >> >> #define QOBJECT(obj) ({ \ >> typeof(obj) _obj = (obj); \ >> _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ >> }) > > Yep, goes back to the policy I've seen of enforcing naming conventions > that ensure macros can't shadow normal code. > >> >> Works well enough until we nest macro calls. For instance, with >> >> #define qobject_ref(obj) ({ \ >> typeof(obj) _obj = (obj); \ >> qobject_ref_impl(QOBJECT(_obj)); \ >> _obj; \ >> }) >> >> the expression qobject_ref(obj) expands into >> >> ({ >> typeof(obj) _obj = (obj); >> qobject_ref_impl( >> ({ >> ---> typeof(_obj) _obj = (_obj); >> _obj ? container_of(&(_obj)->base, QObject, base) : NULL; >> })); >> _obj; >> }) >> >> Unintended variable name capture at --->. > > Yep, you've just proven how macros calling macros requires care, as we > no longer have the namespace protections. If macro calls can only > form a DAG, it is possible to choose unique intermediate declarations > for every macro (although remembering to do that, and still keeping > the macro definition legible, may not be easy). But if you can have > macros that form any sort of nesting loop (and we WANT to allow things > like MIN(MIN(a, b), c) - which is such a loop), dynamic naming becomes > the only solution. > >> >> The only reliable way to prevent unintended variable name capture is >> -Wshadow. > > Yes, I would love to have that enabled eventually. > >> >> One blocker for enabling it is shadowing hiding in function-like >> macros like >> >> qdict_put(dict, "name", qobject_ref(...)) >> >> qdict_put() wraps its last argument in QOBJECT(), and the last >> argument here contains another QOBJECT(). >> >> Use dark preprocessor sorcery to make the macros that give us this >> problem use different variable names on every call. > > Sounds foreboding; hopefully not many macros are affected... > >> >> Signed-off-by: Markus Armbruster <armbru@redhat.com> >> --- >> include/qapi/qmp/qobject.h | 8 +++++--- >> include/qemu/atomic.h | 11 ++++++----- >> include/qemu/osdep.h | 34 +++++++++++++++++++--------------- >> 3 files changed, 30 insertions(+), 23 deletions(-) > > ...okay, the size of the diffstat seems tolerable (good that we don't > have many macros that expand to a temporary variable declaration and > which are likely to be heavily reused from nested contexts). > >> diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h >> index 9003b71fd3..7b50fc905d 100644 >> --- a/include/qapi/qmp/qobject.h >> +++ b/include/qapi/qmp/qobject.h >> @@ -45,10 +45,12 @@ struct QObject { >> struct QObjectBase_ base; >> }; >> >> -#define QOBJECT(obj) ({ \ >> - typeof(obj) _obj = (obj); \ >> - _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ >> +#define QOBJECT_INTERNAL(obj, l) ({ \ >> + typeof(obj) PASTE(_obj, l) = (obj); \ >> + PASTE(_obj, l) \ >> + ? container_of(&(PASTE(_obj, l))->base, QObject, base) : NULL; \ >> }) >> +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) > > Slick! Every call to QOBJECT() defines a unique temporary variable > name. But as written, QOBJECT(o) expands to this (when __COUNTER__ is > 1): > > ({ > typeof((o)) _obj1 = ((o)); > _obj1 ? container_of(&(_obj1)->base, QObject, base) : NULL; > }) > > which has three sets of redundant parens that could be elided. Why do > you need to add () around obj when forwarding to QOBJECT_INTERNAL()? Habit: enclose macro arguments in parenthesis unless there's a specific need not to. > The only way the expansion of the text passed through the 'obj' > parameter can contain a comma is if the user has already parenthesized > it on their end (not that I expect a comma expression to be a frequent > argument to QOBJECT(), but who knows). Arguing that it is to silence > a syntax checker is weak; since we must NOT add parens around the > parameters to QOBJECT_INTERNAL (calling PASTE((_obj), (l)) is > obviously wrong). > > Meanwhile, the repetition of three calls to PASTE() is annoying. How > about: > > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ > typeof _obj _tmp = _obj; \ > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > )} > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) > #define QOBJECT_INTERNAL(_arg, _ctr) \ > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > #define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) > > or: > > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ > typeof(_obj) _tmp = (_obj); \ > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > )} > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) > #define QOBJECT_INTERNAL(_arg, _ctr) \ > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > #define QOBJECT(obj) QOBJECT_INTERNAL(obj, __COUNTER__) > > where, in either case, QOBJECT(o) should then expand to > > ({ > typeof (o) _obj1 = (o); > _obj1 ? container_of(&_obj1->base, QObject, base) : NULL; > }) The idea is to have the innermost macro take the variable name instead of the counter. "PASTE(_obj, l)" then becomes "_tmp" there, which is more legible. However, we pay for it by going through two more macros. Can you explain why you need two more? Complication: the innermost macro needs to take all its variable names as arguments. The macro above has just one. Macros below have two. Not quite sure which version is easier to understand. >> /* Required for qobject_to() */ >> #define QTYPE_CAST_TO_QNull QTYPE_QNULL >> diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h >> index d95612f7a0..3f80ffac69 100644 >> --- a/include/qemu/atomic.h >> +++ b/include/qemu/atomic.h >> @@ -157,13 +157,14 @@ >> smp_read_barrier_depends(); >> #endif >> >> -#define qatomic_rcu_read(ptr) \ >> - ({ \ >> +#define qatomic_rcu_read_internal(ptr, l) \ >> + ({ \ >> qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ >> - typeof_strip_qual(*ptr) _val; \ >> - qatomic_rcu_read__nocheck(ptr, &_val); \ >> - _val; \ >> + typeof_strip_qual(*ptr) PASTE(_val, l); \ >> + qatomic_rcu_read__nocheck(ptr, &PASTE(_val, l)); \ >> + PASTE(_val, l); \ >> }) >> +#define qatomic_rcu_read(ptr) qatomic_rcu_read_internal((ptr), __COUNTER__) > > Same observation about being able to reduce the number of PASTE() > calls by adding yet another layer of macro invocations. > >> >> #define qatomic_rcu_set(ptr, i) do { \ >> qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ >> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h >> index 21ef8f1699..9c191ebe99 100644 >> --- a/include/qemu/osdep.h >> +++ b/include/qemu/osdep.h >> @@ -371,18 +371,21 @@ void QEMU_ERROR("code path is reachable") >> * have to open-code it. Sadly, Coverity is severely confused by the >> * constant variants, so we have to dumb things down there. >> */ >> +#define PASTE(a, b) a##b >> +#define MIN_INTERNAL(a, b, l) \ >> + ({ \ >> + typeof(1 ? (a) : (b)) PASTE(_a, l) = (a), PASTE(_b, l) = (b); \ >> + PASTE(_a, l) < PASTE(_b, l) ? PASTE(_a, l) : PASTE(_b, l); \ >> + }) > > And again. > > I think you are definitely on the right track to have all internal > variable declarations within a macro definition use multi-layered > expansion with the help of __COUNTER__ to ensure that the macro's > temporary variable is globally unique; so if you leave it as written, > I could live with: > > Reviewed-by: Eric Blake <eblake@redhat.com> > > But if you respin it to pick up any of my suggestions, I'll definitely > spend another review cycle on the result. > > If it helps, here's what I ended up using in nbdkit: > > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/unique-name.h#L36 > /* https://stackoverflow.com/a/1597129 > * https://stackoverflow.com/a/12711226 > */ > #define XXUNIQUE_NAME(name, counter) name ## counter > #define XUNIQUE_NAME(name, counter) XXUNIQUE_NAME (name, counter) > #define NBDKIT_UNIQUE_NAME(name) XUNIQUE_NAME (name, __COUNTER__) > > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/minmax.h#L47 > #include "unique-name.h" > > #undef MIN > #define MIN(x, y) \ > MIN_1 ((x), (y), NBDKIT_UNIQUE_NAME (_x), NBDKIT_UNIQUE_NAME (_y)) > > ... > #define MIN_1(x, y, _x, _y) ({ \ > __auto_type _x = (x); \ > __auto_type _y = (y); \ > _x < _y ? _x : _y; \ > }) Thanks!
On 8/31/23 16:30, Eric Blake wrote: > On Thu, Aug 31, 2023 at 03:25:46PM +0200, Markus Armbruster wrote: > > [This paragraph written last: Bear with my stream of consciousness > review below, where I end up duplicating some of the conslusions you > reached before the point where I saw where the patch was headed] > >> Variables declared in macros can shadow other variables. Much of the >> time, this is harmless, e.g.: >> >> #define _FDT(exp) \ >> do { \ >> int ret = (exp); \ >> if (ret < 0) { \ >> error_report("error creating device tree: %s: %s", \ >> #exp, fdt_strerror(ret)); \ >> exit(1); \ >> } \ >> } while (0) > > Which is why I've seen some projects require a strict namespace > separation: if all macro parameters and any identifiers declared in > macros use either a leading or a trailing _ (I prefer a trailing one, > to avoid risking conflicts with libc reserved namespace; but leading > is usually okay), and all other identifiers avoid that namespace, then > you will never have shadowing by calling a macro from normal code. I started fixing the _FDT() macro since it is quite noisy at compile. Same for qemu_fdt_setprop_cells(). So are we ok with names like 'ret_' and 'i_' ? I used a 'local_' prefix for now but I can change. I also have a bunch of fixes for ppc. C. > >> >> Harmless shadowing in h_client_architecture_support(): >> >> target_ulong ret; >> >> [...] >> >> ret = do_client_architecture_support(cpu, spapr, vec, fdt_bufsize); >> if (ret == H_SUCCESS) { >> _FDT((fdt_pack(spapr->fdt_blob))); >> [...] >> } >> >> return ret; >> >> However, we can get in trouble when the shadowed variable is used in a >> macro argument: >> >> #define QOBJECT(obj) ({ \ >> typeof(obj) o = (obj); \ >> o ? container_of(&(o)->base, QObject, base) : NULL; \ >> }) >> >> QOBJECT(o) expands into >> >> ({ >> ---> typeof(o) o = (o); >> o ? container_of(&(o)->base, QObject, base) : NULL; >> }) >> >> Unintended variable name capture at --->. We'd be saved by >> -Winit-self. But I could certainly construct more elaborate death >> traps that don't trigger it. > > Indeed, not fully understanding the preprocessor makes for some > interesting death traps. > >> >> To reduce the risk of trapping ourselves, we use variable names in >> macros that no sane person would use elsewhere. Here's our actual >> definition of QOBJECT(): >> >> #define QOBJECT(obj) ({ \ >> typeof(obj) _obj = (obj); \ >> _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ >> }) > > Yep, goes back to the policy I've seen of enforcing naming conventions > that ensure macros can't shadow normal code. > >> >> Works well enough until we nest macro calls. For instance, with >> >> #define qobject_ref(obj) ({ \ >> typeof(obj) _obj = (obj); \ >> qobject_ref_impl(QOBJECT(_obj)); \ >> _obj; \ >> }) >> >> the expression qobject_ref(obj) expands into >> >> ({ >> typeof(obj) _obj = (obj); >> qobject_ref_impl( >> ({ >> ---> typeof(_obj) _obj = (_obj); >> _obj ? container_of(&(_obj)->base, QObject, base) : NULL; >> })); >> _obj; >> }) >> >> Unintended variable name capture at --->. > > Yep, you've just proven how macros calling macros requires care, as we > no longer have the namespace protections. If macro calls can only > form a DAG, it is possible to choose unique intermediate declarations > for every macro (although remembering to do that, and still keeping > the macro definition legible, may not be easy). But if you can have > macros that form any sort of nesting loop (and we WANT to allow things > like MIN(MIN(a, b), c) - which is such a loop), dynamic naming becomes > the only solution. > >> >> The only reliable way to prevent unintended variable name capture is >> -Wshadow. > > Yes, I would love to have that enabled eventually. > >> >> One blocker for enabling it is shadowing hiding in function-like >> macros like >> >> qdict_put(dict, "name", qobject_ref(...)) >> >> qdict_put() wraps its last argument in QOBJECT(), and the last >> argument here contains another QOBJECT(). >> >> Use dark preprocessor sorcery to make the macros that give us this >> problem use different variable names on every call. > > Sounds foreboding; hopefully not many macros are affected... > >> >> Signed-off-by: Markus Armbruster <armbru@redhat.com> >> --- >> include/qapi/qmp/qobject.h | 8 +++++--- >> include/qemu/atomic.h | 11 ++++++----- >> include/qemu/osdep.h | 34 +++++++++++++++++++--------------- >> 3 files changed, 30 insertions(+), 23 deletions(-) > > ...okay, the size of the diffstat seems tolerable (good that we don't > have many macros that expand to a temporary variable declaration and > which are likely to be heavily reused from nested contexts). > >> >> diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h >> index 9003b71fd3..7b50fc905d 100644 >> --- a/include/qapi/qmp/qobject.h >> +++ b/include/qapi/qmp/qobject.h >> @@ -45,10 +45,12 @@ struct QObject { >> struct QObjectBase_ base; >> }; >> >> -#define QOBJECT(obj) ({ \ >> - typeof(obj) _obj = (obj); \ >> - _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ >> +#define QOBJECT_INTERNAL(obj, l) ({ \ >> + typeof(obj) PASTE(_obj, l) = (obj); \ >> + PASTE(_obj, l) \ >> + ? container_of(&(PASTE(_obj, l))->base, QObject, base) : NULL; \ >> }) >> +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) > > Slick! Every call to QOBJECT() defines a unique temporary variable > name. But as written, QOBJECT(o) expands to this (when __COUNTER__ is > 1): > > ({ > typeof((o)) _obj1 = ((o)); > _obj1 ? container_of(&(_obj1)->base, QObject, base) : NULL; > }) > > which has three sets of redundant parens that could be elided. Why do > you need to add () around obj when forwarding to QOBJECT_INTERNAL()? > The only way the expansion of the text passed through the 'obj' > parameter can contain a comma is if the user has already parenthesized > it on their end (not that I expect a comma expression to be a frequent > argument to QOBJECT(), but who knows). Arguing that it is to silence > a syntax checker is weak; since we must NOT add parens around the > parameters to QOBJECT_INTERNAL (calling PASTE((_obj), (l)) is > obviously wrong). > > Meanwhile, the repetition of three calls to PASTE() is annoying. How > about: > > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ > typeof _obj _tmp = _obj; \ > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > )} > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) > #define QOBJECT_INTERNAL(_arg, _ctr) \ > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > #define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) > > or: > > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ > typeof(_obj) _tmp = (_obj); \ > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > )} > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) > #define QOBJECT_INTERNAL(_arg, _ctr) \ > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > #define QOBJECT(obj) QOBJECT_INTERNAL(obj, __COUNTER__) > > where, in either case, QOBJECT(o) should then expand to > > ({ > typeof (o) _obj1 = (o); > _obj1 ? container_of(&_obj1->base, QObject, base) : NULL; > }) > >> >> /* Required for qobject_to() */ >> #define QTYPE_CAST_TO_QNull QTYPE_QNULL >> diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h >> index d95612f7a0..3f80ffac69 100644 >> --- a/include/qemu/atomic.h >> +++ b/include/qemu/atomic.h >> @@ -157,13 +157,14 @@ >> smp_read_barrier_depends(); >> #endif >> >> -#define qatomic_rcu_read(ptr) \ >> - ({ \ >> +#define qatomic_rcu_read_internal(ptr, l) \ >> + ({ \ >> qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ >> - typeof_strip_qual(*ptr) _val; \ >> - qatomic_rcu_read__nocheck(ptr, &_val); \ >> - _val; \ >> + typeof_strip_qual(*ptr) PASTE(_val, l); \ >> + qatomic_rcu_read__nocheck(ptr, &PASTE(_val, l)); \ >> + PASTE(_val, l); \ >> }) >> +#define qatomic_rcu_read(ptr) qatomic_rcu_read_internal((ptr), __COUNTER__) > > Same observation about being able to reduce the number of PASTE() > calls by adding yet another layer of macro invocations. > >> >> #define qatomic_rcu_set(ptr, i) do { \ >> qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ >> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h >> index 21ef8f1699..9c191ebe99 100644 >> --- a/include/qemu/osdep.h >> +++ b/include/qemu/osdep.h >> @@ -371,18 +371,21 @@ void QEMU_ERROR("code path is reachable") >> * have to open-code it. Sadly, Coverity is severely confused by the >> * constant variants, so we have to dumb things down there. >> */ >> +#define PASTE(a, b) a##b >> +#define MIN_INTERNAL(a, b, l) \ >> + ({ \ >> + typeof(1 ? (a) : (b)) PASTE(_a, l) = (a), PASTE(_b, l) = (b); \ >> + PASTE(_a, l) < PASTE(_b, l) ? PASTE(_a, l) : PASTE(_b, l); \ >> + }) > > And again. > > I think you are definitely on the right track to have all internal > variable declarations within a macro definition use multi-layered > expansion with the help of __COUNTER__ to ensure that the macro's > temporary variable is globally unique; so if you leave it as written, > I could live with: > > Reviewed-by: Eric Blake <eblake@redhat.com> > > But if you respin it to pick up any of my suggestions, I'll definitely > spend another review cycle on the result. > > If it helps, here's what I ended up using in nbdkit: > > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/unique-name.h#L36 > /* https://stackoverflow.com/a/1597129 > * https://stackoverflow.com/a/12711226 > */ > #define XXUNIQUE_NAME(name, counter) name ## counter > #define XUNIQUE_NAME(name, counter) XXUNIQUE_NAME (name, counter) > #define NBDKIT_UNIQUE_NAME(name) XUNIQUE_NAME (name, __COUNTER__) > > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/minmax.h#L47 > #include "unique-name.h" > > #undef MIN > #define MIN(x, y) \ > MIN_1 ((x), (y), NBDKIT_UNIQUE_NAME (_x), NBDKIT_UNIQUE_NAME (_y)) > > ... > #define MIN_1(x, y, _x, _y) ({ \ > __auto_type _x = (x); \ > __auto_type _y = (y); \ > _x < _y ? _x : _y; \ > }) >
On Fri, Sep 01, 2023 at 10:48:26AM +0200, Markus Armbruster wrote: > > Indeed, not fully understanding the preprocessor makes for some > > interesting death traps. > > We use ALL_CAPS for macros to signal "watch out for traps". > > >> -#define QOBJECT(obj) ({ \ > >> - typeof(obj) _obj = (obj); \ > >> - _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ > >> +#define QOBJECT_INTERNAL(obj, l) ({ \ > >> + typeof(obj) PASTE(_obj, l) = (obj); \ > >> + PASTE(_obj, l) \ > >> + ? container_of(&(PASTE(_obj, l))->base, QObject, base) : NULL; \ > >> }) > >> +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) > > > > Slick! Every call to QOBJECT() defines a unique temporary variable > > name. But as written, QOBJECT(o) expands to this (when __COUNTER__ is > > 1): > > > > ({ > > typeof((o)) _obj1 = ((o)); > > _obj1 ? container_of(&(_obj1)->base, QObject, base) : NULL; > > }) > > > > which has three sets of redundant parens that could be elided. Why do > > you need to add () around obj when forwarding to QOBJECT_INTERNAL()? > > Habit: enclose macro arguments in parenthesis unless there's a specific > need not to. > > > The only way the expansion of the text passed through the 'obj' > > parameter can contain a comma is if the user has already parenthesized > > it on their end (not that I expect a comma expression to be a frequent > > argument to QOBJECT(), but who knows). Arguing that it is to silence > > a syntax checker is weak; since we must NOT add parens around the > > parameters to QOBJECT_INTERNAL (calling PASTE((_obj), (l)) is > > obviously wrong). > > > > Meanwhile, the repetition of three calls to PASTE() is annoying. How > > about: > > > > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ > > typeof _obj _tmp = _obj; \ > > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > > )} > > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) > > #define QOBJECT_INTERNAL(_arg, _ctr) \ > > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > > #define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) > > > > or: > > > > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ > > typeof(_obj) _tmp = (_obj); \ > > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > > )} > > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) > > #define QOBJECT_INTERNAL(_arg, _ctr) \ > > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > > #define QOBJECT(obj) QOBJECT_INTERNAL(obj, __COUNTER__) > > > > where, in either case, QOBJECT(o) should then expand to > > > > ({ > > typeof (o) _obj1 = (o); > > _obj1 ? container_of(&_obj1->base, QObject, base) : NULL; > > }) > > The idea is to have the innermost macro take the variable name instead > of the counter. "PASTE(_obj, l)" then becomes "_tmp" there, which is > more legible. However, we pay for it by going through two more macros. > > Can you explain why you need two more? Likewise habit, on my part (and laziness - I wrote the previous email without testing anything). I've learned over the years that pasting is hard (you can't mix ## with a macro that you want expanded without 2 layers of indirection), so I started out by adding two layers of QOBJECT_INTERNAL, then wrote QOBJECT to forward to QOBJECT_INTERNAL. But now that you asked, I actually spent the time to test with the preprocessor, and your hunch is indeed correct: I over-compensated becaues of my habit. $cat foo.c #define PASTE(_a, _b) _a##_b #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ typeof _obj _tmp = _obj; \ _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ )} #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) #define QOBJECT_INTERNAL(_arg, _ctr) \ QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) #define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) QOBJECT(o) #define Q_INTERNAL_1(_obj, _tmp) ({ \ typeof _obj _tmp = _obj; \ _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ )} #define Q_INTERNAL(_arg, _ctr) \ Q_INTERNAL_1(_arg, PASTE(_obj, _ctr)) #define Q(obj) Q_INTERNAL((obj), __COUNTER__) Q(o) $ gcc -E foo.c # 0 "foo.c" # 0 "<built-in>" # 0 "<command-line>" # 1 "/usr/include/stdc-predef.h" 1 3 4 # 0 "<command-line>" 2 # 1 "foo.c" # 12 "foo.c" ({ typeof (o) _obj0 = (o); _obj0 ? container_of(&_obj0->base, QObject, base) : NULL; )} # 22 "foo.c" ({ typeof (o) _obj1 = (o); _obj1 ? container_of(&_obj1->base, QObject, base) : NULL; )} So the important part was merely ensuring that __COUNTER__ has a chance to be expanded PRIOR to the point where ## gets its argument, and one less layer of indirection was still sufficient for that. > > Complication: the innermost macro needs to take all its variable names > as arguments. The macro above has just one. Macros below have two. > > Not quite sure which version is easier to understand. Proper comments may help; once you realize that you are passing in the unique variable name(s) to be used as the temporary identifier(s), rather than an integer that still needs to be glued, then separating the task of generating name(s) (which is done once per name, instead of repeated 3 times) makes sense to me. I also like minimizing the number of times I have to spell out __COUNTER__; wrapping unique name generation behind a well-named helper macro gives a better view of why it is needed in the first place. At any rate, this comment still applies: > > > > I think you are definitely on the right track to have all internal > > variable declarations within a macro definition use multi-layered > > expansion with the help of __COUNTER__ to ensure that the macro's > > temporary variable is globally unique; so if you leave it as written, > > I could live with: > > > > Reviewed-by: Eric Blake <eblake@redhat.com> > > > > But if you respin it to pick up any of my suggestions, I'll definitely > > spend another review cycle on the result. > > > > If it helps, here's what I ended up using in nbdkit: > > > > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/unique-name.h#L36 > > /* https://stackoverflow.com/a/1597129 > > * https://stackoverflow.com/a/12711226 > > */ > > #define XXUNIQUE_NAME(name, counter) name ## counter > > #define XUNIQUE_NAME(name, counter) XXUNIQUE_NAME (name, counter) > > #define NBDKIT_UNIQUE_NAME(name) XUNIQUE_NAME (name, __COUNTER__) > > > > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/minmax.h#L47 > > #include "unique-name.h" > > > > #undef MIN > > #define MIN(x, y) \ > > MIN_1 ((x), (y), NBDKIT_UNIQUE_NAME (_x), NBDKIT_UNIQUE_NAME (_y)) > > > > ... > > #define MIN_1(x, y, _x, _y) ({ \ > > __auto_type _x = (x); \ > > __auto_type _y = (y); \ > > _x < _y ? _x : _y; \ > > }) > > Thanks! and so I'm fine leaving it up to you if you want to copy the paradigm I've seen elsewhere, or if you want to leave it as you first proposed. The end result is the same, and it is more of a judgment call on which form is easier for you to reason about. But as other commenters mentioned, we already have a glue() macro, so use that instead of PASTE(), no matter what else you choose. Looking forward to v2!
On 1/9/23 14:59, Cédric Le Goater wrote: > On 8/31/23 16:30, Eric Blake wrote: >> On Thu, Aug 31, 2023 at 03:25:46PM +0200, Markus Armbruster wrote: >> >> [This paragraph written last: Bear with my stream of consciousness >> review below, where I end up duplicating some of the conslusions you >> reached before the point where I saw where the patch was headed] >> >>> Variables declared in macros can shadow other variables. Much of the >>> time, this is harmless, e.g.: >>> >>> #define >>> _FDT(exp) \ >>> do >>> { \ >>> int ret = >>> (exp); \ >>> if (ret < 0) >>> { \ >>> error_report("error creating device tree: %s: %s", \ >>> #exp, >>> fdt_strerror(ret)); \ >>> >>> exit(1); \ >>> >>> } \ >>> } while (0) >> >> Which is why I've seen some projects require a strict namespace >> separation: if all macro parameters and any identifiers declared in >> macros use either a leading or a trailing _ (I prefer a trailing one, >> to avoid risking conflicts with libc reserved namespace; but leading >> is usually okay), and all other identifiers avoid that namespace, then >> you will never have shadowing by calling a macro from normal code. > > I started fixing the _FDT() macro since it is quite noisy at compile. > Same for qemu_fdt_setprop_cells(). So are we ok with names like 'ret_' > and 'i_' ? I used a 'local_' prefix for now but I can change. See https://lore.kernel.org/qemu-devel/20230831225607.30829-12-philmd@linaro.org/
Cédric Le Goater <clg@kaod.org> writes: > On 8/31/23 16:30, Eric Blake wrote: >> On Thu, Aug 31, 2023 at 03:25:46PM +0200, Markus Armbruster wrote: >> [This paragraph written last: Bear with my stream of consciousness >> review below, where I end up duplicating some of the conslusions you >> reached before the point where I saw where the patch was headed] >> >>> Variables declared in macros can shadow other variables. Much of the >>> time, this is harmless, e.g.: >>> >>> #define _FDT(exp) \ >>> do { \ >>> int ret = (exp); \ >>> if (ret < 0) { \ >>> error_report("error creating device tree: %s: %s", \ >>> #exp, fdt_strerror(ret)); \ >>> exit(1); \ >>> } \ >>> } while (0) >> Which is why I've seen some projects require a strict namespace >> separation: if all macro parameters and any identifiers declared in >> macros use either a leading or a trailing _ (I prefer a trailing one, >> to avoid risking conflicts with libc reserved namespace; but leading >> is usually okay), and all other identifiers avoid that namespace, then >> you will never have shadowing by calling a macro from normal code. > > I started fixing the _FDT() macro since it is quite noisy at compile. > Same for qemu_fdt_setprop_cells(). So are we ok with names like 'ret_' > and 'i_' ? I used a 'local_' prefix for now but I can change. I believe identifiers with a '_' suffix are just fine in macros. We have quite a few of them already. > I also have a bunch of fixes for ppc. Appreciated!
On 9/1/23 16:50, Markus Armbruster wrote: > Cédric Le Goater <clg@kaod.org> writes: > >> On 8/31/23 16:30, Eric Blake wrote: >>> On Thu, Aug 31, 2023 at 03:25:46PM +0200, Markus Armbruster wrote: >>> [This paragraph written last: Bear with my stream of consciousness >>> review below, where I end up duplicating some of the conslusions you >>> reached before the point where I saw where the patch was headed] >>> >>>> Variables declared in macros can shadow other variables. Much of the >>>> time, this is harmless, e.g.: >>>> >>>> #define _FDT(exp) \ >>>> do { \ >>>> int ret = (exp); \ >>>> if (ret < 0) { \ >>>> error_report("error creating device tree: %s: %s", \ >>>> #exp, fdt_strerror(ret)); \ >>>> exit(1); \ >>>> } \ >>>> } while (0) >>> Which is why I've seen some projects require a strict namespace >>> separation: if all macro parameters and any identifiers declared in >>> macros use either a leading or a trailing _ (I prefer a trailing one, >>> to avoid risking conflicts with libc reserved namespace; but leading >>> is usually okay), and all other identifiers avoid that namespace, then >>> you will never have shadowing by calling a macro from normal code. >> >> I started fixing the _FDT() macro since it is quite noisy at compile. >> Same for qemu_fdt_setprop_cells(). So are we ok with names like 'ret_' >> and 'i_' ? I used a 'local_' prefix for now but I can change. > > I believe identifiers with a '_' suffix are just fine in macros. We > have quite a few of them already. ok >> I also have a bunch of fixes for ppc. > > Appreciated! count me on for the ppc files : hw/ppc/pnv_psi.c hw/ppc/spapr.c hw/ppc/spapr_drc.c hw/ppc/spapr_pci.c include/hw/ppc/fdt.h and surely some other files under target/ppc/ This one was taken care of by Phil: include/sysemu/device_tree.h C.
Eric Blake <eblake@redhat.com> writes: > On Fri, Sep 01, 2023 at 10:48:26AM +0200, Markus Armbruster wrote: >> > Indeed, not fully understanding the preprocessor makes for some >> > interesting death traps. >> >> We use ALL_CAPS for macros to signal "watch out for traps". >> > >> >> -#define QOBJECT(obj) ({ \ >> >> - typeof(obj) _obj = (obj); \ >> >> - _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ >> >> +#define QOBJECT_INTERNAL(obj, l) ({ \ >> >> + typeof(obj) PASTE(_obj, l) = (obj); \ >> >> + PASTE(_obj, l) \ >> >> + ? container_of(&(PASTE(_obj, l))->base, QObject, base) : NULL; \ >> >> }) >> >> +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) >> > >> > Slick! Every call to QOBJECT() defines a unique temporary variable >> > name. But as written, QOBJECT(o) expands to this (when __COUNTER__ is >> > 1): >> > >> > ({ >> > typeof((o)) _obj1 = ((o)); >> > _obj1 ? container_of(&(_obj1)->base, QObject, base) : NULL; >> > }) >> > >> > which has three sets of redundant parens that could be elided. Why do >> > you need to add () around obj when forwarding to QOBJECT_INTERNAL()? >> >> Habit: enclose macro arguments in parenthesis unless there's a specific >> need not to. >> >> > The only way the expansion of the text passed through the 'obj' >> > parameter can contain a comma is if the user has already parenthesized >> > it on their end (not that I expect a comma expression to be a frequent >> > argument to QOBJECT(), but who knows). Arguing that it is to silence >> > a syntax checker is weak; since we must NOT add parens around the >> > parameters to QOBJECT_INTERNAL (calling PASTE((_obj), (l)) is >> > obviously wrong). >> > >> > Meanwhile, the repetition of three calls to PASTE() is annoying. How >> > about: >> > >> > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ >> > typeof _obj _tmp = _obj; \ >> > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ >> > )} >> > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) >> > #define QOBJECT_INTERNAL(_arg, _ctr) \ >> > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) >> > #define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) >> > >> > or: >> > >> > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ >> > typeof(_obj) _tmp = (_obj); \ >> > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ >> > )} >> > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) >> > #define QOBJECT_INTERNAL(_arg, _ctr) \ >> > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) >> > #define QOBJECT(obj) QOBJECT_INTERNAL(obj, __COUNTER__) >> > >> > where, in either case, QOBJECT(o) should then expand to >> > >> > ({ >> > typeof (o) _obj1 = (o); >> > _obj1 ? container_of(&_obj1->base, QObject, base) : NULL; >> > }) >> >> The idea is to have the innermost macro take the variable name instead >> of the counter. "PASTE(_obj, l)" then becomes "_tmp" there, which is >> more legible. However, we pay for it by going through two more macros. >> >> Can you explain why you need two more? > > Likewise habit, on my part (and laziness - I wrote the previous email > without testing anything). I've learned over the years that pasting is > hard (you can't mix ## with a macro that you want expanded without 2 > layers of indirection), so I started out by adding two layers of > QOBJECT_INTERNAL, then wrote QOBJECT to forward to QOBJECT_INTERNAL. > But now that you asked, I actually spent the time to test with the > preprocessor, and your hunch is indeed correct: I over-compensated > becaues of my habit. > > $cat foo.c > #define PASTE(_a, _b) _a##_b > > #define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \ > typeof _obj _tmp = _obj; \ > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > )} > #define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp) > #define QOBJECT_INTERNAL(_arg, _ctr) \ > QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > #define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) > > QOBJECT(o) > > #define Q_INTERNAL_1(_obj, _tmp) ({ \ > typeof _obj _tmp = _obj; \ > _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \ > )} > #define Q_INTERNAL(_arg, _ctr) \ > Q_INTERNAL_1(_arg, PASTE(_obj, _ctr)) > #define Q(obj) Q_INTERNAL((obj), __COUNTER__) > > Q(o) > $ gcc -E foo.c > # 0 "foo.c" > # 0 "<built-in>" > # 0 "<command-line>" > # 1 "/usr/include/stdc-predef.h" 1 3 4 > # 0 "<command-line>" 2 > # 1 "foo.c" > # 12 "foo.c" > ({ typeof (o) _obj0 = (o); _obj0 ? container_of(&_obj0->base, QObject, base) : NULL; )} > # 22 "foo.c" > ({ typeof (o) _obj1 = (o); _obj1 ? container_of(&_obj1->base, QObject, base) : NULL; )} > > So the important part was merely ensuring that __COUNTER__ has a > chance to be expanded PRIOR to the point where ## gets its argument, > and one less layer of indirection was still sufficient for that. The version with less indirection is easier to understand for me. >> >> Complication: the innermost macro needs to take all its variable names >> as arguments. The macro above has just one. Macros below have two. >> >> Not quite sure which version is easier to understand. > > Proper comments may help; once you realize that you are passing in the > unique variable name(s) to be used as the temporary identifier(s), > rather than an integer that still needs to be glued, then separating > the task of generating name(s) (which is done once per name, instead > of repeated 3 times) makes sense to me. I also like minimizing the > number of times I have to spell out __COUNTER__; wrapping unique name > generation behind a well-named helper macro gives a better view of why > it is needed in the first place. I can add this comment to every instance of the __COUNTER__ trick: /* * Preprocessor wizardry ahead: glue(_val, l) expands to a new * identifier in each macro expansion. Helps avoid shadowing * variables and the unwanted name captures that come with it. */ > At any rate, this comment still applies: > >> > >> > I think you are definitely on the right track to have all internal >> > variable declarations within a macro definition use multi-layered >> > expansion with the help of __COUNTER__ to ensure that the macro's >> > temporary variable is globally unique; so if you leave it as written, >> > I could live with: >> > >> > Reviewed-by: Eric Blake <eblake@redhat.com> >> > >> > But if you respin it to pick up any of my suggestions, I'll definitely >> > spend another review cycle on the result. >> > >> > If it helps, here's what I ended up using in nbdkit: >> > >> > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/unique-name.h#L36 >> > /* https://stackoverflow.com/a/1597129 >> > * https://stackoverflow.com/a/12711226 >> > */ >> > #define XXUNIQUE_NAME(name, counter) name ## counter >> > #define XUNIQUE_NAME(name, counter) XXUNIQUE_NAME (name, counter) >> > #define NBDKIT_UNIQUE_NAME(name) XUNIQUE_NAME (name, __COUNTER__) >> > >> > https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/minmax.h#L47 >> > #include "unique-name.h" >> > >> > #undef MIN >> > #define MIN(x, y) \ >> > MIN_1 ((x), (y), NBDKIT_UNIQUE_NAME (_x), NBDKIT_UNIQUE_NAME (_y)) >> > >> > ... >> > #define MIN_1(x, y, _x, _y) ({ \ >> > __auto_type _x = (x); \ >> > __auto_type _y = (y); \ >> > _x < _y ? _x : _y; \ >> > }) >> >> Thanks! > > and so I'm fine leaving it up to you if you want to copy the paradigm > I've seen elsewhere, or if you want to leave it as you first proposed. > The end result is the same, and it is more of a judgment call on which > form is easier for you to reason about. I need to review the two competing versions once more to decide which I find easier to read. Or shall I say less hard; preprocessor wizardry is never really easy. > But as other commenters mentioned, we already have a glue() macro, so > use that instead of PASTE(), no matter what else you choose. > > Looking forward to v2! Thanks again!
diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h index 9003b71fd3..7b50fc905d 100644 --- a/include/qapi/qmp/qobject.h +++ b/include/qapi/qmp/qobject.h @@ -45,10 +45,12 @@ struct QObject { struct QObjectBase_ base; }; -#define QOBJECT(obj) ({ \ - typeof(obj) _obj = (obj); \ - _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ +#define QOBJECT_INTERNAL(obj, l) ({ \ + typeof(obj) PASTE(_obj, l) = (obj); \ + PASTE(_obj, l) \ + ? container_of(&(PASTE(_obj, l))->base, QObject, base) : NULL; \ }) +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__) /* Required for qobject_to() */ #define QTYPE_CAST_TO_QNull QTYPE_QNULL diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h index d95612f7a0..3f80ffac69 100644 --- a/include/qemu/atomic.h +++ b/include/qemu/atomic.h @@ -157,13 +157,14 @@ smp_read_barrier_depends(); #endif -#define qatomic_rcu_read(ptr) \ - ({ \ +#define qatomic_rcu_read_internal(ptr, l) \ + ({ \ qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ - typeof_strip_qual(*ptr) _val; \ - qatomic_rcu_read__nocheck(ptr, &_val); \ - _val; \ + typeof_strip_qual(*ptr) PASTE(_val, l); \ + qatomic_rcu_read__nocheck(ptr, &PASTE(_val, l)); \ + PASTE(_val, l); \ }) +#define qatomic_rcu_read(ptr) qatomic_rcu_read_internal((ptr), __COUNTER__) #define qatomic_rcu_set(ptr, i) do { \ qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \ diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h index 21ef8f1699..9c191ebe99 100644 --- a/include/qemu/osdep.h +++ b/include/qemu/osdep.h @@ -371,18 +371,21 @@ void QEMU_ERROR("code path is reachable") * have to open-code it. Sadly, Coverity is severely confused by the * constant variants, so we have to dumb things down there. */ +#define PASTE(a, b) a##b +#define MIN_INTERNAL(a, b, l) \ + ({ \ + typeof(1 ? (a) : (b)) PASTE(_a, l) = (a), PASTE(_b, l) = (b); \ + PASTE(_a, l) < PASTE(_b, l) ? PASTE(_a, l) : PASTE(_b, l); \ + }) #undef MIN -#define MIN(a, b) \ - ({ \ - typeof(1 ? (a) : (b)) _a = (a), _b = (b); \ - _a < _b ? _a : _b; \ +#define MIN(a, b) MIN_INTERNAL((a), (b), __COUNTER__) +#define MAX_INTERNAL(a, b, l) \ + ({ \ + typeof(1 ? (a) : (b)) PASTE(_a, l) = (a), PASTE(_b, l) = (b); \ + PASTE(_a, l) > PASTE(_b, l) ? PASTE(_a, l) : PASTE(_b, l); \ }) #undef MAX -#define MAX(a, b) \ - ({ \ - typeof(1 ? (a) : (b)) _a = (a), _b = (b); \ - _a > _b ? _a : _b; \ - }) +#define MAX(a, b) MAX_INTERNAL((a), (b), __COUNTER__) #ifdef __COVERITY__ # define MIN_CONST(a, b) ((a) < (b) ? (a) : (b)) @@ -404,13 +407,14 @@ void QEMU_ERROR("code path is reachable") * Minimum function that returns zero only if both values are zero. * Intended for use with unsigned values only. */ -#ifndef MIN_NON_ZERO -#define MIN_NON_ZERO(a, b) \ - ({ \ - typeof(1 ? (a) : (b)) _a = (a), _b = (b); \ - _a == 0 ? _b : (_b == 0 || _b > _a) ? _a : _b; \ +#define MIN_NON_ZERO_INTERNAL(a, b, l) \ + ({ \ + typeof(1 ? (a) : (b)) PASTE(_a, l) = (a), PASTE(_b, l) = (b); \ + PASTE(_a, l) == 0 ? PASTE(_b, l) \ + : (PASTE(_b, l) == 0 || PASTE(_b, l) > PASTE(_a, l)) ? PASTE(_a, l) \ + : PASTE(_b, l); \ }) -#endif +#define MIN_NON_ZERO(a, b) MIN_NON_ZERO_INTERNAL((a), (b), __COUNTER__) /* * Round number down to multiple. Safe when m is not a power of 2 (see
Variables declared in macros can shadow other variables. Much of the time, this is harmless, e.g.: #define _FDT(exp) \ do { \ int ret = (exp); \ if (ret < 0) { \ error_report("error creating device tree: %s: %s", \ #exp, fdt_strerror(ret)); \ exit(1); \ } \ } while (0) Harmless shadowing in h_client_architecture_support(): target_ulong ret; [...] ret = do_client_architecture_support(cpu, spapr, vec, fdt_bufsize); if (ret == H_SUCCESS) { _FDT((fdt_pack(spapr->fdt_blob))); [...] } return ret; However, we can get in trouble when the shadowed variable is used in a macro argument: #define QOBJECT(obj) ({ \ typeof(obj) o = (obj); \ o ? container_of(&(o)->base, QObject, base) : NULL; \ }) QOBJECT(o) expands into ({ ---> typeof(o) o = (o); o ? container_of(&(o)->base, QObject, base) : NULL; }) Unintended variable name capture at --->. We'd be saved by -Winit-self. But I could certainly construct more elaborate death traps that don't trigger it. To reduce the risk of trapping ourselves, we use variable names in macros that no sane person would use elsewhere. Here's our actual definition of QOBJECT(): #define QOBJECT(obj) ({ \ typeof(obj) _obj = (obj); \ _obj ? container_of(&(_obj)->base, QObject, base) : NULL; \ }) Works well enough until we nest macro calls. For instance, with #define qobject_ref(obj) ({ \ typeof(obj) _obj = (obj); \ qobject_ref_impl(QOBJECT(_obj)); \ _obj; \ }) the expression qobject_ref(obj) expands into ({ typeof(obj) _obj = (obj); qobject_ref_impl( ({ ---> typeof(_obj) _obj = (_obj); _obj ? container_of(&(_obj)->base, QObject, base) : NULL; })); _obj; }) Unintended variable name capture at --->. The only reliable way to prevent unintended variable name capture is -Wshadow. One blocker for enabling it is shadowing hiding in function-like macros like qdict_put(dict, "name", qobject_ref(...)) qdict_put() wraps its last argument in QOBJECT(), and the last argument here contains another QOBJECT(). Use dark preprocessor sorcery to make the macros that give us this problem use different variable names on every call. Signed-off-by: Markus Armbruster <armbru@redhat.com> --- include/qapi/qmp/qobject.h | 8 +++++--- include/qemu/atomic.h | 11 ++++++----- include/qemu/osdep.h | 34 +++++++++++++++++++--------------- 3 files changed, 30 insertions(+), 23 deletions(-)