Message ID | 1455582057-27565-13-git-send-email-eblake@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Eric Blake <eblake@redhat.com> writes: > Commit cee2dedb noticed that if you have a partial flat union > (such as if an input parse failed due to a missing > discriminator), calling the dealloc visitor could result in > trying to dereference a NULL pointer if we attempted to visit > an object branch without an earlier successful call to > visit_start_implicit_struct() allocating the pointer for that > branch. But the "fix" it implemented requires the use of a > '.data' member in the union, which may or may not be the same > size as other branches of the union (consider a 32-bit platform > where one of the branches is an int64), which feels fairly dirty. Well, until the previous commit, it was the same, wasn't it? All pointers. > Plus, as mentioned in that commit, it only works if you can > assume that '.data' would be zero-initialized even if '.kind' was > uninitialized, which is rather poor logic: our usage of > visit_start_struct() happens to zero-initialize both fields, > which means '.kind' is never truly uninitialized - but if we > changed visit_start_struct() to use g_new() instead of g_new0(), > then '.data' would not be any more reliable as a condition on > whether to visit the branch matching '.kind', regardless of > whether '.kind' was 0). > > Menawhile, now that we have just inlined the fields of all flat > unions, there is no longer the possibility of a null pointer to > dereference in the first place. Where the branch structure used > to be separately allocated by visit_start_implicit_struct(), it > is now just pointing to a subset of the memory already > zero-allocated by visit_start_struct(). > > Thus, we can instead fix things to delete the misguided > visit_start_union(), as it is no longer providing any benefit. > And it finishes the cleanup we started in commit 7c91aabd when > we deleted visit_end_union(). Generated code changes as follows: > > |@@ -2366,9 +2363,6 @@ void visit_type_ChardevBackend(Visitor * > | if (err) { > | goto out_obj; > | } > |- if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) { > |- goto out_obj; > |- } > | switch ((*obj)->type) { > | case CHARDEV_BACKEND_KIND_FILE: > | visit_type_ChardevFile(v, "data", &(*obj)->u.file, &err); > > Signed-off-by: Eric Blake <eblake@redhat.com> > > --- > v10: retitle, hoist earlier in series, rebase, drop R-b > v9: no change > v8: rebase to 'name' motion > v7: rebase to earlier context changes, simplify 'obj && !*obj' > condition based on contract > v6: rebase due to deferring 7/46, and gen_err_check() improvements; > rewrite gen_visit_implicit_struct() more like other patterns > --- > include/qapi/visitor.h | 1 - > include/qapi/visitor-impl.h | 2 -- > scripts/qapi-visit.py | 3 --- > qapi/qapi-visit-core.c | 8 -------- > qapi/qapi-dealloc-visitor.c | 26 -------------------------- > 5 files changed, 40 deletions(-) > > diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h > index c131a32..b8ae1b5 100644 > --- a/include/qapi/visitor.h > +++ b/include/qapi/visitor.h > @@ -80,6 +80,5 @@ void visit_type_str(Visitor *v, const char *name, char **obj, Error **errp); > void visit_type_number(Visitor *v, const char *name, double *obj, > Error **errp); > void visit_type_any(Visitor *v, const char *name, QObject **obj, Error **errp); > -bool visit_start_union(Visitor *v, bool data_present, Error **errp); > > #endif > diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h > index 7905a28..c4af3e0 100644 > --- a/include/qapi/visitor-impl.h > +++ b/include/qapi/visitor-impl.h > @@ -58,8 +58,6 @@ struct Visitor > > /* May be NULL; most useful for input visitors. */ > void (*optional)(Visitor *v, const char *name, bool *present); > - > - bool (*start_union)(Visitor *v, bool data_present, Error **errp); > }; > > void input_type_enum(Visitor *v, const char *name, int *obj, > diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py > index 68354d8..02f0122 100644 > --- a/scripts/qapi-visit.py > +++ b/scripts/qapi-visit.py > @@ -246,9 +246,6 @@ void visit_type_%(c_name)s(Visitor *v, const char *name, %(c_name)s **obj, Error > if variants: > ret += gen_err_check(label='out_obj') > ret += mcgen(''' > - if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) { > - goto out_obj; > - } I'm afraid the previous commit broke this for flat unions. Before the previous commit, all members of (*obj)->u were pointers to the struct holding the variant members both for flat and simple unions. !!(*obj)->u.data tests whether the struct holding the variant members has been allocated. This relies on uniform pointer format. The dealloc visitor uses the "has been allocated" bit to suppress visiting the struct when it hasn't been allocated. The previous commit unboxes the struct for flat unions. Now ->u.data reinterprets the first few bytes of that struct as pointer. If you're "lucky", they're not all zero, and the struct gets visited. Obvious fix: squash this hunk into the previous commit, then let this commit drop the code that's no longer used. However, simple unions are still boxed. Why can't their pointer be null in the dealloc visitor? > switch ((*obj)->%(c_name)s) { > ''', > c_name=c_name(variants.tag_member.name)) > diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c > index 6fa66f1..976106e 100644 > --- a/qapi/qapi-visit-core.c > +++ b/qapi/qapi-visit-core.c > @@ -60,14 +60,6 @@ void visit_end_list(Visitor *v) > v->end_list(v); > } > > -bool visit_start_union(Visitor *v, bool data_present, Error **errp) > -{ > - if (v->start_union) { > - return v->start_union(v, data_present, errp); > - } > - return true; > -} > - > bool visit_optional(Visitor *v, const char *name, bool *present) > { > if (v->optional) { > diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c > index 6667e8c..4eae555 100644 > --- a/qapi/qapi-dealloc-visitor.c > +++ b/qapi/qapi-dealloc-visitor.c > @@ -169,31 +169,6 @@ static void qapi_dealloc_type_enum(Visitor *v, const char *name, int *obj, > { > } > > -/* If there's no data present, the dealloc visitor has nothing to free. > - * Thus, indicate to visitor code that the subsequent union fields can > - * be skipped. This is not an error condition, since the cleanup of the > - * rest of an object can continue unhindered, so leave errp unset in > - * these cases. > - * > - * NOTE: In cases where we're attempting to deallocate an object that > - * may have missing fields, the field indicating the union type may > - * be missing. In such a case, it's possible we don't have enough > - * information to differentiate data_present == false from a case where > - * data *is* present but happens to be a scalar with a value of 0. > - * This is okay, since in the case of the dealloc visitor there's no > - * work that needs to done in either situation. > - * > - * The current inability in QAPI code to more thoroughly verify a union > - * type in such cases will likely need to be addressed if we wish to > - * implement this interface for other types of visitors in the future, > - * however. > - */ > -static bool qapi_dealloc_start_union(Visitor *v, bool data_present, > - Error **errp) > -{ > - return data_present; > -} > - > Visitor *qapi_dealloc_get_visitor(QapiDeallocVisitor *v) > { > return &v->visitor; > @@ -224,7 +199,6 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void) > v->visitor.type_str = qapi_dealloc_type_str; > v->visitor.type_number = qapi_dealloc_type_number; > v->visitor.type_any = qapi_dealloc_type_anything; > - v->visitor.start_union = qapi_dealloc_start_union; > > QTAILQ_INIT(&v->stack);
On 02/17/2016 11:08 AM, Markus Armbruster wrote: > Eric Blake <eblake@redhat.com> writes: > >> Commit cee2dedb noticed that if you have a partial flat union >> (such as if an input parse failed due to a missing >> discriminator), calling the dealloc visitor could result in >> trying to dereference a NULL pointer if we attempted to visit >> an object branch without an earlier successful call to >> visit_start_implicit_struct() allocating the pointer for that >> branch. But the "fix" it implemented requires the use of a >> '.data' member in the union, which may or may not be the same >> size as other branches of the union (consider a 32-bit platform >> where one of the branches is an int64), which feels fairly dirty. > > Well, until the previous commit, it was the same, wasn't it? All > pointers. For simple unions, you could have (well, still can have, until my later patch gets rid of the simple_union_type() magic): struct SU { SUKind type; union { void *data; int8_t byte; } u; }; But you're right - for flat unions, ALL branches were represented as pointers (until this series unboxed them). > >> Plus, as mentioned in that commit, it only works if you can >> assume that '.data' would be zero-initialized even if '.kind' was >> uninitialized, which is rather poor logic: our usage of >> visit_start_struct() happens to zero-initialize both fields, >> which means '.kind' is never truly uninitialized - but if we >> changed visit_start_struct() to use g_new() instead of g_new0(), >> then '.data' would not be any more reliable as a condition on >> whether to visit the branch matching '.kind', regardless of >> whether '.kind' was 0). >> >> Menawhile, now that we have just inlined the fields of all flat Meanwhile, >> unions, there is no longer the possibility of a null pointer to >> dereference in the first place. Where the branch structure used >> to be separately allocated by visit_start_implicit_struct(), it >> is now just pointing to a subset of the memory already >> zero-allocated by visit_start_struct(). I guess I may try and reword this slightly, and point to the fact that the NULL dereference was due to calling visit_start_implicit_FOO() (only done for flat unions; for simple unions the branches call visit_type_FOO(), and that call safely handled NULL); because we were using visit_start/end_implicit_struct() for its allocation effects. But the net result is the same - now that we no longer call visit_start_implicit_struct() for a union visit, the dealloc visitor no longer has to worry about a NULL dereference on a partially constructed object, so we no longer need to probe if the union contains any data. >> +++ b/scripts/qapi-visit.py >> @@ -246,9 +246,6 @@ void visit_type_%(c_name)s(Visitor *v, const char *name, %(c_name)s **obj, Error >> if variants: >> ret += gen_err_check(label='out_obj') >> ret += mcgen(''' >> - if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) { >> - goto out_obj; >> - } > > I'm afraid the previous commit broke this for flat unions. > > Before the previous commit, all members of (*obj)->u were pointers to > the struct holding the variant members both for flat and simple unions. > !!(*obj)->u.data tests whether the struct holding the variant members > has been allocated. This relies on uniform pointer format. > > The dealloc visitor uses the "has been allocated" bit to suppress > visiting the struct when it hasn't been allocated. > > The previous commit unboxes the struct for flat unions. Now ->u.data > reinterprets the first few bytes of that struct as pointer. If you're > "lucky", they're not all zero, and the struct gets visited. You're right - and I bet I could come up with a case where valgrind could call me on it. > > Obvious fix: squash this hunk into the previous commit, then let this > commit drop the code that's no longer used. Yep, for bisectability, I think that's what I'll end up doing. > > However, simple unions are still boxed. Why can't their pointer be null > in the dealloc visitor? Simple unions still go through visit_type_FOO(), and _that_ function properly checks for NULL. It was only visit_type_implicit_FOO() that blindly dereferenced things. In fact, in the earlier incantation of this patch, my fix was to teach visit_type_implicit_FOO() how to check for NULL: https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg05442.html But now that visit_type_implicit_FOO() is gone, my earlier incantation got reduced in size. I guess it's all in how I document the commit message.
Eric Blake <eblake@redhat.com> writes: > On 02/17/2016 11:08 AM, Markus Armbruster wrote: >> Eric Blake <eblake@redhat.com> writes: >> >>> Commit cee2dedb noticed that if you have a partial flat union >>> (such as if an input parse failed due to a missing >>> discriminator), calling the dealloc visitor could result in >>> trying to dereference a NULL pointer if we attempted to visit >>> an object branch without an earlier successful call to >>> visit_start_implicit_struct() allocating the pointer for that >>> branch. But the "fix" it implemented requires the use of a >>> '.data' member in the union, which may or may not be the same >>> size as other branches of the union (consider a 32-bit platform >>> where one of the branches is an int64), which feels fairly dirty. >> >> Well, until the previous commit, it was the same, wasn't it? All >> pointers. > > For simple unions, you could have (well, still can have, until my later > patch gets rid of the simple_union_type() magic): > > struct SU { > SUKind type; > union { > void *data; > int8_t byte; > } u; > }; Begs the question why that works :) > But you're right - for flat unions, ALL branches were represented as > pointers (until this series unboxed them). > >> >>> Plus, as mentioned in that commit, it only works if you can >>> assume that '.data' would be zero-initialized even if '.kind' was >>> uninitialized, which is rather poor logic: our usage of >>> visit_start_struct() happens to zero-initialize both fields, >>> which means '.kind' is never truly uninitialized - but if we >>> changed visit_start_struct() to use g_new() instead of g_new0(), >>> then '.data' would not be any more reliable as a condition on >>> whether to visit the branch matching '.kind', regardless of >>> whether '.kind' was 0). >>> >>> Menawhile, now that we have just inlined the fields of all flat > > Meanwhile, > >>> unions, there is no longer the possibility of a null pointer to >>> dereference in the first place. Where the branch structure used >>> to be separately allocated by visit_start_implicit_struct(), it >>> is now just pointing to a subset of the memory already >>> zero-allocated by visit_start_struct(). > > I guess I may try and reword this slightly, and point to the fact that > the NULL dereference was due to calling visit_start_implicit_FOO() (only > done for flat unions; for simple unions the branches call > visit_type_FOO(), and that call safely handled NULL); That's why it works? > because we were > using visit_start/end_implicit_struct() for its allocation effects. But > the net result is the same - now that we no longer call > visit_start_implicit_struct() for a union visit, the dealloc visitor no > longer has to worry about a NULL dereference on a partially constructed > object, so we no longer need to probe if the union contains any data. > >>> +++ b/scripts/qapi-visit.py >>> @@ -246,9 +246,6 @@ void visit_type_%(c_name)s(Visitor *v, const char *name, %(c_name)s **obj, Error >>> if variants: >>> ret += gen_err_check(label='out_obj') >>> ret += mcgen(''' >>> - if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) { >>> - goto out_obj; >>> - } >> >> I'm afraid the previous commit broke this for flat unions. >> >> Before the previous commit, all members of (*obj)->u were pointers to >> the struct holding the variant members both for flat and simple unions. >> !!(*obj)->u.data tests whether the struct holding the variant members >> has been allocated. This relies on uniform pointer format. >> >> The dealloc visitor uses the "has been allocated" bit to suppress >> visiting the struct when it hasn't been allocated. >> >> The previous commit unboxes the struct for flat unions. Now ->u.data >> reinterprets the first few bytes of that struct as pointer. If you're >> "lucky", they're not all zero, and the struct gets visited. > > You're right - and I bet I could come up with a case where valgrind > could call me on it. > >> >> Obvious fix: squash this hunk into the previous commit, then let this >> commit drop the code that's no longer used. > > Yep, for bisectability, I think that's what I'll end up doing. > >> >> However, simple unions are still boxed. Why can't their pointer be null >> in the dealloc visitor? > > Simple unions still go through visit_type_FOO(), and _that_ function > properly checks for NULL. It was only visit_type_implicit_FOO() that > blindly dereferenced things. In fact, in the earlier incantation of > this patch, my fix was to teach visit_type_implicit_FOO() how to check > for NULL: > https://lists.gnu.org/archive/html/qemu-devel/2015-09/msg05442.html > > But now that visit_type_implicit_FOO() is gone, my earlier incantation > got reduced in size. I guess it's all in how I document the commit message. Give it a try :)
On 02/18/2016 01:24 AM, Markus Armbruster wrote: >> For simple unions, you could have (well, still can have, until my later >> patch gets rid of the simple_union_type() magic): >> >> struct SU { >> SUKind type; >> union { >> void *data; >> int8_t byte; >> } u; >> }; > > Begs the question why that works :) By sheer luck, and (poorly?) documented in a hairy comment in qapi-dealloc-visitor.c (at least, until I delete visit_start_union). We have a data-dependent decision (not only the contents of 'byte', but ALSO the contents of the padding bits), but either the decision results in calling visit_type_int8() (and doing nothing) or skipping the call (and likewise doing nothing). >> I guess I may try and reword this slightly, and point to the fact that >> the NULL dereference was due to calling visit_start_implicit_FOO() (only >> done for flat unions; for simple unions the branches call >> visit_type_FOO(), and that call safely handled NULL); > > That's why it works? > >> But now that visit_type_implicit_FOO() is gone, my earlier incantation >> got reduced in size. I guess it's all in how I document the commit message. > > Give it a try :) I gave it my best in v11 :) Maybe you'll still have wording improvements, but this back-and-forth has helped both of us try to actually characterize what is going on.
diff --git a/include/qapi/visitor.h b/include/qapi/visitor.h index c131a32..b8ae1b5 100644 --- a/include/qapi/visitor.h +++ b/include/qapi/visitor.h @@ -80,6 +80,5 @@ void visit_type_str(Visitor *v, const char *name, char **obj, Error **errp); void visit_type_number(Visitor *v, const char *name, double *obj, Error **errp); void visit_type_any(Visitor *v, const char *name, QObject **obj, Error **errp); -bool visit_start_union(Visitor *v, bool data_present, Error **errp); #endif diff --git a/include/qapi/visitor-impl.h b/include/qapi/visitor-impl.h index 7905a28..c4af3e0 100644 --- a/include/qapi/visitor-impl.h +++ b/include/qapi/visitor-impl.h @@ -58,8 +58,6 @@ struct Visitor /* May be NULL; most useful for input visitors. */ void (*optional)(Visitor *v, const char *name, bool *present); - - bool (*start_union)(Visitor *v, bool data_present, Error **errp); }; void input_type_enum(Visitor *v, const char *name, int *obj, diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py index 68354d8..02f0122 100644 --- a/scripts/qapi-visit.py +++ b/scripts/qapi-visit.py @@ -246,9 +246,6 @@ void visit_type_%(c_name)s(Visitor *v, const char *name, %(c_name)s **obj, Error if variants: ret += gen_err_check(label='out_obj') ret += mcgen(''' - if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) { - goto out_obj; - } switch ((*obj)->%(c_name)s) { ''', c_name=c_name(variants.tag_member.name)) diff --git a/qapi/qapi-visit-core.c b/qapi/qapi-visit-core.c index 6fa66f1..976106e 100644 --- a/qapi/qapi-visit-core.c +++ b/qapi/qapi-visit-core.c @@ -60,14 +60,6 @@ void visit_end_list(Visitor *v) v->end_list(v); } -bool visit_start_union(Visitor *v, bool data_present, Error **errp) -{ - if (v->start_union) { - return v->start_union(v, data_present, errp); - } - return true; -} - bool visit_optional(Visitor *v, const char *name, bool *present) { if (v->optional) { diff --git a/qapi/qapi-dealloc-visitor.c b/qapi/qapi-dealloc-visitor.c index 6667e8c..4eae555 100644 --- a/qapi/qapi-dealloc-visitor.c +++ b/qapi/qapi-dealloc-visitor.c @@ -169,31 +169,6 @@ static void qapi_dealloc_type_enum(Visitor *v, const char *name, int *obj, { } -/* If there's no data present, the dealloc visitor has nothing to free. - * Thus, indicate to visitor code that the subsequent union fields can - * be skipped. This is not an error condition, since the cleanup of the - * rest of an object can continue unhindered, so leave errp unset in - * these cases. - * - * NOTE: In cases where we're attempting to deallocate an object that - * may have missing fields, the field indicating the union type may - * be missing. In such a case, it's possible we don't have enough - * information to differentiate data_present == false from a case where - * data *is* present but happens to be a scalar with a value of 0. - * This is okay, since in the case of the dealloc visitor there's no - * work that needs to done in either situation. - * - * The current inability in QAPI code to more thoroughly verify a union - * type in such cases will likely need to be addressed if we wish to - * implement this interface for other types of visitors in the future, - * however. - */ -static bool qapi_dealloc_start_union(Visitor *v, bool data_present, - Error **errp) -{ - return data_present; -} - Visitor *qapi_dealloc_get_visitor(QapiDeallocVisitor *v) { return &v->visitor; @@ -224,7 +199,6 @@ QapiDeallocVisitor *qapi_dealloc_visitor_new(void) v->visitor.type_str = qapi_dealloc_type_str; v->visitor.type_number = qapi_dealloc_type_number; v->visitor.type_any = qapi_dealloc_type_anything; - v->visitor.start_union = qapi_dealloc_start_union; QTAILQ_INIT(&v->stack);
Commit cee2dedb noticed that if you have a partial flat union (such as if an input parse failed due to a missing discriminator), calling the dealloc visitor could result in trying to dereference a NULL pointer if we attempted to visit an object branch without an earlier successful call to visit_start_implicit_struct() allocating the pointer for that branch. But the "fix" it implemented requires the use of a '.data' member in the union, which may or may not be the same size as other branches of the union (consider a 32-bit platform where one of the branches is an int64), which feels fairly dirty. Plus, as mentioned in that commit, it only works if you can assume that '.data' would be zero-initialized even if '.kind' was uninitialized, which is rather poor logic: our usage of visit_start_struct() happens to zero-initialize both fields, which means '.kind' is never truly uninitialized - but if we changed visit_start_struct() to use g_new() instead of g_new0(), then '.data' would not be any more reliable as a condition on whether to visit the branch matching '.kind', regardless of whether '.kind' was 0). Menawhile, now that we have just inlined the fields of all flat unions, there is no longer the possibility of a null pointer to dereference in the first place. Where the branch structure used to be separately allocated by visit_start_implicit_struct(), it is now just pointing to a subset of the memory already zero-allocated by visit_start_struct(). Thus, we can instead fix things to delete the misguided visit_start_union(), as it is no longer providing any benefit. And it finishes the cleanup we started in commit 7c91aabd when we deleted visit_end_union(). Generated code changes as follows: |@@ -2366,9 +2363,6 @@ void visit_type_ChardevBackend(Visitor * | if (err) { | goto out_obj; | } |- if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) { |- goto out_obj; |- } | switch ((*obj)->type) { | case CHARDEV_BACKEND_KIND_FILE: | visit_type_ChardevFile(v, "data", &(*obj)->u.file, &err); Signed-off-by: Eric Blake <eblake@redhat.com> --- v10: retitle, hoist earlier in series, rebase, drop R-b v9: no change v8: rebase to 'name' motion v7: rebase to earlier context changes, simplify 'obj && !*obj' condition based on contract v6: rebase due to deferring 7/46, and gen_err_check() improvements; rewrite gen_visit_implicit_struct() more like other patterns --- include/qapi/visitor.h | 1 - include/qapi/visitor-impl.h | 2 -- scripts/qapi-visit.py | 3 --- qapi/qapi-visit-core.c | 8 -------- qapi/qapi-dealloc-visitor.c | 26 -------------------------- 5 files changed, 40 deletions(-)