Message ID | c3012f1da361af354a904f821b83d61f2534ccb2.1587297254.git.martin.agren@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | strbuf: fix doc for `strbuf_attach()` and avoid it | expand |
Martin Ågren <martin.agren@gmail.com> writes: > +/** > + * Attach a string to a buffer similar to `strbuf_attachstr_len()`. > + * Useful if you do not know the length of the string. > + */ > +static inline void strbuf_attachstr(struct strbuf *sb, char *str) > +{ > + size_t len = strlen(str); > + > + strbuf_attach(sb, str, len, len + 1); > +} This is somewhat worrysome in that the interface is _so_ simple that people may fail to see that str must be allocated piece of memory, and it is preferrable if string fully fills the allocation. We should repeat that (instead of just trusting "similar to ..." would tell them enough) in the doc, perhaps? > diff --git a/trailer.c b/trailer.c > index 0c414f2fed..56c4027943 100644 > --- a/trailer.c > +++ b/trailer.c > @@ -1095,7 +1095,7 @@ void trailer_info_get(struct trailer_info *info, const char *str, > for (ptr = trailer_lines; *ptr; ptr++) { > if (last && isspace((*ptr)->buf[0])) { > struct strbuf sb = STRBUF_INIT; > - strbuf_attach(&sb, *last, strlen(*last), strlen(*last)); > + strbuf_attachstr(&sb, *last); > strbuf_addbuf(&sb, *ptr); > *last = strbuf_detach(&sb, NULL); > continue; This is not wrong per-se, but it is unclear if use of strbuf_attach* family to avoid an explicit malloc/copy/free is buying much at this callsite. Simplifying the code here of course is not within the scope of this series.
On Mon, 20 Apr 2020 at 21:39, Junio C Hamano <gitster@pobox.com> wrote: > > Martin Ågren <martin.agren@gmail.com> writes: > > > +/** > > + * Attach a string to a buffer similar to `strbuf_attachstr_len()`. > > + * Useful if you do not know the length of the string. > > + */ > > +static inline void strbuf_attachstr(struct strbuf *sb, char *str) > > +{ > > + size_t len = strlen(str); > > + > > + strbuf_attach(sb, str, len, len + 1); > > +} > > This is somewhat worrysome in that the interface is _so_ simple that > people may fail to see that str must be allocated piece of memory, > and it is preferrable if string fully fills the allocation. > > We should repeat that (instead of just trusting "similar to ..." > would tell them enough) in the doc, perhaps? Yeah, that's a good point. I'll expand on this to try to better get through that there are things to consider here. > > @@ -1095,7 +1095,7 @@ void trailer_info_get(struct trailer_info *info, const char *str, > > for (ptr = trailer_lines; *ptr; ptr++) { > > if (last && isspace((*ptr)->buf[0])) { > > struct strbuf sb = STRBUF_INIT; > > - strbuf_attach(&sb, *last, strlen(*last), strlen(*last)); > > + strbuf_attachstr(&sb, *last); > > strbuf_addbuf(&sb, *ptr); > > *last = strbuf_detach(&sb, NULL); > > continue; > > This is not wrong per-se, but it is unclear if use of strbuf_attach* > family to avoid an explicit malloc/copy/free is buying much at this > callsite. Simplifying the code here of course is not within the > scope of this series. For the other patches in this series, I spent some time and effort investigating where strings came from, "do I really feel certain that they're NUL-terminated?". But for this patch, I more or less went "we've been using strlen on this all this time, surely if it wasn't guaranteed to be NUL-terminated we'd have messed up already". And I don't think I'm making anything worse. But yeah, I didn't really step back to look at what these sites are really doing, and how, as much as I did for the others. Martin
diff --git a/strbuf.h b/strbuf.h index 7d0aeda434..32cc15de0c 100644 --- a/strbuf.h +++ b/strbuf.h @@ -140,6 +140,17 @@ static inline void strbuf_attachstr_len(struct strbuf *sb, strbuf_attach(sb, str, len, len + 1); } +/** + * Attach a string to a buffer similar to `strbuf_attachstr_len()`. + * Useful if you do not know the length of the string. + */ +static inline void strbuf_attachstr(struct strbuf *sb, char *str) +{ + size_t len = strlen(str); + + strbuf_attach(sb, str, len, len + 1); +} + /** * Swap the contents of two string buffers. */ diff --git a/path.c b/path.c index 9bd717c307..3cd8fd56b4 100644 --- a/path.c +++ b/path.c @@ -815,8 +815,7 @@ const char *enter_repo(const char *path, int strict) char *newpath = expand_user_path(used_path.buf, 0); if (!newpath) return NULL; - strbuf_attach(&used_path, newpath, strlen(newpath), - strlen(newpath)); + strbuf_attachstr(&used_path, newpath); } for (i = 0; suffix[i]; i++) { struct stat st; diff --git a/pretty.c b/pretty.c index e171830389..5ecdf0cbb2 100644 --- a/pretty.c +++ b/pretty.c @@ -590,7 +590,7 @@ static char *replace_encoding_header(char *buf, const char *encoding) return buf; /* should not happen but be defensive */ len = cp + 1 - (buf + start); - strbuf_attach(&tmp, buf, strlen(buf), strlen(buf) + 1); + strbuf_attachstr(&tmp, buf); if (is_encoding_utf8(encoding)) { /* we have re-coded to UTF-8; drop the header */ strbuf_remove(&tmp, start, len); diff --git a/refs/files-backend.c b/refs/files-backend.c index 561c33ac8a..eb058d85b6 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1511,10 +1511,9 @@ static int commit_ref(struct ref_lock *lock) * the lockfile to. Hopefully it is empty; try to * delete it. */ - size_t len = strlen(path); struct strbuf sb_path = STRBUF_INIT; - strbuf_attach(&sb_path, path, len, len); + strbuf_attachstr(&sb_path, path); /* * If this fails, commit_lock_file() will also fail diff --git a/trailer.c b/trailer.c index 0c414f2fed..56c4027943 100644 --- a/trailer.c +++ b/trailer.c @@ -1095,7 +1095,7 @@ void trailer_info_get(struct trailer_info *info, const char *str, for (ptr = trailer_lines; *ptr; ptr++) { if (last && isspace((*ptr)->buf[0])) { struct strbuf sb = STRBUF_INIT; - strbuf_attach(&sb, *last, strlen(*last), strlen(*last)); + strbuf_attachstr(&sb, *last); strbuf_addbuf(&sb, *ptr); *last = strbuf_detach(&sb, NULL); continue;
Similar to the previous commit, introduce `strbuf_attachstr()` where we don't even have to pass in the length of the string that we want to attach. Convert existing callers of `strbuf_attachstr()` that use `strlen()`. Note how only one caller passes in `mem == len + 1` and that the others have been using `strbuf_attach()` in direct contradiction to how it was (incorrectly) documented up until a few commits ago. Now that the documentation has been fixed, you might say these are all fine. But the calling convention of `strbuf_attach()` seems sufficiently hard to get right that it's probably a good idea to introduce this helper. This could help reduce reallocations and memory waste. When we pessimistically pass in `strlen(foo)` for `mem`, the strbuf will have `alloc == len` and will do a reallocation, not just to get one more byte for the NUL (which would have been a no-op), but because we're using `ALLOC_GROW` under the hood, we will ask for 16 more bytes and another 50% on top of that. Signed-off-by: Martin Ågren <martin.agren@gmail.com> --- strbuf.h | 11 +++++++++++ path.c | 3 +-- pretty.c | 2 +- refs/files-backend.c | 3 +-- trailer.c | 2 +- 5 files changed, 15 insertions(+), 6 deletions(-)