Message ID | 20240731134014.2299361-2-christian.couder@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Introduce a "promisor-remote" capability | expand |
Christian Couder <christian.couder@gmail.com> writes: > diff --git a/strbuf.c b/strbuf.c > index 3d2189a7f6..cccfdec0e3 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -1082,3 +1082,12 @@ void strbuf_strip_file_from_path(struct strbuf *sb) > char *path_sep = find_last_dir_sep(sb->buf); > strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0); > } > + > +void strbuf_sanitize(struct strbuf *sb) > +{ > + strbuf_trim(sb); > + for (size_t i = 0; i < sb->len; i++) { > + if (sb->buf[i] <= 32 || sb->buf[i] >= 127) > + sb->buf[i] = '.'; > + } > +} This looked a bit _too_ specific for the use of the transport layer (which raises the question if it should even live in strbuf.[ch]). It also made me wonder if different callers likely want to have different variants (e.g., do not trim, only trim at the tail, squash a run of unprintables into a single '.', use '?' instead of '.', etc., etc.). It turns out that there is only *one* existing caller that gets replaced with this "common" version, which made it a Meh to me. Let's hope that there will be many new callers to make this step worthwhile. > __attribute__((format (printf,1,2))) > int printf_ln(const char *fmt, ...); > __attribute__((format (printf,2,3))) > diff --git a/version.c b/version.c > index 41b718c29e..951e6dca74 100644 > --- a/version.c > +++ b/version.c > @@ -24,15 +24,10 @@ const char *git_user_agent_sanitized(void) > > if (!agent) { > struct strbuf buf = STRBUF_INIT; > - int i; > > strbuf_addstr(&buf, git_user_agent()); > - strbuf_trim(&buf); > - for (i = 0; i < buf.len; i++) { > - if (buf.buf[i] <= 32 || buf.buf[i] >= 127) > - buf.buf[i] = '.'; > - } > - agent = buf.buf; > + strbuf_sanitize(&buf); > + agent = strbuf_detach(&buf, NULL); > } > > return agent;
On Wed, Jul 31, 2024 at 7:18 PM Junio C Hamano <gitster@pobox.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > > > diff --git a/strbuf.c b/strbuf.c > > index 3d2189a7f6..cccfdec0e3 100644 > > --- a/strbuf.c > > +++ b/strbuf.c > > @@ -1082,3 +1082,12 @@ void strbuf_strip_file_from_path(struct strbuf *sb) > > char *path_sep = find_last_dir_sep(sb->buf); > > strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0); > > } > > + > > +void strbuf_sanitize(struct strbuf *sb) > > +{ > > + strbuf_trim(sb); > > + for (size_t i = 0; i < sb->len; i++) { > > + if (sb->buf[i] <= 32 || sb->buf[i] >= 127) > > + sb->buf[i] = '.'; > > + } > > +} > > This looked a bit _too_ specific for the use of the transport layer > (which raises the question if it should even live in strbuf.[ch]). > It also made me wonder if different callers likely want to have > different variants (e.g., do not trim, only trim at the tail, squash > a run of unprintables into a single '.', use '?' instead of '.', > etc., etc.). > > It turns out that there is only *one* existing caller that gets > replaced with this "common" version, which made it a Meh to me. > > Let's hope that there will be many new callers to make this step > worthwhile. A very similar step was also part of my previous patch series to add an OS version to the protocol. See: https://lore.kernel.org/git/20240619125708.3719150-2-christian.couder@gmail.com/ My opinion is that the code is doing something often needed when dealing with the protocol, so it is worth it to refactor that code soon, and then adapt it later when needed with options (to not trim, only trim at the tail, use '?' instead of '.', etc). I am not sure if it should live in strbuf.[ch], but on the other hand if we indeed adapt it over time with a number of options for different use cases, it might end up in strbuf.[ch], so it is a reasonable bet to put it there right away. I must also say that I don't know which other place(s) would be a good home for it.
diff --git a/strbuf.c b/strbuf.c index 3d2189a7f6..cccfdec0e3 100644 --- a/strbuf.c +++ b/strbuf.c @@ -1082,3 +1082,12 @@ void strbuf_strip_file_from_path(struct strbuf *sb) char *path_sep = find_last_dir_sep(sb->buf); strbuf_setlen(sb, path_sep ? path_sep - sb->buf + 1 : 0); } + +void strbuf_sanitize(struct strbuf *sb) +{ + strbuf_trim(sb); + for (size_t i = 0; i < sb->len; i++) { + if (sb->buf[i] <= 32 || sb->buf[i] >= 127) + sb->buf[i] = '.'; + } +} diff --git a/strbuf.h b/strbuf.h index 003f880ff7..884157873e 100644 --- a/strbuf.h +++ b/strbuf.h @@ -664,6 +664,13 @@ typedef int (*char_predicate)(char ch); void strbuf_addstr_urlencode(struct strbuf *sb, const char *name, char_predicate allow_unencoded_fn); +/* + * Trim and replace each character with ascii code below 32 or above + * 127 (included) using a dot '.' character. Useful for sending + * capabilities. + */ +void strbuf_sanitize(struct strbuf *sb); + __attribute__((format (printf,1,2))) int printf_ln(const char *fmt, ...); __attribute__((format (printf,2,3))) diff --git a/version.c b/version.c index 41b718c29e..951e6dca74 100644 --- a/version.c +++ b/version.c @@ -24,15 +24,10 @@ const char *git_user_agent_sanitized(void) if (!agent) { struct strbuf buf = STRBUF_INIT; - int i; strbuf_addstr(&buf, git_user_agent()); - strbuf_trim(&buf); - for (i = 0; i < buf.len; i++) { - if (buf.buf[i] <= 32 || buf.buf[i] >= 127) - buf.buf[i] = '.'; - } - agent = buf.buf; + strbuf_sanitize(&buf); + agent = strbuf_detach(&buf, NULL); } return agent;
The git_user_agent_sanitized() function performs some sanitizing to avoid special characters being sent over the line and possibly messing up with the protocol or with the parsing on the other side. Let's extract this sanitizing into a new strbuf_sanitize() function, as we will want to reuse it in a following patch, and let's put it into strbuf.{c,h}. While at it, let's also make a few small improvements: - use 'size_t' for 'i' instead of 'int', - move the declaration of 'i' inside the 'for ( ... )', - use strbuf_detach() to explicitely detach the string contained by the 'sb' strbuf. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> --- strbuf.c | 9 +++++++++ strbuf.h | 7 +++++++ version.c | 9 ++------- 3 files changed, 18 insertions(+), 7 deletions(-)