Message ID | 9ec999882d790aa770aba8c0916b9260661af9be.1696293862.git.me@ttaylorr.com (mailing list archive) |
---|---|
State | Accepted |
Commit | b5b1f4c0ecf20c069d0301732edfdbfec167aa0c |
Headers | show |
Series | repack: implement `--cruft-max-size` | expand |
On Mon, Oct 02, 2023 at 08:44:29PM -0400, Taylor Blau wrote: > The repack builtin takes a `--max-pack-size` command-line argument which > it uses to feed into any of the pack-objects children that it may spawn > when generating a new pack. > > This option is parsed with OPT_STRING, meaning that we'll accept > anything as input, punting on more fine-grained validation until we get > down into pack-objects. > > This is fine, but it's wasteful to spend an entire sub-process just to > figure out that one of its option is bogus. Instead, parse the value of > `--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the > knonw-good result down to pack-objects. Tiny nit: s/knonw/known. Other than that this patch looks good to me. Patrick > Suggested-by: Junio C Hamano <gitster@pobox.com> > Signed-off-by: Taylor Blau <me@ttaylorr.com> > --- > builtin/repack.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/builtin/repack.c b/builtin/repack.c > index 529e13120d..8a5bbb9cba 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -51,7 +51,7 @@ struct pack_objects_args { > const char *window_memory; > const char *depth; > const char *threads; > - const char *max_pack_size; > + unsigned long max_pack_size; > int no_reuse_delta; > int no_reuse_object; > int quiet; > @@ -242,7 +242,7 @@ static void prepare_pack_objects(struct child_process *cmd, > if (args->threads) > strvec_pushf(&cmd->args, "--threads=%s", args->threads); > if (args->max_pack_size) > - strvec_pushf(&cmd->args, "--max-pack-size=%s", args->max_pack_size); > + strvec_pushf(&cmd->args, "--max-pack-size=%lu", args->max_pack_size); > if (args->no_reuse_delta) > strvec_pushf(&cmd->args, "--no-reuse-delta"); > if (args->no_reuse_object) > @@ -946,7 +946,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) > N_("limits the maximum delta depth")), > OPT_STRING(0, "threads", &po_args.threads, N_("n"), > N_("limits the maximum number of threads")), > - OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), > + OPT_MAGNITUDE(0, "max-pack-size", &po_args.max_pack_size, > N_("maximum size of each packfile")), > OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, > N_("repack objects in packs marked with .keep")), > -- > 2.42.0.310.g9604b54f73.dirty >
On Thu, Oct 05, 2023 at 01:31:02PM +0200, Patrick Steinhardt wrote: > On Mon, Oct 02, 2023 at 08:44:29PM -0400, Taylor Blau wrote: > > The repack builtin takes a `--max-pack-size` command-line argument which > > it uses to feed into any of the pack-objects children that it may spawn > > when generating a new pack. > > > > This option is parsed with OPT_STRING, meaning that we'll accept > > anything as input, punting on more fine-grained validation until we get > > down into pack-objects. > > > > This is fine, but it's wasteful to spend an entire sub-process just to > > figure out that one of its option is bogus. Instead, parse the value of > > `--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the > > knonw-good result down to pack-objects. > > Tiny nit: s/knonw/known. > > Other than that this patch looks good to me. Oops, good eyes. Perhaps Junio can tweak this when queuing, but if he doesn't, I don't think it'd be the end of the world... Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: > On Thu, Oct 05, 2023 at 01:31:02PM +0200, Patrick Steinhardt wrote: >> On Mon, Oct 02, 2023 at 08:44:29PM -0400, Taylor Blau wrote: >> > The repack builtin takes a `--max-pack-size` command-line argument which >> > it uses to feed into any of the pack-objects children that it may spawn >> > when generating a new pack. >> > >> > This option is parsed with OPT_STRING, meaning that we'll accept >> > anything as input, punting on more fine-grained validation until we get >> > down into pack-objects. >> > >> > This is fine, but it's wasteful to spend an entire sub-process just to >> > figure out that one of its option is bogus. Instead, parse the value of >> > `--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the >> > knonw-good result down to pack-objects. >> >> Tiny nit: s/knonw/known. >> >> Other than that this patch looks good to me. > > Oops, good eyes. Perhaps Junio can tweak this when queuing, but if he > doesn't, I don't think it'd be the end of the world... I cannot do so "when queuing" anymore, but let me see if it is still outside 'next' and then run "rebase -i" on it. This is a bit different from what I would have written (I would have passed the original string to underlying subprocess for it to sanity check the value to its own liking, after we checked if it makes sense when interpreted as magnitude, so that the underlying subprocess, if it has stricter limit than we know of, can complain with the string the end-user gave it, not the result of us turning into a large integer string. But that is minor. Thanks, both.
diff --git a/builtin/repack.c b/builtin/repack.c index 529e13120d..8a5bbb9cba 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -51,7 +51,7 @@ struct pack_objects_args { const char *window_memory; const char *depth; const char *threads; - const char *max_pack_size; + unsigned long max_pack_size; int no_reuse_delta; int no_reuse_object; int quiet; @@ -242,7 +242,7 @@ static void prepare_pack_objects(struct child_process *cmd, if (args->threads) strvec_pushf(&cmd->args, "--threads=%s", args->threads); if (args->max_pack_size) - strvec_pushf(&cmd->args, "--max-pack-size=%s", args->max_pack_size); + strvec_pushf(&cmd->args, "--max-pack-size=%lu", args->max_pack_size); if (args->no_reuse_delta) strvec_pushf(&cmd->args, "--no-reuse-delta"); if (args->no_reuse_object) @@ -946,7 +946,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) N_("limits the maximum delta depth")), OPT_STRING(0, "threads", &po_args.threads, N_("n"), N_("limits the maximum number of threads")), - OPT_STRING(0, "max-pack-size", &po_args.max_pack_size, N_("bytes"), + OPT_MAGNITUDE(0, "max-pack-size", &po_args.max_pack_size, N_("maximum size of each packfile")), OPT_BOOL(0, "pack-kept-objects", &pack_kept_objects, N_("repack objects in packs marked with .keep")),
The repack builtin takes a `--max-pack-size` command-line argument which it uses to feed into any of the pack-objects children that it may spawn when generating a new pack. This option is parsed with OPT_STRING, meaning that we'll accept anything as input, punting on more fine-grained validation until we get down into pack-objects. This is fine, but it's wasteful to spend an entire sub-process just to figure out that one of its option is bogus. Instead, parse the value of `--max-pack-size` with OPT_MAGNITUDE in 'git repack', and then pass the knonw-good result down to pack-objects. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> --- builtin/repack.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)