diff mbox series

[v3,2/4] unbundle: introduce unbundle_fsck_flags for fsckobjects handling

Message ID 057c697970ff49301cd9dc6adef099f53d440c3c.1716824518.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series object checking related additions and fixes for bundles in fetches | expand

Commit Message

Xing Xin May 27, 2024, 3:41 p.m. UTC
From: Xing Xin <xingxin.xx@bytedance.com>

This commit adds a new enum `unbundle_fsck_flags` which is designed to
control the fsck behavior when unbundling. `unbundle` can use this newly
passed in enum to further decide whether to enable `--fsck-objects` for
"git-index-pack".

Currently only `UNBUNDLE_FSCK_NEVER` and `UNBUNDLE_FSCK_ALWAYS` are
supported as the very basic options. Another interesting option would be
added in later commits.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
---
 builtin/bundle.c |  2 +-
 bundle-uri.c     |  2 +-
 bundle.c         | 12 +++++++++++-
 bundle.h         |  8 +++++++-
 transport.c      |  2 +-
 5 files changed, 21 insertions(+), 5 deletions(-)

Comments

Patrick Steinhardt May 28, 2024, 12:03 p.m. UTC | #1
On Mon, May 27, 2024 at 03:41:55PM +0000, Xing Xin via GitGitGadget wrote:
[snip]
> diff --git a/bundle.h b/bundle.h
> index 021adbdcbb3..cfa9daddda6 100644
> --- a/bundle.h
> +++ b/bundle.h
> @@ -30,6 +30,11 @@ int create_bundle(struct repository *r, const char *path,
>  		  int argc, const char **argv, struct strvec *pack_options,
>  		  int version);
>  
> +enum unbundle_fsck_flags {
> +	UNBUNDLE_FSCK_NEVER = 0,
> +	UNBUNDLE_FSCK_ALWAYS,
> +};
> +
>  enum verify_bundle_flags {
>  	VERIFY_BUNDLE_VERBOSE = (1 << 0),
>  	VERIFY_BUNDLE_QUIET = (1 << 1),

Wouldn't this have been a natural fit for the new flag, e.g. via
something like `VERIFY_BUNDLE_FSCK`?

Patrick
Xing Xin May 29, 2024, 6:12 p.m. UTC | #2
At 2024-05-28 20:03:25, "Patrick Steinhardt" <ps@pks.im> wrote:
>On Mon, May 27, 2024 at 03:41:55PM +0000, Xing Xin via GitGitGadget wrote:
>[snip]
>> diff --git a/bundle.h b/bundle.h
>> index 021adbdcbb3..cfa9daddda6 100644
>> --- a/bundle.h
>> +++ b/bundle.h
>> @@ -30,6 +30,11 @@ int create_bundle(struct repository *r, const char *path,
>>  		  int argc, const char **argv, struct strvec *pack_options,
>>  		  int version);
>>  
>> +enum unbundle_fsck_flags {
>> +	UNBUNDLE_FSCK_NEVER = 0,
>> +	UNBUNDLE_FSCK_ALWAYS,
>> +};
>> +
>>  enum verify_bundle_flags {
>>  	VERIFY_BUNDLE_VERBOSE = (1 << 0),
>>  	VERIFY_BUNDLE_QUIET = (1 << 1),
>
>Wouldn't this have been a natural fit for the new flag, e.g. via
>something like `VERIFY_BUNDLE_FSCK`?

It makes sense to me. Currently, verify_bundle_flags controls the amount
of information displayed when checking a bundle's prerequisites. The
newly added unbundle_fsck_flags is designed to check for broken objects
during the unbundle process, which is essentially a form of bundle
verification. I believe we should extend some object verification
capabilities to the git bundle verify command as well, perhaps by adding
a --fsck-objects option.

With this in mind, I support adding new options to verify_bundle_flags.
Since bundle.c:unbundle needs to combine multiple options, we must
define new options using bitwise shifting:

	enum verify_bundle_flags {
		VERIFY_BUNDLE_VERBOSE = (1 << 0),
		VERIFY_BUNDLE_QUIET = (1 << 1),
		VERIFY_BUNDLE_FSCK_OBJECTS_ALWAYS = (1 << 2),
		VERIFY_BUNDLE_FSCK_OBJECTS_FOLLOW_FETCH = (1 << 3),
	};

How about the naming? I'm not very good at naming :)

Xing Xin
Patrick Steinhardt May 30, 2024, 4:38 a.m. UTC | #3
On Thu, May 30, 2024 at 02:12:47AM +0800, Xing Xin wrote:
> At 2024-05-28 20:03:25, "Patrick Steinhardt" <ps@pks.im> wrote:
> >On Mon, May 27, 2024 at 03:41:55PM +0000, Xing Xin via GitGitGadget wrote:
> >[snip]
> >> diff --git a/bundle.h b/bundle.h
> >> index 021adbdcbb3..cfa9daddda6 100644
> >> --- a/bundle.h
> >> +++ b/bundle.h
> >> @@ -30,6 +30,11 @@ int create_bundle(struct repository *r, const char *path,
> >>  		  int argc, const char **argv, struct strvec *pack_options,
> >>  		  int version);
> >>  
> >> +enum unbundle_fsck_flags {
> >> +	UNBUNDLE_FSCK_NEVER = 0,
> >> +	UNBUNDLE_FSCK_ALWAYS,
> >> +};
> >> +
> >>  enum verify_bundle_flags {
> >>  	VERIFY_BUNDLE_VERBOSE = (1 << 0),
> >>  	VERIFY_BUNDLE_QUIET = (1 << 1),
> >
> >Wouldn't this have been a natural fit for the new flag, e.g. via
> >something like `VERIFY_BUNDLE_FSCK`?
> 
> It makes sense to me. Currently, verify_bundle_flags controls the amount
> of information displayed when checking a bundle's prerequisites. The
> newly added unbundle_fsck_flags is designed to check for broken objects
> during the unbundle process, which is essentially a form of bundle
> verification. I believe we should extend some object verification
> capabilities to the git bundle verify command as well, perhaps by adding
> a --fsck-objects option.
> 
> With this in mind, I support adding new options to verify_bundle_flags.
> Since bundle.c:unbundle needs to combine multiple options, we must
> define new options using bitwise shifting:
> 
> 	enum verify_bundle_flags {
> 		VERIFY_BUNDLE_VERBOSE = (1 << 0),
> 		VERIFY_BUNDLE_QUIET = (1 << 1),
> 		VERIFY_BUNDLE_FSCK_OBJECTS_ALWAYS = (1 << 2),
> 		VERIFY_BUNDLE_FSCK_OBJECTS_FOLLOW_FETCH = (1 << 3),
> 	};
> 
> How about the naming? I'm not very good at naming :)

I later noticed that you extend the `unbundle_fsck_flags` in a later
patch. With that in mind I don't think it's all that important anymore
to merge those into the `verify_bundle_flags` as you would otherwise
allow for weirdness. What happens for example when both `ALWAYS` and
`FOLLOW_FETCH` are set?

So feel free to ignore this advice. If you still think it's a good idea
then the above naming looks okay to me.

Patrick
Xing Xin May 30, 2024, 8:46 a.m. UTC | #4
At 2024-05-30 12:38:49, "Patrick Steinhardt" <ps@pks.im> wrote:
[snip]
>> >
>> >Wouldn't this have been a natural fit for the new flag, e.g. via
>> >something like `VERIFY_BUNDLE_FSCK`?
>> 
>> It makes sense to me. Currently, verify_bundle_flags controls the amount
>> of information displayed when checking a bundle's prerequisites. The
>> newly added unbundle_fsck_flags is designed to check for broken objects
>> during the unbundle process, which is essentially a form of bundle
>> verification. I believe we should extend some object verification
>> capabilities to the git bundle verify command as well, perhaps by adding
>> a --fsck-objects option.
>> 
>> With this in mind, I support adding new options to verify_bundle_flags.
>> Since bundle.c:unbundle needs to combine multiple options, we must
>> define new options using bitwise shifting:
>> 
>> 	enum verify_bundle_flags {
>> 		VERIFY_BUNDLE_VERBOSE = (1 << 0),
>> 		VERIFY_BUNDLE_QUIET = (1 << 1),
>> 		VERIFY_BUNDLE_FSCK_OBJECTS_ALWAYS = (1 << 2),
>> 		VERIFY_BUNDLE_FSCK_OBJECTS_FOLLOW_FETCH = (1 << 3),
>> 	};
>> 
>> How about the naming? I'm not very good at naming :)
>
>I later noticed that you extend the `unbundle_fsck_flags` in a later
>patch. With that in mind I don't think it's all that important anymore
>to merge those into the `verify_bundle_flags` as you would otherwise
>allow for weirdness. What happens for example when both `ALWAYS` and
>`FOLLOW_FETCH` are set?
>
>So feel free to ignore this advice. If you still think it's a good idea
>then the above naming looks okay to me.

With the idea of extending "--fsck-objects" support for "git bundle verify" and
"git bundle unbundle", I prefer to grouping these options together. Especially
in the "git bundle verify" scenario, adding a new parameter like
`unbundle_fsck_flags` for `bundle.c:verify_bundle` is confusing.

Xing Xin
diff mbox series

Patch

diff --git a/builtin/bundle.c b/builtin/bundle.c
index 3ad11dc5d05..6c10961c640 100644
--- a/builtin/bundle.c
+++ b/builtin/bundle.c
@@ -212,7 +212,7 @@  static int cmd_bundle_unbundle(int argc, const char **argv, const char *prefix)
 		strvec_pushl(&extra_index_pack_args, "-v", "--progress-title",
 			     _("Unbundling objects"), NULL);
 	ret = !!unbundle(the_repository, &header, bundle_fd,
-			 &extra_index_pack_args, 0) ||
+			 &extra_index_pack_args, 0, UNBUNDLE_FSCK_NEVER) ||
 		list_bundle_refs(&header, argc, argv);
 	bundle_header_release(&header);
 cleanup:
diff --git a/bundle-uri.c b/bundle-uri.c
index 65666a11d9c..80f02aac6f1 100644
--- a/bundle-uri.c
+++ b/bundle-uri.c
@@ -373,7 +373,7 @@  static int unbundle_from_file(struct repository *r, const char *file)
 	 * the prerequisite commits.
 	 */
 	if ((result = unbundle(r, &header, bundle_fd, NULL,
-			       VERIFY_BUNDLE_QUIET)))
+			       VERIFY_BUNDLE_QUIET, UNBUNDLE_FSCK_ALWAYS)))
 		return 1;
 
 	/*
diff --git a/bundle.c b/bundle.c
index 95367c2d0a0..a922d592782 100644
--- a/bundle.c
+++ b/bundle.c
@@ -612,7 +612,8 @@  int create_bundle(struct repository *r, const char *path,
 
 int unbundle(struct repository *r, struct bundle_header *header,
 	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags)
+	     enum verify_bundle_flags flags,
+	     enum unbundle_fsck_flags fsck_flags)
 {
 	struct child_process ip = CHILD_PROCESS_INIT;
 
@@ -625,6 +626,15 @@  int unbundle(struct repository *r, struct bundle_header *header,
 	if (header->filter.choice)
 		strvec_push(&ip.args, "--promisor=from-bundle");
 
+	switch (fsck_flags) {
+	case UNBUNDLE_FSCK_ALWAYS:
+		strvec_push(&ip.args, "--fsck-objects");
+		break;
+	case UNBUNDLE_FSCK_NEVER:
+	default:
+		break;
+	}
+
 	if (extra_index_pack_args) {
 		strvec_pushv(&ip.args, extra_index_pack_args->v);
 		strvec_clear(extra_index_pack_args);
diff --git a/bundle.h b/bundle.h
index 021adbdcbb3..cfa9daddda6 100644
--- a/bundle.h
+++ b/bundle.h
@@ -30,6 +30,11 @@  int create_bundle(struct repository *r, const char *path,
 		  int argc, const char **argv, struct strvec *pack_options,
 		  int version);
 
+enum unbundle_fsck_flags {
+	UNBUNDLE_FSCK_NEVER = 0,
+	UNBUNDLE_FSCK_ALWAYS,
+};
+
 enum verify_bundle_flags {
 	VERIFY_BUNDLE_VERBOSE = (1 << 0),
 	VERIFY_BUNDLE_QUIET = (1 << 1),
@@ -53,7 +58,8 @@  int verify_bundle(struct repository *r, struct bundle_header *header,
  */
 int unbundle(struct repository *r, struct bundle_header *header,
 	     int bundle_fd, struct strvec *extra_index_pack_args,
-	     enum verify_bundle_flags flags);
+	     enum verify_bundle_flags flags,
+	     enum unbundle_fsck_flags fsck_flags);
 int list_bundle_refs(struct bundle_header *header,
 		int argc, const char **argv);
 
diff --git a/transport.c b/transport.c
index 0ad04b77fd2..6799988f10c 100644
--- a/transport.c
+++ b/transport.c
@@ -184,7 +184,7 @@  static int fetch_refs_from_bundle(struct transport *transport,
 	if (!data->get_refs_from_bundle_called)
 		get_refs_from_bundle_inner(transport);
 	ret = unbundle(the_repository, &data->header, data->fd,
-		       &extra_index_pack_args, 0);
+		       &extra_index_pack_args, 0, UNBUNDLE_FSCK_ALWAYS);
 	transport->hash_algo = data->header.hash_algo;
 	return ret;
 }