[10/32] bulk-checkin: Only accept blobs

Message ID	20230908231049.2035003-10-ebiederm@xmission.com (mailing list archive)
State	Accepted
Commit	9eb5419799f08402ee3bd185c2d2c50ded669b06
Headers	show Return-Path: <git-owner@vger.kernel.org> From: "Eric W. Biederman" <ebiederm@xmission.com> To: git@vger.kernel.org Cc: Junio C Hamano <gitster@pobox.com>, "brian m. carlson" <sandals@crustytoothpaste.net>, "Eric W. Biederman" <ebiederm@xmission.com> Date: Fri, 8 Sep 2023 18:10:27 -0500 Message-Id: <20230908231049.2035003-10-ebiederm@xmission.com> In-Reply-To: <87sf7ol0z3.fsf@email.froward.int.ebiederm.org> References: <87sf7ol0z3.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: [PATCH 10/32] bulk-checkin: Only accept blobs Precedence: bulk
Series	SHA256 and SHA1 interoperability \| expand [RFC,0/32] SHA256 and SHA1 interoperability [01/32] doc hash-file-transition: A map file for mapping between sha1 and sha256 [02/32] doc hash-function-transition: Replace compatObjectFormat with compatMap [03/32] object-file-convert: Stubs for converting from one object format to another [04/32] object-name: Initial support for ^{sha1} and ^{sha256} [05/32] repository: add a compatibility hash algorithm [06/32] repository: Implement core.compatMap [07/32] loose: add a mapping between SHA-1 and SHA-256 for loose objects [08/32] loose: Compatibilty short name support [09/32] object-file: Update the loose object map when writing loose objects [10/32] bulk-checkin: Only accept blobs [11/32] pack: Communicate the compat_oid through struct pack_idx_entry [12/32] bulk-checkin: hash object with compatibility algorithm [13/32] object-file: Add a compat_oid_in parameter to write_object_file_flags [14/32] commit: write commits for both hashes [15/32] cache: add a function to read an OID of a specific algorithm [16/32] object: Factor out parse_mode out of fast-import and tree-walk into in object.h [17/32] object-file-convert: add a function to convert trees between algorithms [18/32] object-file-convert: convert commit objects when writing [19/32] object-file-convert: convert tag commits when writing [20/32] builtin/cat-file: Let the oid determine the output algorithm [21/32] tree-walk: init_tree_desc take an oid to get the hash algorithm [22/32] object-file: Handle compat objects in check_object_signature [23/32] builtin/ls-tree: Let the oid determine the output algorithm [24/32] builtin/pack-objects: Communicate the compatibility hash through struct pack_idx_entry [25/32] pack-compat-map: Add support for .compat files of a packfile [26/32] object-file-convert: Implement convert_object_file_{begin,step,end} [27/32] builtin/fast-import: compute compatibility hashs for imported objects [28/32] builtin/index-pack: Add a simple oid index [29/32] builtin/index-pack: Compute the compatibility hash [30/32] builtin/index-pack: Make the stack in compute_compat_oid explicit [31/32] unpack-objects: Update to compute and write the compatibility hashes [32/32] object-file-convert: Implement repo_submodule_oid_to_algop

Message ID

20230908231049.2035003-10-ebiederm@xmission.com (mailing list archive)

State

Accepted

Commit

9eb5419799f08402ee3bd185c2d2c50ded669b06

Headers

From: "Eric W. Biederman" <ebiederm@xmission.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
        "brian m. carlson" <sandals@crustytoothpaste.net>,
        "Eric W. Biederman" <ebiederm@xmission.com>
Date: Fri,  8 Sep 2023 18:10:27 -0500
Message-Id: <20230908231049.2035003-10-ebiederm@xmission.com>
In-Reply-To: <87sf7ol0z3.fsf@email.froward.int.ebiederm.org>
References: <87sf7ol0z3.fsf@email.froward.int.ebiederm.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Subject: [PATCH 10/32] bulk-checkin: Only accept blobs
Precedence: bulk

Series

SHA256 and SHA1 interoperability | expand

Commit Message

Eric W. Biederman Sept. 8, 2023, 11:10 p.m. UTC

As the code is written today bulk_checkin only accepts blobs.  When
dealing with multiple hash algorithms it is necessary to distinguish
between blobs and object types that have embedded oids.  For object
that embed oids a completely new object needs to be generated to
compute the compatibility hash on.  For blobs however all that is
needed is to compute the compatibility hash on the same blob as the
default hash.

As the code will soon need the compatiblity hash from
a bulk checkin remove support for a bulk checking of
anything except blobs.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
---
 bulk-checkin.c | 35 +++++++++++++++++------------------
 bulk-checkin.h |  6 +++---
 object-file.c  | 12 ++++++------
 3 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/bulk-checkin.c b/bulk-checkin.c
index 73bff3a23d27..223562b4e748 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -155,10 +155,10 @@  static int already_written(struct bulk_checkin_packfile *state, struct object_id
  * status before calling us just in case we ask it to call us again
  * with a new pack.
  */
-static int stream_to_pack(struct bulk_checkin_packfile *state,
-			  git_hash_ctx *ctx, off_t *already_hashed_to,
-			  int fd, size_t size, enum object_type type,
-			  const char *path, unsigned flags)
+static int stream_blob_to_pack(struct bulk_checkin_packfile *state,
+			       git_hash_ctx *ctx, off_t *already_hashed_to,
+			       int fd, size_t size, const char *path,
+			       unsigned flags)
 {
 	git_zstream s;
 	unsigned char ibuf[16384];
@@ -170,7 +170,7 @@  static int stream_to_pack(struct bulk_checkin_packfile *state,
 
 	git_deflate_init(&s, pack_compression_level);
 
-	hdrlen = encode_in_pack_object_header(obuf, sizeof(obuf), type, size);
+	hdrlen = encode_in_pack_object_header(obuf, sizeof(obuf), OBJ_BLOB, size);
 	s.next_out = obuf + hdrlen;
 	s.avail_out = sizeof(obuf) - hdrlen;
 
@@ -247,11 +247,10 @@  static void prepare_to_stream(struct bulk_checkin_packfile *state,
 		die_errno("unable to write pack header");
 }
 
-static int deflate_to_pack(struct bulk_checkin_packfile *state,
-			   struct object_id *result_oid,
-			   int fd, size_t size,
-			   enum object_type type, const char *path,
-			   unsigned flags)
+static int deflate_blob_to_pack(struct bulk_checkin_packfile *state,
+				struct object_id *result_oid,
+				int fd, size_t size,
+				const char *path, unsigned flags)
 {
 	off_t seekback, already_hashed_to;
 	git_hash_ctx ctx;
@@ -265,7 +264,7 @@  static int deflate_to_pack(struct bulk_checkin_packfile *state,
 		return error("cannot find the current offset");
 
 	header_len = format_object_header((char *)obuf, sizeof(obuf),
-					  type, size);
+					  OBJ_BLOB, size);
 	the_hash_algo->init_fn(&ctx);
 	the_hash_algo->update_fn(&ctx, obuf, header_len);
 
@@ -282,8 +281,8 @@  static int deflate_to_pack(struct bulk_checkin_packfile *state,
 			idx->offset = state->offset;
 			crc32_begin(state->f);
 		}
-		if (!stream_to_pack(state, &ctx, &already_hashed_to,
-				    fd, size, type, path, flags))
+		if (!stream_blob_to_pack(state, &ctx, &already_hashed_to,
+					 fd, size, path, flags))
 			break;
 		/*
 		 * Writing this object to the current pack will make
@@ -350,12 +349,12 @@  void fsync_loose_object_bulk_checkin(int fd, const char *filename)
 	}
 }
 
-int index_bulk_checkin(struct object_id *oid,
-		       int fd, size_t size, enum object_type type,
-		       const char *path, unsigned flags)
+int index_blob_bulk_checkin(struct object_id *oid,
+			    int fd, size_t size,
+			    const char *path, unsigned flags)
 {
-	int status = deflate_to_pack(&bulk_checkin_packfile, oid, fd, size, type,
-				     path, flags);
+	int status = deflate_blob_to_pack(&bulk_checkin_packfile, oid, fd, size,
+					  path, flags);
 	if (!odb_transaction_nesting)
 		flush_bulk_checkin_packfile(&bulk_checkin_packfile);
 	return status;
diff --git a/bulk-checkin.h b/bulk-checkin.h
index 48fe9a6e9171..aa7286a7b3e1 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -9,9 +9,9 @@ 
 void prepare_loose_object_bulk_checkin(void);
 void fsync_loose_object_bulk_checkin(int fd, const char *filename);
 
-int index_bulk_checkin(struct object_id *oid,
-		       int fd, size_t size, enum object_type type,
-		       const char *path, unsigned flags);
+int index_blob_bulk_checkin(struct object_id *oid,
+			    int fd, size_t size,
+			    const char *path, unsigned flags);
 
 /*
  * Tell the object database to optimize for adding
diff --git a/object-file.c b/object-file.c
index 6a14b8875343..6cc4ae1fd957 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2587,11 +2587,11 @@  static int index_core(struct index_state *istate,
  * binary blobs, they generally do not want to get any conversion, and
  * callers should avoid this code path when filters are requested.
  */
-static int index_stream(struct object_id *oid, int fd, size_t size,
-			enum object_type type, const char *path,
-			unsigned flags)
+static int index_blob_stream(struct object_id *oid, int fd, size_t size,
+			     const char *path,
+			     unsigned flags)
 {
-	return index_bulk_checkin(oid, fd, size, type, path, flags);
+	return index_blob_bulk_checkin(oid, fd, size, path, flags);
 }
 
 int index_fd(struct index_state *istate, struct object_id *oid,
@@ -2613,8 +2613,8 @@  int index_fd(struct index_state *istate, struct object_id *oid,
 		ret = index_core(istate, oid, fd, xsize_t(st->st_size),
 				 type, path, flags);
 	else
-		ret = index_stream(oid, fd, xsize_t(st->st_size), type, path,
-				   flags);
+		ret = index_blob_stream(oid, fd, xsize_t(st->st_size), path,
+					flags);
 	close(fd);
 	return ret;
 }

[10/32] bulk-checkin: Only accept blobs

Commit Message

Patch