diff mbox series

[v1,1/1] index_bulk_checkin(): Take off_t, not size_t

Message ID 20181018191140.23318-1-tboegi@web.de (mailing list archive)
State New, archived
Headers show
Series [v1,1/1] index_bulk_checkin(): Take off_t, not size_t | expand

Commit Message

Torsten Bögershausen Oct. 18, 2018, 7:11 p.m. UTC
From: Torsten Bögershausen <tboegi@web.de>

When streaming data from disk into a blob, use off_t instead of
size_t, which is a better choice for file length.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
---

This is based on an old patch from 2017, which never made it to the list.
I think it make sense to have off_t/size_t more consistent,
reviews/comments are welcome.

bulk-checkin.c | 4 ++--
 bulk-checkin.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

Comments

Junio C Hamano Oct. 19, 2018, 1:58 a.m. UTC | #1
tboegi@web.de writes:

> bulk-checkin.c | 4 ++--
>  bulk-checkin.h | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)

If you lost SP in your editor, then it is OK but if format-patch
lost it for some reason, plasee tell me as we need to find the bug.

>
> diff --git a/bulk-checkin.c b/bulk-checkin.c
> index 409ecb566b..2631e82d6c 100644
> --- a/bulk-checkin.c
> +++ b/bulk-checkin.c
> @@ -189,7 +189,7 @@ static void prepare_to_stream(struct bulk_checkin_state *state,
>  
>  static int deflate_to_pack(struct bulk_checkin_state *state,
>  			   struct object_id *result_oid,
> -			   int fd, size_t size,
> +			   int fd, off_t size,

The size is once casted to uintmax_t for recording in the object
header (which is fine), and then passed to stream_to_pack(), which
still takes, and more importantly, does comparisons and chunking in,
size_t after this patch.  Without xsize_t() around size passed in
the call to stream_to_pack(), you may silently be truncating off_t
down to size_t in this function.

> @@ -258,7 +258,7 @@ static int deflate_to_pack(struct bulk_checkin_state *state,
>  }
>  
>  int index_bulk_checkin(struct object_id *oid,
> -		       int fd, size_t size, enum object_type type,
> +		       int fd, off_t size, enum object_type type,
>  		       const char *path, unsigned flags)
>  {
>  	int status = deflate_to_pack(&state, oid, fd, size, type,

This one is a thin wrapper around deflate_to_pack() above.

Its sole caller is sha1-file.c::index_stream() and takes size_t from
its callers, and passes size_t to index_bulk_checkin().

The sole caller of index_stream(), sha1-file.c::index_fd(), wants to
pass st->st_size, and it uses xsize_t() because index_stream() and
callchain underneath currently take size_t.  You want that callchain
to take off_t with this patch.

The whole purpose of stream_to_pack() is to take potentially large
input from the file on the filesystem, chop that into manageable
chunks and feed the underlying hashing and deflating machinery that
takes possibly smaller integer types to represent the sizes of data
they take in a single call, so once that function is taught to take
ofs_t I think you can say you converted the entire callchain from
index_fd() down.

So, this looks like a good starting step, but I suspect it needs one
more level of digging for it to become truly useful.
diff mbox series

Patch

diff --git a/bulk-checkin.c b/bulk-checkin.c
index 409ecb566b..2631e82d6c 100644
--- a/bulk-checkin.c
+++ b/bulk-checkin.c
@@ -189,7 +189,7 @@  static void prepare_to_stream(struct bulk_checkin_state *state,
 
 static int deflate_to_pack(struct bulk_checkin_state *state,
 			   struct object_id *result_oid,
-			   int fd, size_t size,
+			   int fd, off_t size,
 			   enum object_type type, const char *path,
 			   unsigned flags)
 {
@@ -258,7 +258,7 @@  static int deflate_to_pack(struct bulk_checkin_state *state,
 }
 
 int index_bulk_checkin(struct object_id *oid,
-		       int fd, size_t size, enum object_type type,
+		       int fd, off_t size, enum object_type type,
 		       const char *path, unsigned flags)
 {
 	int status = deflate_to_pack(&state, oid, fd, size, type,
diff --git a/bulk-checkin.h b/bulk-checkin.h
index f438f93811..09b2affdf3 100644
--- a/bulk-checkin.h
+++ b/bulk-checkin.h
@@ -7,7 +7,7 @@ 
 #include "cache.h"
 
 extern int index_bulk_checkin(struct object_id *oid,
-			      int fd, size_t size, enum object_type type,
+			      int fd, off_t size, enum object_type type,
 			      const char *path, unsigned flags);
 
 extern void plug_bulk_checkin(void);