diff mbox series

[v5,5/5] apply: fix uninitialized hash function

Message ID 20240520231434.1816979-6-gitster@pobox.com (mailing list archive)
State New, archived
Headers show
Series Fix use of uninitialized hash algorithms | expand

Commit Message

Junio C Hamano May 20, 2024, 11:14 p.m. UTC
"git apply" can work outside a repository as a better "GNU patch",
but when it does so, it still assumed that it can access
the_hash_algo, which is no longer true in the new world order.

Make sure we explicitly fall back to SHA-1 algorithm for backward
compatibility.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/apply.c         | 4 ++++
 t/t1517-outside-repo.sh | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Patrick Steinhardt May 21, 2024, 7:58 a.m. UTC | #1
On Mon, May 20, 2024 at 04:14:34PM -0700, Junio C Hamano wrote:
> "git apply" can work outside a repository as a better "GNU patch",
> but when it does so, it still assumed that it can access
> the_hash_algo, which is no longer true in the new world order.
> 
> Make sure we explicitly fall back to SHA-1 algorithm for backward
> compatibility.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/apply.c         | 4 ++++
>  t/t1517-outside-repo.sh | 2 +-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/apply.c b/builtin/apply.c
> index 861a01910c..e9175f820f 100644
> --- a/builtin/apply.c
> +++ b/builtin/apply.c
> @@ -1,6 +1,7 @@
>  #include "builtin.h"
>  #include "gettext.h"
>  #include "repository.h"
> +#include "hash.h"
>  #include "apply.h"
>  
>  static const char * const apply_usage[] = {
> @@ -18,6 +19,9 @@ int cmd_apply(int argc, const char **argv, const char *prefix)
>  	if (init_apply_state(&state, the_repository, prefix))
>  		exit(128);
>  
> +	if (!the_hash_algo)
> +		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
> +

Do we also want to add a comment here that mentions that we may want to
make this configureable via a command line option, like we have in the
preceding commits?

Patrick

>  	argc = apply_parse_options(argc, argv,
>  				   &state, &force_apply, &options,
>  				   apply_usage);
> diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
> index 2d8982d61a..557808ffa7 100755
> --- a/t/t1517-outside-repo.sh
> +++ b/t/t1517-outside-repo.sh
> @@ -37,7 +37,7 @@ test_expect_success 'hash-object outside repository (uses SHA-1)' '
>  	test_cmp hash.expect hash.actual
>  '
>  
> -test_expect_failure 'apply a patch outside repository' '
> +test_expect_success 'apply a patch outside repository' '
>  	(
>  		cd non-repo &&
>  		cp ../nums.old nums &&
> -- 
> 2.45.1-216-g4365c6fcf9
> 
>
Junio C Hamano May 21, 2024, 1:36 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

>> +	if (!the_hash_algo)
>> +		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
>> +
>
> Do we also want to add a comment here that mentions that we may want to
> make this configureable via a command line option, like we have in the
> preceding commits?

We may want a comment here that says 

    we could to redo the "apply.c" machinery to make this arbitrary
    fallback unnecessary, but the benefit to do so is dubious and
    the risk of breaking the code is probably not worth the effort.

When working as "better GNU patch" mode without a repository, we
should not and do not use the_hash_algo for the purpose of hashing
at all.  We do not do the binary diff (because we cannot grab the
preimage object out of the object store (that does not exist) to
apply the delta to form the postimage, we do not do the 3-way
fallback using the preimage blob object names that appear on the
"index" lines.  As far as I recall, the only thing we use
the_hash_algo for is for the max length of the hash to ask "is this
hexadecimal string a plausible looking object name?  We are parsing
a line that started with 'index ' and trying to see if the line
syntactically looks like a valid 'index' line" and the like.  If we
assume SHA-1 and if somebody tries to injest a SHA-256 --full-index
patch, that logic may say "nah, we didn't find a valid 'index'
header", but I think we'll just leave the fields like old-object
blank, which does not affect anything because we won't do "apply
--3way" anyway.

So we _could_ identify such places and tell the code "when
the_hash_algo is NULL, instead of the_hash_algo->hexsz, use 0 as
hexsz" (this example is from apply.c:gitdiff_index()) and it indeed
was the approach I tried in my unpublished draft before the first
version I posted.  It quickly got ugly.
diff mbox series

Patch

diff --git a/builtin/apply.c b/builtin/apply.c
index 861a01910c..e9175f820f 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -1,6 +1,7 @@ 
 #include "builtin.h"
 #include "gettext.h"
 #include "repository.h"
+#include "hash.h"
 #include "apply.h"
 
 static const char * const apply_usage[] = {
@@ -18,6 +19,9 @@  int cmd_apply(int argc, const char **argv, const char *prefix)
 	if (init_apply_state(&state, the_repository, prefix))
 		exit(128);
 
+	if (!the_hash_algo)
+		repo_set_hash_algo(the_repository, GIT_HASH_SHA1);
+
 	argc = apply_parse_options(argc, argv,
 				   &state, &force_apply, &options,
 				   apply_usage);
diff --git a/t/t1517-outside-repo.sh b/t/t1517-outside-repo.sh
index 2d8982d61a..557808ffa7 100755
--- a/t/t1517-outside-repo.sh
+++ b/t/t1517-outside-repo.sh
@@ -37,7 +37,7 @@  test_expect_success 'hash-object outside repository (uses SHA-1)' '
 	test_cmp hash.expect hash.actual
 '
 
-test_expect_failure 'apply a patch outside repository' '
+test_expect_success 'apply a patch outside repository' '
 	(
 		cd non-repo &&
 		cp ../nums.old nums &&