diff mbox series

set LC_TIME even if locale dir is not present

Message ID pull.1189.git.1648371489398.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series set LC_TIME even if locale dir is not present | expand

Commit Message

Matthias Aßhauer March 27, 2022, 8:58 a.m. UTC
From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>

Since Commit aa1462c (introduce "format" date-mode, 2015-06-25) git log can
pass user specified format strings directly to strftime(). One special
format string we explicitly mention in our documentation is %c, which
depends on the system locale. To accommodate for %c we added a call to
setlocale() in git_setup_gettext().

In Commit cc5e1bf (gettext: avoid initialization if the locale dir is not
present, 2018-04-21) we added an early exit to git_setup_gettext() in case
no textdomain directory is present. This early exit is so early, that we
don't even set the locale for %c in that case, despite strftime() not
needing the textdomain directory at all.

This leads to a subtle bug where `git log --date=format:%c` will use C
locale instead of the system locale on systems without a valid textdomain
directory.

This fixes https://github.com/git-for-windows/git/issues/2959

Signed-off-by: Matthias Aßhauer <mha1993@live.de>
---
    [RFC] set LC_TIME even if locale dir is not present
    
    This is a small bug fix with a large and unwieldy regression test. The
    whole prepare_time_locale() bit and and the Makefile change is obviously
    based on prepare_utf8_locale(), but I'm not really happy with it. I'm
    not even sure how to fully put the issues I have with the test in words.
    
     * I feel like it's not really testing anything on builds without
       gettext, but adding a GETTEXT prerequisite to a test that something
       works without gettext is very counter intuitive.
    
     * I'm also not exactly happy about how I choose a locale, but can't
       think of a better way. It's a reasonable assumption that C locale
       uses a US date format on most, if not all supported systems, but I
       have no good way to make sure that the selected locale actually
       formats dates differently. Defining a custom locale would solve this,
       but seems like a convoluted way to go about things.
    
     * I'm not entirely happy with testing the output of git log
       -format=date:%c against the output of the exact same command. I've
       tried a version of the test based on date(1) and got it working with
       the GNU version, but looking at the BSD version for our OS X based CI
       builds and the POSIX spec for that command, they share barely more
       than their name.
    
    So, looking at the points above, I expect this to take a few re-rolls to
    get into a reasonable shape.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1189%2Frimrul%2Fdate-format-without-gettext-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1189/rimrul/date-format-without-gettext-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1189

 Makefile                      |  7 +++++++
 gettext.c                     |  3 ++-
 t/t4205-log-pretty-formats.sh | 31 +++++++++++++++++++++++++++++++
 3 files changed, 40 insertions(+), 1 deletion(-)


base-commit: abf474a5dd901f28013c52155411a48fd4c09922

Comments

Ævar Arnfjörð Bjarmason March 27, 2022, 10:13 a.m. UTC | #1
On Sun, Mar 27 2022, Matthias Aßhauer via GitGitGadget wrote:

> From: =?UTF-8?q?Matthias=20A=C3=9Fhauer?= <mha1993@live.de>
>
> Since Commit aa1462c (introduce "format" date-mode, 2015-06-25) git log can
> pass user specified format strings directly to strftime(). One special
> format string we explicitly mention in our documentation is %c, which
> depends on the system locale. To accommodate for %c we added a call to
> setlocale() in git_setup_gettext().
>
> In Commit cc5e1bf (gettext: avoid initialization if the locale dir is not
> present, 2018-04-21) we added an early exit to git_setup_gettext() in case
> no textdomain directory is present. This early exit is so early, that we
> don't even set the locale for %c in that case, despite strftime() not
> needing the textdomain directory at all.

Thanks for tracking this down. This commit & end-state looks good to me,
I just have comments about the implementation below, i.e. you're doing
more work than you need to, some of this we have test infrastructure
already...

> diff --git a/Makefile b/Makefile
> index e8aba291d7f..ddca29b550b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -410,6 +410,10 @@ include shared.mak
>  # If it isn't set, fallback to $LC_ALL, $LANG or use the first utf-8
>  # locale returned by "locale -a".
>  #
> +# Define GIT_TEST_TIME_LOCALE to preferred non-us locale for testing.
> +# If it isn't set, fallback to $LC_ALL, $LANG or use the first non-us
> +# locale returned by "locale -a".
> +#
>  # Define HAVE_CLOCK_GETTIME if your platform has clock_gettime.
>  #
>  # Define HAVE_CLOCK_MONOTONIC if your platform has CLOCK_MONOTONIC.
> @@ -2862,6 +2866,9 @@ ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
>  endif
>  ifdef GIT_TEST_UTF8_LOCALE
>  	@echo GIT_TEST_UTF8_LOCALE=\''$(subst ','\'',$(subst ','\'',$(GIT_TEST_UTF8_LOCALE)))'\' >>$@+
> +endif
> +ifdef GIT_TEST_TIME_LOCALE
> +	@echo GIT_TEST_TIME_LOCALE=\''$(subst ','\'',$(subst ','\'',$(GIT_TEST_TIME_LOCALE)))'\' >>$@+
>  endif
>  	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
>  ifdef GIT_PERF_REPEAT_COUNT

You won't need this, more later...

> diff --git a/gettext.c b/gettext.c
> index bb5ba1fe7cc..2b614c2b8c6 100644
> --- a/gettext.c
> +++ b/gettext.c
> @@ -107,6 +107,8 @@ void git_setup_gettext(void)
>  	const char *podir = getenv(GIT_TEXT_DOMAIN_DIR_ENVIRONMENT);
>  	char *p = NULL;
>  
> +	setlocale(LC_TIME, "");
> +
>  	if (!podir)
>  		podir = p = system_path(GIT_LOCALE_PATH);
>  
> @@ -117,7 +119,6 @@ void git_setup_gettext(void)
>  
>  	bindtextdomain("git", podir);
>  	setlocale(LC_MESSAGES, "");
> -	setlocale(LC_TIME, "");
>  	init_gettext_charset("git");
>  	textdomain("git");
>  
> diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
> index e448ef2928a..01a1e61ecea 100755
> --- a/t/t4205-log-pretty-formats.sh
> +++ b/t/t4205-log-pretty-formats.sh
> @@ -25,6 +25,29 @@ commit_msg () {
>  	fi
>  }
>  
> +prepare_time_locale () {
> +	if test -z "$GIT_TEST_TIME_LOCALE"
> +	then
> +		case "${LC_ALL:-$LANG}" in
> +		C | C.* | POSIX | POSIX.* | en_US | en_US.* )
> +			GIT_TEST_TIME_LOCALE=$(locale -a | sed -n '/^\(C\|POSIX\|en_US\)/I !{
> +				p
> +				q
> +			}')
> +			;;
> +		*)
> +			GIT_TEST_TIME_LOCALE="${LC_ALL:-$LANG}"
> +			;;
> +		esac
> +	fi
> +	if test -n "$GIT_TEST_TIME_LOCALE"
> +	then
> +		test_set_prereq TIME_LOCALE
> +	else
> +		say "# No non-us locale available, some tests are skipped"
> +	fi
> +}
> +

And this setup we do already elsewhere.

I think the below would be a good candidate to squash into this. The C
change doesn't change the behavior from your version, but I think it's
good to take the change here about being explicit what we do and don't
seup with/without a podir.

Your prepare_time_locale() is then duplicating lib-gettext.sh, the below
shows a replacement for your test piggy-backing on existing test infra.

I was then worried that we'd introduced a regression in cc5e1bf in
assuming that we didn't need to setlocale(LC_MESSAGES) just because we
didn't have a podir, because the C library might have some, but the
below test passe for me on linux+glibc.

Maybe there's still a regression there, I'm not sure. Perhaps it's also
good to drop that "optimization" on non-Windows platforms?

diff --git a/gettext.c b/gettext.c
index bb5ba1fe7cc..9b46c224230 100644
--- a/gettext.c
+++ b/gettext.c
@@ -102,25 +102,34 @@ static void init_gettext_charset(const char *domain)
 		setlocale(LC_CTYPE, "C");
 }
 
+static void git_setup_gettext_no_podir(void)
+{
+	setlocale(LC_TIME, "");
+}
+
+static void git_setup_gettext_podir(const char *podir)
+{
+	bindtextdomain("git", podir);
+	setlocale(LC_MESSAGES, "");
+	init_gettext_charset("git");
+	textdomain("git");
+}
+
 void git_setup_gettext(void)
 {
 	const char *podir = getenv(GIT_TEXT_DOMAIN_DIR_ENVIRONMENT);
 	char *p = NULL;
 
+	git_setup_gettext_no_podir();
+
 	if (!podir)
 		podir = p = system_path(GIT_LOCALE_PATH);
 
-	if (!is_directory(podir)) {
-		free(p);
-		return;
-	}
-
-	bindtextdomain("git", podir);
-	setlocale(LC_MESSAGES, "");
-	setlocale(LC_TIME, "");
-	init_gettext_charset("git");
-	textdomain("git");
+	if (!is_directory(podir))
+		goto done;
 
+	git_setup_gettext_podir(podir);
+done:
 	free(p);
 }
 
diff --git a/t/t0203-gettext-setlocale-sanity.sh b/t/t0203-gettext-setlocale-sanity.sh
index 0ce1f22eff6..69facd2f8ed 100755
--- a/t/t0203-gettext-setlocale-sanity.sh
+++ b/t/t0203-gettext-setlocale-sanity.sh
@@ -23,4 +23,49 @@ test_expect_success GETTEXT_LOCALE 'git show a ISO-8859-1 commit under a UTF-8 l
 	grep -q "iso-utf8-commit" out
 '
 
+test_expect_success GETTEXT_LOCALE 'the %c date format works even without a localedir (LC_TIME)' '
+	test_when_finished "rm -rf empty" &&
+	mkdir empty &&
+	LANGUAGE=is LC_ALL="$is_IS_locale" GIT_TEXTDOMAINDIR="$PWD/empty" \
+		git log --pretty=format:%ad --date=format:%c HEAD^1..HEAD >actual &&
+
+	# Avoid testing the raw format (it might differ?). But
+	# Thursday is Fimmtudagur in Icelandic, so grepping "fim" is
+	# pretty certain to test that the locale was used.
+	grep -iF fim actual
+'
+
+test_lazy_prereq GETTEXT_HAVE_STRERROR_TRANSLATED '
+	test_have_prereq GETTEXT_LOCALE &&
+	test_have_prereq POSIXPERM &&
+
+	test_when_finished "rm -f file" &&
+	>file &&
+
+	# German is more likely to have a strerror() translation
+	test_must_fail git init file 2>loc-C &&
+	grep "cannot mkdir" loc-C &&
+	test_must_fail env \
+		LANGUAGE=de LC_ALL="de_DE.utf8" \
+		git init file 2>loc-de &&
p+	! grep "cannot mkdir" loc-de
+
+'
+
+test_expect_success GETTEXT_LOCALE,GETTEXT_HAVE_STRERROR_TRANSLATED \
+	'LC_MESSAGES is set up without a localedir (for strerror())' '
+	test_when_finished "rm -f file" &&
+	>file &&
+	test_when_finished "rm -rf no-domain" &&
+	mkdir no-domain &&
+
+	test_must_fail git init file 2>loc-C &&
+	test_must_fail env \
+		LANGUAGE=de LC_ALL="de_DE.utf8" \
+		NO_SET_GIT_TEXTDOMAINDIR=StopDoingThatInBinWrappers \
+		GIT_TEXTDOMAINDIR="$PWD/no-domain" \
+		git init file 2>loc-de-no-textdomain &&
+	! test_cmp loc-C loc-de-no-textdomain
+'
+
 test_done
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 515b1af7ed4..886d260082d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1428,6 +1428,8 @@ else # normal case, use ../bin-wrappers only unless $with_dashes:
 			fi
 			with_dashes=t
 		fi
+		GIT_RUNING_TEST_LIB_SH=t
+		export GIT_RUNING_TEST_LIB_SH
 		PATH="$git_bin_dir:$PATH"
 	fi
 	GIT_EXEC_PATH=$GIT_BUILD_DIR
diff --git a/wrap-for-bin.sh b/wrap-for-bin.sh
index 95851b85b6b..3089bcad37c 100644
--- a/wrap-for-bin.sh
+++ b/wrap-for-bin.sh
@@ -15,10 +15,14 @@ else
 	export GIT_TEMPLATE_DIR
 fi
 GITPERLLIB='@@BUILD_DIR@@/perl/build/lib'"${GITPERLLIB:+:$GITPERLLIB}"
-GIT_TEXTDOMAINDIR='@@BUILD_DIR@@/po/build/locale'
+if test -z "$NO_SET_GIT_TEXTDOMAINDIR"
+then
+	GIT_TEXTDOMAINDIR='@@BUILD_DIR@@/po/build/locale'
+	export GIT_TEXTDOMAINDIR
+fi
 PATH='@@BUILD_DIR@@/bin-wrappers:'"$PATH"
 
-export GIT_EXEC_PATH GITPERLLIB PATH GIT_TEXTDOMAINDIR
+export GIT_EXEC_PATH GITPERLLIB PATH
 
 case "$GIT_DEBUGGER" in
 '')
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index e8aba291d7f..ddca29b550b 100644
--- a/Makefile
+++ b/Makefile
@@ -410,6 +410,10 @@  include shared.mak
 # If it isn't set, fallback to $LC_ALL, $LANG or use the first utf-8
 # locale returned by "locale -a".
 #
+# Define GIT_TEST_TIME_LOCALE to preferred non-us locale for testing.
+# If it isn't set, fallback to $LC_ALL, $LANG or use the first non-us
+# locale returned by "locale -a".
+#
 # Define HAVE_CLOCK_GETTIME if your platform has clock_gettime.
 #
 # Define HAVE_CLOCK_MONOTONIC if your platform has CLOCK_MONOTONIC.
@@ -2862,6 +2866,9 @@  ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT
 endif
 ifdef GIT_TEST_UTF8_LOCALE
 	@echo GIT_TEST_UTF8_LOCALE=\''$(subst ','\'',$(subst ','\'',$(GIT_TEST_UTF8_LOCALE)))'\' >>$@+
+endif
+ifdef GIT_TEST_TIME_LOCALE
+	@echo GIT_TEST_TIME_LOCALE=\''$(subst ','\'',$(subst ','\'',$(GIT_TEST_TIME_LOCALE)))'\' >>$@+
 endif
 	@echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@+
 ifdef GIT_PERF_REPEAT_COUNT
diff --git a/gettext.c b/gettext.c
index bb5ba1fe7cc..2b614c2b8c6 100644
--- a/gettext.c
+++ b/gettext.c
@@ -107,6 +107,8 @@  void git_setup_gettext(void)
 	const char *podir = getenv(GIT_TEXT_DOMAIN_DIR_ENVIRONMENT);
 	char *p = NULL;
 
+	setlocale(LC_TIME, "");
+
 	if (!podir)
 		podir = p = system_path(GIT_LOCALE_PATH);
 
@@ -117,7 +119,6 @@  void git_setup_gettext(void)
 
 	bindtextdomain("git", podir);
 	setlocale(LC_MESSAGES, "");
-	setlocale(LC_TIME, "");
 	init_gettext_charset("git");
 	textdomain("git");
 
diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index e448ef2928a..01a1e61ecea 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -25,6 +25,29 @@  commit_msg () {
 	fi
 }
 
+prepare_time_locale () {
+	if test -z "$GIT_TEST_TIME_LOCALE"
+	then
+		case "${LC_ALL:-$LANG}" in
+		C | C.* | POSIX | POSIX.* | en_US | en_US.* )
+			GIT_TEST_TIME_LOCALE=$(locale -a | sed -n '/^\(C\|POSIX\|en_US\)/I !{
+				p
+				q
+			}')
+			;;
+		*)
+			GIT_TEST_TIME_LOCALE="${LC_ALL:-$LANG}"
+			;;
+		esac
+	fi
+	if test -n "$GIT_TEST_TIME_LOCALE"
+	then
+		test_set_prereq TIME_LOCALE
+	else
+		say "# No non-us locale available, some tests are skipped"
+	fi
+}
+
 test_expect_success 'set up basic repos' '
 	>foo &&
 	>bar &&
@@ -544,6 +567,14 @@  test_expect_success '--date=human %ad%cd is the same as %ah%ch' '
 	test_cmp expected actual
 '
 
+prepare_time_locale
+test_expect_success TIME_LOCALE '--date=format:%c does not need gettext' '
+	rm -fr no-such-dir &&
+	LC_ALL=$GIT_TEST_TIME_LOCALE git log --date=format:%c HEAD^1..HEAD >expected &&
+	GIT_TEXTDOMAINDIR=no-such-dir LC_ALL=$GIT_TEST_TIME_LOCALE git log --date=format:%c HEAD^1..HEAD >actual &&
+	test_cmp expected actual
+'
+
 # get new digests (with no abbreviations)
 test_expect_success 'set up log decoration tests' '
 	head1=$(git rev-parse --verify HEAD~0) &&