mbox series

[0/2] macos: safeguard git maintenance against highly concurrent operations

Message ID pull.1024.git.1629819840.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series macos: safeguard git maintenance against highly concurrent operations | expand

Message

Philippe Blain via GitGitGadget Aug. 24, 2021, 3:43 p.m. UTC
When we started porting Scalar to C, one of the first things we did was to
run Scalar's quite extensive set of functional tests. And there, we ran into
immediate problems in the macOS job because git maintenance was registering
a large number of repositories concurrently, and our code could be more
robust in such scenarios.

The culprit lies not with Scalar, of course, but with the way git
maintenance wants to write the plist for use by launchctl and register it,
every time, and if two concurrent processes try to do that, they stumble
over each other.

This pair of patches makes git maintenance much less fragile in those
situations.

Please note that this patch series conflicts with lh/systemd-timers,
although in a trivial way: the latter changes the signature of
launchctl_schedule_plist() to lose its cmd parameter. The resolution is to
adjust the conflicting code to lose the cmd parameter, and also drop it from
launchctl_list_contains_plist() (and define it in the same way as
launchctl_boot_plist() does). I assume that lh/systemd-timers will advance
to next pretty soon; I plan on rebasing this patch series on top of it at
that stage.

Derrick Stolee (1):
  maintenance: skip bootout/bootstrap when plist is registered

Johannes Schindelin (1):
  maintenance: create `launchctl` configuration using a lock file

 builtin/gc.c           | 91 ++++++++++++++++++++++++++++++++----------
 t/t7900-maintenance.sh | 17 ++++++++
 2 files changed, 87 insertions(+), 21 deletions(-)


base-commit: 48bf2fa8bad054d66bd79c6ba903c89c704201f7
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1024%2Fdscho%2Fmaintenance%2Flaunchctl-concurrent-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1024/dscho/maintenance/launchctl-concurrent-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1024

Comments

Junio C Hamano Aug. 25, 2021, 12:49 a.m. UTC | #1
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Please note that this patch series conflicts with lh/systemd-timers,
> although in a trivial way: the latter changes the signature of
> launchctl_schedule_plist() to lose its cmd parameter. The resolution is to
> adjust the conflicting code to lose the cmd parameter, and also drop it from
> launchctl_list_contains_plist() (and define it in the same way as
> launchctl_boot_plist() does). I assume that lh/systemd-timers will advance
> to next pretty soon; I plan on rebasing this patch series on top of it at
> that stage.

Sounds like a plan.

Here is my attempt to merge lh/systemd-timers into the result of
applying these two to 'master', with focus on the top part of the
launchctl_schedule_plist().  Sanity-checking is appreciated.

diff --cc builtin/gc.c
index 22e670b508,6a57d0fde5..0000000000
--- i/builtin/gc.c
+++ w/builtin/gc.c
@@@ -1593,48 -1678,26 +1678,50 @@@ static int launchctl_remove_plist(enum 
  	return result;
  }
  
- static int launchctl_remove_plists(const char *cmd)
+ static int launchctl_remove_plists(void)
  {
- 	return launchctl_remove_plist(SCHEDULE_HOURLY, cmd) ||
- 		launchctl_remove_plist(SCHEDULE_DAILY, cmd) ||
- 		launchctl_remove_plist(SCHEDULE_WEEKLY, cmd);
+ 	return launchctl_remove_plist(SCHEDULE_HOURLY) ||
+ 	       launchctl_remove_plist(SCHEDULE_DAILY) ||
+ 	       launchctl_remove_plist(SCHEDULE_WEEKLY);
  }
  
 +static int launchctl_list_contains_plist(const char *name, const char *cmd)
 +{
 +	int result;
 +	struct child_process child = CHILD_PROCESS_INIT;
 +	char *uid = launchctl_get_uid();
 +
 +	strvec_split(&child.args, cmd);
 +	strvec_pushl(&child.args, "list", name, NULL);
 +
 +	child.no_stderr = 1;
 +	child.no_stdout = 1;
 +
 +	if (start_command(&child))
 +		die(_("failed to start launchctl"));
 +
 +	result = finish_command(&child);
 +
 +	free(uid);
 +
 +	/* Returns failure if 'name' doesn't exist. */
 +	return !result;
 +}
 +
- static int launchctl_schedule_plist(const char *exec_path, enum schedule_priority schedule, const char *cmd)
+ static int launchctl_schedule_plist(const char *exec_path, enum schedule_priority schedule)
  {
 -	FILE *plist;
 -	int i;
 +	int i, fd;
  	const char *preamble, *repeat;
  	const char *frequency = get_frequency(schedule);
  	char *name = launchctl_service_name(frequency);
  	char *filename = launchctl_service_filename(name);
 +	struct lock_file lk = LOCK_INIT;
 +	static unsigned long lock_file_timeout_ms = ULONG_MAX;
 +	struct strbuf plist = STRBUF_INIT, plist2 = STRBUF_INIT;
 +	struct stat st;
++	const char *cmd = "launchctl";
  
 -	if (safe_create_leading_directories(filename))
 -		die(_("failed to create directories for '%s'"), filename);
 -	plist = xfopen(filename, "w");
 -
++	get_schedule_cmd(&cmd, NULL);
  	preamble = "<?xml version=\"1.0\"?>\n"
  		   "<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n"
  		   "<plist version=\"1.0\">"
@@@ -1687,38 -1750,13 +1774,38 @@@
  		/* unreachable */
  		break;
  	}
 -	fprintf(plist, "</array>\n</dict>\n</plist>\n");
 -	fclose(plist);
 +	strbuf_addstr(&plist, "</array>\n</dict>\n</plist>\n");
  
 -	/* bootout might fail if not already running, so ignore */
 -	launchctl_boot_plist(0, filename);
 -	if (launchctl_boot_plist(1, filename))
 -		die(_("failed to bootstrap service %s"), filename);
 +	if (safe_create_leading_directories(filename))
 +		die(_("failed to create directories for '%s'"), filename);
 +
 +	if ((long)lock_file_timeout_ms < 0 &&
 +	    git_config_get_ulong("gc.launchctlplistlocktimeoutms",
 +				 &lock_file_timeout_ms))
 +		lock_file_timeout_ms = 150;
 +
 +	fd = hold_lock_file_for_update_timeout(&lk, filename, LOCK_DIE_ON_ERROR,
 +					       lock_file_timeout_ms);
 +
 +	/*
 +	 * Does this file already exist? With the intended contents? Is it
 +	 * registered already? Then it does not need to be re-registered.
 +	 */
 +	if (!stat(filename, &st) && st.st_size == plist.len &&
 +	    strbuf_read_file(&plist2, filename, plist.len) == plist.len &&
 +	    !strbuf_cmp(&plist, &plist2) &&
 +	    launchctl_list_contains_plist(name, cmd))
 +		rollback_lock_file(&lk);
 +	else {
 +		if (write_in_full(fd, plist.buf, plist.len) < 0 ||
 +		    commit_lock_file(&lk))
 +			die_errno(_("could not write '%s'"), filename);
 +
 +		/* bootout might fail if not already running, so ignore */
- 		launchctl_boot_plist(0, filename, cmd);
- 		if (launchctl_boot_plist(1, filename, cmd))
++		launchctl_boot_plist(0, filename);
++		if (launchctl_boot_plist(1, filename))
 +			die(_("failed to bootstrap service %s"), filename);
 +	}
  
  	free(filename);
  	free(name);
Johannes Schindelin Aug. 25, 2021, 12:23 p.m. UTC | #2
Hi Junio,

On Tue, 24 Aug 2021, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > Please note that this patch series conflicts with lh/systemd-timers,
> > although in a trivial way: the latter changes the signature of
> > launchctl_schedule_plist() to lose its cmd parameter. The resolution is to
> > adjust the conflicting code to lose the cmd parameter, and also drop it from
> > launchctl_list_contains_plist() (and define it in the same way as
> > launchctl_boot_plist() does). I assume that lh/systemd-timers will advance
> > to next pretty soon; I plan on rebasing this patch series on top of it at
> > that stage.
>
> Sounds like a plan.
>
> Here is my attempt to merge lh/systemd-timers into the result of
> applying these two to 'master', with focus on the top part of the
> launchctl_schedule_plist().  Sanity-checking is appreciated.

My local version (hence `git reset -hard`'ed away) looked almost precisely
like yours, I only added the definition of `cmd` to the top of
`launchctl_list_contains_plist()` and removed its `cmd` parameter (and
adjusted the callers). But that's the only difference I can spot.

Thanks,
Dscho

>
> diff --cc builtin/gc.c
> index 22e670b508,6a57d0fde5..0000000000
> --- i/builtin/gc.c
> +++ w/builtin/gc.c
> @@@ -1593,48 -1678,26 +1678,50 @@@ static int launchctl_remove_plist(enum
>   	return result;
>   }
>
> - static int launchctl_remove_plists(const char *cmd)
> + static int launchctl_remove_plists(void)
>   {
> - 	return launchctl_remove_plist(SCHEDULE_HOURLY, cmd) ||
> - 		launchctl_remove_plist(SCHEDULE_DAILY, cmd) ||
> - 		launchctl_remove_plist(SCHEDULE_WEEKLY, cmd);
> + 	return launchctl_remove_plist(SCHEDULE_HOURLY) ||
> + 	       launchctl_remove_plist(SCHEDULE_DAILY) ||
> + 	       launchctl_remove_plist(SCHEDULE_WEEKLY);
>   }
>
>  +static int launchctl_list_contains_plist(const char *name, const char *cmd)
>  +{
>  +	int result;
>  +	struct child_process child = CHILD_PROCESS_INIT;
>  +	char *uid = launchctl_get_uid();
>  +
>  +	strvec_split(&child.args, cmd);
>  +	strvec_pushl(&child.args, "list", name, NULL);
>  +
>  +	child.no_stderr = 1;
>  +	child.no_stdout = 1;
>  +
>  +	if (start_command(&child))
>  +		die(_("failed to start launchctl"));
>  +
>  +	result = finish_command(&child);
>  +
>  +	free(uid);
>  +
>  +	/* Returns failure if 'name' doesn't exist. */
>  +	return !result;
>  +}
>  +
> - static int launchctl_schedule_plist(const char *exec_path, enum schedule_priority schedule, const char *cmd)
> + static int launchctl_schedule_plist(const char *exec_path, enum schedule_priority schedule)
>   {
>  -	FILE *plist;
>  -	int i;
>  +	int i, fd;
>   	const char *preamble, *repeat;
>   	const char *frequency = get_frequency(schedule);
>   	char *name = launchctl_service_name(frequency);
>   	char *filename = launchctl_service_filename(name);
>  +	struct lock_file lk = LOCK_INIT;
>  +	static unsigned long lock_file_timeout_ms = ULONG_MAX;
>  +	struct strbuf plist = STRBUF_INIT, plist2 = STRBUF_INIT;
>  +	struct stat st;
> ++	const char *cmd = "launchctl";
>
>  -	if (safe_create_leading_directories(filename))
>  -		die(_("failed to create directories for '%s'"), filename);
>  -	plist = xfopen(filename, "w");
>  -
> ++	get_schedule_cmd(&cmd, NULL);
>   	preamble = "<?xml version=\"1.0\"?>\n"
>   		   "<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/PropertyList-1.0.dtd\">\n"
>   		   "<plist version=\"1.0\">"
> @@@ -1687,38 -1750,13 +1774,38 @@@
>   		/* unreachable */
>   		break;
>   	}
>  -	fprintf(plist, "</array>\n</dict>\n</plist>\n");
>  -	fclose(plist);
>  +	strbuf_addstr(&plist, "</array>\n</dict>\n</plist>\n");
>
>  -	/* bootout might fail if not already running, so ignore */
>  -	launchctl_boot_plist(0, filename);
>  -	if (launchctl_boot_plist(1, filename))
>  -		die(_("failed to bootstrap service %s"), filename);
>  +	if (safe_create_leading_directories(filename))
>  +		die(_("failed to create directories for '%s'"), filename);
>  +
>  +	if ((long)lock_file_timeout_ms < 0 &&
>  +	    git_config_get_ulong("gc.launchctlplistlocktimeoutms",
>  +				 &lock_file_timeout_ms))
>  +		lock_file_timeout_ms = 150;
>  +
>  +	fd = hold_lock_file_for_update_timeout(&lk, filename, LOCK_DIE_ON_ERROR,
>  +					       lock_file_timeout_ms);
>  +
>  +	/*
>  +	 * Does this file already exist? With the intended contents? Is it
>  +	 * registered already? Then it does not need to be re-registered.
>  +	 */
>  +	if (!stat(filename, &st) && st.st_size == plist.len &&
>  +	    strbuf_read_file(&plist2, filename, plist.len) == plist.len &&
>  +	    !strbuf_cmp(&plist, &plist2) &&
>  +	    launchctl_list_contains_plist(name, cmd))
>  +		rollback_lock_file(&lk);
>  +	else {
>  +		if (write_in_full(fd, plist.buf, plist.len) < 0 ||
>  +		    commit_lock_file(&lk))
>  +			die_errno(_("could not write '%s'"), filename);
>  +
>  +		/* bootout might fail if not already running, so ignore */
> - 		launchctl_boot_plist(0, filename, cmd);
> - 		if (launchctl_boot_plist(1, filename, cmd))
> ++		launchctl_boot_plist(0, filename);
> ++		if (launchctl_boot_plist(1, filename))
>  +			die(_("failed to bootstrap service %s"), filename);
>  +	}
>
>   	free(filename);
>   	free(name);
>