diff mbox series

[6/6] maintenance: use random minute in systemd scheduler

Message ID 14e340b75faaa66980479f42fec14c457aea5c74.1691434300.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series maintenance: schedule maintenance on a random minute | expand

Commit Message

Derrick Stolee Aug. 7, 2023, 6:51 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

The get_random_minute() method was created to allow maintenance
schedules to be fixed to a random minute of the hour. This randomness is
only intended to spread out the load from a number of clients, but each
client should have an hour between each maintenance cycle.

Add this random minute to the systemd integration.

This integration is more complicated than similar changes for other
schedulers because of a neat trick that systemd allows: templating.

The previous implementation generated two template files with names
of the form 'git-maintenance@.(timer|service)'. The '.timer' or
'.service' indicates that this is a template that is picked up when we
later specify '...@<schedule>.timer' or '...@<schedule>.service'. The
'<schedule>' string is then used to insert into the template both the
'OnCalendar' schedule setting and the '--schedule' parameter of the
'git maintenance run' command.

In order to set these schedules to a given minute, we can no longer use
the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
need to abandon the template model.

Modify the template with a custom schedule in the 'OnCalendar' setting.
This schedule has some interesting differences from cron-like patterns,
but is relatively easy to figure out from context. The one that might be
confusing is that '*-*-*' is a date-based pattern, but this must be
omitted when using 'Mon' to signal that we care about the day of the
week. Monday is used since that matches the day used for the 'weekly'
schedule used previously.

The rest of the change involves making sure we are writing these .timer
and .service files before initializing the schedule with 'systemctl' and
deleting the files when we are done. Some changes are also made to share
the random minute along with a single computation of the execution path
of the current Git executable.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/gc.c           | 82 ++++++++++++++++++++++++++++++++----------
 t/t7900-maintenance.sh |  4 ++-
 2 files changed, 66 insertions(+), 20 deletions(-)

Comments

Taylor Blau Aug. 7, 2023, 9:31 p.m. UTC | #1
On Mon, Aug 07, 2023 at 06:51:40PM +0000, Derrick Stolee via GitGitGadget wrote:
> In order to set these schedules to a given minute, we can no longer use
> the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
> need to abandon the template model.

Makes sense.

> Modify the template with a custom schedule in the 'OnCalendar' setting.
> This schedule has some interesting differences from cron-like patterns,
> but is relatively easy to figure out from context. The one that might be
> confusing is that '*-*-*' is a date-based pattern, but this must be
> omitted when using 'Mon' to signal that we care about the day of the
> week. Monday is used since that matches the day used for the 'weekly'
> schedule used previously.

I think the launchd version (which uses "0" for the day of the week)
runs on Sunday, if I remember correctly. I don't think that these two
necessarily need to run on the same day of the week when configured to
run weekly.

But I figured I'd raise the question in case you did mean for them to
both run on either Sunday or Monday.

> The rest of the change involves making sure we are writing these .timer
> and .service files before initializing the schedule with 'systemctl' and
> deleting the files when we are done. Some changes are also made to share
> the random minute along with a single computation of the execution path
> of the current Git executable.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  builtin/gc.c           | 82 ++++++++++++++++++++++++++++++++----------
>  t/t7900-maintenance.sh |  4 ++-
>  2 files changed, 66 insertions(+), 20 deletions(-)
>
> diff --git a/builtin/gc.c b/builtin/gc.c
> index b3ef95b10aa..5f5bb95641f 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -2299,13 +2299,20 @@ static char *xdg_config_home_systemd(const char *filename)
>  	return xdg_config_home_for("systemd/user", filename);
>  }
>
> -static int systemd_timer_write_unit_templates(const char *exec_path)
> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
> +					     const char *exec_path,
> +					     int minute)
>  {
>  	char *filename;
>  	FILE *file;
>  	const char *unit;
> +	char *schedule_pattern = NULL;

You should be able to drop the NULL initialization, since you assign
this value unconditionally in the switch statement below (or BUG() on an
unknown schedule type).

> +	const char *frequency = get_frequency(schedule);
> +	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
> +	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
> +
> +	filename = xdg_config_home_systemd(local_timer_name);
>
> -	filename = xdg_config_home_systemd("git-maintenance@.timer");
>  	if (safe_create_leading_directories(filename)) {
>  		error(_("failed to create directories for '%s'"), filename);
>  		goto error;
> @@ -2314,6 +2321,23 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>  	if (!file)
>  		goto error;
>
> +	switch (schedule) {
> +	case SCHEDULE_HOURLY:
> +		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_DAILY:
> +		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_WEEKLY:
> +		schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
> +		break;
> +
> +	default:
> +		BUG("Unhandled schedule_priority");
> +	}
> +
>  	unit = "# This file was created and is maintained by Git.\n"
>  	       "# Any edits made in this file might be replaced in the future\n"
>  	       "# by a Git command.\n"
> @@ -2322,12 +2346,12 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>  	       "Description=Optimize Git repositories data\n"
>  	       "\n"
>  	       "[Timer]\n"
> -	       "OnCalendar=%i\n"
> +	       "OnCalendar=%s\n"
>  	       "Persistent=true\n"
>  	       "\n"
>  	       "[Install]\n"
>  	       "WantedBy=timers.target\n";
> -	if (fputs(unit, file) == EOF) {
> +	if (fprintf(file, unit, schedule_pattern) < 0) {

OK, this is the templating part that you were mentioning earlier. I was
wondering what we were doing fputs()-ing a string with "%i" in it
without a formatting value to fill it in with. But that "%i" pertains to
systemd's instance value, IIUC.

The rest all looks good, thanks.

Thanks,
Taylor
Phillip Wood Aug. 8, 2023, 9:53 a.m. UTC | #2
Hi Stolee

On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> +	switch (schedule) {
> +	case SCHEDULE_HOURLY:
> +		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_DAILY:
> +		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
> +		break;
> +
> +	case SCHEDULE_WEEKLY:
> +		schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
> +		break;

This is not a new issue with this patch but we run the hourly job even 
when we want to run the daily job or the weekly job and we run the daily 
job when we want to run the weekly job.  maintenance_run_tasks() contains

	if (hold_lock_file_for_update(&lk, lock_path, LOCK_NO_DEREF) < 0) {
		/*
		 * Another maintenance command is running.
		 *
		 * If --auto was provided, then it is likely due to a
		 * recursive process stack. Do not report an error in
		 * that case.
		 */
		if (!opts->auto_flag && !opts->quiet)
			warning(_("lock file '%s' exists, skipping maintenance"),
				lock_path);
		free(lock_path);
		return 0;
	}

So only one of these jobs will succeed. The cron entries are careful to 
only run one job at a time, I think it would be worth doing the same 
thing here. I think the using the following format strings would fix this.

Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
Daily:  "Tue..Sun *-*-* 00:00:%02d"
Weekly: "Mon      *-*-* 00:00:%02d"

It looks like the launchctl schedule has the same issue.

One thing I've been wondering about which is related to maintenance but 
totally off-topic for this patch is that I think when auto maintenance 
is enabled we stop automatically running "gc" so how do the reflogs get 
expired?

Best Wishes

Phillip
Phillip Wood Aug. 8, 2023, 12:08 p.m. UTC | #3
On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The get_random_minute() method was created to allow maintenance
> schedules to be fixed to a random minute of the hour. This randomness is
> only intended to spread out the load from a number of clients, but each
> client should have an hour between each maintenance cycle.
> 
> Add this random minute to the systemd integration.

I think it makes sense to keep the random minute implementation the same 
across all the schedulers, but we could use RandomizedDelaySec (possibly 
combined with FixedRandomDelay) to randomize when the job is actually run.

> This integration is more complicated than similar changes for other
> schedulers because of a neat trick that systemd allows: templating.
> 
> The previous implementation generated two template files with names
> of the form 'git-maintenance@.(timer|service)'. The '.timer' or
> '.service' indicates that this is a template that is picked up when we
> later specify '...@<schedule>.timer' or '...@<schedule>.service'. The
> '<schedule>' string is then used to insert into the template both the
> 'OnCalendar' schedule setting and the '--schedule' parameter of the
> 'git maintenance run' command.
> 
> In order to set these schedules to a given minute, we can no longer use
> the 'hourly', 'daily', or 'weekly' strings for '<schedule>' and instead
> need to abandon the template model.

I've left some comments about this below.


> @@ -2299,13 +2299,20 @@ static char *xdg_config_home_systemd(const char *filename)
>   	return xdg_config_home_for("systemd/user", filename);
>   }
>   
> -static int systemd_timer_write_unit_templates(const char *exec_path)
> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,

As we're not writing template files any more I think we should rename 
this to systemd_timer_write_unit_file()

> +					     const char *exec_path,
> +					     int minute)
>   {
>   	char *filename;
>   	FILE *file;
>   	const char *unit;
> +	char *schedule_pattern = NULL;
> +	const char *frequency = get_frequency(schedule);
> +	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);

The "@" in the name signifies that it is a template unit which it isn't 
anymore so I think we want to change this to "git-maintenance-%s.timer"

> +	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);

Same change to the name here. I wonder if we could still use a template 
service file but that would complicate the implementation as we'd need 
to write three timer files but only one service file.

> [...]
> @@ -2375,13 +2399,16 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>   	return 0;
>   
>   error:
> +	free(schedule_pattern);
> +	free(local_timer_name);
>   	free(filename);
> -	systemd_timer_delete_unit_templates();

This looks like a change in behavior as previously we'd remove any files 
if there was an error rather than leaving behind a timer file without a 
corresponding unit file.

Looking at maintenance_start() we call maintenance_register() which 
disables "gc --auto" before we get to this point so if we fail to write 
the files we'll end up disabling any form of gc in the repository.

> [...]
> -static int systemd_timer_delete_unit_templates(void)
> +static int systemd_timer_delete_unit_template(enum schedule_priority priority)

Same suggestion as above to rename this to ..._unit_file()

>   {
> +	const char *frequency = get_frequency(priority);
> +	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
> +	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);

I'm not sure it is worth it but we could perhaps

	#define SYSTEMD_UNIT_FORMAT "git-maintenance-%s.%s"

above and then these lines and the ones in 
systemd_timer_write_unit_file() would become

	char *local_timer_name = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "timer");
	char *local_service = xstrfmt(SYSTEMD_UNIT_FORMAT, frequency, "service");

> [...]
> +static int systemd_timer_delete_unit_templates(void)

Naming again.

Best Wishes

Phillip
Phillip Wood Aug. 8, 2023, 1:03 p.m. UTC | #4
On 08/08/2023 10:53, Phillip Wood wrote:
> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
> Daily:  "Tue..Sun *-*-* 00:00:%02d"
> Weekly: "Mon      *-*-* 00:00:%02d"

Thinking about it some more, this only fixes the problem if the computer 
is actually on at midnight. If the computer is switched off overnight 
then we still try to start three maintenance jobs at the same time when 
the user turns their computer on on Tuesday morning. We could stop 
marking the hourly job as persistent on the assumption that it will be 
run soon anyway but that does not solve the problem of the daily and 
weekly jobs running concurrently on a Tuesday morning.

Best Wishes

Phillip
Derrick Stolee Aug. 8, 2023, 1:49 p.m. UTC | #5
On 8/7/2023 5:31 PM, Taylor Blau wrote:
> On Mon, Aug 07, 2023 at 06:51:40PM +0000, Derrick Stolee via GitGitGadget wrote:

>> Modify the template with a custom schedule in the 'OnCalendar' setting.
>> This schedule has some interesting differences from cron-like patterns,
>> but is relatively easy to figure out from context. The one that might be
>> confusing is that '*-*-*' is a date-based pattern, but this must be
>> omitted when using 'Mon' to signal that we care about the day of the
>> week. Monday is used since that matches the day used for the 'weekly'
>> schedule used previously.
> 
> I think the launchd version (which uses "0" for the day of the week)
> runs on Sunday, if I remember correctly. I don't think that these two
> necessarily need to run on the same day of the week when configured to
> run weekly.
> 
> But I figured I'd raise the question in case you did mean for them to
> both run on either Sunday or Monday.

I don't intend to change the day that is run as part of this change.

I think all other schedulers run on Sunday, but systemd running on Monday
is a particular detail of its "weekly" schedule.

>> -static int systemd_timer_write_unit_templates(const char *exec_path)
>> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
>> +					     const char *exec_path,
>> +					     int minute)
>>  {
>>  	char *filename;
>>  	FILE *file;
>>  	const char *unit;
>> +	char *schedule_pattern = NULL;
> 
> You should be able to drop the NULL initialization, since you assign
> this value unconditionally in the switch statement below (or BUG() on an
> unknown schedule type).

Unfortunately, GCC complained about a possibly-unassigned value when I
left this unset during development, so this actually is necessary for
that compiler.

Thanks,
-Stolee
Derrick Stolee Aug. 8, 2023, 1:56 p.m. UTC | #6
On 8/8/2023 5:53 AM, Phillip Wood wrote:> Hi Stolee
> 
> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>> +    switch (schedule) {
>> +    case SCHEDULE_HOURLY:
>> +        schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
>> +        break;
>> +
>> +    case SCHEDULE_DAILY:
>> +        schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
>> +        break;
>> +
>> +    case SCHEDULE_WEEKLY:
>> +        schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
>> +        break;
> 
> This is not a new issue with this patch but we run the hourly job even
> when we want to run the daily job or the weekly job and we run the daily
> job when we want to run the weekly job.

This is an excellent point! Thanks for bringing it up.

> So only one of these jobs will succeed. The cron entries are careful to
> only run one job at a time, I think it would be worth doing the same
> thing here. I think the using the following format strings would fix this.
> 
> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
> Daily:  "Tue..Sun *-*-* 00:00:%02d"
> Weekly: "Mon      *-*-* 00:00:%02d"

I would modify this with dropping the "Tue..Sun" from the hourly schedule,
as we still want 23 runs on Mondays.

> It looks like the launchctl schedule has the same issue.

I will take a look at this and consider some additional patches to correct
these issues across both schedulers. Thank you for the attention to detail!

> One thing I've been wondering about which is related to maintenance but
> totally off-topic for this patch is that I think when auto maintenance
> is enabled we stop automatically running "gc" so how do the reflogs get
> expired?

The default maintenance schedule does not include a 'gc' run as it does
not intend to remove any data. Reflog expiration could be considered as a
separate maintenance task that might be useful in the default maintenance
schedule.

Thanks,
-Stolee
Derrick Stolee Aug. 8, 2023, 5:06 p.m. UTC | #7
On 8/8/2023 8:08 AM, Phillip Wood wrote:
> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> The get_random_minute() method was created to allow maintenance
>> schedules to be fixed to a random minute of the hour. This randomness is
>> only intended to spread out the load from a number of clients, but each
>> client should have an hour between each maintenance cycle.
>>
>> Add this random minute to the systemd integration.
> 
> I think it makes sense to keep the random minute implementation the same across all the schedulers, but we could use RandomizedDelaySec (possibly combined with FixedRandomDelay) to randomize when the job is actually run.

That's an interesting suggestion, but I also think it is valuable to
have consistent gaps between maintenance activities on the client, but
RandomizedDelaySec would present the possibility of 60-3540 seconds between
"hourly" maintenance runs (if I understand the option correctly).

>> @@ -2299,13 +2299,20 @@ static char *xdg_config_home_systemd(const char *filename)
>>       return xdg_config_home_for("systemd/user", filename);
>>   }
>>   -static int systemd_timer_write_unit_templates(const char *exec_path)
>> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
> 
> As we're not writing template files any more I think we should rename this to systemd_timer_write_unit_file()

Good point. I have adjusted all the names in my next version.

>> +                         const char *exec_path,
>> +                         int minute)
>>   {
>>       char *filename;
>>       FILE *file;
>>       const char *unit;
>> +    char *schedule_pattern = NULL;
>> +    const char *frequency = get_frequency(schedule);
>> +    char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
> 
> The "@" in the name signifies that it is a template unit which it isn't anymore so I think we want to change this to "git-maintenance-%s.timer"

I'll also take your SYSTEMD_UNIT_FORMAT macro suggestion to simplify things.
 
>> +    char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
> 
> Same change to the name here. I wonder if we could still use a template service file but that would complicate the implementation as we'd need to write three timer files but only one service file.

Since we execute systemctl only passing the .timer filename, I think we'd
need to keep the '@' symbol in the name if we wanted to use .schedule
templates. Best to keep things simple and abandon templates completely.

>> [...]
>> @@ -2375,13 +2399,16 @@ static int systemd_timer_write_unit_templates(const char *exec_path)
>>       return 0;
>>     error:
>> +    free(schedule_pattern);
>> +    free(local_timer_name);
>>       free(filename);
>> -    systemd_timer_delete_unit_templates();
> 
> This looks like a change in behavior as previously we'd remove any files if there was an error rather than leaving behind a timer file without a corresponding unit file.

The callers take care of deleting the unit files, but there was one place
where a short-circuit could have avoided this deletion. I'll clean that up.

> Looking at maintenance_start() we call maintenance_register() which disables "gc --auto" before we get to this point so if we fail to write the files we'll end up disabling any form of gc in the repository.

Adding this to the list of cleanups at the end. Thanks.

I appreciate the careful review!

Thanks,
-Stolee
Derrick Stolee Aug. 8, 2023, 5:14 p.m. UTC | #8
On 8/8/2023 1:06 PM, Derrick Stolee wrote:
> On 8/8/2023 8:08 AM, Phillip Wood wrote:
>> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:

>>> +    char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
>>
>> The "@" in the name signifies that it is a template unit which it isn't anymore so I think we want to change this to "git-maintenance-%s.timer"
> 
> I'll also take your SYSTEMD_UNIT_FORMAT macro suggestion to simplify things.

As I was checking things, it turns out that we _should_ keep the '@' symbol
if only to make sure that our new schedule overwrites the old schedule.

The alternative is that we manually try to delete the old schedule, but that
feels like an inefficient way to do it, leaving some cruft around long-term.

For completeness, here is what I did to check:

$ systemctl --user list-timers
NEXT                        LEFT        LAST                        PASSED       UNIT                         ACTIVATES                     
Tue 2023-08-08 13:13:00 EDT 6s left     n/a                         n/a          git-maintenance-hourly.timer git-maintenance-hourly.service
Tue 2023-08-08 13:50:00 EDT 37min left  Tue 2023-08-08 12:50:10 EDT 22min ago    git-maintenance@hourly.timer git-maintenance@hourly.service
Wed 2023-08-09 00:13:00 EDT 11h left    n/a                         n/a          git-maintenance-daily.timer  git-maintenance-daily.service
Wed 2023-08-09 00:50:00 EDT 11h left    Tue 2023-08-08 09:35:31 EDT 3h 37min ago git-maintenance@daily.timer  git-maintenance@daily.service
Mon 2023-08-14 00:13:00 EDT 5 days left n/a                         n/a          git-maintenance-weekly.timer git-maintenance-weekly.service
Mon 2023-08-14 00:50:00 EDT 5 days left Mon 2023-08-07 10:28:10 EDT 1 day 2h ago git-maintenance@weekly.timer git-maintenance@weekly.service

Do you have an alternative idea how to handle that?

Thanks,
-Stolee
Derrick Stolee Aug. 8, 2023, 5:24 p.m. UTC | #9
On 8/8/2023 9:56 AM, Derrick Stolee wrote:
> On 8/8/2023 5:53 AM, Phillip Wood wrote:> Hi Stolee

>> So only one of these jobs will succeed. The cron entries are careful to
>> only run one job at a time, I think it would be worth doing the same
>> thing here. I think the using the following format strings would fix this.
>>
>> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
>> Daily:  "Tue..Sun *-*-* 00:00:%02d"
>> Weekly: "Mon      *-*-* 00:00:%02d"
> 
> I would modify this with dropping the "Tue..Sun" from the hourly schedule,
> as we still want 23 runs on Mondays.
> 
>> It looks like the launchctl schedule has the same issue.
> 
> I will take a look at this and consider some additional patches to correct
> these issues across both schedulers. Thank you for the attention to detail!

Taking a look, it seems that launchctl does not have this same problem.

The schedule is set via an <array> of <dict>s as follows:

	case SCHEDULE_HOURLY:
		repeat = "<dict>\n"
			 "<key>Hour</key><integer>%d</integer>\n"
			 "<key>Minute</key><integer>%d</integer>\n"
			 "</dict>\n";
		for (i = 1; i <= 23; i++)
			strbuf_addf(&plist, repeat, i, minute);
		break;

	case SCHEDULE_DAILY:
		repeat = "<dict>\n"
			 "<key>Day</key><integer>%d</integer>\n"
			 "<key>Hour</key><integer>0</integer>\n"
			 "<key>Minute</key><integer>%d</integer>\n"
			 "</dict>\n";
		for (i = 1; i <= 6; i++)
			strbuf_addf(&plist, repeat, i, minute);
		break;

This means that we create an hourly schedule for each hour 1..23
(no 0th hour means no collision with daily or weekly schedule) and
a daily schedule for each day 1..6 (no 0th day means no collision
with the weekly schedule).

Does this match your understanding?

The overlap _definitely_ exists in systemd, which I will fix.

Thanks,
-Stolee
Taylor Blau Aug. 8, 2023, 8:05 p.m. UTC | #10
On Tue, Aug 08, 2023 at 09:49:40AM -0400, Derrick Stolee wrote:
> > But I figured I'd raise the question in case you did mean for them to
> > both run on either Sunday or Monday.
>
> I don't intend to change the day that is run as part of this change.
>
> I think all other schedulers run on Sunday, but systemd running on Monday
> is a particular detail of its "weekly" schedule.

No problem -- I figured that you didn't intend on changing the days
around here, just wanted to make sure it was known that the systemd
scheduler picks a different day of the week for its weekly schedule than
the others do.

> >> -static int systemd_timer_write_unit_templates(const char *exec_path)
> >> +static int systemd_timer_write_unit_template(enum schedule_priority schedule,
> >> +					     const char *exec_path,
> >> +					     int minute)
> >>  {
> >>  	char *filename;
> >>  	FILE *file;
> >>  	const char *unit;
> >> +	char *schedule_pattern = NULL;
> >
> > You should be able to drop the NULL initialization, since you assign
> > this value unconditionally in the switch statement below (or BUG() on an
> > unknown schedule type).
>
> Unfortunately, GCC complained about a possibly-unassigned value when I
> left this unset during development, so this actually is necessary for
> that compiler.

Ah, I would have thought that GCC would be smart enough to figure out
that schedule_pattern is unconditionally initialized via the switch
statement, but I guess not. Makes sense.

Thanks,
Taylor
Phillip Wood Aug. 9, 2023, 10 a.m. UTC | #11
On 08/08/2023 18:14, Derrick Stolee wrote:
> On 8/8/2023 1:06 PM, Derrick Stolee wrote:
>> On 8/8/2023 8:08 AM, Phillip Wood wrote:
>>> On 07/08/2023 19:51, Derrick Stolee via GitGitGadget wrote:
> 
>>>> +    char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
>>>
>>> The "@" in the name signifies that it is a template unit which it isn't anymore so I think we want to change this to "git-maintenance-%s.timer"
>>
>> I'll also take your SYSTEMD_UNIT_FORMAT macro suggestion to simplify things.
> 
> As I was checking things, it turns out that we _should_ keep the '@' symbol
> if only to make sure that our new schedule overwrites the old schedule.

Oh, so if the user already has scheduled maintenance set up then running 
"git maintenance start" adds a new set of timers. I'd not though about that.

> The alternative is that we manually try to delete the old schedule, but that
> feels like an inefficient way to do it, leaving some cruft around long-term.

This patch still changes the names of the files we write. Currently we write

	$XDG_CONFIG_HOME/systemd/user/git-maintenance@.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@.timer

and this patch changes that to

	$XDG_CONFIG_HOME/systemd/user/git-maintenance@hourly.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@daily.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@weekly.service
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@hourly.timer
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@daily.timer
	$XDG_CONFIG_HOME/systemd/user/git-maintenance@weekly.timer

If the user has already enabled maintenance then

	$XDG_CONFIG_HOME/systemd/user/timers.target.wants/git-maintenance@hourly.timer
	$XDG_CONFIG_HOME/systemd/user/timers.target.wants/git-maintenance@daily.timer
	$XDG_CONFIG_HOME/systemd/user/timers.target.wants/git-maintenance@weekly.timer

will exist and are all symlinks pointing to

	$XDG_CONFIG_HOME/systemd/user/git-maintenance@.timer

After this patch if the user runs "git maintenance start" again then 
systemctl will update the symlinks tot point to the matching unit files 
rather than the old template file. That means the user will pick up the 
new schedule but we leave behind the original files that are unused.

> For completeness, here is what I did to check:
> 
> $ systemctl --user list-timers
> NEXT                        LEFT        LAST                        PASSED       UNIT                         ACTIVATES
> Tue 2023-08-08 13:13:00 EDT 6s left     n/a                         n/a          git-maintenance-hourly.timer git-maintenance-hourly.service
> Tue 2023-08-08 13:50:00 EDT 37min left  Tue 2023-08-08 12:50:10 EDT 22min ago    git-maintenance@hourly.timer git-maintenance@hourly.service
> Wed 2023-08-09 00:13:00 EDT 11h left    n/a                         n/a          git-maintenance-daily.timer  git-maintenance-daily.service
> Wed 2023-08-09 00:50:00 EDT 11h left    Tue 2023-08-08 09:35:31 EDT 3h 37min ago git-maintenance@daily.timer  git-maintenance@daily.service
> Mon 2023-08-14 00:13:00 EDT 5 days left n/a                         n/a          git-maintenance-weekly.timer git-maintenance-weekly.service
> Mon 2023-08-14 00:50:00 EDT 5 days left Mon 2023-08-07 10:28:10 EDT 1 day 2h ago git-maintenance@weekly.timer git-maintenance@weekly.service
> 
> Do you have an alternative idea how to handle that?

I think we should stick with the names as you have them. It might be 
worth keeping the service file as a template so we only write the new 
timer files and have a reason to use the "@" naming scheme. We could 
update systemd_timer_setup_units() to delete git-maintenance@.timer if 
we successfully enable the new timer unit files.

Sorry for the confusion, I should have thought about the user running 
"git maintenance start" for a second time.

Best Wishes

Phillip
Phillip Wood Aug. 9, 2023, 10:03 a.m. UTC | #12
On 08/08/2023 18:24, Derrick Stolee wrote:
> On 8/8/2023 9:56 AM, Derrick Stolee wrote:
>> On 8/8/2023 5:53 AM, Phillip Wood wrote:> Hi Stolee
> 
>>> So only one of these jobs will succeed. The cron entries are careful to
>>> only run one job at a time, I think it would be worth doing the same
>>> thing here. I think the using the following format strings would fix this.
>>>
>>> Hourly: "Tue..Sun *-*-* 1..23:00:%02d"
>>> Daily:  "Tue..Sun *-*-* 00:00:%02d"
>>> Weekly: "Mon      *-*-* 00:00:%02d"
>>
>> I would modify this with dropping the "Tue..Sun" from the hourly schedule,
>> as we still want 23 runs on Mondays.

Oops, well spotted

>>> It looks like the launchctl schedule has the same issue.
>>
>> I will take a look at this and consider some additional patches to correct
>> these issues across both schedulers. Thank you for the attention to detail!
> 
> Taking a look, it seems that launchctl does not have this same problem.
> 
> The schedule is set via an <array> of <dict>s as follows:
> 
> 	case SCHEDULE_HOURLY:
> 		repeat = "<dict>\n"
> 			 "<key>Hour</key><integer>%d</integer>\n"
> 			 "<key>Minute</key><integer>%d</integer>\n"
> 			 "</dict>\n";
> 		for (i = 1; i <= 23; i++)
> 			strbuf_addf(&plist, repeat, i, minute);
> 		break;
> 
> 	case SCHEDULE_DAILY:
> 		repeat = "<dict>\n"
> 			 "<key>Day</key><integer>%d</integer>\n"
> 			 "<key>Hour</key><integer>0</integer>\n"
> 			 "<key>Minute</key><integer>%d</integer>\n"
> 			 "</dict>\n";
> 		for (i = 1; i <= 6; i++)
> 			strbuf_addf(&plist, repeat, i, minute);
> 		break;
> 
> This means that we create an hourly schedule for each hour 1..23
> (no 0th hour means no collision with daily or weekly schedule) and
> a daily schedule for each day 1..6 (no 0th day means no collision
> with the weekly schedule).
> 
> Does this match your understanding?

Yes, having read it again - sorry I'd misunderstood it yesterday.

> The overlap _definitely_ exists in systemd, which I will fix.

That's great

Best Wishes

Phillip
diff mbox series

Patch

diff --git a/builtin/gc.c b/builtin/gc.c
index b3ef95b10aa..5f5bb95641f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -2299,13 +2299,20 @@  static char *xdg_config_home_systemd(const char *filename)
 	return xdg_config_home_for("systemd/user", filename);
 }
 
-static int systemd_timer_write_unit_templates(const char *exec_path)
+static int systemd_timer_write_unit_template(enum schedule_priority schedule,
+					     const char *exec_path,
+					     int minute)
 {
 	char *filename;
 	FILE *file;
 	const char *unit;
+	char *schedule_pattern = NULL;
+	const char *frequency = get_frequency(schedule);
+	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
+	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
+
+	filename = xdg_config_home_systemd(local_timer_name);
 
-	filename = xdg_config_home_systemd("git-maintenance@.timer");
 	if (safe_create_leading_directories(filename)) {
 		error(_("failed to create directories for '%s'"), filename);
 		goto error;
@@ -2314,6 +2321,23 @@  static int systemd_timer_write_unit_templates(const char *exec_path)
 	if (!file)
 		goto error;
 
+	switch (schedule) {
+	case SCHEDULE_HOURLY:
+		schedule_pattern = xstrfmt("*-*-* *:%02d:00", minute);
+		break;
+
+	case SCHEDULE_DAILY:
+		schedule_pattern = xstrfmt("*-*-* 0:%02d:00", minute);
+		break;
+
+	case SCHEDULE_WEEKLY:
+		schedule_pattern = xstrfmt("Mon 0:%02d:00", minute);
+		break;
+
+	default:
+		BUG("Unhandled schedule_priority");
+	}
+
 	unit = "# This file was created and is maintained by Git.\n"
 	       "# Any edits made in this file might be replaced in the future\n"
 	       "# by a Git command.\n"
@@ -2322,12 +2346,12 @@  static int systemd_timer_write_unit_templates(const char *exec_path)
 	       "Description=Optimize Git repositories data\n"
 	       "\n"
 	       "[Timer]\n"
-	       "OnCalendar=%i\n"
+	       "OnCalendar=%s\n"
 	       "Persistent=true\n"
 	       "\n"
 	       "[Install]\n"
 	       "WantedBy=timers.target\n";
-	if (fputs(unit, file) == EOF) {
+	if (fprintf(file, unit, schedule_pattern) < 0) {
 		error(_("failed to write to '%s'"), filename);
 		fclose(file);
 		goto error;
@@ -2338,7 +2362,7 @@  static int systemd_timer_write_unit_templates(const char *exec_path)
 	}
 	free(filename);
 
-	filename = xdg_config_home_systemd("git-maintenance@.service");
+	filename = xdg_config_home_systemd(local_service_name);
 	file = fopen_or_warn(filename, "w");
 	if (!file)
 		goto error;
@@ -2352,7 +2376,7 @@  static int systemd_timer_write_unit_templates(const char *exec_path)
 	       "\n"
 	       "[Service]\n"
 	       "Type=oneshot\n"
-	       "ExecStart=\"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%%i\n"
+	       "ExecStart=\"%s/git\" --exec-path=\"%s\" for-each-repo --config=maintenance.repo maintenance run --schedule=%s\n"
 	       "LockPersonality=yes\n"
 	       "MemoryDenyWriteExecute=yes\n"
 	       "NoNewPrivileges=yes\n"
@@ -2362,7 +2386,7 @@  static int systemd_timer_write_unit_templates(const char *exec_path)
 	       "RestrictSUIDSGID=yes\n"
 	       "SystemCallArchitectures=native\n"
 	       "SystemCallFilter=@system-service\n";
-	if (fprintf(file, unit, exec_path, exec_path) < 0) {
+	if (fprintf(file, unit, exec_path, exec_path, frequency) < 0) {
 		error(_("failed to write to '%s'"), filename);
 		fclose(file);
 		goto error;
@@ -2375,13 +2399,16 @@  static int systemd_timer_write_unit_templates(const char *exec_path)
 	return 0;
 
 error:
+	free(schedule_pattern);
+	free(local_timer_name);
 	free(filename);
-	systemd_timer_delete_unit_templates();
 	return -1;
 }
 
 static int systemd_timer_enable_unit(int enable,
-				     enum schedule_priority schedule)
+				     enum schedule_priority schedule,
+				     const char *exec_path,
+				     int minute)
 {
 	const char *cmd = "systemctl";
 	struct child_process child = CHILD_PROCESS_INIT;
@@ -2398,6 +2425,8 @@  static int systemd_timer_enable_unit(int enable,
 	 */
 	if (!enable)
 		child.no_stderr = 1;
+	else if (systemd_timer_write_unit_template(schedule, exec_path, minute))
+		return -1;
 
 	get_schedule_cmd(&cmd, NULL);
 	strvec_split(&child.args, cmd);
@@ -2420,38 +2449,53 @@  static int systemd_timer_enable_unit(int enable,
 	return 0;
 }
 
-static int systemd_timer_delete_unit_templates(void)
+static int systemd_timer_delete_unit_template(enum schedule_priority priority)
 {
+	const char *frequency = get_frequency(priority);
+	char *local_timer_name = xstrfmt("git-maintenance@%s.timer", frequency);
+	char *local_service_name = xstrfmt("git-maintenance@%s.service", frequency);
 	int ret = 0;
-	char *filename = xdg_config_home_systemd("git-maintenance@.timer");
+	char *filename = xdg_config_home_systemd(local_timer_name);
 	if (unlink(filename) && !is_missing_file_error(errno))
 		ret = error_errno(_("failed to delete '%s'"), filename);
 	FREE_AND_NULL(filename);
 
-	filename = xdg_config_home_systemd("git-maintenance@.service");
+	filename = xdg_config_home_systemd(local_service_name);
 	if (unlink(filename) && !is_missing_file_error(errno))
 		ret = error_errno(_("failed to delete '%s'"), filename);
 
 	free(filename);
+	free(local_timer_name);
+	free(local_service_name);
 	return ret;
 }
 
+static int systemd_timer_delete_unit_templates(void)
+{
+	/* Purposefully not short-circuited to make sure all are called. */
+	return systemd_timer_delete_unit_template(SCHEDULE_HOURLY) |
+	       systemd_timer_delete_unit_template(SCHEDULE_DAILY) |
+	       systemd_timer_delete_unit_template(SCHEDULE_WEEKLY);
+}
+
 static int systemd_timer_delete_units(void)
 {
-	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_DAILY) ||
-	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY) ||
+	int minute = get_random_minute();
+	const char *exec_path = git_exec_path();
+	return systemd_timer_enable_unit(0, SCHEDULE_HOURLY, exec_path, minute) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_DAILY, exec_path, minute) ||
+	       systemd_timer_enable_unit(0, SCHEDULE_WEEKLY, exec_path, minute) ||
 	       systemd_timer_delete_unit_templates();
 }
 
 static int systemd_timer_setup_units(void)
 {
+	int minute = get_random_minute();
 	const char *exec_path = git_exec_path();
 
-	int ret = systemd_timer_write_unit_templates(exec_path) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_HOURLY) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_DAILY) ||
-		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY);
+	int ret = systemd_timer_enable_unit(1, SCHEDULE_HOURLY, exec_path, minute) ||
+		  systemd_timer_enable_unit(1, SCHEDULE_DAILY, exec_path, minute) ||
+		  systemd_timer_enable_unit(1, SCHEDULE_WEEKLY, exec_path, minute);
 	if (ret)
 		systemd_timer_delete_units();
 	return ret;
diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 487e326b3fa..f5a93f5aecf 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -744,7 +744,9 @@  test_expect_success 'start and stop Linux/systemd maintenance' '
 	# start registers the repo
 	git config --get --global --fixed-value maintenance.repo "$(pwd)" &&
 
-	test_systemd_analyze_verify "systemd/user/git-maintenance@.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@hourly.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@daily.service" &&
+	test_systemd_analyze_verify "systemd/user/git-maintenance@weekly.service" &&
 
 	printf -- "--user enable --now git-maintenance@%s.timer\n" hourly daily weekly >expect &&
 	test_cmp expect args &&