Message ID | c6d5f045fb5644306a3676e5fa4145ba4c6e9b93.1617291666.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Builtin FSMonitor Feature | expand |
On 4/1/2021 11:41 AM, Jeff Hostetler via GitGitGadget wrote: > From: Jeff Hostetler <jeffhost@microsoft.com> > > Define GIT_TEST_FSMONITOR_CLIENT_DELAY as a millisecond delay. This is a second delay introduced in this feature, but the units are different. Could we put a unit in the name? Perhaps a "_MS" suffix. > Introduce an artificial delay when processing client requests. > This make the CI/PR test suite a little more stable and avoids > the need to load up test scripts with sleep statements to avoid > racy failures. This was mostly seen on 1 or 2 core CI build > machines where the test script would create a file and quickly > try to confirm that the daemon had seen it *before* the daemon > had received the kernel event and causing a test failure. Isn't the cookie file supposed to prevent this from happening? Yes, our test suite interacts with the filesystem and Git commands more quickly than a human user would, but Git is used all the time by scripts or build machines to quickly process data. The FS Monitor feature should be robust to such a situation. I feel that as currently described, this patch is only hiding a bug that shows up during heavy use. Perhaps the test failures are limited to a small number of specific tests that are checking the FS Monitor daemon in a non-standard way, especially in a way that circumvents the cookie file. In this case, I'd like to see _in this patch_ how the environment variable is used in the test suite. I understand that it is difficult to simultaneously build a new feature like this in small increments, but the biggest issue I have with the series' organization so far is that we are 18 patches deep and I still haven't seen a single test. This is a case where I think this only serves the purpose of the test suite, so it would be good to delay until only seeing its value in a test script. Looking ahead, I see that you insert it as a blanket statement in the t7527 test script, which seems like it has potential to hide bugs instead of being an isolated cover for a specific interaction. As for the code, it all looks correct. However, please update t/README with a description of the new GIT_TEST_* variable. Thanks, -Stolee
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c index e9a9aea59ad6..0cb09ef0b984 100644 --- a/builtin/fsmonitor--daemon.c +++ b/builtin/fsmonitor--daemon.c @@ -150,6 +150,30 @@ static int do_as_client__send_flush(void) return 0; } +static int lookup_client_test_delay(void) +{ + static int delay_ms = -1; + + const char *s; + int ms; + + if (delay_ms >= 0) + return delay_ms; + + delay_ms = 0; + + s = getenv("GIT_TEST_FSMONITOR_CLIENT_DELAY"); + if (!s) + return delay_ms; + + ms = atoi(s); + if (ms < 0) + return delay_ms; + + delay_ms = ms; + return delay_ms; +} + /* * Requests to and from a FSMonitor Protocol V2 provider use an opaque * "token" as a virtual timestamp. Clients can request a summary of all @@ -526,6 +550,18 @@ static int do_handle_client(struct fsmonitor_daemon_state *state, return SIMPLE_IPC_QUIT; } + /* + * For testing purposes, introduce an artificial delay in this + * worker to allow the filesystem listener thread to receive + * any fs events that may have been generated by the client + * process on the other end of the pipe/socket. This helps + * make the CI/PR test suite runs a little more predictable + * and hopefully eliminates the need to introduce `sleep` + * commands in the test scripts. + */ + if (state->test_client_delay_ms) + sleep_millisec(state->test_client_delay_ms); + if (!strcmp(command, "flush")) { /* * Flush all of our cached data and generate a new token @@ -1038,7 +1074,7 @@ static int fsmonitor_run_daemon(void) pthread_mutex_init(&state.main_lock, NULL); state.error_code = 0; state.current_token_data = fsmonitor_new_token_data(); - state.test_client_delay_ms = 0; + state.test_client_delay_ms = lookup_client_test_delay(); /* Prepare to (recursively) watch the <worktree-root> directory. */ strbuf_init(&state.path_worktree_watch, 0);