diff mbox series

[4/5] tools/xenstored: remove "-N" command line option

Message ID 20231113124309.10862-5-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series tools/xenstored: remove some command line options | expand

Commit Message

Jürgen Groß Nov. 13, 2023, 12:43 p.m. UTC
The "-N" (do not daemonize) command line option is of questionable use:
its sole purpose seems to be to aid debugging of xenstored by making it
easier to start xenstored under gdb, or to see any debug messages
easily.

Debug messages can as well be sent to syslog(), while gdb can be
attached to the daemon easily. The only not covered case is an error
while initializing xenstored, but this could be handled e.g. by saving
a core dump, which can be analyzed later.

The call of talloc_enable_leak_report_full() done only with "-N"
specified is no longer needed, as the same can be achieved via
"xenstore-control memreport".

Signed-off-by: Juergen Gross <jgross@suse.com>
---
Slightly RFC, as this is making debugging a little bit harder in
specific cases. OTOH I didn't use this option since years, in spite of
having done a _lot_ of xenstore hacking.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 tools/xenstored/core.c | 26 +++++++-------------------
 1 file changed, 7 insertions(+), 19 deletions(-)

Comments

Julien Grall Nov. 14, 2023, 9:15 p.m. UTC | #1
Hi,

On 13/11/2023 12:43, Juergen Gross wrote:
> The "-N" (do not daemonize) command line option is of questionable use:
> its sole purpose seems to be to aid debugging of xenstored by making it
> easier to start xenstored under gdb, or to see any debug messages
> easily.
> 
> Debug messages can as well be sent to syslog(), while gdb can be
> attached to the daemon easily. The only not covered case is an error
> while initializing xenstored, but this could be handled e.g. by saving
> a core dump, which can be analyzed later.

The slight trouble is a debug message in the syslog (such as from 
barf()) is often not enough. What you want is a coredump or attach gdb 
to analyze.

For the former, I found quite useful when debugging the Live-Update code 
to replace exit() with abort() in bar*(). So a coredump will be 
generated anytime barf*() is called.

> 
> The call of talloc_enable_leak_report_full() done only with "-N"
> specified is no longer needed, as the same can be achieved via
> "xenstore-control memreport".
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> Slightly RFC, as this is making debugging a little bit harder in
> specific cases. OTOH I didn't use this option since years, in spite of
> having done a _lot_ of xenstore hacking.

I have used a few times but when trying to run Xenstored with ASAN but 
this wasn't trivial to use.

> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Julien Grall <jgrall@amazon.com.

Cheers,
Andrew Cooper Nov. 14, 2023, 10:11 p.m. UTC | #2
On 13/11/2023 12:43 pm, Juergen Gross wrote:
> The "-N" (do not daemonize) command line option is of questionable use:
> its sole purpose seems to be to aid debugging of xenstored by making it
> easier to start xenstored under gdb, or to see any debug messages
> easily.
>
> Debug messages can as well be sent to syslog(), while gdb can be
> attached to the daemon easily. The only not covered case is an error
> while initializing xenstored, but this could be handled e.g. by saving
> a core dump, which can be analyzed later.
>
> The call of talloc_enable_leak_report_full() done only with "-N"
> specified is no longer needed, as the same can be achieved via
> "xenstore-control memreport".
>
> Signed-off-by: Juergen Gross <jgross@suse.com>

CC Edvin.

There's a patch out to specifically use this option (under systemd) to
fix a bug we found.

I can't recall the details, but I seem to recall it wasn't specific to
oxenstored.

~Andrew
Edwin Torok Nov. 15, 2023, 8:31 a.m. UTC | #3
> On 14 Nov 2023, at 22:11, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
> 
> On 13/11/2023 12:43 pm, Juergen Gross wrote:
>> The "-N" (do not daemonize) command line option is of questionable use:
>> its sole purpose seems to be to aid debugging of xenstored by making it
>> easier to start xenstored under gdb, or to see any debug messages
>> easily.
>> 
>> Debug messages can as well be sent to syslog(), while gdb can be
>> attached to the daemon easily. The only not covered case is an error
>> while initializing xenstored, but this could be handled e.g. by saving
>> a core dump, which can be analyzed later.
>> 
>> The call of talloc_enable_leak_report_full() done only with "-N"
>> specified is no longer needed, as the same can be achieved via
>> "xenstore-control memreport".
>> 
>> Signed-off-by: Juergen Gross <jgross@suse.com>
> 
> CC Edvin.
> 
> There's a patch out to specifically use this option (under systemd) to
> fix a bug we found.
> 
> I can't recall the details, but I seem to recall it wasn't specific to
> oxenstored.
> 


The patch is here https://lore.kernel.org/xen-devel/ECFA15A7-9DC8-4476-8D0B-44A6D12192D6@cloud.com/

It is about not losing stderr when run under systemd (well you can use syslog but what if your connection to syslog fails or you fail before getting to the point of parsing which syslog to use). 
Alternatively we could have a "don't redirect stderr to /dev/null" flag,
but such redirections are usually expected when daemonizing.

It'd be good if the --no-fork flag could be retained for C xenstored, currently there is a CI failure on a non-systemd system that I'm trying to fix (a bit blindly because I don't have such a system myself), but from my testing the flag does work on a systemd system with C xenstored.

The patch is motivated by having some undebuggable issues in oxenstored where it just exits without writing any messages and without dumping core, so by retaining the stderr path we should have an output of last resort in case something goes so wrong that the syslog error handler cannot function.

Best regards,
--Edwin
Jürgen Groß Nov. 15, 2023, 9:50 a.m. UTC | #4
On 15.11.23 09:31, Edwin Torok wrote:
> 
> 
>> On 14 Nov 2023, at 22:11, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote:
>>
>> On 13/11/2023 12:43 pm, Juergen Gross wrote:
>>> The "-N" (do not daemonize) command line option is of questionable use:
>>> its sole purpose seems to be to aid debugging of xenstored by making it
>>> easier to start xenstored under gdb, or to see any debug messages
>>> easily.
>>>
>>> Debug messages can as well be sent to syslog(), while gdb can be
>>> attached to the daemon easily. The only not covered case is an error
>>> while initializing xenstored, but this could be handled e.g. by saving
>>> a core dump, which can be analyzed later.
>>>
>>> The call of talloc_enable_leak_report_full() done only with "-N"
>>> specified is no longer needed, as the same can be achieved via
>>> "xenstore-control memreport".
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>
>> CC Edvin.
>>
>> There's a patch out to specifically use this option (under systemd) to
>> fix a bug we found.
>>
>> I can't recall the details, but I seem to recall it wasn't specific to
>> oxenstored.
>>
> 
> 
> The patch is here https://lore.kernel.org/xen-devel/ECFA15A7-9DC8-4476-8D0B-44A6D12192D6@cloud.com/
> 
> It is about not losing stderr when run under systemd (well you can use syslog but what if your connection to syslog fails or you fail before getting to the point of parsing which syslog to use).
> Alternatively we could have a "don't redirect stderr to /dev/null" flag,
> but such redirections are usually expected when daemonizing.
> 
> It'd be good if the --no-fork flag could be retained for C xenstored, currently there is a CI failure on a non-systemd system that I'm trying to fix (a bit blindly because I don't have such a system myself), but from my testing the flag does work on a systemd system with C xenstored.
> 
> The patch is motivated by having some undebuggable issues in oxenstored where it just exits without writing any messages and without dumping core, so by retaining the stderr path we should have an output of last resort in case something goes so wrong that the syslog error handler cannot function.

Using the --no-fork as in your patch would mean that setting the oom_score in
the lauch_xenstore script wouldn't be executed any longer, right?


Juergen
diff mbox series

Patch

diff --git a/tools/xenstored/core.c b/tools/xenstored/core.c
index 5a6378316a..c3cfef0965 100644
--- a/tools/xenstored/core.c
+++ b/tools/xenstored/core.c
@@ -2652,7 +2652,6 @@  static void usage(void)
 "\n"
 "  -F, --pid-file <file>   giving a file for the daemon's pid to be written,\n"
 "  -H, --help              to output this message,\n"
-"  -N, --no-fork           to request that the daemon does not fork,\n"
 "  -T, --trace-file <file> giving the file for logging, and\n"
 "      --trace-control=+<switch> activate a specific <switch>\n"
 "      --trace-control=-<switch> deactivate a specific <switch>\n"
@@ -2699,7 +2698,6 @@  static struct option options[] = {
 	{ "event", 1, NULL, 'e' },
 	{ "master-domid", 1, NULL, 'm' },
 	{ "help", 0, NULL, 'H' },
-	{ "no-fork", 0, NULL, 'N' },
 	{ "priv-domid", 1, NULL, 'p' },
 	{ "entry-size", 1, NULL, 'S' },
 	{ "trace-file", 1, NULL, 'T' },
@@ -2822,7 +2820,6 @@  int main(int argc, char *argv[])
 {
 	int opt;
 	int sock_pollfd_idx = -1;
-	bool dofork = true;
 	bool live_update = false;
 	const char *pidfile = NULL;
 	int timeout;
@@ -2831,7 +2828,7 @@  int main(int argc, char *argv[])
 	orig_argv = argv;
 
 	while ((opt = getopt_long(argc, argv,
-				  "E:F:H::KNS:t:A:M:Q:q:T:RW:w:U",
+				  "E:F:H::KS:t:A:M:Q:q:T:RW:w:U",
 				  options, NULL)) != -1) {
 		switch (opt) {
 		case 'E':
@@ -2843,9 +2840,6 @@  int main(int argc, char *argv[])
 		case 'H':
 			usage();
 			return 0;
-		case 'N':
-			dofork = false;
-			break;
 		case 'R':
 			recovery = false;
 			break;
@@ -2911,18 +2905,13 @@  int main(int argc, char *argv[])
 	/* Errors ignored here, will be reported when we open files */
 	mkdir(xenstore_daemon_rundir(), 0755);
 
-	if (dofork) {
-		openlog("xenstored", 0, LOG_DAEMON);
-		if (!live_update)
-			daemonize();
-	}
+	openlog("xenstored", 0, LOG_DAEMON);
+	if (!live_update)
+		daemonize();
+
 	if (pidfile)
 		write_pidfile(pidfile);
 
-	/* Talloc leak reports go to stderr, which is closed if we fork. */
-	if (!dofork)
-		talloc_enable_leak_report_full();
-
 	/* Don't kill us with SIGPIPE. */
 	signal(SIGPIPE, SIG_IGN);
 
@@ -2942,11 +2931,10 @@  int main(int argc, char *argv[])
 	}
 
 	/* redirect to /dev/null now we're ready to accept connections */
-	if (dofork && !live_update)
+	if (!live_update)
 		finish_daemonize();
 #ifndef __MINIOS__
-	if (dofork)
-		xprintf = trace;
+	xprintf = trace;
 #endif
 
 	signal(SIGHUP, trigger_reopen_log);