diff mbox series

[V4,18/20] rv/monitor: Add safe watchdog monitor

Message ID 6366fdc89d65b8d9b14ccd1e42fa0d793fbe9f73.1655368610.git.bristot@kernel.org (mailing list archive)
State Superseded
Headers show
Series The Runtime Verification (RV) interface | expand

Commit Message

Daniel Bristot de Oliveira June 16, 2022, 8:45 a.m. UTC
The watchdog is an essential building block for the usage of Linux in
safety-critical systems because it allows the system to be monitored from
an external element - the watchdog hardware, acting as a safety-monitor.

A user-space application controls the watchdog device via the watchdog
interface. This application, hereafter safety_app, enables the watchdog
and periodically pets the watchdog upon correct completion of the safety
related processing.

If the safety_app, for any reason, stops pinging the watchdog,
the watchdog hardware can set the system in a fail-safe state. For
example, shutting the system down.

Given the importance of the safety_app / watchdog hardware couple,
the interaction between these software pieces also needs some
sort of monitoring. In other words, "who monitors the monitor?"

The safe watchdog (safe_wtd) RV monitor monitors the interaction between
the safety_app and the watchdog device, enforcing the correct sequence of
events that leads the system to a safe state.

Furthermore, the safety_app can monitor the RV monitor by collecting the
events generated by the RV monitor itself via tracing interface. In this way,
closing the monitoring loop with the safety_app.

To reach a safe state, the safe_wtd RV monitor requires the
safety_app to:

	- Open the watchdog device
	- Start the watchdog
	- Set a timeout
	- ping at least once

The RV monitor also avoids some undesired actions. For example, to have
other threads to touch the watchdog.

The monitor also has a set of options, enabled via kernel command
line/module options. They are:

	- watchdog_id: the device id to monitor (default 0).
	- dont_stop: once enabled, do not allow the RV monitor to be stopped
		(default off);
	- safe_timeout: define a maximum safe value that an user-space
		application can set as the watchdog timeout
		(default unlimited).
	- check_timeout: After every ping, check if the time left in the
		watchdog is less than or equal to the last timeout set
		for the watchdog. It only works for watchdog devices that
		provide the get_timeleft() function (default off).

For further information, please refer to:
	Documentation/trace/rv/watchdog-monitor.rst

The monitor specification was developed together with Gabriele Paoloni,
in the context of the Linux Foundation Elisa Project.

Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
---
 include/trace/events/rv.h                    |  11 +
 kernel/trace/rv/Kconfig                      |  10 +
 kernel/trace/rv/Makefile                     |   1 +
 kernel/trace/rv/monitors/safe_wtd/safe_wtd.c | 300 +++++++++++++++++++
 kernel/trace/rv/monitors/safe_wtd/safe_wtd.h |  84 ++++++
 5 files changed, 406 insertions(+)
 create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
 create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.h

Comments

Guenter Roeck June 16, 2022, 1:36 p.m. UTC | #1
On 6/16/22 01:45, Daniel Bristot de Oliveira wrote:
> The watchdog is an essential building block for the usage of Linux in
> safety-critical systems because it allows the system to be monitored from
> an external element - the watchdog hardware, acting as a safety-monitor.
> 
> A user-space application controls the watchdog device via the watchdog
> interface. This application, hereafter safety_app, enables the watchdog
> and periodically pets the watchdog upon correct completion of the safety
> related processing.
> 
> If the safety_app, for any reason, stops pinging the watchdog,
> the watchdog hardware can set the system in a fail-safe state. For
> example, shutting the system down.
> 
> Given the importance of the safety_app / watchdog hardware couple,
> the interaction between these software pieces also needs some
> sort of monitoring. In other words, "who monitors the monitor?"
> 
> The safe watchdog (safe_wtd) RV monitor monitors the interaction between
> the safety_app and the watchdog device, enforcing the correct sequence of
> events that leads the system to a safe state.
> 
> Furthermore, the safety_app can monitor the RV monitor by collecting the
> events generated by the RV monitor itself via tracing interface. In this way,
> closing the monitoring loop with the safety_app.
> 
> To reach a safe state, the safe_wtd RV monitor requires the
> safety_app to:
> 
> 	- Open the watchdog device
> 	- Start the watchdog
> 	- Set a timeout
> 	- ping at least once
> 
> The RV monitor also avoids some undesired actions. For example, to have
> other threads to touch the watchdog.
> 
> The monitor also has a set of options, enabled via kernel command
> line/module options. They are:
> 
> 	- watchdog_id: the device id to monitor (default 0).
> 	- dont_stop: once enabled, do not allow the RV monitor to be stopped
> 		(default off);
> 	- safe_timeout: define a maximum safe value that an user-space
> 		application can set as the watchdog timeout
> 		(default unlimited).
> 	- check_timeout: After every ping, check if the time left in the
> 		watchdog is less than or equal to the last timeout set
> 		for the watchdog. It only works for watchdog devices that
> 		provide the get_timeleft() function (default off).
> 
> For further information, please refer to:
> 	Documentation/trace/rv/watchdog-monitor.rst
> 
> The monitor specification was developed together with Gabriele Paoloni,
> in the context of the Linux Foundation Elisa Project.
> 
> Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Gabriele Paoloni <gpaoloni@redhat.com>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Clark Williams <williams@redhat.com>
> Cc: linux-doc@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-trace-devel@vger.kernel.org
> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
> ---
>   include/trace/events/rv.h                    |  11 +
>   kernel/trace/rv/Kconfig                      |  10 +
>   kernel/trace/rv/Makefile                     |   1 +
>   kernel/trace/rv/monitors/safe_wtd/safe_wtd.c | 300 +++++++++++++++++++
>   kernel/trace/rv/monitors/safe_wtd/safe_wtd.h |  84 ++++++
>   5 files changed, 406 insertions(+)
>   create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>   create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.h
> 
> diff --git a/include/trace/events/rv.h b/include/trace/events/rv.h
> index 00f11a8dac3b..895eb3435ed7 100644
> --- a/include/trace/events/rv.h
> +++ b/include/trace/events/rv.h
> @@ -66,6 +66,17 @@ DEFINE_EVENT(error_da_monitor, error_wip,
>   	     TP_PROTO(char *state, char *event),
>   	     TP_ARGS(state, event));
>   #endif /* CONFIG_RV_MON_WIP */
> +
> +#ifdef CONFIG_RV_MON_SAFE_WTD
> +DEFINE_EVENT(event_da_monitor, event_safe_wtd,
> +	     TP_PROTO(char *state, char *event, char *next_state, bool safe),
> +	     TP_ARGS(state, event, next_state, safe));
> +
> +DEFINE_EVENT(error_da_monitor, error_safe_wtd,
> +	     TP_PROTO(char *state, char *event),
> +	     TP_ARGS(state, event));
> +#endif /* CONFIG_RV_MON_SAFE_WTD */
> +
>   #endif /* CONFIG_DA_MON_EVENTS_IMPLICIT */
>   
>   #ifdef CONFIG_DA_MON_EVENTS_ID
> diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
> index 21f03fb3101a..b14ae63e792b 100644
> --- a/kernel/trace/rv/Kconfig
> +++ b/kernel/trace/rv/Kconfig
> @@ -45,6 +45,16 @@ config RV_MON_WWNR
>   	  illustrates the usage of per-task monitor. The model is
>   	  broken on purpose: it serves to test reactors.
>   
> +config RV_MON_SAFE_WTD
> +	select DA_MON_EVENTS_IMPLICIT
> +	bool "Safety watchdog"
> +	help
> +	  Enable safe_wtd, this monitor observes the interaction
> +	  between a user-space safety monitor and a watchdog device.
> +
> +	  For futher information see:
> +	    Documentation/trace/rv/safety-monitor.rst
> +
>   config RV_REACTORS
>   	bool "Runtime verification reactors"
>   	default y if RV
> diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
> index 963d14875b45..904db96c7eae 100644
> --- a/kernel/trace/rv/Makefile
> +++ b/kernel/trace/rv/Makefile
> @@ -3,6 +3,7 @@
>   obj-$(CONFIG_RV) += rv.o
>   obj-$(CONFIG_RV_MON_WIP) += monitors/wip/wip.o
>   obj-$(CONFIG_RV_MON_WWNR) += monitors/wwnr/wwnr.o
> +obj-$(CONFIG_RV_MON_SAFE_WTD) += monitors/safe_wtd/safe_wtd.o
>   obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
>   obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
>   obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o
> diff --git a/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
> new file mode 100644
> index 000000000000..9856e0770d0d
> --- /dev/null
> +++ b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
> @@ -0,0 +1,300 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/ftrace.h>
> +#include <linux/tracepoint.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/init.h>
> +#include <linux/rv.h>
> +#include <rv/instrumentation.h>
> +#include <rv/da_monitor.h>
> +
> +#include <linux/watchdog.h>
> +#include <linux/moduleparam.h>
> +
> +#include <trace/events/rv.h>
> +#include <trace/events/watchdog.h>
> +
> +#define MODULE_NAME "safe_wtd"
> +
> +/*
> + * This is the self-generated part of the monitor. Generally, there is no need
> + * to touch this section.
> + */
> +#include "safe_wtd.h"
> +
> +/*
> + * Declare the deterministic automata monitor.
> + *
> + * The rv monitor reference is needed for the monitor declaration.
> + */
> +struct rv_monitor rv_safe_wtd;
> +DECLARE_DA_MON_GLOBAL(safe_wtd, char);
> +
> +/*
> + * custom: safe_timeout is the maximum value a watchdog monitor
> + * can set. This value is registered here to duplicate the information.
> + * In this way, a miss-behaving monitor can be detected.
> + */
> +static int safe_timeout = ~0;
> +module_param(safe_timeout, int, 0444);
> +
> +/*
> + * custom: if check_timeout is set, the monitor will check if the time left
> + * in the watchdog is less than or equals to the last safe timeout set by
> + * user-space. This check is done after each ping. In this way, if any
> + * code by-passed the watchdog dev interface setting a higher (so unsafe)
> + * timeout, this monitor will catch the side effect and react.
> + */
> +static int last_timeout_set = 0;
> +static int check_timeout = 0;
> +module_param(check_timeout, int, 0444);
> +
> +/*
> + * custom: if dont_stop is set the monitor will react if stopped.
> + */
> +static int dont_stop = 0;
> +module_param(dont_stop, int, 0444);
> +
> +/*
> + * custom: there are some states that are kept after the watchdog is closed.
> + * For example, the nowayout state.
> + *
> + * Thus, the RV monitor needs to keep track of these states after a start/stop
> + * of the RV monitor itself, and should not reset after each restart - keeping the
> + * know state until the system shutdown.
> + *
> + * If for an unknown reason an RV monitor would like to reset the RV monitor at each
> + * RV monitor start, set it to one.
> + */
> +static int reset_on_restart = 0;
> +module_param(reset_on_restart, int, 0444);
> +
> +/*
> + * open_pid takes note of the first thread that opened the watchdog.
> + *
> + * Any other thread that generates an event will cause an "other_threads"
> + * event in the monitor.
> + */
> +static int open_pid = 0;

Userspace could open a watchdog, create a child process, and handle it
from the child. That is perfectly valid.

> +
> +/*
> + * watchdog_id: the watchdog to monitor
> + */
> +static int watchdog_id = 0;
> +module_param(watchdog_id, int, 0444);

Limiting the watcher to a single watchdog sounds less than perfect.
What if the system supports more than one, more than one is enabled,
and the non-monitored watchdog misbehaves ?

> +
> +static void handle_nowayout(void *data, struct watchdog_device *wdd)
> +{
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	da_handle_init_run_event_safe_wtd(nowayout_safe_wtd);
> +}
> +
> +static void handle_close(void *data, struct watchdog_device *wdd)
> +{
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	if (open_pid && current->pid != open_pid) {
> +		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
> +	} else {
> +		da_handle_event_safe_wtd(close_safe_wtd);
> +		open_pid = 0;
> +	}
> +}
> +
> +static void handle_open(void *data, struct watchdog_device *wdd)
> +{
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	if (open_pid && current->pid != open_pid) {
> +		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
> +	} else {
> +		da_handle_init_run_event_safe_wtd(open_safe_wtd);
> +		open_pid = current->pid;
> +	}
> +}
> +
> +static void blocked_events(void *data, struct watchdog_device *wdd)
> +{
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	if (open_pid && current->pid != open_pid) {
> +		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
> +		return;
> +	}
> +	da_handle_event_safe_wtd(other_threads_safe_wtd);
> +}
> +
> +static void blocked_events_timeout(void *data, struct watchdog_device *wdd, u64 timeout)
> +{
> +	blocked_events(data, wdd);
> +}
> +
> +static void handle_ping(void *data, struct watchdog_device *wdd)
> +{
> +	char msg[128];
> +	unsigned int timeout;
> +
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	if (open_pid && current->pid != open_pid) {
> +		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
> +		return;
> +	}
> +
> +	da_handle_event_safe_wtd(ping_safe_wtd);
> +
> +	if (!check_timeout)
> +		return;
> +
> +	if (wdd->ops->get_timeleft) {
> +		timeout = wdd->ops->get_timeleft(wdd);
> +		if (timeout > last_timeout_set) {
> +			snprintf(msg, 128,
> +				 "watchdog timeout is %u > than previously set (%d)\n",
> +				 timeout, last_timeout_set);
> +			cond_react(msg);
> +		}
> +	} else {
> +		snprintf(msg, 128, "error getting timeout: option not supported\n");

This is not an error. The get_timeleft callback is optional.

> +		cond_react(msg);
> +	}
> +}
> +
> +static void handle_set_safe_timeout(void *data, struct watchdog_device *wdd, u64 timeout)
> +{
> +	char msg[128];
> +
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	if (open_pid && current->pid != open_pid) {
> +		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
> +		return;
> +	}
> +
> +	da_handle_event_safe_wtd(set_safe_timeout_safe_wtd);
> +
> +	if (timeout > safe_timeout) {
> +		snprintf(msg, 128, "set safety timeout is too high: %d", (int) timeout);
> +		cond_react(msg);
> +	}
> +
> +	if (check_timeout)
> +		last_timeout_set = timeout;
> +}
> +
> +static void handle_start(void *data, struct watchdog_device *wdd)
> +{
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	if (open_pid && current->pid != open_pid) {
> +		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
> +		return;
> +	}
> +
> +	da_handle_event_safe_wtd(start_safe_wtd);
> +}
> +
> +static void handle_stop(void *data, struct watchdog_device *wdd)
> +{
> +	if (wdd->id != watchdog_id)
> +		return;
> +
> +	if (open_pid && current->pid != open_pid) {
> +		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
> +		return;
> +	}
> +
> +	da_handle_event_safe_wtd(stop_safe_wtd);
> +}
> +
> +static int mon_started = 0;
> +
> +static int start_safe_wtd_monitor(void)
> +{
> +	int retval;
> +
> +	if (!mon_started || reset_on_restart) {
> +		retval = da_monitor_init_safe_wtd();
> +		if (retval)
> +			return retval;
> +
> +		mon_started = 1;
> +	}
> +
> +	rv_attach_trace_probe("safe_wtd", watchdog_close, handle_close);
> +	rv_attach_trace_probe("safe_wtd", watchdog_nowayout, handle_nowayout);
> +	rv_attach_trace_probe("safe_wtd", watchdog_open, handle_open);
> +	rv_attach_trace_probe("safe_wtd", watchdog_ping, handle_ping);
> +	rv_attach_trace_probe("safe_wtd", watchdog_set_timeout, handle_set_safe_timeout);
> +	rv_attach_trace_probe("safe_wtd", watchdog_start, handle_start);
> +	rv_attach_trace_probe("safe_wtd", watchdog_stop, handle_stop);
> +	rv_attach_trace_probe("safe_wtd", watchdog_set_keep_alive, blocked_events_timeout);
> +	rv_attach_trace_probe("safe_wtd", watchdog_keep_alive, blocked_events);
> +	rv_attach_trace_probe("safe_wtd", watchdog_set_pretimeout, blocked_events_timeout);
> +	rv_attach_trace_probe("safe_wtd", watchdog_pretimeout, blocked_events);
> +
> +	return 0;
> +}
> +
> +static void stop_safe_wtd_monitor(void)
> +{
> +	if (dont_stop)
> +		cond_react("dont_stop safe_wtd is set.");
> +
> +	rv_safe_wtd.enabled = 0;
> +
> +	rv_detach_trace_probe("safe_wtd", watchdog_close, handle_close);
> +	rv_detach_trace_probe("safe_wtd", watchdog_nowayout, handle_nowayout);
> +	rv_detach_trace_probe("safe_wtd", watchdog_open, handle_open);
> +	rv_detach_trace_probe("safe_wtd", watchdog_ping, handle_ping);
> +	rv_detach_trace_probe("safe_wtd", watchdog_set_timeout, handle_set_safe_timeout);
> +	rv_detach_trace_probe("safe_wtd", watchdog_start, handle_start);
> +	rv_detach_trace_probe("safe_wtd", watchdog_stop, handle_stop);
> +	rv_detach_trace_probe("safe_wtd", watchdog_set_keep_alive, blocked_events_timeout);
> +	rv_detach_trace_probe("safe_wtd", watchdog_keep_alive, blocked_events);
> +	rv_detach_trace_probe("safe_wtd", watchdog_set_pretimeout, blocked_events_timeout);
> +	rv_detach_trace_probe("safe_wtd", watchdog_pretimeout, blocked_events);
> +
> +	da_monitor_destroy_safe_wtd();
> +}
> +
> +/*
> + * This is the monitor register section.
> + */
> +struct rv_monitor rv_safe_wtd = {
> +	.name = "safe_wtd",
> +	.description = "A watchdog monitor guarding a safety monitor actions",
> +	.start = start_safe_wtd_monitor,
> +	.stop = stop_safe_wtd_monitor,
> +	.reset = da_monitor_reset_all_safe_wtd,
> +	.enabled = 0,
> +};
> +
> +int register_safe_wtd(void)
> +{
> +	rv_register_monitor(&rv_safe_wtd);
> +	return 0;
> +}
> +
> +void unregister_safe_wtd(void)
> +{
> +	if (rv_safe_wtd.enabled)
> +		stop_safe_wtd_monitor();
> +
> +	rv_unregister_monitor(&rv_safe_wtd);
> +}
> +
> +module_init(register_safe_wtd);
> +module_exit(unregister_safe_wtd);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Daniel Bristot de Oliveira <bristot@kernel.org>");
> +MODULE_DESCRIPTION("Safe watchdog RV monitor");
> diff --git a/kernel/trace/rv/monitors/safe_wtd/safe_wtd.h b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.h
> new file mode 100644
> index 000000000000..835c9d0979f6
> --- /dev/null
> +++ b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.h
> @@ -0,0 +1,84 @@
> +enum states_safe_wtd {
> +	init_safe_wtd = 0,
> +	closed_running_safe_wtd,
> +	closed_running_nwo_safe_wtd,
> +	nwo_safe_wtd,
> +	opened_safe_wtd,
> +	opened_nwo_safe_wtd,
> +	reopened_safe_wtd,
> +	safe_safe_wtd,
> +	safe_nwo_safe_wtd,
> +	set_safe_wtd,
> +	set_nwo_safe_wtd,
> +	started_safe_wtd,
> +	started_nwo_safe_wtd,
> +	stoped_safe_wtd,
> +	state_max_safe_wtd
> +};
> +
> +enum events_safe_wtd {
> +	close_safe_wtd = 0,
> +	nowayout_safe_wtd,
> +	open_safe_wtd,
> +	other_threads_safe_wtd,
> +	ping_safe_wtd,
> +	set_safe_timeout_safe_wtd,
> +	start_safe_wtd,
> +	stop_safe_wtd,
> +	event_max_safe_wtd
> +};
> +
> +struct automaton_safe_wtd {
> +	char *state_names[state_max_safe_wtd];
> +	char *event_names[event_max_safe_wtd];
> +	char function[state_max_safe_wtd][event_max_safe_wtd];
> +	char initial_state;
> +	char final_states[state_max_safe_wtd];
> +};
> +
> +struct automaton_safe_wtd automaton_safe_wtd = {
> +	.state_names = {
> +		"init",
> +		"closed_running",
> +		"closed_running_nwo",
> +		"nwo",
> +		"opened",
> +		"opened_nwo",
> +		"reopened",
> +		"safe",
> +		"safe_nwo",
> +		"set",
> +		"set_nwo",
> +		"started",
> +		"started_nwo",
> +		"stoped"
> +	},
> +	.event_names = {
> +		"close",
> +		"nowayout",
> +		"open",
> +		"other_threads",
> +		"ping",
> +		"set_safe_timeout",
> +		"start",
> +		"stop"
> +	},
> +	.function = {
> +		{                          -1,                nwo_safe_wtd,             opened_safe_wtd,               init_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
> +		{                          -1, closed_running_nwo_safe_wtd,           reopened_safe_wtd,     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
> +		{                          -1, closed_running_nwo_safe_wtd,        started_nwo_safe_wtd, closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
> +		{                          -1,                nwo_safe_wtd,         opened_nwo_safe_wtd,                nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
> +		{               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,            started_safe_wtd,                          -1 },
> +		{                nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,        started_nwo_safe_wtd,                          -1 },
> +		{     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                          -1,             opened_safe_wtd },
> +		{     closed_running_safe_wtd,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                          -1,             stoped_safe_wtd },
> +		{ closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                          -1,                          -1 },
> +		{                          -1,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                          -1,                          -1 },
> +		{                          -1,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                          -1,                          -1 },
> +		{     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                          -1,             stoped_safe_wtd },
> +		{ closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,            set_nwo_safe_wtd,                          -1,                          -1 },
> +		{               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,                          -1,                          -1 },
> +	},
> +	.initial_state = init_safe_wtd,
> +	.final_states = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },

I find this event table all but impossible to verify.


> +};
Daniel Bristot de Oliveira June 16, 2022, 3:29 p.m. UTC | #2
Hi Guenter,

On 6/16/22 15:36, Guenter Roeck wrote:
> On 6/16/22 01:45, Daniel Bristot de Oliveira wrote:
>> The watchdog is an essential building block for the usage of Linux in
>> safety-critical systems because it allows the system to be monitored from
>> an external element - the watchdog hardware, acting as a safety-monitor.
>>
>> A user-space application controls the watchdog device via the watchdog
>> interface. This application, hereafter safety_app, enables the watchdog
>> and periodically pets the watchdog upon correct completion of the safety
>> related processing.
>>
>> If the safety_app, for any reason, stops pinging the watchdog,
>> the watchdog hardware can set the system in a fail-safe state. For
>> example, shutting the system down.
>>
>> Given the importance of the safety_app / watchdog hardware couple,
>> the interaction between these software pieces also needs some
>> sort of monitoring. In other words, "who monitors the monitor?"
>>
>> The safe watchdog (safe_wtd) RV monitor monitors the interaction between
>> the safety_app and the watchdog device, enforcing the correct sequence of
>> events that leads the system to a safe state.
>>
>> Furthermore, the safety_app can monitor the RV monitor by collecting the
>> events generated by the RV monitor itself via tracing interface. In this way,
>> closing the monitoring loop with the safety_app.
>>
>> To reach a safe state, the safe_wtd RV monitor requires the
>> safety_app to:
>>
>>     - Open the watchdog device
>>     - Start the watchdog
>>     - Set a timeout
>>     - ping at least once
>>
>> The RV monitor also avoids some undesired actions. For example, to have
>> other threads to touch the watchdog.
>>
>> The monitor also has a set of options, enabled via kernel command
>> line/module options. They are:
>>
>>     - watchdog_id: the device id to monitor (default 0).
>>     - dont_stop: once enabled, do not allow the RV monitor to be stopped
>>         (default off);
>>     - safe_timeout: define a maximum safe value that an user-space
>>         application can set as the watchdog timeout
>>         (default unlimited).
>>     - check_timeout: After every ping, check if the time left in the
>>         watchdog is less than or equal to the last timeout set
>>         for the watchdog. It only works for watchdog devices that
>>         provide the get_timeleft() function (default off).
>>
>> For further information, please refer to:
>>     Documentation/trace/rv/watchdog-monitor.rst
>>
>> The monitor specification was developed together with Gabriele Paoloni,
>> in the context of the Linux Foundation Elisa Project.
>>
>> Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
>> Cc: Guenter Roeck <linux@roeck-us.net>
>> Cc: Jonathan Corbet <corbet@lwn.net>
>> Cc: Steven Rostedt <rostedt@goodmis.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Marco Elver <elver@google.com>
>> Cc: Dmitry Vyukov <dvyukov@google.com>
>> Cc: "Paul E. McKenney" <paulmck@kernel.org>
>> Cc: Shuah Khan <skhan@linuxfoundation.org>
>> Cc: Gabriele Paoloni <gpaoloni@redhat.com>
>> Cc: Juri Lelli <juri.lelli@redhat.com>
>> Cc: Clark Williams <williams@redhat.com>
>> Cc: linux-doc@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: linux-trace-devel@vger.kernel.org
>> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
>> ---
>>   include/trace/events/rv.h                    |  11 +
>>   kernel/trace/rv/Kconfig                      |  10 +
>>   kernel/trace/rv/Makefile                     |   1 +
>>   kernel/trace/rv/monitors/safe_wtd/safe_wtd.c | 300 +++++++++++++++++++
>>   kernel/trace/rv/monitors/safe_wtd/safe_wtd.h |  84 ++++++
>>   5 files changed, 406 insertions(+)
>>   create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>>   create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.h
>>
>> diff --git a/include/trace/events/rv.h b/include/trace/events/rv.h
>> index 00f11a8dac3b..895eb3435ed7 100644
>> --- a/include/trace/events/rv.h
>> +++ b/include/trace/events/rv.h
>> @@ -66,6 +66,17 @@ DEFINE_EVENT(error_da_monitor, error_wip,
>>            TP_PROTO(char *state, char *event),
>>            TP_ARGS(state, event));
>>   #endif /* CONFIG_RV_MON_WIP */
>> +
>> +#ifdef CONFIG_RV_MON_SAFE_WTD
>> +DEFINE_EVENT(event_da_monitor, event_safe_wtd,
>> +         TP_PROTO(char *state, char *event, char *next_state, bool safe),
>> +         TP_ARGS(state, event, next_state, safe));
>> +
>> +DEFINE_EVENT(error_da_monitor, error_safe_wtd,
>> +         TP_PROTO(char *state, char *event),
>> +         TP_ARGS(state, event));
>> +#endif /* CONFIG_RV_MON_SAFE_WTD */
>> +
>>   #endif /* CONFIG_DA_MON_EVENTS_IMPLICIT */
>>     #ifdef CONFIG_DA_MON_EVENTS_ID
>> diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
>> index 21f03fb3101a..b14ae63e792b 100644
>> --- a/kernel/trace/rv/Kconfig
>> +++ b/kernel/trace/rv/Kconfig
>> @@ -45,6 +45,16 @@ config RV_MON_WWNR
>>         illustrates the usage of per-task monitor. The model is
>>         broken on purpose: it serves to test reactors.
>>   +config RV_MON_SAFE_WTD
>> +    select DA_MON_EVENTS_IMPLICIT
>> +    bool "Safety watchdog"
>> +    help
>> +      Enable safe_wtd, this monitor observes the interaction
>> +      between a user-space safety monitor and a watchdog device.
>> +
>> +      For futher information see:
>> +        Documentation/trace/rv/safety-monitor.rst
>> +
>>   config RV_REACTORS
>>       bool "Runtime verification reactors"
>>       default y if RV
>> diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
>> index 963d14875b45..904db96c7eae 100644
>> --- a/kernel/trace/rv/Makefile
>> +++ b/kernel/trace/rv/Makefile
>> @@ -3,6 +3,7 @@
>>   obj-$(CONFIG_RV) += rv.o
>>   obj-$(CONFIG_RV_MON_WIP) += monitors/wip/wip.o
>>   obj-$(CONFIG_RV_MON_WWNR) += monitors/wwnr/wwnr.o
>> +obj-$(CONFIG_RV_MON_SAFE_WTD) += monitors/safe_wtd/safe_wtd.o
>>   obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
>>   obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
>>   obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o
>> diff --git a/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>> b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>> new file mode 100644
>> index 000000000000..9856e0770d0d
>> --- /dev/null
>> +++ b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>> @@ -0,0 +1,300 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#include <linux/ftrace.h>
>> +#include <linux/tracepoint.h>
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/init.h>
>> +#include <linux/rv.h>
>> +#include <rv/instrumentation.h>
>> +#include <rv/da_monitor.h>
>> +
>> +#include <linux/watchdog.h>
>> +#include <linux/moduleparam.h>
>> +
>> +#include <trace/events/rv.h>
>> +#include <trace/events/watchdog.h>
>> +
>> +#define MODULE_NAME "safe_wtd"
>> +
>> +/*
>> + * This is the self-generated part of the monitor. Generally, there is no need
>> + * to touch this section.
>> + */
>> +#include "safe_wtd.h"
>> +
>> +/*
>> + * Declare the deterministic automata monitor.
>> + *
>> + * The rv monitor reference is needed for the monitor declaration.
>> + */
>> +struct rv_monitor rv_safe_wtd;
>> +DECLARE_DA_MON_GLOBAL(safe_wtd, char);
>> +
>> +/*
>> + * custom: safe_timeout is the maximum value a watchdog monitor
>> + * can set. This value is registered here to duplicate the information.
>> + * In this way, a miss-behaving monitor can be detected.
>> + */
>> +static int safe_timeout = ~0;
>> +module_param(safe_timeout, int, 0444);
>> +
>> +/*
>> + * custom: if check_timeout is set, the monitor will check if the time left
>> + * in the watchdog is less than or equals to the last safe timeout set by
>> + * user-space. This check is done after each ping. In this way, if any
>> + * code by-passed the watchdog dev interface setting a higher (so unsafe)
>> + * timeout, this monitor will catch the side effect and react.
>> + */
>> +static int last_timeout_set = 0;
>> +static int check_timeout = 0;
>> +module_param(check_timeout, int, 0444);
>> +
>> +/*
>> + * custom: if dont_stop is set the monitor will react if stopped.
>> + */
>> +static int dont_stop = 0;
>> +module_param(dont_stop, int, 0444);
>> +
>> +/*
>> + * custom: there are some states that are kept after the watchdog is closed.
>> + * For example, the nowayout state.
>> + *
>> + * Thus, the RV monitor needs to keep track of these states after a start/stop
>> + * of the RV monitor itself, and should not reset after each restart -
>> keeping the
>> + * know state until the system shutdown.
>> + *
>> + * If for an unknown reason an RV monitor would like to reset the RV monitor
>> at each
>> + * RV monitor start, set it to one.
>> + */
>> +static int reset_on_restart = 0;
>> +module_param(reset_on_restart, int, 0444);
>> +
>> +/*
>> + * open_pid takes note of the first thread that opened the watchdog.
>> + *
>> + * Any other thread that generates an event will cause an "other_threads"
>> + * event in the monitor.
>> + */
>> +static int open_pid = 0;
> 
> Userspace could open a watchdog, create a child process, and handle it
> from the child. That is perfectly valid.

Right! It is a correct usage of the watchdog subsystem.

However, the idea here is to allow a "restricted" set of operations based on the
safety analysis made by people in the LF Elisa Workgroup (Gabriele Paoloni in Cc:).

One of the specifications says that: only one process should touch the watchdog.

There are details about it in the "watchdog-monitor.rst," section "RV monitor
specification."

There could be another monitor, a less resticted one, in which the operation you
mention would be allowed.

I will complement this commit log in the next version of the patch set,
clarifying that it is not a "full representation of the watchdog operations" but
a restricted set of operations specified by...

>> +
>> +/*
>> + * watchdog_id: the watchdog to monitor
>> + */
>> +static int watchdog_id = 0;
>> +module_param(watchdog_id, int, 0444);
> 
> Limiting the watcher to a single watchdog sounds less than perfect.
> What if the system supports more than one, more than one is enabled,
> and the non-monitored watchdog misbehaves ?

I can add one monitor per watchdog dev. The easiest way would be adding a
"struct da_monitor" variable in the watchdog_device structure, e.g.,

struct watchdog_device {
...
	#ifdef CONFIG_RV_MON_SAFE_WTD
	struct da_monitor da_mon;
	#endif
...
}

A simplified version of the the "per task" monitor, in the patch 01, changes in
include/linux/sched.h.

>> +
>> +static void handle_nowayout(void *data, struct watchdog_device *wdd)
>> +{
>> +    if (wdd->id != watchdog_id)
>> +        return;
>> +
>> +    da_handle_init_run_event_safe_wtd(nowayout_safe_wtd);
>> +}
>> +
>> +static void handle_close(void *data, struct watchdog_device *wdd)
>> +{
>> +    if (wdd->id != watchdog_id)
>> +        return;
>> +
>> +    if (open_pid && current->pid != open_pid) {
>> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>> +    } else {
>> +        da_handle_event_safe_wtd(close_safe_wtd);
>> +        open_pid = 0;
>> +    }
>> +}
>> +
>> +static void handle_open(void *data, struct watchdog_device *wdd)
>> +{
>> +    if (wdd->id != watchdog_id)
>> +        return;
>> +
>> +    if (open_pid && current->pid != open_pid) {
>> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>> +    } else {
>> +        da_handle_init_run_event_safe_wtd(open_safe_wtd);
>> +        open_pid = current->pid;
>> +    }
>> +}
>> +
>> +static void blocked_events(void *data, struct watchdog_device *wdd)
>> +{
>> +    if (wdd->id != watchdog_id)
>> +        return;
>> +
>> +    if (open_pid && current->pid != open_pid) {
>> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>> +        return;
>> +    }
>> +    da_handle_event_safe_wtd(other_threads_safe_wtd);
>> +}
>> +
>> +static void blocked_events_timeout(void *data, struct watchdog_device *wdd,
>> u64 timeout)
>> +{
>> +    blocked_events(data, wdd);
>> +}
>> +
>> +static void handle_ping(void *data, struct watchdog_device *wdd)
>> +{
>> +    char msg[128];
>> +    unsigned int timeout;
>> +
>> +    if (wdd->id != watchdog_id)
>> +        return;
>> +
>> +    if (open_pid && current->pid != open_pid) {
>> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>> +        return;
>> +    }
>> +
>> +    da_handle_event_safe_wtd(ping_safe_wtd);
>> +
>> +    if (!check_timeout)
>> +        return;
>> +
>> +    if (wdd->ops->get_timeleft) {
>> +        timeout = wdd->ops->get_timeleft(wdd);
>> +        if (timeout > last_timeout_set) {
>> +            snprintf(msg, 128,
>> +                 "watchdog timeout is %u > than previously set (%d)\n",
>> +                 timeout, last_timeout_set);
>> +            cond_react(msg);
>> +        }
>> +    } else {
>> +        snprintf(msg, 128, "error getting timeout: option not supported\n");
> 
> This is not an error. The get_timeleft callback is optional.

Right... but this part of the code is only reachable if the user explicitly
asked to check the timeout (if (!check_timeout)...return before this code).

So, if the user only considers the system safe if the monitor also checks the
written timeout, but the watchdog is one of those that do not have the callback
implemented (which is ok for a Linux watchdog), the monitor captures this
"undesired" behavior.

This monitor is not checking if the watchdog subsystem is correct at its
plenitude, it is checking if the watchdog usage is following a set of
specifications (raised by people in the LF Elisa workgroup).

>> +        cond_react(msg);
>> +    }
>> +}
>> +

[...]

>> +
>> +struct automaton_safe_wtd automaton_safe_wtd = {
>> +    .state_names = {
>> +        "init",
>> +        "closed_running",
>> +        "closed_running_nwo",
>> +        "nwo",
>> +        "opened",
>> +        "opened_nwo",
>> +        "reopened",
>> +        "safe",
>> +        "safe_nwo",
>> +        "set",
>> +        "set_nwo",
>> +        "started",
>> +        "started_nwo",
>> +        "stoped"
>> +    },
>> +    .event_names = {
>> +        "close",
>> +        "nowayout",
>> +        "open",
>> +        "other_threads",
>> +        "ping",
>> +        "set_safe_timeout",
>> +        "start",
>> +        "stop"
>> +    },
>> +    .function = {
>> +        {                          -1,                nwo_safe_wtd,             opened_safe_wtd,               init_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>> +        {                          -1, closed_running_nwo_safe_wtd,           reopened_safe_wtd,     closed_running_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>> +        {                          -1, closed_running_nwo_safe_wtd,        started_nwo_safe_wtd, closed_running_nwo_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>> +        {                          -1,                nwo_safe_wtd,         opened_nwo_safe_wtd,                nwo_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>> +        {               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,           started_safe_wtd,                          -1 },
>> +        {                nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,       started_nwo_safe_wtd,                          -1 },
>> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                          1,             opened_safe_wtd },
>> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                          1,             stoped_safe_wtd },
>> +        { closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                         -1,                          -1 },
>> +        {                          -1,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                         -1,                          -1 },
>> +        {                          -1,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                         -1,                          -1 },
>> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                         -1,             stoped_safe_wtd },
>> +        { closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,            set_nwo_safe_wtd,                         -1,                          -1 },
>> +        {               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,                         -1,                          -1 },
>> +    },
>> +    .initial_state = init_safe_wtd,
>> +    .final_states = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
> 
> I find this event table all but impossible to verify.

It is a matrix. Lines are states, and columns are events.

On a given state/line, receiving a given event/column, the data is the next
state/row.

For instance, let's say "init" (row 0), event "nwo" (column 1), and the next
state is the "nwo" (row 3).

-1 means invalid/blocked state (yeah, maybe it is better to have an #define
INVALID_STATE -1).

This is the C representation of an automaton, following the formal definition of
a deterministic automaton. I've added an explanation of this representation in
the documentation (patch 15, file da_monitor_synthesis.rst).

A deeper look into this subject is here (peer-reviewed conference paper at
Software Engineer and Formal Methods 2019):
https://bristot.me/wp-content/uploads/2019/09/paper.pdf

One could translate it back to the automaton's graphical format... to a format
of by a tool used to analyze automaton properties... that is the good point of
using a well-established formalism. (The bad part is that they are often
boring... c'est la vie :-)).

-- Daniel
> 
>> +};
>
Randy Dunlap June 16, 2022, 8:57 p.m. UTC | #3
Hi--

On 6/16/22 01:45, Daniel Bristot de Oliveira wrote:
> diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
> index 21f03fb3101a..b14ae63e792b 100644
> --- a/kernel/trace/rv/Kconfig
> +++ b/kernel/trace/rv/Kconfig
> @@ -45,6 +45,16 @@ config RV_MON_WWNR
>  	  illustrates the usage of per-task monitor. The model is
>  	  broken on purpose: it serves to test reactors.
>  
> +config RV_MON_SAFE_WTD
> +	select DA_MON_EVENTS_IMPLICIT
> +	bool "Safety watchdog"
> +	help
> +	  Enable safe_wtd, this monitor observes the interaction
> +	  between a user-space safety monitor and a watchdog device.
> +
> +	  For futher information see:
> +	    Documentation/trace/rv/safety-monitor.rst

I'm curious about what "WTD" means.

I see lots of WDT in drivers/watchdog/Kconfig
(where it means WatchDog Timer AFAIK).

thanks.
Guenter Roeck June 16, 2022, 11:53 p.m. UTC | #4
On 6/16/22 08:43, Gabriele Paoloni wrote:
> 
> 
> On Thu, Jun 16, 2022 at 5:29 PM Daniel Bristot de Oliveira <bristot@kernel.org <mailto:bristot@kernel.org>> wrote:
> 
>     Hi Guenter,
> 
>     On 6/16/22 15:36, Guenter Roeck wrote:
>      > On 6/16/22 01:45, Daniel Bristot de Oliveira wrote:
>      >> The watchdog is an essential building block for the usage of Linux in
>      >> safety-critical systems because it allows the system to be monitored from
>      >> an external element - the watchdog hardware, acting as a safety-monitor.
>      >>
>      >> A user-space application controls the watchdog device via the watchdog
>      >> interface. This application, hereafter safety_app, enables the watchdog
>      >> and periodically pets the watchdog upon correct completion of the safety
>      >> related processing.
>      >>
>      >> If the safety_app, for any reason, stops pinging the watchdog,
>      >> the watchdog hardware can set the system in a fail-safe state. For
>      >> example, shutting the system down.
>      >>
>      >> Given the importance of the safety_app / watchdog hardware couple,
>      >> the interaction between these software pieces also needs some
>      >> sort of monitoring. In other words, "who monitors the monitor?"
>      >>
>      >> The safe watchdog (safe_wtd) RV monitor monitors the interaction between
>      >> the safety_app and the watchdog device, enforcing the correct sequence of
>      >> events that leads the system to a safe state.
>      >>
>      >> Furthermore, the safety_app can monitor the RV monitor by collecting the
>      >> events generated by the RV monitor itself via tracing interface. In this way,
>      >> closing the monitoring loop with the safety_app.
>      >>
>      >> To reach a safe state, the safe_wtd RV monitor requires the
>      >> safety_app to:
>      >>
>      >>     - Open the watchdog device
>      >>     - Start the watchdog
>      >>     - Set a timeout
>      >>     - ping at least once
>      >>
>      >> The RV monitor also avoids some undesired actions. For example, to have
>      >> other threads to touch the watchdog.
>      >>
>      >> The monitor also has a set of options, enabled via kernel command
>      >> line/module options. They are:
>      >>
>      >>     - watchdog_id: the device id to monitor (default 0).
>      >>     - dont_stop: once enabled, do not allow the RV monitor to be stopped
>      >>         (default off);
>      >>     - safe_timeout: define a maximum safe value that an user-space
>      >>         application can set as the watchdog timeout
>      >>         (default unlimited).
>      >>     - check_timeout: After every ping, check if the time left in the
>      >>         watchdog is less than or equal to the last timeout set
>      >>         for the watchdog. It only works for watchdog devices that
>      >>         provide the get_timeleft() function (default off).
>      >>
>      >> For further information, please refer to:
>      >>     Documentation/trace/rv/watchdog-monitor.rst
>      >>
>      >> The monitor specification was developed together with Gabriele Paoloni,
>      >> in the context of the Linux Foundation Elisa Project.
>      >>
>      >> Cc: Wim Van Sebroeck <wim@linux-watchdog.org <mailto:wim@linux-watchdog.org>>
>      >> Cc: Guenter Roeck <linux@roeck-us.net <mailto:linux@roeck-us.net>>
>      >> Cc: Jonathan Corbet <corbet@lwn.net <mailto:corbet@lwn.net>>
>      >> Cc: Steven Rostedt <rostedt@goodmis.org <mailto:rostedt@goodmis.org>>
>      >> Cc: Ingo Molnar <mingo@redhat.com <mailto:mingo@redhat.com>>
>      >> Cc: Thomas Gleixner <tglx@linutronix.de <mailto:tglx@linutronix.de>>
>      >> Cc: Peter Zijlstra <peterz@infradead.org <mailto:peterz@infradead.org>>
>      >> Cc: Will Deacon <will@kernel.org <mailto:will@kernel.org>>
>      >> Cc: Catalin Marinas <catalin.marinas@arm.com <mailto:catalin.marinas@arm.com>>
>      >> Cc: Marco Elver <elver@google.com <mailto:elver@google.com>>
>      >> Cc: Dmitry Vyukov <dvyukov@google.com <mailto:dvyukov@google.com>>
>      >> Cc: "Paul E. McKenney" <paulmck@kernel.org <mailto:paulmck@kernel.org>>
>      >> Cc: Shuah Khan <skhan@linuxfoundation.org <mailto:skhan@linuxfoundation.org>>
>      >> Cc: Gabriele Paoloni <gpaoloni@redhat.com <mailto:gpaoloni@redhat.com>>
>      >> Cc: Juri Lelli <juri.lelli@redhat.com <mailto:juri.lelli@redhat.com>>
>      >> Cc: Clark Williams <williams@redhat.com <mailto:williams@redhat.com>>
>      >> Cc: linux-doc@vger.kernel.org <mailto:linux-doc@vger.kernel.org>
>      >> Cc: linux-kernel@vger.kernel.org <mailto:linux-kernel@vger.kernel.org>
>      >> Cc: linux-trace-devel@vger.kernel.org <mailto:linux-trace-devel@vger.kernel.org>
>      >> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org <mailto:bristot@kernel.org>>
>      >> ---
>      >>   include/trace/events/rv.h                    |  11 +
>      >>   kernel/trace/rv/Kconfig                      |  10 +
>      >>   kernel/trace/rv/Makefile                     |   1 +
>      >>   kernel/trace/rv/monitors/safe_wtd/safe_wtd.c | 300 +++++++++++++++++++
>      >>   kernel/trace/rv/monitors/safe_wtd/safe_wtd.h |  84 ++++++
>      >>   5 files changed, 406 insertions(+)
>      >>   create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>      >>   create mode 100644 kernel/trace/rv/monitors/safe_wtd/safe_wtd.h
>      >>
>      >> diff --git a/include/trace/events/rv.h b/include/trace/events/rv.h
>      >> index 00f11a8dac3b..895eb3435ed7 100644
>      >> --- a/include/trace/events/rv.h
>      >> +++ b/include/trace/events/rv.h
>      >> @@ -66,6 +66,17 @@ DEFINE_EVENT(error_da_monitor, error_wip,
>      >>            TP_PROTO(char *state, char *event),
>      >>            TP_ARGS(state, event));
>      >>   #endif /* CONFIG_RV_MON_WIP */
>      >> +
>      >> +#ifdef CONFIG_RV_MON_SAFE_WTD
>      >> +DEFINE_EVENT(event_da_monitor, event_safe_wtd,
>      >> +         TP_PROTO(char *state, char *event, char *next_state, bool safe),
>      >> +         TP_ARGS(state, event, next_state, safe));
>      >> +
>      >> +DEFINE_EVENT(error_da_monitor, error_safe_wtd,
>      >> +         TP_PROTO(char *state, char *event),
>      >> +         TP_ARGS(state, event));
>      >> +#endif /* CONFIG_RV_MON_SAFE_WTD */
>      >> +
>      >>   #endif /* CONFIG_DA_MON_EVENTS_IMPLICIT */
>      >>     #ifdef CONFIG_DA_MON_EVENTS_ID
>      >> diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
>      >> index 21f03fb3101a..b14ae63e792b 100644
>      >> --- a/kernel/trace/rv/Kconfig
>      >> +++ b/kernel/trace/rv/Kconfig
>      >> @@ -45,6 +45,16 @@ config RV_MON_WWNR
>      >>         illustrates the usage of per-task monitor. The model is
>      >>         broken on purpose: it serves to test reactors.
>      >>   +config RV_MON_SAFE_WTD
>      >> +    select DA_MON_EVENTS_IMPLICIT
>      >> +    bool "Safety watchdog"
>      >> +    help
>      >> +      Enable safe_wtd, this monitor observes the interaction
>      >> +      between a user-space safety monitor and a watchdog device.
>      >> +
>      >> +      For futher information see:
>      >> +        Documentation/trace/rv/safety-monitor.rst
>      >> +
>      >>   config RV_REACTORS
>      >>       bool "Runtime verification reactors"
>      >>       default y if RV
>      >> diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
>      >> index 963d14875b45..904db96c7eae 100644
>      >> --- a/kernel/trace/rv/Makefile
>      >> +++ b/kernel/trace/rv/Makefile
>      >> @@ -3,6 +3,7 @@
>      >>   obj-$(CONFIG_RV) += rv.o
>      >>   obj-$(CONFIG_RV_MON_WIP) += monitors/wip/wip.o
>      >>   obj-$(CONFIG_RV_MON_WWNR) += monitors/wwnr/wwnr.o
>      >> +obj-$(CONFIG_RV_MON_SAFE_WTD) += monitors/safe_wtd/safe_wtd.o
>      >>   obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
>      >>   obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
>      >>   obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o
>      >> diff --git a/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>      >> b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>      >> new file mode 100644
>      >> index 000000000000..9856e0770d0d
>      >> --- /dev/null
>      >> +++ b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
>      >> @@ -0,0 +1,300 @@
>      >> +// SPDX-License-Identifier: GPL-2.0
>      >> +#include <linux/ftrace.h>
>      >> +#include <linux/tracepoint.h>
>      >> +#include <linux/kernel.h>
>      >> +#include <linux/module.h>
>      >> +#include <linux/init.h>
>      >> +#include <linux/rv.h>
>      >> +#include <rv/instrumentation.h>
>      >> +#include <rv/da_monitor.h>
>      >> +
>      >> +#include <linux/watchdog.h>
>      >> +#include <linux/moduleparam.h>
>      >> +
>      >> +#include <trace/events/rv.h>
>      >> +#include <trace/events/watchdog.h>
>      >> +
>      >> +#define MODULE_NAME "safe_wtd"
>      >> +
>      >> +/*
>      >> + * This is the self-generated part of the monitor. Generally, there is no need
>      >> + * to touch this section.
>      >> + */
>      >> +#include "safe_wtd.h"
>      >> +
>      >> +/*
>      >> + * Declare the deterministic automata monitor.
>      >> + *
>      >> + * The rv monitor reference is needed for the monitor declaration.
>      >> + */
>      >> +struct rv_monitor rv_safe_wtd;
>      >> +DECLARE_DA_MON_GLOBAL(safe_wtd, char);
>      >> +
>      >> +/*
>      >> + * custom: safe_timeout is the maximum value a watchdog monitor
>      >> + * can set. This value is registered here to duplicate the information.
>      >> + * In this way, a miss-behaving monitor can be detected.
>      >> + */
>      >> +static int safe_timeout = ~0;
>      >> +module_param(safe_timeout, int, 0444);
>      >> +
>      >> +/*
>      >> + * custom: if check_timeout is set, the monitor will check if the time left
>      >> + * in the watchdog is less than or equals to the last safe timeout set by
>      >> + * user-space. This check is done after each ping. In this way, if any
>      >> + * code by-passed the watchdog dev interface setting a higher (so unsafe)
>      >> + * timeout, this monitor will catch the side effect and react.
>      >> + */
>      >> +static int last_timeout_set = 0;
>      >> +static int check_timeout = 0;
>      >> +module_param(check_timeout, int, 0444);
>      >> +
>      >> +/*
>      >> + * custom: if dont_stop is set the monitor will react if stopped.
>      >> + */
>      >> +static int dont_stop = 0;
>      >> +module_param(dont_stop, int, 0444);
>      >> +
>      >> +/*
>      >> + * custom: there are some states that are kept after the watchdog is closed.
>      >> + * For example, the nowayout state.
>      >> + *
>      >> + * Thus, the RV monitor needs to keep track of these states after a start/stop
>      >> + * of the RV monitor itself, and should not reset after each restart -
>      >> keeping the
>      >> + * know state until the system shutdown.
>      >> + *
>      >> + * If for an unknown reason an RV monitor would like to reset the RV monitor
>      >> at each
>      >> + * RV monitor start, set it to one.
>      >> + */
>      >> +static int reset_on_restart = 0;
>      >> +module_param(reset_on_restart, int, 0444);
>      >> +
>      >> +/*
>      >> + * open_pid takes note of the first thread that opened the watchdog.
>      >> + *
>      >> + * Any other thread that generates an event will cause an "other_threads"
>      >> + * event in the monitor.
>      >> + */
>      >> +static int open_pid = 0;
>      >
>      > Userspace could open a watchdog, create a child process, and handle it
>      > from the child. That is perfectly valid.
> 
>     Right! It is a correct usage of the watchdog subsystem.
> 
>     However, the idea here is to allow a "restricted" set of operations based on the
>     safety analysis made by people in the LF Elisa Workgroup (Gabriele Paoloni in Cc:).
> 
> 
> Yes this is correct. This model represents the instance of a specific monitor
> resulting from a specific use case that was analysed in ELISA. From my
> understanding nothing prevents extending the monitor to a more complex
> model in the future that may eventually result in a group of allowed
> processes for the WTD manipulation...
> 
> Kind Regards
> Gab
> 
> 
>     One of the specifications says that: only one process should touch the watchdog.
> 
>     There are details about it in the "watchdog-monitor.rst," section "RV monitor
>     specification."
> 
>     There could be another monitor, a less resticted one, in which the operation you
>     mention would be allowed.
> 
>     I will complement this commit log in the next version of the patch set,
>     clarifying that it is not a "full representation of the watchdog operations" but
>     a restricted set of operations specified by...
> 
>      >> +
>      >> +/*
>      >> + * watchdog_id: the watchdog to monitor
>      >> + */
>      >> +static int watchdog_id = 0;
>      >> +module_param(watchdog_id, int, 0444);
>      >
>      > Limiting the watcher to a single watchdog sounds less than perfect.
>      > What if the system supports more than one, more than one is enabled,
>      > and the non-monitored watchdog misbehaves ?
> 
>     I can add one monitor per watchdog dev. The easiest way would be adding a
>     "struct da_monitor" variable in the watchdog_device structure, e.g.,
> 
>     struct watchdog_device {
>     ...
>              #ifdef CONFIG_RV_MON_SAFE_WTD
>              struct da_monitor da_mon;
>              #endif
>     ...
>     }
> 

In my opinion shis should be dynamically allocated and not waste space in driver
code if unused.

>     A simplified version of the the "per task" monitor, in the patch 01, changes in
>     include/linux/sched.h.
> 
>      >> +
>      >> +static void handle_nowayout(void *data, struct watchdog_device *wdd)
>      >> +{
>      >> +    if (wdd->id != watchdog_id)
>      >> +        return;
>      >> +
>      >> +    da_handle_init_run_event_safe_wtd(nowayout_safe_wtd);
>      >> +}
>      >> +
>      >> +static void handle_close(void *data, struct watchdog_device *wdd)
>      >> +{
>      >> +    if (wdd->id != watchdog_id)
>      >> +        return;
>      >> +
>      >> +    if (open_pid && current->pid != open_pid) {
>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>      >> +    } else {
>      >> +        da_handle_event_safe_wtd(close_safe_wtd);
>      >> +        open_pid = 0;
>      >> +    }
>      >> +}
>      >> +
>      >> +static void handle_open(void *data, struct watchdog_device *wdd)
>      >> +{
>      >> +    if (wdd->id != watchdog_id)
>      >> +        return;
>      >> +
>      >> +    if (open_pid && current->pid != open_pid) {
>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>      >> +    } else {
>      >> +        da_handle_init_run_event_safe_wtd(open_safe_wtd);
>      >> +        open_pid = current->pid;
>      >> +    }
>      >> +}
>      >> +
>      >> +static void blocked_events(void *data, struct watchdog_device *wdd)
>      >> +{
>      >> +    if (wdd->id != watchdog_id)
>      >> +        return;
>      >> +
>      >> +    if (open_pid && current->pid != open_pid) {
>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>      >> +        return;
>      >> +    }
>      >> +    da_handle_event_safe_wtd(other_threads_safe_wtd);
>      >> +}
>      >> +
>      >> +static void blocked_events_timeout(void *data, struct watchdog_device *wdd,
>      >> u64 timeout)
>      >> +{
>      >> +    blocked_events(data, wdd);
>      >> +}
>      >> +
>      >> +static void handle_ping(void *data, struct watchdog_device *wdd)
>      >> +{
>      >> +    char msg[128];
>      >> +    unsigned int timeout;
>      >> +
>      >> +    if (wdd->id != watchdog_id)
>      >> +        return;
>      >> +
>      >> +    if (open_pid && current->pid != open_pid) {
>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>      >> +        return;
>      >> +    }
>      >> +
>      >> +    da_handle_event_safe_wtd(ping_safe_wtd);
>      >> +
>      >> +    if (!check_timeout)
>      >> +        return;
>      >> +
>      >> +    if (wdd->ops->get_timeleft) {
>      >> +        timeout = wdd->ops->get_timeleft(wdd);
>      >> +        if (timeout > last_timeout_set) {
>      >> +            snprintf(msg, 128,
>      >> +                 "watchdog timeout is %u > than previously set (%d)\n",
>      >> +                 timeout, last_timeout_set);
>      >> +            cond_react(msg);
>      >> +        }
>      >> +    } else {
>      >> +        snprintf(msg, 128, "error getting timeout: option not supported\n");
>      >
>      > This is not an error. The get_timeleft callback is optional.
> 
>     Right... but this part of the code is only reachable if the user explicitly
>     asked to check the timeout (if (!check_timeout)...return before this code).
> 
>     So, if the user only considers the system safe if the monitor also checks the
>     written timeout, but the watchdog is one of those that do not have the callback
>     implemented (which is ok for a Linux watchdog), the monitor captures this
>     "undesired" behavior.
> 
>     This monitor is not checking if the watchdog subsystem is correct at its
>     plenitude, it is checking if the watchdog usage is following a set of
>     specifications (raised by people in the LF Elisa workgroup).
> 

The kernel is not intended for special use cases. The callback is optional,
period. The test for check_timeout is way too late. A check like this should
be made when the check is requested, not when it is executed - in other words,
when the user requests it. That request should fail.

>      >> +        cond_react(msg);
>      >> +    }
>      >> +}
>      >> +
> 
>     [...]
> 
>      >> +
>      >> +struct automaton_safe_wtd automaton_safe_wtd = {
>      >> +    .state_names = {
>      >> +        "init",
>      >> +        "closed_running",
>      >> +        "closed_running_nwo",
>      >> +        "nwo",
>      >> +        "opened",
>      >> +        "opened_nwo",
>      >> +        "reopened",
>      >> +        "safe",
>      >> +        "safe_nwo",
>      >> +        "set",
>      >> +        "set_nwo",
>      >> +        "started",
>      >> +        "started_nwo",
>      >> +        "stoped"
>      >> +    },
>      >> +    .event_names = {
>      >> +        "close",
>      >> +        "nowayout",
>      >> +        "open",
>      >> +        "other_threads",
>      >> +        "ping",
>      >> +        "set_safe_timeout",
>      >> +        "start",
>      >> +        "stop"
>      >> +    },
>      >> +    .function = {
>      >> +        {                          -1,                nwo_safe_wtd,             opened_safe_wtd,               init_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>      >> +        {                          -1, closed_running_nwo_safe_wtd,           reopened_safe_wtd,     closed_running_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>      >> +        {                          -1, closed_running_nwo_safe_wtd,        started_nwo_safe_wtd, closed_running_nwo_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>      >> +        {                          -1,                nwo_safe_wtd,         opened_nwo_safe_wtd,                nwo_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
>      >> +        {               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,           started_safe_wtd,                          -1 },
>      >> +        {                nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,       started_nwo_safe_wtd,                          -1 },
>      >> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                          1,             opened_safe_wtd },
>      >> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                          1,             stoped_safe_wtd },
>      >> +        { closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                         -1,                          -1 },
>      >> +        {                          -1,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                         -1,                          -1 },
>      >> +        {                          -1,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                         -1,                          -1 },
>      >> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                         -1,             stoped_safe_wtd },
>      >> +        { closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,            set_nwo_safe_wtd,                         -1,                          -1 },
>      >> +        {               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,                         -1,                          -1 },
>      >> +    },
>      >> +    .initial_state = init_safe_wtd,
>      >> +    .final_states = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
>      >
>      > I find this event table all but impossible to verify.
> 
>     It is a matrix. Lines are states, and columns are events.
> 
>     On a given state/line, receiving a given event/column, the data is the next
>     state/row.
> 

I am aware of that, and I did program state machines before.

>     For instance, let's say "init" (row 0), event "nwo" (column 1), and the next
>     state is the "nwo" (row 3).
> 
>     -1 means invalid/blocked state (yeah, maybe it is better to have an #define
>     INVALID_STATE -1).
> 
>     This is the C representation of an automaton, following the formal definition of
>     a deterministic automaton. I've added an explanation of this representation in
>     the documentation (patch 15, file da_monitor_synthesis.rst).
> 
>     A deeper look into this subject is here (peer-reviewed conference paper at
>     Software Engineer and Formal Methods 2019):
>     https://bristot.me/wp-content/uploads/2019/09/paper.pdf <https://bristot.me/wp-content/uploads/2019/09/paper.pdf>
> 
>     One could translate it back to the automaton's graphical format... to a format
>     of by a tool used to analyze automaton properties... that is the good point of
>     using a well-established formalism. (The bad part is that they are often
>     boring... c'est la vie :-)).
> 

If the above state machine fails, no one but the authors will be able to even
remotely figure out what happened, and if the watchdog driver is at fault or
its monitor. It is a state machine making assumptions about state transitions,
sure, but who knows if those asssumptions are even remotely correct or match
reality. For example, I have no idea if the lack of a 'ping' function is handled
correctly,  if the lack of a 'stop' function is handled correctly, or what
happens if any of the driver functions returns an error.

I already found three assumptions which do not or not necessarily match
reality:

- The function to read the remaining timeout is optional and must not be
   used unconditionally, and its lack is not an error.
- The requested timeout (and pretimeout) do not have to match the actually
   configured timeout, and userspace must not rely on the assumption that
   the values match.
- The code assumes that the process opening the watchdog and the process
   accessing it are the same. While that is in general the case, it might
   well be that some application opens the watchdog and then handles it
   from a child process.

And that is just after briefly browsing through the code.

I am open to suggestions from others, but at this point I have serious doubts
that this code is maintainable in the kernel.

Guenter
Daniel Bristot de Oliveira June 17, 2022, 4:17 p.m. UTC | #5
On 6/16/22 22:57, Randy Dunlap wrote:
> Hi--
> 
> On 6/16/22 01:45, Daniel Bristot de Oliveira wrote:
>> diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
>> index 21f03fb3101a..b14ae63e792b 100644
>> --- a/kernel/trace/rv/Kconfig
>> +++ b/kernel/trace/rv/Kconfig
>> @@ -45,6 +45,16 @@ config RV_MON_WWNR
>>  	  illustrates the usage of per-task monitor. The model is
>>  	  broken on purpose: it serves to test reactors.
>>  
>> +config RV_MON_SAFE_WTD
>> +	select DA_MON_EVENTS_IMPLICIT
>> +	bool "Safety watchdog"
>> +	help
>> +	  Enable safe_wtd, this monitor observes the interaction
>> +	  between a user-space safety monitor and a watchdog device.
>> +
>> +	  For futher information see:
>> +	    Documentation/trace/rv/safety-monitor.rst
> I'm curious about what "WTD" means.
> 
> I see lots of WDT in drivers/watchdog/Kconfig
> (where it means WatchDog Timer AFAIK).

Yep, watchdog. I will add the long description right after the first WTD appearance.

-- Daniel

> thanks.
Daniel Bristot de Oliveira June 17, 2022, 5:06 p.m. UTC | #6
On 6/17/22 01:53, Guenter Roeck wrote:
>>
>>     struct watchdog_device {
>>     ...
>>              #ifdef CONFIG_RV_MON_SAFE_WTD
>>              struct da_monitor da_mon;
>>              #endif
>>     ...
>>     }
>>
> 
> In my opinion shis should be dynamically allocated and not waste space in driver
> code if unused.

ack

>>     A simplified version of the the "per task" monitor, in the patch 01, changes in
>>
>>     include/linux/sched.h.
>>
>>      >> +
>>      >> +static void handle_nowayout(void *data, struct watchdog_device *wdd)
>>      >> +{
>>      >> +    if (wdd->id != watchdog_id)
>>      >> +        return;
>>      >> +
>>      >> +    da_handle_init_run_event_safe_wtd(nowayout_safe_wtd);
>>      >> +}
>>      >> +
>>      >> +static void handle_close(void *data, struct watchdog_device *wdd)
>>      >> +{
>>      >> +    if (wdd->id != watchdog_id)
>>      >> +        return;
>>      >> +
>>      >> +    if (open_pid && current->pid != open_pid) {
>>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>>      >> +    } else {
>>      >> +        da_handle_event_safe_wtd(close_safe_wtd);
>>      >> +        open_pid = 0;
>>      >> +    }
>>      >> +}
>>      >> +
>>      >> +static void handle_open(void *data, struct watchdog_device *wdd)
>>      >> +{
>>      >> +    if (wdd->id != watchdog_id)
>>      >> +        return;
>>      >> +
>>      >> +    if (open_pid && current->pid != open_pid) {
>>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>>      >> +    } else {
>>      >> +        da_handle_init_run_event_safe_wtd(open_safe_wtd);
>>      >> +        open_pid = current->pid;
>>      >> +    }
>>      >> +}
>>      >> +
>>      >> +static void blocked_events(void *data, struct watchdog_device *wdd)
>>      >> +{
>>      >> +    if (wdd->id != watchdog_id)
>>      >> +        return;
>>      >> +
>>      >> +    if (open_pid && current->pid != open_pid) {
>>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>>      >> +        return;
>>      >> +    }
>>      >> +    da_handle_event_safe_wtd(other_threads_safe_wtd);
>>      >> +}
>>      >> +
>>      >> +static void blocked_events_timeout(void *data, struct watchdog_device *wdd,
>>
>>      >> u64 timeout)
>>      >> +{
>>      >> +    blocked_events(data, wdd);
>>      >> +}
>>      >> +
>>      >> +static void handle_ping(void *data, struct watchdog_device *wdd)
>>      >> +{
>>      >> +    char msg[128];
>>      >> +    unsigned int timeout;
>>      >> +
>>      >> +    if (wdd->id != watchdog_id)
>>      >> +        return;
>>      >> +
>>      >> +    if (open_pid && current->pid != open_pid) {
>>      >> +        da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
>>      >> +        return;
>>      >> +    }
>>      >> +
>>      >> +    da_handle_event_safe_wtd(ping_safe_wtd);
>>      >> +
>>      >> +    if (!check_timeout)
>>      >> +        return;
>>      >> +
>>      >> +    if (wdd->ops->get_timeleft) {
>>      >> +        timeout = wdd->ops->get_timeleft(wdd);
>>      >> +        if (timeout > last_timeout_set) {
>>      >> +            snprintf(msg, 128,
>>      >> +                 "watchdog timeout is %u > than previously set (%d)\n",
>>      >> +                 timeout, last_timeout_set);
>>      >> +            cond_react(msg);
>>      >> +        }
>>      >> +    } else {
>>      >> +        snprintf(msg, 128, "error getting timeout: option not supported\n");
>>
>>      >
>>      > This is not an error. The get_timeleft callback is optional.
>>
>>     Right... but this part of the code is only reachable if the user explicitly
>>     asked to check the timeout (if (!check_timeout)...return before this code).
>>
>>     So, if the user only considers the system safe if the monitor also checks the
>>     written timeout, but the watchdog is one of those that do not have the callback
>>
>>     implemented (which is ok for a Linux watchdog), the monitor captures this
>>     "undesired" behavior.
>>
>>     This monitor is not checking if the watchdog subsystem is correct at its
>>     plenitude, it is checking if the watchdog usage is following a set of
>>     specifications (raised by people in the LF Elisa workgroup).
>>
> 
> The kernel is not intended for special use cases. The callback is optional,
> period. The test for check_timeout is way too late. A check like this should
> be made when the check is requested, not when it is executed - in other words,
> when the user requests it. That request should fail.
> 
>>      >> +        cond_react(msg);
>>      >> +    }

[...]

>>      >
>>      > I find this event table all but impossible to verify.
>>
>>     It is a matrix. Lines are states, and columns are events.
>>
>>     On a given state/line, receiving a given event/column, the data is the next
>>     state/row.
>>
> 
> I am aware of that, and I did program state machines before.
> 
>>     For instance, let's say "init" (row 0), event "nwo" (column 1), and the next
>>     state is the "nwo" (row 3).
>>
>>     -1 means invalid/blocked state (yeah, maybe it is better to have an #define
>>     INVALID_STATE -1).
>>
>>     This is the C representation of an automaton, following the formal definition of
>>
>>     a deterministic automaton. I've added an explanation of this representation in
>>
>>     the documentation (patch 15, file da_monitor_synthesis.rst).
>>
>>     A deeper look into this subject is here (peer-reviewed conference paper at
>>     Software Engineer and Formal Methods 2019):
>>     https://bristot.me/wp-content/uploads/2019/09/paper.pdf <https://bristot.me/wp-content/uploads/2019/09/paper.pdf>
>>
>>
>>     One could translate it back to the automaton's graphical format... to a format
>>
>>     of by a tool used to analyze automaton properties... that is the good point of
>>
>>     using a well-established formalism. (The bad part is that they are often
>>     boring... c'est la vie :-)).
>>
> 
> If the above state machine fails, no one but the authors will be able to even
> remotely figure out what happened, and if the watchdog driver is at fault or
> its monitor.

That is a point we can improve. If we look only for the matrix, yes, it is not
the best way to try to understand what is going on. I agree.

The idea here is that one can describe the expected behavior and check. But we fail
to explain the reasons for these choices... 

The patch "20/20 Documentation/rv: Add watchdog-monitor documentation" attempts
to explain the idea behind the monitor, and includes the ASCII representation
of the monitor. But indeed, it fails on explaining why those assumptions were
made, what it is trying to monitor, and what it is not...

It is a state machine making assumptions about state transitions,
> sure, but who knows if those asssumptions are even remotely correct or match
> reality. For example, I have no idea if the lack of a 'ping' function is handled
> correctly,  if the lack of a 'stop' function is handled correctly, or what
> happens if any of the driver functions returns an error.
> 
> I already found three assumptions which do not or not necessarily match
> reality:
> 
> - The function to read the remaining timeout is optional and must not be
>   used unconditionally, and its lack is not an error.

Ack, that can be removed.

> - The requested timeout (and pretimeout) do not have to match the actually
>   configured timeout, and userspace must not rely on the assumption that
>   the values match.

I agree, the model is lacking that, and it needs it.

> - The code assumes that the process opening the watchdog and the process
>   accessing it are the same. While that is in general the case, it might
>   well be that some application opens the watchdog and then handles it
>   from a child process.

What is not clear on our documentation are the reasons behind these choices.
They are based on a hazard analysis made by people safety experts in the
Elisa group, that need to be clarified.

/me looks at people from Elisa...

-- Daniel

> And that is just after briefly browsing through the code> I am open to suggestions from others, but at this point I have serious doubts
> that this code is maintainable in the kernel.
> 
> Guenter
Steven Rostedt June 28, 2022, 7:32 p.m. UTC | #7
On Thu, 16 Jun 2022 16:53:54 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> >      >> +
> >      >> +struct automaton_safe_wtd automaton_safe_wtd = {
> >      >> +    .state_names = {
> >      >> +        "init",
> >      >> +        "closed_running",
> >      >> +        "closed_running_nwo",
> >      >> +        "nwo",
> >      >> +        "opened",
> >      >> +        "opened_nwo",
> >      >> +        "reopened",
> >      >> +        "safe",
> >      >> +        "safe_nwo",
> >      >> +        "set",
> >      >> +        "set_nwo",
> >      >> +        "started",
> >      >> +        "started_nwo",
> >      >> +        "stoped"
> >      >> +    },
> >      >> +    .event_names = {
> >      >> +        "close",
> >      >> +        "nowayout",
> >      >> +        "open",
> >      >> +        "other_threads",
> >      >> +        "ping",
> >      >> +        "set_safe_timeout",
> >      >> +        "start",
> >      >> +        "stop"
> >      >> +    },
> >      >> +    .function = {

I think it could become much more readable if you added comments here that
show what each column is. That way, we do not need to try to remember it.

And maybe even a diagram in the comments. If we can automate the generating
of an ASCII DFA diagram then that would be awesome too :-)

> >      >> +        {                          -1,                nwo_safe_wtd,             opened_safe_wtd,               init_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
> >      >> +        {                          -1, closed_running_nwo_safe_wtd,           reopened_safe_wtd,     closed_running_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
> >      >> +        {                          -1, closed_running_nwo_safe_wtd,        started_nwo_safe_wtd, closed_running_nwo_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
> >      >> +        {                          -1,                nwo_safe_wtd,         opened_nwo_safe_wtd,                nwo_safe_wtd,                          -1,                          -1,                         -1,                          -1 },
> >      >> +        {               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,           started_safe_wtd,                          -1 },
> >      >> +        {                nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,       started_nwo_safe_wtd,                          -1 },
> >      >> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                          1,             opened_safe_wtd },
> >      >> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                          1,             stoped_safe_wtd },
> >      >> +        { closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                         -1,                          -1 },
> >      >> +        {                          -1,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                         -1,                          -1 },
> >      >> +        {                          -1,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                         -1,                          -1 },
> >      >> +        {     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                         -1,             stoped_safe_wtd },
> >      >> +        { closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,            set_nwo_safe_wtd,                         -1,                          -1 },
> >      >> +        {               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,                         -1,                          -1 },
> >      >> +    },
> >      >> +    .initial_state = init_safe_wtd,
> >      >> +    .final_states = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },  
> >      >
> >      > I find this event table all but impossible to verify.  
> > 
> >     It is a matrix. Lines are states, and columns are events.
> > 
> >     On a given state/line, receiving a given event/column, the data is the next
> >     state/row.
> >   
> 
> I am aware of that, and I did program state machines before.
> 
> >     For instance, let's say "init" (row 0), event "nwo" (column 1), and the next
> >     state is the "nwo" (row 3).
> > 
> >     -1 means invalid/blocked state (yeah, maybe it is better to have an #define
> >     INVALID_STATE -1).
> > 
> >     This is the C representation of an automaton, following the formal definition of
> >     a deterministic automaton. I've added an explanation of this representation in
> >     the documentation (patch 15, file da_monitor_synthesis.rst).
> > 
> >     A deeper look into this subject is here (peer-reviewed conference paper at
> >     Software Engineer and Formal Methods 2019):
> >     https://bristot.me/wp-content/uploads/2019/09/paper.pdf <https://bristot.me/wp-content/uploads/2019/09/paper.pdf>
> > 
> >     One could translate it back to the automaton's graphical format... to a format
> >     of by a tool used to analyze automaton properties... that is the good point of
> >     using a well-established formalism. (The bad part is that they are often
> >     boring... c'est la vie :-)).
> >   
> 
> If the above state machine fails, no one but the authors will be able to even
> remotely figure out what happened, and if the watchdog driver is at fault or
> its monitor. It is a state machine making assumptions about state transitions,
> sure, but who knows if those asssumptions are even remotely correct or match
> reality. For example, I have no idea if the lack of a 'ping' function is handled
> correctly,  if the lack of a 'stop' function is handled correctly, or what
> happens if any of the driver functions returns an error.
> 
> I already found three assumptions which do not or not necessarily match
> reality:
> 
> - The function to read the remaining timeout is optional and must not be
>    used unconditionally, and its lack is not an error.
> - The requested timeout (and pretimeout) do not have to match the actually
>    configured timeout, and userspace must not rely on the assumption that
>    the values match.
> - The code assumes that the process opening the watchdog and the process
>    accessing it are the same. While that is in general the case, it might
>    well be that some application opens the watchdog and then handles it
>    from a child process.
> 
> And that is just after briefly browsing through the code.
> 
> I am open to suggestions from others, but at this point I have serious doubts
> that this code is maintainable in the kernel.
> 

Let me give some background on this. Various safety critical users
(automotive, power plants, etc) are looking at using Linux in the field.
Obviously, Linux itself is too big to verify properly. In fact, even small
OSs that say they are verifiable seldom are. Part of the solution is to add
a way to check that Linux is running as expected, and be able to detect
when it is not. When that happens, you fall into a "fail-safe" mode (for
example, if you have an autonomous driving vehicle, it could sound an alarm
and inform the driver to go into manual mode, or it is going to simply
"pull over". Note, this is just a hypothetical example, not something I've
seen in reality).

But to implement this, you need to verify the verifier. In this case, the
watchdog timer is what verifies that Linux is running properly. It is
possible to verify parts of the kernel when you limit the inputs of the
kernel. This means we are not trying to verify all use cases of the
watchdog timer. That would be pretty much impossible, or take decades to
complete. We only need to verify the use cases used in safety critical
systems.

Do not confuse this with static analyzers or other general purpose tooling
to find bugs. This is not the purpose of the run time verify. It is just to
prove that certain use cases will perform as expected, given a limited
input.

Years ago there was an interview with Linus where he was asked if he would
trust Linux in real-time safety critical projects. And Linus's response was
something to the effect that he would, as those projects have a very
limited use case, and the code that actually gets run is not as big as you
would think. And that code can be verified.

That's exactly what we are doing here. There are very strict standards on
what is considered safe for safety critical projects and by limiting the
use cases of the code we are verifying, we can achieve those objectives.

But your complaint about not being able to read or understand the table is
legitimate, and that needs to be addressed. And what the limiting input is
must also be commented as well, that way people will not get confused when
they see a legitimate use case not covered by the DFA.

Action Items:
 1. Add column title to the matrix. 
 2. Ideally, also add an ASCII DFA diagram.
 3. And finally, comment what the range of inputs are (what use cases are
    being covered).

-- Steve
Guenter Roeck July 1, 2022, 2:45 p.m. UTC | #8
On 6/28/22 12:32, Steven Rostedt wrote:
[ ... ]
>> I am open to suggestions from others, but at this point I have serious doubts
>> that this code is maintainable in the kernel.
>>
> 
> Let me give some background on this. Various safety critical users
> (automotive, power plants, etc) are looking at using Linux in the field.
> Obviously, Linux itself is too big to verify properly. In fact, even small
> OSs that say they are verifiable seldom are. Part of the solution is to add
> a way to check that Linux is running as expected, and be able to detect
> when it is not. When that happens, you fall into a "fail-safe" mode (for
> example, if you have an autonomous driving vehicle, it could sound an alarm
> and inform the driver to go into manual mode, or it is going to simply
> "pull over". Note, this is just a hypothetical example, not something I've
> seen in reality).
> 
> But to implement this, you need to verify the verifier. In this case, the
> watchdog timer is what verifies that Linux is running properly. It is
> possible to verify parts of the kernel when you limit the inputs of the
> kernel. This means we are not trying to verify all use cases of the
> watchdog timer. That would be pretty much impossible, or take decades to
> complete. We only need to verify the use cases used in safety critical
> systems.
> 

I understand the context. However, ...

> Do not confuse this with static analyzers or other general purpose tooling
> to find bugs. This is not the purpose of the run time verify. It is just to
> prove that certain use cases will perform as expected, given a limited
> input.
> 

... this is, unfortunately, not explained in the patch. I would have much less
of a problem with the series if those details were included.

Not that I would mind such a verifier, if it was possible to define one,
but it would have to be tested with a large number of watchdog drivers
to ensure that it addresses all use cases, or at least with a substantial
percentage of use cases. It would also require that the state machine is
readable to give people a chance to fix it if turns out to be broken.
It would also have to be robust, meaning it would have to reject invalid
(unsupported) settings from the start and not only during runtime.

Thanks,
Guenter
Steven Rostedt July 1, 2022, 3:38 p.m. UTC | #9
On Fri, 1 Jul 2022 07:45:50 -0700
Guenter Roeck <linux@roeck-us.net> wrote:

> > Do not confuse this with static analyzers or other general purpose tooling
> > to find bugs. This is not the purpose of the run time verify. It is just to
> > prove that certain use cases will perform as expected, given a limited
> > input.
> >   
> 
> ... this is, unfortunately, not explained in the patch. I would have much less
> of a problem with the series if those details were included.

It's one of those cases where developers get so involved in their code that
they leave out the things that are so obvious to them, but not obvious to
others ;-)

My new saying is: "We work in a field where the obvious seldom is".

Hmm, I think I'll go tweet that :-)

> 
> Not that I would mind such a verifier, if it was possible to define one,
> but it would have to be tested with a large number of watchdog drivers
> to ensure that it addresses all use cases, or at least with a substantial
> percentage of use cases. It would also require that the state machine is
> readable to give people a chance to fix it if turns out to be broken.
> It would also have to be robust, meaning it would have to reject invalid
> (unsupported) settings from the start and not only during runtime.

I would agree than any module would need to state up front exactly what it
is modeling. In safety critical systems, all the components that are used
are defined up front. Not sure if we can have the model not load if the
required drivers to test are not loaded or ones not part of the model are
(Daniel?).

Anyway, thanks for the feedback.

-- Steve
Daniel Bristot de Oliveira July 4, 2022, 12:41 p.m. UTC | #10
On 7/1/22 17:38, Steven Rostedt wrote:
> On Fri, 1 Jul 2022 07:45:50 -0700
> Guenter Roeck <linux@roeck-us.net> wrote:
> 
>>> Do not confuse this with static analyzers or other general purpose tooling
>>> to find bugs. This is not the purpose of the run time verify. It is just to
>>> prove that certain use cases will perform as expected, given a limited
>>> input.
>>>   
>>
>> ... this is, unfortunately, not explained in the patch. I would have much less
>> of a problem with the series if those details were included.
> 
> It's one of those cases where developers get so involved in their code that
> they leave out the things that are so obvious to them, but not obvious to
> others ;-)

I agree that the main point here is that the documentation [patch 20/20] needs
to be extended and correctly linked to the code.

The goal of the model is to specify the minimum but obligatory steps to set a
watchdog (start, set a safe timeout, ping...) so it can be used by
"safety/heath" monitors in safety-critical systems.

Another goal is to reduce the amount of code/dependencies that will require
deeper inspections to qualify the subsystem for usage in a given context via
monitoring (as steven mentioned - more about it here [1]), without having to
reduce the generic subsystem.

Although the method allows one to create a complete model of the watchdog device
layer, covering all use cases, that is not the idea of this monitor. Moreover, a
full model would not be the adequate model for this specific (but relevant) case
raised and discussed in the Elisa workgroup [2].

The goal of the monitor (that uses the model) is to verify that the interaction
between the watchdog device layer and the "safety/heath" monitors follows this
established model, at runtime.

> My new saying is: "We work in a field where the obvious seldom is".
> 
> Hmm, I think I'll go tweet that :-)

/me liked the tweet... and yep, you clarified well the context in which this is
being applied.

>>
>> Not that I would mind such a verifier, if it was possible to define one,
>> but it would have to be tested with a large number of watchdog drivers
>> to ensure that it addresses all use cases, or at least with a substantial
>> percentage of use cases. It would also require that the state machine is
>> readable to give people a chance to fix it if turns out to be broken.

The patch 20/20 has the automaton in the ASCII art format. Both C and ASCII
models were extracted from the same .dot file. I am not including the .dot file
because there was a previous discussion with the doc people that prefer the
ASCII format in the documentation. But I can add as well, not linked with the
rst file. By reading the model in the ASCII format, it is possible to see that
it is broad enough to cover many watchdogs as it uses simple/generic operations.

The model (the .h) and the instrumentation (the .c) can be updated at any time.
I am adding the tooling to facilitate that, like [patch 06/20].

Patch 19/20 adds a safety application that enables the RV monitor, uses the
watchdog, and collects feedback from the monitor to see if the requested actions
are occurring in the model - and it can be used to test the RV monitor with
any watchdog.

The goal is the one I described above, so an exception generated by this monitor
needs to be read accordingly: it does not imply that a watchdog driver is
broken. It means the interaction between the safety/health monitor and the
watchdog is not following specifications and must be checked at the development
phase, or that something went really wrong at runtime phase.

I will add that to the documentation and emphasize the context of this monitor.

>> It would also have to be robust, meaning it would have to reject invalid
>> (unsupported) settings from the start and not only during runtime.
> 
> I would agree than any module would need to state up front exactly what it
> is modeling. In safety critical systems, all the components that are used
> are defined up front. Not sure if we can have the model not load if the
> required drivers to test are not loaded or ones not part of the model are
> (Daniel?).

I can add a check in the time in which the model is being enabled. But for this
specific case, removing that part might be better and adding it later if it
becomes fundamental.

In practice, selecting a watchdog device can assume that the get_timeleft
implementation is a requirement - so that option was just an additional check.

[1] This is one use case for the broader goal of the strategy discussed here:
https://www.youtube.com/watch?v=DkiwkAKOXNs

[2] https://www.youtube.com/watch?v=qFSYlCbHCYk
> Anyway, thanks for the feedback.
> 
> -- Steve
Daniel Bristot de Oliveira July 13, 2022, 7:13 p.m. UTC | #11
On 6/16/22 10:45, Daniel Bristot de Oliveira wrote:
> The watchdog is an essential building block for the usage of Linux in
> safety-critical systems because it allows the system to be monitored from
> an external element - the watchdog hardware, acting as a safety-monitor.

Guenter and Steven,

I will move the watchdog monitor to a separated thread, as it will require
further discussions specific to it, mainly to get feedback from watchdog
maintainers.

Anyways, I am adding additional information for the sample monitors based on our
discussions here.

They are:

I added documentation about the automata format and the translation between
the formal <-> dot -> C.

I am adding the .dot file to the tools/verification/models/ so that one can get
the .dot file and convert it to other formats, like, a png file. This
will make it easy to read the automata model.

I am adding a .rst documentation for each model, including details about it.

I will send the v5 of RV without the safe_wtd monitor and then start a new
one about the watchdog later, after getting the RV interface series ready.

-- Daniel
diff mbox series

Patch

diff --git a/include/trace/events/rv.h b/include/trace/events/rv.h
index 00f11a8dac3b..895eb3435ed7 100644
--- a/include/trace/events/rv.h
+++ b/include/trace/events/rv.h
@@ -66,6 +66,17 @@  DEFINE_EVENT(error_da_monitor, error_wip,
 	     TP_PROTO(char *state, char *event),
 	     TP_ARGS(state, event));
 #endif /* CONFIG_RV_MON_WIP */
+
+#ifdef CONFIG_RV_MON_SAFE_WTD
+DEFINE_EVENT(event_da_monitor, event_safe_wtd,
+	     TP_PROTO(char *state, char *event, char *next_state, bool safe),
+	     TP_ARGS(state, event, next_state, safe));
+
+DEFINE_EVENT(error_da_monitor, error_safe_wtd,
+	     TP_PROTO(char *state, char *event),
+	     TP_ARGS(state, event));
+#endif /* CONFIG_RV_MON_SAFE_WTD */
+
 #endif /* CONFIG_DA_MON_EVENTS_IMPLICIT */
 
 #ifdef CONFIG_DA_MON_EVENTS_ID
diff --git a/kernel/trace/rv/Kconfig b/kernel/trace/rv/Kconfig
index 21f03fb3101a..b14ae63e792b 100644
--- a/kernel/trace/rv/Kconfig
+++ b/kernel/trace/rv/Kconfig
@@ -45,6 +45,16 @@  config RV_MON_WWNR
 	  illustrates the usage of per-task monitor. The model is
 	  broken on purpose: it serves to test reactors.
 
+config RV_MON_SAFE_WTD
+	select DA_MON_EVENTS_IMPLICIT
+	bool "Safety watchdog"
+	help
+	  Enable safe_wtd, this monitor observes the interaction
+	  between a user-space safety monitor and a watchdog device.
+
+	  For futher information see:
+	    Documentation/trace/rv/safety-monitor.rst
+
 config RV_REACTORS
 	bool "Runtime verification reactors"
 	default y if RV
diff --git a/kernel/trace/rv/Makefile b/kernel/trace/rv/Makefile
index 963d14875b45..904db96c7eae 100644
--- a/kernel/trace/rv/Makefile
+++ b/kernel/trace/rv/Makefile
@@ -3,6 +3,7 @@ 
 obj-$(CONFIG_RV) += rv.o
 obj-$(CONFIG_RV_MON_WIP) += monitors/wip/wip.o
 obj-$(CONFIG_RV_MON_WWNR) += monitors/wwnr/wwnr.o
+obj-$(CONFIG_RV_MON_SAFE_WTD) += monitors/safe_wtd/safe_wtd.o
 obj-$(CONFIG_RV_REACTORS) += rv_reactors.o
 obj-$(CONFIG_RV_REACT_PRINTK) += reactor_printk.o
 obj-$(CONFIG_RV_REACT_PANIC) += reactor_panic.o
diff --git a/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
new file mode 100644
index 000000000000..9856e0770d0d
--- /dev/null
+++ b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.c
@@ -0,0 +1,300 @@ 
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/ftrace.h>
+#include <linux/tracepoint.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/rv.h>
+#include <rv/instrumentation.h>
+#include <rv/da_monitor.h>
+
+#include <linux/watchdog.h>
+#include <linux/moduleparam.h>
+
+#include <trace/events/rv.h>
+#include <trace/events/watchdog.h>
+
+#define MODULE_NAME "safe_wtd"
+
+/*
+ * This is the self-generated part of the monitor. Generally, there is no need
+ * to touch this section.
+ */
+#include "safe_wtd.h"
+
+/*
+ * Declare the deterministic automata monitor.
+ *
+ * The rv monitor reference is needed for the monitor declaration.
+ */
+struct rv_monitor rv_safe_wtd;
+DECLARE_DA_MON_GLOBAL(safe_wtd, char);
+
+/*
+ * custom: safe_timeout is the maximum value a watchdog monitor
+ * can set. This value is registered here to duplicate the information.
+ * In this way, a miss-behaving monitor can be detected.
+ */
+static int safe_timeout = ~0;
+module_param(safe_timeout, int, 0444);
+
+/*
+ * custom: if check_timeout is set, the monitor will check if the time left
+ * in the watchdog is less than or equals to the last safe timeout set by
+ * user-space. This check is done after each ping. In this way, if any
+ * code by-passed the watchdog dev interface setting a higher (so unsafe)
+ * timeout, this monitor will catch the side effect and react.
+ */
+static int last_timeout_set = 0;
+static int check_timeout = 0;
+module_param(check_timeout, int, 0444);
+
+/*
+ * custom: if dont_stop is set the monitor will react if stopped.
+ */
+static int dont_stop = 0;
+module_param(dont_stop, int, 0444);
+
+/*
+ * custom: there are some states that are kept after the watchdog is closed.
+ * For example, the nowayout state.
+ *
+ * Thus, the RV monitor needs to keep track of these states after a start/stop
+ * of the RV monitor itself, and should not reset after each restart - keeping the
+ * know state until the system shutdown.
+ *
+ * If for an unknown reason an RV monitor would like to reset the RV monitor at each
+ * RV monitor start, set it to one.
+ */
+static int reset_on_restart = 0;
+module_param(reset_on_restart, int, 0444);
+
+/*
+ * open_pid takes note of the first thread that opened the watchdog.
+ *
+ * Any other thread that generates an event will cause an "other_threads"
+ * event in the monitor.
+ */
+static int open_pid = 0;
+
+/*
+ * watchdog_id: the watchdog to monitor
+ */
+static int watchdog_id = 0;
+module_param(watchdog_id, int, 0444);
+
+static void handle_nowayout(void *data, struct watchdog_device *wdd)
+{
+	if (wdd->id != watchdog_id)
+		return;
+
+	da_handle_init_run_event_safe_wtd(nowayout_safe_wtd);
+}
+
+static void handle_close(void *data, struct watchdog_device *wdd)
+{
+	if (wdd->id != watchdog_id)
+		return;
+
+	if (open_pid && current->pid != open_pid) {
+		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
+	} else {
+		da_handle_event_safe_wtd(close_safe_wtd);
+		open_pid = 0;
+	}
+}
+
+static void handle_open(void *data, struct watchdog_device *wdd)
+{
+	if (wdd->id != watchdog_id)
+		return;
+
+	if (open_pid && current->pid != open_pid) {
+		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
+	} else {
+		da_handle_init_run_event_safe_wtd(open_safe_wtd);
+		open_pid = current->pid;
+	}
+}
+
+static void blocked_events(void *data, struct watchdog_device *wdd)
+{
+	if (wdd->id != watchdog_id)
+		return;
+
+	if (open_pid && current->pid != open_pid) {
+		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
+		return;
+	}
+	da_handle_event_safe_wtd(other_threads_safe_wtd);
+}
+
+static void blocked_events_timeout(void *data, struct watchdog_device *wdd, u64 timeout)
+{
+	blocked_events(data, wdd);
+}
+
+static void handle_ping(void *data, struct watchdog_device *wdd)
+{
+	char msg[128];
+	unsigned int timeout;
+
+	if (wdd->id != watchdog_id)
+		return;
+
+	if (open_pid && current->pid != open_pid) {
+		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
+		return;
+	}
+
+	da_handle_event_safe_wtd(ping_safe_wtd);
+
+	if (!check_timeout)
+		return;
+
+	if (wdd->ops->get_timeleft) {
+		timeout = wdd->ops->get_timeleft(wdd);
+		if (timeout > last_timeout_set) {
+			snprintf(msg, 128,
+				 "watchdog timeout is %u > than previously set (%d)\n",
+				 timeout, last_timeout_set);
+			cond_react(msg);
+		}
+	} else {
+		snprintf(msg, 128, "error getting timeout: option not supported\n");
+		cond_react(msg);
+	}
+}
+
+static void handle_set_safe_timeout(void *data, struct watchdog_device *wdd, u64 timeout)
+{
+	char msg[128];
+
+	if (wdd->id != watchdog_id)
+		return;
+
+	if (open_pid && current->pid != open_pid) {
+		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
+		return;
+	}
+
+	da_handle_event_safe_wtd(set_safe_timeout_safe_wtd);
+
+	if (timeout > safe_timeout) {
+		snprintf(msg, 128, "set safety timeout is too high: %d", (int) timeout);
+		cond_react(msg);
+	}
+
+	if (check_timeout)
+		last_timeout_set = timeout;
+}
+
+static void handle_start(void *data, struct watchdog_device *wdd)
+{
+	if (wdd->id != watchdog_id)
+		return;
+
+	if (open_pid && current->pid != open_pid) {
+		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
+		return;
+	}
+
+	da_handle_event_safe_wtd(start_safe_wtd);
+}
+
+static void handle_stop(void *data, struct watchdog_device *wdd)
+{
+	if (wdd->id != watchdog_id)
+		return;
+
+	if (open_pid && current->pid != open_pid) {
+		da_handle_init_run_event_safe_wtd(other_threads_safe_wtd);
+		return;
+	}
+
+	da_handle_event_safe_wtd(stop_safe_wtd);
+}
+
+static int mon_started = 0;
+
+static int start_safe_wtd_monitor(void)
+{
+	int retval;
+
+	if (!mon_started || reset_on_restart) {
+		retval = da_monitor_init_safe_wtd();
+		if (retval)
+			return retval;
+
+		mon_started = 1;
+	}
+
+	rv_attach_trace_probe("safe_wtd", watchdog_close, handle_close);
+	rv_attach_trace_probe("safe_wtd", watchdog_nowayout, handle_nowayout);
+	rv_attach_trace_probe("safe_wtd", watchdog_open, handle_open);
+	rv_attach_trace_probe("safe_wtd", watchdog_ping, handle_ping);
+	rv_attach_trace_probe("safe_wtd", watchdog_set_timeout, handle_set_safe_timeout);
+	rv_attach_trace_probe("safe_wtd", watchdog_start, handle_start);
+	rv_attach_trace_probe("safe_wtd", watchdog_stop, handle_stop);
+	rv_attach_trace_probe("safe_wtd", watchdog_set_keep_alive, blocked_events_timeout);
+	rv_attach_trace_probe("safe_wtd", watchdog_keep_alive, blocked_events);
+	rv_attach_trace_probe("safe_wtd", watchdog_set_pretimeout, blocked_events_timeout);
+	rv_attach_trace_probe("safe_wtd", watchdog_pretimeout, blocked_events);
+
+	return 0;
+}
+
+static void stop_safe_wtd_monitor(void)
+{
+	if (dont_stop)
+		cond_react("dont_stop safe_wtd is set.");
+
+	rv_safe_wtd.enabled = 0;
+
+	rv_detach_trace_probe("safe_wtd", watchdog_close, handle_close);
+	rv_detach_trace_probe("safe_wtd", watchdog_nowayout, handle_nowayout);
+	rv_detach_trace_probe("safe_wtd", watchdog_open, handle_open);
+	rv_detach_trace_probe("safe_wtd", watchdog_ping, handle_ping);
+	rv_detach_trace_probe("safe_wtd", watchdog_set_timeout, handle_set_safe_timeout);
+	rv_detach_trace_probe("safe_wtd", watchdog_start, handle_start);
+	rv_detach_trace_probe("safe_wtd", watchdog_stop, handle_stop);
+	rv_detach_trace_probe("safe_wtd", watchdog_set_keep_alive, blocked_events_timeout);
+	rv_detach_trace_probe("safe_wtd", watchdog_keep_alive, blocked_events);
+	rv_detach_trace_probe("safe_wtd", watchdog_set_pretimeout, blocked_events_timeout);
+	rv_detach_trace_probe("safe_wtd", watchdog_pretimeout, blocked_events);
+
+	da_monitor_destroy_safe_wtd();
+}
+
+/*
+ * This is the monitor register section.
+ */
+struct rv_monitor rv_safe_wtd = {
+	.name = "safe_wtd",
+	.description = "A watchdog monitor guarding a safety monitor actions",
+	.start = start_safe_wtd_monitor,
+	.stop = stop_safe_wtd_monitor,
+	.reset = da_monitor_reset_all_safe_wtd,
+	.enabled = 0,
+};
+
+int register_safe_wtd(void)
+{
+	rv_register_monitor(&rv_safe_wtd);
+	return 0;
+}
+
+void unregister_safe_wtd(void)
+{
+	if (rv_safe_wtd.enabled)
+		stop_safe_wtd_monitor();
+
+	rv_unregister_monitor(&rv_safe_wtd);
+}
+
+module_init(register_safe_wtd);
+module_exit(unregister_safe_wtd);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Daniel Bristot de Oliveira <bristot@kernel.org>");
+MODULE_DESCRIPTION("Safe watchdog RV monitor");
diff --git a/kernel/trace/rv/monitors/safe_wtd/safe_wtd.h b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.h
new file mode 100644
index 000000000000..835c9d0979f6
--- /dev/null
+++ b/kernel/trace/rv/monitors/safe_wtd/safe_wtd.h
@@ -0,0 +1,84 @@ 
+enum states_safe_wtd {
+	init_safe_wtd = 0,
+	closed_running_safe_wtd,
+	closed_running_nwo_safe_wtd,
+	nwo_safe_wtd,
+	opened_safe_wtd,
+	opened_nwo_safe_wtd,
+	reopened_safe_wtd,
+	safe_safe_wtd,
+	safe_nwo_safe_wtd,
+	set_safe_wtd,
+	set_nwo_safe_wtd,
+	started_safe_wtd,
+	started_nwo_safe_wtd,
+	stoped_safe_wtd,
+	state_max_safe_wtd
+};
+
+enum events_safe_wtd {
+	close_safe_wtd = 0,
+	nowayout_safe_wtd,
+	open_safe_wtd,
+	other_threads_safe_wtd,
+	ping_safe_wtd,
+	set_safe_timeout_safe_wtd,
+	start_safe_wtd,
+	stop_safe_wtd,
+	event_max_safe_wtd
+};
+
+struct automaton_safe_wtd {
+	char *state_names[state_max_safe_wtd];
+	char *event_names[event_max_safe_wtd];
+	char function[state_max_safe_wtd][event_max_safe_wtd];
+	char initial_state;
+	char final_states[state_max_safe_wtd];
+};
+
+struct automaton_safe_wtd automaton_safe_wtd = {
+	.state_names = {
+		"init",
+		"closed_running",
+		"closed_running_nwo",
+		"nwo",
+		"opened",
+		"opened_nwo",
+		"reopened",
+		"safe",
+		"safe_nwo",
+		"set",
+		"set_nwo",
+		"started",
+		"started_nwo",
+		"stoped"
+	},
+	.event_names = {
+		"close",
+		"nowayout",
+		"open",
+		"other_threads",
+		"ping",
+		"set_safe_timeout",
+		"start",
+		"stop"
+	},
+	.function = {
+		{                          -1,                nwo_safe_wtd,             opened_safe_wtd,               init_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
+		{                          -1, closed_running_nwo_safe_wtd,           reopened_safe_wtd,     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
+		{                          -1, closed_running_nwo_safe_wtd,        started_nwo_safe_wtd, closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
+		{                          -1,                nwo_safe_wtd,         opened_nwo_safe_wtd,                nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1 },
+		{               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,            started_safe_wtd,                          -1 },
+		{                nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,        started_nwo_safe_wtd,                          -1 },
+		{     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                          -1,             opened_safe_wtd },
+		{     closed_running_safe_wtd,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                          -1,             stoped_safe_wtd },
+		{ closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                          -1,                          -1 },
+		{                          -1,                          -1,                          -1,                          -1,               safe_safe_wtd,                          -1,                          -1,                          -1 },
+		{                          -1,                          -1,                          -1,                          -1,           safe_nwo_safe_wtd,                          -1,                          -1,                          -1 },
+		{     closed_running_safe_wtd,                          -1,                          -1,                          -1,                          -1,                set_safe_wtd,                          -1,             stoped_safe_wtd },
+		{ closed_running_nwo_safe_wtd,                          -1,                          -1,                          -1,                          -1,            set_nwo_safe_wtd,                          -1,                          -1 },
+		{               init_safe_wtd,                          -1,                          -1,                          -1,                          -1,                          -1,                          -1,                          -1 },
+	},
+	.initial_state = init_safe_wtd,
+	.final_states = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 },
+};