xref: /openbmc/linux/Documentation/watchdog/watchdog-kernel-api.rst (revision e65e175b07bef5974045cc42238de99057669ca7)
1===============================================
2The Linux WatchDog Timer Driver Core kernel API
3===============================================
4
5Last reviewed: 12-Feb-2013
6
7Wim Van Sebroeck <wim@iguana.be>
8
9Introduction
10------------
11This document does not describe what a WatchDog Timer (WDT) Driver or Device is.
12It also does not describe the API which can be used by user space to communicate
13with a WatchDog Timer. If you want to know this then please read the following
14file: Documentation/watchdog/watchdog-api.rst .
15
16So what does this document describe? It describes the API that can be used by
17WatchDog Timer Drivers that want to use the WatchDog Timer Driver Core
18Framework. This framework provides all interfacing towards user space so that
19the same code does not have to be reproduced each time. This also means that
20a watchdog timer driver then only needs to provide the different routines
21(operations) that control the watchdog timer (WDT).
22
23The API
24-------
25Each watchdog timer driver that wants to use the WatchDog Timer Driver Core
26must #include <linux/watchdog.h> (you would have to do this anyway when
27writing a watchdog device driver). This include file contains following
28register/unregister routines::
29
30	extern int watchdog_register_device(struct watchdog_device *);
31	extern void watchdog_unregister_device(struct watchdog_device *);
32
33The watchdog_register_device routine registers a watchdog timer device.
34The parameter of this routine is a pointer to a watchdog_device structure.
35This routine returns zero on success and a negative errno code for failure.
36
37The watchdog_unregister_device routine deregisters a registered watchdog timer
38device. The parameter of this routine is the pointer to the registered
39watchdog_device structure.
40
41The watchdog subsystem includes an registration deferral mechanism,
42which allows you to register an watchdog as early as you wish during
43the boot process.
44
45The watchdog device structure looks like this::
46
47  struct watchdog_device {
48	int id;
49	struct device *parent;
50	const struct attribute_group **groups;
51	const struct watchdog_info *info;
52	const struct watchdog_ops *ops;
53	const struct watchdog_governor *gov;
54	unsigned int bootstatus;
55	unsigned int timeout;
56	unsigned int pretimeout;
57	unsigned int min_timeout;
58	unsigned int max_timeout;
59	unsigned int min_hw_heartbeat_ms;
60	unsigned int max_hw_heartbeat_ms;
61	struct notifier_block reboot_nb;
62	struct notifier_block restart_nb;
63	void *driver_data;
64	struct watchdog_core_data *wd_data;
65	unsigned long status;
66	struct list_head deferred;
67  };
68
69It contains following fields:
70
71* id: set by watchdog_register_device, id 0 is special. It has both a
72  /dev/watchdog0 cdev (dynamic major, minor 0) as well as the old
73  /dev/watchdog miscdev. The id is set automatically when calling
74  watchdog_register_device.
75* parent: set this to the parent device (or NULL) before calling
76  watchdog_register_device.
77* groups: List of sysfs attribute groups to create when creating the watchdog
78  device.
79* info: a pointer to a watchdog_info structure. This structure gives some
80  additional information about the watchdog timer itself. (Like it's unique name)
81* ops: a pointer to the list of watchdog operations that the watchdog supports.
82* gov: a pointer to the assigned watchdog device pretimeout governor or NULL.
83* timeout: the watchdog timer's timeout value (in seconds).
84  This is the time after which the system will reboot if user space does
85  not send a heartbeat request if WDOG_ACTIVE is set.
86* pretimeout: the watchdog timer's pretimeout value (in seconds).
87* min_timeout: the watchdog timer's minimum timeout value (in seconds).
88  If set, the minimum configurable value for 'timeout'.
89* max_timeout: the watchdog timer's maximum timeout value (in seconds),
90  as seen from userspace. If set, the maximum configurable value for
91  'timeout'. Not used if max_hw_heartbeat_ms is non-zero.
92* min_hw_heartbeat_ms: Hardware limit for minimum time between heartbeats,
93  in milli-seconds. This value is normally 0; it should only be provided
94  if the hardware can not tolerate lower intervals between heartbeats.
95* max_hw_heartbeat_ms: Maximum hardware heartbeat, in milli-seconds.
96  If set, the infrastructure will send heartbeats to the watchdog driver
97  if 'timeout' is larger than max_hw_heartbeat_ms, unless WDOG_ACTIVE
98  is set and userspace failed to send a heartbeat for at least 'timeout'
99  seconds. max_hw_heartbeat_ms must be set if a driver does not implement
100  the stop function.
101* reboot_nb: notifier block that is registered for reboot notifications, for
102  internal use only. If the driver calls watchdog_stop_on_reboot, watchdog core
103  will stop the watchdog on such notifications.
104* restart_nb: notifier block that is registered for machine restart, for
105  internal use only. If a watchdog is capable of restarting the machine, it
106  should define ops->restart. Priority can be changed through
107  watchdog_set_restart_priority.
108* bootstatus: status of the device after booting (reported with watchdog
109  WDIOF_* status bits).
110* driver_data: a pointer to the drivers private data of a watchdog device.
111  This data should only be accessed via the watchdog_set_drvdata and
112  watchdog_get_drvdata routines.
113* wd_data: a pointer to watchdog core internal data.
114* status: this field contains a number of status bits that give extra
115  information about the status of the device (Like: is the watchdog timer
116  running/active, or is the nowayout bit set).
117* deferred: entry in wtd_deferred_reg_list which is used to
118  register early initialized watchdogs.
119
120The list of watchdog operations is defined as::
121
122  struct watchdog_ops {
123	struct module *owner;
124	/* mandatory operations */
125	int (*start)(struct watchdog_device *);
126	/* optional operations */
127	int (*stop)(struct watchdog_device *);
128	int (*ping)(struct watchdog_device *);
129	unsigned int (*status)(struct watchdog_device *);
130	int (*set_timeout)(struct watchdog_device *, unsigned int);
131	int (*set_pretimeout)(struct watchdog_device *, unsigned int);
132	unsigned int (*get_timeleft)(struct watchdog_device *);
133	int (*restart)(struct watchdog_device *);
134	long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long);
135  };
136
137It is important that you first define the module owner of the watchdog timer
138driver's operations. This module owner will be used to lock the module when
139the watchdog is active. (This to avoid a system crash when you unload the
140module and /dev/watchdog is still open).
141
142Some operations are mandatory and some are optional. The mandatory operations
143are:
144
145* start: this is a pointer to the routine that starts the watchdog timer
146  device.
147  The routine needs a pointer to the watchdog timer device structure as a
148  parameter. It returns zero on success or a negative errno code for failure.
149
150Not all watchdog timer hardware supports the same functionality. That's why
151all other routines/operations are optional. They only need to be provided if
152they are supported. These optional routines/operations are:
153
154* stop: with this routine the watchdog timer device is being stopped.
155
156  The routine needs a pointer to the watchdog timer device structure as a
157  parameter. It returns zero on success or a negative errno code for failure.
158  Some watchdog timer hardware can only be started and not be stopped. A
159  driver supporting such hardware does not have to implement the stop routine.
160
161  If a driver has no stop function, the watchdog core will set WDOG_HW_RUNNING
162  and start calling the driver's keepalive pings function after the watchdog
163  device is closed.
164
165  If a watchdog driver does not implement the stop function, it must set
166  max_hw_heartbeat_ms.
167* ping: this is the routine that sends a keepalive ping to the watchdog timer
168  hardware.
169
170  The routine needs a pointer to the watchdog timer device structure as a
171  parameter. It returns zero on success or a negative errno code for failure.
172
173  Most hardware that does not support this as a separate function uses the
174  start function to restart the watchdog timer hardware. And that's also what
175  the watchdog timer driver core does: to send a keepalive ping to the watchdog
176  timer hardware it will either use the ping operation (when available) or the
177  start operation (when the ping operation is not available).
178
179  (Note: the WDIOC_KEEPALIVE ioctl call will only be active when the
180  WDIOF_KEEPALIVEPING bit has been set in the option field on the watchdog's
181  info structure).
182* status: this routine checks the status of the watchdog timer device. The
183  status of the device is reported with watchdog WDIOF_* status flags/bits.
184
185  WDIOF_MAGICCLOSE and WDIOF_KEEPALIVEPING are reported by the watchdog core;
186  it is not necessary to report those bits from the driver. Also, if no status
187  function is provided by the driver, the watchdog core reports the status bits
188  provided in the bootstatus variable of struct watchdog_device.
189
190* set_timeout: this routine checks and changes the timeout of the watchdog
191  timer device. It returns 0 on success, -EINVAL for "parameter out of range"
192  and -EIO for "could not write value to the watchdog". On success this
193  routine should set the timeout value of the watchdog_device to the
194  achieved timeout value (which may be different from the requested one
195  because the watchdog does not necessarily have a 1 second resolution).
196
197  Drivers implementing max_hw_heartbeat_ms set the hardware watchdog heartbeat
198  to the minimum of timeout and max_hw_heartbeat_ms. Those drivers set the
199  timeout value of the watchdog_device either to the requested timeout value
200  (if it is larger than max_hw_heartbeat_ms), or to the achieved timeout value.
201  (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
202  watchdog's info structure).
203
204  If the watchdog driver does not have to perform any action but setting the
205  watchdog_device.timeout, this callback can be omitted.
206
207  If set_timeout is not provided but, WDIOF_SETTIMEOUT is set, the watchdog
208  infrastructure updates the timeout value of the watchdog_device internally
209  to the requested value.
210
211  If the pretimeout feature is used (WDIOF_PRETIMEOUT), then set_timeout must
212  also take care of checking if pretimeout is still valid and set up the timer
213  accordingly. This can't be done in the core without races, so it is the
214  duty of the driver.
215* set_pretimeout: this routine checks and changes the pretimeout value of
216  the watchdog. It is optional because not all watchdogs support pretimeout
217  notification. The timeout value is not an absolute time, but the number of
218  seconds before the actual timeout would happen. It returns 0 on success,
219  -EINVAL for "parameter out of range" and -EIO for "could not write value to
220  the watchdog". A value of 0 disables pretimeout notification.
221
222  (Note: the WDIOF_PRETIMEOUT needs to be set in the options field of the
223  watchdog's info structure).
224
225  If the watchdog driver does not have to perform any action but setting the
226  watchdog_device.pretimeout, this callback can be omitted. That means if
227  set_pretimeout is not provided but WDIOF_PRETIMEOUT is set, the watchdog
228  infrastructure updates the pretimeout value of the watchdog_device internally
229  to the requested value.
230
231* get_timeleft: this routines returns the time that's left before a reset.
232* restart: this routine restarts the machine. It returns 0 on success or a
233  negative errno code for failure.
234* ioctl: if this routine is present then it will be called first before we do
235  our own internal ioctl call handling. This routine should return -ENOIOCTLCMD
236  if a command is not supported. The parameters that are passed to the ioctl
237  call are: watchdog_device, cmd and arg.
238
239The status bits should (preferably) be set with the set_bit and clear_bit alike
240bit-operations. The status bits that are defined are:
241
242* WDOG_ACTIVE: this status bit indicates whether or not a watchdog timer device
243  is active or not from user perspective. User space is expected to send
244  heartbeat requests to the driver while this flag is set.
245* WDOG_NO_WAY_OUT: this bit stores the nowayout setting for the watchdog.
246  If this bit is set then the watchdog timer will not be able to stop.
247* WDOG_HW_RUNNING: Set by the watchdog driver if the hardware watchdog is
248  running. The bit must be set if the watchdog timer hardware can not be
249  stopped. The bit may also be set if the watchdog timer is running after
250  booting, before the watchdog device is opened. If set, the watchdog
251  infrastructure will send keepalives to the watchdog hardware while
252  WDOG_ACTIVE is not set.
253  Note: when you register the watchdog timer device with this bit set,
254  then opening /dev/watchdog will skip the start operation but send a keepalive
255  request instead.
256
257  To set the WDOG_NO_WAY_OUT status bit (before registering your watchdog
258  timer device) you can either:
259
260  * set it statically in your watchdog_device struct with
261
262	.status = WATCHDOG_NOWAYOUT_INIT_STATUS,
263
264    (this will set the value the same as CONFIG_WATCHDOG_NOWAYOUT) or
265  * use the following helper function::
266
267	static inline void watchdog_set_nowayout(struct watchdog_device *wdd,
268						 int nowayout)
269
270Note:
271   The WatchDog Timer Driver Core supports the magic close feature and
272   the nowayout feature. To use the magic close feature you must set the
273   WDIOF_MAGICCLOSE bit in the options field of the watchdog's info structure.
274
275The nowayout feature will overrule the magic close feature.
276
277To get or set driver specific data the following two helper functions should be
278used::
279
280  static inline void watchdog_set_drvdata(struct watchdog_device *wdd,
281					  void *data)
282  static inline void *watchdog_get_drvdata(struct watchdog_device *wdd)
283
284The watchdog_set_drvdata function allows you to add driver specific data. The
285arguments of this function are the watchdog device where you want to add the
286driver specific data to and a pointer to the data itself.
287
288The watchdog_get_drvdata function allows you to retrieve driver specific data.
289The argument of this function is the watchdog device where you want to retrieve
290data from. The function returns the pointer to the driver specific data.
291
292To initialize the timeout field, the following function can be used::
293
294  extern int watchdog_init_timeout(struct watchdog_device *wdd,
295                                   unsigned int timeout_parm,
296                                   struct device *dev);
297
298The watchdog_init_timeout function allows you to initialize the timeout field
299using the module timeout parameter or by retrieving the timeout-sec property from
300the device tree (if the module timeout parameter is invalid). Best practice is
301to set the default timeout value as timeout value in the watchdog_device and
302then use this function to set the user "preferred" timeout value.
303This routine returns zero on success and a negative errno code for failure.
304
305To disable the watchdog on reboot, the user must call the following helper::
306
307  static inline void watchdog_stop_on_reboot(struct watchdog_device *wdd);
308
309To disable the watchdog when unregistering the watchdog, the user must call
310the following helper. Note that this will only stop the watchdog if the
311nowayout flag is not set.
312
313::
314
315  static inline void watchdog_stop_on_unregister(struct watchdog_device *wdd);
316
317To change the priority of the restart handler the following helper should be
318used::
319
320  void watchdog_set_restart_priority(struct watchdog_device *wdd, int priority);
321
322User should follow the following guidelines for setting the priority:
323
324* 0: should be called in last resort, has limited restart capabilities
325* 128: default restart handler, use if no other handler is expected to be
326  available, and/or if restart is sufficient to restart the entire system
327* 255: highest priority, will preempt all other restart handlers
328
329To raise a pretimeout notification, the following function should be used::
330
331  void watchdog_notify_pretimeout(struct watchdog_device *wdd)
332
333The function can be called in the interrupt context. If watchdog pretimeout
334governor framework (kbuild CONFIG_WATCHDOG_PRETIMEOUT_GOV symbol) is enabled,
335an action is taken by a preconfigured pretimeout governor preassigned to
336the watchdog device. If watchdog pretimeout governor framework is not
337enabled, watchdog_notify_pretimeout() prints a notification message to
338the kernel log buffer.
339
340To set the last known HW keepalive time for a watchdog, the following function
341should be used::
342
343  int watchdog_set_last_hw_keepalive(struct watchdog_device *wdd,
344                                     unsigned int last_ping_ms)
345
346This function must be called immediately after watchdog registration. It
347sets the last known hardware heartbeat to have happened last_ping_ms before
348current time. Calling this is only needed if the watchdog is already running
349when probe is called, and the watchdog can only be pinged after the
350min_hw_heartbeat_ms time has passed from the last ping.
351