13b552b92SKay SieversWhat:		/dev/kmsg
23b552b92SKay SieversDate:		Mai 2012
33b552b92SKay SieversKernelVersion:	3.5
43b552b92SKay SieversContact:	Kay Sievers <kay@vrfy.org>
53b552b92SKay SieversDescription:	The /dev/kmsg character device node provides userspace access
63b552b92SKay Sievers		to the kernel's printk buffer.
73b552b92SKay Sievers
83b552b92SKay Sievers		Injecting messages:
93b552b92SKay Sievers		Every write() to the opened device node places a log entry in
103b552b92SKay Sievers		the kernel's printk buffer.
113b552b92SKay Sievers
123b552b92SKay Sievers		The logged line can be prefixed with a <N> syslog prefix, which
133b552b92SKay Sievers		carries the syslog priority and facility. The single decimal
143b552b92SKay Sievers		prefix number is composed of the 3 lowest bits being the syslog
15085a3a8fSJames Byrne		priority and the next 8 bits the syslog facility number.
163b552b92SKay Sievers
173b552b92SKay Sievers		If no prefix is given, the priority number is the default kernel
183b552b92SKay Sievers		log priority and the facility number is set to LOG_USER (1). It
193b552b92SKay Sievers		is not possible to inject messages from userspace with the
203b552b92SKay Sievers		facility number LOG_KERN (0), to make sure that the origin of
213b552b92SKay Sievers		the messages can always be reliably determined.
223b552b92SKay Sievers
233b552b92SKay Sievers		Accessing the buffer:
243b552b92SKay Sievers		Every read() from the opened device node receives one record
253b552b92SKay Sievers		of the kernel's printk buffer.
263b552b92SKay Sievers
273b552b92SKay Sievers		The first read() directly following an open() always returns
283b552b92SKay Sievers		first message in the buffer; there is no kernel-internal
293b552b92SKay Sievers		persistent state; many readers can concurrently open the device
303b552b92SKay Sievers		and read from it, without affecting other readers.
313b552b92SKay Sievers
323b552b92SKay Sievers		Every read() will receive the next available record. If no more
333b552b92SKay Sievers		records are available read() will block, or if O_NONBLOCK is
343b552b92SKay Sievers		used -EAGAIN returned.
353b552b92SKay Sievers
363b552b92SKay Sievers		Messages in the record ring buffer get overwritten as whole,
373b552b92SKay Sievers		there are never partial messages received by read().
383b552b92SKay Sievers
393b552b92SKay Sievers		In case messages get overwritten in the circular buffer while
403b552b92SKay Sievers		the device is kept open, the next read() will return -EPIPE,
413b552b92SKay Sievers		and the seek position be updated to the next available record.
423b552b92SKay Sievers		Subsequent reads() will return available records again.
433b552b92SKay Sievers
443b552b92SKay Sievers		Unlike the classic syslog() interface, the 64 bit record
453b552b92SKay Sievers		sequence numbers allow to calculate the amount of lost
463b552b92SKay Sievers		messages, in case the buffer gets overwritten. And they allow
473b552b92SKay Sievers		to reconnect to the buffer and reconstruct the read position
483b552b92SKay Sievers		if needed, without limiting the interface to a single reader.
493b552b92SKay Sievers
503b552b92SKay Sievers		The device supports seek with the following parameters:
513b552b92SKay Sievers		SEEK_SET, 0
523b552b92SKay Sievers		  seek to the first entry in the buffer
533b552b92SKay Sievers		SEEK_END, 0
543b552b92SKay Sievers		  seek after the last entry in the buffer
553b552b92SKay Sievers		SEEK_DATA, 0
563b552b92SKay Sievers		  seek after the last record available at the time
573b552b92SKay Sievers		  the last SYSLOG_ACTION_CLEAR was issued.
583b552b92SKay Sievers
598ece3b3eSBruno Meneguele		Due to the record nature of this interface with a "read all"
608ece3b3eSBruno Meneguele		behavior and the specific positions each seek operation sets,
618ece3b3eSBruno Meneguele		SEEK_CUR is not supported, returning -ESPIPE (invalid seek) to
628ece3b3eSBruno Meneguele		errno whenever requested.
638ece3b3eSBruno Meneguele
64bc885f1aSBruno Meneguele		Other seek operations or offsets are not supported because of
65bc885f1aSBruno Meneguele		the special behavior this device has. The device allows to read
66bc885f1aSBruno Meneguele		or write only whole variable length messages (records) that are
67bc885f1aSBruno Meneguele		stored in a ring buffer.
68bc885f1aSBruno Meneguele
69bc885f1aSBruno Meneguele		Because of the non-standard behavior also the error values are
70bc885f1aSBruno Meneguele		non-standard. -ESPIPE is returned for non-zero offset. -EINVAL
71bc885f1aSBruno Meneguele		is returned for other operations, e.g. SEEK_CUR. This behavior
72bc885f1aSBruno Meneguele		and values are historical and could not be modified without the
73bc885f1aSBruno Meneguele		risk of breaking userspace.
74bc885f1aSBruno Meneguele
753b552b92SKay Sievers		The output format consists of a prefix carrying the syslog
763b552b92SKay Sievers		prefix including priority and facility, the 64 bit message
77d39f3d77SKay Sievers		sequence number and the monotonic timestamp in microseconds,
78d39f3d77SKay Sievers		and a flag field. All fields are separated by a ','.
79d39f3d77SKay Sievers
80d39f3d77SKay Sievers		Future extensions might add more comma separated values before
81d39f3d77SKay Sievers		the terminating ';'. Unknown fields and values should be
82d39f3d77SKay Sievers		gracefully ignored.
833b552b92SKay Sievers
843b552b92SKay Sievers		The human readable text string starts directly after the ';'
853b552b92SKay Sievers		and is terminated by a '\n'. Untrusted values derived from
863b552b92SKay Sievers		hardware or other facilities are printed, therefore
87d39f3d77SKay Sievers		all non-printable characters and '\' itself in the log message
88d39f3d77SKay Sievers		are escaped by "\x00" C-style hex encoding.
893b552b92SKay Sievers
903b552b92SKay Sievers		A line starting with ' ', is a continuation line, adding
913b552b92SKay Sievers		key/value pairs to the log message, which provide the machine
923b552b92SKay Sievers		readable context of the message, for reliable processing in
933b552b92SKay Sievers		userspace.
943b552b92SKay Sievers
953b552b92SKay Sievers		Example:
96d39f3d77SKay Sievers		7,160,424069,-;pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
973b552b92SKay Sievers		 SUBSYSTEM=acpi
983b552b92SKay Sievers		 DEVICE=+acpi:PNP0A03:00
99d39f3d77SKay Sievers		6,339,5140900,-;NET: Registered protocol family 10
100d39f3d77SKay Sievers		30,340,5690716,-;udevd[80]: starting version 181
1013b552b92SKay Sievers
1023b552b92SKay Sievers		The DEVICE= key uniquely identifies devices the following way:
1033b552b92SKay Sievers		  b12:8        - block dev_t
1043b552b92SKay Sievers		  c127:3       - char dev_t
1053b552b92SKay Sievers		  n8           - netdev ifindex
1063b552b92SKay Sievers		  +sound:card0 - subsystem:devname
1073b552b92SKay Sievers
108d39f3d77SKay Sievers		The flags field carries '-' by default. A 'c' indicates a
109085a3a8fSJames Byrne		fragment of a line. Note, that these hints about continuation
110085a3a8fSJames Byrne		lines are not necessarily correct, and the stream could be
111085a3a8fSJames Byrne		interleaved with unrelated messages, but merging the lines in
112085a3a8fSJames Byrne		the output usually produces better human readable results. A
113085a3a8fSJames Byrne		similar logic is used internally when messages are printed to
114085a3a8fSJames Byrne		the console, /proc/kmsg or the syslog() syscall.
115d39f3d77SKay Sievers
1166fe29354STejun Heo		By default, kernel tries to avoid fragments by concatenating
1176fe29354STejun Heo		when it can and fragments are rare; however, when extended
1186fe29354STejun Heo		console support is enabled, the in-kernel concatenation is
1196fe29354STejun Heo		disabled and /dev/kmsg output will contain more fragments. If
1206fe29354STejun Heo		the log consumer performs concatenation, the end result
1216fe29354STejun Heo		should be the same. In the future, the in-kernel concatenation
1226fe29354STejun Heo		may be removed entirely and /dev/kmsg users are recommended to
1236fe29354STejun Heo		implement fragment handling.
1246fe29354STejun Heo
1253b552b92SKay SieversUsers:		dmesg(1), userspace kernel log consumers
126