xref: /openbmc/phosphor-logging/docs/structured-logging.md (revision 9fd25af597865f7e69bfb4f15338782370e61baa)
1# Structured Logging
2
3There are currently two APIs for structured logging:
4[log](../lib/include/phosphor-logging/log.hpp) and
5[lg2](../lib/include/phosphor-logging/lg2.hpp). If your code is C++20 (or later)
6it is preferred to use `lg2`.
7
8## Why structured logging?
9
10Structured logging is a method of logging where every variable piece of data is
11tagged with some identifier for the data. This is opposite of unstructured
12logging where logged events are free-form strings.
13
14The principal logging daemon in OpenBMC (systemd-journald) natively supports
15structured logging. As a result, there are some designs in place where specific
16structured events are added to the journal and downstream these events can be
17consumed. For example, one implementation of the IPMI SEL utilizes specific
18journal structured data to stored and later retrieve SEL events.
19
20Even if an argument might be made against the merits of using the journal as a
21form of IPC, the value of structured logging persists. It is very common as part
22of various failure-analysis operations, either on the part of a system
23manufacturer or an end-user, to need to interrogate the system logs to determine
24when/where/why a situation degraded. With unstructured logging, the
25implementation is left chasing message format and data changes, where as with
26structured logging the format is somewhat static and easily parsed.
27
28A specific example of where structured logging is beneficial is a service which
29gathers `error` or higher log reports and creates issues when a new or unknown
30signature is discovered. If the only information available is an unstructured
31string, any kind of signature identification requires creating a regular
32expression (likely). With structured log, specific identifiers can be used as
33the error signature while others are ignored. For instance, maybe a specific
34`ERROR_RC` is critical to identifying the scenario but `FILE_PATH` is variable
35and ignored.
36
37For deeper understanding of the OpenBMC logging subsystem, it may be useful to
38read the manpages for `man 1 journalctl` and `man 3 sd_journal_send`. Generally
39accepted log-levels and their definition is historically documented in
40`man 3 syslog`.
41
42## log
43
44The pre-C++20 logging APIs presented by phosphor-logging are
45`phosphor::logging::log`. The basic format of a log call is:
46
47```cpp
48    log<level>("A message", entry("TAG0=%s", "value"), entry("TAG1=%x", 2));
49```
50
51Each log call has a level or priority, which corresponds to syslog priorities,
52such as 'debug', 'info', 'emergency', a free-form message string, and any number
53of entries, which are key-value pairs of data.
54
55The 'key' of an entry is an upper-case tag for the data along with a
56printf-style format string for the data. Journald imposes some restrictions on
57the tag: it must be all capital letters, numbers, or underscores and must not
58start with an underscore. Unfortunately, if these restrictions are not followed
59or the printf string is invalid for the data, the code will compile but journald
60may silently drop the log request (or pieces of it).
61
62It is highly discouraged to dynamically create the free-form message string
63because the contents are then, effectively, unstructured.
64
65## lg2
66
67The post-C++20 logging APIs presented by phosphor-logging are `lg2::log`. The
68basic format of a log call is:
69
70```cpp
71    lg2::level("A {TAG0} occured.", "TAG0", "foo"s, "TAG1", lg2::hex, 2);
72```
73
74Each log call has a level or priority, but the level is indicated by the
75function call name (such as `lg2::debug(...)`). The log call also has a
76free-form message string and any number of entries indicated by 2 or 3 argument
77sets:
78
79- key name (with the same `[_A-Z0-9]` requirements imposed by journald).
80- [optional] set of format flags
81- data value
82
83The free-form message may also optionally contain brace-wrapped key names, for
84which the message will be dynamically modified to contain the formatted value in
85place of the `{KEY}`. This enables human-friendly message strings to be formed
86without relying on verbose journald output modes.
87
88Note: Since a free-form message with data can be created using the `{KEY}`
89mechanism, no other string formatting libraries are necessary or should be used.
90Avoiding the `{KEY}` feature causes the journal messages to become unstructured.
91Do not use `sstream` or `{fmt}` to format the message!
92
93The supported format flags are:
94
95- `bin`, `dec`, `hex`
96  - The [integer] data should be formatted in the requested manner.
97  - Decimal is the default.
98  - Examples:
99    - `bin, 0xabcd` -> `0b1010101111001101`
100    - `hex, 1234` -> `0x4d2`
101- `field8`, `field16`, `field32`, `field64`
102  - The [integer] data should be padded as if it were a field of specified
103    bit-length (useful only for `bin` or `hex` data).
104  - Examples:
105    - `(bin | field8), 0xff` -> `0b11111111`
106    - `(hex | field16), 10` -> `0x000a`
107
108Format flags can be OR'd together as necessary: `hex | field32`.
109
110The APIs can handle (and format appropriately) any data of the following types:
111signed or unsigned integers, floating point numbers, booleans, strings
112(C-strings, std::strings, or `std::string_views`), `std::filesystem::paths`,
113sdbusplus enums and `object_paths`, generic enumerations (as numbers),
114exceptions, and pointers. The APIs will also allow arbitrary type formatting for
115any type which has a `to_string` function defined.
116
117The APIs also perform compile-time analysis of the arguments to give descriptive
118error messages for incorrect parameters or format flags. Some examples are:
119
120- `(dec | hex)` yields:
121  - `error: static assertion failed: Conflicting flags found for value type.`
122- `dec` applied to a string yields:
123  - `error: static assertion failed: Prohibited flag found for value type.`
124- Missing a key yields:
125  - `error: static assertion failed: Found value without expected header field.`
126- Missing data yields:
127  - `error: static assertion failed: Found header field without expected data.`
128- Missing a message yields:
129  - `error: use of deleted function ‘lg2::debug<>::debug()’`
130
131Debug-level messages are only sent to journald if the `DEBUG_INVOCATION`
132environment variable is set, as suggested by the `systemd.exec` manpage.
133
134### LOG2_FMTMSG key
135
136The API adds an extra journald key to represent the original message prior to
137`{KEY}` replacement, which is saved with the `LOG2_FMTMSG` key. This is done to
138facilitate searching the journal with a known fixed version of the message
139(prior to the dynamic replacement).
140
141### Key format checking
142
143The journald APIs require that keys (also called data 'headers') may only
144contain underscore, uppercase letters, or numbers (`[_A-Z0-9]`) and may not
145start with underscores. If these requirements are ignored, the journal API
146silently drops journal requests. In order to prevent silent bugs, the code
147performs compile-time checking of these requirements.
148
149The code that enables compile-time header checking imposes two constraints:
150
1511. Keys / headers must be passed as constant C-string values.
152   - `"KEY"` is valid; `"KEY"s` or `variable_key` is not.
1532. Any constant C-string may be interpreted as a key and give non-obvious
154   compile warnings about format violations.
155   - Constant C-strings (`"a string"`) should be passed as a C++ literal
156     (`"a string"s`) instead.
157
158### stderr output
159
160When running an application or daemon on a console or SSH session, it might not
161be obvious that the application is writing to the journal. The `lg2` APIs detect
162if the application is running on a TTY and additionally log to the TTY.
163
164Output to stderr can also be forced by setting the `LG2_FORCE_STDERR`
165environment variable to any value. This is especially useful to see log output
166in OpenBMC CI test verfication.
167
168The verbosity of this output can be filtered by setting `LG2_LOG_LEVEL`
169environment variable to the desired maximum log level (0-7). Messages with a
170priority value greater than this setting will be supressed. If `LG2_LOG_LEVEL`
171is not set, it defaults to 7, showing all messages.
172
173The format of information sent to the TTY can be adjusted by setting the desired
174format string in the `LG2_FORMAT` environment variable. Supported fields are:
175
176- `%%` : a `'%'` literal
177- `%f` : the logging function's name
178- `%F` : the logging function's file
179- `%l` : the log level as an integer
180- `%L` : the logging function's line number
181- `%m` : the lg2 message
182
183The default format is `"<%l> %m"`.
184
185### Why a new API?
186
187There were a number of issues raised by `logging::log` which are not easily
188fixed with the existing API.
189
1901. The mixture of template and temporary-constructed `entry` parameters is
191   verbose and clumsy. `lg2` is far more succinct in this regard.
1922. The printf format-strings were error prone and potentially missed in code
193   reviews. `lg2` eliminates the need for printf strings by generating the
194   formatting internally.
1953. `logging::log` generates incorrect code location information (see
196   openbmc/openbmc#2297). `lg2` uses C++20's `source_location` to, by default,
197   generate correct code location info and can optionally be passed a
198   non-defaulted `source_location` for rare but appropriate cases.
1994. The previous APIs had no mechanism to generate dynamic user-friendly strings
200   which caused some developers to rely on external formatting libraries such as
201   `{fmt}`. `{KEY}` replacement is now a core feature of the new API.
2025. When running on a TTY, `logging::log` sent data to journal and the TTY was
203   silent. This resulted in some applications creating custom code to write some
204   data to `stdout` and some to `logging::log` APIs. `lg2` automatically detects
205   if it is running on a TTY and duplicates logging data to the console and the
206   journal.
207
208It is possible #3 and #5 could be fixed with the existing APIs, but the
209remainder are only possible to be resolved with changes to the API syntax.
210
211### Why C++20?
212
213Solving issue openbmc/openbmc#2297 requires C++20's `source_location` feature.
214It is possible that this could be solved in the `logging::log` APIs by utilizing
215`#ifdef` detection of the `source_location` feature so that C++20 applications
216gain this support.
217
218Implementing much of the syntactic improvements of the `lg2` API is made
219possible by leveraging C++20's Concepts feature. Experts in C++ may argue that
220this could be implemented in earlier revisions of C++ using complex SFINAE
221techniques with templated-class partial-specialization and overloading. Those
222experts are more than welcome to implement the `lg2` API in C++17 on their own.
223
224### Why didn't you do ...?
225
226> Why didn't you just use `{fmt}`?
227
228`{fmt}` is a great API for creating unstructured logging strings, but we are
229trying to create structured logging. `{fmt}` doesn't address that problem
230domain.
231
232> Why invent your own formatting and not use `{fmt}`?
233
234The formatting performed by this API is purposefully minimal. `{fmt}` is very
235capable and especially apt for human-facing string formatting. That is not the
236typical use-case for our logging. Instead we prioritized the following:
237
2381. Reasonable syntactic ergonomics so that the API can easily be adopted.
2392. Simple, consistent, machine parse-able data contents.
2403. Sufficient human-facing messages for developer-level debug.
2414. Very tight code generation at logging call sites and reasonably performant
242   code.
243
244(1) The lg2 API is roughly equivalent to `printf`, `{fmt}`, `cout` in terms of
245ergonomics, but significantly better compile-time error proofing than the others
246(except on par with `{fmt}` for errors).
247
248(2) Adding robust formatting would lead to less consistent structured data with
249essentially no additional benefit. Detailed field specifiers like `{.4f}` do not
250serve any purpose when the consumer is another computer program, and only
251minimal enhancement for developers. The typical utility formatting for
252hardware-facing usage is implemented (hex, binary, field-size).
253
254(3) The `{HEADER}` placeholders allow data to be placed in a human-friendly
255manner on par with `{fmt}`.
256
257(4) The call-site code generated by this API is almost identical to a `printf`
258and the journal-facing code is on similar performance footing to the
259journal_send APIs. We save some processing by using `iovec` interfaces and
260providing the source-code information, compared to the older `logging` APIs and
261have similar formatting performance to the printf-style formatting that
262journal_send used. The only difference is that this is done in our library
263rather than in `libsystemd`.
264
265Utilizing `{fmt}` for each structured data element would impose much greater
266overheads. Either we insert the `{fmt}` calls at the call-site (N calls plus N
267string objects for N structured data elements), or we do them in the library
268side where we lose the compile-time format checking. Also, the performance of
269the more robust formatting would almost certainly be worse, especially if we do
270the formatting library-side.
271
272Logging is done often. Shifting a few values onto the stack for a printf-type
273call compared to a kilobyte+ of generated code for inline `{fmt}` calls is a
274significant trade-off. And one with the only major advantage being more
275universally standardized API. The `lg2` API seems obvious enough in ergonomics
276such that this should not be an impediment to our community of developers.
277
278If it is later decided that we need more robust formatting or the `lg2::format`
279flags were a bad idea they could be deprecated and replaced. The format flags
280are a unique C++ type, which makes syntax parsing easier. As long as they are
281replaced with a similar unique C++ type both syntaxes could be supported for a
282time. Thus enhancing to support something like `fmt::arg` in the future could be
283done without impacting existing code usage. Also, an ambitious developer could
284write a Clang-Tidy code upgrader to transition from format flags to something
285else, like Abseil provides for API changes.
286
287> Doesn't duplicating the structured data in the message decrease the available
288> journal space?
289
290Far less than you might imagine. Journald keeps the messages in a compressed
291binary format. Since the data embedded in the message and the structured data
292are identical in content, and very near each other in the on-disk format, they
293compress exceptionally well. Likely on the order of 1-2 bytes per structured
294data element.
295
296The lack of data in the default `journalctl` output was a serious impediment to
297adoption of the `logging` API by some members of the development community.
298Unless we dispense with structured logging entirely, this duplication seems like
299a reasonable compromise.
300
301> Doesn't the `{HEADER}` syntax needlessly lengthen the message strings?
302
303Lengthen, yes. Needlessly, no?
304
305An earlier `lg2` proposal had a format flag that appended data to the message
306string instead of putting it in-place. The consensus was that this did not
307create as human-friendly messages as developers desired so the placeholder
308syntax was implemented instead.
309
310`{fmt}` can use shorter placeholders of `{}` or `{3}`. The non-indexed syntax
311would require structured data elements be in specific order and could be error
312prone with code maintenance. The indexed syntax is similarly error prone and
313harder to review. Both of them are more work for string formatting on the
314library.
315
316The `{HEADER}` syntax is identical to `{fmt}`'s "Named Argument" syntax but
317actually with better parameter ergonomics as `{fmt}` would require wrapping the
318named argument with a `fmt::arg` call, which is similar to `logging`'s `entry`.
319