xref: /openbmc/docs/anti-patterns.md (revision c0aae66e0d2a6d7b52471a4858aa463789071581)
1c7205cb0SBrad Bishop# OpenBMC Anti-patterns
2c7205cb0SBrad Bishop
3c7205cb0SBrad BishopFrom [Wikipedia](https://en.wikipedia.org/wiki/Anti-pattern):
4c7205cb0SBrad Bishop
5c7205cb0SBrad Bishop"An anti-pattern is a common response to a recurring problem that is usually
6c7205cb0SBrad Bishopineffective and risks being highly counterproductive."
7c7205cb0SBrad Bishop
8c7205cb0SBrad BishopThe developers of OpenBMC do not get 100% of decisions right 100% of the time.
9c7205cb0SBrad BishopThat, combined with the fact that software development is often an exercise in
10c7205cb0SBrad Bishopcopying and pasting, results in mistakes happening over and over again.
11c7205cb0SBrad Bishop
12c7205cb0SBrad BishopThis page aims to document some of the anti-patterns that exist in OpenBMC to
13f4febd00SPatrick Williamsease the job of those reviewing code. If an anti-pattern is spotted, rather that
14f4febd00SPatrick Williamsrepeating the same explanations over and over, a link to this document can be
15f4febd00SPatrick Williamsprovided.
16c7205cb0SBrad Bishop
17c7205cb0SBrad Bishop<!-- begin copy/paste on next line -->
18c7205cb0SBrad Bishop
19c7205cb0SBrad Bishop## Anti-pattern template [one line description]
20c7205cb0SBrad Bishop
21c7205cb0SBrad Bishop### Identification
22f4febd00SPatrick Williams
23c7205cb0SBrad Bishop(1 paragraph) Describe how to spot the anti-pattern.
24c7205cb0SBrad Bishop
25c7205cb0SBrad Bishop### Description
26f4febd00SPatrick Williams
27c7205cb0SBrad Bishop(1 paragraph) Describe the negative effects of the anti-pattern.
28c7205cb0SBrad Bishop
29c7205cb0SBrad Bishop### Background
30f4febd00SPatrick Williams
31c7205cb0SBrad Bishop(1 paragraph) Describe why the anti-pattern exists. If you don't know, try
32c7205cb0SBrad Bishoprunning git blame and look at who wrote the code originally, and ask them on the
33f4febd00SPatrick Williamsmailing list or in Discord what their original intent was, so it can be
34f4febd00SPatrick Williamsdocumented here (and you may possibly discover it isn't as much of an
35f4febd00SPatrick Williamsanti-pattern as you thought). If you are unable to determine why the
36f4febd00SPatrick Williamsanti-pattern exists, put: "Unknown" here.
37c7205cb0SBrad Bishop
38c7205cb0SBrad Bishop### Resolution
39f4febd00SPatrick Williams
40c7205cb0SBrad Bishop(1 paragraph) Describe the preferred way to solve the problem solved by the
41c7205cb0SBrad Bishopanti-pattern and the positive effects of solving it in the manner described.
42c7205cb0SBrad Bishop
43c7205cb0SBrad Bishop<!-- end copy/paste on previous line -->
4420d70b14SBrad Bishop
45b410d1a7SPatrick Venture## Custom ArgumentParser object
46b410d1a7SPatrick Venture
47b410d1a7SPatrick Venture### Identification
48b410d1a7SPatrick Venture
49f4febd00SPatrick WilliamsThe ArgumentParser object is typically present to wrap calls to get options. It
50f4febd00SPatrick Williamsabstracts away the parsing and provides a `[]` operator to access the
51b410d1a7SPatrick Ventureparameters.
52b410d1a7SPatrick Venture
53b410d1a7SPatrick Venture### Description
54b410d1a7SPatrick Venture
55b410d1a7SPatrick VentureWriting a custom ArgumentParser object creates nearly duplicate code in a
56f4febd00SPatrick Williamsrepository. The ArgumentParser itself is the same, however, the options provided
57f4febd00SPatrick Williamsdiffer. Writing a custom argument parser re-invents the wheel on c++ command
58f4febd00SPatrick Williamsline argument parsing.
59b410d1a7SPatrick Venture
60b410d1a7SPatrick Venture### Background
61b410d1a7SPatrick Venture
62b410d1a7SPatrick VentureThe ArgumentParser exists because it was in place early and then copied into
63b410d1a7SPatrick Ventureeach new repository as an easy way to handle argument parsing.
64b410d1a7SPatrick Venture
65b410d1a7SPatrick Venture### Resolution
66b410d1a7SPatrick Venture
67b410d1a7SPatrick VentureThe CLI11 library was designed and implemented specifically to support modern
68b410d1a7SPatrick Ventureargument parsing. It handles the cases seen in OpenBMC daemons and has some
69b410d1a7SPatrick Venturehandy built-in validators, and allows easy customizations to validation.
70b410d1a7SPatrick Venture
71f3fe1216SPatrick Venture## Explicit AC_MSG_ERROR on PKG_CHECK_MODULES failure
72f3fe1216SPatrick Venture
73f3fe1216SPatrick Venture### Identification
74f4febd00SPatrick Williams
75f3fe1216SPatrick Venture```
76f3fe1216SPatrick VenturePKG_CHECK_MODULES(
77f3fe1216SPatrick Venture    [PHOSPHOR_LOGGING],
78f3fe1216SPatrick Venture    [phosphor-logging],
79f3fe1216SPatrick Venture    [],
80f3fe1216SPatrick Venture    [AC_MSG_ERROR([Could not find phosphor-logging...openbmc/phosphor-logging package required])])
81f3fe1216SPatrick Venture```
82f3fe1216SPatrick Venture
83f3fe1216SPatrick Venture### Description
84f3fe1216SPatrick Venture
85f4febd00SPatrick WilliamsThe autotools PKG_CHECK_MODULES macro provides the ability to specify an "if
86f4febd00SPatrick Williamsfound" and "if not found" behavior. By default, the "if not found" behavior will
87f4febd00SPatrick Williamslist the package not found. In many cases, this is sufficient to a developer to
88f4febd00SPatrick Williamsknow what package is missing. In most cases, it's another OpenBMC package.
89f3fe1216SPatrick Venture
90f4febd00SPatrick WilliamsIf the library sought's name isn't related to the package providing it, then the
91f4febd00SPatrick Williamsfailure message should be set to something more useful to the developer.
92f3fe1216SPatrick Venture
93f3fe1216SPatrick Venture### Resolution
94f3fe1216SPatrick Venture
95f3fe1216SPatrick VentureUse the default macro behavior when it is clear that the missing package is
96f3fe1216SPatrick Ventureanother OpenBMC package.
97f3fe1216SPatrick Venture
98f3fe1216SPatrick Venture```
99f3fe1216SPatrick VenturePKG_CHECK_MODULES([PHOSPHOR_LOGGING], [phosphor-logging])
100f3fe1216SPatrick Venture```
101f3fe1216SPatrick Venture
10220d70b14SBrad Bishop## Explicit listing of shared library packages in RDEPENDS in bitbake metadata
10320d70b14SBrad Bishop
10420d70b14SBrad Bishop### Identification
105f4febd00SPatrick Williams
10620d70b14SBrad Bishop```
10720d70b14SBrad BishopRDEPENDS_${PN} = "libsystemd"
10820d70b14SBrad Bishop```
10920d70b14SBrad Bishop
11020d70b14SBrad Bishop### Description
111f4febd00SPatrick Williams
11220d70b14SBrad BishopOut of the box bitbake examines built applications, automatically adds runtime
11320d70b14SBrad Bishopdependencies and thus ensures any library packages dependencies are
11420d70b14SBrad Bishopautomatically added to images, sdks, etc. There is no need to list them
11520d70b14SBrad Bishopexplicitly in a recipe.
11620d70b14SBrad Bishop
11720d70b14SBrad BishopDependencies change over time, and listing them explicitly is likely prone to
11820d70b14SBrad Bishoperrors - the net effect being unnecessary shared library packages being
11920d70b14SBrad Bishopinstalled into images.
12020d70b14SBrad Bishop
12120d70b14SBrad BishopConsult
12220d70b14SBrad Bishophttps://www.yoctoproject.org/docs/latest/mega-manual/mega-manual.html#var-RDEPENDS
12320d70b14SBrad Bishopfor information on when to use explicit runtime dependencies.
12420d70b14SBrad Bishop
12520d70b14SBrad Bishop### Background
126f4febd00SPatrick Williams
127f4febd00SPatrick WilliamsThe initial bitbake metadata author for OpenBMC was not aware that bitbake added
128f4febd00SPatrick Williamsthese dependencies automatically. Initial bitbake metadata therefore listed
129f4febd00SPatrick Williamsshared library dependencies explicitly, and was subsequently copy pasted.
13020d70b14SBrad Bishop
13120d70b14SBrad Bishop### Resolution
132f4febd00SPatrick Williams
133f4febd00SPatrick WilliamsDo not list shared library packages in RDEPENDS. This eliminates the possibility
134f4febd00SPatrick Williamsof installing unnecessary shared library packages due to unmaintained library
135f4febd00SPatrick Williamsdependency lists in bitbake metadata.
136acedc151SBrad Bishop
137acedc151SBrad Bishop## Use of /usr/bin/env in systemd service files
138acedc151SBrad Bishop
139acedc151SBrad Bishop### Identification
140f4febd00SPatrick Williams
141acedc151SBrad BishopIn systemd unit files:
142f4febd00SPatrick Williams
143acedc151SBrad Bishop```
144acedc151SBrad Bishop[Service]
145acedc151SBrad Bishop
146acedc151SBrad BishopExecStart=/usr/bin/env some-application
147acedc151SBrad Bishop```
148acedc151SBrad Bishop
149acedc151SBrad Bishop### Description
150f4febd00SPatrick Williams
151acedc151SBrad BishopOutside of OpenBMC, most applications that provide systemd unit files don't
152f4febd00SPatrick Williamslaunch applications in this way. So if nothing else, this just looks strange and
153f4febd00SPatrick Williamsviolates the
154f4febd00SPatrick Williams[princple of least astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment).
155acedc151SBrad Bishop
156acedc151SBrad Bishop### Background
157f4febd00SPatrick Williams
158acedc151SBrad BishopThis anti-pattern exists because a requirement exists to enable live patching of
159acedc151SBrad Bishopapplications on read-only filesystems. Launching applications in this way was
160acedc151SBrad Bishoppart of the implementation that satisfied the live patch requirement. For
161acedc151SBrad Bishopexample:
162acedc151SBrad Bishop
163acedc151SBrad Bishop```
164acedc151SBrad Bishop/usr/bin/phosphor-hwmon
165acedc151SBrad Bishop```
166acedc151SBrad Bishop
167acedc151SBrad Bishopon a read-only filesystem becomes:
168acedc151SBrad Bishop
169acedc151SBrad Bishop```
170acedc151SBrad Bishop/usr/local/bin/phosphor-hwmon`
171acedc151SBrad Bishop```
172acedc151SBrad Bishop
173acedc151SBrad Bishopon a writeable /usr/local filesystem.
174acedc151SBrad Bishop
175acedc151SBrad Bishop### Resolution
176f4febd00SPatrick Williams
177acedc151SBrad BishopThe /usr/bin/env method only enables live patching of applications. A method
178acedc151SBrad Bishopthat supports live patching of any file in the read-only filesystem has emerged.
179acedc151SBrad BishopAssuming a writeable filesystem exists _somewhere_ on the bmc, something like:
180acedc151SBrad Bishop
181acedc151SBrad Bishop```
182acedc151SBrad Bishopmkdir -p /var/persist/usr
183acedc151SBrad Bishopmkdir -p /var/persist/work/usr
184acedc151SBrad Bishopmount -t overlay -o lowerdir=/usr,upperdir=/var/persist/usr,workdir=/var/persist/work/usr overlay /usr
185acedc151SBrad Bishop```
186f4febd00SPatrick Williams
187acedc151SBrad Bishopcan enable live system patching without any additional requirements on how
188acedc151SBrad Bishopapplications are launched from systemd service files. This is the preferred
189acedc151SBrad Bishopmethod for enabling live system patching as it allows OpenBMC developers to
190acedc151SBrad Bishopwrite systemd service files in the same way as most other projects.
191acedc151SBrad Bishop
192acedc151SBrad BishopTo undo existing instances of this anti-pattern remove /usr/bin/env from systemd
193acedc151SBrad Bishopservice files and replace with the fully qualified path to the application being
194acedc151SBrad Bishoplaunched. For example, given an application in /usr/bin:
195acedc151SBrad Bishop
196acedc151SBrad Bishop```
197acedc151SBrad Bishopsed -i s,/usr/bin/env ,/usr/bin/, foo.service
198acedc151SBrad Bishop```
1998b37263fSBrad Bishop
200a77235f1SPatrick Williams## Incorrect placement of executables in /sbin, /usr/sbin or /bin, /usr/bin
2018b37263fSBrad Bishop
2028b37263fSBrad Bishop### Identification
203f4febd00SPatrick Williams
204a77235f1SPatrick WilliamsOpenBMC executables that are installed in `/usr/sbin`. `$sbindir` in bitbake
205a77235f1SPatrick Williamsmetadata, makefiles or configure scripts. systemd service files pointing to
206a77235f1SPatrick Williams`/bin` or `/usr/bin` executables.
2078b37263fSBrad Bishop
2088b37263fSBrad Bishop### Description
209f4febd00SPatrick Williams
210a77235f1SPatrick WilliamsInstalling OpenBMC applications in incorrect locations violates the
211f4febd00SPatrick Williams[principle of least astonishment](https://en.wikipedia.org/wiki/Principle_of_least_astonishment)
212f4febd00SPatrick Williamsand more importantly violates the
2138b37263fSBrad Bishop[FHS](https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard).
2148b37263fSBrad Bishop
2158b37263fSBrad Bishop### Background
216a77235f1SPatrick Williams
217a77235f1SPatrick WilliamsThere are typically two types of executables:
218a77235f1SPatrick Williams
219a77235f1SPatrick Williams1. Long-running daemons started by, for instance, systemd service files and
220f4febd00SPatrick Williams   _NOT_ intended to be directly executed by users.
221a77235f1SPatrick Williams2. Utilities intended to be used by a user as a CLI.
222a77235f1SPatrick Williams
223f4febd00SPatrick WilliamsExecutables of type 1 should not be placed anywhere in `${PATH}` because it is
224f4febd00SPatrick Williamsconfusing and error-prone to users, but should instead be placed in a
225a77235f1SPatrick Williams`/usr/libexec/<package>` subdirectory.
226a77235f1SPatrick Williams
227a77235f1SPatrick WilliamsExecutables of type 2 should be placed in `/usr/bin` because they are intended
228f4febd00SPatrick Williamsto be used by users and should be in `${PATH}` (also, `sbin` is inappropriate as
229f4febd00SPatrick Williamswe transition to having non-root access).
230a77235f1SPatrick Williams
231a77235f1SPatrick WilliamsThe sbin anti-pattern exists because the FHS was misinterpreted at an early
232a77235f1SPatrick Williamspoint in OpenBMC project history, and has proliferated ever since.
2338b37263fSBrad Bishop
2348b37263fSBrad BishopFrom the hier(7) man page:
2358b37263fSBrad Bishop
2368b37263fSBrad Bishop```
2378b37263fSBrad Bishop/usr/bin This is the primary directory for executable programs.  Most programs
2388b37263fSBrad Bishopexecuted by normal users which are not needed for booting or for repairing the
2398b37263fSBrad Bishopsystem and which are not installed locally should be placed in this directory.
2408b37263fSBrad Bishop
2418b37263fSBrad Bishop/usr/sbin This directory contains program binaries for system administration
2428b37263fSBrad Bishopwhich are not essential for the boot process, for mounting /usr, or for system
2438b37263fSBrad Bishoprepair.
244a77235f1SPatrick Williams
245a77235f1SPatrick Williams/usr/libexec Directory  contains  binaries for internal use only and they are
246a77235f1SPatrick Williamsnot meant to be executed directly by users shell or scripts.
2478b37263fSBrad Bishop```
248a77235f1SPatrick Williams
24967465bdeSPatrick WilliamsThe FHS description for `/usr/sbin` refers to "system administration" but the
2508b37263fSBrad Bishopde-facto interpretation of the system being administered refers to the OS
2518b37263fSBrad Bishopinstallation and not a system in the OpenBMC sense of managed system. As such
25267465bdeSPatrick WilliamsOpenBMC applications should be installed in `/usr/bin`.
2538b37263fSBrad Bishop
254a77235f1SPatrick WilliamsIt is becoming common practice in Linux for daemons to now be moved to `libexec`
255a77235f1SPatrick Williamsand considered "internal use" from the perspective of the systemd service file
256a77235f1SPatrick Williamsthat controls their launch.
257a77235f1SPatrick Williams
2588b37263fSBrad Bishop### Resolution
259f4febd00SPatrick Williams
26067465bdeSPatrick WilliamsInstall OpenBMC applications in `/usr/libexec` or `/usr/bin/` as appropriate.
261daa78cb1SJoseph Reynolds
262daa78cb1SJoseph Reynolds## Handling unexpected error codes and exceptions
263daa78cb1SJoseph Reynolds
264daa78cb1SJoseph Reynolds### Identification
265f4febd00SPatrick Williams
266daa78cb1SJoseph ReynoldsThe anti-pattern is for an application to continue processing after it
267daa78cb1SJoseph Reynoldsencounters unexpected conditions in the form of error codes and exceptions and
268daa78cb1SJoseph Reynoldsnot capturing the data needed to resolve the problem.
269daa78cb1SJoseph Reynolds
270daa78cb1SJoseph ReynoldsExample C++ code:
271f4febd00SPatrick Williams
272daa78cb1SJoseph Reynolds```
273daa78cb1SJoseph Reynoldsusing InternalFailure = sdbusplus::xyz::openbmc_project::Common::Error::InternalFailure;
274daa78cb1SJoseph Reynoldstry
275daa78cb1SJoseph Reynolds{
276daa78cb1SJoseph Reynolds  ... use d-Bus APIs...
277daa78cb1SJoseph Reynolds}
278daa78cb1SJoseph Reynoldscatch (InternalFailure& e)
279daa78cb1SJoseph Reynolds{
280daa78cb1SJoseph Reynolds    phosphor::logging::commit<InternalFailure>();
281daa78cb1SJoseph Reynolds}
282daa78cb1SJoseph Reynolds```
283daa78cb1SJoseph Reynolds
284daa78cb1SJoseph Reynolds### Description
285f4febd00SPatrick Williams
286daa78cb1SJoseph ReynoldsSuppressing unexpected errors can lead an application to incorrect or erratic
287daa78cb1SJoseph Reynoldsbehavior which can affect the service it provides and the overall system.
288daa78cb1SJoseph Reynolds
289daa78cb1SJoseph ReynoldsWriting a log entry instead of a core dump may not give enough information to
290daa78cb1SJoseph Reynoldsdebug a truly unexpected problem, so developers do not get a chance to
291daa78cb1SJoseph Reynoldsinvestigate problems and the system's reliability does not improve over time.
292daa78cb1SJoseph Reynolds
293daa78cb1SJoseph Reynolds### Background
294daa78cb1SJoseph Reynolds
295f4febd00SPatrick WilliamsProgrammers want their application to continue processing when it encounters
296f4febd00SPatrick Williamsconditions it can recover from. Sometimes they try too hard and continue when it
297f4febd00SPatrick Williamsis not appropriate.
298f4febd00SPatrick Williams
299f4febd00SPatrick WilliamsProgrammers also want to log what is happening in the application, so they write
300f4febd00SPatrick Williamslog entries that give debug data when something goes wrong, typically targeted
301f4febd00SPatrick Williamsfor their use. They don't consider how the log entry is consumed by the BMC
302f4febd00SPatrick Williamsadministrator or automated service tools.
303daa78cb1SJoseph Reynolds
304daa78cb1SJoseph ReynoldsThe `InternalFailure` in the [Phosphor logging README][] is overused.
305daa78cb1SJoseph Reynolds
306f4febd00SPatrick Williams[phosphor logging readme]:
307f4febd00SPatrick Williams  https://github.com/openbmc/phosphor-logging/blob/master/README.md
308daa78cb1SJoseph Reynolds
309daa78cb1SJoseph Reynolds### Resolution
310daa78cb1SJoseph Reynolds
311daa78cb1SJoseph ReynoldsSeveral items are needed:
312f4febd00SPatrick Williams
313daa78cb1SJoseph Reynolds1. Check all places where a return code or errno value is given. Strongly
314f4febd00SPatrick Williams   consider that a default error handler should throw an exception, for example
315f4febd00SPatrick Williams   `std::system_error`.
316daa78cb1SJoseph Reynolds2. Have a good reason to handle specific error codes. See below.
317f4febd00SPatrick Williams3. Have a good reason to handle specific exceptions. Allow other exceptions to
318f4febd00SPatrick Williams   propagate.
319daa78cb1SJoseph Reynolds4. Document (in terms of impacts to other services) what happens when this
320daa78cb1SJoseph Reynolds   service fails, stops, or is restarted. Use that to inform the recovery
321daa78cb1SJoseph Reynolds   strategy.
322daa78cb1SJoseph Reynolds
323daa78cb1SJoseph ReynoldsIn the error handler:
324daa78cb1SJoseph Reynolds
325f4febd00SPatrick Williams1. Consider what data (if any) should be logged. Determine who will consume the
326f4febd00SPatrick Williams   log entry: BMC developers, administrator, or an analysis tool. Usually the
327f4febd00SPatrick Williams   answer is more than one of these.
328daa78cb1SJoseph Reynolds
329daa78cb1SJoseph Reynolds   The following example log entries use an IPMI request to set network access
330daa78cb1SJoseph Reynolds   on, but the user input is invalid.
331daa78cb1SJoseph Reynolds
332f4febd00SPatrick Williams   - BMC Developer: Reference internal applications, services, pids, etc. the
333f4febd00SPatrick Williams     developer would be familiar with.
334daa78cb1SJoseph Reynolds
335f4febd00SPatrick Williams     Example:
336f4febd00SPatrick Williams     `ipmid service successfully processed a network setting packet, however the user input of USB0 is not a valid network interface to configure.`
337daa78cb1SJoseph Reynolds
338daa78cb1SJoseph Reynolds   - Administrator: Reference the external interfaces of the BMC such as the
339f4febd00SPatrick Williams     REST API. They can respond to feedback about invalid input, or a need to
340f4febd00SPatrick Williams     restart the BMC.
341daa78cb1SJoseph Reynolds
342f4febd00SPatrick Williams     Example:
343f4febd00SPatrick Williams     `The network interface of USB0 is not a valid option. Retry the IPMI command with a valid interface.`
344daa78cb1SJoseph Reynolds
345daa78cb1SJoseph Reynolds   - Analyzer tool: Consider breaking the log down and including several
346f4febd00SPatrick Williams     properties which an analyzer can leverage. For instance, tagging the log
347f4febd00SPatrick Williams     with 'Internal' is not helpful. However, breaking that down into something
348f4febd00SPatrick Williams     like [UserInput][ipmi][Network] tells at a quick glance that the input
349f4febd00SPatrick Williams     received for configuring the network via an IPMI command was invalid.
350f4febd00SPatrick Williams     Categorization and system impact are key things to focus on when creating
351f4febd00SPatrick Williams     logs for an analysis application.
352daa78cb1SJoseph Reynolds
353f4febd00SPatrick Williams     Example:
354f4febd00SPatrick Williams     `[UserInput][IPMI][Network][Config][Warning] Interface USB0 not valid.`
355daa78cb1SJoseph Reynolds
356daa78cb1SJoseph Reynolds2. Determine if the application can fully recover from the condition. If not,
357daa78cb1SJoseph Reynolds   don't continue. Allow the system to determine if it writes a core dump or
358daa78cb1SJoseph Reynolds   restarts the service. If there are severe impacts when the service fails,
359daa78cb1SJoseph Reynolds   consider using a better error recovery mechanism.
360f96db51dSAndrew Geissler
361f96db51dSAndrew Geissler## Non-standard debug application options and logging
362f96db51dSAndrew Geissler
363f96db51dSAndrew Geissler### Identification
364f4febd00SPatrick Williams
365f4febd00SPatrick WilliamsAn application uses non-standard methods on startup to indicate verbose logging
366f4febd00SPatrick Williamsand/or does not utilize standard systemd-journald debug levels for logging.
367f96db51dSAndrew Geissler
368f96db51dSAndrew Geissler### Description
369f4febd00SPatrick Williams
370f4febd00SPatrick WilliamsWhen debugging issues within OpenBMC that cross multiple applications, it's very
371f4febd00SPatrick Williamsdifficult to enable the appropriate debug when different applications have
372f4febd00SPatrick Williamsdifferent mechanisms for enabling debug. For example, different OpenBMC
373f96db51dSAndrew Geisslerapplications take the following as command line parameters to enable extra
374f96db51dSAndrew Geisslerdebug:
375f4febd00SPatrick Williams
376f96db51dSAndrew Geissler- 0xff, --vv, -vv, -v, --verbose, <and more>
377f96db51dSAndrew Geissler
378f96db51dSAndrew GeisslerAlong these same lines, some applications then have their own internal methods
379f4febd00SPatrick Williamsof writing debug data. They use std::cout, printf, fprintf, ... Doing this
380f4febd00SPatrick Williamscauses other issues when trying to compare debug data across different
381f4febd00SPatrick Williamsapplications that may be having their buffers flushed at different times (and
382f4febd00SPatrick Williamspicked up by journald).
383f96db51dSAndrew Geissler
384f96db51dSAndrew Geissler### Background
385f4febd00SPatrick Williams
386f4febd00SPatrick WilliamsEveryone has their own ways to debug. There was no real standard within OpenBMC
387f4febd00SPatrick Williamson how to do it so everyone came up with whatever they were familiar with.
388f96db51dSAndrew Geissler
389f96db51dSAndrew Geissler### Resolution
390f4febd00SPatrick Williams
391f96db51dSAndrew GeisslerIf an OpenBMC application is to support enhanced debug via a command line then
392f96db51dSAndrew Geisslerit will support the standard "-v,--verbose" option.
393f96db51dSAndrew Geissler
394f96db51dSAndrew GeisslerIn general, OpenBMC developers should utilize the "debug" journald level for
395f4febd00SPatrick Williamsdebug messages. This can be enabled/disabled globally and would apply to all
396f4febd00SPatrick Williamsapplications. If a developer believes this would cause too much debug data in
397f4febd00SPatrick Williamscertain cases then they can protect these journald debug calls around a
398f4febd00SPatrick Williams--verbose command line option.
3997e7e0e40SAndrew Jeffery
4007e7e0e40SAndrew Jeffery## DBus interface representing GPIOs
4017e7e0e40SAndrew Jeffery
4027e7e0e40SAndrew Jeffery### Identification
403f4febd00SPatrick Williams
4047e7e0e40SAndrew JefferyDesire to expose a DBus interface to drive GPIOs, for example:
4057e7e0e40SAndrew Jeffery
406f4febd00SPatrick Williams- https://lore.kernel.org/openbmc/YV21cD3HOOGi7K2f@heinlein/
407f4febd00SPatrick Williams- https://lore.kernel.org/openbmc/CAH2-KxBV9_0Dt79Quy0f4HkXXPdHfBw9FsG=4KwdWXBYNEA-ew@mail.gmail.com/
408f4febd00SPatrick Williams- https://lore.kernel.org/openbmc/YtPrcDzaxXiM6vYJ@heinlein.stwcx.org.github.beta.tailscale.net/
4097e7e0e40SAndrew Jeffery
4107e7e0e40SAndrew Jeffery### Description
411f4febd00SPatrick Williams
4127e7e0e40SAndrew JefferyPlatform functionality selected by GPIOs might equally be selected by other
4137e7e0e40SAndrew Jefferymeans with a shift in system design philosophy. As such, GPIOs are a (hardware)
4147e7e0e40SAndrew Jefferyimplementation detail. Exposing the implementation on DBus forces the
4157e7e0e40SAndrew Jefferydistribution of platform design details across multiple applications, which
4167e7e0e40SAndrew Jefferyviolates the software design principle of [low coupling][coupling] and impacts
4177e7e0e40SAndrew Jefferyour confidence in maintenance.
4187e7e0e40SAndrew Jeffery
4197e7e0e40SAndrew Jeffery[coupling]: https://en.wikipedia.org/wiki/Coupling_%28computer_programming%29
4207e7e0e40SAndrew Jeffery
4217e7e0e40SAndrew Jeffery### Background
422f4febd00SPatrick Williams
4237e7e0e40SAndrew JefferyGPIOs are often used to select functionality that might be hard to generalise,
424f4febd00SPatrick Williamsand therefore hard to abstract. If this is the case, then the functionality in
425f4febd00SPatrick Williamsquestion is probably best wrapped up as an implementation detail of a behaviour
426f4febd00SPatrick Williamsthat is more generally applicable (e.g. host power-on procedures).
4277e7e0e40SAndrew Jeffery
4287e7e0e40SAndrew Jeffery### Resolution
429f4febd00SPatrick Williams
4307e7e0e40SAndrew JefferyConsider what functionality the GPIO provides and design or exploit an existing
4317e7e0e40SAndrew Jefferyinterface to expose that behaviour instead.
43217d84a2dSAndrew Jeffery
43317d84a2dSAndrew Jeffery## Very long lambda callbacks
43417d84a2dSAndrew Jeffery
43517d84a2dSAndrew Jeffery### Identification
43617d84a2dSAndrew Jeffery
43717d84a2dSAndrew JefferyC++ code that is similar to the following:
43817d84a2dSAndrew Jeffery
43917d84a2dSAndrew Jeffery```cpp
44017d84a2dSAndrew Jefferydbus::utility::getSubTree("/", interfaces,
44117d84a2dSAndrew Jeffery                         [asyncResp](boost::system::error_code& ec,
44217d84a2dSAndrew Jeffery                                     MapperGetSubTreeResult& res){
44317d84a2dSAndrew Jeffery                            <too many lines of code>
44417d84a2dSAndrew Jeffery                         })
44517d84a2dSAndrew Jeffery```
44617d84a2dSAndrew Jeffery
44717d84a2dSAndrew Jeffery### Description
44817d84a2dSAndrew Jeffery
44917d84a2dSAndrew JefferyInline lambdas, while useful in some contexts, are difficult to read, and have
45017d84a2dSAndrew Jefferyinconsistent formatting with tools like `clang-format`, which causes significant
45117d84a2dSAndrew Jefferyproblems in review, and in applying patchsets that might have minor conflicts.
45217d84a2dSAndrew JefferyIn addition, because they are declared in a function scope, they are difficult
45317d84a2dSAndrew Jefferyto unit test, and produce log messages that are difficult to read given their
45417d84a2dSAndrew Jefferyunnamed nature. They are also difficult to debug, lacking any symbol name to
45517d84a2dSAndrew Jefferywhich to attach breakpoints or tracepoints.
45617d84a2dSAndrew Jeffery
45717d84a2dSAndrew Jeffery### Background
45817d84a2dSAndrew Jeffery
45917d84a2dSAndrew JefferyLambdas are a natural approach to implementing callbacks in the context of
46017d84a2dSAndrew Jefferyasynchronous IO. Further, the Boost and sdbusplus ASIO APIs tend to encourage
46117d84a2dSAndrew Jefferythis approach. Doing something other than lambdas requires more effort, and so
46217d84a2dSAndrew Jefferythese options tend not to be chosen without pressure to do so.
46317d84a2dSAndrew Jeffery
46417d84a2dSAndrew Jeffery### Resolution
46517d84a2dSAndrew Jeffery
46617d84a2dSAndrew JefferyPrefer to either use `std::bind_front` and a method or static function to handle
46717d84a2dSAndrew Jefferythe return, or a lambda that is less than 10 lines of code to handle an error
46817d84a2dSAndrew Jefferyinline. In cases where `std::bind_front` cannot be used, such as in
46917d84a2dSAndrew Jeffery`sdbusplus::asio::connection::async_method_call`, keep the lambda length less
47017d84a2dSAndrew Jefferythan 10 lines, and call the appropriate function for handling non-trivial
47117d84a2dSAndrew Jefferytransforms.
47217d84a2dSAndrew Jeffery
47317d84a2dSAndrew Jeffery```cpp
47417d84a2dSAndrew Jefferyvoid afterGetSubTree(std::shared_ptr<bmcweb::AsyncResp>& asyncResp,
47517d84a2dSAndrew Jeffery                     boost::system::error_code& ec,
47617d84a2dSAndrew Jeffery                     MapperGetSubTreeResult& res){
47717d84a2dSAndrew Jeffery   <many lines of code>
47817d84a2dSAndrew Jeffery}
47917d84a2dSAndrew Jefferydbus::utility::getSubTree("/xyz/openbmc_project/inventory", interfaces,
48017d84a2dSAndrew Jeffery                         std::bind_front(afterGetSubTree, asyncResp));
48117d84a2dSAndrew Jeffery```
48217d84a2dSAndrew Jeffery
48317d84a2dSAndrew JefferySee also the [Cpp Core Guidelines][] for generalized guidelines on when lambdas
48417d84a2dSAndrew Jefferyare appropriate. The above recommendation is aligned with the Cpp Core
48517d84a2dSAndrew JefferyGuidelines.
48617d84a2dSAndrew Jeffery
48717d84a2dSAndrew Jeffery[Cpp Core Guidelines]:
48817d84a2dSAndrew Jeffery  https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f11-use-an-unnamed-lambda-if-you-need-a-simple-function-object-in-one-place-only
4895eda2a9bSAndrew Jeffery
4905eda2a9bSAndrew Jeffery## Placing internal headers in a parallel subtree
4915eda2a9bSAndrew Jeffery
4925eda2a9bSAndrew Jeffery### Identification
4935eda2a9bSAndrew Jeffery
4945eda2a9bSAndrew JefferyDeclaring types and functions that do not participate in a public API of a
4955eda2a9bSAndrew Jefferylibrary in header files that do not live alongside their implementation in the
4965eda2a9bSAndrew Jefferysource tree.
4975eda2a9bSAndrew Jeffery
4985eda2a9bSAndrew Jeffery### Description
4995eda2a9bSAndrew Jeffery
5005eda2a9bSAndrew JefferyThere's no reason to put internal headers in a parallel subtree (for example,
5015eda2a9bSAndrew Jeffery`include/`). It's more effort to organise, puts unnecessary distance between
5025eda2a9bSAndrew Jefferydeclarations and definitions, and increases effort required in the build system
5035eda2a9bSAndrew Jefferyconfiguration.
5045eda2a9bSAndrew Jeffery
5055eda2a9bSAndrew Jeffery### Background
5065eda2a9bSAndrew Jeffery
5075eda2a9bSAndrew JefferyIn C and C++, header files expose the public API of a library to its consumers
5085eda2a9bSAndrew Jefferyand need to be installed onto the developer's system in a location known to the
5095eda2a9bSAndrew Jefferycompiler. To delineate which header files are to be installed and which are not,
5105eda2a9bSAndrew Jefferythe public header files are often placed into an `include/` subdirectory of the
5115eda2a9bSAndrew Jefferysource tree to mark their importance.
5125eda2a9bSAndrew Jeffery
5135eda2a9bSAndrew JefferyAny functions or structures that are implementation details of the library
5145eda2a9bSAndrew Jefferyshould not be provided in its installed header files. Ignoring this philosphy
5155eda2a9bSAndrew Jefferyover-exposes the library's design and may lead to otherwise unnecessary API or
5165eda2a9bSAndrew JefferyABI breaks in the future.
5175eda2a9bSAndrew Jeffery
5185eda2a9bSAndrew JefferyFurther, projects whose artifacts are only application binaries have no public
5195eda2a9bSAndrew JefferyAPI or ABI in the sense of a library. Any header files in the source tree of
5205eda2a9bSAndrew Jefferysuch projects have no need to be installed onto a developer's system and
5215eda2a9bSAndrew Jefferysegregation in path from the implementation serves no purpose.
5225eda2a9bSAndrew Jeffery
5235eda2a9bSAndrew Jeffery### Resolution
5245eda2a9bSAndrew Jeffery
5255eda2a9bSAndrew JefferyPlace internal header files immediately alongside source files containing the
5265eda2a9bSAndrew Jefferycorresponding implementation.
527*c0aae66eSAndrew Jeffery
528*c0aae66eSAndrew Jeffery## Ill-defined data structuring in `lg2` message strings
529*c0aae66eSAndrew Jeffery
530*c0aae66eSAndrew Jeffery### Identification
531*c0aae66eSAndrew Jeffery
532*c0aae66eSAndrew JefferyAttempts at encoding information into the journal's MESSAGE string that is at
533*c0aae66eSAndrew Jefferymost only plausible to parse using a regex while also reducing human
534*c0aae66eSAndrew Jefferyreadability. For example:
535*c0aae66eSAndrew Jeffery
536*c0aae66eSAndrew Jeffery```
537*c0aae66eSAndrew Jefferyerror(
538*c0aae66eSAndrew Jeffery    "Error getting time, PATH={BMC_TIME_PATH} TIME INTERACE={TIME_INTF}  ERROR={ERR_EXCEP}",
539*c0aae66eSAndrew Jeffery    "BMC_TIME_PATH", bmcTimePath, "TIME_INTF", timeInterface,
540*c0aae66eSAndrew Jeffery    "ERR_EXCEP", e);
541*c0aae66eSAndrew Jeffery```
542*c0aae66eSAndrew Jeffery
543*c0aae66eSAndrew Jeffery### Description
544*c0aae66eSAndrew Jeffery
545*c0aae66eSAndrew Jeffery[`lg2` is OpenBMC's preferred C++ logging interface][phosphor-logging-lg2] and
546*c0aae66eSAndrew Jefferyis implemented on top of the systemd journal and its library APIs.
547*c0aae66eSAndrew Jeffery[systemd-journald provides structured logging][systemd-structured-logging],
548*c0aae66eSAndrew Jefferywhich allows us to capture additional metadata alongside the provided message.
549*c0aae66eSAndrew Jeffery
550*c0aae66eSAndrew Jeffery[phosphor-logging-lg2]:
551*c0aae66eSAndrew Jeffery  https://github.com/openbmc/phosphor-logging/blob/master/docs/structured-logging.md
552*c0aae66eSAndrew Jeffery[systemd-structured-logging]:
553*c0aae66eSAndrew Jeffery  https://0pointer.de/blog/projects/journal-submit.html
554*c0aae66eSAndrew Jeffery
555*c0aae66eSAndrew JefferyThe concept of structured logging allows for convenient programmable access to
556*c0aae66eSAndrew Jefferymetadata associated with a log event. The journal captures a default set of
557*c0aae66eSAndrew Jefferymetadata with each message logged. However, the primary way the entries are
558*c0aae66eSAndrew Jefferyexposed to users is `journalctl`'s default behaviour of printing just the
559*c0aae66eSAndrew Jeffery`MESSAGE` value for each log entry. This may lead to a misunderstanding that the
560*c0aae66eSAndrew Jefferyprinted message is the only way to expose related metadata for investigating
561*c0aae66eSAndrew Jefferydefects.
562*c0aae66eSAndrew Jeffery
563*c0aae66eSAndrew JefferyFor human ergonomics `lg2` provides the ability to interpolate structured data
564*c0aae66eSAndrew Jefferyinto the journal's `MESSAGE` string. This aids with human inspection of the logs
565*c0aae66eSAndrew Jefferyas it becomes possible to highlight important metadata for a log event. However,
566*c0aae66eSAndrew Jefferyit's not intended that this interpolation be used for ad-hoc, ill-defined
567*c0aae66eSAndrew Jefferyattempts at exposing metadata for automated analysis.
568*c0aae66eSAndrew Jeffery
569*c0aae66eSAndrew JefferyAll key-value pairs provided to the `lg2` logging APIs are captured in the
570*c0aae66eSAndrew Jefferystructured log event, regardless of whether any particular key is interpolated
571*c0aae66eSAndrew Jefferyinto the `MESSAGE` string. It is always possible to recover the information
572*c0aae66eSAndrew Jefferyassociated with the log event even if it's not captured in the `MESSAGE` string.
573*c0aae66eSAndrew Jeffery
574*c0aae66eSAndrew Jeffery`phosphor-time-manager` demonstrates a reasonable use of the `lg2` APIs. One
575*c0aae66eSAndrew Jefferylogging instance in the code [is as follows][phosphor-time-manager-lg2]:
576*c0aae66eSAndrew Jeffery
577*c0aae66eSAndrew Jeffery[phosphor-time-manager-lg2]:
578*c0aae66eSAndrew Jeffery  https://github.com/openbmc/phosphor-time-manager/blob/5ce9ac0e56440312997b25771507585905e8b360/manager.cpp#L98
579*c0aae66eSAndrew Jeffery
580*c0aae66eSAndrew Jeffery```
581*c0aae66eSAndrew Jefferyinfo("Time mode has been changed to {MODE}", "MODE", newMode);
582*c0aae66eSAndrew Jeffery```
583*c0aae66eSAndrew Jeffery
584*c0aae66eSAndrew JefferyBy default, this renders in the output of `journalctl` as:
585*c0aae66eSAndrew Jeffery
586*c0aae66eSAndrew Jeffery```
587*c0aae66eSAndrew JefferySep 23 06:09:57 bmc phosphor-time-manager[373]: Time mode has been changed to xyz.openbmc_project.Time.Synchronization.Method.NTP
588*c0aae66eSAndrew Jeffery```
589*c0aae66eSAndrew Jeffery
590*c0aae66eSAndrew JefferyHowever, we can use some journalctl commandline options to inspect the
591*c0aae66eSAndrew Jefferystructured data associated with the log entry:
592*c0aae66eSAndrew Jeffery
593*c0aae66eSAndrew Jeffery```
594*c0aae66eSAndrew Jeffery# journalctl --identifier=phosphor-time-manager --boot --output=verbose | grep -v '^    _' | head -n 9
595*c0aae66eSAndrew JefferySat 2023-09-23 06:09:57.645208 UTC [s=85c1cb5f8e02445aa110a5164c9c07f6;i=244;b=ffd111d3cdca41c8893bb728a1c6cb20;m=133a5a0;t=606009314d0d9;x=9a54e8714754a6cb]
596*c0aae66eSAndrew Jeffery    PRIORITY=6
597*c0aae66eSAndrew Jeffery    MESSAGE=Time mode has been changed to xyz.openbmc_project.Time.Synchronization.Method.NTP
598*c0aae66eSAndrew Jeffery    LOG2_FMTMSG=Time mode has been changed to {MODE}
599*c0aae66eSAndrew Jeffery    CODE_FILE=/usr/src/debug/phosphor-time-manager/1.0+git/manager.cpp
600*c0aae66eSAndrew Jeffery    CODE_LINE=98
601*c0aae66eSAndrew Jeffery    CODE_FUNC=bool phosphor::time::Manager::setCurrentTimeMode(const std::string&)
602*c0aae66eSAndrew Jeffery    MODE=xyz.openbmc_project.Time.Synchronization.Method.NTP
603*c0aae66eSAndrew Jeffery    SYSLOG_IDENTIFIER=phosphor-time-manager
604*c0aae66eSAndrew Jeffery```
605*c0aae66eSAndrew Jeffery
606*c0aae66eSAndrew JefferyHere we find that `MODE` and its value are captured as its own metadata entry in
607*c0aae66eSAndrew Jefferythe structured data, as well as being interpolated into `MESSAGE` as requested.
608*c0aae66eSAndrew JefferyAdditionally, from the log entry we can find _how_ `MODE` was interpolated into
609*c0aae66eSAndrew Jeffery`MESSAGE` using the format string captured in the `LOG2_FMTMSG` metadata entry.
610*c0aae66eSAndrew Jeffery
611*c0aae66eSAndrew Jeffery`LOG2_FMTMSG` also provides a stable handle for identifying the existence of a
612*c0aae66eSAndrew Jefferyspecific class of log events in the journal, further aiding automated analysis.
613*c0aae66eSAndrew Jeffery
614*c0aae66eSAndrew Jeffery### Background
615*c0aae66eSAndrew Jeffery
616*c0aae66eSAndrew JefferyA variety of patches such as [PLDM:Catching exception precisely and printing
617*c0aae66eSAndrew Jefferyit][openbmc-gerrit-67994] added a number of ad-hoc, ill-defined attempts at
618*c0aae66eSAndrew Jefferyproviding all the metadata through the `MESSAGE` entry.
619*c0aae66eSAndrew Jeffery
620*c0aae66eSAndrew Jeffery[openbmc-gerrit-67994]: https://gerrit.openbmc.org/c/openbmc/pldm/+/67994
621*c0aae66eSAndrew Jeffery
622*c0aae66eSAndrew Jeffery### Resolution
623*c0aae66eSAndrew Jeffery
624*c0aae66eSAndrew Jeffery`lg2` messages should be formatted for consumption by humans. They should not
625*c0aae66eSAndrew Jefferycontain ad-hoc, ill-defined attempts at integrating metadata for automated
626*c0aae66eSAndrew Jefferyanalysis.
627