1# Checklist for making changes to `libpldm`
2
3## Philosophy and influences
4
5- [Good Practices in Library Design, Implementation, and Maintenance - Ulrich
6  Drepper][goodpractice]
7
8[goodpractice]: https://www.akkadia.org/drepper/goodpractice.pdf
9
10- [How Do I Make This Hard to Misuse? - Rusty Russell][rusty-api-scale-good]
11
12[rusty-api-scale-good]: https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
13
14- [What If I Don't Actually Like My Users? - Rusty Russell][rusty-api-scale-bad]
15
16[rusty-api-scale-bad]: https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
17
18- [Red flags that indicate questionable quality - Lennart
19  Poettering][poettering-library-red-flags]
20
21[poettering-library-red-flags]:
22  https://mastodon.social/@pid_eins/112517953375791453
23
24- [Not sure if this is a gcc bug or some weird corner of UB or what... - Andrew
25  Zonenberg][azonenberg-packed-struct]
26
27[azonenberg-packed-struct]: https://ioc.exchange/@azonenberg/112535511250395148
28
29- [The Good, the Bad, and the Weird - Trail of Bits
30  Blog][trail-of-bits-weird-machines]
31
32[trail-of-bits-weird-machines]:
33  https://blog.trailofbits.com/2018/10/26/the-good-the-bad-and-the-weird/
34
35- [Logic for Programmers - Hillel Wayne][logic-for-programmers]
36
37[logic-for-programmers]: https://leanpub.com/logic
38
39- [Parse, don’t validate - Alexis King][alexis-king-parse-dont-validate]
40
41[alexis-king-parse-dont-validate]:
42  https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
43
44- [The byte order fallacy - command center][command-center-byte-order-fallacy]
45
46[command-center-byte-order-fallacy]:
47  https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html
48
49- [what’s the most annoying thing about writing C - Jessica
50  Paquette][barrelshifter-annoying-c]
51
52[barrelshifter-annoying-c]:
53  https://mastodon.social/@barrelshifter/114016810912730423
54
55- [The Power of Ten: Rules for Developing Safety Critical Code - NASA/JPL
56  Laboratory for Reliable Software][nasa-reliable-code]
57
58[nasa-reliable-code]:
59  https://www.cs.otago.ac.nz/cosc345/resources/nasa-10-rules.pdf
60
61- [C Isn't A Programming Language Anymore - Aria Desires][aria-c-is-a-protocol]
62
63[aria-c-is-a-protocol]: https://faultlore.com/blah/c-isnt-a-language/
64
65- [To be a better programmer, write little proofs in your head - Matthew
66  Prast][matthew-prast-little-proofs]
67
68[matthew-prast-little-proofs]:
69  https://the-nerve-blog.ghost.io/to-be-a-better-programmer-write-little-proofs-in-your-head/
70
71## References
72
73- [The C programming language - C Standards Committee][c-language]
74
75[c-language]: https://www.c-language.org/
76
77- [SEI CERT C Coding Standard][sei-cert-c-coding-standard]
78
79[sei-cert-c-coding-standard]:
80  https://wiki.sei.cmu.edu/confluence/display/c/SEI+CERT+C+Coding+Standard
81
82- [Common Weakness Enumeration (CWE) - Software
83  Development][common-weakness-enumeration-sw]
84
85[common-weakness-enumeration-sw]:
86  https://cwe.mitre.org/data/definitions/699.html
87
88## Definitions
89
90- **Error condition**: An invalid state reached at runtime, caused either by
91  resource exhaustion, or incorrect use of the library's public APIs and data
92  types.
93
94- **Invariant**: A condition in the library's implementation that must never
95  evaluate false.
96
97- **Public API**: Any definitions and declarations under `include/libpldm`.
98
99- **Wire format**: Any message structure defined in the DMTF PLDM protocol
100  specifications.
101
102## Elaborations
103
104- Resource exhaustion is always an error condition and never an invariant
105  violation.
106
107- An invariant violation is always a programming failure of the library's
108  implementation, and never the result of incorrect use of the library's public
109  APIs (see error condition).
110
111- Corollaries of the above two points:
112  - Incorrect use of public API functions is always an error condition, and is
113    dealt with by returning an error code.
114
115  - Incorrect use of static functions in the library's implementation is an
116    invariant violation which may be established using `assert()`.
117
118- `assert()` is the recommended way to demonstrate invariants are upheld.
119
120## Library background
121
122### The ABI lifecycle
123
124```mermaid
125---
126title: libpldm symbol lifecycle
127---
128stateDiagram-v2
129    direction LR
130    [*] --> Testing: Add
131    Testing --> Testing: Change
132    Testing --> [*]: Remove
133    Testing --> Stable: Stabilise
134    Stable --> Deprecated: Deprecate
135    Deprecated --> [*]: Remove
136```
137
138The ABI of the library produced by the build is controlled using the `abi` meson
139option. The following use cases determine how the `abi` option should be
140specified:
141
142| Use Case    | Meson Configuration               |
143| ----------- | --------------------------------- |
144| Production  | `-Dabi=deprecated,stable`         |
145| Maintenance | `-Dabi=stable`                    |
146| Development | `-Dabi=deprecated,stable,testing` |
147
148### Maintenance
149
150Applications and libraries that depend on `libpldm` can identify how to migrate
151off of deprecated APIs by constraining the library ABI to the stable category.
152This will force the compiler identify any call-sites that try to link against
153deprecated symbols.
154
155### Development
156
157Applications and libraries often require functionality that doesn't yet exist in
158`libpldm`. The work is thus in two parts:
159
1601. Add the required APIs to `libpldm`
1612. Use the new APIs from `libpldm` in the dependent application or library
162
163Adding APIs to a library is a difficult task. Generally, once an API is exposed
164in the library's ABI, any changes to the API risk breaking applications already
165making use of it. To make sure we have more than one shot at getting an API
166right, all new APIs must first be exposed in the testing category. Concretely:
167
168Patches adding new APIs MUST mark them as testing and MUST NOT mark them as
169stable.
170
171### Marking functions as testing, stable or deprecated
172
173Three macros are provided through `config.h` (automatically included for all
174translation units) to mark functions as testing, stable or deprecated:
175
1761. `LIBPLDM_ABI_TESTING`
1772. `LIBPLDM_ABI_STABLE`
1783. `LIBPLDM_ABI_DEPRECATED`
179
180These annotations go immediately before your function signature:
181
182```c
183LIBPLDM_ABI_TESTING
184pldm_requester_rc_t pldm_transport_send_msg(struct pldm_transport *transport,
185                                            pldm_tid_t tid,
186                                            const void *pldm_req_msg,
187                                            size_t req_msg_len)
188{
189    ...
190}
191```
192
193### What does it mean to mark a function as stable?
194
195Marking a function as stable makes the following promise to users of the
196library:
197
198> We will not remove or change the symbol name, argument count, argument types,
199> return type, or interpretation of relevant values for the function before
200> first marking it as `LIBPLDM_ABI_DEPRECATED` and then subsequently creating a
201> tagged release
202
203Marking a function as stable does _not_ promise that it is free of
204implementation bugs. It is just a promise that the prototype won't change
205without notice.
206
207Given this, it is always okay to implement functions marked stable in terms of
208functions marked testing inside of libpldm. If we remove or change the prototype
209of a function marked testing the only impact is that we need to fix up any call
210sites of that function in the same patch.
211
212### Requirements for stabilising a function
213
214To move a function from the testing category to the stable category, it's
215required that patches demonstrating use of the function in a dependent
216application or library be linked in the commit message of the stabilisation
217change. We require this to demonstrate that the implementer has considered its
218use in context _before_ preventing us from making changes to the API.
219
220### Building a dependent application or library against a testing ABI
221
222Meson is broadly used in the OpenBMC ecosystem, the historical home of
223`libpldm`. Meson's subprojects are a relatively painless way of managing
224dependencies for the purpose of developing complex applications and libraries.
225Use of `libpldm` as a subproject is both supported and encouraged.
226
227`libpldm`'s ABI can be controlled from a parent project through meson's
228subproject configuration syntax:
229
230```shell
231meson setup ... -Dlibpldm:abi=deprecated,stable,testing ...
232```
233
234## General practices
235
236- [ ] My commit message subject is prefixed with the name of the impacted
237      subsystem
238
239- [ ] My commit message describes testing practices only if the discussion is
240      substantive.
241  - The description must contain enough information for someone else to
242    reproduce the setup and verify the results.
243
244  - A section on testing should not be added if there was no specific testing
245    performed
246
247## Adding a new API
248
249### Naming macros, functions and types
250
251- [ ] All publicly exposed macros, types and functions relating to the PLDM
252      specifications must be prefixed with either `pldm_` or `PLDM_` as
253      appropriate
254  - The only (temporary) exception are the `encode_*()` and `decode_*()`
255    function symbols
256
257- [ ] All publicly exposed macros, types and functions relating to the library
258      implementation must be prefixed with `libpldm_` or `LIBPLDM_`
259
260- [ ] All `pldm_`-prefixed symbols must also name the related specification. For
261      example, for DSP0248 Platform Monitoring and Control, the symbol prefix
262      should be `pldm_platform_`.
263
264- [ ] All enum members must be prefixed with the type name
265
266### API design
267
268- [ ] If I've added support for a new PLDM message type, then I've defined both
269      the encoder and decoder for that message.
270  - This applies for both request _and_ response message types.
271
272- [ ] I've designed my APIs so their implementation does not require heap
273      allocation.
274  - Prefer [defining iterators][libpldm-iterator] over the message buffer to
275    extract sub-structures from variable-length messages. Iterators avoid both
276    requiring heap allocation in the implementation or splitting the API to
277    allow the caller to allocate appropriate space. Instead, the caller is
278    provided with an on-stack struct containing the extracted sub-structure.
279
280[libpldm-iterator]:
281  https://github.com/openbmc/libpldm/commit/3a2c6589c5660d2066b612bae28ca393a8aa1c2b
282
283- [ ] My `encode_*()` APIs exchange with the caller the size of the destination
284      buffer and the length of the encoded data using an in-out buffer length
285      parameter
286
287- [ ] My new public message codec functions take a `struct` representing the
288      message as a parameter
289  - Function prototypes must _not_ decompose the message to individual
290    parameters. This approach is not ergonomic and is difficult to make
291    type-safe. This is especially true for message decoding functions which must
292    use pointers for out-parameters, where it has often become ambiguous whether
293    the underlying memory represents a single object or an array.
294
295- [ ] Each new `struct` I've defined is used in at least one new function I've
296      added to the public API.
297
298- [ ] My new public `struct` definitions are _not_ marked
299      `__attribute__((packed))`
300
301- [ ] My new public `struct` definitions do _not_ define a flexible array
302      member, unless:
303  - [ ] It's contained in an `#ifndef __cplusplus` macro guard, as flexible
304        arrays are not specified by C++, and
305
306  - [ ] I've implemented an accessor function so the array base pointer can be
307        accessed from C++, and
308
309  - [ ] It is defined as per the C17 specification by omitting the length[^1]
310    - Note: Any array defined with length 1 is _not_ a flexible array, and any
311      access beyond the first element invokes undefined behaviour in both C and
312      C++.
313
314  - [ ] I've annotated the flexible array member with `LIBPLDM_CC_COUNTED_BY()`
315
316[^1]:
317    [C17 draft specification][c17-draft-standard], 6.7.2.1 Structure and union
318    specifiers, paragraph 18.
319
320[c17-draft-standard]:
321  https://web.archive.org/web/20181230041359/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf
322
323### ABI control
324
325- [ ] My new function symbols are marked with `LIBPLDM_ABI_TESTING` in the
326      implementation
327
328- [ ] I've guarded the test cases of functions marked `LIBPLDM_ABI_TESTING` so
329      that they are not compiled when the corresponding function symbols aren't
330      visible
331
332### Error handling and invariants
333
334- [ ] All my error conditions are handled by returning an error code to the
335      caller.
336
337- [ ] All my invariants are tested using `assert()`.
338
339- [ ] I have not used `assert()` to evaluate any error conditions without also
340      handling the error condition by returning an error code the the caller.
341  - Release builds of the library are configured with `assert()` disabled
342    (`-Db_ndebug=if-release`, which provides `-DNDEBUG` in `CFLAGS`).
343
344- [ ] My new APIs return negative `errno` values on error and not PLDM
345      completion codes.
346  - [ ] The specific error values my function returns and their meaning in the
347        context of the function call are listed in the API documentation.
348
349### Implementation
350
351- [ ] If my work interacts with the PLDM wire format, then I have done so using
352      the `msgbuf` APIs found in `src/msgbuf.h` (and under `src/msgbuf/`) to
353      minimise concerns around spatial memory safety and endian-correctness.
354
355- [ ] I've used `goto` to clean up resources acquired prior to encountering an
356      error condition
357  - Replication of resource cleanup across multiple error paths is error-prone,
358    especially when multiple, dependent resources have been acquired.
359
360- [ ] I've released acquired resources in stack-order
361  - This should be the case regardless of whether we're in the happy path at the
362    end of object lifetime or an error path during construction.
363
364- [ ] I've declared variables in [reverse-christmas-tree (inverted pyramid)
365      order][hisham-make-pyramids] in any block scopes I've added or changed.
366
367[hisham-make-pyramids]:
368  https://web.archive.org/web/20220404224603/https://hisham.hm/2018/06/16/when-listing-repeated-things-make-pyramids/
369
370### Testing
371
372- [ ] I've implemented test cases with reasonable branch coverage of each new
373      function I've added
374
375### Maintenance
376
377- [ ] If I've added support for a new message type, then my commit message
378      specifies all of:
379  - [ ] The relevant DMTF specification by its DSP number and title
380  - [ ] The relevant version of the specification
381  - [ ] The section of the specification that defines the message type
382
383- [ ] If my work impacts the public API of the library, then I've added an entry
384      to `CHANGELOG.md` describing my work
385
386### OEM/vendor-specific APIs
387
388- [ ] I've documented the wire format for all OEM messages under
389      `docs/oem/${OEM_NAME}/`
390
391- [ ] I've added public OEM API declarations and definitions under
392      `include/libpldm/oem/${OEM_NAME}/`, and installed them to the same
393      relative location.
394
395- [ ] I've implemented the public OEM APIs under `src/oem/${OEM_NAME}/`
396
397- [ ] I've implemented the OEM API tests under `tests/oem/${OEM_NAME}/`
398
399The `${OEM_NAME}` folder must be created with the name of the OEM/vendor in
400lower case.
401
402Finally, the OEM name must be added to the list of choices for the `oem` meson
403option, and the `meson.build` files updated throughout the tree to guard
404integration of the OEM extensions.
405
406## Stabilising an existing API
407
408- [ ] The API of interest is currently marked `LIBPLDM_ABI_TESTING`
409
410- [ ] My commit message links to a publicly visible patch that makes use of the
411      API
412
413- [ ] My commit updates the annotation from `LIBPLDM_ABI_TESTING` to
414      `LIBPLDM_ABI_STABLE` only for the function symbols demonstrated by the
415      patch linked in the commit message.
416
417- [ ] I've removed guards from the function's tests so they are always compiled
418
419- [ ] If I've updated the ABI dump, then I've used the OpenBMC CI container to
420      do so.
421
422## Updating an ABI dump
423
424To update the ABI dump you'll need to build an appropriate OpenBMC CI container
425image of your own. Some hints on how to do this locally can be found [in the
426openbmc/docs repository][openbmc-docs-local-ci]. You can list your locally built
427images with `docker images`.
428
429[openbmc-docs-local-ci]:
430  https://github.com/openbmc/docs/blob/master/testing/local-ci-build.md
431
432Assuming:
433
434```shell
435export OPENBMC_CI_IMAGE=openbmc/ubuntu-unit-test:2024-W21-ce361f95ff4fa669
436```
437
438the ABI dump can be updated with:
439
440```shell
441docker run \
442  --cap-add=sys_admin \
443  --rm=true \
444  --privileged=true \
445  -u $USER \
446  -w $(pwd) \
447  -v $(pwd):$(pwd) \
448  -e MAKEFLAGS= \
449  -t $OPENBMC_CI_IMAGE \
450  ./scripts/abi-dump-updater
451```
452
453## Removing an API
454
455- [ ] If the function is marked `LIBPLDM_ABI_TESTING`, then I have removed it
456
457- [ ] If the function is marked `LIBPLDM_ABI_STABLE`, then I have changed the
458      annotation to `LIBPLDM_ABI_DEPRECATED` and left it in-place.
459  - [ ] I have updated the ABI dump, or will mark the change as WIP until it has
460        been.
461
462- [ ] If the function is marked `LIBPLDM_ABI_DEPRECATED`, then I have removed it
463      only after satisfying myself that each of the following is true:
464  - [ ] There are no known users of the function left in the community
465  - [ ] There has been at least one tagged release of `libpldm` subsequent to
466        the API being marked deprecated
467
468## Renaming an API
469
470A change to an API is a pure rename only if there are no additional behavioural
471changes. Renaming an API with no other behavioural changes is really two
472actions:
473
4741. Introducing the new API name
4752. Deprecating the old API name
476
477- [ ] Only the name of the function has changed. None of its behaviour has
478      changed.
479
480- [ ] Both the new and the old functions are declared in the public headers
481
482- [ ] I've aliased the old function name to the new function name via the
483      `libpldm_deprecated_aliases` list in `meson.build`
484
485- [ ] I've added a [semantic patch][coccinelle] to migrate users from the old
486      name to the new name
487
488[coccinelle]: https://coccinelle.gitlabpages.inria.fr/website/
489
490- [ ] I've updated the ABI dump to capture the rename, or will mark the change
491      as WIP until it has been.
492
493## Fixing Implementation Defects
494
495- [ ] My change fixing the bug includes a [Fixes tag][linux-kernel-fixes-tag]
496      identifying the change introducing the defect.
497
498[linux-kernel-fixes-tag]:
499  https://docs.kernel.org/process/submitting-patches.html#describe-your-changes
500
501- [ ] My change fixing the bug includes test cases demonstrating that the bug is
502      fixed.
503