1# Checklist for making changes to `libpldm` 2 3## Philosophy and influences 4 5- [Good Practices in Library Design, Implementation, and Maintenance - Ulrich 6 Drepper][goodpractice] 7 8[goodpractice]: https://www.akkadia.org/drepper/goodpractice.pdf 9 10- [How Do I Make This Hard to Misuse? - Rusty Russell][rusty-api-scale-good] 11 12[rusty-api-scale-good]: https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html 13 14- [What If I Don't Actually Like My Users? - Rusty Russell][rusty-api-scale-bad] 15 16[rusty-api-scale-bad]: https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html 17 18- [Red flags that indicate questionable quality - Lennart 19 Poettering][poettering-library-red-flags] 20 21[poettering-library-red-flags]: 22 https://mastodon.social/@pid_eins/112517953375791453 23 24- [Not sure if this is a gcc bug or some weird corner of UB or what... - Andrew 25 Zonenberg][azonenberg-packed-struct] 26 27[azonenberg-packed-struct]: https://ioc.exchange/@azonenberg/112535511250395148 28 29- [The Good, the Bad, and the Weird - Trail of Bits 30 Blog][trail-of-bits-weird-machines] 31 32[trail-of-bits-weird-machines]: 33 https://blog.trailofbits.com/2018/10/26/the-good-the-bad-and-the-weird/ 34 35- [Logic for Programmers - Hillel Wayne][logic-for-programmers] 36 37[logic-for-programmers]: https://leanpub.com/logic 38 39- [Parse, don’t validate - Alexis King][alexis-king-parse-dont-validate] 40 41[alexis-king-parse-dont-validate]: 42 https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/ 43 44- [The byte order fallacy - command center][command-center-byte-order-fallacy] 45 46[command-center-byte-order-fallacy]: 47 https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html 48 49- [what’s the most annoying thing about writing C - Jessica 50 Paquette][barrelshifter-annoying-c] 51 52[barrelshifter-annoying-c]: 53 https://mastodon.social/@barrelshifter/114016810912730423 54 55- [The Power of Ten: Rules for Developing Safety Critical Code - NASA/JPL 56 Laboratory for Reliable Software][nasa-reliable-code] 57 58[nasa-reliable-code]: 59 https://www.cs.otago.ac.nz/cosc345/resources/nasa-10-rules.pdf 60 61- [C Isn't A Programming Language Anymore - Aria Desires][aria-c-is-a-protocol] 62 63[aria-c-is-a-protocol]: https://faultlore.com/blah/c-isnt-a-language/ 64 65- [To be a better programmer, write little proofs in your head - Matthew 66 Prast][matthew-prast-little-proofs] 67 68[matthew-prast-little-proofs]: 69 https://the-nerve-blog.ghost.io/to-be-a-better-programmer-write-little-proofs-in-your-head/ 70 71## References 72 73- [The C programming language - C Standards Committee][c-language] 74 75[c-language]: https://www.c-language.org/ 76 77- [SEI CERT C Coding Standard][sei-cert-c-coding-standard] 78 79[sei-cert-c-coding-standard]: 80 https://wiki.sei.cmu.edu/confluence/display/c/SEI+CERT+C+Coding+Standard 81 82- [Common Weakness Enumeration (CWE) - Software 83 Development][common-weakness-enumeration-sw] 84 85[common-weakness-enumeration-sw]: 86 https://cwe.mitre.org/data/definitions/699.html 87 88## Definitions 89 90- **Error condition**: An invalid state reached at runtime, caused either by 91 resource exhaustion, or incorrect use of the library's public APIs and data 92 types. 93 94- **Invariant**: A condition in the library's implementation that must never 95 evaluate false. 96 97- **Public API**: Any definitions and declarations under `include/libpldm`. 98 99- **Wire format**: Any message structure defined in the DMTF PLDM protocol 100 specifications. 101 102## Elaborations 103 104- Resource exhaustion is always an error condition and never an invariant 105 violation. 106 107- An invariant violation is always a programming failure of the library's 108 implementation, and never the result of incorrect use of the library's public 109 APIs (see error condition). 110 111- Corollaries of the above two points: 112 - Incorrect use of public API functions is always an error condition, and is 113 dealt with by returning an error code. 114 115 - Incorrect use of static functions in the library's implementation is an 116 invariant violation which may be established using `assert()`. 117 118- `assert()` is the recommended way to demonstrate invariants are upheld. 119 120## Library background 121 122### The ABI lifecycle 123 124```mermaid 125--- 126title: libpldm symbol lifecycle 127--- 128stateDiagram-v2 129 direction LR 130 [*] --> Testing: Add 131 Testing --> Testing: Change 132 Testing --> [*]: Remove 133 Testing --> Stable: Stabilise 134 Stable --> Deprecated: Deprecate 135 Deprecated --> [*]: Remove 136``` 137 138The ABI of the library produced by the build is controlled using the `abi` meson 139option. The following use cases determine how the `abi` option should be 140specified: 141 142| Use Case | Meson Configuration | 143| ----------- | --------------------------------- | 144| Production | `-Dabi=deprecated,stable` | 145| Maintenance | `-Dabi=stable` | 146| Development | `-Dabi=deprecated,stable,testing` | 147 148### Maintenance 149 150Applications and libraries that depend on `libpldm` can identify how to migrate 151off of deprecated APIs by constraining the library ABI to the stable category. 152This will force the compiler identify any call-sites that try to link against 153deprecated symbols. 154 155### Development 156 157Applications and libraries often require functionality that doesn't yet exist in 158`libpldm`. The work is thus in two parts: 159 1601. Add the required APIs to `libpldm` 1612. Use the new APIs from `libpldm` in the dependent application or library 162 163Adding APIs to a library is a difficult task. Generally, once an API is exposed 164in the library's ABI, any changes to the API risk breaking applications already 165making use of it. To make sure we have more than one shot at getting an API 166right, all new APIs must first be exposed in the testing category. Concretely: 167 168Patches adding new APIs MUST mark them as testing and MUST NOT mark them as 169stable. 170 171### Marking functions as testing, stable or deprecated 172 173Three macros are provided through `config.h` (automatically included for all 174translation units) to mark functions as testing, stable or deprecated: 175 1761. `LIBPLDM_ABI_TESTING` 1772. `LIBPLDM_ABI_STABLE` 1783. `LIBPLDM_ABI_DEPRECATED` 179 180These annotations go immediately before your function signature: 181 182```c 183LIBPLDM_ABI_TESTING 184pldm_requester_rc_t pldm_transport_send_msg(struct pldm_transport *transport, 185 pldm_tid_t tid, 186 const void *pldm_req_msg, 187 size_t req_msg_len) 188{ 189 ... 190} 191``` 192 193### What does it mean to mark a function as stable? 194 195Marking a function as stable makes the following promise to users of the 196library: 197 198> We will not remove or change the symbol name, argument count, argument types, 199> return type, or interpretation of relevant values for the function before 200> first marking it as `LIBPLDM_ABI_DEPRECATED` and then subsequently creating a 201> tagged release 202 203Marking a function as stable does _not_ promise that it is free of 204implementation bugs. It is just a promise that the prototype won't change 205without notice. 206 207Given this, it is always okay to implement functions marked stable in terms of 208functions marked testing inside of libpldm. If we remove or change the prototype 209of a function marked testing the only impact is that we need to fix up any call 210sites of that function in the same patch. 211 212### Requirements for stabilising a function 213 214To move a function from the testing category to the stable category, it's 215required that patches demonstrating use of the function in a dependent 216application or library be linked in the commit message of the stabilisation 217change. We require this to demonstrate that the implementer has considered its 218use in context _before_ preventing us from making changes to the API. 219 220### Building a dependent application or library against a testing ABI 221 222Meson is broadly used in the OpenBMC ecosystem, the historical home of 223`libpldm`. Meson's subprojects are a relatively painless way of managing 224dependencies for the purpose of developing complex applications and libraries. 225Use of `libpldm` as a subproject is both supported and encouraged. 226 227`libpldm`'s ABI can be controlled from a parent project through meson's 228subproject configuration syntax: 229 230```shell 231meson setup ... -Dlibpldm:abi=deprecated,stable,testing ... 232``` 233 234## General practices 235 236- [ ] My commit message subject is prefixed with the name of the impacted 237 subsystem 238 239- [ ] My commit message describes testing practices only if the discussion is 240 substantive. 241 - The description must contain enough information for someone else to 242 reproduce the setup and verify the results. 243 244 - A section on testing should not be added if there was no specific testing 245 performed 246 247## Adding a new API 248 249### Naming macros, functions and types 250 251- [ ] All publicly exposed macros, types and functions relating to the PLDM 252 specifications must be prefixed with either `pldm_` or `PLDM_` as 253 appropriate 254 - The only (temporary) exception are the `encode_*()` and `decode_*()` 255 function symbols 256 257- [ ] All publicly exposed macros, types and functions relating to the library 258 implementation must be prefixed with `libpldm_` or `LIBPLDM_` 259 260- [ ] All `pldm_`-prefixed symbols must also name the related specification. For 261 example, for DSP0248 Platform Monitoring and Control, the symbol prefix 262 should be `pldm_platform_`. 263 264- [ ] All enum members must be prefixed with the type name 265 266### API design 267 268- [ ] If I've added support for a new PLDM message type, then I've defined both 269 the encoder and decoder for that message. 270 - This applies for both request _and_ response message types. 271 272- [ ] I've designed my APIs so their implementation does not require heap 273 allocation. 274 - Prefer [defining iterators][libpldm-iterator] over the message buffer to 275 extract sub-structures from variable-length messages. Iterators avoid both 276 requiring heap allocation in the implementation or splitting the API to 277 allow the caller to allocate appropriate space. Instead, the caller is 278 provided with an on-stack struct containing the extracted sub-structure. 279 280[libpldm-iterator]: 281 https://github.com/openbmc/libpldm/commit/3a2c6589c5660d2066b612bae28ca393a8aa1c2b 282 283- [ ] My `encode_*()` APIs exchange with the caller the size of the destination 284 buffer and the length of the encoded data using an in-out buffer length 285 parameter 286 287- [ ] My new public message codec functions take a `struct` representing the 288 message as a parameter 289 - Function prototypes must _not_ decompose the message to individual 290 parameters. This approach is not ergonomic and is difficult to make 291 type-safe. This is especially true for message decoding functions which must 292 use pointers for out-parameters, where it has often become ambiguous whether 293 the underlying memory represents a single object or an array. 294 295- [ ] Each new `struct` I've defined is used in at least one new function I've 296 added to the public API. 297 298- [ ] My new public `struct` definitions are _not_ marked 299 `__attribute__((packed))` 300 301- [ ] My new public `struct` definitions do _not_ define a flexible array 302 member, unless: 303 - [ ] It's contained in an `#ifndef __cplusplus` macro guard, as flexible 304 arrays are not specified by C++, and 305 306 - [ ] I've implemented an accessor function so the array base pointer can be 307 accessed from C++, and 308 309 - [ ] It is defined as per the C17 specification by omitting the length[^1] 310 - Note: Any array defined with length 1 is _not_ a flexible array, and any 311 access beyond the first element invokes undefined behaviour in both C and 312 C++. 313 314 - [ ] I've annotated the flexible array member with `LIBPLDM_CC_COUNTED_BY()` 315 316[^1]: 317 [C17 draft specification][c17-draft-standard], 6.7.2.1 Structure and union 318 specifiers, paragraph 18. 319 320[c17-draft-standard]: 321 https://web.archive.org/web/20181230041359/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf 322 323### ABI control 324 325- [ ] My new function symbols are marked with `LIBPLDM_ABI_TESTING` in the 326 implementation 327 328- [ ] I've guarded the test cases of functions marked `LIBPLDM_ABI_TESTING` so 329 that they are not compiled when the corresponding function symbols aren't 330 visible 331 332### Error handling and invariants 333 334- [ ] All my error conditions are handled by returning an error code to the 335 caller. 336 337- [ ] All my invariants are tested using `assert()`. 338 339- [ ] I have not used `assert()` to evaluate any error conditions without also 340 handling the error condition by returning an error code the the caller. 341 - Release builds of the library are configured with `assert()` disabled 342 (`-Db_ndebug=if-release`, which provides `-DNDEBUG` in `CFLAGS`). 343 344- [ ] My new APIs return negative `errno` values on error and not PLDM 345 completion codes. 346 - [ ] The specific error values my function returns and their meaning in the 347 context of the function call are listed in the API documentation. 348 349### Implementation 350 351- [ ] If my work interacts with the PLDM wire format, then I have done so using 352 the `msgbuf` APIs found in `src/msgbuf.h` (and under `src/msgbuf/`) to 353 minimise concerns around spatial memory safety and endian-correctness. 354 355- [ ] I've used `goto` to clean up resources acquired prior to encountering an 356 error condition 357 - Replication of resource cleanup across multiple error paths is error-prone, 358 especially when multiple, dependent resources have been acquired. 359 360- [ ] I've released acquired resources in stack-order 361 - This should be the case regardless of whether we're in the happy path at the 362 end of object lifetime or an error path during construction. 363 364- [ ] I've declared variables in [reverse-christmas-tree (inverted pyramid) 365 order][hisham-make-pyramids] in any block scopes I've added or changed. 366 367[hisham-make-pyramids]: 368 https://web.archive.org/web/20220404224603/https://hisham.hm/2018/06/16/when-listing-repeated-things-make-pyramids/ 369 370### Testing 371 372- [ ] I've implemented test cases with reasonable branch coverage of each new 373 function I've added 374 375### Maintenance 376 377- [ ] If I've added support for a new message type, then my commit message 378 specifies all of: 379 - [ ] The relevant DMTF specification by its DSP number and title 380 - [ ] The relevant version of the specification 381 - [ ] The section of the specification that defines the message type 382 383- [ ] If my work impacts the public API of the library, then I've added an entry 384 to `CHANGELOG.md` describing my work 385 386### OEM/vendor-specific APIs 387 388- [ ] I've documented the wire format for all OEM messages under 389 `docs/oem/${OEM_NAME}/` 390 391- [ ] I've added public OEM API declarations and definitions under 392 `include/libpldm/oem/${OEM_NAME}/`, and installed them to the same 393 relative location. 394 395- [ ] I've implemented the public OEM APIs under `src/oem/${OEM_NAME}/` 396 397- [ ] I've implemented the OEM API tests under `tests/oem/${OEM_NAME}/` 398 399The `${OEM_NAME}` folder must be created with the name of the OEM/vendor in 400lower case. 401 402Finally, the OEM name must be added to the list of choices for the `oem` meson 403option, and the `meson.build` files updated throughout the tree to guard 404integration of the OEM extensions. 405 406## Stabilising an existing API 407 408- [ ] The API of interest is currently marked `LIBPLDM_ABI_TESTING` 409 410- [ ] My commit message links to a publicly visible patch that makes use of the 411 API 412 413- [ ] My commit updates the annotation from `LIBPLDM_ABI_TESTING` to 414 `LIBPLDM_ABI_STABLE` only for the function symbols demonstrated by the 415 patch linked in the commit message. 416 417- [ ] I've removed guards from the function's tests so they are always compiled 418 419- [ ] If I've updated the ABI dump, then I've used the OpenBMC CI container to 420 do so. 421 422## Updating an ABI dump 423 424To update the ABI dump you'll need to build an appropriate OpenBMC CI container 425image of your own. Some hints on how to do this locally can be found [in the 426openbmc/docs repository][openbmc-docs-local-ci]. You can list your locally built 427images with `docker images`. 428 429[openbmc-docs-local-ci]: 430 https://github.com/openbmc/docs/blob/master/testing/local-ci-build.md 431 432Assuming: 433 434```shell 435export OPENBMC_CI_IMAGE=openbmc/ubuntu-unit-test:2024-W21-ce361f95ff4fa669 436``` 437 438the ABI dump can be updated with: 439 440```shell 441docker run \ 442 --cap-add=sys_admin \ 443 --rm=true \ 444 --privileged=true \ 445 -u $USER \ 446 -w $(pwd) \ 447 -v $(pwd):$(pwd) \ 448 -e MAKEFLAGS= \ 449 -t $OPENBMC_CI_IMAGE \ 450 ./scripts/abi-dump-updater 451``` 452 453## Removing an API 454 455- [ ] If the function is marked `LIBPLDM_ABI_TESTING`, then I have removed it 456 457- [ ] If the function is marked `LIBPLDM_ABI_STABLE`, then I have changed the 458 annotation to `LIBPLDM_ABI_DEPRECATED` and left it in-place. 459 - [ ] I have updated the ABI dump, or will mark the change as WIP until it has 460 been. 461 462- [ ] If the function is marked `LIBPLDM_ABI_DEPRECATED`, then I have removed it 463 only after satisfying myself that each of the following is true: 464 - [ ] There are no known users of the function left in the community 465 - [ ] There has been at least one tagged release of `libpldm` subsequent to 466 the API being marked deprecated 467 468## Renaming an API 469 470A change to an API is a pure rename only if there are no additional behavioural 471changes. Renaming an API with no other behavioural changes is really two 472actions: 473 4741. Introducing the new API name 4752. Deprecating the old API name 476 477- [ ] Only the name of the function has changed. None of its behaviour has 478 changed. 479 480- [ ] Both the new and the old functions are declared in the public headers 481 482- [ ] I've aliased the old function name to the new function name via the 483 `libpldm_deprecated_aliases` list in `meson.build` 484 485- [ ] I've added a [semantic patch][coccinelle] to migrate users from the old 486 name to the new name 487 488[coccinelle]: https://coccinelle.gitlabpages.inria.fr/website/ 489 490- [ ] I've updated the ABI dump to capture the rename, or will mark the change 491 as WIP until it has been. 492 493## Fixing Implementation Defects 494 495- [ ] My change fixing the bug includes a [Fixes tag][linux-kernel-fixes-tag] 496 identifying the change introducing the defect. 497 498[linux-kernel-fixes-tag]: 499 https://docs.kernel.org/process/submitting-patches.html#describe-your-changes 500 501- [ ] My change fixing the bug includes test cases demonstrating that the bug is 502 fixed. 503