1# Checklist for making changes to `libpldm` 2 3## Philosophy and influences 4 5- [Good Practices in Library Design, Implementation, and Maintenance - Ulrich 6 Drepper][goodpractice] 7 8[goodpractice]: https://www.akkadia.org/drepper/goodpractice.pdf 9 10- [How Do I Make This Hard to Misuse? - Rusty Russell][rusty-api-scale-good] 11 12[rusty-api-scale-good]: https://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html 13 14- [What If I Don't Actually Like My Users? - Rusty Russell][rusty-api-scale-bad] 15 16[rusty-api-scale-bad]: https://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html 17 18- [Red flags that indicate questionable quality - Lennart 19 Poettering][poettering-library-red-flags] 20 21[poettering-library-red-flags]: 22 https://mastodon.social/@pid_eins/112517953375791453 23 24- [Not sure if this is a gcc bug or some weird corner of UB or what... - Andrew 25 Zonenberg][azonenberg-packed-struct] 26 27[azonenberg-packed-struct]: https://ioc.exchange/@azonenberg/112535511250395148 28 29- [The Good, the Bad, and the Weird - Trail of Bits 30 Blog][trail-of-bits-weird-machines] 31 32[trail-of-bits-weird-machines]: 33 https://blog.trailofbits.com/2018/10/26/the-good-the-bad-and-the-weird/ 34 35- [Logic for Programmers - Hillel Wayne][logic-for-programmers] 36 37[logic-for-programmers]: https://leanpub.com/logic 38 39- [Parse, don’t validate - Alexis King][alexis-king-parse-dont-validate] 40 41[alexis-king-parse-dont-validate]: 42 https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/ 43 44- [The byte order fallacy - command center][command-center-byte-order-fallacy] 45 46[command-center-byte-order-fallacy]: 47 https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html 48 49- [what’s the most annoying thing about writing C - Jessica 50 Paquette][barrelshifter-annoying-c] 51 52[barrelshifter-annoying-c]: 53 https://mastodon.social/@barrelshifter/114016810912730423 54 55- [The Power of Ten: Rules for Developing Safety Critical Code - NASA/JPL 56 Laboratory for Reliable Software][nasa-reliable-code] 57 58[nasa-reliable-code]: 59 https://www.cs.otago.ac.nz/cosc345/resources/nasa-10-rules.pdf 60 61- [C Isn't A Programming Language Anymore - Aria Desires][aria-c-is-a-protocol] 62 63[aria-c-is-a-protocol]: https://faultlore.com/blah/c-isnt-a-language/ 64 65## References 66 67- [The C programming language - C Standards Committee][c-language] 68 69[c-language]: https://www.c-language.org/ 70 71- [SEI CERT C Coding Standard][sei-cert-c-coding-standard] 72 73[sei-cert-c-coding-standard]: 74 https://wiki.sei.cmu.edu/confluence/display/c/SEI+CERT+C+Coding+Standard 75 76- [Common Weakness Enumeration (CWE) - Software 77 Development][common-weakness-enumeration-sw] 78 79[common-weakness-enumeration-sw]: 80 https://cwe.mitre.org/data/definitions/699.html 81 82## Definitions 83 84- **Error condition**: An invalid state reached at runtime, caused either by 85 resource exhaustion, or incorrect use of the library's public APIs and data 86 types. 87 88- **Invariant**: A condition in the library's implementation that must never 89 evaluate false. 90 91- **Public API**: Any definitions and declarations under `include/libpldm`. 92 93- **Wire format**: Any message structure defined in the DMTF PLDM protocol 94 specifications. 95 96## Elaborations 97 98- Resource exhaustion is always an error condition and never an invariant 99 violation. 100 101- An invariant violation is always a programming failure of the library's 102 implementation, and never the result of incorrect use of the library's public 103 APIs (see error condition). 104 105- Corollaries of the above two points: 106 - Incorrect use of public API functions is always an error condition, and is 107 dealt with by returning an error code. 108 109 - Incorrect use of static functions in the library's implementation is an 110 invariant violation which may be established using `assert()`. 111 112- `assert()` is the recommended way to demonstrate invariants are upheld. 113 114## Library background 115 116### The ABI lifecycle 117 118```mermaid 119--- 120title: libpldm symbol lifecycle 121--- 122stateDiagram-v2 123 direction LR 124 [*] --> Testing: Add 125 Testing --> Testing: Change 126 Testing --> [*]: Remove 127 Testing --> Stable: Stabilise 128 Stable --> Deprecated: Deprecate 129 Deprecated --> [*]: Remove 130``` 131 132The ABI of the library produced by the build is controlled using the `abi` meson 133option. The following use cases determine how the `abi` option should be 134specified: 135 136| Use Case | Meson Configuration | 137| ----------- | --------------------------------- | 138| Production | `-Dabi=deprecated,stable` | 139| Maintenance | `-Dabi=stable` | 140| Development | `-Dabi=deprecated,stable,testing` | 141 142### Maintenance 143 144Applications and libraries that depend on `libpldm` can identify how to migrate 145off of deprecated APIs by constraining the library ABI to the stable category. 146This will force the compiler identify any call-sites that try to link against 147deprecated symbols. 148 149### Development 150 151Applications and libraries often require functionality that doesn't yet exist in 152`libpldm`. The work is thus in two parts: 153 1541. Add the required APIs to `libpldm` 1552. Use the new APIs from `libpldm` in the dependent application or library 156 157Adding APIs to a library is a difficult task. Generally, once an API is exposed 158in the library's ABI, any changes to the API risk breaking applications already 159making use of it. To make sure we have more than one shot at getting an API 160right, all new APIs must first be exposed in the testing category. Concretely: 161 162Patches adding new APIs MUST mark them as testing and MUST NOT mark them as 163stable. 164 165### Marking functions as testing, stable or deprecated 166 167Three macros are provided through `config.h` (automatically included for all 168translation units) to mark functions as testing, stable or deprecated: 169 1701. `LIBPLDM_ABI_TESTING` 1712. `LIBPLDM_ABI_STABLE` 1723. `LIBPLDM_ABI_DEPRECATED` 173 174These annotations go immediately before your function signature: 175 176```c 177LIBPLDM_ABI_TESTING 178pldm_requester_rc_t pldm_transport_send_msg(struct pldm_transport *transport, 179 pldm_tid_t tid, 180 const void *pldm_req_msg, 181 size_t req_msg_len) 182{ 183 ... 184} 185``` 186 187### What does it mean to mark a function as stable? 188 189Marking a function as stable makes the following promise to users of the 190library: 191 192> We will not remove or change the symbol name, argument count, argument types, 193> return type, or interpretation of relevant values for the function before 194> first marking it as `LIBPLDM_ABI_DEPRECATED` and then subsequently creating a 195> tagged release 196 197Marking a function as stable does _not_ promise that it is free of 198implementation bugs. It is just a promise that the prototype won't change 199without notice. 200 201Given this, it is always okay to implement functions marked stable in terms of 202functions marked testing inside of libpldm. If we remove or change the prototype 203of a function marked testing the only impact is that we need to fix up any call 204sites of that function in the same patch. 205 206### Requirements for stabilising a function 207 208To move a function from the testing category to the stable category, it's 209required that patches demonstrating use of the function in a dependent 210application or library be linked in the commit message of the stabilisation 211change. We require this to demonstrate that the implementer has considered its 212use in context _before_ preventing us from making changes to the API. 213 214### Building a dependent application or library against a testing ABI 215 216Meson is broadly used in the OpenBMC ecosystem, the historical home of 217`libpldm`. Meson's subprojects are a relatively painless way of managing 218dependencies for the purpose of developing complex applications and libraries. 219Use of `libpldm` as a subproject is both supported and encouraged. 220 221`libpldm`'s ABI can be controlled from a parent project through meson's 222subproject configuration syntax: 223 224```shell 225meson setup ... -Dlibpldm:abi=deprecated,stable,testing ... 226``` 227 228## Adding a new API 229 230### Naming macros, functions and types 231 232- [ ] All publicly exposed macros, types and functions relating to the PLDM 233 specifications must be prefixed with either `pldm_` or `PLDM_` as 234 appropriate 235 - The only (temporary) exception are the `encode_*()` and `decode_*()` 236 function symbols 237 238- [ ] All publicly exposed macros, types and functions relating to the library 239 implementation must be prefixed with `libpldm_` or `LIBPLDM_` 240 241- [ ] All `pldm_`-prefixed symbols must also name the related specification. For 242 example, for DSP0248 Platform Monitoring and Control, the symbol prefix 243 should be `pldm_platform_`. 244 245- [ ] All enum members must be prefixed with the type name 246 247### API design 248 249- [ ] If I've added support for a new PLDM message type, then I've defined both 250 the encoder and decoder for that message. 251 - This applies for both request _and_ response message types. 252 253- [ ] I've designed my APIs so their implementation does not require heap 254 allocation. 255 - Prefer [defining iterators][libpldm-iterator] over the message buffer to 256 extract sub-structures from variable-length messages. Iterators avoid both 257 requiring heap allocation in the implementation or splitting the API to 258 allow the caller to allocate appropriate space. Instead, the caller is 259 provided with an on-stack struct containing the extracted sub-structure. 260 261[libpldm-iterator]: 262 https://github.com/openbmc/libpldm/commit/3a2c6589c5660d2066b612bae28ca393a8aa1c2b 263 264- [ ] My new public message codec functions take a `struct` representing the 265 message as a parameter 266 - Function prototypes must _not_ decompose the message to individual 267 parameters. This approach is not ergonomic and is difficult to make 268 type-safe. This is especially true for message decoding functions which must 269 use pointers for out-parameters, where it has often become ambiguous whether 270 the underlying memory represents a single object or an array. 271 272- [ ] Each new `struct` I've defined is used in at least one new function I've 273 added to the public API. 274 275- [ ] My new public `struct` definitions are _not_ marked 276 `__attribute__((packed))` 277 278- [ ] My new public `struct` definitions do _not_ define a flexible array 279 member, unless: 280 - [ ] It's contained in an `#ifndef __cplusplus` macro guard, as flexible 281 arrays are not specified by C++, and 282 283 - [ ] I've implemented an accessor function so the array base pointer can be 284 accessed from C++, and 285 286 - [ ] It is defined as per the C17 specification by omitting the length[^1] 287 - Note: Any array defined with length 1 is _not_ a flexible array, and any 288 access beyond the first element invokes undefined behaviour in both C and 289 C++. 290 291 - [ ] I've annotated the flexible array member with `LIBPLDM_CC_COUNTED_BY()` 292 293[^1]: 294 [C17 draft specification][c17-draft-standard], 6.7.2.1 Structure and union 295 specifiers, paragraph 18. 296 297[c17-draft-standard]: 298 https://web.archive.org/web/20181230041359/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf 299 300### ABI control 301 302- [ ] My new function symbols are marked with `LIBPLDM_ABI_TESTING` in the 303 implementation 304 305- [ ] I've guarded the test cases of functions marked `LIBPLDM_ABI_TESTING` so 306 that they are not compiled when the corresponding function symbols aren't 307 visible 308 309### Error handling and invariants 310 311- [ ] All my error conditions are handled by returning an error code to the 312 caller. 313 314- [ ] All my invariants are tested using `assert()`. 315 316- [ ] I have not used `assert()` to evaluate any error conditions without also 317 handling the error condition by returning an error code the the caller. 318 - Release builds of the library are configured with `assert()` disabled 319 (`-Db_ndebug=if-release`, which provides `-DNDEBUG` in `CFLAGS`). 320 321- [ ] My new APIs return negative `errno` values on error and not PLDM 322 completion codes. 323 - [ ] The specific error values my function returns and their meaning in the 324 context of the function call are listed in the API documentation. 325 326### Implementation 327 328- [ ] If my work interacts with the PLDM wire format, then I have done so using 329 the `msgbuf` APIs found in `src/msgbuf.h` (and under `src/msgbuf/`) to 330 minimise concerns around spatial memory safety and endian-correctness. 331 332- [ ] I've used `goto` to clean up resources acquired prior to encountering an 333 error condition 334 - Replication of resource cleanup across multiple error paths is error-prone, 335 especially when multiple, dependent resources have been acquired. 336 337- [ ] I've released acquired resources in stack-order 338 - This should be the case regardless of whether we're in the happy path at the 339 end of object lifetime or an error path during construction. 340 341- [ ] I've declared variables in [reverse-christmas-tree (inverted pyramid) 342 order][hisham-make-pyramids] in any block scopes I've added or changed. 343 344[hisham-make-pyramids]: 345 https://web.archive.org/web/20220404224603/https://hisham.hm/2018/06/16/when-listing-repeated-things-make-pyramids/ 346 347### Testing 348 349- [ ] I've implemented test cases with reasonable branch coverage of each new 350 function I've added 351 352### Maintenance 353 354- [ ] If I've added support for a new message type, then my commit message 355 specifies all of: 356 - [ ] The relevant DMTF specification by its DSP number and title 357 - [ ] The relevant version of the specification 358 - [ ] The section of the specification that defines the message type 359 360- [ ] If my work impacts the public API of the library, then I've added an entry 361 to `CHANGELOG.md` describing my work 362 363### OEM/vendor-specific APIs 364 365- [ ] I've documented the wire format for all OEM messages under 366 `docs/oem/${OEM_NAME}/` 367 368- [ ] I've added public OEM API declarations and definitions under 369 `include/libpldm/oem/${OEM_NAME}/`, and installed them to the same 370 relative location. 371 372- [ ] I've implemented the public OEM APIs under `src/oem/${OEM_NAME}/` 373 374- [ ] I've implemented the OEM API tests under `tests/oem/${OEM_NAME}/` 375 376The `${OEM_NAME}` folder must be created with the name of the OEM/vendor in 377lower case. 378 379Finally, the OEM name must be added to the list of choices for the `oem` meson 380option, and the `meson.build` files updated throughout the tree to guard 381integration of the OEM extensions. 382 383## Stabilising an existing API 384 385- [ ] The API of interest is currently marked `LIBPLDM_ABI_TESTING` 386 387- [ ] My commit message links to a publicly visible patch that makes use of the 388 API 389 390- [ ] My commit updates the annotation from `LIBPLDM_ABI_TESTING` to 391 `LIBPLDM_ABI_STABLE` only for the function symbols demonstrated by the 392 patch linked in the commit message. 393 394- [ ] I've removed guards from the function's tests so they are always compiled 395 396- [ ] If I've updated the ABI dump, then I've used the OpenBMC CI container to 397 do so. 398 399## Updating an ABI dump 400 401To update the ABI dump you'll need to build an appropriate OpenBMC CI container 402image of your own. Some hints on how to do this locally can be found [in the 403openbmc/docs repository][openbmc-docs-local-ci]. You can list your locally built 404images with `docker images`. 405 406[openbmc-docs-local-ci]: 407 https://github.com/openbmc/docs/blob/master/testing/local-ci-build.md 408 409Assuming: 410 411```shell 412export OPENBMC_CI_IMAGE=openbmc/ubuntu-unit-test:2024-W21-ce361f95ff4fa669 413``` 414 415the ABI dump can be updated with: 416 417```shell 418docker run \ 419 --cap-add=sys_admin \ 420 --rm=true \ 421 --privileged=true \ 422 -u $USER \ 423 -w $(pwd) \ 424 -v $(pwd):$(pwd) \ 425 -e MAKEFLAGS= \ 426 -t $OPENBMC_CI_IMAGE \ 427 ./scripts/abi-dump-updater 428``` 429 430## Removing an API 431 432- [ ] If the function is marked `LIBPLDM_ABI_TESTING`, then I have removed it 433 434- [ ] If the function is marked `LIBPLDM_ABI_STABLE`, then I have changed the 435 annotation to `LIBPLDM_ABI_DEPRECATED` and left it in-place. 436 - [ ] I have updated the ABI dump, or will mark the change as WIP until it has 437 been. 438 439- [ ] If the function is marked `LIBPLDM_ABI_DEPRECATED`, then I have removed it 440 only after satisfying myself that each of the following is true: 441 - [ ] There are no known users of the function left in the community 442 - [ ] There has been at least one tagged release of `libpldm` subsequent to 443 the API being marked deprecated 444 445## Renaming an API 446 447A change to an API is a pure rename only if there are no additional behavioural 448changes. Renaming an API with no other behavioural changes is really two 449actions: 450 4511. Introducing the new API name 4522. Deprecating the old API name 453 454- [ ] Only the name of the function has changed. None of its behaviour has 455 changed. 456 457- [ ] Both the new and the old functions are declared in the public headers 458 459- [ ] I've aliased the old function name to the new function name via the 460 `libpldm_deprecated_aliases` list in `meson.build` 461 462- [ ] I've added a [semantic patch][coccinelle] to migrate users from the old 463 name to the new name 464 465[coccinelle]: https://coccinelle.gitlabpages.inria.fr/website/ 466 467- [ ] I've updated the ABI dump to capture the rename, or will mark the change 468 as WIP until it has been. 469 470## Fixing Implementation Defects 471 472- [ ] My change fixing the bug includes a [Fixes tag][linux-kernel-fixes-tag] 473 identifying the change introducing the defect. 474 475[linux-kernel-fixes-tag]: 476 https://docs.kernel.org/process/submitting-patches.html#describe-your-changes 477 478- [ ] My change fixing the bug includes test cases demonstrating that the bug is 479 fixed. 480