1# In-Band Update of BMC Firmware (and others) using OEM IPMI Blob Transport 2 3Author: Patrick Venture <venture!> 4 5Created: 2018-10-18 6 7## Problem Description 8 9The BMC needs a mechanism for receiving a new firmware image from the host 10through a variety of mechanisms. This can best be served with one protocol into 11which multiple approaches can be routed. 12 13## Background and References 14 15BMC hardware provides at a minimum some interface for sending and receiving IPMI 16messages. This hardware may also provide regions that can be memory mapped for 17higher speed communication between the BMC and the host. Certain infrastructures 18do not provide network access to the BMC, therefore it is required to provide an 19update mechanism that can be done in-band between the host and the BMC. 20 21In-band here refers to a communications channel that is directly connected 22between the host and BMC. 23 241. Serial 251. IPMI over LPC 261. IPMI over i2c 271. LPC Memory-Mapped Region 281. P2A bridge 29 30## Primer 31 32Please read the IPMI BLOB protocol design as primer 33[here](https://github.com/openbmc/phosphor-ipmi-blobs/blob/master/README.md). 34 35## Requirements 36 37The following statements are reflective of the initial requirements. 38 39* Any update mechanism must provide support for UBI tarballs and legacy 40 (static layout) flash images. Leveraging the BLOB protocol allows a system 41 to provide support for any image type simply by implementing a mechanism for 42 handling it. 43 44* Any update mechanism must allow for triggering an image verification step 45 before the image is used. 46 47* Any update mechanism must allow implementing the data staging via different 48 in-band mechanisms. 49 50* Any update mechanism must provide a handshake or equivalent protocol for 51 coordinating the data transfer. For instance, whether the BMC should enable 52 the P2A bridge and what region to use or whether to turn on the LPC memory 53 map bridge. 54 55* Any update mechanism must attempt to maintain security, insomuch as not 56 leaving a memory region open by default. For example, before starting the 57 verification step, access to the staged firmware image must not be still 58 accessible from the host. 59 60## Proposed Design 61 62OpenBMC supports a BLOB protocol that provides primitives. These primitives 63allow a variety of different "handlers" to exist that implement those primitives 64for specific "blobs." A blob in this context is a file path that is strictly 65unique. 66 67Sending the firmware image over the BLOB protocol will be done via routing the 68[phosphor-ipmi-flash design](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md) 69through a BLOB handler. This is meant to supplant `phosphor-ipmi-flash`'s 70current approach to centralize on one flexible handler. 71 72### Sequencing Control 73 74To enforce sequencing control, the design requires that only one blob be open at 75a time. If the verification blob is open, the other blobs cannot be opened, and 76likewise if a client has a data blob open, the verification blob cannot be 77opened. 78 79### Defining Blobs 80 81The BLOB protocol allows a handler to specify a list of blob ids. This list will 82be leveraged to specify whether the platform supports either the legacy (static 83layout) or the UBI mechanism, or both. The flags provided to the open command 84identify the mechanism selected by the client-side. The stat command will return 85the list of supported mechanisms for the blob. 86 87The blob ids for the mechanisms will be as follows: 88 89Flash Blob Id | Type 90---------------- | ------------- 91`/flash/image` | Static Layout 92`/flash/tarball` | UBI 93`/flash/bios` | Host BIOS image 94 95The flash handler will determine what commands it should expect to receive and 96responses it will return given the blob opened, based on the flags provided to 97open. 98 99The flash handler will only allow one of the above blobs to be opened for a 100sequence of commands, such that you cannot open `/flash/image` and then open 101`/flash/bios` without completing (or later aborting) the first update process 102started. 103 104The following blob ids are defined for storing the hash for the image: 105 106Hash Blob | Id Mechanism 107------------- | -------------------- 108`/flash/hash` | Whichever flash blob was opened 109 110The flash handler will only allow one open file at a time, such that if the host 111attempts to send a firmware image down over IPMI BlockTransfer, it won't allow 112the host to start a PCI send until the BlockTransfer file is closed. 113 114There is only one hash "file" mechanism. The exact hash used will only be 115important to your verification service. The value provided will be written to a 116known place. 117 118When a transfer is active, it'll create a blob_id of `/flash/active/image` and 119`/flash/active/hash`. 120 121#### Verification Blob 122 123The following blob id is defined once the image or hash upload has started. Its 124purpose is to trigger and monitor the firmware verification process. Therefore, 125the BmcBlobOpen command will fail until both the hash and image file are closed. 126Further on the ideal command sequence below. 127 128Trigger Blob | Note 129--------------- | ------------------------ 130`/flash/verify` | Verify Trigger Mechanism 131 132When the verification file is closed, if verification was completed 133successfully, it'll add an update blob id, defined below. 134 135The verification process used is not defined by this design. 136 137#### Update Blob 138 139The update blob id is available once `/flash/verify` is closed with a valid image 140or tarball. The update blob needs to be opened and commit() called on that blob 141id to trigger the update mechanism. 142 143The update process can be checked periodically by calling stat() on the update 144blob id. 145 146Update Blob | Note 147--------------- | ------------------------ 148`/flash/update` | Trigger Update Mechanism 149 150The update process used is not defined by this design. 151 152#### Cleanup Blob 153 154The cleanup blob id is always present. The goal of this blob is to handle 155deletion of update artifacts on failure, or success. It can be implemented to 156do any manner of cleanup required, but for systems under memory pressure, it is 157a convenient cleanup mechanism. 158 159The cleanup blob has no state or knowledge and is meant to provide a simple 160system cleanup mechanism. This could also be accomplished by warm rebooting 161the BMC. The cleanup blob will delete a list of files. The cleanup blob has 162no state recognition for the update process, and therefore can interfere with 163an update process. The host tool will only use it on failure cases. Any other 164tool developed should respect this and not employ it unless the goal is to 165cleanup artifacts. 166 167To trigger the cleanup, simply open the blob, commit, and close. It has no 168knowledge of the update process. This simplification is done through the 169design of a convenience mechanism instead of a required mechanism. 170 171Cleanup Blob | Note 172---------------- | ------------------------- 173`/flash/cleanup` | Trigger Cleanup Mechanism 174 175### Caching Images 176 177Similarly to the OEM IPMI Flash protocol, the flash image will be staged in a 178compile-time configured location. 179 180Other mechanisms can readily be added by adding more blob ids or flags to the 181handler. 182 183### Commands 184 185The update mechanism will expect a specific sequence of commands depending on 186the transport mechanism selected. Some mechanisms require a handshake. 187 188#### BlockTransfer Sequence 189 1901. Open (for Image or tarball) 1911. Write 1921. Close 1931. Open (`/flash/hash`) 1941. Write 1951. Close 1961. Open (`/flash/verify`) 1971. Commit 1981. SessionStat (to read back verification status) 1991. Close 2001. Open (`/flash/update`) 2011. Commit 2021. SessionStat (to read back update status) 2031. Close 204 205#### P2A Sequence 206 2071. Open (for Image or tarball) 2081. SessionStat (P2A Region for P2A mapping) 2091. Write 2101. Close 2111. Open (`/flash/hash`) 2121. SessionStat (P2A Region) 2131. Write 2141. Close 2151. Open (`/flash/verify`) 2161. Commit 2171. SessionStat (to read back verification status) 2181. Close 2191. Open (`/flash/update`) 2201. Commit 2211. SessionStat (to read back update status) 2221. Close 223 224#### LPC Sequence 225 2261. Open (for image or tarball) 2271. WriteMeta (specify region information from host for LPC) 2281. SessionStat (verify the contents from the above) 2291. Write 2301. Close 2311. Open (`/flash/hash`) 2321. WriteMeta (LPC Region) 2331. SessionStat (verify LPC config) 2341. Write 2351. Close 2361. Open (`/flash/verify`) 2371. Commit 2381. SessionStat (to read back verification status) 2391. Close 2401. Open (`/flash/update`) 2411. Commit 2421. SessionStat (to read back update status) 2431. Close 244 245### Stale Images 246 247If an image update process is started but goes stale there are multiple 248mechanisms in place to ensure cleanup. If a session is left open after the blob 249timeout period it'll be closed. Because expiration is not the same action as 250closing, the cache will be flushed and any staged pieces deleted. 251 252The image itself, in legacy (static layout) mode will be placed and named in 253such a way that it will disappear if the BMC reboots. In the UBI case, the file 254will be stored in `/tmp` and deleted accordingly. 255 256At any point during the upload process, one can abort by closing the open blobs 257and deleting them by name. 258 259### Blob Primitives 260 261The update mechanism will implement the Blob primitives as follows. 262 263#### BmcBlobOpen 264 265The blob open primitive allows supplying blob specific flags. These flags are 266used for specifying the transport mechanism. To obtain the list of supported 267mechanisms on a platform, see the `Stat` command below. 268 269``` 270enum OpenFlags 271{ 272 read = (1 << 0), 273 write = (1 << 1), 274}; 275 276/* These bits start in the blob specific range of the flags. */ 277enum FirmwareUpdateFlags 278{ 279 bt = (1 << 8), /* Expect to send contents over IPMI BlockTransfer. */ 280 p2a = (1 << 9), /* Expect to send contents over P2A bridge. */ 281 lpc = (1 << 10), /* Expect to send contents over LPC bridge. */ 282}; 283``` 284 285An open request must specify that it is opening for writing and one transport 286mechanism, otherwise it is rejected. If the request is also set for reading, 287this is not rejected but currently provides no additional value. 288 289Once opened a new file will appear in the blob_id list (for both the image and 290hash) indicating they are in progress. The name will be `flash/active/image` and 291`flash/active/hash` which has no meaning beyond representing the current update 292in progress. Closing the file does not delete the staged images. Only delete 293will. 294 295***Note*** The active image blob_ids cannot be opened. This can be reconsidered 296later. 297 298#### BmcBlobRead 299 300This will initially not perform any function and will return success with 0 301bytes. 302 303#### BmcBlobWrite 304 305The write command's contents will depend on the transport mechanism. This 306command must not return until it has copied the data out of the mapped region 307into either a staging buffer or written down to a staging file. How the command 308reads from the mapped region is beyond the scope of this design. 309 310##### If BT 311 312The data section of the payload is only data. 313 314##### If P2A 315 316The data section of the payload is the following structure: 317 318``` 319struct ExtChunkHdr 320{ 321 uint32_t length; /* Length of the data queued (little endian). */ 322}; 323``` 324 325##### If LPC 326 327The data section of the payload is the following structure: 328 329``` 330struct ExtChunkHdr 331{ 332 uint32_t length; /* Length of the data queued (little endian). */ 333}; 334``` 335 336#### BmcBlobCommit 337 338If this command is called on the session of the firmware image itself, nothing 339will happen at present. It will return a no-op success. 340 341If this command is called on the session for the hash image, nothing will happen 342at present. It will return a no-op success. 343 344If this command is called on the session for the verify blob id, it'll trigger a 345systemd service `verify_image.service` to attempt to verify the image. Before 346doing this, if the transport mechanism is not IPMI BT, it'll shut down the 347mechanism used for transport preventing the host from updating anything. 348 349When this is started, only the BmcBlobSessionStat command will respond. Details 350on that response are below under BmcBlobSessionStat. 351 352#### BmcBlobClose 353 354Close must be called on the firmware image and the hash file before opening the 355verify blob. 356 357If the `verify_image.service` returned success, closing the verify file will 358have a specific behavior depending on the update. If it's UBI, it'll perform the 359install. If it's legacy (static layout), it'll do nothing. The verify_image 360service in the legacy case is responsible for placing the file in the correct 361staging position. A BMC warm reset command will initiate the firmware update 362process. 363 364If the image verification fails, it will automatically delete any files 365associated with the update. 366 367***Note:*** During development testing, a developer will want to upload files 368that are not signed. Therefore, an additional bit will be added to the flags to 369change this behavior. 370 371#### BmcBlobDelete 372 373Aborts any update that's in progress: 374 3751. Stops the verify_image.service if started. 3761. Deletes any staged files. 377 378In the event the update is already in progress, such as the tarball mechanism is 379used and in the middle of updating the files, it cannot be aborted. 380 381#### BmcBlobStat 382 383Blob stat on a blob_id (not SessionStat) will return the capabilities of the 384blob_id handler. 385 386``` 387struct BmcBlobStatRx { 388 uint16_t crc16; 389 /* This will have the bits set from the FirmwareUpdateFlags enum. */ 390 uint16_t blob_state; 391 uint32_t size; /* 0 - it's set to zero when there's no session */ 392 uint8_t metadata_len; /* 0 */ 393}; 394``` 395 396#### BmcBlobSessionStat 397 398If called pre-commit, it'll return the following information: 399 400``` 401struct BmcBlobStatRx { 402 uint16_t crc16; 403 uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */ 404 uint32_t size; /* Size in bytes so far written */ 405 uint8_t metadata_len; /* 0. */ 406}; 407``` 408 409If it's called and the data transport mechanism is P2A, it'll return a 32-bit 410address for use to configure the P2A region as part of the metadata portion of 411the `BmcBlobStatRx`. 412 413``` 414struct BmcBlobStatRx { 415 uint16_t crc16; 416 uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */ 417 uint32_t size; /* Size in bytes so far written */ 418 uint8_t metadata_len = sizeof(struct P2ARegion); 419 struct P2ARegion { 420 uint32_t address; 421 }; 422}; 423``` 424 425If called post-commit on the verify file session, it'll return: 426 427``` 428struct BmcBlobStatRx { 429 uint16_t crc16; 430 uint16_t blob_state; /* OPEN_W | (one of the interfaces) */ 431 uint32_t size; /* Size in bytes so far written */ 432 uint8_t metadata_len; /* 1. */ 433 uint8_t verify_response; /* one byte from the below enum */ 434}; 435 436enum VerifyCheckResponses 437{ 438 VerifyRunning = 0x00, 439 VerifySuccess = 0x01, 440 VerifyFailed = 0x02, 441 VerifyOther = 0x03, 442}; 443``` 444 445If called post-commit on the update file session, it'll return: 446 447``` 448struct BmcBlobStatRx { 449 uint16_t crc16; 450 uint16_t blob_state; /* OPEN_W | (one of the interfaces) */ 451 uint32_t size; /* Size in bytes so far written */ 452 uint8_t metadata_len; /* 1. */ 453 uint8_t update_response; /* one by from the below enum */ 454}; 455 456enum UpdateStatus 457{ 458 UpdateRunning = 0x00, 459 UpdateSuccessful = 0x01, 460 UpdateFailed = 0x02, 461 UpdateStatusUnknown = 0x03 462}; 463``` 464 465The `UpdateStatus` and `VerifyCheckResponses` are currently identical, but this 466may change over time. 467 468#### BmcBlobWriteMeta 469 470The write metadata command is meant to allow the host to provide specific 471configuration data to the BMC for the in-band update. Currently that is only 472aimed at LPC which needs to be told the memory address so it can configure the 473window. 474 475The write meta command's blob will be this structure: 476 477``` 478struct LpcRegion 479{ 480 uint32_t address; /* Host LPC address where the chunk is to be mapped. */ 481 uint32_t length; /* Size of the chunk to be mapped. */ 482}; 483``` 484 485## Alternatives Considered 486 487There is currently another implementation in-use by Google that leverages the 488same mechanisms, however, it's not as flexible because every command is a custom 489piece. Mapping it into blobs primitives allows for easier future modification 490while maintaining backwards compatibility (without simply adding a separate OEM 491library to handle a new process, etc). 492 493## Impacts 494 495This impacts security because it can leverage the memory mapped windows. There 496is not an expected performance impact, as the blob handler existing only 497generates a couple extra entries during the blob enumerate command's response. 498 499## Testing 500 501Where possible (nearly everywhere), mockable interfaces will be used such that 502the entire process has individual unit-tests that verify flags are checked, as 503well as states and sequences. 504 505### Scenarios 506 507#### Sending an image with a bad hash 508 509A required functional test is one whereby an image is sent down to the BMC, 510however the signature is invalid for that image. The expected result is that the 511verification step will return failure and the files will be deleted from the BMC 512without user intervention. 513 514#### Sending an image with a good hash 515 516A required functional test is one whereby an image is sent down to the BMC with 517a valid signature. The expected result is that the verification step will return 518success. 519 520## Configuration 521 522See the configuration section of 523[Secure Flash Update Mechanism](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md) 524