1# In-Band Update of BMC Firmware (and others) using OEM IPMI Blob Transport
2
3Author: Patrick Venture <venture!>
4
5Created: 2018-10-18
6
7## Problem Description
8
9The BMC needs a mechanism for receiving a new firmware image from the host
10through a variety of mechanisms. This can best be served with one protocol into
11which multiple approaches can be routed.
12
13## Background and References
14
15BMC hardware provides at a minimum some interface for sending and receiving IPMI
16messages. This hardware may also provide regions that can be memory mapped for
17higher speed communication between the BMC and the host. Certain infrastructures
18do not provide network access to the BMC, therefore it is required to provide an
19update mechanism that can be done in-band between the host and the BMC.
20
21In-band here refers to a communications channel that is directly connected
22between the host and BMC.
23
241.  Serial
251.  IPMI over LPC
261.  IPMI over i2c
271.  LPC Memory-Mapped Region
281.  P2A bridge
29
30## Primer
31
32Please read the IPMI BLOB protocol design as primer
33[here](https://github.com/openbmc/phosphor-ipmi-blobs/blob/master/README.md).
34
35## Requirements
36
37The following statements are reflective of the initial requirements.
38
39- Any update mechanism must provide support for UBI tarballs and legacy (static
40  layout) flash images. Leveraging the BLOB protocol allows a system to provide
41  support for any image type simply by implementing a mechanism for handling it.
42
43- Any update mechanism must allow for triggering an image verification step
44  before the image is used.
45
46- Any update mechanism must allow implementing the data staging via different
47  in-band mechanisms.
48
49- Any update mechanism must provide a handshake or equivalent protocol for
50  coordinating the data transfer. For instance, whether the BMC should enable
51  the P2A bridge and what region to use or whether to turn on the LPC memory map
52  bridge.
53
54- Any update mechanism must attempt to maintain security, insomuch as not
55  leaving a memory region open by default. For example, before starting the
56  verification step, access to the staged firmware image must not be still
57  accessible from the host.
58
59## Proposed Design
60
61OpenBMC supports a BLOB protocol that provides primitives. These primitives
62allow a variety of different "handlers" to exist that implement those primitives
63for specific "blobs." A blob in this context is a file path that is strictly
64unique.
65
66Sending the firmware image over the BLOB protocol will be done via routing the
67[phosphor-ipmi-flash design](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md)
68through a BLOB handler. This is meant to supplant `phosphor-ipmi-flash`'s
69current approach to centralize on one flexible handler.
70
71### Sequencing Control
72
73To enforce sequencing control, the design requires that only one blob be open at
74a time. If the verification blob is open, the other blobs cannot be opened, and
75likewise if a client has a data blob open, the verification blob cannot be
76opened.
77
78### Defining Blobs
79
80The BLOB protocol allows a handler to specify a list of blob ids. This list will
81be leveraged to specify whether the platform supports either the legacy (static
82layout) or the UBI mechanism, or both. The flags provided to the open command
83identify the mechanism selected by the client-side. The stat command will return
84the list of supported mechanisms for the blob.
85
86The blob ids for the mechanisms will be as follows:
87
88| Flash Blob Id    | Type            |
89| ---------------- | --------------- |
90| `/flash/image`   | Static Layout   |
91| `/flash/tarball` | UBI             |
92| `/flash/bios`    | Host BIOS image |
93
94The flash handler will determine what commands it should expect to receive and
95responses it will return given the blob opened, based on the flags provided to
96open.
97
98The flash handler will only allow one of the above blobs to be opened for a
99sequence of commands, such that you cannot open `/flash/image` and then open
100`/flash/bios` without completing (or later aborting) the first update process
101started.
102
103The following blob ids are defined for storing the hash for the image:
104
105| Hash Blob     | Id Mechanism                    |
106| ------------- | ------------------------------- |
107| `/flash/hash` | Whichever flash blob was opened |
108
109The flash handler will only allow one open file at a time, such that if the host
110attempts to send a firmware image down over IPMI BlockTransfer, it won't allow
111the host to start a PCI send until the BlockTransfer file is closed.
112
113There is only one hash "file" mechanism. The exact hash used will only be
114important to your verification service. The value provided will be written to a
115known place.
116
117When a transfer is active, it'll create a blob_id of `/flash/active/image` and
118`/flash/active/hash`.
119
120#### Verification Blob
121
122The following blob id is defined once the image or hash upload has started. Its
123purpose is to trigger and monitor the firmware verification process. Therefore,
124the BmcBlobOpen command will fail until both the hash and image file are closed.
125Further on the ideal command sequence below.
126
127| Trigger Blob    | Note                     |
128| --------------- | ------------------------ |
129| `/flash/verify` | Verify Trigger Mechanism |
130
131When the verification file is closed, if verification was completed
132successfully, it'll add an update blob id, defined below.
133
134The verification process used is not defined by this design.
135
136#### Update Blob
137
138The update blob id is available once `/flash/verify` is closed with a valid
139image or tarball. The update blob needs to be opened and commit() called on that
140blob id to trigger the update mechanism.
141
142The update process can be checked periodically by calling stat() on the update
143blob id.
144
145| Update Blob     | Note                     |
146| --------------- | ------------------------ |
147| `/flash/update` | Trigger Update Mechanism |
148
149The update process used is not defined by this design.
150
151#### Cleanup Blob
152
153The cleanup blob id is always present. The goal of this blob is to handle
154deletion of update artifacts on failure, or success. It can be implemented to do
155any manner of cleanup required, but for systems under memory pressure, it is a
156convenient cleanup mechanism.
157
158The cleanup blob has no state or knowledge and is meant to provide a simple
159system cleanup mechanism. This could also be accomplished by warm rebooting the
160BMC. The cleanup blob will delete a list of files. The cleanup blob has no state
161recognition for the update process, and therefore can interfere with an update
162process. The host tool will only use it on failure cases. Any other tool
163developed should respect this and not employ it unless the goal is to cleanup
164artifacts.
165
166To trigger the cleanup, simply open the blob, commit, and close. It has no
167knowledge of the update process. This simplification is done through the design
168of a convenience mechanism instead of a required mechanism.
169
170| Cleanup Blob     | Note                      |
171| ---------------- | ------------------------- |
172| `/flash/cleanup` | Trigger Cleanup Mechanism |
173
174### Caching Images
175
176Similarly to the OEM IPMI Flash protocol, the flash image will be staged in a
177compile-time configured location.
178
179Other mechanisms can readily be added by adding more blob ids or flags to the
180handler.
181
182### Commands
183
184The update mechanism will expect a specific sequence of commands depending on
185the transport mechanism selected. Some mechanisms require a handshake.
186
187#### BlockTransfer Sequence
188
1891.  Open (for Image or tarball)
1901.  Write
1911.  Close
1921.  Open (`/flash/hash`)
1931.  Write
1941.  Close
1951.  Open (`/flash/verify`)
1961.  Commit
1971.  SessionStat (to read back verification status)
1981.  Close
1991.  Open (`/flash/update`)
2001.  Commit
2011.  SessionStat (to read back update status)
2021.  Close
203
204#### P2A Sequence
205
2061.  Open (for Image or tarball)
2071.  SessionStat (P2A Region for P2A mapping)
2081.  Write
2091.  Close
2101.  Open (`/flash/hash`)
2111.  SessionStat (P2A Region)
2121.  Write
2131.  Close
2141.  Open (`/flash/verify`)
2151.  Commit
2161.  SessionStat (to read back verification status)
2171.  Close
2181.  Open (`/flash/update`)
2191.  Commit
2201.  SessionStat (to read back update status)
2211.  Close
222
223#### LPC Sequence
224
2251.  Open (for image or tarball)
2261.  WriteMeta (specify region information from host for LPC)
2271.  SessionStat (verify the contents from the above)
2281.  Write
2291.  Close
2301.  Open (`/flash/hash`)
2311.  WriteMeta (LPC Region)
2321.  SessionStat (verify LPC config)
2331.  Write
2341.  Close
2351.  Open (`/flash/verify`)
2361.  Commit
2371.  SessionStat (to read back verification status)
2381.  Close
2391.  Open (`/flash/update`)
2401.  Commit
2411.  SessionStat (to read back update status)
2421.  Close
243
244### Stale Images
245
246If an image update process is started but goes stale there are multiple
247mechanisms in place to ensure cleanup. If a session is left open after the blob
248timeout period it'll be closed. Because expiration is not the same action as
249closing, the cache will be flushed and any staged pieces deleted.
250
251The image itself, in legacy (static layout) mode will be placed and named in
252such a way that it will disappear if the BMC reboots. In the UBI case, the file
253will be stored in `/tmp` and deleted accordingly.
254
255At any point during the upload process, one can abort by closing the open blobs
256and deleting them by name.
257
258### Blob Primitives
259
260The update mechanism will implement the Blob primitives as follows.
261
262#### BmcBlobOpen
263
264The blob open primitive allows supplying blob specific flags. These flags are
265used for specifying the transport mechanism. To obtain the list of supported
266mechanisms on a platform, see the `Stat` command below.
267
268```
269enum OpenFlags
270{
271    read = (1 << 0),
272    write = (1 << 1),
273};
274
275/* These bits start in the blob specific range of the flags. */
276enum FirmwareUpdateFlags
277{
278    bt = (1 << 8),   /* Expect to send contents over IPMI BlockTransfer. */
279    p2a = (1 << 9),  /* Expect to send contents over P2A bridge. */
280    lpc = (1 << 10), /* Expect to send contents over LPC bridge. */
281};
282```
283
284An open request must specify that it is opening for writing and one transport
285mechanism, otherwise it is rejected. If the request is also set for reading,
286this is not rejected but currently provides no additional value.
287
288Once opened a new file will appear in the blob_id list (for both the image and
289hash) indicating they are in progress. The name will be `flash/active/image` and
290`flash/active/hash` which has no meaning beyond representing the current update
291in progress. Closing the file does not delete the staged images. Only delete
292will.
293
294**_Note_** The active image blob_ids cannot be opened. This can be reconsidered
295later.
296
297#### BmcBlobRead
298
299This will initially not perform any function and will return success with 0
300bytes.
301
302#### BmcBlobWrite
303
304The write command's contents will depend on the transport mechanism. This
305command must not return until it has copied the data out of the mapped region
306into either a staging buffer or written down to a staging file. How the command
307reads from the mapped region is beyond the scope of this design.
308
309##### If BT
310
311The data section of the payload is only data.
312
313##### If P2A
314
315The data section of the payload is the following structure:
316
317```
318struct ExtChunkHdr
319{
320    uint32_t length; /* Length of the data queued (little endian). */
321};
322```
323
324##### If LPC
325
326The data section of the payload is the following structure:
327
328```
329struct ExtChunkHdr
330{
331    uint32_t length; /* Length of the data queued (little endian). */
332};
333```
334
335#### BmcBlobCommit
336
337If this command is called on the session of the firmware image itself, nothing
338will happen at present. It will return a no-op success.
339
340If this command is called on the session for the hash image, nothing will happen
341at present. It will return a no-op success.
342
343If this command is called on the session for the verify blob id, it'll trigger a
344systemd service `verify_image.service` to attempt to verify the image. Before
345doing this, if the transport mechanism is not IPMI BT, it'll shut down the
346mechanism used for transport preventing the host from updating anything.
347
348When this is started, only the BmcBlobSessionStat command will respond. Details
349on that response are below under BmcBlobSessionStat.
350
351#### BmcBlobClose
352
353Close must be called on the firmware image and the hash file before opening the
354verify blob.
355
356If the `verify_image.service` returned success, closing the verify file will
357have a specific behavior depending on the update. If it's UBI, it'll perform the
358install. If it's legacy (static layout), it'll do nothing. The verify_image
359service in the legacy case is responsible for placing the file in the correct
360staging position. A BMC warm reset command will initiate the firmware update
361process.
362
363If the image verification fails, it will automatically delete any files
364associated with the update.
365
366**_Note:_** During development testing, a developer will want to upload files
367that are not signed. Therefore, an additional bit will be added to the flags to
368change this behavior.
369
370#### BmcBlobDelete
371
372Aborts any update that's in progress:
373
3741.  Stops the verify_image.service if started.
3751.  Deletes any staged files.
376
377In the event the update is already in progress, such as the tarball mechanism is
378used and in the middle of updating the files, it cannot be aborted.
379
380#### BmcBlobStat
381
382Blob stat on a blob_id (not SessionStat) will return the capabilities of the
383blob_id handler.
384
385```
386struct BmcBlobStatRx {
387    uint16_t crc16;
388    /* This will have the bits set from the FirmwareUpdateFlags enum. */
389    uint16_t blob_state;
390    uint32_t size; /* 0 - it's set to zero when there's no session */
391    uint8_t  metadata_len; /* 0 */
392};
393```
394
395#### BmcBlobSessionStat
396
397If called pre-commit, it'll return the following information:
398
399```
400struct BmcBlobStatRx {
401    uint16_t crc16;
402    uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */
403    uint32_t size; /* Size in bytes so far written */
404    uint8_t  metadata_len; /* 0. */
405};
406```
407
408If it's called and the data transport mechanism is P2A, it'll return a 32-bit
409address for use to configure the P2A region as part of the metadata portion of
410the `BmcBlobStatRx`.
411
412```
413struct BmcBlobStatRx {
414    uint16_t crc16;
415    uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */
416    uint32_t size; /* Size in bytes so far written */
417    uint8_t  metadata_len = sizeof(struct P2ARegion);
418    struct P2ARegion {
419        uint32_t address;
420    };
421};
422```
423
424If called post-commit on the verify file session, it'll return:
425
426```
427struct BmcBlobStatRx {
428    uint16_t crc16;
429    uint16_t blob_state; /* OPEN_W | (one of the interfaces) */
430    uint32_t size; /* Size in bytes so far written */
431    uint8_t  metadata_len; /* 1. */
432    uint8_t  verify_response; /* one byte from the below enum */
433};
434
435enum VerifyCheckResponses
436{
437    VerifyRunning = 0x00,
438    VerifySuccess = 0x01,
439    VerifyFailed  = 0x02,
440    VerifyOther   = 0x03,
441};
442```
443
444If called post-commit on the update file session, it'll return:
445
446```
447struct BmcBlobStatRx {
448    uint16_t crc16;
449    uint16_t blob_state; /* OPEN_W | (one of the interfaces) */
450    uint32_t size; /* Size in bytes so far written */
451    uint8_t  metadata_len; /* 1. */
452    uint8_t  update_response; /* one by from the below enum */
453};
454
455enum UpdateStatus
456{
457    UpdateRunning = 0x00,
458    UpdateSuccessful = 0x01,
459    UpdateFailed = 0x02,
460    UpdateStatusUnknown = 0x03
461};
462```
463
464The `UpdateStatus` and `VerifyCheckResponses` are currently identical, but this
465may change over time.
466
467#### BmcBlobWriteMeta
468
469The write metadata command is meant to allow the host to provide specific
470configuration data to the BMC for the in-band update. Currently that is only
471aimed at LPC which needs to be told the memory address so it can configure the
472window.
473
474The write meta command's blob will be this structure:
475
476```
477struct LpcRegion
478{
479    uint32_t address; /* Host LPC address where the chunk is to be mapped. */
480    uint32_t length; /* Size of the chunk to be mapped. */
481};
482```
483
484## Alternatives Considered
485
486There is currently another implementation in-use by Google that leverages the
487same mechanisms, however, it's not as flexible because every command is a custom
488piece. Mapping it into blobs primitives allows for easier future modification
489while maintaining backwards compatibility (without simply adding a separate OEM
490library to handle a new process, etc).
491
492## Impacts
493
494This impacts security because it can leverage the memory mapped windows. There
495is not an expected performance impact, as the blob handler existing only
496generates a couple extra entries during the blob enumerate command's response.
497
498## Testing
499
500Where possible (nearly everywhere), mockable interfaces will be used such that
501the entire process has individual unit-tests that verify flags are checked, as
502well as states and sequences.
503
504### Scenarios
505
506#### Sending an image with a bad hash
507
508A required functional test is one whereby an image is sent down to the BMC,
509however the signature is invalid for that image. The expected result is that the
510verification step will return failure and the files will be deleted from the BMC
511without user intervention.
512
513#### Sending an image with a good hash
514
515A required functional test is one whereby an image is sent down to the BMC with
516a valid signature. The expected result is that the verification step will return
517success.
518
519## Configuration
520
521See the configuration section of
522[Secure Flash Update Mechanism](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md)
523