1# In-Band Update of BMC Firmware (and others) using OEM IPMI Blob Transport
2
3Author: Patrick Venture <venture!>
4
5Created: 2018-10-18
6
7## Problem Description
8
9The BMC needs a mechanism for receiving a new firmware image from the host
10through a variety of mechanisms. This can best be served with one protocol into
11which multiple approaches can be routed.
12
13## Background and References
14
15BMC hardware provides at a minimum some interface for sending and receiving IPMI
16messages. This hardware may also provide regions that can be memory mapped for
17higher speed communication between the BMC and the host. Certain infrastructures
18do not provide network access to the BMC, therefore it is required to provide an
19update mechanism that can be done in-band between the host and the BMC.
20
21In-band here refers to a communications channel that is directly connected
22between the host and BMC.
23
241.  Serial
251.  IPMI over LPC
261.  IPMI over i2c
271.  LPC Memory-Mapped Region
281.  P2A bridge
29
30## Primer
31
32Please read the IPMI BLOB protocol design as primer
33[here](https://github.com/openbmc/phosphor-ipmi-blobs/blob/master/README.md).
34
35## Requirements
36
37The following statements are reflective of the initial requirements.
38
39*   Any update mechanism must provide support for UBI tarballs and legacy
40    (static layout) flash images. Leveraging the BLOB protocol allows a system
41    to provide support for any image type simply by implementing a mechanism for
42    handling it.
43
44*   Any update mechanism must allow for triggering an image verification step
45    before the image is used.
46
47*   Any update mechanism must allow implementing the data staging via different
48    in-band mechanisms.
49
50*   Any update mechanism must provide a handshake or equivalent protocol for
51    coordinating the data transfer. For instance, whether the BMC should enable
52    the P2A bridge and what region to use or whether to turn on the LPC memory
53    map bridge.
54
55*   Any update mechanism must attempt to maintain security, insomuch as not
56    leaving a memory region open by default. For example, before starting the
57    verification step, access to the staged firmware image must not be still
58    accessible from the host.
59
60## Proposed Design
61
62OpenBMC supports a BLOB protocol that provides primitives. These primitives
63allow a variety of different "handlers" to exist that implement those primitives
64for specific "blobs." A blob in this context is a file path that is strictly
65unique.
66
67Sending the firmware image over the BLOB protocol will be done via routing the
68[phosphor-ipmi-flash design](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md)
69through a BLOB handler. This is meant to supplant `phosphor-ipmi-flash`'s
70current approach to centralize on one flexible handler.
71
72### Sequencing Control
73
74To enforce sequencing control, the design requires that only one blob be open at
75a time. If the verification blob is open, the other blobs cannot be opened, and
76likewise if a client has a data blob open, the verification blob cannot be
77opened.
78
79### Defining Blobs
80
81The BLOB protocol allows a handler to specify a list of blob ids. This list will
82be leveraged to specify whether the platform supports either the legacy (static
83layout) or the UBI mechanism, or both. The flags provided to the open command
84identify the mechanism selected by the client-side. The stat command will return
85the list of supported mechanisms for the blob.
86
87The blob ids for the mechanisms will be as follows:
88
89Flash Blob Id    | Type
90---------------- | -------------
91`/flash/image`   | Static Layout
92`/flash/tarball` | UBI
93`/flash/bios`    | Host BIOS image
94
95The flash handler will determine what commands it should expect to receive and
96responses it will return given the blob opened, based on the flags provided to
97open.
98
99The flash handler will only allow one of the above blobs to be opened for a
100sequence of commands, such that you cannot open `/flash/image` and then open
101`/flash/bios` without completing (or later aborting) the first update process
102started.
103
104The following blob ids are defined for storing the hash for the image:
105
106Hash Blob     | Id Mechanism
107------------- | --------------------
108`/flash/hash` | Whichever flash blob was opened
109
110The flash handler will only allow one open file at a time, such that if the host
111attempts to send a firmware image down over IPMI BlockTransfer, it won't allow
112the host to start a PCI send until the BlockTransfer file is closed.
113
114There is only one hash "file" mechanism. The exact hash used will only be
115important to your verification service. The value provided will be written to a
116known place.
117
118When a transfer is active, it'll create a blob_id of `/flash/active/image` and
119`/flash/active/hash`.
120
121#### Verification Blob
122
123The following blob id is defined once the image or hash upload has started. Its
124purpose is to trigger and monitor the firmware verification process. Therefore,
125the BmcBlobOpen command will fail until both the hash and image file are closed.
126Further on the ideal command sequence below.
127
128Trigger Blob    | Note
129--------------- | ------------------------
130`/flash/verify` | Verify Trigger Mechanism
131
132When the verification file is closed, if verification was completed
133successfully, it'll add an update blob id, defined below.
134
135The verification process used is not defined by this design.
136
137#### Update Blob
138
139The update blob id is available once `/flash/verify` is closed with a valid image
140or tarball. The update blob needs to be opened and commit() called on that blob
141id to trigger the update mechanism.
142
143The update process can be checked periodically by calling stat() on the update
144blob id.
145
146Update Blob     | Note
147--------------- | ------------------------
148`/flash/update` | Trigger Update Mechanism
149
150The update process used is not defined by this design.
151
152#### Cleanup Blob
153
154The cleanup blob id is always present.  The goal of this blob is to handle
155deletion of update artifacts on failure, or success.  It can be implemented to
156do any manner of cleanup required, but for systems under memory pressure, it is
157a convenient cleanup mechanism.
158
159The cleanup blob has no state or knowledge and is meant to provide a simple
160system cleanup mechanism.  This could also be accomplished by warm rebooting
161the BMC.  The cleanup blob will delete a list of files.  The cleanup blob has
162no state recognition for the update process, and therefore can interfere with
163an update process.  The host tool will only use it on failure cases.  Any other
164tool developed should respect this and not employ it unless the goal is to
165cleanup artifacts.
166
167To trigger the cleanup, simply open the blob, commit, and close.  It has no
168knowledge of the update process.  This simplification is done through the
169design of a convenience mechanism instead of a required mechanism.
170
171Cleanup Blob     | Note
172---------------- | -------------------------
173`/flash/cleanup` | Trigger Cleanup Mechanism
174
175### Caching Images
176
177Similarly to the OEM IPMI Flash protocol, the flash image will be staged in a
178compile-time configured location.
179
180Other mechanisms can readily be added by adding more blob ids or flags to the
181handler.
182
183### Commands
184
185The update mechanism will expect a specific sequence of commands depending on
186the transport mechanism selected. Some mechanisms require a handshake.
187
188#### BlockTransfer Sequence
189
1901.  Open (for Image or tarball)
1911.  Write
1921.  Close
1931.  Open (`/flash/hash`)
1941.  Write
1951.  Close
1961.  Open (`/flash/verify`)
1971.  Commit
1981.  SessionStat (to read back verification status)
1991.  Close
2001.  Open (`/flash/update`)
2011.  Commit
2021.  SessionStat (to read back update status)
2031.  Close
204
205#### P2A Sequence
206
2071.  Open (for Image or tarball)
2081.  SessionStat (P2A Region for P2A mapping)
2091.  Write
2101.  Close
2111.  Open (`/flash/hash`)
2121.  SessionStat (P2A Region)
2131.  Write
2141.  Close
2151.  Open (`/flash/verify`)
2161.  Commit
2171.  SessionStat (to read back verification status)
2181.  Close
2191.  Open (`/flash/update`)
2201.  Commit
2211.  SessionStat (to read back update status)
2221.  Close
223
224#### LPC Sequence
225
2261.  Open (for image or tarball)
2271.  WriteMeta (specify region information from host for LPC)
2281.  SessionStat (verify the contents from the above)
2291.  Write
2301.  Close
2311.  Open (`/flash/hash`)
2321.  WriteMeta (LPC Region)
2331.  SessionStat (verify LPC config)
2341.  Write
2351.  Close
2361.  Open (`/flash/verify`)
2371.  Commit
2381.  SessionStat (to read back verification status)
2391.  Close
2401.  Open (`/flash/update`)
2411.  Commit
2421.  SessionStat (to read back update status)
2431.  Close
244
245### Stale Images
246
247If an image update process is started but goes stale there are multiple
248mechanisms in place to ensure cleanup. If a session is left open after the blob
249timeout period it'll be closed. Because expiration is not the same action as
250closing, the cache will be flushed and any staged pieces deleted.
251
252The image itself, in legacy (static layout) mode will be placed and named in
253such a way that it will disappear if the BMC reboots. In the UBI case, the file
254will be stored in `/tmp` and deleted accordingly.
255
256At any point during the upload process, one can abort by closing the open blobs
257and deleting them by name.
258
259### Blob Primitives
260
261The update mechanism will implement the Blob primitives as follows.
262
263#### BmcBlobOpen
264
265The blob open primitive allows supplying blob specific flags. These flags are
266used for specifying the transport mechanism. To obtain the list of supported
267mechanisms on a platform, see the `Stat` command below.
268
269```
270enum OpenFlags
271{
272    read = (1 << 0),
273    write = (1 << 1),
274};
275
276/* These bits start in the blob specific range of the flags. */
277enum FirmwareUpdateFlags
278{
279    bt = (1 << 8),   /* Expect to send contents over IPMI BlockTransfer. */
280    p2a = (1 << 9),  /* Expect to send contents over P2A bridge. */
281    lpc = (1 << 10), /* Expect to send contents over LPC bridge. */
282};
283```
284
285An open request must specify that it is opening for writing and one transport
286mechanism, otherwise it is rejected. If the request is also set for reading,
287this is not rejected but currently provides no additional value.
288
289Once opened a new file will appear in the blob_id list (for both the image and
290hash) indicating they are in progress. The name will be `flash/active/image` and
291`flash/active/hash` which has no meaning beyond representing the current update
292in progress. Closing the file does not delete the staged images. Only delete
293will.
294
295***Note*** The active image blob_ids cannot be opened. This can be reconsidered
296later.
297
298#### BmcBlobRead
299
300This will initially not perform any function and will return success with 0
301bytes.
302
303#### BmcBlobWrite
304
305The write command's contents will depend on the transport mechanism. This
306command must not return until it has copied the data out of the mapped region
307into either a staging buffer or written down to a staging file. How the command
308reads from the mapped region is beyond the scope of this design.
309
310##### If BT
311
312The data section of the payload is only data.
313
314##### If P2A
315
316The data section of the payload is the following structure:
317
318```
319struct ExtChunkHdr
320{
321    uint32_t length; /* Length of the data queued (little endian). */
322};
323```
324
325##### If LPC
326
327The data section of the payload is the following structure:
328
329```
330struct ExtChunkHdr
331{
332    uint32_t length; /* Length of the data queued (little endian). */
333};
334```
335
336#### BmcBlobCommit
337
338If this command is called on the session of the firmware image itself, nothing
339will happen at present. It will return a no-op success.
340
341If this command is called on the session for the hash image, nothing will happen
342at present. It will return a no-op success.
343
344If this command is called on the session for the verify blob id, it'll trigger a
345systemd service `verify_image.service` to attempt to verify the image. Before
346doing this, if the transport mechanism is not IPMI BT, it'll shut down the
347mechanism used for transport preventing the host from updating anything.
348
349When this is started, only the BmcBlobSessionStat command will respond. Details
350on that response are below under BmcBlobSessionStat.
351
352#### BmcBlobClose
353
354Close must be called on the firmware image and the hash file before opening the
355verify blob.
356
357If the `verify_image.service` returned success, closing the verify file will
358have a specific behavior depending on the update. If it's UBI, it'll perform the
359install. If it's legacy (static layout), it'll do nothing. The verify_image
360service in the legacy case is responsible for placing the file in the correct
361staging position. A BMC warm reset command will initiate the firmware update
362process.
363
364If the image verification fails, it will automatically delete any files
365associated with the update.
366
367***Note:*** During development testing, a developer will want to upload files
368that are not signed. Therefore, an additional bit will be added to the flags to
369change this behavior.
370
371#### BmcBlobDelete
372
373Aborts any update that's in progress:
374
3751.  Stops the verify_image.service if started.
3761.  Deletes any staged files.
377
378In the event the update is already in progress, such as the tarball mechanism is
379used and in the middle of updating the files, it cannot be aborted.
380
381#### BmcBlobStat
382
383Blob stat on a blob_id (not SessionStat) will return the capabilities of the
384blob_id handler.
385
386```
387struct BmcBlobStatRx {
388    uint16_t crc16;
389    /* This will have the bits set from the FirmwareUpdateFlags enum. */
390    uint16_t blob_state;
391    uint32_t size; /* 0 - it's set to zero when there's no session */
392    uint8_t  metadata_len; /* 0 */
393};
394```
395
396#### BmcBlobSessionStat
397
398If called pre-commit, it'll return the following information:
399
400```
401struct BmcBlobStatRx {
402    uint16_t crc16;
403    uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */
404    uint32_t size; /* Size in bytes so far written */
405    uint8_t  metadata_len; /* 0. */
406};
407```
408
409If it's called and the data transport mechanism is P2A, it'll return a 32-bit
410address for use to configure the P2A region as part of the metadata portion of
411the `BmcBlobStatRx`.
412
413```
414struct BmcBlobStatRx {
415    uint16_t crc16;
416    uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */
417    uint32_t size; /* Size in bytes so far written */
418    uint8_t  metadata_len = sizeof(struct P2ARegion);
419    struct P2ARegion {
420        uint32_t address;
421    };
422};
423```
424
425If called post-commit on the verify file session, it'll return:
426
427```
428struct BmcBlobStatRx {
429    uint16_t crc16;
430    uint16_t blob_state; /* OPEN_W | (one of the interfaces) */
431    uint32_t size; /* Size in bytes so far written */
432    uint8_t  metadata_len; /* 1. */
433    uint8_t  verify_response; /* one byte from the below enum */
434};
435
436enum VerifyCheckResponses
437{
438    VerifyRunning = 0x00,
439    VerifySuccess = 0x01,
440    VerifyFailed  = 0x02,
441    VerifyOther   = 0x03,
442};
443```
444
445If called post-commit on the update file session, it'll return:
446
447```
448struct BmcBlobStatRx {
449    uint16_t crc16;
450    uint16_t blob_state; /* OPEN_W | (one of the interfaces) */
451    uint32_t size; /* Size in bytes so far written */
452    uint8_t  metadata_len; /* 1. */
453    uint8_t  update_response; /* one by from the below enum */
454};
455
456enum UpdateStatus
457{
458    UpdateRunning = 0x00,
459    UpdateSuccessful = 0x01,
460    UpdateFailed = 0x02,
461    UpdateStatusUnknown = 0x03
462};
463```
464
465The `UpdateStatus` and `VerifyCheckResponses` are currently identical, but this
466may change over time.
467
468#### BmcBlobWriteMeta
469
470The write metadata command is meant to allow the host to provide specific
471configuration data to the BMC for the in-band update. Currently that is only
472aimed at LPC which needs to be told the memory address so it can configure the
473window.
474
475The write meta command's blob will be this structure:
476
477```
478struct LpcRegion
479{
480    uint32_t address; /* Host LPC address where the chunk is to be mapped. */
481    uint32_t length; /* Size of the chunk to be mapped. */
482};
483```
484
485## Alternatives Considered
486
487There is currently another implementation in-use by Google that leverages the
488same mechanisms, however, it's not as flexible because every command is a custom
489piece. Mapping it into blobs primitives allows for easier future modification
490while maintaining backwards compatibility (without simply adding a separate OEM
491library to handle a new process, etc).
492
493## Impacts
494
495This impacts security because it can leverage the memory mapped windows. There
496is not an expected performance impact, as the blob handler existing only
497generates a couple extra entries during the blob enumerate command's response.
498
499## Testing
500
501Where possible (nearly everywhere), mockable interfaces will be used such that
502the entire process has individual unit-tests that verify flags are checked, as
503well as states and sequences.
504
505### Scenarios
506
507#### Sending an image with a bad hash
508
509A required functional test is one whereby an image is sent down to the BMC,
510however the signature is invalid for that image. The expected result is that the
511verification step will return failure and the files will be deleted from the BMC
512without user intervention.
513
514#### Sending an image with a good hash
515
516A required functional test is one whereby an image is sent down to the BMC with
517a valid signature. The expected result is that the verification step will return
518success.
519
520## Configuration
521
522See the configuration section of
523[Secure Flash Update Mechanism](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md)
524