xref: /openbmc/docs/designs/firmware-update-via-blobs.md (revision 31de159f86a42b643858e33eed4840dbdd6dd9f8)
1# In-Band Update of BMC Firmware (and others) using OEM IPMI Blob Transport
2
3Author: Patrick Venture <venture!>
4
5Primary assignee: Patrick Venture
6
7Created: 2018-10-18
8
9## Problem Description
10
11The BMC needs a mechanism for receiving a new firmware image from the host
12through a variety of mechanisms. This can best be served with one protocol into
13which multiple approaches can be routed.
14
15## Background and References
16
17BMC hardware provides at a minimum some interface for sending and receiving IPMI
18messages. This hardware may also provide regions that can be memory mapped for
19higher speed communication between the BMC and the host. Certain infrastructures
20do not provide network access to the BMC, therefore it is required to provide an
21update mechanism that can be done in-band between the host and the BMC.
22
23In-band here refers to a communications channel that is directly connected
24between the host and BMC.
25
261.  Serial
271.  IPMI over LPC
281.  IPMI over i2c
291.  LPC Memory-Mapped Region
301.  P2A bridge
31
32## Primer
33
34Please read the IPMI BLOB protocol design as primer
35[here](https://github.com/openbmc/phosphor-ipmi-blobs/blob/master/README.md).
36
37## Requirements
38
39The following statements are reflective of the initial requirements.
40
41*   Any update mechanism must provide support for UBI tarballs and legacy
42    (static layout) flash images. Leveraging the BLOB protocol allows a system
43    to provide support for any image type simply by implementing a mechanism for
44    handling it.
45
46*   Any update mechanism must allow for triggering an image verification step
47    before the image is used.
48
49*   Any update mechanism must allow implementing the data staging via different
50    in-band mechanisms.
51
52*   Any update mechanism must provide a handshake or equivalent protocol for
53    coordinating the data transfer. For instance, whether the BMC should enable
54    the P2A bridge and what region to use or whether to turn on the LPC memory
55    map bridge.
56
57*   Any update mechanism must attempt to maintain security, insomuch as not
58    leaving a memory region open by default. For example, before starting the
59    verification step, access to the staged firmware image must not be still
60    accessible from the host.
61
62## Proposed Design
63
64OpenBMC supports a BLOB protocol that provides primitives. These primitives
65allow a variety of different "handlers" to exist that implement those primitives
66for specific "blobs." A blob in this context is a file path that is strictly
67unique.
68
69Sending the firmware image over the BLOB protocol will be done via routing the
70[phosphor-ipmi-flash design](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md)
71through a BLOB handler. This is meant to supplant `phosphor-ipmi-flash`'s
72current approach to centralize on one flexible handler.
73
74### Sequencing Control
75
76To enforce sequencing control, the design requires that only one blob be open at
77a time. If the verification blob is open, the other blobs cannot be opened, and
78likewise if a client has a data blob open, the verification blob cannot be
79opened.
80
81### Defining Blobs
82
83The BLOB protocol allows a handler to specify a list of blob ids. This list will
84be leveraged to specify whether the platform supports either the legacy (static
85layout) or the UBI mechanism, or both. The flags provided to the open command
86identify the mechanism selected by the client-side. The stat command will return
87the list of supported mechanisms for the blob.
88
89The blob ids for the mechanisms will be as follows:
90
91Flash Blob Id    | Type
92---------------- | -------------
93`/flash/image`   | Static Layout
94`/flash/tarball` | UBI
95`/flash/bios`    | Host BIOS image
96
97The flash handler will determine what commands it should expect to receive and
98responses it will return given the blob opened, based on the flags provided to
99open.
100
101The flash handler will only allow one of the above blobs to be opened for a
102sequence of commands, such that you cannot open `/flash/image` and then open
103`/flash/bios` without completing (or later aborting) the first update process
104started.
105
106The following blob ids are defined for storing the hash for the image:
107
108Hash Blob     | Id Mechanism
109------------- | --------------------
110`/flash/hash` | Whichever flash blob was opened
111
112The flash handler will only allow one open file at a time, such that if the host
113attempts to send a firmware image down over IPMI BlockTransfer, it won't allow
114the host to start a PCI send until the BlockTransfer file is closed.
115
116There is only one hash "file" mechanism. The exact hash used will only be
117important to your verification service. The value provided will be written to a
118known place.
119
120When a transfer is active, it'll create a blob_id of `/flash/active/image` and
121`/flash/active/hash`.
122
123#### Verification Blob
124
125The following blob id is defined once the image or hash upload has started. Its
126purpose is to trigger and monitor the firmware verification process. Therefore,
127the BmcBlobOpen command will fail until both the hash and image file are closed.
128Further on the ideal command sequence below.
129
130Trigger Blob    | Note
131--------------- | ------------------------
132`/flash/verify` | Verify Trigger Mechanism
133
134When the verification file is closed, if verification was completed
135successfully, it'll add an update blob id, defined below.
136
137The verification process used is not defined by this design.
138
139#### Update Blob
140
141The update blob id is available once `/flash/verify` is closed with a valid image
142or tarball. The update blob needs to be opened and commit() called on that blob
143id to trigger the update mechanism.
144
145The update process can be checked periodically by calling stat() on the update
146blob id.
147
148Update Blob     | Note
149--------------- | ------------------------
150`/flash/update` | Trigger Update Mechanism
151
152The update process used is not defined by this design.
153
154#### Cleanup Blob
155
156The cleanup blob id is always present.  The goal of this blob is to handle
157deletion of update artifacts on failure, or success.  It can be implemented to
158do any manner of cleanup required, but for systems under memory pressure, it is
159a convenient cleanup mechanism.
160
161The cleanup blob has no state or knowledge and is meant to provide a simple
162system cleanup mechanism.  This could also be accomplished by warm rebooting
163the BMC.  The cleanup blob will delete a list of files.  The cleanup blob has
164no state recognition for the update process, and therefore can interfere with
165an update process.  The host tool will only use it on failure cases.  Any other
166tool developed should respect this and not employ it unless the goal is to
167cleanup artifacts.
168
169To trigger the cleanup, simply open the blob, commit, and close.  It has no
170knowledge of the update process.  This simplification is done through the
171design of a convenience mechanism instead of a required mechanism.
172
173Cleanup Blob     | Note
174---------------- | -------------------------
175`/flash/cleanup` | Trigger Cleanup Mechanism
176
177### Caching Images
178
179Similarly to the OEM IPMI Flash protocol, the flash image will be staged in a
180compile-time configured location.
181
182Other mechanisms can readily be added by adding more blob ids or flags to the
183handler.
184
185### Commands
186
187The update mechanism will expect a specific sequence of commands depending on
188the transport mechanism selected. Some mechanisms require a handshake.
189
190#### BlockTransfer Sequence
191
1921.  Open (for Image or tarball)
1931.  Write
1941.  Close
1951.  Open (`/flash/hash`)
1961.  Write
1971.  Close
1981.  Open (`/flash/verify`)
1991.  Commit
2001.  SessionStat (to read back verification status)
2011.  Close
2021.  Open (`/flash/update`)
2031.  Commit
2041.  SessionStat (to read back update status)
2051.  Close
206
207#### P2A Sequence
208
2091.  Open (for Image or tarball)
2101.  SessionStat (P2A Region for P2A mapping)
2111.  Write
2121.  Close
2131.  Open (`/flash/hash`)
2141.  SessionStat (P2A Region)
2151.  Write
2161.  Close
2171.  Open (`/flash/verify`)
2181.  Commit
2191.  SessionStat (to read back verification status)
2201.  Close
2211.  Open (`/flash/update`)
2221.  Commit
2231.  SessionStat (to read back update status)
2241.  Close
225
226#### LPC Sequence
227
2281.  Open (for image or tarball)
2291.  WriteMeta (specify region information from host for LPC)
2301.  SessionStat (verify the contents from the above)
2311.  Write
2321.  Close
2331.  Open (`/flash/hash`)
2341.  WriteMeta (LPC Region)
2351.  SessionStat (verify LPC config)
2361.  Write
2371.  Close
2381.  Open (`/flash/verify`)
2391.  Commit
2401.  SessionStat (to read back verification status)
2411.  Close
2421.  Open (`/flash/update`)
2431.  Commit
2441.  SessionStat (to read back update status)
2451.  Close
246
247### Stale Images
248
249If an image update process is started but goes stale there are multiple
250mechanisms in place to ensure cleanup. If a session is left open after the blob
251timeout period it'll be closed. Because expiration is not the same action as
252closing, the cache will be flushed and any staged pieces deleted.
253
254The image itself, in legacy (static layout) mode will be placed and named in
255such a way that it will disappear if the BMC reboots. In the UBI case, the file
256will be stored in `/tmp` and deleted accordingly.
257
258At any point during the upload process, one can abort by closing the open blobs
259and deleting them by name.
260
261### Blob Primitives
262
263The update mechanism will implement the Blob primitives as follows.
264
265#### BmcBlobOpen
266
267The blob open primitive allows supplying blob specific flags. These flags are
268used for specifying the transport mechanism. To obtain the list of supported
269mechanisms on a platform, see the `Stat` command below.
270
271```
272enum OpenFlags
273{
274    read = (1 << 0),
275    write = (1 << 1),
276};
277
278/* These bits start in the blob specific range of the flags. */
279enum FirmwareUpdateFlags
280{
281    bt = (1 << 8),   /* Expect to send contents over IPMI BlockTransfer. */
282    p2a = (1 << 9),  /* Expect to send contents over P2A bridge. */
283    lpc = (1 << 10), /* Expect to send contents over LPC bridge. */
284};
285```
286
287An open request must specify that it is opening for writing and one transport
288mechanism, otherwise it is rejected. If the request is also set for reading,
289this is not rejected but currently provides no additional value.
290
291Once opened a new file will appear in the blob_id list (for both the image and
292hash) indicating they are in progress. The name will be `flash/active/image` and
293`flash/active/hash` which has no meaning beyond representing the current update
294in progress. Closing the file does not delete the staged images. Only delete
295will.
296
297***Note*** The active image blob_ids cannot be opened. This can be reconsidered
298later.
299
300#### BmcBlobRead
301
302This will initially not perform any function and will return success with 0
303bytes.
304
305#### BmcBlobWrite
306
307The write command's contents will depend on the transport mechanism. This
308command must not return until it has copied the data out of the mapped region
309into either a staging buffer or written down to a staging file. How the command
310reads from the mapped region is beyond the scope of this design.
311
312##### If BT
313
314The data section of the payload is only data.
315
316##### If P2A
317
318The data section of the payload is the following structure:
319
320```
321struct ExtChunkHdr
322{
323    uint32_t length; /* Length of the data queued (little endian). */
324};
325```
326
327##### If LPC
328
329The data section of the payload is the following structure:
330
331```
332struct ExtChunkHdr
333{
334    uint32_t length; /* Length of the data queued (little endian). */
335};
336```
337
338#### BmcBlobCommit
339
340If this command is called on the session of the firmware image itself, nothing
341will happen at present. It will return a no-op success.
342
343If this command is called on the session for the hash image, nothing will happen
344at present. It will return a no-op success.
345
346If this command is called on the session for the verify blob id, it'll trigger a
347systemd service `verify_image.service` to attempt to verify the image. Before
348doing this, if the transport mechanism is not IPMI BT, it'll shut down the
349mechanism used for transport preventing the host from updating anything.
350
351When this is started, only the BmcBlobSessionStat command will respond. Details
352on that response are below under BmcBlobSessionStat.
353
354#### BmcBlobClose
355
356Close must be called on the firmware image and the hash file before opening the
357verify blob.
358
359If the `verify_image.service` returned success, closing the verify file will
360have a specific behavior depending on the update. If it's UBI, it'll perform the
361install. If it's legacy (static layout), it'll do nothing. The verify_image
362service in the legacy case is responsible for placing the file in the correct
363staging position. A BMC warm reset command will initiate the firmware update
364process.
365
366If the image verification fails, it will automatically delete any files
367associated with the update.
368
369***Note:*** During development testing, a developer will want to upload files
370that are not signed. Therefore, an additional bit will be added to the flags to
371change this behavior.
372
373#### BmcBlobDelete
374
375Aborts any update that's in progress:
376
3771.  Stops the verify_image.service if started.
3781.  Deletes any staged files.
379
380In the event the update is already in progress, such as the tarball mechanism is
381used and in the middle of updating the files, it cannot be aborted.
382
383#### BmcBlobStat
384
385Blob stat on a blob_id (not SessionStat) will return the capabilities of the
386blob_id handler.
387
388```
389struct BmcBlobStatRx {
390    uint16_t crc16;
391    /* This will have the bits set from the FirmwareUpdateFlags enum. */
392    uint16_t blob_state;
393    uint32_t size; /* 0 - it's set to zero when there's no session */
394    uint8_t  metadata_len; /* 0 */
395};
396```
397
398#### BmcBlobSessionStat
399
400If called pre-commit, it'll return the following information:
401
402```
403struct BmcBlobStatRx {
404    uint16_t crc16;
405    uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */
406    uint32_t size; /* Size in bytes so far written */
407    uint8_t  metadata_len; /* 0. */
408};
409```
410
411If it's called and the data transport mechanism is P2A, it'll return a 32-bit
412address for use to configure the P2A region as part of the metadata portion of
413the `BmcBlobStatRx`.
414
415```
416struct BmcBlobStatRx {
417    uint16_t crc16;
418    uint16_t blob_state; /* OpenFlags::write | (one of the interfaces) */
419    uint32_t size; /* Size in bytes so far written */
420    uint8_t  metadata_len = sizeof(struct P2ARegion);
421    struct P2ARegion {
422        uint32_t address;
423    };
424};
425```
426
427If called post-commit on the verify file session, it'll return:
428
429```
430struct BmcBlobStatRx {
431    uint16_t crc16;
432    uint16_t blob_state; /* OPEN_W | (one of the interfaces) */
433    uint32_t size; /* Size in bytes so far written */
434    uint8_t  metadata_len; /* 1. */
435    uint8_t  verify_response; /* one byte from the below enum */
436};
437
438enum VerifyCheckResponses
439{
440    VerifyRunning = 0x00,
441    VerifySuccess = 0x01,
442    VerifyFailed  = 0x02,
443    VerifyOther   = 0x03,
444};
445```
446
447If called post-commit on the update file session, it'll return:
448
449```
450struct BmcBlobStatRx {
451    uint16_t crc16;
452    uint16_t blob_state; /* OPEN_W | (one of the interfaces) */
453    uint32_t size; /* Size in bytes so far written */
454    uint8_t  metadata_len; /* 1. */
455    uint8_t  update_response; /* one by from the below enum */
456};
457
458enum UpdateStatus
459{
460    UpdateRunning = 0x00,
461    UpdateSuccessful = 0x01,
462    UpdateFailed = 0x02,
463    UpdateStatusUnknown = 0x03
464};
465```
466
467The `UpdateStatus` and `VerifyCheckResponses` are currently identical, but this
468may change over time.
469
470#### BmcBlobWriteMeta
471
472The write metadata command is meant to allow the host to provide specific
473configuration data to the BMC for the in-band update. Currently that is only
474aimed at LPC which needs to be told the memory address so it can configure the
475window.
476
477The write meta command's blob will be this structure:
478
479```
480struct LpcRegion
481{
482    uint32_t address; /* Host LPC address where the chunk is to be mapped. */
483    uint32_t length; /* Size of the chunk to be mapped. */
484};
485```
486
487## Alternatives Considered
488
489There is currently another implementation in-use by Google that leverages the
490same mechanisms, however, it's not as flexible because every command is a custom
491piece. Mapping it into blobs primitives allows for easier future modification
492while maintaining backwards compatibility (without simply adding a separate OEM
493library to handle a new process, etc).
494
495## Impacts
496
497This impacts security because it can leverage the memory mapped windows. There
498is not an expected performance impact, as the blob handler existing only
499generates a couple extra entries during the blob enumerate command's response.
500
501## Testing
502
503Where possible (nearly everywhere), mockable interfaces will be used such that
504the entire process has individual unit-tests that verify flags are checked, as
505well as states and sequences.
506
507### Scenarios
508
509#### Sending an image with a bad hash
510
511A required functional test is one whereby an image is sent down to the BMC,
512however the signature is invalid for that image. The expected result is that the
513verification step will return failure and the files will be deleted from the BMC
514without user intervention.
515
516#### Sending an image with a good hash
517
518A required functional test is one whereby an image is sent down to the BMC with
519a valid signature. The expected result is that the verification step will return
520success.
521
522## Configuration
523
524See the configuration section of
525[Secure Flash Update Mechanism](https://github.com/openbmc/phosphor-ipmi-flash/blob/master/README.md)
526