1Copyright 2017 IBM 2 3Licensed under the Apache License, Version 2.0 (the "License"); 4you may not use this file except in compliance with the License. 5You may obtain a copy of the License at 6 7 http://www.apache.org/licenses/LICENSE-2.0 8 9Unless required by applicable law or agreed to in writing, software 10distributed under the License is distributed on an "AS IS" BASIS, 11WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12See the License for the specific language governing permissions and 13limitations under the License. 14 15## Intro 16 17This document describes a protocol for host to BMC communication via the 18mailbox registers present on the Aspeed 2400 and 2500 chips. 19This protocol is specifically designed to allow a host to request and manage 20access to the flash with the specifics of how the host is required to control 21this described below. 22 23## Version 24 25Both version 1 and version 2 of the protocol are described below with version 2 26specificities represented with V2 in brackets - (V2). 27 28## Problem Overview 29 30"mbox" is the name we use to represent a protocol we have established between 31the host and the BMC via the Aspeed mailbox registers. This protocol is used 32for the host to control the flash. 33 34Prior to the mbox protocol, the host uses a backdoor into the BMC address space 35(the iLPC-to-AHB bridge) to directly manipulate the BMCs own flash controller. 36 37This is not sustainable for a number of reasons. The main ones are: 38 391. Every piece of the host software stack that needs flash access (HostBoot, 40 OCC, OPAL, ...) has to have a complete driver for the flash controller, 41 update it on each BMC generation, have all the quirks for all the flash 42 chips supported etc... We have 3 copies on the host already in addition to 43 the one in the BMC itself. 44 452. There are serious issues of access conflicts to that controller between the 46 host and the BMC. 47 483. It's very hard to support "BMC reboots" when doing that 49 504. It's slow 51 525. Last but probably most important, having that backdoor open is a security 53 risk. It means the host can access any address on the BMC internal bus and 54 implant malware in the BMC itself. So if the host is a "bare metal" shared 55 system in some kind of data center, not only the host flash needs to be 56 reflashed when switching from one customer to another, but the entire BMC 57 flash too as nothing can be trusted. So we want to disable it. 58 59To address all these, we have implemented a new mechanism that we call mbox. 60 61When using this mechanism, the BMC is solely responsible for directly accessing 62the flash controller. All flash erase and write operations are performed by the 63BMC and the BMC only. (We can allow direct reads from flash under some 64circumstances but we tend to prefer going via memory). 65 66The host uses the mailbox registers to send "commands" to the BMC, which 67responds via the same mechanism. Those commands allow the host to control a 68"window" (which is the LPC -> AHB FW space mapping) that is either a read 69window or a write window onto the flash. 70 71When set for writing, the BMC makes the window point to a chunk of RAM instead. 72When the host "commits" a change (via MBOX), then the BMC can perform the 73actual flashing from the data in the RAM window. 74 75The idea is to have the LPC FW space be routed to an active "window". That 76window can be a read or a write window. The commands allow to control which 77window and which offset into the flash it maps. 78 79* A read window can be a direct window to the flash controller space (ie. 80 0x3000\_0000) or it can be a window to a RAM image of a flash. It doesn't have 81 to be the full size of the flash per protocol (commands can be use to "slide" 82 it to various parts of the flash) but if its set to map the actual flash 83 controller space at 0x3000\_0000, it's probably simpler to make it the full 84 flash. The host makes no assumption, it's your choice what to provide. The 85 simplest implementation is to just route to the flash read/only. 86 87* A write window has to be a chunk of BMC memory. The minimum size is not 88 defined in the spec, but it should be at least one block (4k for now but it 89 should support larger block sizes in the future). When the BMC receive the 90 command to map the write window at a given offset of the flash, the BMC should 91 copy that portion of the flash into a reserved memory buffer, and modify the 92 LPC mapping to point to that buffer. 93 94The host can then write to that window directly (updating the BMC memory) and 95send a command to "commit" those updates to flash. 96 97Finally there is a `RESET_STATE`. It's the state in which the bootloader in the 98SEEPROM of the POWER9 chip will find what it needs to load HostBoot. The 99details are still being ironed out: either mapping the full flash read only or 100reset to a "window" that is either at the bottom or top of the flash. The 101current implementation resets to point to the full flash. 102 103## Where is the code? 104 105The mbox userspace is available [on GitHub](https://github.com/openbmc/mboxbridge) 106This is Apache licensed but we are keen to see any enhancements you may have. 107 108The kernel driver is still in the process of being upstreamed but can be found 109in the OpenBMC Linux kernel staging tree: 110 111https://github.com/openbmc/linux/commit/85770a7d1caa6a1fa1a291c33dfe46e05755a2ef 112 113## Building 114 115The autotools of this requires the autoconf-archive package for your 116system 117 118## The Hardware 119 120The Aspeed mailbox consists of 16 (8 bit) data registers see Layout for their 121use. Mailbox interrupt enabling, masking and triggering is done using a pair 122of control registers, one accessible by the host the other by the BMC. 123Interrupts can also be raised per write to each data register, for BMC and 124host. Write tiggered interrupts are configured using two 8 bit registers where 125each bit represents a data register and if an interrupt should fire on write. 126Two 8 bit registers are present to act as a mask for write triggered 127interrupts. 128 129### Layout 130 131``` 132Byte 0: COMMAND 133Byte 1: Sequence 134Byte 2-12: Arguments 135Byte 13: Response code 136Byte 14: Host controlled status reg 137Byte 15: BMC controlled status reg 138``` 139 140## Low Level Protocol Flow 141 142What we essentially have is a set of registers which either the host or BMC can 143write to in order to communicate to the other which will respond in some way. 144There are 3 basic types of communication. 145 1461. Commands sent from the Host to the BMC 1472. Responses sent from the BMC to the Host in response to commands 1483. Asyncronous events raised by the BMC 149 150### General Use 151 152Messages usually originate from the host to the BMC. There are special 153cases for a back channel for the BMC to pass new information to the 154host which will be discussed later. 155 156To initiate a request the host must set a command code (see 157Commands) into mailbox data register 0. It is also the hosts 158responsibility to generate a unique sequence number into mailbox 159register 1. After this any command specific data should be written 160(see Layout). The host must then generate an interrupt to the BMC by 161using bit 0 of its control register and wait for an interrupt on the 162response register. Generating an interrupt automatically sets bit 7 of the 163corresponding control register. This bit can be used to poll for 164messages. 165 166On receiving an interrupt (or polling on bit 7 of its Control 167Register) the BMC should read the message from the general registers 168of the mailbox and perform the necessary action before responding. On 169responding the BMC must ensure that the sequence number is the same as 170the one in the request from the host. The BMC must also ensure that 171mailbox data regsiter 13 is a valid response code (see Responses). The 172BMC should then use its control register to generate an interrupt for 173the host to notify it of a response. 174 175### Asynchronous BMC to Host Events 176 177BMC to host communication is also possible for notification of events 178from the BMC. This requires that the host have interrupts enabled on 179mailbox data register 15 (or otherwise poll on bit 7 of mailbox status 180register 1). On receiving such a notification the host should read 181mailbox data register 15 to determine the event code which was set by the 182BMC (see BMC Event notifications in Commands for detail). Events which are 183defined as being able to be acknowledged by the host must be with a 184BMC_EVENT_ACK command. 185 186## High Level Protocol Flow 187 188When a host wants to communicate with the BMC via the mbox protocol the first 189thing it should do it call MBOX_GET_INFO in order to establish the protocol 190version which each understands. Before this the only other commands which are 191allowed are RESET_STATE and BMC_EVENT_ACK. 192 193After this the host can open and close windows with the CREATE_READ_WINDOW, 194CREATE_WRITE_WINDOW and CLOSE_WINDOW commands. Creating a window is how the 195host requests access to a section of flash. It is worth noting that the host 196can only ever have one window that it is accessing at a time - hence forth 197referred to as the active window. 198 199When the active window is a write window the host can perform MARK_WRITE_DIRTY, 200MARK_WRITE_ERASED and WRITE_FLUSH commands to identify changed blocks and 201control when the changed blocks are written to flash. 202 203Independently, and at any point not during an existing mbox command 204transaction, the BMC may raise raise asynchronous events with the host to 205communicate a change in state. 206 207### Version Negotiation 208 209Given that a majority of command and response arguments are specified as a 210multiple of block size it is necessary for the host and BMC to agree on a 211protocol version as this determines the block size. In V1 it is hard coded at 2124K and in V2 the BMC chooses and specifies this to the host as a response 213argument to `MBOX_GET_INFO`. Thus the host must always call `MBOX_GET_INFO` 214before any other command which specifies an argument in block size. 215 216When invoking `MBOX_GET_INFO` the host must provide the BMC its highest 217supported version of the protocol. The BMC must respond with a protocol version 218less than or equal to that requested by the host, or in the event that there is 219no such value, an error code. In the event that an error is returned the host 220must not continue to communicate with the BMC. Otherwise, the protocol version 221returned by the BMC is the agreed protocol version for all further 222communication. The host may at a future point request a change in protocol 223version by issuing a subsequent `MBOX_GET_INFO` command. 224 225### Window Management 226 227In order to access flash contents the host must request a window be opened at 228the flash offset it would like to access. The host may give a hint as to how 229much data it would like to access or otherwise set this argument to zero. The 230BMC must respond with the lpc bus address to access this window and the 231window size. The host must not access past the end of the active window. 232 233There is only ever one active window which is the window created by the most 234recent CREATE_READ_WINDOW or CREATE_WRITE_WINDOW call which succeeded. Even 235though there are two types of windows there can still only be one active window 236irrespective of type. A host must not write to a read window. A host may read 237from a write window and the BMC must guarantee that the window reflects what 238the host has written there. 239 240A window can be closed by calling CLOSE_WINDOW in which case there is no active 241window and the host must not access the LPC window after it has been closed. 242If the host closes an active write window then the BMC must perform an 243implicit flush. If the host tries to open a new window with an already active 244window then the active window is closed (and implicitly flushed if it was a 245write window). If the new window is successfully opened then it is the new 246active window, if the command fails then there is no active window and the 247previous active window must no longer be accessed. 248 249The host must not access an lpc address other than that which is contained by 250the active window. The host must not use write management functions (see below) 251if the active window is a read window or if there is no active window. 252 253### Write Management 254 255The BMC has no method for intercepting writes that occur over the LPC bus. Thus 256the host must explicitly notify the BMC of where and when a write has 257occured. The host must use the MARK_WRITE_DIRTY command to tell the BMC where 258within the write window it has modified. The host may also use the 259MARK_WRITE_ERASED command to erase large parts of the active window without the 260need to write 0xFF. The BMC must ensure that if the host 261reads from an area it has erased that the read values are 0xFF. Any part of the 262active window marked dirty/erased is only marked for the lifetime of the current 263active write window and does not persist if the active window is closed either 264implicitly or explicitly by the host or the BMC. The BMC may at any time 265or must on a call to WRITE_FLUSH flush the changes which it has been notified 266of back to the flash, at which point the dirty or erased marking is cleared 267for the active window. The host must not assume that any changes have been 268written to flash unless an explicit flush call was successful, a close of an 269active write window was successful or a create window command with an active 270write window was successful - otherwise consistency between the flash and memory 271contents cannot be guaranteed. 272 273The host is not required to perform an erase before a write command and the 274BMC must ensure that a write performs as expected - that is if an erase is 275required before a write then the BMC must perform this itself. 276 277### BMC Events 278 279The BMC can raise events with the host asynchronously to communicate to the 280host a change in state which it should take notice of. The host must (if 281possible for the given event) acknowledge it to inform the BMC it has been 282received. 283 284If the BMC raises a BMC Reboot event then the host must renegotiate the 285protocol version so that both the BMC and the host agree on the block size. 286A BMC Reboot event implies a BMC Windows Reset event. 287If the BMC raises a BMC Windows Reset event then the host must 288assume that there is no longer an active window - that is if there was an 289active window it has been closed by the BMC and if it was a write window 290then the host must not assume that it was flushed unless a previous explicit 291flush call was successful. 292 293The BMC may at some points require access to the flash and the BMC daemon must 294set the BMC Flash Control Lost event when the BMC is accessing the flash behind 295the BMC daemons back. When this event is set the host must assume that the 296contents of the active window could be inconsistent with the contents of flash. 297 298## Protocol Definition 299 300### Commands 301 302``` 303RESET_STATE 0x01 304GET_MBOX_INFO 0x02 305GET_FLASH_INFO 0x03 306CREATE_READ_WINDOW 0x04 307CLOSE_WINDOW 0x05 308CREATE_WRITE_WINDOW 0x06 309MARK_WRITE_DIRTY 0x07 310WRITE_FLUSH 0x08 311BMC_EVENT_ACK 0x09 312MARK_WRITE_ERASED 0x0a (V2) 313``` 314 315### Sequence 316 317The host must ensure a unique sequence number at the start of a 318command/response pair. The BMC must ensure the responses to 319a particular message contain the same sequence number that was in the 320command request from the host. 321 322### Responses 323 324``` 325SUCCESS 1 326PARAM_ERROR 2 327WRITE_ERROR 3 328SYSTEM_ERROR 4 329TIMEOUT 5 330BUSY 6 (V2) 331WINDOW_ERROR 7 (V2) 332``` 333 334#### Description: 335 336SUCCESS - Command completed successfully 337 338PARAM_ERROR - Error with parameters supplied or command invalid 339 340WRITE_ERROR - Error writing to the backing file system 341 342SYSTEM_ERROR - Error in BMC performing system action 343 344TIMEOUT - Timeout in performing action 345 346BUSY - Daemon in suspended state (currently unable to access flash) 347 - Retry again later 348 349WINDOW_ERROR - Command not valid for active window or no active window 350 - Try opening an appropriate window and retrying the command 351 352### Information 353- All multibyte messages are LSB first (little endian) 354- All responses must have a valid return code in byte 13 355 356 357### Commands in detail 358 359Note in V1 block size is hard coded to 4K, in V2 it is variable and must be 360queried with GET_MBOX_INFO. 361Sizes and addresses are specified in either bytes - (bytes) 362 or blocks - (blocks) 363Sizes and addresses specified in blocks must be converted to bytes by 364multiplying by the block size. 365``` 366Command: 367 RESET_STATE 368 Implemented in Versions: 369 V1, V2 370 Arguments: 371 - 372 Response: 373 - 374 Notes: 375 This command is designed to inform the BMC that it should put 376 host LPC mapping back in a state where the SBE will be able to 377 use it. Currently this means pointing back to BMC flash 378 pre mailbox protocol. Final behavour is still TBD. 379 380Command: 381 GET_MBOX_INFO 382 Implemented in Versions: 383 V1, V2 384 Arguments: 385 V1: 386 Args 0: API version 387 388 V2: 389 Args 0: API version 390 391 Response: 392 V1: 393 Args 0: API version 394 Args 1-2: default read window size (blocks) 395 Args 3-4: default write window size (blocks) 396 397 V2: 398 Args 0: API version 399 Args 1-2: reserved 400 Args 3-4: reserved 401 Args 5: Block size as power of two (encoded as a shift) 402 403Command: 404 GET_FLASH_INFO 405 Implemented in Versions: 406 V1, V2 407 Arguments: 408 - 409 Response: 410 V1: 411 Args 0-3: Flash size (bytes) 412 Args 4-7: Erase granule (bytes) 413 414 V2: 415 Args 0-1: Flash size (blocks) 416 Args 2-3: Erase granule (blocks) 417 418Command: 419 CREATE_{READ/WRITE}_WINDOW 420 Implemented in Versions: 421 V1, V2 422 Arguments: 423 V1: 424 Args 0-1: Window location as offset into flash (blocks) 425 426 V2: 427 Args 0-1: Window location as offset into flash (blocks) 428 Args 2-3: Requested window size (blocks) 429 430 Response: 431 V1: 432 Args 0-1: LPC bus address of window (blocks) 433 434 V2: 435 Args 0-1: LPC bus address of window (blocks) 436 Args 2-3: Actual window size (blocks) 437 Args 4-5: Actual window location as offset into flash (blocks) 438 Notes: 439 Window location is always given as an offset into flash as 440 taken from the start of flash - that is it is an absolute 441 address. 442 443 LPC bus address is always given from the start of the LPC 444 address space - that is it is an absolute address. 445 446 The requested window size is only a hint. The response 447 indicates the actual size of the window. The BMC may 448 want to use the requested size to pre-load the remainder 449 of the request. The host must not access past the end of the 450 active window. 451 452 The actual window location indicates the absolute flash offset 453 that the window actually maps and is not required to be equal 454 to the flash offset requested by the host, but however must be 455 less than or equal to it. Thus the first block of the window at 456 the lpc address in the response will map the first block at the 457 actual flash offset also contained in the response. It is the 458 responsibility of the host to use this information to access 459 any offset which is required. 460 461 The requested window size may be zero. In this case the 462 BMC is free to create any sized window but it must contain 463 atleast the first block of data requested by the host. A large 464 window is of course preferred and should correspond to 465 the default size returned in the GET_MBOX_INFO command. 466 467 If this command returns successfully then the window which the 468 host requested is the active window. If it fails then there is 469 no active window. 470 471Command: 472 CLOSE_WINDOW 473 Implemented in Versions: 474 V1, V2 475 Arguments: 476 V1: 477 - 478 479 V2: 480 Args 0: Flags 481 Response: 482 - 483 Notes: 484 Closes the active window. Any further access to the LPC bus 485 address specified to address the previously active window will 486 have undefined effects. If the active window is a 487 write window then the BMC must perform an implicit flush. 488 489 The Flags argument allows the host to provide some 490 hints to the BMC. Defined Values: 491 0x01 - Short Lifetime: 492 The window is unlikely to be accessed 493 anytime again in the near future. The effect of 494 this will depend on BMC implementation. In 495 the event that the BMC performs some caching 496 the BMC daemon could mark data contained in a 497 window closed with this flag as first to be 498 evicted from the cache. 499 500Command: 501 MARK_WRITE_DIRTY 502 Implemented in Versions: 503 V1, V2 504 Arguments: 505 V1: 506 Args 0-1: Flash offset to mark from base of flash (blocks) 507 Args 2-5: Number to mark dirty at offset (bytes) 508 509 V2: 510 Args 0-1: Window offset to mark (blocks) 511 Args 2-3: Number to mark dirty at offset (blocks) 512 513 Response: 514 - 515 Notes: 516 The BMC has no method for intercepting writes that 517 occur over the LPC bus. The host must explicitly notify 518 the daemon of where and when a write has occured so it 519 can be flushed to backing storage. 520 521 Offsets are given as an absolute (either into flash (V1) or the 522 active window (V2)) and a zero offset refers to the first 523 block. If the offset + number exceeds the size of the active 524 window then the command must not succeed. 525 526Command 527 WRITE_FLUSH 528 Implemented in Versions: 529 V1, V2 530 Arguments: 531 V1: 532 Args 0-1: Flash offset to mark from base of flash (blocks) 533 Args 2-5: Number to mark dirty at offset (bytes) 534 535 V2: 536 - 537 538 Response: 539 - 540 Notes: 541 Flushes any dirty/erased blocks in the active window to 542 the backing storage. 543 544 In V1 this can also be used to mark parts of the flash 545 dirty and flush in a single command. In V2 the explicit 546 mark dirty command must be used before a call to flush 547 since there are no longer any arguments. If the offset + number 548 exceeds the size of the active window then the command must not 549 succeed. 550 551 552Command: 553 BMC_EVENT_ACK 554 Implemented in Versions: 555 V1, V2 556 Arguments: 557 Args 0: Bits in the BMC status byte (mailbox data 558 register 15) to ack 559 Response: 560 *clears the bits in mailbox data register 15* 561 Notes: 562 The host should use this command to acknowledge BMC events 563 supplied in mailbox register 15. 564 565Command: 566 MARK_WRITE_ERASED 567 Implemented in Versions: 568 V2 569 Arguments: 570 V2: 571 Args 0-1: Window offset to erase (blocks) 572 Args 2-3: Number to erase at offset (blocks) 573 Response: 574 - 575 Notes: 576 This command allows the host to erase a large area 577 without the need to individually write 0xFF 578 repetitively. 579 580 Offset is the offset within the active window to start erasing 581 from (zero refers to the first block of the active window) and 582 number is the number of blocks of the active window to erase 583 starting at offset. If the offset + number exceeds the size of 584 the active window then the command must not succeed. 585``` 586 587### BMC Events in Detail: 588 589If the BMC needs to tell the host something then it simply 590writes to Byte 15. The host should have interrupts enabled 591on that register, or otherwise be polling it. 592 593#### Bit Definitions: 594 595Events which should be ACKed: 596``` 5970x01: BMC Reboot 5980x02: BMC Windows Reset (V2) 599``` 600 601Events which cannot be ACKed (BMC will clear when no longer 602applicable): 603``` 6040x40: BMC Flash Control Lost (V2) 6050x80: BMC MBOX Daemon Ready (V2) 606``` 607 608#### Event Description: 609 610Events which must be ACKed: 611The host should acknowledge these events with BMC_EVENT_ACK to 612let the BMC know that they have been received and understood. 613``` 6140x01 - BMC Reboot: 615 Used to inform the host that a BMC reboot has occured. 616 The host must perform protocol verison negotiation again and 617 must assume it has no active window. The host must not assume 618 that any commands which didn't respond as such succeeded. 6190x02 - BMC Windows Reset: (V2) 620 The host must assume that its active window has been closed and 621 that it no longer has an active window. The host is not 622 required to perform protocol version negotiation again. The 623 host must not assume that any commands which didn't respond as such 624 succeeded. 625``` 626 627Events which cannot be ACKed: 628These events cannot be acknowledged by the host and a call to 629BMC_EVENT_ACK with these bits set will have no effect. The BMC 630will clear these bits when they are no longer applicable. 631``` 6320x40 - BMC Flash Control Lost: (V2) 633 The BMC daemon has been suspended and thus no longer 634 controls access to the flash (most likely because some 635 other process on the BMC required direct access to the 636 flash and has suspended the BMC daemon to preclude 637 concurrent access). 638 The BMC daemon must clear this bit itself when it regains 639 control of the flash (the host isn't able to clear it 640 through an acknowledge command). 641 The host must not assume that the contents of the active window 642 correctly reflect the contents of flash while this bit is set. 6430x80 - BMC MBOX Daemon Ready: (V2) 644 Used to inform the host that the BMC daemon is ready to 645 accept command requests. The host isn't able to clear 646 this bit through an acknowledge command, the BMC daemon must 647 clear it before it terminates (assuming it didn't 648 terminate unexpectedly). 649 The host should not expect a response while this bit is 650 not set. 651 Note that this bit being set is not a guarantee that the BMC daemon 652 will respond as it or the BMC may have crashed without clearing 653 it. 654``` 655