1Copyright 2016 IBM 2 3Licensed under the Apache License, Version 2.0 (the "License"); 4you may not use this file except in compliance with the License. 5You may obtain a copy of the License at 6 7 http://www.apache.org/licenses/LICENSE-2.0 8 9Unless required by applicable law or agreed to in writing, software 10distributed under the License is distributed on an "AS IS" BASIS, 11WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12See the License for the specific language governing permissions and 13limitations under the License. 14 15## Intro 16 17This document describes a protocol for host to BMC communication via the 18mailbox registers present on the Aspeed 2400 and 2500 chips. 19This protocol is specifically designed to allow a host to request and manage 20access to the flash with the specifics of how the host is required to control 21this described below. 22 23## Version 24 25Both version 1 and version 2 of the protocol are described below with version 2 26specificities represented with V2 in brackets - (V2). 27 28## Problem Overview 29 30"mbox" is the name we use to represent a protocol we have established between 31the host and the BMC via the Aspeed mailbox registers. This protocol is used 32for the host to control the flash. 33 34Prior to the mbox protocol, the host uses a backdoor into the BMC address space 35(the iLPC-to-AHB bridge) to directly manipulate the BMCs own flash controller. 36 37This is not sustainable for a number of reasons. The main ones are: 38 391. Every piece of the host software stack that needs flash access (HostBoot, 40 OCC, OPAL, ...) has to have a complete driver for the flash controller, 41 update it on each BMC generation, have all the quirks for all the flash 42 chips supported etc... We have 3 copies on the host already in addition to 43 the one in the BMC itself. 44 452. There are serious issues of access conflicts to that controller between the 46 host and the BMC. 47 483. It's very hard to support "BMC reboots" when doing that 49 504. It's slow 51 525. Last but probably most important, having that backdoor open is a security 53 risk. It means the host can access any address on the BMC internal bus and 54 implant malware in the BMC itself. So if the host is a "bare metal" shared 55 system in some kind of data center, not only the host flash needs to be 56 reflashed when switching from one customer to another, but the entire BMC 57 flash too as nothing can be trusted. So we want to disable it. 58 59To address all these, we have implemented a new mechanism that we call mbox. 60 61When using this mechanism, the BMC is solely responsible for directly accessing 62the flash controller. All flash erase and write operations are performed by the 63BMC and the BMC only. (We can allow direct reads from flash under some 64circumstances but we tend to prefer going via memory). 65 66The host uses the mailbox registers to send "commands" to the BMC, which 67responds via the same mechanism. Those commands allow the host to control a 68"window" (which is the LPC -> AHB FW space mapping) that is either a read 69window or a write window onto the flash. 70 71When set for writing, the BMC makes the window point to a chunk of RAM instead. 72When the host "commits" a change (via MBOX), then the BMC can perform the 73actual flashing from the data in the RAM window. 74 75The idea is to have the LPC FW space be routed to an active "window". That 76window can be a read or a write window. The commands allow to control which 77window and which offset into the flash it maps. 78 79* A read window can be a direct window to the flash controller space (ie. 80 0x3000\_0000) or it can be a window to a RAM image of a flash. It doesn't have 81 to be the full size of the flash per protocol (commands can be use to "slide" 82 it to various parts of the flash) but if its set to map the actual flash 83 controller space at 0x3000\_0000, it's probably simpler to make it the full 84 flash. The host makes no assumption, it's your choice what to provide. The 85 simplest implementation is to just route to the flash read/only. 86 87* A write window has to be a chunk of BMC memory. The minimum size is not 88 defined in the spec, but it should be at least one block (4k for now but it 89 should support larger block sizes in the future). When the BMC receive the 90 command to map the write window at a given offset of the flash, the BMC should 91 copy that portion of the flash into a reserved memory buffer, and modify the 92 LPC mapping to point to that buffer. 93 94The host can then write to that window directly (updating the BMC memory) and 95send a command to "commit" those updates to flash. 96 97Finally there is a `RESET_STATE`. It's the state in which the bootloader in the 98SEEPROM of the POWER9 chip will find what it needs to load HostBoot. The 99details are still being ironed out: either mapping the full flash read only or 100reset to a "window" that is either at the bottom or top of the flash. The 101current implementation resets to point to the full flash. 102 103## Where is the code? 104 105The mbox userspace is available [on GitHub](https://github.com/openbmc/mboxbridge) 106This is Apache licensed but we are keen to see any enhancements you may have. 107 108The kernel driver is still in the process of being upstreamed but can be found 109in the OpenBMC Linux kernel staging tree: 110 111https://github.com/openbmc/linux/commit/85770a7d1caa6a1fa1a291c33dfe46e05755a2ef 112 113## Building 114 115The autotools of this requires the autoconf-archive package for your 116system 117 118## The Hardware 119 120The Aspeed mailbox consists of 16 (8 bit) data registers see Layout for their 121use. Mailbox interrupt enabling, masking and triggering is done using a pair 122of control registers, one accessible by the host the other by the BMC. 123Interrupts can also be raised per write to each data register, for BMC and 124host. Write tiggered interrupts are configured using two 8 bit registers where 125each bit represents a data register and if an interrupt should fire on write. 126Two 8 bit registers are present to act as a mask for write triggered 127interrupts. 128 129### Layout 130 131``` 132Byte 0: COMMAND 133Byte 1: Sequence 134Byte 2-12: Arguments 135Byte 13: Response code 136Byte 14: Host controlled status reg 137Byte 15: BMC controlled status reg 138``` 139 140## Low Level Protocol Flow 141 142What we essentially have is a set of registers which either the host or BMC can 143write to in order to communicate to the other which will respond in some way. 144There are 3 basic types of communication. 145 1461. Commands sent from the Host to the BMC 1472. Responses sent from the BMC to the Host in response to commands 1483. Asyncronous events raised by the BMC 149 150### General Use 151 152Messages usually originate from the host to the BMC. There are special 153cases for a back channel for the BMC to pass new information to the 154host which will be discussed later. 155 156To initiate a request the host must set a command code (see 157Commands) into mailbox data register 0. It is also the hosts 158responsibility to generate a unique sequence number into mailbox 159register 1. After this any command specific data should be written 160(see Layout). The host must then generate an interrupt to the BMC by 161using bit 0 of its control register and wait for an interrupt on the 162response register. Generating an interrupt automatically sets bit 7 of the 163corresponding control register. This bit can be used to poll for 164messages. 165 166On receiving an interrupt (or polling on bit 7 of its Control 167Register) the BMC should read the message from the general registers 168of the mailbox and perform the necessary action before responding. On 169responding the BMC must ensure that the sequence number is the same as 170the one in the request from the host. The BMC must also ensure that 171mailbox data regsiter 13 is a valid response code (see Responses). The 172BMC should then use its control register to generate an interrupt for 173the host to notify it of a response. 174 175### Asynchronous BMC to Host Events 176 177BMC to host communication is also possible for notification of events 178from the BMC. This requires that the host have interrupts enabled on 179mailbox data register 15 (or otherwise poll on bit 7 of mailbox status 180register 1). On receiving such a notification the host should read 181mailbox data register 15 to determine the event code which was set by the 182BMC (see BMC Event notifications in Commands for detail). Events which are 183defined as being able to be acknowledged by the host must be with a 184BMC_EVENT_ACK command. 185 186## High Level Protocol Flow 187 188When a host wants to communicate with the BMC via the mbox protocol the first 189thing it should do it call MBOX_GET_INFO in order to establish the protocol 190version which each understands. Before this the only other commands which are 191allowed are RESET_STATE and BMC_EVENT_ACK. 192 193After this the host can open and close windows with the CREATE_READ_WINDOW, 194CREATE_WRITE_WINDOW and CLOSE_WINDOW commands. Creating a window is how the 195host requests access to a section of flash. It is worth noting that the host 196can only ever have one window that it is accessing at a time - hence forth 197referred to as the active window. 198 199When the active window is a write window the host can perform MARK_WRITE_DIRTY, 200MARK_WRITE_ERASED and WRITE_FLUSH commands to identify changed blocks and 201control when the changed blocks are written to flash. 202 203Independently, and at any point not during an existing mbox command 204transaction, the BMC may raise raise asynchronous events with the host to 205communicate a change in state. 206 207### Version Negotiation 208 209Given that a majority of command and response arguments are specified as a 210multiple of block size it is necessary for the host and BMC to agree on a 211protocol version as this determines the block size. In V1 it is hard coded at 2124K and in V2 the BMC chooses and specifies this to the host as a response 213argument to MBOX_GET_INFO. Thus the host must always call MBOX_GET_INFO before 214any other command which specifies an argument in block size. 215 216The host must tell the BMC the highest protocol level which it supports. The 217BMC will then respond with a protocol level. If the host doesn't understand 218the protocol level specified by the BMC then it must not continue to 219communicate with the BMC. Otherwise the protocol level specified by the 220BMC is taken to be the protocol level used for further communication and can 221only be changed by another call to MBOX_GET_INFO. The BMC should use the 222request from the host to influence its protocol version choice. 223 224### Window Management 225 226In order to access flash contents the host must request a window be opened at 227the flash offset it would like to access. The host may give a hint as to how 228much data it would like to access or otherwise set this argument to zero. The 229BMC must respond with the lpc bus address to access this window and the 230window size. The host must not access past the end of the active window. 231 232There is only ever one active window which is the window created by the most 233recent CREATE_READ_WINDOW or CREATE_WRITE_WINDOW call which succeeded. Even 234though there are two types of windows there can still only be one active window 235irrespective of type. A host must not write to a read window. A host may read 236from a write window and the BMC must guarantee that the window reflects what 237the host has written there. 238 239A window can be closed by calling CLOSE_WINDOW in which case there is no active 240window and the host must not access the LPC window after it has been closed. 241If the host closes an active write window then the BMC must perform an 242implicit flush. If the host tries to open a new window with an already active 243window then the active window is closed (and implicitly flushed if it was a 244write window). If the new window is successfully opened then it is the new 245active window, if the command fails then there is no active window and the 246previous active window must no longer be accessed. 247 248The host must not access an lpc address other than that which is contained by 249the active window. The host must not use write management functions (see below) 250if the active window is a read window or if there is no active window. 251 252### Write Management 253 254The BMC has no method for intercepting writes that occur over the LPC bus. Thus 255the host must explicitly notify the BMC of where and when a write has 256occured. The host must use the MARK_WRITE_DIRTY command to tell the BMC where 257within the write window it has modified. The host may also use the 258MARK_WRITE_ERASED command to erase large parts of the active window without the 259need to write 0xFF. The BMC must ensure that if the host 260reads from an area it has erased that the read values are 0xFF. Any part of the 261active window marked dirty/erased is only marked for the lifetime of the current 262active write window and does not persist if the active window is closed either 263implicitly or explicitly by the host or the BMC. The BMC may at any time 264or must on a call to WRITE_FLUSH flush the changes which it has been notified 265of back to the flash, at which point the dirty or erased marking is cleared 266for the active window. The host must not assume that any changes have been 267written to flash unless an explicit flush call was successful, a close of an 268active write window was successful or a create window command with an active 269write window was successful - otherwise consistency between the flash and memory 270contents cannot be guaranteed. 271 272The host is not required to perform an erase before a write command and the 273BMC must ensure that a write performs as expected - that is if an erase is 274required before a write then the BMC must perform this itself. 275 276### BMC Events 277 278The BMC can raise events with the host asynchronously to communicate to the 279host a change in state which it should take notice of. The host must (if 280possible for the given event) acknowledge it to inform the BMC it has been 281received. 282 283If the BMC raises a BMC Reboot event then the host must renegotiate the 284protocol version so that both the BMC and the host agree on the block size. 285A BMC Reboot event implies a BMC Windows Reset event. 286If the BMC raises a BMC Windows Reset event then the host must 287assume that there is no longer an active window - that is if there was an 288active window it has been closed by the BMC and if it was a write window 289then the host must not assume that it was flushed unless a previous explicit 290flush call was successful. 291 292The BMC may at some points require access to the flash and the BMC daemon must 293set the BMC Flash Control Lost event when the BMC is accessing the flash behind 294the BMC daemons back. When this event is set the host must assume that the 295contents of the active window could be inconsistent with the contents of flash. 296 297## Protocol Definition 298 299### Commands 300 301``` 302RESET_STATE 0x01 303GET_MBOX_INFO 0x02 304GET_FLASH_INFO 0x03 305CREATE_READ_WINDOW 0x04 306CLOSE_WINDOW 0x05 307CREATE_WRITE_WINDOW 0x06 308MARK_WRITE_DIRTY 0x07 309WRITE_FLUSH 0x08 310BMC_EVENT_ACK 0x09 311MARK_WRITE_ERASED 0x0a (V2) 312``` 313 314### Sequence 315 316The host must ensure a unique sequence number at the start of a 317command/response pair. The BMC must ensure the responses to 318a particular message contain the same sequence number that was in the 319command request from the host. 320 321### Responses 322 323``` 324SUCCESS 1 325PARAM_ERROR 2 326WRITE_ERROR 3 327SYSTEM_ERROR 4 328TIMEOUT 5 329BUSY 6 (V2) 330WINDOW_ERROR 7 (V2) 331``` 332 333#### Description: 334 335SUCCESS - Command completed successfully 336 337PARAM_ERROR - Error with parameters supplied or command invalid 338 339WRITE_ERROR - Error writing to the backing file system 340 341SYSTEM_ERROR - Error in BMC performing system action 342 343TIMEOUT - Timeout in performing action 344 345BUSY - Daemon in suspended state (currently unable to access flash) 346 - Retry again later 347 348WINDOW_ERROR - Command not valid for active window or no active window 349 - Try opening an appropriate window and retrying the command 350 351### Information 352- All multibyte messages are LSB first (little endian) 353- All responses must have a valid return code in byte 13 354 355 356### Commands in detail 357 358Note in V1 block size is hard coded to 4K, in V2 it is variable and must be 359queried with GET_MBOX_INFO. 360Sizes and addresses are specified in either bytes - (bytes) 361 or blocks - (blocks) 362Sizes and addresses specified in blocks must be converted to bytes by 363multiplying by the block size. 364``` 365Command: 366 RESET_STATE 367 Implemented in Versions: 368 V1, V2 369 Arguments: 370 - 371 Response: 372 - 373 Notes: 374 This command is designed to inform the BMC that it should put 375 host LPC mapping back in a state where the SBE will be able to 376 use it. Currently this means pointing back to BMC flash 377 pre mailbox protocol. Final behavour is still TBD. 378 379Command: 380 GET_MBOX_INFO 381 Implemented in Versions: 382 V1, V2 383 Arguments: 384 V1: 385 Args 0: API version 386 387 V2: 388 Args 0: API version 389 390 Response: 391 V1: 392 Args 0: API version 393 Args 1-2: default read window size (blocks) 394 Args 3-4: default write window size (blocks) 395 396 V2: 397 Args 0: API version 398 Args 1-2: reserved 399 Args 3-4: reserved 400 Args 5: Block size as power of two (encoded as a shift) 401 402Command: 403 GET_FLASH_INFO 404 Implemented in Versions: 405 V1, V2 406 Arguments: 407 - 408 Response: 409 V1: 410 Args 0-3: Flash size (bytes) 411 Args 4-7: Erase granule (bytes) 412 413 V2: 414 Args 0-1: Flash size (blocks) 415 Args 2-3: Erase granule (blocks) 416 417Command: 418 CREATE_{READ/WRITE}_WINDOW 419 Implemented in Versions: 420 V1, V2 421 Arguments: 422 V1: 423 Args 0-1: Window location as offset into flash (blocks) 424 425 V2: 426 Args 0-1: Window location as offset into flash (blocks) 427 Args 2-3: Requested window size (blocks) 428 429 Response: 430 V1: 431 Args 0-1: LPC bus address of window (blocks) 432 433 V2: 434 Args 0-1: LPC bus address of window (blocks) 435 Args 2-3: Actual window size (blocks) 436 Args 4-5: Actual window location as offset into flash (blocks) 437 Notes: 438 Window location is always given as an offset into flash as 439 taken from the start of flash - that is it is an absolute 440 address. 441 442 LPC bus address is always given from the start of the LPC 443 address space - that is it is an absolute address. 444 445 The requested window size is only a hint. The response 446 indicates the actual size of the window. The BMC may 447 want to use the requested size to pre-load the remainder 448 of the request. The host must not access past the end of the 449 active window. 450 451 The actual window location indicates the absolute flash offset 452 that the window actually maps and is not required to be equal 453 to the flash offset requested by the host, but however must be 454 less than or equal to it. Thus the first block of the window at 455 the lpc address in the response will map the first block at the 456 actual flash offset also contained in the response. It is the 457 responsibility of the host to use this information to access 458 any offset which is required. 459 460 The requested window size may be zero. In this case the 461 BMC is free to create any sized window but it must contain 462 atleast the first block of data requested by the host. A large 463 window is of course preferred and should correspond to 464 the default size returned in the GET_MBOX_INFO command. 465 466 If this command returns successfully then the window which the 467 host requested is the active window. If it fails then there is 468 no active window. 469 470Command: 471 CLOSE_WINDOW 472 Implemented in Versions: 473 V1, V2 474 Arguments: 475 V1: 476 - 477 478 V2: 479 Args 0: Flags 480 Response: 481 - 482 Notes: 483 Closes the active window. Any further access to the LPC bus 484 address specified to address the previously active window will 485 have undefined effects. If the active window is a 486 write window then the BMC must perform an implicit flush. 487 488 The Flags argument allows the host to provide some 489 hints to the BMC. Defined Values: 490 0x01 - Short Lifetime: 491 The window is unlikely to be accessed 492 anytime again in the near future. The effect of 493 this will depend on BMC implementation. In 494 the event that the BMC performs some caching 495 the BMC daemon could mark data contained in a 496 window closed with this flag as first to be 497 evicted from the cache. 498 499Command: 500 MARK_WRITE_DIRTY 501 Implemented in Versions: 502 V1, V2 503 Arguments: 504 V1: 505 Args 0-1: Flash offset to mark from base of flash (blocks) 506 Args 2-5: Number to mark dirty at offset (bytes) 507 508 V2: 509 Args 0-1: Window offset to mark (blocks) 510 Args 2-3: Number to mark dirty at offset (blocks) 511 512 Response: 513 - 514 Notes: 515 The BMC has no method for intercepting writes that 516 occur over the LPC bus. The host must explicitly notify 517 the daemon of where and when a write has occured so it 518 can be flushed to backing storage. 519 520 Offsets are given as an absolute (either into flash (V1) or the 521 active window (V2)) and a zero offset refers to the first 522 block. If the offset + number exceeds the size of the active 523 window then the command must not succeed. 524 525Command 526 WRITE_FLUSH 527 Implemented in Versions: 528 V1, V2 529 Arguments: 530 V1: 531 Args 0-1: Flash offset to mark from base of flash (blocks) 532 Args 2-5: Number to mark dirty at offset (bytes) 533 534 V2: 535 - 536 537 Response: 538 - 539 Notes: 540 Flushes any dirty/erased blocks in the active window to 541 the backing storage. 542 543 In V1 this can also be used to mark parts of the flash 544 dirty and flush in a single command. In V2 the explicit 545 mark dirty command must be used before a call to flush 546 since there are no longer any arguments. If the offset + number 547 exceeds the size of the active window then the command must not 548 succeed. 549 550 551Command: 552 BMC_EVENT_ACK 553 Implemented in Versions: 554 V1, V2 555 Arguments: 556 Args 0: Bits in the BMC status byte (mailbox data 557 register 15) to ack 558 Response: 559 *clears the bits in mailbox data register 15* 560 Notes: 561 The host should use this command to acknowledge BMC events 562 supplied in mailbox register 15. 563 564Command: 565 MARK_WRITE_ERASED 566 Implemented in Versions: 567 V2 568 Arguments: 569 V2: 570 Args 0-1: Window offset to erase (blocks) 571 Args 2-3: Number to erase at offset (blocks) 572 Response: 573 - 574 Notes: 575 This command allows the host to erase a large area 576 without the need to individually write 0xFF 577 repetitively. 578 579 Offset is the offset within the active window to start erasing 580 from (zero refers to the first block of the active window) and 581 number is the number of blocks of the active window to erase 582 starting at offset. If the offset + number exceeds the size of 583 the active window then the command must not succeed. 584``` 585 586### BMC Events in Detail: 587 588If the BMC needs to tell the host something then it simply 589writes to Byte 15. The host should have interrupts enabled 590on that register, or otherwise be polling it. 591 592#### Bit Definitions: 593 594Events which should be ACKed: 595``` 5960x01: BMC Reboot 5970x02: BMC Windows Reset (V2) 598``` 599 600Events which cannot be ACKed (BMC will clear when no longer 601applicable): 602``` 6030x40: BMC Flash Control Lost (V2) 6040x80: BMC MBOX Daemon Ready (V2) 605``` 606 607#### Event Description: 608 609Events which must be ACKed: 610The host should acknowledge these events with BMC_EVENT_ACK to 611let the BMC know that they have been received and understood. 612``` 6130x01 - BMC Reboot: 614 Used to inform the host that a BMC reboot has occured. 615 The host must perform protocol verison negotiation again and 616 must assume it has no active window. The host must not assume 617 that any commands which didn't respond as such succeeded. 6180x02 - BMC Windows Reset: (V2) 619 The host must assume that its active window has been closed and 620 that it no longer has an active window. The host is not 621 required to perform protocol version negotiation again. The 622 host must not assume that any commands which didn't respond as such 623 succeeded. 624``` 625 626Events which cannot be ACKed: 627These events cannot be acknowledged by the host and a call to 628BMC_EVENT_ACK with these bits set will have no effect. The BMC 629will clear these bits when they are no longer applicable. 630``` 6310x40 - BMC Flash Control Lost: (V2) 632 The BMC daemon has been suspended and thus no longer 633 controls access to the flash (most likely because some 634 other process on the BMC required direct access to the 635 flash and has suspended the BMC daemon to preclude 636 concurrent access). 637 The BMC daemon must clear this bit itself when it regains 638 control of the flash (the host isn't able to clear it 639 through an acknowledge command). 640 The host must not assume that the contents of the active window 641 correctly reflect the contents of flash while this bit is set. 6420x80 - BMC MBOX Daemon Ready: (V2) 643 Used to inform the host that the BMC daemon is ready to 644 accept command requests. The host isn't able to clear 645 this bit through an acknowledge command, the BMC daemon must 646 clear it before it terminates (assuming it didn't 647 terminate unexpectedly). 648 The host should not expect a response while this bit is 649 not set. 650 Note that this bit being set is not a guarantee that the BMC daemon 651 will respond as it or the BMC may have crashed without clearing 652 it. 653``` 654