xref: /openbmc/hiomapd/README.md (revision f6e0cf00)
1Copyright 2016 IBM
2
3Licensed under the Apache License, Version 2.0 (the "License");
4you may not use this file except in compliance with the License.
5You may obtain a copy of the License at
6
7  http://www.apache.org/licenses/LICENSE-2.0
8
9Unless required by applicable law or agreed to in writing, software
10distributed under the License is distributed on an "AS IS" BASIS,
11WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12See the License for the specific language governing permissions and
13limitations under the License.
14
15## Intro
16
17This document describes a protocol for host to BMC communication via the
18mailbox registers present on the Aspeed 2400 and 2500 chips.
19This protocol is specifically designed to allow a host to request and manage
20access to the flash with the specifics of how the host is required to control
21this described below.
22
23## Version
24
25Both version 1 and version 2 of the protocol are described below with version 2
26specificities represented with V2 in brackets - (V2).
27
28## Problem Overview
29
30"mbox" is the name we use to represent a protocol we have established between
31the host and the BMC via the Aspeed mailbox registers. This protocol is used
32for the host to control the flash.
33
34Prior to the mbox protocol, the host uses a backdoor into the BMC address space
35(the iLPC-to-AHB bridge) to directly manipulate the BMCs own flash controller.
36
37This is not sustainable for a number of reasons. The main ones are:
38
391. Every piece of the host software stack that needs flash access (HostBoot,
40   OCC, OPAL, ...) has to have a complete driver for the flash controller,
41   update it on each BMC generation, have all the quirks for all the flash
42   chips supported etc... We have 3 copies on the host already in addition to
43   the one in the BMC itself.
44
452. There are serious issues of access conflicts to that controller between the
46   host and the BMC.
47
483. It's very hard to support "BMC reboots" when doing that
49
504. It's slow
51
525. Last but probably most important, having that backdoor open is a security
53   risk. It means the host can access any address on the BMC internal bus and
54   implant malware in the BMC itself. So if the host is a "bare metal" shared
55   system in some kind of data center, not only the host flash needs to be
56   reflashed when switching from one customer to another, but the entire BMC
57   flash too as nothing can be trusted. So we want to disable it.
58
59To address all these, we have implemented a new mechanism that we call mbox.
60
61When using this mechanism, the BMC is solely responsible for directly accessing
62the flash controller. All flash erase and write operations are performed by the
63BMC and the BMC only. (We can allow direct reads from flash under some
64circumstances but we tend to prefer going via memory).
65
66The host uses the mailbox registers to send "commands" to the BMC, which
67responds via the same mechanism. Those commands allow the host to control a
68"window" (which is the LPC -> AHB FW space mapping) that is either a read
69window or a write window onto the flash.
70
71When set for writing, the BMC makes the window point to a chunk of RAM instead.
72When the host "commits" a change (via MBOX), then the BMC can perform the
73actual flashing from the data in the RAM window.
74
75The idea is to have the LPC FW space be routed to an active "window".  That
76window can be a read or a write window. The commands allow to control which
77window and which offset into the flash it maps.
78
79* A read window can be a direct window to the flash controller space (ie.
80  0x3000\_0000) or it can be a window to a RAM image of a flash. It doesn't have
81  to be the full size of the flash per protocol (commands can be use to "slide"
82  it to various parts of the flash) but if its set to map the actual flash
83  controller space at 0x3000\_0000, it's probably simpler to make it the full
84  flash. The host makes no assumption, it's your choice what to provide. The
85  simplest implementation is to just route to the flash read/only.
86
87* A write window has to be a chunk of BMC memory. The minimum size is not
88  defined in the spec, but it should be at least one block (4k for now but it
89  should support larger block sizes in the future). When the BMC receive the
90  command to map the write window at a given offset of the flash, the BMC should
91  copy that portion of the flash into a reserved memory buffer, and modify the
92  LPC mapping to point to that buffer.
93
94The host can then write to that window directly (updating the BMC memory) and
95send a command to "commit" those updates to flash.
96
97Finally there is a `RESET_STATE`. It's the state in which the bootloader in the
98SEEPROM of the POWER9 chip will find what it needs to load HostBoot. The
99details are still being ironed out: either mapping the full flash read only or
100reset to a "window" that is either at the bottom or top of the flash. The
101current implementation resets to point to the full flash.
102
103## Where is the code?
104
105The mbox userspace is available [on GitHub](https://github.com/openbmc/mboxbridge)
106This is Apache licensed but we are keen to see any enhancements you may have.
107
108The kernel driver is still in the process of being upstreamed but can be found
109in the OpenBMC Linux kernel staging tree:
110
111https://github.com/openbmc/linux/commit/85770a7d1caa6a1fa1a291c33dfe46e05755a2ef
112
113## Building
114
115The autotools of this requires the autoconf-archive package for your
116system
117
118## The Hardware
119
120The Aspeed mailbox consists of 16 (8 bit) data registers see Layout for their
121use. Mailbox interrupt enabling, masking and triggering is done using a pair
122of control registers, one accessible by the host the other by the BMC.
123Interrupts can also be raised per write to each data register, for BMC and
124host. Write tiggered interrupts are configured using two 8 bit registers where
125each bit represents a data register and if an interrupt should fire on write.
126Two 8 bit registers are present to act as a mask for write triggered
127interrupts.
128
129### Layout
130
131```
132Byte 0: COMMAND
133Byte 1: Sequence
134Byte 2-12: Arguments
135Byte 13: Response code
136Byte 14: Host controlled status reg
137Byte 15: BMC controlled status reg
138```
139
140## Low Level Protocol Flow
141
142What we essentially have is a set of registers which either the host or BMC can
143write to in order to communicate to the other which will respond in some way.
144There are 3 basic types of communication.
145
1461. Commands sent from the Host to the BMC
1472. Responses sent from the BMC to the Host in response to commands
1483. Asyncronous events raised by the BMC
149
150### General Use
151
152Messages usually originate from the host to the BMC. There are special
153cases for a back channel for the BMC to pass new information to the
154host which will be discussed later.
155
156To initiate a request the host must set a command code (see
157Commands) into mailbox data register 0. It is also the hosts
158responsibility to generate a unique sequence number into mailbox
159register 1. After this any command specific data should be written
160(see Layout). The host must then generate an interrupt to the BMC by
161using bit 0 of its control register and wait for an interrupt on the
162response register. Generating an interrupt automatically sets bit 7 of the
163corresponding control register. This bit can be used to poll for
164messages.
165
166On receiving an interrupt (or polling on bit 7 of its Control
167Register) the BMC should read the message from the general registers
168of the mailbox and perform the necessary action before responding. On
169responding the BMC must ensure that the sequence number is the same as
170the one in the request from the host. The BMC must also ensure that
171mailbox data regsiter 13 is a valid response code (see Responses). The
172BMC should then use its control register to generate an interrupt for
173the host to notify it of a response.
174
175### Asynchronous BMC to Host Events
176
177BMC to host communication is also possible for notification of events
178from the BMC. This requires that the host have interrupts enabled on
179mailbox data register 15 (or otherwise poll on bit 7 of mailbox status
180register 1). On receiving such a notification the host should read
181mailbox data register 15 to determine the event code which was set by the
182BMC (see BMC Event notifications in Commands for detail). Events which are
183defined as being able to be acknowledged by the host must be with a
184BMC_EVENT_ACK command.
185
186## High Level Protocol Flow
187
188When a host wants to communicate with the BMC via the mbox protocol the first
189thing it should do it call MBOX_GET_INFO in order to establish the protocol
190version which each understands. Before this the only other commands which are
191allowed are RESET_STATE and BMC_EVENT_ACK.
192
193After this the host can open and close windows with the CREATE_READ_WINDOW,
194CREATE_WRITE_WINDOW and CLOSE_WINDOW commands. Creating a window is how the
195host requests access to a section of flash. It is worth noting that the host
196can only ever have one window that it is accessing at a time - hence forth
197referred to as the active window.
198
199When the active window is a write window the host can perform MARK_WRITE_DIRTY,
200MARK_WRITE_ERASED and WRITE_FLUSH commands to identify changed blocks and
201control when the changed blocks are written to flash.
202
203Independently, and at any point not during an existing mbox command
204transaction, the BMC may raise raise asynchronous events with the host to
205communicate a change in state.
206
207### Version Negotiation
208
209Given that a majority of command and response arguments are specified as a
210multiple of block size it is necessary for the host and BMC to agree on a
211protocol version as this determines the block size. In V1 it is hard coded at
2124K and in V2 the BMC chooses and specifies this to the host as a response
213argument to MBOX_GET_INFO. Thus the host must always call MBOX_GET_INFO before
214any other command which specifies an argument in block size.
215
216The host must tell the BMC the highest protocol level which it supports. The
217BMC will then respond with a protocol level. If the host doesn't understand
218the protocol level specified by the BMC then it must not continue to
219communicate with the BMC. Otherwise the protocol level specified by the
220BMC is taken to be the protocol level used for further communication and can
221only be changed by another call to MBOX_GET_INFO. The BMC should use the
222request from the host to influence its protocol version choice.
223
224### Window Management
225
226In order to access flash contents the host must request a window be opened at
227the flash offset it would like to access. The host may give a hint as to how
228much data it would like to access or otherwise set this argument to zero. The
229BMC must respond with the lpc bus address to access this window and the
230window size. The host must not access past the end of the active window.
231
232There is only ever one active window which is the window created by the most
233recent CREATE_READ_WINDOW or CREATE_WRITE_WINDOW call which succeeded. Even
234though there are two types of windows there can still only be one active window
235irrespective of type. A host must not write to a read window. A host may read
236from a write window and the BMC must guarantee that the window reflects what
237the host has written there.
238
239A window can be closed by calling CLOSE_WINDOW in which case there is no active
240window and the host must not access the LPC window after it has been closed.
241If the host closes an active write window then the BMC must perform an
242implicit flush. If the host tries to open a new window with an already active
243window then the active window is closed (and implicitly flushed if it was a
244write window). If the new window is successfully opened then it is the new
245active window, if the command fails then there is no active window and the
246previous active window must no longer be accessed.
247
248The host must not access an lpc address other than that which is contained by
249the active window. The host must not use write management functions (see below)
250if the active window is a read window or if there is no active window.
251
252### Write Management
253
254The BMC has no method for intercepting writes that occur over the LPC bus. Thus
255the host must explicitly notify the BMC of where and when a write has
256occured. The host must use the MARK_WRITE_DIRTY command to tell the BMC where
257within the write window it has modified. The host may also use the
258MARK_WRITE_ERASED command to erase large parts of the active window without the
259need to write 0xFF. The BMC must ensure that if the host
260reads from an area it has erased that the read values are 0xFF. Any part of the
261active window marked dirty/erased is only marked for the lifetime of the current
262active write window and does not persist if the active window is closed either
263implicitly or explicitly by the host or the BMC. The BMC may at any time
264or must on a call to WRITE_FLUSH flush the changes which it has been notified
265of back to the flash, at which point the dirty or erased marking is cleared
266for the active window. The host must not assume that any changes have been
267written to flash unless an explicit flush call was successful, a close of an
268active write window was successful or a create window command with an active
269write window was successful - otherwise consistency between the flash and memory
270contents cannot be guaranteed.
271
272The host is not required to perform an erase before a write command and the
273BMC must ensure that a write performs as expected - that is if an erase is
274required before a write then the BMC must perform this itself.
275
276### BMC Events
277
278The BMC can raise events with the host asynchronously to communicate to the
279host a change in state which it should take notice of. The host must (if
280possible for the given event) acknowledge it to inform the BMC it has been
281received.
282
283If the BMC raises a BMC Reboot event then the host must renegotiate the
284protocol version so that both the BMC and the host agree on the block size.
285A BMC Reboot event implies a BMC Windows Reset event.
286If the BMC raises a BMC Windows Reset event then the host must
287assume that there is no longer an active window - that is if there was an
288active window it has been closed by the BMC and if it was a write window
289then the host must not assume that it was flushed unless a previous explicit
290flush call was successful.
291
292The BMC may at some points require access to the flash and the BMC daemon must
293set the BMC Flash Control Lost event when the BMC is accessing the flash behind
294the BMC daemons back. When this event is set the host must assume that the
295contents of the active window could be inconsistent with the contents of flash.
296
297## Protocol Definition
298
299### Commands
300
301```
302RESET_STATE          0x01
303GET_MBOX_INFO        0x02
304GET_FLASH_INFO       0x03
305CREATE_READ_WINDOW   0x04
306CLOSE_WINDOW         0x05
307CREATE_WRITE_WINDOW  0x06
308MARK_WRITE_DIRTY     0x07
309WRITE_FLUSH          0x08
310BMC_EVENT_ACK        0x09
311MARK_WRITE_ERASED    0x0a	(V2)
312```
313
314### Sequence
315
316The host must ensure a unique sequence number at the start of a
317command/response pair. The BMC must ensure the responses to
318a particular message contain the same sequence number that was in the
319command request from the host.
320
321### Responses
322
323```
324SUCCESS		1
325PARAM_ERROR	2
326WRITE_ERROR	3
327SYSTEM_ERROR	4
328TIMEOUT		5
329BUSY		6	(V2)
330WINDOW_ERROR	7	(V2)
331```
332
333#### Description:
334
335SUCCESS		- Command completed successfully
336
337PARAM_ERROR	- Error with parameters supplied or command invalid
338
339WRITE_ERROR	- Error writing to the backing file system
340
341SYSTEM_ERROR	- Error in BMC performing system action
342
343TIMEOUT		- Timeout in performing action
344
345BUSY		- Daemon in suspended state (currently unable to access flash)
346		- Retry again later
347
348WINDOW_ERROR	- Command not valid for active window or no active window
349		- Try opening an appropriate window and retrying the command
350
351### Information
352- All multibyte messages are LSB first (little endian)
353- All responses must have a valid return code in byte 13
354
355
356### Commands in detail
357
358Note in V1 block size is hard coded to 4K, in V2 it is variable and must be
359queried with GET_MBOX_INFO.
360Sizes and addresses are specified in either bytes - (bytes)
361					 or blocks - (blocks)
362Sizes and addresses specified in blocks must be converted to bytes by
363multiplying by the block size.
364```
365Command:
366	RESET_STATE
367	Implemented in Versions:
368		V1, V2
369	Arguments:
370		-
371	Response:
372		-
373	Notes:
374		This command is designed to inform the BMC that it should put
375		host LPC mapping back in a state where the SBE will be able to
376		use it. Currently this means pointing back to BMC flash
377		pre mailbox protocol. Final behavour is still TBD.
378
379Command:
380	GET_MBOX_INFO
381	Implemented in Versions:
382		V1, V2
383	Arguments:
384		V1:
385		Args 0: API version
386
387		V2:
388		Args 0: API version
389
390	Response:
391		V1:
392		Args 0: API version
393		Args 1-2: default read window size (blocks)
394		Args 3-4: default write window size (blocks)
395
396		V2:
397		Args 0: API version
398		Args 1-2: reserved
399		Args 3-4: reserved
400		Args 5: Block size as power of two (encoded as a shift)
401
402Command:
403	GET_FLASH_INFO
404	Implemented in Versions:
405		V1, V2
406	Arguments:
407		-
408	Response:
409		V1:
410		Args 0-3: Flash size (bytes)
411		Args 4-7: Erase granule (bytes)
412
413		V2:
414		Args 0-1: Flash size (blocks)
415		Args 2-3: Erase granule (blocks)
416
417Command:
418	CREATE_{READ/WRITE}_WINDOW
419	Implemented in Versions:
420		V1, V2
421	Arguments:
422		V1:
423		Args 0-1: Window location as offset into flash (blocks)
424
425		V2:
426		Args 0-1: Window location as offset into flash (blocks)
427		Args 2-3: Requested window size (blocks)
428
429	Response:
430		V1:
431		Args 0-1: LPC bus address of window (blocks)
432
433		V2:
434		Args 0-1: LPC bus address of window (blocks)
435		Args 2-3: Actual window size (blocks)
436		Args 4-5: Actual window location as offset into flash (blocks)
437	Notes:
438		Window location is always given as an offset into flash as
439		taken from the start of flash - that is it is an absolute
440		address.
441
442		LPC bus address is always given from the start of the LPC
443		address space - that is it is an absolute address.
444
445		The requested window size is only a hint. The response
446		indicates the actual size of the window. The BMC may
447		want to use the requested size to pre-load the remainder
448		of the request. The host must not access past the end of the
449		active window.
450
451		The actual window location indicates the absolute flash offset
452		that the window actually maps and is not required to be equal
453		to the flash offset requested by the host, but however must be
454		less than or equal to it. Thus the first block of the window at
455		the lpc address in the response will map the first block at the
456		actual flash offset also contained in the response. It is the
457		responsibility of the host to use this information to access
458		any offset which is required.
459
460		The requested window size may be zero. In this case the
461		BMC is free to create any sized window but it must contain
462		atleast the first block of data requested by the host. A large
463		window is of course preferred and should correspond to
464		the default size returned in the GET_MBOX_INFO command.
465
466		If this command returns successfully then the window which the
467		host requested is the active window. If it fails then there is
468		no active window.
469
470Command:
471	CLOSE_WINDOW
472	Implemented in Versions:
473		V1, V2
474	Arguments:
475		V1:
476		-
477
478		V2:
479		Args 0: Flags
480	Response:
481		-
482	Notes:
483		Closes the active window. Any further access to the LPC bus
484		address specified to address the previously active window will
485		have undefined effects. If the active window is a
486		write window then the BMC must perform an implicit flush.
487
488		The Flags argument allows the host to provide some
489		hints to the BMC. Defined Values:
490			0x01 - Short Lifetime:
491				The window is unlikely to be accessed
492				anytime again in the near future. The effect of
493				this will depend on BMC implementation. In
494				the event that the BMC performs some caching
495				the BMC daemon could mark data contained in a
496				window closed with this flag as first to be
497				evicted from the cache.
498
499Command:
500	MARK_WRITE_DIRTY
501	Implemented in Versions:
502		V1, V2
503	Arguments:
504		V1:
505		Args 0-1: Flash offset to mark from base of flash (blocks)
506		Args 2-5: Number to mark dirty at offset (bytes)
507
508		V2:
509		Args 0-1: Window offset to mark (blocks)
510		Args 2-3: Number to mark dirty at offset (blocks)
511
512	Response:
513		-
514	Notes:
515		The BMC has no method for intercepting writes that
516		occur over the LPC bus. The host must explicitly notify
517		the daemon of where and when a write has occured so it
518		can be flushed to backing storage.
519
520		Offsets are given as an absolute (either into flash (V1) or the
521		active window (V2)) and a zero offset refers to the first
522		block. If the offset + number exceeds the size of the active
523		window then the command must not succeed.
524
525Command
526	WRITE_FLUSH
527	Implemented in Versions:
528		V1, V2
529	Arguments:
530		V1:
531		Args 0-1: Flash offset to mark from base of flash (blocks)
532		Args 2-5: Number to mark dirty at offset (bytes)
533
534		V2:
535		-
536
537	Response:
538		-
539	Notes:
540		Flushes any dirty/erased blocks in the active window to
541		the backing storage.
542
543		In V1 this can also be used to mark parts of the flash
544		dirty and flush in a single command. In V2 the explicit
545		mark dirty command must be used before a call to flush
546		since there are no longer any arguments. If the offset + number
547		exceeds the size of the active window then the command must not
548		succeed.
549
550
551Command:
552	BMC_EVENT_ACK
553	Implemented in Versions:
554		V1, V2
555	Arguments:
556		Args 0:	Bits in the BMC status byte (mailbox data
557			register 15) to ack
558	Response:
559		*clears the bits in mailbox data register 15*
560	Notes:
561		The host should use this command to acknowledge BMC events
562		supplied in mailbox register 15.
563
564Command:
565	MARK_WRITE_ERASED
566	Implemented in Versions:
567		V2
568	Arguments:
569		V2:
570		Args 0-1: Window offset to erase (blocks)
571		Args 2-3: Number to erase at offset (blocks)
572	Response:
573		-
574	Notes:
575		This command allows the host to erase a large area
576		without the need to individually write 0xFF
577		repetitively.
578
579		Offset is the offset within the active window to start erasing
580		from (zero refers to the first block of the active window) and
581		number is the number of blocks of the active window to erase
582		starting at offset. If the offset + number exceeds the size of
583		the active window then the command must not succeed.
584```
585
586### BMC Events in Detail:
587
588If the BMC needs to tell the host something then it simply
589writes to Byte 15. The host should have interrupts enabled
590on that register, or otherwise be polling it.
591
592#### Bit Definitions:
593
594Events which should be ACKed:
595```
5960x01: BMC Reboot
5970x02: BMC Windows Reset (V2)
598```
599
600Events which cannot be ACKed (BMC will clear when no longer
601applicable):
602```
6030x40: BMC Flash Control Lost (V2)
6040x80: BMC MBOX Daemon Ready (V2)
605```
606
607#### Event Description:
608
609Events which must be ACKed:
610The host should acknowledge these events with BMC_EVENT_ACK to
611let the BMC know that they have been received and understood.
612```
6130x01 - BMC Reboot:
614	Used to inform the host that a BMC reboot has occured.
615	The host must perform protocol verison negotiation again and
616	must assume it has no active window. The host must not assume
617	that any commands which didn't respond as such succeeded.
6180x02 - BMC Windows Reset: (V2)
619	The host must assume that its active window has been closed and
620	that it no longer has an active window. The host is not
621	required to perform protocol version negotiation again. The
622	host must not assume that any commands which didn't respond as such
623	succeeded.
624```
625
626Events which cannot be ACKed:
627These events cannot be acknowledged by the host and a call to
628BMC_EVENT_ACK with these bits set will have no effect. The BMC
629will clear these bits when they are no longer applicable.
630```
6310x40 - BMC Flash Control Lost: (V2)
632	The BMC daemon has been suspended and thus no longer
633	controls access to the flash (most likely because some
634	other process on the BMC required direct access to the
635	flash and has suspended the BMC daemon to preclude
636	concurrent access).
637	The BMC daemon must clear this bit itself when it regains
638	control of the flash (the host isn't able to clear it
639	through an acknowledge command).
640	The host must not assume that the contents of the active window
641	correctly reflect the contents of flash while this bit is set.
6420x80 - BMC MBOX Daemon Ready: (V2)
643	Used to inform the host that the BMC daemon is ready to
644	accept command requests. The host isn't able to clear
645	this bit through an acknowledge command, the BMC daemon must
646	clear it before it terminates (assuming it didn't
647	terminate unexpectedly).
648	The host should not expect a response while this bit is
649	not set.
650	Note that this bit being set is not a guarantee that the BMC daemon
651	will respond as it or the BMC may have crashed without clearing
652	it.
653```
654