1# Management Component Transport Protocol (MCTP) LPC Transport Binding Specification for ASPEED BMC Systems
2
3## Scope
4
5This design provides an efficient method to transfer MCTP packets between the
6host and BMC over the LPC bus on ASPEED BMC platforms.
7
8## References
9
10The following referenced documents are indispensable for the application of
11this document.
12
131. DMTF DSP0236, Management Component Transport Protocol (MCTP) Base
14Specification 1.0,
15http://www.dmtf.org/standards/published_documents/DSP0236_1.0.pdf
16
172. Intel (R) Low Pin Count (LPC) Interface Specification 1.1,
18https://www.intel.com/content/dam/www/program/design/us/en/documents/low-pin-count-interface-specification.pdf
19
203. IPMI Consortium, Intelligent Platform Management Interface Specification,
21v1.5 Revision 1.1 February 20, 2002,
22http://download.intel.com/design/servers/ipmi/IPMIv1_5rev1_1.pdf
23
24## Definitions
25
26**BTU: Baseline Transmission Unit**
27
28Defined by the MCTP base specification as the smallest maximum packet size all
29MCTP-compliant endpoints must accept.
30
31**KCS: Keyboard-Controller-Style**
32
33A set of bit definitions and operation of the registers typically used in
34keyboard microcontrollers and embedded controllers. The term "Keyboard
35Controller Style" reflects that the register definition was originally used as
36the legacy "8742" keyboard controller interface in PC architecture computer
37systems.  This interface is available built-in to several commercially
38available microcontrollers. Data is transferred across the KCS interface using
39a per-byte handshake.
40
41**LPC Bus: Low Pin Count Bus**
42
43A bus specification that implements ISA bus in a reduced physical form while
44extending ISA's capabilities.
45
46**LPC FW: LPC Firmware Cycles**
47
48LPC firmware cycles allow separate boot BIOS firmware memory cycles and
49application memory cycles with respect to the LPC bus. The ASPEED BMCs allow
50remapping of the LPC firmware cycles onto arbitrary regions of the BMC's
51physical address space, including RAM.
52
53**MTU: Maximum Transmission Unit**
54
55The largest payload the link will accept for a packet. The Maximum Transmission
56Unit represents a value that is at least as large as the BTU. Negotiation of
57MTU values larger than the BTU may improve throughput for data-intensive
58transfers.
59
60## Conventions
61
62Where unspecified, state, command and sequence descriptions apply to all
63versions of the protocol unless marked otherwise.
64
65## MCTP over LPC Transport
66
67### Concepts
68
69The basic components used for the transfer are:
70
71* An interrupt mechanism using the IPMI KCS interface
72* A window of the LPC FW address space, where reads and writes are forwarded to
73  BMC memory, using the LPC2AHB hardware
74
75In order to transfer a packet, either side of the channel (BMC or host) will:
76
771. Write the packet to the LPC FW window
78   * The BMC will perform writes by writing to the memory backing the LPC
79     window
80   * The host will perform writes by writing to the LPC bus, at predefined
81     addresses
822. Trigger an interrupt on the remote side, by writing to the KCS data buffer
83
84On this indication, the remote side will:
85
861. Read from the KCS status register, which shows that the single-byte KCS data
87   buffer is full
882. Read the MCTP packet from the LPC FW window
893. Read from the KCS buffer, to clear the 'buffer full' state.
90
91### Scope
92
93The document limits itself to describing the operation of the binding protocol.
94The following issues of protocol ABI are considered out of scope:
95
961. The LPC IO address and Serial IRQ parameters of the KCS device
972. The concrete location of the control region in the LPC FW address space
98
99### KCS Interface
100
101The KCS hardware on the ASPEED BMCs is used as a method of indicating, to the
102remote side, that a packet is ready to be transferred through the LPC FW
103mapping.
104
105The KCS hardware consists of two single-byte buffers: the Output Data Register
106(ODR) and the Input Data Register (IDR). The ODR is written by the BMC and read
107by the host. The IDR is the obverse.
108
109The KCS unit also contains a status register, allowing both host and BMC to
110determine if there is data in the ODR or IDR. These are single-bit flags,
111designated Input/Output Buffer Full (IBF/OBF), and are automatically set by
112hardware when data has been written to the corresponding ODR/IDR buffer (and
113cleared when data has been read).
114
115While the IBF and OBF flags are managed in hardware, the remaining
116software-defined bits in the status register are used to carry other required
117protocol state. A problematic feature of the KCS status register is described
118in the IPMI specification, which states that an interrupt may be triggered on
119writes to the KCS status register but hardware implementations are not required
120to do so. Comparatively, writes to the data registers must set the
121corresponding buffer-full flag and invoke an interrupt.
122
123To ensure interrupts are generated for status updates, we exploit the OBF
124interrupt to signal a status update by writing a dummy command to ODR after
125updating the status register, as outlined below.
126
127### LPC FW Window
128
129The window of BMC-memory-backed LPC FW address space has a predefined format,
130consisting of:
131
132* A control descriptor, describing static data about the rest of the window
133* A receive area for BMC-to-host packets
134* A transmit area, for host-to-BMC packets
135
136The control descriptor contains a version, and offset and size data for the
137transmit and receive areas. These offsets are relative to the start of the LPC
138FW window.
139
140Full definition of the control area is defined below, and it will be the base
141for all future versions.
142
143```
144struct mctp_lpcmap_hdr {
145   uint32_t magic;
146
147   uint16_t bmc_ver_min;
148   uint16_t bmc_ver_cur;
149   uint16_t host_ver_min;
150   uint16_t host_ver_cur;
151   uint16_t negotiated_ver;
152   uint16_t pad0;
153
154   uint32_t rx_offset;
155   uint32_t rx_size;
156   uint32_t tx_offset;
157   uint32_t tx_size;
158} __attribute__((packed));
159```
160
161Where the magic value marking the beginning of the control area is the ASCII
162encoding of "MCTP":
163
164```
165#define LPC_MAGIC 0x4d435450
166```
167
168The transmit and receive areas contain a length field, followed by the actual
169MCTP packet to be transferred. At version 1, only a single MCTP packet is
170present in the Rx and Tx areas. This may change for future versions of the
171protocol.
172
173All control data is in big-endian format. MCTP packet data is transferred
174exactly as is presented, and no data escaping is performed.
175
176#### Negotiation of the Maximum Transmission Unit
177
178Version 1 of the protocol offers no mechanism for negotiation of the maximum
179transmission unit. The Rx and Tx buffers must be sized to accommodate packets
180up to the Baseline Transmission Unit, and the implementation assumes that the
181MTU is set to the BTU regardless of the values of `rx_size` and `tx_size`.
182
183Version 2 of the protocol exploits the `rx_size` and `tx_size` fields in the
184control region to negotiate the link MTU. Note that at the time that the MTU is
185under negotiation the protocol version has not been finalised, so the process
186is necessarily backwards-compatible.
187
188The relevant property that each endpoint must control is the MTU of packets it
189will receive, as this governs how the remote endpoint's packetisation impacts
190memory pressure at the local endpoint. As such while the BMC MUST populate
191`rx_size` for backwards compatibility with version 1, the host MAY write
192`rx_size` without regard for its current value if the host supports version 2.
193The BMC controls the value of `tx_size`, and MAY choose to adjust it in
194response to the host's proposed `rx_size` value. As such, when `Channel Active`
195is set by the BMC, the host MUST read both `rx_size` and `tx_size` in response
196to ensure both the BMC and the host have a consistent understanding of the MTU
197in each direction. It is convention for `rx_size` and `tx_size` to be set to
198the same value by the BMC as part of finalising the channel, though it is not
199invalid to have asymmetric MTUs.
200
201For all protocol versions, the following properties must be upheld for the Rx
202and Tx buffers to be considered valid:
203
204* Intersect neither eachother nor the control region
205* Not extend beyond the window allocated to MCTP in the LPC FW address space
206* Must accommodate at least BTU-sized payloads
207
208The BMC MAY choose to fail channel initialisation if these properties are
209violated in the negotiation process.
210
211### KCS Status and Control Sequences
212
213The KCS status flags and command set govern the state of the protocol, defining
214the ability to send and receive packets on the LPC bus.
215
216#### KCS Status Register Layout
217
218| Bit | Managed By | Description |
219|-----|------------|-------------|
220|  7  |  Software  | (MSB) BMC Active  |
221|  6  |  Software  | Channel active, version negotiated |
222|  5  |  Software  | Unused      |
223|  4  |  Software  | Unused      |
224|  3  |  Hardware  | Command / Data |
225|  2  |  Software  | Unused      |
226|  1  |  Hardware  | Input Buffer Full |
227|  0  |  Hardware  | (LSB) Output Buffer Full |
228
229#### KCS Data Register Commands
230
231| Command | Description |
232|---------|-------------|
233|  0x00   | Initialise  |
234|  0x01   | Tx Begin    |
235|  0x02   | Rx Complete |
236|  0xff   | Dummy Value |
237
238#### Host Command to BMC Sequence
239
240The host sends commands to the BMC to signal channel initialisation, begin
241transmission of a packet, or to complete reception of a packet.
242
243| Step | Description                                             |
244|------|---------------------------------------------------------|
245|  1   | The host writes a command value to IDR                  |
246|  2   | The hardware sets IBF, which triggers a BMC interrupt   |
247|  3   | The BMC reads the status register for IBF               |
248|  4   | If IBF is set, the BMC reads the host command from IDR  |
249|  5   | The interrupt is acknowledged by the data register read |
250
251#### BMC Command to Host Sequence
252
253The BMC sends commands to the host to begin transmission of a packet or to
254complete reception of a packet.
255
256| Step | Description                                             |
257|------|---------------------------------------------------------|
258|  1   | The BMC writes a command value to ODR                   |
259|  2   | The hardware sets OBF, which triggers a host interrupt  |
260|  3   | The host reads the status register for OBF              |
261|  4   | If OBF is set, the host reads the BMC command from ODR  |
262|  5   | The interrupt is acknowledged by the data register read |
263
264#### BMC Status Update Sequence
265
266The BMC sends status updates to the host to signal loss of function, loss of
267channel state, or the presence of a command in the KCS data register.
268
269| Step | Description                                                    |
270|------|----------------------------------------------------------------|
271|  1   | The BMC writes the status value to the status register         |
272|  2   | The BMC writes the dummy command to ODR                        |
273|  3   | The hardware sets OBF, which triggers a host interrupt         |
274|  4   | If OBF is set, the host reads the BMC command from ODR         |
275|  5   | The interrupt is acknowledged by the data register read        |
276|  6   | The host observes the command is the dummy command             |
277|  7   | The host reads the status register to capture the state change |
278
279#### LPC Window Ownership and Synchronisation
280
281Because the LPC FW window is shared between the host and the BMC we need
282strict rules on which entity is allowed to access it at specific times.
283
284Firstly, we have rules for modification:
285
286* The control data is only written during initialisation. The control area
287  is never modified once the channel is active.
288* Only the BMC may write to the Rx buffer described in the control area
289* Only the host may write to the Tx buffer described in the control area
290
291During packet transmission, the follow sequence occurs:
292
2931. The Tx side writes the packet to its Tx buffer
2942. The Tx side sends a `Tx Begin` message, indicating that the buffer ownership
295   is transferred
2963. The Rx side now owns the buffer, and reads the message from its Rx area
2974. The Rx side sends a `Rx Complete` once done, indicating that the buffer
298   ownership is transferred back to the Tx side.
299
300### LPC Binding Operation
301
302The binding operation is not symmetric as the BMC is the only side that can
303drive the status register. Each side's initialisation sequence is outlined
304below.
305
306The sequences below contain steps where the BMC updates the channel status and
307where commands are sent between the BMC and the host. The act of updating
308status or sending a command invokes the behaviour outlined in [KCS
309Control](#kcs-control).
310
311The packet transmission sequences assume that `BMC Active` and `Channel Active`
312are set.
313
314#### BMC Initialisation Sequence
315
316| Step | Description                              |
317|------|------------------------------------------|
318|  1   | The BMC initialises the control area: magic value, BMC versions and buffer parameters |
319|  2   | The BMC sets the status to `BMC Active`  |
320
321#### Host Initialisation Sequence
322
323| Step | v1 | v2 | Description                                    |
324|------|----|----|------------------------------------------------|
325|  1   | ✓  | ✓  | The host waits for the `BMC Active` state      |
326|  2   | ✓  | ✓  | The host populates the its version fields      |
327|  3   |    | ✓  | The host derives and writes to `rx_size` the packet size associated with its desired MTU |
328|  4   | ✓  | ✓  | The host sends the `Initialise` command        |
329|  5   | ✓  | ✓  | The BMC observes the `Initialise` command      |
330|  6   | ✓  | ✓  | The BMC calculates and writes `negotiated_ver` |
331|  7   |    | ✓  | The BMC calculates the MTUs and updates neither, one or both of `rx_size` and `tx_size` |
332|  8   | ✓  | ✓  | The BMC sets the status to `Channel Active`    |
333|  9   | ✓  | ✓  | The host observes that `Channel Active` is set |
334|  10  | ✓  | ✓  | The host reads the negotiated version          |
335|  11  |    | ✓  | The host reads both `rx_size` and `tx_size` to derive the negotiated MTUs |
336
337#### Host Packet Transmission Sequence
338
339| Step | Description                                                  |
340|------|--------------------------------------------------------------|
341|  1   | The host waits on any previous `Rx Complete` message         |
342|  3   | The host writes the packet to its Tx area (BMC Rx area)      |
343|  4   | The host sends the `Tx Begin` command, transferring ownership of its Tx buffer to the BMC |
344|  5   | The BMC observes the `Tx Begin` command                      |
345|  6   | The BMC reads the packet from the its Rx area (host Tx area) |
346|  7   | The BMC sends the `Rx Complete` command, transferring ownership of its Rx buffer to the host |
347|  8   | The host observes the `Rx Complete` command                  |
348
349#### BMC Packet Transmission Sequence
350
351| Step | Description                                                   |
352|------|---------------------------------------------------------------|
353|  1   | The BMC waits on any previous `Rx Complete` message           |
354|  2   | The BMC writes the packet to its Tx area (host Rx area)       |
355|  3   | The BMC sends the `Tx Begin` command, transferring ownership of its Tx buffer to the host |
356|  8   | The host observes the `Tx Begin` command                      |
357|  9   | The host reads the packet from the host Rx area (BMC Tx area) |
358|  10  | The host sends the `Rx Complete` command, transferring ownership of its Rx buffer to the BMC |
359|  15  | The BMC observes the `Rx Complete` command                    |
360
361## Implementation Notes
362
363On the BMC the initial prototype implementation makes use of the following
364components:
365
366* An LPC KCS device exposed by a [binding-specific kernel driver][mctp-driver]
367* The reserved memory mapped by the LPC2AHB bridge via the [aspeed-lpc-ctrl
368  driver][aspeed-lpc-ctrl]
369* The astlpc binding found in [libmctp][libmctp]
370
371[mctp-driver]: https://github.com/openbmc/linux/commit/9a3b539a175cf4fe1f8fc2997e8a91abec25c37f
372[aspeed-lpc-ctrl]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/soc/aspeed/aspeed-lpc-ctrl.c?h=v5.7
373[libmctp]: https://github.com/openbmc/libmctp
374
375From the host side, the LPC Firmware and KCS IO cycles are driven by
376free-standing firmware. Some firmwares exploit [libmctp][libmctp] by
377implementing the driver hooks for direct access to the LPC devices.
378
379## Alternatives Considered
380
381### The KCS MCTP Binding (DSP0254)
382
383The KCS hardware (used as the full transfer channel) can be used to transfer
384arbitrarily-sized MCTP messages. However, there are much larger overheads in
385synchronisation between host and BMC for every byte transferred.
386
387### The MCTP Serial Binding (DSP0253)
388
389We could use the VUART hardware to transfer the MCTP packets according to the
390existing MCTP Serial Binding. However, the VUART device is already used for
391console data. Multiplexing both MCTP and console would be an alternative, but
392the complexity introduced would make low-level debugging both more difficult
393and less reliable.
394
395### The BT interface
396
397The BT interface allows for block-at-time transfers. However, the BT buffer
398size is only 64 bytes on the AST2500 hardware, which does not allow us to
399comply with the MCTP Base Specification (DSP0236) that requires a 64-byte
400payload size as the minimum. The 64-byte BT buffer does not allow for MCTP and
401transport headers.
402
403Additionally, we would like to develop the MCTP channel alongside the existing
404IPMI interfaces, to allow a gradual transition from IPMI to MCTP. As the BT
405channel is already used on OpenPOWER systems for IPMI transfers, we would not
406be able to support both in parallel.
407
408### Using the AST2500 LPC Mailbox
409
410This would require enabling the SuperIO interface, which allows the host to
411access the entire BMC address space, and so introduces security
412vulnerabilities.
413