114081020SJeremy Kerr# OpenBMC platform communication channel: MCTP & PLDM in userspace 214081020SJeremy Kerr 314081020SJeremy KerrAuthor: Jeremy Kerr <jk@ozlabs.org> <jk> 414081020SJeremy Kerr 533940c03SNan ZhouPlease refer to the [MCTP Overview](mctp.md) document for general MCTP design 633940c03SNan Zhoudescription, background and requirements. 714081020SJeremy Kerr 833940c03SNan ZhouThis document describes a userspace implementation of MCTP infrastructure, 933940c03SNan Zhouallowing a straightforward mechanism of supporting MCTP messaging within an 1033940c03SNan ZhouOpenBMC system. 1114081020SJeremy Kerr 1214081020SJeremy Kerr## Proposed Design 1314081020SJeremy Kerr 1414081020SJeremy KerrThe MCTP core specification just provides the packetisation, routing and 15*f4febd00SPatrick Williamsaddressing mechanisms. The actual transmit/receive of those packets is up to the 16*f4febd00SPatrick Williamshardware binding of the MCTP transport. 1714081020SJeremy Kerr 1814081020SJeremy KerrFor OpenBMC, we would introduce a MCTP daemon, which implements the transport 1914081020SJeremy Kerrover a configurable hardware channel (eg., Serial UART, I2C or PCIe), and 2033940c03SNan Zhouprovides a socket-based interface for other processes to send and receive 2133940c03SNan Zhoucomplete MCTP messages. This daemon is responsible for the packetisation and 2233940c03SNan Zhourouting of MCTP messages from external endpoints, and handling the forwarding 2333940c03SNan Zhouthese messages to and from individual handler applications. This includes 2433940c03SNan Zhouhandling local MCTP-stack configuration, like local EID assignments. 2514081020SJeremy Kerr 2614081020SJeremy KerrThis daemon has a few components: 2714081020SJeremy Kerr 2833940c03SNan Zhou1. the core MCTP stack 2914081020SJeremy Kerr 3033940c03SNan Zhou2. one or more binding implementations (eg, MCTP-over-serial), which interact 3133940c03SNan Zhou with the hardware channel(s). 3214081020SJeremy Kerr 3333940c03SNan Zhou3. an interface to handler applications over a unix-domain socket. 3414081020SJeremy Kerr 3533940c03SNan ZhouThe proposed implementation here is to produce an MCTP "library" which provides 3633940c03SNan Zhouthe packetisation and routing functions, between: 3714081020SJeremy Kerr 3833940c03SNan Zhou- an "upper" messaging transmit/receive interface, for tx/rx of a full message 3933940c03SNan Zhou to a specific endpoint (ie, (1) above) 4014081020SJeremy Kerr 4133940c03SNan Zhou- a "lower" hardware binding for transmit/receive of individual packets, 4233940c03SNan Zhou providing a method for the core to tx/rx each packet to hardware, and defines 4333940c03SNan Zhou the parameters of the common packetisation code (ie. (2) above). 4414081020SJeremy Kerr 4533940c03SNan ZhouThe lower interface would be plugged in to one of a number of hardware-specific 4633940c03SNan Zhoubinding implementations. Most of these would be included in the library source 4733940c03SNan Zhoutree, but others can be plugged-in too, perhaps where the physical layer 4833940c03SNan Zhouimplementation does not make sense to include in the platform-agnostic library. 4914081020SJeremy Kerr 5033940c03SNan ZhouThe reason for a library is to allow the same MCTP implementation to be used in 5133940c03SNan Zhouboth OpenBMC and host firmware; the library should be bidirectional. To allow 5233940c03SNan Zhouthis, the library would be written in portable C (structured in a way that can 5333940c03SNan Zhoube compiled as "extern C" in C++ codebases), and be able to be configured to 5433940c03SNan Zhousuit those runtime environments (for example, POSIX IO may not be available on 5533940c03SNan Zhouall platforms; we should be able to compile the library to suit). The licence 5633940c03SNan Zhoufor the library should also allow this re-use; a dual Apache & GPLv2+ licence 5733940c03SNan Zhoumay be best. 5814081020SJeremy Kerr 5914081020SJeremy KerrThese "lower" binding implementations may have very different methods of 6033940c03SNan Zhoutransferring packets to the physical layer. For example, a serial binding 6133940c03SNan Zhouimplementation for running on a Linux environment may be implemented through 6233940c03SNan Zhouread()/write() syscalls to a PTY device. An I2C binding for use in low-level 6333940c03SNan Zhouhost firmware environments may interact directly with hardware registers to 6433940c03SNan Zhouperform packet transfers. 6514081020SJeremy Kerr 6633940c03SNan ZhouThe application-specific handlers implement the actual functionality provided 6733940c03SNan Zhouover the MCTP channel, and connect to the central daemon over a UNIX domain 6833940c03SNan Zhousocket. Each of these would register with the MCTP daemon to receive MCTP 6933940c03SNan Zhoumessages of a certain type, and would transmit MCTP messages of that same type. 7014081020SJeremy Kerr 7133940c03SNan ZhouThe daemon's sockets to these handlers is configured for non-blocking IO, to 7233940c03SNan Zhouallow the daemon to be decoupled from any blocking behaviour of handlers. The 7333940c03SNan Zhoudaemon would use a message queue to enable message reception/transmission to a 7433940c03SNan Zhoublocked daemon, but this would be of a limited size. Handlers whose sockets 7533940c03SNan Zhouexceed this queue would be disconnected from the daemon. 7614081020SJeremy Kerr 77*f4febd00SPatrick WilliamsOne design intention of the multiplexer daemon is to allow a future kernel-based 78*f4febd00SPatrick WilliamsMCTP implementation without requiring major structural changes to handler 79*f4febd00SPatrick Williamsapplications. The socket-based interface facilitates this, as the unix-domain 80*f4febd00SPatrick Williamssocket interface could be fairly easily swapped out with a new kernel-based 81*f4febd00SPatrick Williamssocket type. 8214081020SJeremy Kerr 83*f4febd00SPatrick WilliamsMCTP is intended to be an optional component of OpenBMC. Platforms using OpenBMC 84*f4febd00SPatrick Williamsare free to adopt it as they see fit. 8514081020SJeremy Kerr 8614081020SJeremy Kerr### Demultiplexer daemon interface 8714081020SJeremy Kerr 8833940c03SNan ZhouMCTP handlers (ie, clients of the demultiplexer) connect using a unix-domain 8933940c03SNan Zhousocket, at the abstract socket address: 9014081020SJeremy Kerr 9133940c03SNan Zhou``` 9214081020SJeremy Kerr\0mctp-demux 9333940c03SNan Zhou``` 9414081020SJeremy Kerr 9514081020SJeremy KerrThe socket type used should be `SOCK_SEQPACKET`. 9614081020SJeremy Kerr 9733940c03SNan ZhouOnce connected, the client sends a single byte message, indicating what type of 9833940c03SNan ZhouMCTP messages should be forwarded to the client. Types must be greater than 9933940c03SNan Zhouzero. 10014081020SJeremy Kerr 10133940c03SNan ZhouSubsequent messages sent over the socket are MCTP messages sent/received by the 10233940c03SNan Zhoudemultiplexer, that match the specified MCTP message type. Clients should use 10333940c03SNan Zhouthe send/recv syscalls to interact with the socket. 10414081020SJeremy Kerr 10514081020SJeremy KerrEach message has a fixed small header: 10614081020SJeremy Kerr 10733940c03SNan Zhou``` 10833940c03SNan Zhouuint8_t eid 10933940c03SNan Zhou``` 11014081020SJeremy Kerr 11133940c03SNan ZhouFor messages coming from the demux daemon, this indicates the source EID of the 11233940c03SNan Zhououtgoing MCTP message. For messages going to the demux daemon, this indicates 11333940c03SNan Zhouthe destination EID. 11414081020SJeremy Kerr 11533940c03SNan ZhouThe rest of the message data is the complete MCTP message, including MCTP 11633940c03SNan Zhoumessage type field. 11714081020SJeremy Kerr 118*f4febd00SPatrick WilliamsThe daemon does not provide a facility for clients to specify or retrieve values 119*f4febd00SPatrick Williamsfor the tag field in individual MCTP packets. 12014081020SJeremy Kerr 12114081020SJeremy Kerr## Alternatives Considered 12214081020SJeremy Kerr 12333940c03SNan ZhouIn terms of an MCTP daemon structure, an alternative is to have the MCTP 124*f4febd00SPatrick Williamsimplementation contained within a single process, using the libmctp API directly 125*f4febd00SPatrick Williamsfor passing messages from the core code to application-level handlers. The 126*f4febd00SPatrick Williamsdrawback of this approach is that this single process needs to implement all 127*f4febd00SPatrick Williamspossible functionality that is available over MCTP, which may be quite a 12833940c03SNan Zhoudisjoint set. This would likely lead to unnecessary restrictions on the 12933940c03SNan Zhouimplementation of those application-level handlers (programming language, 130*f4febd00SPatrick Williamsframeworks used, etc). Also, this single-process approach would likely need more 131*f4febd00SPatrick Williamssignificant modifications if/when MCTP protocol support is moved to the kernel. 13214081020SJeremy Kerr 13333940c03SNan ZhouThe interface between the demultiplexer daemon and clients is currently defined 13433940c03SNan Zhouas a socket-based interface. However, an alternative here would be to pass MCTP 13533940c03SNan Zhoumessages over dbus instead. The reason for the choice of sockets rather than 13633940c03SNan Zhoudbus is that the former allows a direct transition to a kernel-based socket API 13733940c03SNan Zhouwhen suitable. 13814081020SJeremy Kerr 13914081020SJeremy Kerr## Testing 14014081020SJeremy Kerr 14133940c03SNan ZhouFor the core MCTP library, we are able to run tests there in complete isolation 14233940c03SNan Zhou(I have already been able to run a prototype MCTP stack through the afl fuzzer) 14333940c03SNan Zhouto ensure that the core transport protocol works. 14414081020SJeremy Kerr 14533940c03SNan ZhouFor MCTP hardware bindings, we would develop channel-specific tests that would 14633940c03SNan Zhoube run in CI on both host and BMC. 14714081020SJeremy Kerr 14833940c03SNan ZhouFor the OpenBMC MCTP daemon implementation, testing models would depend on the 14933940c03SNan Zhoustructure we adopt in the design section. 150