1d070b7d7SDeepak Kodihalli# PLDM stack on OpenBMC 2d070b7d7SDeepak Kodihalli 3d070b7d7SDeepak KodihalliAuthor: Deepak Kodihalli <dkodihal@linux.vnet.ibm.com> <dkodihal> 4*4134f4f1SDeepak Kodihalli 5d070b7d7SDeepak KodihalliPrimary assignee: Deepak Kodihalli 6*4134f4f1SDeepak Kodihalli 7d070b7d7SDeepak KodihalliCreated: 2019-01-22 8d070b7d7SDeepak Kodihalli 9d070b7d7SDeepak Kodihalli## Problem Description 10d070b7d7SDeepak KodihalliOn OpenBMC, in-band IPMI is currently the primary industry-standard means of 11d070b7d7SDeepak Kodihallicommunication between the BMC and the Host firmware. We've started hitting some 12d070b7d7SDeepak Kodihalliinherent limitations of IPMI on OpenPOWER servers: a limited number of sensors, 13d070b7d7SDeepak Kodihalliand a lack of a generic control mechanism (sensors are a generic monitoring 14d070b7d7SDeepak Kodihallimechanism) are the major ones. There is a need to improve upon the communication 15d070b7d7SDeepak Kodihalliprotocol, but at the same time inventing a custom protocol is undesirable. 16d070b7d7SDeepak Kodihalli 17d070b7d7SDeepak KodihalliThis design aims to employ Platform Level Data Model (PLDM), a standard 18d070b7d7SDeepak Kodihalliapplication layer communication protocol defined by the DMTF. PLDM draws inputs 19d070b7d7SDeepak Kodihallifrom IPMI, but it overcomes most of the latter's limitations. PLDM is also 20d070b7d7SDeepak Kodihallidesigned to run on standard transport protocols, for e.g. MCTP (also designed by 21d070b7d7SDeepak Kodihallithe DMTF). MCTP provides for a common transport layer over several physical 22d070b7d7SDeepak Kodihallichannels, by defining hardware bindings. The solution of PLDM over MCTP also 23d070b7d7SDeepak Kodihallihelps overcome some of the limitations of the hardware channels that IPMI uses. 24d070b7d7SDeepak Kodihalli 25d070b7d7SDeepak KodihalliPLDM's purpose is to enable all sorts of "inside the box communication": BMC - 26d070b7d7SDeepak KodihalliHost, BMC - BMC, BMC - Network Controller and BMC - Other (for e.g. sensor) 27d070b7d7SDeepak Kodihallidevices. 28d070b7d7SDeepak Kodihalli 29d070b7d7SDeepak Kodihalli## Background and References 30d070b7d7SDeepak KodihalliPLDM is designed to be an effective interface and data model that provides 31d070b7d7SDeepak Kodihalliefficient access to low-level platform inventory, monitoring, control, event, 32d070b7d7SDeepak Kodihalliand data/parameters transfer functions. For example, temperature, voltage, or 33d070b7d7SDeepak Kodihallifan sensors can have a PLDM representation that can be used to monitor and 34d070b7d7SDeepak Kodihallicontrol the platform using a set of PLDM messages. PLDM defines data 35d070b7d7SDeepak Kodihallirepresentations and commands that abstract the platform management hardware. 36d070b7d7SDeepak Kodihalli 37d070b7d7SDeepak KodihalliPLDM groups commands under broader functions, and defines 38d070b7d7SDeepak Kodihalliseparate specifications for each of these functions (also called PLDM "Types"). 39d070b7d7SDeepak KodihalliThe currently defined Types (and corresponding specs) are : PLDM base (with 40d070b7d7SDeepak Kodihalliassociated IDs and states specs), BIOS, FRU, Platform monitoring and control, 41d070b7d7SDeepak KodihalliFirmware Update and SMBIOS. All these specifications are available at: 42d070b7d7SDeepak Kodihalli 43d070b7d7SDeepak Kodihallihttps://www.dmtf.org/standards/pmci 44d070b7d7SDeepak Kodihalli 45d070b7d7SDeepak KodihalliSome of the reasons PLDM sounds promising (some of these are advantages over 46d070b7d7SDeepak KodihalliIPMI): 47d070b7d7SDeepak Kodihalli 48d070b7d7SDeepak Kodihalli- Common in-band communication protocol. 49d070b7d7SDeepak Kodihalli 50d070b7d7SDeepak Kodihalli- Already existing PLDM Type specifications that cover the most common 51d070b7d7SDeepak Kodihalli communication requirements. Up to 64 PLDM Types can be defined (the last one 52d070b7d7SDeepak Kodihalli is OEM). At the moment, 6 are defined. Each PLDM type can house up to 256 PLDM 53d070b7d7SDeepak Kodihalli commands. 54d070b7d7SDeepak Kodihalli 55d070b7d7SDeepak Kodihalli- PLDM sensors are 2 bytes in length. 56d070b7d7SDeepak Kodihalli 57d070b7d7SDeepak Kodihalli- PLDM introduces the concept of effecters - a control mechanism. Both sensors 58d070b7d7SDeepak Kodihalli and effecters are associated to entities (similar to IPMI, entities can be 59d070b7d7SDeepak Kodihalli physical or logical), where sensors are a mechanism for monitoring and 60d070b7d7SDeepak Kodihalli effecters are a mechanism for control. Effecters can be numeric or state 61d070b7d7SDeepak Kodihalli based. PLDM defines commonly used entities and their IDs, but there 8K slots 62d070b7d7SDeepak Kodihalli available to define OEM entities. 63d070b7d7SDeepak Kodihalli 64d070b7d7SDeepak Kodihalli- A very active PLDM related working group in the DMTF. 65d070b7d7SDeepak Kodihalli 66d070b7d7SDeepak KodihalliThe plan is to run PLDM over MCTP. MCTP is defined in a spec of its own, and a 67d070b7d7SDeepak Kodihalliproposal on the MCTP design is in discussion already. There's going to be an 68d070b7d7SDeepak Kodihalliintermediate PLDM over MCTP binding layer, which lets us send PLDM messages over 69d070b7d7SDeepak KodihalliMCTP. This is defined in a spec of its own, and the design for this binding will 70d070b7d7SDeepak Kodihallibe proposed separately. 71d070b7d7SDeepak Kodihalli 72d070b7d7SDeepak Kodihalli## Requirements 73d070b7d7SDeepak KodihalliHow different BMC applications make use of PLDM messages is outside the scope 74d070b7d7SDeepak Kodihalliof this requirements doc. The requirements listed here are related to the PLDM 75d070b7d7SDeepak Kodihalliprotocol stack and the request/response model: 76d070b7d7SDeepak Kodihalli 77d070b7d7SDeepak Kodihalli- Marshalling and unmarshalling of PLDM messages, defined in various PLDM Type 78d070b7d7SDeepak Kodihalli specs, must be implemented. This can of course be staged based on the need of 79d070b7d7SDeepak Kodihalli specific Types and functions. Since this is just encoding and decoding PLDM 80d070b7d7SDeepak Kodihalli messages, this can be a library that could shared between the BMC, and other 81d070b7d7SDeepak Kodihalli firmware stacks. The specifics of each PLDM Type (such as FRU table 82d070b7d7SDeepak Kodihalli structures, sensor PDR structures, etc) are implemented by this lib. 83d070b7d7SDeepak Kodihalli 84d070b7d7SDeepak Kodihalli- Mapping PLDM concepts to native OpenBMC concepts must be implemented. For 85d070b7d7SDeepak Kodihalli e.g.: mapping PLDM sensors to phosphor-hwmon hosted D-Bus objects, mapping 86d070b7d7SDeepak Kodihalli PLDM FRU data to D-Bus objects hosted by phosphor-inventory-manager, etc. The 87d070b7d7SDeepak Kodihalli mapping shouldn't be restrictive to D-Bus alone (meaning it shouldn't be 88d070b7d7SDeepak Kodihalli necessary to put objects on the Bus just to serve PLDM requests, a problem 89d070b7d7SDeepak Kodihalli that exists with phosphor-host-ipmid today). Essentially these are platform 90d070b7d7SDeepak Kodihalli specific PLDM message handlers. 91d070b7d7SDeepak Kodihalli 92d070b7d7SDeepak Kodihalli- The BMC should be able to act as a PLDM responder as well as a PLDM requester. 93d070b7d7SDeepak Kodihalli As a PLDM requester, the BMC can monitor/control other devices. As a PLDM 94d070b7d7SDeepak Kodihalli responder, the BMC can react to PLDM messages directed to it via requesters in 95d070b7d7SDeepak Kodihalli the platform. 96d070b7d7SDeepak Kodihalli 97d070b7d7SDeepak Kodihalli- As a PLDM requester, the BMC must be able to discover other PLDM enabled 98d070b7d7SDeepak Kodihalli components in the platform. 99d070b7d7SDeepak Kodihalli 100d070b7d7SDeepak Kodihalli- As a PLDM requester, the BMC must be able to send simultaneous messages to 101d070b7d7SDeepak Kodihalli different responders. 102d070b7d7SDeepak Kodihalli 103d070b7d7SDeepak Kodihalli- As a PLDM requester, the BMC must be able to handle out of order responses. 104d070b7d7SDeepak Kodihalli 105d070b7d7SDeepak Kodihalli- As a PLDM responder, the BMC may simultaneously respond to messages from 106d070b7d7SDeepak Kodihalli different requesters, but the spec doesn't mandate this. In other words the 107d070b7d7SDeepak Kodihalli responder could be single-threaded. 108d070b7d7SDeepak Kodihalli 109d070b7d7SDeepak Kodihalli- It should be possible to plug-in OEM PLDM types/functions into the PLDM stack. 110d070b7d7SDeepak Kodihalli 111d070b7d7SDeepak Kodihalli## Proposed Design 112d070b7d7SDeepak KodihalliThis document covers the architectural, interface, and design details. It 113d070b7d7SDeepak Kodihalliprovides recommendations for implementations, but implementation details are 114d070b7d7SDeepak Kodihallioutside the scope of this document. 115d070b7d7SDeepak Kodihalli 116d070b7d7SDeepak KodihalliThe design aims at having a single PLDM daemon serve both the requester and 117d070b7d7SDeepak Kodihalliresponder functions, and having transport specific endpoints to communicate 118d070b7d7SDeepak Kodihallion different channels. 119d070b7d7SDeepak Kodihalli 120d070b7d7SDeepak KodihalliThe design enables concurrency aspects of the requester and responder functions, 121d070b7d7SDeepak Kodihallibut the goal is to employ asynchronous IO and event loops, instead of multiple 122d070b7d7SDeepak Kodihallithreads, wherever possible. 123d070b7d7SDeepak Kodihalli 124d070b7d7SDeepak KodihalliThe following are high level structural elements of the design: 125d070b7d7SDeepak Kodihalli 126d070b7d7SDeepak Kodihalli### PLDM encode/decode libraries 127d070b7d7SDeepak Kodihalli 128d070b7d7SDeepak KodihalliThis library would take a PLDM message, decode it and extract the different 129d070b7d7SDeepak Kodihallifields of the message. Conversely, given a PLDM Type, command code, and the 130d070b7d7SDeepak Kodihallicommand's data fields, it would make a PLDM message. The thought is to design 131d070b7d7SDeepak Kodihallithis as a common library, that can be used by the BMC and other firmware stacks, 132d070b7d7SDeepak Kodihallibecause it's the encode/decode and protocol piece (and not the handling of a 133d070b7d7SDeepak Kodihallimessage). 134d070b7d7SDeepak Kodihalli 135d070b7d7SDeepak Kodihalli### PLDM provider libraries 136d070b7d7SDeepak Kodihalli 137d070b7d7SDeepak KodihalliThese libraries would implement the platform specific handling of incoming PLDM 138d070b7d7SDeepak Kodihallirequests (basically helping with the PLDM responder implementation, see next 139d070b7d7SDeepak Kodihallibullet point), so for instance they would query D-Bus objects (or even something 140d070b7d7SDeepak Kodihallilike a JSON file) to fetch platform specific information to respond to the PLDM 141d070b7d7SDeepak Kodihallimessage. They would link with the encode/decode lib. 142d070b7d7SDeepak Kodihalli 143d070b7d7SDeepak KodihalliIt should be possible to plug-in a provider library, that lets someone add 144d070b7d7SDeepak Kodihallifunctionality for new PLDM (standard as well as OEM) Types. The libraries would 145d070b7d7SDeepak Kodihalliimplement a "register" API to plug-in handlers for specific PLDM messages. 146d070b7d7SDeepak KodihalliSomething like: 147d070b7d7SDeepak Kodihalli 148d070b7d7SDeepak Kodihallitemplate <typename Handler, typename... args> 149d070b7d7SDeepak Kodihalliauto register(uint8_t type, uint8_t command, Handler handler); 150d070b7d7SDeepak Kodihalli 151d070b7d7SDeepak KodihalliThis allows for providing a strongly-typed C++ handler registration scheme. It 152d070b7d7SDeepak Kodihalliwould also be possible to validate the parameters passed to the handler at 153d070b7d7SDeepak Kodihallicompile time. 154d070b7d7SDeepak Kodihalli 155d070b7d7SDeepak Kodihalli### Request/Response Model 156d070b7d7SDeepak Kodihalli 157d070b7d7SDeepak KodihalliThe PLDM daemon links with the encode/decode and provider libs. The daemon 158d070b7d7SDeepak Kodihalliwould have to implement the following functions: 159d070b7d7SDeepak Kodihalli 160d070b7d7SDeepak Kodihalli#### Receiver/Responder 161d070b7d7SDeepak KodihalliThe receiver wakes up on getting notified of incoming PLDM messages (via D-Bus 162d070b7d7SDeepak Kodihallisignal or callback from the transport layer) from a remote PLDM device. If the 163d070b7d7SDeepak Kodihallimessage type is "Request" it would route them to a PLDM provider library. Via 164d070b7d7SDeepak Kodihallithe library, asynchronous D-Bus calls (using sdbusplus-asio) would be made, so 165d070b7d7SDeepak Kodihallithat the receiver can register a handler for the D-Bus response, instead of 166d070b7d7SDeepak Kodihallihaving to wait for the D-Bus response. This way it can go back to listening for 167d070b7d7SDeepak Kodihalliincoming PLDM messages. 168d070b7d7SDeepak Kodihalli 169d070b7d7SDeepak KodihalliIn the D-Bus response handler, the receiver will send out the PLDM response 170d070b7d7SDeepak Kodihallimessage via the transport's send message API. If the transport's send message 171d070b7d7SDeepak KodihalliAPI blocks for a considerably long duration, then it would have to be run in a 172d070b7d7SDeepak Kodihallithread of it's own. 173d070b7d7SDeepak Kodihalli 174d070b7d7SDeepak KodihalliIf the incoming PLDM message is of type "Response", then the receiver emits a 175d070b7d7SDeepak KodihalliD-Bus signal pointing to the response message. Any time the message is too 176d070b7d7SDeepak Kodihallilarge to fit in a D-Bus payload, the message is written to a file, and a 177d070b7d7SDeepak Kodihalliread-only file descriptor pointing to that file is contained in the D-Bus 178d070b7d7SDeepak Kodihallisignal. 179d070b7d7SDeepak Kodihalli 180d070b7d7SDeepak Kodihalli#### Requester 181d070b7d7SDeepak KodihalliDesigning the BMC as a PLDM requester is interesting. We haven't had this with 182d070b7d7SDeepak KodihalliIPMI, because the BMC was typically an IPMI server. PLDM requester functions 183d070b7d7SDeepak Kodihalliwill be spread across multiple OpenBMC applications (instead of a single big 184d070b7d7SDeepak Kodihallirequester app) - based on the responder they're talking to and the high level 185d070b7d7SDeepak Kodihallifunction they implement. For example, there could be an app that lets the BMC 186d070b7d7SDeepak Kodihalliupgrade firmware for other devices using PLDM - this would be a generic app 187d070b7d7SDeepak Kodihalliin the sense that the same set of commands might have to be run irrespective 188d070b7d7SDeepak Kodihalliof the device on the other side. There could also be an app that does fan 189d070b7d7SDeepak Kodihallicontrol on a remote device, based on sensors from that device and algorithms 190d070b7d7SDeepak Kodihallispecific to that device. 191d070b7d7SDeepak Kodihalli 192d070b7d7SDeepak KodihalliThe PLDM daemon would have to implement D-Bus interfaces to form the requester 193d070b7d7SDeepak Kodihallifunctions: a method to send a PLDM message over the underlying transport (again, 194d070b7d7SDeepak Kodihallithis will have two versions: one that accepts a byte stream, and the other that 195d070b7d7SDeepak Kodihalliaccepts an fd, for large messages) and a signal to indicate a PLDM response from 196d070b7d7SDeepak Kodihallithe remote PLDM device. The signal would comprise of the transport headers, PLDM 197d070b7d7SDeepak Kodihalliheaders, and the PLDM payload. 198d070b7d7SDeepak Kodihalli 199d070b7d7SDeepak KodihalliThe typical flow for a requester app would be to send the PLDM message via the 200d070b7d7SDeepak KodihalliD-Bus API (the PLDM daemon would have to assign an instance id), and add a 201d070b7d7SDeepak Kodihallihandler for the D-Bus signal containing the response. As this flow is 202d070b7d7SDeepak Kodihalliasynchronous, the requester app can execute other scheduled work, if any, in its 203d070b7d7SDeepak Kodihallievent loop, while it waits for the D-Bus signal containing the response. The 204d070b7d7SDeepak KodihalliD-Bus API to send a PLDM message to the remote PLDM device would call the 205d070b7d7SDeepak Kodihalliunderlying transport's send API. If that API blocks for too long, the call may 206d070b7d7SDeepak Kodihallihave to run in a thread of it's own. The D-Bus signal containing a response 207d070b7d7SDeepak Kodihallimessage is emitted by the receiver (see above). 208d070b7d7SDeepak Kodihalli 209d070b7d7SDeepak Kodihalli### Multiple transport channels 210d070b7d7SDeepak KodihalliThe PLDM daemon might have to talk to remote PLDM devices via different 211d070b7d7SDeepak Kodihallichannels. While a level of abstraction might be provided by MCTP, the PLDM 212d070b7d7SDeepak Kodihallidaemon would have to implement a D-Bus interface to target a specific 213d070b7d7SDeepak Kodihallitransport channel, so that requester apps on the BMC can send messages over 214d070b7d7SDeepak Kodihallithat transport. Also, it should be possible to plug-in platform specific D-Bus 215d070b7d7SDeepak Kodihalliobjects that implement an interface to target a platform specific transport. 216d070b7d7SDeepak Kodihalli 217d070b7d7SDeepak Kodihalli## Alternatives Considered 218d070b7d7SDeepak KodihalliContinue using IPMI, but start making more use of OEM extensions to 219d070b7d7SDeepak Kodihallisuit the requirements of new platforms. However, given that the IPMI 220d070b7d7SDeepak Kodihallistandard is no longer under active development, we would likely end up 221d070b7d7SDeepak Kodihalliwith a large amount of platform-specific customisations. This also does 222d070b7d7SDeepak Kodihallinot solve the hardware channel issues in a standard manner. 223d070b7d7SDeepak KodihalliOn OpenPOWER hardware at least, we've started to hit some of the limitations of 224d070b7d7SDeepak KodihalliIPMI (for example, we have need for >255 sensors). 225d070b7d7SDeepak Kodihalli 226d070b7d7SDeepak Kodihalli## Impacts 227d070b7d7SDeepak KodihalliDevelopment would be required to implement the PLDM protocol, the 228d070b7d7SDeepak Kodihallirequest/response model, and platform specific handling. Low level design is 229d070b7d7SDeepak Kodihallirequired to implement the protocol specifics of each of the PLDM Types. Such low 230d070b7d7SDeepak Kodihallilevel design is not included in this proposal. 231d070b7d7SDeepak Kodihalli 232d070b7d7SDeepak KodihalliDesign and development needs to involve the firmware stacks of management 233d070b7d7SDeepak Kodihallicontrollers and management devices of a platform management subsystem. 234d070b7d7SDeepak Kodihalli 235d070b7d7SDeepak Kodihalli## Testing 236d070b7d7SDeepak KodihalliTesting can be done without having to depend on the underlying transport layer. 237d070b7d7SDeepak Kodihalli 238d070b7d7SDeepak KodihalliThe responder function can be tested by mocking a requester and the transport 239d070b7d7SDeepak Kodihallilayer: this would essentially test the protocol handling and platform specific 240d070b7d7SDeepak Kodihallihandling. The requester function can be tested by mocking a responder: this 241d070b7d7SDeepak Kodihalliwould test the instance id handling and the send/receive functions. 242d070b7d7SDeepak Kodihalli 243d070b7d7SDeepak KodihalliAPIs from the shared libraries can be tested via fuzzing. 244