xref: /openbmc/docs/designs/pldm-stack.md (revision 37055f7d)
1# PLDM stack on OpenBMC
2
3Author: Deepak Kodihalli <dkodihal@linux.vnet.ibm.com> <dkodihal>
4
5Created: 2019-01-22
6
7## Problem Description
8On OpenBMC, in-band IPMI is currently the primary industry-standard means of
9communication between the BMC and the Host firmware. We've started hitting some
10inherent limitations of IPMI on OpenPOWER servers: a limited number of sensors,
11and a lack of a generic control mechanism (sensors are a generic monitoring
12mechanism) are the major ones. There is a need to improve upon the communication
13protocol, but at the same time inventing a custom protocol is undesirable.
14
15This design aims to employ Platform Level Data Model (PLDM), a standard
16application layer communication protocol defined by the DMTF. PLDM draws inputs
17from IPMI, but it overcomes most of the latter's limitations. PLDM is also
18designed to run on standard transport protocols, for e.g. MCTP (also designed by
19the DMTF). MCTP provides for a common transport layer over several physical
20channels, by defining hardware bindings. The solution of PLDM over MCTP also
21helps overcome some of the limitations of the hardware channels that IPMI uses.
22
23PLDM's purpose is to enable all sorts of "inside the box communication": BMC -
24Host, BMC - BMC, BMC - Network Controller and BMC - Other (for e.g. sensor)
25devices.
26
27## Background and References
28PLDM is designed to be an effective interface and data model that provides
29efficient access to low-level platform inventory, monitoring, control, event,
30and data/parameters transfer functions. For example, temperature, voltage, or
31fan sensors can have a PLDM representation that can be used to monitor and
32control the platform using a set of PLDM messages. PLDM defines data
33representations and commands that abstract the platform management hardware.
34
35PLDM groups commands under broader functions, and defines
36separate specifications for each of these functions (also called PLDM "Types").
37The currently defined Types (and corresponding specs) are : PLDM base (with
38associated IDs and states specs), BIOS, FRU, Platform monitoring and control,
39Firmware Update and SMBIOS. All these specifications are available at:
40
41https://www.dmtf.org/standards/pmci
42
43Some of the reasons PLDM sounds promising (some of these are advantages over
44IPMI):
45
46- Common in-band communication protocol.
47
48- Already existing PLDM Type specifications that cover the most common
49  communication requirements. Up to 64 PLDM Types can be defined (the last one
50  is OEM). At the moment, 6 are defined. Each PLDM type can house up to 256 PLDM
51  commands.
52
53- PLDM sensors are 2 bytes in length.
54
55- PLDM introduces the concept of effecters - a control mechanism. Both sensors
56  and effecters are associated to entities (similar to IPMI, entities can be
57  physical or logical), where sensors are a mechanism for monitoring and
58  effecters are a mechanism for control. Effecters can be numeric or state
59  based. PLDM defines commonly used entities and their IDs, but there 8K slots
60  available to define OEM entities.
61
62- A very active PLDM related working group in the DMTF.
63
64The plan is to run PLDM over MCTP. MCTP is defined in a spec of its own, and a
65proposal on the MCTP design is in discussion already. There's going to be an
66intermediate PLDM over MCTP binding layer, which lets us send PLDM messages over
67MCTP. This is defined in a spec of its own, and the design for this binding will
68be proposed separately.
69
70## Requirements
71How different BMC applications make use of PLDM messages is outside the scope
72of this requirements doc. The requirements listed here are related to the PLDM
73protocol stack and the request/response model:
74
75- Marshalling and unmarshalling of PLDM messages, defined in various PLDM Type
76  specs, must be implemented. This can of course be staged based on the need of
77  specific Types and functions. Since this is just encoding and decoding PLDM
78  messages, this can be a library that could shared between the BMC, and other
79  firmware stacks. The specifics of each PLDM Type (such as FRU table
80  structures, sensor PDR structures, etc) are implemented by this lib.
81
82- Mapping PLDM concepts to native OpenBMC concepts must be implemented. For
83  e.g.: mapping PLDM sensors to phosphor-hwmon hosted D-Bus objects, mapping
84  PLDM FRU data to D-Bus objects hosted by phosphor-inventory-manager, etc. The
85  mapping shouldn't be restrictive to D-Bus alone (meaning it shouldn't be
86  necessary to put objects on the Bus just to serve PLDM requests, a problem
87  that exists with phosphor-host-ipmid today). Essentially these are platform
88  specific PLDM message handlers.
89
90- The BMC should be able to act as a PLDM responder as well as a PLDM requester.
91  As a PLDM requester, the BMC can monitor/control other devices. As a PLDM
92  responder, the BMC can react to PLDM messages directed to it via requesters in
93  the platform.
94
95- As a PLDM requester, the BMC must be able to discover other PLDM enabled
96  components in the platform.
97
98- As a PLDM requester, the BMC must be able to send simultaneous messages to
99  different responders.
100
101- As a PLDM requester, the BMC must be able to handle out of order responses.
102
103- As a PLDM responder, the BMC may simultaneously respond to messages from
104  different requesters, but the spec doesn't mandate this. In other words the
105  responder could be single-threaded.
106
107- It should be possible to plug-in OEM PLDM types/functions into the PLDM stack.
108
109## Proposed Design
110This document covers the architectural, interface, and design details. It
111provides recommendations for implementations, but implementation details are
112outside the scope of this document.
113
114The design aims at having a single PLDM daemon serve both the requester and
115responder functions, and having transport specific endpoints to communicate
116on different channels.
117
118The design enables concurrency aspects of the requester and responder functions,
119but the goal is to employ asynchronous IO and event loops, instead of multiple
120threads, wherever possible.
121
122The following are high level structural elements of the design:
123
124### PLDM encode/decode libraries
125
126This library would take a PLDM message, decode it and extract the different
127fields of the message. Conversely, given a PLDM Type, command code, and the
128command's data fields, it would make a PLDM message. The thought is to design
129this as a common library, that can be used by the BMC and other firmware stacks,
130because it's the encode/decode and protocol piece (and not the handling of a
131message).
132
133### PLDM provider libraries
134
135These libraries would implement the platform specific handling of incoming PLDM
136requests (basically helping with the PLDM responder implementation, see next
137bullet point), so for instance they would query D-Bus objects (or even something
138like a JSON file) to fetch platform specific information to respond to the PLDM
139message. They would link with the encode/decode lib.
140
141It should be possible to plug-in a provider library, that lets someone add
142functionality for new PLDM (standard as well as OEM) Types. The libraries would
143implement a "register" API to plug-in handlers for specific PLDM messages.
144Something like:
145
146template <typename Handler, typename... args>
147auto register(uint8_t type, uint8_t command, Handler handler);
148
149This allows for providing a strongly-typed C++ handler registration scheme. It
150would also be possible to validate the parameters passed to the handler at
151compile time.
152
153### Request/Response Model
154
155The PLDM daemon links with the encode/decode and provider libs. The daemon
156would have to implement the following functions:
157
158#### Receiver/Responder
159The receiver wakes up on getting notified of incoming PLDM messages (via D-Bus
160signal or callback from the transport layer) from a remote PLDM device. If the
161message type is "Request" it would route them to a PLDM provider library. Via
162the library, asynchronous D-Bus calls (using sdbusplus-asio) would be made, so
163that the receiver can register a handler for the D-Bus response, instead of
164having to wait for the D-Bus response. This way it can go back to listening for
165incoming PLDM messages.
166
167In the D-Bus response handler, the receiver will send out the PLDM response
168message via the transport's send message API. If the transport's send message
169API blocks for a considerably long duration, then it would have to be run in a
170thread of it's own.
171
172If the incoming PLDM message is of type "Response", then the receiver emits a
173D-Bus signal pointing to the response message. Any time the message is too
174large to fit in a D-Bus payload, the message is written to a file, and a
175read-only file descriptor pointing to that file is contained in the D-Bus
176signal.
177
178#### Requester
179Designing the BMC as a PLDM requester is interesting. We haven't had this with
180IPMI, because the BMC was typically an IPMI server. PLDM requester functions
181will be spread across multiple OpenBMC applications (instead of a single big
182requester app) - based on the responder they're talking to and the high level
183function they implement. For example, there could be an app that lets the BMC
184upgrade firmware for other devices using PLDM - this would be a generic app
185in the sense that the same set of commands might have to be run irrespective
186of the device on the other side. There could also be an app that does fan
187control on a remote device, based on sensors from that device and algorithms
188specific to that device.
189
190##### Proposed requester design
191
192A requester app/flow comprises of the following :
193
194- Linkage with a PLDM encode/decode library, to be able to pack PLDM requests
195  and unpack PLDM responses.
196
197- A D-Bus API to generate a unique PLDM instance id. The id needs to be unique
198  across all outgoing PLDM messages (from potentially different processes).
199  This needs to be on D-Bus because the id needs to be unique across PLDM
200  requester app processes.
201
202- A requester client API that provides blocking and non-blocking functions to
203  transfer a PLDM request message and to receive the corresponding response
204  message, over MCTP (the blocking send() will return a PLDM response).
205  This will be a thin wrapper over the socket API provided by the mctp demux
206  daemon. This will provide APIs for common tasks so that the same may not
207  be re-implemented in each PLDM requester app. This set of API will be built
208  into the encode/decode library (so libpldm would house encode/decode APIs, and
209  based on a compile time flag, the requester APIs as well). A PLDM requester
210  app can choose to not use the client requester APIs, and instead can directly
211  talk to the MCTP demux daemon.
212
213##### Proposed requester design - flow diagrams
214
215a) With blocking API
216
217```
218+---------------+               +----------------+            +----------------+               +-----------------+
219|BMC requester/ |               |PLDM requester  |            |PLDM responder  |               |PLDM Daemon      |
220|client app     |               |lib (part of    |            |                |               |                 |
221|               |               |libpldm)        |            |                |               |                 |
222+-------+-------+               +-------+--------+            +--------+-------+               +---------+-------+
223        |                               |                              |                                 |
224        |App starts                     |                              |                                 |
225        |                               |                              |                                 |
226        +------------------------------->setup connection with         |                                 |
227        |init(non_block=false)          |MCTP daemon                   |                                 |
228        |                               |                              |                                 |
229        +<-------+return_code+----------+                              |                                 |
230        |                               |                              |                                 |
231        |                               |                              |                                 |
232        |                               |                              |                                 |
233        +------------------------------>+                              |                                 |
234        |encode_pldm_cmd(cmd code, args)|                              |                                 |
235        |                               |                              |                                 |
236        +<----+returns pldm_msg+--------+                              |                                 |
237        |                               |                              |                                 |
238        |                               |                              |                                 |
239        |----------------------------------------------------------------------------------------------->|
240        |DBus.getPLDMInstanceId()       |                              |                                 |
241        |                               |                              |                                 |
242        |<-------------------------returns PLDM instance id----------------------------------------------|
243        |                               |                              |                                 |
244        +------------------------------>+                              |                                 |
245        |send_msg(mctp_eids, pldm_msg)  +----------------------------->+                                 |
246        |                               |write msg to MCTP socket      |                                 |
247        |                               +----------------------------->+                                 |
248        |                               |call blocking recv() on socket|                                 |
249        |                               |                              |                                 |
250        |                               +<-+returns pldm_response+-----+                                 |
251        |                               |                              |                                 |
252        |                               +----+                         |                                 |
253        |                               |    | verify eids, instance id|                                 |
254        |                               +<---+                         |                                 |
255        |                               |                              |                                 |
256        +<--+returns pldm_response+-----+                              |                                 |
257        |                               |                              |                                 |
258        |                               |                              |                                 |
259        |                               |                              |                                 |
260        +------------------------------>+                              |                                 |
261        |decode_pldm_cmd(pldm_resp,     |                              |                                 |
262        |                output args)   |                              |                                 |
263        |                               |                              |                                 |
264        +------------------------------>+                              |                                 |
265        |close_connection()             |                              |                                 |
266        +                               +                              +                                 +
267```
268
269
270b) With non-blocking API
271
272```
273+---------------+               +----------------+            +----------------+             +---------------+
274|BMC requester/ |               |PLDM requester  |            |PLDM responder  |             |PLDM daemon    |
275|client app     |               |lib (part of    |            |                |             |               |
276|               |               |libpldm)        |            |                |             |               |
277+-------+-------+               +-------+--------+            +--------+-------+             +--------+------+
278        |                               |                              |                              |
279        |App starts                     |                              |                              |
280        |                               |                              |                              |
281        +------------------------------->setup connection with         |                              |
282        |init(non_block=true            |MCTP daemon                   |                              |
283        |     int* o_mctp_fd)           |                              |                              |
284        |                               |                              |                              |
285        +<-------+return_code+----------+                              |                              |
286        |                               |                              |                              |
287        |                               |                              |                              |
288        |                               |                              |                              |
289        +------------------------------>+                              |                              |
290        |encode_pldm_cmd(cmd code, args)|                              |                              |
291        |                               |                              |                              |
292        +<----+returns pldm_msg+--------+                              |                              |
293        |                               |                              |                              |
294        |-------------------------------------------------------------------------------------------->|
295        |DBus.getPLDMInstanceId()       |                              |                              |
296        |                               |                              |                              |
297        |<-------------------------returns PLDM instance id-------------------------------------------|
298        |                               |                              |                              |
299        |                               |                              |                              |
300        +------------------------------>+                              |                              |
301        |send_msg(eids, pldm_msg,       +----------------------------->+                              |
302        |         non_block=true)       |write msg to MCTP socket      |                              |
303        |                               +<---+return_code+-------------+                              |
304        +<-+returns rc, doesn't block+--+                              |                              |
305        |                               |                              |                              |
306        +------+                        |                              |                              |
307        |      |Add EPOLLIN on mctp_fd  |                              |                              |
308        |      |to self.event_loop      |                              |                              |
309        +<-----+                        |                              |                              |
310        |                               +                              |                              |
311        +<----------------------+PLDM response msg written to mctp_fd+-+                              |
312        |                               +                              |                              |
313        +------+EPOLLIN on mctp_fd      |                              |                              |
314        |      |received                |                              |                              |
315        |      |                        |                              |                              |
316        +<-----+                        |                              |                              |
317        |                               |                              |                              |
318        +------------------------------>+                              |                              |
319        |decode_pldm_cmd(pldm_response) |                              |                              |
320        |                               |                              |                              |
321        +------------------------------>+                              |                              |
322        |close_connection()             |                              |                              |
323        +                               +                              +                              +
324```
325
326##### Alternative to the proposed requester design
327
328a) Define D-Bus interfaces to send and receive PLDM messages :
329
330```
331method sendPLDM(uint8 mctp_eid, uint8 msg[])
332
333signal recvPLDM(uint8 mctp_eid, uint8 pldm_instance_id, uint8 msg[])
334```
335
336PLDM requester apps can then invoke the above applications. While this
337simplifies things for the user, it has two disadvantages :
338- the app implementing such an interface could be a single point of failure,
339  plus sending messages concurrently would be a challenge.
340- the message payload could be large (several pages), and copying the same for
341  D-Bus transfers might be undesirable.
342
343### Multiple transport channels
344The PLDM daemon might have to talk to remote PLDM devices via different
345channels. While a level of abstraction might be provided by MCTP, the PLDM
346daemon would have to implement a D-Bus interface to target a specific
347transport channel, so that requester apps on the BMC can send messages over
348that transport. Also, it should be possible to plug-in platform specific D-Bus
349objects that implement an interface to target a platform specific transport.
350
351### Processing PLDM FRU information sent down by the host firmware
352
353Note: while this is specific to the host BMC communication, most of this might
354apply to processing PLDM FRU information received from a device connected to the
355BMC as well.
356
357The requirement is for the BMC to consume PLDM FRU information received from the
358host firmware and then have the same exposed via Redfish. An example can be the
359host firmware sending down processor and core information via PLDM FRU commands,
360and the BMC making this information available via the Processor and
361ProcessorCollection schemas.
362
363This design is built around the pldmd and entity-manager applications on the
364BMC:
365
366- The pldmd asks the host firmware's PLDM stack for the host's FRU record table,
367  by sending it the PLDM GetFRURecordTable command. The pldmd should send this
368  command if the host indicates support for the PLDM FRU spec. The pldmd
369  receives a PLDM FRU record table from the host firmware (
370  www.dmtf.org/sites/default/files/standards/documents/DSP0257_1.0.0.pdf). The
371  daemon parses the FRU record table and hosts raw PLDM FRU information on
372  D-Bus. It will house the PLDM FRU properties for a certain FRU under an
373  xyz.openbmc_project.Inventory.Source.PLDM.FRU D-Bus interface, and house the
374  PLDM entity info extracted from the FRU record set PDR under an
375  xyz.openbmc_project.Source.PLDM.Entity interface.
376
377- Configurations can be written for entity-manager to probe an interface like
378  xyz.openbmc_project.Inventory.Source.PLDM.FRU, and create FRU inventory D-Bus
379  objects. Inventory interfaces from the xyz.openbmc_project. Inventory
380  namespace can be applied on these objects, by converting PLDM FRU property
381  values into xyz.openbmc_project.Invnetory.Decorator.Asset property values,
382  such as Part Number and Serial Number, in the entity manager configuration
383  file. Bmcweb can find these FRU inventory objects based on D-Bus interfaces,
384  as it does today.
385
386## Alternatives Considered
387Continue using IPMI, but start making more use of OEM extensions to
388suit the requirements of new platforms. However, given that the IPMI
389standard is no longer under active development, we would likely end up
390with a large amount of platform-specific customisations. This also does
391not solve the hardware channel issues in a standard manner.
392On OpenPOWER hardware at least, we've started to hit some of the limitations of
393IPMI (for example, we have need for >255 sensors).
394
395## Impacts
396Development would be required to implement the PLDM protocol, the
397request/response model, and platform specific handling. Low level design is
398required to implement the protocol specifics of each of the PLDM Types. Such low
399level design is not included in this proposal.
400
401Design and development needs to involve the firmware stacks of management
402controllers and management devices of a platform management subsystem.
403
404## Testing
405Testing can be done without having to depend on the underlying transport layer.
406
407The responder function can be tested by mocking a requester and the transport
408layer: this would essentially test the protocol handling and platform specific
409handling. The requester function can be tested by mocking a responder: this
410would test the instance id handling and the send/receive functions.
411
412APIs from the shared libraries can be tested via fuzzing.
413