xref: /openbmc/docs/designs/pldm-stack.md (revision e70b2ba6)
1# PLDM stack on OpenBMC
2
3Author: Deepak Kodihalli <dkodihal@linux.vnet.ibm.com> <dkodihal>
4
5Primary assignee: Deepak Kodihalli
6
7Created: 2019-01-22
8
9## Problem Description
10On OpenBMC, in-band IPMI is currently the primary industry-standard means of
11communication between the BMC and the Host firmware. We've started hitting some
12inherent limitations of IPMI on OpenPOWER servers: a limited number of sensors,
13and a lack of a generic control mechanism (sensors are a generic monitoring
14mechanism) are the major ones. There is a need to improve upon the communication
15protocol, but at the same time inventing a custom protocol is undesirable.
16
17This design aims to employ Platform Level Data Model (PLDM), a standard
18application layer communication protocol defined by the DMTF. PLDM draws inputs
19from IPMI, but it overcomes most of the latter's limitations. PLDM is also
20designed to run on standard transport protocols, for e.g. MCTP (also designed by
21the DMTF). MCTP provides for a common transport layer over several physical
22channels, by defining hardware bindings. The solution of PLDM over MCTP also
23helps overcome some of the limitations of the hardware channels that IPMI uses.
24
25PLDM's purpose is to enable all sorts of "inside the box communication": BMC -
26Host, BMC - BMC, BMC - Network Controller and BMC - Other (for e.g. sensor)
27devices.
28
29## Background and References
30PLDM is designed to be an effective interface and data model that provides
31efficient access to low-level platform inventory, monitoring, control, event,
32and data/parameters transfer functions. For example, temperature, voltage, or
33fan sensors can have a PLDM representation that can be used to monitor and
34control the platform using a set of PLDM messages. PLDM defines data
35representations and commands that abstract the platform management hardware.
36
37PLDM groups commands under broader functions, and defines
38separate specifications for each of these functions (also called PLDM "Types").
39The currently defined Types (and corresponding specs) are : PLDM base (with
40associated IDs and states specs), BIOS, FRU, Platform monitoring and control,
41Firmware Update and SMBIOS. All these specifications are available at:
42
43https://www.dmtf.org/standards/pmci
44
45Some of the reasons PLDM sounds promising (some of these are advantages over
46IPMI):
47
48- Common in-band communication protocol.
49
50- Already existing PLDM Type specifications that cover the most common
51  communication requirements. Up to 64 PLDM Types can be defined (the last one
52  is OEM). At the moment, 6 are defined. Each PLDM type can house up to 256 PLDM
53  commands.
54
55- PLDM sensors are 2 bytes in length.
56
57- PLDM introduces the concept of effecters - a control mechanism. Both sensors
58  and effecters are associated to entities (similar to IPMI, entities can be
59  physical or logical), where sensors are a mechanism for monitoring and
60  effecters are a mechanism for control. Effecters can be numeric or state
61  based. PLDM defines commonly used entities and their IDs, but there 8K slots
62  available to define OEM entities.
63
64- A very active PLDM related working group in the DMTF.
65
66The plan is to run PLDM over MCTP. MCTP is defined in a spec of its own, and a
67proposal on the MCTP design is in discussion already. There's going to be an
68intermediate PLDM over MCTP binding layer, which lets us send PLDM messages over
69MCTP. This is defined in a spec of its own, and the design for this binding will
70be proposed separately.
71
72## Requirements
73How different BMC applications make use of PLDM messages is outside the scope
74of this requirements doc. The requirements listed here are related to the PLDM
75protocol stack and the request/response model:
76
77- Marshalling and unmarshalling of PLDM messages, defined in various PLDM Type
78  specs, must be implemented. This can of course be staged based on the need of
79  specific Types and functions. Since this is just encoding and decoding PLDM
80  messages, this can be a library that could shared between the BMC, and other
81  firmware stacks. The specifics of each PLDM Type (such as FRU table
82  structures, sensor PDR structures, etc) are implemented by this lib.
83
84- Mapping PLDM concepts to native OpenBMC concepts must be implemented. For
85  e.g.: mapping PLDM sensors to phosphor-hwmon hosted D-Bus objects, mapping
86  PLDM FRU data to D-Bus objects hosted by phosphor-inventory-manager, etc. The
87  mapping shouldn't be restrictive to D-Bus alone (meaning it shouldn't be
88  necessary to put objects on the Bus just to serve PLDM requests, a problem
89  that exists with phosphor-host-ipmid today). Essentially these are platform
90  specific PLDM message handlers.
91
92- The BMC should be able to act as a PLDM responder as well as a PLDM requester.
93  As a PLDM requester, the BMC can monitor/control other devices. As a PLDM
94  responder, the BMC can react to PLDM messages directed to it via requesters in
95  the platform.
96
97- As a PLDM requester, the BMC must be able to discover other PLDM enabled
98  components in the platform.
99
100- As a PLDM requester, the BMC must be able to send simultaneous messages to
101  different responders.
102
103- As a PLDM requester, the BMC must be able to handle out of order responses.
104
105- As a PLDM responder, the BMC may simultaneously respond to messages from
106  different requesters, but the spec doesn't mandate this. In other words the
107  responder could be single-threaded.
108
109- It should be possible to plug-in OEM PLDM types/functions into the PLDM stack.
110
111## Proposed Design
112This document covers the architectural, interface, and design details. It
113provides recommendations for implementations, but implementation details are
114outside the scope of this document.
115
116The design aims at having a single PLDM daemon serve both the requester and
117responder functions, and having transport specific endpoints to communicate
118on different channels.
119
120The design enables concurrency aspects of the requester and responder functions,
121but the goal is to employ asynchronous IO and event loops, instead of multiple
122threads, wherever possible.
123
124The following are high level structural elements of the design:
125
126### PLDM encode/decode libraries
127
128This library would take a PLDM message, decode it and extract the different
129fields of the message. Conversely, given a PLDM Type, command code, and the
130command's data fields, it would make a PLDM message. The thought is to design
131this as a common library, that can be used by the BMC and other firmware stacks,
132because it's the encode/decode and protocol piece (and not the handling of a
133message).
134
135### PLDM provider libraries
136
137These libraries would implement the platform specific handling of incoming PLDM
138requests (basically helping with the PLDM responder implementation, see next
139bullet point), so for instance they would query D-Bus objects (or even something
140like a JSON file) to fetch platform specific information to respond to the PLDM
141message. They would link with the encode/decode lib.
142
143It should be possible to plug-in a provider library, that lets someone add
144functionality for new PLDM (standard as well as OEM) Types. The libraries would
145implement a "register" API to plug-in handlers for specific PLDM messages.
146Something like:
147
148template <typename Handler, typename... args>
149auto register(uint8_t type, uint8_t command, Handler handler);
150
151This allows for providing a strongly-typed C++ handler registration scheme. It
152would also be possible to validate the parameters passed to the handler at
153compile time.
154
155### Request/Response Model
156
157The PLDM daemon links with the encode/decode and provider libs. The daemon
158would have to implement the following functions:
159
160#### Receiver/Responder
161The receiver wakes up on getting notified of incoming PLDM messages (via D-Bus
162signal or callback from the transport layer) from a remote PLDM device. If the
163message type is "Request" it would route them to a PLDM provider library. Via
164the library, asynchronous D-Bus calls (using sdbusplus-asio) would be made, so
165that the receiver can register a handler for the D-Bus response, instead of
166having to wait for the D-Bus response. This way it can go back to listening for
167incoming PLDM messages.
168
169In the D-Bus response handler, the receiver will send out the PLDM response
170message via the transport's send message API. If the transport's send message
171API blocks for a considerably long duration, then it would have to be run in a
172thread of it's own.
173
174If the incoming PLDM message is of type "Response", then the receiver emits a
175D-Bus signal pointing to the response message. Any time the message is too
176large to fit in a D-Bus payload, the message is written to a file, and a
177read-only file descriptor pointing to that file is contained in the D-Bus
178signal.
179
180#### Requester
181Designing the BMC as a PLDM requester is interesting. We haven't had this with
182IPMI, because the BMC was typically an IPMI server. PLDM requester functions
183will be spread across multiple OpenBMC applications (instead of a single big
184requester app) - based on the responder they're talking to and the high level
185function they implement. For example, there could be an app that lets the BMC
186upgrade firmware for other devices using PLDM - this would be a generic app
187in the sense that the same set of commands might have to be run irrespective
188of the device on the other side. There could also be an app that does fan
189control on a remote device, based on sensors from that device and algorithms
190specific to that device.
191
192##### Proposed requester design
193
194A requester app/flow comprises of the following :
195
196- Linkage with a PLDM encode/decode library, to be able to pack PLDM requests
197  and unpack PLDM responses.
198
199- A D-Bus API to generate a unique PLDM instance id. The id needs to be unique
200  across all outgoing PLDM messages (from potentially different processes).
201  This needs to be on D-Bus because the id needs to be unique across PLDM
202  requester app processes.
203
204- A requester client API that provides blocking and non-blocking functions to
205  transfer a PLDM request message and to receive the corresponding response
206  message, over MCTP (the blocking send() will return a PLDM response).
207  This will be a thin wrapper over the socket API provided by the mctp demux
208  daemon. This will provide APIs for common tasks so that the same may not
209  be re-implemented in each PLDM requester app. This set of API will be built
210  into the encode/decode library (so libpldm would house encode/decode APIs, and
211  based on a compile time flag, the requester APIs as well). A PLDM requester
212  app can choose to not use the client requester APIs, and instead can directly
213  talk to the MCTP demux daemon.
214
215##### Proposed requester design - flow diagrams
216
217a) With blocking API
218
219+---------------+               +----------------+            +----------------+               +-----------------+
220|BMC requester/ |               |PLDM requester  |            |PLDM responder  |               |PLDM Daemon      |
221|client app     |               |lib (part of    |            |                |               |                 |
222|               |               |libpldm)        |            |                |               |                 |
223+-------+-------+               +-------+--------+            +--------+-------+               +---------+-------+
224        |                               |                              |                                 |
225        |App starts                     |                              |                                 |
226        |                               |                              |                                 |
227        +------------------------------->setup connection with         |                                 |
228        |init(non_block=false)          |MCTP daemon                   |                                 |
229        |                               |                              |                                 |
230        +<-------+return_code+----------+                              |                                 |
231        |                               |                              |                                 |
232        |                               |                              |                                 |
233        |                               |                              |                                 |
234        +------------------------------>+                              |                                 |
235        |encode_pldm_cmd(cmd code, args)|                              |                                 |
236        |                               |                              |                                 |
237        +<----+returns pldm_msg+--------+                              |                                 |
238        |                               |                              |                                 |
239        |                               |                              |                                 |
240        |----------------------------------------------------------------------------------------------->|
241        |DBus.getPLDMInstanceId()       |                              |                                 |
242        |                               |                              |                                 |
243        |<-------------------------returns PLDM instance id----------------------------------------------|
244        |                               |                              |                                 |
245        +------------------------------>+                              |                                 |
246        |send_msg(mctp_eids, pldm_msg)  +----------------------------->+                                 |
247        |                               |write msg to MCTP socket      |                                 |
248        |                               +----------------------------->+                                 |
249        |                               |call blocking recv() on socket|                                 |
250        |                               |                              |                                 |
251        |                               +<-+returns pldm_response+-----+                                 |
252        |                               |                              |                                 |
253        |                               +----+                         |                                 |
254        |                               |    | verify eids, instance id|                                 |
255        |                               +<---+                         |                                 |
256        |                               |                              |                                 |
257        +<--+returns pldm_response+-----+                              |                                 |
258        |                               |                              |                                 |
259        |                               |                              |                                 |
260        |                               |                              |                                 |
261        +------------------------------>+                              |                                 |
262        |decode_pldm_cmd(pldm_resp,     |                              |                                 |
263        |                output args)   |                              |                                 |
264        |                               |                              |                                 |
265        +------------------------------>+                              |                                 |
266        |close_connection()             |                              |                                 |
267        +                               +                              +                                 +
268
269
270b) With non-blocking API
271
272+---------------+               +----------------+            +----------------+             +---------------+
273|BMC requester/ |               |PLDM requester  |            |PLDM responder  |             |PLDM daemon    |
274|client app     |               |lib (part of    |            |                |             |               |
275|               |               |libpldm)        |            |                |             |               |
276+-------+-------+               +-------+--------+            +--------+-------+             +--------+------+
277        |                               |                              |                              |
278        |App starts                     |                              |                              |
279        |                               |                              |                              |
280        +------------------------------->setup connection with         |                              |
281        |init(non_block=true            |MCTP daemon                   |                              |
282        |     int* o_mctp_fd)           |                              |                              |
283        |                               |                              |                              |
284        +<-------+return_code+----------+                              |                              |
285        |                               |                              |                              |
286        |                               |                              |                              |
287        |                               |                              |                              |
288        +------------------------------>+                              |                              |
289        |encode_pldm_cmd(cmd code, args)|                              |                              |
290        |                               |                              |                              |
291        +<----+returns pldm_msg+--------+                              |                              |
292        |                               |                              |                              |
293        |-------------------------------------------------------------------------------------------->|
294        |DBus.getPLDMInstanceId()       |                              |                              |
295        |                               |                              |                              |
296        |<-------------------------returns PLDM instance id-------------------------------------------|
297        |                               |                              |                              |
298        |                               |                              |                              |
299        +------------------------------>+                              |                              |
300        |send_msg(eids, pldm_msg,       +----------------------------->+                              |
301        |         non_block=true)       |write msg to MCTP socket      |                              |
302        |                               +<---+return_code+-------------+                              |
303        +<-+returns rc, doesn't block+--+                              |                              |
304        |                               |                              |                              |
305        +------+                        |                              |                              |
306        |      |Add EPOLLIN on mctp_fd  |                              |                              |
307        |      |to self.event_loop      |                              |                              |
308        +<-----+                        |                              |                              |
309        |                               +                              |                              |
310        +<----------------------+PLDM response msg written to mctp_fd+-+                              |
311        |                               +                              |                              |
312        +------+EPOLLIN on mctp_fd      |                              |                              |
313        |      |received                |                              |                              |
314        |      |                        |                              |                              |
315        +<-----+                        |                              |                              |
316        |                               |                              |                              |
317        +------------------------------>+                              |                              |
318        |decode_pldm_cmd(pldm_response) |                              |                              |
319        |                               |                              |                              |
320        +------------------------------>+                              |                              |
321        |close_connection()             |                              |                              |
322        +                               +                              +                              +
323
324##### Alternative to the proposed requester design
325
326a) Define D-Bus interfaces to send and receive PLDM messages :
327
328```
329method sendPLDM(uint8 mctp_eid, uint8 msg[])
330
331signal recvPLDM(uint8 mctp_eid, uint8 pldm_instance_id, uint8 msg[])
332```
333
334PLDM requester apps can then invoke the above applications. While this
335simplifies things for the user, it has two disadvantages :
336- the app implementing such an interface could be a single point of failure,
337  plus sending messages concurrently would be a challenge.
338- the message payload could be large (several pages), and copying the same for
339  D-Bus transfers might be undesirable.
340
341### Multiple transport channels
342The PLDM daemon might have to talk to remote PLDM devices via different
343channels. While a level of abstraction might be provided by MCTP, the PLDM
344daemon would have to implement a D-Bus interface to target a specific
345transport channel, so that requester apps on the BMC can send messages over
346that transport. Also, it should be possible to plug-in platform specific D-Bus
347objects that implement an interface to target a platform specific transport.
348
349## Alternatives Considered
350Continue using IPMI, but start making more use of OEM extensions to
351suit the requirements of new platforms. However, given that the IPMI
352standard is no longer under active development, we would likely end up
353with a large amount of platform-specific customisations. This also does
354not solve the hardware channel issues in a standard manner.
355On OpenPOWER hardware at least, we've started to hit some of the limitations of
356IPMI (for example, we have need for >255 sensors).
357
358## Impacts
359Development would be required to implement the PLDM protocol, the
360request/response model, and platform specific handling. Low level design is
361required to implement the protocol specifics of each of the PLDM Types. Such low
362level design is not included in this proposal.
363
364Design and development needs to involve the firmware stacks of management
365controllers and management devices of a platform management subsystem.
366
367## Testing
368Testing can be done without having to depend on the underlying transport layer.
369
370The responder function can be tested by mocking a requester and the transport
371layer: this would essentially test the protocol handling and platform specific
372handling. The requester function can be tested by mocking a responder: this
373would test the instance id handling and the send/receive functions.
374
375APIs from the shared libraries can be tested via fuzzing.
376