xref: /openbmc/docs/designs/pldm-stack.md (revision 4134f4f1)
1d070b7d7SDeepak Kodihalli# PLDM stack on OpenBMC
2d070b7d7SDeepak Kodihalli
3d070b7d7SDeepak KodihalliAuthor: Deepak Kodihalli <dkodihal@linux.vnet.ibm.com> <dkodihal>
4*4134f4f1SDeepak Kodihalli
5d070b7d7SDeepak KodihalliPrimary assignee: Deepak Kodihalli
6*4134f4f1SDeepak Kodihalli
7d070b7d7SDeepak KodihalliCreated: 2019-01-22
8d070b7d7SDeepak Kodihalli
9d070b7d7SDeepak Kodihalli## Problem Description
10d070b7d7SDeepak KodihalliOn OpenBMC, in-band IPMI is currently the primary industry-standard means of
11d070b7d7SDeepak Kodihallicommunication between the BMC and the Host firmware. We've started hitting some
12d070b7d7SDeepak Kodihalliinherent limitations of IPMI on OpenPOWER servers: a limited number of sensors,
13d070b7d7SDeepak Kodihalliand a lack of a generic control mechanism (sensors are a generic monitoring
14d070b7d7SDeepak Kodihallimechanism) are the major ones. There is a need to improve upon the communication
15d070b7d7SDeepak Kodihalliprotocol, but at the same time inventing a custom protocol is undesirable.
16d070b7d7SDeepak Kodihalli
17d070b7d7SDeepak KodihalliThis design aims to employ Platform Level Data Model (PLDM), a standard
18d070b7d7SDeepak Kodihalliapplication layer communication protocol defined by the DMTF. PLDM draws inputs
19d070b7d7SDeepak Kodihallifrom IPMI, but it overcomes most of the latter's limitations. PLDM is also
20d070b7d7SDeepak Kodihallidesigned to run on standard transport protocols, for e.g. MCTP (also designed by
21d070b7d7SDeepak Kodihallithe DMTF). MCTP provides for a common transport layer over several physical
22d070b7d7SDeepak Kodihallichannels, by defining hardware bindings. The solution of PLDM over MCTP also
23d070b7d7SDeepak Kodihallihelps overcome some of the limitations of the hardware channels that IPMI uses.
24d070b7d7SDeepak Kodihalli
25d070b7d7SDeepak KodihalliPLDM's purpose is to enable all sorts of "inside the box communication": BMC -
26d070b7d7SDeepak KodihalliHost, BMC - BMC, BMC - Network Controller and BMC - Other (for e.g. sensor)
27d070b7d7SDeepak Kodihallidevices.
28d070b7d7SDeepak Kodihalli
29d070b7d7SDeepak Kodihalli## Background and References
30d070b7d7SDeepak KodihalliPLDM is designed to be an effective interface and data model that provides
31d070b7d7SDeepak Kodihalliefficient access to low-level platform inventory, monitoring, control, event,
32d070b7d7SDeepak Kodihalliand data/parameters transfer functions. For example, temperature, voltage, or
33d070b7d7SDeepak Kodihallifan sensors can have a PLDM representation that can be used to monitor and
34d070b7d7SDeepak Kodihallicontrol the platform using a set of PLDM messages. PLDM defines data
35d070b7d7SDeepak Kodihallirepresentations and commands that abstract the platform management hardware.
36d070b7d7SDeepak Kodihalli
37d070b7d7SDeepak KodihalliPLDM groups commands under broader functions, and defines
38d070b7d7SDeepak Kodihalliseparate specifications for each of these functions (also called PLDM "Types").
39d070b7d7SDeepak KodihalliThe currently defined Types (and corresponding specs) are : PLDM base (with
40d070b7d7SDeepak Kodihalliassociated IDs and states specs), BIOS, FRU, Platform monitoring and control,
41d070b7d7SDeepak KodihalliFirmware Update and SMBIOS. All these specifications are available at:
42d070b7d7SDeepak Kodihalli
43d070b7d7SDeepak Kodihallihttps://www.dmtf.org/standards/pmci
44d070b7d7SDeepak Kodihalli
45d070b7d7SDeepak KodihalliSome of the reasons PLDM sounds promising (some of these are advantages over
46d070b7d7SDeepak KodihalliIPMI):
47d070b7d7SDeepak Kodihalli
48d070b7d7SDeepak Kodihalli- Common in-band communication protocol.
49d070b7d7SDeepak Kodihalli
50d070b7d7SDeepak Kodihalli- Already existing PLDM Type specifications that cover the most common
51d070b7d7SDeepak Kodihalli  communication requirements. Up to 64 PLDM Types can be defined (the last one
52d070b7d7SDeepak Kodihalli  is OEM). At the moment, 6 are defined. Each PLDM type can house up to 256 PLDM
53d070b7d7SDeepak Kodihalli  commands.
54d070b7d7SDeepak Kodihalli
55d070b7d7SDeepak Kodihalli- PLDM sensors are 2 bytes in length.
56d070b7d7SDeepak Kodihalli
57d070b7d7SDeepak Kodihalli- PLDM introduces the concept of effecters - a control mechanism. Both sensors
58d070b7d7SDeepak Kodihalli  and effecters are associated to entities (similar to IPMI, entities can be
59d070b7d7SDeepak Kodihalli  physical or logical), where sensors are a mechanism for monitoring and
60d070b7d7SDeepak Kodihalli  effecters are a mechanism for control. Effecters can be numeric or state
61d070b7d7SDeepak Kodihalli  based. PLDM defines commonly used entities and their IDs, but there 8K slots
62d070b7d7SDeepak Kodihalli  available to define OEM entities.
63d070b7d7SDeepak Kodihalli
64d070b7d7SDeepak Kodihalli- A very active PLDM related working group in the DMTF.
65d070b7d7SDeepak Kodihalli
66d070b7d7SDeepak KodihalliThe plan is to run PLDM over MCTP. MCTP is defined in a spec of its own, and a
67d070b7d7SDeepak Kodihalliproposal on the MCTP design is in discussion already. There's going to be an
68d070b7d7SDeepak Kodihalliintermediate PLDM over MCTP binding layer, which lets us send PLDM messages over
69d070b7d7SDeepak KodihalliMCTP. This is defined in a spec of its own, and the design for this binding will
70d070b7d7SDeepak Kodihallibe proposed separately.
71d070b7d7SDeepak Kodihalli
72d070b7d7SDeepak Kodihalli## Requirements
73d070b7d7SDeepak KodihalliHow different BMC applications make use of PLDM messages is outside the scope
74d070b7d7SDeepak Kodihalliof this requirements doc. The requirements listed here are related to the PLDM
75d070b7d7SDeepak Kodihalliprotocol stack and the request/response model:
76d070b7d7SDeepak Kodihalli
77d070b7d7SDeepak Kodihalli- Marshalling and unmarshalling of PLDM messages, defined in various PLDM Type
78d070b7d7SDeepak Kodihalli  specs, must be implemented. This can of course be staged based on the need of
79d070b7d7SDeepak Kodihalli  specific Types and functions. Since this is just encoding and decoding PLDM
80d070b7d7SDeepak Kodihalli  messages, this can be a library that could shared between the BMC, and other
81d070b7d7SDeepak Kodihalli  firmware stacks. The specifics of each PLDM Type (such as FRU table
82d070b7d7SDeepak Kodihalli  structures, sensor PDR structures, etc) are implemented by this lib.
83d070b7d7SDeepak Kodihalli
84d070b7d7SDeepak Kodihalli- Mapping PLDM concepts to native OpenBMC concepts must be implemented. For
85d070b7d7SDeepak Kodihalli  e.g.: mapping PLDM sensors to phosphor-hwmon hosted D-Bus objects, mapping
86d070b7d7SDeepak Kodihalli  PLDM FRU data to D-Bus objects hosted by phosphor-inventory-manager, etc. The
87d070b7d7SDeepak Kodihalli  mapping shouldn't be restrictive to D-Bus alone (meaning it shouldn't be
88d070b7d7SDeepak Kodihalli  necessary to put objects on the Bus just to serve PLDM requests, a problem
89d070b7d7SDeepak Kodihalli  that exists with phosphor-host-ipmid today). Essentially these are platform
90d070b7d7SDeepak Kodihalli  specific PLDM message handlers.
91d070b7d7SDeepak Kodihalli
92d070b7d7SDeepak Kodihalli- The BMC should be able to act as a PLDM responder as well as a PLDM requester.
93d070b7d7SDeepak Kodihalli  As a PLDM requester, the BMC can monitor/control other devices. As a PLDM
94d070b7d7SDeepak Kodihalli  responder, the BMC can react to PLDM messages directed to it via requesters in
95d070b7d7SDeepak Kodihalli  the platform.
96d070b7d7SDeepak Kodihalli
97d070b7d7SDeepak Kodihalli- As a PLDM requester, the BMC must be able to discover other PLDM enabled
98d070b7d7SDeepak Kodihalli  components in the platform.
99d070b7d7SDeepak Kodihalli
100d070b7d7SDeepak Kodihalli- As a PLDM requester, the BMC must be able to send simultaneous messages to
101d070b7d7SDeepak Kodihalli  different responders.
102d070b7d7SDeepak Kodihalli
103d070b7d7SDeepak Kodihalli- As a PLDM requester, the BMC must be able to handle out of order responses.
104d070b7d7SDeepak Kodihalli
105d070b7d7SDeepak Kodihalli- As a PLDM responder, the BMC may simultaneously respond to messages from
106d070b7d7SDeepak Kodihalli  different requesters, but the spec doesn't mandate this. In other words the
107d070b7d7SDeepak Kodihalli  responder could be single-threaded.
108d070b7d7SDeepak Kodihalli
109d070b7d7SDeepak Kodihalli- It should be possible to plug-in OEM PLDM types/functions into the PLDM stack.
110d070b7d7SDeepak Kodihalli
111d070b7d7SDeepak Kodihalli## Proposed Design
112d070b7d7SDeepak KodihalliThis document covers the architectural, interface, and design details. It
113d070b7d7SDeepak Kodihalliprovides recommendations for implementations, but implementation details are
114d070b7d7SDeepak Kodihallioutside the scope of this document.
115d070b7d7SDeepak Kodihalli
116d070b7d7SDeepak KodihalliThe design aims at having a single PLDM daemon serve both the requester and
117d070b7d7SDeepak Kodihalliresponder functions, and having transport specific endpoints to communicate
118d070b7d7SDeepak Kodihallion different channels.
119d070b7d7SDeepak Kodihalli
120d070b7d7SDeepak KodihalliThe design enables concurrency aspects of the requester and responder functions,
121d070b7d7SDeepak Kodihallibut the goal is to employ asynchronous IO and event loops, instead of multiple
122d070b7d7SDeepak Kodihallithreads, wherever possible.
123d070b7d7SDeepak Kodihalli
124d070b7d7SDeepak KodihalliThe following are high level structural elements of the design:
125d070b7d7SDeepak Kodihalli
126d070b7d7SDeepak Kodihalli### PLDM encode/decode libraries
127d070b7d7SDeepak Kodihalli
128d070b7d7SDeepak KodihalliThis library would take a PLDM message, decode it and extract the different
129d070b7d7SDeepak Kodihallifields of the message. Conversely, given a PLDM Type, command code, and the
130d070b7d7SDeepak Kodihallicommand's data fields, it would make a PLDM message. The thought is to design
131d070b7d7SDeepak Kodihallithis as a common library, that can be used by the BMC and other firmware stacks,
132d070b7d7SDeepak Kodihallibecause it's the encode/decode and protocol piece (and not the handling of a
133d070b7d7SDeepak Kodihallimessage).
134d070b7d7SDeepak Kodihalli
135d070b7d7SDeepak Kodihalli### PLDM provider libraries
136d070b7d7SDeepak Kodihalli
137d070b7d7SDeepak KodihalliThese libraries would implement the platform specific handling of incoming PLDM
138d070b7d7SDeepak Kodihallirequests (basically helping with the PLDM responder implementation, see next
139d070b7d7SDeepak Kodihallibullet point), so for instance they would query D-Bus objects (or even something
140d070b7d7SDeepak Kodihallilike a JSON file) to fetch platform specific information to respond to the PLDM
141d070b7d7SDeepak Kodihallimessage. They would link with the encode/decode lib.
142d070b7d7SDeepak Kodihalli
143d070b7d7SDeepak KodihalliIt should be possible to plug-in a provider library, that lets someone add
144d070b7d7SDeepak Kodihallifunctionality for new PLDM (standard as well as OEM) Types. The libraries would
145d070b7d7SDeepak Kodihalliimplement a "register" API to plug-in handlers for specific PLDM messages.
146d070b7d7SDeepak KodihalliSomething like:
147d070b7d7SDeepak Kodihalli
148d070b7d7SDeepak Kodihallitemplate <typename Handler, typename... args>
149d070b7d7SDeepak Kodihalliauto register(uint8_t type, uint8_t command, Handler handler);
150d070b7d7SDeepak Kodihalli
151d070b7d7SDeepak KodihalliThis allows for providing a strongly-typed C++ handler registration scheme. It
152d070b7d7SDeepak Kodihalliwould also be possible to validate the parameters passed to the handler at
153d070b7d7SDeepak Kodihallicompile time.
154d070b7d7SDeepak Kodihalli
155d070b7d7SDeepak Kodihalli### Request/Response Model
156d070b7d7SDeepak Kodihalli
157d070b7d7SDeepak KodihalliThe PLDM daemon links with the encode/decode and provider libs. The daemon
158d070b7d7SDeepak Kodihalliwould have to implement the following functions:
159d070b7d7SDeepak Kodihalli
160d070b7d7SDeepak Kodihalli#### Receiver/Responder
161d070b7d7SDeepak KodihalliThe receiver wakes up on getting notified of incoming PLDM messages (via D-Bus
162d070b7d7SDeepak Kodihallisignal or callback from the transport layer) from a remote PLDM device. If the
163d070b7d7SDeepak Kodihallimessage type is "Request" it would route them to a PLDM provider library. Via
164d070b7d7SDeepak Kodihallithe library, asynchronous D-Bus calls (using sdbusplus-asio) would be made, so
165d070b7d7SDeepak Kodihallithat the receiver can register a handler for the D-Bus response, instead of
166d070b7d7SDeepak Kodihallihaving to wait for the D-Bus response. This way it can go back to listening for
167d070b7d7SDeepak Kodihalliincoming PLDM messages.
168d070b7d7SDeepak Kodihalli
169d070b7d7SDeepak KodihalliIn the D-Bus response handler, the receiver will send out the PLDM response
170d070b7d7SDeepak Kodihallimessage via the transport's send message API. If the transport's send message
171d070b7d7SDeepak KodihalliAPI blocks for a considerably long duration, then it would have to be run in a
172d070b7d7SDeepak Kodihallithread of it's own.
173d070b7d7SDeepak Kodihalli
174d070b7d7SDeepak KodihalliIf the incoming PLDM message is of type "Response", then the receiver emits a
175d070b7d7SDeepak KodihalliD-Bus signal pointing to the response message. Any time the message is too
176d070b7d7SDeepak Kodihallilarge to fit in a D-Bus payload, the message is written to a file, and a
177d070b7d7SDeepak Kodihalliread-only file descriptor pointing to that file is contained in the D-Bus
178d070b7d7SDeepak Kodihallisignal.
179d070b7d7SDeepak Kodihalli
180d070b7d7SDeepak Kodihalli#### Requester
181d070b7d7SDeepak KodihalliDesigning the BMC as a PLDM requester is interesting. We haven't had this with
182d070b7d7SDeepak KodihalliIPMI, because the BMC was typically an IPMI server. PLDM requester functions
183d070b7d7SDeepak Kodihalliwill be spread across multiple OpenBMC applications (instead of a single big
184d070b7d7SDeepak Kodihallirequester app) - based on the responder they're talking to and the high level
185d070b7d7SDeepak Kodihallifunction they implement. For example, there could be an app that lets the BMC
186d070b7d7SDeepak Kodihalliupgrade firmware for other devices using PLDM - this would be a generic app
187d070b7d7SDeepak Kodihalliin the sense that the same set of commands might have to be run irrespective
188d070b7d7SDeepak Kodihalliof the device on the other side. There could also be an app that does fan
189d070b7d7SDeepak Kodihallicontrol on a remote device, based on sensors from that device and algorithms
190d070b7d7SDeepak Kodihallispecific to that device.
191d070b7d7SDeepak Kodihalli
192d070b7d7SDeepak KodihalliThe PLDM daemon would have to implement D-Bus interfaces to form the requester
193d070b7d7SDeepak Kodihallifunctions: a method to send a PLDM message over the underlying transport (again,
194d070b7d7SDeepak Kodihallithis will have two versions: one that accepts a byte stream, and the other that
195d070b7d7SDeepak Kodihalliaccepts an fd, for large messages) and a signal to indicate a PLDM response from
196d070b7d7SDeepak Kodihallithe remote PLDM device. The signal would comprise of the transport headers, PLDM
197d070b7d7SDeepak Kodihalliheaders, and the PLDM payload.
198d070b7d7SDeepak Kodihalli
199d070b7d7SDeepak KodihalliThe typical flow for a requester app would be to send the PLDM message via the
200d070b7d7SDeepak KodihalliD-Bus API (the PLDM daemon would have to assign an instance id), and add a
201d070b7d7SDeepak Kodihallihandler for the D-Bus signal containing the response. As this flow is
202d070b7d7SDeepak Kodihalliasynchronous, the requester app can execute other scheduled work, if any, in its
203d070b7d7SDeepak Kodihallievent loop, while it waits for the D-Bus signal containing the response. The
204d070b7d7SDeepak KodihalliD-Bus API to send a PLDM message to the remote PLDM device would call the
205d070b7d7SDeepak Kodihalliunderlying transport's send API. If that API blocks for too long, the call may
206d070b7d7SDeepak Kodihallihave to run in a thread of it's own. The D-Bus signal containing a response
207d070b7d7SDeepak Kodihallimessage is emitted by the receiver (see above).
208d070b7d7SDeepak Kodihalli
209d070b7d7SDeepak Kodihalli### Multiple transport channels
210d070b7d7SDeepak KodihalliThe PLDM daemon might have to talk to remote PLDM devices via different
211d070b7d7SDeepak Kodihallichannels. While a level of abstraction might be provided by MCTP, the PLDM
212d070b7d7SDeepak Kodihallidaemon would have to implement a D-Bus interface to target a specific
213d070b7d7SDeepak Kodihallitransport channel, so that requester apps on the BMC can send messages over
214d070b7d7SDeepak Kodihallithat transport. Also, it should be possible to plug-in platform specific D-Bus
215d070b7d7SDeepak Kodihalliobjects that implement an interface to target a platform specific transport.
216d070b7d7SDeepak Kodihalli
217d070b7d7SDeepak Kodihalli## Alternatives Considered
218d070b7d7SDeepak KodihalliContinue using IPMI, but start making more use of OEM extensions to
219d070b7d7SDeepak Kodihallisuit the requirements of new platforms. However, given that the IPMI
220d070b7d7SDeepak Kodihallistandard is no longer under active development, we would likely end up
221d070b7d7SDeepak Kodihalliwith a large amount of platform-specific customisations. This also does
222d070b7d7SDeepak Kodihallinot solve the hardware channel issues in a standard manner.
223d070b7d7SDeepak KodihalliOn OpenPOWER hardware at least, we've started to hit some of the limitations of
224d070b7d7SDeepak KodihalliIPMI (for example, we have need for >255 sensors).
225d070b7d7SDeepak Kodihalli
226d070b7d7SDeepak Kodihalli## Impacts
227d070b7d7SDeepak KodihalliDevelopment would be required to implement the PLDM protocol, the
228d070b7d7SDeepak Kodihallirequest/response model, and platform specific handling. Low level design is
229d070b7d7SDeepak Kodihallirequired to implement the protocol specifics of each of the PLDM Types. Such low
230d070b7d7SDeepak Kodihallilevel design is not included in this proposal.
231d070b7d7SDeepak Kodihalli
232d070b7d7SDeepak KodihalliDesign and development needs to involve the firmware stacks of management
233d070b7d7SDeepak Kodihallicontrollers and management devices of a platform management subsystem.
234d070b7d7SDeepak Kodihalli
235d070b7d7SDeepak Kodihalli## Testing
236d070b7d7SDeepak KodihalliTesting can be done without having to depend on the underlying transport layer.
237d070b7d7SDeepak Kodihalli
238d070b7d7SDeepak KodihalliThe responder function can be tested by mocking a requester and the transport
239d070b7d7SDeepak Kodihallilayer: this would essentially test the protocol handling and platform specific
240d070b7d7SDeepak Kodihallihandling. The requester function can be tested by mocking a responder: this
241d070b7d7SDeepak Kodihalliwould test the instance id handling and the send/receive functions.
242d070b7d7SDeepak Kodihalli
243d070b7d7SDeepak KodihalliAPIs from the shared libraries can be tested via fuzzing.
244