1.. SPDX-License-Identifier: GPL-2.0
2
3==============================================
4Management Component Transport Protocol (MCTP)
5==============================================
6
7net/mctp/ contains protocol support for MCTP, as defined by DMTF standard
8DSP0236. Physical interface drivers ("bindings" in the specification) are
9provided in drivers/net/mctp/.
10
11The core code provides a socket-based interface to send and receive MCTP
12messages, through an AF_MCTP, SOCK_DGRAM socket.
13
14Structure: interfaces & networks
15================================
16
17The kernel models the local MCTP topology through two items: interfaces and
18networks.
19
20An interface (or "link") is an instance of an MCTP physical transport binding
21(as defined by DSP0236, section 3.2.47), likely connected to a specific hardware
22device. This is represented as a ``struct netdevice``.
23
24A network defines a unique address space for MCTP endpoints by endpoint-ID
25(described by DSP0236, section 3.2.31). A network has a user-visible identifier
26to allow references from userspace. Route definitions are specific to one
27network.
28
29Interfaces are associated with one network. A network may be associated with one
30or more interfaces.
31
32If multiple networks are present, each may contain endpoint IDs (EIDs) that are
33also present on other networks.
34
35Sockets API
36===========
37
38Protocol definitions
39--------------------
40
41MCTP uses ``AF_MCTP`` / ``PF_MCTP`` for the address- and protocol- families.
42Since MCTP is message-based, only ``SOCK_DGRAM`` sockets are supported.
43
44.. code-block:: C
45
46    int sd = socket(AF_MCTP, SOCK_DGRAM, 0);
47
48The only (current) value for the ``protocol`` argument is 0.
49
50As with all socket address families, source and destination addresses are
51specified with a ``sockaddr`` type, with a single-byte endpoint address:
52
53.. code-block:: C
54
55    typedef __u8		mctp_eid_t;
56
57    struct mctp_addr {
58            mctp_eid_t		s_addr;
59    };
60
61    struct sockaddr_mctp {
62            __kernel_sa_family_t smctp_family;
63            unsigned int         smctp_network;
64            struct mctp_addr     smctp_addr;
65            __u8                 smctp_type;
66            __u8                 smctp_tag;
67    };
68
69    #define MCTP_NET_ANY	0x0
70    #define MCTP_ADDR_ANY	0xff
71
72
73Syscall behaviour
74-----------------
75
76The following sections describe the MCTP-specific behaviours of the standard
77socket system calls. These behaviours have been chosen to map closely to the
78existing sockets APIs.
79
80``bind()`` : set local socket address
81^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
82
83Sockets that receive incoming request packets will bind to a local address,
84using the ``bind()`` syscall.
85
86.. code-block:: C
87
88    struct sockaddr_mctp addr;
89
90    addr.smctp_family = AF_MCTP;
91    addr.smctp_network = MCTP_NET_ANY;
92    addr.smctp_addr.s_addr = MCTP_ADDR_ANY;
93    addr.smctp_type = MCTP_TYPE_PLDM;
94    addr.smctp_tag = MCTP_TAG_OWNER;
95
96    int rc = bind(sd, (struct sockaddr *)&addr, sizeof(addr));
97
98This establishes the local address of the socket. Incoming MCTP messages that
99match the network, address, and message type will be received by this socket.
100The reference to 'incoming' is important here; a bound socket will only receive
101messages with the TO bit set, to indicate an incoming request message, rather
102than a response.
103
104The ``smctp_tag`` value will configure the tags accepted from the remote side of
105this socket. Given the above, the only valid value is ``MCTP_TAG_OWNER``, which
106will result in remotely "owned" tags being routed to this socket. Since
107``MCTP_TAG_OWNER`` is set, the 3 least-significant bits of ``smctp_tag`` are not
108used; callers must set them to zero.
109
110A ``smctp_network`` value of ``MCTP_NET_ANY`` will configure the socket to
111receive incoming packets from any locally-connected network. A specific network
112value will cause the socket to only receive incoming messages from that network.
113
114The ``smctp_addr`` field specifies a local address to bind to. A value of
115``MCTP_ADDR_ANY`` configures the socket to receive messages addressed to any
116local destination EID.
117
118The ``smctp_type`` field specifies which message types to receive. Only the
119lower 7 bits of the type is matched on incoming messages (ie., the
120most-significant IC bit is not part of the match). This results in the socket
121receiving packets with and without a message integrity check footer.
122
123``sendto()``, ``sendmsg()``, ``send()`` : transmit an MCTP message
124^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
125
126An MCTP message is transmitted using one of the ``sendto()``, ``sendmsg()`` or
127``send()`` syscalls. Using ``sendto()`` as the primary example:
128
129.. code-block:: C
130
131    struct sockaddr_mctp addr;
132    char buf[14];
133    ssize_t len;
134
135    /* set message destination */
136    addr.smctp_family = AF_MCTP;
137    addr.smctp_network = 0;
138    addr.smctp_addr.s_addr = 8;
139    addr.smctp_tag = MCTP_TAG_OWNER;
140    addr.smctp_type = MCTP_TYPE_ECHO;
141
142    /* arbitrary message to send, with message-type header */
143    buf[0] = MCTP_TYPE_ECHO;
144    memcpy(buf + 1, "hello, world!", sizeof(buf) - 1);
145
146    len = sendto(sd, buf, sizeof(buf), 0,
147                    (struct sockaddr_mctp *)&addr, sizeof(addr));
148
149The network and address fields of ``addr`` define the remote address to send to.
150If ``smctp_tag`` has the ``MCTP_TAG_OWNER``, the kernel will ignore any bits set
151in ``MCTP_TAG_VALUE``, and generate a tag value suitable for the destination
152EID. If ``MCTP_TAG_OWNER`` is not set, the message will be sent with the tag
153value as specified. If a tag value cannot be allocated, the system call will
154report an errno of ``EAGAIN``.
155
156The application must provide the message type byte as the first byte of the
157message buffer passed to ``sendto()``. If a message integrity check is to be
158included in the transmitted message, it must also be provided in the message
159buffer, and the most-significant bit of the message type byte must be 1.
160
161The ``sendmsg()`` system call allows a more compact argument interface, and the
162message buffer to be specified as a scatter-gather list. At present no ancillary
163message types (used for the ``msg_control`` data passed to ``sendmsg()``) are
164defined.
165
166Transmitting a message on an unconnected socket with ``MCTP_TAG_OWNER``
167specified will cause an allocation of a tag, if no valid tag is already
168allocated for that destination. The (destination-eid,tag) tuple acts as an
169implicit local socket address, to allow the socket to receive responses to this
170outgoing message. If any previous allocation has been performed (to for a
171different remote EID), that allocation is lost.
172
173Sockets will only receive responses to requests they have sent (with TO=1) and
174may only respond (with TO=0) to requests they have received.
175
176``recvfrom()``, ``recvmsg()``, ``recv()`` : receive an MCTP message
177^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
178
179An MCTP message can be received by an application using one of the
180``recvfrom()``, ``recvmsg()``, or ``recv()`` system calls. Using ``recvfrom()``
181as the primary example:
182
183.. code-block:: C
184
185    struct sockaddr_mctp addr;
186    socklen_t addrlen;
187    char buf[14];
188    ssize_t len;
189
190    addrlen = sizeof(addr);
191
192    len = recvfrom(sd, buf, sizeof(buf), 0,
193                    (struct sockaddr_mctp *)&addr, &addrlen);
194
195    /* We can expect addr to describe an MCTP address */
196    assert(addrlen >= sizeof(buf));
197    assert(addr.smctp_family == AF_MCTP);
198
199    printf("received %zd bytes from remote EID %d\n", rc, addr.smctp_addr);
200
201The address argument to ``recvfrom`` and ``recvmsg`` is populated with the
202remote address of the incoming message, including tag value (this will be needed
203in order to reply to the message).
204
205The first byte of the message buffer will contain the message type byte. If an
206integrity check follows the message, it will be included in the received buffer.
207
208The ``recv()`` system call behaves in a similar way, but does not provide a
209remote address to the application. Therefore, these are only useful if the
210remote address is already known, or the message does not require a reply.
211
212Like the send calls, sockets will only receive responses to requests they have
213sent (TO=1) and may only respond (TO=0) to requests they have received.
214
215Kernel internals
216================
217
218There are a few possible packet flows in the MCTP stack:
219
2201. local TX to remote endpoint, message <= MTU::
221
222	sendmsg()
223	 -> mctp_local_output()
224	    : route lookup
225	    -> rt->output() (== mctp_route_output)
226	       -> dev_queue_xmit()
227
2282. local TX to remote endpoint, message > MTU::
229
230	sendmsg()
231	-> mctp_local_output()
232	    -> mctp_do_fragment_route()
233	       : creates packet-sized skbs. For each new skb:
234	       -> rt->output() (== mctp_route_output)
235	          -> dev_queue_xmit()
236
2373. remote TX to local endpoint, single-packet message::
238
239	mctp_pkttype_receive()
240	: route lookup
241	-> rt->output() (== mctp_route_input)
242	   : sk_key lookup
243	   -> sock_queue_rcv_skb()
244
2454. remote TX to local endpoint, multiple-packet message::
246
247	mctp_pkttype_receive()
248	: route lookup
249	-> rt->output() (== mctp_route_input)
250	   : sk_key lookup
251	   : stores skb in struct sk_key->reasm_head
252
253	mctp_pkttype_receive()
254	: route lookup
255	-> rt->output() (== mctp_route_input)
256	   : sk_key lookup
257	   : finds existing reassembly in sk_key->reasm_head
258	   : appends new fragment
259	   -> sock_queue_rcv_skb()
260
261Key refcounts
262-------------
263
264 * keys are refed by:
265
266   - a skb: during route output, stored in ``skb->cb``.
267
268   - netns and sock lists.
269
270 * keys can be associated with a device, in which case they hold a
271   reference to the dev (set through ``key->dev``, counted through
272   ``dev->key_count``). Multiple keys can reference the device.
273