xref: /openbmc/docs/designs/ncsi-coredump.md (revision 5a727b48)
1*5a727b48SDelphine CC Chiu# NC-SI core dump
2*5a727b48SDelphine CC Chiu
3*5a727b48SDelphine CC ChiuAuthor: DelphineCCChiu (<Delphine_CC_Chiu@wiwynn.com>)
4*5a727b48SDelphine CC Chiu
5*5a727b48SDelphine CC ChiuCreated: 03/12/2024
6*5a727b48SDelphine CC Chiu
7*5a727b48SDelphine CC Chiu## Problem Description
8*5a727b48SDelphine CC Chiu
9*5a727b48SDelphine CC ChiuNIC core-dump is essential for NIC issue debugging, and it could be retrieved
10*5a727b48SDelphine CC Chiuvia both in-band and out of band method. The design here is providing the
11*5a727b48SDelphine CC Chiusolution for NIC out of band dumping from BMC over NC-SI protocol.
12*5a727b48SDelphine CC Chiu
13*5a727b48SDelphine CC Chiu## Background and References
14*5a727b48SDelphine CC Chiu
15*5a727b48SDelphine CC ChiuNC-SI command for dump retrieval: Reference: NC-SI spec v1.2: section: 8.4.114
16*5a727b48SDelphine CC Chiu<https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.2.0.pdf>
17*5a727b48SDelphine CC Chiu
18*5a727b48SDelphine CC ChiuNC-SI over MCTP:
19*5a727b48SDelphine CC Chiu<https://www.dmtf.org/sites/default/files/standards/documents/DSP0261_1.3.0.pdf>
20*5a727b48SDelphine CC Chiu
21*5a727b48SDelphine CC Chiu## Requirements
22*5a727b48SDelphine CC Chiu
23*5a727b48SDelphine CC ChiuThis feature requires Linux kernel to support transferring new NC-SI command
24*5a727b48SDelphine CC Chiu(0x4D) in net/ncsi module
25*5a727b48SDelphine CC Chiu<https://github.com/torvalds/linux/commits/master/net/ncsi>
26*5a727b48SDelphine CC Chiu
27*5a727b48SDelphine CC Chiu## Proposed Design
28*5a727b48SDelphine CC Chiu
29*5a727b48SDelphine CC Chiu### Interface
30*5a727b48SDelphine CC Chiu
31*5a727b48SDelphine CC ChiuThis design will reuse existing phosphor-debug-collector module:
32*5a727b48SDelphine CC Chiu<https://github.com/openbmc/phosphor-debug-collector> and extent the dump
33*5a727b48SDelphine CC Chiucreation interface with a new "NC-SICoreDump" dump type.
34*5a727b48SDelphine CC Chiu
35*5a727b48SDelphine CC ChiuThe D-Bus interface for dump creation will be:"xyz.openbmc_project.Dump.Manager
36*5a727b48SDelphine CC Chiu/xyz/openbmc_project/dump/bmc xyz.openbmc_project.Dump.Create"
37*5a727b48SDelphine CC Chiu
38*5a727b48SDelphine CC ChiuTo indicate which NC-SI link to target, The CreateDump method need one
39*5a727b48SDelphine CC Chiuadditional input parameter: "NICTarget". An EID or network interface, such as
40*5a727b48SDelphine CC Chiueth0 could be a valid value.
41*5a727b48SDelphine CC Chiu
42*5a727b48SDelphine CC Chiu### Dump Retrieval
43*5a727b48SDelphine CC Chiu
44*5a727b48SDelphine CC ChiuUsing standard NC-SI command: Retrieve Data From NC(0x4D) to get the dumps by
45*5a727b48SDelphine CC ChiuNC-SI over RBT or NC-SI over MCTP protocol. All NC-SI dump procedure will be
46*5a727b48SDelphine CC Chiuimplemented in ncsi-netlink and ncsi-mctp utility in phosphor-networkd:
47*5a727b48SDelphine CC Chiu<https://github.com/openbmc/phosphor-networkd/blob/master/src/ncsi_netlink_main.cpp>
48*5a727b48SDelphine CC Chiu
49*5a727b48SDelphine CC Chiu### Integrate with phosphor-debug-collector
50*5a727b48SDelphine CC Chiu
51*5a727b48SDelphine CC ChiuSince phosphor-debug-collector using shell scripts for data collection, a new
52*5a727b48SDelphine CC Chiucollector script named "ncsicoredump" will be added. This script will help to
53*5a727b48SDelphine CC Chiucall ncsi-netlink or ncsi-mctp by different NICTarget and generate dump file
54*5a727b48SDelphine CC Chiuunder specific folder.
55*5a727b48SDelphine CC Chiu
56*5a727b48SDelphine CC ChiuThe following block diagram illustrate entire dump procedure and relationship
57*5a727b48SDelphine CC Chiubetween modules:
58*5a727b48SDelphine CC Chiu
59*5a727b48SDelphine CC Chiu```text
60*5a727b48SDelphine CC Chiu
61*5a727b48SDelphine CC Chiu                           +----------------+           +-----------+
62*5a727b48SDelphine CC Chiu                           |                |           |           |
63*5a727b48SDelphine CC Chiu              ------------->  Dump manager  +-----------> DumpEntry |
64*5a727b48SDelphine CC Chiu               CreateDump  |                |           |           |
65*5a727b48SDelphine CC Chiu                           +--------+-------+           +-----------+
66*5a727b48SDelphine CC Chiu                                    |
67*5a727b48SDelphine CC Chiu                                    |
68*5a727b48SDelphine CC Chiu                                    |
69*5a727b48SDelphine CC Chiu                           +--------v-------+
70*5a727b48SDelphine CC Chiu                           |                |
71*5a727b48SDelphine CC Chiu                           |    Dreport     |
72*5a727b48SDelphine CC Chiu                           |                |
73*5a727b48SDelphine CC Chiu                           +--------+-------+
74*5a727b48SDelphine CC Chiu                                    |
75*5a727b48SDelphine CC Chiu                                    |
76*5a727b48SDelphine CC Chiu                                    |
77*5a727b48SDelphine CC Chiu                           +--------v-------+
78*5a727b48SDelphine CC Chiu                           |                |
79*5a727b48SDelphine CC Chiu                           |  Plugin:       +------------------+
80*5a727b48SDelphine CC Chiu                           |  ncsicoredump  |                  |
81*5a727b48SDelphine CC Chiu                           |                |                  |
82*5a727b48SDelphine CC Chiu                           +--------+-------+                  |
83*5a727b48SDelphine CC Chiu                                    |                          |
84*5a727b48SDelphine CC Chiu                                    |                          |
85*5a727b48SDelphine CC Chiu                                    |                          |
86*5a727b48SDelphine CC Chiu                                    |                          |
87*5a727b48SDelphine CC Chiu      +------------+       +--------v-------+          +-------v------+        +------------+
88*5a727b48SDelphine CC Chiu      |            |       |                |          |              |        |            |
89*5a727b48SDelphine CC Chiu      |  DumpFile  <-------+  NCSI-NetLink  |          |  NCSI-MCTP   +-------->  DumpFile  |
90*5a727b48SDelphine CC Chiu      |            |       |                |          |              |        |            |
91*5a727b48SDelphine CC Chiu      +------------+       +--------^-------+          +-------^------+        +------------+
92*5a727b48SDelphine CC Chiu                                    |                          |
93*5a727b48SDelphine CC Chiu    --------------------------------+--------------------------+-----------------------------
94*5a727b48SDelphine CC Chiu        Kernel                      |Netlink                   |MCTP
95*5a727b48SDelphine CC Chiu                           +--------v-------+         +--------v-------+
96*5a727b48SDelphine CC Chiu                           |                |         |                |
97*5a727b48SDelphine CC Chiu                           |Net/NC-SI module|         |   I2C driver   |
98*5a727b48SDelphine CC Chiu                           |                |         |                |
99*5a727b48SDelphine CC Chiu                           +--------^-------+         +--------^-------+
100*5a727b48SDelphine CC Chiu                                    |                          |
101*5a727b48SDelphine CC Chiu                                    |NC-SI                     |SMBUS
102*5a727b48SDelphine CC Chiu                                    |                          |
103*5a727b48SDelphine CC Chiu                           +--------v--------------------------v-------+
104*5a727b48SDelphine CC Chiu                           |                                           |
105*5a727b48SDelphine CC Chiu                           |                     NIC                   |
106*5a727b48SDelphine CC Chiu                           |                                           |
107*5a727b48SDelphine CC Chiu                           +-------------------------------------------+
108*5a727b48SDelphine CC Chiu
109*5a727b48SDelphine CC Chiu```
110*5a727b48SDelphine CC Chiu
111*5a727b48SDelphine CC Chiu## Alternatives Considered
112*5a727b48SDelphine CC Chiu
113*5a727b48SDelphine CC ChiuWe shall block duplicated dump procedure by the reception ordering of NC-SI
114*5a727b48SDelphine CC Chiucommand(0x4d) shall be maintained. Since the core dump will contain up to 2
115*5a727b48SDelphine CC Chiucrash dump inside, we only support core dump now by it's sufficient for current
116*5a727b48SDelphine CC Chiuusage.
117*5a727b48SDelphine CC Chiu
118*5a727b48SDelphine CC Chiu## Impacts
119*5a727b48SDelphine CC Chiu
120*5a727b48SDelphine CC ChiuNone.
121*5a727b48SDelphine CC Chiu
122*5a727b48SDelphine CC Chiu## Testing
123*5a727b48SDelphine CC Chiu
124*5a727b48SDelphine CC ChiuCo-work with NIC vendor(Broadcom) for dump process/file validation.
125