xref: /openbmc/docs/designs/phosphor-audit.md (revision 2bc8dac0)
1# phosphor-audit
2
3Author:
4  Ivan Mikhaylov, [i.mikhaylov@yadro.com](mailto:i.mikhaylov@yadro.com)
5
6Primary assignee:
7  Ivan Mikhaylov, [i.mikhaylov@yadro.com](mailto:i.mikhaylov@yadro.com)
8
9Other contributors:
10  Alexander Amelkin, [a.amelkin@yadro.com](mailto:a.amelkin@yadro.com)
11  Alexander Filippov, [a.filippov@yadro.com](mailto:a.filippov@yadro.com)
12
13Created:
14  2019-07-23
15
16## Problem Description
17
18End users of OpenBMC may take actions that change the system state and/or
19configuration. Such actions may be taken using any of the numerous interfaces
20provided by OpenBMC. That includes RedFish, IPMI, ssh or serial console shell,
21and other interfaces, including the future ones.
22
23Consequences of those actions may sometimes be harmful and an investigation may
24be conducted in order to find out the person responsible for the unwelcome
25changes. Currently, most changes leave no trace in OpenBMC logs, which hampers
26the aforementioned investigation.
27
28It is required to develop a mechanism that would allow for tracking such
29user activity, logging it, and taking certain actions if necessary.
30
31## Background and References
32
33YADRO had an internal solution for the problem. It was only applicable to an
34outdated version of OpenBMC and needed a redesign. There was also a parallel
35effort by IBM that can be found here:
36[REST and Redfish Traffic Logging](https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/22699)
37
38## Assumptions
39
40This design assumes that an end user is never given a direct access to the
41system shell. The shell allows for direct manipulation of user database
42(add/remove users, change passwords) and system configuration (network scripts,
43etc.), and it doesn't seem feasible to track such user actions taken within the
44shell. This design assumes that all user interaction with OpenBMC is limited to
45controlled interfaces served by other Phosphor OpenBMC components interacting
46via D-Bus.
47
48## Requirements
49
50 * Provide a unified method of logging user actions independent of the user
51   interface, where possible user actions are:
52   * Redfish/REST PUT/POST/DELETE/PATCH
53   * IPMI
54   * PAM
55   * PLDM
56   * Any other suitable service
57 * Provide a way to configure system response actions taken upon certain user
58   actions, where possible response actions are:
59   * Log an event
60   * Notify an administrator or an arbitrary notification receiver
61   * Run an arbitrary command
62 * Provide a way to configure notification receivers:
63   * E-mail
64   * SNMP
65   * Instant messengers
66   * D-Bus
67
68## Proposed Design
69
70The main idea is to catch D-Bus requests sent by user interfaces, then handle the
71request according to the configuration. In future, support for flexible policies
72may be implemented that would allow for better flexibility in handling and
73tracking.
74
75The phosphor-audit service represents a service that provides user activity
76tracking and corresponding action taking in response of user actions.
77
78The key benefit of using phosphor-audit is that all action handling will be kept
79inside this project instead of spreading it across multiple dedicated interface
80services with a risk of missing a handler for some action in one of them and
81bloating the codebase.
82
83The component diagram below shows the example of service overview.
84
85```ascii
86  +----------------+  audit event                           +-----------------+
87  |    IPMI NET    +-----------+                            | action          |
88  +----------------+           |                            | +-------------+ |
89                               |                            | |   logging   | |
90  +----------------+           |                            | +-------------+ |
91  |   IPMI HOST    +-----------+      +--------------+      |                 |
92  +----------------+           |      |    audit     |      | +-------------+ |
93                               +----->+   service    +----->| |   command   | |
94  +----------------+           |      |              |      | +-------------+ |
95  |  RedFish/REST  +-----------+      +--------------+      |                 |
96  +----------------+           |                            | +-------------+ |
97                               |                            | |   notify    | |
98  +----------------+           |                            | +-------------+ |
99  |  any service   +-----------+                            |                 |
100  +----------------+                                        | +-------------+ |
101                                                            | |     ...     | |
102                                                            | +-------------+ |
103                                                            +-----------------+
104```
105
106The audit event from diagram generated by an application to track user activity.
107The application sends 'signal' to audit service via D-Bus. What is happening
108next in audit service's handler depends on user requirements and needs. It is
109possible to just store logs, run arbitrary command or notify someone in handler
110or we can do all of the above and all of this can be optional.
111
112**Audit event call**
113
114Audit event call performs preprocessing of incoming data at application side
115before sending it to the audit service, if the request is filtered out, it will
116be dropped at this moment and will no longer be processed. After the filter
117check, the audit event call sends the data through D-Bus to the audit service
118which makes a decision regarding next steps. Also, it caches list of possible
119commands (blacklist or whitelist) and status of its service (disabled or enabled).
120If the service in undefined state, the call checks if service alive or not.
121
122 > `audit_event(type, rc, request, user, host, data)`
123 > *  type - type of event source : IPMI, REST, PAM, etc.
124 > *  rc   - return code of the handler event (status, rc, etc.)
125 > *  request - a generalized identifier of the event, e.g. ipmi command
126 > (cmd/netfn/lun), web path, or anything else that can describe the event.
127 > *  user - the user account on behalf of which the event was processed.
128 >           depends on context, NA/None in case of user inaccessibility.
129 > *  source - identifier of the host that the event has originated from. This can
130 >     be literally "host" for events originating from the local host (via locally
131 >     connected IPMI), or an IP address or a hostname of a remote host.
132 > *  data - any supplementary data that can help better identify the event
133 >      (e.g., some first bytes of the IPMI command data).
134
135Service itself can control flow of events with configuration on its side.
136
137Pseudocode for example:
138
139    audit_event(NET_IPMI, "access denied"(rc=-1), "ipmi cmd", "qwerty223",
140                          "192.168.0.1", <some additional data if needed>)
141    audit_event(REST, "login successful"(rc=200), "rest login",
142                      "qwerty223", "192.168.0.1", NULL)
143    audit_event(HOST_IPMI, "shutting down the host"(rc=0), "host poweroff",
144                       NULL, NULL, NULL)
145
146`audit_event(blob_data)`
147Blob can be described as structure:
148
149    struct blob_audit
150    {
151        uint8_t type;
152        int32_t rc;
153        uint32_t request_id;
154        char *user;
155        sockaddr_in6 *addr;
156        struct iovec *data;
157    }
158
159When the call reaches the server destination via D-Bus, the server already knows
160that the call should be processed via predefined list of actions which are set
161in the server configuration.
162
163Step by step execution of call:
164 * client's layer
165    1. checks if audit is enabled for such service
166    2. checks if audit event should be whitelisted or blacklisted at
167       the audit service side for preventing spamming of unneeded events
168       to audit service
169    3. send the data to the audit service via D-Bus
170 * server's layer
171    1. accept D-Bus request
172    2. goes through list of actions for each services
173
174How the checks will be processed at client's layer:
175 1. check the status of service and cache that value
176 2. check the list of possible actions which should be logged and cache them also
177 3. listen on 'propertiesChanged' event in case of changing list or status
178    of service
179
180## Service configuration
181
182The configuration structure can be described as tree with set of options,
183as example of structure:
184
185```
186[IPMI]
187   [Enabled]
188   [Whitelist]
189     [Cmd 0x01] ["reset request"]
190     [Cmd 0x02] ["hello world"]
191     [Cmd 0x03] ["goodbye cruel world"]
192   [Actions]
193     [Notify type1] [Recipient]
194     [Notify type2] [Recipient]
195     [Notify type3] [Recipient]
196     [Logging type] [Options]
197     [Exec] [ExternalCommand]
198[REST]
199   [Disabled]
200   [Blacklist]
201     [Path1] [Options]
202     [Path2] [Options]
203   [Actions]
204     [Notify type2] [Recipient]
205     [Logging type] [Options]
206```
207
208Options can be updated via D-Bus properties. The audit service listens changes
209on configuration file and emit 'PropertiesChanged' signal with changed details.
210
211* The whitelisting and blacklisting
212
213 > Possible list of requests which have to be filtered and processed.
214 > 'Whitelist' filters possible requests which can be processed.
215 > 'Blacklist' blocks only exact requests.
216
217* Enable/disable the event processing for directed services, where the directed
218  service is any suitable services which can use audit service.
219
220 > Each audit processing type can be disabled or enabled at runtime via
221 > config file or D-Bus property.
222
223* Notification setup via SNMP/E-mail/Instant messengers/D-Bus
224
225 > The end recipient notification system with different transports.
226
227* Logging
228
229 > phosphor-logging, journald or anything else suitable for.
230
231* User actions
232
233 > Running a command as consequenced action.
234
235## Workflow
236
237An example of possible flow:
238
239```ascii
240           +----------------+
241           |   NET   IPMI   |
242           |    REQUEST     |
243           +----------------+
244                   |
245 +--------------------------------------------------------------------------+
246 |         +-------v--------+                                         IPMI  |
247 |         |    NET IPMI    |                                               |
248 |         +----------------+                                               |
249 |                 |                                                        |
250 |         +-------v--------+        +---------------------------+          |
251 |         | rc = handle()  +------->|  audit_event<NET_IPMI>()  |          |
252 |         +----------------+        +---------------------------+          |
253 |                 |                              |                         |
254 |                 |                              |                         |
255 |         +-------v--------+                     |                         |
256 |         |   Processing   |                     |                         |
257 |         |    further     |                     |                         |
258 |         +----------------+                     |                         |
259 +--------------------------------------------------------------------------+
260                                                  |
261                                                  |
262 +--------------------------------------------------------------------------+
263 |                  +-----------------------------+                         |
264 |                  |                                        Audit Service  |
265 |                  |                                                       |
266 |                  |                                                       |
267 |                  |                                                       |
268 |            +-----v------+                                                |
269 |        NO  | Is logging |        YES                                     |
270 |     +------+  enabled   +--------------------+                           |
271 |     |      | for  type? |                    |                           |
272 |     |      +------------+            +-------v-----+                     |
273 |     |                           NO   | Is request  |   YES               |
274 |     |                       +--------+    type     +--------+            |
275 |     |                       |        |  filtered?  |        |            |
276 |     |                       |        +-------------+        |            |
277 |     |                       |                               |            |
278 |     |               +-------v-------+                       |            |
279 |     |               |    Notify     |                       |            |
280 |     |               | Administrator |                       |            |
281 |     |               +---------------+                       |            |
282 |     |                       |                               |            |
283 |     |               +-------v-------+                       |            |
284 |     |               |   Log Event   |                       |            |
285 |     |               +---------------+                       |            |
286 |     |                       |                               |            |
287 |     |               +-------v-------+                       |            |
288 |     |               |     User      |                       |            |
289 |     |               |    actions    |                       |            |
290 |     |               +---------------+                       |            |
291 |     |                       |                               |            |
292 |     |               +-------v-------+                       |            |
293 |     +-------------->|      End      |<----------------------+            |
294 |                     +---------------+                                    |
295 |                                                                          |
296 +--------------------------------------------------------------------------+
297```
298
299## Notification mechanisms
300
301The unified model for reporting accidents to the end user, where the transport can be:
302
303* E-mail
304
305  > Sending a note to directed recipient which set in configuration via
306  > sendmail or anything else.
307
308* SNMP
309
310  > Sending a notification via SNMP trap messages to directed recipient which
311  > set in configuration.
312
313* Instant messengers
314
315  > Sending a notification to directed recipient which set in configuration via
316  > jabber/sametime/gtalk/etc.
317
318* D-Bus
319
320  > Notify the other service which set in configuration via 'method_call' or
321  > 'signal'.
322
323Notifications will be skipped in case if there is no any of above configuration
324rules is set inside configuration. It is possible to pick up rules at runtime.
325
326## User Actions
327
328 * Exec application via 'system' call.
329 * The code for directed handling type inside handler itself.
330   As example for 'net ipmi' in case of unsuccessful user login inside handler:
331   * Sends a notification to administrator.
332   * echo heartbeat > /sys/class/leds/alarm_red/trigger
333
334## Alternatives Considered
335
336Processing user requests in each dedicated interface service and logging
337them separately for each of the interfaces. Scattered handling looks like
338an error-prone and rigid approach.
339
340## Impacts
341
342Improves system manageability and security.
343
344Impacts when phosphor-audit is not enabled:
345 - Many services will have slightly larger code size and longer CPU path length
346   due to invocations of audit_event().
347 - Increased D-Bus traffic.
348
349Impacts when phosphor-audit is enabled:
350All of the above, plus:
351 - Additional BMC processor time needed to handle audit events.
352 - Additional BMC flash storage needed to store logged events.
353 - Additional outbound network traffic to notify users.
354 - Additional space for notification libraries.
355
356## Testing
357
358`dbus-send` as command-line tool for generating audit events.
359
360Scenarios:
361 - For each supported service (such as Redfish, net IPMI, host IPMI, PLDM), create audit events, and validate they get logged.
362 - Ensure message-type and request-type filtering works as expected.
363 - Ensure basic notification actions work as expected (log, command, notify).
364 - When continuously generating audit-events, change the phosphor-audit service's configuration, and validate no audit events are lost, and the new configuration takes effect.
365