xref: /openbmc/docs/designs/phosphor-audit.md (revision f4febd002df578bad816239b70950f84ea4567e8)
1# phosphor-audit
2
3Author: Ivan Mikhaylov, [i.mikhaylov@yadro.com](mailto:i.mikhaylov@yadro.com)
4
5Other contributors: Alexander Amelkin,
6[a.amelkin@yadro.com](mailto:a.amelkin@yadro.com) Alexander Filippov,
7[a.filippov@yadro.com](mailto:a.filippov@yadro.com)
8
9Created: 2019-07-23
10
11## Problem Description
12
13End users of OpenBMC may take actions that change the system state and/or
14configuration. Such actions may be taken using any of the numerous interfaces
15provided by OpenBMC. That includes RedFish, IPMI, ssh or serial console shell,
16and other interfaces, including the future ones.
17
18Consequences of those actions may sometimes be harmful and an investigation may
19be conducted in order to find out the person responsible for the unwelcome
20changes. Currently, most changes leave no trace in OpenBMC logs, which hampers
21the aforementioned investigation.
22
23It is required to develop a mechanism that would allow for tracking such user
24activity, logging it, and taking certain actions if necessary.
25
26## Background and References
27
28YADRO had an internal solution for the problem. It was only applicable to an
29outdated version of OpenBMC and needed a redesign. There was also a parallel
30effort by IBM that can be found here:
31[REST and Redfish Traffic Logging](https://gerrit.openbmc.org/c/openbmc/bmcweb/+/22699)
32
33## Assumptions
34
35This design assumes that an end user is never given a direct access to the
36system shell. The shell allows for direct manipulation of user database
37(add/remove users, change passwords) and system configuration (network scripts,
38etc.), and it doesn't seem feasible to track such user actions taken within the
39shell. This design assumes that all user interaction with OpenBMC is limited to
40controlled interfaces served by other Phosphor OpenBMC components interacting
41via D-Bus.
42
43## Requirements
44
45- Provide a unified method of logging user actions independent of the user
46  interface, where possible user actions are:
47  - Redfish/REST PUT/POST/DELETE/PATCH
48  - IPMI
49  - PAM
50  - PLDM
51  - Any other suitable service
52- Provide a way to configure system response actions taken upon certain user
53  actions, where possible response actions are:
54  - Log an event
55  - Notify an administrator or an arbitrary notification receiver
56  - Run an arbitrary command
57- Provide a way to configure notification receivers:
58  - E-mail
59  - SNMP
60  - Instant messengers
61  - D-Bus
62
63## Proposed Design
64
65The main idea is to catch D-Bus requests sent by user interfaces, then handle
66the request according to the configuration. In future, support for flexible
67policies may be implemented that would allow for better flexibility in handling
68and tracking.
69
70The phosphor-audit service represents a service that provides user activity
71tracking and corresponding action taking in response of user actions.
72
73The key benefit of using phosphor-audit is that all action handling will be kept
74inside this project instead of spreading it across multiple dedicated interface
75services with a risk of missing a handler for some action in one of them and
76bloating the codebase.
77
78The component diagram below shows the example of service overview.
79
80```ascii
81  +----------------+  audit event                           +-----------------+
82  |    IPMI NET    +-----------+                            | action          |
83  +----------------+           |                            | +-------------+ |
84                               |                            | |   logging   | |
85  +----------------+           |                            | +-------------+ |
86  |   IPMI HOST    +-----------+      +--------------+      |                 |
87  +----------------+           |      |    audit     |      | +-------------+ |
88                               +----->+   service    +----->| |   command   | |
89  +----------------+           |      |              |      | +-------------+ |
90  |  RedFish/REST  +-----------+      +--------------+      |                 |
91  +----------------+           |                            | +-------------+ |
92                               |                            | |   notify    | |
93  +----------------+           |                            | +-------------+ |
94  |  any service   +-----------+                            |                 |
95  +----------------+                                        | +-------------+ |
96                                                            | |     ...     | |
97                                                            | +-------------+ |
98                                                            +-----------------+
99```
100
101The audit event from diagram generated by an application to track user activity.
102The application sends 'signal' to audit service via D-Bus. What is happening
103next in audit service's handler depends on user requirements and needs. It is
104possible to just store logs, run arbitrary command or notify someone in handler
105or we can do all of the above and all of this can be optional.
106
107**Audit event call**
108
109Audit event call performs preprocessing of incoming data at application side
110before sending it to the audit service, if the request is filtered out, it will
111be dropped at this moment and will no longer be processed. After the filter
112check, the audit event call sends the data through D-Bus to the audit service
113which makes a decision regarding next steps. Also, it caches list of possible
114commands (blacklist or whitelist) and status of its service (disabled or
115enabled). If the service in undefined state, the call checks if service alive or
116not.
117
118> `audit_event(type, rc, request, user, host, data)`
119>
120> - type - type of event source : IPMI, REST, PAM, etc.
121> - rc - return code of the handler event (status, rc, etc.)
122> - request - a generalized identifier of the event, e.g. ipmi command
123>   (cmd/netfn/lun), web path, or anything else that can describe the event.
124> - user - the user account on behalf of which the event was processed. depends
125>   on context, NA/None in case of user inaccessibility.
126> - source - identifier of the host that the event has originated from. This can
127>   be literally "host" for events originating from the local host (via locally
128>   connected IPMI), or an IP address or a hostname of a remote host.
129> - data - any supplementary data that can help better identify the event (e.g.,
130>   some first bytes of the IPMI command data).
131
132Service itself can control flow of events with configuration on its side.
133
134Pseudocode for example:
135
136    audit_event(NET_IPMI, "access denied"(rc=-1), "ipmi cmd", "qwerty223",
137                          "192.168.0.1", <some additional data if needed>)
138    audit_event(REST, "login successful"(rc=200), "rest login",
139                      "qwerty223", "192.168.0.1", NULL)
140    audit_event(HOST_IPMI, "shutting down the host"(rc=0), "host poweroff",
141                       NULL, NULL, NULL)
142
143`audit_event(blob_data)` Blob can be described as structure:
144
145    struct blob_audit
146    {
147        uint8_t type;
148        int32_t rc;
149        uint32_t request_id;
150        char *user;
151        sockaddr_in6 *addr;
152        struct iovec *data;
153    }
154
155When the call reaches the server destination via D-Bus, the server already knows
156that the call should be processed via predefined list of actions which are set
157in the server configuration.
158
159Step by step execution of call:
160
161- client's layer
162  1. checks if audit is enabled for such service
163  2. checks if audit event should be whitelisted or blacklisted at the audit
164     service side for preventing spamming of unneeded events to audit service
165  3. send the data to the audit service via D-Bus
166- server's layer
167  1. accept D-Bus request
168  2. goes through list of actions for each services
169
170How the checks will be processed at client's layer:
171
1721.  check the status of service and cache that value
1732.  check the list of possible actions which should be logged and cache them
174    also
1753.  listen on 'propertiesChanged' event in case of changing list or status of
176    service
177
178## Service configuration
179
180The configuration structure can be described as tree with set of options, as
181example of structure:
182
183```
184[IPMI]
185   [Enabled]
186   [Whitelist]
187     [Cmd 0x01] ["reset request"]
188     [Cmd 0x02] ["hello world"]
189     [Cmd 0x03] ["goodbye cruel world"]
190   [Actions]
191     [Notify type1] [Recipient]
192     [Notify type2] [Recipient]
193     [Notify type3] [Recipient]
194     [Logging type] [Options]
195     [Exec] [ExternalCommand]
196[REST]
197   [Disabled]
198   [Blacklist]
199     [Path1] [Options]
200     [Path2] [Options]
201   [Actions]
202     [Notify type2] [Recipient]
203     [Logging type] [Options]
204```
205
206Options can be updated via D-Bus properties. The audit service listens changes
207on configuration file and emit 'PropertiesChanged' signal with changed details.
208
209- The whitelisting and blacklisting
210
211> Possible list of requests which have to be filtered and processed. 'Whitelist'
212> filters possible requests which can be processed. 'Blacklist' blocks only
213> exact requests.
214
215- Enable/disable the event processing for directed services, where the directed
216  service is any suitable services which can use audit service.
217
218> Each audit processing type can be disabled or enabled at runtime via config
219> file or D-Bus property.
220
221- Notification setup via SNMP/E-mail/Instant messengers/D-Bus
222
223> The end recipient notification system with different transports.
224
225- Logging
226
227> phosphor-logging, journald or anything else suitable for.
228
229- User actions
230
231> Running a command as consequenced action.
232
233## Workflow
234
235An example of possible flow:
236
237```ascii
238           +----------------+
239           |   NET   IPMI   |
240           |    REQUEST     |
241           +----------------+
242                   |
243 +--------------------------------------------------------------------------+
244 |         +-------v--------+                                         IPMI  |
245 |         |    NET IPMI    |                                               |
246 |         +----------------+                                               |
247 |                 |                                                        |
248 |         +-------v--------+        +---------------------------+          |
249 |         | rc = handle()  +------->|  audit_event<NET_IPMI>()  |          |
250 |         +----------------+        +---------------------------+          |
251 |                 |                              |                         |
252 |                 |                              |                         |
253 |         +-------v--------+                     |                         |
254 |         |   Processing   |                     |                         |
255 |         |    further     |                     |                         |
256 |         +----------------+                     |                         |
257 +--------------------------------------------------------------------------+
258                                                  |
259                                                  |
260 +--------------------------------------------------------------------------+
261 |                  +-----------------------------+                         |
262 |                  |                                        Audit Service  |
263 |                  |                                                       |
264 |                  |                                                       |
265 |                  |                                                       |
266 |            +-----v------+                                                |
267 |        NO  | Is logging |        YES                                     |
268 |     +------+  enabled   +--------------------+                           |
269 |     |      | for  type? |                    |                           |
270 |     |      +------------+            +-------v-----+                     |
271 |     |                           NO   | Is request  |   YES               |
272 |     |                       +--------+    type     +--------+            |
273 |     |                       |        |  filtered?  |        |            |
274 |     |                       |        +-------------+        |            |
275 |     |                       |                               |            |
276 |     |               +-------v-------+                       |            |
277 |     |               |    Notify     |                       |            |
278 |     |               | Administrator |                       |            |
279 |     |               +---------------+                       |            |
280 |     |                       |                               |            |
281 |     |               +-------v-------+                       |            |
282 |     |               |   Log Event   |                       |            |
283 |     |               +---------------+                       |            |
284 |     |                       |                               |            |
285 |     |               +-------v-------+                       |            |
286 |     |               |     User      |                       |            |
287 |     |               |    actions    |                       |            |
288 |     |               +---------------+                       |            |
289 |     |                       |                               |            |
290 |     |               +-------v-------+                       |            |
291 |     +-------------->|      End      |<----------------------+            |
292 |                     +---------------+                                    |
293 |                                                                          |
294 +--------------------------------------------------------------------------+
295```
296
297## Notification mechanisms
298
299The unified model for reporting accidents to the end user, where the transport
300can be:
301
302- E-mail
303
304  > Sending a note to directed recipient which set in configuration via sendmail
305  > or anything else.
306
307- SNMP
308
309  > Sending a notification via SNMP trap messages to directed recipient which
310  > set in configuration.
311
312- Instant messengers
313
314  > Sending a notification to directed recipient which set in configuration via
315  > jabber/sametime/gtalk/etc.
316
317- D-Bus
318
319  > Notify the other service which set in configuration via 'method_call' or
320  > 'signal'.
321
322Notifications will be skipped in case if there is no any of above configuration
323rules is set inside configuration. It is possible to pick up rules at runtime.
324
325## User Actions
326
327- Exec application via 'system' call.
328- The code for directed handling type inside handler itself. As example for 'net
329  ipmi' in case of unsuccessful user login inside handler:
330  - Sends a notification to administrator.
331  - echo heartbeat > /sys/class/leds/alarm_red/trigger
332
333## Alternatives Considered
334
335Processing user requests in each dedicated interface service and logging them
336separately for each of the interfaces. Scattered handling looks like an
337error-prone and rigid approach.
338
339## Impacts
340
341Improves system manageability and security.
342
343Impacts when phosphor-audit is not enabled:
344
345- Many services will have slightly larger code size and longer CPU path length
346  due to invocations of audit_event().
347- Increased D-Bus traffic.
348
349Impacts when phosphor-audit is enabled: All of the above, plus:
350
351- Additional BMC processor time needed to handle audit events.
352- Additional BMC flash storage needed to store logged events.
353- Additional outbound network traffic to notify users.
354- Additional space for notification libraries.
355
356## Testing
357
358`dbus-send` as command-line tool for generating audit events.
359
360Scenarios:
361
362- For each supported service (such as Redfish, net IPMI, host IPMI, PLDM),
363  create audit events, and validate they get logged.
364- Ensure message-type and request-type filtering works as expected.
365- Ensure basic notification actions work as expected (log, command, notify).
366- When continuously generating audit-events, change the phosphor-audit service's
367  configuration, and validate no audit events are lost, and the new
368  configuration takes effect.
369