xref: /openbmc/docs/designs/phosphor-audit.md (revision 6ac6f3df008560ee9022cc10dbb429d6b2e53547)
1# phosphor-audit
2
3Author:
4  Ivan Mikhaylov, [i.mikhaylov@yadro.com](mailto:i.mikhaylov@yadro.com)
5
6Other contributors:
7  Alexander Amelkin, [a.amelkin@yadro.com](mailto:a.amelkin@yadro.com)
8  Alexander Filippov, [a.filippov@yadro.com](mailto:a.filippov@yadro.com)
9
10Created:
11  2019-07-23
12
13## Problem Description
14
15End users of OpenBMC may take actions that change the system state and/or
16configuration. Such actions may be taken using any of the numerous interfaces
17provided by OpenBMC. That includes RedFish, IPMI, ssh or serial console shell,
18and other interfaces, including the future ones.
19
20Consequences of those actions may sometimes be harmful and an investigation may
21be conducted in order to find out the person responsible for the unwelcome
22changes. Currently, most changes leave no trace in OpenBMC logs, which hampers
23the aforementioned investigation.
24
25It is required to develop a mechanism that would allow for tracking such
26user activity, logging it, and taking certain actions if necessary.
27
28## Background and References
29
30YADRO had an internal solution for the problem. It was only applicable to an
31outdated version of OpenBMC and needed a redesign. There was also a parallel
32effort by IBM that can be found here:
33[REST and Redfish Traffic Logging](https://gerrit.openbmc-project.xyz/c/openbmc/bmcweb/+/22699)
34
35## Assumptions
36
37This design assumes that an end user is never given a direct access to the
38system shell. The shell allows for direct manipulation of user database
39(add/remove users, change passwords) and system configuration (network scripts,
40etc.), and it doesn't seem feasible to track such user actions taken within the
41shell. This design assumes that all user interaction with OpenBMC is limited to
42controlled interfaces served by other Phosphor OpenBMC components interacting
43via D-Bus.
44
45## Requirements
46
47 * Provide a unified method of logging user actions independent of the user
48   interface, where possible user actions are:
49   * Redfish/REST PUT/POST/DELETE/PATCH
50   * IPMI
51   * PAM
52   * PLDM
53   * Any other suitable service
54 * Provide a way to configure system response actions taken upon certain user
55   actions, where possible response actions are:
56   * Log an event
57   * Notify an administrator or an arbitrary notification receiver
58   * Run an arbitrary command
59 * Provide a way to configure notification receivers:
60   * E-mail
61   * SNMP
62   * Instant messengers
63   * D-Bus
64
65## Proposed Design
66
67The main idea is to catch D-Bus requests sent by user interfaces, then handle the
68request according to the configuration. In future, support for flexible policies
69may be implemented that would allow for better flexibility in handling and
70tracking.
71
72The phosphor-audit service represents a service that provides user activity
73tracking and corresponding action taking in response of user actions.
74
75The key benefit of using phosphor-audit is that all action handling will be kept
76inside this project instead of spreading it across multiple dedicated interface
77services with a risk of missing a handler for some action in one of them and
78bloating the codebase.
79
80The component diagram below shows the example of service overview.
81
82```ascii
83  +----------------+  audit event                           +-----------------+
84  |    IPMI NET    +-----------+                            | action          |
85  +----------------+           |                            | +-------------+ |
86                               |                            | |   logging   | |
87  +----------------+           |                            | +-------------+ |
88  |   IPMI HOST    +-----------+      +--------------+      |                 |
89  +----------------+           |      |    audit     |      | +-------------+ |
90                               +----->+   service    +----->| |   command   | |
91  +----------------+           |      |              |      | +-------------+ |
92  |  RedFish/REST  +-----------+      +--------------+      |                 |
93  +----------------+           |                            | +-------------+ |
94                               |                            | |   notify    | |
95  +----------------+           |                            | +-------------+ |
96  |  any service   +-----------+                            |                 |
97  +----------------+                                        | +-------------+ |
98                                                            | |     ...     | |
99                                                            | +-------------+ |
100                                                            +-----------------+
101```
102
103The audit event from diagram generated by an application to track user activity.
104The application sends 'signal' to audit service via D-Bus. What is happening
105next in audit service's handler depends on user requirements and needs. It is
106possible to just store logs, run arbitrary command or notify someone in handler
107or we can do all of the above and all of this can be optional.
108
109**Audit event call**
110
111Audit event call performs preprocessing of incoming data at application side
112before sending it to the audit service, if the request is filtered out, it will
113be dropped at this moment and will no longer be processed. After the filter
114check, the audit event call sends the data through D-Bus to the audit service
115which makes a decision regarding next steps. Also, it caches list of possible
116commands (blacklist or whitelist) and status of its service (disabled or enabled).
117If the service in undefined state, the call checks if service alive or not.
118
119 > `audit_event(type, rc, request, user, host, data)`
120 > *  type - type of event source : IPMI, REST, PAM, etc.
121 > *  rc   - return code of the handler event (status, rc, etc.)
122 > *  request - a generalized identifier of the event, e.g. ipmi command
123 > (cmd/netfn/lun), web path, or anything else that can describe the event.
124 > *  user - the user account on behalf of which the event was processed.
125 >           depends on context, NA/None in case of user inaccessibility.
126 > *  source - identifier of the host that the event has originated from. This can
127 >     be literally "host" for events originating from the local host (via locally
128 >     connected IPMI), or an IP address or a hostname of a remote host.
129 > *  data - any supplementary data that can help better identify the event
130 >      (e.g., some first bytes of the IPMI command data).
131
132Service itself can control flow of events with configuration on its side.
133
134Pseudocode for example:
135
136    audit_event(NET_IPMI, "access denied"(rc=-1), "ipmi cmd", "qwerty223",
137                          "192.168.0.1", <some additional data if needed>)
138    audit_event(REST, "login successful"(rc=200), "rest login",
139                      "qwerty223", "192.168.0.1", NULL)
140    audit_event(HOST_IPMI, "shutting down the host"(rc=0), "host poweroff",
141                       NULL, NULL, NULL)
142
143`audit_event(blob_data)`
144Blob can be described as structure:
145
146    struct blob_audit
147    {
148        uint8_t type;
149        int32_t rc;
150        uint32_t request_id;
151        char *user;
152        sockaddr_in6 *addr;
153        struct iovec *data;
154    }
155
156When the call reaches the server destination via D-Bus, the server already knows
157that the call should be processed via predefined list of actions which are set
158in the server configuration.
159
160Step by step execution of call:
161 * client's layer
162    1. checks if audit is enabled for such service
163    2. checks if audit event should be whitelisted or blacklisted at
164       the audit service side for preventing spamming of unneeded events
165       to audit service
166    3. send the data to the audit service via D-Bus
167 * server's layer
168    1. accept D-Bus request
169    2. goes through list of actions for each services
170
171How the checks will be processed at client's layer:
172 1. check the status of service and cache that value
173 2. check the list of possible actions which should be logged and cache them also
174 3. listen on 'propertiesChanged' event in case of changing list or status
175    of service
176
177## Service configuration
178
179The configuration structure can be described as tree with set of options,
180as example of structure:
181
182```
183[IPMI]
184   [Enabled]
185   [Whitelist]
186     [Cmd 0x01] ["reset request"]
187     [Cmd 0x02] ["hello world"]
188     [Cmd 0x03] ["goodbye cruel world"]
189   [Actions]
190     [Notify type1] [Recipient]
191     [Notify type2] [Recipient]
192     [Notify type3] [Recipient]
193     [Logging type] [Options]
194     [Exec] [ExternalCommand]
195[REST]
196   [Disabled]
197   [Blacklist]
198     [Path1] [Options]
199     [Path2] [Options]
200   [Actions]
201     [Notify type2] [Recipient]
202     [Logging type] [Options]
203```
204
205Options can be updated via D-Bus properties. The audit service listens changes
206on configuration file and emit 'PropertiesChanged' signal with changed details.
207
208* The whitelisting and blacklisting
209
210 > Possible list of requests which have to be filtered and processed.
211 > 'Whitelist' filters possible requests which can be processed.
212 > 'Blacklist' blocks only exact requests.
213
214* Enable/disable the event processing for directed services, where the directed
215  service is any suitable services which can use audit service.
216
217 > Each audit processing type can be disabled or enabled at runtime via
218 > config file or D-Bus property.
219
220* Notification setup via SNMP/E-mail/Instant messengers/D-Bus
221
222 > The end recipient notification system with different transports.
223
224* Logging
225
226 > phosphor-logging, journald or anything else suitable for.
227
228* User actions
229
230 > Running a command as consequenced action.
231
232## Workflow
233
234An example of possible flow:
235
236```ascii
237           +----------------+
238           |   NET   IPMI   |
239           |    REQUEST     |
240           +----------------+
241                   |
242 +--------------------------------------------------------------------------+
243 |         +-------v--------+                                         IPMI  |
244 |         |    NET IPMI    |                                               |
245 |         +----------------+                                               |
246 |                 |                                                        |
247 |         +-------v--------+        +---------------------------+          |
248 |         | rc = handle()  +------->|  audit_event<NET_IPMI>()  |          |
249 |         +----------------+        +---------------------------+          |
250 |                 |                              |                         |
251 |                 |                              |                         |
252 |         +-------v--------+                     |                         |
253 |         |   Processing   |                     |                         |
254 |         |    further     |                     |                         |
255 |         +----------------+                     |                         |
256 +--------------------------------------------------------------------------+
257                                                  |
258                                                  |
259 +--------------------------------------------------------------------------+
260 |                  +-----------------------------+                         |
261 |                  |                                        Audit Service  |
262 |                  |                                                       |
263 |                  |                                                       |
264 |                  |                                                       |
265 |            +-----v------+                                                |
266 |        NO  | Is logging |        YES                                     |
267 |     +------+  enabled   +--------------------+                           |
268 |     |      | for  type? |                    |                           |
269 |     |      +------------+            +-------v-----+                     |
270 |     |                           NO   | Is request  |   YES               |
271 |     |                       +--------+    type     +--------+            |
272 |     |                       |        |  filtered?  |        |            |
273 |     |                       |        +-------------+        |            |
274 |     |                       |                               |            |
275 |     |               +-------v-------+                       |            |
276 |     |               |    Notify     |                       |            |
277 |     |               | Administrator |                       |            |
278 |     |               +---------------+                       |            |
279 |     |                       |                               |            |
280 |     |               +-------v-------+                       |            |
281 |     |               |   Log Event   |                       |            |
282 |     |               +---------------+                       |            |
283 |     |                       |                               |            |
284 |     |               +-------v-------+                       |            |
285 |     |               |     User      |                       |            |
286 |     |               |    actions    |                       |            |
287 |     |               +---------------+                       |            |
288 |     |                       |                               |            |
289 |     |               +-------v-------+                       |            |
290 |     +-------------->|      End      |<----------------------+            |
291 |                     +---------------+                                    |
292 |                                                                          |
293 +--------------------------------------------------------------------------+
294```
295
296## Notification mechanisms
297
298The unified model for reporting accidents to the end user, where the transport can be:
299
300* E-mail
301
302  > Sending a note to directed recipient which set in configuration via
303  > sendmail or anything else.
304
305* SNMP
306
307  > Sending a notification via SNMP trap messages to directed recipient which
308  > set in configuration.
309
310* Instant messengers
311
312  > Sending a notification to directed recipient which set in configuration via
313  > jabber/sametime/gtalk/etc.
314
315* D-Bus
316
317  > Notify the other service which set in configuration via 'method_call' or
318  > 'signal'.
319
320Notifications will be skipped in case if there is no any of above configuration
321rules is set inside configuration. It is possible to pick up rules at runtime.
322
323## User Actions
324
325 * Exec application via 'system' call.
326 * The code for directed handling type inside handler itself.
327   As example for 'net ipmi' in case of unsuccessful user login inside handler:
328   * Sends a notification to administrator.
329   * echo heartbeat > /sys/class/leds/alarm_red/trigger
330
331## Alternatives Considered
332
333Processing user requests in each dedicated interface service and logging
334them separately for each of the interfaces. Scattered handling looks like
335an error-prone and rigid approach.
336
337## Impacts
338
339Improves system manageability and security.
340
341Impacts when phosphor-audit is not enabled:
342 - Many services will have slightly larger code size and longer CPU path length
343   due to invocations of audit_event().
344 - Increased D-Bus traffic.
345
346Impacts when phosphor-audit is enabled:
347All of the above, plus:
348 - Additional BMC processor time needed to handle audit events.
349 - Additional BMC flash storage needed to store logged events.
350 - Additional outbound network traffic to notify users.
351 - Additional space for notification libraries.
352
353## Testing
354
355`dbus-send` as command-line tool for generating audit events.
356
357Scenarios:
358 - For each supported service (such as Redfish, net IPMI, host IPMI, PLDM), create audit events, and validate they get logged.
359 - Ensure message-type and request-type filtering works as expected.
360 - Ensure basic notification actions work as expected (log, command, notify).
361 - When continuously generating audit-events, change the phosphor-audit service's configuration, and validate no audit events are lost, and the new configuration takes effect.
362