Lines Matching +full:library +full:- +full:sel
3 Author: [Patrick Williams][patrick-email] `<stwcx>`
5 [patrick-email]: mailto:patrick@stwcx.xyz
13 There is currently not a consistent end-to-end error and event reporting design
15 primarily using phosphor-logging and one using rsyslog, both of which have gaps
17 end-to-end design handling both errors and tracing events which facilitate
26 of the IPMI "System Event Log (SEL)".
28 The IPMI SEL is the location where the BMC can collect errors and events,
30 be "DIMM-A0 encountered an uncorrectable ECC error" or "System boot successful".
31 These SEL records are exposed as human readable strings, either natively by a
32 OEM SEL design or by tools such as `ipmitool`, which are typically unique to
36 temperature threshold exceeded: ["temperature threshold exceeded"][HPE-Example]
37 and ["Temperature #0x30 Upper Critical going high"][Oracle-Example]. There is
45 reference][Registry-Example] from the DMTF gives this example:
84 Within OpenBMC, there is currently a [limited design][existing-design] for this
85 Redfish feature and it requires inserting specially formed Redfish-specific
88 observed that these [strings][app-example], when used, are often out of date
89 with the [message registry][obmc-registry-example] advertised by `bmcweb`. Some
90 maintainers have rejected adding new Redfish-specific logging messages to their
94 …nbmc/bmcweb/blob/de0c960c4262169ea92a4b852dd5ebbe3810bf00/redfish-core/schema/dmtf/json-schema/Log…
95 [HPE-Example]:
96 …ublic/docDisplay?docId=sd00002092en_us&docLocale=en_US&page=GUID-D7147C7F-2016-0901-06CE-000000000…
97 [Oracle-Example]:
98 https://docs.oracle.com/cd/E19464-01/820-6850-11/IPMItool.html#50602039_63068
99 [Registry-Example]:
100 https://www.dmtf.org/sites/default/files/Redfish%20School%20-%20Events_0.pdf
101 [existing-design]:
102 https://github.com/openbmc/docs/blob/master/architecture/redfish-logging-in-bmcweb.md
103 [app-example]:
104 …https://github.com/openbmc/phosphor-post-code-manager/blob/f2da78deb3a105c7270f74d9d747c77f0feaae2…
105 [obmc-registry-example]:
106 …https://github.com/openbmc/bmcweb/blob/4ba5be51e3fcbeed49a6a312b4e6b2f1ea7447ba/redfish-core/inclu…
108 ### Existing phosphor-logging implementation
118 client. `phosphor-logging` extended this to also add metadata associated to the
125 - name: InvalidCertificate
129 `phosphor-logging` metadata definition (in
133 - name: InvalidCertificate
135 - str: "REASON=%s"
154 `phosphor-logging`'s daemon for recording. As a side-effect of both calls, the
157 When an error is sent to the `phosphor-logging` daemon, it will:
161 2. Create an [`xyz.openbmc_project.Logging.Entry`][Logging-Entry] DBus object
166 `xyz.openbmc_project.Logging.Entry` objects advertised by `phosphor-logging`
170 perform (hand-coded) regular-expressions to extract any information from the
171 `Message` field of the `LogEntry`. Furthermore, these regular-expressions are
175 [Logging-Entry]:
176 …https://github.com/openbmc/phosphor-dbus-interfaces/blob/9012243e543abdc5851b7e878c17c991b2a2a8b7/…
180 - There are two different implementations of error logging, neither of which are
184 - The `REDFISH_MESSAGE_ID` log approach leads to differences between the Redfish
189 also does not provide comple-time assurance of appropriate metadata
190 collection, which can lead to producing code being out-of-date with the
193 - The `phosphor-logging` approach does not provide compile-time assurance of
197 - The `sdbusplus` bindings for error reporting do not currently handle lossless
200 - Similar applications can result in different Redfish `LogEntry` for the same
202 between `dbus-sensors`, `phosphor-hwmon`, `phosphor-virtual-sensor`, and
203 `phosphor-health-monitor`. One cause of this is two different error reporting
208 - Applications running on the BMC must be able to report errors and failure
212 - These errors must be structured, versioned, and the complete set of errors
213 able to be created by the BMC should be available at built-time of a BMC
215 - The set of errors, able to be created by the BMC, must be able to be
217 - For Redfish, the transformation must comply with the Redfish standard
219 - For Redfish, the transformation should allow mapping internally defined
220 events to pre-existing Redfish Message Registries for broader
222 - For Redfish, the implementation must also support the EventService
223 mechanics for push-reporting.
224 - Errors reported by the BMC should contain sufficient information to allow
228 - Applications running on the BMC should be able to report important tracing
232 - All requirements relevant to errors are also applicable to tracing events.
233 - The implementation must have a mechanism for vendors to be able to disable
236 - Applications running on the BMC should be able to determine when a previously
240 - The BMC should provide a mechanism for managed entities within the server to
244 - The implementation on the BMC should scale to a minimum of
245 [10,000][error-discussion] error and events without impacting the BMC or
248 - The implementation should provide a mechanism to allow OEM or vendor
250 the Redfish Message Registry) for usage in closed-source or non-upstreamed
252 vendor-specific and not be tied to the OpenBMC project.
254 - APIs to implement error and event reporting should have good ergonomics. These
255 APIs must provide compile-time identification, for applicable programming
259 - The generated error classes and APIs should not require exceptions but
263 [error-discussion]:
268 The proposed design has a few high-level design elements:
270 - Consolidate the `sdbusplus` and `phosphor-logging` implementation of error
272 associated APIs and add compile-time checking of missing metadata.
274 - Add APIs to `phosphor-logging` to enable daemons to easily look up their own
277 - Add to `phosphor-logging` a compile-time mechanism to disable recording of
278 specific tracing events for vendor-level customization.
280 - Generate a Redfish Message Registry for all error and events defined in
281 `phosphor-dbus-interfaces`, using binding generators from `sdbusplus`. Enhance
283 cover the Redfish Message Registry and `phosphor-logging` enhancements;
285 Base64-encoded JSON representation of the entire `Logging.Entry` for
287 `bmcweb` EventService implementation to support `phosphor-logging`-hosted
293 `Foo.metadata.yaml` files specified by `phosphor-logging` and specified by a new
300 The `sdbusplus` library will be enhanced to provide the following:
302 - JSON serialization and de-serialization of generated exception types with
307 - A facility to register exception types, at library load time, with the
308 `sdbusplus` library for automatic conversion back to C++ exception types in
313 - Generate complete C++ exception types, with compile-time checking of missing
317 - size-type and signed integer
318 - floating-point number
319 - string
320 - DBus object path
322 - Generate a format that `bmcweb` can use to create and populate a Redfish
323 Message Registry, and translate from `phosphor-logging` to Redfish `LogEntry`
331 ### `phosphor-dbus-interfaces`
336 generators from `sdbusplus`. A small library enhancement will be done to
340 ### `phosphor-logging`
344 > managed separately. The `phosphor-logging` default `meson.options` have
355 - name: CreateEntry
357 - name: Message
359 - name: Severity
361 - name: AdditionalData
363 - name: Hint
367 - name: Entry
380 - property: Hint
387 - name: FindEntry
389 - name: Hint
392 - name: Entry
395 - xyz.openbmc_project.Common.ResourceNotFound
404 There are outstanding performance concerns with the `phosphor-logging`
406 This issue is expected to be self-contained within `phosphor-logging`, except
407 for potential future changes to the log-retrieval interfaces used by `bmcweb`.
409 APIs, from the experimentation and improvements in `phosphor-logging`, we will
421 `bmcweb` already has support for build-time conversion from a Redfish Message
425 from bitbake from `phosphor-dbus-interfaces` and vendor-specific event
427 also be added for adding `phosphor-dbus-interfaces` as a Meson subproject for
428 stand-alone testing.
445 1. A Base64-encoded JSON representation of the `Logging.Entry` will be assigned
464 `phosphor-logging` hosted events. The implementation of `LogService` should be
465 enhanced to support log paging for `phosphor-logging` hosted events.
467 ### `phosphor-sel-logger`
469 The `phosphor-sel-logger` has a meson option `send-to-logger` which toggles
470 between using `phosphor-logging` or the [`REDFISH_MESSAGE_ID`
471 mechanism][existing-design]. The `phosphor-logging`-utilizing paths will be
472 updated to utilize `phosphor-dbus-interfaces` specified errors and events.
476 Consider an example file in `phosphor-dbus-interfaces` as
484 - name: UpdateFailure
487 - name: TARGET
490 - name: ERRNO
492 - name: CALLOUT_HARDWARE
500 - name: BMCUpdateFailure
505 redfish-mapping: OpenBMC.FirmwareUpdateFailed
508 - name: UpdateProgress
510 - name: TARGET
513 - name: COMPLETION
524 schema][yaml-schema] is contained in the sdbusplus repository.
554 implementation will provide compile-time assurance that all of the metadata
568 [yaml-schema]:
575 - Adjusting a description or message should result in a `PATCH` increment.
576 - Adding a new error or event, or adding metadata to an existing error or event,
578 - Deprecating an error or event should result in a `MAJOR` increment.
580 There is [guidance on maintenance][registry-guidance] of the OpenBMC Message
582 `phosphor-dbus-interfaces` policy.
584 [registry-guidance]:
585 …https://github.com/openbmc/bmcweb/blob/master/redfish-core/include/registries/openbmc_message_regi…
621 `phosphor-logging`. Events defined in other repositories will be expected to use
622 some other prefix. Vendor-defined repositories should use a vendor-owned prefix
632 identifier naming). The `sdbusplus` (and `phosphor-logging` and `bmcweb`)
640 `phosphor-dbus-interfaces` defined events. Vendors must not add their own events
641 to `phosphor-dbus-interfaces` in downstream implementations because it would
643 OpenBMC-owned Registry which is not the case, but they should add them to their
652 this proposal there are many minor-alternatives that have been assessed.
656 The original `phosphor-logging` error descriptions allowed inheritance between
659 - This introduces complexity in the Redfish Message Registry versioning because
662 - It makes it difficult for a developer to clearly identify all of the fields
669 understand, and can provide compile-time awareness of missing metadata fields.
695 LSP-enabled editors to give completions for the metadata fields but
697 constructed but not thrown, which means we cannot get compile-time checking
700 2. This syntax uses tag-dispatch to enables compile-time checking of all
701 metadata fields and potential LSP-completion of the tag-types, but is more
705 `phosphor-logging`'s `lg2` API, but does not allow LSP-completion of the
709 enable compile-time checking that all metadata fields have been populated by
710 the lambda. The LSP-completion is likely not as strong as option (1), due to
711 the use of `auto`, and the lambda necessity will likely be a hang-up for
715 provide compile-time confirmation that all fields have been populated.
733 - Use a date code (ex. `2024.17.x`) representing the ISO 8601 week when the
736 - This does not cover vendors that may choose to branch for stabilization
738 OpenBMC-versioned message registry with different content.
740 - Use the most recent `openbmc/openbmc` tag as the version.
742 - This does not cover vendors that build off HEAD and may deploy multiple
745 - Generate the version based on the git-history.
747 - This requires `phosphor-dbus-interfaces` to be built from a git repository,
749 non-trivial processing that continues to scale over time.
756 Intel-specific code that is not pulled into any upstreamed machine, 39 are
760 and many of the others do not have attributes that would facilitate a multi-host
766 existing format, we can maintain those call-sites for a time period of 1-2
770 `phosphor-dbus-interfaces` defined events to the current `OpenBMC.0.4.0`
775 - phosphor-post-code-manager
776 - BIOSPOSTCode (unique)
777 - dbus-sensors
778 - ChassisIntrusionDetected (unique)
779 - ChassisIntrusionReset (unique)
780 - FanInserted
781 - FanRedundancyLost (unique)
782 - FanRedudancyRegained (unique)
783 - FanRemoved
784 - LanLost
785 - LanRegained
786 - PowerSupplyConfigurationError (unique)
787 - PowerSupplyConfigurationErrorRecovered (unique)
788 - PowerSupplyFailed
789 - PowerSupplyFailurePredicted (unique)
790 - PowerSupplyFanFailed
791 - PowerSupplyFanRecovered
792 - PowerSupplyPowerLost
793 - PowerSupplyPowerRestored
794 - PowerSupplyPredictiedFailureRecovered (unique)
795 - PowerSupplyRecovered
796 - phosphor-sel-logger
797 - IPMIWatchdog (unique)
798 - `SensorThreshold*` : 8 different events
799 - phosphor-net-ipmid
800 - InvalidLoginAttempted (unique)
801 - entity-manager
802 - InventoryAdded (unique)
803 - InventoryRemoved (unique)
804 - estoraged
805 - ServiceStarted
806 - x86-power-control
807 - NMIButtonPressed (unique)
808 - NMIDiagnosticInterrupt (unique)
809 - PowerButtonPressed (unique)
810 - PowerRestorePolicyApplied (unique)
811 - PowerSupplyPowerGoodFailed (unique)
812 - ResetButtonPressed (unique)
813 - SystemPowerGoodFailed (unique)
815 Intel-only implementations:
817 - intel-ipmi-oem
818 - ADDDCCorrectable
819 - BIOSPostERROR
820 - BIOSRecoveryComplete
821 - BIOSRecoveryStart
822 - FirmwareUpdateCompleted
823 - IntelUPILinkWidthReducedToHalf
824 - IntelUPILinkWidthReducedToQuarter
825 - LegacyPCIPERR
826 - LegacyPCISERR
827 - `ME*` : 29 different events
828 - `Memory*` : 9 different events
829 - MirroringRedundancyDegraded
830 - MirroringRedundancyFull
831 - `PCIeCorrectable*`, `PCIeFatal` : 29 different events
832 - SELEntryAdded
833 - SparingRedundancyDegraded
834 - pfr-manager
835 - BIOSFirmwareRecoveryReason
836 - BIOSFirmwarePanicReason
837 - BMCFirmwarePanicReason
838 - BMCFirmwareRecoveryReason
839 - BMCFirmwareResiliencyError
840 - CPLDFirmwarePanicReason
841 - CPLDFirmwareResilencyError
842 - FirmwareResiliencyError
843 - host-error-monitor
844 - CPUError
845 - CPUMismatch
846 - CPUThermalTrip
847 - ComponentOverTemperature
848 - SsbThermalTrip
849 - VoltageRegulatorOverheated
850 - s2600wf-misc
851 - DriveError
852 - InventoryAdded
856 - New APIs are defined for error and event logging. This will deprecate existing
857 `phosphor-logging` APIs, with a time to migrate, for error reporting.
859 - The design should improve performance by eliminating the regular parsing of
865 - Backwards compatibility and documentation should be improved by the automatic
871 - **Does this repository require a new repository?**
872 - No
873 - **Who will be the initial maintainer(s) of this repository?**
874 - N/A
875 - **Which repositories are expected to be modified to execute this design?**
876 - `sdbusplus`
877 - `phosphor-dbus-interfaces`
878 - `phosphor-logging`
879 - `bmcweb`
880 - Any repository creating an error or event.
884 - Unit tests will be written in `sdbusplus` and `phosphor-logging` for the error
888 - Unit tests will be written for `bmcweb` for basic `Logging.Entry`
891 - Integration tests should be leveraged (and enhanced as necessary) from
892 `openbmc-test-automation` to cover the end-to-end error creation and Redfish