History log of /openbmc/openpower-hw-diags/analyzer/ (Results 26 – 50 of 179)
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
adda054006-Apr-2023 Zane Shelley <zshelle@us.ibm.com>

Clarify definition of chip checkstop

Previously, the ATTN_TYPE_CHECKSTOP associated with a signature was
synonymous with a system checkstop event. This is certainly true for
if a processor chip chec

Clarify definition of chip checkstop

Previously, the ATTN_TYPE_CHECKSTOP associated with a signature was
synonymous with a system checkstop event. This is certainly true for
if a processor chip checkstops. However, this is not true if a connected
OCMB chip checkstops because it is possible in some cases for a system
to recover. To differentiate an OCMB chip checkstop from a system
checkstop they were previously reported as unit checkstops. With the
addition Odyssey OCMBs, which have ability to report both chip and unit
checkstops, we decided to fix the confusion and disassociate a chip
checkstop from a system checkstop. Now the signatures will properly
report the chip attention type and the signature filtering code has been
modified to simply associate only chip checkstops from processor chips
as system checkstop attentions.

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: Iff9822ff8c9c0ae1afe84353010e94759dbdf49d

show more ...

93b001c524-Mar-2023 Zane Shelley <zshelle@us.ibm.com>

Remove support for deprecated RAS data version 1

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: I91572c057169e3416bc543bad5ccab1a505d1485

100c7a2603-Mar-2023 Caleb Palmer <cnpalmer@us.ibm.com>

Updates to Odyssey RAS data for TP and MEM local FIRs

Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>
Change-Id: Ia916f92fe7afc7278c6f8efc2d46ad5223064eb5

51f8202c22-Feb-2023 Caleb Palmer <cnpalmer@us.ibm.com>

Update DSTL_FIR callouts in the event of failure to analyze an OCMB

Change-Id: I40c17703ad032aa98f02b43d9cb321b7fc86fea3
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

d5fa958427-Feb-2023 Caleb Palmer <cnpalmer@us.ibm.com>

Add initial RAS data files for Odyssey

Change-Id: I70a596dd364057a4ce555546c36cc8369764b785
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

31a8753a22-Feb-2023 Caleb Palmer <cnpalmer@us.ibm.com>

Update Explorer RAS data json with sorted keys

Change-Id: I6ebc92345d2cb857abcbe5b73ca73e3c7a3a5191
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

5836f4a609-Feb-2023 Zane Shelley <zshelle@us.ibm.com>

use nholmann json exceptions instead of std

nholmann::json::at() will throw nholmann::json::out_of_range instead
of std::out_of_range. This resulted in missed exceptions in the
signature filtering c

use nholmann json exceptions instead of std

nholmann::json::at() will throw nholmann::json::out_of_range instead
of std::out_of_range. This resulted in missed exceptions in the
signature filtering code.

Change-Id: I573e1ed4455bbda4f05c100edd315eb0ccdc9c3f
Signed-off-by: Zane Shelley <zshelle@us.ibm.com>

show more ...

02d59af507-Feb-2023 Zane Shelley <zshelle@us.ibm.com>

Exception handling with flags in ras-data-parser

It is possible that a signature may not be defined in the RAS data. In
which case, trying to access the flags for an undefined signature would
throw

Exception handling with flags in ras-data-parser

It is possible that a signature may not be defined in the RAS data. In
which case, trying to access the flags for an undefined signature would
throw an exception. This is not the desired behavior. Instead, we'll
catch the exceptions and move on as if the flag is not defined.

Change-Id: I4d3cff52ce5f32074fca9863f60b84726dd590aa
Signed-off-by: Zane Shelley <zshelle@us.ibm.com>

show more ...

ecde53fc13-Dec-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Adjust TI root cause filter to skip INT_CQ_FIR[47:50]

These bits are recoverable errors and should not
be blamed as the root cause of a TI.

Change-Id: I666eadbde0c2a0935fa47206f337112bc44a100f
Sign

Adjust TI root cause filter to skip INT_CQ_FIR[47:50]

These bits are recoverable errors and should not
be blamed as the root cause of a TI.

Change-Id: I666eadbde0c2a0935fa47206f337112bc44a100f
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

show more ...

b69b2ba014-Dec-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Updated P10 RAS data json with added thresholds

Change-Id: Iddc7d587c69560eb8194edf22235b3e9d903412e
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

8b10d69908-Dec-2022 Patrick Williams <patrick@stwcx.xyz>

prettier: re-format

Prettier is enabled in openbmc-build-scripts on Markdown, JSON, and YAML
files to have consistent formatting for these file types. Re-run the
formatter on the whole repository.

prettier: re-format

Prettier is enabled in openbmc-build-scripts on Markdown, JSON, and YAML
files to have consistent formatting for these file types. Re-run the
formatter on the whole repository.

Change-Id: Ib936836ce0d698dc522bc047a78d4f1b0060c13c
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...

1a4f0e7007-Nov-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Update root cause filtering to use RAS data flags

Change-Id: I172540905a39533139821d3cb1676424824bd804
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

de22092005-Dec-2022 Zane Shelley <zshelle@us.ibm.com>

Change scope of auto-generated build info header

Changes had to be made in libhei to make the build information header
more portable. These changes are in reaction to that.

Signed-off-by: Zane Shel

Change scope of auto-generated build info header

Changes had to be made in libhei to make the build information header
more portable. These changes are in reaction to that.

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: Ifeb04f302d850446eff42ae66c2b29b1693c5889

show more ...

934635e003-Nov-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Update Explorer RAS data json to auto-generated v2

Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>
Change-Id: I0b5062f3f8dac85d9f8abfe2115c15aae3e12d0c

f118439207-Oct-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Add RAS data parser handling for getting RAS data flags

In the future we will be supporting an additional 'flags'
type stored in the RAS data files for specific bits. This
adds the handling to the R

Add RAS data parser handling for getting RAS data flags

In the future we will be supporting an additional 'flags'
type stored in the RAS data files for specific bits. This
adds the handling to the RAS data parser to get those flags.

Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>
Change-Id: Ie7889135ae7a643fec287565143a8ee7edc33777

show more ...

dd74a84f02-Nov-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Update RAS data json with flags for root cause filtering

Change-Id: If39f871c3d02c06cb5ad972a361c326ab8391748
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

e36866c331-Oct-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Add auto-generated json RAS data and supporting changes

Moving forward we want to use json RAS data files that
have been auto-generated instead of maintaining the
json itself. This updates the curre

Add auto-generated json RAS data and supporting changes

Moving forward we want to use json RAS data files that
have been auto-generated instead of maintaining the
json itself. This updates the current json RAS data
to version 2 and makes accompanying changes in the
RAS data parser and schema.

Change-Id: I1278c65f6479437630de5b9d3440d4a19f42a1f6
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

show more ...

329dbbde03-Oct-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Adjust root cause filtering for IUE thresholds

After handling an IUE threshold, a channel fail will
be initiated by firmware. If that channel fail causes
a system checkstop, we want to blame the IUE

Adjust root cause filtering for IUE thresholds

After handling an IUE threshold, a channel fail will
be initiated by firmware. If that channel fail causes
a system checkstop, we want to blame the IUE FIR bits
as the root cause.

Change-Id: Idd28b0b4310b83b97258755bc8da0dad1f58d2a6
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

show more ...

7a46525909-Sep-2022 Caleb Palmer <cnpalmer@us.ibm.com>

Add FFDC for signatures stored in scratch registers

If analysis was interrupted by a system checkstop there may
exist an error signature within two Hostboot scratch regs
that indicates the signature

Add FFDC for signatures stored in scratch registers

If analysis was interrupted by a system checkstop there may
exist an error signature within two Hostboot scratch regs
that indicates the signature from that analysis. This commit
adds support to add that signature as FFDC to the PEL if it
exists to indicate that a prior analysis was interrupted
such that we may be missing a PEL for that signature.

Change-Id: I53216e2c7910c69c4e7e74010a5c0045b793bfde
Signed-off-by: Caleb Palmer <cnpalmer@us.ibm.com>

show more ...

fc7e247624-Jun-2022 Zane Shelley <zshelle@us.ibm.com>

CORE_FIR recoverables could be blamed as checkstop root cause

If a CORE_FIR recoverable attention fails recovery, it will trigger a
core unit checkstop attention via another bit. All core unit check

CORE_FIR recoverables could be blamed as checkstop root cause

If a CORE_FIR recoverable attention fails recovery, it will trigger a
core unit checkstop attention via another bit. All core unit checkstop
attentions have the potential to trigger a system checkstop attention.
Therefore, all CORE_FIR recoverable attentions could be blamed a system
checkstop root cause attentions.

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: Ib2f3916218b4dce88797f645a302716ef4fd4d49

show more ...

b82cbf7527-Jun-2022 Zane Shelley <zshelle@us.ibm.com>

Update to clang-format-14

Required because the Jenkins CI tools have moved to v14.

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: I3cf4df1b45325545a423bdcb810040724a598ec5

513f64aa15-Jun-2022 Zane Shelley <zshelle@us.ibm.com>

Handling for host detected LPC timeout

For reasons not explained yet, hardware will not initiate an LPC timeout
attention via NCU timeout FIR bit as we expected. When the host firmware
detects an LP

Handling for host detected LPC timeout

For reasons not explained yet, hardware will not initiate an LPC timeout
attention via NCU timeout FIR bit as we expected. When the host firmware
detects an LPC timeout, it will manually set N1_LOCAL_FIR[61] to force a
system checkstop. The service response for this bit will be to call out
the hardware as if there was a hardware reported LPC timeout.

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: I863e8aa3ef50a4b18b5106b3a45c4cf81b2c7808

show more ...

ed3ab8f924-May-2022 Zane Shelley <zshelle@us.ibm.com>

Fix outdate comment in analyzer filter support

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: I5e14eb82a4017ed794314d2800ea88dd0d706942

026e5a3f05-May-2022 Zane Shelley <zshelle@us.ibm.com>

Avoid guarding on TOD interfaces errors

The error could be anywhere between the two processors in the interface.
Fatally guarding the MDMT will cause system outage until service is
done. Instead, do

Avoid guarding on TOD interfaces errors

The error could be anywhere between the two processors in the interface.
Fatally guarding the MDMT will cause system outage until service is
done. Instead, do not guard on the TOD interface errors to avoid outage.

Signed-off-by: Zane Shelley <zshelle@us.ibm.com>
Change-Id: I446917bad985e5143657398b2fbadacf6e8c4a9d

show more ...

7bf1bfa527-Apr-2022 Zane Shelley <zshelle@us.ibm.com>

Enable LPC timeout handling

It turns out the plugin exists, but nothing in the RAS data was calling
the plugin.

Change-Id: I9d35a61064e5f412f216ffbea96597b4d691a98a

12345678