History log of /openbmc/linux/arch/powerpc/platforms/powernv/opal.c (Results 51 – 75 of 299)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 8e84b2d1 09-Aug-2017 Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>

powerpc/powernv: Add support to set power-shifting-ratio

This patch adds support to set power-shifting-ratio which hints the
firmware how to distribute/throttle power between different entities
in a

powerpc/powernv: Add support to set power-shifting-ratio

This patch adds support to set power-shifting-ratio which hints the
firmware how to distribute/throttle power between different entities
in a system (e.g CPU v/s GPU). This ratio is used by OCC for power
capping algorithm.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# cb8b340d 09-Aug-2017 Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>

powerpc/powernv: Add support for powercap framework

Adds a generic powercap framework to change the system powercap
inband through OPAL-OCC command/response interface.

Signed-off-by: Shilpasri G Bh

powerpc/powernv: Add support for powercap framework

Adds a generic powercap framework to change the system powercap
inband through OPAL-OCC command/response interface.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 8f95faaa 18-Jul-2017 Madhavan Srinivasan <maddy@linux.vnet.ibm.com>

powerpc/powernv: Detect and create IMC device

Code to create platform device for the In-Memory Collection (IMC)
counters. Platform devices are created based on the IMC compatibility.
New header file

powerpc/powernv: Detect and create IMC device

Code to create platform device for the In-Memory Collection (IMC)
counters. Platform devices are created based on the IMC compatibility.
New header file created to contain the data structures and macros
needed for In-Memory Collection (IMC) counter pmu devices.

The device tree for IMC counters starts at the node "imc-counters".
This node contains all the IMC PMU nodes and event nodes for these IMC
PMUs. Device probe() parses the device to locate three possible IMC
device types (Nest/Core/Thread). Function then branch to parse each
unit nodes to populate vital information such as device memory sizes,
event nodes information, base address for reserve memory access (if
any) and so on. Simple bare-minimum shutdown function added which only
"stops" the engines.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Fix build with CONFIG_PERF_EVENTS=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# a70b487b 17-Jul-2017 Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()

In commit 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode
on POWER9"), we added additional flags to the OPAL

powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores()

In commit 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode
on POWER9"), we added additional flags to the OPAL call to configure
CPUs at boot.

These flags only work on Power9 firmwares, and worse can cause boot
failures on Power8 machines, so we check for CPU_FTR_ARCH_300 (aka POWER9)
before adding the extra flags.

Unfortunately we forgot that opal_configure_cores() is called before
the CPU feature checks are dynamically patched, meaning the check
always returns true.

We definitely need to do something to make the CPU feature checks less
prone to bugs like this, but for now the minimal fix is to use
early_cpu_has_feature().

Reported-and-tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Fixes: 1c0eaf0f56d6 ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 1c0eaf0f 30-Jun-2017 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/powernv: Tell OPAL about our MMU mode on POWER9

That will allow OPAL to configure the CPU in an optimal way.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by:

powerpc/powernv: Tell OPAL about our MMU mode on POWER9

That will allow OPAL to configure the CPU in an optimal way.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 5af50993 05-Apr-2017 Benjamin Herrenschmidt <benh@kernel.crashing.org>

KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller

This patch makes KVM capable of using the XIVE interrupt controller
to provide the standard PAPR "XICS" style hypercalls. It is nec

KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller

This patch makes KVM capable of using the XIVE interrupt controller
to provide the standard PAPR "XICS" style hypercalls. It is necessary
for proper operations when the host uses XIVE natively.

This has been lightly tested on an actual system, including PCI
pass-through with a TG3 device.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Cleanup pr_xxx(), unsplit pr_xxx() strings, etc., fix build
failures by adding KVM_XIVE which depends on KVM_XICS and XIVE, and
adding empty stubs for the kvm_xive_xxx() routines, fixup subject,
integrate fixes from Paul for building PR=y HV=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 83c49190 26-Apr-2017 Michael Ellerman <mpe@ellerman.id.au>

powerpc/powernv: Fix missing attr initialisation in opal_export_attrs()

In opal_export_attrs() we dynamically allocate some bin_attributes. They're
allocated with kmalloc() and although we initialis

powerpc/powernv: Fix missing attr initialisation in opal_export_attrs()

In opal_export_attrs() we dynamically allocate some bin_attributes. They're
allocated with kmalloc() and although we initialise most of the fields, we don't
initialise write() or mmap(), and in particular we don't initialise the lockdep
related fields in the embedded struct attribute.

This leads to a lockdep warning at boot:

BUG: key c0000000f11906d8 not in .data!
WARNING: CPU: 0 PID: 1 at ../kernel/locking/lockdep.c:3136 lockdep_init_map+0x28c/0x2a0
...
Call Trace:
lockdep_init_map+0x288/0x2a0 (unreliable)
__kernfs_create_file+0x8c/0x170
sysfs_add_file_mode_ns+0xc8/0x240
__machine_initcall_powernv_opal_init+0x60c/0x684
do_one_initcall+0x60/0x1c0
kernel_init_freeable+0x2f4/0x3d4
kernel_init+0x24/0x160
ret_from_kernel_thread+0x5c/0xb0

Fix it by kzalloc'ing the attr, which fixes the uninitialised write() and
mmap(), and calling sysfs_bin_attr_init() on it to initialise the lockdep
fields.

Fixes: 11fe909d2362 ("powerpc/powernv: Add OPAL exports attributes to sysfs")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 11fe909d 29-Mar-2017 Matt Brown <matthew.brown.dev@gmail.com>

powerpc/powernv: Add OPAL exports attributes to sysfs

New versions of OPAL have a device node /ibm,opal/firmware/exports, each
property of which describes a range of memory in OPAL that Linux might

powerpc/powernv: Add OPAL exports attributes to sysfs

New versions of OPAL have a device node /ibm,opal/firmware/exports, each
property of which describes a range of memory in OPAL that Linux might
want to export to userspace for debugging.

This patch adds a sysfs file under 'opal/exports' for each property
found there, and makes it read-only by root.

Signed-off-by: Matt Brown <matthew.brown.dev@gmail.com>
[mpe: Drop counting of props, rename to attr, free on sysfs error, c'log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 63f44d65 03-Apr-2017 Michael Ellerman <mpe@ellerman.id.au>

powerpc/book3s: Print task info if we take a machine check in user mode

For an MCE (Machine Check Exception) that hits while in user mode
MSR(PR=1), print the task info to the console MCE error log.

powerpc/book3s: Print task info if we take a machine check in user mode

For an MCE (Machine Check Exception) that hits while in user mode
MSR(PR=1), print the task info to the console MCE error log. This may
help to identify an application that triggered the MCE.

After this patch the MCE console looks like:

Severe Machine check interrupt [Recovered]
NIP: [0000000010039778] PID: 762 Comm: ebizzy
Initiator: CPU
Error type: SLB [Multihit]
Effective address: 0000000010039778

Severe Machine check interrupt [Not recovered]
NIP: [0000000010039778] PID: 763 Comm: ebizzy
Initiator: CPU
Error type: UE [Page table walk ifetch]
Effective address: 0000000010039778
ebizzy[763]: unhandled signal 7 at 0000000010039778 nip 0000000010039778 lr 0000000010001b44 code 30004

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 1363875b 27-Feb-2017 Nicholas Piggin <npiggin@gmail.com>

powerpc/64s: fix handling of non-synchronous machine checks

A synchronous machine check is an exception raised by the attempt to
execute the current instruction. If the error can't be corrected, it

powerpc/64s: fix handling of non-synchronous machine checks

A synchronous machine check is an exception raised by the attempt to
execute the current instruction. If the error can't be corrected, it
can make sense to SIGBUS the currently running process.

In other cases, the error condition is not related to the current
instruction, so killing the current process is not the right thing to
do.

Today, all machine checks are MCE_SEV_ERROR_SYNC, so this has no
practical change. It will be used to handle POWER9 asynchronous
machine checks.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 1d0761d2 13-Dec-2016 Alistair Popple <alistair@popple.id.au>

powerpc/powernv: Initialise nest mmu

POWER9 contains an off core mmu called the nest mmu (NMMU). This is
used by other hardware units on the chip to translate virtual
addresses into real addresses.

powerpc/powernv: Initialise nest mmu

POWER9 contains an off core mmu called the nest mmu (NMMU). This is
used by other hardware units on the chip to translate virtual
addresses into real addresses. The unit attempting an address
translation provides the majority of the context required for the
translation request except for the base address of the partition table
(ie. the PTCR) which needs to be programmed into the NMMU.

This patch adds a call to OPAL to set the PTCR for the nest mmu in
opal_init().

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# ffe6d810 20-Nov-2016 Paul Mackerras <paulus@ozlabs.org>

powerpc/powernv: Define real-mode versions of OPAL XICS accessors

This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi()
and opal_int_set_mfrr(), for use by KVM real-mode code.

It a

powerpc/powernv: Define real-mode versions of OPAL XICS accessors

This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi()
and opal_int_set_mfrr(), for use by KVM real-mode code.

It also exports opal_int_set_mfrr() so that the modular part of KVM
can use it to send IPIs.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


Revision tags: v4.4.27, v4.7.10, openbmc-4.4-20161021-1, v4.7.9, v4.4.26, v4.7.8, v4.4.25, v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22, v4.4.21, v4.7.4, v4.7.3, v4.4.20, v4.7.2, v4.4.19, openbmc-4.4-20160819-1, v4.7.1, v4.4.18
# 9e4f51bd 10-Aug-2016 Jack Miller <jack@codezen.org>

powerpc/powernv: Simplify searching for compatible device nodes

This condenses the opal node searching into a single function that finds
all compatible nodes, instead of just searching the ibm,opal

powerpc/powernv: Simplify searching for compatible device nodes

This condenses the opal node searching into a single function that finds
all compatible nodes, instead of just searching the ibm,opal children,
for ipmi, flash, and prd similar to how opal-i2c nodes are found.

Signed-off-by: Jack Miller <jack@codezen.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


Revision tags: v4.4.17
# c74dd88e 09-Aug-2016 Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

When machine check occurs with MSR(RI=0), it means MC interrupt is
unrecoverable and kernel goes down to panic path. But the console
m

powerpc/book3s: Fix MCE console messages for unrecoverable MCE.

When machine check occurs with MSR(RI=0), it means MC interrupt is
unrecoverable and kernel goes down to panic path. But the console
message still shows it as recovered. This patch fixes the MCE console
messages.

Fixes: 36df96f8acaf ("powerpc/book3s: Decode and save machine check event.")
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


Revision tags: openbmc-4.4-20160804-1, v4.4.16, v4.7, openbmc-4.4-20160722-1, openbmc-20160722-1, openbmc-20160713-1, v4.4.15, v4.6.4
# d3cbff1b 05-Jul-2016 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Put exception configuration in a common place

The various calls to establish exception endianness and AIL are
now done from a single point using already established CPU and FW
feature bits

powerpc: Put exception configuration in a common place

The various calls to establish exception endianness and AIL are
now done from a single point using already established CPU and FW
feature bits to decide what to do.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# a203658b 03-Jul-2016 Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/opal: Wake up kopald polling thread before waiting for events

On some environments (prototype machines, some simulators, etc...)
there is no functional interrupt source to signal completion,

powerpc/opal: Wake up kopald polling thread before waiting for events

On some environments (prototype machines, some simulators, etc...)
there is no functional interrupt source to signal completion, so
we rely on the fairly slow OPAL heartbeat.

In a number of cases, the calls complete very quickly or even
immediately. We've observed that it helps a lot to wakeup the OPAL
heartbeat thread before waiting for event in those cases, it will
call OPAL immediately to collect completions for anything that
finished fast enough.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 43a1dd9b 28-Jun-2016 Suraj Jitindar Singh <sjitindarsingh@gmail.com>

powerpc/powernv: Add driver for operator panel on FSP machines

Implement new character device driver to allow access from user space
to the operator panel display present on IBM Power Systems machin

powerpc/powernv: Add driver for operator panel on FSP machines

Implement new character device driver to allow access from user space
to the operator panel display present on IBM Power Systems machines
with FSPs.

This will allow status information to be presented on the display which
is visible to a user.

The driver implements a character buffer which a user can read/write
by accessing the device (/dev/op_panel). This buffer is then displayed on
the operator panel display. Any attempt to write past the last character
position will have no effect and attempts to write more characters than
the size of the display will be truncated. The device may only be accessed
by a single process at a time.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


Revision tags: v4.6.3, v4.4.14, v4.6.2, v4.4.13, openbmc-20160606-1, v4.6.1, v4.4.12, openbmc-20160521-1, v4.4.11, openbmc-20160518-1, v4.6, v4.4.10, openbmc-20160511-1, openbmc-20160505-1, v4.4.9, v4.4.8, v4.4.7, openbmc-20160329-2, openbmc-20160329-1, openbmc-20160321-1, v4.4.6, v4.5, v4.4.5, v4.4.4, v4.4.3, openbmc-20160222-1, v4.4.2, openbmc-20160212-1, openbmc-20160210-1
# 9b4fffa1 09-Feb-2016 Andrew Donnellan <andrew.donnellan@au1.ibm.com>

powerpc/powernv: new function to access OPAL msglog

Currently, the OPAL msglog/console buffer is exposed as a sysfs file, with
the sysfs read handler responsible for retrieving the log from the OPAL

powerpc/powernv: new function to access OPAL msglog

Currently, the OPAL msglog/console buffer is exposed as a sysfs file, with
the sysfs read handler responsible for retrieving the log from the OPAL
buffer. We'd like to be able to use it in xmon as well.

Refactor the OPAL msglog code to create a new function, opal_msglog_copy(),
that copies to an arbitrary buffer. Separate the initialisation code into
generic memcons init and sysfs file creation.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


Revision tags: openbmc-20160202-2, openbmc-20160202-1, v4.4.1, openbmc-20160127-1, openbmc-20160120-1, v4.4
# dc3799bb 21-Dec-2015 Andrew Donnellan <andrew.donnellan@au1.ibm.com>

powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()

Fix off-by-one error in opal_mce_check_early_recovery() when checking
whether the NIP falls within OPAL space.

Signed-

powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()

Fix off-by-one error in opal_mce_check_early_recovery() when checking
whether the NIP falls within OPAL space.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


Revision tags: openbmc-20151217-1, openbmc-20151210-1, openbmc-20151202-1
# affddff6 27-Nov-2015 Russell Currey <ruscur@russell.cc>

powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

On BMC machines, console output is controlled by the OPAL firmware and is
only flushed when its pollers are called. When the

powerpc/powernv: Add a kmsg_dumper that flushes console output on panic

On BMC machines, console output is controlled by the OPAL firmware and is
only flushed when its pollers are called. When the kernel is in a panic
state, it no longer calls these pollers and thus console output does not
completely flush, causing some output from the panic to be lost.

Output is only actually lost when the kernel is configured to not power off
or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
flushes the console buffer as part of its power down routines. Before this
patch, however, only partial output would be printed during the timeout wait.

This patch adds a new kmsg_dumper which gets called at panic time to ensure
panic output is not lost. It accomplishes this by calling OPAL_CONSOLE_FLUSH
in the OPAL API, and if that is not available, the pollers are called enough
times to (hopefully) completely flush the buffer.

The flushing mechanism will only affect output printed at and before the
kmsg_dump call in kernel/panic.c:panic(). As such, the "end Kernel panic"
message may still be truncated as follows:

>Call Trace:
>[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable)
>[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4
>[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c
>[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254
>[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350
>[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130
>[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
>---[ end Kernel panic - not

This functionality is implemented as a kmsg_dumper as it seems to be the
most sensible way to introduce platform-specific functionality to the
panic function.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# e4d54f71 09-Dec-2015 Stewart Smith <stewart@linux.vnet.ibm.com>

powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL

Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OP

powerpc/powernv: remove FW_FEATURE_OPALv3 and just use FW_FEATURE_OPAL

Long ago, only in the lab, there was OPALv1 and OPALv2. Now there is
just OPALv3, with nobody ever expecting anything on pre-OPALv3 to
be cared about or supported by mainline kernels.

So, let's remove FW_FEATURE_OPALv3 and instead use FW_FEATURE_OPAL
exclusively.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 7261aafc 09-Dec-2015 Stewart Smith <stewart@linux.vnet.ibm.com>

powerpc/powernv: Remove OPALv2 firmware define and references

OPALv2 only ever existed in the lab and didn't escape to the world.
All OPAL systems in the wild are OPALv3.

The probability of there b

powerpc/powernv: Remove OPALv2 firmware define and references

OPALv2 only ever existed in the lab and didn't escape to the world.
All OPAL systems in the wild are OPALv3.

The probability of there being an OPALv2 system still powered on
anywhere inside IBM is approximately zero, let alone anyone
expecting to run mainline kernels.

So, start to remove references to OPALv2.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 786842b6 09-Dec-2015 Stewart Smith <stewart@linux.vnet.ibm.com>

powerpc/powernv: panic() on OPAL < V3

The OpenPower Abstraction Layer firmware went through a couple
of iterations in the lab before being released. What we now know
as OPAL advertises itself as OPA

powerpc/powernv: panic() on OPAL < V3

The OpenPower Abstraction Layer firmware went through a couple
of iterations in the lab before being released. What we now know
as OPAL advertises itself as OPALv3.

OPALv2 and OPALv1 never made it outside the lab, and the possibility
of anyone at all ever building a mainline kernel today and expecting
it to boot on such hardware is zero.

Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


# 98da62b7 10-Dec-2015 Stewart Smith <stewart@linux.vnet.ibm.com>

powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type

When running on newer OPAL firmware that supports sending extra
OPAL_MSG types, we would print a warning on *every* message received.

This

powerpc/powernv: pr_warn_once on unsupported OPAL_MSG type

When running on newer OPAL firmware that supports sending extra
OPAL_MSG types, we would print a warning on *every* message received.

This could be a problem for kernels that don't support OPAL_MSG_OCC
on machines that are running real close to thermal limits and the
OCC is throttling the chip. For a kernel that is paying attention to
the message queue, we could get these notifications quite often.

Conceivably, future message types could also come fairly often,
and printing that we didn't understand them 10,000 times provides
no further information than printing them once.

Cc: stable@vger.kernel.org
Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


Revision tags: openbmc-20151123-1, openbmc-20151118-1, openbmc-20151104-1, v4.3, openbmc-20151102-1, openbmc-20151028-1
# f2dd80ec 23-Sep-2015 Daniel Axtens <dja@axtens.net>

powerpc/powernv: Panic on unhandled Machine Check

All unrecovered machine check errors on PowerNV should cause an
immediate panic. There are 2 reasons that this is the right policy:
it's not safe to

powerpc/powernv: Panic on unhandled Machine Check

All unrecovered machine check errors on PowerNV should cause an
immediate panic. There are 2 reasons that this is the right policy:
it's not safe to continue, and we're already trying to reboot.

Firstly, if we go through the recovery process and do not successfully
recover, we can't be sure about the state of the machine, and it is
not safe to recover and proceed.

Linux knows about the following sources of Machine Check Errors:
- Uncorrectable Errors (UE)
- Effective - Real Address Translation (ERAT)
- Segment Lookaside Buffer (SLB)
- Translation Lookaside Buffer (TLB)
- Unknown/Unrecognised

In the SLB, TLB and ERAT cases, we can further categorise these as
parity errors, multihit errors or unknown/unrecognised.

We can handle SLB errors by flushing and reloading the SLB. We can
handle TLB and ERAT multihit errors by flushing the TLB. (It appears
we may not handle TLB and ERAT parity errors: I will investigate
further and send a followup patch if appropriate.)

This leaves us with uncorrectable errors. Uncorrectable errors are
usually the result of ECC memory detecting an error that it cannot
correct, but they also crop up in the context of PCI cards failing
during DMA writes, and during CAPI error events.

There are several types of UE, and there are 3 places a UE can occur:
Skiboot, the kernel, and userspace. For Skiboot errors, we have the
facility to make some recoverable. For userspace, we can simply kill
(SIGBUS) the affected process. We have no meaningful way to deal with
UEs in kernel space or in unrecoverable sections of Skiboot.

Currently, these unrecovered UEs fall through to
machine_check_expection() in traps.c, which calls die(), which OOPSes
and sends SIGBUS to the process. This sometimes allows us to stumble
onwards. For example we've seen UEs kill the kernel eehd and
khugepaged. However, the process killed could have held a lock, or it
could have been a more important process, etc: we can no longer make
any assertions about the state of the machine. Similarly if we see a
UE in skiboot (and again we've seen this happen), we're not in a
position where we can make any assertions about the state of the
machine.

Likewise, for unknown or unrecognised errors, we're not able to say
anything about the state of the machine.

Therefore, if we have an unrecovered MCE, the most appropriate thing
to do is to panic.

The second reason is that since e784b6499d9c ("powerpc/powernv: Invoke
opal_cec_reboot2() on unrecoverable machine check errors."), we
attempt a special OPAL reboot on an unhandled MCE. This is so the
hardware can record error data for later debugging.

The comments in that commit assert that we are heading down the panic
path anyway. At the moment this is not always true. With UEs in kernel
space, for instance, they are marked as recoverable by the hardware,
so if the attempt to reboot failed (e.g. old Skiboot), we wouldn't
panic() but would simply die() and OOPS. It doesn't make sense to be
staggering on if we've just tried to reboot: we should panic().

Explicitly panic() on unrecovered MCEs on PowerNV.
Update the comments appropriately.

This fixes some hangs following EEH events on cxlflash setups.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

show more ...


12345678910>>...12