Revision tags: v4.12 |
|
#
a0db971e |
| 18-Jun-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Move idle balancer registration to the idle path
The idle load balancing registration path assumes that we only stop the tick when the CPU is idle, ignoring the nohz full case. As a result, a
nohz: Move idle balancer registration to the idle path
The idle load balancing registration path assumes that we only stop the tick when the CPU is idle, ignoring the nohz full case. As a result, a nohz full CPU that is running a task may be chosen to perform idle load balancing.
Lets make sure that only CPUs in dynticks idle mode can be picked as idle load balancers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1497838322-10913-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
3c85d6db |
| 18-Jun-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
sched/loadavg: Generalize "_idle" naming to "_nohz"
The loadavg naming code still assumes that nohz == idle whereas its code is actually handling well both nohz idle and nohz full.
So lets fix the
sched/loadavg: Generalize "_idle" naming to "_nohz"
The loadavg naming code still assumes that nohz == idle whereas its code is actually handling well both nohz idle and nohz full.
So lets fix the naming according to what the code actually does, to unconfuse the reader.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1497838322-10913-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
d4af6d93 |
| 12-Jun-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Fix spurious warning when hrtimer and clockevent get out of sync
The sanity check ensuring that the tick expiry cache (ts->next_tick) is actually in sync with the hardware clock (dev->next_eve
nohz: Fix spurious warning when hrtimer and clockevent get out of sync
The sanity check ensuring that the tick expiry cache (ts->next_tick) is actually in sync with the hardware clock (dev->next_event) makes the wrong assumption that the clock can't be programmed later than the hrtimer deadline.
In fact the clock hardware can be programmed later on some conditions such as:
* The hrtimer deadline is already in the past. * The hrtimer deadline is earlier than the minimum delay supported by the hardware.
Such conditions can be met when we program the tick, for example if the last jiffies update hasn't been seen by the current CPU yet, we may program the hrtimer to a deadline that is earlier than ktime_get() because last_jiffies_update is our timestamp base to compute the next tick.
As a result, we can randomly observe such warning:
WARNING: CPU: 5 PID: 0 at kernel/time/tick-sched.c:794 tick_nohz_stop_sched_tick kernel/time/tick-sched.c:791 [inline] Call Trace: tick_nohz_irq_exit tick_irq_exit irq_exit exiting_irq smp_call_function_interrupt smp_call_function_single_interrupt call_function_single_interrupt
Therefore, let's rather make sure that the tick expiry cache is sync'ed with the tick hrtimer deadline, against which it is not supposed to drift away. The clock hardware instead has its own will and can't be used as a reliable comparison point.
Reported-and-tested-by: Sasha Levin <alexander.levin@verizon.com> Reported-and-tested-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: James Hartsock <hartsjc@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Wright <tim@binbash.co.uk> Link: http://lkml.kernel.org/r/1497326654-14122-1-git-send-email-fweisbec@gmail.com [ Minor readability edit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
f99973e1 |
| 01-Jun-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Fix buggy tick delay on IRQ storms
When the tick is stopped and we reach the dynticks evaluation code on IRQ exit, we perform a soft tick restart if we observe an expired timer from there. It
nohz: Fix buggy tick delay on IRQ storms
When the tick is stopped and we reach the dynticks evaluation code on IRQ exit, we perform a soft tick restart if we observe an expired timer from there. It means we program the nearest possible tick but we stay in dynticks mode (ts->tick_stopped = 1) because we may need to stop the tick again after that expired timer is handled.
Now this solution works most of the time but if we suffer an IRQ storm and those interrupts trigger faster than the hardware clockevents min delay, our tick won't fire until that IRQ storm is finished.
Here is the problem: on IRQ exit we reprog the timer to at least NOW() + min_clockevents_delay. Another IRQ fires before the tick so we reschedule again to NOW() + min_clockevents_delay, etc... The tick is eternally rescheduled min_clockevents_delay ahead.
A solution is to simply remove this soft tick restart. After all the normal dynticks evaluation path can handle 0 delay just fine. And by doing that we benefit from the optimization branch which avoids clock reprogramming if the clockevents deadline hasn't changed since the last reprog. This fixes our issue because we don't do repetitive clock reprog that always add hardware min delay.
As a side effect it should even optimize the 0 delay path in general.
Reported-and-tested-by: Octavian Purdila <octavian.purdila@nxp.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1496328429-13317-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.10.17 |
|
#
7c259045 |
| 15-May-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Reset next_tick cache even when the timer has no regs
Handle tick interrupts whose regs are NULL, out of general paranoia. It happens when hrtimer_interrupt() is called from non-interrupt cont
nohz: Reset next_tick cache even when the timer has no regs
Handle tick interrupts whose regs are NULL, out of general paranoia. It happens when hrtimer_interrupt() is called from non-interrupt contexts, such as hotplug CPU down events.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.10.16, v4.10.15, v4.10.14, v4.10.13 |
|
#
411fe24e |
| 21-Apr-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Fix collision between tick and other hrtimers, again
This restores commit:
24b91e360ef5: ("nohz: Fix collision between tick and other hrtimers")
... which got reverted by commit:
558e8e
nohz: Fix collision between tick and other hrtimers, again
This restores commit:
24b91e360ef5: ("nohz: Fix collision between tick and other hrtimers")
... which got reverted by commit:
558e8e27e73f: ('Revert "nohz: Fix collision between tick and other hrtimers"')
... due to a regression where CPUs spuriously stopped ticking.
The bug happened when a tick fired too early past its expected expiration: on IRQ exit the tick was scheduled again to the same deadline but skipped reprogramming because ts->next_tick still kept in cache the deadline. This has been fixed now with resetting ts->next_tick from the tick itself. Extra care has also been taken to prevent from obsolete values throughout CPU hotplug operations.
When the tick is stopped and an interrupt occurs afterward, we check on that interrupt exit if the next tick needs to be rescheduled. If it doesn't need any update, we don't want to do anything.
In order to check if the tick needs an update, we compare it against the clockevent device deadline. Now that's a problem because the clockevent device is at a lower level than the tick itself if it is implemented on top of hrtimer.
Every hrtimer share this clockevent device. So comparing the next tick deadline against the clockevent device deadline is wrong because the device may be programmed for another hrtimer whose deadline collides with the tick. As a result we may end up not reprogramming the tick accidentally.
In a worst case scenario under full dynticks mode, the tick stops firing as it is supposed to every 1hz, leaving /proc/stat stalled:
Task in a full dynticks CPU ----------------------------
* hrtimer A is queued 2 seconds ahead * the tick is stopped, scheduled 1 second ahead * tick fires 1 second later * on tick exit, nohz schedules the tick 1 second ahead but sees the clockevent device is already programmed to that deadline, fooled by hrtimer A, the tick isn't rescheduled. * hrtimer A is cancelled before its deadline * tick never fires again until an interrupt happens...
In order to fix this, store the next tick deadline to the tick_sched local structure and reuse that value later to check whether we need to reprogram the clock after an interrupt.
On the other hand, ts->sleep_length still wants to know about the next clock event and not just the tick, so we want to improve the related comment to avoid confusion.
Reported-and-tested-by: Tim Wright <tim@binbash.co.uk> Reported-and-tested-by: Pavel Machek <pavel@ucw.cz> Reported-by: James Hartsock <hartsjc@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1492783255-5051-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
ce6cf9a1 |
| 11-May-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Add hrtimer sanity check
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
ac1e843f |
| 21-Apr-2017 |
Peter Zijlstra <peterz@infradead.org> |
sched/clock: Remove unused argument to sched_clock_idle_wakeup_event()
The argument to sched_clock_idle_wakeup_event() has not been used in a long time. Remove it.
Signed-off-by: Peter Zijlstra (In
sched/clock: Remove unused argument to sched_clock_idle_wakeup_event()
The argument to sched_clock_idle_wakeup_event() has not been used in a long time. Remove it.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
Revision tags: v4.10.12, v4.10.11, v4.10.10, v4.10.9, v4.10.8, v4.10.7, v4.10.6, v4.10.5 |
|
#
b7eaf1aa |
| 21-Mar-2017 |
Rafael J. Wysocki <rafael.j.wysocki@intel.com> |
cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely
The way the schedutil governor uses the PELT metric causes it to underestimate the CPU utilization in some cases.
That can be e
cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely
The way the schedutil governor uses the PELT metric causes it to underestimate the CPU utilization in some cases.
That can be easily demonstrated by running kernel compilation on a Sandy Bridge Intel processor, running turbostat in parallel with it and looking at the values written to the MSR_IA32_PERF_CTL register. Namely, the expected result would be that when all CPUs were 100% busy, all of them would be requested to run in the maximum P-state, but observation shows that this clearly isn't the case. The CPUs run in the maximum P-state for a while and then are requested to run slower and go back to the maximum P-state after a while again. That causes the actual frequency of the processor to visibly oscillate below the sustainable maximum in a jittery fashion which clearly is not desirable.
That has been attributed to CPU utilization metric updates on task migration that cause the total utilization value for the CPU to be reduced by the utilization of the migrated task. If that happens, the schedutil governor may see a CPU utilization reduction and will attempt to reduce the CPU frequency accordingly right away. That may be premature, though, for example if the system is generally busy and there are other runnable tasks waiting to be run on that CPU already.
This is unlikely to be an issue on systems where cpufreq policies are shared between multiple CPUs, because in those cases the policy utilization is computed as the maximum of the CPU utilization values over the whole policy and if that turns out to be low, reducing the frequency for the policy most likely is a good idea anyway. On systems with one CPU per policy, however, it may affect performance adversely and even lead to increased energy consumption in some cases.
On those systems it may be addressed by taking another utilization metric into consideration, like whether or not the CPU whose frequency is about to be reduced has been idle recently, because if that's not the case, the CPU is likely to be busy in the near future and its frequency should not be reduced.
To that end, use the counter of idle calls in the timekeeping code. Namely, make the schedutil governor look at that counter for the current CPU every time before its frequency is about to be reduced. If the counter has not changed since the previous iteration of the governor computations for that CPU, the CPU has been busy for all that time and its frequency should not be decreased, so if the new frequency would be lower than the one set previously, the governor will skip the frequency update.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Joel Fernandes <joelaf@google.com>
show more ...
|
Revision tags: v4.10.4, v4.10.3, v4.10.2, v4.10.1, v4.10 |
|
#
370c9135 |
| 08-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nohz.h>
We are going to split <linux/sched/nohz.h> out of <linux/sched.h>, which will have to be picked up from
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nohz.h>
We are going to split <linux/sched/nohz.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/nohz.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
03441a34 |
| 08-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/stat.h>
We are going to split <linux/sched/stat.h> out of <linux/sched.h>, which will have to be picked up from
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/stat.h>
We are going to split <linux/sched/stat.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/stat.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
38b8d208 |
| 08-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h>
We are going to move softlockup APIs out of <linux/sched.h>, which will have to be picked up from other h
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/nmi.h>
We are going to move softlockup APIs out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files.
<linux/nmi.h> already includes <linux/sched.h>.
Include the <linux/nmi.h> header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
3f07c014 |
| 08-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which will have to be picked up f
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/signal.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
e6017571 |
| 01-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h>
We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up fro
sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h>
We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up from other headers and .c files.
Create a trivial placeholder <linux/sched/clock.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
558e8e27 |
| 16-Feb-2017 |
Linus Torvalds <torvalds@linux-foundation.org> |
Revert "nohz: Fix collision between tick and other hrtimers"
This reverts commit 24b91e360ef521a2808771633d76ebc68bd5604b and commit 7bdb59f1ad47 ("tick/nohz: Fix possible missing clock reprog after
Revert "nohz: Fix collision between tick and other hrtimers"
This reverts commit 24b91e360ef521a2808771633d76ebc68bd5604b and commit 7bdb59f1ad47 ("tick/nohz: Fix possible missing clock reprog after tick soft restart") that depends on it,
Pavel reports that it causes occasional boot hangs for him that seem to depend on just how the machine was booted. In particular, his machine hangs at around the PCI fixups of the EHCI USB host controller, but only hangs from cold boot, not from a warm boot.
Thomas Gleixner suspecs it's a CPU hotplug interaction, particularly since Pavel also saw suspend/resume issues that seem to be related. We're reverting for now while trying to figure out the root cause.
Reported-bisected-and-tested-by: Pavel Machek <pavel@ucw.cz> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org # reverted commits were marked for stable Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
7bdb59f1 |
| 07-Feb-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
tick/nohz: Fix possible missing clock reprog after tick soft restart
ts->next_tick keeps track of the next tick deadline in order to optimize clock programmation on irq exit and avoid redundant cloc
tick/nohz: Fix possible missing clock reprog after tick soft restart
ts->next_tick keeps track of the next tick deadline in order to optimize clock programmation on irq exit and avoid redundant clock device writes.
Now if ts->next_tick missed an update, we may spuriously miss a clock reprog later as the nohz code is fooled by an obsolete next_tick value.
This is what happens here on a specific path: when we observe an expired timer from the nohz update code on irq exit, we perform a soft tick restart which simply fires the closest possible tick without actually exiting the nohz mode and restoring a periodic state. But we forget to update ts->next_tick accordingly.
As a result, after the next tick resulting from such soft tick restart, the nohz code sees a stale value on ts->next_tick which doesn't match the clock deadline that just expired. If that obsolete ts->next_tick value happens to collide with the actual next tick deadline to be scheduled, we may spuriously bypass the clock reprogramming. In the worst case, the tick may never fire again.
Fix this with a ts->next_tick reset on soft tick restart.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1486485894-29173-1-git-send-email-fweisbec@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
24b91e36 |
| 04-Jan-2017 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Fix collision between tick and other hrtimers
When the tick is stopped and an interrupt occurs afterward, we check on that interrupt exit if the next tick needs to be rescheduled. If it doesn'
nohz: Fix collision between tick and other hrtimers
When the tick is stopped and an interrupt occurs afterward, we check on that interrupt exit if the next tick needs to be rescheduled. If it doesn't need any update, we don't want to do anything.
In order to check if the tick needs an update, we compare it against the clockevent device deadline. Now that's a problem because the clockevent device is at a lower level than the tick itself if it is implemented on top of hrtimer.
Every hrtimer share this clockevent device. So comparing the next tick deadline against the clockevent device deadline is wrong because the device may be programmed for another hrtimer whose deadline collides with the tick. As a result we may end up not reprogramming the tick accidentally.
In a worst case scenario under full dynticks mode, the tick stops firing as it is supposed to every 1hz, leaving /proc/stat stalled:
Task in a full dynticks CPU ----------------------------
* hrtimer A is queued 2 seconds ahead * the tick is stopped, scheduled 1 second ahead * tick fires 1 second later * on tick exit, nohz schedules the tick 1 second ahead but sees the clockevent device is already programmed to that deadline, fooled by hrtimer A, the tick isn't rescheduled. * hrtimer A is cancelled before its deadline * tick never fires again until an interrupt happens...
In order to fix this, store the next tick deadline to the tick_sched local structure and reuse that value later to check whether we need to reprogram the clock after an interrupt.
On the other hand, ts->sleep_length still wants to know about the next clock event and not just the tick, so we want to improve the related comment to avoid confusion.
Reported-by: James Hartsock <hartsjc@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/r/1483539124-5693-1-git-send-email-fweisbec@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
2456e855 |
| 25-Dec-2016 |
Thomas Gleixner <tglx@linutronix.de> |
ktime: Get rid of the union
ktime is a union because the initial implementation stored the time in scalar nanoseconds on 64 bit machine and in a endianess optimized timespec variant for 32bit machin
ktime: Get rid of the union
ktime is a union because the initial implementation stored the time in scalar nanoseconds on 64 bit machine and in a endianess optimized timespec variant for 32bit machines. The Y2038 cleanup removed the timespec variant and switched everything to scalar nanoseconds. The union remained, but become completely pointless.
Get rid of the union and just keep ktime_t as simple typedef of type s64.
The conversion was done with coccinelle and some manual mopping up.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
show more ...
|
Revision tags: v4.9, openbmc-4.4-20161121-1, v4.4.33 |
|
#
31eff243 |
| 17-Nov-2016 |
Sebastian Andrzej Siewior <bigeasy@linutronix.de> |
sched/nohz: Convert to hotplug state machine
Install the callbacks via the state machine.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: rt@linuxtronix.de Link: http://lkml.ke
sched/nohz: Convert to hotplug state machine
Install the callbacks via the state machine.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: rt@linuxtronix.de Link: http://lkml.kernel.org/r/20161117183541.8588-14-bigeasy@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
Revision tags: v4.4.32, v4.4.31, v4.4.30, v4.4.29, v4.4.28, v4.4.27, v4.7.10, openbmc-4.4-20161021-1, v4.7.9, v4.4.26, v4.7.8, v4.4.25, v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22, v4.4.21, v4.7.4 |
|
#
57ccdf44 |
| 07-Sep-2016 |
Wanpeng Li <wanpeng.li@hotmail.com> |
tick/nohz: Prevent stopping the tick on an offline CPU
can_stop_full_tick() has no check for offline cpus. So it allows to stop the tick on an offline cpu from the interrupt return path, which is wr
tick/nohz: Prevent stopping the tick on an offline CPU
can_stop_full_tick() has no check for offline cpus. So it allows to stop the tick on an offline cpu from the interrupt return path, which is wrong and subsequently makes irq_work_needs_cpu() warn about being called for an offline cpu.
Commit f7ea0fd639c2c4 ("tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline") added prevention for can_stop_idle_tick(), but forgot to do the same in can_stop_full_tick(). Add it.
[ tglx: Massaged changelog ]
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/1473245473-4463-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
Revision tags: v4.7.3, v4.4.20 |
|
#
08d07259 |
| 02-Sep-2016 |
Wanpeng Li <wanpeng.li@hotmail.com> |
tick/nohz: Fix softlockup on scheduler stalls in kvm guest
tick_nohz_start_idle() is prevented to be called if the idle tick can't be stopped since commit 1f3b0f8243cb934 ("tick/nohz: Optimize nohz
tick/nohz: Fix softlockup on scheduler stalls in kvm guest
tick_nohz_start_idle() is prevented to be called if the idle tick can't be stopped since commit 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle enter"). As a result, after suspend/resume the host machine, full dynticks kvm guest will softlockup:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [swapper/0:0] Call Trace: default_idle+0x31/0x1a0 arch_cpu_idle+0xf/0x20 default_idle_call+0x2a/0x50 cpu_startup_entry+0x39b/0x4d0 rest_init+0x138/0x140 ? rest_init+0x5/0x140 start_kernel+0x4c1/0x4ce ? set_init_arg+0x55/0x55 ? early_idt_handler_array+0x120/0x120 x86_64_start_reservations+0x24/0x26 x86_64_start_kernel+0x142/0x14f
In addition, cat /proc/stat | grep cpu in guest or host:
cpu 398 16 5049 15754 5490 0 1 46 0 0 cpu0 206 5 450 0 0 0 1 14 0 0 cpu1 81 0 3937 3149 1514 0 0 9 0 0 cpu2 45 6 332 6052 2243 0 0 11 0 0 cpu3 65 2 328 6552 1732 0 0 11 0 0
The idle and iowait states are weird 0 for cpu0(housekeeping).
The bug is present in both guest and host kernels, and they both have cpu0's idle and iowait states issue, however, host kernel's suspend/resume path etc will touch watchdog to avoid the softlockup.
- The watchdog will not be touched in tick_nohz_stop_idle path (need be touched since the scheduler stall is expected) if idle_active flags are not detected. - The idle and iowait states will not be accounted when exit idle loop (resched or interrupt) if idle start time and idle_active flags are not set.
This patch fixes it by reverting commit 1f3b0f8243cb934 since can't stop idle tick doesn't mean can't be idle.
Fixes: 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle enter") Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com> Cc: Gaurav Jindal<gaurav.jindal@spreadtrum.com> Cc: stable@vger.kernel.org Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lkml.kernel.org/r/1472798303-4154-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
Revision tags: v4.7.2, v4.4.19, openbmc-4.4-20160819-1, v4.7.1, v4.4.18, v4.4.17, openbmc-4.4-20160804-1, v4.4.16, v4.7, openbmc-4.4-20160722-1, openbmc-20160722-1 |
|
#
1f3b0f82 |
| 14-Jul-2016 |
Gaurav Jindal <Gaurav.Jindal@spreadtrum.com> |
tick/nohz: Optimize nohz idle enter
tick_nohz_start_idle is called before checking whether the idle tick can be stopped. If the tick cannot be stopped, calling tick_nohz_start_idle() is pointless an
tick/nohz: Optimize nohz idle enter
tick_nohz_start_idle is called before checking whether the idle tick can be stopped. If the tick cannot be stopped, calling tick_nohz_start_idle() is pointless and just wasting CPU cycles.
Only invoke tick_nohz_start_idle() when can_stop_idle_tick() returns true. A short one minute observation of the effect on ARM64 shows a reduction of calls by 1.5% thus optimizing the idle entry sequence.
[tglx: Massaged changelog ]
Co-developed-by: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com> Signed-off-by: Gaurav Jindal<gaurav.jindal@spreadtrum.com> Link: http://lkml.kernel.org/r/20160714120416.GB21099@gaurav.jindal@spreadtrum.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
Revision tags: openbmc-20160713-1, v4.4.15, v4.6.4 |
|
#
a683f390 |
| 04-Jul-2016 |
Thomas Gleixner <tglx@linutronix.de> |
timers: Forward the wheel clock whenever possible
The wheel clock is stale when a CPU goes into a long idle sleep. This has the side effect that timers which are queued end up in the outer wheel lev
timers: Forward the wheel clock whenever possible
The wheel clock is stale when a CPU goes into a long idle sleep. This has the side effect that timers which are queued end up in the outer wheel levels. That results in coarser granularity.
To solve this, we keep track of the idle state and forward the wheel clock whenever possible.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
ff006732 |
| 04-Jul-2016 |
Thomas Gleixner <tglx@linutronix.de> |
timers/nohz: Remove pointless tick_nohz_kick_tick() function
This was a failed attempt to optimize the timer expiry in idle, which was disabled and never revisited. Remove the cruft.
Signed-off-by:
timers/nohz: Remove pointless tick_nohz_kick_tick() function
This was a failed attempt to optimize the timer expiry in idle, which was disabled and never revisited. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: George Spelvin <linux@sciencehorizons.net> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: rt@linutronix.de Link: http://lkml.kernel.org/r/20160704094342.431073782@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
0de7611a |
| 01-Jul-2016 |
Ingo Molnar <mingo@kernel.org> |
timers/nohz: Capitalize 'CPU' consistently
While reviewing another patch I noticed that kernel/time/tick-sched.c had a charmingly (confusingly, annoyingly) rich set of variants for spelling 'CPU':
timers/nohz: Capitalize 'CPU' consistently
While reviewing another patch I noticed that kernel/time/tick-sched.c had a charmingly (confusingly, annoyingly) rich set of variants for spelling 'CPU':
cpu cpus CPU CPUs per CPU per-CPU per cpu
... sometimes these were mixed even within the same comment block!
Compress these variants down to a single consistent set of:
CPU CPUs per-CPU
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|