#
351f181f |
| 24-Oct-2012 |
Chuansheng Liu <chuansheng.liu@intel.com> |
timers, sched: Correct the comments for tick_sched_timer()
In the comments of function tick_sched_timer(), the sentence "timer->base->cpu_base->lock held" is not right.
In function __run_hrtimer(),
timers, sched: Correct the comments for tick_sched_timer()
In the comments of function tick_sched_timer(), the sentence "timer->base->cpu_base->lock held" is not right.
In function __run_hrtimer(), before call timer->function(), the cpu_base->lock has been unlocked.
Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> Cc: fei.li@intel.com Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1351098455.15558.1421.camel@cliu38-desktop-build Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
94a57140 |
| 15-Oct-2012 |
Frederic Weisbecker <fweisbec@gmail.com> |
tick: Conditionally build nohz specific code in tick handler
This optimize a bit the high res tick sched handler.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@l
tick: Conditionally build nohz specific code in tick handler
This optimize a bit the high res tick sched handler.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>
show more ...
|
#
9e8f559b |
| 14-Oct-2012 |
Frederic Weisbecker <fweisbec@gmail.com> |
tick: Consolidate tick handling for high and low res handlers
Besides unifying code, this also adds the idle check before processing idle accounting specifics on the low res handler. This way we als
tick: Consolidate tick handling for high and low res handlers
Besides unifying code, this also adds the idle check before processing idle accounting specifics on the low res handler. This way we also generalize this part of the nohz code for !CONFIG_HIGH_RES_TIMERS to prepare for the adaptive tickless features.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>
show more ...
|
#
5bb96226 |
| 14-Oct-2012 |
Frederic Weisbecker <fweisbec@gmail.com> |
tick: Consolidate timekeeping handling code
Unify the duplicated timekeeping handling code of low and high res tick sched handlers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thoma
tick: Consolidate timekeeping handling code
Unify the duplicated timekeeping handling code of low and high res tick sched handlers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org>
show more ...
|
#
2b17c545 |
| 03-Oct-2012 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Fix one jiffy count too far in idle cputime
When we stop the tick in idle, we save the current jiffies value in ts->idle_jiffies. This snapshot is substracted from the later value of jiffies w
nohz: Fix one jiffy count too far in idle cputime
When we stop the tick in idle, we save the current jiffies value in ts->idle_jiffies. This snapshot is substracted from the later value of jiffies when the tick is restarted and the resulting delta is accounted as idle cputime. This is how we handle the idle cputime accounting without the tick.
But sometimes we need to schedule the next tick to some time in the future instead of completely stopping it. In this case, a tick may happen before we restart the periodic behaviour and from that tick we account one jiffy to idle cputime as usual but we also increment the ts->idle_jiffies snapshot by one so that when we compute the delta to account, we substract the one jiffy we just accounted.
To prepare for stopping the tick outside idle, we introduced a check that prevents from fixing up that ts->idle_jiffies if we are not running the idle task. But we use idle_cpu() for that and this is a problem if we run the tick while another CPU remotely enqueues a ttwu to our runqueue:
CPU 0: CPU 1:
tick_sched_timer() { ttwu_queue_remote() if (idle_cpu(CPU 0)) ts->idle_jiffies++; }
Here, idle_cpu() notes that &rq->wake_list is not empty and hence won't consider the CPU as idle. As a result, ts->idle_jiffies won't be incremented. But this is wrong because we actually account the current jiffy to idle cputime. And that jiffy won't get substracted from the nohz time delta. So in the end, this jiffy is accounted twice.
Fix this by changing idle_cpu(smp_processor_id()) with is_idle_task(current). This way the jiffy is substracted correctly even if a ttwu operation is enqueued on the CPU.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> # 3.5+ Link: http://lkml.kernel.org/r/1349308004-3482-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
803b0eba |
| 23-Aug-2012 |
Paul E. McKenney <paul.mckenney@linaro.org> |
time: RCU permitted to stop idle entry via softirq
The can_stop_idle_tick() function complains if a softirq vector is raised too late in the idle-entry process, presumably in order to prevent dangli
time: RCU permitted to stop idle entry via softirq
The can_stop_idle_tick() function complains if a softirq vector is raised too late in the idle-entry process, presumably in order to prevent dangling softirq invocations from being delayed across the full idle period, which might be indefinitely long -- and if softirq was asserted any later than the call to this function, such a delay might well happen.
However, RCU needs to be able to use softirq to stop idle entry in order to be able to drain RCU callbacks from the current CPU, which in turn enables faster entry into dyntick-idle mode, which in turn reduces power consumption. Because RCU takes this action at a well-defined point in the idle-entry path, it is safe for RCU to take this approach.
This commit therefore silences the error message that is sometimes produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ to process. The error message will continue to be issued for other softirq vectors.
Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
show more ...
|
#
c1cc017c |
| 10-Sep-2012 |
Alex Shi <alex.shi@intel.com> |
sched/nohz: Clean up select_nohz_load_balancer()
There is no load_balancer to be selected now. It just sets the state of the nohz tick to stop.
So rename the function, pass the 'cpu' as a parameter
sched/nohz: Clean up select_nohz_load_balancer()
There is no load_balancer to be selected now. It just sets the state of the nohz tick to stop.
So rename the function, pass the 'cpu' as a parameter and then remove the useless call from tick_nohz_restart_sched_tick().
[ s/set_nohz_tick_stopped/nohz_balance_enter_idle/g s/clear_nohz_tick_stopped/nohz_balance_exit_idle/g ] Signed-off-by: Alex Shi <alex.shi@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Venkatesh Pallipadi <venki@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1347261059-24747-1-git-send-email-alex.shi@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
749c8814 |
| 20-Aug-2012 |
Charles Wang <muming.wq@taobao.com> |
sched: Add missing call to calc_load_exit_idle()
Azat Khuzhin reported high loadavg in Linux v3.6
After checking the upstream scheduler code, I found Peter's commit:
5167e8d5417b sched/nohz: Rew
sched: Add missing call to calc_load_exit_idle()
Azat Khuzhin reported high loadavg in Linux v3.6
After checking the upstream scheduler code, I found Peter's commit:
5167e8d5417b sched/nohz: Rewrite and fix load-avg computation -- again
not fully applied, missing the call to calc_load_exit_idle().
After that idle exit in sampling window will always be calculated to non-idle, and the load will be higher than normal.
This patch adds the missing call to calc_load_exit_idle().
Signed-off-by: Charles Wang <muming.wq@taobao.com> Cc: stable@kernel.org Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1345449754-27130-1-git-send-email-muming.wq@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
5167e8d5 |
| 22-Jun-2012 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
sched/nohz: Rewrite and fix load-avg computation -- again
Thanks to Charles Wang for spotting the defects in the current code:
- If we go idle during the sample window -- after sampling, we get a
sched/nohz: Rewrite and fix load-avg computation -- again
Thanks to Charles Wang for spotting the defects in the current code:
- If we go idle during the sample window -- after sampling, we get a negative bias because we can negate our own sample.
- If we wake up during the sample window we get a positive bias because we push the sample to a known active period.
So rewrite the entire nohz load-avg muck once again, now adding copious documentation to the code.
Reported-and-tested-by: Doug Smythies <dsmythies@telus.net> Reported-and-tested-by: Charles Wang <muming.wq@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org Link: http://lkml.kernel.org/r/1340373782.18025.74.camel@twins [ minor edits ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
9d2ad243 |
| 24-Jun-2012 |
Paul E. McKenney <paul.mckenney@linaro.org> |
rcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter
If the nohz= boot parameter disables nohz, then RCU_FAST_NO_HZ needs to also disable itself. This commit therefore checks for tick_nohz_enabled
rcu: Make RCU_FAST_NO_HZ respect nohz= boot parameter
If the nohz= boot parameter disables nohz, then RCU_FAST_NO_HZ needs to also disable itself. This commit therefore checks for tick_nohz_enabled being zero, disabling rcu_prepare_for_idle() if so. This commit assumes that tick_nohz_enabled can change at runtime: If this is not the case, then a simpler approach suffices.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
show more ...
|
Revision tags: v3.1-rc1 |
|
#
84bf1bcc |
| 31-Jul-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Move next idle expiry time record into idle logic area
The next idle expiry time record and idle sleeps tracking are statistics that only concern idle.
Since we want the nohz APIs to become u
nohz: Move next idle expiry time record into idle logic area
The next idle expiry time record and idle sleeps tracking are statistics that only concern idle.
Since we want the nohz APIs to become usable further idle context, let's pull up the handling of these statistics to the callers in idle.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
5b39939a |
| 31-Jul-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Move ts->idle_calls incrementation into strict idle logic
Since we want to prepare for making the nohz API to work further the idle case, we need to pull ts->idle_calls incrementation up to th
nohz: Move ts->idle_calls incrementation into strict idle logic
Since we want to prepare for making the nohz API to work further the idle case, we need to pull ts->idle_calls incrementation up to the callers in idle.
To perform this, we split tick_nohz_stop_sched_tick() in two parts: a first one that checks if we can really stop the tick for idle, and another that actually stops it. Then from the callers in idle, we check if we can stop the tick and only then we increment idle_calls and finally relay to the nohz API that won't care about these details anymore.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
f5d411c9 |
| 31-Jul-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Rename ts->idle_tick to ts->last_tick
Now that idle and nohz logics are going to be independant each others, ts->idle_tick becomes too much a biased name to describe the field that saves the l
nohz: Rename ts->idle_tick to ts->last_tick
Now that idle and nohz logics are going to be independant each others, ts->idle_tick becomes too much a biased name to describe the field that saves the last scheduled tick on top of which we re-calculate the next tick to schedule when the timer is restarted.
We want to reuse this even to stop the tick outside idle cases. So let's rename it to some more generic name: ts->last_tick.
This changes a bit the timer list stat export so we need to increase its version.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
2ac0d98f |
| 27-Jul-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Make nohz API agnostic against idle ticks cputime accounting
When the timer tick fires, it accounts the new jiffy as either part of system, user or idle time. This is how we record the cputime
nohz: Make nohz API agnostic against idle ticks cputime accounting
When the timer tick fires, it accounts the new jiffy as either part of system, user or idle time. This is how we record the cputime statistics.
But when the tick is stopped from the idle task, we still need to record the number of jiffies spent tickless until we restart the tick and fall back to traditional tick-based cputime accounting.
To do this, we take a snapshot of jiffies when the tick is stopped and compute the difference against the new value of jiffies when the tick is restarted. Then we account this whole difference to the idle cputime.
However we are preparing to be able to stop the tick from other places than idle. So this idle time accounting needs to be performed from the callers of nohz APIs, not from the nohz APIs themselves because we now want them to be agnostic against places that stop/restart tick.
Therefore, we pull the tickless idle time accounting out of generic nohz helpers up to idle entry/exit callers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
19f5f736 |
| 27-Jul-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Separate idle sleeping time accounting from nohz logic
As we plan to be able to stop the tick outside the idle task, we need to prepare for separating nohz logic from idle. As a start, this pu
nohz: Separate idle sleeping time accounting from nohz logic
As we plan to be able to stop the tick outside the idle task, we need to prepare for separating nohz logic from idle. As a start, this pulls the idle sleeping time accounting out of the tick stop/restart API to the callers on idle entry/exit.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
aa9b1630 |
| 10-May-2012 |
Paul E. McKenney <paul.mckenney@linaro.org> |
rcu: Precompute RCU_FAST_NO_HZ timer offsets
When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick() calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the next wak
rcu: Precompute RCU_FAST_NO_HZ timer offsets
When a CPU is entering dyntick-idle mode, tick_nohz_stop_sched_tick() calls rcu_needs_cpu() see if RCU needs that CPU, and, if not, computes the next wakeup time based on the timer wheels. Only later, when actually entering the idle loop, rcu_prepare_for_idle() will be invoked. In some cases, rcu_prepare_for_idle() will post timers to wake the CPU back up. But all for naught: The next wakeup time for the CPU has already been computed, and posting a timer afterwards does not force that wakeup time to be recomputed. This means that rcu_prepare_for_idle()'s have no effect.
This is not a problem on a busy system because something else will wake up the CPU soon enough. However, on lightly loaded systems, the CPU might stay asleep for a considerable length of time. If that CPU has a callback that the rest of the system is waiting on, the system might run very slowly or (in theory) even hang.
This commit avoids this problem by having rcu_needs_cpu() give tick_nohz_stop_sched_tick() an estimate of when RCU will need the CPU to wake back up, which tick_nohz_stop_sched_tick() takes into account when programming the CPU's wakeup time. An alternative approach is for rcu_prepare_for_idle() to use hrtimers instead of normal timers, but timers are much more efficient than are hrtimers for frequently and repeatedly posting and cancelling a given timer, which is exactly what RCU_FAST_NO_HZ does.
Reported-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Tested-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr>
show more ...
|
#
5aaa0b7a |
| 17-May-2012 |
Peter Zijlstra <a.p.zijlstra@chello.nl> |
sched/nohz: Fix rq->cpu_load calculations some more
Follow up on commit 556061b00 ("sched/nohz: Fix rq->cpu_load[] calculations") since while that fixed the busy case it regressed the mostly idle ca
sched/nohz: Fix rq->cpu_load calculations some more
Follow up on commit 556061b00 ("sched/nohz: Fix rq->cpu_load[] calculations") since while that fixed the busy case it regressed the mostly idle case.
Add a callback from the nohz exit to also age the rq->cpu_load[] array. This closes the hole where either there was no nohz load balance pass during the nohz, or there was a 'significant' amount of idle time between the last nohz balance and the nohz exit.
So we'll update unconditionally from the tick to not insert any accidental 0 load periods while busy, and we try and catch up from nohz idle balance and nohz exit. Both these are still prone to missing a jiffy, but that has always been the case.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: pjt@google.com Cc: Venkatesh Pallipadi <venki@google.com> Link: http://lkml.kernel.org/n/tip-kt0trz0apodbf84ucjfdbr1a@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
show more ...
|
#
62cf20b3 |
| 25-May-2012 |
Thomas Gleixner <tglx@linutronix.de> |
tick: Move skew_tick option into the HIGH_RES_TIMER section
commit 5307c95 (tick: Add tick skew boot option) broke the !CONFIG_HIGH_RES_TIMERS build.
Move the boot option parsing into the CONFIG_HI
tick: Move skew_tick option into the HIGH_RES_TIMER section
commit 5307c95 (tick: Add tick skew boot option) broke the !CONFIG_HIGH_RES_TIMERS build.
Move the boot option parsing into the CONFIG_HIGH_RES_TIMERS section.
Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <mgalbraith@suse.de>
show more ...
|
#
5307c955 |
| 08-May-2012 |
Mike Galbraith <mgalbraith@suse.de> |
tick: Add tick skew boot option
Let the user decide whether power consumption or jitter is the more important consideration for their machines.
Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e
tick: Add tick skew boot option
Let the user decide whether power consumption or jitter is the more important consideration for their machines.
Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867:
"Historically, Linux has tried to make the regular timer tick on the various CPUs not happen at the same time, to avoid contention on xtime_lock. Nowadays, with the tickless kernel, this contention no longer happens since time keeping and updating are done differently. In addition, this skew is actually hurting power consumption in a measurable way on many-core systems."
Problems:
- Contrary to the above, systems do encounter contention on both xtime_lock and RCU structure locks when the tick is synchronized. - Moderate sized RT systems suffer intolerable jitter due to the tick being synchronized.
- SGI reports the same for their large systems.
- Fully utilized systems reap no power saving benefit from skew removal, but do suffer from resulting induced lock contention.
- 0209f649 rcu: limit rcu_node leaf-level fanout This patch was born to combat lock contention which testing showed to have been _induced by_ skew removal. Skew the tick, contention disappeared virtually completely.
Signed-off-by: Mike Galbraith <mgalbraith@suse.de> Link: http://lkml.kernel.org/r/1336472458.21924.78.camel@marge.simpson.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
6f103929 |
| 27-Mar-2012 |
Neal Cardwell <ncardwell@google.com> |
nohz: Fix stale jiffies update in tick_nohz_restart()
Fix tick_nohz_restart() to not use a stale ktime_t "now" value when calling tick_do_update_jiffies64(now).
If we reach this point in the loop i
nohz: Fix stale jiffies update in tick_nohz_restart()
Fix tick_nohz_restart() to not use a stale ktime_t "now" value when calling tick_do_update_jiffies64(now).
If we reach this point in the loop it means that we crossed a tick boundary since we grabbed the "now" timestamp, so at this point "now" refers to a time in the old jiffy, so using the old value for "now" is incorrect, and is likely to give us a stale jiffies value.
In particular, the first time through the loop the tick_do_update_jiffies64(now) call is always a no-op, since the caller, tick_nohz_restart_sched_tick(), will have already called tick_do_update_jiffies64(now) with that "now" value.
Note that tick_nohz_stop_sched_tick() already uses the correct approach: when we notice we cross a jiffy boundary, grab a new timestamp with ktime_get(), and *then* update jiffies.
Signed-off-by: Neal Cardwell <ncardwell@google.com> Cc: Ben Segall <bsegall@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1332875377-23014-1-git-send-email-ncardwell@google.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
15f827be |
| 24-Jan-2012 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Remove ts->Einidle checks before restarting the tick
ts->inidle is set by tick_nohz_idle_enter() and unset by tick_nohz_idle_exit(). However these two calls are assumed to be always paired. Th
nohz: Remove ts->Einidle checks before restarting the tick
ts->inidle is set by tick_nohz_idle_enter() and unset by tick_nohz_idle_exit(). However these two calls are assumed to be always paired. This means that by the time we call tick_nohz_idle_exit(), ts->inidle is supposed to be always set to 1.
Remove the checks for ts->inidle in tick_nohz_idle_exit(). This simplifies a bit the code and improves its debuggability (ie: ensure the call is paired with a tick_nohz_idle_enter() call).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: John Stultz <john.stultz@linaro.org> Cc: Ingo Molnar <mingo@elte.hu> Link: http://lkml.kernel.org/r/1327427984-23282-2-git-send-email-fweisbec@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
430ee881 |
| 01-Dec-2011 |
Michal Hocko <mhocko@suse.cz> |
nohz: Remove update_ts_time_stat from tick_nohz_start_idle
There is no reason to call update_ts_time_stat from tick_nohz_start_idle anymore (after e0e37c20 sched: Eliminate the ts->idle_lastupdate f
nohz: Remove update_ts_time_stat from tick_nohz_start_idle
There is no reason to call update_ts_time_stat from tick_nohz_start_idle anymore (after e0e37c20 sched: Eliminate the ts->idle_lastupdate field) when we updated idle_lastupdate unconditionally.
We haven't set idle_active yet and do not provide last_update_time so the whole call end up being just 2 wasted branches.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Arjan van de Ven <arjan@linux.intel.com> Link: http://lkml.kernel.org/r/1322755222-6951-1-git-send-email-mhocko@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
1268fbc7 |
| 17-Nov-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
Those two APIs were provided to optimize the calls of tick_nohz_idle_enter() and rcu_idle_enter() into a single irq disabled s
nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
Those two APIs were provided to optimize the calls of tick_nohz_idle_enter() and rcu_idle_enter() into a single irq disabled section. This way no interrupt happening in-between would needlessly process any RCU job.
Now we are talking about an optimization for which benefits have yet to be measured. Let's start simple and completely decouple idle rcu and dyntick idle logics to simplify.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
show more ...
|
#
2bbb6817 |
| 08-Oct-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Allow rcu extended quiescent state handling seperately from tick stop
It is assumed that rcu won't be used once we switch to tickless mode and until we restart the tick. However this is not al
nohz: Allow rcu extended quiescent state handling seperately from tick stop
It is assumed that rcu won't be used once we switch to tickless mode and until we restart the tick. However this is not always true, as in x86-64 where we dereference the idle notifiers after the tick is stopped.
To prepare for fixing this, add two new APIs: tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().
If no use of RCU is made in the idle loop between tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch must instead call the new *_norcu() version such that the arch doesn't need to call rcu_idle_enter() and rcu_idle_exit().
Otherwise the arch must call tick_nohz_enter_idle() and tick_nohz_exit_idle() and also call explicitly:
- rcu_idle_enter() after its last use of RCU before the CPU is put to sleep. - rcu_idle_exit() before the first use of RCU after the CPU is woken up.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
show more ...
|
#
280f0677 |
| 07-Oct-2011 |
Frederic Weisbecker <fweisbec@gmail.com> |
nohz: Separate out irq exit and idle loop dyntick logic
The tick_nohz_stop_sched_tick() function, which tries to delay the next timer tick as long as possible, can be called from two places:
- From
nohz: Separate out irq exit and idle loop dyntick logic
The tick_nohz_stop_sched_tick() function, which tries to delay the next timer tick as long as possible, can be called from two places:
- From the idle loop to start the dytick idle mode - From interrupt exit if we have interrupted the dyntick idle mode, so that we reprogram the next tick event in case the irq changed some internal state that requires this action.
There are only few minor differences between both that are handled by that function, driven by the ts->inidle cpu variable and the inidle parameter. The whole guarantees that we only update the dyntick mode on irq exit if we actually interrupted the dyntick idle mode, and that we enter in RCU extended quiescent state from idle loop entry only.
Split this function into:
- tick_nohz_idle_enter(), which sets ts->inidle to 1, enters dynticks idle mode unconditionally if it can, and enters into RCU extended quiescent state.
- tick_nohz_irq_exit() which only updates the dynticks idle mode when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).
To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed into tick_nohz_idle_exit().
This simplifies the code and micro-optimize the irq exit path (no need for local_irq_save there). This also prepares for the split between dynticks and rcu extended quiescent state logics. We'll need this split to further fix illegal uses of RCU in extended quiescent states in the idle loop.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
show more ...
|