Home
last modified time | relevance | path

Searched hist:b8f8c3cf (Results 1 – 10 of 10) sorted by relevance

/openbmc/linux/arch/powerpc/kernel/
H A Didle.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/arch/sh/kernel/
H A Dprocess_32.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/arch/um/kernel/
H A Dprocess.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/arch/arm/kernel/
H A Dprocess.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/include/linux/
H A Dtick.hb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/arch/mips/kernel/
H A Dprocess.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/arch/x86/kernel/
H A Dprocess_32.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
H A Dprocess_64.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/kernel/
H A Dsoftirq.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
/openbmc/linux/kernel/time/
H A Dtick-sched.cb8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
b8f8c3cf Fri Jul 18 10:27:28 CDT 2008 Thomas Gleixner <tglx@linutronix.de> nohz: prevent tick stop outside of the idle loop

Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

scheduler switch to idle task
enable interrupts

Window starts here

----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick

----> interrupt happens (does set NEED_RESCHED)

return from schedule()

cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
preempt_disable();

while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop

while (!need_resched())
halt();

tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}

In hindsight we should have done this forever, but ...

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>