- Nov 09, 2007
-
-
Ingo Molnar authored
clean up the wakeup preemption check. No code changed: text data bss dec hex filename 44227 3326 36 47589 b9e5 sched.o.before 44227 3326 36 47589 b9e5 sched.o.after Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
wakeup preemption fix: do not make it dependent on p->prio. Preemption purely depends on ->vruntime. This improves preemption in mixed-nice-level workloads. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
remove PREEMPT_RESTRICT. (this is a separate commit so that any regression related to the removal itself is bisectable) Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
PREEMPT_RESTRICT was a method aimed at reducing the amount of wakeup related preemption. It has a disadvantage though, it can prevent legitimate wakeups if a task is 'unlucky' to be hit too early by a tick that clears peer_preempt. Now that the wakeup preemption has been cleaned up we dont seem to have excessive preemptions anymore, so this feature can be turned off. (and removed in the next patch) Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Eric Dumazet authored
1) hardcoded 1000000000 value is used five times in places where NSEC_PER_SEC might be more readable. 2) A conversion from nsec to msec uses the hardcoded 1000000 value, which is a candidate for NSEC_PER_MSEC. no code changed: text data bss dec hex filename 44359 3326 36 47721 ba69 sched.o.before 44359 3326 36 47721 ba69 sched.o.after Signed-off-by:
Eric Dumazet <dada1@cosmosbay.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
Yanmin Zhang reported an aim7 regression and bisected it down to: | commit 38ad464d | Author: Ingo Molnar <mingo@elte.hu> | Date: Mon Oct 15 17:00:02 2007 +0200 | | sched: uniform tunings | | use the same defaults on both UP and SMP. fix this by reintroducing similar SMP tunings again. This resolves the regression. (also update the comments to match the ilog2(nr_cpus) tuning effect) Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Paul Mackerras authored
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been broken on powerpc, because we end up counting user time twice: once in timer_interrupt() and once in update_process_times(). This fixes the problem by pulling the code in update_process_times that updates utime and stime into a separate function called account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined, there is a version of account_process_tick in kernel/timer.c that simply accounts a whole tick to either utime or stime as before. If CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to implement account_process_tick. This also lets us simplify the s390 code a bit; it means that the s390 timer interrupt can now call update_process_times even when CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a suitable account_process_tick(). account_process_tick() now takes the task_struct * as an argument. Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING. Signed-off-by:
Paul Mackerras <paulus@samba.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Balbir Singh authored
Fix the delay accounting regression introduced by commit 75d4ef16. rq no longer has sched_info data associated with it. task_struct sched_info structure is used by delay accounting to provide back statistics to user space. also remove direct use of sched_clock() (which is not a valid thing to do anymore) and use rq->clock instead. Signed-off-by:
Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
we lost the sched_min_granularity tunable to a clever optimization that uses the sched_latency/min_granularity ratio - but the ratio is quite unintuitive to users and can also crash the kernel if the ratio is set to 0. So reintroduce the min_granularity tunable, while keeping the ratio maintained internally. no functionality changed. [ mingo@elte.hu: some fixlets. ] Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
Add a few comments to place_entity(). No code changed. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
vslice was missing a factor NICE_0_LOAD, as weight is in weight*NICE_0_LOAD units. the effect of this bug was larger initial slices and thus latency-noisier forks. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Nov 05, 2007
-
-
Li Zefan authored
Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Alexey Dobriyan authored
Let's make immediately obvious from where sysctl comes from and messages itself more noticeable. Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Acked-by:
"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Adrian Bunk authored
The following functions can now become static again: - get_futex_key() - get_futex_key_refs() - drop_futex_key_refs() Signed-off-by:
Adrian Bunk <bunk@kernel.org> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Oct 31, 2007
-
-
David S. Miller authored
Add some missing cond_syscall() entries for this case. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 30, 2007
-
-
Rafael J. Wysocki authored
Do not allow processes to clear their TIF_SIGPENDING if TIF_FREEZE is set, so that they will not race with the freezer (like mysqld does, for example). Signed-off-by:
Rafael J. Wysocki <rjw@sisk.pl> Acked-by:
Nigel Cunningham <nigel@suspend2.net> Acked-by:
Pavel Machek <pavel@ucw.cz> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 29, 2007
-
-
Balbir Singh authored
Extend Peter's patch to fix accounting issues, by keeping stime monotonic too. Signed-off-by:
Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu> Tested-by:
Frans Pop <elendil@planet.nl>
-
Ingo Molnar authored
fallout of recent commits: small coding style fixes. Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
fix style of swap() macro in kernel/sched_fair.c. ( this macro should eventually move to a general header, as ext3 uses a similar construct too. ) Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Paul Menage authored
Adds a cpu.usage file to the CFS cgroup that reports CPU usage in milliseconds for that cgroup's tasks [ mingo@elte.hu: style cleanups. ] Signed-off-by:
Paul Menage <menage@google.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Srivatsa Vaddagiri authored
Peter Zijlstra noticed that the rcu_head object need not be present in every cfs_rq of a group. Move it to the task_group structure instead. Signed-off-by:
Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
James Bottomley authored
This patch: commit 9b5b7751 Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Date: Mon Oct 15 17:00:09 2007 +0200 sched: clean up code under CONFIG_FAIR_GROUP_SCHED Introduced an assumption of the existence of CPU0 via this line cfs_rq = tg->cfs_rq[0]; If you have no CPU0, that will be NULL. The fix seems to be just to take whatever cfs_rq queue comes out of the for_each_possible_cpu() loop, since they're all equally good for the destruction operation. Signed-off-by:
James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Zijlstra authored
keep utime/stime monotonic. cpustats use utime/stime as a ratio against sum_exec_runtime, as a consequence it can happen - when the ratio changes faster than time accumulates - that either can be appear to go backwards. Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Adrian Bunk authored
account_guest_time() can become static. Signed-off-by:
Adrian Bunk <bunk@kernel.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Al Viro authored
Don't undef __i386__/__x86_64__ in uml anymore, make sure that (few) places that required adjusting the ifdefs got those. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Michael Ellerman authored
Change the hrtimer printk "Switched to high resolution mode .." to be KERN_DEBUG, rather than KERN_INFO. If users need to see this they can pass "loglevel" or "debug" on the command line, or check dmesg. Signed-off-by:
Michael Ellerman <michael@ellerman.id.au> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> kernel/hrtimer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
-
Vegard Nossum authored
This makes sure printk format strings contain no more than a single line. Signed-off-by:
Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Adrian Bunk authored
This patch removes the unused EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length). Signed-off-by:
Adrian Bunk <bunk@kernel.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- Oct 28, 2007
-
-
Gautham R Shenoy authored
Fix a typo in the __lock_acquire comment. Signed-off-by:
Gautham R Shenoy <ego@in.ibm.com> Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Oct 25, 2007
-
-
Adrian Bunk authored
This patch removes the unused EXPORT_SYMBOL_GPL(tick_nohz_get_sleep_length), which we no long user b/c we no longer build optional modules. Signed-off-by:
Adrian Bunk <bunk@kernel.org> Acked-by:
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Acked-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Len Brown <len.brown@intel.com>
-
Peter Zijlstra authored
Lockdep noticed that this lock can also be taken from hardirq context, and can thus not unconditionally disable/enable irqs. WARNING: at kernel/lockdep.c:2033 trace_hardirqs_on() [show_trace_log_lvl+26/48] show_trace_log_lvl+0x1a/0x30 [show_trace+18/32] show_trace+0x12/0x20 [dump_stack+22/32] dump_stack+0x16/0x20 [trace_hardirqs_on+405/416] trace_hardirqs_on+0x195/0x1a0 [_read_unlock_irq+34/48] _read_unlock_irq+0x22/0x30 [sched_debug_show+2615/4224] sched_debug_show+0xa37/0x1080 [show_state_filter+326/368] show_state_filter+0x146/0x170 [sysrq_handle_showstate+10/16] sysrq_handle_showstate+0xa/0x10 [__handle_sysrq+123/288] __handle_sysrq+0x7b/0x120 [handle_sysrq+40/64] handle_sysrq+0x28/0x40 [kbd_event+1045/1680] kbd_event+0x415/0x690 [input_pass_event+206/208] input_pass_event+0xce/0xd0 [input_handle_event+170/928] input_handle_event+0xaa/0x3a0 [input_event+95/112] input_event+0x5f/0x70 [atkbd_interrupt+434/1456] atkbd_interrupt+0x1b2/0x5b0 [serio_interrupt+59/128] serio_interrupt+0x3b/0x80 [i8042_interrupt+263/576] i8042_interrupt+0x107/0x240 [handle_IRQ_event+40/96] handle_IRQ_event+0x28/0x60 [handle_edge_irq+175/320] handle_edge_irq+0xaf/0x140 [do_IRQ+64/128] do_IRQ+0x40/0x80 [common_interrupt+46/52] common_interrupt+0x2e/0x34 Signed-off-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
- Oct 24, 2007
-
-
Peter Williams authored
At the moment, a lot of load balancing code that is irrelevant to non SMP systems gets included during non SMP builds. This patch addresses this issue and reduces the binary size on non SMP systems: text data bss dec hex filename 10983 28 1192 12203 2fab sched.o.before 10739 28 1192 11959 2eb7 sched.o.after Signed-off-by:
Peter Williams <pwil3058@bigpond.net.au> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Peter Williams authored
At the moment, balance_tasks() provides low level functionality for both move_tasks() and move_one_task() (indirectly) via the load_balance() function (in the sched_class interface) which also provides dual functionality. This dual functionality complicates the interfaces and internal mechanisms and makes the run time overhead of operations that are called with two run queue locks held. This patch addresses this issue and reduces the overhead of these operations. Signed-off-by:
Peter Williams <pwil3058@bigpond.net.au> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Adrian Bunk authored
cpu_shares_{show,store}() can become static. Signed-off-by:
Adrian Bunk <bunk@kernel.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Paul Menage authored
- replace "cont" with "cgrp" in a few places in the CFS cgroup code, - use write_uint rather than write for cpu.shares write function Signed-off-by:
Paul Menage <menage@google.com> Acked-by : Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Mel Gorman authored
profile=sleep only works if CONFIG_SCHEDSTATS is set. This patch notes the limitation in Documentation/kernel-parameters.txt and prints a warning at boot-time if profile=sleep is used without CONFIG_SCHEDSTAT. Signed-off-by:
Mel Gorman <mel@csn.ul.ie> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Satyam Sharma authored
A full register dump along with stack backtrace would make the "scheduling while atomic" message more helpful. Use show_regs() instead of dump_stack() for this. We already know we're atomic in here (that is why this function was called) so show_regs()'s atomicity expectations are guaranteed. Also, modify the output of the "BUG: scheduling while atomic:" header a bit to keep task->comm and task->pid together and preempt_count() after them. Signed-off-by:
Satyam Sharma <satyam@infradead.org> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
clean up sched_domain_debug(). this also shrinks the code a bit: text data bss dec hex filename 50474 4306 480 55260 d7dc sched.o.before 50404 4306 480 55190 d796 sched.o.after Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Ingo Molnar authored
Jeff Dike noticed that wait_for_completion_interruptible()'s prototype had a mismatched fastcall. Fix this by removing the fastcall attributes from all the completion APIs. Found-by:
Jeff Dike <jdike@linux.intel.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-
Milton Miller authored
commit 029190c5 (cpuset sched_load_balance flag) was not tested SCHED_DEBUG enabled as committed as it dereferences NULL when used and it reordered the sysctl registration to cause it to never show any domains or their tunables. Fixes: 1) restore arch_init_sched_domains ordering we can't walk the domains before we build them presently we register cpus with empty directories (no domain directories or files). 2) make unregister_sched_domain_sysctl do nothing when already unregistered detach_destroy_domains is now called one set of cpus at a time unregister_syctl dereferences NULL if called with a null. While the the function would always dereference null if called twice, in the previous code it was always called once and then was followed a register. So only the hidden bug of the sysctl_root_table not being allocated followed by an attempt to free it would have shown the error. 3) always call unregister and register in partition_sched_domains The code is "smart" about unregistering only needed domains. Since we aren't guaranteed any calls to unregister, always unregister. Without calling register on the way out we will not have a table or any sysctl tree. 4) warn if register is called without unregistering The previous table memory is lost, leaving pointers to the later freed memory in sysctl and leaking the memory of the tables. Before this patch on a 2-core 4-thread box compiled for SMT and NUMA, the domains appear empty (there are actually 3 levels per cpu). And as soon as two domains a null pointer is dereferenced (unreliable in this case is stack garbage): bu19a:~# ls -R /proc/sys/kernel/sched_domain/ /proc/sys/kernel/sched_domain/: cpu0 cpu1 cpu2 cpu3 /proc/sys/kernel/sched_domain/cpu0: /proc/sys/kernel/sched_domain/cpu1: /proc/sys/kernel/sched_domain/cpu2: /proc/sys/kernel/sched_domain/cpu3: bu19a:~# mkdir /dev/cpuset bu19a:~# mount -tcpuset cpuset /dev/cpuset/ bu19a:~# cd /dev/cpuset/ bu19a:/dev/cpuset# echo 0 > sched_load_balance bu19a:/dev/cpuset# mkdir one bu19a:/dev/cpuset# echo 1 > one/cpus bu19a:/dev/cpuset# echo 0 > one/sched_load_balance Unable to handle kernel paging request for data at address 0x00000018 Faulting instruction address: 0xc00000000006b608 NIP: c00000000006b608 LR: c00000000006b604 CTR: 0000000000000000 REGS: c000000018d973f0 TRAP: 0300 Not tainted (2.6.23-bml) MSR: 9000000000009032 <EE,ME,IR,DR> CR: 28242442 XER: 00000000 DAR: 0000000000000018, DSISR: 0000000040000000 TASK = c00000001912e340[1987] 'bash' THREAD: c000000018d94000 CPU: 2 .. NIP [c00000000006b608] .unregister_sysctl_table+0x38/0x110 LR [c00000000006b604] .unregister_sysctl_table+0x34/0x110 Call Trace: [c000000018d97670] [c000000007017270] 0xc000000007017270 (unreliable) [c000000018d97720] [c000000000058710] .detach_destroy_domains+0x30/0xb0 [c000000018d977b0] [c00000000005cf1c] .partition_sched_domains+0x1bc/0x230 [c000000018d97870] [c00000000009fdc4] .rebuild_sched_domains+0xb4/0x4c0 [c000000018d97970] [c0000000000a02e8] .update_flag+0x118/0x170 [c000000018d97a80] [c0000000000a1768] .cpuset_common_file_write+0x568/0x820 [c000000018d97c00] [c00000000009d95c] .cgroup_file_write+0x7c/0x180 [c000000018d97cf0] [c0000000000e76b8] .vfs_write+0xe8/0x1b0 [c000000018d97d90] [c0000000000e810c] .sys_write+0x4c/0x90 [c000000018d97e30] [c00000000000852c] syscall_exit+0x0/0x40 Signed-off-by:
Milton Miller <miltonm@bga.com> Signed-off-by:
Ingo Molnar <mingo@elte.hu>
-