- Apr 12, 2013
-
-
Tejun Heo authored
This reverts commit 84cfb6ab. There are scheduled changes which make use of the removed callback. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Rami Rosen <ramirose@gmail.com> Cc: Li Zefan <lizefan@huawei.com>
-
- Apr 10, 2013
-
-
Tejun Heo authored
perf_event is one of a couple remaining cgroup controllers with broken hierarchy support. Converting it to support hierarchy is almost trivial. The only thing necessary is to consider a task belonging to a descendant cgroup as a match. IOW, if the cgroup of the currently executing task (@cpuctx->cgrp) equals or is a descendant of the event's cgroup (@event->cgrp), then the event should be enabled. Implement hierarchy support and remove .broken_hierarchy tag along with the incorrect comment on what needs to be done for hierarchy support. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Stephane Eranian <eranian@google.com> Cc: Namhyung Kim <namhyung.kim@lge.com>
-
Li Zefan authored
A couple controllers want to determine whether two cgroups are in ancestor/descendant relationship. As it's more likely that the descendant is the primary subject of interest and there are other operations focusing on the descendants, let's ask is_descendent rather than is_ancestor. Implementation is trivial as the previous patch guarantees that all ancestors of a cgroup stay accessible as long as the cgroup is accessible. tj: Removed depth optimization, renamed from cgroup_is_ancestor(), rewrote descriptions. Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
Suppose we rmdir a cgroup and there're still css refs, this cgroup won't be freed. Then we rmdir the parent cgroup, and the parent is freed immediately due to css ref draining to 0. Now it would be a disaster if the still-alive child cgroup tries to access its parent. Make sure this won't happen. Signed-off-by:
Li Zefan <lizefan@huawei.com> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Acked-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Rami Rosen authored
The bind() method of cgroup_subsys is not used in any of the controllers (cpuset, freezer, blkio, net_cls, memcg, net_prio, devices, perf, hugetlb, cpu and cpuacct) tj: Removed the entry on ->bind() from Documentation/cgroups/cgroups.txt. Also updated a couple paragraphs which were suggesting that dynamic re-binding may be implemented. It's not gonna. Signed-off-by:
Rami Rosen <ramirose@gmail.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Apr 07, 2013
-
-
Tejun Heo authored
We don't want controllers to assume that the information is officially available and do funky things with it. The only user is task_subsys_state_check() which uses it to verify RCU access context. We can move cgroup_lock_is_held() inside CONFIG_PROVE_RCU but that doesn't add meaningful protection compared to conditionally exposing cgroup_mutex. Remove cgroup_lock_is_held(), export cgroup_mutex iff CONFIG_PROVE_RCU and use lockdep_is_held() directly on the mutex in task_subsys_state_check(). While at it, add parentheses around macro arguments in task_subsys_state_check(). Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com>
-
Tejun Heo authored
Now that locking interface is unexported, there's no reason to keep around these thin wrappers. Kill them and use mutex operations directly. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com>
-
Tejun Heo authored
Now that all external cgroup_lock() users are gone, we can finally unexport the locking interface and prevent future abuse of cgroup_mutex. Make cgroup_[un]lock() and cgroup_lock_live_group() static. Also, cgroup_attach_task() doesn't have any user left and can't be used without locking interface anyway. Make it static too. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com>
-
Tejun Heo authored
cgroup_lock_live_group() and cgroup_attach_task() are scheduled to be made static. Relocate the former and cgroup_attach_task_all() so that we don't need forward declarations. This patch is pure relocation. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com>
-
Tejun Heo authored
When a cpuset becomes empty (no CPU or memory), its tasks are transferred with the nearest ancestor with execution resources. This is implemented using cgroup_scan_tasks() with a callback which grabs cgroup_mutex and invokes cgroup_attach_task() on each task. Both cgroup_mutex and cgroup_attach_task() are scheduled to be unexported. Implement cgroup_transfer_tasks() in cgroup proper which is essentially the same as move_member_tasks_to_cpuset() except that it takes cgroups instead of cpusets and @to comes before @from like normal functions with those arguments, and replace move_member_tasks_to_cpuset() with it. Signed-off-by:
Tejun Heo <tj@kernel.org> Acked-by:
Li Zefan <lizefan@huawei.com>
-
- Apr 03, 2013
-
-
Kevin Wilson authored
This patch removes unused parameter from cgroup_task_migrate(). Signed-off-by:
Kevin Wilson <wkevils@gmail.com> Acked-by:
Acked-by: Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Mar 20, 2013
-
-
Li Zefan authored
These two functions share most of the code. Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
The 3rd parameter of flex_array_prealloc() is the number of elements, not the index of the last element. The effect of the bug is, when opening cgroup.procs, a flex array will be allocated and all elements of the array is allocated with GFP_KERNEL flag, but the last one is GFP_ATOMIC, and if we fail to allocate memory for it, it'll trigger a BUG_ON(). Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org
-
- Mar 12, 2013
-
-
Li Zefan authored
eventfd_poll() never returns POLLHUP. Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
When we open cgroup.procs, we'll allocate an buffer and store all tasks' tgid in it, and then duplicate entries will be stripped. If that results in a much smaller pid list, we'll re-allocate a smaller buffer. But we've already sucessfully allocated memory and reading the procs file is a short period and the memory will be freed very soon, so why bother to re-allocate memory. Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
cpuset no longer nests cgroup_mutex inside cpu_hotplug lock, so we don't have to release cgroup_mutex before calling css_offline(). Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
It was used by ns cgroup, and ns cgroup was removed long ago. Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
Sasha reported a lockdep warning when OOM was triggered. The reason is cgroup_name() should be called with rcu_read_lock() held. Reported-by:
Sasha Levin <sasha.levin@oracle.com> Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Mar 05, 2013
-
-
Li Zefan authored
subsys[i] is set to NULL in cgroup_unload_subsys() at modular unload, and that's protected by cgroup_mutex, and then the memory *subsys[i] resides will be freed. So this is unsafe without any locking: if (!ss || ss->module) ... v2: - add a comment for enum cgroup_subsys_id - simplify the comment in cgroup_exit() Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Mar 04, 2013
-
-
Li Zefan authored
We no longer fail rmdir() when there're still css refs, so we don't need to check css refs in check_for_release(). This also voids a bug. cgroup_has_css_refs() accesses subsys[i] without cgroup_mutex, so it can race with cgroup_unload_subsys(). cgroup_has_css_refs() ... if (ss == NULL || ss->root != cgrp->root) if ss pointers to net_cls_subsys, and cls_cgroup module is unloaded right after the former check but before the latter, the memory that net_cls_subsys resides has become invalid. Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
Use cgroup_name() instead of cgrp->dentry->name. This makes the code a bit simpler. While at it, remove cpuset_name and make cpuset_nodelist a local variable to cpuset_print_task_mems_allowed(). Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Li Zefan authored
rename() will change dentry->d_name. The result of this race can be worse than seeing partially rewritten name, but we might access a stale pointer because rename() will re-allocate memory to hold a longer name. As accessing dentry->name must be protected by dentry->d_lock or parent inode's i_mutex, while on the other hand cgroup-path() can be called with some irq-safe spinlocks held, we can't generate cgroup path using dentry->d_name. Alternatively we make a copy of dentry->d_name and save it in cgrp->name when a cgroup is created, and update cgrp->name at rename(). v5: use flexible array instead of zero-size array. v4: - allocate root_cgroup_name and all root_cgroup->name points to it. - add cgroup_name() wrapper. v3: use kfree_rcu() instead of synchronize_rcu() in user-visible path. v2: make cgrp->name RCU safe. Signed-off-by:
Li Zefan <lizefan@huawei.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
- Mar 03, 2013
-
-
Al Viro authored
Converting bitmask to 32bit granularity is fine, but we'd better _do_ something with the result. Such as "copy it to userland"... Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- Mar 02, 2013
-
-
James Hogan authored
Some 32 bit architectures require 64 bit values to be aligned (for example Meta which has 64 bit read/write instructions). These require 8 byte alignment of event data too, so use !CONFIG_HAVE_64BIT_ALIGNED_ACCESS instead of !CONFIG_64BIT || CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to decide alignment, and align buffer_data_page::data accordingly. Signed-off-by:
James Hogan <james.hogan@imgtec.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> (previous version subtly different)
-
Vincent authored
The 'ssb' command can only be handled when we have a disassembler, to check for branches, so remove the 'ssb' command for now. Signed-off-by:
Vincent Stehlé <vincent.stehle@laposte.net> Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
Jason Wessel authored
The kdb_defcmd can only be used to display the available command aliases while using the kernel debug shell. If you try to define a new macro while the kernel debugger is active it will oops. The debug shell macros must use pre-allocated memory set aside at the time kdb_init() is run, and the kdb_defcmd is restricted to only working at the time that the kdb_init sequence is being run, which only occurs if you actually activate the kernel debugger. Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
Jason Wessel authored
Recently some code inspection was done after fixing a problem with kmalloc used while in the kernel debugger context (which is not legal), and it turned up the fact that kdb ll command will oops the kernel. Given that there have been zero bug reports on the command combined with the fact it will oops the kernel it is clearly not being used. Instead of fixing it, it will be removed. Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
Jason Wessel authored
The help command was chopping all the usage instructions such that they were not readable. Example: bta [D|R|S|T|C|Z|E|U|I| Backtrace all processes matching state flag per_cpu <sym> [<bytes>] [<c Display per_cpu variables Where as it should look like: bta [D|R|S|T|C|Z|E|U|I|M|A] Backtrace all processes matching state flag per_cpu <sym> [<bytes>] [<cpu>] Display per_cpu variables All that is needed is to check the how long the cmd_usage is and jump to the next line when appropriate. Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
Jason Wessel authored
Maxime reported that strcpy(s->usage, s->usage+1) has no definitive guarantee that it will work on all archs the same way when you have overlapping memory. The fix is simple for the kdb code because we still have the original string memory in the function scope, so we just have to use that as the argument instead. Reported-by:
Maxime Villard <rustyBSD@gmx.fr> Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
Matt Klein authored
Although invasive kdb commands are not supported via kgdb, some useful non-invasive commands like bt* require basic kdb state to be setup before calling into the kdb code. Factor out some of this code and call it before and after executing kdb commands via kgdb. Signed-off-by:
Matt Klein <mklein@twitter.com> Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
Sasha Levin authored
Signed-off-by:
Sasha Levin <sasha.levin@oracle.com> Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
John Blackwood authored
When locally adding in some additional kdb commands, I stumbled across an issue with the dynamic expansion of the kdb command table. When the number of kdb commands exceeds the size of the statically allocated kdb_base_commands[] array, additional space is allocated in the kdb_register_repeat() routine. The unused portion of the newly allocated array was not being initialized to zero properly and this would result in segfaults when help '?' was executed or when a search for a non-existing command would traverse the command table beyond the end of valid command entries and then attempt to use the non-zeroed area as actual command entries. Signed-off-by:
John Blackwood <john.blackwood@ccur.com> Signed-off-by:
Jason Wessel <jason.wessel@windriver.com>
-
- Feb 28, 2013
-
-
Sasha Levin authored
I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S | hlist_for_each_entry_continue(a, - b, c) S | hlist_for_each_entry_from(a, - b, c) S | hlist_for_each_entry_rcu(a, - b, c, d) S | hlist_for_each_entry_rcu_bh(a, - b, c, d) S | hlist_for_each_entry_continue_rcu_bh(a, - b, c) S | for_each_busy_worker(a, c, - b, d) S | ax25_uid_for_each(a, - b, c) S | ax25_for_each(a, - b, c) S | inet_bind_bucket_for_each(a, - b, c) S | sctp_for_each_hentry(a, - b, c) S | sk_for_each(a, - b, c) S | sk_for_each_rcu(a, - b, c) S | sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S | sk_for_each_safe(a, - b, c, d) S | sk_for_each_bound(a, - b, c) S | hlist_for_each_entry_safe(a, - b, c, d, e) S | hlist_for_each_entry_continue_rcu(a, - b, c) S | nr_neigh_for_each(a, - b, c) S | nr_neigh_for_each_safe(a, - b, c, d) S | nr_node_for_each(a, - b, c) S | nr_node_for_each_safe(a, - b, c, d) S | - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S | - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S | for_each_host(a, - b, c) S | for_each_host_safe(a, - b, c, d) S | for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by:
Peter Senna Tschudin <peter.senna@gmail.com> Acked-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by:
Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Cyrill Gorcunov authored
Since kcmp syscall has been implemented (initially on x86 architecture) a number of other archs wire it up as well: xtensa, sparc, sh, s390, mips, microblaze, m68k (not taking into account those who uses <asm-generic/unistd.h> for syscall numbers definitions). But the Makefile, which turns kcmp.o generation on still depends on former config-x86. Thus get rid of this limitation and make kcmp.o depend on CHECKPOINT_RESTORE option. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Andrey Vagin <avagin@openvz.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Stefani Seibold authored
Move kfifo.c from kernel/ to lib/ Signed-off-by:
Stefani Seibold <stefani@seibold.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Yuanhan Liu authored
Fix the wrong comment about the return value of clone_uts_ns() Signed-off-by:
Yuanhan Liu <yuanhan.liu@linux.intel.com> Acked-by:
Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Yuanhan Liu authored
Put get/get_uts() into CONFIG_PROC_SYSCTL code block as they are used only when CONFIG_PROC_SYSCTL is enabled. Signed-off-by:
Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Xi Wang authored
The null check of `strchr() + 1' is broken, which is always non-null, leading to OOB read. Instead, check the result of strchr(). Signed-off-by:
Xi Wang <xi.wang@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Tejun Heo authored
Convert to the much saner new idr interface. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-