- Jan 23, 2016
-
-
Tetsuo Handa authored
There are many locations that do if (memory_was_allocated_by_vmalloc) vfree(ptr); else kfree(ptr); but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory using is_vmalloc_addr(). Unless callers have special reasons, we can replace this branch with kvfree(). Please check and reply if you found problems. Signed-off-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Jan Kara <jack@suse.com> Acked-by:
Russell King <rmk+kernel@arm.linux.org.uk> Reviewed-by:
Andreas Dilger <andreas.dilger@intel.com> Acked-by:
"Rafael J. Wysocki" <rjw@rjwysocki.net> Acked-by:
David Rientjes <rientjes@google.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Boris Petkov <bp@suse.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 22, 2016
-
-
Al Viro authored
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested}, inode_foo(inode) being mutex_foo(&inode->i_mutex). Please, use those for access to ->i_mutex; over the coming cycle ->i_mutex will become rwsem, with ->lookup() done with it held only shared. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- Jan 21, 2016
-
-
Ilya Dryomov authored
MClientMount{,Ack} are long gone. The receipt of bare monmap doesn't actually indicate a mount success as we are yet to authenticate at that point in time. Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
Ilya Dryomov authored
With it gone, no need to preserve ceph_timespec in process_one_ticket() either. Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Reviewed-by:
Sage Weil <sage@redhat.com>
-
Ilya Dryomov authored
If we fault due to authentication, we invalidate the service ticket we have and request a new one - the idea being that if a service rejected our authorizer, it must have expired, despite mon_client's attempts at periodic renewal. (The other possibility is that our ticket is too new and the service hasn't gotten it yet, in which case invalidating isn't necessary but doesn't hurt.) Invalidating just the service ticket is not enough, though. If we assume a failure on mon_client's part to renew a service ticket, we have to assume the same for the AUTH ticket. If our AUTH ticket is bad, we won't get any service tickets no matter how hard we try, so invalidate AUTH ticket along with the service ticket. Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Reviewed-by:
Sage Weil <sage@redhat.com>
-
Ilya Dryomov authored
Back in 2013, commit 4b8e8b5d ("libceph: fix authorizer invalidation") tried to fix authorizer invalidation issues by clearing validity field. However, nothing ever consults this field, so it doesn't force us to request any new secrets in any way and therefore we never get out of the exponential backoff mode: [ 129.973812] libceph: osd2 192.168.122.1:6810 connect authorization failure [ 130.706785] libceph: osd2 192.168.122.1:6810 connect authorization failure [ 131.710088] libceph: osd2 192.168.122.1:6810 connect authorization failure [ 133.708321] libceph: osd2 192.168.122.1:6810 connect authorization failure [ 137.706598] libceph: osd2 192.168.122.1:6810 connect authorization failure ... AFAICT this was the case at the time 4b8e8b5d was merged, too. Using timespec solely as a bool isn't nice, so introduce a new have_key flag, specifically for this purpose. Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Reviewed-by:
Sage Weil <sage@redhat.com>
-
Ilya Dryomov authored
Commit 20e55c4c ("libceph: clear messenger auth_retry flag when we authenticate") got us only half way there. We clear the flag if the second attempt succeeds, but it also needs to be cleared if that attempt fails, to allow for the exponential backoff to kick in. Otherwise, if ->should_authenticate() thinks our keys are valid, we will busy loop, incrementing auth_retry to no avail: process_connect ffff880079a63830 got BADAUTHORIZER attempt 1 process_connect ffff880079a63830 got BADAUTHORIZER attempt 2 process_connect ffff880079a63830 got BADAUTHORIZER attempt 3 process_connect ffff880079a63830 got BADAUTHORIZER attempt 4 process_connect ffff880079a63830 got BADAUTHORIZER attempt 5 ... Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Reviewed-by:
Sage Weil <sage@redhat.com>
-
Ilya Dryomov authored
There are a number of problems with revoking a "was sending" message: (1) We never make any attempt to revoke data - only kvecs contibute to con->out_skip. However, once the header (envelope) is written to the socket, our peer learns data_len and sets itself to expect at least data_len bytes to follow front or front+middle. If ceph_msg_revoke() is called while the messenger is sending message's data portion, anything we send after that call is counted by the OSD towards the now revoked message's data portion. The effects vary, the most common one is the eventual hang - higher layers get stuck waiting for the reply to the message that was sent out after ceph_msg_revoke() returned and treated by the OSD as a bunch of data bytes. This is what Matt ran into. (2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs is wrong. If ceph_msg_revoke() is called before the tag is sent out or while the messenger is sending the header, we will get a connection reset, either due to a bad tag (0 is not a valid tag) or a bad header CRC, which kind of defeats the purpose of revoke. Currently the kernel client refuses to work with header CRCs disabled, but that will likely change in the future, making this even worse. (3) con->out_skip is not reset on connection reset, leading to one or more spurious connection resets if we happen to get a real one between con->out_skip is set in ceph_msg_revoke() and before it's cleared in write_partial_skip(). Fixing (1) and (3) is trivial. The idea behind fixing (2) is to never zero the tag or the header, i.e. send out tag+header regardless of when ceph_msg_revoke() is called. That way the header is always correct, no unnecessary resets are induced and revoke stands ready for disabled CRCs. Since ceph_msg_revoke() rips out con->out_msg, introduce a new "message out temp" and copy the header into it before sending. Cc: stable@vger.kernel.org # 4.0+ Reported-by:
Matt Conner <matt.conner@keepertech.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Tested-by:
Matt Conner <matt.conner@keepertech.com> Reviewed-by:
Sage Weil <sage@redhat.com>
-
Geliang Tang authored
Use list_for_each_entry_safe() instead of list_for_each_safe() to simplify the code. Signed-off-by:
Geliang Tang <geliangtang@163.com> [idryomov@gmail.com: nuke call to list_splice_init() as well] Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
Geliang Tang authored
list_next_entry has been defined in list.h, so I replace list_entry_next with it. Signed-off-by:
Geliang Tang <geliangtang@163.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
Vladimir Davydov authored
tcp_memcontrol.c only contains legacy memory.tcp.kmem.* file definitions and mem_cgroup->tcp_mem init/destroy stuff. This doesn't belong to network subsys. Let's move it to memcontrol.c. This also allows us to reuse generic code for handling legacy memcg files. Signed-off-by:
Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by:
Michal Hocko <mhocko@suse.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Johannes Weiner authored
Let the user know that CONFIG_MEMCG_KMEM does not apply to the cgroup2 interface. This also makes legacy-only code sections stand out better. [arnd@arndb.de: mm: memcontrol: only manage socket pressure for CONFIG_INET] Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Tejun Heo <tj@kernel.org> Acked-by:
Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Johannes Weiner authored
This series adds accounting of the historical "kmem" memory consumers to the cgroup2 memory controller. These consumers include the dentry cache, the inode cache, kernel stack pages, and a few others that are pointed out in patch 7/8. The footprint of these consumers is directly tied to userspace activity in common workloads, and so they have to be part of the minimally viable configuration in order to present a complete feature to our users. The cgroup2 interface of the memory controller is far from complete, but this series, along with the socket memory accounting series, provides the final semantic changes for the existing memory knobs in the cgroup2 interface, which is scheduled for initial release in the next merge window. This patch (of 8): Remove unused css argument frmo memcg_init_kmem() Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Acked-by:
Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Acked-by:
Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Ryabinin authored
With upcoming CONFIG_UBSAN the following BUILD_BUG_ON in net/mac80211/debugfs.c starts to trigger: BUILD_BUG_ON(hw_flag_names[NUM_IEEE80211_HW_FLAGS] != (void *)0x1); It seems, that compiler instrumentation causes some code deoptimizations. Because of that GCC is not being able to resolve condition in BUILD_BUG_ON() at compile time. We could make size of hw_flag_names array unspecified and replace the condition in BUILD_BUG_ON() with following: ARRAY_SIZE(hw_flag_names) != NUM_IEEE80211_HW_FLAGS That will have the same effect as before (adding new flag without updating array will trigger build failure) except it doesn't fail with CONFIG_UBSAN. As a bonus this patch slightly decreases size of hw_flag_names array. Signed-off-by:
Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 19, 2016
-
-
Christoph Hellwig authored
We now alwasy have a per-PD local_dma_lkey available. Make use of that fact in svc_rdma and stop registering our own MR. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Sagi Grimberg <sagig@mellanox.com> Reviewed-by:
Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by:
Chuck Lever <chuck.lever@oracle.com> Reviewed-by:
Steve Wise <swise@opengridcomputing.com> Acked-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
To support the server-side of an NFSv4.1 backchannel on RDMA connections, add a transport class that enables backward direction messages on an existing forward channel connection. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
Extra resources for handling backchannel requests have to be pre-allocated when a transport instance is created. Set up additional fields in svcxprt_rdma to track these resources. The max_requests fields are elements of the RPC-over-RDMA protocol, so they should be u32. To ensure that unsigned arithmetic is used everywhere, some other fields in the svcxprt_rdma struct are updated. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
Pre-requisite to use map_xdr in the backchannel code. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
Clean up. These functions can otherwise fail, so check for page allocation failures too. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
svc_rdma_post_recv() allocates pages for receive buffers on-demand. It uses GFP_KERNEL so the allocator tries hard, and may sleep. But I'm about to add a call to svc_rdma_post_recv() from a function that may not sleep. Since all svc_rdma_post_recv() call sites can tolerate its failure, allow it to fail if the page allocator returns nothing. Longer term, receive buffers, being a finite resource per-connection, should be pre-allocated and re-used. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
Clean up. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
To ensure this allocation cannot fail and will not sleep, pre-allocate the req_map structures per-connection. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
When the maximum payload size of NFS READ and WRITE was increased by commit cc9a903d ("svcrdma: Change maximum server payload back to RPCSVC_MAXPAYLOAD"), the size of struct svc_rdma_op_ctxt increased to over 6KB (on x86_64). That makes allocating one of these from a kmem_cache more likely to fail in situations when system memory is exhausted. Since I'm about to add a caller where this allocation must always work _and_ it cannot sleep, pre-allocate ctxts for each connection. Another motivation for this change is that NFSv4.x servers are required by specification not to drop NFS requests. Pre-allocating memory resources reduces the likelihood of a drop. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
Be sure the completed ctxt is put in every path. The xprt enqueue can take a while, so put the completed ctxt back in circulation _before_ enqueuing the xprt. Remove/disable debugging. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
Chuck Lever authored
kzalloc is used here, so setting the atomic fields to zero is unnecessary. sc_ord is set again in handle_connect_req. The other fields are re-initialized in svc_rdma_accept(). Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Acked-by:
Bruce Fields <bfields@fieldses.org> Signed-off-by:
Doug Ledford <dledford@redhat.com>
-
- Jan 18, 2016
-
-
Hannes Frederic Sowa authored
It was seen that defective configurations of openvswitch could overwrite the STACK_END_MAGIC and cause a hard crash of the kernel because of too many recursions within ovs. This problem arises due to the high stack usage of openvswitch. The rest of the kernel is fine with the current limit of 10 (RECURSION_LIMIT). We use the already existing recursion counter in ovs_execute_actions to implement an upper bound of 5 recursions. Cc: Pravin Shelar <pshelar@ovn.org> Cc: Simon Horman <simon.horman@netronome.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Simon Horman <simon.horman@netronome.com> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Xin Long authored
Re-establish the previous behavior and avoid hashing temporary asocs by checking t->asoc->temp in sctp_(un)hash_transport. Also, remove the check of t->asoc->temp in __sctp_lookup_association, since they are never hashed now. Fixes: 4f008781 ("sctp: apply rhashtable api to send/recv path") Signed-off-by:
Xin Long <lucien.xin@gmail.com> Acked-by:
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reported-by:
Vlad Yasevich <vyasevich@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 16, 2016
-
-
Sven Eckelmann authored
It is not allowed to free the memory of an object which is part of a list which is protected by rcu-read-side-critical sections without making sure that no other context is accessing the object anymore. This usually happens by removing the references to this object and then waiting until the rcu grace period is over and no one (allowedly) accesses it anymore. But the _now functions ignore this completely. They free the object directly even when a different context still tries to access it. This has to be avoided and thus these functions must be removed and all functions have to use batadv_orig_node_free_ref. Fixes: 72822225 ("batman-adv: Fix rcu_barrier() miss due to double call_rcu() in TT code") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
It is not allowed to free the memory of an object which is part of a list which is protected by rcu-read-side-critical sections without making sure that no other context is accessing the object anymore. This usually happens by removing the references to this object and then waiting until the rcu grace period is over and no one (allowedly) accesses it anymore. But the _now functions ignore this completely. They free the object directly even when a different context still tries to access it. This has to be avoided and thus these functions must be removed and all functions have to use batadv_hardif_free_ref. Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
It is not allowed to free the memory of an object which is part of a list which is protected by rcu-read-side-critical sections without making sure that no other context is accessing the object anymore. This usually happens by removing the references to this object and then waiting until the rcu grace period is over and no one (allowedly) accesses it anymore. But the _now functions ignore this completely. They free the object directly even when a different context still tries to access it. This has to be avoided and thus these functions must be removed and all functions have to use batadv_neigh_ifinfo_free_ref. Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
It is not allowed to free the memory of an object which is part of a list which is protected by rcu-read-side-critical sections without making sure that no other context is accessing the object anymore. This usually happens by removing the references to this object and then waiting until the rcu grace period is over and no one (allowedly) accesses it anymore. But the _now functions ignore this completely. They free the object directly even when a different context still tries to access it. This has to be avoided and thus these functions must be removed and all functions have to use batadv_hardif_neigh_free_ref. Fixes: cef63419 ("batman-adv: add list of unique single hop neighbors per hard-interface") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
It is not allowed to free the memory of an object which is part of a list which is protected by rcu-read-side-critical sections without making sure that no other context is accessing the object anymore. This usually happens by removing the references to this object and then waiting until the rcu grace period is over and no one (allowedly) accesses it anymore. But the _now functions ignore this completely. They free the object directly even when a different context still tries to access it. This has to be avoided and thus these functions must be removed and all functions have to use batadv_neigh_node_free_ref. Fixes: 89652331 ("batman-adv: split tq information in neigh_node struct") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
It is not allowed to free the memory of an object which is part of a list which is protected by rcu-read-side-critical sections without making sure that no other context is accessing the object anymore. This usually happens by removing the references to this object and then waiting until the rcu grace period is over and no one (allowedly) accesses it anymore. But the _now functions ignore this completely. They free the object directly even when a different context still tries to access it. This has to be avoided and thus these functions must be removed and all functions have to use batadv_orig_ifinfo_free_ref. Fixes: 7351a482 ("batman-adv: split out router from orig_node") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
The batadv_nc_node_free_ref function uses call_rcu to delay the free of the batadv_nc_node object until no (already started) rcu_read_lock is enabled anymore. This makes sure that no context is still trying to access the object which should be removed. But batadv_nc_node also contains a reference to orig_node which must be removed. The reference drop of orig_node was done in the call_rcu function batadv_nc_node_free_rcu but should actually be done in the batadv_nc_node_release function to avoid nested call_rcus. This is important because rcu_barrier (e.g. batadv_softif_free or batadv_exit) will not detect the inner call_rcu as relevant for its execution. Otherwise this barrier will most likely be inserted in the queue before the callback of the first call_rcu was executed. The caller of rcu_barrier will therefore continue to run before the inner call_rcu callback finished. Fixes: d56b1705 ("batman-adv: network coding - detect coding nodes and remove these after timeout") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
The batadv_claim_free_ref function uses call_rcu to delay the free of the batadv_bla_claim object until no (already started) rcu_read_lock is enabled anymore. This makes sure that no context is still trying to access the object which should be removed. But batadv_bla_claim also contains a reference to backbone_gw which must be removed. The reference drop of backbone_gw was done in the call_rcu function batadv_claim_free_rcu but should actually be done in the batadv_claim_release function to avoid nested call_rcus. This is important because rcu_barrier (e.g. batadv_softif_free or batadv_exit) will not detect the inner call_rcu as relevant for its execution. Otherwise this barrier will most likely be inserted in the queue before the callback of the first call_rcu was executed. The caller of rcu_barrier will therefore continue to run before the inner call_rcu callback finished. Fixes: 23721387 ("batman-adv: add basic bridge loop avoidance code") Signed-off-by:
Sven Eckelmann <sven@narfation.org> Acked-by:
Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by:
Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by:
Antonio Quartulli <a@unstable.cc>
-
- Jan 15, 2016
-
-
Nikolay Aleksandrov authored
After promisc mode management was introduced a bridge device could do dev_set_promiscuity from its ndo_change_rx_flags() callback which in turn can be called after the bridge's addr_list_lock has been taken (e.g. by dev_uc_add). This causes a false positive lockdep splat because the port interfaces' addr_list_lock is taken when br_manage_promisc() runs after the bridge's addr list lock was already taken. To remove the false positive introduce a custom bridge addr_list_lock class and set it on bridge init. A simple way to reproduce this is with the following: $ brctl addbr br0 $ ip l add l br0 br0.100 type vlan id 100 $ ip l set br0 up $ ip l set br0.100 up $ echo 1 > /sys/class/net/br0/bridge/vlan_filtering $ brctl addif br0 eth0 Splat: [ 43.684325] ============================================= [ 43.684485] [ INFO: possible recursive locking detected ] [ 43.684636] 4.4.0-rc8+ #54 Not tainted [ 43.684755] --------------------------------------------- [ 43.684906] brctl/1187 is trying to acquire lock: [ 43.685047] (_xmit_ETHER){+.....}, at: [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.685460] but task is already holding lock: [ 43.685618] (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.686015] other info that might help us debug this: [ 43.686316] Possible unsafe locking scenario: [ 43.686743] CPU0 [ 43.686967] ---- [ 43.687197] lock(_xmit_ETHER); [ 43.687544] lock(_xmit_ETHER); [ 43.687886] *** DEADLOCK *** [ 43.688438] May be due to missing lock nesting notation [ 43.688882] 2 locks held by brctl/1187: [ 43.689134] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff81510317>] rtnl_lock+0x17/0x20 [ 43.689852] #1: (_xmit_ETHER){+.....}, at: [<ffffffff815072a7>] dev_uc_add+0x27/0x80 [ 43.690575] stack backtrace: [ 43.690970] CPU: 0 PID: 1187 Comm: brctl Not tainted 4.4.0-rc8+ #54 [ 43.691270] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [ 43.691770] ffffffff826a25c0 ffff8800369fb8e0 ffffffff81360ceb ffffffff826a25c0 [ 43.692425] ffff8800369fb9b8 ffffffff810d0466 ffff8800369fb968 ffffffff81537139 [ 43.693071] ffff88003a08c880 0000000000000000 00000000ffffffff 0000000002080020 [ 43.693709] Call Trace: [ 43.693931] [<ffffffff81360ceb>] dump_stack+0x4b/0x70 [ 43.694199] [<ffffffff810d0466>] __lock_acquire+0x1e46/0x1e90 [ 43.694483] [<ffffffff81537139>] ? netlink_broadcast_filtered+0x139/0x3e0 [ 43.694789] [<ffffffff8153b5da>] ? nlmsg_notify+0x5a/0xc0 [ 43.695064] [<ffffffff810d10f5>] lock_acquire+0xe5/0x1f0 [ 43.695340] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.695623] [<ffffffff815edea5>] _raw_spin_lock_bh+0x45/0x80 [ 43.695901] [<ffffffff8150169e>] ? dev_set_rx_mode+0x1e/0x40 [ 43.696180] [<ffffffff8150169e>] dev_set_rx_mode+0x1e/0x40 [ 43.696460] [<ffffffff8150189c>] dev_set_promiscuity+0x3c/0x50 [ 43.696750] [<ffffffffa0586845>] br_port_set_promisc+0x25/0x50 [bridge] [ 43.697052] [<ffffffffa05869aa>] br_manage_promisc+0x8a/0xe0 [bridge] [ 43.697348] [<ffffffffa05826ee>] br_dev_change_rx_flags+0x1e/0x20 [bridge] [ 43.697655] [<ffffffff81501532>] __dev_set_promiscuity+0x132/0x1f0 [ 43.697943] [<ffffffff81501672>] __dev_set_rx_mode+0x82/0x90 [ 43.698223] [<ffffffff815072de>] dev_uc_add+0x5e/0x80 [ 43.698498] [<ffffffffa05b3c62>] vlan_device_event+0x542/0x650 [8021q] [ 43.698798] [<ffffffff8109886d>] notifier_call_chain+0x5d/0x80 [ 43.699083] [<ffffffff810988b6>] raw_notifier_call_chain+0x16/0x20 [ 43.699374] [<ffffffff814f456e>] call_netdevice_notifiers_info+0x6e/0x80 [ 43.699678] [<ffffffff814f4596>] call_netdevice_notifiers+0x16/0x20 [ 43.699973] [<ffffffffa05872be>] br_add_if+0x47e/0x4c0 [bridge] [ 43.700259] [<ffffffffa058801e>] add_del_if+0x6e/0x80 [bridge] [ 43.700548] [<ffffffffa0588b5f>] br_dev_ioctl+0xaf/0xc0 [bridge] [ 43.700836] [<ffffffff8151a7ac>] dev_ifsioc+0x30c/0x3c0 [ 43.701106] [<ffffffff8151aac9>] dev_ioctl+0xf9/0x6f0 [ 43.701379] [<ffffffff81254345>] ? mntput_no_expire+0x5/0x450 [ 43.701665] [<ffffffff812543ee>] ? mntput_no_expire+0xae/0x450 [ 43.701947] [<ffffffff814d7b02>] sock_do_ioctl+0x42/0x50 [ 43.702219] [<ffffffff814d8175>] sock_ioctl+0x1e5/0x290 [ 43.702500] [<ffffffff81242d0b>] do_vfs_ioctl+0x2cb/0x5c0 [ 43.702771] [<ffffffff81243079>] SyS_ioctl+0x79/0x90 [ 43.703033] [<ffffffff815eebb6>] entry_SYSCALL_64_fastpath+0x16/0x7a CC: Vlad Yasevich <vyasevic@redhat.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Bridge list <bridge@lists.linux-foundation.org> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: Roopa Prabhu <roopa@cumulusnetworks.com> Fixes: 2796d0c6 ("bridge: Automatically manage port promiscuous mode.") Reported-by:
Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by:
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Geert Uytterhoeven authored
net/sctp/proc.c: In function ‘sctp_transport_get_idx’: net/sctp/proc.c:313: warning: ‘obj’ may be used uninitialized in this function This is currently a false positive, as all callers check for a zero offset first, and handle this case in the exact same way. Move the check and handling into sctp_transport_get_idx() to kill the compiler warning, and avoid future bugs. Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Acked-by:
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
When a tunnel decapsulates the outer header, it has to comply with RFC 6080 and eventually propagate CE mark into inner header. It turns out IP6_ECN_set_ce() does not correctly update skb->csum for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure" messages and stack traces. Signed-off-by:
Eric Dumazet <edumazet@google.com> Acked-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Xin Long authored
Now, when we sendmsg, we translate the ep to laddr by selecting the first element of the list, and then do a lookup for a transport. But sctp_hash_cmp() will compare it against asoc addr_list, which may be a subset of ep addr_list, meaning that this chosen laddr may not be there, and thus making it impossible to find the transport. So we fix it by using ep + paddr to lookup transports in hashtable. In sctp_hash_cmp, if .ep is set, we will check if this ep == asoc->ep, or we will do the laddr check. Fixes: d6c0256a ("sctp: add the rhashtable apis for sctp global transport hashtable") Signed-off-by:
Xin Long <lucien.xin@gmail.com> Acked-by:
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reported-by:
Vlad Yasevich <vyasevich@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Konstantin Khlebnikov authored
Skb_gso_segment() uses skb control block during segmentation. This patch adds 32-bytes room for previous control block which will be copied into all resulting segments. This patch fixes kernel crash during fragmenting forwarded packets. Fragmentation requires valid IP CB in skb for clearing ip options. Also patch removes custom save/restore in ovs code, now it's redundant. Signed-off-by:
Konstantin Khlebnikov <koct9i@gmail.com> Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com Signed-off-by:
David S. Miller <davem@davemloft.net>
-