- Aug 05, 2009
-
-
Jan Engelhardt authored
String literals are constant, and usually, we can also tag the array of pointers const too, moving it to the .rodata section. Signed-off-by:
Jan Engelhardt <jengelh@medozas.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 03, 2009
-
-
Eric Dumazet authored
Current neigh_periodic_timer() function is fired by timer IRQ, and scans one hash bucket each round (very litle work in fact) As we are supposed to scan whole hash table in 15 seconds, this means neigh_periodic_timer() can be fired very often. (depending on the number of concurrent hash entries we stored in this table) Converting this to a workqueue permits scanning whole table, minimizing icache pollution, and firing this work every 15 seconds, independantly of hash table size. This 15 seconds delay is not a hard number, as work is a deferrable one. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 27, 2009
-
-
Johannes Berg authored
In order to make cfg80211/nl80211 aware of network namespaces, we have to do the following things: * del_virtual_intf method takes an interface index rather than a netdev pointer - simply change this * nl80211 uses init_net a lot, it changes to use the sender's network namespace * scan requests use the interface index, hold a netdev pointer and reference instead * we want a wiphy and its associated virtual interfaces to be in one netns together, so - we need to be able to change ns for a given interface, so export dev_change_net_namespace() - for each virtual interface set the NETIF_F_NETNS_LOCAL flag, and clear that flag only when the wiphy changes ns, to disallow breaking this invariant * when a network namespace goes away, we need to reparent the wiphy to init_net * cfg80211 users that support creating virtual interfaces must create them in the wiphy's namespace, currently this affects only mac80211 The end result is that you can now switch an entire wiphy into a different network namespace with the new command iw phy#<idx> set netns <pid> and all virtual interfaces will follow (or the operation fails). Signed-off-by:
Johannes Berg <johannes@sipsolutions.net> Signed-off-by:
John W. Linville <linville@tuxdriver.com>
-
Eric Dumazet authored
Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Sridhar Samudrala authored
This helps avoid error messages with ethtool -k on devices that don't provide device specific routines. Signed-off-by:
Sridhar Samudrala <sri@us.ibm.com> ------------------------------------------------------------------ Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 24, 2009
-
-
Johannes Berg authored
mac80211 required this due to the master netdev, but now it can put all information into skb->cb and this can go. Signed-off-by:
Johannes Berg <johannes@sipsolutions.net> Signed-off-by:
John W. Linville <linville@tuxdriver.com>
-
Johannes Berg authored
For mac80211, with the master netdev removal, we need to be able to sync a multicast address list onto another list that is not tracked within a netdev, so we need access to the functions doing that. Signed-off-by:
Johannes Berg <johannes@sipsolutions.net> Signed-off-by:
John W. Linville <linville@tuxdriver.com>
-
- Jul 20, 2009
-
-
Rémi Denis-Courmont authored
I guess it should be -EINVAL rather than EINVAL. I have not checked when the bug came in. Perhaps a candidate for -stable? Signed-off-by:
Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 17, 2009
-
-
Eric Dumazet authored
Commit e912b114 (net: sk_prot_alloc() should not blindly overwrite memory) took care of not zeroing whole new socket at allocation time. sock_copy() is another spot where we should be very careful. We should not set refcnt to a non null value, until we are sure other fields are correctly setup, or a lockless reader could catch this socket by mistake, while not fully (re)initialized. This patch puts sk_node & sk_refcnt to the very beginning of struct sock to ease sock_copy() & sk_prot_alloc() job. We add appropriate smp_wmb() before sk_refcnt initializations to match our RCU requirements (changes to sock keys should be committed to memory before sk_refcnt setting) Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 13, 2009
-
-
Tobias Klauser authored
Rename lookup_neigh_params to lookup_neigh_parms as the struct is named neigh_parms and all other functions dealing with the struct carry neigh_parms in their names. Signed-off-by:
Tobias Klauser <klto@zhaw.ch> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 12, 2009
-
-
Johannes Berg authored
The function get_net_ns_by_pid(), to get a network namespace from a pid_t, will be required in cfg80211 as well. Therefore, let's move it to net_namespace.c and export it. We can't make it a static inline in the !NETNS case because it needs to verify that the given pid even exists (and return -ESRCH). Signed-off-by:
Johannes Berg <johannes@sipsolutions.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Johannes Berg authored
All we need to take care of is using proper RCU list add/del primitives and inserting a synchronize_rcu() at one place to make sure the exit notifiers are run after everybody has stopped iterating the list. Signed-off-by:
Johannes Berg <johannes@sipsolutions.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Some sockets use SLAB_DESTROY_BY_RCU, and our RCU code correctness depends on sk->sk_nulls_node.next being always valid. A NULL value is not allowed as it might fault a lockless reader. Current sk_prot_alloc() implementation doesnt respect this hypothesis, calling kmem_cache_alloc() with __GFP_ZERO. Just call memset() around the forbidden field. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 10, 2009
-
-
Jiri Olsa authored
Adding memory barrier after the poll_wait function, paired with receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper to wrap the memory barrier. Without the memory barrier, following race can happen. The race fires, when following code paths meet, and the tp->rcv_nxt and __add_wait_queue updates stay in CPU caches. CPU1 CPU2 sys_select receive packet ... ... __add_wait_queue update tp->rcv_nxt ... ... tp->rcv_nxt check sock_def_readable ... { schedule ... if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) wake_up_interruptible(sk->sk_sleep) ... } If there was no cache the code would work ok, since the wait_queue and rcv_nxt are opposit to each other. Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already passed the tp->rcv_nxt check and sleeps, or will get the new value for tp->rcv_nxt and will return with new data mask. In both cases the process (CPU1) is being added to the wait queue, so the waitqueue_active (CPU2) call cannot miss and will wake up CPU1. The bad case is when the __add_wait_queue changes done by CPU1 stay in its cache, and so does the tp->rcv_nxt update on CPU2 side. The CPU1 will then endup calling schedule and sleep forever if there are no more data on the socket. Calls to poll_wait in following modules were ommited: net/bluetooth/af_bluetooth.c net/irda/af_irda.c net/irda/irnet/irnet_ppp.c net/mac80211/rc80211_pid_debugfs.c net/phonet/socket.c net/rds/af_rds.c net/rfkill/core.c net/sunrpc/cache.c net/sunrpc/rpc_pipe.c net/tipc/socket.c Signed-off-by:
Jiri Olsa <jolsa@redhat.com> Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 09, 2009
-
-
Anton Vorontsov authored
Using early netconsole and gianfar driver this error pops up: netconsole: timeout waiting for carrier It appears that net/core/netpoll.c:netpoll_setup() is using cond_resched() in a loop waiting for a carrier. The thing is that cond_resched() is a no-op when system_state != SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never scheduled, therefore link detection doesn't work. I belive that the main problem is in cond_resched()[1], but despite how the cond_resched() story ends, it might be a good idea to call msleep(1) instead of cond_resched(), as suggested by Andrew Morton. [1] http://lkml.org/lkml/2009/7/7/463 Signed-off-by:
Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 08, 2009
-
-
Anton Vorontsov authored
Some PHYs require longer timeouts for carrier detection, and auto-negotiation process may take indefinite amount of time. It may be inconvenient to force longer timeouts for sane PHYs, so let's introduce a kernel command line option. Since we're using module_param(), the option also can be changed in runtime. Signed-off-by:
Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 06, 2009
-
-
Patrick McHardy authored
This patch converts the remaining occurences of raw return values to their symbolic counterparts in ndo_start_xmit() functions that were missed by the previous automatic conversion. Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero is changed to explicitly use NETDEV_TX_OK. Signed-off-by:
Patrick McHardy <kaber@trash.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 27, 2009
-
-
Herbert Xu authored
When NAPI is disabled while we're in net_rx_action, we end up calling __napi_complete without flushing GRO packets. This is a bug as it would cause the GRO packets to linger, of course it also literally BUGs to catch error like this :) This patch changes it to napi_complete, with the obligatory IRQ reenabling. This should be safe because we've only just disabled IRQs and it does not materially affect the test conditions in between. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 23, 2009
-
-
Herbert Xu authored
In order to get the tun driver to account packets, we need to be able to receive packets with destructors set. To be on the safe side, I added an skb_orphan call for all protocols by default since some of them (IP in particular) cannot handle receiving packets destructors properly. Now it seems that at least one protocol (CAN) expects to be able to pass skb->sk through the rx path without getting clobbered. So this patch attempts to fix this properly by moving the skb_orphan call to where it's actually needed. In particular, I've added it to skb_set_owner_[rw] which is what most users of skb->destructor call. This is actually an improvement for tun too since it means that we only give back the amount charged to the socket when the skb is passed to another socket that will also be charged accordingly. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Tested-by:
Oliver Hartkopp <olver@hartkopp.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 18, 2009
-
-
Jiri Pirko authored
This patch is inspired by patch recently posted by Johannes Berg. Basically what my patch does is to group list and a count of addresses into newly introduced structure netdev_hw_addr_list. This brings us two benefits: 1) struct net_device becames a bit nicer. 2) in the future there will be a possibility to operate with lists independently on netdevices (with exporting right functions). I wanted to introduce this patch before I'll post a multicast lists conversion. Signed-off-by:
Jiri Pirko <jpirko@redhat.com> drivers/net/bnx2.c | 4 +- drivers/net/e1000/e1000_main.c | 4 +- drivers/net/ixgbe/ixgbe_main.c | 6 +- drivers/net/mv643xx_eth.c | 2 +- drivers/net/niu.c | 4 +- drivers/net/virtio_net.c | 10 ++-- drivers/s390/net/qeth_l2_main.c | 2 +- include/linux/netdevice.h | 17 +++-- net/core/dev.c | 130 ++++++++++++++++++-------------------- 9 files changed, 89 insertions(+), 90 deletions(-) Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
The skb mac_header field is sometimes NULL (or ~0u) as a sentinel value. The places where skb is expanded add an offset which would change this flag into an invalid pointer (or offset). Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Looking at the crash in log_martians(), one suspect is that the check for mac header being set is not correct. The value of mac_header defaults to 0 on allocation, therefore skb_mac_header_was_set will always be true on platforms using NET_SKBUFF_USES_OFFSET. Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Acked-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 15, 2009
-
-
Vegard Nossum authored
2009/2/24 Ingo Molnar <mingo@elte.hu>: > ok, this is the last warning i have from today's overnight -tip > testruns - a 32-bit system warning in sock_init_data(): > > [ 2.610389] NET: Registered protocol family 16 > [ 2.616138] initcall netlink_proto_init+0x0/0x170 returned 0 after 7812 usecs > [ 2.620010] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f642c184) > [ 2.624002] 010000000200000000000000604990c000000000000000000000000000000000 > [ 2.634076] i i i i i i u u i i i i i i i i i i i i i i i i i i i i i i i i > [ 2.641038] ^ > [ 2.643376] > [ 2.644004] Pid: 1, comm: swapper Not tainted (2.6.29-rc6-tip-01751-g4d1c22c-dirty #885) > [ 2.648003] EIP: 0060:[<c07141a1>] EFLAGS: 00010282 CPU: 0 > [ 2.652008] EIP is at sock_init_data+0xa1/0x190 > [ 2.656003] EAX: 0001a800 EBX: f6836c00 ECX: 00463000 EDX: c0e46fe0 > [ 2.660003] ESI: f642c180 EDI: c0b83088 EBP: f6863ed8 ESP: c0c412ec > [ 2.664003] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 > [ 2.668003] CR0: 8005003b CR2: f682c400 CR3: 00b91000 CR4: 000006f0 > [ 2.672003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [ 2.676003] DR6: ffff4ff0 DR7: 00000400 > [ 2.680002] [<c07423e5>] __netlink_create+0x35/0xa0 > [ 2.684002] [<c07443cc>] netlink_kernel_create+0x4c/0x140 > [ 2.688002] [<c072755e>] rtnetlink_net_init+0x1e/0x40 > [ 2.696002] [<c071b601>] register_pernet_operations+0x11/0x30 > [ 2.700002] [<c071b72c>] register_pernet_subsys+0x1c/0x30 > [ 2.704002] [<c0bf3c8c>] rtnetlink_init+0x4c/0x100 > [ 2.708002] [<c0bf4669>] netlink_proto_init+0x159/0x170 > [ 2.712002] [<c0101124>] do_one_initcall+0x24/0x150 > [ 2.716002] [<c0bbf3c7>] do_initcalls+0x27/0x40 > [ 2.723201] [<c0bbf3fc>] do_basic_setup+0x1c/0x20 > [ 2.728002] [<c0bbfb8a>] kernel_init+0x5a/0xa0 > [ 2.732002] [<c0103e47>] kernel_thread_helper+0x7/0x10 > [ 2.736002] [<ffffffff>] 0xffffffff We fix this false positive by annotating the bitfield in struct sock. Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
-
Vegard Nossum authored
Signed-off-by:
Vegard Nossum <vegard.nossum@gmail.com>
-
- Jun 12, 2009
-
-
Michał Mirosław authored
This patch changes FDB entry check for ATM LANE bridge integration. There's no point in holding a FDB entry around SKB building. br_fdb_get()/br_fdb_put() pair are changed into single br_fdb_test_addr() hook that checks if the addr has FDB entry pointing to other port to the one the request arrived on. FDB entry refcounting is removed as it's not used anywhere else. Signed-off-by:
Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
John Dykstra authored
Commit b00055aa " [NET] core: add RFC2863 operstate" defined new interface flag values. Its documentation specified that these flags could be accessed from user space via SIOCGIFFLAGS. However, this does not work because the new flags do not fit in that ioctl's argument width. Change the documentation to match the code's behavior. Also change the source to explicitly show the truncation. This _should_ have no effect on executable code, and did not with gcc 4.2.4 generating x86 code. A new ioctl could be defined to return all interface flags to user space. However, since this has been broken for three years with no one complaining, there doesn't seem much need. They are still accessible via netlink. Reported-by:
"Fredrik Arnerup" <fredrik.arnerup@edgeware.tv> Signed-off-by:
John Dykstra <john.dykstra1@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 11, 2009
-
-
Timo Teras authored
The current code errors out the INCOMPLETE neigh entry skb queue only from the timer if maximum probes have been attempted and there has been no reply. This also causes the transtion to FAILED state. However, the neigh entry can be also updated via Netlink to inform that the address is unavailable. Currently, neigh_update() just stops the timers and leaves the pending skb's unreleased. This results that the clean up code in the timer callback is never called, preventing also proper garbage collection. This fixes neigh_update() to process the pending skb queue immediately if INCOMPLETE -> FAILED state transtion occurs due to a Netlink request. Signed-off-by:
Timo Teras <timo.teras@iki.fi> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
One of the problem with sock memory accounting is it uses a pair of sock_hold()/sock_put() for each transmitted packet. This slows down bidirectional flows because the receive path also needs to take a refcount on socket and might use a different cpu than transmit path or transmit completion path. So these two atomic operations also trigger cache line bounces. We can see this in tx or tx/rx workloads (media gateways for example), where sock_wfree() can be in top five functions in profiles. We use this sock_hold()/sock_put() so that sock freeing is delayed until all tx packets are completed. As we also update sk_wmem_alloc, we could offset sk_wmem_alloc by one unit at init time, until sk_free() is called. Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc) to decrement initial offset and atomicaly check if any packets are in flight. skb_set_owner_w() doesnt call sock_hold() anymore sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc reached 0 to perform the final freeing. Drawback is that a skb->truesize error could lead to unfreeable sockets, or even worse, prematurely calling __sk_free() on a live socket. Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt contention point. 5 % speedup on a UDP transmit workload (depends on number of flows), lowering TX completion cpu usage. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 10, 2009
-
-
Johannes Berg authored
In order to handle powersave frames properly we had needed to pass these out to the device queues again, and introduce the skb->requeue bit. This, however, also has unnecessary overhead by needing to 'clean up' already tried frames, and this clean-up code is also buggy when software encryption is used. Instead of sending the frames via the master netdev queue again, simply put them into the pending queue. This also fixes a problem where frames for that particular station could be reordered when some were still on the software queues and older ones are re-injected into the software queue after them. Signed-off-by:
Johannes Berg <johannes@sipsolutions.net> Signed-off-by:
John W. Linville <linville@tuxdriver.com>
-
- Jun 09, 2009
-
-
Sergey Lapin authored
IEEE 802.15.4 stack requires several constants to be defined/adjusted. Signed-off-by:
Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Signed-off-by:
Sergey Lapin <slapin@ossfans.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
commit f001fde5 (net: introduce a list of device addresses dev_addr_list (v6)) added one regression Vegard Nossum found in its testings. With kmemcheck help, Vegard found some uninitialized memory was read and reported to user, potentialy leaking kernel data. ( thread can be found on http://lkml.org/lkml/2009/5/30/177 ) dev_addr_init() incorrectly uses sizeof() operator. We were initializing one byte instead of MAX_ADDR_LEN bytes. Reported-by:
Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Acked-by:
Jiri Pirko <jpirko@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 08, 2009
-
-
Figo.zhang authored
vfree() does its own 'NULL' check, so no need for check before calling it. Signed-off-by:
Figo.zhang <figo1802@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Sridhar Samudrala authored
Increment the iovec base by the offset passed in for the initial copy_to_user() in memcpy_to_iovecend(). Signed-off-by:
Sridhar Samudrala <sri@us.ibm.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Sridhar Samudrala authored
I am working on enabling UFO between KVM guests using virtio-net and i have some patches that i got working with 2.6.30-rc8. When i wanted to try them with net-next-2.6, i noticed that virtio-net is not working with that tree. After some debugging, it turned out to be several bugs in the recent patches to fix aio with tun driver, specifically the following 2 commits. http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git;a=commitdiff;h=0a1ec07a67bd8b0033dace237249654d015efa21 http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git;a=commitdiff;h=6f26c9a7555e5bcca3560919db9b852015077dae Fix the call to memcpy_from_iovecend() in skb_copy_datagram_from_iovec to pass the right iovec offset. Signed-off-by:
Sridhar Samudrala <sri@us.ibm.com> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
skb_dma_unmap() is quite expensive for small packets, because we use two different cache lines from skb_shared_info. One to access nr_frags, one to access dma_maps[0] Instead of dma_maps being an array of MAX_SKB_FRAGS + 1 elements, let dma_head alone in a new dma_head field, close to nr_frags, to reduce cache lines misses. Tested on my dev machine (bnx2 & tg3 adapters), nice speedup ! Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Get rid of num_dma_maps in struct skb_shared_info, as it seems unused. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-