- Aug 29, 2013
-
-
Daniel Borkmann authored
Reduce cacheline usage from 2 to 1 cacheline for sctp_globals structure. By reordering elements, we can close gaps and simply achieve the following: Current situation: /* size: 80, cachelines: 2, members: 10 */ /* sum members: 57, holes: 4, sum holes: 16 */ /* padding: 7 */ /* last cacheline: 16 bytes */ Afterwards: /* size: 64, cachelines: 1, members: 10 */ /* padding: 7 */ Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 27, 2013
-
-
Patrick McHardy authored
Extract the local TCP stack independant parts of tcp_v6_init_sequence() and cookie_v6_check() and export them for use by the upcoming IPv6 SYNPROXY target. Signed-off-by:
Patrick McHardy <kaber@trash.net> Acked-by:
David S. Miller <davem@davemloft.net> Tested-by:
Martin Topholm <mph@one.com> Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
Add a SYNPROXY for netfilter. The code is split into two parts, the synproxy core with common functions and an address family specific target. The SYNPROXY receives the connection request from the client, responds with a SYN/ACK containing a SYN cookie and announcing a zero window and checks whether the final ACK from the client contains a valid cookie. It then establishes a connection to the original destination and, if successful, sends a window update to the client with the window size announced by the server. Support for timestamps, SACK, window scaling and MSS options can be statically configured as target parameters if the features of the server are known. If timestamps are used, the timestamp value sent back to the client in the SYN/ACK will be different from the real timestamp of the server. In order to now break PAWS, the timestamps are translated in the direction server->client. Signed-off-by:
Patrick McHardy <kaber@trash.net> Tested-by:
Martin Topholm <mph@one.com> Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
Extract the local TCP stack independant parts of tcp_v4_init_sequence() and cookie_v4_check() and export them for use by the upcoming SYNPROXY target. Signed-off-by:
Patrick McHardy <kaber@trash.net> Acked-by:
David S. Miller <davem@davemloft.net> Tested-by:
Martin Topholm <mph@one.com> Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
Split out sequence number adjustments from NAT and move them to the conntrack core to make them usable for SYN proxying. The sequence number adjustment information is moved to a seperate extend. The extend is added to new conntracks when a NAT mapping is set up for a connection using a helper. As a side effect, this saves 24 bytes per connection with NAT in the common case that a connection does not have a helper assigned. Signed-off-by:
Patrick McHardy <kaber@trash.net> Tested-by:
Martin Topholm <mph@one.com> Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- Aug 23, 2013
-
-
Joe Stringer authored
Signed-off-by:
Joe Stringer <joe@wand.net.nz> Signed-off-by:
Jesse Gross <jesse@nicira.com>
-
Duan Jiong authored
rfc 4861 says the Redirected Header option is optional, so the kernel should not drop the Redirect Message that has no Redirected Header option. In this patch, the function ip6_redirect_no_header() is introduced to deal with that condition. Signed-off-by:
Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org>
-
- Aug 20, 2013
-
-
Pravin B Shelar authored
Following patch allows more code sharing between vxlan and ovs-vxlan. Signed-off-by:
Pravin B Shelar <pshelar@nicira.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pravin B Shelar authored
Following patch adds data field to vxlan socket and export vxlan handler api. vh->data is required to store private data per vxlan handler. Signed-off-by:
Pravin B Shelar <pshelar@nicira.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 15, 2013
-
-
Jesper Dangaard Brouer authored
commit 56b765b7 ("htb: improved accuracy at high rates") broke the "linklayer atm" handling. tc class add ... htb rate X ceil Y linklayer atm The linklayer setting is implemented by modifying the rate table which is send to the kernel. No direct parameter were transferred to the kernel indicating the linklayer setting. The commit 56b765b7 ("htb: improved accuracy at high rates") removed the use of the rate table system. To keep compatible with older iproute2 utils, this patch detects the linklayer by parsing the rate table. It also supports future versions of iproute2 to send this linklayer parameter to the kernel directly. This is done by using the __reserved field in struct tc_ratespec, to convey the choosen linklayer option, but only using the lower 4 bits of this field. Linklayer detection is limited to speeds below 100Mbit/s, because at high rates the rtab is gets too inaccurate, so bad that several fields contain the same values, this resembling the ATM detect. Fields even start to contain "0" time to send, e.g. at 1000Mbit/s sending a 96 bytes packet cost "0", thus the rtab have been more broken than we first realized. Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
This patch allows to switch the netns when packet is encapsulated or decapsulated. In other word, the encapsulated packet is received in a netns, where the lookup is done to find the tunnel. Once the tunnel is found, the packet is decapsulated and injecting into the corresponding interface which stands to another netns. When one of the two netns is removed, the tunnel is destroyed. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
This patch allows to switch the netns when packet is encapsulated or decapsulated. In other word, the encapsulated packet is received in a netns, where the lookup is done to find the tunnel. Once the tunnel is found, the packet is decapsulated and injecting into the corresponding interface which stands to another netns. When one of the two netns is removed, the tunnel is destroyed. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 13, 2013
-
-
Pravin B Shelar authored
Using inner-id for tunnel id is not safe in some rare cases. E.g. packets coming from multiple sources entering same tunnel can have same id. Therefore on tunnel packet receive we could have packets from two different stream but with same source and dst IP with same ip-id which could confuse ip packet reassembly. Following patch reverts optimization from commit 490ab081 (IP_GRE: Fix IP-Identification.) CC: Jarno Rajahalme <jrajahalme@nicira.com> CC: Ansis Atteka <aatteka@nicira.com> Signed-off-by:
Pravin B Shelar <pshelar@nicira.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pablo Neira Ayuso authored
This patch adds the capability to attach expectations via nfnetlink_queue. This is required by conntrack helpers that trigger expectations based on the first packet seen like the TFTP and the DHCPv6 user-space helpers. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- Aug 10, 2013
-
-
Eric Dumazet authored
Adding paged frags skbs to af_unix sockets introduced a performance regression on large sends because of additional page allocations, even if each skb could carry at least 100% more payload than before. We can instruct sock_alloc_send_pskb() to attempt high order allocations. Most of the time, it does a single page allocation instead of 8. I added an additional parameter to sock_alloc_send_pskb() to let other users to opt-in for this new feature on followup patches. Tested: Before patch : $ netperf -t STREAM_STREAM STREAM STREAM TEST Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 2304 212992 212992 10.00 46861.15 After patch : $ netperf -t STREAM_STREAM STREAM STREAM TEST Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 2304 212992 212992 10.00 57981.11 Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
unix_stream_sendmsg() currently uses order-2 allocations, and we had numerous reports this can fail. The __GFP_REPEAT flag present in sock_alloc_send_pskb() is not helping. This patch extends the work done in commit eb6a2481 ("af_unix: reduce high order page allocations) for datagram sockets. This opens the possibility of zero copy IO (splice() and friends) The trick is to not use skb_pull() anymore in recvmsg() path, and instead add a @consumed field in UNIXCB() to track amount of already read payload in the skb. There is a performance regression for large sends because of extra page allocations that will be addressed in a follow-up patch, allowing sock_alloc_send_pskb() to attempt high order page allocations. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
Encrypt the cookie with both server and client IPv4 addresses, such that multi-homed server will grant different cookies based on both the source and destination IPs. No client change is needed since cookie is opaque to the client. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Reviewed-by:
Eric Dumazet <edumazet@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 09, 2013
-
-
David S. Miller authored
This reverts commit cda5f98e. As per Vlad's request. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eliezer Tamir authored
Rename mib counter from "low latency" to "busy poll" v1 also moved the counter to the ip MIB (suggested by Shawn Bohrer) Eric Dumazet suggested that the current location is better. So v2 just renames the counter to fit the new naming convention. Signed-off-by:
Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
With the restructuring of the lksctp.org site, we only allow bug reports through the SCTP mailing list linux-sctp@vger.kernel.org, not via SF, as SF is only used for web hosting and nothing more. While at it, also remove the obvious statement that bugs will be fixed and incooperated into the kernel. Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Acked-by:
Vlad Yasevich <vyasevich@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Get rid of the last module parameter for SCTP and make this configurable via sysctl for SCTP like all the rest of SCTP's configuration knobs. Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Florian Westphal authored
Let nf_ct_delete handle delivery of the DESTROY event. Based on earlier patch from Pablo Neira. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- Aug 07, 2013
-
-
stephen hemminger authored
The IP tunnel hash heads can be embedded in the per-net structure since it is a fixed size. Reduce the size so that the total structure fits in a page size. The original size was overly large, even NETDEV_HASHBITS is only 8 bits! Also, add some white space for readability. Signed-off-by:
Stephen Hemminger <stephen@networkplumber.org> Acked-by:
Pravin B Shelar <pshelar@nicira.com>.> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 05, 2013
-
-
fan.du authored
As dst_cookie is used in fast path sctp_transport_dst_check. Before: struct sctp_transport { struct list_head transports; /* 0 16 */ atomic_t refcnt; /* 16 4 */ __u32 dead:1; /* 20:31 4 */ __u32 rto_pending:1; /* 20:30 4 */ __u32 hb_sent:1; /* 20:29 4 */ __u32 pmtu_pending:1; /* 20:28 4 */ /* XXX 28 bits hole, try to pack */ __u32 sack_generation; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ struct flowi fl; /* 32 64 */ /* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */ union sctp_addr ipaddr; /* 96 28 */ After: struct sctp_transport { struct list_head transports; /* 0 16 */ atomic_t refcnt; /* 16 4 */ __u32 dead:1; /* 20:31 4 */ __u32 rto_pending:1; /* 20:30 4 */ __u32 hb_sent:1; /* 20:29 4 */ __u32 pmtu_pending:1; /* 20:28 4 */ /* XXX 28 bits hole, try to pack */ __u32 sack_generation; /* 24 4 */ u32 dst_cookie; /* 28 4 */ struct flowi fl; /* 32 64 */ /* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */ union sctp_addr ipaddr; /* 96 28 */ Signed-off-by:
Fan Du <fan.du@windriver.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Mathias Krause authored
The mark argument is read only, so constify it. Also make dummy_mark in af_key const -- only used as dummy argument for this very function. Signed-off-by:
Mathias Krause <minipli@googlemail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com>
-
Eliezer Tamir authored
When renaming ll_poll to busy poll, I introduced a typo in the name of the do-nothing placeholder for sk_busy_loop and called it sk_busy_poll. This broke compile when busy poll was not configured. Cong Wang submitted a patch to fixed that. This patch removes the now redundant, misspelled placeholder. Signed-off-by:
Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 03, 2013
-
-
Eric Dumazet authored
Move refcnt, pref, suppress_ifgroup, suppress_prefixlen out of first cache line, as they are not used in fast path. Make sure ctarget & fr_net are in first cache line. (Assuming 64 bit arches and 64 bytes cache lines) Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Stefan Tomanek authored
This change brings the suppressor attribute names into line; it also changes the data types to provide a more consistent interface. While -1 indicates that the suppressor is not enabled, values >= 0 for suppress_prefixlen or suppress_ifgroup reject routing decisions violating the constraint. This changes the previously presented behaviour of suppress_prefixlen, where a prefix length _less_ than the attribute value was rejected. After this change, a prefix length less than *or* equal to the value is considered a violation of the rule constraint. It also changes the default values for default and newly added rules (disabling any suppression for those). Signed-off-by:
Stefan Tomanek <stefan.tomanek@wertarbyte.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 02, 2013
-
-
Stefan Tomanek authored
This change adds the ability to suppress a routing decision based upon the interface group the selected interface belongs to. This allows it to exclude specific devices from a routing decision. Signed-off-by:
Stefan Tomanek <stefan.tomanek@wertarbyte.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
fan.du authored
When sctp sits on IPv6, sctp_transport_dst_check pass cookie as ZERO, as a result ip6_dst_check always fail out. This behaviour makes transport->dst useless, because every sctp_packet_transmit must look for valid dst. Add a dst_cookie into sctp_transport, and set the cookie whenever we get new dst for sctp_transport. So dst validness could be checked against it. Since I have split genid for IPv4 and IPv6, also delete/add IPv6 address will also bump IPv6 genid. So issues we discussed in: http://marc.info/?l=linux-netdev&m=137404469219410&w=4 have all been sloved for this patch. Signed-off-by:
Fan Du <fan.du@windriver.com> Acked-by:
Vlad Yasevich <vyasevich@gmail.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Joe Perches authored
It's convenient to have ethernet mac addresses use ETH_ALEN to be able to grep for them a bit easier and also to ensure that the addresses are __aligned(2). Add #include <linux/if_ether.h> as necessary. Signed-off-by:
Joe Perches <joe@perches.com> Acked-by:
Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Aug 01, 2013
-
-
Cong Wang authored
Eliezer renames several *ll_poll to *busy_poll, but forgets CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too. Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
Cong Wang <amwang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Cong Wang authored
When CONFIG_NET_LL_RX_POLL is not set, I got: net/socket.c: In function ‘sock_poll’: net/socket.c:1165:4: error: implicit declaration of function ‘sk_busy_loop’ [-Werror=implicit-function-declaration] Fix this by adding a nop when !CONFIG_NET_LL_RX_POLL. Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
Cong Wang <amwang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Michal Kubeček authored
On a high-traffic router with many processors and many IPv6 dst entries, soft lockup in fib6_run_gc() can occur when number of entries reaches gc_thresh. This happens because fib6_run_gc() uses fib6_gc_lock to allow only one thread to run the garbage collector but ip6_dst_gc() doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc() returns. On a system with many entries, this can take some time so that in the meantime, other threads pass the tests in ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for the lock. They then have to run the garbage collector one after another which blocks them for quite long. Resolve this by replacing special value ~0UL of expire parameter to fib6_run_gc() by explicit "force" parameter to choose between spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with force=false if gc_thresh is reached but not max_size. Signed-off-by:
Michal Kubecek <mkubecek@suse.cz> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Simon Wunderlich authored
The count field in CSA must be decremented with each beacon transmitted. This patch implements the functionality for drivers using ieee80211_beacon_get(). Other drivers must call back manually after reaching count == 0. This patch also contains the handling and finish worker for the channel switch command, and mac80211/chanctx code to allow to change a channel definition of an active channel context. Signed-off-by:
Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by:
Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> [small cleanups, catch identical chandef] Signed-off-by:
Johannes Berg <johannes.berg@intel.com>
-
Simon Wunderlich authored
To allow channel switch announcements within beacons, add the channel switch command to nl80211/cfg80211. This is implementation is intended for AP and (later) IBSS mode. Signed-off-by:
Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by:
Mathias Kretschmer <mathias.kretschmer@fokus.fraunhofer.de> Signed-off-by:
Johannes Berg <johannes.berg@intel.com>
-
Joe Perches authored
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Joe Perches authored
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Joe Perches authored
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Reflow modified prototypes to 80 columns. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Joe Perches authored
There are a mix of function prototypes with and without extern in the kernel sources. Standardize on not using extern for function prototypes. Function prototypes don't need to be written with extern. extern is assumed by the compiler. Its use is as unnecessary as using auto to declare automatic/local variables in a block. Reflow modified prototypes to 80 columns. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-