- Jul 11, 2012
-
-
Eric Dumazet authored
On 64 bit arches having efficient unaligned accesses (eg x86_64) we can use long words to reduce number of instructions for free. Joe Perches suggested to change ipv6_masked_addr_cmp() to return a bool instead of 'int', to make sure ipv6_masked_addr_cmp() cannot be used in a sorting function. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
No longer used. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Maintaining this in the inetpeer entries was not the right way to do this at all. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
No longer needed. TCP writes metrics, but now in it's own special cache that does not dirty the route metrics. Therefore there is no longer any reason to pre-cow metrics in this way. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
No longer used. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
With help from Lin Ming. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
No longer used. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Maintain a local hash table of TCP dynamic metrics blobs. Computed TCP metrics are no longer maintained in the route metrics. The table uses RCU and an extremely simple hash so that it has low latency and low overhead. A simple hash is legitimate because we only make metrics blobs for fully established connections. Some tweaking of the default hash table sizes, metric timeouts, and the hash chain length limit certainly could use some tweaking. But the basic design seems sound. With help from Eric Dumazet and Joe Perches. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
All paths assume, when CONFIG_IP_MULTIPLE_TABLES is enabled, that any successful call to fib_lookup() will initialize the fib_result->r value to something. We violated that expectation in the new fib_lookup() fast path. Reported-by:
Or Gerlitz <ogerlitz@mellanox.com> Tested-by:
Eric Dumazet <eric.dumazet@gmail.com> Tested-by:
Greg Rose <gregory.v.rose@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 09, 2012
-
-
Pablo Neira Ayuso authored
Hans reports that he's still hitting: BUG: unable to handle kernel NULL pointer dereference at 000000000000027c IP: [<ffffffff813615db>] netlink_has_listeners+0xb/0x60 PGD 0 Oops: 0000 [#3] PREEMPT SMP CPU 0 It happens when adding a number of containers with do: nfct_query(h, NFCT_Q_CREATE, ct); and most likely one namespace shuts down. this problem was supposed to be fixed by: 70e9942f netfilter: nf_conntrack: make event callback registration per-netns Still, it was missing one rcu_access_pointer to check if the callback is set or not. Reported-by:
Hans Schillstrom <hans@schillstrom.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
- Jul 06, 2012
-
-
David S. Miller authored
If the user hasn't actually installed any custom rules, or fiddled with the default ones, don't go through the whole FIB rules layer. It's just pure overhead. Instead do what we do with CONFIG_IP_MULTIPLE_TABLES disabled, check the individual tables by hand, one by one. Also, move fib_num_tclassid_users into the ipv4 network namespace. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 05, 2012
-
-
David S. Miller authored
No longer used. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
This makes for a simplified conversion away from dst_get_neighbour*(). All code outside of ipv6 will use neigh lookups via dst_neigh_lookup*(). Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
This allows an easy conversion away from dst_get_neighbour*(). Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Do not use the dst cached neigh, we'll be getting rid of that. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 04, 2012
-
-
Pablo Neira Ayuso authored
This patch generalizes nf_ct_l4proto_net by splitting it into chunks and moving the corresponding protocol part to where it really belongs to. To clarify, note that we follow two different approaches to support per-net depending if it's built-in or run-time loadable protocol tracker. Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Acked-by:
Gao feng <gaofeng@cn.fujitsu.com>
-
- Jul 01, 2012
-
-
Neil Horman authored
It was noticed recently that when we send data on a transport, its possible that we might bundle a sack that arrived on a different transport. While this isn't a major problem, it does go against the SHOULD requirement in section 6.4 of RFC 2960: An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK, etc.) to the same destination transport address from which it received the DATA or control chunk to which it is replying. This rule should also be followed if the endpoint is bundling DATA chunks together with the reply chunk. This patch seeks to correct that. It restricts the bundling of sack operations to only those transports which have moved the ctsn of the association forward since the last sack. By doing this we guarantee that we only bundle outbound saks on a transport that has received a chunk since the last sack. This brings us into stricter compliance with the RFC. Vlad had initially suggested that we strictly allow only sack bundling on the transport that last moved the ctsn forward. While this makes sense, I was concerned that doing so prevented us from bundling in the case where we had received chunks that moved the ctsn on multiple transports. In those cases, the RFC allows us to select any of the transports having received chunks to bundle the sack on. so I've modified the approach to allow for that, by adding a state variable to each transport that tracks weather it has moved the ctsn since the last sack. This I think keeps our behavior (and performance), close enough to our current profile that I think we can do this without a sysctl knob to enable/disable it. Signed-off-by:
Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yaseivch <vyasevich@gmail.com> CC: David S. Miller <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Reported-by:
Michele Baldessari <michele@redhat.com> Reported-by:
sorin serban <sserban@redhat.com> Acked-by:
Vlad Yasevich <vyasevich@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 29, 2012
-
-
David S. Miller authored
If rpfilter is off (or the SKB has an IPSEC path) and there are not tclassid users, we don't have to do anything at all when fib_validate_source() is invoked besides setting the itag to zero. We monitor tclassid uses with a counter (modified only under RTNL and marked __read_mostly) and we protect the fib_validate_source() real work with a test against this counter and whether rpfilter is to be done. Having a way to know whether we need no tclassid processing or not also opens the door for future optimized rpfilter algorithms that do not perform full FIB lookups. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Ville Nuorvala authored
At Facebook, we do Layer-3 DSR via IP-in-IP tunneling. Our load balancers wrap an extra IP header on incoming packets so they can be routed to the backend. In the v4 tunnel driver, when these packets fall on the default tunl0 device, the behavior is to decapsulate them and drop them back on the stack. So our setup is that tunl0 has the VIP and eth0 has (obviously) the backend's real address. In IPv6 we do the same thing, but the v6 tunnel driver didn't have this same behavior - if you didn't have an explicit tunnel setup, it would drop the packet. This patch brings that v4 feature to the v6 driver. The same IPv6 address checks are performed as with any normal tunnel, but as the fallback tunnel endpoint addresses are unspecified, the checks must be performed on a per-packet basis, rather than at tunnel configuration time. [Patch description modified by phil@ipom.com] Signed-off-by:
Ville Nuorvala <ville.nuorvala@gmail.com> Tested-by:
Phil Dibowitz <phil@ipom.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Checking for in_dev being NULL is pointless. In fact, all of our callers have in_dev precomputed already, so just pass it in and remove the NULL checking. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Thomas Graf authored
Using NLMSG_GOODSIZE results in multiple pages being used as nlmsg_new() will automatically add the size of the netlink header to the payload thus exceeding the page limit. NLMSG_DEFAULT_SIZE takes this into account. Signed-off-by:
Thomas Graf <tgraf@suug.ch> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Cc: Sergey Lapin <slapin@ossfans.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Lauro Ramos Venancio <lauro.venancio@openbossa.org> Cc: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> Cc: Samuel Ortiz <sameo@linux.intel.com> Reviewed-by:
Jiri Pirko <jpirko@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Neal Cardwell authored
This commit changes inet_csk_route_req() so that it uses a pointer to a struct flowi6, rather than allocating its own on the stack. This brings its behavior in line with its IPv4 cousin, inet_csk_route_req(), and allows a follow-on patch to fix a dst leak. Signed-off-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 28, 2012
-
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
The specific destination is the host we direct unicast replies to. Usually this is the original packet source address, but if we are responding to a multicast or broadcast packet we have to use something different. Specifically we must use the source address we would use if we were to send a packet to the unicast source of the original packet. The routing cache precomputes this value, but we want to remove that precomputation because it creates a hard dependency on the expensive rpfilter source address validation which we'd like to make cheaper. There are only three places where this matters: 1) ICMP replies. 2) pktinfo CMSG 3) IP options Now there will be no real users of rt->rt_spec_dst and we can simply remove it altogether. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Rename it to ip_send_unicast_reply() and add explicit 'saddr' argument. This removed one of the few users of rt->rt_spec_dst. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
It's completely unnecessary. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Instead of using a fixed value of "-1" or "-EMSGSIZE", propagate what the nla_*() interfaces actually return. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
This reverts commit c074da28. This change has several unwanted side effects: 1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll thus never create a real cached route. 2) All TCP traffic will use DST_NOCACHE and never use the routing cache at all. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 27, 2012
-
-
Eric Dumazet authored
DDOS synflood attacks hit badly IP route cache. On typical machines, this cache is allowed to hold up to 8 Millions dst entries, 256 bytes for each, for a total of 2GB of memory. rt_garbage_collect() triggers and tries to cleanup things. Eventually route cache is disabled but machine is under fire and might OOM and crash. This patch exploits the new TCP early demux, to set a nocache boolean in case incoming TCP frame is for a not yet ESTABLISHED or TIMEWAIT socket. This 'nocache' boolean is then used in case dst entry is not found in route cache, to create an unhashed dst entry (DST_NOCACHE) SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache output dst for syncookies), so after this patch, a machine is able to absorb a DDOS synflood attack without polluting its IP route cache. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Gao feng authored
This patch is a cleanup. It adds nf_ct_kfree_compat_sysctl_table to release l4proto's compat sysctl table and set the compat sysctl table point to NULL. This new function will be used by follow-up patches. Signed-off-by:
Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
Gao feng authored
l4proto->init contain quite redundant code. We can simplify this by adding a new parameter l3proto. This patch prepares that code simplification. Signed-off-by:
Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org>
-
alex.bluesman.smirnov@gmail.com authored
Every real 802.15.4 transceiver, which works with software MAC layer, can be classified as a wpan device in this stack. So the wpan device implementation provides missing link in datapath between the device drivers and the Linux network queue. According to the IEEE 802.15.4 standard each packet can be one of the following types: - beacon - MAC layer command - ACK - data This patch adds support for the data packet-type only, but this is enough to perform data transmission and receiving over radio. Signed-off-by:
Alexander Smirnov <alex.bluesman.smirnov@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 26, 2012
-
-
Thomas Pedersen authored
Support configuring an RSSI threshold in dBm (s32) when requesting scheduled scan, below which a BSS won't be reported by the cfg80211 driver. Signed-off-by:
Thomas Pedersen <c_tpeder@qca.qualcomm.com> Signed-off-by:
Johannes Berg <johannes.berg@intel.com>
-
- Jun 25, 2012
-
-
Sjur Brændeland authored
Remove use of module parameters on caif hsi device, as rtnl configuration parameters are already supported. All caif hsi configuration data is put in cfhsi_config, and default values in hsi_default_config. Signed-off-by:
Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Sjur Brændeland authored
Remove use of struct platform_device, and replace it with struct cfhsi_ops. Updated variable names in the same spirit: cfhsi_get_dev to cfhsi_get_ops, cfhsi->dev to cfhsi->ops and, cfhsi->dev.drv to cfhsi->ops->cb_ops. Signed-off-by:
Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-