Skip to content
  1. Jun 16, 2017
  2. Jun 15, 2017
  3. Jun 14, 2017
    • David Howells's avatar
      rxrpc: Cache the congestion window setting · f7aec129
      David Howells authored
      
      
      Cache the congestion window setting that was determined during a call's
      transmission phase when it finishes so that it can be used by the next call
      to the same peer, thereby shortcutting the slow-start algorithm.
      
      The value is stored in the rxrpc_peer struct and is accessed without
      locking.  Each call takes the value that happens to be there when it starts
      and just overwrites the value when it finishes.
      
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7aec129
    • Jesper Dangaard Brouer's avatar
      net: don't global ICMP rate limit packets originating from loopback · 849a44de
      Jesper Dangaard Brouer authored
      
      
      Florian Weimer seems to have a glibc test-case which requires that
      loopback interfaces does not get ICMP ratelimited.  This was broken by
      commit c0303efe ("net: reduce cycles spend on ICMP replies that
      gets rate limited").
      
      An ICMP response will usually be routed back-out the same incoming
      interface.  Thus, take advantage of this and skip global ICMP
      ratelimit when the incoming device is loopback.  In the unlikely event
      that the outgoing it not loopback, due to strange routing policy
      rules, ICMP rate limiting still works via peer ratelimiting via
      icmpv4_xrlim_allow().  Thus, we should still comply with RFC1812
      (section 4.3.2.8 "Rate Limiting").
      
      This seems to fix the reproducer given by Florian.  While still
      avoiding to perform expensive and unneeded outgoing route lookup for
      rate limited packets (in the non-loopback case).
      
      Fixes: c0303efe ("net: reduce cycles spend on ICMP replies that gets rate limited")
      Reported-by: default avatarFlorian Weimer <fweimer@redhat.com>
      Reported-by: default avatar"H.J. Lu" <hjl.tools@gmail.com>
      Signed-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      849a44de
    • Dan Carpenter's avatar
      net/act_pedit: fix an error code · c4f65b09
      Dan Carpenter authored
      
      
      I'm reviewing static checker warnings where we do ERR_PTR(0), which is
      the same as NULL.  I'm pretty sure we intended to return ERR_PTR(-EINVAL)
      here.  Sometimes these bugs lead to a NULL dereference but I don't
      immediately see that problem here.
      
      Fixes: 71d0ed70 ("net/act_pedit: Support using offset relative to the conventional network headers")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarAmir Vadai <amir@vadai.me>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4f65b09
    • Paolo Abeni's avatar
      net: use skb_unref() in napi_consume_skb() · 7608894e
      Paolo Abeni authored
      
      
      The commit 83ada39bb79d ("net: factor out a helper to decrement the
      skb refcount") provided and used a helper for decrementing skb usage,
      but I missed at least a spot for it.
      
      This change remove some more duplicated code reusing skb_unref() in
      napi_consume_skb(), too. The helper uses an additional, unneeded
      unlikely(!skb) test - napi_consume_skb() already check it a few lines
      above - but the compiler is smart enough to optimize the duplicated
      test out.
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7608894e
    • Yonghong Song's avatar
      bpf: permits narrower load from bpf program context fields · 31fd8581
      Yonghong Song authored
      
      
      Currently, verifier will reject a program if it contains an
      narrower load from the bpf context structure. For example,
              __u8 h = __sk_buff->hash, or
              __u16 p = __sk_buff->protocol
              __u32 sample_period = bpf_perf_event_data->sample_period
      which are narrower loads of 4-byte or 8-byte field.
      
      This patch solves the issue by:
        . Introduce a new parameter ctx_field_size to carry the
          field size of narrower load from prog type
          specific *__is_valid_access validator back to verifier.
        . The non-zero ctx_field_size for a memory access indicates
          (1). underlying prog type specific convert_ctx_accesses
               supporting non-whole-field access
          (2). the current insn is a narrower or whole field access.
        . In verifier, for such loads where load memory size is
          less than ctx_field_size, verifier transforms it
          to a full field load followed by proper masking.
        . Currently, __sk_buff and bpf_perf_event_data->sample_period
          are supporting narrowing loads.
        . Narrower stores are still not allowed as typical ctx stores
          are just normal stores.
      
      Because of this change, some tests in verifier will fail and
      these tests are removed. As a bonus, rename some out of bound
      __sk_buff->cb access to proper field name and remove two
      redundant "skb cb oob" tests.
      
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31fd8581
    • WANG Cong's avatar
      net_sched: move tcf_lock down after gen_replace_estimator() · 74030603
      WANG Cong authored
      
      
      Laura reported a sleep-in-atomic kernel warning inside
      tcf_act_police_init() which calls gen_replace_estimator() with
      spinlock protection.
      
      It is not necessary in this case, we already have RTNL lock here
      so it is enough to protect concurrent writers. For the reader,
      i.e. tcf_act_police(), it needs to make decision based on this
      rate estimator, in the worst case we drop more/less packets than
      necessary while changing the rate in parallel, it is still acceptable.
      
      Reported-by: default avatarLaura Abbott <labbott@redhat.com>
      Reported-by: default avatarNick Huber <nicholashuber@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74030603
  4. Jun 13, 2017
  5. Jun 12, 2017
    • Karicheri, Muralidharan's avatar
      hsr: fix incorrect warning · 675c8da0
      Karicheri, Muralidharan authored
      
      
      When HSR interface is setup using ip link command, an annoying warning
      appears with the trace as below:-
      
      [  203.019828] hsr_get_node: Non-HSR frame
      [  203.019833] Modules linked in:
      [  203.019848] CPU: 0 PID: 158 Comm: sd-resolve Tainted: G        W       4.12.0-rc3-00052-g9fa6bf70 #2
      [  203.019853] Hardware name: Generic DRA74X (Flattened Device Tree)
      [  203.019869] [<c0110280>] (unwind_backtrace) from [<c010c2f4>] (show_stack+0x10/0x14)
      [  203.019880] [<c010c2f4>] (show_stack) from [<c04b9f64>] (dump_stack+0xac/0xe0)
      [  203.019894] [<c04b9f64>] (dump_stack) from [<c01374e8>] (__warn+0xd8/0x104)
      [  203.019907] [<c01374e8>] (__warn) from [<c0137548>] (warn_slowpath_fmt+0x34/0x44)
      root@am57xx-evm:~# [  203.019921] [<c0137548>] (warn_slowpath_fmt) from [<c081126c>] (hsr_get_node+0x148/0x170)
      [  203.019932] [<c081126c>] (hsr_get_node) from [<c0814240>] (hsr_forward_skb+0x110/0x7c0)
      [  203.019942] [<c0814240>] (hsr_forward_skb) from [<c0811d64>] (hsr_dev_xmit+0x2c/0x34)
      [  203.019954] [<c0811d64>] (hsr_dev_xmit) from [<c06c0828>] (dev_hard_start_xmit+0xc4/0x3bc)
      [  203.019963] [<c06c0828>] (dev_hard_start_xmit) from [<c06c13d8>] (__dev_queue_xmit+0x7c4/0x98c)
      [  203.019974] [<c06c13d8>] (__dev_queue_xmit) from [<c0782f54>] (ip6_finish_output2+0x330/0xc1c)
      [  203.019983] [<c0782f54>] (ip6_finish_output2) from [<c0788f0c>] (ip6_output+0x58/0x454)
      [  203.019994] [<c0788f0c>] (ip6_output) from [<c07b16cc>] (mld_sendpack+0x420/0x744)
      
      As this is an expected path to hsr_get_node() with frame coming from
      the master interface, add a check to ensure packet is not from the
      master port and then warn.
      
      Signed-off-by: default avatarMurali Karicheri <m-karicheri2@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      675c8da0
    • Paolo Abeni's avatar
      udp: try to avoid 2 cache miss on dequeue · b65ac446
      Paolo Abeni authored
      
      
      when udp_recvmsg() is executed, on x86_64 and other archs, most skb
      fields are on cold cachelines.
      If the skb are linear and the kernel don't need to compute the udp
      csum, only a handful of skb fields are required by udp_recvmsg().
      Since we already use skb->dev_scratch to cache hot data, and
      there are 32 bits unused on 64 bit archs, use such field to cache
      as much data as we can, and try to prefetch on dequeue the relevant
      fields that are left out.
      
      This can save up to 2 cache miss per packet.
      
      v1 -> v2:
        - changed udp_dev_scratch fields types to u{32,16} variant,
          replaced bitfiled with bool
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b65ac446
    • Paolo Abeni's avatar
      udp: avoid a cache miss on dequeue · 0a463c78
      Paolo Abeni authored
      
      
      Since UDP no more uses sk->destructor, we can clear completely
      the skb head state before enqueuing. Amend and use
      skb_release_head_state() for that.
      
      All head states share a single cacheline, which is not
      normally used/accesses on dequeue. We can avoid entirely accessing
      such cacheline implementing and using in the UDP code a specialized
      skb free helper which ignores the skb head state.
      
      This saves a cacheline miss at skb deallocation time.
      
      v1 -> v2:
        replaced secpath_reset() with skb_release_head_state()
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a463c78
    • Paolo Abeni's avatar
      net: factor out a helper to decrement the skb refcount · 3889a803
      Paolo Abeni authored
      
      
      The same code is replicated in 3 different places; move it to a
      common helper.
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3889a803
    • Christian Perle's avatar
      proc: snmp6: Use correct type in memset · 3500cd73
      Christian Perle authored
      
      
      Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
      Use "u64" instead of "unsigned long" in sizeof().
      
      Fixes: 4a4857b1 ("proc: Reduce cache miss in snmp6_seq_show")
      Signed-off-by: default avatarChristian Perle <christian.perle@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3500cd73
Loading