Skip to content
  1. Nov 25, 2014
    • Chuck Lever's avatar
      xprtrdma: Refactor tasklet scheduling · f1a03b76
      Chuck Lever authored
      
      
      Restore the separate function that schedules the reply handling
      tasklet. I need to call it from two different paths.
      
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      f1a03b76
    • Chuck Lever's avatar
      xprtrdma: unmap all FMRs during transport disconnect · 467c9674
      Chuck Lever authored
      
      
      When using RPCRDMA_MTHCAFMR memory registration, after a few
      transport disconnect / reconnect cycles, ib_map_phys_fmr() starts to
      return EINVAL because the provider has exhausted its map pool.
      
      Make sure that all FMRs are unmapped during transport disconnect,
      and that ->send_request remarshals them during an RPC retransmit.
      This resets the transport's MRs to ensure that none are leaked
      during a disconnect.
      
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      467c9674
    • Chuck Lever's avatar
      xprtrdma: Cap req_cqinit · e7104a2a
      Chuck Lever authored
      Recent work made FRMR registration and invalidation completions
      unsignaled. This greatly reduces the adapter interrupt rate.
      
      Every so often, however, a posted send Work Request is allowed to
      signal. Otherwise, the provider's Work Queue will wrap and the
      workload will hang.
      
      The number of Work Requests that are allowed to remain unsignaled is
      determined by the value of req_cqinit. Currently, this is set to the
      size of the send Work Queue divided by two, minus 1.
      
      For FRMR, the send Work Queue is the maximum number of concurrent
      RPCs (currently 32) times the maximum number of Work Requests an
      RPC might use (currently 7, though some adapters may need more).
      
      For mlx4, this is 224 entries. This leaves completion signaling
      disabled for 111 send Work Requests.
      
      Some providers hold back dispatching Work Requests until a CQE is
      generated.  If completions are disabled, then no CQEs are generated
      for quite some time, and that can stall the Work Queue.
      
      I've seen this occur running xfstests generic/113 over NFSv4, where
      eventually, posting a FAST_REG_MR Work Request fails with -ENOMEM
      because the Work Queue has overflowed. The connection is dropped
      and re-established.
      
      Cap the rep_cqinit setting so completions are not left turned off
      for too long.
      
      BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=269
      
      
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      e7104a2a
    • Chuck Lever's avatar
      xprtrdma: Return an errno from rpcrdma_register_external() · 92b98361
      Chuck Lever authored
      
      
      The RPC/RDMA send_request method and the chunk registration code
      expects an errno from the registration function. This allows
      the upper layers to distinguish between a recoverable failure
      (for example, temporary memory exhaustion) and a hard failure
      (for example, a bug in the registration logic).
      
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      92b98361
  2. Nov 21, 2014
    • Calvin Owens's avatar
      tcp: Restore RFC5961-compliant behavior for SYN packets · 0c228e83
      Calvin Owens authored
      
      
      Commit c3ae62af ("tcp: should drop incoming frames without ACK
      flag set") was created to mitigate a security vulnerability in which a
      local attacker is able to inject data into locally-opened sockets by
      using TCP protocol statistics in procfs to quickly find the correct
      sequence number.
      
      This broke the RFC5961 requirement to send a challenge ACK in response
      to spurious RST packets, which was subsequently fixed by commit
      7b514a88 ("tcp: accept RST without ACK flag").
      
      Unfortunately, the RFC5961 requirement that spurious SYN packets be
      handled in a similar manner remains broken.
      
      RFC5961 section 4 states that:
      
         ... the handling of the SYN in the synchronized state SHOULD be
         performed as follows:
      
         1) If the SYN bit is set, irrespective of the sequence number, TCP
            MUST send an ACK (also referred to as challenge ACK) to the remote
            peer:
      
            <SEQ=SND.NXT><ACK=RCV.NXT><CTL=ACK>
      
            After sending the acknowledgment, TCP MUST drop the unacceptable
            segment and stop processing further.
      
         By sending an ACK, the remote peer is challenged to confirm the loss
         of the previous connection and the request to start a new connection.
         A legitimate peer, after restart, would not have a TCB in the
         synchronized state.  Thus, when the ACK arrives, the peer should send
         a RST segment back with the sequence number derived from the ACK
         field that caused the RST.
      
         This RST will confirm that the remote peer has indeed closed the
         previous connection.  Upon receipt of a valid RST, the local TCP
         endpoint MUST terminate its connection.  The local TCP endpoint
         should then rely on SYN retransmission from the remote end to
         re-establish the connection.
      
      This patch lets SYN packets through the discard added in c3ae62af,
      so that spurious SYN packets are properly dealt with as per the RFC.
      
      The challenge ACK is sent unconditionally and is rate-limited, so the
      original vulnerability is not reintroduced by this patch.
      
      Signed-off-by: default avatarCalvin Owens <calvinowens@fb.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c228e83
    • Eric Dumazet's avatar
      net: Revert "net: avoid one atomic operation in skb_clone()" · e7820e39
      Eric Dumazet authored
      
      
      Not sure what I was thinking, but doing anything after
      releasing a refcount is suicidal or/and embarrassing.
      
      By the time we set skb->fclone to SKB_FCLONE_FREE, another cpu
      could have released last reference and freed whole skb.
      
      We potentially corrupt memory or trap if CONFIG_DEBUG_PAGEALLOC is set.
      
      Reported-by: default avatarChris Mason <clm@fb.com>
      Fixes: ce1a4ea3 ("net: avoid one atomic operation in skb_clone()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Sabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7820e39
    • Jiri Bohac's avatar
      ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg · 01462405
      Jiri Bohac authored
      
      
      This fixes an old regression introduced by commit
      b0d0d915 (ipx: remove the BKL).
      
      When a recvmsg syscall blocks waiting for new data, no data can be sent on the
      same socket with sendmsg because ipx_recvmsg() sleeps with the socket locked.
      
      This breaks mars-nwe (NetWare emulator):
      - the ncpserv process reads the request using recvmsg
      - ncpserv forks and spawns nwconn
      - ncpserv calls a (blocking) recvmsg and waits for new requests
      - nwconn deadlocks in sendmsg on the same socket
      
      Commit b0d0d915 has simply replaced BKL locking with
      lock_sock/release_sock. Unlike now, BKL got unlocked while
      sleeping, so a blocking recvmsg did not block a concurrent
      sendmsg.
      
      Only keep the socket locked while actually working with the socket data and
      release it prior to calling skb_recv_datagram().
      
      Signed-off-by: default avatarJiri Bohac <jbohac@suse.cz>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01462405
    • Joe Stringer's avatar
      openvswitch: Don't validate IPv6 label masks. · d3052bb5
      Joe Stringer authored
      
      
      When userspace doesn't provide a mask, OVS datapath generates a fully
      unwildcarded mask for the flow by copying the flow and setting all bits
      in all fields. For IPv6 label, this creates a mask that matches on the
      upper 12 bits, causing the following error:
      
      openvswitch: netlink: Invalid IPv6 flow label value (value=ffffffff, max=fffff)
      
      This patch ignores the label validation check for masks, avoiding this
      error.
      
      Signed-off-by: default avatarJoe Stringer <joestringer@nicira.com>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3052bb5
  3. Nov 19, 2014
  4. Nov 18, 2014
  5. Nov 17, 2014
    • Linus Lüssing's avatar
      bridge: fix netfilter/NF_BR_LOCAL_OUT for own, locally generated queries · f0b4eece
      Linus Lüssing authored
      
      
      Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected
      for both locally generated IGMP and MLD queries. The IP header specific
      filter options are off by 14 Bytes for netfilter (actual output on
      interfaces is fine).
      
      NF_HOOK() expects the skb->data to point to the IP header, not the
      ethernet one (while dev_queue_xmit() does not). Luckily there is an
      br_dev_queue_push_xmit() helper function already - let's just use that.
      
      Introduced by eb1d1641
      ("bridge: Add core IGMP snooping support")
      
      Ebtables example:
      
      $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \
      	--log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP
      
      before (broken):
      
      ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
      	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
      	SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \
      	DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \
      	IPv6 priority=0x3, Next Header=2
      
      after (working):
      
      ~EBT:  IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
      	MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
      	SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \
      	DST=ff02:0000:0000:0000:0000:0000:0000:0001, \
      	IPv6 priority=0x0, Next Header=0
      
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@web.de>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f0b4eece
    • Pablo Neira Ayuso's avatar
      netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind · 97840cb6
      Pablo Neira Ayuso authored
      
      
      Make sure the netlink group exists, otherwise you can trigger an out
      of bound array memory access from the netlink_bind() path. This splat
      can only be triggered only by superuser.
      
      [  180.203600] UBSan: Undefined behaviour in ../net/netfilter/nfnetlink.c:467:28
      [  180.204249] index 9 is out of range for type 'int [9]'
      [  180.204697] CPU: 0 PID: 1771 Comm: trinity-main Not tainted 3.18.0-rc4-mm1+ #122
      [  180.205365] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org
      +04/01/2014
      [  180.206498]  0000000000000018 0000000000000000 0000000000000009 ffff88007bdf7da8
      [  180.207220]  ffffffff82b0ef5f 0000000000000092 ffffffff845ae2e0 ffff88007bdf7db8
      [  180.207887]  ffffffff8199e489 ffff88007bdf7e18 ffffffff8199ea22 0000003900000000
      [  180.208639] Call Trace:
      [  180.208857] dump_stack (lib/dump_stack.c:52)
      [  180.209370] ubsan_epilogue (lib/ubsan.c:174)
      [  180.209849] __ubsan_handle_out_of_bounds (lib/ubsan.c:400)
      [  180.210512] nfnetlink_bind (net/netfilter/nfnetlink.c:467)
      [  180.210986] netlink_bind (net/netlink/af_netlink.c:1483)
      [  180.211495] SYSC_bind (net/socket.c:1541)
      
      Moreover, define the missing nf_tables and nf_acct multicast groups too.
      
      Reported-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      97840cb6
  6. Nov 16, 2014
    • Daniel Borkmann's avatar
      ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs · feb91a02
      Daniel Borkmann authored
      
      
      It has been reported that generating an MLD listener report on
      devices with large MTUs (e.g. 9000) and a high number of IPv6
      addresses can trigger a skb_over_panic():
      
      skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20
      head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0
      dev:port1
       ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:100!
      invalid opcode: 0000 [#1] SMP
      Modules linked in: ixgbe(O)
      CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4
      [...]
      Call Trace:
       <IRQ>
       [<ffffffff80578226>] ? skb_put+0x3a/0x3b
       [<ffffffff80612a5d>] ? add_grhead+0x45/0x8e
       [<ffffffff80612e3a>] ? add_grec+0x394/0x3d4
       [<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d
       [<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45
       [<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68
       [<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182
       [<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d
       [<ffffffff8025112b>] ? irq_exit+0x4e/0xd3
       [<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46
       [<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70
      
      mld_newpack() skb allocations are usually requested with dev->mtu
      in size, since commit 72e09ad1 ("ipv6: avoid high order allocations")
      we have changed the limit in order to be less likely to fail.
      
      However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb)
      macros, which determine if we may end up doing an skb_put() for
      adding another record. To avoid possible fragmentation, we check
      the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong
      assumption as the actual max allocation size can be much smaller.
      
      The IGMP case doesn't have this issue as commit 57e1ab6e
      ("igmp: refine skb allocations") stores the allocation size in
      the cb[].
      
      Set a reserved_tailroom to make it fit into the MTU and use
      skb_availroom() helper instead. This also allows to get rid of
      igmp_skb_size().
      
      Reported-by: default avatarWei Liu <lw1a2.jing@gmail.com>
      Fixes: 72e09ad1 ("ipv6: avoid high order allocations")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: David L Stevens <david.stevens@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      feb91a02
    • Anish Bhatt's avatar
      dcbnl : Disable software interrupts before taking dcb_lock · 52cff74e
      Anish Bhatt authored
      
      
      Solves possible lockup issues that can be seen from firmware DCB agents calling
      into the DCB app api.
      
      DCB firmware event queues can be tied in with NAPI so that dcb events are
      generated in softIRQ context. This can results in calls to dcb_*app()
      functions which try to take the dcb_lock.
      
      If the the event triggers while we also have the dcb_lock because lldpad or
      some other agent happened to be issuing a  get/set command we could see a cpu
      lockup.
      
      This code was not originally written with firmware agents in mind, hence
      grabbing dcb_lock from softIRQ context was not considered.
      
      Signed-off-by: default avatarAnish Bhatt <anish@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52cff74e
    • Panu Matilainen's avatar
      ipv4: Fix incorrect error code when adding an unreachable route · 49dd18ba
      Panu Matilainen authored
      
      
      Trying to add an unreachable route incorrectly returns -ESRCH if
      if custom FIB rules are present:
      
      [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4
      RTNETLINK answers: Network is unreachable
      [root@localhost ~]# ip rule add to 55.66.77.88 table 200
      [root@localhost ~]# ip route add 74.125.31.199 dev eth0 via 1.2.3.4
      RTNETLINK answers: No such process
      [root@localhost ~]#
      
      Commit 83886b6b ("[NET]: Change "not found"
      return value for rule lookup") changed fib_rules_lookup()
      to use -ESRCH as a "not found" code internally, but for user space it
      should be translated into -ENETUNREACH. Handle the translation centrally in
      ipv4-specific fib_lookup(), leaving the DECnet case alone.
      
      On a related note, commit b7a71b51
      ("ipv4: removed redundant conditional") removed a similar translation from
      ip_route_input_slow() prematurely AIUI.
      
      Fixes: b7a71b51 ("ipv4: removed redundant conditional")
      Signed-off-by: default avatarPanu Matilainen <pmatilai@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49dd18ba
  7. Nov 14, 2014
  8. Nov 13, 2014
    • Ilya Dryomov's avatar
      libceph: change from BUG to WARN for __remove_osd() asserts · cc9f1f51
      Ilya Dryomov authored
      
      
      No reason to use BUG_ON for osd request list assertions.
      
      Signed-off-by: default avatarIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      cc9f1f51
    • Ilya Dryomov's avatar
      libceph: clear r_req_lru_item in __unregister_linger_request() · ba9d114e
      Ilya Dryomov authored
      
      
      kick_requests() can put linger requests on the notarget list.  This
      means we need to clear the much-overloaded req->r_req_lru_item in
      __unregister_linger_request() as well, or we get an assertion failure
      in ceph_osdc_release_request() - !list_empty(&req->r_req_lru_item).
      
      AFAICT the assumption was that registered linger requests cannot be on
      any of req->r_req_lru_item lists, but that's clearly not the case.
      
      Signed-off-by: default avatarIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      ba9d114e
    • Ilya Dryomov's avatar
      libceph: unlink from o_linger_requests when clearing r_osd · a390de02
      Ilya Dryomov authored
      
      
      Requests have to be unlinked from both osd->o_requests (normal
      requests) and osd->o_linger_requests (linger requests) lists when
      clearing req->r_osd.  Otherwise __unregister_linger_request() gets
      confused and we trip over a !list_empty(&osd->o_linger_requests)
      assert in __remove_osd().
      
      MON=1 OSD=1:
      
          # cat remove-osd.sh
          #!/bin/bash
          rbd create --size 1 test
          DEV=$(rbd map test)
          ceph osd out 0
          sleep 3
          rbd map dne/dne # obtain a new osdmap as a side effect
          rbd unmap $DEV & # will block
          sleep 3
          ceph osd in 0
      
      Signed-off-by: default avatarIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      a390de02
    • Ilya Dryomov's avatar
      libceph: do not crash on large auth tickets · aaef3170
      Ilya Dryomov authored
      
      
      Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
      tickets will have their buffers vmalloc'ed, which leads to the
      following crash in crypto:
      
      [   28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
      [   28.686032] IP: [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
      [   28.686032] PGD 0
      [   28.688088] Oops: 0000 [#1] PREEMPT SMP
      [   28.688088] Modules linked in:
      [   28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
      [   28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [   28.688088] Workqueue: ceph-msgr con_work
      [   28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
      [   28.688088] RIP: 0010:[<ffffffff81392b42>]  [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
      [   28.688088] RSP: 0018:ffff8800d903f688  EFLAGS: 00010286
      [   28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
      [   28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
      [   28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
      [   28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
      [   28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
      [   28.688088] FS:  00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
      [   28.688088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
      [   28.688088] Stack:
      [   28.688088]  ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
      [   28.688088]  ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
      [   28.688088]  ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
      [   28.688088] Call Trace:
      [   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
      [   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
      [   28.688088]  [<ffffffff81395d32>] blkcipher_walk_done+0x182/0x220
      [   28.688088]  [<ffffffff813990bf>] crypto_cbc_encrypt+0x15f/0x180
      [   28.688088]  [<ffffffff81399780>] ? crypto_aes_set_key+0x30/0x30
      [   28.688088]  [<ffffffff8156c40c>] ceph_aes_encrypt2+0x29c/0x2e0
      [   28.688088]  [<ffffffff8156d2a3>] ceph_encrypt2+0x93/0xb0
      [   28.688088]  [<ffffffff8156d7da>] ceph_x_encrypt+0x4a/0x60
      [   28.688088]  [<ffffffff8155b39d>] ? ceph_buffer_new+0x5d/0xf0
      [   28.688088]  [<ffffffff8156e837>] ceph_x_build_authorizer.isra.6+0x297/0x360
      [   28.688088]  [<ffffffff8112089b>] ? kmem_cache_alloc_trace+0x11b/0x1c0
      [   28.688088]  [<ffffffff8156b496>] ? ceph_auth_create_authorizer+0x36/0x80
      [   28.688088]  [<ffffffff8156ed83>] ceph_x_create_authorizer+0x63/0xd0
      [   28.688088]  [<ffffffff8156b4b4>] ceph_auth_create_authorizer+0x54/0x80
      [   28.688088]  [<ffffffff8155f7c0>] get_authorizer+0x80/0xd0
      [   28.688088]  [<ffffffff81555a8b>] prepare_write_connect+0x18b/0x2b0
      [   28.688088]  [<ffffffff81559289>] try_read+0x1e59/0x1f10
      
      This is because we set up crypto scatterlists as if all buffers were
      kmalloc'ed.  Fix it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: default avatarSage Weil <sage@redhat.com>
      aaef3170
    • Jeff Layton's avatar
      sunrpc: fix sleeping under rcu_read_lock in gss_stringify_acceptor · b3ecba09
      Jeff Layton authored
      
      
      Bruce reported that he was seeing the following BUG pop:
      
          BUG: sleeping function called from invalid context at mm/slab.c:2846
          in_atomic(): 0, irqs_disabled(): 0, pid: 4539, name: mount.nfs
          2 locks held by mount.nfs/4539:
          #0:  (nfs_clid_init_mutex){+.+.+.}, at: [<ffffffffa01c0a9a>] nfs4_discover_server_trunking+0x4a/0x2f0 [nfsv4]
          #1:  (rcu_read_lock){......}, at: [<ffffffffa00e3185>] gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss]
          Preemption disabled at:[<ffffffff81a4f082>] printk+0x4d/0x4f
      
          CPU: 3 PID: 4539 Comm: mount.nfs Not tainted 3.18.0-rc1-00013-g5b095e9 #3393
          Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
          ffff880021499390 ffff8800381476a8 ffffffff81a534cf 0000000000000001
          0000000000000000 ffff8800381476c8 ffffffff81097854 00000000000000d0
          0000000000000018 ffff880038147718 ffffffff8118e4f3 0000000020479f00
          Call Trace:
          [<ffffffff81a534cf>] dump_stack+0x4f/0x7c
          [<ffffffff81097854>] __might_sleep+0x114/0x180
          [<ffffffff8118e4f3>] __kmalloc+0x1a3/0x280
          [<ffffffffa00e31d8>] gss_stringify_acceptor+0x58/0xb0 [auth_rpcgss]
          [<ffffffffa00e3185>] ? gss_stringify_acceptor+0x5/0xb0 [auth_rpcgss]
          [<ffffffffa006b438>] rpcauth_stringify_acceptor+0x18/0x30 [sunrpc]
          [<ffffffffa01b0469>] nfs4_proc_setclientid+0x199/0x380 [nfsv4]
          [<ffffffffa01b04d0>] ? nfs4_proc_setclientid+0x200/0x380 [nfsv4]
          [<ffffffffa01bdf1a>] nfs40_discover_server_trunking+0xda/0x150 [nfsv4]
          [<ffffffffa01bde45>] ? nfs40_discover_server_trunking+0x5/0x150 [nfsv4]
          [<ffffffffa01c0acf>] nfs4_discover_server_trunking+0x7f/0x2f0 [nfsv4]
          [<ffffffffa01c8e24>] nfs4_init_client+0x104/0x2f0 [nfsv4]
          [<ffffffffa01539b4>] nfs_get_client+0x314/0x3f0 [nfs]
          [<ffffffffa0153780>] ? nfs_get_client+0xe0/0x3f0 [nfs]
          [<ffffffffa01c83aa>] nfs4_set_client+0x8a/0x110 [nfsv4]
          [<ffffffffa0069708>] ? __rpc_init_priority_wait_queue+0xa8/0xf0 [sunrpc]
          [<ffffffffa01c9b2f>] nfs4_create_server+0x12f/0x390 [nfsv4]
          [<ffffffffa01c1472>] nfs4_remote_mount+0x32/0x60 [nfsv4]
          [<ffffffff81196489>] mount_fs+0x39/0x1b0
          [<ffffffff81166145>] ? __alloc_percpu+0x15/0x20
          [<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150
          [<ffffffffa01c1396>] nfs_do_root_mount+0x86/0xc0 [nfsv4]
          [<ffffffffa01c1784>] nfs4_try_mount+0x44/0xc0 [nfsv4]
          [<ffffffffa01549b7>] ? get_nfs_version+0x27/0x90 [nfs]
          [<ffffffffa0161a2d>] nfs_fs_mount+0x47d/0xd60 [nfs]
          [<ffffffff81a59c5e>] ? mutex_unlock+0xe/0x10
          [<ffffffffa01606a0>] ? nfs_remount+0x430/0x430 [nfs]
          [<ffffffffa01609c0>] ? nfs_clone_super+0x140/0x140 [nfs]
          [<ffffffff81196489>] mount_fs+0x39/0x1b0
          [<ffffffff81166145>] ? __alloc_percpu+0x15/0x20
          [<ffffffff811b276b>] vfs_kern_mount+0x6b/0x150
          [<ffffffff811b5830>] do_mount+0x210/0xbe0
          [<ffffffff811b54ca>] ? copy_mount_options+0x3a/0x160
          [<ffffffff811b651f>] SyS_mount+0x6f/0xb0
          [<ffffffff81a5c852>] system_call_fastpath+0x12/0x17
      
      Sleeping under the rcu_read_lock is bad. This patch fixes it by dropping
      the rcu_read_lock before doing the allocation and then reacquiring it
      and redoing the dereference before doing the copy. If we find that the
      string has somehow grown in the meantime, we'll reallocate and try again.
      
      Cc: <stable@vger.kernel.org> # v3.17+
      Reported-by: default avatar"J. Bruce Fields" <bfields@fieldses.org>
      Signed-off-by: default avatarJeff Layton <jlayton@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      b3ecba09
  9. Nov 12, 2014
  10. Nov 11, 2014
    • Eric Dumazet's avatar
      ipv6: fix IPV6_PKTINFO with v4 mapped · 5337b5b7
      Eric Dumazet authored
      
      
      Use IS_ENABLED(CONFIG_IPV6), to enable this code if IPv6 is
      a module.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: c8e6ad08 ("ipv6: honor IPV6_PKTINFO with v4 mapped addresses on sendmsg")
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5337b5b7
    • Daniel Borkmann's avatar
      net: sctp: fix memory leak in auth key management · 4184b2a7
      Daniel Borkmann authored
      
      
      A very minimal and simple user space application allocating an SCTP
      socket, setting SCTP_AUTH_KEY setsockopt(2) on it and then closing
      the socket again will leak the memory containing the authentication
      key from user space:
      
      unreferenced object 0xffff8800837047c0 (size 16):
        comm "a.out", pid 2789, jiffies 4296954322 (age 192.258s)
        hex dump (first 16 bytes):
          01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff816d7e8e>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff811c88d8>] __kmalloc+0xe8/0x270
          [<ffffffffa0870c23>] sctp_auth_create_key+0x23/0x50 [sctp]
          [<ffffffffa08718b1>] sctp_auth_set_key+0xa1/0x140 [sctp]
          [<ffffffffa086b383>] sctp_setsockopt+0xd03/0x1180 [sctp]
          [<ffffffff815bfd94>] sock_common_setsockopt+0x14/0x20
          [<ffffffff815beb61>] SyS_setsockopt+0x71/0xd0
          [<ffffffff816e58a9>] system_call_fastpath+0x12/0x17
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      This is bad because of two things, we can bring down a machine from
      user space when auth_enable=1, but also we would leave security sensitive
      keying material in memory without clearing it after use. The issue is
      that sctp_auth_create_key() already sets the refcount to 1, but after
      allocation sctp_auth_set_key() does an additional refcount on it, and
      thus leaving it around when we free the socket.
      
      Fixes: 65b07e5d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4184b2a7
    • Daniel Borkmann's avatar
      net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet · e40607cb
      Daniel Borkmann authored
      
      
      An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
      in the form of:
      
        ------------ INIT[PARAM: SET_PRIMARY_IP] ------------>
      
      While the INIT chunk parameter verification dissects through many things
      in order to detect malformed input, it misses to actually check parameters
      inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
      IP address' parameter in ASCONF, which has as a subparameter an address
      parameter.
      
      So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
      or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
      and thus sctp_get_af_specific() returns NULL, too, which we then happily
      dereference unconditionally through af->from_addr_param().
      
      The trace for the log:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
      IP: [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
      PGD 0
      Oops: 0000 [#1] SMP
      [...]
      Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
      RIP: 0010:[<ffffffffa01e9c62>]  [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
      [...]
      Call Trace:
       <IRQ>
       [<ffffffffa01f2add>] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
       [<ffffffffa01e1fcb>] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
       [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
       [<ffffffffa01e5c09>] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
       [<ffffffffa01e61f6>] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
       [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
       [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
       [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
       [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
       [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
       [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
       [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
      [...]
      
      A minimal way to address this is to check for NULL as we do on all
      other such occasions where we know sctp_get_af_specific() could
      possibly return with NULL.
      
      Fixes: d6de3097 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e40607cb
    • Dan Carpenter's avatar
      netfilter: ipset: small potential read beyond the end of buffer · 2196937e
      Dan Carpenter authored
      
      
      We could be reading 8 bytes into a 4 byte buffer here.  It seems
      harmless but adding a check is the right thing to do and it silences a
      static checker warning.
      
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2196937e
  11. Nov 10, 2014
  12. Nov 06, 2014
    • Andrew Lunn's avatar
      net: dsa: slave: Fix autoneg for phys on switch MDIO bus · b31f65fb
      Andrew Lunn authored
      
      
      When the ports phys are connected to the switches internal MDIO bus,
      we need to connect the phy to the slave netdev, otherwise
      auto-negotiation etc, does not work.
      
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b31f65fb
    • Ronald Wahl's avatar
      mac80211: Fix regression that triggers a kernel BUG with CCMP · 4f031fa9
      Ronald Wahl authored
      
      
      Commit 7ec7c4a9 (mac80211: port CCMP to
      cryptoapi's CCM driver) introduced a regression when decrypting empty
      packets (data_len == 0). This will lead to backtraces like:
      
      (scatterwalk_start) from [<c01312f4>] (scatterwalk_map_and_copy+0x2c/0xa8)
      (scatterwalk_map_and_copy) from [<c013a5a0>] (crypto_ccm_decrypt+0x7c/0x25c)
      (crypto_ccm_decrypt) from [<c032886c>] (ieee80211_aes_ccm_decrypt+0x160/0x170)
      (ieee80211_aes_ccm_decrypt) from [<c031c628>] (ieee80211_crypto_ccmp_decrypt+0x1ac/0x238)
      (ieee80211_crypto_ccmp_decrypt) from [<c032ef28>] (ieee80211_rx_handlers+0x870/0x1d24)
      (ieee80211_rx_handlers) from [<c0330c7c>] (ieee80211_prepare_and_rx_handle+0x8a0/0x91c)
      (ieee80211_prepare_and_rx_handle) from [<c0331260>] (ieee80211_rx+0x568/0x730)
      (ieee80211_rx) from [<c01d3054>] (__carl9170_rx+0x94c/0xa20)
      (__carl9170_rx) from [<c01d3324>] (carl9170_rx_stream+0x1fc/0x320)
      (carl9170_rx_stream) from [<c01cbccc>] (carl9170_usb_tasklet+0x80/0xc8)
      (carl9170_usb_tasklet) from [<c00199dc>] (tasklet_hi_action+0x88/0xcc)
      (tasklet_hi_action) from [<c00193c8>] (__do_softirq+0xcc/0x200)
      (__do_softirq) from [<c0019734>] (irq_exit+0x80/0xe0)
      (irq_exit) from [<c0009c10>] (handle_IRQ+0x64/0x80)
      (handle_IRQ) from [<c000c3a0>] (__irq_svc+0x40/0x4c)
      (__irq_svc) from [<c0009d44>] (arch_cpu_idle+0x2c/0x34)
      
      Such packets can appear for example when using the carl9170 wireless driver
      because hardware sometimes generates garbage when the internal FIFO overruns.
      
      This patch adds an additional length check.
      
      Cc: stable@vger.kernel.org
      Fixes: 7ec7c4a9 ("mac80211: port CCMP to cryptoapi's CCM driver")
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      4f031fa9
Loading