Skip to content
  1. Dec 08, 2020
  2. Sep 26, 2020
    • Vladimir Oltean's avatar
      net: dsa: allow drivers to request promiscuous mode on master · c3975400
      Vladimir Oltean authored
      
      
      Currently DSA assumes that taggers don't mess with the destination MAC
      address of the frames on RX. That is not always the case. Some DSA
      headers are placed before the Ethernet header (ocelot), and others
      simply mangle random bytes from the destination MAC address (sja1105
      with its incl_srcpt option).
      
      Currently the DSA master goes to promiscuous mode automatically when the
      slave devices go too (such as when enslaved to a bridge), but in
      standalone mode this is a problem that needs to be dealt with.
      
      So give drivers the possibility to signal that their tagging protocol
      will get randomly dropped otherwise, and let DSA deal with fixing that.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3975400
  3. Jul 23, 2020
    • Vladimir Oltean's avatar
      net: dsa: stop overriding master's ndo_get_phys_port_name · 5df5661a
      Vladimir Oltean authored
      
      
      The purpose of this override is to give the user an indication of what
      the number of the CPU port is (in DSA, the CPU port is a hardware
      implementation detail and not a network interface capable of traffic).
      
      However, it has always failed (by design) at providing this information
      to the user in a reliable fashion.
      
      Prior to commit 3369afba ("net: Call into DSA netdevice_ops
      wrappers"), the behavior was to only override this callback if it was
      not provided by the DSA master.
      
      That was its first failure: if the DSA master itself was a DSA port or a
      switchdev, then the user would not see the number of the CPU port in
      /sys/class/net/eth0/phys_port_name, but the number of the DSA master
      port within its respective physical switch.
      
      But that was actually ok in a way. The commit mentioned above changed
      that behavior, and now overrides the master's ndo_get_phys_port_name
      unconditionally. That comes with problems of its own, which are worse in
      a way.
      
      The idea is that it's typical for switchdev users to have udev rules for
      consistent interface naming. These are based, among other things, on
      the phys_port_name attribute. If we let the DSA switch at the bottom
      to start randomly overriding ndo_get_phys_port_name with its own CPU
      port, we basically lose any predictability in interface naming, or even
      uniqueness, for that matter.
      
      So, there are reasons to let DSA override the master's callback (to
      provide a consistent interface, a number which has a clear meaning and
      must not be interpreted according to context), and there are reasons to
      not let DSA override it (it breaks udev matching for the DSA master).
      
      But, there is an alternative method for users to retrieve the number of
      the CPU port of each DSA switch in the system:
      
        $ devlink port
        pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0
        pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2
        pci/0000:00:00.5/4: type notset flavour cpu port 4
        spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0
        spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1
        spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2
        spi/spi2.0/4: type notset flavour cpu port 4
        spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0
        spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1
        spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2
        spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3
        spi/spi2.1/4: type notset flavour cpu port 4
      
      So remove this duplicated, unreliable and troublesome method. From this
      patch on, the phys_port_name attribute of the DSA master will only
      contain information about itself (if at all). If the users need reliable
      information about the CPU port they're probably using devlink anyway.
      
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarflorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5df5661a
  4. Jul 20, 2020
    • Florian Fainelli's avatar
      net: dsa: Setup dsa_netdev_ops · 9c0c7014
      Florian Fainelli authored
      
      
      Now that we have all the infrastructure in place for calling into the
      dsa_ptr->netdev_ops function pointers, install them when we configure
      the DSA CPU/management interface and tear them down. The flow is
      unchanged from before, but now we preserve equality of tests when
      network device drivers do tests like dev->netdev_ops == &foo_ops which
      was not the case before since we were allocating an entirely new
      structure.
      
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c0c7014
  5. Jun 09, 2020
    • Cong Wang's avatar
      net: change addr_list_lock back to static key · 845e0ebb
      Cong Wang authored
      
      
      The dynamic key update for addr_list_lock still causes troubles,
      for example the following race condition still exists:
      
      CPU 0:				CPU 1:
      (RCU read lock)			(RTNL lock)
      dev_mc_seq_show()		netdev_update_lockdep_key()
      				  -> lockdep_unregister_key()
       -> netif_addr_lock_bh()
      
      because lockdep doesn't provide an API to update it atomically.
      Therefore, we have to move it back to static keys and use subclass
      for nest locking like before.
      
      In commit 1a33e10e ("net: partially revert dynamic lockdep key
      changes"), I already reverted most parts of commit ab92d68f
      ("net: core: add generic lockdep keys").
      
      This patch reverts the rest and also part of commit f3b0a18b
      ("net: remove unnecessary variables and callback"). After this
      patch, addr_list_lock changes back to using static keys and
      subclasses to satisfy lockdep. Thanks to dev->lower_level, we do
      not have to change back to ->ndo_get_lock_subclass().
      
      And hopefully this reduces some syzbot lockdep noises too.
      
      Reported-by: default avatar <syzbot+f3a0e80c34b3fc28ac5e@syzkaller.appspotmail.com>
      Cc: Taehee Yoo <ap420073@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      845e0ebb
  6. May 07, 2020
  7. Mar 27, 2020
    • Vladimir Oltean's avatar
      net: dsa: configure the MTU for switch ports · bfcb8132
      Vladimir Oltean authored
      
      
      It is useful be able to configure port policers on a switch to accept
      frames of various sizes:
      
      - Increase the MTU for better throughput from the default of 1500 if it
        is known that there is no 10/100 Mbps device in the network.
      - Decrease the MTU to limit the latency of high-priority frames under
        congestion, or work around various network segments that add extra
        headers to packets which can't be fragmented.
      
      For DSA slave ports, this is mostly a pass-through callback, called
      through the regular ndo ops and at probe time (to ensure consistency
      across all supported switches).
      
      The CPU port is called with an MTU equal to the largest configured MTU
      of the slave ports. The assumption is that the user might want to
      sustain a bidirectional conversation with a partner over any switch
      port.
      
      The DSA master is configured the same as the CPU port, plus the tagger
      overhead. Since the MTU is by definition L2 payload (sans Ethernet
      header), it is up to each individual driver to figure out if it needs to
      do anything special for its frame tags on the CPU port (it shouldn't
      except in special cases). So the MTU does not contain the tagger
      overhead on the CPU port.
      However the MTU of the DSA master, minus the tagger overhead, is used as
      a proxy for the MTU of the CPU port, which does not have a net device.
      This is to avoid uselessly calling the .change_mtu function on the CPU
      port when nothing should change.
      
      So it is safe to assume that the DSA master and the CPU port MTUs are
      apart by exactly the tagger's overhead in bytes.
      
      Some changes were made around dsa_master_set_mtu(), function which was
      now removed, for 2 reasons:
        - dev_set_mtu() already calls dev_validate_mtu(), so it's redundant to
          do the same thing in DSA
        - __dev_set_mtu() returns 0 if ops->ndo_change_mtu is an absent method
      That is to say, there's no need for this function in DSA, we can safely
      call dev_set_mtu() directly, take the rtnl lock when necessary, and just
      propagate whatever errors get reported (since the user probably wants to
      be informed).
      
      Some inspiration (mainly in the MTU DSA notifier) was taken from a
      vaguely similar patch from Murali and Florian, who are credited as
      co-developers down below.
      
      Co-developed-by: default avatarMurali Krishna Policharla <murali.policharla@broadcom.com>
      Signed-off-by: default avatarMurali Krishna Policharla <murali.policharla@broadcom.com>
      Co-developed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfcb8132
  8. Dec 28, 2019
    • Vladimir Oltean's avatar
      net: dsa: Deny PTP on master if switch supports it · f685e609
      Vladimir Oltean authored
      It is possible to kill PTP on a DSA switch completely and absolutely,
      until a reboot, with a simple command:
      
      tcpdump -i eth2 -j adapter_unsynced
      
      where eth2 is the switch's DSA master.
      
      Why? Well, in short, the PTP API in place today is a bit rudimentary and
      relies on applications to retrieve the TX timestamps by polling the
      error queue and looking at the cmsg structure. But there is no timestamp
      identification of any sorts (except whether it's HW or SW), you don't
      know how many more timestamps are there to come, which one is this one,
      from whom it is, etc. In other words, the SO_TIMESTAMPING API is
      fundamentally limited in that you can get a single HW timestamp from the
      stack.
      
      And the "-j adapter_unsynced" flag of tcpdump enables hardware
      timestamping.
      
      So let's imagine what happens when the DSA master decides it wants to
      deliver TX timestamps to the skb's socket too:
      - The timestamp that the user space sees is taken by the DSA master.
        Whereas the RX timestamp will eventually be overwritten by the DSA
        switch. So the RX and TX timestamps will be in different time bases
        (aka garbage).
      - The user space applications have no way to deal with the second (real)
        TX timestamp finally delivered by the DSA switch, or even to know to
        wait for it.
      
      Take ptp4l from the linuxptp project, for example. This is its behavior
      after running tcpdump, before the patch:
      
      ptp4l[172]: [6469.594] Unexpected data on socket err queue:
      ptp4l[172]: [6469.693] rms    8 max   16 freq -21257 +/-  11 delay   748 +/-   0
      ptp4l[172]: [6469.711] Unexpected data on socket err queue:
      ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 03 aa 05 00 fd
      ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: [6469.721] Unexpected data on socket err queue:
      ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
      ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 01 c6 b1 00 fd
      ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: [6469.838] Unexpected data on socket err queue:
      ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
      ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 03 aa 06 00 fd
      ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: [6469.848] Unexpected data on socket err queue:
      ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 13 02
      ptp4l[172]: 0010 00 36 00 00 02 00 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 04 1a 45 05 7f
      ptp4l[172]: 0030 00 00 5e 05 41 32 27 c2 1a 68 00 04 9f ff fe 05
      ptp4l[172]: 0040 de 06 00 01
      ptp4l[172]: [6469.855] Unexpected data on socket err queue:
      ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
      ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 01 c6 b2 00 fd
      ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: [6469.974] Unexpected data on socket err queue:
      ptp4l[172]: 0000 01 80 c2 00 00 0e 00 1f 7b 63 02 48 88 f7 10 02
      ptp4l[172]: 0010 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00
      ptp4l[172]: 0020 00 00 00 1f 7b ff fe 63 02 48 00 03 aa 07 00 fd
      ptp4l[172]: 0030 00 00 00 00 00 00 00 00 00 00
      
      The ptp4l program itself is heavily patched to show this (more details
      here [0]). Otherwise, by default it just hangs.
      
      On the other hand, with the DSA patch to disallow HW timestamping
      applied:
      
      tcpdump -i eth2 -j adapter_unsynced
      tcpdump: SIOCSHWTSTAMP failed: Device or resource busy
      
      So it is a fact of life that PTP timestamping on the DSA master is
      incompatible with timestamping on the switch MAC, at least with the
      current API. And if the switch supports PTP, taking the timestamps from
      the switch MAC is highly preferable anyway, due to the fact that those
      don't contain the queuing latencies of the switch. So just disallow PTP
      on the DSA master if there is any PTP-capable switch attached.
      
      [0]: https://sourceforge.net/p/linuxptp/mailman/message/36880648/
      
      
      
      Fixes: 0336369d ("net: dsa: forward hardware timestamping ioctls to switch driver")
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f685e609
  9. Oct 24, 2019
    • Taehee Yoo's avatar
      net: core: add generic lockdep keys · ab92d68f
      Taehee Yoo authored
      
      
      Some interface types could be nested.
      (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
      These interface types should set lockdep class because, without lockdep
      class key, lockdep always warn about unexisting circular locking.
      
      In the current code, these interfaces have their own lockdep class keys and
      these manage itself. So that there are so many duplicate code around the
      /driver/net and /net/.
      This patch adds new generic lockdep keys and some helper functions for it.
      
      This patch does below changes.
      a) Add lockdep class keys in struct net_device
         - qdisc_running, xmit, addr_list, qdisc_busylock
         - these keys are used as dynamic lockdep key.
      b) When net_device is being allocated, lockdep keys are registered.
         - alloc_netdev_mqs()
      c) When net_device is being free'd llockdep keys are unregistered.
         - free_netdev()
      d) Add generic lockdep key helper function
         - netdev_register_lockdep_key()
         - netdev_unregister_lockdep_key()
         - netdev_update_lockdep_key()
      e) Remove unnecessary generic lockdep macro and functions
      f) Remove unnecessary lockdep code of each interfaces.
      
      After this patch, each interface modules don't need to maintain
      their lockdep keys.
      
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab92d68f
  10. Aug 06, 2019
    • Vivien Didelot's avatar
      net: dsa: dump CPU port regs through master · 48e23311
      Vivien Didelot authored
      
      
      Merge the CPU port registers dump into the master interface registers
      dump through ethtool, by nesting the ethtool_drvinfo and ethtool_regs
      structures of the CPU port into the dump.
      
      drvinfo->regdump_len will contain the full data length, while regs->len
      will contain only the master interface registers dump length.
      
      This allows for example to dump the CPU port registers on a ZII Dev
      C board like this:
      
          # ethtool -d eth1
          0x004:                                              0x00000000
          0x008:                                              0x0a8000aa
          0x010:                                              0x01000000
          0x014:                                              0x00000000
          0x024:                                              0xf0000102
          0x040:                                              0x6d82c800
          0x044:                                              0x00000020
          0x064:                                              0x40000000
          0x084: RCR (Receive Control Register)               0x47c00104
              MAX_FL (Maximum frame length)                   1984
              FCE (Flow control enable)                       0
              BC_REJ (Broadcast frame reject)                 0
              PROM (Promiscuous mode)                         0
              DRT (Disable receive on transmit)               0
              LOOP (Internal loopback)                        0
          0x0c4: TCR (Transmit Control Register)              0x00000004
              RFC_PAUSE (Receive frame control pause)         0
              TFC_PAUSE (Transmit frame control pause)        0
              FDEN (Full duplex enable)                       1
              HBC (Heartbeat control)                         0
              GTS (Graceful transmit stop)                    0
          0x0e4:                                              0x76735d6d
          0x0e8:                                              0x7e9e8808
          0x0ec:                                              0x00010000
          .
          .
          .
          88E6352  Switch Port Registers
          ------------------------------
          00: Port Status                            0x4d04
                Pause Enabled                        0
                My Pause                             1
                802.3 PHY Detected                   0
                Link Status                          Up
                Duplex                               Full
                Speed                                100 or 200 Mbps
                EEE Enabled                          0
                Transmitter Paused                   0
                Flow Control                         0
                Config Mode                          0x4
          01: Physical Control                       0x003d
                RGMII Receive Timing Control         Default
                RGMII Transmit Timing Control        Default
                200 BASE Mode                        100
                Flow Control's Forced value          0
                Force Flow Control                   0
                Link's Forced value                  Up
                Force Link                           1
                Duplex's Forced value                Full
                Force Duplex                         1
                Force Speed                          100 or 200 Mbps
          .
          .
          .
      
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48e23311
  11. May 30, 2019
  12. Feb 05, 2019
    • Marc Zyngier's avatar
      net: dsa: Fix lockdep false positive splat · c8101f77
      Marc Zyngier authored
      
      
      Creating a macvtap on a DSA-backed interface results in the following
      splat when lockdep is enabled:
      
      [   19.638080] IPv6: ADDRCONF(NETDEV_CHANGE): lan0: link becomes ready
      [   23.041198] device lan0 entered promiscuous mode
      [   23.043445] device eth0 entered promiscuous mode
      [   23.049255]
      [   23.049557] ============================================
      [   23.055021] WARNING: possible recursive locking detected
      [   23.060490] 5.0.0-rc3-00013-g56c857a1b8d3 #118 Not tainted
      [   23.066132] --------------------------------------------
      [   23.071598] ip/2861 is trying to acquire lock:
      [   23.076171] 00000000f61990cb (_xmit_ETHER){+...}, at: dev_set_rx_mode+0x1c/0x38
      [   23.083693]
      [   23.083693] but task is already holding lock:
      [   23.089696] 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
      [   23.096774]
      [   23.096774] other info that might help us debug this:
      [   23.103494]  Possible unsafe locking scenario:
      [   23.103494]
      [   23.109584]        CPU0
      [   23.112093]        ----
      [   23.114601]   lock(_xmit_ETHER);
      [   23.117917]   lock(_xmit_ETHER);
      [   23.121233]
      [   23.121233]  *** DEADLOCK ***
      [   23.121233]
      [   23.127325]  May be due to missing lock nesting notation
      [   23.127325]
      [   23.134315] 2 locks held by ip/2861:
      [   23.137987]  #0: 000000003b766c72 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x338/0x4e0
      [   23.146231]  #1: 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
      [   23.153757]
      [   23.153757] stack backtrace:
      [   23.158243] CPU: 0 PID: 2861 Comm: ip Not tainted 5.0.0-rc3-00013-g56c857a1b8d3 #118
      [   23.166212] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
      [   23.172843] Call trace:
      [   23.175358]  dump_backtrace+0x0/0x188
      [   23.179116]  show_stack+0x14/0x20
      [   23.182524]  dump_stack+0xb4/0xec
      [   23.185928]  __lock_acquire+0x123c/0x1860
      [   23.190048]  lock_acquire+0xc8/0x248
      [   23.193724]  _raw_spin_lock_bh+0x40/0x58
      [   23.197755]  dev_set_rx_mode+0x1c/0x38
      [   23.201607]  dev_set_promiscuity+0x3c/0x50
      [   23.205820]  dsa_slave_change_rx_flags+0x5c/0x70
      [   23.210567]  __dev_set_promiscuity+0x148/0x1e0
      [   23.215136]  __dev_set_rx_mode+0x74/0x98
      [   23.219167]  dev_uc_add+0x54/0x70
      [   23.222575]  macvlan_open+0x170/0x1d0
      [   23.226336]  __dev_open+0xe0/0x160
      [   23.229830]  __dev_change_flags+0x16c/0x1b8
      [   23.234132]  dev_change_flags+0x20/0x60
      [   23.238074]  do_setlink+0x2d0/0xc50
      [   23.241658]  __rtnl_newlink+0x5f8/0x6e8
      [   23.245601]  rtnl_newlink+0x50/0x78
      [   23.249184]  rtnetlink_rcv_msg+0x360/0x4e0
      [   23.253397]  netlink_rcv_skb+0xe8/0x130
      [   23.257338]  rtnetlink_rcv+0x14/0x20
      [   23.261012]  netlink_unicast+0x190/0x210
      [   23.265043]  netlink_sendmsg+0x288/0x350
      [   23.269075]  sock_sendmsg+0x18/0x30
      [   23.272659]  ___sys_sendmsg+0x29c/0x2c8
      [   23.276602]  __sys_sendmsg+0x60/0xb8
      [   23.280276]  __arm64_sys_sendmsg+0x1c/0x28
      [   23.284488]  el0_svc_common+0xd8/0x138
      [   23.288340]  el0_svc_handler+0x24/0x80
      [   23.292192]  el0_svc+0x8/0xc
      
      This looks fairly harmless (no actual deadlock occurs), and is
      fixed in a similar way to c6894dec ("bridge: fix lockdep
      addr_list_lock false positive splat") by putting the addr_list_lock
      in its own lockdep class.
      
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8101f77
  13. Jan 17, 2019
  14. Dec 09, 2018
  15. Dec 06, 2018
  16. Dec 01, 2018
  17. Apr 27, 2018
  18. Mar 04, 2018
  19. Nov 09, 2017
  20. Oct 01, 2017
  21. Sep 19, 2017
Loading