Skip to content
  1. Mar 04, 2019
    • Linus Torvalds's avatar
      get rid of legacy 'get_ds()' function · 736706be
      Linus Torvalds authored
      
      
      Every in-kernel use of this function defined it to KERNEL_DS (either as
      an actual define, or as an inline function).  It's an entirely
      historical artifact, and long long long ago used to actually read the
      segment selector valueof '%ds' on x86.
      
      Which in the kernel is always KERNEL_DS.
      
      Inspired by a patch from Jann Horn that just did this for a very small
      subset of users (the ones in fs/), along with Al who suggested a script.
      I then just took it to the logical extreme and removed all the remaining
      gunk.
      
      Roughly scripted with
      
         git grep -l '(get_ds())' -- :^tools/ | xargs sed -i 's/(get_ds())/(KERNEL_DS)/'
         git grep -lw 'get_ds' -- :^tools/ | xargs sed -i '/^#define get_ds()/d'
      
      plus manual fixups to remove a few unusual usage patterns, the couple of
      inline function cases and to fix up a comment that had become stale.
      
      The 'get_ds()' function remains in an x86 kvm selftest, since in user
      space it actually does something relevant.
      
      Inspired-by: default avatarJann Horn <jannh@google.com>
      Inspired-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      736706be
  2. Feb 21, 2019
  3. Feb 03, 2019
    • Deepa Dinamani's avatar
      sock: Add SO_RCVTIMEO_NEW and SO_SNDTIMEO_NEW · a9beb86a
      Deepa Dinamani authored
      
      
      Add new socket timeout options that are y2038 safe.
      
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: ccaulfie@redhat.com
      Cc: davem@davemloft.net
      Cc: deller@gmx.de
      Cc: paulus@samba.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: cluster-devel@redhat.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9beb86a
    • Deepa Dinamani's avatar
      socket: Rename SO_RCVTIMEO/ SO_SNDTIMEO with _OLD suffixes · 45bdc661
      Deepa Dinamani authored
      
      
      SO_RCVTIMEO and SO_SNDTIMEO socket options use struct timeval
      as the time format. struct timeval is not y2038 safe.
      The subsequent patches in the series add support for new socket
      timeout options with _NEW suffix that will use y2038 safe
      data structures. Although the existing struct timeval layout
      is sufficiently wide to represent timeouts, because of the way
      libc will interpret time_t based on user defined flag, these
      new flags provide a way of having a structure that is the same
      for all architectures consistently.
      Rename the existing options with _OLD suffix forms so that the
      right option is enabled for userspace applications according
      to the architecture and time_t definition of libc.
      
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: ccaulfie@redhat.com
      Cc: deller@gmx.de
      Cc: paulus@samba.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: cluster-devel@redhat.com
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: linux-parisc@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45bdc661
    • Deepa Dinamani's avatar
      socket: Add SO_TIMESTAMPING_NEW · 9718475e
      Deepa Dinamani authored
      
      
      Add SO_TIMESTAMPING_NEW variant of socket timestamp options.
      This is the y2038 safe versions of the SO_TIMESTAMPING_OLD
      for all architectures.
      
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: chris@zankel.net
      Cc: fenghua.yu@intel.com
      Cc: rth@twiddle.net
      Cc: tglx@linutronix.de
      Cc: ubraun@linux.ibm.com
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-s390@vger.kernel.org
      Cc: linux-xtensa@linux-xtensa.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9718475e
    • Deepa Dinamani's avatar
      socket: Add SO_TIMESTAMP[NS]_NEW · 887feae3
      Deepa Dinamani authored
      
      
      Add SO_TIMESTAMP_NEW and SO_TIMESTAMPNS_NEW variants of
      socket timestamp options.
      These are the y2038 safe versions of the SO_TIMESTAMP_OLD
      and SO_TIMESTAMPNS_OLD for all architectures.
      
      Note that the format of scm_timestamping.ts[0] is not changed
      in this patch.
      
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: jejb@parisc-linux.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-rdma@vger.kernel.org
      Cc: netdev@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      887feae3
    • Deepa Dinamani's avatar
      sockopt: Rename SO_TIMESTAMP* to SO_TIMESTAMP*_OLD · 7f1bc6e9
      Deepa Dinamani authored
      
      
      SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING options, the
      way they are currently defined, are not y2038 safe.
      Subsequent patches in the series add new y2038 safe versions
      of these options which provide 64 bit timestamps on all
      architectures uniformly.
      Hence, rename existing options with OLD tag suffixes.
      
      Also note that kernel will not use the untagged SO_TIMESTAMP*
      and SCM_TIMESTAMP* options internally anymore.
      
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Cc: deller@gmx.de
      Cc: dhowells@redhat.com
      Cc: jejb@parisc-linux.org
      Cc: ralf@linux-mips.org
      Cc: rth@twiddle.net
      Cc: linux-afs@lists.infradead.org
      Cc: linux-alpha@vger.kernel.org
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-parisc@vger.kernel.org
      Cc: linux-rdma@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f1bc6e9
  4. Jan 17, 2019
    • David Rheinsberg's avatar
      net: introduce SO_BINDTOIFINDEX sockopt · f5dd3d0c
      David Rheinsberg authored
      
      
      This introduces a new generic SOL_SOCKET-level socket option called
      SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a
      network interface index as argument, rather than the network interface
      name.
      
      User-space often refers to network-interfaces via their index, but has
      to temporarily resolve it to a name for a call into SO_BINDTODEVICE.
      This might pose problems when the network-device is renamed
      asynchronously by other parts of the system. When this happens, the
      SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong
      device.
      
      In most cases user-space only ever operates on devices which they
      either manage themselves, or otherwise have a guarantee that the device
      name will not change (e.g., devices that are UP cannot be renamed).
      However, particularly in libraries this guarantee is non-obvious and it
      would be nice if that race-condition would simply not exist. It would
      make it easier for those libraries to operate even in situations where
      the device-name might change under the hood.
      
      A real use-case that we recently hit is trying to start the network
      stack early in the initrd but make it survive into the real system.
      Existing distributions rename network-interfaces during the transition
      from initrd into the real system. This, obviously, cannot affect
      devices that are up and running (unless you also consider moving them
      between network-namespaces). However, the network manager now has to
      make sure its management engine for dormant devices will not run in
      parallel to these renames. Particularly, when you offload operations
      like DHCP into separate processes, these might setup their sockets
      early, and thus have to resolve the device-name possibly running into
      this race-condition.
      
      By avoiding a call to resolve the device-name, we no longer depend on
      the name and can run network setup of dormant devices in parallel to
      the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this
      race.
      
      Reviewed-by: default avatarTom Gundersen <teg@jklm.no>
      Signed-off-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5dd3d0c
  5. Jan 06, 2019
  6. Jan 04, 2019
    • Joel Fernandes (Google)'s avatar
      mm: treewide: remove unused address argument from pte_alloc functions · 4cf58924
      Joel Fernandes (Google) authored
      Patch series "Add support for fast mremap".
      
      This series speeds up the mremap(2) syscall by copying page tables at
      the PMD level even for non-THP systems.  There is concern that the extra
      'address' argument that mremap passes to pte_alloc may do something
      subtle architecture related in the future that may make the scheme not
      work.  Also we find that there is no point in passing the 'address' to
      pte_alloc since its unused.  This patch therefore removes this argument
      tree-wide resulting in a nice negative diff as well.  Also ensuring
      along the way that the enabled architectures do not do anything funky
      with the 'address' argument that goes unnoticed by the optimization.
      
      Build and boot tested on x86-64.  Build tested on arm64.  The config
      enablement patch for arm64 will be posted in the future after more
      testing.
      
      The changes were obtained by applying the following Coccinelle script.
      (thanks Julia for answering all Coccinelle questions!).
      Following fix ups were done manually:
      * Removal of address argument from  pte_fragment_alloc
      * Removal of pte_alloc_one_fast definitions from m68k and microblaze.
      
      // Options: --include-headers --no-includes
      // Note: I split the 'identifier fn' line, so if you are manually
      // running it, please unsplit it so it runs for you.
      
      virtual patch
      
      @pte_alloc_func_def depends on patch exists@
      identifier E2;
      identifier fn =~
      "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
      type T2;
      @@
      
       fn(...
      - , T2 E2
       )
       { ... }
      
      @pte_alloc_func_proto_noarg depends on patch exists@
      type T1, T2, T3, T4;
      identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
      @@
      
      (
      - T3 fn(T1, T2);
      + T3 fn(T1);
      |
      - T3 fn(T1, T2, T4);
      + T3 fn(T1, T2);
      )
      
      @pte_alloc_func_proto depends on patch exists@
      identifier E1, E2, E4;
      type T1, T2, T3, T4;
      identifier fn =~
      "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
      @@
      
      (
      - T3 fn(T1 E1, T2 E2);
      + T3 fn(T1 E1);
      |
      - T3 fn(T1 E1, T2 E2, T4 E4);
      + T3 fn(T1 E1, T2 E2);
      )
      
      @pte_alloc_func_call depends on patch exists@
      expression E2;
      identifier fn =~
      "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
      @@
      
       fn(...
      -,  E2
       )
      
      @pte_alloc_macro depends on patch exists@
      identifier fn =~
      "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
      identifier a, b, c;
      expression e;
      position p;
      @@
      
      (
      - #define fn(a, b, c) e
      + #define fn(a, b) e
      |
      - #define fn(a, b) e
      + #define fn(a) e
      )
      
      Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.com
      
      
      Signed-off-by: default avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Suggested-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Acked-by: default avatarKirill A. Shutemov <kirill@shutemov.name>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Julia Lawall <Julia.Lawall@lip6.fr>
      Cc: Kirill A. Shutemov <kirill@shutemov.name>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4cf58924
    • Matthew Wilcox's avatar
      fls: change parameter to unsigned int · 3fc2579e
      Matthew Wilcox authored
      When testing in userspace, UBSAN pointed out that shifting into the sign
      bit is undefined behaviour.  It doesn't really make sense to ask for the
      highest set bit of a negative value, so just turn the argument type into
      an unsigned int.
      
      Some architectures (eg ppc) already had it declared as an unsigned int,
      so I don't expect too many problems.
      
      Link: http://lkml.kernel.org/r/20181105221117.31828-1-willy@infradead.org
      
      
      Signed-off-by: default avatarMatthew Wilcox <willy@infradead.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3fc2579e
    • Linus Torvalds's avatar
      Remove 'type' argument from access_ok() function · 96d4f267
      Linus Torvalds authored
      
      
      Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
      of the user address range verification function since we got rid of the
      old racy i386-only code to walk page tables by hand.
      
      It existed because the original 80386 would not honor the write protect
      bit when in kernel mode, so you had to do COW by hand before doing any
      user access.  But we haven't supported that in a long time, and these
      days the 'type' argument is a purely historical artifact.
      
      A discussion about extending 'user_access_begin()' to do the range
      checking resulted this patch, because there is no way we're going to
      move the old VERIFY_xyz interface to that model.  And it's best done at
      the end of the merge window when I've done most of my merges, so let's
      just get this done once and for all.
      
      This patch was mostly done with a sed-script, with manual fix-ups for
      the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
      
      There were a couple of notable cases:
      
       - csky still had the old "verify_area()" name as an alias.
      
       - the iter_iov code had magical hardcoded knowledge of the actual
         values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
         really used it)
      
       - microblaze used the type argument for a debug printout
      
      but other than those oddities this should be a total no-op patch.
      
      I tried to fix up all architectures, did fairly extensive grepping for
      access_ok() uses, and the changes are trivial, but I may have missed
      something.  Any missed conversion should be trivially fixable, though.
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96d4f267
  7. Dec 10, 2018
    • Firoz Khan's avatar
      parisc: generate uapi header and system call table files · 575afc4d
      Firoz Khan authored
      
      
      System call table generation script must be run to gener-
      ate unistd_32/64.h and syscall_table_32/64/c32.h files.
      This patch will have changes which will invokes the script.
      
      This patch will generate unistd_32/64.h and syscall_table-
      _32/64/c32.h files by the syscall table generation script
      invoked by parisc/Makefile and the generated files against
      the removed files must be identical.
      
      The generated uapi header file will be included in uapi/-
      asm/unistd.h and generated system call table header file
      will be included by kernel/syscall.S file.
      
      Signed-off-by: default avatarFiroz Khan <firoz.khan@linaro.org>
      Acked-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      575afc4d
    • Firoz Khan's avatar
      parisc: remove __NR_Linux from uapi header file. · 28ff62a4
      Firoz Khan authored
      
      
      The __NR_Linux defined as 0 to support HP-UX syscalls along
      with an offset to other system call. But support for HP-UX
      is gone and there is no need to define __NR_Linux as 0.
      
      One of the patch in this patch series will generate uapi header
      file which does have offset logic support. But here the offset
      value __NR_Linux defined as 0 and it doesn't make much effect.
      So remove the offset  __NR_Linux from uapi header file.
      
      Signed-off-by: default avatarFiroz Khan <firoz.khan@linaro.org>
      Acked-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      28ff62a4
    • Firoz Khan's avatar
      parisc: add __NR_syscalls along with __NR_Linux_syscalls · dbf91a54
      Firoz Khan authored
      
      
      __NR_Linux_syscalls macro holds the number of system call
      exist in parisc architecture. We have to change the value
      of __NR_Linux_syscalls, if we add or delete a system call.
      
      One of the patch in this patch series has a script which
      will generate a uapi header based on syscall.tbl file.
      The syscall.tbl file contains the total number of system
      calls information. So we have two option to update __NR-
      _Linux_syscalls value.
      
      1. Update __NR_Linux_syscalls in asm/unistd.h manually by
         counting the no.of system calls. No need to update __NR-
         _Linux_syscalls until we either add a new system call or
         delete existing system call.
      
      2. We can keep this feature it above mentioned script,
         that will count the number of syscalls and keep it in
         a generated file. In this case we don't need to expli-
         citly update __NR_Linux_syscalls in asm/unistd.h file.
      
      The 2nd option will be the recommended one. For that, I
      added the __NR_syscalls macro in uapi/asm/unistd.h along
      with __NR_Linux_syscalls asm/unistd.h. The macro __NR_sys-
      calls also added for making the name convention same across
      all architecture. While __NR_syscalls isn't strictly part
      of the uapi, having it as part of the generated header to
      simplifies the implementation. We also need to enclose
      this macro with #ifdef __KERNEL__ to avoid side effects.
      
      Signed-off-by: default avatarFiroz Khan <firoz.khan@linaro.org>
      Acked-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      dbf91a54
    • Firoz Khan's avatar
      parisc: move __IGNORE* entries to non uapi header · dfddd1a8
      Firoz Khan authored
      
      
      All the __IGNORE* entries are resides in the uapi header
      file move to non uapi header asm/unistd.h as it is not
      used by any user space applications.
      
      It is correct to keep __IGNORE* entry in non uapi header
      asm/unistd.h while uapi/asm/unistd.h must hold information
      only useful for user space applications.
      
      One of the patch in this patch series will generate uapi
      header file. The information which directly used by the
      user space application must be present in uapi file.
      
      Signed-off-by: default avatarFiroz Khan <firoz.khan@linaro.org>
      Acked-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      dfddd1a8
    • Helge Deller's avatar
      parisc: Split out alternative live patching code · 8cc28269
      Helge Deller authored
      
      
      Move the alternative implemenation coding to alternative.c and add code to
      patch modules while loading.
      
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      8cc28269
  8. Nov 06, 2018
    • John David Anglin's avatar
      parisc: Revert "Release spinlocks using ordered store" · 86d4d068
      John David Anglin authored
      
      
      This reverts commit d27dfa13.
      
      Unfortunately, this patch needs to be reverted.  We need the full sync
      barrier and not the limited barrier provided by using an ordered store.
      The sync ensures that all accesses and cache purge instructions that
      follow the sync are performed after all such instructions prior the sync
      instruction have completed executing.
      
      The patch breaks the rwlock implementation in glibc.  This caused the
      test-lock application in the libprelude testsuite to hang.  With the
      change reverted, the test runs correctly and the libprelude package
      builds successfully.
      
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      86d4d068
  9. Nov 02, 2018
  10. Oct 31, 2018
  11. Oct 26, 2018
Loading