Skip to content
  1. Apr 10, 2019
  2. Apr 08, 2019
  3. Apr 07, 2019
  4. Apr 01, 2019
  5. Mar 29, 2019
  6. Mar 26, 2019
  7. Mar 25, 2019
    • Michael Ellerman's avatar
      powerpc/64: Fix memcmp reading past the end of src/dest · d9470757
      Michael Ellerman authored
      
      
      Chandan reported that fstests' generic/026 test hit a crash:
      
        BUG: Unable to handle kernel data access at 0xc00000062ac40000
        Faulting instruction address: 0xc000000000092240
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
        CPU: 0 PID: 27828 Comm: chacl Not tainted 5.0.0-rc2-next-20190115-00001-g6de6dba64dda #1
        NIP:  c000000000092240 LR: c00000000066a55c CTR: 0000000000000000
        REGS: c00000062c0c3430 TRAP: 0300   Not tainted  (5.0.0-rc2-next-20190115-00001-g6de6dba64dda)
        MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 44000842  XER: 20000000
        CFAR: 00007fff7f3108ac DAR: c00000062ac40000 DSISR: 40000000 IRQMASK: 0
        GPR00: 0000000000000000 c00000062c0c36c0 c0000000017f4c00 c00000000121a660
        GPR04: c00000062ac3fff9 0000000000000004 0000000000000020 00000000275b19c4
        GPR08: 000000000000000c 46494c4500000000 5347495f41434c5f c0000000026073a0
        GPR12: 0000000000000000 c0000000027a0000 0000000000000000 0000000000000000
        GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
        GPR20: c00000062ea70020 c00000062c0c38d0 0000000000000002 0000000000000002
        GPR24: c00000062ac3ffe8 00000000275b19c4 0000000000000001 c00000062ac30000
        GPR28: c00000062c0c38d0 c00000062ac30050 c00000062ac30058 0000000000000000
        NIP memcmp+0x120/0x690
        LR  xfs_attr3_leaf_lookup_int+0x53c/0x5b0
        Call Trace:
          xfs_attr3_leaf_lookup_int+0x78/0x5b0 (unreliable)
          xfs_da3_node_lookup_int+0x32c/0x5a0
          xfs_attr_node_addname+0x170/0x6b0
          xfs_attr_set+0x2ac/0x340
          __xfs_set_acl+0xf0/0x230
          xfs_set_acl+0xd0/0x160
          set_posix_acl+0xc0/0x130
          posix_acl_xattr_set+0x68/0x110
          __vfs_setxattr+0xa4/0x110
          __vfs_setxattr_noperm+0xac/0x240
          vfs_setxattr+0x128/0x130
          setxattr+0x248/0x600
          path_setxattr+0x108/0x120
          sys_setxattr+0x28/0x40
          system_call+0x5c/0x70
        Instruction dump:
        7d201c28 7d402428 7c295040 38630008 38840008 408201f0 4200ffe8 2c050000
        4182ff6c 20c50008 54c61838 7d201c28 <7d402428> 7d293436 7d4a3436 7c295040
      
      The instruction dump decodes as:
        subfic  r6,r5,8
        rlwinm  r6,r6,3,0,28
        ldbrx   r9,0,r3
        ldbrx   r10,0,r4      <-
      
      Which shows us doing an 8 byte load from c00000062ac3fff9, which
      crosses the page boundary at c00000062ac40000 and faults.
      
      It's not OK for memcmp to read past the end of the source or
      destination buffers if that would cross a page boundary, because we
      don't know that the next page is mapped.
      
      As pointed out by Segher, we can read past the end of the source or
      destination as long as we don't cross a 4K boundary, because that's
      our minimum page size on all platforms.
      
      The bug is in the code at the .Lcmp_rest_lt8bytes label. When we get
      there we know that s1 is 8-byte aligned and we have at least 1 byte to
      read, so a single 8-byte load won't read past the end of s1 and cross
      a page boundary.
      
      But we have to be more careful with s2. So check if it's within 8
      bytes of a 4K boundary and if so go to the byte-by-byte loop.
      
      Fixes: 2d9ee327 ("powerpc/64: Align bytes before fall back to .Lshort in powerpc64 memcmp()")
      Cc: stable@vger.kernel.org # v4.19+
      Reported-by: default avatarChandan Rajendra <chandan@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarSegher Boessenkool <segher@kernel.crashing.org>
      Tested-by: default avatarChandan Rajendra <chandan@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      d9470757
  8. Mar 23, 2019
    • Kairui Song's avatar
      x86/gart: Exclude GART aperture from kcore · ffc8599a
      Kairui Song authored
      
      
      On machines where the GART aperture is mapped over physical RAM,
      /proc/kcore contains the GART aperture range. Accessing the GART range via
      /proc/kcore results in a kernel crash.
      
      vmcore used to have the same issue, until it was fixed with commit
      2a3e83c6 ("x86/gart: Exclude GART aperture from vmcore")', leveraging
      existing hook infrastructure in vmcore to let /proc/vmcore return zeroes
      when attempting to read the aperture region, and so it won't read from the
      actual memory.
      
      Apply the same workaround for kcore. First implement the same hook
      infrastructure for kcore, then reuse the hook functions introduced in the
      previous vmcore fix. Just with some minor adjustment, rename some functions
      for more general usage, and simplify the hook infrastructure a bit as there
      is no module usage yet.
      
      Suggested-by: default avatarBaoquan He <bhe@redhat.com>
      Signed-off-by: default avatarKairui Song <kasong@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJiri Bohac <jbohac@suse.cz>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Omar Sandoval <osandov@fb.com>
      Cc: Dave Young <dyoung@redhat.com>
      Link: https://lkml.kernel.org/r/20190308030508.13548-1-kasong@redhat.com
      
      ffc8599a
  9. Mar 22, 2019
  10. Mar 21, 2019
  11. Mar 20, 2019
  12. Mar 19, 2019
  13. Mar 18, 2019
    • Christophe Leroy's avatar
      powerpc/6xx: fix setup and use of SPRN_SPRG_PGDIR for hash32 · 4622a2d4
      Christophe Leroy authored
      
      
      Not only the 603 but all 6xx need SPRN_SPRG_PGDIR to be initialised at
      startup. This patch move it from __setup_cpu_603() to start_here()
      and __secondary_start(), close to the initialisation of SPRN_THREAD.
      
      Previously, virt addr of PGDIR was retrieved from thread struct.
      Now that it is the phys addr which is stored in SPRN_SPRG_PGDIR,
      hash_page() shall not convert it to phys anymore.
      This patch removes the conversion.
      
      Fixes: 93c4a162 ("powerpc/6xx: Store PGDIR physical address in a SPRG")
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      4622a2d4
    • Michael Ellerman's avatar
      powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038 · b5b4453e
      Michael Ellerman authored
      Jakub Drnec reported:
        Setting the realtime clock can sometimes make the monotonic clock go
        back by over a hundred years. Decreasing the realtime clock across
        the y2k38 threshold is one reliable way to reproduce. Allegedly this
        can also happen just by running ntpd, I have not managed to
        reproduce that other than booting with rtc at >2038 and then running
        ntp. When this happens, anything with timers (e.g. openjdk) breaks
        rather badly.
      
      And included a test case (slightly edited for brevity):
        #define _POSIX_C_SOURCE 199309L
        #include <stdio.h>
        #include <time.h>
        #include <stdlib.h>
        #include <unistd.h>
      
        long get_time(void) {
          struct timespec tp;
          clock_gettime(CLOCK_MONOTONIC, &tp);
          return tp.tv_sec + tp.tv_nsec / 1000000000;
        }
      
        int main(void) {
          long last = get_time();
          while(1) {
            long now = get_time();
            if (now < last) {
              printf("clock went backwards by %ld seconds!\n", last - now);
            }
            last = now;
            sleep(1);
          }
          return 0;
        }
      
      Which when run concurrently with:
       # date -s 2040-1-1
       # date -s 2037-1-1
      
      Will detect the clock going backward.
      
      The root cause is that wtom_clock_sec in struct vdso_data is only a
      32-bit signed value, even though we set its value to be equal to
      tk->wall_to_monotonic.tv_sec which is 64-bits.
      
      Because the monotonic clock starts at zero when the system boots the
      wall_to_montonic.tv_sec offset is negative for current and future
      dates. Currently on a freshly booted system the offset will be in the
      vicinity of negative 1.5 billion seconds.
      
      However if the wall clock is set past the Y2038 boundary, the offset
      from wall to monotonic becomes less than negative 2^31, and no longer
      fits in 32-bits. When that value is assigned to wtom_clock_sec it is
      truncated and becomes positive, causing the VDSO assembly code to
      calculate CLOCK_MONOTONIC incorrectly.
      
      That causes CLOCK_MONOTONIC to jump ahead by ~4 billion seconds which
      it is not meant to do. Worse, if the time is then set back before the
      Y2038 boundary CLOCK_MONOTONIC will jump backward.
      
      We can fix it simply by storing the full 64-bit offset in the
      vdso_data, and using that in the VDSO assembly code. We also shuffle
      some of the fields in vdso_data to avoid creating a hole.
      
      The original commit that added the CLOCK_MONOTONIC support to the VDSO
      did actually use a 64-bit value for wtom_clock_sec, see commit
      a7f290da ("[PATCH] powerpc: Merge vdso's and add vdso support to
      32 bits kernel") (Nov 2005). However just 3 days later it was
      converted to 32-bits in commit 0c37ec2a ("[PATCH] powerpc: vdso
      fixes (take #2)"), and the bug has existed since then AFAICS.
      
      Fixes: 0c37ec2a ("[PATCH] powerpc: vdso fixes (take #2)")
      Cc: stable@vger.kernel.org # v2.6.15+
      Link: http://lkml.kernel.org/r/HaC.ZfES.62bwlnvAvMP.1STMMj@seznam.cz
      
      
      Reported-by: default avatarJakub Drnec <jaydee@email.cz>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b5b4453e
  14. Mar 17, 2019
  15. Mar 15, 2019
  16. Mar 14, 2019
    • Paul Burton's avatar
      MIPS: Remove custom MIPS32 __kernel_fsid_t type · f6cab793
      Paul Burton authored
      For MIPS32 kernels we have a custom definition of __kernel_fsid_t. This
      differs from the asm-generic version used by all other architectures &
      MIPS64 in one way - it declares the val field as an array of long,
      rather than an array of int. Since int & long have identical size &
      alignment when targeting MIPS32 anyway, this makes little sense.
      
      Beyond the pointlessness this causes problems for code which prints
      entries from the val array, for example the fanotify_encode_fid()
      function [1]. If such code uses a format specified suited to an int then
      it encounters compiler warnings when building for MIPS32, such as:
      
        In file included from include/linux/kernel.h:14:0,
                         from include/linux/list.h:9,
                         from include/linux/preempt.h:11,
                         from include/linux/spinlock.h:51,
                         from include/linux/fdtable.h:11,
                         from fs/notify/fanotify/fanotify.c:3:
        fs/notify/fanotify/fanotify.c: In function 'fanotify_encode_fid':
        include/linux/kern_levels.h:5:18: warning: format '%x' expects argument
          of type 'unsigned int', but argument 2 has type 'long int' [-Wformat=]
      
      Remove the custom __kernel_fsid_t definition & make use of the
      asm-generic version which will have an identical layout in memory
      anyway, in order to remove the inconsistency with other architectures.
      
      One possible regression this could cause if is any code is attempting to
      print entries from the val array with a long-sized format specifier, in
      which case it would begin seeing compiler warnings when built against
      kernel headers including this change. Since such code is exceedingly
      rare, and would have to be MIPS32-specific to expect a long, this seems
      to be a problem that it's extremely unlikely anyone will encounter.
      
      [1] https://lore.kernel.org/linux-mips/CAOQ4uxiEkczB7PNCXegFC-eYb9zAGaio_o=OgHAJHFd7eavBxA@mail.gmail.com/T/#mb43103277c79ef06b884359209e817db1c136140
      
      
      
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Amir Goldstein <amir73il@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jan Kara <jack@suse.cz>
      Cc: linux-arch@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      f6cab793
Loading