Skip to content
Snippets Groups Projects
  1. Aug 18, 2017
  2. Aug 16, 2017
  3. Aug 09, 2017
  4. Aug 08, 2017
    • Gautham R. Shenoy's avatar
      powerpc/powernv/idle: Disable LOSE_FULL_CONTEXT states when stop-api fails · 785a12af
      Gautham R. Shenoy authored
      
      Currently, we use the opal call opal_slw_set_reg() to inform the
      Sleep-Winkle Engine (SLW) to restore the contents of some of the
      Hypervisor state on wakeup from deep idle states that lose full
      hypervisor context (characterized by the flag
      OPAL_PM_LOSE_FULL_CONTEXT).
      
      However, the current code has a bug in that if opal_slw_set_reg()
      fails, we don't disable the use of these deep states (winkle on
      POWER8, stop4 onwards on POWER9).
      
      This patch fixes this bug by ensuring that if programing the
      sleep-winkle engine to restore the hypervisor states in
      pnv_save_sprs_for_deep_states() fails, then we exclude such states by
      clearing the OPAL_PM_LOSE_FULL_CONTEXT flag from
      supported_cpuidle_states. As a result POWER8 will be prevented from
      using winkle for CPU-Hotplug, and POWER9 will put the offlined CPUs to
      the default stop state when available.
      
      Further, we ensure in the initialization of the cpuidle-powernv driver
      to only include those states whose flags are present in
      supported_cpuidle_states, thereby skipping OPAL_PM_LOSE_FULL_CONTEXT
      states when they have been disabled due to stop-api failure.
      
      Fixes: 1e1601b3 ("powerpc/powernv/idle: Restore SPRs for deep idle
      states via stop API.")
      
      Signed-off-by: default avatarGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      785a12af
  5. Aug 07, 2017
    • Michael Ellerman's avatar
      Revert "powerpc/64: Avoid restore_math call if possible in syscall exit" · 44a12806
      Michael Ellerman authored
      
      This reverts commit bc4f65e4.
      
      As reported by Andreas, this commit is causing unrecoverable SLB misses in the
      system call exit path:
      
        Unrecoverable exception 4100 at c00000000000a1ec
        Oops: Unrecoverable exception, sig: 6 [#1]
        SMP NR_CPUS=2 PowerMac
        ...
        CPU: 0 PID: 18626 Comm: rm Not tainted 4.13.0-rc3 #1
        task: c00000018335e080 task.stack: c000000139e50000
        NIP: c00000000000a1ec LR: c00000000000a118 CTR: 0000000000000000
        REGS: c000000139e53bb0 TRAP: 4100   Not tainted  (4.13.0-rc3)
        MSR: 9000000000001030 <SF,HV,ME,IR,DR> CR: 24000044  XER: 20000000 SOFTE: 1
        GPR00: 0000000000000000 c000000139e53e30 c000000000abb500 fffffffffffffffe
        GPR04: c0000001eb866298 0000000000000000 0000000000000000 c00000018335e080
        GPR08: 900000000000d032 0000000000000000 0000000000000002 fffffffffffff001
        GPR12: c000000139e50000 c00000000ffff000 00003fffa8c0dca0 00003fffa8c0dc88
        GPR16: 0000000010000000 0000000000000001 00003fffa8c0eaa0 0000000000000000
        GPR20: 00003fffa8c27528 00003fffa8c27b00 0000000000000000 0000000000000000
        GPR24: 00003fffa8c0d918 00003ffff1b3efa0 00003fffa8c26d68 0000000000000000
        GPR28: 00003fffa8c249e8 00003fffa8c263d0 00003fffa8c27550 00003ffff1b3ef10
        NIP [c00000000000a1ec] system_call_exit+0xc0/0x21c
        LR [c00000000000a118] system_call+0x58/0x6c
        Call Trace:
        [c000000139e53e30] [c00000000000a118] system_call+0x58/0x6c (unreliable)
        Instruction dump:
        64a51000 7c6300d0 f8a101a0 4bffff9c 3c000000 60000006 780007c6 64000000
        60000000 7c004039 4082001c e8ed0170 <88070b78> 88c70b79 7c003214 2c200000
      
      This is caused by us trying to load THREAD_LOAD_FP with MSR_RI=0, and taking an
      SLB miss on the thread struct.
      
      Reported-by: default avatarAndreas Schwab <schwab@linux-m68k.org>
      Diagnosed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      44a12806
  6. Aug 04, 2017
  7. Aug 02, 2017
  8. Jul 31, 2017
    • Nicholas Piggin's avatar
      powerpc/64s: Fix stack setup in watchdog soft_nmi_common() · cc491f1d
      Nicholas Piggin authored
      
      The watchdog soft-NMI exception stack setup loads a stack pointer
      twice, which is an obvious error. It ends up using the system reset
      interrupt (true-NMI) stack, which is also a bug because the watchdog
      could be preempted by a system reset interrupt that overwrites the
      NMI stack.
      
      Change the soft-NMI to use the "emergency stack". The current kernel
      stack is not used, because of the longer-term goal to prevent
      asynchronous stack access using soft-disable.
      
      Fixes: 2104180a ("powerpc/64s: implement arch-specific hardlockup watchdog")
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      cc491f1d
  9. Jul 28, 2017
    • Alistair Popple's avatar
      powerpc/powernv/pci: Return failure for some uses of dma_set_mask() · 253fd51e
      Alistair Popple authored
      
      Commit 8e3f1b1d ("powerpc/powernv/pci: Enable 64-bit devices to access
      >4GB DMA space") introduced the ability for PCI device drivers to request a
      DMA mask between 64 and 32 bits and actually get a mask greater than
      32-bits. However currently if certain machine configuration dependent
      conditions are not meet the code silently falls back to a 32-bit mask.
      
      This makes it hard for device drivers to detect which mask they actually
      got. Instead we should return an error when the request could not be
      fulfilled which allows drivers to either fallback or implement other
      workarounds as documented in DMA-API-HOWTO.txt.
      
      Signed-off-by: default avatarAlistair Popple <alistair@popple.id.au>
      Acked-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      253fd51e
    • Michael Ellerman's avatar
      powerpc/boot: Fix 64-bit boot wrapper build with non-biarch compiler · 65c5ec11
      Michael Ellerman authored
      
      Historically the boot wrapper was always built 32-bit big endian, even
      for 64-bit kernels. That was because old firmwares didn't necessarily
      support booting a 64-bit image. Because of that arch/powerpc/boot/Makefile
      uses CROSS32CC for compilation.
      
      However when we added 64-bit little endian support, we also added
      support for building the boot wrapper 64-bit. However we kept using
      CROSS32CC, because in most cases it is just CC and everything works.
      
      However if the user doesn't specify CROSS32_COMPILE (which no one ever
      does AFAIK), and CC is *not* biarch (32/64-bit capable), then CROSS32CC
      becomes just "gcc". On native systems that is probably OK, but if we're
      cross building it definitely isn't, leading to eg:
      
        gcc ... -m64 -mlittle-endian -mabi=elfv2 ... arch/powerpc/boot/cpm-serial.c
        gcc: error: unrecognized argument in option ‘-mabi=elfv2’
        gcc: error: unrecognized command line option ‘-mlittle-endian’
        make: *** [zImage] Error 2
      
      To fix it, stop using CROSS32CC, because we may or may not be building
      32-bit. Instead setup a BOOTCC, which defaults to CC, and only use
      CROSS32_COMPILE if it's set and we're building for 32-bit.
      
      Fixes: 147c0516 ("powerpc/boot: Add support for 64bit little endian wrapper")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarCyril Bur <cyrilbur@gmail.com>
      65c5ec11
    • Michael Ellerman's avatar
      powerpc/smp: Call smp_ops->setup_cpu() directly on the boot CPU · 7b7622bb
      Michael Ellerman authored
      
      In smp_cpus_done() we need to call smp_ops->setup_cpu() for the boot
      CPU, which means it has to run *on* the boot CPU.
      
      In the past we ensured it ran on the boot CPU by changing the CPU
      affinity mask of current directly. That was removed in commit
      6d11b87d ("powerpc/smp: Replace open coded task affinity logic"),
      and replaced with a work queue call.
      
      Unfortunately using a work queue leads to a lockdep warning, now that
      the CPU hotplug lock is a regular semaphore:
      
        ======================================================
        WARNING: possible circular locking dependency detected
        ...
        kworker/0:1/971 is trying to acquire lock:
         (cpu_hotplug_lock.rw_sem){++++++}, at: [<c000000000100974>] apply_workqueue_attrs+0x34/0xa0
      
        but task is already holding lock:
         ((&wfc.work)){+.+.+.}, at: [<c0000000000fdb2c>] process_one_work+0x25c/0x800
        ...
             CPU0                    CPU1
             ----                    ----
        lock((&wfc.work));
                                     lock(cpu_hotplug_lock.rw_sem);
                                     lock((&wfc.work));
        lock(cpu_hotplug_lock.rw_sem);
      
      Although the deadlock can't happen in practice, because
      smp_cpus_done() only runs in early boot before CPU hotplug is allowed,
      lockdep can't tell that.
      
      Luckily in commit 8fb12156 ("init: Pin init task to the boot CPU,
      initially") tglx changed the generic code to pin init to the boot CPU
      to begin with. The unpinning of init from the boot CPU happens in
      sched_init_smp(), which is called after smp_cpus_done().
      
      So smp_cpus_done() is always called on the boot CPU, which means we
      don't need the work queue call at all - and the lockdep warning goes
      away.
      
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      7b7622bb
    • Gustavo Romero's avatar
      powerpc/tm: Fix saving of TM SPRs in core dump · cd63f3cf
      Gustavo Romero authored
      
      Currently flush_tmregs_to_thread() does not save the TM SPRs (TFHAR,
      TFIAR, TEXASR) to the thread struct, unless the process is currently
      inside a suspended transaction.
      
      If the process is core dumping, and the TM SPRs have changed since the
      last time the process was context switched, then we will save stale
      values of the TM SPRs to the core dump.
      
      Fix it by saving the live register state to the thread struct in that
      case.
      
      Fixes: 08e1c01d ("powerpc/ptrace: Enable support for TM SPR state")
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: default avatarGustavo Romero <gromero@linux.vnet.ibm.com>
      Reviewed-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      cd63f3cf
    • Oliver O'Halloran's avatar
      powerpc/mm: Fix pmd/pte_devmap() on non-leaf entries · c9c98bc5
      Oliver O'Halloran authored
      
      The Radix MMU translation tree as defined in ISA v3.0 contains two
      different types of entry, directories and leaves. Leaves are
      identified by _PAGE_PTE being set.
      
      The formats of the two entries are different, with the directory
      entries containing no spare bits for use by software. In particular
      the bit we use for _PAGE_DEVMAP is not reserved for software, and is
      part of the NLB (Next Level Base) field, essentially the address of
      the next level in the tree.
      
      Note that the Linux pte_t is not == _PAGE_PTE. A huge page pmd
      entry (or devmap!) is also a leaf and so has _PAGE_PTE set, even
      though we use a pmd_t for it in Linux.
      
      The fix is to ensure that the pmd/pte_devmap() confirm they are
      looking at a leaf entry (_PAGE_PTE) as well as checking _PAGE_DEVMAP.
      
      Fixes: ebd31197 ("powerpc/mm: Add devmap support for ppc64")
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Tested-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Tested-by: default avatarJose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
      Reviewed-by: default avatarSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      [mpe: Add a comment in the code and flesh out change log]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c9c98bc5
  10. Jul 27, 2017
  11. Jul 26, 2017
    • Michael Ellerman's avatar
      powerpc/Makefile: Fix ld version check with 64-bit LE-only toolchain · b40b2386
      Michael Ellerman authored
      
      In commit efe0160c ("powerpc/64: Linker on-demand sfpr functions
      for modules"), we added an ld version check early in the powerpc
      top-level Makefile.
      
      Because the Makefile runs before the kernel config is setup, the
      checks for CONFIG_CPU_LITTLE_ENDIAN etc. all take the default case. So
      we end up configuring ld for 32-bit big endian.
      
      That would be OK, except that for historical (or perhaps no) reason,
      we use 'override LD' to add the endian flags to the LD variable
      itself, rather than the normal approach of adding them to LDFLAGS.
      
      The end result is that when we check the ld version we run it as:
      
        $(CROSS_COMPILE)ld -EB -m elf32ppc --version
      
      This often works, unless you are using a 64-bit only and/or little
      endian only, toolchain. In which case you see something like:
      
        $ make defconfig
        powerpc64le-linux-ld: unrecognised emulation mode: elf32ppc
        Supported emulations: elf64lppc elf32lppc elf32lppclinux elf32lppcsim
        /bin/sh: 1: [: -ge: unexpected operator
      
      The proper fix is to stop using 'override LD', but that will require a
      fair bit of testing. Instead we can fix it for now just by reordering
      the Makefile to do the version check earlier.
      
      Fixes: efe0160c ("powerpc/64: Linker on-demand sfpr functions for modules")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b40b2386
    • Laurent Vivier's avatar
      powerpc/pseries: Fix of_node_put() underflow during reconfig remove · 4fd1bd44
      Laurent Vivier authored
      
      As for commit 68baf692 ("powerpc/pseries: Fix of_node_put()
      underflow during DLPAR remove"), the call to of_node_put() must be
      removed from pSeries_reconfig_remove_node().
      
      dlpar_detach_node() and pSeries_reconfig_remove_node() both call
      of_detach_node(), and thus the node should not be released in both
      cases.
      
      Fixes: 0829f6d1 ("of: device_node kobject lifecycle fixes")
      Cc: stable@vger.kernel.org # v3.15+
      Signed-off-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      4fd1bd44
    • Benjamin Herrenschmidt's avatar
      powerpc/mm/radix: Workaround prefetch issue with KVM · a25bd72b
      Benjamin Herrenschmidt authored
      
      There's a somewhat architectural issue with Radix MMU and KVM.
      
      When coming out of a guest with AIL (Alternate Interrupt Location, ie,
      MMU enabled), we start executing hypervisor code with the PID register
      still containing whatever the guest has been using.
      
      The problem is that the CPU can (and will) then start prefetching or
      speculatively load from whatever host context has that same PID (if
      any), thus bringing translations for that context into the TLB, which
      Linux doesn't know about.
      
      This can cause stale translations and subsequent crashes.
      
      Fixing this in a way that is neither racy nor a huge performance
      impact is difficult. We could just make the host invalidations always
      use broadcast forms but that would hurt single threaded programs for
      example.
      
      We chose to fix it instead by partitioning the PID space between guest
      and host. This is possible because today Linux only use 19 out of the
      20 bits of PID space, so existing guests will work if we make the host
      use the top half of the 20 bits space.
      
      We additionally add support for a property to indicate to Linux the
      size of the PID register which will be useful if we eventually have
      processors with a larger PID space available.
      
      There is still an issue with malicious guests purposefully setting the
      PID register to a value in the hosts PID range. Hopefully future HW
      can prevent that, but in the meantime, we handle it with a pair of
      kludges:
      
       - On the way out of a guest, before we clear the current VCPU in the
         PACA, we check the PID and if it's outside of the permitted range
         we flush the TLB for that PID.
      
       - When context switching, if the mm is "new" on that CPU (the
         corresponding bit was set for the first time in the mm cpumask), we
         check if any sibling thread is in KVM (has a non-NULL VCPU pointer
         in the PACA). If that is the case, we also flush the PID for that
         CPU (core).
      
      This second part is needed to handle the case where a process is
      migrated (or starts a new pthread) on a sibling thread of the CPU
      coming out of KVM, as there's a window where stale translations can
      exist before we detect it and flush them out.
      
      A future optimization could be added by keeping track of whether the
      PID has ever been used and avoid doing that for completely fresh PIDs.
      We could similarily mark PIDs that have been the subject of a global
      invalidation as "fresh". But for now this will do.
      
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      [mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop
            unneeded include of kvm_book3s_asm.h]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a25bd72b
  12. Jul 18, 2017
  13. Jul 17, 2017
    • Michael Ellerman's avatar
      powerpc/powernv: Fix boot on Power8 bare metal due to opal_configure_cores() · a70b487b
      Michael Ellerman authored
      
      In commit 1c0eaf0f ("powerpc/powernv: Tell OPAL about our MMU mode
      on POWER9"), we added additional flags to the OPAL call to configure
      CPUs at boot.
      
      These flags only work on Power9 firmwares, and worse can cause boot
      failures on Power8 machines, so we check for CPU_FTR_ARCH_300 (aka POWER9)
      before adding the extra flags.
      
      Unfortunately we forgot that opal_configure_cores() is called before
      the CPU feature checks are dynamically patched, meaning the check
      always returns true.
      
      We definitely need to do something to make the CPU feature checks less
      prone to bugs like this, but for now the minimal fix is to use
      early_cpu_has_feature().
      
      Reported-and-tested-by: default avatarAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Fixes: 1c0eaf0f ("powerpc/powernv: Tell OPAL about our MMU mode on POWER9")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a70b487b
  14. Jul 12, 2017
Loading