Skip to content
  1. Nov 01, 2017
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Unify dirty page map between HPT and radix · e641a317
      Paul Mackerras authored
      
      
      Currently, the HPT code in HV KVM maintains a dirty bit per guest page
      in the rmap array, whether or not dirty page tracking has been enabled
      for the memory slot.  In contrast, the radix code maintains a dirty
      bit per guest page in memslot->dirty_bitmap, and only does so when
      dirty page tracking has been enabled.
      
      This changes the HPT code to maintain the dirty bits in the memslot
      dirty_bitmap like radix does.  This results in slightly less code
      overall, and will mean that we do not lose the dirty bits when
      transitioning between HPT and radix mode in future.
      
      There is one minor change to behaviour as a result.  With HPT, when
      dirty tracking was enabled for a memslot, we would previously clear
      all the dirty bits at that point (both in the HPT entries and in the
      rmap arrays), meaning that a KVM_GET_DIRTY_LOG ioctl immediately
      following would show no pages as dirty (assuming no vcpus have run
      in the meantime).  With this change, the dirty bits on HPT entries
      are not cleared at the point where dirty tracking is enabled, so
      KVM_GET_DIRTY_LOG would show as dirty any guest pages that are
      resident in the HPT and dirty.  This is consistent with what happens
      on radix.
      
      This also fixes a bug in the mark_pages_dirty() function for radix
      (in the sense that the function no longer exists).  In the case where
      a large page of 64 normal pages or more is marked dirty, the
      addressing of the dirty bitmap was incorrect and could write past
      the end of the bitmap.  Fortunately this case was never hit in
      practice because a 2MB large page is only 32 x 64kB pages, and we
      don't support backing the guest with 1GB huge pages at this point.
      
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      e641a317
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Rename hpte_setup_done to mmu_ready · 1b151ce4
      Paul Mackerras authored
      
      
      This renames the kvm->arch.hpte_setup_done field to mmu_ready because
      we will want to use it for radix guests too -- both for setting things
      up before vcpu execution, and for excluding vcpus from executing while
      MMU-related things get changed, such as in future switching the MMU
      from radix to HPT mode or vice-versa.
      
      This also moves the call to kvmppc_setup_partition_table() that was
      done in kvmppc_hv_setup_htab_rma() for HPT guests, and the setting
      of mmu_ready, into the caller in kvmppc_vcpu_run_hv().
      
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      1b151ce4
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Don't rely on host's page size information · 8dc6cca5
      Paul Mackerras authored
      
      
      This removes the dependence of KVM on the mmu_psize_defs array (which
      stores information about hardware support for various page sizes) and
      the things derived from it, chiefly hpte_page_sizes[], hpte_page_size(),
      hpte_actual_page_size() and get_sllp_encoding().  We also no longer
      rely on the mmu_slb_size variable or the MMU_FTR_1T_SEGMENTS feature
      bit.
      
      The reason for doing this is so we can support a HPT guest on a radix
      host.  In a radix host, the mmu_psize_defs array contains information
      about page sizes supported by the MMU in radix mode rather than the
      page sizes supported by the MMU in HPT mode.  Similarly, mmu_slb_size
      and the MMU_FTR_1T_SEGMENTS bit are not set.
      
      Instead we hard-code knowledge of the behaviour of the HPT MMU in the
      POWER7, POWER8 and POWER9 processors (which are the only processors
      supported by HV KVM) - specifically the encoding of the LP fields in
      the HPT and SLB entries, and the fact that they have 32 SLB entries
      and support 1TB segments.
      
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      8dc6cca5
    • Nicholas Piggin's avatar
      KVM: PPC: Book3S: Fix gas warning due to using r0 as immediate 0 · 93897a1f
      Nicholas Piggin authored
      
      
      This fixes the message:
      
      arch/powerpc/kvm/book3s_segment.S: Assembler messages:
      arch/powerpc/kvm/book3s_segment.S:330: Warning: invalid register expression
      
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      93897a1f
    • Greg Kurz's avatar
      KVM: PPC: Book3S PR: Only install valid SLBs during KVM_SET_SREGS · f4093ee9
      Greg Kurz authored
      
      
      Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
      some of which are valid (ie, SLB_ESID_V is set) and the rest are
      likely all-zeroes (with QEMU at least).
      
      Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
      assumes to find the SLB index in the 3 lower bits of its rb argument.
      When passed zeroed arguments, it happily overwrites the 0th SLB entry
      with zeroes. This is exactly what happens while doing live migration
      with QEMU when the destination pushes the incoming SLB descriptors to
      KVM PR. When reloading the SLBs at the next synchronization, QEMU first
      clears its SLB array and only restore valid ones, but the 0th one is
      now gone and we cannot access the corresponding memory anymore:
      
      (qemu) x/x $pc
      c0000000000b742c: Cannot access memory
      
      To avoid this, let's filter out non-valid SLB entries. While here, we
      also force a full SLB flush before installing new entries. Since SLB
      is for 64-bit only, we now build this path conditionally to avoid a
      build break on 32-bit, which doesn't define SLB_ESID_V.
      
      Signed-off-by: default avatarGreg Kurz <groug@kaod.org>
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      f4093ee9
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Don't call real-mode XICS hypercall handlers if not enabled · 00bb6ae5
      Paul Mackerras authored
      
      
      When running a guest on a POWER9 system with the in-kernel XICS
      emulation disabled (for example by running QEMU with the parameter
      "-machine pseries,kernel_irqchip=off"), the kernel does not pass
      the XICS-related hypercalls such as H_CPPR up to userspace for
      emulation there as it should.
      
      The reason for this is that the real-mode handlers for these
      hypercalls don't check whether a XICS device has been instantiated
      before calling the xics-on-xive code.  That code doesn't check
      either, leading to potential NULL pointer dereferences because
      vcpu->arch.xive_vcpu is NULL.  Those dereferences won't cause an
      exception in real mode but will lead to kernel memory corruption.
      
      This fixes it by adding kvmppc_xics_enabled() checks before calling
      the XICS functions.
      
      Cc: stable@vger.kernel.org # v4.11+
      Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      00bb6ae5
  2. Oct 20, 2017
    • Michael Ellerman's avatar
      KVM: PPC: Tie KVM_CAP_PPC_HTM to the user-visible TM feature · 2a3d6553
      Michael Ellerman authored
      
      
      Currently we use CPU_FTR_TM to decide if the CPU/kernel can support
      TM (Transactional Memory), and if it's true we advertise that to
      Qemu (or similar) via KVM_CAP_PPC_HTM.
      
      PPC_FEATURE2_HTM is the user-visible feature bit, which indicates that
      the CPU and kernel can support TM. Currently CPU_FTR_TM and
      PPC_FEATURE2_HTM always have the same value, either true or false, so
      using the former for KVM_CAP_PPC_HTM is correct.
      
      However some Power9 CPUs can operate in a mode where TM is enabled but
      TM suspended state is disabled. In this mode CPU_FTR_TM is true, but
      PPC_FEATURE2_HTM is false. Instead a different PPC_FEATURE2 bit is
      set, to indicate that this different mode of TM is available.
      
      It is not safe to let guests use TM as-is, when the CPU is in this
      mode. So to prevent that from happening, use PPC_FEATURE2_HTM to
      determine the value of KVM_CAP_PPC_HTM.
      
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2a3d6553
  3. Oct 19, 2017
  4. Oct 15, 2017
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Explicitly disable HPT operations on radix guests · 891f1ebf
      Paul Mackerras authored
      
      
      This adds code to make sure that we don't try to access the
      non-existent HPT for a radix guest using the htab file for the VM
      in debugfs, a file descriptor obtained using the KVM_PPC_GET_HTAB_FD
      ioctl, or via the KVM_PPC_RESIZE_HPT_{PREPARE,COMMIT} ioctls.
      
      At present nothing bad happens if userspace does access these
      interfaces on a radix guest, mostly because kvmppc_hpt_npte()
      gives 0 for a radix guest, which in turn is because 1 << -4
      comes out as 0 on POWER processors.  However, that relies on
      undefined behaviour, so it is better to be explicit about not
      accessing the HPT for a radix guest.
      
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      891f1ebf
  5. Oct 14, 2017
  6. Oct 06, 2017
  7. Oct 04, 2017
    • Guenter Roeck's avatar
      powerpc/mm: Call flush_tlb_kernel_range with interrupts enabled · 7c6a4f3b
      Guenter Roeck authored
      
      
      flush_tlb_kernel_range() may call smp_call_function_many() which expects
      interrupts to be enabled. This results in a traceback.
      
      WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1
      task: cf830000 task.stack: cf82e000
      NIP:  c00a93c8 LR: c00a9634 CTR: 00000001
      REGS: cf82fde0 TRAP: 0700   Not tainted  (4.14.0-rc1-00009-g0666f56)
      MSR:  00021000 <CE,ME>  CR: 24000082  XER: 00000000
      
      GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001
      GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000
      GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000
      GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000
      NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc
      LR [c00a9634] smp_call_function+0x3c/0x50
      Call Trace:
      [cf82fe90] [00000010] 0x10 (unreliable)
      [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50
      [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38
      [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c
      [cf82ff20] [c001484c] free_initmem+0x20/0x4c
      [cf82ff30] [c000316c] kernel_init+0x1c/0x108
      [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64
      Instruction dump:
      7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac
      3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78
      
      Fixes: 3184cc4b ("powerpc/mm: Fix kernel RAM protection after freeing ...")
      Fixes: e611939f ("powerpc/mm: Ensure change_page_attr() doesn't ...")
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      7c6a4f3b
    • Cédric Le Goater's avatar
      powerpc/xive: Clear XIVE internal structures when a CPU is removed · cc569398
      Cédric Le Goater authored
      
      
      Commit eac1e731 ("powerpc/xive: guest exploitation of the XIVE
      interrupt controller") introduced support for the XIVE exploitation
      mode of the P9 interrupt controller on the pseries platform.
      
      At that time, support for CPU removal was not complete on PowerVM and
      CPU hot unplug remained untested. It appears that some cleanups of the
      XIVE internal structures are required before releasing the CPU,
      without which the kernel crashes in a RTAS call doing the CPU
      isolation.
      
      These changes fix the crash by deconfiguring the IPI interrupt source
      and clearing the event queues of the CPU when it is removed.
      
      Fixes: eac1e731 ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
      Signed-off-by: default avatarCédric Le Goater <clg@kaod.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      cc569398
    • Cédric Le Goater's avatar
      powerpc/xive: Fix IPI reset · 74f12821
      Cédric Le Goater authored
      
      
      When resetting an IPI, hw_ipi should also be set to zero.
      
      Fixes: eac1e731 ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
      Signed-off-by: default avatarCédric Le Goater <clg@kaod.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      74f12821
    • Thomas Gleixner's avatar
      powerpc/watchdog: Make use of watchdog_nmi_probe() · 34ddaa3e
      Thomas Gleixner authored
      
      
      The rework of the core hotplug code triggers the WARN_ON in start_wd_cpu()
      on powerpc because it is called multiple times for the boot CPU.
      
      The first call is via:
      
        start_wd_on_cpu+0x80/0x2f0
        watchdog_nmi_reconfigure+0x124/0x170
        softlockup_reconfigure_threads+0x110/0x130
        lockup_detector_init+0xbc/0xe0
        kernel_init_freeable+0x18c/0x37c
        kernel_init+0x2c/0x160
        ret_from_kernel_thread+0x5c/0xbc
      
      And then again via the CPU hotplug registration:
      
        start_wd_on_cpu+0x80/0x2f0
        cpuhp_invoke_callback+0x194/0x620
        cpuhp_thread_fun+0x7c/0x1b0
        smpboot_thread_fn+0x290/0x2a0
        kthread+0x168/0x1b0
        ret_from_kernel_thread+0x5c/0xbc
      
      This can be avoided by setting up the cpu hotplug state with nocalls and
      move the initialization to the watchdog_nmi_probe() function. That
      initializes the hotplug callbacks without invoking the callback and the
      following core initialization function then configures the watchdog for the
      online CPUs (in this case CPU0) via softlockup_reconfigure_threads().
      
      Reported-and-tested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      34ddaa3e
    • Thomas Gleixner's avatar
      watchdog/core, powerpc: Lock cpus across reconfiguration · e31d6883
      Thomas Gleixner authored
      
      
      Instead of dropping the cpu hotplug lock after stopping NMI watchdog and
      threads and reaquiring for restart, the code and the protection rules
      become more obvious when holding cpu hotplug lock across the full
      reconfiguration.
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linuxfoundation.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710022105570.2114@nanos
      e31d6883
    • Thomas Gleixner's avatar
      watchdog/core, powerpc: Replace watchdog_nmi_reconfigure() · 6b9dc480
      Thomas Gleixner authored
      
      
      The recent cleanup of the watchdog code split watchdog_nmi_reconfigure()
      into two stages. One to stop the NMI and one to restart it after
      reconfiguration. That was done by adding a boolean 'run' argument to the
      code, which is functionally correct but not necessarily a piece of art.
      
      Replace it by two explicit functions: watchdog_nmi_stop() and
      watchdog_nmi_start().
      
      Fixes: 6592ad2f ("watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage")
      Requested-by: default avatarLinus 'Nursing his pet-peeve' Torvalds <torvalds@linuxfoundation.org>
      Signed-off-by: default avatarThomas 'Mopping up garbage' Gleixner <tglx@linutronix.de>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1710021957480.2114@nanos
      6b9dc480
    • Ioan Nicu's avatar
      rapidio: remove global irq spinlocks from the subsystem · 31d1e130
      Ioan Nicu authored
      Locking of config and doorbell operations should be done only if the
      underlying hardware requires it.
      
      This patch removes the global spinlocks from the rapidio subsystem and
      moves them to the mport drivers (fsl_rio and tsi721), only to the
      necessary places.  For example, local config space read and write
      operations (lcread/lcwrite) are atomic in all existing drivers, so there
      should be no need for locking, while the cread/cwrite operations which
      generate maintenance transactions need to be synchronized with a lock.
      
      Later, each driver could chose to use a per-port lock instead of a
      global one, or even more granular locking.
      
      Link: http://lkml.kernel.org/r/20170824113023.GD50104@nokia.com
      
      
      Signed-off-by: default avatarIoan Nicu <ioan.nicu.ext@nokia.com>
      Signed-off-by: default avatarFrank Kunz <frank.kunz@nokia.com>
      Acked-by: default avatarAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      31d1e130
  8. Oct 03, 2017
    • Sam Bobroff's avatar
      KVM: PPC: Book3S: Fix server always zero from kvmppc_xive_get_xive() · 2fb1e946
      Sam Bobroff authored
      
      
      In KVM's XICS-on-XIVE emulation, kvmppc_xive_get_xive() returns the
      value of state->guest_server as "server". However, this value is not
      set by it's counterpart kvmppc_xive_set_xive(). When the guest uses
      this interface to migrate interrupts away from a CPU that is going
      offline, it sees all interrupts as belonging to CPU 0, so they are
      left assigned to (now) offline CPUs.
      
      This patch removes the guest_server field from the state, and returns
      act_server in it's place (that is, the CPU actually handling the
      interrupt, which may differ from the one requested).
      
      Fixes: 5af50993 ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSam Bobroff <sam.bobroff@au1.ibm.com>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      2fb1e946
    • Christian Lamparter's avatar
      powerpc/4xx: Fix compile error with 64K pages on 40x, 44x · 070e0049
      Christian Lamparter authored
      
      
      The mmu context on the 40x, 44x does not define pte_frag entry. This
      causes gcc abort the compilation due to:
      
        setup-common.c: In function ‘setup_arch’:
        setup-common.c:908: error: ‘mm_context_t’ has no ‘pte_frag’
      
      This patch fixes the issue by removing the pte_frag initialization in
      setup-common.c.
      
      This is possible, because the compiler will do the initialization,
      since the mm_context is a sub struct of init_mm. init_mm is declared
      in mm_types.h as external linkage.
      
      According to C99 6.2.4.3:
        An object whose identifier is declared with external linkage
        [...] has static storage duration.
      
      C99 defines in 6.7.8.10 that:
        If an object that has static storage duration is not
        initialized explicitly, then:
        - if it has pointer type, it is initialized to a null pointer
      
      Fixes: b1923caa ("powerpc: Merge 32-bit and 64-bit setup_arch()")
      Signed-off-by: default avatarChristian Lamparter <chunkeey@gmail.com>
      Reviewed-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      070e0049
    • Jeremy Kerr's avatar
      powerpc: Fix action argument for cpufeatures-based TLB flush · 3b7af5c0
      Jeremy Kerr authored
      
      
      Commit 41d0c2ec ("powerpc/powernv: Fix local TLB flush for boot
      and MCE on POWER9") introduced calls to __flush_tlb_power[89] from the
      cpufeatures code, specifying the number of sets to flush.
      
      However, these functions take an action argument, not a number of
      sets. This means we hit the BUG() in __flush_tlb_{206,300} when using
      cpufeatures-style configuration.
      
      This change passes TLB_INVAL_SCOPE_GLOBAL instead.
      
      Fixes: 41d0c2ec ("powerpc/powernv: Fix local TLB flush for boot and MCE on POWER9")
      Cc: stable@vger.kernel.org # v4.13+
      Signed-off-by: default avatarJeremy Kerr <jk@ozlabs.org>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3b7af5c0
  9. Sep 29, 2017
  10. Sep 26, 2017
  11. Sep 22, 2017
    • Michael Neuling's avatar
      KVM: PPC: Book3S HV: Check for updated HDSISR on P9 HDSI exception · e001fa78
      Michael Neuling authored
      
      
      On POWER9 DD2.1 and below, sometimes on a Hypervisor Data Storage
      Interrupt (HDSI) the HDSISR is not be updated at all.
      
      To work around this we put a canary value into the HDSISR before
      returning to a guest and then check for this canary when we take a
      HDSI. If we find the canary on a HDSI, we know the hardware didn't
      update the HDSISR. In this case we return to the guest to retake the
      HDSI which should correctly update the HDSISR the second time HDSI
      entry.
      
      After talking to Paulus we've applied this workaround to all POWER9
      CPUs. The workaround of returning to the guest shouldn't ever be
      triggered on well behaving CPU. The extra instructions should have
      negligible performance impact.
      
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e001fa78
  12. Sep 21, 2017
    • Tyrel Datwyler's avatar
      powerpc/pseries: Fix parent_dn reference leak in add_dt_node() · b537ca6f
      Tyrel Datwyler authored
      
      
      A reference to the parent device node is held by add_dt_node() for the
      node to be added. If the call to dlpar_configure_connector() fails
      add_dt_node() returns ENOENT and that reference is not freed.
      
      Add a call to of_node_put(parent_dn) prior to bailing out after a
      failed dlpar_configure_connector() call.
      
      Fixes: 8d5ff320 ("powerpc/pseries: Make dlpar_configure_connector parent node aware")
      Cc: stable@vger.kernel.org # v3.12+
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b537ca6f
    • Tyrel Datwyler's avatar
      powerpc/pseries: Fix "OF: ERROR: Bad of_node_put() on /cpus" during DLPAR · 087ff6a5
      Tyrel Datwyler authored
      
      
      Commit 215ee763 ("powerpc: pseries: remove dlpar_attach_node
      dependency on full path") reworked dlpar_attach_node() to no longer
      look up the parent node "/cpus", but instead to have the parent node
      passed by the caller in the function parameter list.
      
      As a result dlpar_attach_node() is no longer responsible for freeing
      the reference to the parent node. However, commit 215ee763 failed
      to remove the of_node_put(parent) call in dlpar_attach_node(), or to
      take into account that the reference to the parent in the caller
      dlpar_cpu_add() needs to be held until after dlpar_attach_node()
      returns.
      
      As a result doing repeated cpu add/remove dlpar operations will
      eventually result in the following error:
      
        OF: ERROR: Bad of_node_put() on /cpus
        CPU: 0 PID: 10896 Comm: drmgr Not tainted 4.13.0-autotest #1
        Call Trace:
         dump_stack+0x15c/0x1f8 (unreliable)
         of_node_release+0x1a4/0x1c0
         kobject_put+0x1a8/0x310
         kobject_del+0xbc/0xf0
         __of_detach_node_sysfs+0x144/0x210
         of_detach_node+0xf0/0x180
         dlpar_detach_node+0xc4/0x120
         dlpar_cpu_remove+0x280/0x560
         dlpar_cpu_release+0xbc/0x1b0
         arch_cpu_release+0x6c/0xb0
         cpu_release_store+0xa0/0x100
         dev_attr_store+0x68/0xa0
         sysfs_kf_write+0xa8/0xf0
         kernfs_fop_write+0x2cc/0x400
         __vfs_write+0x5c/0x340
         vfs_write+0x1a8/0x3d0
         SyS_write+0xa8/0x1a0
         system_call+0x58/0x6c
      
      Fix the issue by removing the of_node_put(parent) call from
      dlpar_attach_node(), and ensuring that the reference to the parent
      node is properly held and released by the caller dlpar_cpu_add().
      
      Fixes: 215ee763 ("powerpc: pseries: remove dlpar_attach_node dependency on full path")
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Reported-by: default avatarAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      [mpe: Add a comment in the code and frob the change log slightly]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      087ff6a5
    • Benjamin Herrenschmidt's avatar
      powerpc/eeh: Create PHB PEs after EEH is initialized · 3e77adee
      Benjamin Herrenschmidt authored
      
      
      Otherwise we end up not yet having computed the right diag data size
      on powernv where EEH initialization is delayed, thus causing memory
      corruption later on when calling OPAL.
      
      Fixes: 5cb1f8fd ("powerpc/powernv/pci: Dynamically allocate PHB diag data")
      Cc: stable@vger.kernel.org # v4.13+
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3e77adee
  13. Sep 20, 2017
  14. Sep 15, 2017
Loading