Skip to content
  1. Apr 17, 2015
  2. Apr 14, 2015
    • Kees Cook's avatar
      mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE · 204db6ed
      Kees Cook authored
      
      
      The arch_randomize_brk() function is used on several architectures,
      even those that don't support ET_DYN ASLR. To avoid bulky extern/#define
      tricks, consolidate the support under CONFIG_ARCH_HAS_ELF_RANDOMIZE for
      the architectures that support it, while still handling CONFIG_COMPAT_BRK.
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Hector Marco-Gisbert <hecmargi@upv.es>
      Cc: Russell King <linux@arm.linux.org.uk>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: "David A. Long" <dave.long@linaro.org>
      Cc: Andrey Ryabinin <a.ryabinin@samsung.com>
      Cc: Arun Chandran <achandran@mvista.com>
      Cc: Yann Droneaud <ydroneaud@opteya.com>
      Cc: Min-Hua Chen <orca.chen@gmail.com>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Alex Smith <alex@alex-smith.me.uk>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Vineeth Vijayan <vvijayan@mvista.com>
      Cc: Jeff Bailey <jeffbailey@google.com>
      Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Behan Webster <behanw@converseincode.com>
      Cc: Ismael Ripoll <iripoll@upv.es>
      Cc: Jan-Simon Mller <dl9pf@gmx.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      204db6ed
    • Russell King's avatar
      ARM: allow 16-bit instructions in ALT_UP() · 89c6bc58
      Russell King authored
      
      
      Allow ALT_UP() to cope with a 16-bit Thumb instruction by automatically
      inserting a following nop instruction.  This allows us to care less
      about getting the assembler to emit a 32-bit thumb instruction.
      
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      89c6bc58
  3. Apr 12, 2015
  4. Apr 10, 2015
  5. Apr 09, 2015
    • Anton Blanchard's avatar
      jump_label: Allow asm/jump_label.h to be included in assembly · 55dd0df7
      Anton Blanchard authored
      
      
      Wrap asm/jump_label.h for all archs with #ifndef __ASSEMBLY__.
      Since these are kernel only headers, we don't need #ifdef
      __KERNEL__ so can simplify things a bit.
      
      If an architecture wants to use jump labels in assembly, it
      will still need to define a macro to create the __jump_table
      entries (see ARCH_STATIC_BRANCH in the powerpc asm/jump_label.h
      for an example).
      
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: benh@kernel.crashing.org
      Cc: catalin.marinas@arm.com
      Cc: davem@davemloft.net
      Cc: heiko.carstens@de.ibm.com
      Cc: jbaron@akamai.com
      Cc: linux@arm.linux.org.uk
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: liuj97@gmail.com
      Cc: mgorman@suse.de
      Cc: mmarek@suse.cz
      Cc: mpe@ellerman.id.au
      Cc: paulus@samba.org
      Cc: ralf@linux-mips.org
      Cc: rostedt@goodmis.org
      Cc: schwidefsky@de.ibm.com
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1428551492-21977-1-git-send-email-anton@samba.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      55dd0df7
  6. Apr 03, 2015
  7. Apr 02, 2015
    • Geert Uytterhoeven's avatar
      ARM: 8338/1: kexec: Relax SMP validation to improve DT compatibility · fee3fd4f
      Geert Uytterhoeven authored
      
      
      When trying to kexec into a new kernel on a platform where multiple CPU
      cores are present, but no SMP bringup code is available yet, the
      kexec_load system call fails with:
      
          kexec_load failed: Invalid argument
      
      The SMP test added to machine_kexec_prepare() in commit 2103f6cb
      ("ARM: 7807/1: kexec: validate CPU hotplug support") wants to prohibit
      kexec on SMP platforms where it cannot disable secondary CPUs.
      However, this test is too strict: if the secondary CPUs couldn't be
      enabled in the first place, there's no need to disable them later at
      kexec time.  Hence skip the test in the absence of SMP bringup code.
      
      This allows to add all CPU cores to the DTS from the beginning, without
      having to implement SMP bringup first, improving DT compatibility.
      
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Acked-by: default avatarStephen Warren <swarren@nvidia.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      fee3fd4f
  8. Mar 30, 2015
    • Andre Przywara's avatar
      KVM: arm/arm64: rework MMIO abort handling to use KVM MMIO bus · 950324ab
      Andre Przywara authored
      
      
      Currently we have struct kvm_exit_mmio for encapsulating MMIO abort
      data to be passed on from syndrome decoding all the way down to the
      VGIC register handlers. Now as we switch the MMIO handling to be
      routed through the KVM MMIO bus, it does not make sense anymore to
      use that structure already from the beginning. So we keep the data in
      local variables until we put them into the kvm_io_bus framework.
      Then we fill kvm_exit_mmio in the VGIC only, making it a VGIC private
      structure. On that way we replace the data buffer in that structure
      with a pointer pointing to a single location in a local variable, so
      we get rid of some copying on the way.
      With all of the virtual GIC emulation code now being registered with
      the kvm_io_bus, we can remove all of the old MMIO handling code and
      its dispatching functionality.
      
      I didn't bother to rename kvm_exit_mmio (to vgic_mmio or something),
      because that touches a lot of code lines without any good reason.
      
      This is based on an original patch by Nikolay.
      
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Cc: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      950324ab
  9. Mar 29, 2015
  10. Mar 28, 2015
  11. Mar 27, 2015
    • Nathan Lynch's avatar
      ARM: 8329/1: miscellaneous vdso infrastructure, preparation · 1713ce7c
      Nathan Lynch authored
      
      
      Define the layout of the data structure shared between kernel and
      userspace.
      
      Track the vdso address in the mm_context; needed for communicating
      AT_SYSINFO_EHDR to the ELF loader.
      
      Add declarations for arm_install_vdso; implementation is in a
      following patch.
      
      Define AT_SYSINFO_EHDR, and, if CONFIG_VDSO=y, report the vdso shared
      object address via the ELF auxiliary vector.
      
      Note - this adds the AT_SYSINFO_EHDR in a new user-visible header
      asm/auxvec.h; this is consistent with other architectures.
      
      Signed-off-by: default avatarNathan Lynch <nathan_lynch@mentor.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      1713ce7c
    • Suzuki K. Poulose's avatar
      arm-cci: Get rid of secure transactions for PMU driver · 772742a6
      Suzuki K. Poulose authored
      
      
      Avoid secure transactions while probing the CCI PMU. The
      existing code makes use of the Peripheral ID2 (PID2) register
      to determine the revision of the CCI400, which requires a
      secure transaction. This puts a limitation on the usage of the
      driver on systems running non-secure Linux(e.g, ARM64).
      
      Updated the device-tree binding for cci pmu node to add the explicit
      revision number for the compatible field.
      
      The supported strings are :
      	arm,cci-400-pmu,r0
      	arm,cci-400-pmu,r1
      	arm,cci-400-pmu - DEPRECATED. See NOTE below
      
      NOTE: If the revision is not mentioned, we need to probe the cci revision,
      which could be fatal on a platform running non-secure. We need a reliable way
      to know if we can poke the CCI registers at runtime on ARM32. We depend on
      'mcpm_is_available()' when it is available. mcpm_is_available() returns true
      only when there is a registered driver for mcpm. Otherwise, we assume that we
      don't have secure access, and skips probing the revision number(ARM64 case).
      
      The MCPM should figure out if it is safe to access the CCI. Unfortunately
      there isn't a reliable way to indicate the same via dtb. This patch doesn't
      address/change the current situation. It only deals with the CCI-PMU, leaving
      the assumptions about the secure access as it has been, prior to this patch.
      
      Cc: devicetree@vger.kernel.org
      Cc: Punit Agrawal <punit.agrawal@arm.com>
      Tested-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Acked-by: default avatarNicolas Pitre <nicolas.pitre@linaro.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarSuzuki K. Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      772742a6
  12. Mar 24, 2015
    • Will Deacon's avatar
      ARM: pmu: add support for interrupt-affinity property · 9fd85eb5
      Will Deacon authored
      
      
      Historically, the PMU devicetree bindings have expected SPIs to be
      listed in order of *logical* CPU number. This is problematic for
      bootloaders, especially when the boot CPU (logical ID 0) isn't listed
      first in the devicetree.
      
      This patch adds a new optional property, interrupt-affinity, to the
      PMU node which allows the interrupt affinity to be described using
      a list of phandled to CPU nodes, with each entry in the list
      corresponding to the SPI at the same index in the interrupts property.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      9fd85eb5
    • Daniel Lezcano's avatar
      ARM: cpuidle: Add a cpuidle ops structure to be used for DT · 449e056c
      Daniel Lezcano authored
      
      
      The current state of the different cpuidle drivers is the different PM
      operations are passed via the platform_data using the platform driver
      paradigm.
      
      This approach allowed to split the low level PM code from the arch specific
      and the generic cpuidle code.
      
      Unfortunately there are complaints about this approach as, in the context of the
      single kernel image, we have multiple drivers loaded in memory for nothing and
      the platform driver is not adequate for cpuidle.
      
      This patch provides a common interface via cpuidle ops for all new cpuidle
      driver and a definition for the device tree.
      
      It will allow with the next patches to a have a common definition with ARM64
      and share the same cpuidle driver.
      
      The code is optimized to use the __init section intensively in order to reduce
      the memory footprint after the driver is initialized and unify the function
      names with ARM64.
      
      Signed-off-by: default avatarDaniel Lezcano <daniel.lezcano@linaro.org>
      Acked-by: default avatarKevin Hilman <khilman@linaro.org>
      Acked-by: default avatarRob Herring <robherring2@gmail.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      449e056c
  13. Mar 23, 2015
  14. Mar 12, 2015
  15. Mar 11, 2015
    • Marc Zyngier's avatar
      arm64: KVM: Do not use pgd_index to index stage-2 pgd · 04b8dc85
      Marc Zyngier authored
      
      
      The kernel's pgd_index macro is designed to index a normal, page
      sized array. KVM is a bit diffferent, as we can use concatenated
      pages to have a bigger address space (for example 40bit IPA with
      4kB pages gives us an 8kB PGD.
      
      In the above case, the use of pgd_index will always return an index
      inside the first 4kB, which makes a guest that has memory above
      0x8000000000 rather unhappy, as it spins forever in a page fault,
      whist the host happilly corrupts the lower pgd.
      
      The obvious fix is to get our own kvm_pgd_index that does the right
      thing(tm).
      
      Tested on X-Gene with a hacked kvmtool that put memory at a stupidly
      high address.
      
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      04b8dc85
    • Marc Zyngier's avatar
      arm64: KVM: Fix stage-2 PGD allocation to have per-page refcounting · a987370f
      Marc Zyngier authored
      
      
      We're using __get_free_pages with to allocate the guest's stage-2
      PGD. The standard behaviour of this function is to return a set of
      pages where only the head page has a valid refcount.
      
      This behaviour gets us into trouble when we're trying to increment
      the refcount on a non-head page:
      
      page:ffff7c00cfb693c0 count:0 mapcount:0 mapping:          (null) index:0x0
      flags: 0x4000000000000000()
      page dumped because: VM_BUG_ON_PAGE((*({ __attribute__((unused)) typeof((&page->_count)->counter) __var = ( typeof((&page->_count)->counter)) 0; (volatile typeof((&page->_count)->counter) *)&((&page->_count)->counter); })) <= 0)
      BUG: failure at include/linux/mm.h:548/get_page()!
      Kernel panic - not syncing: BUG!
      CPU: 1 PID: 1695 Comm: kvm-vcpu-0 Not tainted 4.0.0-rc1+ #3825
      Hardware name: APM X-Gene Mustang board (DT)
      Call trace:
      [<ffff80000008a09c>] dump_backtrace+0x0/0x13c
      [<ffff80000008a1e8>] show_stack+0x10/0x1c
      [<ffff800000691da8>] dump_stack+0x74/0x94
      [<ffff800000690d78>] panic+0x100/0x240
      [<ffff8000000a0bc4>] stage2_get_pmd+0x17c/0x2bc
      [<ffff8000000a1dc4>] kvm_handle_guest_abort+0x4b4/0x6b0
      [<ffff8000000a420c>] handle_exit+0x58/0x180
      [<ffff80000009e7a4>] kvm_arch_vcpu_ioctl_run+0x114/0x45c
      [<ffff800000099df4>] kvm_vcpu_ioctl+0x2e0/0x754
      [<ffff8000001c0a18>] do_vfs_ioctl+0x424/0x5c8
      [<ffff8000001c0bfc>] SyS_ioctl+0x40/0x78
      CPU0: stopping
      
      A possible approach for this is to split the compound page using
      split_page() at allocation time, and change the teardown path to
      free one page at a time.  It turns out that alloc_pages_exact() and
      free_pages_exact() does exactly that.
      
      While we're at it, the PGD allocation code is reworked to reduce
      duplication.
      
      This has been tested on an X-Gene platform with a 4kB/48bit-VA host
      kernel, and kvmtool hacked to place memory in the second page of
      the hardware PGD (PUD for the host kernel). Also regression-tested
      on a Cubietruck (Cortex-A7).
      
       [ Reworked to use alloc_pages_exact() and free_pages_exact() and to
         return pointers directly instead of by reference as arguments
          - Christoffer ]
      
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      a987370f
  16. Feb 23, 2015
  17. Feb 13, 2015
    • Andy Lutomirski's avatar
      all arches, signal: move restart_block to struct task_struct · f56141e3
      Andy Lutomirski authored
      
      
      If an attacker can cause a controlled kernel stack overflow, overwriting
      the restart block is a very juicy exploit target.  This is because the
      restart_block is held in the same memory allocation as the kernel stack.
      
      Moving the restart block to struct task_struct prevents this exploit by
      making the restart_block harder to locate.
      
      Note that there are other fields in thread_info that are also easy
      targets, at least on some architectures.
      
      It's also a decent simplification, since the restart code is more or less
      identical on all architectures.
      
      [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Miller <davem@davemloft.net>
      Acked-by: default avatarRichard Weinberger <richard@nod.at>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
      Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
      Cc: Steven Miao <realmz6@gmail.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Helge Deller <deller@gmx.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Tested-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chen Liqin <liqin.linux@gmail.com>
      Cc: Lennox Wu <lennox.wu@gmail.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f56141e3
    • Mel Gorman's avatar
      mm: convert p[te|md]_mknonnuma and remaining page table manipulations · 4d942466
      Mel Gorman authored
      
      
      With PROT_NONE, the traditional page table manipulation functions are
      sufficient.
      
      [andre.przywara@arm.com: fix compiler warning in pmdp_invalidate()]
      [akpm@linux-foundation.org: fix build with STRICT_MM_TYPECHECKS]
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Acked-by: default avatarAneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4d942466
  18. Feb 12, 2015
  19. Feb 10, 2015
  20. Feb 06, 2015
    • Paolo Bonzini's avatar
      kvm: add halt_poll_ns module parameter · f7819512
      Paolo Bonzini authored
      
      
      This patch introduces a new module parameter for the KVM module; when it
      is present, KVM attempts a bit of polling on every HLT before scheduling
      itself out via kvm_vcpu_block.
      
      This parameter helps a lot for latency-bound workloads---in particular
      I tested it with O_DSYNC writes with a battery-backed disk in the host.
      In this case, writes are fast (because the data doesn't have to go all
      the way to the platters) but they cannot be merged by either the host or
      the guest.  KVM's performance here is usually around 30% of bare metal,
      or 50% if you use cache=directsync or cache=writethrough (these
      parameters avoid that the guest sends pointless flush requests, and
      at the same time they are not slow because of the battery-backed cache).
      The bad performance happens because on every halt the host CPU decides
      to halt itself too.  When the interrupt comes, the vCPU thread is then
      migrated to a new physical CPU, and in general the latency is horrible
      because the vCPU thread has to be scheduled back in.
      
      With this patch performance reaches 60-65% of bare metal and, more
      important, 99% of what you get if you use idle=poll in the guest.  This
      means that the tunable gets rid of this particular bottleneck, and more
      work can be done to improve performance in the kernel or QEMU.
      
      Of course there is some price to pay; every time an otherwise idle vCPUs
      is interrupted by an interrupt, it will poll unnecessarily and thus
      impose a little load on the host.  The above results were obtained with
      a mostly random value of the parameter (500000), and the load was around
      1.5-2.5% CPU usage on one of the host's core for each idle guest vCPU.
      
      The patch also adds a new stat, /sys/kernel/debug/kvm/halt_successful_poll,
      that can be used to tune the parameter.  It counts how many HLT
      instructions received an interrupt during the polling period; each
      successful poll avoids that Linux schedules the VCPU thread out and back
      in, and may also avoid a likely trip to C1 and back for the physical CPU.
      
      While the VM is idle, a Linux 4 VCPU VM halts around 10 times per second.
      Of these halts, almost all are failed polls.  During the benchmark,
      instead, basically all halts end within the polling period, except a more
      or less constant stream of 50 per second coming from vCPUs that are not
      running the benchmark.  The wasted time is thus very low.  Things may
      be slightly different for Windows VMs, which have a ~10 ms timer tick.
      
      The effect is also visible on Marcelo's recently-introduced latency
      test for the TSC deadline timer.  Though of course a non-RT kernel has
      awful latency bounds, the latency of the timer is around 8000-10000 clock
      cycles compared to 20000-120000 without setting halt_poll_ns.  For the TSC
      deadline timer, thus, the effect is both a smaller average latency and
      a smaller variance.
      
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f7819512
  21. Jan 29, 2015
    • Marc Zyngier's avatar
      arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault · 0d3e4d4f
      Marc Zyngier authored
      
      
      When handling a fault in stage-2, we need to resync I$ and D$, just
      to be sure we don't leave any old cache line behind.
      
      That's very good, except that we do so using the *user* address.
      Under heavy load (swapping like crazy), we may end up in a situation
      where the page gets mapped in stage-2 while being unmapped from
      userspace by another CPU.
      
      At that point, the DC/IC instructions can generate a fault, which
      we handle with kvm->mmu_lock held. The box quickly deadlocks, user
      is unhappy.
      
      Instead, perform this invalidation through the kernel mapping,
      which is guaranteed to be present. The box is much happier, and so
      am I.
      
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      0d3e4d4f
    • Marc Zyngier's avatar
      arm/arm64: KVM: Invalidate data cache on unmap · 363ef89f
      Marc Zyngier authored
      
      
      Let's assume a guest has created an uncached mapping, and written
      to that page. Let's also assume that the host uses a cache-coherent
      IO subsystem. Let's finally assume that the host is under memory
      pressure and starts to swap things out.
      
      Before this "uncached" page is evicted, we need to make sure
      we invalidate potential speculated, clean cache lines that are
      sitting there, or the IO subsystem is going to swap out the
      cached view, loosing the data that has been written directly
      into memory.
      
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      363ef89f
    • Marc Zyngier's avatar
      arm/arm64: KVM: Use set/way op trapping to track the state of the caches · 3c1e7165
      Marc Zyngier authored
      
      
      Trying to emulate the behaviour of set/way cache ops is fairly
      pointless, as there are too many ways we can end-up missing stuff.
      Also, there is some system caches out there that simply ignore
      set/way operations.
      
      So instead of trying to implement them, let's convert it to VA ops,
      and use them as a way to re-enable the trapping of VM ops. That way,
      we can detect the point when the MMU/caches are turned off, and do
      a full VM flush (which is what the guest was trying to do anyway).
      
      This allows a 32bit zImage to boot on the APM thingy, and will
      probably help bootloaders in general.
      
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      3c1e7165
  22. Jan 28, 2015
  23. Jan 20, 2015
  24. Jan 16, 2015
Loading