Skip to content
  1. Dec 25, 2016
  2. Dec 07, 2016
    • Julien Grall's avatar
      arm/xen: Use alloc_percpu rather than __alloc_percpu · 24d5373d
      Julien Grall authored
      
      
      The function xen_guest_init is using __alloc_percpu with an alignment
      which are not power of two.
      
      However, the percpu allocator never supported alignments which are not power
      of two and has always behaved incorectly in thise case.
      
      Commit 3ca45a46 "percpu: ensure requested alignment is power of two"
      introduced a check which trigger a warning [1] when booting linux-next
      on Xen. But in reality this bug was always present.
      
      This can be fixed by replacing the call to __alloc_percpu with
      alloc_percpu. The latter will use an alignment which are a power of two.
      
      [1]
      
      [    0.023921] illegal size (48) or align (48) for percpu allocation
      [    0.024167] ------------[ cut here ]------------
      [    0.024344] WARNING: CPU: 0 PID: 1 at linux/mm/percpu.c:892 pcpu_alloc+0x88/0x6c0
      [    0.024584] Modules linked in:
      [    0.024708]
      [    0.024804] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
      4.9.0-rc7-next-20161128 #473
      [    0.025012] Hardware name: Foundation-v8A (DT)
      [    0.025162] task: ffff80003d870000 task.stack: ffff80003d844000
      [    0.025351] PC is at pcpu_alloc+0x88/0x6c0
      [    0.025490] LR is at pcpu_alloc+0x88/0x6c0
      [    0.025624] pc : [<ffff00000818e678>] lr : [<ffff00000818e678>]
      pstate: 60000045
      [    0.025830] sp : ffff80003d847cd0
      [    0.025946] x29: ffff80003d847cd0 x28: 0000000000000000
      [    0.026147] x27: 0000000000000000 x26: 0000000000000000
      [    0.026348] x25: 0000000000000000 x24: 0000000000000000
      [    0.026549] x23: 0000000000000000 x22: 00000000024000c0
      [    0.026752] x21: ffff000008e97000 x20: 0000000000000000
      [    0.026953] x19: 0000000000000030 x18: 0000000000000010
      [    0.027155] x17: 0000000000000a3f x16: 00000000deadbeef
      [    0.027357] x15: 0000000000000006 x14: ffff000088f79c3f
      [    0.027573] x13: ffff000008f79c4d x12: 0000000000000041
      [    0.027782] x11: 0000000000000006 x10: 0000000000000042
      [    0.027995] x9 : ffff80003d847a40 x8 : 6f697461636f6c6c
      [    0.028208] x7 : 6120757063726570 x6 : ffff000008f79c84
      [    0.028419] x5 : 0000000000000005 x4 : 0000000000000000
      [    0.028628] x3 : 0000000000000000 x2 : 000000000000017f
      [    0.028840] x1 : ffff80003d870000 x0 : 0000000000000035
      [    0.029056]
      [    0.029152] ---[ end trace 0000000000000000 ]---
      [    0.029297] Call trace:
      [    0.029403] Exception stack(0xffff80003d847b00 to
                                     0xffff80003d847c30)
      [    0.029621] 7b00: 0000000000000030 0001000000000000
      ffff80003d847cd0 ffff00000818e678
      [    0.029901] 7b20: 0000000000000002 0000000000000004
      ffff000008f7c060 0000000000000035
      [    0.030153] 7b40: ffff000008f79000 ffff000008c4cd88
      ffff80003d847bf0 ffff000008101778
      [    0.030402] 7b60: 0000000000000030 0000000000000000
      ffff000008e97000 00000000024000c0
      [    0.030647] 7b80: 0000000000000000 0000000000000000
      0000000000000000 0000000000000000
      [    0.030895] 7ba0: 0000000000000035 ffff80003d870000
      000000000000017f 0000000000000000
      [    0.031144] 7bc0: 0000000000000000 0000000000000005
      ffff000008f79c84 6120757063726570
      [    0.031394] 7be0: 6f697461636f6c6c ffff80003d847a40
      0000000000000042 0000000000000006
      [    0.031643] 7c00: 0000000000000041 ffff000008f79c4d
      ffff000088f79c3f 0000000000000006
      [    0.031877] 7c20: 00000000deadbeef 0000000000000a3f
      [    0.032051] [<ffff00000818e678>] pcpu_alloc+0x88/0x6c0
      [    0.032229] [<ffff00000818ece8>] __alloc_percpu+0x18/0x20
      [    0.032409] [<ffff000008d9606c>] xen_guest_init+0x174/0x2f4
      [    0.032591] [<ffff0000080830f8>] do_one_initcall+0x38/0x130
      [    0.032783] [<ffff000008d90c34>] kernel_init_freeable+0xe0/0x248
      [    0.032995] [<ffff00000899a890>] kernel_init+0x10/0x100
      [    0.033172] [<ffff000008082ec0>] ret_from_fork+0x10/0x50
      
      Reported-by: default avatarWei Chen <wei.chen@arm.com>
      Link: https://lkml.org/lkml/2016/11/28/669
      
      
      Signed-off-by: default avatarJulien Grall <julien.grall@arm.com>
      Signed-off-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Reviewed-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Cc: stable@vger.kernel.org
      24d5373d
  3. Nov 07, 2016
    • Alexander Duyck's avatar
      swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function · 76418421
      Alexander Duyck authored
      
      
      The mapping function should always return DMA_ERROR_CODE when a mapping has
      failed as this is what the DMA API expects when a DMA error has occurred.
      The current function for mapping a page in Xen was returning either
      DMA_ERROR_CODE or 0 depending on where it failed.
      
      On x86 DMA_ERROR_CODE is 0, but on other architectures such as ARM it is
      ~0. We need to make sure we return the same error value if either the
      mapping failed or the device is not capable of accessing the mapping.
      
      If we are returning DMA_ERROR_CODE as our error value we can drop the
      function for checking the error code as the default is to compare the
      return value against DMA_ERROR_CODE if no function is defined.
      
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad@kernel.org>
      76418421
  4. Sep 14, 2016
    • Vitaly Kuznetsov's avatar
      arm/xen: fix SMP guests boot · de75abbe
      Vitaly Kuznetsov authored
      
      
      Commit 88e957d6 ("xen: introduce xen_vcpu_id mapping") broke SMP
      ARM guests on Xen. When FIFO-based event channels are in use (this is
      the default), evtchn_fifo_alloc_control_block() is called on
      CPU_UP_PREPARE event and this happens before we set up xen_vcpu_id
      mapping in xen_starting_cpu. Temporary fix the issue by setting direct
      Linux CPU id <-> Xen vCPU id mapping for all possible CPUs at boot. We
      don't currently support kexec/kdump on Xen/ARM so these ids always
      match.
      
      In future, we have several ways to solve the issue, e.g.:
      
      - Eliminate all hypercalls from CPU_UP_PREPARE, do them from the
        starting CPU. This can probably be done for both x86 and ARM and, if
        done, will allow us to get Xen's idea of vCPU id from CPUID/MPIDR on
        the starting CPU directly, no messing with ACPI/device tree
        required.
      
      - Save vCPU id information from ACPI/device tree on ARM and use it to
        initialize xen_vcpu_id mapping. This is the same trick we currently
        do on x86.
      
      Reported-by: default avatarJulien Grall <julien.grall@arm.com>
      Tested-by: default avatarWei Chen <Wei.Chen@arm.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: default avatarStefano Stabellini <sstabellini@kernel.org>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      de75abbe
  5. Aug 24, 2016
  6. Aug 04, 2016
    • Krzysztof Kozlowski's avatar
      dma-mapping: use unsigned long for dma_attrs · 00085f1e
      Krzysztof Kozlowski authored
      The dma-mapping core and the implementations do not change the DMA
      attributes passed by pointer.  Thus the pointer can point to const data.
      However the attributes do not have to be a bitfield.  Instead unsigned
      long will do fine:
      
      1. This is just simpler.  Both in terms of reading the code and setting
         attributes.  Instead of initializing local attributes on the stack
         and passing pointer to it to dma_set_attr(), just set the bits.
      
      2. It brings safeness and checking for const correctness because the
         attributes are passed by value.
      
      Semantic patches for this change (at least most of them):
      
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
      
          @@
          f(...,
          - struct dma_attrs *attrs
          + unsigned long attrs
          , ...)
          {
          ...
          }
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      and
      
          // Options: --all-includes
          virtual patch
          virtual context
      
          @r@
          identifier f, attrs;
          type t;
      
          @@
          t f(..., struct dma_attrs *attrs);
      
          @@
          identifier r.f;
          @@
          f(...,
          - NULL
          + 0
           )
      
      Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
      
      
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Acked-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Acked-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Acked-by: default avatarHans-Christian Noren Egtvedt <egtvedt@samfundet.no>
      Acked-by: Mark Salter <msalter@redhat.com> [c6x]
      Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
      Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
      Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
      Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
      Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
      Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
      Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
      Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
      Acked-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
      Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
      Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00085f1e
  7. Jul 25, 2016
    • Vitaly Kuznetsov's avatar
      x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op · ad5475f9
      Vitaly Kuznetsov authored
      
      
      HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter
      while Xen's idea is expected. In some cases these ideas diverge so we
      need to do remapping.
      
      Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr().
      
      Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as
      they're only being called by PV guests before perpu areas are
      initialized. While the issue could be solved by switching to
      early_percpu for xen_vcpu_id I think it's not worth it: PV guests will
      probably never get to the point where their idea of vCPU id diverges
      from Xen's.
      
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      ad5475f9
    • Vitaly Kuznetsov's avatar
      xen: introduce xen_vcpu_id mapping · 88e957d6
      Vitaly Kuznetsov authored
      
      
      It may happen that Xen's and Linux's ideas of vCPU id diverge. In
      particular, when we crash on a secondary vCPU we may want to do kdump
      and unlike plain kexec where we do migrate_to_reboot_cpu() we try
      booting on the vCPU which crashed. This doesn't work very well for
      PVHVM guests as we have a number of hypercalls where we pass vCPU id
      as a parameter. These hypercalls either fail or do something
      unexpected.
      
      To solve the issue introduce percpu xen_vcpu_id mapping. ARM and PV
      guests get direct mapping for now. Boot CPU for PVHVM guest gets its
      id from CPUID. With secondary CPUs it is a bit more
      trickier. Currently, we initialize IPI vectors before these CPUs boot
      so we can't use CPUID. Use ACPI ids from MADT instead.
      
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      88e957d6
  8. Jul 15, 2016
  9. Jul 06, 2016
  10. Dec 21, 2015
  11. Nov 07, 2015
    • Mel Gorman's avatar
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep... · d0164adc
      Mel Gorman authored
      
      mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd
      
      __GFP_WAIT has been used to identify atomic context in callers that hold
      spinlocks or are in interrupts.  They are expected to be high priority and
      have access one of two watermarks lower than "min" which can be referred
      to as the "atomic reserve".  __GFP_HIGH users get access to the first
      lower watermark and can be called the "high priority reserve".
      
      Over time, callers had a requirement to not block when fallback options
      were available.  Some have abused __GFP_WAIT leading to a situation where
      an optimisitic allocation with a fallback option can access atomic
      reserves.
      
      This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
      cannot sleep and have no alternative.  High priority users continue to use
      __GFP_HIGH.  __GFP_DIRECT_RECLAIM identifies callers that can sleep and
      are willing to enter direct reclaim.  __GFP_KSWAPD_RECLAIM to identify
      callers that want to wake kswapd for background reclaim.  __GFP_WAIT is
      redefined as a caller that is willing to enter direct reclaim and wake
      kswapd for background reclaim.
      
      This patch then converts a number of sites
      
      o __GFP_ATOMIC is used by callers that are high priority and have memory
        pools for those requests. GFP_ATOMIC uses this flag.
      
      o Callers that have a limited mempool to guarantee forward progress clear
        __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
        into this category where kswapd will still be woken but atomic reserves
        are not used as there is a one-entry mempool to guarantee progress.
      
      o Callers that are checking if they are non-blocking should use the
        helper gfpflags_allow_blocking() where possible. This is because
        checking for __GFP_WAIT as was done historically now can trigger false
        positives. Some exceptions like dm-crypt.c exist where the code intent
        is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
        flag manipulations.
      
      o Callers that built their own GFP flags instead of starting with GFP_KERNEL
        and friends now also need to specify __GFP_KSWAPD_RECLAIM.
      
      The first key hazard to watch out for is callers that removed __GFP_WAIT
      and was depending on access to atomic reserves for inconspicuous reasons.
      In some cases it may be appropriate for them to use __GFP_HIGH.
      
      The second key hazard is callers that assembled their own combination of
      GFP flags instead of starting with something like GFP_KERNEL.  They may
      now wish to specify __GFP_KSWAPD_RECLAIM.  It's almost certainly harmless
      if it's missed in most cases as other activity will wake kswapd.
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Vitaly Wool <vitalywool@gmail.com>
      Cc: Rik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0164adc
  12. Oct 23, 2015
  13. Sep 11, 2015
  14. Sep 08, 2015
    • Julien Grall's avatar
      xen/privcmd: Further s/MFN/GFN/ clean-up · a13d7201
      Julien Grall authored
      
      
      The privcmd code is mixing the usage of GFN and MFN within the same
      functions which make the code difficult to understand when you only work
      with auto-translated guests.
      
      The privcmd driver is only dealing with GFN so replace all the mention
      of MFN into GFN.
      
      The ioctl structure used to map foreign change has been left unchanged
      given that the userspace is using it. Nonetheless, add a comment to
      explain the expected value within the "mfn" field.
      
      Signed-off-by: default avatarJulien Grall <julien.grall@citrix.com>
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      a13d7201
    • Julien Grall's avatar
      xen: Make clear that swiotlb and biomerge are dealing with DMA address · 32e09870
      Julien Grall authored
      
      
      The swiotlb is required when programming a DMA address on ARM when a
      device is not protected by an IOMMU.
      
      In this case, the DMA address should always be equal to the machine address.
      For DOM0 memory, Xen ensure it by have an identity mapping between the
      guest address and host address. However, when mapping a foreign grant
      reference, the 1:1 model doesn't work.
      
      For ARM guest, most of the callers of pfn_to_mfn expects to get a GFN
      (Guest Frame Number), i.e a PFN (Page Frame Number) from the Linux point
      of view given that all ARM guest are auto-translated.
      
      Even though the name pfn_to_mfn is misleading, we need to ensure that
      those caller get a GFN and not by mistake a MFN. In pratical, I haven't
      seen error related to this but we should fix it for the sake of
      correctness.
      
      In order to fix the implementation of pfn_to_mfn on ARM in a follow-up
      patch, we have to introduce new helpers to return the DMA from a PFN and
      the invert.
      
      On x86, the new helpers will be an alias of pfn_to_mfn and mfn_to_pfn.
      
      The helpers will be used in swiotlb and xen_biovec_phys_mergeable.
      
      This is necessary in the latter because we have to ensure that the
      biovec code will not try to merge a biovec using foreign page and
      another using Linux memory.
      
      Lastly, the helper mfn_to_local_pfn has been renamed to bfn_to_local_pfn
      given that the only usage was in swiotlb.
      
      Signed-off-by: default avatarJulien Grall <julien.grall@citrix.com>
      Reviewed-by: default avatarStefano Stabellini <stefano.stabellini@eu.citrix.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      32e09870
  15. Aug 20, 2015
  16. Jun 17, 2015
  17. May 28, 2015
  18. May 18, 2015
  19. May 06, 2015
  20. Mar 16, 2015
    • David Vrabel's avatar
      xen/privcmd: improve performance of MMAPBATCH_V2 · 4e8c0c8c
      David Vrabel authored
      
      
      Make the IOCTL_PRIVCMD_MMAPBATCH_V2 (and older V1 version) map
      multiple frames at a time rather than one at a time, despite the pages
      being non-consecutive GFNs.
      
      xen_remap_foreign_mfn_array() is added which maps an array of GFNs
      (instead of a consecutive range of GFNs).
      
      Since per-frame errors are returned in an array, privcmd must set the
      MMAPBATCH_V1 error bits as part of the "report errors" phase, after
      all the frames are mapped.
      
      Migrate times are significantly improved (when using a PV toolstack
      domain).  For example, for an idle 12 GiB PV guest:
      
              Before     After
        real  0m38.179s  0m26.868s
        user  0m15.096s  0m13.652s
        sys   0m28.988s  0m18.732s
      
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarStefano Stabellini <stefano.stabellini@eu.citrix.com>
      4e8c0c8c
    • David Vrabel's avatar
      xen: unify foreign GFN map/unmap for auto-xlated physmap guests · 628c28ee
      David Vrabel authored
      
      
      Auto-translated physmap guests (arm, arm64 and x86 PVHVM/PVH) map and
      unmap foreign GFNs using the same method (updating the physmap).
      Unify the two arm and x86 implementations into one commont one.
      
      Note that on arm and arm64, the correct error code will be returned
      (instead of always -EFAULT) and map/unmap failure warnings are no
      longer printed.  These changes are required if the foreign domain is
      paging (-ENOENT failures are expected and must be propagated up to the
      caller).
      
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarStefano Stabellini <stefano.stabellini@eu.citrix.com>
      628c28ee
Loading