Skip to content
  1. Sep 14, 2014
    • Dave Young's avatar
      x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8 · 3eddc69f
      Dave Young authored
      
      
      3.16 kernel boot fail with earlyprintk=efi, it keeps scrolling at the
      bottom line of screen.
      
      Bisected, the first bad commit is below:
      commit 86dfc6f3
      Author: Lv Zheng <lv.zheng@intel.com>
      Date:   Fri Apr 4 12:38:57 2014 +0800
      
          ACPICA: Tables: Fix table checksums verification before installation.
      
      I did some debugging by enabling both serial and efi earlyprintk, below is
      some debug dmesg, seems early_ioremap fails in scroll up function due to
      no free slot, see below dmesg output:
      
        WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:116 __early_ioremap+0x90/0x1c4()
        __early_ioremap(ed00c800, 00000c80) not found slot
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 3.17.0-rc1+ #204
        Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013
        Call Trace:
          dump_stack+0x4e/0x7a
          warn_slowpath_common+0x75/0x8e
          ? __early_ioremap+0x90/0x1c4
          warn_slowpath_fmt+0x47/0x49
          __early_ioremap+0x90/0x1c4
          ? sprintf+0x46/0x48
          early_ioremap+0x13/0x15
          early_efi_map+0x24/0x26
          early_efi_scroll_up+0x6d/0xc0
          early_efi_write+0x1b0/0x214
          call_console_drivers.constprop.21+0x73/0x7e
          console_unlock+0x151/0x3b2
          ? vprintk_emit+0x49f/0x532
          vprintk_emit+0x521/0x532
          ? console_unlock+0x383/0x3b2
          printk+0x4f/0x51
          acpi_os_vprintf+0x2b/0x2d
          acpi_os_printf+0x43/0x45
          acpi_info+0x5c/0x63
          ? __acpi_map_table+0x13/0x18
          ? acpi_os_map_iomem+0x21/0x147
          acpi_tb_print_table_header+0x177/0x186
          acpi_tb_install_table_with_override+0x4b/0x62
          acpi_tb_install_standard_table+0xd9/0x215
          ? early_ioremap+0x13/0x15
          ? __acpi_map_table+0x13/0x18
          acpi_tb_parse_root_table+0x16e/0x1b4
          acpi_initialize_tables+0x57/0x59
          acpi_table_init+0x50/0xce
          acpi_boot_table_init+0x1e/0x85
          setup_arch+0x9b7/0xcc4
          start_kernel+0x94/0x42d
          ? early_idt_handlers+0x120/0x120
          x86_64_start_reservations+0x2a/0x2c
          x86_64_start_kernel+0xf3/0x100
      
      Quote reply from Lv.zheng about the early ioremap slot usage in this case:
      
      """
      In early_efi_scroll_up(), 2 mapping entries will be used for the src/dst screen buffer.
      In drivers/acpi/acpica/tbutils.c, we've improved the early table loading code in acpi_tb_parse_root_table().
      We now need 2 mapping entries:
      1. One mapping entry is used for RSDT table mapping. Each RSDT entry contains an address for another ACPI table.
      2. For each entry in RSDP, we need another mapping entry to map the table to perform necessary check/override before installing it.
      
      When acpi_tb_parse_root_table() prints something through EFI earlyprintk console, we'll have 4 mapping entries used.
      The current 4 slots setting of early_ioremap() seems to be too small for such a use case.
      """
      
      Thus increase the slot to 8 in this patch to fix this issue.
      boot-time mappings become 512 page with this patch.
      
      Signed-off-by: default avatarDave Young <dyoung@redhat.com>
      Cc: <stable@vger.kernel.org> # v3.16
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      3eddc69f
  2. Sep 09, 2014
  3. Sep 08, 2014
  4. Sep 01, 2014
    • Jiang Liu's avatar
      x86, irq: Fix build error caused by 9eabc99a · f3761db1
      Jiang Liu authored
      
      
      Commit 9eabc99a causes following build error when
      IOAPIC is disabled.
         arch/x86/pci/irq.c: In function 'pirq_disable_irq':
      >> arch/x86/pci/irq.c:1259:2: error: implicit declaration of function 'mp_should_keep_irq' [-Werror=implicit-function-declaration]
           if (io_apic_assign_pci_irqs && !mp_should_keep_irq(&dev->dev) &&
           ^
         cc1: some warnings being treated as errors
      
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Grant Likely <grant.likely@linaro.org>
      Link: http://lkml.kernel.org/r/1409382916-10649-1-git-send-email-jiang.liu@linux.intel.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      f3761db1
  5. Aug 29, 2014
    • Michael Welling's avatar
      kexec: purgatory: add clean-up for purgatory directory · b0108f9e
      Michael Welling authored
      
      
      Without this patch the kexec-purgatory.c and purgatory.ro files are not
      removed after make mrproper.
      
      Signed-off-by: default avatarMichael Welling <mwelling@ieee.org>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0108f9e
    • Vivek Goyal's avatar
      x86/purgatory: use approprate -m64/-32 build flag for arch/x86/purgatory · 4df4185a
      Vivek Goyal authored
      
      
      Thomas reported that build of x86_64 kernel was failing for him.  He is
      using 32bit tool chain.
      
      Problem is that while compiling purgatory, I have not specified -m64
      flag.  And 32bit tool chain must be assuming -m32 by default.
      
      Following is error message.
      
      (mini) [~/work/linux-2.6] make
      scripts/kconfig/conf --silentoldconfig Kconfig
        CHK     include/config/kernel.release
        UPD     include/config/kernel.release
        CHK     include/generated/uapi/linux/version.h
        CHK     include/generated/utsrelease.h
        UPD     include/generated/utsrelease.h
        CC      arch/x86/purgatory/purgatory.o
      arch/x86/purgatory/purgatory.c:1:0: error: code model 'large' not supported in
      the 32 bit mode
      
      Fix it by explicitly passing appropriate -m64/-m32 build flag for
      purgatory.
      
      Reported-by: default avatarThomas Glanzmann <thomas@glanzmann.de>
      Tested-by: default avatarThomas Glanzmann <thomas@glanzmann.de>
      Suggested-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4df4185a
    • Vivek Goyal's avatar
      kexec: create a new config option CONFIG_KEXEC_FILE for new syscall · 74ca317c
      Vivek Goyal authored
      
      
      Currently new system call kexec_file_load() and all the associated code
      compiles if CONFIG_KEXEC=y.  But new syscall also compiles purgatory
      code which currently uses gcc option -mcmodel=large.  This option seems
      to be available only gcc 4.4 onwards.
      
      Hiding new functionality behind a new config option will not break
      existing users of old gcc.  Those who wish to enable new functionality
      will require new gcc.  Having said that, I am trying to figure out how
      can I move away from using -mcmodel=large but that can take a while.
      
      I think there are other advantages of introducing this new config
      option.  As this option will be enabled only on x86_64, other arches
      don't have to compile generic kexec code which will never be used.  This
      new code selects CRYPTO=y and CRYPTO_SHA256=y.  And all other arches had
      to do this for CONFIG_KEXEC.  Now with introduction of new config
      option, we can remove crypto dependency from other arches.
      
      Now CONFIG_KEXEC_FILE is available only on x86_64.  So whereever I had
      CONFIG_X86_64 defined, I got rid of that.
      
      For CONFIG_KEXEC_FILE, instead of doing select CRYPTO=y, I changed it to
      "depends on CRYPTO=y".  This should be safer as "select" is not
      recursive.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Tested-by: default avatarShaun Ruffell <sruffell@digium.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      74ca317c
    • Hugh Dickins's avatar
      x86,mm: fix pte_special versus pte_numa · b38af472
      Hugh Dickins authored
      
      
      Sasha Levin has shown oopses on ffffea0003480048 and ffffea0003480008 at
      mm/memory.c:1132, running Trinity on different 3.16-rc-next kernels:
      where zap_pte_range() checks page->mapping to see if PageAnon(page).
      
      Those addresses fit struct pages for pfns d2001 and d2000, and in each
      dump a register or a stack slot showed d2001730 or d2000730: pte flags
      0x730 are PCD ACCESSED PROTNONE SPECIAL IOMAP; and Sasha's e820 map has
      a hole between cfffffff and 100000000, which would need special access.
      
      Commit c46a7c81 ("x86: define _PAGE_NUMA by reusing software bits on
      the PMD and PTE levels") has broken vm_normal_page(): a PROTNONE SPECIAL
      pte no longer passes the pte_special() test, so zap_pte_range() goes on
      to try to access a non-existent struct page.
      
      Fix this by refining pte_special() (SPECIAL with PRESENT or PROTNONE) to
      complement pte_numa() (SPECIAL with neither PRESENT nor PROTNONE).  A
      hint that this was a problem was that c46a7c81 added pte_numa() test
      to vm_normal_page(), and moved its is_zero_pfn() test from slow to fast
      path: This was papering over a pte_special() snag when the zero page was
      encountered during zap.  This patch reverts vm_normal_page() to how it
      was before, relying on pte_special().
      
      It still appears that this patch may be incomplete: aren't there other
      places which need to be handling PROTNONE along with PRESENT?  For
      example, pte_mknuma() clears _PAGE_PRESENT and sets _PAGE_NUMA, but on a
      PROT_NONE area, that would make it pte_special().  This is side-stepped
      by the fact that NUMA hinting faults skipped PROT_NONE VMAs and there
      are no grounds where a NUMA hinting fault on a PROT_NONE VMA would be
      interesting.
      
      Fixes: c46a7c81 ("x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels")
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Cyrill Gorcunov <gorcunov@gmail.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Cc: <stable@vger.kernel.org>	[3.16]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b38af472
    • Jiang Liu's avatar
      x86, irq, PCI: Keep IRQ assignment for runtime power management · 9eabc99a
      Jiang Liu authored
      Now IOAPIC driver dynamically allocates IRQ numbers for IOAPIC pins.
      We need to keep IRQ assignment for PCI devices during runtime power
      management, otherwise it may cause failure of device wakeups.
      
      Commit 3eec5952 "x86, irq, PCI: Keep IRQ assignment for PCI
      devices during suspend/hibernation" has fixed the issue for suspend/
      hibernation, we also need the same fix for runtime device sleep too.
      
      Fix: https://bugzilla.kernel.org/show_bug.cgi?id=83271
      
      
      Reported-and-Tested-by: default avatarEmanueL Czirai <amanual@openmailbox.org>
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: EmanueL Czirai <amanual@openmailbox.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Grant Likely <grant.likely@linaro.org>
      Link: http://lkml.kernel.org/r/1409304383-18806-1-git-send-email-jiang.liu@linux.intel.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      9eabc99a
  6. Aug 27, 2014
    • Jiang Liu's avatar
      x86: irq: Fix bug in setting IOAPIC pin attributes · f395dcae
      Jiang Liu authored
      
      
      Commit 15a3c7cc "x86, irq: Introduce two helper functions
      to support irqdomain map operation" breaks LPSS ACPI enumerated
      devices.
      
      On startup, IOAPIC driver preallocates IRQ descriptors and programs
      IOAPIC pins with default level and polarity attributes for all legacy
      IRQs. Later legacy IRQ users may fail to set IOAPIC pin attributes
      if the requested attributes conflicts with the default IOAPIC pin
      attributes. So change mp_irqdomain_map() to allow the first legacy IRQ
      user to reprogram IOAPIC pin with different attributes.
      
      Reported-and-tested-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Link: http://lkml.kernel.org/r/1409118795-17046-1-git-send-email-jiang.liu@linux.intel.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      f395dcae
  7. Aug 25, 2014
  8. Aug 19, 2014
  9. Aug 15, 2014
  10. Aug 12, 2014
  11. Aug 11, 2014
    • David Vrabel's avatar
      x86/xen: use vmap() to map grant table pages in PVH guests · 7d951f3c
      David Vrabel authored
      
      
      Commit b7dd0e35 (x86/xen: safely map and unmap grant frames when
      in atomic context) causes PVH guests to crash in
      arch_gnttab_map_shared() when they attempted to map the pages for the
      grant table.
      
      This use of a PV-specific function during the PVH grant table setup is
      non-obvious and not needed.  The standard vmap() function does the
      right thing.
      
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reported-by: default avatarMukesh Rathor <mukesh.rathor@oracle.com>
      Tested-by: default avatarMukesh Rathor <mukesh.rathor@oracle.com>
      Cc: stable@vger.kernel.org
      7d951f3c
    • David Vrabel's avatar
      x86/xen: resume timer irqs early · 8d5999df
      David Vrabel authored
      
      
      If the timer irqs are resumed during device resume it is possible in
      certain circumstances for the resume to hang early on, before device
      interrupts are resumed.  For an Ubuntu 14.04 PVHVM guest this would
      occur in ~0.5% of resume attempts.
      
      It is not entirely clear what is occuring the point of the hang but I
      think a task necessary for the resume calls schedule_timeout(),
      waiting for a timer interrupt (which never arrives).  This failure may
      require specific tasks to be running on the other VCPUs to trigger
      (processes are not frozen during a suspend/resume if PREEMPT is
      disabled).
      
      Add IRQF_EARLY_RESUME to the timer interrupts so they are resumed in
      syscore_resume().
      
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: stable@vger.kernel.org
      8d5999df
  12. Aug 10, 2014
  13. Aug 08, 2014
    • Vivek Goyal's avatar
      kexec: verify the signature of signed PE bzImage · 8e7d8381
      Vivek Goyal authored
      
      
      This is the final piece of the puzzle of verifying kernel image signature
      during kexec_file_load() syscall.
      
      This patch calls into PE file routines to verify signature of bzImage.  If
      signature are valid, kexec_file_load() succeeds otherwise it fails.
      
      Two new config options have been introduced.  First one is
      CONFIG_KEXEC_VERIFY_SIG.  This option enforces that kernel has to be
      validly signed otherwise kernel load will fail.  If this option is not
      set, no signature verification will be done.  Only exception will be when
      secureboot is enabled.  In that case signature verification should be
      automatically enforced when secureboot is enabled.  But that will happen
      when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      
      I tested these patches with both "pesign" and "sbsign" signed bzImages.
      
      I used signing_key.priv key and signing_key.x509 cert for signing as
      generated during kernel build process (if module signing is enabled).
      
      Used following method to sign bzImage.
      
      pesign
      ======
      - Convert DER format cert to PEM format cert
      openssl x509 -in signing_key.x509 -inform DER -out signing_key.x509.PEM -outform
      PEM
      
      - Generate a .p12 file from existing cert and private key file
      openssl pkcs12 -export -out kernel-key.p12 -inkey signing_key.priv -in
      signing_key.x509.PEM
      
      - Import .p12 file into pesign db
      pk12util -i /tmp/kernel-key.p12 -d /etc/pki/pesign
      
      - Sign bzImage
      pesign -i /boot/vmlinuz-3.16.0-rc3+ -o /boot/vmlinuz-3.16.0-rc3+.signed.pesign
      -c "Glacier signing key - Magrathea" -s
      
      sbsign
      ======
      sbsign --key signing_key.priv --cert signing_key.x509.PEM --output
      /boot/vmlinuz-3.16.0-rc3+.signed.sbsign /boot/vmlinuz-3.16.0-rc3+
      
      Patch details:
      
      Well all the hard work is done in previous patches.  Now bzImage loader
      has just call into that code and verify whether bzImage signature are
      valid or not.
      
      Also create two config options.  First one is CONFIG_KEXEC_VERIFY_SIG.
      This option enforces that kernel has to be validly signed otherwise kernel
      load will fail.  If this option is not set, no signature verification will
      be done.  Only exception will be when secureboot is enabled.  In that case
      signature verification should be automatically enforced when secureboot is
      enabled.  But that will happen when secureboot patches are merged.
      
      Second config option is CONFIG_KEXEC_BZIMAGE_VERIFY_SIG.  This option
      enables signature verification support on bzImage.  If this option is not
      set and previous one is set, kernel image loading will fail because kernel
      does not have support to verify signature of bzImage.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Matt Fleming <matt@console-pimps.org>
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8e7d8381
    • Vivek Goyal's avatar
      kexec: support kexec/kdump on EFI systems · 6a2c20e7
      Vivek Goyal authored
      
      
      This patch does two things.  It passes EFI run time mappings to second
      kernel in bootparams efi_info.  Second kernel parse this info and create
      new mappings in second kernel.  That means mappings in first and second
      kernel will be same.  This paves the way to enable EFI in kexec kernel.
      
      This patch also prepares and passes EFI setup data through bootparams.
      This contains bunch of information about various tables and their
      addresses.
      
      These information gathering and passing has been written along the lines
      of what current kexec-tools is doing to make kexec work with UEFI.
      
      [akpm@linux-foundation.org: s/get_efi/efi_get/g, per Matt]
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Matt Fleming <matt@console-pimps.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6a2c20e7
    • Vivek Goyal's avatar
      kexec: support for kexec on panic using new system call · dd5f7260
      Vivek Goyal authored
      
      
      This patch adds support for loading a kexec on panic (kdump) kernel usning
      new system call.
      
      It prepares ELF headers for memory areas to be dumped and for saved cpu
      registers.  Also prepares the memory map for second kernel and limits its
      boot to reserved areas only.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd5f7260
    • Vivek Goyal's avatar
      kexec-bzImage64: support for loading bzImage using 64bit entry · 27f48d3e
      Vivek Goyal authored
      
      
      This is loader specific code which can load bzImage and set it up for
      64bit entry.  This does not take care of 32bit entry or real mode entry.
      
      32bit mode entry can be implemented if somebody needs it.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27f48d3e
    • Vivek Goyal's avatar
      kexec: load and relocate purgatory at kernel load time · 12db5562
      Vivek Goyal authored
      
      
      Load purgatory code in RAM and relocate it based on the location.
      Relocation code has been inspired by module relocation code and purgatory
      relocation code in kexec-tools.
      
      Also compute the checksums of loaded kexec segments and store them in
      purgatory.
      
      Arch independent code provides this functionality so that arch dependent
      bootloaders can make use of it.
      
      Helper functions are provided to get/set symbol values in purgatory which
      are used by bootloaders later to set things like stack and entry point of
      second kernel etc.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      12db5562
    • Vivek Goyal's avatar
      purgatory: core purgatory functionality · 8fc5b4d4
      Vivek Goyal authored
      
      
      Create a stand alone relocatable object purgatory which runs between two
      kernels.  This name, concept and some code has been taken from
      kexec-tools.  Idea is that this code runs after a crash and it runs in
      minimal environment.  So keep it separate from rest of the kernel and in
      long term we will have to practically do no maintenance of this code.
      
      This code also has the logic to do verify sha256 hashes of various
      segments which have been loaded into memory.  So first we verify that the
      kernel we are jumping to is fine and has not been corrupted and make
      progress only if checsums are verified.
      
      This code also takes care of copying some memory contents to backup region.
      
      [sfr@canb.auug.org.au: run host built programs from objtree]
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fc5b4d4
    • Vivek Goyal's avatar
      purgatory/sha256: provide implementation of sha256 in purgaotory context · daeba064
      Vivek Goyal authored
      
      
      Next two patches provide code for purgatory.  This is a code which does
      not link against the kernel and runs stand alone.  This code runs between
      two kernels.  One of the primary purpose of this code is to verify the
      digest of newly loaded kernel and making sure it matches the digest
      computed at kernel load time.
      
      We use sha256 for calculating digest of kexec segmetns.  Purgatory can't
      use stanard crypto API as that API is not available in purgatory context.
      
      Hence, I have copied code from crypto/sha256_generic.c and compiled it
      with purgaotry code so that it could be used.  I could not #include
      sha256_generic.c file here as some of the function signature requiered
      little tweaking.  Original functions work with crypto API but these ones
      don't
      
      So instead of doing #include on sha256_generic.c I just copied relevant
      portions of code into arch/x86/purgatory/sha256.c.  Now we shouldn't have
      to touch this code at all.  Do let me know if there are better ways to
      handle it.
      
      This patch does not enable compiling of this code.  That happens in next
      patch.  I wanted to highlight this change in a separate patch for easy
      review.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      daeba064
    • Vivek Goyal's avatar
      kexec: implementation of new syscall kexec_file_load · cb105258
      Vivek Goyal authored
      
      
      Previous patch provided the interface definition and this patch prvides
      implementation of new syscall.
      
      Previously segment list was prepared in user space.  Now user space just
      passes kernel fd, initrd fd and command line and kernel will create a
      segment list internally.
      
      This patch contains generic part of the code.  Actual segment preparation
      and loading is done by arch and image specific loader.  Which comes in
      next patch.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb105258
    • Vivek Goyal's avatar
      kexec: new syscall kexec_file_load() declaration · f0895685
      Vivek Goyal authored
      
      
      This is the new syscall kexec_file_load() declaration/interface.  I have
      reserved the syscall number only for x86_64 so far.  Other architectures
      (including i386) can reserve syscall number when they enable the support
      for this new syscall.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0895685
    • Vivek Goyal's avatar
      kernel: build bin2c based on config option CONFIG_BUILD_BIN2C · de5b56ba
      Vivek Goyal authored
      
      
      currently bin2c builds only if CONFIG_IKCONFIG=y. But bin2c will now be
      used by kexec too.  So make it compilation dependent on CONFIG_BUILD_BIN2C
      and this config option can be selected by CONFIG_KEXEC and CONFIG_IKCONFIG.
      
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de5b56ba
    • David Rheinsberg's avatar
      shm: add memfd_create() syscall · 9183df25
      David Rheinsberg authored
      
      
      memfd_create() is similar to mmap(MAP_ANON), but returns a file-descriptor
      that you can pass to mmap().  It can support sealing and avoids any
      connection to user-visible mount-points.  Thus, it's not subject to quotas
      on mounted file-systems, but can be used like malloc()'ed memory, but with
      a file-descriptor to it.
      
      memfd_create() returns the raw shmem file, so calls like ftruncate() can
      be used to modify the underlying inode.  Also calls like fstat() will
      return proper information and mark the file as regular file.  If you want
      sealing, you can specify MFD_ALLOW_SEALING.  Otherwise, sealing is not
      supported (like on all other regular files).
      
      Compared to O_TMPFILE, it does not require a tmpfs mount-point and is not
      subject to a filesystem size limit.  It is still properly accounted to
      memcg limits, though, and to the same overcommit or no-overcommit
      accounting as all user memory.
      
      Signed-off-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Ryan Lortie <desrt@desrt.ca>
      Cc: Lennart Poettering <lennart@poettering.net>
      Cc: Daniel Mack <zonque@gmail.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9183df25
    • Daniel Walter's avatar
      arch/x86: replace strict_strto calls · 164109e3
      Daniel Walter authored
      
      
      Replace obsolete strict_strto calls with appropriate kstrto calls
      
      Signed-off-by: default avatarDaniel Walter <dwalter@google.com>
      Acked-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      164109e3
    • Andy Lutomirski's avatar
      arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate area · a6c19dfe
      Andy Lutomirski authored
      
      
      The core mm code will provide a default gate area based on
      FIXADDR_USER_START and FIXADDR_USER_END if
      !defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR).
      
      This default is only useful for ia64.  arm64, ppc, s390, sh, tile, 64-bit
      UML, and x86_32 have their own code just to disable it.  arm, 32-bit UML,
      and x86_64 have gate areas, but they have their own implementations.
      
      This gets rid of the default and moves the code into ia64.
      
      This should save some code on architectures without a gate area: it's now
      possible to inline the gate_area functions in the default case.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Acked-by: default avatarNathan Lynch <nathan_lynch@mentor.com>
      Acked-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [in principle]
      Acked-by: Richard Weinberger <richard@nod.at> [for um]
      Acked-by: Will Deacon <will.deacon@arm.com> [for arm64]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Chris Metcalf <cmetcalf@tilera.com>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Nathan Lynch <Nathan_Lynch@mentor.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a6c19dfe
    • Laura Abbott's avatar
      lib/scatterlist: make ARCH_HAS_SG_CHAIN an actual Kconfig · 308c09f1
      Laura Abbott authored
      
      
      Rather than have architectures #define ARCH_HAS_SG_CHAIN in an
      architecture specific scatterlist.h, make it a proper Kconfig option and
      use that instead.  At same time, remove the header files are are now
      mostly useless and just include asm-generic/scatterlist.h.
      
      [sfr@canb.auug.org.au: powerpc files now need asm/dma.h]
      Signed-off-by: default avatarLaura Abbott <lauraa@codeaurora.org>
      Acked-by: Thomas Gleixner <tglx@linutronix.de>			[x86]
      Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	[powerpc]
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      308c09f1
    • Jiang Liu's avatar
      x86, irq, PCI: Keep IRQ assignment for PCI devices during suspend/hibernation · 3eec5952
      Jiang Liu authored
      
      
      Now IOAPIC driver dynamically allocates IRQ numbers for IOAPIC pins.
      We need to keep IRQ assignment for PCI devices during suspend/hibernation,
      otherwise it may cause failure of suspend/hibernation due to:
      1) Device driver calls pci_enable_device() to allocate an IRQ number
         and register interrupt handler on the returned IRQ.
      2) Device driver's suspend callback calls pci_disable_device() and
         release assigned IRQ in turn.
      3) Device driver's resume callback calls pci_enable_device() to
         allocate IRQ number again. A different IRQ number may be assigned
         by IOAPIC driver this time.
      4) Now the hardware delivers interrupt to the new IRQ but interrupt
         handler is still registered against the old IRQ, so it breaks
         suspend/hibernation.
      
      To fix this issue, we keep IRQ assignment during suspend/hibernation.
      Flag pci_dev.dev.power.is_prepared is used to detect that
      pci_disable_device() is called during suspend/hibernation.
      
      Reported-and-Tested-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Grant Likely <grant.likely@linaro.org>
      Cc: Len Brown <lenb@kernel.org>
      Link: http://lkml.kernel.org/r/1407478071-29399-1-git-send-email-jiang.liu@linux.intel.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      3eec5952
    • Dave Hansen's avatar
      x86/mm: Fix RCU splat from new TLB tracepoints · 7c7f1547
      Dave Hansen authored
      Dave Jones reported seeing a bug from one of my TLB tracepoints:
      
              http://lkml.kernel.org/r/20140806181801.GA4605@redhat.com
      
      According to Paul McKenney, the right way to fix this is adding
      an _rcuidle suffix to the tracepoint.
      
              http://lkml.kernel.org/r/20140807065055.GA5821@linux.vnet.ibm.com
      
      
      
      This patch does just that.
      
      Reported-by: default avatarDave Jones <davej@redhat.com&gt;,>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Reviewed-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20140807175841.5C92D878@viggo.jf.intel.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7c7f1547
  14. Aug 07, 2014
Loading